linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 4.4 00/90] 4.4.72-stable review
@ 2017-06-12 15:25 Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 01/90] bnx2x: Fix Multi-Cos Greg Kroah-Hartman
                   ` (87 more replies)
  0 siblings, 88 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, torvalds, akpm, linux, shuahkh, patches,
	ben.hutchings, stable

This is the start of the stable review cycle for the 4.4.72 release.
There are 90 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Wed Jun 14 15:25:30 UTC 2017.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
	kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.72-rc1.gz
or in the git tree and branch at:
  git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Linux 4.4.72-rc1

Mark Rutland <mark.rutland@arm.com>
    arm64: ensure extension of smp_store_release value

Mark Rutland <mark.rutland@arm.com>
    arm64: armv8_deprecated: ensure extension of addr

Kees Cook <keescook@chromium.org>
    usercopy: Adjust tests to deal with SMAP/PAN

Amey Telawane <ameyt@codeaurora.org>
    tracing: Use strlcpy() instead of strcpy() in __trace_find_cmdline()

Mike Marciniszyn <mike.marciniszyn@intel.com>
    RDMA/qib,hfi1: Fix MR reference count leak on write with immediate

Kristina Martsenko <kristina.martsenko@arm.com>
    arm64: entry: improve data abort handling of tagged pointers

Kristina Martsenko <kristina.martsenko@arm.com>
    arm64: hw_breakpoint: fix watchpoint matching for tagged pointers

Artem Savkov <asavkov@redhat.com>
    Make __xfs_xattr_put_listen preperly report errors.

Trond Myklebust <trond.myklebust@primarydata.com>
    NFSv4: Don't perform cached access checks before we've OPENed the file

Trond Myklebust <trond.myklebust@primarydata.com>
    NFS: Ensure we revalidate attributes before using execute_ok()

Michal Hocko <mhocko@suse.com>
    mm: consider memblock reservations for deferred memory initialization sizing

Eric Dumazet <edumazet@google.com>
    net: better skb->sender_cpu and skb->napi_id cohabitation

Takatoshi Akiyama <takatoshi.akiyama.kj@ps.hitachi-solutions.com>
    serial: sh-sci: Fix panic when serial console and DMA are enabled

Peter Hurley <peter@hurleysoftware.com>
    tty: Drop krefs for interrupted tty lock

Julius Werner <jwerner@chromium.org>
    drivers: char: mem: Fix wraparound check to allow mappings up to the end

Takashi Iwai <tiwai@suse.de>
    ASoC: Fix use-after-free at card unregistration

Takashi Iwai <tiwai@suse.de>
    ALSA: timer: Fix missing queue indices reset at SNDRV_TIMER_IOCTL_SELECT

Takashi Iwai <tiwai@suse.de>
    ALSA: timer: Fix race between read and ioctl

Ben Skeggs <bskeggs@redhat.com>
    drm/nouveau/tmr: fully separate alarm execution/pending lists

Sinclair Yeh <syeh@vmware.com>
    drm/vmwgfx: Make sure backup_handle is always valid

Vladis Dronov <vdronov@redhat.com>
    drm/vmwgfx: limit the number of mip levels in vmw_gb_surface_define_ioctl()

Dan Carpenter <dan.carpenter@oracle.com>
    drm/vmwgfx: Handle vmalloc() failure in vmw_local_fifo_reserve()

Jin Yao <yao.jin@linux.intel.com>
    perf/core: Drop kernel samples even though :u is specified

Michael Bringmann <mwb@linux.vnet.ibm.com>
    powerpc/hotplug-mem: Fix missing endian conversion of aa_index

Michael Ellerman <mpe@ellerman.id.au>
    powerpc/numa: Fix percpu allocations to be NUMA aware

Russell Currey <ruscur@russell.cc>
    powerpc/eeh: Avoid use after free in eeh_handle_special_event()

Johannes Thumshirn <jthumshirn@suse.de>
    scsi: qla2xxx: don't disable a not previously enabled PCI device

Marc Zyngier <marc.zyngier@arm.com>
    KVM: arm/arm64: Handle possible NULL stage2 pud when ageing pages

Jeff Mahoney <jeffm@suse.com>
    btrfs: fix memory leak in update_space_info failure path

David Sterba <dsterba@suse.com>
    btrfs: use correct types for page indices in btrfs_page_exists_in_range

Frederic Barrat <fbarrat@linux.vnet.ibm.com>
    cxl: Fix error path on bad ioctl

Al Viro <viro@zeniv.linux.org.uk>
    ufs_getfrag_block(): we only grab ->truncate_mutex on block creation path

Al Viro <viro@zeniv.linux.org.uk>
    ufs_extend_tail(): fix the braino in calling conventions of ufs_new_fragments()

Al Viro <viro@zeniv.linux.org.uk>
    ufs: set correct ->s_maxsize

Al Viro <viro@zeniv.linux.org.uk>
    ufs: restore maintaining ->i_blocks

Al Viro <viro@zeniv.linux.org.uk>
    fix ufs_isblockset()

Al Viro <viro@zeniv.linux.org.uk>
    ufs: restore proper tail allocation

Fabian Frederick <fabf@skynet.be>
    fs: add i_blocksize()

Tejun Heo <tj@kernel.org>
    cpuset: consider dying css as offline

Ulrik De Bie <ulrik.debie-os@e2big.org>
    Input: elantech - add Fujitsu Lifebook E546/E557 to force crc_enabled

Eric Anholt <eric@anholt.net>
    drm/msm: Expose our reservation object when exporting a dmabuf.

Nicholas Bellinger <nab@linux-iscsi.org>
    target: Re-add check to reject control WRITEs with overflow data

David Arcari <darcari@redhat.com>
    cpufreq: cpufreq_register_driver() should return -ENODEV if init fails

Daniel Micay <danielmicay@gmail.com>
    stackprotector: Increase the per-task stack canary's random range from 32 bits to 64 bits on 64-bit platforms

Eric Biggers <ebiggers3@gmail.com>
    random: properly align get_random_int_hash

Daniel Cashman <dcashman@android.com>
    drivers: char: random: add get_random_long()

Matt Ranostay <matt.ranostay@konsulko.com>
    iio: proximity: as3935: fix AS3935_INT mask

Franziska Naepelt <franziska.naepelt@idt.com>
    iio: light: ltr501 Fix interchanged als/ps register field

Oleg Drokin <green@linuxhacker.ru>
    staging/lustre/lov: remove set_fs() call from lov_getstripe()

Michael Thalmeier <michael.thalmeier@hale.at>
    usb: chipidea: debug: check before accessing ci_role

Jisheng Zhang <jszhang@marvell.com>
    usb: chipidea: udc: fix NULL pointer dereference if udc_start failed

Thinh Nguyen <Thinh.Nguyen@synopsys.com>
    usb: gadget: f_mass_storage: Serialize wake and sleep execution

Jan Kara <jack@suse.cz>
    ext4: fix fdatasync(2) after extent manipulation operations

Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
    ext4: keep existing extra fields when inode expands

Jan Kara <jack@suse.cz>
    ext4: fix SEEK_HOLE

Dongli Zhang <dongli.zhang@oracle.com>
    xen-netfront: cast grant table reference first to type int

Dongli Zhang <dongli.zhang@oracle.com>
    xen-netfront: do not cast grant table reference to signed short

Julien Grall <julien.grall@arm.com>
    xen/privcmd: Support correctly 64KB page granularity when mapping memory

Alexander Sverdlin <alexander.sverdlin@gmail.com>
    dmaengine: ep93xx: Always start from BASE0

Hiroyuki Yokoyama <hiroyuki.yokoyama.vx@renesas.com>
    dmaengine: usb-dmac: Fix DMAOR AE bit definition

Wanpeng Li <wanpeng.li@hotmail.com>
    KVM: async_pf: avoid async pf injection when in guest mode

Marc Zyngier <marc.zyngier@arm.com>
    arm: KVM: Allow unaligned accesses at HYP

Wanpeng Li <wanpeng.li@hotmail.com>
    KVM: cpuid: Fix read/write out-of-bounds vulnerability in cpuid emulation

Paolo Bonzini <pbonzini@redhat.com>
    kvm: async_pf: fix rcu_irq_enter() with irqs enabled

Trond Myklebust <trond.myklebust@primarydata.com>
    nfsd: Fix up the "supattr_exclcreat" attributes

J. Bruce Fields <bfields@redhat.com>
    nfsd4: fix null dereference on replay

Alex Deucher <alexander.deucher@amd.com>
    drm/amdgpu/ci: disable mclk switching for high refresh rates (v2)

Gilad Ben-Yossef <gilad@benyossef.com>
    crypto: gcm - wait for crypto op not signal safe

Eric Biggers <ebiggers@google.com>
    KEYS: fix freeing uninitialized memory in key_update()

Eric Biggers <ebiggers@google.com>
    KEYS: fix dereferencing NULL payload with nonzero length

Eric W. Biederman <ebiederm@xmission.com>
    ptrace: Properly initialize ptracer_cred on fork

Johan Hovold <johan@kernel.org>
    serial: ifx6x60: fix use-after-free on module unload

Jane Chu <jane.chu@oracle.com>
    arch/sparc: support NR_CPUS = 4096

Pavel Tatashin <pasha.tatashin@oracle.com>
    sparc64: delete old wrap code

Pavel Tatashin <pasha.tatashin@oracle.com>
    sparc64: new context wrap

Pavel Tatashin <pasha.tatashin@oracle.com>
    sparc64: add per-cpu mm of secondary contexts

Pavel Tatashin <pasha.tatashin@oracle.com>
    sparc64: redefine first version

Pavel Tatashin <pasha.tatashin@oracle.com>
    sparc64: combine activate_mm and switch_mm

Pavel Tatashin <pasha.tatashin@oracle.com>
    sparc64: reset mm cpumask after wrap

James Clarke <jrtc27@jrtc27.com>
    sparc: Machine description indices can vary

Mike Kravetz <mike.kravetz@oracle.com>
    sparc64: mm: fix copy_tsb to correctly copy huge page TSBs

Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
    net: bridge: start hello timer only if device is up

Max Filippov <jcmvbkbc@gmail.com>
    net: ethoc: enable NAPI before poll may be scheduled

Eric Dumazet <edumazet@google.com>
    net: ping: do not abuse udp_poll()

David S. Miller <davem@davemloft.net>
    ipv6: Fix leak in ipv6_gso_segment().

Mark Bloch <markb@mellanox.com>
    vxlan: fix use-after-free on deletion

Yuchung Cheng <ycheng@google.com>
    tcp: disallow cwnd undo when switching congestion control

Ganesh Goudar <ganeshgr@chelsio.com>
    cxgb4: avoid enabling napi twice to the same queue

Ben Hutchings <ben@decadent.org.uk>
    ipv6: xfrm: Handle errors reported by xfrm6_find_1stfragopt()

Mintz, Yuval <Yuval.Mintz@cavium.com>
    bnx2x: Fix Multi-Cos


-------------

Diffstat:

 Makefile                                           |  4 +-
 arch/arm/kvm/init.S                                |  5 +-
 arch/arm/kvm/mmu.c                                 |  3 +
 arch/arm64/include/asm/asm-uaccess.h               | 13 ++++
 arch/arm64/include/asm/barrier.h                   | 18 ++++-
 arch/arm64/include/asm/uaccess.h                   |  8 ++
 arch/arm64/kernel/armv8_deprecated.c               |  3 +-
 arch/arm64/kernel/entry.S                          |  6 +-
 arch/arm64/kernel/hw_breakpoint.c                  |  3 +-
 arch/powerpc/include/asm/topology.h                | 14 ++++
 arch/powerpc/kernel/eeh_driver.c                   | 19 ++++-
 arch/powerpc/kernel/setup_64.c                     |  4 +-
 arch/powerpc/platforms/pseries/hotplug-memory.c    |  2 +
 arch/sparc/Kconfig                                 |  4 +-
 arch/sparc/include/asm/mmu_64.h                    |  2 +-
 arch/sparc/include/asm/mmu_context_64.h            | 32 +-------
 arch/sparc/include/asm/pil.h                       |  1 -
 arch/sparc/include/asm/vio.h                       |  1 +
 arch/sparc/kernel/irq_64.c                         | 17 ++++-
 arch/sparc/kernel/kernel.h                         |  1 -
 arch/sparc/kernel/smp_64.c                         | 31 --------
 arch/sparc/kernel/tsb.S                            | 11 ++-
 arch/sparc/kernel/ttable_64.S                      |  2 +-
 arch/sparc/kernel/vio.c                            | 68 ++++++++++++++++-
 arch/sparc/mm/init_64.c                            | 86 +++++++++++++++-------
 arch/sparc/mm/tsb.c                                |  7 +-
 arch/sparc/mm/ultra.S                              |  5 --
 arch/x86/kernel/kvm.c                              |  2 +-
 arch/x86/kvm/cpuid.c                               | 20 ++---
 arch/x86/kvm/mmu.c                                 |  7 +-
 arch/x86/kvm/mmu.h                                 |  1 +
 arch/x86/kvm/x86.c                                 |  3 +-
 crypto/gcm.c                                       |  6 +-
 drivers/char/mem.c                                 |  2 +-
 drivers/char/random.c                              | 26 ++++++-
 drivers/cpufreq/cpufreq.c                          |  1 +
 drivers/dma/ep93xx_dma.c                           |  2 +
 drivers/dma/sh/usb-dmac.c                          |  2 +-
 drivers/gpu/drm/amd/amdgpu/ci_dpm.c                |  6 ++
 drivers/gpu/drm/msm/msm_drv.c                      |  1 +
 drivers/gpu/drm/msm/msm_drv.h                      |  1 +
 drivers/gpu/drm/msm/msm_gem_prime.c                |  7 ++
 .../gpu/drm/nouveau/include/nvkm/subdev/timer.h    |  1 +
 drivers/gpu/drm/nouveau/nvkm/subdev/timer/base.c   |  7 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_fifo.c               |  2 +
 drivers/gpu/drm/vmwgfx/vmwgfx_surface.c            | 21 ++++--
 drivers/iio/light/ltr501.c                         |  4 +-
 drivers/iio/proximity/as3935.c                     |  4 +-
 drivers/infiniband/hw/qib/qib_rc.c                 |  4 +-
 drivers/input/mouse/elantech.c                     | 16 ++++
 drivers/misc/cxl/file.c                            |  7 +-
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c    |  2 +-
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c    |  4 +
 drivers/net/ethernet/ethoc.c                       |  3 +-
 drivers/net/vxlan.c                                | 19 +++--
 drivers/net/xen-netfront.c                         |  4 +-
 drivers/scsi/qla2xxx/qla_os.c                      |  8 +-
 drivers/staging/lustre/lustre/lov/lov_pack.c       |  9 ---
 drivers/target/target_core_transport.c             | 23 ++++--
 drivers/tty/serial/ifx6x60.c                       |  2 +-
 drivers/tty/serial/sh-sci.c                        | 10 ++-
 drivers/tty/tty_io.c                               |  3 +-
 drivers/tty/tty_mutex.c                            |  7 +-
 drivers/usb/chipidea/debug.c                       |  3 +-
 drivers/usb/chipidea/udc.c                         |  8 +-
 drivers/usb/gadget/function/f_mass_storage.c       | 13 +++-
 drivers/xen/privcmd.c                              |  4 +-
 fs/btrfs/extent-tree.c                             |  1 +
 fs/btrfs/file.c                                    |  2 +-
 fs/btrfs/inode.c                                   |  4 +-
 fs/buffer.c                                        | 12 +--
 fs/ceph/addr.c                                     |  2 +-
 fs/direct-io.c                                     |  2 +-
 fs/ext4/extents.c                                  |  5 ++
 fs/ext4/file.c                                     | 50 ++++---------
 fs/ext4/inode.c                                    |  9 ++-
 fs/ext4/move_extent.c                              |  2 +-
 fs/jfs/super.c                                     |  4 +-
 fs/mpage.c                                         |  2 +-
 fs/nfs/dir.c                                       | 21 +++++-
 fs/nfsd/blocklayout.c                              |  4 +-
 fs/nfsd/nfs4proc.c                                 | 13 ++--
 fs/nfsd/nfs4xdr.c                                  | 13 +++-
 fs/nilfs2/btnode.c                                 |  2 +-
 fs/nilfs2/inode.c                                  |  4 +-
 fs/nilfs2/mdt.c                                    |  4 +-
 fs/nilfs2/segment.c                                |  2 +-
 fs/ocfs2/aops.c                                    |  2 +-
 fs/ocfs2/file.c                                    |  2 +-
 fs/reiserfs/file.c                                 |  2 +-
 fs/reiserfs/inode.c                                |  2 +-
 fs/stat.c                                          |  3 +-
 fs/udf/inode.c                                     |  2 +-
 fs/ufs/balloc.c                                    | 26 ++++++-
 fs/ufs/inode.c                                     |  9 ++-
 fs/ufs/super.c                                     | 18 +++++
 fs/ufs/util.h                                      | 10 ++-
 fs/xfs/xfs_aops.c                                  | 10 +--
 fs/xfs/xfs_file.c                                  |  4 +-
 fs/xfs/xfs_xattr.c                                 |  1 +
 include/linux/cgroup.h                             | 20 +++++
 include/linux/fs.h                                 |  5 ++
 include/linux/memblock.h                           |  8 ++
 include/linux/mmzone.h                             |  1 +
 include/linux/ptrace.h                             |  7 +-
 include/linux/random.h                             |  1 +
 include/linux/skbuff.h                             |  3 -
 include/net/ipv6.h                                 |  1 +
 kernel/cpuset.c                                    |  4 +-
 kernel/events/core.c                               | 21 ++++++
 kernel/fork.c                                      |  2 +-
 kernel/ptrace.c                                    | 20 +++--
 kernel/trace/trace.c                               |  2 +-
 lib/test_user_copy.c                               | 20 ++++-
 mm/memblock.c                                      | 24 ++++++
 mm/page_alloc.c                                    | 25 ++++++-
 mm/truncate.c                                      |  2 +-
 net/bridge/br_stp_if.c                             |  3 +-
 net/core/dev.c                                     | 33 ++++-----
 net/ipv4/af_inet.c                                 |  2 +-
 net/ipv4/tcp_cong.c                                |  1 +
 net/ipv6/ip6_offload.c                             |  4 +-
 net/ipv6/ping.c                                    |  2 +-
 net/ipv6/raw.c                                     |  2 +-
 net/ipv6/xfrm6_mode_ro.c                           |  2 +
 net/ipv6/xfrm6_mode_transport.c                    |  2 +
 security/keys/key.c                                |  5 +-
 security/keys/keyctl.c                             |  4 +-
 sound/core/timer.c                                 |  7 +-
 sound/soc/soc-core.c                               |  5 +-
 130 files changed, 770 insertions(+), 362 deletions(-)

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 01/90] bnx2x: Fix Multi-Cos
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 02/90] ipv6: xfrm: Handle errors reported by xfrm6_find_1stfragopt() Greg Kroah-Hartman
                   ` (86 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Yuval Mintz, David S. Miller

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: "Mintz, Yuval" <Yuval.Mintz@cavium.com>


[ Upstream commit 3968d38917eb9bd0cd391265f6c9c538d9b33ffa ]

Apparently multi-cos isn't working for bnx2x quite some time -
driver implements ndo_select_queue() to allow queue-selection
for FCoE, but the regular L2 flow would cause it to modulo the
fallback's result by the number of queues.
The fallback would return a queue matching the needed tc
[via __skb_tx_hash()], but since the modulo is by the number of TSS
queues where number of TCs is not accounted, transmission would always
be done by a queue configured into using TC0.

Fixes: ada7c19e6d27 ("bnx2x: use XPS if possible for bnx2x_select_queue instead of pure hash")
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
@@ -1949,7 +1949,7 @@ u16 bnx2x_select_queue(struct net_device
 	}
 
 	/* select a non-FCoE queue */
-	return fallback(dev, skb) % BNX2X_NUM_ETH_QUEUES(bp);
+	return fallback(dev, skb) % (BNX2X_NUM_ETH_QUEUES(bp) * bp->max_cos);
 }
 
 void bnx2x_set_num_queues(struct bnx2x *bp)

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 02/90] ipv6: xfrm: Handle errors reported by xfrm6_find_1stfragopt()
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 01/90] bnx2x: Fix Multi-Cos Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 03/90] cxgb4: avoid enabling napi twice to the same queue Greg Kroah-Hartman
                   ` (85 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Ben Hutchings, Craig Gallek, David S. Miller

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ben Hutchings <ben@decadent.org.uk>


[ Upstream commit 6e80ac5cc992ab6256c3dae87f7e57db15e1a58c ]

xfrm6_find_1stfragopt() may now return an error code and we must
not treat it as a length.

Fixes: 2423496af35d ("ipv6: Prevent overrun when parsing v6 header options")
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Acked-by: Craig Gallek <kraig@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv6/xfrm6_mode_ro.c        |    2 ++
 net/ipv6/xfrm6_mode_transport.c |    2 ++
 2 files changed, 4 insertions(+)

--- a/net/ipv6/xfrm6_mode_ro.c
+++ b/net/ipv6/xfrm6_mode_ro.c
@@ -47,6 +47,8 @@ static int xfrm6_ro_output(struct xfrm_s
 	iph = ipv6_hdr(skb);
 
 	hdr_len = x->type->hdr_offset(x, skb, &prevhdr);
+	if (hdr_len < 0)
+		return hdr_len;
 	skb_set_mac_header(skb, (prevhdr - x->props.header_len) - skb->data);
 	skb_set_network_header(skb, -x->props.header_len);
 	skb->transport_header = skb->network_header + hdr_len;
--- a/net/ipv6/xfrm6_mode_transport.c
+++ b/net/ipv6/xfrm6_mode_transport.c
@@ -28,6 +28,8 @@ static int xfrm6_transport_output(struct
 	iph = ipv6_hdr(skb);
 
 	hdr_len = x->type->hdr_offset(x, skb, &prevhdr);
+	if (hdr_len < 0)
+		return hdr_len;
 	skb_set_mac_header(skb, (prevhdr - x->props.header_len) - skb->data);
 	skb_set_network_header(skb, -x->props.header_len);
 	skb->transport_header = skb->network_header + hdr_len;

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 03/90] cxgb4: avoid enabling napi twice to the same queue
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 01/90] bnx2x: Fix Multi-Cos Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 02/90] ipv6: xfrm: Handle errors reported by xfrm6_find_1stfragopt() Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 04/90] tcp: disallow cwnd undo when switching congestion control Greg Kroah-Hartman
                   ` (84 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Ganesh Goudar, David S. Miller

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ganesh Goudar <ganeshgr@chelsio.com>


[ Upstream commit e7519f9926f1d0d11c776eb0475eb098c7760f68 ]

Take uld mutex to avoid race between cxgb_up() and
cxgb4_register_uld() to enable napi for the same uld
queue.

Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c |    4 ++++
 1 file changed, 4 insertions(+)

--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -2714,10 +2714,14 @@ static int cxgb_up(struct adapter *adap)
 		if (err)
 			goto irq_err;
 	}
+
+	mutex_lock(&uld_mutex);
 	enable_rx(adap);
 	t4_sge_start(adap);
 	t4_intr_enable(adap);
 	adap->flags |= FULL_INIT_DONE;
+	mutex_unlock(&uld_mutex);
+
 	notify_ulds(adap, CXGB4_STATE_UP);
 #if IS_ENABLED(CONFIG_IPV6)
 	update_clip(adap);

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 04/90] tcp: disallow cwnd undo when switching congestion control
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (2 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 03/90] cxgb4: avoid enabling napi twice to the same queue Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 05/90] vxlan: fix use-after-free on deletion Greg Kroah-Hartman
                   ` (83 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Yuchung Cheng, Soheil Hassas Yeganeh,
	Neal Cardwell, Eric Dumazet, David S. Miller

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Yuchung Cheng <ycheng@google.com>


[ Upstream commit 44abafc4cc094214a99f860f778c48ecb23422fc ]

When the sender switches its congestion control during loss
recovery, if the recovery is spurious then it may incorrectly
revert cwnd and ssthresh to the older values set by a previous
congestion control. Consider a congestion control (like BBR)
that does not use ssthresh and keeps it infinite: the connection
may incorrectly revert cwnd to an infinite value when switching
from BBR to another congestion control.

This patch fixes it by disallowing such cwnd undo operation
upon switching congestion control.  Note that undo_marker
is not reset s.t. the packets that were incorrectly marked
lost would be corrected. We only avoid undoing the cwnd in
tcp_undo_cwnd_reduction().

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv4/tcp_cong.c |    1 +
 1 file changed, 1 insertion(+)

--- a/net/ipv4/tcp_cong.c
+++ b/net/ipv4/tcp_cong.c
@@ -183,6 +183,7 @@ void tcp_init_congestion_control(struct
 {
 	const struct inet_connection_sock *icsk = inet_csk(sk);
 
+	tcp_sk(sk)->prior_ssthresh = 0;
 	if (icsk->icsk_ca_ops->init)
 		icsk->icsk_ca_ops->init(sk);
 	if (tcp_ca_needs_ecn(sk))

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 05/90] vxlan: fix use-after-free on deletion
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (3 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 04/90] tcp: disallow cwnd undo when switching congestion control Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 06/90] ipv6: Fix leak in ipv6_gso_segment() Greg Kroah-Hartman
                   ` (82 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jiri Benc, Roi Dayan, Mark Bloch,
	Roopa Prabhu, David S. Miller

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Bloch <markb@mellanox.com>


[ Upstream commit a53cb29b0af346af44e4abf13d7e59f807fba690 ]

Adding a vxlan interface to a socket isn't symmetrical, while adding
is done in vxlan_open() the deletion is done in vxlan_dellink().
This can cause a use-after-free error when we close the vxlan
interface before deleting it.

We add vxlan_vs_del_dev() to match vxlan_vs_add_dev() and call
it from vxlan_stop() to match the call from vxlan_open().

Fixes: 56ef9c909b40 ("vxlan: Move socket initialization to within rtnl scope")
Acked-by: Jiri Benc <jbenc@redhat.com>
Tested-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Mark Bloch <markb@mellanox.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/vxlan.c |   19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -77,6 +77,8 @@ static const u8 all_zeros_mac[ETH_ALEN];
 
 static int vxlan_sock_add(struct vxlan_dev *vxlan);
 
+static void vxlan_vs_del_dev(struct vxlan_dev *vxlan);
+
 /* per-network namespace private data for this module */
 struct vxlan_net {
 	struct list_head  vxlan_list;
@@ -1052,6 +1054,8 @@ static void __vxlan_sock_release(struct
 
 static void vxlan_sock_release(struct vxlan_dev *vxlan)
 {
+	vxlan_vs_del_dev(vxlan);
+
 	__vxlan_sock_release(vxlan->vn4_sock);
 #if IS_ENABLED(CONFIG_IPV6)
 	__vxlan_sock_release(vxlan->vn6_sock);
@@ -2255,6 +2259,15 @@ static void vxlan_cleanup(unsigned long
 	mod_timer(&vxlan->age_timer, next_timer);
 }
 
+static void vxlan_vs_del_dev(struct vxlan_dev *vxlan)
+{
+	struct vxlan_net *vn = net_generic(vxlan->net, vxlan_net_id);
+
+	spin_lock(&vn->sock_lock);
+	hlist_del_init_rcu(&vxlan->hlist);
+	spin_unlock(&vn->sock_lock);
+}
+
 static void vxlan_vs_add_dev(struct vxlan_sock *vs, struct vxlan_dev *vxlan)
 {
 	struct vxlan_net *vn = net_generic(vxlan->net, vxlan_net_id);
@@ -3028,12 +3041,6 @@ static int vxlan_newlink(struct net *src
 static void vxlan_dellink(struct net_device *dev, struct list_head *head)
 {
 	struct vxlan_dev *vxlan = netdev_priv(dev);
-	struct vxlan_net *vn = net_generic(vxlan->net, vxlan_net_id);
-
-	spin_lock(&vn->sock_lock);
-	if (!hlist_unhashed(&vxlan->hlist))
-		hlist_del_rcu(&vxlan->hlist);
-	spin_unlock(&vn->sock_lock);
 
 	gro_cells_destroy(&vxlan->gro_cells);
 	list_del(&vxlan->next);

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 06/90] ipv6: Fix leak in ipv6_gso_segment().
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (4 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 05/90] vxlan: fix use-after-free on deletion Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 07/90] net: ping: do not abuse udp_poll() Greg Kroah-Hartman
                   ` (81 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Ben Hutchings, David S. Miller

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: "David S. Miller" <davem@davemloft.net>


[ Upstream commit e3e86b5119f81e5e2499bea7ea1ebe8ac6aab789 ]

If ip6_find_1stfragopt() fails and we return an error we have to free
up 'segs' because nobody else is going to.

Fixes: 2423496af35d ("ipv6: Prevent overrun when parsing v6 header options")
Reported-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv6/ip6_offload.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/net/ipv6/ip6_offload.c
+++ b/net/ipv6/ip6_offload.c
@@ -121,8 +121,10 @@ static struct sk_buff *ipv6_gso_segment(
 
 		if (udpfrag) {
 			int err = ip6_find_1stfragopt(skb, &prevhdr);
-			if (err < 0)
+			if (err < 0) {
+				kfree_skb_list(segs);
 				return ERR_PTR(err);
+			}
 			fptr = (struct frag_hdr *)((u8 *)ipv6h + err);
 			fptr->frag_off = htons(offset);
 			if (skb->next)

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 07/90] net: ping: do not abuse udp_poll()
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (5 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 06/90] ipv6: Fix leak in ipv6_gso_segment() Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 08/90] net: ethoc: enable NAPI before poll may be scheduled Greg Kroah-Hartman
                   ` (80 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Eric Dumazet, Sasha Levin,
	Solar Designer, Vasiliy Kulikov, Lorenzo Colitti,
	David S. Miller

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>


[ Upstream commit 77d4b1d36926a9b8387c6b53eeba42bcaaffcea3 ]

Alexander reported various KASAN messages triggered in recent kernels

The problem is that ping sockets should not use udp_poll() in the first
place, and recent changes in UDP stack finally exposed this old bug.

Fixes: c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP socket kind")
Fixes: 6d0bfe226116 ("net: ipv6: Add IPv6 support to the ping socket.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Sasha Levin <alexander.levin@verizon.com>
Cc: Solar Designer <solar@openwall.com>
Cc: Vasiliy Kulikov <segoon@openwall.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Acked-By: Lorenzo Colitti <lorenzo@google.com>
Tested-By: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/net/ipv6.h |    1 +
 net/ipv4/af_inet.c |    2 +-
 net/ipv6/ping.c    |    2 +-
 net/ipv6/raw.c     |    2 +-
 4 files changed, 4 insertions(+), 3 deletions(-)

--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -958,6 +958,7 @@ int inet6_hash_connect(struct inet_timew
  */
 extern const struct proto_ops inet6_stream_ops;
 extern const struct proto_ops inet6_dgram_ops;
+extern const struct proto_ops inet6_sockraw_ops;
 
 struct group_source_req;
 struct group_filter;
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1014,7 +1014,7 @@ static struct inet_protosw inetsw_array[
 		.type =       SOCK_DGRAM,
 		.protocol =   IPPROTO_ICMP,
 		.prot =       &ping_prot,
-		.ops =        &inet_dgram_ops,
+		.ops =        &inet_sockraw_ops,
 		.flags =      INET_PROTOSW_REUSE,
        },
 
--- a/net/ipv6/ping.c
+++ b/net/ipv6/ping.c
@@ -50,7 +50,7 @@ static struct inet_protosw pingv6_protos
 	.type =      SOCK_DGRAM,
 	.protocol =  IPPROTO_ICMPV6,
 	.prot =      &pingv6_prot,
-	.ops =       &inet6_dgram_ops,
+	.ops =       &inet6_sockraw_ops,
 	.flags =     INET_PROTOSW_REUSE,
 };
 
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -1303,7 +1303,7 @@ void raw6_proc_exit(void)
 #endif	/* CONFIG_PROC_FS */
 
 /* Same as inet6_dgram_ops, sans udp_poll.  */
-static const struct proto_ops inet6_sockraw_ops = {
+const struct proto_ops inet6_sockraw_ops = {
 	.family		   = PF_INET6,
 	.owner		   = THIS_MODULE,
 	.release	   = inet6_release,

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 08/90] net: ethoc: enable NAPI before poll may be scheduled
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (6 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 07/90] net: ping: do not abuse udp_poll() Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 09/90] net: bridge: start hello timer only if device is up Greg Kroah-Hartman
                   ` (79 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Max Filippov, Tobias Klauser,
	Florian Fainelli, David S. Miller

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Max Filippov <jcmvbkbc@gmail.com>


[ Upstream commit d220b942a4b6a0640aee78841608f4aa5e8e185e ]

ethoc_reset enables device interrupts, ethoc_interrupt may schedule a
NAPI poll before NAPI is enabled in the ethoc_open, which results in
device being unable to send or receive anything until it's closed and
reopened. In case the device is flooded with ingress packets it may be
unable to recover at all.
Move napi_enable above ethoc_reset in the ethoc_open to fix that.

Fixes: a1702857724f ("net: Add support for the OpenCores 10/100 Mbps Ethernet MAC.")
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Tobias Klauser <tklauser@distanz.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/ethoc.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/net/ethernet/ethoc.c
+++ b/drivers/net/ethernet/ethoc.c
@@ -713,6 +713,8 @@ static int ethoc_open(struct net_device
 	if (ret)
 		return ret;
 
+	napi_enable(&priv->napi);
+
 	ethoc_init_ring(priv, dev->mem_start);
 	ethoc_reset(priv);
 
@@ -725,7 +727,6 @@ static int ethoc_open(struct net_device
 	}
 
 	phy_start(priv->phy);
-	napi_enable(&priv->napi);
 
 	if (netif_msg_ifup(priv)) {
 		dev_info(&dev->dev, "I/O: %08lx Memory: %08lx-%08lx\n",

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 09/90] net: bridge: start hello timer only if device is up
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (7 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 08/90] net: ethoc: enable NAPI before poll may be scheduled Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 10/90] sparc64: mm: fix copy_tsb to correctly copy huge page TSBs Greg Kroah-Hartman
                   ` (78 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Xin Long, Ivan Vecera, Sebastian Ott,
	Nikolay Aleksandrov, David S. Miller

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>


[ Upstream commit aeb073241fe7a2b932e04e20c60e47718332877f ]

When the transition of NO_STP -> KERNEL_STP was fixed by always calling
mod_timer in br_stp_start, it introduced a new regression which causes
the timer to be armed even when the bridge is down, and since we stop
the timers in its ndo_stop() function, they never get disabled if the
device is destroyed before it's upped.

To reproduce:
$ while :; do ip l add br0 type bridge hello_time 100; brctl stp br0 on;
ip l del br0; done;

CC: Xin Long <lucien.xin@gmail.com>
CC: Ivan Vecera <cera@cera.cz>
CC: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Fixes: 6d18c732b95c ("bridge: start hello_timer when enabling KERNEL_STP in br_stp_start")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/bridge/br_stp_if.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/net/bridge/br_stp_if.c
+++ b/net/bridge/br_stp_if.c
@@ -166,7 +166,8 @@ static void br_stp_start(struct net_brid
 		br_debug(br, "using kernel STP\n");
 
 		/* To start timers on any ports left in blocking */
-		mod_timer(&br->hello_timer, jiffies + br->hello_time);
+		if (br->dev->flags & IFF_UP)
+			mod_timer(&br->hello_timer, jiffies + br->hello_time);
 		br_port_state_selection(br);
 	}
 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 10/90] sparc64: mm: fix copy_tsb to correctly copy huge page TSBs
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (8 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 09/90] net: bridge: start hello timer only if device is up Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 11/90] sparc: Machine description indices can vary Greg Kroah-Hartman
                   ` (77 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Anthony Yznaga, Mike Kravetz,
	David S. Miller

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mike Kravetz <mike.kravetz@oracle.com>


[ Upstream commit 654f4807624a657f364417c2a7454f0df9961734 ]

When a TSB grows beyond its current capacity, a new TSB is allocated
and copy_tsb is called to copy entries from the old TSB to the new.
A hash shift based on page size is used to calculate the index of an
entry in the TSB.  copy_tsb has hard coded PAGE_SHIFT in these
calculations.  However, for huge page TSBs the value REAL_HPAGE_SHIFT
should be used.  As a result, when copy_tsb is called for a huge page
TSB the entries are placed at the incorrect index in the newly
allocated TSB.  When doing hardware table walk, the MMU does not
match these entries and we end up in the TSB miss handling code.
This code will then create and write an entry to the correct index
in the TSB.  We take a performance hit for the table walk miss and
recreation of these entries.

Pass a new parameter to copy_tsb that is the page size shift to be
used when copying the TSB.

Suggested-by: Anthony Yznaga <anthony.yznaga@oracle.com>
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/sparc/kernel/tsb.S |   11 +++++++----
 arch/sparc/mm/tsb.c     |    7 +++++--
 2 files changed, 12 insertions(+), 6 deletions(-)

--- a/arch/sparc/kernel/tsb.S
+++ b/arch/sparc/kernel/tsb.S
@@ -470,13 +470,16 @@ __tsb_context_switch:
 	.type	copy_tsb,#function
 copy_tsb:		/* %o0=old_tsb_base, %o1=old_tsb_size
 			 * %o2=new_tsb_base, %o3=new_tsb_size
+			 * %o4=page_size_shift
 			 */
 	sethi		%uhi(TSB_PASS_BITS), %g7
 	srlx		%o3, 4, %o3
-	add		%o0, %o1, %g1	/* end of old tsb */
+	add		%o0, %o1, %o1	/* end of old tsb */
 	sllx		%g7, 32, %g7
 	sub		%o3, 1, %o3	/* %o3 == new tsb hash mask */
 
+	mov		%o4, %g1	/* page_size_shift */
+
 661:	prefetcha	[%o0] ASI_N, #one_read
 	.section	.tsb_phys_patch, "ax"
 	.word		661b
@@ -501,9 +504,9 @@ copy_tsb:		/* %o0=old_tsb_base, %o1=old_
 	/* This can definitely be computed faster... */
 	srlx		%o0, 4, %o5	/* Build index */
 	and		%o5, 511, %o5	/* Mask index */
-	sllx		%o5, PAGE_SHIFT, %o5 /* Put into vaddr position */
+	sllx		%o5, %g1, %o5	/* Put into vaddr position */
 	or		%o4, %o5, %o4	/* Full VADDR. */
-	srlx		%o4, PAGE_SHIFT, %o4 /* Shift down to create index */
+	srlx		%o4, %g1, %o4	/* Shift down to create index */
 	and		%o4, %o3, %o4	/* Mask with new_tsb_nents-1 */
 	sllx		%o4, 4, %o4	/* Shift back up into tsb ent offset */
 	TSB_STORE(%o2 + %o4, %g2)	/* Store TAG */
@@ -511,7 +514,7 @@ copy_tsb:		/* %o0=old_tsb_base, %o1=old_
 	TSB_STORE(%o2 + %o4, %g3)	/* Store TTE */
 
 80:	add		%o0, 16, %o0
-	cmp		%o0, %g1
+	cmp		%o0, %o1
 	bne,pt		%xcc, 90b
 	 nop
 
--- a/arch/sparc/mm/tsb.c
+++ b/arch/sparc/mm/tsb.c
@@ -451,7 +451,8 @@ retry_tsb_alloc:
 		extern void copy_tsb(unsigned long old_tsb_base,
 				     unsigned long old_tsb_size,
 				     unsigned long new_tsb_base,
-				     unsigned long new_tsb_size);
+				     unsigned long new_tsb_size,
+				     unsigned long page_size_shift);
 		unsigned long old_tsb_base = (unsigned long) old_tsb;
 		unsigned long new_tsb_base = (unsigned long) new_tsb;
 
@@ -459,7 +460,9 @@ retry_tsb_alloc:
 			old_tsb_base = __pa(old_tsb_base);
 			new_tsb_base = __pa(new_tsb_base);
 		}
-		copy_tsb(old_tsb_base, old_size, new_tsb_base, new_size);
+		copy_tsb(old_tsb_base, old_size, new_tsb_base, new_size,
+			tsb_index == MM_TSB_BASE ?
+			PAGE_SHIFT : REAL_HPAGE_SHIFT);
 	}
 
 	mm->context.tsb_block[tsb_index].tsb = new_tsb;

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 11/90] sparc: Machine description indices can vary
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (9 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 10/90] sparc64: mm: fix copy_tsb to correctly copy huge page TSBs Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 12/90] sparc64: reset mm cpumask after wrap Greg Kroah-Hartman
                   ` (76 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, James Clarke, David S. Miller

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: James Clarke <jrtc27@jrtc27.com>


[ Upstream commit c982aa9c304bf0b9a7522fd118fed4afa5a0263c ]

VIO devices were being looked up by their index in the machine
description node block, but this often varies over time as devices are
added and removed. Instead, store the ID and look up using the type,
config handle and ID.

Signed-off-by: James Clarke <jrtc27@jrtc27.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=112541
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/sparc/include/asm/vio.h |    1 
 arch/sparc/kernel/vio.c      |   68 ++++++++++++++++++++++++++++++++++++++++---
 2 files changed, 65 insertions(+), 4 deletions(-)

--- a/arch/sparc/include/asm/vio.h
+++ b/arch/sparc/include/asm/vio.h
@@ -327,6 +327,7 @@ struct vio_dev {
 	int			compat_len;
 
 	u64			dev_no;
+	u64			id;
 
 	unsigned long		channel_id;
 
--- a/arch/sparc/kernel/vio.c
+++ b/arch/sparc/kernel/vio.c
@@ -284,13 +284,16 @@ static struct vio_dev *vio_create_one(st
 	if (!id) {
 		dev_set_name(&vdev->dev, "%s", bus_id_name);
 		vdev->dev_no = ~(u64)0;
+		vdev->id = ~(u64)0;
 	} else if (!cfg_handle) {
 		dev_set_name(&vdev->dev, "%s-%llu", bus_id_name, *id);
 		vdev->dev_no = *id;
+		vdev->id = ~(u64)0;
 	} else {
 		dev_set_name(&vdev->dev, "%s-%llu-%llu", bus_id_name,
 			     *cfg_handle, *id);
 		vdev->dev_no = *cfg_handle;
+		vdev->id = *id;
 	}
 
 	vdev->dev.parent = parent;
@@ -333,27 +336,84 @@ static void vio_add(struct mdesc_handle
 	(void) vio_create_one(hp, node, &root_vdev->dev);
 }
 
+struct vio_md_node_query {
+	const char *type;
+	u64 dev_no;
+	u64 id;
+};
+
 static int vio_md_node_match(struct device *dev, void *arg)
 {
+	struct vio_md_node_query *query = (struct vio_md_node_query *) arg;
 	struct vio_dev *vdev = to_vio_dev(dev);
 
-	if (vdev->mp == (u64) arg)
-		return 1;
+	if (vdev->dev_no != query->dev_no)
+		return 0;
+	if (vdev->id != query->id)
+		return 0;
+	if (strcmp(vdev->type, query->type))
+		return 0;
 
-	return 0;
+	return 1;
 }
 
 static void vio_remove(struct mdesc_handle *hp, u64 node)
 {
+	const char *type;
+	const u64 *id, *cfg_handle;
+	u64 a;
+	struct vio_md_node_query query;
 	struct device *dev;
 
-	dev = device_find_child(&root_vdev->dev, (void *) node,
+	type = mdesc_get_property(hp, node, "device-type", NULL);
+	if (!type) {
+		type = mdesc_get_property(hp, node, "name", NULL);
+		if (!type)
+			type = mdesc_node_name(hp, node);
+	}
+
+	query.type = type;
+
+	id = mdesc_get_property(hp, node, "id", NULL);
+	cfg_handle = NULL;
+	mdesc_for_each_arc(a, hp, node, MDESC_ARC_TYPE_BACK) {
+		u64 target;
+
+		target = mdesc_arc_target(hp, a);
+		cfg_handle = mdesc_get_property(hp, target,
+						"cfg-handle", NULL);
+		if (cfg_handle)
+			break;
+	}
+
+	if (!id) {
+		query.dev_no = ~(u64)0;
+		query.id = ~(u64)0;
+	} else if (!cfg_handle) {
+		query.dev_no = *id;
+		query.id = ~(u64)0;
+	} else {
+		query.dev_no = *cfg_handle;
+		query.id = *id;
+	}
+
+	dev = device_find_child(&root_vdev->dev, &query,
 				vio_md_node_match);
 	if (dev) {
 		printk(KERN_INFO "VIO: Removing device %s\n", dev_name(dev));
 
 		device_unregister(dev);
 		put_device(dev);
+	} else {
+		if (!id)
+			printk(KERN_ERR "VIO: Removed unknown %s node.\n",
+			       type);
+		else if (!cfg_handle)
+			printk(KERN_ERR "VIO: Removed unknown %s node %llu.\n",
+			       type, *id);
+		else
+			printk(KERN_ERR "VIO: Removed unknown %s node %llu-%llu.\n",
+			       type, *cfg_handle, *id);
 	}
 }
 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 12/90] sparc64: reset mm cpumask after wrap
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (10 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 11/90] sparc: Machine description indices can vary Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 13/90] sparc64: combine activate_mm and switch_mm Greg Kroah-Hartman
                   ` (75 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Pavel Tatashin, Bob Picco,
	Steven Sistare, David S. Miller

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Pavel Tatashin <pasha.tatashin@oracle.com>


[ Upstream commit 588974857359861891f478a070b1dc7ae04a3880 ]

After a wrap (getting a new context version) a process must get a new
context id, which means that we would need to flush the context id from
the TLB before running for the first time with this ID on every CPU. But,
we use mm_cpumask to determine if this process has been running on this CPU
before, and this mask is not reset after a wrap. So, there are two possible
fixes for this issue:

1. Clear mm cpumask whenever mm gets a new context id
2. Unconditionally flush context every time process is running on a CPU

This patch implements the first solution

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/sparc/mm/init_64.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -708,6 +708,8 @@ void get_new_mmu_context(struct mm_struc
 			goto out;
 		}
 	}
+	if (mm->context.sparc64_ctx_val)
+		cpumask_clear(mm_cpumask(mm));
 	mmu_context_bmap[new_ctx>>6] |= (1UL << (new_ctx & 63));
 	new_ctx |= (tlb_context_cache & CTX_VERSION_MASK);
 out:

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 13/90] sparc64: combine activate_mm and switch_mm
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (11 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 12/90] sparc64: reset mm cpumask after wrap Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 14/90] sparc64: redefine first version Greg Kroah-Hartman
                   ` (74 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Pavel Tatashin, Bob Picco,
	Steven Sistare, David S. Miller

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Pavel Tatashin <pasha.tatashin@oracle.com>


[ Upstream commit 14d0334c6748ff2aedb3f2f7fdc51ee90a9b54e7 ]

The only difference between these two functions is that in activate_mm we
unconditionally flush context. However, there is no need to keep this
difference after fixing a bug where cpumask was not reset on a wrap. So, in
this patch we combine these.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/sparc/include/asm/mmu_context_64.h |   21 +--------------------
 1 file changed, 1 insertion(+), 20 deletions(-)

--- a/arch/sparc/include/asm/mmu_context_64.h
+++ b/arch/sparc/include/asm/mmu_context_64.h
@@ -131,26 +131,7 @@ static inline void switch_mm(struct mm_s
 }
 
 #define deactivate_mm(tsk,mm)	do { } while (0)
-
-/* Activate a new MM instance for the current task. */
-static inline void activate_mm(struct mm_struct *active_mm, struct mm_struct *mm)
-{
-	unsigned long flags;
-	int cpu;
-
-	spin_lock_irqsave(&mm->context.lock, flags);
-	if (!CTX_VALID(mm->context))
-		get_new_mmu_context(mm);
-	cpu = smp_processor_id();
-	if (!cpumask_test_cpu(cpu, mm_cpumask(mm)))
-		cpumask_set_cpu(cpu, mm_cpumask(mm));
-
-	load_secondary_context(mm);
-	__flush_tlb_mm(CTX_HWBITS(mm->context), SECONDARY_CONTEXT);
-	tsb_context_switch(mm);
-	spin_unlock_irqrestore(&mm->context.lock, flags);
-}
-
+#define activate_mm(active_mm, mm) switch_mm(active_mm, mm, NULL)
 #endif /* !(__ASSEMBLY__) */
 
 #endif /* !(__SPARC64_MMU_CONTEXT_H) */

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 14/90] sparc64: redefine first version
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (12 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 13/90] sparc64: combine activate_mm and switch_mm Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 15/90] sparc64: add per-cpu mm of secondary contexts Greg Kroah-Hartman
                   ` (73 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Pavel Tatashin, Bob Picco,
	Steven Sistare, David S. Miller

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Pavel Tatashin <pasha.tatashin@oracle.com>


[ Upstream commit c4415235b2be0cc791572e8e7f7466ab8f73a2bf ]

CTX_FIRST_VERSION defines the first context version, but also it defines
first context. This patch redefines it to only include the first context
version.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/sparc/include/asm/mmu_64.h |    2 +-
 arch/sparc/mm/init_64.c         |    6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

--- a/arch/sparc/include/asm/mmu_64.h
+++ b/arch/sparc/include/asm/mmu_64.h
@@ -52,7 +52,7 @@
 #define CTX_NR_MASK		TAG_CONTEXT_BITS
 #define CTX_HW_MASK		(CTX_NR_MASK | CTX_PGSZ_MASK)
 
-#define CTX_FIRST_VERSION	((_AC(1,UL) << CTX_VERSION_SHIFT) + _AC(1,UL))
+#define CTX_FIRST_VERSION	BIT(CTX_VERSION_SHIFT)
 #define CTX_VALID(__ctx)	\
 	 (!(((__ctx.sparc64_ctx_val) ^ tlb_context_cache) & CTX_VERSION_MASK))
 #define CTX_HWBITS(__ctx)	((__ctx.sparc64_ctx_val) & CTX_HW_MASK)
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -656,7 +656,7 @@ EXPORT_SYMBOL(__flush_dcache_range);
 
 /* get_new_mmu_context() uses "cache + 1".  */
 DEFINE_SPINLOCK(ctx_alloc_lock);
-unsigned long tlb_context_cache = CTX_FIRST_VERSION - 1;
+unsigned long tlb_context_cache = CTX_FIRST_VERSION;
 #define MAX_CTX_NR	(1UL << CTX_NR_BITS)
 #define CTX_BMAP_SLOTS	BITS_TO_LONGS(MAX_CTX_NR)
 DECLARE_BITMAP(mmu_context_bmap, MAX_CTX_NR);
@@ -687,9 +687,9 @@ void get_new_mmu_context(struct mm_struc
 		if (new_ctx >= ctx) {
 			int i;
 			new_ctx = (tlb_context_cache & CTX_VERSION_MASK) +
-				CTX_FIRST_VERSION;
+				CTX_FIRST_VERSION + 1;
 			if (new_ctx == 1)
-				new_ctx = CTX_FIRST_VERSION;
+				new_ctx = CTX_FIRST_VERSION + 1;
 
 			/* Don't call memset, for 16 entries that's just
 			 * plain silly...

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 15/90] sparc64: add per-cpu mm of secondary contexts
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (13 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 14/90] sparc64: redefine first version Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 16/90] sparc64: new context wrap Greg Kroah-Hartman
                   ` (72 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Pavel Tatashin, Bob Picco,
	Steven Sistare, David S. Miller

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Pavel Tatashin <pasha.tatashin@oracle.com>


[ Upstream commit 7a5b4bbf49fe86ce77488a70c5dccfe2d50d7a2d ]

The new wrap is going to use information from this array to figure out
mm's that currently have valid secondary contexts setup.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/sparc/include/asm/mmu_context_64.h |    5 +++--
 arch/sparc/mm/init_64.c                 |    1 +
 2 files changed, 4 insertions(+), 2 deletions(-)

--- a/arch/sparc/include/asm/mmu_context_64.h
+++ b/arch/sparc/include/asm/mmu_context_64.h
@@ -17,6 +17,7 @@ extern spinlock_t ctx_alloc_lock;
 extern unsigned long tlb_context_cache;
 extern unsigned long mmu_context_bmap[];
 
+DECLARE_PER_CPU(struct mm_struct *, per_cpu_secondary_mm);
 void get_new_mmu_context(struct mm_struct *mm);
 #ifdef CONFIG_SMP
 void smp_new_mmu_context_version(void);
@@ -74,8 +75,9 @@ void __flush_tlb_mm(unsigned long, unsig
 static inline void switch_mm(struct mm_struct *old_mm, struct mm_struct *mm, struct task_struct *tsk)
 {
 	unsigned long ctx_valid, flags;
-	int cpu;
+	int cpu = smp_processor_id();
 
+	per_cpu(per_cpu_secondary_mm, cpu) = mm;
 	if (unlikely(mm == &init_mm))
 		return;
 
@@ -121,7 +123,6 @@ static inline void switch_mm(struct mm_s
 	 * for the first time, we must flush that context out of the
 	 * local TLB.
 	 */
-	cpu = smp_processor_id();
 	if (!ctx_valid || !cpumask_test_cpu(cpu, mm_cpumask(mm))) {
 		cpumask_set_cpu(cpu, mm_cpumask(mm));
 		__flush_tlb_mm(CTX_HWBITS(mm->context),
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -660,6 +660,7 @@ unsigned long tlb_context_cache = CTX_FI
 #define MAX_CTX_NR	(1UL << CTX_NR_BITS)
 #define CTX_BMAP_SLOTS	BITS_TO_LONGS(MAX_CTX_NR)
 DECLARE_BITMAP(mmu_context_bmap, MAX_CTX_NR);
+DEFINE_PER_CPU(struct mm_struct *, per_cpu_secondary_mm) = {0};
 
 /* Caller does TLB context flushing on local CPU if necessary.
  * The caller also ensures that CTX_VALID(mm->context) is false.

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 16/90] sparc64: new context wrap
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (14 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 15/90] sparc64: add per-cpu mm of secondary contexts Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 17/90] sparc64: delete old wrap code Greg Kroah-Hartman
                   ` (71 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Pavel Tatashin, Bob Picco,
	Steven Sistare, David S. Miller

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Pavel Tatashin <pasha.tatashin@oracle.com>


[ Upstream commit a0582f26ec9dfd5360ea2f35dd9a1b026f8adda0 ]

The current wrap implementation has a race issue: it is called outside of
the ctx_alloc_lock, and also does not wait for all CPUs to complete the
wrap.  This means that a thread can get a new context with a new version
and another thread might still be running with the same context. The
problem is especially severe on CPUs with shared TLBs, like sun4v. I used
the following test to very quickly reproduce the problem:
- start over 8K processes (must be more than context IDs)
- write and read values at a  memory location in every process.

Very quickly memory corruptions start happening, and what we read back
does not equal what we wrote.

Several approaches were explored before settling on this one:

Approach 1:
Move smp_new_mmu_context_version() inside ctx_alloc_lock, and wait for
every process to complete the wrap. (Note: every CPU must WAIT before
leaving smp_new_mmu_context_version_client() until every one arrives).

This approach ends up with deadlocks, as some threads own locks which other
threads are waiting for, and they never receive softint until these threads
exit smp_new_mmu_context_version_client(). Since we do not allow the exit,
deadlock happens.

Approach 2:
Handle wrap right during mondo interrupt. Use etrap/rtrap to enter into
into C code, and issue new versions to every CPU.
This approach adds some overhead to runtime: in switch_mm() we must add
some checks to make sure that versions have not changed due to wrap while
we were loading the new secondary context. (could be protected by PSTATE_IE
but that degrades performance as on M7 and older CPUs as it takes 50 cycles
for each access). Also, we still need a global per-cpu array of MMs to know
where we need to load new contexts, otherwise we can change context to a
thread that is going way (if we received mondo between switch_mm() and
switch_to() time). Finally, there are some issues with window registers in
rtrap() when context IDs are changed during CPU mondo time.

The approach in this patch is the simplest and has almost no impact on
runtime.  We use the array with mm's where last secondary contexts were
loaded onto CPUs and bump their versions to the new generation without
changing context IDs. If a new process comes in to get a context ID, it
will go through get_new_mmu_context() because of version mismatch. But the
running processes do not need to be interrupted. And wrap is quicker as we
do not need to xcall and wait for everyone to receive and complete wrap.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/sparc/mm/init_64.c |   81 ++++++++++++++++++++++++++++++++----------------
 1 file changed, 54 insertions(+), 27 deletions(-)

--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -662,6 +662,53 @@ unsigned long tlb_context_cache = CTX_FI
 DECLARE_BITMAP(mmu_context_bmap, MAX_CTX_NR);
 DEFINE_PER_CPU(struct mm_struct *, per_cpu_secondary_mm) = {0};
 
+static void mmu_context_wrap(void)
+{
+	unsigned long old_ver = tlb_context_cache & CTX_VERSION_MASK;
+	unsigned long new_ver, new_ctx, old_ctx;
+	struct mm_struct *mm;
+	int cpu;
+
+	bitmap_zero(mmu_context_bmap, 1 << CTX_NR_BITS);
+
+	/* Reserve kernel context */
+	set_bit(0, mmu_context_bmap);
+
+	new_ver = (tlb_context_cache & CTX_VERSION_MASK) + CTX_FIRST_VERSION;
+	if (unlikely(new_ver == 0))
+		new_ver = CTX_FIRST_VERSION;
+	tlb_context_cache = new_ver;
+
+	/*
+	 * Make sure that any new mm that are added into per_cpu_secondary_mm,
+	 * are going to go through get_new_mmu_context() path.
+	 */
+	mb();
+
+	/*
+	 * Updated versions to current on those CPUs that had valid secondary
+	 * contexts
+	 */
+	for_each_online_cpu(cpu) {
+		/*
+		 * If a new mm is stored after we took this mm from the array,
+		 * it will go into get_new_mmu_context() path, because we
+		 * already bumped the version in tlb_context_cache.
+		 */
+		mm = per_cpu(per_cpu_secondary_mm, cpu);
+
+		if (unlikely(!mm || mm == &init_mm))
+			continue;
+
+		old_ctx = mm->context.sparc64_ctx_val;
+		if (likely((old_ctx & CTX_VERSION_MASK) == old_ver)) {
+			new_ctx = (old_ctx & ~CTX_VERSION_MASK) | new_ver;
+			set_bit(new_ctx & CTX_NR_MASK, mmu_context_bmap);
+			mm->context.sparc64_ctx_val = new_ctx;
+		}
+	}
+}
+
 /* Caller does TLB context flushing on local CPU if necessary.
  * The caller also ensures that CTX_VALID(mm->context) is false.
  *
@@ -676,50 +723,30 @@ void get_new_mmu_context(struct mm_struc
 {
 	unsigned long ctx, new_ctx;
 	unsigned long orig_pgsz_bits;
-	int new_version;
 
 	spin_lock(&ctx_alloc_lock);
+retry:
+	/* wrap might have happened, test again if our context became valid */
+	if (unlikely(CTX_VALID(mm->context)))
+		goto out;
 	orig_pgsz_bits = (mm->context.sparc64_ctx_val & CTX_PGSZ_MASK);
 	ctx = (tlb_context_cache + 1) & CTX_NR_MASK;
 	new_ctx = find_next_zero_bit(mmu_context_bmap, 1 << CTX_NR_BITS, ctx);
-	new_version = 0;
 	if (new_ctx >= (1 << CTX_NR_BITS)) {
 		new_ctx = find_next_zero_bit(mmu_context_bmap, ctx, 1);
 		if (new_ctx >= ctx) {
-			int i;
-			new_ctx = (tlb_context_cache & CTX_VERSION_MASK) +
-				CTX_FIRST_VERSION + 1;
-			if (new_ctx == 1)
-				new_ctx = CTX_FIRST_VERSION + 1;
-
-			/* Don't call memset, for 16 entries that's just
-			 * plain silly...
-			 */
-			mmu_context_bmap[0] = 3;
-			mmu_context_bmap[1] = 0;
-			mmu_context_bmap[2] = 0;
-			mmu_context_bmap[3] = 0;
-			for (i = 4; i < CTX_BMAP_SLOTS; i += 4) {
-				mmu_context_bmap[i + 0] = 0;
-				mmu_context_bmap[i + 1] = 0;
-				mmu_context_bmap[i + 2] = 0;
-				mmu_context_bmap[i + 3] = 0;
-			}
-			new_version = 1;
-			goto out;
+			mmu_context_wrap();
+			goto retry;
 		}
 	}
 	if (mm->context.sparc64_ctx_val)
 		cpumask_clear(mm_cpumask(mm));
 	mmu_context_bmap[new_ctx>>6] |= (1UL << (new_ctx & 63));
 	new_ctx |= (tlb_context_cache & CTX_VERSION_MASK);
-out:
 	tlb_context_cache = new_ctx;
 	mm->context.sparc64_ctx_val = new_ctx | orig_pgsz_bits;
+out:
 	spin_unlock(&ctx_alloc_lock);
-
-	if (unlikely(new_version))
-		smp_new_mmu_context_version();
 }
 
 static int numa_enabled = 1;

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 17/90] sparc64: delete old wrap code
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (15 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 16/90] sparc64: new context wrap Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 18/90] arch/sparc: support NR_CPUS = 4096 Greg Kroah-Hartman
                   ` (70 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Pavel Tatashin, Bob Picco,
	Steven Sistare, David S. Miller

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Pavel Tatashin <pasha.tatashin@oracle.com>


[ Upstream commit 0197e41ce70511dc3b71f7fefa1a676e2b5cd60b ]

The old method that is using xcall and softint to get new context id is
deleted, as it is replaced by a method of using per_cpu_secondary_mm
without xcall to perform the context wrap.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/sparc/include/asm/mmu_context_64.h |    6 ------
 arch/sparc/include/asm/pil.h            |    1 -
 arch/sparc/kernel/kernel.h              |    1 -
 arch/sparc/kernel/smp_64.c              |   31 -------------------------------
 arch/sparc/kernel/ttable_64.S           |    2 +-
 arch/sparc/mm/ultra.S                   |    5 -----
 6 files changed, 1 insertion(+), 45 deletions(-)

--- a/arch/sparc/include/asm/mmu_context_64.h
+++ b/arch/sparc/include/asm/mmu_context_64.h
@@ -19,12 +19,6 @@ extern unsigned long mmu_context_bmap[];
 
 DECLARE_PER_CPU(struct mm_struct *, per_cpu_secondary_mm);
 void get_new_mmu_context(struct mm_struct *mm);
-#ifdef CONFIG_SMP
-void smp_new_mmu_context_version(void);
-#else
-#define smp_new_mmu_context_version() do { } while (0)
-#endif
-
 int init_new_context(struct task_struct *tsk, struct mm_struct *mm);
 void destroy_context(struct mm_struct *mm);
 
--- a/arch/sparc/include/asm/pil.h
+++ b/arch/sparc/include/asm/pil.h
@@ -20,7 +20,6 @@
 #define PIL_SMP_CALL_FUNC	1
 #define PIL_SMP_RECEIVE_SIGNAL	2
 #define PIL_SMP_CAPTURE		3
-#define PIL_SMP_CTX_NEW_VERSION	4
 #define PIL_DEVICE_IRQ		5
 #define PIL_SMP_CALL_FUNC_SNGL	6
 #define PIL_DEFERRED_PCR_WORK	7
--- a/arch/sparc/kernel/kernel.h
+++ b/arch/sparc/kernel/kernel.h
@@ -37,7 +37,6 @@ void handle_stdfmna(struct pt_regs *regs
 /* smp_64.c */
 void __irq_entry smp_call_function_client(int irq, struct pt_regs *regs);
 void __irq_entry smp_call_function_single_client(int irq, struct pt_regs *regs);
-void __irq_entry smp_new_mmu_context_version_client(int irq, struct pt_regs *regs);
 void __irq_entry smp_penguin_jailcell(int irq, struct pt_regs *regs);
 void __irq_entry smp_receive_signal_client(int irq, struct pt_regs *regs);
 
--- a/arch/sparc/kernel/smp_64.c
+++ b/arch/sparc/kernel/smp_64.c
@@ -959,37 +959,6 @@ void flush_dcache_page_all(struct mm_str
 	preempt_enable();
 }
 
-void __irq_entry smp_new_mmu_context_version_client(int irq, struct pt_regs *regs)
-{
-	struct mm_struct *mm;
-	unsigned long flags;
-
-	clear_softint(1 << irq);
-
-	/* See if we need to allocate a new TLB context because
-	 * the version of the one we are using is now out of date.
-	 */
-	mm = current->active_mm;
-	if (unlikely(!mm || (mm == &init_mm)))
-		return;
-
-	spin_lock_irqsave(&mm->context.lock, flags);
-
-	if (unlikely(!CTX_VALID(mm->context)))
-		get_new_mmu_context(mm);
-
-	spin_unlock_irqrestore(&mm->context.lock, flags);
-
-	load_secondary_context(mm);
-	__flush_tlb_mm(CTX_HWBITS(mm->context),
-		       SECONDARY_CONTEXT);
-}
-
-void smp_new_mmu_context_version(void)
-{
-	smp_cross_call(&xcall_new_mmu_context_version, 0, 0, 0);
-}
-
 #ifdef CONFIG_KGDB
 void kgdb_roundup_cpus(unsigned long flags)
 {
--- a/arch/sparc/kernel/ttable_64.S
+++ b/arch/sparc/kernel/ttable_64.S
@@ -50,7 +50,7 @@ tl0_resv03e:	BTRAP(0x3e) BTRAP(0x3f) BTR
 tl0_irq1:	TRAP_IRQ(smp_call_function_client, 1)
 tl0_irq2:	TRAP_IRQ(smp_receive_signal_client, 2)
 tl0_irq3:	TRAP_IRQ(smp_penguin_jailcell, 3)
-tl0_irq4:	TRAP_IRQ(smp_new_mmu_context_version_client, 4)
+tl0_irq4:       BTRAP(0x44)
 #else
 tl0_irq1:	BTRAP(0x41)
 tl0_irq2:	BTRAP(0x42)
--- a/arch/sparc/mm/ultra.S
+++ b/arch/sparc/mm/ultra.S
@@ -971,11 +971,6 @@ xcall_capture:
 	wr		%g0, (1 << PIL_SMP_CAPTURE), %set_softint
 	retry
 
-	.globl		xcall_new_mmu_context_version
-xcall_new_mmu_context_version:
-	wr		%g0, (1 << PIL_SMP_CTX_NEW_VERSION), %set_softint
-	retry
-
 #ifdef CONFIG_KGDB
 	.globl		xcall_kgdb_capture
 xcall_kgdb_capture:

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 18/90] arch/sparc: support NR_CPUS = 4096
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (16 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 17/90] sparc64: delete old wrap code Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 19/90] serial: ifx6x60: fix use-after-free on module unload Greg Kroah-Hartman
                   ` (69 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jane Chu, Bob Picco, Atish Patra,
	David S. Miller

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jane Chu <jane.chu@oracle.com>


[ Upstream commit c79a13734d104b5b147d7cb0870276ccdd660dae ]

Linux SPARC64 limits NR_CPUS to 4064 because init_cpu_send_mondo_info()
only allocates a single page for NR_CPUS mondo entries. Thus we cannot
use all 4096 CPUs on some SPARC platforms.

To fix, allocate (2^order) pages where order is set according to the size
of cpu_list for possible cpus. Since cpu_list_pa and cpu_mondo_block_pa
are not used in asm code, there are no imm13 offsets from the base PA
that will break because they can only reach one page.

Orabug: 25505750

Signed-off-by: Jane Chu <jane.chu@oracle.com>

Reviewed-by: Bob Picco <bob.picco@oracle.com>
Reviewed-by: Atish Patra <atish.patra@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/sparc/Kconfig         |    4 ++--
 arch/sparc/kernel/irq_64.c |   17 +++++++++++++----
 2 files changed, 15 insertions(+), 6 deletions(-)

--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -182,9 +182,9 @@ config NR_CPUS
 	int "Maximum number of CPUs"
 	depends on SMP
 	range 2 32 if SPARC32
-	range 2 1024 if SPARC64
+	range 2 4096 if SPARC64
 	default 32 if SPARC32
-	default 64 if SPARC64
+	default 4096 if SPARC64
 
 source kernel/Kconfig.hz
 
--- a/arch/sparc/kernel/irq_64.c
+++ b/arch/sparc/kernel/irq_64.c
@@ -1034,17 +1034,26 @@ static void __init init_cpu_send_mondo_i
 {
 #ifdef CONFIG_SMP
 	unsigned long page;
+	void *mondo, *p;
 
-	BUILD_BUG_ON((NR_CPUS * sizeof(u16)) > (PAGE_SIZE - 64));
+	BUILD_BUG_ON((NR_CPUS * sizeof(u16)) > PAGE_SIZE);
+
+	/* Make sure mondo block is 64byte aligned */
+	p = kzalloc(127, GFP_KERNEL);
+	if (!p) {
+		prom_printf("SUN4V: Error, cannot allocate mondo block.\n");
+		prom_halt();
+	}
+	mondo = (void *)(((unsigned long)p + 63) & ~0x3f);
+	tb->cpu_mondo_block_pa = __pa(mondo);
 
 	page = get_zeroed_page(GFP_KERNEL);
 	if (!page) {
-		prom_printf("SUN4V: Error, cannot allocate cpu mondo page.\n");
+		prom_printf("SUN4V: Error, cannot allocate cpu list page.\n");
 		prom_halt();
 	}
 
-	tb->cpu_mondo_block_pa = __pa(page);
-	tb->cpu_list_pa = __pa(page + 64);
+	tb->cpu_list_pa = __pa(page);
 #endif
 }
 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 19/90] serial: ifx6x60: fix use-after-free on module unload
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (17 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 18/90] arch/sparc: support NR_CPUS = 4096 Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 20/90] ptrace: Properly initialize ptracer_cred on fork Greg Kroah-Hartman
                   ` (68 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Jun Chen, Johan Hovold

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Johan Hovold <johan@kernel.org>

commit 1e948479b3d63e3ac0ecca13cbf4921c7d17c168 upstream.

Make sure to deregister the SPI driver before releasing the tty driver
to avoid use-after-free in the SPI remove callback where the tty
devices are deregistered.

Fixes: 72d4724ea54c ("serial: ifx6x60: Add modem power off function in the platform reboot process")
Cc: Jun Chen <jun.d.chen@intel.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/tty/serial/ifx6x60.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/tty/serial/ifx6x60.c
+++ b/drivers/tty/serial/ifx6x60.c
@@ -1378,9 +1378,9 @@ static struct spi_driver ifx_spi_driver
 static void __exit ifx_spi_exit(void)
 {
 	/* unregister */
+	spi_unregister_driver(&ifx_spi_driver);
 	tty_unregister_driver(tty_drv);
 	put_tty_driver(tty_drv);
-	spi_unregister_driver(&ifx_spi_driver);
 	unregister_reboot_notifier(&ifx_modem_reboot_notifier_block);
 }
 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 20/90] ptrace: Properly initialize ptracer_cred on fork
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (18 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 19/90] serial: ifx6x60: fix use-after-free on module unload Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 21/90] KEYS: fix dereferencing NULL payload with nonzero length Greg Kroah-Hartman
                   ` (67 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Takashi Iwai, Eric W. Biederman

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric W. Biederman <ebiederm@xmission.com>

commit c70d9d809fdeecedb96972457ee45c49a232d97f upstream.

When I introduced ptracer_cred I failed to consider the weirdness of
fork where the task_struct copies the old value by default.  This
winds up leaving ptracer_cred set even when a process forks and
the child process does not wind up being ptraced.

Because ptracer_cred is not set on non-ptraced processes whose
parents were ptraced this has broken the ability of the enlightenment
window manager to start setuid children.

Fix this by properly initializing ptracer_cred in ptrace_init_task

This must be done with a little bit of care to preserve the current value
of ptracer_cred when ptrace carries through fork.  Re-reading the
ptracer_cred from the ptracing process at this point is inconsistent
with how PT_PTRACE_CAP has been maintained all of these years.

Tested-by: Takashi Iwai <tiwai@suse.de>
Fixes: 64b875f7ac8a ("ptrace: Capture the ptracer's creds not PT_PTRACE_CAP")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 include/linux/ptrace.h |    7 +++++--
 kernel/ptrace.c        |   20 +++++++++++++-------
 2 files changed, 18 insertions(+), 9 deletions(-)

--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -50,7 +50,8 @@ extern int ptrace_request(struct task_st
 			  unsigned long addr, unsigned long data);
 extern void ptrace_notify(int exit_code);
 extern void __ptrace_link(struct task_struct *child,
-			  struct task_struct *new_parent);
+			  struct task_struct *new_parent,
+			  const struct cred *ptracer_cred);
 extern void __ptrace_unlink(struct task_struct *child);
 extern void exit_ptrace(struct task_struct *tracer, struct list_head *dead);
 #define PTRACE_MODE_READ	0x01
@@ -202,7 +203,7 @@ static inline void ptrace_init_task(stru
 
 	if (unlikely(ptrace) && current->ptrace) {
 		child->ptrace = current->ptrace;
-		__ptrace_link(child, current->parent);
+		__ptrace_link(child, current->parent, current->ptracer_cred);
 
 		if (child->ptrace & PT_SEIZED)
 			task_set_jobctl_pending(child, JOBCTL_TRAP_STOP);
@@ -211,6 +212,8 @@ static inline void ptrace_init_task(stru
 
 		set_tsk_thread_flag(child, TIF_SIGPENDING);
 	}
+	else
+		child->ptracer_cred = NULL;
 }
 
 /**
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -28,19 +28,25 @@
 #include <linux/compat.h>
 
 
+void __ptrace_link(struct task_struct *child, struct task_struct *new_parent,
+		   const struct cred *ptracer_cred)
+{
+	BUG_ON(!list_empty(&child->ptrace_entry));
+	list_add(&child->ptrace_entry, &new_parent->ptraced);
+	child->parent = new_parent;
+	child->ptracer_cred = get_cred(ptracer_cred);
+}
+
 /*
  * ptrace a task: make the debugger its new parent and
  * move it to the ptrace list.
  *
  * Must be called with the tasklist lock write-held.
  */
-void __ptrace_link(struct task_struct *child, struct task_struct *new_parent)
+static void ptrace_link(struct task_struct *child, struct task_struct *new_parent)
 {
-	BUG_ON(!list_empty(&child->ptrace_entry));
-	list_add(&child->ptrace_entry, &new_parent->ptraced);
-	child->parent = new_parent;
 	rcu_read_lock();
-	child->ptracer_cred = get_cred(__task_cred(new_parent));
+	__ptrace_link(child, new_parent, __task_cred(new_parent));
 	rcu_read_unlock();
 }
 
@@ -353,7 +359,7 @@ static int ptrace_attach(struct task_str
 		flags |= PT_SEIZED;
 	task->ptrace = flags;
 
-	__ptrace_link(task, current);
+	ptrace_link(task, current);
 
 	/* SEIZE doesn't trap tracee on attach */
 	if (!seize)
@@ -420,7 +426,7 @@ static int ptrace_traceme(void)
 		 */
 		if (!ret && !(current->real_parent->flags & PF_EXITING)) {
 			current->ptrace = PT_PTRACED;
-			__ptrace_link(current, current->real_parent);
+			ptrace_link(current, current->real_parent);
 		}
 	}
 	write_unlock_irq(&tasklist_lock);

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 21/90] KEYS: fix dereferencing NULL payload with nonzero length
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (19 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 20/90] ptrace: Properly initialize ptracer_cred on fork Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 22/90] KEYS: fix freeing uninitialized memory in key_update() Greg Kroah-Hartman
                   ` (66 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Eric Biggers, David Howells, James Morris

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Biggers <ebiggers@google.com>

commit 5649645d725c73df4302428ee4e02c869248b4c5 upstream.

sys_add_key() and the KEYCTL_UPDATE operation of sys_keyctl() allowed a
NULL payload with nonzero length to be passed to the key type's
->preparse(), ->instantiate(), and/or ->update() methods.  Various key
types including asymmetric, cifs.idmap, cifs.spnego, and pkcs7_test did
not handle this case, allowing an unprivileged user to trivially cause a
NULL pointer dereference (kernel oops) if one of these key types was
present.  Fix it by doing the copy_from_user() when 'plen' is nonzero
rather than when '_payload' is non-NULL, causing the syscall to fail
with EFAULT as expected when an invalid buffer is specified.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 security/keys/keyctl.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/security/keys/keyctl.c
+++ b/security/keys/keyctl.c
@@ -97,7 +97,7 @@ SYSCALL_DEFINE5(add_key, const char __us
 	/* pull the payload in if one was supplied */
 	payload = NULL;
 
-	if (_payload) {
+	if (plen) {
 		ret = -ENOMEM;
 		payload = kmalloc(plen, GFP_KERNEL | __GFP_NOWARN);
 		if (!payload) {
@@ -327,7 +327,7 @@ long keyctl_update_key(key_serial_t id,
 
 	/* pull the payload in if one was supplied */
 	payload = NULL;
-	if (_payload) {
+	if (plen) {
 		ret = -ENOMEM;
 		payload = kmalloc(plen, GFP_KERNEL);
 		if (!payload)

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 22/90] KEYS: fix freeing uninitialized memory in key_update()
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (20 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 21/90] KEYS: fix dereferencing NULL payload with nonzero length Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 23/90] crypto: gcm - wait for crypto op not signal safe Greg Kroah-Hartman
                   ` (65 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Eric Biggers, David Howells, James Morris

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Biggers <ebiggers@google.com>

commit 63a0b0509e700717a59f049ec6e4e04e903c7fe2 upstream.

key_update() freed the key_preparsed_payload even if it was not
initialized first.  This would cause a crash if userspace called
keyctl_update() on a key with type like "asymmetric" that has a
->preparse() method but not an ->update() method.  Possibly it could
even be triggered for other key types by racing with keyctl_setperm() to
make the KEY_NEED_WRITE check fail (the permission was already checked,
so normally it wouldn't fail there).

Reproducer with key type "asymmetric", given a valid cert.der:

keyctl new_session
keyid=$(keyctl padd asymmetric desc @s < cert.der)
keyctl setperm $keyid 0x3f000000
keyctl update $keyid data

[  150.686666] BUG: unable to handle kernel NULL pointer dereference at 0000000000000001
[  150.687601] IP: asymmetric_key_free_kids+0x12/0x30
[  150.688139] PGD 38a3d067
[  150.688141] PUD 3b3de067
[  150.688447] PMD 0
[  150.688745]
[  150.689160] Oops: 0000 [#1] SMP
[  150.689455] Modules linked in:
[  150.689769] CPU: 1 PID: 2478 Comm: keyctl Not tainted 4.11.0-rc4-xfstests-00187-ga9f6b6b8cd2f #742
[  150.690916] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-20170228_101828-anatol 04/01/2014
[  150.692199] task: ffff88003b30c480 task.stack: ffffc90000350000
[  150.692952] RIP: 0010:asymmetric_key_free_kids+0x12/0x30
[  150.693556] RSP: 0018:ffffc90000353e58 EFLAGS: 00010202
[  150.694142] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000004
[  150.694845] RDX: ffffffff81ee3920 RSI: ffff88003d4b0700 RDI: 0000000000000001
[  150.697569] RBP: ffffc90000353e60 R08: ffff88003d5d2140 R09: 0000000000000000
[  150.702483] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
[  150.707393] R13: 0000000000000004 R14: ffff880038a4d2d8 R15: 000000000040411f
[  150.709720] FS:  00007fcbcee35700(0000) GS:ffff88003fd00000(0000) knlGS:0000000000000000
[  150.711504] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  150.712733] CR2: 0000000000000001 CR3: 0000000039eab000 CR4: 00000000003406e0
[  150.714487] Call Trace:
[  150.714975]  asymmetric_key_free_preparse+0x2f/0x40
[  150.715907]  key_update+0xf7/0x140
[  150.716560]  ? key_default_cmp+0x20/0x20
[  150.717319]  keyctl_update_key+0xb0/0xe0
[  150.718066]  SyS_keyctl+0x109/0x130
[  150.718663]  entry_SYSCALL_64_fastpath+0x1f/0xc2
[  150.719440] RIP: 0033:0x7fcbce75ff19
[  150.719926] RSP: 002b:00007ffd5d167088 EFLAGS: 00000206 ORIG_RAX: 00000000000000fa
[  150.720918] RAX: ffffffffffffffda RBX: 0000000000404d80 RCX: 00007fcbce75ff19
[  150.721874] RDX: 00007ffd5d16785e RSI: 000000002866cd36 RDI: 0000000000000002
[  150.722827] RBP: 0000000000000006 R08: 000000002866cd36 R09: 00007ffd5d16785e
[  150.723781] R10: 0000000000000004 R11: 0000000000000206 R12: 0000000000404d80
[  150.724650] R13: 00007ffd5d16784d R14: 00007ffd5d167238 R15: 000000000040411f
[  150.725447] Code: 83 c4 08 31 c0 5b 41 5c 41 5d 41 5e 41 5f 5d c3 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 85 ff 74 23 55 48 89 e5 53 48 89 fb <48> 8b 3f e8 06 21 c5 ff 48 8b 7b 08 e8 fd 20 c5 ff 48 89 df e8
[  150.727489] RIP: asymmetric_key_free_kids+0x12/0x30 RSP: ffffc90000353e58
[  150.728117] CR2: 0000000000000001
[  150.728430] ---[ end trace f7f8fe1da2d5ae8d ]---

Fixes: 4d8c0250b841 ("KEYS: Call ->free_preparse() even after ->preparse() returns an error")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 security/keys/key.c |    5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

--- a/security/keys/key.c
+++ b/security/keys/key.c
@@ -934,12 +934,11 @@ int key_update(key_ref_t key_ref, const
 	/* the key must be writable */
 	ret = key_permission(key_ref, KEY_NEED_WRITE);
 	if (ret < 0)
-		goto error;
+		return ret;
 
 	/* attempt to update it if supported */
-	ret = -EOPNOTSUPP;
 	if (!key->type->update)
-		goto error;
+		return -EOPNOTSUPP;
 
 	memset(&prep, 0, sizeof(prep));
 	prep.data = payload;

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 23/90] crypto: gcm - wait for crypto op not signal safe
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (21 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 22/90] KEYS: fix freeing uninitialized memory in key_update() Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 25/90] nfsd4: fix null dereference on replay Greg Kroah-Hartman
                   ` (64 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Eric Biggers, Gilad Ben-Yossef, Herbert Xu

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Gilad Ben-Yossef <gilad@benyossef.com>

commit f3ad587070d6bd961ab942b3fd7a85d00dfc934b upstream.

crypto_gcm_setkey() was using wait_for_completion_interruptible() to
wait for completion of async crypto op but if a signal occurs it
may return before DMA ops of HW crypto provider finish, thus
corrupting the data buffer that is kfree'ed in this case.

Resolve this by using wait_for_completion() instead.

Reported-by: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 crypto/gcm.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

--- a/crypto/gcm.c
+++ b/crypto/gcm.c
@@ -152,10 +152,8 @@ static int crypto_gcm_setkey(struct cryp
 
 	err = crypto_ablkcipher_encrypt(&data->req);
 	if (err == -EINPROGRESS || err == -EBUSY) {
-		err = wait_for_completion_interruptible(
-			&data->result.completion);
-		if (!err)
-			err = data->result.err;
+		wait_for_completion(&data->result.completion);
+		err = data->result.err;
 	}
 
 	if (err)

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 25/90] nfsd4: fix null dereference on replay
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (22 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 23/90] crypto: gcm - wait for crypto op not signal safe Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 26/90] nfsd: Fix up the "supattr_exclcreat" attributes Greg Kroah-Hartman
                   ` (63 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Scott Mayhew, J. Bruce Fields

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: J. Bruce Fields <bfields@redhat.com>

commit 9a307403d374b993061f5992a6e260c944920d0b upstream.

if we receive a compound such that:

	- the sessionid, slot, and sequence number in the SEQUENCE op
	  match a cached succesful reply with N ops, and
	- the Nth operation of the compound is a PUTFH, PUTPUBFH,
	  PUTROOTFH, or RESTOREFH,

then nfsd4_sequence will return 0 and set cstate->status to
nfserr_replay_cache.  The current filehandle will not be set.  This will
cause us to call check_nfsd_access with first argument NULL.

To nfsd4_compound it looks like we just succesfully executed an
operation that set a filehandle, but the current filehandle is not set.

Fix this by moving the nfserr_replay_cache earlier.  There was never any
reason to have it after the encode_op label, since the only case where
he hit that is when opdesc->op_func sets it.

Note that there are two ways we could hit this case:

	- a client is resending a previously sent compound that ended
	  with one of the four PUTFH-like operations, or
	- a client is sending a *new* compound that (incorrectly) shares
	  sessionid, slot, and sequence number with a previously sent
	  compound, and the length of the previously sent compound
	  happens to match the position of a PUTFH-like operation in the
	  new compound.

The second is obviously incorrect client behavior.  The first is also
very strange--the only purpose of a PUTFH-like operation is to set the
current filehandle to be used by the following operation, so there's no
point in having it as the last in a compound.

So it's likely this requires a buggy or malicious client to reproduce.

Reported-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/nfsd/nfs4proc.c |   13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1690,6 +1690,12 @@ nfsd4_proc_compound(struct svc_rqst *rqs
 			opdesc->op_get_currentstateid(cstate, &op->u);
 		op->status = opdesc->op_func(rqstp, cstate, &op->u);
 
+		/* Only from SEQUENCE */
+		if (cstate->status == nfserr_replay_cache) {
+			dprintk("%s NFS4.1 replay from cache\n", __func__);
+			status = op->status;
+			goto out;
+		}
 		if (!op->status) {
 			if (opdesc->op_set_currentstateid)
 				opdesc->op_set_currentstateid(cstate, &op->u);
@@ -1700,14 +1706,7 @@ nfsd4_proc_compound(struct svc_rqst *rqs
 			if (need_wrongsec_check(rqstp))
 				op->status = check_nfsd_access(current_fh->fh_export, rqstp);
 		}
-
 encode_op:
-		/* Only from SEQUENCE */
-		if (cstate->status == nfserr_replay_cache) {
-			dprintk("%s NFS4.1 replay from cache\n", __func__);
-			status = op->status;
-			goto out;
-		}
 		if (op->status == nfserr_replay_me) {
 			op->replay = &cstate->replay_owner->so_replay;
 			nfsd4_encode_replay(&resp->xdr, op);

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 26/90] nfsd: Fix up the "supattr_exclcreat" attributes
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (23 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 25/90] nfsd4: fix null dereference on replay Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 29/90] arm: KVM: Allow unaligned accesses at HYP Greg Kroah-Hartman
                   ` (62 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Trond Myklebust, J. Bruce Fields

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Trond Myklebust <trond.myklebust@primarydata.com>

commit b26b78cb726007533d81fdf90a62e915002ef5c8 upstream.

If an NFSv4 client asks us for the supattr_exclcreat, then we must
not return attributes that are unsupported by this minor version.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Fixes: 75976de6556f ("NFSD: Return word2 bitmask if setting security..,")
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/nfsd/nfs4xdr.c |   13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -2753,9 +2753,16 @@ out_acl:
 	}
 #endif /* CONFIG_NFSD_PNFS */
 	if (bmval2 & FATTR4_WORD2_SUPPATTR_EXCLCREAT) {
-		status = nfsd4_encode_bitmap(xdr, NFSD_SUPPATTR_EXCLCREAT_WORD0,
-						  NFSD_SUPPATTR_EXCLCREAT_WORD1,
-						  NFSD_SUPPATTR_EXCLCREAT_WORD2);
+		u32 supp[3];
+
+		supp[0] = nfsd_suppattrs0(minorversion);
+		supp[1] = nfsd_suppattrs1(minorversion);
+		supp[2] = nfsd_suppattrs2(minorversion);
+		supp[0] &= NFSD_SUPPATTR_EXCLCREAT_WORD0;
+		supp[1] &= NFSD_SUPPATTR_EXCLCREAT_WORD1;
+		supp[2] &= NFSD_SUPPATTR_EXCLCREAT_WORD2;
+
+		status = nfsd4_encode_bitmap(xdr, supp[0], supp[1], supp[2]);
 		if (status)
 			goto out;
 	}

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 29/90] arm: KVM: Allow unaligned accesses at HYP
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (24 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 26/90] nfsd: Fix up the "supattr_exclcreat" attributes Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 31/90] dmaengine: usb-dmac: Fix DMAOR AE bit definition Greg Kroah-Hartman
                   ` (61 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Marc Zyngier, Christoffer Dall

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Marc Zyngier <marc.zyngier@arm.com>

commit 33b5c38852b29736f3b472dd095c9a18ec22746f upstream.

We currently have the HSCTLR.A bit set, trapping unaligned accesses
at HYP, but we're not really prepared to deal with it.

Since the rest of the kernel is pretty happy about that, let's follow
its example and set HSCTLR.A to zero. Modern CPUs don't really care.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/arm/kvm/init.S |    5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

--- a/arch/arm/kvm/init.S
+++ b/arch/arm/kvm/init.S
@@ -110,7 +110,6 @@ __do_hyp_init:
 	@  - Write permission implies XN: disabled
 	@  - Instruction cache: enabled
 	@  - Data/Unified cache: enabled
-	@  - Memory alignment checks: enabled
 	@  - MMU: enabled (this code must be run from an identity mapping)
 	mrc	p15, 4, r0, c1, c0, 0	@ HSCR
 	ldr	r2, =HSCTLR_MASK
@@ -118,8 +117,8 @@ __do_hyp_init:
 	mrc	p15, 0, r1, c1, c0, 0	@ SCTLR
 	ldr	r2, =(HSCTLR_EE | HSCTLR_FI | HSCTLR_I | HSCTLR_C)
 	and	r1, r1, r2
- ARM(	ldr	r2, =(HSCTLR_M | HSCTLR_A)			)
- THUMB(	ldr	r2, =(HSCTLR_M | HSCTLR_A | HSCTLR_TE)		)
+ ARM(	ldr	r2, =(HSCTLR_M)					)
+ THUMB(	ldr	r2, =(HSCTLR_M | HSCTLR_TE)			)
 	orr	r1, r1, r2
 	orr	r0, r0, r1
 	isb

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 31/90] dmaengine: usb-dmac: Fix DMAOR AE bit definition
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (25 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 29/90] arm: KVM: Allow unaligned accesses at HYP Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 32/90] dmaengine: ep93xx: Always start from BASE0 Greg Kroah-Hartman
                   ` (60 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Hiroyuki Yokoyama, Yoshihiro Shimoda,
	Geert Uytterhoeven, Vinod Koul

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Hiroyuki Yokoyama <hiroyuki.yokoyama.vx@renesas.com>

commit 9a445bbb1607d9f14556a532453dd86d1b7e381e upstream.

This patch fixes the register definition of AE (Address Error flag) bit.

Fixes: 0c1c8ff32fa2 ("dmaengine: usb-dmac: Add Renesas USB DMA Controller (USB-DMAC) driver")
Signed-off-by: Hiroyuki Yokoyama <hiroyuki.yokoyama.vx@renesas.com>
[Shimoda: add Fixes and Cc tags in the commit log]
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/dma/sh/usb-dmac.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/dma/sh/usb-dmac.c
+++ b/drivers/dma/sh/usb-dmac.c
@@ -117,7 +117,7 @@ struct usb_dmac {
 #define USB_DMASWR			0x0008
 #define USB_DMASWR_SWR			(1 << 0)
 #define USB_DMAOR			0x0060
-#define USB_DMAOR_AE			(1 << 2)
+#define USB_DMAOR_AE			(1 << 1)
 #define USB_DMAOR_DME			(1 << 0)
 
 #define USB_DMASAR			0x0000

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 32/90] dmaengine: ep93xx: Always start from BASE0
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (26 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 31/90] dmaengine: usb-dmac: Fix DMAOR AE bit definition Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 33/90] xen/privcmd: Support correctly 64KB page granularity when mapping memory Greg Kroah-Hartman
                   ` (59 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Alexander Sverdlin, Vinod Koul

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Alexander Sverdlin <alexander.sverdlin@gmail.com>

commit 0037ae47812b1f431cc602100d1d51f37d77b61e upstream.

The current buffer is being reset to zero on device_free_chan_resources()
but not on device_terminate_all(). It could happen that HW is restarted and
expects BASE0 to be used, but the driver is not synchronized and will start
from BASE1. One solution is to reset the buffer explicitly in
m2p_hw_setup().

Signed-off-by: Alexander Sverdlin <alexander.sverdlin@gmail.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/dma/ep93xx_dma.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/drivers/dma/ep93xx_dma.c
+++ b/drivers/dma/ep93xx_dma.c
@@ -325,6 +325,8 @@ static int m2p_hw_setup(struct ep93xx_dm
 		| M2P_CONTROL_ENABLE;
 	m2p_set_control(edmac, control);
 
+	edmac->buffer = 0;
+
 	return 0;
 }
 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 33/90] xen/privcmd: Support correctly 64KB page granularity when mapping memory
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (27 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 32/90] dmaengine: ep93xx: Always start from BASE0 Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 34/90] xen-netfront: do not cast grant table reference to signed short Greg Kroah-Hartman
                   ` (58 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Feng Kan, Julien Grall,
	Boris Ostrovsky, Juergen Gross

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Julien Grall <julien.grall@arm.com>

commit 753c09b5652bb4fe53e2db648002ec64b32b8827 upstream.

Commit 5995a68 "xen/privcmd: Add support for Linux 64KB page granularity" did
not go far enough to support 64KB in mmap_batch_fn.

The variable 'nr' is the number of 4KB chunk to map. However, when Linux
is using 64KB page granularity the array of pages (vma->vm_private_data)
contain one page per 64KB. Fix it by incrementing st->index correctly.

Furthermore, st->va is not correctly incremented as PAGE_SIZE !=
XEN_PAGE_SIZE.

Fixes: 5995a68 ("xen/privcmd: Add support for Linux 64KB page granularity")
Reported-by: Feng Kan <fkan@apm.com>
Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/xen/privcmd.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/xen/privcmd.c
+++ b/drivers/xen/privcmd.c
@@ -335,8 +335,8 @@ static int mmap_batch_fn(void *data, int
 				st->global_error = 1;
 		}
 	}
-	st->va += PAGE_SIZE * nr;
-	st->index += nr;
+	st->va += XEN_PAGE_SIZE * nr;
+	st->index += nr / XEN_PFN_PER_PAGE;
 
 	return 0;
 }

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 34/90] xen-netfront: do not cast grant table reference to signed short
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (28 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 33/90] xen/privcmd: Support correctly 64KB page granularity when mapping memory Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 35/90] xen-netfront: cast grant table reference first to type int Greg Kroah-Hartman
                   ` (57 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Dongli Zhang, David S. Miller, Blake Cooper

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Dongli Zhang <dongli.zhang@oracle.com>

commit 87557efc27f6a50140fb20df06a917f368ce3c66 upstream.

While grant reference is of type uint32_t, xen-netfront erroneously casts
it to signed short in BUG_ON().

This would lead to the xen domU panic during boot-up or migration when it
is attached with lots of paravirtual devices.

Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Blake Cooper <blake.cooper@braintreepayments.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/net/xen-netfront.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -304,7 +304,7 @@ static void xennet_alloc_rx_buffers(stru
 		queue->rx_skbs[id] = skb;
 
 		ref = gnttab_claim_grant_reference(&queue->gref_rx_head);
-		BUG_ON((signed short)ref < 0);
+		WARN_ON_ONCE(IS_ERR_VALUE((unsigned long)ref));
 		queue->grant_rx_ref[id] = ref;
 
 		page = skb_frag_page(&skb_shinfo(skb)->frags[0]);
@@ -437,7 +437,7 @@ static void xennet_tx_setup_grant(unsign
 	id = get_id_from_freelist(&queue->tx_skb_freelist, queue->tx_skbs);
 	tx = RING_GET_REQUEST(&queue->tx, queue->tx.req_prod_pvt++);
 	ref = gnttab_claim_grant_reference(&queue->gref_tx_head);
-	BUG_ON((signed short)ref < 0);
+	WARN_ON_ONCE(IS_ERR_VALUE((unsigned long)ref));
 
 	gnttab_grant_foreign_access_ref(ref, queue->info->xbdev->otherend_id,
 					gfn, GNTMAP_readonly);

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 35/90] xen-netfront: cast grant table reference first to type int
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (29 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 34/90] xen-netfront: do not cast grant table reference to signed short Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 36/90] ext4: fix SEEK_HOLE Greg Kroah-Hartman
                   ` (56 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Dongli Zhang, David S. Miller, Blake Cooper

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Dongli Zhang <dongli.zhang@oracle.com>

commit 269ebce4531b8edc4224259a02143181a1c1d77c upstream.

IS_ERR_VALUE() in commit 87557efc27f6a50140fb20df06a917f368ce3c66
("xen-netfront: do not cast grant table reference to signed short") would
not return true for error code unless we cast ref first to type int.

Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Blake Cooper <blake.cooper@braintreepayments.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/net/xen-netfront.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -304,7 +304,7 @@ static void xennet_alloc_rx_buffers(stru
 		queue->rx_skbs[id] = skb;
 
 		ref = gnttab_claim_grant_reference(&queue->gref_rx_head);
-		WARN_ON_ONCE(IS_ERR_VALUE((unsigned long)ref));
+		WARN_ON_ONCE(IS_ERR_VALUE((unsigned long)(int)ref));
 		queue->grant_rx_ref[id] = ref;
 
 		page = skb_frag_page(&skb_shinfo(skb)->frags[0]);
@@ -437,7 +437,7 @@ static void xennet_tx_setup_grant(unsign
 	id = get_id_from_freelist(&queue->tx_skb_freelist, queue->tx_skbs);
 	tx = RING_GET_REQUEST(&queue->tx, queue->tx.req_prod_pvt++);
 	ref = gnttab_claim_grant_reference(&queue->gref_tx_head);
-	WARN_ON_ONCE(IS_ERR_VALUE((unsigned long)ref));
+	WARN_ON_ONCE(IS_ERR_VALUE((unsigned long)(int)ref));
 
 	gnttab_grant_foreign_access_ref(ref, queue->info->xbdev->otherend_id,
 					gfn, GNTMAP_readonly);

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 36/90] ext4: fix SEEK_HOLE
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (30 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 35/90] xen-netfront: cast grant table reference first to type int Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 37/90] ext4: keep existing extra fields when inode expands Greg Kroah-Hartman
                   ` (55 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Zheng Liu, Jan Kara, Theodore Tso

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jan Kara <jack@suse.cz>

commit 7d95eddf313c88b24f99d4ca9c2411a4b82fef33 upstream.

Currently, SEEK_HOLE implementation in ext4 may both return that there's
a hole at some offset although that offset already has data and skip
some holes during a search for the next hole. The first problem is
demostrated by:

xfs_io -c "falloc 0 256k" -c "pwrite 0 56k" -c "seek -h 0" file
wrote 57344/57344 bytes at offset 0
56 KiB, 14 ops; 0.0000 sec (2.054 GiB/sec and 538461.5385 ops/sec)
Whence	Result
HOLE	0

Where we can see that SEEK_HOLE wrongly returned offset 0 as containing
a hole although we have written data there. The second problem can be
demonstrated by:

xfs_io -c "falloc 0 256k" -c "pwrite 0 56k" -c "pwrite 128k 8k"
       -c "seek -h 0" file

wrote 57344/57344 bytes at offset 0
56 KiB, 14 ops; 0.0000 sec (1.978 GiB/sec and 518518.5185 ops/sec)
wrote 8192/8192 bytes at offset 131072
8 KiB, 2 ops; 0.0000 sec (2 GiB/sec and 500000.0000 ops/sec)
Whence	Result
HOLE	139264

Where we can see that hole at offsets 56k..128k has been ignored by the
SEEK_HOLE call.

The underlying problem is in the ext4_find_unwritten_pgoff() which is
just buggy. In some cases it fails to update returned offset when it
finds a hole (when no pages are found or when the first found page has
higher index than expected), in some cases conditions for detecting hole
are just missing (we fail to detect a situation where indices of
returned pages are not contiguous).

Fix ext4_find_unwritten_pgoff() to properly detect non-contiguous page
indices and also handle all cases where we got less pages then expected
in one place and handle it properly there.

Fixes: c8c0df241cc2719b1262e627f999638411934f60
CC: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/ext4/file.c |   50 ++++++++++++++------------------------------------
 1 file changed, 14 insertions(+), 36 deletions(-)

--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -463,47 +463,27 @@ static int ext4_find_unwritten_pgoff(str
 		num = min_t(pgoff_t, end - index, PAGEVEC_SIZE);
 		nr_pages = pagevec_lookup(&pvec, inode->i_mapping, index,
 					  (pgoff_t)num);
-		if (nr_pages == 0) {
-			if (whence == SEEK_DATA)
-				break;
-
-			BUG_ON(whence != SEEK_HOLE);
-			/*
-			 * If this is the first time to go into the loop and
-			 * offset is not beyond the end offset, it will be a
-			 * hole at this offset
-			 */
-			if (lastoff == startoff || lastoff < endoff)
-				found = 1;
+		if (nr_pages == 0)
 			break;
-		}
-
-		/*
-		 * If this is the first time to go into the loop and
-		 * offset is smaller than the first page offset, it will be a
-		 * hole at this offset.
-		 */
-		if (lastoff == startoff && whence == SEEK_HOLE &&
-		    lastoff < page_offset(pvec.pages[0])) {
-			found = 1;
-			break;
-		}
 
 		for (i = 0; i < nr_pages; i++) {
 			struct page *page = pvec.pages[i];
 			struct buffer_head *bh, *head;
 
 			/*
-			 * If the current offset is not beyond the end of given
-			 * range, it will be a hole.
+			 * If current offset is smaller than the page offset,
+			 * there is a hole at this offset.
 			 */
-			if (lastoff < endoff && whence == SEEK_HOLE &&
-			    page->index > end) {
+			if (whence == SEEK_HOLE && lastoff < endoff &&
+			    lastoff < page_offset(pvec.pages[i])) {
 				found = 1;
 				*offset = lastoff;
 				goto out;
 			}
 
+			if (page->index > end)
+				goto out;
+
 			lock_page(page);
 
 			if (unlikely(page->mapping != inode->i_mapping)) {
@@ -543,20 +523,18 @@ static int ext4_find_unwritten_pgoff(str
 			unlock_page(page);
 		}
 
-		/*
-		 * The no. of pages is less than our desired, that would be a
-		 * hole in there.
-		 */
-		if (nr_pages < num && whence == SEEK_HOLE) {
-			found = 1;
-			*offset = lastoff;
+		/* The no. of pages is less than our desired, we are done. */
+		if (nr_pages < num)
 			break;
-		}
 
 		index = pvec.pages[i - 1]->index + 1;
 		pagevec_release(&pvec);
 	} while (index <= end);
 
+	if (whence == SEEK_HOLE && lastoff < endoff) {
+		found = 1;
+		*offset = lastoff;
+	}
 out:
 	pagevec_release(&pvec);
 	return found;

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 37/90] ext4: keep existing extra fields when inode expands
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (31 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 36/90] ext4: fix SEEK_HOLE Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 38/90] ext4: fix fdatasync(2) after extent manipulation operations Greg Kroah-Hartman
                   ` (54 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Konstantin Khlebnikov, Theodore Tso

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>

commit 887a9730614727c4fff7cb756711b190593fc1df upstream.

ext4_expand_extra_isize() should clear only space between old and new
size.

Fixes: 6dd4ee7cab7e # v2.6.23
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/ext4/inode.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -5162,8 +5162,9 @@ static int ext4_expand_extra_isize(struc
 	/* No extended attributes present */
 	if (!ext4_test_inode_state(inode, EXT4_STATE_XATTR) ||
 	    header->h_magic != cpu_to_le32(EXT4_XATTR_MAGIC)) {
-		memset((void *)raw_inode + EXT4_GOOD_OLD_INODE_SIZE, 0,
-			new_extra_isize);
+		memset((void *)raw_inode + EXT4_GOOD_OLD_INODE_SIZE +
+		       EXT4_I(inode)->i_extra_isize, 0,
+		       new_extra_isize - EXT4_I(inode)->i_extra_isize);
 		EXT4_I(inode)->i_extra_isize = new_extra_isize;
 		return 0;
 	}

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 38/90] ext4: fix fdatasync(2) after extent manipulation operations
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (32 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 37/90] ext4: keep existing extra fields when inode expands Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 39/90] usb: gadget: f_mass_storage: Serialize wake and sleep execution Greg Kroah-Hartman
                   ` (53 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Jan Kara, Theodore Tso

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jan Kara <jack@suse.cz>

commit 67a7d5f561f469ad2fa5154d2888258ab8e6df7c upstream.

Currently, extent manipulation operations such as hole punch, range
zeroing, or extent shifting do not record the fact that file data has
changed and thus fdatasync(2) has a work to do. As a result if we crash
e.g. after a punch hole and fdatasync, user can still possibly see the
punched out data after journal replay. Test generic/392 fails due to
these problems.

Fix the problem by properly marking that file data has changed in these
operations.

Fixes: a4bb6b64e39abc0e41ca077725f2a72c868e7622
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/ext4/extents.c |    5 +++++
 fs/ext4/inode.c   |    2 ++
 2 files changed, 7 insertions(+)

--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -4902,6 +4902,8 @@ static long ext4_zero_range(struct file
 
 	/* Zero out partial block at the edges of the range */
 	ret = ext4_zero_partial_blocks(handle, inode, offset, len);
+	if (ret >= 0)
+		ext4_update_inode_fsync_trans(handle, inode, 1);
 
 	if (file->f_flags & O_SYNC)
 		ext4_handle_sync(handle);
@@ -5597,6 +5599,7 @@ int ext4_collapse_range(struct inode *in
 		ext4_handle_sync(handle);
 	inode->i_mtime = inode->i_ctime = ext4_current_time(inode);
 	ext4_mark_inode_dirty(handle, inode);
+	ext4_update_inode_fsync_trans(handle, inode, 1);
 
 out_stop:
 	ext4_journal_stop(handle);
@@ -5770,6 +5773,8 @@ int ext4_insert_range(struct inode *inod
 	up_write(&EXT4_I(inode)->i_data_sem);
 	if (IS_SYNC(inode))
 		ext4_handle_sync(handle);
+	if (ret >= 0)
+		ext4_update_inode_fsync_trans(handle, inode, 1);
 
 out_stop:
 	ext4_journal_stop(handle);
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3793,6 +3793,8 @@ int ext4_punch_hole(struct inode *inode,
 
 	inode->i_mtime = inode->i_ctime = ext4_current_time(inode);
 	ext4_mark_inode_dirty(handle, inode);
+	if (ret >= 0)
+		ext4_update_inode_fsync_trans(handle, inode, 1);
 out_stop:
 	ext4_journal_stop(handle);
 out_dio:

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 39/90] usb: gadget: f_mass_storage: Serialize wake and sleep execution
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (33 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 38/90] ext4: fix fdatasync(2) after extent manipulation operations Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 40/90] usb: chipidea: udc: fix NULL pointer dereference if udc_start failed Greg Kroah-Hartman
                   ` (52 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Alan Stern, Thinh Nguyen, Felipe Balbi

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thinh Nguyen <Thinh.Nguyen@synopsys.com>

commit dc9217b69dd6089dcfeb86ed4b3c671504326087 upstream.

f_mass_storage has a memorry barrier issue with the sleep and wake
functions that can cause a deadlock. This results in intermittent hangs
during MSC file transfer. The host will reset the device after receiving
no response to resume the transfer. This issue is seen when dwc3 is
processing 2 transfer-in-progress events at the same time, invoking
completion handlers for CSW and CBW. Also this issue occurs depending on
the system timing and latency.

To increase the chance to hit this issue, you can force dwc3 driver to
wait and process those 2 events at once by adding a small delay (~100us)
in dwc3_check_event_buf() whenever the request is for CSW and read the
event count again. Avoid debugging with printk and ftrace as extra
delays and memory barrier will mask this issue.

Scenario which can lead to failure:
-----------------------------------
1) The main thread sleeps and waits for the next command in
   get_next_command().
2) bulk_in_complete() wakes up main thread for CSW.
3) bulk_out_complete() tries to wake up the running main thread for CBW.
4) thread_wakeup_needed is not loaded with correct value in
   sleep_thread().
5) Main thread goes to sleep again.

The pattern is shown below. Note the 2 critical variables.
 * common->thread_wakeup_needed
 * bh->state

	CPU 0 (sleep_thread)		CPU 1 (wakeup_thread)
	==============================  ===============================

					bh->state = BH_STATE_FULL;
					smp_wmb();
	thread_wakeup_needed = 0;	thread_wakeup_needed = 1;
	smp_rmb();
	if (bh->state != BH_STATE_FULL)
		sleep again ...

As pointed out by Alan Stern, this is an R-pattern issue. The issue can
be seen when there are two wakeups in quick succession. The
thread_wakeup_needed can be overwritten in sleep_thread, and the read of
the bh->state maybe reordered before the write to thread_wakeup_needed.

This patch applies full memory barrier smp_mb() in both sleep_thread()
and wakeup_thread() to ensure the order which the thread_wakeup_needed
and bh->state are written and loaded.

However, a better solution in the future would be to use wait_queue
method that takes care of managing memory barrier between waker and
waiter.

Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Thinh Nguyen <thinhn@synopsys.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/usb/gadget/function/f_mass_storage.c |   13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

--- a/drivers/usb/gadget/function/f_mass_storage.c
+++ b/drivers/usb/gadget/function/f_mass_storage.c
@@ -399,7 +399,11 @@ static int fsg_set_halt(struct fsg_dev *
 /* Caller must hold fsg->lock */
 static void wakeup_thread(struct fsg_common *common)
 {
-	smp_wmb();	/* ensure the write of bh->state is complete */
+	/*
+	 * Ensure the reading of thread_wakeup_needed
+	 * and the writing of bh->state are completed
+	 */
+	smp_mb();
 	/* Tell the main thread that something has happened */
 	common->thread_wakeup_needed = 1;
 	if (common->thread_task)
@@ -630,7 +634,12 @@ static int sleep_thread(struct fsg_commo
 	}
 	__set_current_state(TASK_RUNNING);
 	common->thread_wakeup_needed = 0;
-	smp_rmb();	/* ensure the latest bh->state is visible */
+
+	/*
+	 * Ensure the writing of thread_wakeup_needed
+	 * and the reading of bh->state are completed
+	 */
+	smp_mb();
 	return rc;
 }
 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 40/90] usb: chipidea: udc: fix NULL pointer dereference if udc_start failed
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (34 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 39/90] usb: gadget: f_mass_storage: Serialize wake and sleep execution Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 41/90] usb: chipidea: debug: check before accessing ci_role Greg Kroah-Hartman
                   ` (51 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Jisheng Zhang, Peter Chen

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jisheng Zhang <jszhang@marvell.com>

commit aa1f058d7d9244423b8c5a75b9484b1115df7f02 upstream.

Fix below NULL pointer dereference. we set ci->roles[CI_ROLE_GADGET]
too early in ci_hdrc_gadget_init(), if udc_start() fails due to some
reason, the ci->roles[CI_ROLE_GADGET] check in  ci_hdrc_gadget_destroy
can't protect us.

We fix this issue by only setting ci->roles[CI_ROLE_GADGET] if
udc_start() succeed.

[    1.398550] Unable to handle kernel NULL pointer dereference at
virtual address 00000000
...
[    1.448600] PC is at dma_pool_free+0xb8/0xf0
[    1.453012] LR is at dma_pool_free+0x28/0xf0
[    2.113369] [<ffffff80081817d8>] dma_pool_free+0xb8/0xf0
[    2.118857] [<ffffff800841209c>] destroy_eps+0x4c/0x68
[    2.124165] [<ffffff8008413770>] ci_hdrc_gadget_destroy+0x28/0x50
[    2.130461] [<ffffff800840fa30>] ci_hdrc_probe+0x588/0x7e8
[    2.136129] [<ffffff8008380fb8>] platform_drv_probe+0x50/0xb8
[    2.142066] [<ffffff800837f494>] driver_probe_device+0x1fc/0x2a8
[    2.148270] [<ffffff800837f68c>] __device_attach_driver+0x9c/0xf8
[    2.154563] [<ffffff800837d570>] bus_for_each_drv+0x58/0x98
[    2.160317] [<ffffff800837f174>] __device_attach+0xc4/0x138
[    2.166072] [<ffffff800837f738>] device_initial_probe+0x10/0x18
[    2.172185] [<ffffff800837e58c>] bus_probe_device+0x94/0xa0
[    2.177940] [<ffffff800837c560>] device_add+0x3f0/0x560
[    2.183337] [<ffffff8008380d20>] platform_device_add+0x180/0x240
[    2.189541] [<ffffff800840f0e8>] ci_hdrc_add_device+0x440/0x4f8
[    2.195654] [<ffffff8008414194>] ci_hdrc_usb2_probe+0x13c/0x2d8
[    2.201769] [<ffffff8008380fb8>] platform_drv_probe+0x50/0xb8
[    2.207705] [<ffffff800837f494>] driver_probe_device+0x1fc/0x2a8
[    2.213910] [<ffffff800837f5ec>] __driver_attach+0xac/0xb0
[    2.219575] [<ffffff800837d4b0>] bus_for_each_dev+0x60/0xa0
[    2.225329] [<ffffff800837ec80>] driver_attach+0x20/0x28
[    2.230816] [<ffffff800837e880>] bus_add_driver+0x1d0/0x238
[    2.236571] [<ffffff800837fdb0>] driver_register+0x60/0xf8
[    2.242237] [<ffffff8008380ef4>] __platform_driver_register+0x44/0x50
[    2.248891] [<ffffff80086fd440>] ci_hdrc_usb2_driver_init+0x18/0x20
[    2.255365] [<ffffff8008082950>] do_one_initcall+0x38/0x128
[    2.261121] [<ffffff80086e0d00>] kernel_init_freeable+0x1ac/0x250
[    2.267414] [<ffffff800852f0b8>] kernel_init+0x10/0x100
[    2.272810] [<ffffff8008082680>] ret_from_fork+0x10/0x50

Fixes: 3f124d233e97 ("usb: chipidea: add role init and destroy APIs")
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/usb/chipidea/udc.c |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

--- a/drivers/usb/chipidea/udc.c
+++ b/drivers/usb/chipidea/udc.c
@@ -1982,6 +1982,7 @@ static void udc_id_switch_for_host(struc
 int ci_hdrc_gadget_init(struct ci_hdrc *ci)
 {
 	struct ci_role_driver *rdrv;
+	int ret;
 
 	if (!hw_read(ci, CAP_DCCPARAMS, DCCPARAMS_DC))
 		return -ENXIO;
@@ -1994,7 +1995,10 @@ int ci_hdrc_gadget_init(struct ci_hdrc *
 	rdrv->stop	= udc_id_switch_for_host;
 	rdrv->irq	= udc_irq;
 	rdrv->name	= "gadget";
-	ci->roles[CI_ROLE_GADGET] = rdrv;
 
-	return udc_start(ci);
+	ret = udc_start(ci);
+	if (!ret)
+		ci->roles[CI_ROLE_GADGET] = rdrv;
+
+	return ret;
 }

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 41/90] usb: chipidea: debug: check before accessing ci_role
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (35 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 40/90] usb: chipidea: udc: fix NULL pointer dereference if udc_start failed Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 42/90] staging/lustre/lov: remove set_fs() call from lov_getstripe() Greg Kroah-Hartman
                   ` (50 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Michael Thalmeier, Peter Chen

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Michael Thalmeier <michael.thalmeier@hale.at>

commit 0340ff83cd4475261e7474033a381bc125b45244 upstream.

ci_role BUGs when the role is >= CI_ROLE_END.

Signed-off-by: Michael Thalmeier <michael.thalmeier@hale.at>
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/usb/chipidea/debug.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/usb/chipidea/debug.c
+++ b/drivers/usb/chipidea/debug.c
@@ -295,7 +295,8 @@ static int ci_role_show(struct seq_file
 {
 	struct ci_hdrc *ci = s->private;
 
-	seq_printf(s, "%s\n", ci_role(ci)->name);
+	if (ci->role != CI_ROLE_END)
+		seq_printf(s, "%s\n", ci_role(ci)->name);
 
 	return 0;
 }

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 42/90] staging/lustre/lov: remove set_fs() call from lov_getstripe()
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (36 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 41/90] usb: chipidea: debug: check before accessing ci_role Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 43/90] iio: light: ltr501 Fix interchanged als/ps register field Greg Kroah-Hartman
                   ` (49 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, John L. Hammond, Andreas Dilger,
	Li Wei, Oleg Drokin

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Oleg Drokin <green@linuxhacker.ru>

commit 0a33252e060e97ed3fbdcec9517672f1e91aaef3 upstream.

lov_getstripe() calls set_fs(KERNEL_DS) so that it can handle a struct
lov_user_md pointer from user- or kernel-space.  This changes the
behavior of copy_from_user() on SPARC and may result in a misaligned
access exception which in turn oopses the kernel.  In fact the
relevant argument to lov_getstripe() is never called with a
kernel-space pointer and so changing the address limits is unnecessary
and so we remove the calls to save, set, and restore the address
limits.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Reviewed-on: http://review.whamcloud.com/6150
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3221
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Li Wei <wei.g.li@intel.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/staging/lustre/lustre/lov/lov_pack.c |    9 ---------
 1 file changed, 9 deletions(-)

--- a/drivers/staging/lustre/lustre/lov/lov_pack.c
+++ b/drivers/staging/lustre/lustre/lov/lov_pack.c
@@ -399,18 +399,10 @@ int lov_getstripe(struct obd_export *exp
 	struct lov_mds_md *lmmk = NULL;
 	int rc, lmm_size;
 	int lum_size;
-	mm_segment_t seg;
 
 	if (!lsm)
 		return -ENODATA;
 
-	/*
-	 * "Switch to kernel segment" to allow copying from kernel space by
-	 * copy_{to,from}_user().
-	 */
-	seg = get_fs();
-	set_fs(KERNEL_DS);
-
 	/* we only need the header part from user space to get lmm_magic and
 	 * lmm_stripe_count, (the header part is common to v1 and v3) */
 	lum_size = sizeof(struct lov_user_md_v1);
@@ -485,6 +477,5 @@ int lov_getstripe(struct obd_export *exp
 
 	obd_free_diskmd(exp, &lmmk);
 out_set:
-	set_fs(seg);
 	return rc;
 }

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 43/90] iio: light: ltr501 Fix interchanged als/ps register field
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (37 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 42/90] staging/lustre/lov: remove set_fs() call from lov_getstripe() Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 44/90] iio: proximity: as3935: fix AS3935_INT mask Greg Kroah-Hartman
                   ` (48 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Franziska Naepelt,
	Peter Meerwald-Stadler, Jonathan Cameron

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Franziska Naepelt <franziska.naepelt@idt.com>

commit 7cc3bff4efe6164a0c8163331c8aa55454799f42 upstream.

The register mapping for the IIO driver for the Liteon Light and Proximity
sensor LTR501 interrupt mode is interchanged (ALS/PS).
There is a register called INTERRUPT register (address 0x8F)
Bit 0 represents PS measurement trigger.
Bit 1 represents ALS measurement trigger.
This two bit fields are interchanged within the driver.
see datasheet page 24:
http://optoelectronics.liteon.com/upload/download/DS86-2012-0006/S_110_LTR-501ALS-01_PrelimDS_ver1%5B1%5D.pdf

Signed-off-by: Franziska Naepelt <franziska.naepelt@idt.com>
Fixes: 7ac702b3144b6 ("iio: ltr501: Add interrupt support")
Acked-by: Peter Meerwald-Stadler <pmeerw@pmeerw.net>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/iio/light/ltr501.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/iio/light/ltr501.c
+++ b/drivers/iio/light/ltr501.c
@@ -74,9 +74,9 @@ static const int int_time_mapping[] = {1
 static const struct reg_field reg_field_it =
 				REG_FIELD(LTR501_ALS_MEAS_RATE, 3, 4);
 static const struct reg_field reg_field_als_intr =
-				REG_FIELD(LTR501_INTR, 0, 0);
-static const struct reg_field reg_field_ps_intr =
 				REG_FIELD(LTR501_INTR, 1, 1);
+static const struct reg_field reg_field_ps_intr =
+				REG_FIELD(LTR501_INTR, 0, 0);
 static const struct reg_field reg_field_als_rate =
 				REG_FIELD(LTR501_ALS_MEAS_RATE, 0, 2);
 static const struct reg_field reg_field_ps_rate =

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 44/90] iio: proximity: as3935: fix AS3935_INT mask
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (38 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 43/90] iio: light: ltr501 Fix interchanged als/ps register field Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 45/90] drivers: char: random: add get_random_long() Greg Kroah-Hartman
                   ` (47 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Matt Ranostay, Jonathan Cameron

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Matt Ranostay <matt.ranostay@konsulko.com>

commit 275292d3a3d62670b1b13484707b74e5239b4bb0 upstream.

AS3935 interrupt mask has been incorrect so valid lightning events
would never trigger an buffer event. Also noise interrupt should be
BIT(0).

Fixes: 24ddb0e4bba4 ("iio: Add AS3935 lightning sensor support")
Signed-off-by: Matt Ranostay <matt.ranostay@konsulko.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/iio/proximity/as3935.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/iio/proximity/as3935.c
+++ b/drivers/iio/proximity/as3935.c
@@ -40,9 +40,9 @@
 #define AS3935_AFE_PWR_BIT	BIT(0)
 
 #define AS3935_INT		0x03
-#define AS3935_INT_MASK		0x07
+#define AS3935_INT_MASK		0x0f
 #define AS3935_EVENT_INT	BIT(3)
-#define AS3935_NOISE_INT	BIT(1)
+#define AS3935_NOISE_INT	BIT(0)
 
 #define AS3935_DATA		0x07
 #define AS3935_DATA_MASK	0x3F

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 45/90] drivers: char: random: add get_random_long()
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (39 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 44/90] iio: proximity: as3935: fix AS3935_INT mask Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 46/90] random: properly align get_random_int_hash Greg Kroah-Hartman
                   ` (46 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Daniel Cashman, Kees Cook,
	Theodore Tso, Arnd Bergmann, Catalin Marinas, Will Deacon,
	Ralf Baechle, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, David S. Miller, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Al Viro, Nick Kralevich, Jeff Vander Stoep,
	Mark Salyzyn, Andrew Morton, Linus Torvalds

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Daniel Cashman <dcashman@android.com>

commit ec9ee4acd97c0039a61c0ae4f12705767ae62153 upstream.

Commit d07e22597d1d ("mm: mmap: add new /proc tunable for mmap_base
ASLR") added the ability to choose from a range of values to use for
entropy count in generating the random offset to the mmap_base address.

The maximum value on this range was set to 32 bits for 64-bit x86
systems, but this value could be increased further, requiring more than
the 32 bits of randomness provided by get_random_int(), as is already
possible for arm64.  Add a new function: get_random_long() which more
naturally fits with the mmap usage of get_random_int() but operates
exactly the same as get_random_int().

Also, fix the shifting constant in mmap_rnd() to be an unsigned long so
that values greater than 31 bits generate an appropriate mask without
overflow.  This is especially important on x86, as its shift instruction
uses a 5-bit mask for the shift operand, which meant that any value for
mmap_rnd_bits over 31 acts as a no-op and effectively disables mmap_base
randomization.

Finally, replace calls to get_random_int() with get_random_long() where
appropriate.

This patch (of 2):

Add get_random_long().

Signed-off-by: Daniel Cashman <dcashman@android.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: David S. Miller <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Nick Kralevich <nnk@google.com>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Mark Salyzyn <salyzyn@android.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/char/random.c  |   22 ++++++++++++++++++++++
 include/linux/random.h |    1 +
 2 files changed, 23 insertions(+)

--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1825,6 +1825,28 @@ unsigned int get_random_int(void)
 EXPORT_SYMBOL(get_random_int);
 
 /*
+ * Same as get_random_int(), but returns unsigned long.
+ */
+unsigned long get_random_long(void)
+{
+	__u32 *hash;
+	unsigned long ret;
+
+	if (arch_get_random_long(&ret))
+		return ret;
+
+	hash = get_cpu_var(get_random_int_hash);
+
+	hash[0] += current->pid + jiffies + random_get_entropy();
+	md5_transform(hash, random_int_secret);
+	ret = *(unsigned long *)hash;
+	put_cpu_var(get_random_int_hash);
+
+	return ret;
+}
+EXPORT_SYMBOL(get_random_long);
+
+/*
  * randomize_range() returns a start address such that
  *
  *    [...... <range> .....]
--- a/include/linux/random.h
+++ b/include/linux/random.h
@@ -34,6 +34,7 @@ extern const struct file_operations rand
 #endif
 
 unsigned int get_random_int(void);
+unsigned long get_random_long(void);
 unsigned long randomize_range(unsigned long start, unsigned long end, unsigned long len);
 
 u32 prandom_u32(void);

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 46/90] random: properly align get_random_int_hash
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (40 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 45/90] drivers: char: random: add get_random_long() Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 47/90] stackprotector: Increase the per-task stack canarys random range from 32 bits to 64 bits on 64-bit platforms Greg Kroah-Hartman
                   ` (45 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Eric Biggers, Theodore Tso

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Biggers <ebiggers3@gmail.com>

commit b1132deac01c2332d234fa821a70022796b79182 upstream.

get_random_long() reads from the get_random_int_hash array using an
unsigned long pointer.  For this code to be guaranteed correct on all
architectures, the array must be aligned to an unsigned long boundary.

Signed-off-by: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/char/random.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1798,13 +1798,15 @@ int random_int_secret_init(void)
 	return 0;
 }
 
+static DEFINE_PER_CPU(__u32 [MD5_DIGEST_WORDS], get_random_int_hash)
+		__aligned(sizeof(unsigned long));
+
 /*
  * Get a random word for internal kernel use only. Similar to urandom but
  * with the goal of minimal entropy pool depletion. As a result, the random
  * value is not cryptographically secure but for several uses the cost of
  * depleting entropy is too high
  */
-static DEFINE_PER_CPU(__u32 [MD5_DIGEST_WORDS], get_random_int_hash);
 unsigned int get_random_int(void)
 {
 	__u32 *hash;

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 47/90] stackprotector: Increase the per-task stack canarys random range from 32 bits to 64 bits on 64-bit platforms
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (41 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 46/90] random: properly align get_random_int_hash Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:41   ` [kernel-hardening] " Jann Horn
  2017-06-12 15:25 ` [PATCH 4.4 48/90] cpufreq: cpufreq_register_driver() should return -ENODEV if init fails Greg Kroah-Hartman
                   ` (44 subsequent siblings)
  87 siblings, 1 reply; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Daniel Micay, Arjan van de Ven,
	Rik van Riel, Kees Cook, Linus Torvalds, Peter Zijlstra,
	Thomas Gleixner, kernel-hardening, Ingo Molnar

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Daniel Micay <danielmicay@gmail.com>

commit 5ea30e4e58040cfd6434c2f33dc3ea76e2c15b05 upstream.

The stack canary is an 'unsigned long' and should be fully initialized to
random data rather than only 32 bits of random data.

Signed-off-by: Daniel Micay <danielmicay@gmail.com>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Arjan van Ven <arjan@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel-hardening@lists.openwall.com
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170504133209.3053-1-danielmicay@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 kernel/fork.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -368,7 +368,7 @@ static struct task_struct *dup_task_stru
 	set_task_stack_end_magic(tsk);
 
 #ifdef CONFIG_CC_STACKPROTECTOR
-	tsk->stack_canary = get_random_int();
+	tsk->stack_canary = get_random_long();
 #endif
 
 	/*

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 48/90] cpufreq: cpufreq_register_driver() should return -ENODEV if init fails
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (42 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 47/90] stackprotector: Increase the per-task stack canarys random range from 32 bits to 64 bits on 64-bit platforms Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 49/90] target: Re-add check to reject control WRITEs with overflow data Greg Kroah-Hartman
                   ` (43 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, David Arcari, Viresh Kumar,
	Rafael J. Wysocki

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: David Arcari <darcari@redhat.com>

commit 6c77003677d5f1ce15f26d24360cb66c0bc07bb3 upstream.

For a driver that does not set the CPUFREQ_STICKY flag, if all of the
->init() calls fail, cpufreq_register_driver() should return an error.
This will prevent the driver from loading.

Fixes: ce1bcfe94db8 (cpufreq: check cpufreq_policy_list instead of scanning policies for all CPUs)
Signed-off-by: David Arcari <darcari@redhat.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/cpufreq/cpufreq.c |    1 +
 1 file changed, 1 insertion(+)

--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -2451,6 +2451,7 @@ int cpufreq_register_driver(struct cpufr
 	if (!(cpufreq_driver->flags & CPUFREQ_STICKY) &&
 	    list_empty(&cpufreq_policy_list)) {
 		/* if all ->init() calls failed, unregister */
+		ret = -ENODEV;
 		pr_debug("%s: No CPU initialized for driver %s\n", __func__,
 			 driver_data->name);
 		goto err_if_unreg;

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 49/90] target: Re-add check to reject control WRITEs with overflow data
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (43 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 48/90] cpufreq: cpufreq_register_driver() should return -ENODEV if init fails Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 50/90] drm/msm: Expose our reservation object when exporting a dmabuf Greg Kroah-Hartman
                   ` (42 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Bart Van Assche, Roland Dreier,
	Nicholas Bellinger

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Nicholas Bellinger <nab@linux-iscsi.org>

commit 4ff83daa0200affe1894bd33d17bac404e3d78d4 upstream.

During v4.3 when the overflow/underflow check was relaxed by
commit c72c525022:

  commit c72c5250224d475614a00c1d7e54a67f77cd3410
  Author: Roland Dreier <roland@purestorage.com>
  Date:   Wed Jul 22 15:08:18 2015 -0700

       target: allow underflow/overflow for PR OUT etc. commands

to allow underflow/overflow for Windows compliance + FCP, a
consequence was to allow control CDBs to process overflow
data for iscsi-target with immediate data as well.

As per Roland's original change, continue to allow underflow
cases for control CDBs to make Windows compliance + FCP happy,
but until overflow for control CDBs is supported tree-wide,
explicitly reject all control WRITEs with overflow following
pre v4.3.y logic.

Reported-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Roland Dreier <roland@purestorage.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/target/target_core_transport.c |   23 ++++++++++++++++++-----
 1 file changed, 18 insertions(+), 5 deletions(-)

--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -1154,15 +1154,28 @@ target_cmd_size_check(struct se_cmd *cmd
 	if (cmd->unknown_data_length) {
 		cmd->data_length = size;
 	} else if (size != cmd->data_length) {
-		pr_warn("TARGET_CORE[%s]: Expected Transfer Length:"
+		pr_warn_ratelimited("TARGET_CORE[%s]: Expected Transfer Length:"
 			" %u does not match SCSI CDB Length: %u for SAM Opcode:"
 			" 0x%02x\n", cmd->se_tfo->get_fabric_name(),
 				cmd->data_length, size, cmd->t_task_cdb[0]);
 
-		if (cmd->data_direction == DMA_TO_DEVICE &&
-		    cmd->se_cmd_flags & SCF_SCSI_DATA_CDB) {
-			pr_err("Rejecting underflow/overflow WRITE data\n");
-			return TCM_INVALID_CDB_FIELD;
+		if (cmd->data_direction == DMA_TO_DEVICE) {
+			if (cmd->se_cmd_flags & SCF_SCSI_DATA_CDB) {
+				pr_err_ratelimited("Rejecting underflow/overflow"
+						   " for WRITE data CDB\n");
+				return TCM_INVALID_CDB_FIELD;
+			}
+			/*
+			 * Some fabric drivers like iscsi-target still expect to
+			 * always reject overflow writes.  Reject this case until
+			 * full fabric driver level support for overflow writes
+			 * is introduced tree-wide.
+			 */
+			if (size > cmd->data_length) {
+				pr_err_ratelimited("Rejecting overflow for"
+						   " WRITE control CDB\n");
+				return TCM_INVALID_CDB_FIELD;
+			}
 		}
 		/*
 		 * Reject READ_* or WRITE_* with overflow/underflow for

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 50/90] drm/msm: Expose our reservation object when exporting a dmabuf.
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (44 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 49/90] target: Re-add check to reject control WRITEs with overflow data Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 51/90] Input: elantech - add Fujitsu Lifebook E546/E557 to force crc_enabled Greg Kroah-Hartman
                   ` (41 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Eric Anholt, Daniel Vetter,
	Rob Clark, linux-arm-msm, freedreno

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Anholt <eric@anholt.net>

commit 43523eba79bda8f5b4c27f8ffe20ea078d20113a upstream.

Without this, polling on the dma-buf (and presumably other devices
synchronizing against our rendering) would return immediately, even
while the BO was busy.

Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Rob Clark <robdclark@gmail.com>
Cc: linux-arm-msm@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/gpu/drm/msm/msm_drv.c       |    1 +
 drivers/gpu/drm/msm/msm_drv.h       |    1 +
 drivers/gpu/drm/msm/msm_gem_prime.c |    7 +++++++
 3 files changed, 9 insertions(+)

--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -986,6 +986,7 @@ static struct drm_driver msm_driver = {
 	.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
 	.gem_prime_export   = drm_gem_prime_export,
 	.gem_prime_import   = drm_gem_prime_import,
+	.gem_prime_res_obj  = msm_gem_prime_res_obj,
 	.gem_prime_pin      = msm_gem_prime_pin,
 	.gem_prime_unpin    = msm_gem_prime_unpin,
 	.gem_prime_get_sg_table = msm_gem_prime_get_sg_table,
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -212,6 +212,7 @@ struct sg_table *msm_gem_prime_get_sg_ta
 void *msm_gem_prime_vmap(struct drm_gem_object *obj);
 void msm_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr);
 int msm_gem_prime_mmap(struct drm_gem_object *obj, struct vm_area_struct *vma);
+struct reservation_object *msm_gem_prime_res_obj(struct drm_gem_object *obj);
 struct drm_gem_object *msm_gem_prime_import_sg_table(struct drm_device *dev,
 		struct dma_buf_attachment *attach, struct sg_table *sg);
 int msm_gem_prime_pin(struct drm_gem_object *obj);
--- a/drivers/gpu/drm/msm/msm_gem_prime.c
+++ b/drivers/gpu/drm/msm/msm_gem_prime.c
@@ -70,3 +70,10 @@ void msm_gem_prime_unpin(struct drm_gem_
 	if (!obj->import_attach)
 		msm_gem_put_pages(obj);
 }
+
+struct reservation_object *msm_gem_prime_res_obj(struct drm_gem_object *obj)
+{
+	struct msm_gem_object *msm_obj = to_msm_bo(obj);
+
+	return msm_obj->resv;
+}

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 51/90] Input: elantech - add Fujitsu Lifebook E546/E557 to force crc_enabled
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (45 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 50/90] drm/msm: Expose our reservation object when exporting a dmabuf Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 52/90] cpuset: consider dying css as offline Greg Kroah-Hartman
                   ` (40 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Ulrik De Bie, Arjan Opmeer, Dmitry Torokhov

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ulrik De Bie <ulrik.debie-os@e2big.org>

commit 47eb0c8b4d9eb6368941c6a9bb443f00847a46d7 upstream.

The Lifebook E546 and E557 touchpad were also not functioning and
worked after running:

        echo "1" > /sys/devices/platform/i8042/serio2/crc_enabled

Add them to the list of machines that need this workaround.

Signed-off-by: Ulrik De Bie <ulrik.debie-os@e2big.org>
Reviewed-by: Arjan Opmeer <arjan@opmeer.net>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/input/mouse/elantech.c |   16 ++++++++++++++++
 1 file changed, 16 insertions(+)

--- a/drivers/input/mouse/elantech.c
+++ b/drivers/input/mouse/elantech.c
@@ -1122,8 +1122,10 @@ static int elantech_get_resolution_v4(st
  * Asus UX32VD             0x361f02        00, 15, 0e      clickpad
  * Avatar AVIU-145A2       0x361f00        ?               clickpad
  * Fujitsu LIFEBOOK E544   0x470f00        d0, 12, 09      2 hw buttons
+ * Fujitsu LIFEBOOK E546   0x470f00        50, 12, 09      2 hw buttons
  * Fujitsu LIFEBOOK E547   0x470f00        50, 12, 09      2 hw buttons
  * Fujitsu LIFEBOOK E554   0x570f01        40, 14, 0c      2 hw buttons
+ * Fujitsu LIFEBOOK E557   0x570f01        40, 14, 0c      2 hw buttons
  * Fujitsu T725            0x470f01        05, 12, 09      2 hw buttons
  * Fujitsu H730            0x570f00        c0, 14, 0c      3 hw buttons (**)
  * Gigabyte U2442          0x450f01        58, 17, 0c      2 hw buttons
@@ -1529,6 +1531,13 @@ static const struct dmi_system_id elante
 		},
 	},
 	{
+		/* Fujitsu LIFEBOOK E546  does not work with crc_enabled == 0 */
+		.matches = {
+			DMI_MATCH(DMI_SYS_VENDOR, "FUJITSU"),
+			DMI_MATCH(DMI_PRODUCT_NAME, "LIFEBOOK E546"),
+		},
+	},
+	{
 		/* Fujitsu LIFEBOOK E547 does not work with crc_enabled == 0 */
 		.matches = {
 			DMI_MATCH(DMI_SYS_VENDOR, "FUJITSU"),
@@ -1550,6 +1559,13 @@ static const struct dmi_system_id elante
 		},
 	},
 	{
+		/* Fujitsu LIFEBOOK E557 does not work with crc_enabled == 0 */
+		.matches = {
+			DMI_MATCH(DMI_SYS_VENDOR, "FUJITSU"),
+			DMI_MATCH(DMI_PRODUCT_NAME, "LIFEBOOK E557"),
+		},
+	},
+	{
 		/* Fujitsu LIFEBOOK U745 does not work with crc_enabled == 0 */
 		.matches = {
 			DMI_MATCH(DMI_SYS_VENDOR, "FUJITSU"),

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 52/90] cpuset: consider dying css as offline
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (46 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 51/90] Input: elantech - add Fujitsu Lifebook E546/E557 to force crc_enabled Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:25 ` [PATCH 4.4 53/90] fs: add i_blocksize() Greg Kroah-Hartman
                   ` (39 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Tejun Heo, Daniel Jordan

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Tejun Heo <tj@kernel.org>

commit 41c25707d21716826e3c1f60967f5550610ec1c9 upstream.

In most cases, a cgroup controller don't care about the liftimes of
cgroups.  For the controller, a css becomes online when ->css_online()
is called on it and offline when ->css_offline() is called.

However, cpuset is special in that the user interface it exposes cares
whether certain cgroups exist or not.  Combined with the RCU delay
between cgroup removal and css offlining, this can lead to user
visible behavior oddities where operations which should succeed after
cgroup removals fail for some time period.  The effects of cgroup
removals are delayed when seen from userland.

This patch adds css_is_dying() which tests whether offline is pending
and updates is_cpuset_online() so that the function returns false also
while offline is pending.  This gets rid of the userland visible
delays.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Link: http://lkml.kernel.org/r/327ca1f5-7957-fbb9-9e5f-9ba149d40ba2@oracle.com
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 include/linux/cgroup.h |   20 ++++++++++++++++++++
 kernel/cpuset.c        |    4 ++--
 2 files changed, 22 insertions(+), 2 deletions(-)

--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -340,6 +340,26 @@ static inline bool css_tryget_online(str
 }
 
 /**
+ * css_is_dying - test whether the specified css is dying
+ * @css: target css
+ *
+ * Test whether @css is in the process of offlining or already offline.  In
+ * most cases, ->css_online() and ->css_offline() callbacks should be
+ * enough; however, the actual offline operations are RCU delayed and this
+ * test returns %true also when @css is scheduled to be offlined.
+ *
+ * This is useful, for example, when the use case requires synchronous
+ * behavior with respect to cgroup removal.  cgroup removal schedules css
+ * offlining but the css can seem alive while the operation is being
+ * delayed.  If the delay affects user visible semantics, this test can be
+ * used to resolve the situation.
+ */
+static inline bool css_is_dying(struct cgroup_subsys_state *css)
+{
+	return !(css->flags & CSS_NO_REF) && percpu_ref_is_dying(&css->refcnt);
+}
+
+/**
  * css_put - put a css reference
  * @css: target css
  *
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -173,9 +173,9 @@ typedef enum {
 } cpuset_flagbits_t;
 
 /* convenient tests for these bits */
-static inline bool is_cpuset_online(const struct cpuset *cs)
+static inline bool is_cpuset_online(struct cpuset *cs)
 {
-	return test_bit(CS_ONLINE, &cs->flags);
+	return test_bit(CS_ONLINE, &cs->flags) && !css_is_dying(&cs->css);
 }
 
 static inline int is_cpu_exclusive(const struct cpuset *cs)

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 53/90] fs: add i_blocksize()
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (47 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 52/90] cpuset: consider dying css as offline Greg Kroah-Hartman
@ 2017-06-12 15:25 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 54/90] ufs: restore proper tail allocation Greg Kroah-Hartman
                   ` (38 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Fabian Frederick, Geliang Tang,
	Alexander Viro, Ross Zwisler, Andrew Morton, Linus Torvalds

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Fabian Frederick <fabf@skynet.be>

commit 93407472a21b82f39c955ea7787e5bc7da100642 upstream.

Replace all 1 << inode->i_blkbits and (1 << inode->i_blkbits) in fs
branch.

This patch also fixes multiple checkpatch warnings: WARNING: Prefer
'unsigned int' to bare use of 'unsigned'

Thanks to Andrew Morton for suggesting more appropriate function instead
of macro.

[geliangtang@gmail.com: truncate: use i_blocksize()]
  Link: http://lkml.kernel.org/r/9c8b2cd83c8f5653805d43debde9fa8817e02fc4.1484895804.git.geliangtang@gmail.com
Link: http://lkml.kernel.org/r/1481319905-10126-1-git-send-email-fabf@skynet.be
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/btrfs/file.c       |    2 +-
 fs/buffer.c           |   12 ++++++------
 fs/ceph/addr.c        |    2 +-
 fs/direct-io.c        |    2 +-
 fs/ext4/inode.c       |    2 +-
 fs/ext4/move_extent.c |    2 +-
 fs/jfs/super.c        |    4 ++--
 fs/mpage.c            |    2 +-
 fs/nfsd/blocklayout.c |    4 ++--
 fs/nilfs2/btnode.c    |    2 +-
 fs/nilfs2/inode.c     |    4 ++--
 fs/nilfs2/mdt.c       |    4 ++--
 fs/nilfs2/segment.c   |    2 +-
 fs/ocfs2/aops.c       |    2 +-
 fs/ocfs2/file.c       |    2 +-
 fs/reiserfs/file.c    |    2 +-
 fs/reiserfs/inode.c   |    2 +-
 fs/stat.c             |    2 +-
 fs/udf/inode.c        |    2 +-
 fs/xfs/xfs_aops.c     |   10 +++++-----
 fs/xfs/xfs_file.c     |    4 ++--
 include/linux/fs.h    |    5 +++++
 mm/truncate.c         |    2 +-
 23 files changed, 41 insertions(+), 36 deletions(-)

--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -2771,7 +2771,7 @@ static long btrfs_fallocate(struct file
 		if (!ret)
 			ret = btrfs_prealloc_file_range(inode, mode,
 					range->start,
-					range->len, 1 << inode->i_blkbits,
+					range->len, i_blocksize(inode),
 					offset + len, &alloc_hint);
 		list_del(&range->list);
 		kfree(range);
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -2298,7 +2298,7 @@ static int cont_expand_zero(struct file
 			    loff_t pos, loff_t *bytes)
 {
 	struct inode *inode = mapping->host;
-	unsigned blocksize = 1 << inode->i_blkbits;
+	unsigned int blocksize = i_blocksize(inode);
 	struct page *page;
 	void *fsdata;
 	pgoff_t index, curidx;
@@ -2378,8 +2378,8 @@ int cont_write_begin(struct file *file,
 			get_block_t *get_block, loff_t *bytes)
 {
 	struct inode *inode = mapping->host;
-	unsigned blocksize = 1 << inode->i_blkbits;
-	unsigned zerofrom;
+	unsigned int blocksize = i_blocksize(inode);
+	unsigned int zerofrom;
 	int err;
 
 	err = cont_expand_zero(file, mapping, pos, bytes);
@@ -2741,7 +2741,7 @@ int nobh_truncate_page(struct address_sp
 	struct buffer_head map_bh;
 	int err;
 
-	blocksize = 1 << inode->i_blkbits;
+	blocksize = i_blocksize(inode);
 	length = offset & (blocksize - 1);
 
 	/* Block boundary? Nothing to do */
@@ -2819,7 +2819,7 @@ int block_truncate_page(struct address_s
 	struct buffer_head *bh;
 	int err;
 
-	blocksize = 1 << inode->i_blkbits;
+	blocksize = i_blocksize(inode);
 	length = offset & (blocksize - 1);
 
 	/* Block boundary? Nothing to do */
@@ -2931,7 +2931,7 @@ sector_t generic_block_bmap(struct addre
 	struct inode *inode = mapping->host;
 	tmp.b_state = 0;
 	tmp.b_blocknr = 0;
-	tmp.b_size = 1 << inode->i_blkbits;
+	tmp.b_size = i_blocksize(inode);
 	get_block(inode, block, &tmp, 0);
 	return tmp.b_blocknr;
 }
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -697,7 +697,7 @@ static int ceph_writepages_start(struct
 	struct pagevec pvec;
 	int done = 0;
 	int rc = 0;
-	unsigned wsize = 1 << inode->i_blkbits;
+	unsigned int wsize = i_blocksize(inode);
 	struct ceph_osd_request *req = NULL;
 	int do_sync = 0;
 	loff_t snap_size, i_size;
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -575,7 +575,7 @@ static int dio_set_defer_completion(stru
 /*
  * Call into the fs to map some more disk blocks.  We record the current number
  * of available blocks at sdio->blocks_available.  These are in units of the
- * fs blocksize, (1 << inode->i_blkbits).
+ * fs blocksize, i_blocksize(inode).
  *
  * The fs is allowed to map lots of blocks at once.  If it wants to do that,
  * it uses the passed inode-relative block number as the file offset, as usual.
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2044,7 +2044,7 @@ static int mpage_process_page_bufs(struc
 {
 	struct inode *inode = mpd->inode;
 	int err;
-	ext4_lblk_t blocks = (i_size_read(inode) + (1 << inode->i_blkbits) - 1)
+	ext4_lblk_t blocks = (i_size_read(inode) + i_blocksize(inode) - 1)
 							>> inode->i_blkbits;
 
 	do {
--- a/fs/ext4/move_extent.c
+++ b/fs/ext4/move_extent.c
@@ -187,7 +187,7 @@ mext_page_mkuptodate(struct page *page,
 	if (PageUptodate(page))
 		return 0;
 
-	blocksize = 1 << inode->i_blkbits;
+	blocksize = i_blocksize(inode);
 	if (!page_has_buffers(page))
 		create_empty_buffers(page, blocksize, 0);
 
--- a/fs/jfs/super.c
+++ b/fs/jfs/super.c
@@ -758,7 +758,7 @@ static ssize_t jfs_quota_read(struct sup
 				sb->s_blocksize - offset : toread;
 
 		tmp_bh.b_state = 0;
-		tmp_bh.b_size = 1 << inode->i_blkbits;
+		tmp_bh.b_size = i_blocksize(inode);
 		err = jfs_get_block(inode, blk, &tmp_bh, 0);
 		if (err)
 			return err;
@@ -798,7 +798,7 @@ static ssize_t jfs_quota_write(struct su
 				sb->s_blocksize - offset : towrite;
 
 		tmp_bh.b_state = 0;
-		tmp_bh.b_size = 1 << inode->i_blkbits;
+		tmp_bh.b_size = i_blocksize(inode);
 		err = jfs_get_block(inode, blk, &tmp_bh, 1);
 		if (err)
 			goto out;
--- a/fs/mpage.c
+++ b/fs/mpage.c
@@ -111,7 +111,7 @@ map_buffer_to_page(struct page *page, st
 			SetPageUptodate(page);    
 			return;
 		}
-		create_empty_buffers(page, 1 << inode->i_blkbits, 0);
+		create_empty_buffers(page, i_blocksize(inode), 0);
 	}
 	head = page_buffers(page);
 	page_bh = head;
--- a/fs/nfsd/blocklayout.c
+++ b/fs/nfsd/blocklayout.c
@@ -50,7 +50,7 @@ nfsd4_block_proc_layoutget(struct inode
 {
 	struct nfsd4_layout_seg *seg = &args->lg_seg;
 	struct super_block *sb = inode->i_sb;
-	u32 block_size = (1 << inode->i_blkbits);
+	u32 block_size = i_blocksize(inode);
 	struct pnfs_block_extent *bex;
 	struct iomap iomap;
 	u32 device_generation = 0;
@@ -151,7 +151,7 @@ nfsd4_block_proc_layoutcommit(struct ino
 	int error;
 
 	nr_iomaps = nfsd4_block_decode_layoutupdate(lcp->lc_up_layout,
-			lcp->lc_up_len, &iomaps, 1 << inode->i_blkbits);
+			lcp->lc_up_len, &iomaps, i_blocksize(inode));
 	if (nr_iomaps < 0)
 		return nfserrno(nr_iomaps);
 
--- a/fs/nilfs2/btnode.c
+++ b/fs/nilfs2/btnode.c
@@ -55,7 +55,7 @@ nilfs_btnode_create_block(struct address
 		brelse(bh);
 		BUG();
 	}
-	memset(bh->b_data, 0, 1 << inode->i_blkbits);
+	memset(bh->b_data, 0, i_blocksize(inode));
 	bh->b_bdev = inode->i_sb->s_bdev;
 	bh->b_blocknr = blocknr;
 	set_buffer_mapped(bh);
--- a/fs/nilfs2/inode.c
+++ b/fs/nilfs2/inode.c
@@ -55,7 +55,7 @@ void nilfs_inode_add_blocks(struct inode
 {
 	struct nilfs_root *root = NILFS_I(inode)->i_root;
 
-	inode_add_bytes(inode, (1 << inode->i_blkbits) * n);
+	inode_add_bytes(inode, i_blocksize(inode) * n);
 	if (root)
 		atomic64_add(n, &root->blocks_count);
 }
@@ -64,7 +64,7 @@ void nilfs_inode_sub_blocks(struct inode
 {
 	struct nilfs_root *root = NILFS_I(inode)->i_root;
 
-	inode_sub_bytes(inode, (1 << inode->i_blkbits) * n);
+	inode_sub_bytes(inode, i_blocksize(inode) * n);
 	if (root)
 		atomic64_sub(n, &root->blocks_count);
 }
--- a/fs/nilfs2/mdt.c
+++ b/fs/nilfs2/mdt.c
@@ -60,7 +60,7 @@ nilfs_mdt_insert_new_block(struct inode
 	set_buffer_mapped(bh);
 
 	kaddr = kmap_atomic(bh->b_page);
-	memset(kaddr + bh_offset(bh), 0, 1 << inode->i_blkbits);
+	memset(kaddr + bh_offset(bh), 0, i_blocksize(inode));
 	if (init_block)
 		init_block(inode, bh, kaddr);
 	flush_dcache_page(bh->b_page);
@@ -503,7 +503,7 @@ void nilfs_mdt_set_entry_size(struct ino
 	struct nilfs_mdt_info *mi = NILFS_MDT(inode);
 
 	mi->mi_entry_size = entry_size;
-	mi->mi_entries_per_block = (1 << inode->i_blkbits) / entry_size;
+	mi->mi_entries_per_block = i_blocksize(inode) / entry_size;
 	mi->mi_first_entry_offset = DIV_ROUND_UP(header_size, entry_size);
 }
 
--- a/fs/nilfs2/segment.c
+++ b/fs/nilfs2/segment.c
@@ -719,7 +719,7 @@ static size_t nilfs_lookup_dirty_data_bu
 
 		lock_page(page);
 		if (!page_has_buffers(page))
-			create_empty_buffers(page, 1 << inode->i_blkbits, 0);
+			create_empty_buffers(page, i_blocksize(inode), 0);
 		unlock_page(page);
 
 		bh = head = page_buffers(page);
--- a/fs/ocfs2/aops.c
+++ b/fs/ocfs2/aops.c
@@ -1103,7 +1103,7 @@ int ocfs2_map_page_blocks(struct page *p
 	int ret = 0;
 	struct buffer_head *head, *bh, *wait[2], **wait_bh = wait;
 	unsigned int block_end, block_start;
-	unsigned int bsize = 1 << inode->i_blkbits;
+	unsigned int bsize = i_blocksize(inode);
 
 	if (!page_has_buffers(page))
 		create_empty_buffers(page, bsize, 0);
--- a/fs/ocfs2/file.c
+++ b/fs/ocfs2/file.c
@@ -808,7 +808,7 @@ static int ocfs2_write_zero_page(struct
 	/* We know that zero_from is block aligned */
 	for (block_start = zero_from; block_start < zero_to;
 	     block_start = block_end) {
-		block_end = block_start + (1 << inode->i_blkbits);
+		block_end = block_start + i_blocksize(inode);
 
 		/*
 		 * block_start is block-aligned.  Bump it by one to force
--- a/fs/reiserfs/file.c
+++ b/fs/reiserfs/file.c
@@ -189,7 +189,7 @@ int reiserfs_commit_page(struct inode *i
 	int ret = 0;
 
 	th.t_trans_id = 0;
-	blocksize = 1 << inode->i_blkbits;
+	blocksize = i_blocksize(inode);
 
 	if (logit) {
 		reiserfs_write_lock(s);
--- a/fs/reiserfs/inode.c
+++ b/fs/reiserfs/inode.c
@@ -524,7 +524,7 @@ static int reiserfs_get_blocks_direct_io
 	 * referenced in convert_tail_for_hole() that may be called from
 	 * reiserfs_get_block()
 	 */
-	bh_result->b_size = (1 << inode->i_blkbits);
+	bh_result->b_size = i_blocksize(inode);
 
 	ret = reiserfs_get_block(inode, iblock, bh_result,
 				 create | GET_BLOCK_NO_DANGLE);
--- a/fs/stat.c
+++ b/fs/stat.c
@@ -31,7 +31,7 @@ void generic_fillattr(struct inode *inod
 	stat->atime = inode->i_atime;
 	stat->mtime = inode->i_mtime;
 	stat->ctime = inode->i_ctime;
-	stat->blksize = (1 << inode->i_blkbits);
+	stat->blksize = i_blocksize(inode);
 	stat->blocks = inode->i_blocks;
 }
 
--- a/fs/udf/inode.c
+++ b/fs/udf/inode.c
@@ -1206,7 +1206,7 @@ int udf_setsize(struct inode *inode, lof
 {
 	int err;
 	struct udf_inode_info *iinfo;
-	int bsize = 1 << inode->i_blkbits;
+	int bsize = i_blocksize(inode);
 
 	if (!(S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode) ||
 	      S_ISLNK(inode->i_mode)))
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -288,7 +288,7 @@ xfs_map_blocks(
 {
 	struct xfs_inode	*ip = XFS_I(inode);
 	struct xfs_mount	*mp = ip->i_mount;
-	ssize_t			count = 1 << inode->i_blkbits;
+	ssize_t			count = i_blocksize(inode);
 	xfs_fileoff_t		offset_fsb, end_fsb;
 	int			error = 0;
 	int			bmapi_flags = XFS_BMAPI_ENTIRE;
@@ -921,7 +921,7 @@ xfs_aops_discard_page(
 			break;
 		}
 next_buffer:
-		offset += 1 << inode->i_blkbits;
+		offset += i_blocksize(inode);
 
 	} while ((bh = bh->b_this_page) != head);
 
@@ -1363,7 +1363,7 @@ xfs_map_trim_size(
 	    offset + mapping_size >= i_size_read(inode)) {
 		/* limit mapping to block that spans EOF */
 		mapping_size = roundup_64(i_size_read(inode) - offset,
-					  1 << inode->i_blkbits);
+					  i_blocksize(inode));
 	}
 	if (mapping_size > LONG_MAX)
 		mapping_size = LONG_MAX;
@@ -1395,7 +1395,7 @@ __xfs_get_blocks(
 		return -EIO;
 
 	offset = (xfs_off_t)iblock << inode->i_blkbits;
-	ASSERT(bh_result->b_size >= (1 << inode->i_blkbits));
+	ASSERT(bh_result->b_size >= i_blocksize(inode));
 	size = bh_result->b_size;
 
 	if (!create && direct && offset >= i_size_read(inode))
@@ -1968,7 +1968,7 @@ xfs_vm_set_page_dirty(
 			if (offset < end_offset)
 				set_buffer_dirty(bh);
 			bh = bh->b_this_page;
-			offset += 1 << inode->i_blkbits;
+			offset += i_blocksize(inode);
 		} while (bh != head);
 	}
 	/*
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -947,7 +947,7 @@ xfs_file_fallocate(
 		if (error)
 			goto out_unlock;
 	} else if (mode & FALLOC_FL_COLLAPSE_RANGE) {
-		unsigned blksize_mask = (1 << inode->i_blkbits) - 1;
+		unsigned int blksize_mask = i_blocksize(inode) - 1;
 
 		if (offset & blksize_mask || len & blksize_mask) {
 			error = -EINVAL;
@@ -969,7 +969,7 @@ xfs_file_fallocate(
 		if (error)
 			goto out_unlock;
 	} else if (mode & FALLOC_FL_INSERT_RANGE) {
-		unsigned blksize_mask = (1 << inode->i_blkbits) - 1;
+		unsigned int blksize_mask = i_blocksize(inode) - 1;
 
 		new_size = i_size_read(inode) + len;
 		if (offset & blksize_mask || len & blksize_mask) {
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -680,6 +680,11 @@ struct inode {
 	void			*i_private; /* fs or device private pointer */
 };
 
+static inline unsigned int i_blocksize(const struct inode *node)
+{
+	return (1 << node->i_blkbits);
+}
+
 static inline int inode_unhashed(struct inode *inode)
 {
 	return hlist_unhashed(&inode->i_hash);
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -732,7 +732,7 @@ EXPORT_SYMBOL(truncate_setsize);
  */
 void pagecache_isize_extended(struct inode *inode, loff_t from, loff_t to)
 {
-	int bsize = 1 << inode->i_blkbits;
+	int bsize = i_blocksize(inode);
 	loff_t rounded_from;
 	struct page *page;
 	pgoff_t index;

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 54/90] ufs: restore proper tail allocation
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (48 preceding siblings ...)
  2017-06-12 15:25 ` [PATCH 4.4 53/90] fs: add i_blocksize() Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 55/90] fix ufs_isblockset() Greg Kroah-Hartman
                   ` (37 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Al Viro

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Al Viro <viro@zeniv.linux.org.uk>

commit 8785d84d002c2ce0f68fbcd6c2c86be859802c7e upstream.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/ufs/inode.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/ufs/inode.c
+++ b/fs/ufs/inode.c
@@ -284,7 +284,7 @@ ufs_inode_getfrag(struct inode *inode, u
 			goal += uspi->s_fpb;
 	}
 	tmp = ufs_new_fragments(inode, p, ufs_blknum(new_fragment),
-				goal, uspi->s_fpb, err, locked_page);
+				goal, nfrags, err, locked_page);
 
 	if (!tmp) {
 		*err = -ENOSPC;

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 55/90] fix ufs_isblockset()
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (49 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 54/90] ufs: restore proper tail allocation Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 56/90] ufs: restore maintaining ->i_blocks Greg Kroah-Hartman
                   ` (36 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Al Viro

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Al Viro <viro@zeniv.linux.org.uk>

commit 414cf7186dbec29bd946c138d6b5c09da5955a08 upstream.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/ufs/util.h |   10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

--- a/fs/ufs/util.h
+++ b/fs/ufs/util.h
@@ -473,15 +473,19 @@ static inline unsigned _ubh_find_last_ze
 static inline int _ubh_isblockset_(struct ufs_sb_private_info * uspi,
 	struct ufs_buffer_head * ubh, unsigned begin, unsigned block)
 {
+	u8 mask;
 	switch (uspi->s_fpb) {
 	case 8:
 	    	return (*ubh_get_addr (ubh, begin + block) == 0xff);
 	case 4:
-		return (*ubh_get_addr (ubh, begin + (block >> 1)) == (0x0f << ((block & 0x01) << 2)));
+		mask = 0x0f << ((block & 0x01) << 2);
+		return (*ubh_get_addr (ubh, begin + (block >> 1)) & mask) == mask;
 	case 2:
-		return (*ubh_get_addr (ubh, begin + (block >> 2)) == (0x03 << ((block & 0x03) << 1)));
+		mask = 0x03 << ((block & 0x03) << 1);
+		return (*ubh_get_addr (ubh, begin + (block >> 2)) & mask) == mask;
 	case 1:
-		return (*ubh_get_addr (ubh, begin + (block >> 3)) == (0x01 << (block & 0x07)));
+		mask = 0x01 << (block & 0x07);
+		return (*ubh_get_addr (ubh, begin + (block >> 3)) & mask) == mask;
 	}
 	return 0;	
 }

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 56/90] ufs: restore maintaining ->i_blocks
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (50 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 55/90] fix ufs_isblockset() Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 57/90] ufs: set correct ->s_maxsize Greg Kroah-Hartman
                   ` (35 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Al Viro

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Al Viro <viro@zeniv.linux.org.uk>

commit eb315d2ae614493fd1ebb026c75a80573d84f7ad upstream.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/stat.c       |    1 +
 fs/ufs/balloc.c |   26 +++++++++++++++++++++++++-
 2 files changed, 26 insertions(+), 1 deletion(-)

--- a/fs/stat.c
+++ b/fs/stat.c
@@ -454,6 +454,7 @@ void __inode_add_bytes(struct inode *ino
 		inode->i_bytes -= 512;
 	}
 }
+EXPORT_SYMBOL(__inode_add_bytes);
 
 void inode_add_bytes(struct inode *inode, loff_t bytes)
 {
--- a/fs/ufs/balloc.c
+++ b/fs/ufs/balloc.c
@@ -81,7 +81,8 @@ void ufs_free_fragments(struct inode *in
 			ufs_error (sb, "ufs_free_fragments",
 				   "bit already cleared for fragment %u", i);
 	}
-	
+
+	inode_sub_bytes(inode, count << uspi->s_fshift);
 	fs32_add(sb, &ucg->cg_cs.cs_nffree, count);
 	uspi->cs_total.cs_nffree += count;
 	fs32_add(sb, &UFS_SB(sb)->fs_cs(cgno).cs_nffree, count);
@@ -183,6 +184,7 @@ do_more:
 			ufs_error(sb, "ufs_free_blocks", "freeing free fragment");
 		}
 		ubh_setblock(UCPI_UBH(ucpi), ucpi->c_freeoff, blkno);
+		inode_sub_bytes(inode, uspi->s_fpb << uspi->s_fshift);
 		if ((UFS_SB(sb)->s_flags & UFS_CG_MASK) == UFS_CG_44BSD)
 			ufs_clusteracct (sb, ucpi, blkno, 1);
 
@@ -494,6 +496,20 @@ u64 ufs_new_fragments(struct inode *inod
 	return 0;
 }		
 
+static bool try_add_frags(struct inode *inode, unsigned frags)
+{
+	unsigned size = frags * i_blocksize(inode);
+	spin_lock(&inode->i_lock);
+	__inode_add_bytes(inode, size);
+	if (unlikely((u32)inode->i_blocks != inode->i_blocks)) {
+		__inode_sub_bytes(inode, size);
+		spin_unlock(&inode->i_lock);
+		return false;
+	}
+	spin_unlock(&inode->i_lock);
+	return true;
+}
+
 static u64 ufs_add_fragments(struct inode *inode, u64 fragment,
 			     unsigned oldcount, unsigned newcount)
 {
@@ -530,6 +546,9 @@ static u64 ufs_add_fragments(struct inod
 	for (i = oldcount; i < newcount; i++)
 		if (ubh_isclr (UCPI_UBH(ucpi), ucpi->c_freeoff, fragno + i))
 			return 0;
+
+	if (!try_add_frags(inode, count))
+		return 0;
 	/*
 	 * Block can be extended
 	 */
@@ -647,6 +666,7 @@ cg_found:
 			ubh_setbit (UCPI_UBH(ucpi), ucpi->c_freeoff, goal + i);
 		i = uspi->s_fpb - count;
 
+		inode_sub_bytes(inode, i << uspi->s_fshift);
 		fs32_add(sb, &ucg->cg_cs.cs_nffree, i);
 		uspi->cs_total.cs_nffree += i;
 		fs32_add(sb, &UFS_SB(sb)->fs_cs(cgno).cs_nffree, i);
@@ -657,6 +677,8 @@ cg_found:
 	result = ufs_bitmap_search (sb, ucpi, goal, allocsize);
 	if (result == INVBLOCK)
 		return 0;
+	if (!try_add_frags(inode, count))
+		return 0;
 	for (i = 0; i < count; i++)
 		ubh_clrbit (UCPI_UBH(ucpi), ucpi->c_freeoff, result + i);
 	
@@ -716,6 +738,8 @@ norot:
 		return INVBLOCK;
 	ucpi->c_rotor = result;
 gotit:
+	if (!try_add_frags(inode, uspi->s_fpb))
+		return 0;
 	blkno = ufs_fragstoblks(result);
 	ubh_clrblock (UCPI_UBH(ucpi), ucpi->c_freeoff, blkno);
 	if ((UFS_SB(sb)->s_flags & UFS_CG_MASK) == UFS_CG_44BSD)

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 57/90] ufs: set correct ->s_maxsize
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (51 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 56/90] ufs: restore maintaining ->i_blocks Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 58/90] ufs_extend_tail(): fix the braino in calling conventions of ufs_new_fragments() Greg Kroah-Hartman
                   ` (34 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Al Viro

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Al Viro <viro@zeniv.linux.org.uk>

commit 6b0d144fa758869bdd652c50aa41aaf601232550 upstream.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/ufs/super.c |   18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

--- a/fs/ufs/super.c
+++ b/fs/ufs/super.c
@@ -746,6 +746,23 @@ static void ufs_put_super(struct super_b
 	return;
 }
 
+static u64 ufs_max_bytes(struct super_block *sb)
+{
+	struct ufs_sb_private_info *uspi = UFS_SB(sb)->s_uspi;
+	int bits = uspi->s_apbshift;
+	u64 res;
+
+	if (bits > 21)
+		res = ~0ULL;
+	else
+		res = UFS_NDADDR + (1LL << bits) + (1LL << (2*bits)) +
+			(1LL << (3*bits));
+
+	if (res >= (MAX_LFS_FILESIZE >> uspi->s_bshift))
+		return MAX_LFS_FILESIZE;
+	return res << uspi->s_bshift;
+}
+
 static int ufs_fill_super(struct super_block *sb, void *data, int silent)
 {
 	struct ufs_sb_info * sbi;
@@ -1212,6 +1229,7 @@ magic_found:
 			    "fast symlink size (%u)\n", uspi->s_maxsymlinklen);
 		uspi->s_maxsymlinklen = maxsymlen;
 	}
+	sb->s_maxbytes = ufs_max_bytes(sb);
 	sb->s_max_links = UFS_LINK_MAX;
 
 	inode = ufs_iget(sb, UFS_ROOTINO);

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 58/90] ufs_extend_tail(): fix the braino in calling conventions of ufs_new_fragments()
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (52 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 57/90] ufs: set correct ->s_maxsize Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 59/90] ufs_getfrag_block(): we only grab ->truncate_mutex on block creation path Greg Kroah-Hartman
                   ` (33 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Al Viro

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Al Viro <viro@zeniv.linux.org.uk>

commit 940ef1a0ed939c2ca029fca715e25e7778ce1e34 upstream.

... and it really needs splitting into "new" and "extend" cases, but that's for
later

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/ufs/inode.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/fs/ufs/inode.c
+++ b/fs/ufs/inode.c
@@ -235,7 +235,8 @@ ufs_extend_tail(struct inode *inode, u64
 
 	p = ufs_get_direct_data_ptr(uspi, ufsi, block);
 	tmp = ufs_new_fragments(inode, p, lastfrag, ufs_data_ptr_to_cpu(sb, p),
-				new_size, err, locked_page);
+				new_size - (lastfrag & uspi->s_fpbmask), err,
+				locked_page);
 	return tmp != 0;
 }
 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 59/90] ufs_getfrag_block(): we only grab ->truncate_mutex on block creation path
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (53 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 58/90] ufs_extend_tail(): fix the braino in calling conventions of ufs_new_fragments() Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 60/90] cxl: Fix error path on bad ioctl Greg Kroah-Hartman
                   ` (32 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Al Viro

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Al Viro <viro@zeniv.linux.org.uk>

commit 006351ac8ead0d4a67dd3845e3ceffe650a23212 upstream.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/ufs/inode.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/fs/ufs/inode.c
+++ b/fs/ufs/inode.c
@@ -403,7 +403,9 @@ static int ufs_getfrag_block(struct inod
 
 	if (!create) {
 		phys64 = ufs_frag_map(inode, offsets, depth);
-		goto out;
+		if (phys64)
+			map_bh(bh_result, sb, phys64 + frag);
+		return 0;
 	}
 
         /* This code entered only while writing ....? */

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 60/90] cxl: Fix error path on bad ioctl
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (54 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 59/90] ufs_getfrag_block(): we only grab ->truncate_mutex on block creation path Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 61/90] btrfs: use correct types for page indices in btrfs_page_exists_in_range Greg Kroah-Hartman
                   ` (31 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Frederic Barrat, Vaibhav Jain,
	Andrew Donnellan, Michael Ellerman

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Frederic Barrat <fbarrat@linux.vnet.ibm.com>

commit cec422c11caeeccae709e9942058b6b644ce434c upstream.

Fix error path if we can't copy user structure on CXL_IOCTL_START_WORK
ioctl. We shouldn't unlock the context status mutex as it was not
locked (yet).

Fixes: 0712dc7e73e5 ("cxl: Fix issues when unmapping contexts")
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/misc/cxl/file.c |    7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

--- a/drivers/misc/cxl/file.c
+++ b/drivers/misc/cxl/file.c
@@ -158,11 +158,8 @@ static long afu_ioctl_start_work(struct
 
 	/* Do this outside the status_mutex to avoid a circular dependency with
 	 * the locking in cxl_mmap_fault() */
-	if (copy_from_user(&work, uwork,
-			   sizeof(struct cxl_ioctl_start_work))) {
-		rc = -EFAULT;
-		goto out;
-	}
+	if (copy_from_user(&work, uwork, sizeof(work)))
+		return -EFAULT;
 
 	mutex_lock(&ctx->status_mutex);
 	if (ctx->status != OPENED) {

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 61/90] btrfs: use correct types for page indices in btrfs_page_exists_in_range
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (55 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 60/90] cxl: Fix error path on bad ioctl Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 62/90] btrfs: fix memory leak in update_space_info failure path Greg Kroah-Hartman
                   ` (30 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Liu Bo, David Sterba

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: David Sterba <dsterba@suse.com>

commit cc2b702c52094b637a351d7491ac5200331d0445 upstream.

Variables start_idx and end_idx are supposed to hold a page index
derived from the file offsets. The int type is not the right one though,
offsets larger than 1 << 44 will get silently trimmed off the high bits.
(1 << 44 is 16TiB)

What can go wrong, if start is below the boundary and end gets trimmed:
- if there's a page after start, we'll find it (radix_tree_gang_lookup_slot)
- the final check "if (page->index <= end_idx)" will unexpectedly fail

The function will return false, ie. "there's no page in the range",
although there is at least one.

btrfs_page_exists_in_range is used to prevent races in:

* in hole punching, where we make sure there are not pages in the
  truncated range, otherwise we'll wait for them to finish and redo
  truncation, but we're going to replace the pages with holes anyway so
  the only problem is the intermediate state

* lock_extent_direct: we want to make sure there are no pages before we
  lock and start DIO, to prevent stale data reads

For practical occurence of the bug, there are several constaints.  The
file must be quite large, the affected range must cross the 16TiB
boundary and the internal state of the file pages and pending operations
must match.  Also, we must not have started any ordered data in the
range, otherwise we don't even reach the buggy function check.

DIO locking tries hard in several places to avoid deadlocks with
buffered IO and avoids waiting for ranges. The worst consequence seems
to be stale data read.

CC: Liu Bo <bo.li.liu@oracle.com>
Fixes: fc4adbff823f7 ("btrfs: Drop EXTENT_UPTODATE check in hole punching and direct locking")
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/btrfs/inode.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -7318,8 +7318,8 @@ bool btrfs_page_exists_in_range(struct i
 	int found = false;
 	void **pagep = NULL;
 	struct page *page = NULL;
-	int start_idx;
-	int end_idx;
+	unsigned long start_idx;
+	unsigned long end_idx;
 
 	start_idx = start >> PAGE_CACHE_SHIFT;
 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 62/90] btrfs: fix memory leak in update_space_info failure path
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (56 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 61/90] btrfs: use correct types for page indices in btrfs_page_exists_in_range Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 63/90] KVM: arm/arm64: Handle possible NULL stage2 pud when ageing pages Greg Kroah-Hartman
                   ` (29 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jeff Mahoney, Liu Bo, David Sterba

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jeff Mahoney <jeffm@suse.com>

commit 896533a7da929136d0432713f02a3edffece2826 upstream.

If we fail to add the space_info kobject, we'll leak the memory
for the percpu counter.

Fixes: 6ab0a2029c (btrfs: publish allocation data in sysfs)
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/btrfs/extent-tree.c |    1 +
 1 file changed, 1 insertion(+)

--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3854,6 +3854,7 @@ static int update_space_info(struct btrf
 				    info->space_info_kobj, "%s",
 				    alloc_name(found->flags));
 	if (ret) {
+		percpu_counter_destroy(&found->total_bytes_pinned);
 		kfree(found);
 		return ret;
 	}

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 63/90] KVM: arm/arm64: Handle possible NULL stage2 pud when ageing pages
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (57 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 62/90] btrfs: fix memory leak in update_space_info failure path Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 64/90] scsi: qla2xxx: dont disable a not previously enabled PCI device Greg Kroah-Hartman
                   ` (28 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Marc Zyngier, Christoffer Dall

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Marc Zyngier <marc.zyngier@arm.com>

commit d6dbdd3c8558cad3b6d74cc357b408622d122331 upstream.

Under memory pressure, we start ageing pages, which amounts to parsing
the page tables. Since we don't want to allocate any extra level,
we pass NULL for our private allocation cache. Which means that
stage2_get_pud() is allowed to fail. This results in the following
splat:

[ 1520.409577] Unable to handle kernel NULL pointer dereference at virtual address 00000008
[ 1520.417741] pgd = ffff810f52fef000
[ 1520.421201] [00000008] *pgd=0000010f636c5003, *pud=0000010f56f48003, *pmd=0000000000000000
[ 1520.429546] Internal error: Oops: 96000006 [#1] PREEMPT SMP
[ 1520.435156] Modules linked in:
[ 1520.438246] CPU: 15 PID: 53550 Comm: qemu-system-aar Tainted: G        W       4.12.0-rc4-00027-g1885c397eaec #7205
[ 1520.448705] Hardware name: FOXCONN R2-1221R-A4/C2U4N_MB, BIOS G31FB12A 10/26/2016
[ 1520.463726] task: ffff800ac5fb4e00 task.stack: ffff800ce04e0000
[ 1520.469666] PC is at stage2_get_pmd+0x34/0x110
[ 1520.474119] LR is at kvm_age_hva_handler+0x44/0xf0
[ 1520.478917] pc : [<ffff0000080b137c>] lr : [<ffff0000080b149c>] pstate: 40000145
[ 1520.486325] sp : ffff800ce04e33d0
[ 1520.489644] x29: ffff800ce04e33d0 x28: 0000000ffff40064
[ 1520.494967] x27: 0000ffff27e00000 x26: 0000000000000000
[ 1520.500289] x25: ffff81051ba65008 x24: 0000ffff40065000
[ 1520.505618] x23: 0000ffff40064000 x22: 0000000000000000
[ 1520.510947] x21: ffff810f52b20000 x20: 0000000000000000
[ 1520.516274] x19: 0000000058264000 x18: 0000000000000000
[ 1520.521603] x17: 0000ffffa6fe7438 x16: ffff000008278b70
[ 1520.526940] x15: 000028ccd8000000 x14: 0000000000000008
[ 1520.532264] x13: ffff7e0018298000 x12: 0000000000000002
[ 1520.537582] x11: ffff000009241b93 x10: 0000000000000940
[ 1520.542908] x9 : ffff0000092ef800 x8 : 0000000000000200
[ 1520.548229] x7 : ffff800ce04e36a8 x6 : 0000000000000000
[ 1520.553552] x5 : 0000000000000001 x4 : 0000000000000000
[ 1520.558873] x3 : 0000000000000000 x2 : 0000000000000008
[ 1520.571696] x1 : ffff000008fd5000 x0 : ffff0000080b149c
[ 1520.577039] Process qemu-system-aar (pid: 53550, stack limit = 0xffff800ce04e0000)
[...]
[ 1521.510735] [<ffff0000080b137c>] stage2_get_pmd+0x34/0x110
[ 1521.516221] [<ffff0000080b149c>] kvm_age_hva_handler+0x44/0xf0
[ 1521.522054] [<ffff0000080b0610>] handle_hva_to_gpa+0xb8/0xe8
[ 1521.527716] [<ffff0000080b3434>] kvm_age_hva+0x44/0xf0
[ 1521.532854] [<ffff0000080a58b0>] kvm_mmu_notifier_clear_flush_young+0x70/0xc0
[ 1521.539992] [<ffff000008238378>] __mmu_notifier_clear_flush_young+0x88/0xd0
[ 1521.546958] [<ffff00000821eca0>] page_referenced_one+0xf0/0x188
[ 1521.552881] [<ffff00000821f36c>] rmap_walk_anon+0xec/0x250
[ 1521.558370] [<ffff000008220f78>] rmap_walk+0x78/0xa0
[ 1521.563337] [<ffff000008221104>] page_referenced+0x164/0x180
[ 1521.569002] [<ffff0000081f1af0>] shrink_active_list+0x178/0x3b8
[ 1521.574922] [<ffff0000081f2058>] shrink_node_memcg+0x328/0x600
[ 1521.580758] [<ffff0000081f23f4>] shrink_node+0xc4/0x328
[ 1521.585986] [<ffff0000081f2718>] do_try_to_free_pages+0xc0/0x340
[ 1521.592000] [<ffff0000081f2a64>] try_to_free_pages+0xcc/0x240
[...]

The trivial fix is to handle this NULL pud value early, rather than
dereferencing it blindly.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/arm/kvm/mmu.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -869,6 +869,9 @@ static pmd_t *stage2_get_pmd(struct kvm
 	pmd_t *pmd;
 
 	pud = stage2_get_pud(kvm, cache, addr);
+	if (!pud)
+		return NULL;
+
 	if (pud_none(*pud)) {
 		if (!cache)
 			return NULL;

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 64/90] scsi: qla2xxx: dont disable a not previously enabled PCI device
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (58 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 63/90] KVM: arm/arm64: Handle possible NULL stage2 pud when ageing pages Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 65/90] powerpc/eeh: Avoid use after free in eeh_handle_special_event() Greg Kroah-Hartman
                   ` (27 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Johannes Thumshirn, Bart Van Assche,
	Giridhar Malavali, Martin K. Petersen

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Johannes Thumshirn <jthumshirn@suse.de>

commit ddff7ed45edce4a4c92949d3c61cd25d229c4a14 upstream.

When pci_enable_device() or pci_enable_device_mem() fail in
qla2x00_probe_one() we bail out but do a call to
pci_disable_device(). This causes the dev_WARN_ON() in
pci_disable_device() to trigger, as the device wasn't enabled
previously.

So instead of taking the 'probe_out' error path we can directly return
*iff* one of the pci_enable_device() calls fails.

Additionally rename the 'probe_out' goto label's name to the more
descriptive 'disable_device'.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Fixes: e315cd28b9ef ("[SCSI] qla2xxx: Code changes for qla data structure refactoring")
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Giridhar Malavali <giridhar.malavali@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/scsi/qla2xxx/qla_os.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -2311,10 +2311,10 @@ qla2x00_probe_one(struct pci_dev *pdev,
 
 	if (mem_only) {
 		if (pci_enable_device_mem(pdev))
-			goto probe_out;
+			return ret;
 	} else {
 		if (pci_enable_device(pdev))
-			goto probe_out;
+			return ret;
 	}
 
 	/* This may fail but that's ok */
@@ -2324,7 +2324,7 @@ qla2x00_probe_one(struct pci_dev *pdev,
 	if (!ha) {
 		ql_log_pci(ql_log_fatal, pdev, 0x0009,
 		    "Unable to allocate memory for ha.\n");
-		goto probe_out;
+		goto disable_device;
 	}
 	ql_dbg_pci(ql_dbg_init, pdev, 0x000a,
 	    "Memory allocated for ha=%p.\n", ha);
@@ -2923,7 +2923,7 @@ iospace_config_failed:
 	kfree(ha);
 	ha = NULL;
 
-probe_out:
+disable_device:
 	pci_disable_device(pdev);
 	return ret;
 }

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 65/90] powerpc/eeh: Avoid use after free in eeh_handle_special_event()
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (59 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 64/90] scsi: qla2xxx: dont disable a not previously enabled PCI device Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 66/90] powerpc/numa: Fix percpu allocations to be NUMA aware Greg Kroah-Hartman
                   ` (26 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Alexey Kardashevskiy, Russell Currey,
	Gavin Shan, Michael Ellerman

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Russell Currey <ruscur@russell.cc>

commit daeba2956f32f91f3493788ff6ee02fb1b2f02fa upstream.

eeh_handle_special_event() is called when an EEH event is detected but
can't be narrowed down to a specific PE.  This function looks through
every PE to find one in an erroneous state, then calls the regular event
handler eeh_handle_normal_event() once it knows which PE has an error.

However, if eeh_handle_normal_event() found that the PE cannot possibly
be recovered, it will free it, rendering the passed PE stale.
This leads to a use after free in eeh_handle_special_event() as it attempts to
clear the "recovering" state on the PE after eeh_handle_normal_event() returns.

Thus, make sure the PE is valid when attempting to clear state in
eeh_handle_special_event().

Fixes: 8a6b1bc70dbb ("powerpc/eeh: EEH core to handle special event")
Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


---
 arch/powerpc/kernel/eeh_driver.c |   19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -655,7 +655,7 @@ static int eeh_reset_device(struct eeh_p
  */
 #define MAX_WAIT_FOR_RECOVERY 300
 
-static void eeh_handle_normal_event(struct eeh_pe *pe)
+static bool eeh_handle_normal_event(struct eeh_pe *pe)
 {
 	struct pci_bus *frozen_bus;
 	int rc = 0;
@@ -665,7 +665,7 @@ static void eeh_handle_normal_event(stru
 	if (!frozen_bus) {
 		pr_err("%s: Cannot find PCI bus for PHB#%d-PE#%x\n",
 			__func__, pe->phb->global_number, pe->addr);
-		return;
+		return false;
 	}
 
 	eeh_pe_update_time_stamp(pe);
@@ -790,7 +790,7 @@ static void eeh_handle_normal_event(stru
 	pr_info("EEH: Notify device driver to resume\n");
 	eeh_pe_dev_traverse(pe, eeh_report_resume, NULL);
 
-	return;
+	return false;
 
 excess_failures:
 	/*
@@ -831,7 +831,11 @@ perm_error:
 		pci_lock_rescan_remove();
 		pcibios_remove_pci_devices(frozen_bus);
 		pci_unlock_rescan_remove();
+
+		/* The passed PE should no longer be used */
+		return true;
 	}
+	return false;
 }
 
 static void eeh_handle_special_event(void)
@@ -897,7 +901,14 @@ static void eeh_handle_special_event(voi
 		 */
 		if (rc == EEH_NEXT_ERR_FROZEN_PE ||
 		    rc == EEH_NEXT_ERR_FENCED_PHB) {
-			eeh_handle_normal_event(pe);
+			/*
+			 * eeh_handle_normal_event() can make the PE stale if it
+			 * determines that the PE cannot possibly be recovered.
+			 * Don't modify the PE state if that's the case.
+			 */
+			if (eeh_handle_normal_event(pe))
+				continue;
+
 			eeh_pe_state_clear(pe, EEH_PE_RECOVERING);
 		} else {
 			pci_lock_rescan_remove();

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 66/90] powerpc/numa: Fix percpu allocations to be NUMA aware
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (60 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 65/90] powerpc/eeh: Avoid use after free in eeh_handle_special_event() Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-07-28 13:53   ` Michal Hocko
  2017-06-12 15:26 ` [PATCH 4.4 67/90] powerpc/hotplug-mem: Fix missing endian conversion of aa_index Greg Kroah-Hartman
                   ` (25 subsequent siblings)
  87 siblings, 1 reply; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Michael Ellerman, Nicholas Piggin

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Michael Ellerman <mpe@ellerman.id.au>

commit ba4a648f12f4cd0a8003dd229b6ca8a53348ee4b upstream.

In commit 8c272261194d ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID"), we
switched to the generic implementation of cpu_to_node(), which uses a percpu
variable to hold the NUMA node for each CPU.

Unfortunately we neglected to notice that we use cpu_to_node() in the allocation
of our percpu areas, leading to a chicken and egg problem. In practice what
happens is when we are setting up the percpu areas, cpu_to_node() reports that
all CPUs are on node 0, so we allocate all percpu areas on node 0.

This is visible in the dmesg output, as all pcpu allocs being in group 0:

  pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07
  pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15
  pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23
  pcpu-alloc: [0] 24 25 26 27 [0] 28 29 30 31
  pcpu-alloc: [0] 32 33 34 35 [0] 36 37 38 39
  pcpu-alloc: [0] 40 41 42 43 [0] 44 45 46 47

To fix it we need an early_cpu_to_node() which can run prior to percpu being
setup. We already have the numa_cpu_lookup_table we can use, so just plumb it
in. With the patch dmesg output shows two groups, 0 and 1:

  pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07
  pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15
  pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23
  pcpu-alloc: [1] 24 25 26 27 [1] 28 29 30 31
  pcpu-alloc: [1] 32 33 34 35 [1] 36 37 38 39
  pcpu-alloc: [1] 40 41 42 43 [1] 44 45 46 47

We can also check the data_offset in the paca of various CPUs, with the fix we
see:

  CPU 0:  data_offset = 0x0ffe8b0000
  CPU 24: data_offset = 0x1ffe5b0000

And we can see from dmesg that CPU 24 has an allocation on node 1:

  node   0: [mem 0x0000000000000000-0x0000000fffffffff]
  node   1: [mem 0x0000001000000000-0x0000001fffffffff]

Fixes: 8c272261194d ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/powerpc/include/asm/topology.h |   14 ++++++++++++++
 arch/powerpc/kernel/setup_64.c      |    4 ++--
 2 files changed, 16 insertions(+), 2 deletions(-)

--- a/arch/powerpc/include/asm/topology.h
+++ b/arch/powerpc/include/asm/topology.h
@@ -44,8 +44,22 @@ extern void __init dump_numa_cpu_topolog
 extern int sysfs_add_device_to_node(struct device *dev, int nid);
 extern void sysfs_remove_device_from_node(struct device *dev, int nid);
 
+static inline int early_cpu_to_node(int cpu)
+{
+	int nid;
+
+	nid = numa_cpu_lookup_table[cpu];
+
+	/*
+	 * Fall back to node 0 if nid is unset (it should be, except bugs).
+	 * This allows callers to safely do NODE_DATA(early_cpu_to_node(cpu)).
+	 */
+	return (nid < 0) ? 0 : nid;
+}
 #else
 
+static inline int early_cpu_to_node(int cpu) { return 0; }
+
 static inline void dump_numa_cpu_topology(void) {}
 
 static inline int sysfs_add_device_to_node(struct device *dev, int nid)
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -751,7 +751,7 @@ void __init setup_arch(char **cmdline_p)
 
 static void * __init pcpu_fc_alloc(unsigned int cpu, size_t size, size_t align)
 {
-	return __alloc_bootmem_node(NODE_DATA(cpu_to_node(cpu)), size, align,
+	return __alloc_bootmem_node(NODE_DATA(early_cpu_to_node(cpu)), size, align,
 				    __pa(MAX_DMA_ADDRESS));
 }
 
@@ -762,7 +762,7 @@ static void __init pcpu_fc_free(void *pt
 
 static int pcpu_cpu_distance(unsigned int from, unsigned int to)
 {
-	if (cpu_to_node(from) == cpu_to_node(to))
+	if (early_cpu_to_node(from) == early_cpu_to_node(to))
 		return LOCAL_DISTANCE;
 	else
 		return REMOTE_DISTANCE;

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 67/90] powerpc/hotplug-mem: Fix missing endian conversion of aa_index
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (61 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 66/90] powerpc/numa: Fix percpu allocations to be NUMA aware Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 68/90] perf/core: Drop kernel samples even though :u is specified Greg Kroah-Hartman
                   ` (24 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Michael Bringmann, Michael Ellerman

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Michael Bringmann <mwb@linux.vnet.ibm.com>

commit dc421b200f91930c9c6a9586810ff8c232cf10fc upstream.

When adding or removing memory, the aa_index (affinity value) for the
memblock must also be converted to match the endianness of the rest
of the 'ibm,dynamic-memory' property.  Otherwise, subsequent retrieval
of the attribute will likely lead to non-existent nodes, followed by
using the default node in the code inappropriately.

Fixes: 5f97b2a0d176 ("powerpc/pseries: Implement memory hotplug add in the kernel")
Signed-off-by: Michael Bringmann <mwb@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/powerpc/platforms/pseries/hotplug-memory.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -110,6 +110,7 @@ static struct property *dlpar_clone_drco
 	for (i = 0; i < num_lmbs; i++) {
 		lmbs[i].base_addr = be64_to_cpu(lmbs[i].base_addr);
 		lmbs[i].drc_index = be32_to_cpu(lmbs[i].drc_index);
+		lmbs[i].aa_index = be32_to_cpu(lmbs[i].aa_index);
 		lmbs[i].flags = be32_to_cpu(lmbs[i].flags);
 	}
 
@@ -553,6 +554,7 @@ static void dlpar_update_drconf_property
 	for (i = 0; i < num_lmbs; i++) {
 		lmbs[i].base_addr = cpu_to_be64(lmbs[i].base_addr);
 		lmbs[i].drc_index = cpu_to_be32(lmbs[i].drc_index);
+		lmbs[i].aa_index = cpu_to_be32(lmbs[i].aa_index);
 		lmbs[i].flags = cpu_to_be32(lmbs[i].flags);
 	}
 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 68/90] perf/core: Drop kernel samples even though :u is specified
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (62 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 67/90] powerpc/hotplug-mem: Fix missing endian conversion of aa_index Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 69/90] drm/vmwgfx: Handle vmalloc() failure in vmw_local_fifo_reserve() Greg Kroah-Hartman
                   ` (23 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jin Yao, Peter Zijlstra (Intel),
	Alexander Shishkin, Arnaldo Carvalho de Melo, Jiri Olsa,
	Linus Torvalds, Namhyung Kim, Stephane Eranian, Thomas Gleixner,
	Vince Weaver, acme, jolsa, kan.liang, mark.rutland, will.deacon,
	yao.jin, Ingo Molnar

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jin Yao <yao.jin@linux.intel.com>

commit cc1582c231ea041fbc68861dfaf957eaf902b829 upstream.

When doing sampling, for example:

  perf record -e cycles:u ...

On workloads that do a lot of kernel entry/exits we see kernel
samples, even though :u is specified. This is due to skid existing.

This might be a security issue because it can leak kernel addresses even
though kernel sampling support is disabled.

The patch drops the kernel samples if exclude_kernel is specified.

For example, test on Haswell desktop:

  perf record -e cycles:u <mgen>
  perf report --stdio

Before patch applied:

    99.77%  mgen     mgen              [.] buf_read
     0.20%  mgen     mgen              [.] rand_buf_init
     0.01%  mgen     [kernel.vmlinux]  [k] apic_timer_interrupt
     0.00%  mgen     mgen              [.] last_free_elem
     0.00%  mgen     libc-2.23.so      [.] __random_r
     0.00%  mgen     libc-2.23.so      [.] _int_malloc
     0.00%  mgen     mgen              [.] rand_array_init
     0.00%  mgen     [kernel.vmlinux]  [k] page_fault
     0.00%  mgen     libc-2.23.so      [.] __random
     0.00%  mgen     libc-2.23.so      [.] __strcasestr
     0.00%  mgen     ld-2.23.so        [.] strcmp
     0.00%  mgen     ld-2.23.so        [.] _dl_start
     0.00%  mgen     libc-2.23.so      [.] sched_setaffinity@@GLIBC_2.3.4
     0.00%  mgen     ld-2.23.so        [.] _start

We can see kernel symbols apic_timer_interrupt and page_fault.

After patch applied:

    99.79%  mgen     mgen           [.] buf_read
     0.19%  mgen     mgen           [.] rand_buf_init
     0.00%  mgen     libc-2.23.so   [.] __random_r
     0.00%  mgen     mgen           [.] rand_array_init
     0.00%  mgen     mgen           [.] last_free_elem
     0.00%  mgen     libc-2.23.so   [.] vfprintf
     0.00%  mgen     libc-2.23.so   [.] rand
     0.00%  mgen     libc-2.23.so   [.] __random
     0.00%  mgen     libc-2.23.so   [.] _int_malloc
     0.00%  mgen     libc-2.23.so   [.] _IO_doallocbuf
     0.00%  mgen     ld-2.23.so     [.] do_lookup_x
     0.00%  mgen     ld-2.23.so     [.] open_verify.constprop.7
     0.00%  mgen     ld-2.23.so     [.] _dl_important_hwcaps
     0.00%  mgen     libc-2.23.so   [.] sched_setaffinity@@GLIBC_2.3.4
     0.00%  mgen     ld-2.23.so     [.] _start

There are only userspace symbols.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Cc: jolsa@kernel.org
Cc: kan.liang@intel.com
Cc: mark.rutland@arm.com
Cc: will.deacon@arm.com
Cc: yao.jin@intel.com
Link: http://lkml.kernel.org/r/1495706947-3744-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 kernel/events/core.c |   21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6410,6 +6410,21 @@ static void perf_log_itrace_start(struct
 	perf_output_end(&handle);
 }
 
+static bool sample_is_allowed(struct perf_event *event, struct pt_regs *regs)
+{
+	/*
+	 * Due to interrupt latency (AKA "skid"), we may enter the
+	 * kernel before taking an overflow, even if the PMU is only
+	 * counting user events.
+	 * To avoid leaking information to userspace, we must always
+	 * reject kernel samples when exclude_kernel is set.
+	 */
+	if (event->attr.exclude_kernel && !user_mode(regs))
+		return false;
+
+	return true;
+}
+
 /*
  * Generic event overflow handling, sampling.
  */
@@ -6457,6 +6472,12 @@ static int __perf_event_overflow(struct
 	}
 
 	/*
+	 * For security, drop the skid kernel samples if necessary.
+	 */
+	if (!sample_is_allowed(event, regs))
+		return ret;
+
+	/*
 	 * XXX event_limit might not quite work as expected on inherited
 	 * events
 	 */

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 69/90] drm/vmwgfx: Handle vmalloc() failure in vmw_local_fifo_reserve()
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (63 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 68/90] perf/core: Drop kernel samples even though :u is specified Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 70/90] drm/vmwgfx: limit the number of mip levels in vmw_gb_surface_define_ioctl() Greg Kroah-Hartman
                   ` (22 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Dan Carpenter, Sinclair Yeh

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Dan Carpenter <dan.carpenter@oracle.com>

commit f0c62e9878024300319ba2438adc7b06c6b9c448 upstream.

If vmalloc() fails then we need to a bit of cleanup before returning.

Fixes: fb1d9738ca05 ("drm/vmwgfx: Add DRM driver for VMware Virtual GPU")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/gpu/drm/vmwgfx/vmwgfx_fifo.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/drivers/gpu/drm/vmwgfx/vmwgfx_fifo.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_fifo.c
@@ -368,6 +368,8 @@ static void *vmw_local_fifo_reserve(stru
 				return fifo_state->static_buffer;
 			else {
 				fifo_state->dynamic_buffer = vmalloc(bytes);
+				if (!fifo_state->dynamic_buffer)
+					goto out_err;
 				return fifo_state->dynamic_buffer;
 			}
 		}

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 70/90] drm/vmwgfx: limit the number of mip levels in vmw_gb_surface_define_ioctl()
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (64 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 69/90] drm/vmwgfx: Handle vmalloc() failure in vmw_local_fifo_reserve() Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 71/90] drm/vmwgfx: Make sure backup_handle is always valid Greg Kroah-Hartman
                   ` (21 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Vladis Dronov, Sinclair Yeh

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Vladis Dronov <vdronov@redhat.com>

commit ee9c4e681ec4f58e42a83cb0c22a0289ade1aacf upstream.

The 'req->mip_levels' parameter in vmw_gb_surface_define_ioctl() is
a user-controlled 'uint32_t' value which is used as a loop count limit.
This can lead to a kernel lockup and DoS. Add check for 'req->mip_levels'.

References:
https://bugzilla.redhat.com/show_bug.cgi?id=1437431

Signed-off-by: Vladis Dronov <vdronov@redhat.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/gpu/drm/vmwgfx/vmwgfx_surface.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c
@@ -1293,6 +1293,9 @@ int vmw_gb_surface_define_ioctl(struct d
 	if (req->multisample_count != 0)
 		return -EINVAL;
 
+	if (req->mip_levels > DRM_VMW_MAX_MIP_LEVELS)
+		return -EINVAL;
+
 	if (unlikely(vmw_user_surface_size == 0))
 		vmw_user_surface_size = ttm_round_pot(sizeof(*user_srf)) +
 			128;

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 71/90] drm/vmwgfx: Make sure backup_handle is always valid
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (65 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 70/90] drm/vmwgfx: limit the number of mip levels in vmw_gb_surface_define_ioctl() Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 72/90] drm/nouveau/tmr: fully separate alarm execution/pending lists Greg Kroah-Hartman
                   ` (20 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Murray McAllister, Sinclair Yeh,
	Deepak Rawat

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sinclair Yeh <syeh@vmware.com>

commit 07678eca2cf9c9a18584e546c2b2a0d0c9a3150c upstream.

When vmw_gb_surface_define_ioctl() is called with an existing buffer,
we end up returning an uninitialized variable in the backup_handle.

The fix is to first initialize backup_handle to 0 just to be sure, and
second, when a user-provided buffer is found, we will use the
req->buffer_handle as the backup_handle.

Reported-by: Murray McAllister <murray.mcallister@insomniasec.com>
Signed-off-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/gpu/drm/vmwgfx/vmwgfx_surface.c |   18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

--- a/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c
@@ -1288,7 +1288,7 @@ int vmw_gb_surface_define_ioctl(struct d
 	struct ttm_object_file *tfile = vmw_fpriv(file_priv)->tfile;
 	int ret;
 	uint32_t size;
-	uint32_t backup_handle;
+	uint32_t backup_handle = 0;
 
 	if (req->multisample_count != 0)
 		return -EINVAL;
@@ -1331,12 +1331,16 @@ int vmw_gb_surface_define_ioctl(struct d
 		ret = vmw_user_dmabuf_lookup(tfile, req->buffer_handle,
 					     &res->backup,
 					     &user_srf->backup_base);
-		if (ret == 0 && res->backup->base.num_pages * PAGE_SIZE <
-		    res->backup_size) {
-			DRM_ERROR("Surface backup buffer is too small.\n");
-			vmw_dmabuf_unreference(&res->backup);
-			ret = -EINVAL;
-			goto out_unlock;
+		if (ret == 0) {
+			if (res->backup->base.num_pages * PAGE_SIZE <
+			    res->backup_size) {
+				DRM_ERROR("Surface backup buffer is too small.\n");
+				vmw_dmabuf_unreference(&res->backup);
+				ret = -EINVAL;
+				goto out_unlock;
+			} else {
+				backup_handle = req->buffer_handle;
+			}
 		}
 	} else if (req->drm_surface_flags & drm_vmw_surface_flag_create_buffer)
 		ret = vmw_user_dmabuf_alloc(dev_priv, tfile,

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 72/90] drm/nouveau/tmr: fully separate alarm execution/pending lists
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (66 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 71/90] drm/vmwgfx: Make sure backup_handle is always valid Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 73/90] ALSA: timer: Fix race between read and ioctl Greg Kroah-Hartman
                   ` (19 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Ben Skeggs

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ben Skeggs <bskeggs@redhat.com>

commit b4e382ca7586a63b6c1e5221ce0863ff867c2df6 upstream.

Reusing the list_head for both is a bad idea.  Callback execution is done
with the lock dropped so that alarms can be rescheduled from the callback,
which means that with some unfortunate timing, lists can get corrupted.

The execution list should not require its own locking, the single function
that uses it can only be called from a single context.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/gpu/drm/nouveau/include/nvkm/subdev/timer.h |    1 +
 drivers/gpu/drm/nouveau/nvkm/subdev/timer/base.c    |    7 ++++---
 2 files changed, 5 insertions(+), 3 deletions(-)

--- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/timer.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/timer.h
@@ -4,6 +4,7 @@
 
 struct nvkm_alarm {
 	struct list_head head;
+	struct list_head exec;
 	u64 timestamp;
 	void (*func)(struct nvkm_alarm *);
 };
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/timer/base.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/timer/base.c
@@ -50,7 +50,8 @@ nvkm_timer_alarm_trigger(struct nvkm_tim
 		/* Move to completed list.  We'll drop the lock before
 		 * executing the callback so it can reschedule itself.
 		 */
-		list_move_tail(&alarm->head, &exec);
+		list_del_init(&alarm->head);
+		list_add(&alarm->exec, &exec);
 	}
 
 	/* Shut down interrupt if no more pending alarms. */
@@ -59,8 +60,8 @@ nvkm_timer_alarm_trigger(struct nvkm_tim
 	spin_unlock_irqrestore(&tmr->lock, flags);
 
 	/* Execute completed callbacks. */
-	list_for_each_entry_safe(alarm, atemp, &exec, head) {
-		list_del_init(&alarm->head);
+	list_for_each_entry_safe(alarm, atemp, &exec, exec) {
+		list_del(&alarm->exec);
 		alarm->func(alarm);
 	}
 }

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 73/90] ALSA: timer: Fix race between read and ioctl
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (67 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 72/90] drm/nouveau/tmr: fully separate alarm execution/pending lists Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 74/90] ALSA: timer: Fix missing queue indices reset at SNDRV_TIMER_IOCTL_SELECT Greg Kroah-Hartman
                   ` (18 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Alexander Potapenko, Takashi Iwai

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Takashi Iwai <tiwai@suse.de>

commit d11662f4f798b50d8c8743f433842c3e40fe3378 upstream.

The read from ALSA timer device, the function snd_timer_user_tread(),
may access to an uninitialized struct snd_timer_user fields when the
read is concurrently performed while the ioctl like
snd_timer_user_tselect() is invoked.  We have already fixed the races
among ioctls via a mutex, but we seem to have forgotten the race
between read vs ioctl.

This patch simply applies (more exactly extends the already applied
range of) tu->ioctl_lock in snd_timer_user_tread() for closing the
race window.

Reported-by: Alexander Potapenko <glider@google.com>
Tested-by: Alexander Potapenko <glider@google.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 sound/core/timer.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

--- a/sound/core/timer.c
+++ b/sound/core/timer.c
@@ -1958,6 +1958,7 @@ static ssize_t snd_timer_user_read(struc
 
 	tu = file->private_data;
 	unit = tu->tread ? sizeof(struct snd_timer_tread) : sizeof(struct snd_timer_read);
+	mutex_lock(&tu->ioctl_lock);
 	spin_lock_irq(&tu->qlock);
 	while ((long)count - result >= unit) {
 		while (!tu->qused) {
@@ -1973,7 +1974,9 @@ static ssize_t snd_timer_user_read(struc
 			add_wait_queue(&tu->qchange_sleep, &wait);
 
 			spin_unlock_irq(&tu->qlock);
+			mutex_unlock(&tu->ioctl_lock);
 			schedule();
+			mutex_lock(&tu->ioctl_lock);
 			spin_lock_irq(&tu->qlock);
 
 			remove_wait_queue(&tu->qchange_sleep, &wait);
@@ -1993,7 +1996,6 @@ static ssize_t snd_timer_user_read(struc
 		tu->qused--;
 		spin_unlock_irq(&tu->qlock);
 
-		mutex_lock(&tu->ioctl_lock);
 		if (tu->tread) {
 			if (copy_to_user(buffer, &tu->tqueue[qhead],
 					 sizeof(struct snd_timer_tread)))
@@ -2003,7 +2005,6 @@ static ssize_t snd_timer_user_read(struc
 					 sizeof(struct snd_timer_read)))
 				err = -EFAULT;
 		}
-		mutex_unlock(&tu->ioctl_lock);
 
 		spin_lock_irq(&tu->qlock);
 		if (err < 0)
@@ -2013,6 +2014,7 @@ static ssize_t snd_timer_user_read(struc
 	}
  _error:
 	spin_unlock_irq(&tu->qlock);
+	mutex_unlock(&tu->ioctl_lock);
 	return result > 0 ? result : err;
 }
 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 74/90] ALSA: timer: Fix missing queue indices reset at SNDRV_TIMER_IOCTL_SELECT
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (68 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 73/90] ALSA: timer: Fix race between read and ioctl Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 75/90] ASoC: Fix use-after-free at card unregistration Greg Kroah-Hartman
                   ` (17 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Alexander Potapenko, Takashi Iwai

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Takashi Iwai <tiwai@suse.de>

commit ba3021b2c79b2fa9114f92790a99deb27a65b728 upstream.

snd_timer_user_tselect() reallocates the queue buffer dynamically, but
it forgot to reset its indices.  Since the read may happen
concurrently with ioctl and snd_timer_user_tselect() allocates the
buffer via kmalloc(), this may lead to the leak of uninitialized
kernel-space data, as spotted via KMSAN:

  BUG: KMSAN: use of unitialized memory in snd_timer_user_read+0x6c4/0xa10
  CPU: 0 PID: 1037 Comm: probe Not tainted 4.11.0-rc5+ #2739
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
  Call Trace:
   __dump_stack lib/dump_stack.c:16
   dump_stack+0x143/0x1b0 lib/dump_stack.c:52
   kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:1007
   kmsan_check_memory+0xc2/0x140 mm/kmsan/kmsan.c:1086
   copy_to_user ./arch/x86/include/asm/uaccess.h:725
   snd_timer_user_read+0x6c4/0xa10 sound/core/timer.c:2004
   do_loop_readv_writev fs/read_write.c:716
   __do_readv_writev+0x94c/0x1380 fs/read_write.c:864
   do_readv_writev fs/read_write.c:894
   vfs_readv fs/read_write.c:908
   do_readv+0x52a/0x5d0 fs/read_write.c:934
   SYSC_readv+0xb6/0xd0 fs/read_write.c:1021
   SyS_readv+0x87/0xb0 fs/read_write.c:1018

This patch adds the missing reset of queue indices.  Together with the
previous fix for the ioctl/read race, we cover the whole problem.

Reported-by: Alexander Potapenko <glider@google.com>
Tested-by: Alexander Potapenko <glider@google.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 sound/core/timer.c |    1 +
 1 file changed, 1 insertion(+)

--- a/sound/core/timer.c
+++ b/sound/core/timer.c
@@ -1621,6 +1621,7 @@ static int snd_timer_user_tselect(struct
 	if (err < 0)
 		goto __err;
 
+	tu->qhead = tu->qtail = tu->qused = 0;
 	kfree(tu->queue);
 	tu->queue = NULL;
 	kfree(tu->tqueue);

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 75/90] ASoC: Fix use-after-free at card unregistration
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (69 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 74/90] ALSA: timer: Fix missing queue indices reset at SNDRV_TIMER_IOCTL_SELECT Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 76/90] drivers: char: mem: Fix wraparound check to allow mappings up to the end Greg Kroah-Hartman
                   ` (16 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Takashi Iwai, Mark Brown

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Takashi Iwai <tiwai@suse.de>

commit 4efda5f2130da033aeedc5b3205569893b910de2 upstream.

soc_cleanup_card_resources() call snd_card_free() at the last of its
procedure.  This turned out to lead to a use-after-free.
PCM runtimes have been already removed via soc_remove_pcm_runtimes(),
while it's dereferenced later in soc_pcm_free() called via
snd_card_free().

The fix is simple: just move the snd_card_free() call to the beginning
of the whole procedure.  This also gives another benefit: it
guarantees that all operations have been shut down before actually
releasing the resources, which was racy until now.

Reported-and-tested-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 sound/soc/soc-core.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/sound/soc/soc-core.c
+++ b/sound/soc/soc-core.c
@@ -1775,6 +1775,9 @@ static int soc_cleanup_card_resources(st
 	for (i = 0; i < card->num_aux_devs; i++)
 		soc_remove_aux_dev(card, i);
 
+	/* free the ALSA card at first; this syncs with pending operations */
+	snd_card_free(card->snd_card);
+
 	/* remove and free each DAI */
 	soc_remove_dai_links(card);
 
@@ -1786,9 +1789,7 @@ static int soc_cleanup_card_resources(st
 
 	snd_soc_dapm_free(&card->dapm);
 
-	snd_card_free(card->snd_card);
 	return 0;
-
 }
 
 /* removes a socdev */

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 76/90] drivers: char: mem: Fix wraparound check to allow mappings up to the end
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (70 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 75/90] ASoC: Fix use-after-free at card unregistration Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 77/90] tty: Drop krefs for interrupted tty lock Greg Kroah-Hartman
                   ` (15 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Nico Huber, Julius Werner

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Julius Werner <jwerner@chromium.org>

commit 32829da54d9368103a2f03269a5120aa9ee4d5da upstream.

A recent fix to /dev/mem prevents mappings from wrapping around the end
of physical address space. However, the check was written in a way that
also prevents a mapping reaching just up to the end of physical address
space, which may be a valid use case (especially on 32-bit systems).
This patch fixes it by checking the last mapped address (instead of the
first address behind that) for overflow.

Fixes: b299cde245 ("drivers: char: mem: Check for address space wraparound with mmap()")
Reported-by: Nico Huber <nico.h@gmx.de>
Signed-off-by: Julius Werner <jwerner@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/char/mem.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/char/mem.c
+++ b/drivers/char/mem.c
@@ -346,7 +346,7 @@ static int mmap_mem(struct file *file, s
 	phys_addr_t offset = (phys_addr_t)vma->vm_pgoff << PAGE_SHIFT;
 
 	/* It's illegal to wrap around the end of the physical address space. */
-	if (offset + (phys_addr_t)size < offset)
+	if (offset + (phys_addr_t)size - 1 < offset)
 		return -EINVAL;
 
 	if (!valid_mmap_phys_addr_range(vma->vm_pgoff, size))

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 77/90] tty: Drop krefs for interrupted tty lock
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (71 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 76/90] drivers: char: mem: Fix wraparound check to allow mappings up to the end Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 78/90] serial: sh-sci: Fix panic when serial console and DMA are enabled Greg Kroah-Hartman
                   ` (14 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Dmitry Vyukov, Peter Hurley, Jiri Slaby

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Peter Hurley <peter@hurleysoftware.com>

commit e9036d0662360cd4c79578565ce422ed5872f301 upstream.

When the tty lock is interrupted on attempted re-open, 2 tty krefs
are still held. Drop extra kref before returning failure from
tty_lock_interruptible(), and drop lookup kref before returning
failure from tty_open().

Fixes: 0bfd464d3fdd ("tty: Wait interruptibly for tty lock on reopen")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/tty/tty_io.c    |    3 +--
 drivers/tty/tty_mutex.c |    7 ++++++-
 2 files changed, 7 insertions(+), 3 deletions(-)

--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -2070,13 +2070,12 @@ retry_open:
 		if (tty) {
 			mutex_unlock(&tty_mutex);
 			retval = tty_lock_interruptible(tty);
+			tty_kref_put(tty);  /* drop kref from tty_driver_lookup_tty() */
 			if (retval) {
 				if (retval == -EINTR)
 					retval = -ERESTARTSYS;
 				goto err_unref;
 			}
-			/* safe to drop the kref from tty_driver_lookup_tty() */
-			tty_kref_put(tty);
 			retval = tty_reopen(tty);
 			if (retval < 0) {
 				tty_unlock(tty);
--- a/drivers/tty/tty_mutex.c
+++ b/drivers/tty/tty_mutex.c
@@ -24,10 +24,15 @@ EXPORT_SYMBOL(tty_lock);
 
 int tty_lock_interruptible(struct tty_struct *tty)
 {
+	int ret;
+
 	if (WARN(tty->magic != TTY_MAGIC, "L Bad %p\n", tty))
 		return -EIO;
 	tty_kref_get(tty);
-	return mutex_lock_interruptible(&tty->legacy_mutex);
+	ret = mutex_lock_interruptible(&tty->legacy_mutex);
+	if (ret)
+		tty_kref_put(tty);
+	return ret;
 }
 
 void __lockfunc tty_unlock(struct tty_struct *tty)

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 78/90] serial: sh-sci: Fix panic when serial console and DMA are enabled
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (72 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 77/90] tty: Drop krefs for interrupted tty lock Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 79/90] net: better skb->sender_cpu and skb->napi_id cohabitation Greg Kroah-Hartman
                   ` (13 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Takatoshi Akiyama, Yoshihiro Shimoda,
	Jiri Slaby

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Takatoshi Akiyama <takatoshi.akiyama.kj@ps.hitachi-solutions.com>

commit 3c9101766b502a0163d1d437fada5801cf616be2 upstream.

This patch fixes an issue that kernel panic happens when DMA is enabled
and we press enter key while the kernel booting on the serial console.

* An interrupt may occur after sci_request_irq().
* DMA transfer area is initialized by setup_timer() in sci_request_dma()
  and used in interrupt.

If an interrupt occurred between sci_request_irq() and setup_timer() in
sci_request_dma(), DMA transfer area has not been initialized yet.
So, this patch changes the order of sci_request_irq() and
sci_request_dma().

Fixes: 73a19e4c0301 ("serial: sh-sci: Add DMA support.")
Signed-off-by: Takatoshi Akiyama <takatoshi.akiyama.kj@ps.hitachi-solutions.com>
[Shimoda changes the commit log]
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/tty/serial/sh-sci.c |   10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

--- a/drivers/tty/serial/sh-sci.c
+++ b/drivers/tty/serial/sh-sci.c
@@ -1800,11 +1800,13 @@ static int sci_startup(struct uart_port
 
 	dev_dbg(port->dev, "%s(%d)\n", __func__, port->line);
 
+	sci_request_dma(port);
+
 	ret = sci_request_irq(s);
-	if (unlikely(ret < 0))
+	if (unlikely(ret < 0)) {
+		sci_free_dma(port);
 		return ret;
-
-	sci_request_dma(port);
+	}
 
 	spin_lock_irqsave(&port->lock, flags);
 	sci_start_tx(port);
@@ -1834,8 +1836,8 @@ static void sci_shutdown(struct uart_por
 	}
 #endif
 
-	sci_free_dma(port);
 	sci_free_irq(s);
+	sci_free_dma(port);
 }
 
 static unsigned int sci_scbrr_calc(struct sci_port *s, unsigned int bps,

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 79/90] net: better skb->sender_cpu and skb->napi_id cohabitation
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (73 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 78/90] serial: sh-sci: Fix panic when serial console and DMA are enabled Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 80/90] mm: consider memblock reservations for deferred memory initialization sizing Greg Kroah-Hartman
                   ` (12 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Eric Dumazet, Alexei Starovoitov,
	David S. Miller, Paul Menzel

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

commit 52bd2d62ce6758d811edcbd2256eb9ea7f6a56cb upstream.

skb->sender_cpu and skb->napi_id share a common storage,
and we had various bugs about this.

We had to call skb_sender_cpu_clear() in some places to
not leave a prior skb->napi_id and fool netdev_pick_tx()

As suggested by Alexei, we could split the space so that
these errors can not happen.

0 value being reserved as the common (not initialized) value,
let's reserve [1 .. NR_CPUS] range for valid sender_cpu,
and [NR_CPUS+1 .. ~0U] for valid napi_id.

This will allow proper busy polling support over tunnels.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Suggested-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 include/linux/skbuff.h |    3 ---
 net/core/dev.c         |   33 ++++++++++++++++-----------------
 2 files changed, 16 insertions(+), 20 deletions(-)

--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1084,9 +1084,6 @@ static inline void skb_copy_hash(struct
 
 static inline void skb_sender_cpu_clear(struct sk_buff *skb)
 {
-#ifdef CONFIG_XPS
-	skb->sender_cpu = 0;
-#endif
 }
 
 #ifdef NET_SKBUFF_DATA_USES_OFFSET
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -182,7 +182,7 @@ EXPORT_SYMBOL(dev_base_lock);
 /* protects napi_hash addition/deletion and napi_gen_id */
 static DEFINE_SPINLOCK(napi_hash_lock);
 
-static unsigned int napi_gen_id;
+static unsigned int napi_gen_id = NR_CPUS;
 static DEFINE_HASHTABLE(napi_hash, 8);
 
 static seqcount_t devnet_rename_seq;
@@ -3049,7 +3049,9 @@ struct netdev_queue *netdev_pick_tx(stru
 	int queue_index = 0;
 
 #ifdef CONFIG_XPS
-	if (skb->sender_cpu == 0)
+	u32 sender_cpu = skb->sender_cpu - 1;
+
+	if (sender_cpu >= (u32)NR_CPUS)
 		skb->sender_cpu = raw_smp_processor_id() + 1;
 #endif
 
@@ -4726,25 +4728,22 @@ EXPORT_SYMBOL_GPL(napi_by_id);
 
 void napi_hash_add(struct napi_struct *napi)
 {
-	if (!test_and_set_bit(NAPI_STATE_HASHED, &napi->state)) {
+	if (test_and_set_bit(NAPI_STATE_HASHED, &napi->state))
+		return;
 
-		spin_lock(&napi_hash_lock);
+	spin_lock(&napi_hash_lock);
 
-		/* 0 is not a valid id, we also skip an id that is taken
-		 * we expect both events to be extremely rare
-		 */
-		napi->napi_id = 0;
-		while (!napi->napi_id) {
-			napi->napi_id = ++napi_gen_id;
-			if (napi_by_id(napi->napi_id))
-				napi->napi_id = 0;
-		}
+	/* 0..NR_CPUS+1 range is reserved for sender_cpu use */
+	do {
+		if (unlikely(++napi_gen_id < NR_CPUS + 1))
+			napi_gen_id = NR_CPUS + 1;
+	} while (napi_by_id(napi_gen_id));
+	napi->napi_id = napi_gen_id;
 
-		hlist_add_head_rcu(&napi->napi_hash_node,
-			&napi_hash[napi->napi_id % HASH_SIZE(napi_hash)]);
+	hlist_add_head_rcu(&napi->napi_hash_node,
+			   &napi_hash[napi->napi_id % HASH_SIZE(napi_hash)]);
 
-		spin_unlock(&napi_hash_lock);
-	}
+	spin_unlock(&napi_hash_lock);
 }
 EXPORT_SYMBOL_GPL(napi_hash_add);
 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 80/90] mm: consider memblock reservations for deferred memory initialization sizing
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (74 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 79/90] net: better skb->sender_cpu and skb->napi_id cohabitation Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 81/90] NFS: Ensure we revalidate attributes before using execute_ok() Greg Kroah-Hartman
                   ` (11 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Michal Hocko, Mel Gorman,
	Srikar Dronamraju, Andrew Morton, Linus Torvalds

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Michal Hocko <mhocko@suse.com>

commit 864b9a393dcb5aed09b8fd31b9bbda0fdda99374 upstream.

We have seen an early OOM killer invocation on ppc64 systems with
crashkernel=4096M:

	kthreadd invoked oom-killer: gfp_mask=0x16040c0(GFP_KERNEL|__GFP_COMP|__GFP_NOTRACK), nodemask=7, order=0, oom_score_adj=0
	kthreadd cpuset=/ mems_allowed=7
	CPU: 0 PID: 2 Comm: kthreadd Not tainted 4.4.68-1.gd7fe927-default #1
	Call Trace:
	  dump_stack+0xb0/0xf0 (unreliable)
	  dump_header+0xb0/0x258
	  out_of_memory+0x5f0/0x640
	  __alloc_pages_nodemask+0xa8c/0xc80
	  kmem_getpages+0x84/0x1a0
	  fallback_alloc+0x2a4/0x320
	  kmem_cache_alloc_node+0xc0/0x2e0
	  copy_process.isra.25+0x260/0x1b30
	  _do_fork+0x94/0x470
	  kernel_thread+0x48/0x60
	  kthreadd+0x264/0x330
	  ret_from_kernel_thread+0x5c/0xa4

	Mem-Info:
	active_anon:0 inactive_anon:0 isolated_anon:0
	 active_file:0 inactive_file:0 isolated_file:0
	 unevictable:0 dirty:0 writeback:0 unstable:0
	 slab_reclaimable:5 slab_unreclaimable:73
	 mapped:0 shmem:0 pagetables:0 bounce:0
	 free:0 free_pcp:0 free_cma:0
	Node 7 DMA free:0kB min:0kB low:0kB high:0kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:52428800kB managed:110016kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:320kB slab_unreclaimable:4672kB kernel_stack:1152kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
	lowmem_reserve[]: 0 0 0 0
	Node 7 DMA: 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB 0*8192kB 0*16384kB = 0kB
	0 total pagecache pages
	0 pages in swap cache
	Swap cache stats: add 0, delete 0, find 0/0
	Free swap  = 0kB
	Total swap = 0kB
	819200 pages RAM
	0 pages HighMem/MovableOnly
	817481 pages reserved
	0 pages cma reserved
	0 pages hwpoisoned

the reason is that the managed memory is too low (only 110MB) while the
rest of the the 50GB is still waiting for the deferred intialization to
be done.  update_defer_init estimates the initial memoty to initialize
to 2GB at least but it doesn't consider any memory allocated in that
range.  In this particular case we've had

	Reserving 4096MB of memory at 128MB for crashkernel (System RAM: 51200MB)

so the low 2GB is mostly depleted.

Fix this by considering memblock allocations in the initial static
initialization estimation.  Move the max_initialise to
reset_deferred_meminit and implement a simple memblock_reserved_memory
helper which iterates all reserved blocks and sums the size of all that
start below the given address.  The cumulative size is than added on top
of the initial estimation.  This is still not ideal because
reset_deferred_meminit doesn't consider holes and so reservation might
be above the initial estimation whihch we ignore but let's make the
logic simpler until we really need to handle more complicated cases.

Fixes: 3a80a7fa7989 ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set")
Link: http://lkml.kernel.org/r/20170531104010.GI27783@dhcp22.suse.cz
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Tested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


---
 include/linux/memblock.h |    8 ++++++++
 include/linux/mmzone.h   |    1 +
 mm/memblock.c            |   24 ++++++++++++++++++++++++
 mm/page_alloc.c          |   25 ++++++++++++++++++++++---
 4 files changed, 55 insertions(+), 3 deletions(-)

--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -408,11 +408,19 @@ static inline void early_memtest(phys_ad
 }
 #endif
 
+extern unsigned long memblock_reserved_memory_within(phys_addr_t start_addr,
+		phys_addr_t end_addr);
 #else
 static inline phys_addr_t memblock_alloc(phys_addr_t size, phys_addr_t align)
 {
 	return 0;
 }
+
+static inline unsigned long memblock_reserved_memory_within(phys_addr_t start_addr,
+		phys_addr_t end_addr)
+{
+	return 0;
+}
 
 #endif /* CONFIG_HAVE_MEMBLOCK */
 
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -688,6 +688,7 @@ typedef struct pglist_data {
 	 * is the first PFN that needs to be initialised.
 	 */
 	unsigned long first_deferred_pfn;
+	unsigned long static_init_size;
 #endif /* CONFIG_DEFERRED_STRUCT_PAGE_INIT */
 } pg_data_t;
 
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1634,6 +1634,30 @@ static void __init_memblock memblock_dum
 	}
 }
 
+extern unsigned long __init_memblock
+memblock_reserved_memory_within(phys_addr_t start_addr, phys_addr_t end_addr)
+{
+	struct memblock_type *type = &memblock.reserved;
+	unsigned long size = 0;
+	int idx;
+
+	for (idx = 0; idx < type->cnt; idx++) {
+		struct memblock_region *rgn = &type->regions[idx];
+		phys_addr_t start, end;
+
+		if (rgn->base + rgn->size < start_addr)
+			continue;
+		if (rgn->base > end_addr)
+			continue;
+
+		start = rgn->base;
+		end = start + rgn->size;
+		size += end - start;
+	}
+
+	return size;
+}
+
 void __init_memblock __memblock_dump_all(void)
 {
 	pr_info("MEMBLOCK configuration:\n");
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -269,6 +269,26 @@ int page_group_by_mobility_disabled __re
 #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT
 static inline void reset_deferred_meminit(pg_data_t *pgdat)
 {
+	unsigned long max_initialise;
+	unsigned long reserved_lowmem;
+
+	/*
+	 * Initialise at least 2G of a node but also take into account that
+	 * two large system hashes that can take up 1GB for 0.25TB/node.
+	 */
+	max_initialise = max(2UL << (30 - PAGE_SHIFT),
+		(pgdat->node_spanned_pages >> 8));
+
+	/*
+	 * Compensate the all the memblock reservations (e.g. crash kernel)
+	 * from the initial estimation to make sure we will initialize enough
+	 * memory to boot.
+	 */
+	reserved_lowmem = memblock_reserved_memory_within(pgdat->node_start_pfn,
+			pgdat->node_start_pfn + max_initialise);
+	max_initialise += reserved_lowmem;
+
+	pgdat->static_init_size = min(max_initialise, pgdat->node_spanned_pages);
 	pgdat->first_deferred_pfn = ULONG_MAX;
 }
 
@@ -302,10 +322,9 @@ static inline bool update_defer_init(pg_
 	/* Always populate low zones for address-contrained allocations */
 	if (zone_end < pgdat_end_pfn(pgdat))
 		return true;
-
 	/* Initialise at least 2G of the highest zone */
 	(*nr_initialised)++;
-	if (*nr_initialised > (2UL << (30 - PAGE_SHIFT)) &&
+	if ((*nr_initialised > pgdat->static_init_size) &&
 	    (pfn & (PAGES_PER_SECTION - 1)) == 0) {
 		pgdat->first_deferred_pfn = pfn;
 		return false;
@@ -5343,7 +5362,6 @@ void __paginginit free_area_init_node(in
 	/* pg_data_t should be reset to zero when it's allocated */
 	WARN_ON(pgdat->nr_zones || pgdat->classzone_idx);
 
-	reset_deferred_meminit(pgdat);
 	pgdat->node_id = nid;
 	pgdat->node_start_pfn = node_start_pfn;
 #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
@@ -5362,6 +5380,7 @@ void __paginginit free_area_init_node(in
 		(unsigned long)pgdat->node_mem_map);
 #endif
 
+	reset_deferred_meminit(pgdat);
 	free_area_init_core(pgdat);
 }
 

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 81/90] NFS: Ensure we revalidate attributes before using execute_ok()
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (75 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 80/90] mm: consider memblock reservations for deferred memory initialization sizing Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 82/90] NFSv4: Dont perform cached access checks before weve OPENed the file Greg Kroah-Hartman
                   ` (10 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Donald Buczek, Trond Myklebust, Paul Menzel

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Trond Myklebust <trond.myklebust@primarydata.com>

commit 5c5fc09a1157a11dbe84e6421c3e0b37d05238cb upstream.

Donald Buczek reports that NFS clients can also report incorrect
results for access() due to lack of revalidation of attributes
before calling execute_ok().
Looking closely, it seems chdir() is afflicted with the same problem.

Fix is to ensure we call nfs_revalidate_inode_rcu() or
nfs_revalidate_inode() as appropriate before deciding to trust
execute_ok().

Reported-by: Donald Buczek <buczek@molgen.mpg.de>
Link: http://lkml.kernel.org/r/1451331530-3748-1-git-send-email-buczek@molgen.mpg.de
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/nfs/dir.c |   18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -2421,6 +2421,20 @@ int nfs_may_open(struct inode *inode, st
 }
 EXPORT_SYMBOL_GPL(nfs_may_open);
 
+static int nfs_execute_ok(struct inode *inode, int mask)
+{
+	struct nfs_server *server = NFS_SERVER(inode);
+	int ret;
+
+	if (mask & MAY_NOT_BLOCK)
+		ret = nfs_revalidate_inode_rcu(server, inode);
+	else
+		ret = nfs_revalidate_inode(server, inode);
+	if (ret == 0 && !execute_ok(inode))
+		ret = -EACCES;
+	return ret;
+}
+
 int nfs_permission(struct inode *inode, int mask)
 {
 	struct rpc_cred *cred;
@@ -2470,8 +2484,8 @@ force_lookup:
 			res = PTR_ERR(cred);
 	}
 out:
-	if (!res && (mask & MAY_EXEC) && !execute_ok(inode))
-		res = -EACCES;
+	if (!res && (mask & MAY_EXEC))
+		res = nfs_execute_ok(inode, mask);
 
 	dfprintk(VFS, "NFS: permission(%s/%lu), mask=0x%x, res=%d\n",
 		inode->i_sb->s_id, inode->i_ino, mask, res);

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 82/90] NFSv4: Dont perform cached access checks before weve OPENed the file
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (76 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 81/90] NFS: Ensure we revalidate attributes before using execute_ok() Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 83/90] Make __xfs_xattr_put_listen preperly report errors Greg Kroah-Hartman
                   ` (9 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Donald Buczek, Al Viro,
	Trond Myklebust, Paul Menzel

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Trond Myklebust <trond.myklebust@primarydata.com>

commit 762674f86d0328d5dc923c966e209e1ee59663f2 upstream.

Donald Buczek reports that a nfs4 client incorrectly denies
execute access based on outdated file mode (missing 'x' bit).
After the mode on the server is 'fixed' (chmod +x) further execution
attempts continue to fail, because the nfs ACCESS call updates
the access parameter but not the mode parameter or the mode in
the inode.

The root cause is ultimately that the VFS is calling may_open()
before the NFS client has a chance to OPEN the file and hence revalidate
the access and attribute caches.

Al Viro suggests:
>>> Make nfs_permission() relax the checks when it sees MAY_OPEN, if you know
>>> that things will be caught by server anyway?
>>
>> That can work as long as we're guaranteed that everything that calls
>> inode_permission() with MAY_OPEN on a regular file will also follow up
>> with a vfs_open() or dentry_open() on success. Is this always the
>> case?
>
> 1) in do_tmpfile(), followed by do_dentry_open() (not reachable by NFS since
> it doesn't have ->tmpfile() instance anyway)
>
> 2) in atomic_open(), after the call of ->atomic_open() has succeeded.
>
> 3) in do_last(), followed on success by vfs_open()
>
> That's all.  All calls of inode_permission() that get MAY_OPEN come from
> may_open(), and there's no other callers of that puppy.

Reported-by: Donald Buczek <buczek@molgen.mpg.de>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=109771
Link: http://lkml.kernel.org/r/1451046656-26319-1-git-send-email-buczek@molgen.mpg.de
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/nfs/dir.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -2452,6 +2452,9 @@ int nfs_permission(struct inode *inode,
 		case S_IFLNK:
 			goto out;
 		case S_IFREG:
+			if ((mask & MAY_OPEN) &&
+			   nfs_server_capable(inode, NFS_CAP_ATOMIC_OPEN))
+				return 0;
 			break;
 		case S_IFDIR:
 			/*

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 83/90] Make __xfs_xattr_put_listen preperly report errors.
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (77 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 82/90] NFSv4: Dont perform cached access checks before weve OPENed the file Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 84/90] arm64: hw_breakpoint: fix watchpoint matching for tagged pointers Greg Kroah-Hartman
                   ` (8 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Artem Savkov, Dave Chinner,
	Dave Chinner, Nikolay Borisov

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Artem Savkov <asavkov@redhat.com>

commit 791cc43b36eb1f88166c8505900cad1b43c7fe1a upstream.

Commit 2a6fba6 "xfs: only return -errno or success from attr ->put_listent"
changes the returnvalue of __xfs_xattr_put_listen to 0 in case when there is
insufficient space in the buffer assuming that setting context->count to -1
would be enough, but all of the ->put_listent callers only check seen_enough.
This results in a failed assertion:
XFS: Assertion failed: context->count >= 0, file: fs/xfs/xfs_xattr.c, line: 175
in insufficient buffer size case.

This is only reproducible with at least 2 xattrs and only when the buffer
gets depleted before the last one.

Furthermore if buffersize is such that it is enough to hold the last xattr's
name, but not enough to hold the sum of preceeding xattr names listxattr won't
fail with ERANGE, but will suceed returning last xattr's name without the
first character. The first character end's up overwriting data stored at
(context->alist - 1).

Signed-off-by: Artem Savkov <asavkov@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/xfs/xfs_xattr.c |    1 +
 1 file changed, 1 insertion(+)

--- a/fs/xfs/xfs_xattr.c
+++ b/fs/xfs/xfs_xattr.c
@@ -180,6 +180,7 @@ xfs_xattr_put_listent(
 	arraytop = context->count + prefix_len + namelen + 1;
 	if (arraytop > context->firstu) {
 		context->count = -1;	/* insufficient space */
+		context->seen_enough = 1;
 		return 0;
 	}
 	offset = (char *)context->alist + context->count;

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 84/90] arm64: hw_breakpoint: fix watchpoint matching for tagged pointers
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (78 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 83/90] Make __xfs_xattr_put_listen preperly report errors Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 85/90] arm64: entry: improve data abort handling of " Greg Kroah-Hartman
                   ` (7 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Mark Rutland, Will Deacon,
	Kristina Martsenko, Catalin Marinas

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Kristina Martsenko <kristina.martsenko@arm.com>

commit 7dcd9dd8cebe9fa626af7e2358d03a37041a70fb upstream.

This backport has a few small differences from the upstream commit:
 - The address tag is removed in watchpoint_handler() instead of
   get_distance_from_watchpoint(), because 4.4 does not have commit
   fdfeff0f9e3d ("arm64: hw_breakpoint: Handle inexact watchpoint
   addresses").
 - A macro is backported (untagged_addr), as it is not present in 4.4.

Original patch description:

When we take a watchpoint exception, the address that triggered the
watchpoint is found in FAR_EL1. We compare it to the address of each
configured watchpoint to see which one was hit.

The configured watchpoint addresses are untagged, while the address in
FAR_EL1 will have an address tag if the data access was done using a
tagged address. The tag needs to be removed to compare the address to
the watchpoints.

Currently we don't remove it, and as a result can report the wrong
watchpoint as being hit (specifically, always either the highest TTBR0
watchpoint or lowest TTBR1 watchpoint). This patch removes the tag.

Fixes: d50240a5f6ce ("arm64: mm: permit use of tagged pointers at EL0")
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/include/asm/uaccess.h  |    8 ++++++++
 arch/arm64/kernel/hw_breakpoint.c |    3 ++-
 2 files changed, 10 insertions(+), 1 deletion(-)

--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -21,6 +21,7 @@
 /*
  * User space memory access functions
  */
+#include <linux/bitops.h>
 #include <linux/string.h>
 #include <linux/thread_info.h>
 
@@ -103,6 +104,13 @@ static inline void set_fs(mm_segment_t f
 	flag;								\
 })
 
+/*
+ * When dealing with data aborts, watchpoints, or instruction traps we may end
+ * up with a tagged userland pointer. Clear the tag to get a sane pointer to
+ * pass on to access_ok(), for instance.
+ */
+#define untagged_addr(addr)		sign_extend64(addr, 55)
+
 #define access_ok(type, addr, size)	__range_ok(addr, size)
 #define user_addr_max			get_fs
 
--- a/arch/arm64/kernel/hw_breakpoint.c
+++ b/arch/arm64/kernel/hw_breakpoint.c
@@ -35,6 +35,7 @@
 #include <asm/traps.h>
 #include <asm/cputype.h>
 #include <asm/system_misc.h>
+#include <asm/uaccess.h>
 
 /* Breakpoint currently in use for each BRP. */
 static DEFINE_PER_CPU(struct perf_event *, bp_on_reg[ARM_MAX_BRP]);
@@ -690,7 +691,7 @@ static int watchpoint_handler(unsigned l
 
 		/* Check if the watchpoint value matches. */
 		val = read_wb_reg(AARCH64_DBG_REG_WVR, i);
-		if (val != (addr & ~alignment_mask))
+		if (val != (untagged_addr(addr) & ~alignment_mask))
 			goto unlock;
 
 		/* Possible match, check the byte address select to confirm. */

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 85/90] arm64: entry: improve data abort handling of tagged pointers
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (79 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 84/90] arm64: hw_breakpoint: fix watchpoint matching for tagged pointers Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 86/90] RDMA/qib,hfi1: Fix MR reference count leak on write with immediate Greg Kroah-Hartman
                   ` (6 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Dave Martin, Will Deacon,
	Kristina Martsenko, Catalin Marinas

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Kristina Martsenko <kristina.martsenko@arm.com>

commit 276e93279a630657fff4b086ba14c95955912dfa upstream.

This backport has a minor difference from the upstream commit: it adds
the asm-uaccess.h file, which is not present in 4.4, because 4.4 does
not have commit b4b8664d291a ("arm64: don't pull uaccess.h into *.S").

Original patch description:

When handling a data abort from EL0, we currently zero the top byte of
the faulting address, as we assume the address is a TTBR0 address, which
may contain a non-zero address tag. However, the address may be a TTBR1
address, in which case we should not zero the top byte. This patch fixes
that. The effect is that the full TTBR1 address is passed to the task's
signal handler (or printed out in the kernel log).

When handling a data abort from EL1, we leave the faulting address
intact, as we assume it's either a TTBR1 address or a TTBR0 address with
tag 0x00. This is true as far as I'm aware, we don't seem to access a
tagged TTBR0 address anywhere in the kernel. Regardless, it's easy to
forget about address tags, and code added in the future may not always
remember to remove tags from addresses before accessing them. So add tag
handling to the EL1 data abort handler as well. This also makes it
consistent with the EL0 data abort handler.

Fixes: d50240a5f6ce ("arm64: mm: permit use of tagged pointers at EL0")
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/arm64/include/asm/asm-uaccess.h |   13 +++++++++++++
 arch/arm64/kernel/entry.S            |    6 ++++--
 2 files changed, 17 insertions(+), 2 deletions(-)

--- /dev/null
+++ b/arch/arm64/include/asm/asm-uaccess.h
@@ -0,0 +1,13 @@
+#ifndef __ASM_ASM_UACCESS_H
+#define __ASM_ASM_UACCESS_H
+
+/*
+ * Remove the address tag from a virtual address, if present.
+ */
+	.macro	clear_address_tag, dst, addr
+	tst	\addr, #(1 << 55)
+	bic	\dst, \addr, #(0xff << 56)
+	csel	\dst, \dst, \addr, eq
+	.endm
+
+#endif
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -29,6 +29,7 @@
 #include <asm/esr.h>
 #include <asm/memory.h>
 #include <asm/thread_info.h>
+#include <asm/asm-uaccess.h>
 #include <asm/unistd.h>
 
 /*
@@ -316,12 +317,13 @@ el1_da:
 	/*
 	 * Data abort handling
 	 */
-	mrs	x0, far_el1
+	mrs	x3, far_el1
 	enable_dbg
 	// re-enable interrupts if they were enabled in the aborted context
 	tbnz	x23, #7, 1f			// PSR_I_BIT
 	enable_irq
 1:
+	clear_address_tag x0, x3
 	mov	x2, sp				// struct pt_regs
 	bl	do_mem_abort
 
@@ -483,7 +485,7 @@ el0_da:
 	// enable interrupts before calling the main handler
 	enable_dbg_and_irq
 	ct_user_exit
-	bic	x0, x26, #(0xff << 56)
+	clear_address_tag x0, x26
 	mov	x1, x25
 	mov	x2, sp
 	bl	do_mem_abort

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 86/90] RDMA/qib,hfi1: Fix MR reference count leak on write with immediate
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (80 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 85/90] arm64: entry: improve data abort handling of " Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 87/90] tracing: Use strlcpy() instead of strcpy() in __trace_find_cmdline() Greg Kroah-Hartman
                   ` (5 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Tadeusz Struk, Dennis Dalessandro,
	Mike Marciniszyn, Doug Ledford

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mike Marciniszyn <mike.marciniszyn@intel.com>

commit 1feb40067cf04ae48d65f728d62ca255c9449178 upstream.

The handling of IB_RDMA_WRITE_ONLY_WITH_IMMEDIATE will leak a memory
reference when a buffer cannot be allocated for returning the immediate
data.

The issue is that the rkey validation has already occurred and the RNR
nak fails to release the reference that was fruitlessly gotten.  The
the peer will send the identical single packet request when its RNR
timer pops.

The fix is to release the held reference prior to the rnr nak exit.
This is the only sequence the requires both rkey validation and the
buffer allocation on the same packet.

Cc: Stable <stable@vger.kernel.org> # 4.7+
Tested-by: Tadeusz Struk <tadeusz.struk@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/infiniband/hw/qib/qib_rc.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/drivers/infiniband/hw/qib/qib_rc.c
+++ b/drivers/infiniband/hw/qib/qib_rc.c
@@ -2088,8 +2088,10 @@ send_last:
 		ret = qib_get_rwqe(qp, 1);
 		if (ret < 0)
 			goto nack_op_err;
-		if (!ret)
+		if (!ret) {
+			qib_put_ss(&qp->r_sge);
 			goto rnr_nak;
+		}
 		wc.ex.imm_data = ohdr->u.rc.imm_data;
 		hdrsize += 4;
 		wc.wc_flags = IB_WC_WITH_IMM;

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 87/90] tracing: Use strlcpy() instead of strcpy() in __trace_find_cmdline()
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (81 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 86/90] RDMA/qib,hfi1: Fix MR reference count leak on write with immediate Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 88/90] usercopy: Adjust tests to deal with SMAP/PAN Greg Kroah-Hartman
                   ` (4 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Amey Telawane, Amit Pundir,
	Steven Rostedt (VMware)

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Amey Telawane <ameyt@codeaurora.org>

commit e09e28671cda63e6308b31798b997639120e2a21 upstream.

Strcpy is inherently not safe, and strlcpy() should be used instead.
__trace_find_cmdline() uses strcpy() because the comms saved must have a
terminating nul character, but it doesn't hurt to add the extra protection
of using strlcpy() instead of strcpy().

Link: http://lkml.kernel.org/r/1493806274-13936-1-git-send-email-amit.pundir@linaro.org

Signed-off-by: Amey Telawane <ameyt@codeaurora.org>
[AmitP: Cherry-picked this commit from CodeAurora kernel/msm-3.10
https://source.codeaurora.org/quic/la/kernel/msm-3.10/commit/?id=2161ae9a70b12cf18ac8e5952a20161ffbccb477]
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
[ Updated change log and removed the "- 1" from len parameter ]
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 kernel/trace/trace.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -1617,7 +1617,7 @@ static void __trace_find_cmdline(int pid
 
 	map = savedcmd->map_pid_to_cmdline[pid];
 	if (map != NO_CMDLINE_MAP)
-		strcpy(comm, get_saved_cmdlines(map));
+		strlcpy(comm, get_saved_cmdlines(map), TASK_COMM_LEN);
 	else
 		strcpy(comm, "<...>");
 }

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 88/90] usercopy: Adjust tests to deal with SMAP/PAN
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (82 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 87/90] tracing: Use strlcpy() instead of strcpy() in __trace_find_cmdline() Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 89/90] arm64: armv8_deprecated: ensure extension of addr Greg Kroah-Hartman
                   ` (3 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Kees Cook, Arnd Bergmann

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Kees Cook <keescook@chromium.org>

commit f5f893c57e37ca730808cb2eee3820abd05e7507 upstream.

Under SMAP/PAN/etc, we cannot write directly to userspace memory, so
this rearranges the test bytes to get written through copy_to_user().
Additionally drops the bad copy_from_user() test that would trigger a
memcpy() against userspace on failure.

[arnd: the test module was added in 3.14, and this backported patch
       should apply cleanly on all version from 3.14 to 4.10.
       The original patch was in 4.11 on top of a context change
       I saw the bug triggered with kselftest on a 4.4.y stable kernel]

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 lib/test_user_copy.c |   20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

--- a/lib/test_user_copy.c
+++ b/lib/test_user_copy.c
@@ -58,7 +58,9 @@ static int __init test_user_copy_init(vo
 	usermem = (char __user *)user_addr;
 	bad_usermem = (char *)user_addr;
 
-	/* Legitimate usage: none of these should fail. */
+	/*
+	 * Legitimate usage: none of these copies should fail.
+	 */
 	ret |= test(copy_from_user(kmem, usermem, PAGE_SIZE),
 		    "legitimate copy_from_user failed");
 	ret |= test(copy_to_user(usermem, kmem, PAGE_SIZE),
@@ -68,19 +70,33 @@ static int __init test_user_copy_init(vo
 	ret |= test(put_user(value, (unsigned long __user *)usermem),
 		    "legitimate put_user failed");
 
-	/* Invalid usage: none of these should succeed. */
+	/*
+	 * Invalid usage: none of these copies should succeed.
+	 */
+
+	/* Reject kernel-to-kernel copies through copy_from_user(). */
 	ret |= test(!copy_from_user(kmem, (char __user *)(kmem + PAGE_SIZE),
 				    PAGE_SIZE),
 		    "illegal all-kernel copy_from_user passed");
+
+#if 0
+	/*
+	 * When running with SMAP/PAN/etc, this will Oops the kernel
+	 * due to the zeroing of userspace memory on failure. This needs
+	 * to be tested in LKDTM instead, since this test module does not
+	 * expect to explode.
+	 */
 	ret |= test(!copy_from_user(bad_usermem, (char __user *)kmem,
 				    PAGE_SIZE),
 		    "illegal reversed copy_from_user passed");
+#endif
 	ret |= test(!copy_to_user((char __user *)kmem, kmem + PAGE_SIZE,
 				  PAGE_SIZE),
 		    "illegal all-kernel copy_to_user passed");
 	ret |= test(!copy_to_user((char __user *)kmem, bad_usermem,
 				  PAGE_SIZE),
 		    "illegal reversed copy_to_user passed");
+
 	ret |= test(!get_user(value, (unsigned long __user *)kmem),
 		    "illegal get_user passed");
 	ret |= test(!put_user(value, (unsigned long __user *)kmem),

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 89/90] arm64: armv8_deprecated: ensure extension of addr
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (83 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 88/90] usercopy: Adjust tests to deal with SMAP/PAN Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 15:26 ` [PATCH 4.4 90/90] arm64: ensure extension of smp_store_release value Greg Kroah-Hartman
                   ` (2 subsequent siblings)
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Will Deacon, Mark Rutland, Catalin Marinas

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>

commit 55de49f9aa17b0b2b144dd2af587177b9aadf429 upstream.

Our compat swp emulation holds the compat user address in an unsigned
int, which it passes to __user_swpX_asm(). When a 32-bit value is passed
in a register, the upper 32 bits of the register are unknown, and we
must extend the value to 64 bits before we can use it as a base address.

This patch casts the address to unsigned long to ensure it has been
suitably extended, avoiding the potential issue, and silencing a related
warning from clang.

Fixes: bd35a4adc413 ("arm64: Port SWP/SWPB emulation support from arm")
Cc: <stable@vger.kernel.org> # 3.19.x-
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/arm64/kernel/armv8_deprecated.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/arch/arm64/kernel/armv8_deprecated.c
+++ b/arch/arm64/kernel/armv8_deprecated.c
@@ -305,7 +305,8 @@ static void register_insn_emulation_sysc
 	ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN,	\
 		CONFIG_ARM64_PAN)				\
 	: "=&r" (res), "+r" (data), "=&r" (temp)		\
-	: "r" (addr), "i" (-EAGAIN), "i" (-EFAULT)		\
+	: "r" ((unsigned long)addr), "i" (-EAGAIN),		\
+	  "i" (-EFAULT)						\
 	: "memory")
 
 #define __user_swp_asm(data, addr, res, temp) \

^ permalink raw reply	[flat|nested] 95+ messages in thread

* [PATCH 4.4 90/90] arm64: ensure extension of smp_store_release value
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (84 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 89/90] arm64: armv8_deprecated: ensure extension of addr Greg Kroah-Hartman
@ 2017-06-12 15:26 ` Greg Kroah-Hartman
  2017-06-12 21:53 ` [PATCH 4.4 00/90] 4.4.72-stable review Guenter Roeck
  2017-06-13  0:45 ` Shuah Khan
  87 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-06-12 15:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Will Deacon, Mark Rutland,
	Matthias Kaehlcke, Catalin Marinas

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mark Rutland <mark.rutland@arm.com>

commit 994870bead4ab19087a79492400a5478e2906196 upstream.

When an inline assembly operand's type is narrower than the register it
is allocated to, the least significant bits of the register (up to the
operand type's width) are valid, and any other bits are permitted to
contain any arbitrary value. This aligns with the AAPCS64 parameter
passing rules.

Our __smp_store_release() implementation does not account for this, and
implicitly assumes that operands have been zero-extended to the width of
the type being stored to. Thus, we may store unknown values to memory
when the value type is narrower than the pointer type (e.g. when storing
a char to a long).

This patch fixes the issue by casting the value operand to the same
width as the pointer operand in all cases, which ensures that the value
is zero-extended as we expect. We use the same union trickery as
__smp_load_acquire and {READ,WRITE}_ONCE() to avoid GCC complaining that
pointers are potentially cast to narrower width integers in unreachable
paths.

A whitespace issue at the top of __smp_store_release() is also
corrected.

No changes are necessary for __smp_load_acquire(). Load instructions
implicitly clear any upper bits of the register, and the compiler will
only consider the least significant bits of the register as valid
regardless.

Fixes: 47933ad41a86 ("arch: Introduce smp_load_acquire(), smp_store_release()")
Fixes: 878a84d5a8a1 ("arm64: add missing data types in smp_load_acquire/smp_store_release")
Cc: <stable@vger.kernel.org> # 3.14.x-
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/arm64/include/asm/barrier.h |   18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

--- a/arch/arm64/include/asm/barrier.h
+++ b/arch/arm64/include/asm/barrier.h
@@ -41,23 +41,33 @@
 
 #define smp_store_release(p, v)						\
 do {									\
+	union { typeof(*p) __val; char __c[1]; } __u =			\
+		{ .__val = (__force typeof(*p)) (v) }; 			\
 	compiletime_assert_atomic_type(*p);				\
 	switch (sizeof(*p)) {						\
 	case 1:								\
 		asm volatile ("stlrb %w1, %0"				\
-				: "=Q" (*p) : "r" (v) : "memory");	\
+				: "=Q" (*p)				\
+				: "r" (*(__u8 *)__u.__c)		\
+				: "memory");				\
 		break;							\
 	case 2:								\
 		asm volatile ("stlrh %w1, %0"				\
-				: "=Q" (*p) : "r" (v) : "memory");	\
+				: "=Q" (*p)				\
+				: "r" (*(__u16 *)__u.__c)		\
+				: "memory");				\
 		break;							\
 	case 4:								\
 		asm volatile ("stlr %w1, %0"				\
-				: "=Q" (*p) : "r" (v) : "memory");	\
+				: "=Q" (*p)				\
+				: "r" (*(__u32 *)__u.__c)		\
+				: "memory");				\
 		break;							\
 	case 8:								\
 		asm volatile ("stlr %1, %0"				\
-				: "=Q" (*p) : "r" (v) : "memory");	\
+				: "=Q" (*p)				\
+				: "r" (*(__u64 *)__u.__c)		\
+				: "memory");				\
 		break;							\
 	}								\
 } while (0)

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [kernel-hardening] [PATCH 4.4 47/90] stackprotector: Increase the per-task stack canarys random range from 32 bits to 64 bits on 64-bit platforms
  2017-06-12 15:25 ` [PATCH 4.4 47/90] stackprotector: Increase the per-task stack canarys random range from 32 bits to 64 bits on 64-bit platforms Greg Kroah-Hartman
@ 2017-06-12 15:41   ` Jann Horn
  2017-06-12 15:45     ` Jann Horn
  0 siblings, 1 reply; 95+ messages in thread
From: Jann Horn @ 2017-06-12 15:41 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: kernel list, stable, Daniel Micay, Arjan van de Ven,
	Rik van Riel, Kees Cook, Linus Torvalds, Peter Zijlstra,
	Thomas Gleixner, Kernel Hardening, Ingo Molnar

AFAICS get_random_long() doesn't exist in 4.4 (except in
arch/x86/boot/compressed/aslr.c)? IIRC the same problem already
occured with another kernel version?

On Mon, Jun 12, 2017 at 5:25 PM, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
> 4.4-stable review patch.  If anyone has any objections, please let me know.
>
> ------------------
>
> From: Daniel Micay <danielmicay@gmail.com>
>
> commit 5ea30e4e58040cfd6434c2f33dc3ea76e2c15b05 upstream.
>
> The stack canary is an 'unsigned long' and should be fully initialized to
> random data rather than only 32 bits of random data.
>
> Signed-off-by: Daniel Micay <danielmicay@gmail.com>
> Acked-by: Arjan van de Ven <arjan@linux.intel.com>
> Acked-by: Rik van Riel <riel@redhat.com>
> Acked-by: Kees Cook <keescook@chromium.org>
> Cc: Arjan van Ven <arjan@linux.intel.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: kernel-hardening@lists.openwall.com
> Cc: stable@vger.kernel.org
> Link: http://lkml.kernel.org/r/20170504133209.3053-1-danielmicay@gmail.com
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>
> ---
>  kernel/fork.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -368,7 +368,7 @@ static struct task_struct *dup_task_stru
>         set_task_stack_end_magic(tsk);
>
>  #ifdef CONFIG_CC_STACKPROTECTOR
> -       tsk->stack_canary = get_random_int();
> +       tsk->stack_canary = get_random_long();
>  #endif
>
>         /*
>
>

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [kernel-hardening] [PATCH 4.4 47/90] stackprotector: Increase the per-task stack canarys random range from 32 bits to 64 bits on 64-bit platforms
  2017-06-12 15:41   ` [kernel-hardening] " Jann Horn
@ 2017-06-12 15:45     ` Jann Horn
  0 siblings, 0 replies; 95+ messages in thread
From: Jann Horn @ 2017-06-12 15:45 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: kernel list, stable, Daniel Micay, Arjan van de Ven,
	Rik van Riel, Kees Cook, Linus Torvalds, Peter Zijlstra,
	Thomas Gleixner, Kernel Hardening, Ingo Molnar

Ah, nevermind, I just saw that this was changed in
https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/tree/queue-4.4/drivers-char-random-add-get_random_long.patch
.

On Mon, Jun 12, 2017 at 5:41 PM, Jann Horn <jannh@google.com> wrote:
> AFAICS get_random_long() doesn't exist in 4.4 (except in
> arch/x86/boot/compressed/aslr.c)? IIRC the same problem already
> occured with another kernel version?
>
> On Mon, Jun 12, 2017 at 5:25 PM, Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
>> 4.4-stable review patch.  If anyone has any objections, please let me know.
>>
>> ------------------
>>
>> From: Daniel Micay <danielmicay@gmail.com>
>>
>> commit 5ea30e4e58040cfd6434c2f33dc3ea76e2c15b05 upstream.
>>
>> The stack canary is an 'unsigned long' and should be fully initialized to
>> random data rather than only 32 bits of random data.
>>
>> Signed-off-by: Daniel Micay <danielmicay@gmail.com>
>> Acked-by: Arjan van de Ven <arjan@linux.intel.com>
>> Acked-by: Rik van Riel <riel@redhat.com>
>> Acked-by: Kees Cook <keescook@chromium.org>
>> Cc: Arjan van Ven <arjan@linux.intel.com>
>> Cc: Linus Torvalds <torvalds@linux-foundation.org>
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Cc: kernel-hardening@lists.openwall.com
>> Cc: stable@vger.kernel.org
>> Link: http://lkml.kernel.org/r/20170504133209.3053-1-danielmicay@gmail.com
>> Signed-off-by: Ingo Molnar <mingo@kernel.org>
>> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>>
>> ---
>>  kernel/fork.c |    2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> --- a/kernel/fork.c
>> +++ b/kernel/fork.c
>> @@ -368,7 +368,7 @@ static struct task_struct *dup_task_stru
>>         set_task_stack_end_magic(tsk);
>>
>>  #ifdef CONFIG_CC_STACKPROTECTOR
>> -       tsk->stack_canary = get_random_int();
>> +       tsk->stack_canary = get_random_long();
>>  #endif
>>
>>         /*
>>
>>

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [PATCH 4.4 00/90] 4.4.72-stable review
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (85 preceding siblings ...)
  2017-06-12 15:26 ` [PATCH 4.4 90/90] arm64: ensure extension of smp_store_release value Greg Kroah-Hartman
@ 2017-06-12 21:53 ` Guenter Roeck
  2017-06-13  0:45 ` Shuah Khan
  87 siblings, 0 replies; 95+ messages in thread
From: Guenter Roeck @ 2017-06-12 21:53 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, torvalds, akpm, shuahkh, patches, ben.hutchings, stable

On Mon, Jun 12, 2017 at 05:25:06PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.4.72 release.
> There are 90 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Wed Jun 14 15:25:30 UTC 2017.
> Anything received after that time might be too late.
> 

Build results:
	total: 145 pass: 145 fail: 0
Qemu test results:
	total: 115 pass: 115 fail: 0

Details are available at http://kerneltests.org/builders.

Guenter

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [PATCH 4.4 00/90] 4.4.72-stable review
  2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
                   ` (86 preceding siblings ...)
  2017-06-12 21:53 ` [PATCH 4.4 00/90] 4.4.72-stable review Guenter Roeck
@ 2017-06-13  0:45 ` Shuah Khan
  87 siblings, 0 replies; 95+ messages in thread
From: Shuah Khan @ 2017-06-13  0:45 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel
  Cc: torvalds, akpm, linux, patches, ben.hutchings, stable, Shuah Khan

On 06/12/2017 09:25 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.4.72 release.
> There are 90 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Wed Jun 14 15:25:30 UTC 2017.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.72-rc1.gz
> or in the git tree and branch at:
>   git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h
> 

Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [PATCH 4.4 66/90] powerpc/numa: Fix percpu allocations to be NUMA aware
  2017-06-12 15:26 ` [PATCH 4.4 66/90] powerpc/numa: Fix percpu allocations to be NUMA aware Greg Kroah-Hartman
@ 2017-07-28 13:53   ` Michal Hocko
  2017-07-28 22:41     ` Greg Kroah-Hartman
  0 siblings, 1 reply; 95+ messages in thread
From: Michal Hocko @ 2017-07-28 13:53 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, stable, Michael Ellerman, Nicholas Piggin

JFYI. We have encountered a regression after applying this patch on a
large ppc machine. While the patch is the right thing to do it doesn't
work well with the current vmalloc area size on ppc and large machines
where NUMA nodes are very far from each other. Just for the reference
the boot fails on such a machine with bunch of warning preceeding it.
See http://lkml.kernel.org/r/20170724134240.GL25221@dhcp22.suse.cz

It seems the right thing to do is to enlarge the vmalloc space on ppc
but this is not the case in the upstream kernel yet AFAIK. It is also
questionable whether that is a stable material but I will decision on
you here.

We have reverted this patch from our 4.4 based kernel.

On Mon 12-06-17 17:26:12, Greg KH wrote:
> 4.4-stable review patch.  If anyone has any objections, please let me know.
> 
> ------------------
> 
> From: Michael Ellerman <mpe@ellerman.id.au>
> 
> commit ba4a648f12f4cd0a8003dd229b6ca8a53348ee4b upstream.
> 
> In commit 8c272261194d ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID"), we
> switched to the generic implementation of cpu_to_node(), which uses a percpu
> variable to hold the NUMA node for each CPU.
> 
> Unfortunately we neglected to notice that we use cpu_to_node() in the allocation
> of our percpu areas, leading to a chicken and egg problem. In practice what
> happens is when we are setting up the percpu areas, cpu_to_node() reports that
> all CPUs are on node 0, so we allocate all percpu areas on node 0.
> 
> This is visible in the dmesg output, as all pcpu allocs being in group 0:
> 
>   pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07
>   pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15
>   pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23
>   pcpu-alloc: [0] 24 25 26 27 [0] 28 29 30 31
>   pcpu-alloc: [0] 32 33 34 35 [0] 36 37 38 39
>   pcpu-alloc: [0] 40 41 42 43 [0] 44 45 46 47
> 
> To fix it we need an early_cpu_to_node() which can run prior to percpu being
> setup. We already have the numa_cpu_lookup_table we can use, so just plumb it
> in. With the patch dmesg output shows two groups, 0 and 1:
> 
>   pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07
>   pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15
>   pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23
>   pcpu-alloc: [1] 24 25 26 27 [1] 28 29 30 31
>   pcpu-alloc: [1] 32 33 34 35 [1] 36 37 38 39
>   pcpu-alloc: [1] 40 41 42 43 [1] 44 45 46 47
> 
> We can also check the data_offset in the paca of various CPUs, with the fix we
> see:
> 
>   CPU 0:  data_offset = 0x0ffe8b0000
>   CPU 24: data_offset = 0x1ffe5b0000
> 
> And we can see from dmesg that CPU 24 has an allocation on node 1:
> 
>   node   0: [mem 0x0000000000000000-0x0000000fffffffff]
>   node   1: [mem 0x0000001000000000-0x0000001fffffffff]
> 
> Fixes: 8c272261194d ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID")
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> 
> ---
>  arch/powerpc/include/asm/topology.h |   14 ++++++++++++++
>  arch/powerpc/kernel/setup_64.c      |    4 ++--
>  2 files changed, 16 insertions(+), 2 deletions(-)
> 
> --- a/arch/powerpc/include/asm/topology.h
> +++ b/arch/powerpc/include/asm/topology.h
> @@ -44,8 +44,22 @@ extern void __init dump_numa_cpu_topolog
>  extern int sysfs_add_device_to_node(struct device *dev, int nid);
>  extern void sysfs_remove_device_from_node(struct device *dev, int nid);
>  
> +static inline int early_cpu_to_node(int cpu)
> +{
> +	int nid;
> +
> +	nid = numa_cpu_lookup_table[cpu];
> +
> +	/*
> +	 * Fall back to node 0 if nid is unset (it should be, except bugs).
> +	 * This allows callers to safely do NODE_DATA(early_cpu_to_node(cpu)).
> +	 */
> +	return (nid < 0) ? 0 : nid;
> +}
>  #else
>  
> +static inline int early_cpu_to_node(int cpu) { return 0; }
> +
>  static inline void dump_numa_cpu_topology(void) {}
>  
>  static inline int sysfs_add_device_to_node(struct device *dev, int nid)
> --- a/arch/powerpc/kernel/setup_64.c
> +++ b/arch/powerpc/kernel/setup_64.c
> @@ -751,7 +751,7 @@ void __init setup_arch(char **cmdline_p)
>  
>  static void * __init pcpu_fc_alloc(unsigned int cpu, size_t size, size_t align)
>  {
> -	return __alloc_bootmem_node(NODE_DATA(cpu_to_node(cpu)), size, align,
> +	return __alloc_bootmem_node(NODE_DATA(early_cpu_to_node(cpu)), size, align,
>  				    __pa(MAX_DMA_ADDRESS));
>  }
>  
> @@ -762,7 +762,7 @@ static void __init pcpu_fc_free(void *pt
>  
>  static int pcpu_cpu_distance(unsigned int from, unsigned int to)
>  {
> -	if (cpu_to_node(from) == cpu_to_node(to))
> +	if (early_cpu_to_node(from) == early_cpu_to_node(to))
>  		return LOCAL_DISTANCE;
>  	else
>  		return REMOTE_DISTANCE;
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [PATCH 4.4 66/90] powerpc/numa: Fix percpu allocations to be NUMA aware
  2017-07-28 13:53   ` Michal Hocko
@ 2017-07-28 22:41     ` Greg Kroah-Hartman
  2017-07-31  6:41       ` Michal Hocko
  0 siblings, 1 reply; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-07-28 22:41 UTC (permalink / raw)
  To: Michal Hocko; +Cc: linux-kernel, stable, Michael Ellerman, Nicholas Piggin

On Fri, Jul 28, 2017 at 03:53:35PM +0200, Michal Hocko wrote:
> JFYI. We have encountered a regression after applying this patch on a
> large ppc machine. While the patch is the right thing to do it doesn't
> work well with the current vmalloc area size on ppc and large machines
> where NUMA nodes are very far from each other. Just for the reference
> the boot fails on such a machine with bunch of warning preceeding it.
> See http://lkml.kernel.org/r/20170724134240.GL25221@dhcp22.suse.cz
> 
> It seems the right thing to do is to enlarge the vmalloc space on ppc
> but this is not the case in the upstream kernel yet AFAIK. It is also
> questionable whether that is a stable material but I will decision on
> you here.
> 
> We have reverted this patch from our 4.4 based kernel.

But all is fine on newer kernels?  That is odd.

I'll be glad to drop it, but should it be dropped from all stable trees?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [PATCH 4.4 66/90] powerpc/numa: Fix percpu allocations to be NUMA aware
  2017-07-28 22:41     ` Greg Kroah-Hartman
@ 2017-07-31  6:41       ` Michal Hocko
  2017-08-03 19:29         ` Greg Kroah-Hartman
  0 siblings, 1 reply; 95+ messages in thread
From: Michal Hocko @ 2017-07-31  6:41 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, stable, Michael Ellerman, Nicholas Piggin

On Fri 28-07-17 15:41:47, Greg KH wrote:
> On Fri, Jul 28, 2017 at 03:53:35PM +0200, Michal Hocko wrote:
> > JFYI. We have encountered a regression after applying this patch on a
> > large ppc machine. While the patch is the right thing to do it doesn't
> > work well with the current vmalloc area size on ppc and large machines
> > where NUMA nodes are very far from each other. Just for the reference
> > the boot fails on such a machine with bunch of warning preceeding it.
> > See http://lkml.kernel.org/r/20170724134240.GL25221@dhcp22.suse.cz
> > 
> > It seems the right thing to do is to enlarge the vmalloc space on ppc
> > but this is not the case in the upstream kernel yet AFAIK. It is also
> > questionable whether that is a stable material but I will decision on
> > you here.
> > 
> > We have reverted this patch from our 4.4 based kernel.
> 
> But all is fine on newer kernels?  That is odd.

Newer kernels do not have enlarged vmalloc space yet AFAIK so they won't
work properly eiter. This bug is quite rare though because you need a
specific HW configuration to trigger the issue - namely NUMA nodes have
to be far away from each other in the physical memory space.

> I'll be glad to drop it, but should it be dropped from all stable trees?

Yes from all stable backports.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [PATCH 4.4 66/90] powerpc/numa: Fix percpu allocations to be NUMA aware
  2017-07-31  6:41       ` Michal Hocko
@ 2017-08-03 19:29         ` Greg Kroah-Hartman
  0 siblings, 0 replies; 95+ messages in thread
From: Greg Kroah-Hartman @ 2017-08-03 19:29 UTC (permalink / raw)
  To: Michal Hocko; +Cc: linux-kernel, stable, Michael Ellerman, Nicholas Piggin

On Mon, Jul 31, 2017 at 08:41:16AM +0200, Michal Hocko wrote:
> On Fri 28-07-17 15:41:47, Greg KH wrote:
> > On Fri, Jul 28, 2017 at 03:53:35PM +0200, Michal Hocko wrote:
> > > JFYI. We have encountered a regression after applying this patch on a
> > > large ppc machine. While the patch is the right thing to do it doesn't
> > > work well with the current vmalloc area size on ppc and large machines
> > > where NUMA nodes are very far from each other. Just for the reference
> > > the boot fails on such a machine with bunch of warning preceeding it.
> > > See http://lkml.kernel.org/r/20170724134240.GL25221@dhcp22.suse.cz
> > > 
> > > It seems the right thing to do is to enlarge the vmalloc space on ppc
> > > but this is not the case in the upstream kernel yet AFAIK. It is also
> > > questionable whether that is a stable material but I will decision on
> > > you here.
> > > 
> > > We have reverted this patch from our 4.4 based kernel.
> > 
> > But all is fine on newer kernels?  That is odd.
> 
> Newer kernels do not have enlarged vmalloc space yet AFAIK so they won't
> work properly eiter. This bug is quite rare though because you need a
> specific HW configuration to trigger the issue - namely NUMA nodes have
> to be far away from each other in the physical memory space.
> 
> > I'll be glad to drop it, but should it be dropped from all stable trees?
> 
> Yes from all stable backports.

Ok, now all reverted, thanks.

greg k-h

^ permalink raw reply	[flat|nested] 95+ messages in thread

end of thread, other threads:[~2017-08-03 19:30 UTC | newest]

Thread overview: 95+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-12 15:25 [PATCH 4.4 00/90] 4.4.72-stable review Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 01/90] bnx2x: Fix Multi-Cos Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 02/90] ipv6: xfrm: Handle errors reported by xfrm6_find_1stfragopt() Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 03/90] cxgb4: avoid enabling napi twice to the same queue Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 04/90] tcp: disallow cwnd undo when switching congestion control Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 05/90] vxlan: fix use-after-free on deletion Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 06/90] ipv6: Fix leak in ipv6_gso_segment() Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 07/90] net: ping: do not abuse udp_poll() Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 08/90] net: ethoc: enable NAPI before poll may be scheduled Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 09/90] net: bridge: start hello timer only if device is up Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 10/90] sparc64: mm: fix copy_tsb to correctly copy huge page TSBs Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 11/90] sparc: Machine description indices can vary Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 12/90] sparc64: reset mm cpumask after wrap Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 13/90] sparc64: combine activate_mm and switch_mm Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 14/90] sparc64: redefine first version Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 15/90] sparc64: add per-cpu mm of secondary contexts Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 16/90] sparc64: new context wrap Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 17/90] sparc64: delete old wrap code Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 18/90] arch/sparc: support NR_CPUS = 4096 Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 19/90] serial: ifx6x60: fix use-after-free on module unload Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 20/90] ptrace: Properly initialize ptracer_cred on fork Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 21/90] KEYS: fix dereferencing NULL payload with nonzero length Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 22/90] KEYS: fix freeing uninitialized memory in key_update() Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 23/90] crypto: gcm - wait for crypto op not signal safe Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 25/90] nfsd4: fix null dereference on replay Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 26/90] nfsd: Fix up the "supattr_exclcreat" attributes Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 29/90] arm: KVM: Allow unaligned accesses at HYP Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 31/90] dmaengine: usb-dmac: Fix DMAOR AE bit definition Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 32/90] dmaengine: ep93xx: Always start from BASE0 Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 33/90] xen/privcmd: Support correctly 64KB page granularity when mapping memory Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 34/90] xen-netfront: do not cast grant table reference to signed short Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 35/90] xen-netfront: cast grant table reference first to type int Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 36/90] ext4: fix SEEK_HOLE Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 37/90] ext4: keep existing extra fields when inode expands Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 38/90] ext4: fix fdatasync(2) after extent manipulation operations Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 39/90] usb: gadget: f_mass_storage: Serialize wake and sleep execution Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 40/90] usb: chipidea: udc: fix NULL pointer dereference if udc_start failed Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 41/90] usb: chipidea: debug: check before accessing ci_role Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 42/90] staging/lustre/lov: remove set_fs() call from lov_getstripe() Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 43/90] iio: light: ltr501 Fix interchanged als/ps register field Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 44/90] iio: proximity: as3935: fix AS3935_INT mask Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 45/90] drivers: char: random: add get_random_long() Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 46/90] random: properly align get_random_int_hash Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 47/90] stackprotector: Increase the per-task stack canarys random range from 32 bits to 64 bits on 64-bit platforms Greg Kroah-Hartman
2017-06-12 15:41   ` [kernel-hardening] " Jann Horn
2017-06-12 15:45     ` Jann Horn
2017-06-12 15:25 ` [PATCH 4.4 48/90] cpufreq: cpufreq_register_driver() should return -ENODEV if init fails Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 49/90] target: Re-add check to reject control WRITEs with overflow data Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 50/90] drm/msm: Expose our reservation object when exporting a dmabuf Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 51/90] Input: elantech - add Fujitsu Lifebook E546/E557 to force crc_enabled Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 52/90] cpuset: consider dying css as offline Greg Kroah-Hartman
2017-06-12 15:25 ` [PATCH 4.4 53/90] fs: add i_blocksize() Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 54/90] ufs: restore proper tail allocation Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 55/90] fix ufs_isblockset() Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 56/90] ufs: restore maintaining ->i_blocks Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 57/90] ufs: set correct ->s_maxsize Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 58/90] ufs_extend_tail(): fix the braino in calling conventions of ufs_new_fragments() Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 59/90] ufs_getfrag_block(): we only grab ->truncate_mutex on block creation path Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 60/90] cxl: Fix error path on bad ioctl Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 61/90] btrfs: use correct types for page indices in btrfs_page_exists_in_range Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 62/90] btrfs: fix memory leak in update_space_info failure path Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 63/90] KVM: arm/arm64: Handle possible NULL stage2 pud when ageing pages Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 64/90] scsi: qla2xxx: dont disable a not previously enabled PCI device Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 65/90] powerpc/eeh: Avoid use after free in eeh_handle_special_event() Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 66/90] powerpc/numa: Fix percpu allocations to be NUMA aware Greg Kroah-Hartman
2017-07-28 13:53   ` Michal Hocko
2017-07-28 22:41     ` Greg Kroah-Hartman
2017-07-31  6:41       ` Michal Hocko
2017-08-03 19:29         ` Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 67/90] powerpc/hotplug-mem: Fix missing endian conversion of aa_index Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 68/90] perf/core: Drop kernel samples even though :u is specified Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 69/90] drm/vmwgfx: Handle vmalloc() failure in vmw_local_fifo_reserve() Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 70/90] drm/vmwgfx: limit the number of mip levels in vmw_gb_surface_define_ioctl() Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 71/90] drm/vmwgfx: Make sure backup_handle is always valid Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 72/90] drm/nouveau/tmr: fully separate alarm execution/pending lists Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 73/90] ALSA: timer: Fix race between read and ioctl Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 74/90] ALSA: timer: Fix missing queue indices reset at SNDRV_TIMER_IOCTL_SELECT Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 75/90] ASoC: Fix use-after-free at card unregistration Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 76/90] drivers: char: mem: Fix wraparound check to allow mappings up to the end Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 77/90] tty: Drop krefs for interrupted tty lock Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 78/90] serial: sh-sci: Fix panic when serial console and DMA are enabled Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 79/90] net: better skb->sender_cpu and skb->napi_id cohabitation Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 80/90] mm: consider memblock reservations for deferred memory initialization sizing Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 81/90] NFS: Ensure we revalidate attributes before using execute_ok() Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 82/90] NFSv4: Dont perform cached access checks before weve OPENed the file Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 83/90] Make __xfs_xattr_put_listen preperly report errors Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 84/90] arm64: hw_breakpoint: fix watchpoint matching for tagged pointers Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 85/90] arm64: entry: improve data abort handling of " Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 86/90] RDMA/qib,hfi1: Fix MR reference count leak on write with immediate Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 87/90] tracing: Use strlcpy() instead of strcpy() in __trace_find_cmdline() Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 88/90] usercopy: Adjust tests to deal with SMAP/PAN Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 89/90] arm64: armv8_deprecated: ensure extension of addr Greg Kroah-Hartman
2017-06-12 15:26 ` [PATCH 4.4 90/90] arm64: ensure extension of smp_store_release value Greg Kroah-Hartman
2017-06-12 21:53 ` [PATCH 4.4 00/90] 4.4.72-stable review Guenter Roeck
2017-06-13  0:45 ` Shuah Khan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).