All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 3.12 00/78] 3.12.36-stable review
@ 2015-01-09 10:30 Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 01/78] ipv6: gre: fix wrong skb->protocol in WCCP Jiri Slaby
                   ` (79 more replies)
  0 siblings, 80 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:30 UTC (permalink / raw)
  To: stable; +Cc: linux, satoru.takeuchi, shuah.kh, linux-kernel, Jiri Slaby

This is the start of the stable review cycle for the 3.12.36 release.
There are 78 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Tue Jan 13 11:29:45 CET 2015.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
	http://kernel.org/pub/linux/kernel/people/jirislaby/stable-review/patch-3.12.36-rc1.xz
and the diffstat can be found below.

thanks,
js

===============


Alexander Kochetkov (2):
  i2c: omap: fix NACK and Arbitration Lost irq handling
  i2c: omap: fix i207 errata handling

Andreas Müller (1):
  mac80211: fix multicast LED blinking and counter

Andrew Morton (1):
  mm/vmpressure.c: fix race in vmpressure_work_fn()

Andy Lutomirski (4):
  x86/tls: Validate TLS entries to protect espfix
  x86/tls: Disallow unusual TLS segments
  x86, kvm: Clear paravirt_enabled on KVM guests for espfix32's benefit
  x86/tls: Don't validate lm in set_thread_area() after all

Anton Blanchard (1):
  powerpc: 32 bit getcpu VDSO function uses 64 bit instructions

Baruch Siach (1):
  mmc: block: add newline to sysfs display of force_ro

Dan Carpenter (1):
  dm space map metadata: fix sm_bootstrap_get_nr_blocks()

Daniel Borkmann (1):
  net: sctp: use MAX_HEADER for headroom reserve in output path

Daniel Forrest (1):
  mm: fix anon_vma_clone() error treatment

Daniel Vetter (2):
  drm/i915: More cautious with pch fifo underruns
  drm/i915: Unlock panel even when LVDS is disabled

Darrick J. Wong (1):
  dm bufio: fix memleak when using a dm_buffer's inline bio

Devin Ryles (1):
  AHCI: Add DeviceIDs for Sunrise Point-LP SATA controller

Dmitry Eremin-Solenikov (1):
  mfd: tc6393xb: Fail ohci suspend if full state restore is required

Dmitry Torokhov (1):
  sata_fsl: fix error handling of irq_of_parse_and_map

Eric Dumazet (1):
  net: mvneta: fix race condition in mvneta_tx()

Eric W. Biederman (13):
  mnt: Implicitly add MNT_NODEV on remount when it was implicitly added
    by mount
  mnt: Update unprivileged remount test
  umount: Disallow unprivileged mount force
  groups: Consolidate the setgroups permission checks
  userns: Document what the invariant required for safe unprivileged
    mappings.
  userns: Don't allow setgroups until a gid mapping has been setablished
  userns: Don't allow unprivileged creation of gid mappings
  userns: Check euid no fsuid when establishing an unprivileged uid
    mapping
  userns: Only allow the creator of the userns unprivileged mappings
  userns: Rename id_map_mutex to userns_state_mutex
  userns: Add a knob to disable setgroups on a per user namespace basis
  userns: Allow setting gid_maps without privilege when setgroups is
    disabled
  userns: Unbreak the unprivileged remount tests

Filipe Manana (1):
  Btrfs: fix fs corruption on transaction abort if device supports
    discard

Francesco Ruggeri (1):
  tty: Fix pty master poll() after slave closes v2

Grygorii Strashko (1):
  i2c: davinci: generate STP always when NACK is received

Hannes Reinecke (1):
  scsi: correct return values for .eh_abort_handler implementations

Hugh Dickins (2):
  mm: fix swapoff hang after page migration and fork
  mm: let mm_find_pmd fix buggy race with THP fault

Jack Morgenstein (1):
  net/mlx4_core: Limit count field to 24 bits in qp_alloc_res

Jan Kara (4):
  isofs: Fix infinite looping over CE entries
  isofs: Fix unchecked printing of ER records
  ncpfs: return proper error from NCP_IOC_SETROOT ioctl
  udf: Verify symlink size before loading it

Johan Hovold (1):
  mfd: viperboard: Fix platform-device id collision

Johannes Berg (1):
  mac80211: free management frame keys when removing station

Josef Bacik (1):
  Btrfs: do not move em to modified list when unpinning

Kan Liang (1):
  perf/x86/intel: Protect LBR and extra_regs against KVM lying

Linus Walleij (1):
  mfd: stmpe: Fix STMPE24xx GPMR LSB

Luis Henriques (1):
  thermal: Fix error path in thermal_init()

Marcelo Leitner (1):
  Fix race condition between vxlan_sock_add and vxlan_sock_release

Martin Schwidefsky (2):
  s390/3215: fix hanging console issue
  s390/3215: fix tty output containing tabs

Mathias Nyman (1):
  USB: xhci: Reset a halted endpoint immediately when we encounter a
    stall.

Michael Halcrow (1):
  eCryptfs: Remove buggy and unnecessary write in file name decode
    routine

Nicolas Dichtel (1):
  rtnetlink: release net refcnt on error in do_setlink()

Oleg Nesterov (1):
  exit: pidns: alloc_pid() leaks pid_namespace if child_reaper is
    exiting

Peng Tao (1):
  nfs41: fix nfs4_proc_layoutget error handling

Petr Mladek (1):
  drm/radeon: kernel panic in drm_calc_vbltimestamp_from_scanoutpos with
    3.18.0-rc6

Rabin Vincent (1):
  crypto: af_alg - fix backlog handling

Richard Guy Briggs (2):
  audit: change decimal constant to macro for invalid uid
  audit: restore AUDIT_LOGINUID unset ABI

Ronald Wahl (1):
  usb: gadget: at91_udc: move prepare clk into process context

Sakari Ailus (1):
  media: smiapp: Only some selection targets are settable

Seth Forshee (1):
  xen-netfront: Remove BUGs on paged skb data which crosses a page
    boundary

Sumit.Saxena@avagotech.com (1):
  megaraid_sas: corrected return of wait_event from abort frame path

Takashi Iwai (4):
  ALSA: hda - Add EAPD fixup for ASUS Z99He laptop
  ALSA: hda - Fix built-in mic at resume on Lenovo Ideapad S210
  ALSA: usb-audio: Don't resubmit pending URBs at MIDI error recovery
  KEYS: Fix stale key registration at error path

Tejun Heo (1):
  ahci: disable MSI on SAMSUNG 0xa800 SSD

Thadeu Lima de Souza Cascardo (1):
  tg3: fix ring init when there are more TX than RX channels

Todd Fujinaka (1):
  igb: bring link up when PHY is powered up

Tyler Hicks (1):
  eCryptfs: Force RO mount when encrypted view is enabled

Weijie Yang (1):
  mm: frontswap: invalidate expired data on a dup-store failure

Yan, Zheng (1):
  ceph: fix null pointer dereference in discard_cap_releases()

Yuri Chislov (1):
  ipv6: gre: fix wrong skb->protocol in WCCP

willy tarreau (1):
  net: mvneta: fix Tx interrupt delay

 arch/powerpc/kernel/vdso32/getcpu.S                |   4 +-
 arch/s390/kernel/compat_linux.c                    |   2 +-
 arch/x86/include/uapi/asm/ldt.h                    |   7 +
 arch/x86/kernel/cpu/perf_event.c                   |   3 +
 arch/x86/kernel/cpu/perf_event.h                   |  12 +-
 arch/x86/kernel/cpu/perf_event_intel.c             |  66 ++++++-
 arch/x86/kernel/kvm.c                              |   9 +-
 arch/x86/kernel/kvmclock.c                         |   1 -
 arch/x86/kernel/tls.c                              |  39 ++++
 crypto/af_alg.c                                    |   3 +
 drivers/ata/ahci.c                                 |   4 +
 drivers/ata/sata_fsl.c                             |   2 +-
 drivers/gpu/drm/i915/intel_display.c               |   2 -
 drivers/gpu/drm/i915/intel_lvds.c                  |  22 +--
 drivers/gpu/drm/radeon/radeon_kms.c                |   2 +
 drivers/i2c/busses/i2c-davinci.c                   |   8 +-
 drivers/i2c/busses/i2c-omap.c                      |  10 +-
 drivers/md/dm-bufio.c                              |  20 +-
 drivers/md/persistent-data/dm-space-map-metadata.c |   4 +-
 drivers/media/i2c/smiapp/smiapp-core.c             |   2 +-
 drivers/mfd/stmpe.h                                |   2 +-
 drivers/mfd/tc6393xb.c                             |  13 +-
 drivers/mfd/viperboard.c                           |   5 +-
 drivers/mmc/card/block.c                           |   2 +-
 drivers/net/ethernet/broadcom/tg3.c                |   3 +-
 drivers/net/ethernet/intel/igb/igb_main.c          |   2 +
 drivers/net/ethernet/marvell/mvneta.c              |   5 +-
 .../net/ethernet/mellanox/mlx4/resource_tracker.c  |   2 +-
 drivers/net/vxlan.c                                |  10 +-
 drivers/net/xen-netfront.c                         |   5 -
 drivers/s390/char/con3215.c                        |  52 ++++--
 drivers/scsi/NCR5380.c                             |  12 +-
 drivers/scsi/aha1740.c                             |   2 +-
 drivers/scsi/atari_NCR5380.c                       |   2 +-
 drivers/scsi/esas2r/esas2r_main.c                  |   2 +-
 drivers/scsi/megaraid.c                            |   8 +-
 drivers/scsi/megaraid/megaraid_sas_base.c          |   2 +-
 drivers/scsi/sun3_NCR5380.c                        |  10 +-
 drivers/thermal/thermal_core.c                     |   4 +-
 drivers/tty/n_tty.c                                |   9 +-
 drivers/usb/gadget/at91_udc.c                      |  44 +++--
 drivers/usb/host/xhci-ring.c                       |  40 +---
 drivers/usb/host/xhci.c                            |  60 ++----
 fs/btrfs/disk-io.c                                 |   6 -
 fs/btrfs/extent-tree.c                             |  10 +-
 fs/btrfs/extent_map.c                              |   2 -
 fs/ceph/mds_client.c                               |  21 ++-
 fs/ecryptfs/crypto.c                               |   1 -
 fs/ecryptfs/file.c                                 |  12 --
 fs/ecryptfs/main.c                                 |  16 +-
 fs/isofs/rock.c                                    |   9 +
 fs/namespace.c                                     |  11 +-
 fs/ncpfs/ioctl.c                                   |   1 -
 fs/nfs/nfs4proc.c                                  |   6 +-
 fs/proc/base.c                                     |  53 ++++++
 fs/udf/symlink.c                                   |  17 +-
 include/linux/audit.h                              |   4 +
 include/linux/cred.h                               |   1 +
 include/linux/user_namespace.h                     |  12 ++
 include/uapi/linux/audit.h                         |   2 +
 kernel/auditfilter.c                               |  12 +-
 kernel/groups.c                                    |  11 +-
 kernel/pid.c                                       |   2 +
 kernel/uid16.c                                     |   2 +-
 kernel/user.c                                      |   1 +
 kernel/user_namespace.c                            | 125 +++++++++++--
 mm/frontswap.c                                     |   4 +-
 mm/huge_memory.c                                   |  18 +-
 mm/ksm.c                                           |   1 -
 mm/memory.c                                        |  26 +--
 mm/migrate.c                                       |   2 -
 mm/mmap.c                                          |  10 +-
 mm/rmap.c                                          |  18 +-
 mm/vmpressure.c                                    |   8 +-
 net/core/rtnetlink.c                               |   1 +
 net/ipv6/ip6_gre.c                                 |   4 +-
 net/mac80211/key.c                                 |   2 +-
 net/mac80211/rx.c                                  |  11 +-
 net/sctp/output.c                                  |   4 +-
 security/keys/encrypted-keys/encrypted.c           |   5 +-
 sound/pci/hda/patch_analog.c                       |   1 +
 sound/pci/hda/patch_realtek.c                      |   1 +
 sound/usb/midi.c                                   |   2 +
 .../selftests/mount/unprivileged-remount-test.c    | 204 +++++++++++++++++----
 84 files changed, 845 insertions(+), 332 deletions(-)

-- 
2.2.1


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [PATCH 3.12 01/78] ipv6: gre: fix wrong skb->protocol in WCCP
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 02/78] Fix race condition between vxlan_sock_add and vxlan_sock_release Jiri Slaby
                   ` (78 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable
  Cc: linux-kernel, Yuri Chislov, Dmitry Kozlov, Daniel Borkmann,
	David S. Miller, Jiri Slaby

From: Yuri Chislov <yuri.chislov@gmail.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

[ Upstream commit be6572fdb1bfbe23b2624d477de50af50b02f5d6 ]

When using GRE redirection in WCCP, it sets the wrong skb->protocol,
that is, ETH_P_IP instead of ETH_P_IPV6 for the encapuslated traffic.

Fixes: c12b395a4664 ("gre: Support GRE over IPv6")
Cc: Dmitry Kozlov <xeb@mail.ru>
Signed-off-by: Yuri Chislov <yuri.chislov@gmail.com>
Tested-by: Yuri Chislov <yuri.chislov@gmail.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 net/ipv6/ip6_gre.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index 88774ccb3dda..7d640f276e87 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -511,11 +511,11 @@ static int ip6gre_rcv(struct sk_buff *skb)
 
 		skb->protocol = gre_proto;
 		/* WCCP version 1 and 2 protocol decoding.
-		 * - Change protocol to IP
+		 * - Change protocol to IPv6
 		 * - When dealing with WCCPv2, Skip extra 4 bytes in GRE header
 		 */
 		if (flags == 0 && gre_proto == htons(ETH_P_WCCP)) {
-			skb->protocol = htons(ETH_P_IP);
+			skb->protocol = htons(ETH_P_IPV6);
 			if ((*(h + offset) & 0xF0) != 0x40)
 				offset += 4;
 		}
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 02/78] Fix race condition between vxlan_sock_add and vxlan_sock_release
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 01/78] ipv6: gre: fix wrong skb->protocol in WCCP Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 03/78] tg3: fix ring init when there are more TX than RX channels Jiri Slaby
                   ` (77 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Marcelo Leitner, David S. Miller, Jiri Slaby

From: Marcelo Leitner <mleitner@redhat.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

[ Upstream commit 00c83b01d58068dfeb2e1351cca6fccf2a83fa8f ]

Currently, when trying to reuse a socket, vxlan_sock_add will grab
vn->sock_lock, locate a reusable socket, inc refcount and release
vn->sock_lock.

But vxlan_sock_release() will first decrement refcount, and then grab
that lock. refcnt operations are atomic but as currently we have
deferred works which hold vs->refcnt each, this might happen, leading to
a use after free (specially after vxlan_igmp_leave):

  CPU 1                            CPU 2

deferred work                    vxlan_sock_add
  ...                              ...
                                   spin_lock(&vn->sock_lock)
                                   vs = vxlan_find_sock();
  vxlan_sock_release
    dec vs->refcnt, reaches 0
    spin_lock(&vn->sock_lock)
                                   vxlan_sock_hold(vs), refcnt=1
                                   spin_unlock(&vn->sock_lock)
    hlist_del_rcu(&vs->hlist);
    vxlan_notify_del_rx_port(vs)
    spin_unlock(&vn->sock_lock)

So when we look for a reusable socket, we check if it wasn't freed
already before reusing it.

Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Fixes: 7c47cedf43a8b3 ("vxlan: move IGMP join/leave to work queue")
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 drivers/net/vxlan.c | 10 +++-------
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 5407c11a9f14..c8e333306c4c 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -2002,9 +2002,8 @@ static int vxlan_init(struct net_device *dev)
 	spin_lock(&vn->sock_lock);
 	vs = vxlan_find_sock(dev_net(dev), ipv6 ? AF_INET6 : AF_INET,
 			     vxlan->dst_port);
-	if (vs) {
+	if (vs && atomic_add_unless(&vs->refcnt, 1, 0)) {
 		/* If we have a socket with same port already, reuse it */
-		atomic_inc(&vs->refcnt);
 		vxlan_vs_add_dev(vs, vxlan);
 	} else {
 		/* otherwise make new socket outside of RTNL */
@@ -2447,12 +2446,9 @@ struct vxlan_sock *vxlan_sock_add(struct net *net, __be16 port,
 
 	spin_lock(&vn->sock_lock);
 	vs = vxlan_find_sock(net, ipv6 ? AF_INET6 : AF_INET, port);
-	if (vs) {
-		if (vs->rcv == rcv)
-			atomic_inc(&vs->refcnt);
-		else
+	if (vs && ((vs->rcv != rcv) ||
+		   !atomic_add_unless(&vs->refcnt, 1, 0)))
 			vs = ERR_PTR(-EBUSY);
-	}
 	spin_unlock(&vn->sock_lock);
 
 	if (!vs)
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 03/78] tg3: fix ring init when there are more TX than RX channels
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 01/78] ipv6: gre: fix wrong skb->protocol in WCCP Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 02/78] Fix race condition between vxlan_sock_add and vxlan_sock_release Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 04/78] net/mlx4_core: Limit count field to 24 bits in qp_alloc_res Jiri Slaby
                   ` (76 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable
  Cc: linux-kernel, Thadeu Lima de Souza Cascardo, David S. Miller, Jiri Slaby

From: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

[ Upstream commit a620a6bc1c94c22d6c312892be1e0ae171523125 ]

If TX channels are set to 4 and RX channels are set to less than 4,
using ethtool -L, the driver will try to initialize more RX channels
than it has allocated, causing an oops.

This fix only initializes the RX ring if it has been allocated.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 drivers/net/ethernet/broadcom/tg3.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
index 18f0d772e544..8d45dce7cfdb 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -8523,7 +8523,8 @@ static int tg3_init_rings(struct tg3 *tp)
 		if (tnapi->rx_rcb)
 			memset(tnapi->rx_rcb, 0, TG3_RX_RCB_RING_BYTES(tp));
 
-		if (tg3_rx_prodring_alloc(tp, &tnapi->prodring)) {
+		if (tnapi->prodring.rx_std &&
+		    tg3_rx_prodring_alloc(tp, &tnapi->prodring)) {
 			tg3_free_rings(tp);
 			return -ENOMEM;
 		}
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 04/78] net/mlx4_core: Limit count field to 24 bits in qp_alloc_res
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (2 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 03/78] tg3: fix ring init when there are more TX than RX channels Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 05/78] rtnetlink: release net refcnt on error in do_setlink() Jiri Slaby
                   ` (75 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable
  Cc: linux-kernel, Jack Morgenstein, Or Gerlitz, David S. Miller, Jiri Slaby

From: Jack Morgenstein <jackm@dev.mellanox.co.il>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

[ Upstream commit 2d5c57d7fbfaa642fb7f0673df24f32b83d9066c ]

Some VF drivers use the upper byte of "param1" (the qp count field)
in mlx4_qp_reserve_range() to pass flags which are used to optimize
the range allocation.

Under the current code, if any of these flags are set, the 32-bit
count field yields a count greater than 2^24, which is out of range,
and this VF fails.

As these flags represent a "best-effort" allocation hint anyway, they may
safely be ignored. Therefore, the PF driver may simply mask out the bits.

Fixes: c82e9aa0a8 "mlx4_core: resource tracking for HCA resources used by guests"
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 drivers/net/ethernet/mellanox/mlx4/resource_tracker.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
index dd6876321116..cdbe63712d2d 100644
--- a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
+++ b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c
@@ -1227,7 +1227,7 @@ static int qp_alloc_res(struct mlx4_dev *dev, int slave, int op, int cmd,
 
 	switch (op) {
 	case RES_OP_RESERVE:
-		count = get_param_l(&in_param);
+		count = get_param_l(&in_param) & 0xffffff;
 		align = get_param_h(&in_param);
 		err = __mlx4_qp_reserve_range(dev, count, align, &base);
 		if (err)
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 05/78] rtnetlink: release net refcnt on error in do_setlink()
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (3 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 04/78] net/mlx4_core: Limit count field to 24 bits in qp_alloc_res Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 06/78] xen-netfront: Remove BUGs on paged skb data which crosses a page boundary Jiri Slaby
                   ` (74 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable
  Cc: linux-kernel, Nicolas Dichtel, Eric W. Biederman,
	David S. Miller, Jiri Slaby

From: Nicolas Dichtel <nicolas.dichtel@6wind.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

[ Upstream commit e0ebde0e131b529fd721b24f62872def5ec3718c ]

rtnl_link_get_net() holds a reference on the 'struct net', we need to release
it in case of error.

CC: Eric W. Biederman <ebiederm@xmission.com>
Fixes: b51642f6d77b ("net: Enable a userns root rtnl calls that are safe for unprivilged users")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 net/core/rtnetlink.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 93ad6c5b2d77..f3224755b328 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -1391,6 +1391,7 @@ static int do_setlink(const struct sk_buff *skb,
 			goto errout;
 		}
 		if (!netlink_ns_capable(skb, net->user_ns, CAP_NET_ADMIN)) {
+			put_net(net);
 			err = -EPERM;
 			goto errout;
 		}
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 06/78] xen-netfront: Remove BUGs on paged skb data which crosses a page boundary
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (4 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 05/78] rtnetlink: release net refcnt on error in do_setlink() Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 07/78] net: mvneta: fix Tx interrupt delay Jiri Slaby
                   ` (73 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Seth Forshee, David S. Miller, Jiri Slaby

From: Seth Forshee <seth.forshee@canonical.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

[ Upstream commit 8d609725d4357f499e2103e46011308b32f53513 ]

These BUGs can be erroneously triggered by frags which refer to
tail pages within a compound page. The data in these pages may
overrun the hardware page while still being contained within the
compound page, but since compound_order() evaluates to 0 for tail
pages the assertion fails. The code already iterates through
subsequent pages correctly in this scenario, so the BUGs are
unnecessary and can be removed.

Fixes: f36c374782e4 ("xen/netfront: handle compound page fragments on transmit")
Cc: <stable@vger.kernel.org> # 3.7+
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 drivers/net/xen-netfront.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 7c541dc1647e..fd3c1da14495 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -468,9 +468,6 @@ static void xennet_make_frags(struct sk_buff *skb, struct net_device *dev,
 		len = skb_frag_size(frag);
 		offset = frag->page_offset;
 
-		/* Data must not cross a page boundary. */
-		BUG_ON(len + offset > PAGE_SIZE<<compound_order(page));
-
 		/* Skip unused frames from start of page */
 		page += offset >> PAGE_SHIFT;
 		offset &= ~PAGE_MASK;
@@ -478,8 +475,6 @@ static void xennet_make_frags(struct sk_buff *skb, struct net_device *dev,
 		while (len > 0) {
 			unsigned long bytes;
 
-			BUG_ON(offset >= PAGE_SIZE);
-
 			bytes = PAGE_SIZE - offset;
 			if (bytes > len)
 				bytes = len;
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 07/78] net: mvneta: fix Tx interrupt delay
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (5 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 06/78] xen-netfront: Remove BUGs on paged skb data which crosses a page boundary Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 08/78] net: mvneta: fix race condition in mvneta_tx() Jiri Slaby
                   ` (72 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, willy tarreau, David S. Miller, Jiri Slaby

From: willy tarreau <w@1wt.eu>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

[ Upstream commit aebea2ba0f7495e1a1c9ea5e753d146cb2f6b845 ]

The mvneta driver sets the amount of Tx coalesce packets to 16 by
default. Normally that does not cause any trouble since the driver
uses a much larger Tx ring size (532 packets). But some sockets
might run with very small buffers, much smaller than the equivalent
of 16 packets. This is what ping is doing for example, by setting
SNDBUF to 324 bytes rounded up to 2kB by the kernel.

The problem is that there is no documented method to force a specific
packet to emit an interrupt (eg: the last of the ring) nor is it
possible to make the NIC emit an interrupt after a given delay.

In this case, it causes trouble, because when ping sends packets over
its raw socket, the few first packets leave the system, and the first
15 packets will be emitted without an IRQ being generated, so without
the skbs being freed. And since the socket's buffer is small, there's
no way to reach that amount of packets, and the ping ends up with
"send: no buffer available" after sending 6 packets. Running with 3
instances of ping in parallel is enough to hide the problem, because
with 6 packets per instance, that's 18 packets total, which is enough
to grant a Tx interrupt before all are sent.

The original driver in the LSP kernel worked around this design flaw
by using a software timer to clean up the Tx descriptors. This timer
was slow and caused terrible network performance on some Tx-bound
workloads (such as routing) but was enough to make tools like ping
work correctly.

Instead here, we simply set the packet counts before interrupt to 1.
This ensures that each packet sent will produce an interrupt. NAPI
takes care of coalescing interrupts since the interrupt is disabled
once generated.

No measurable performance impact nor CPU usage were observed on small
nor large packets, including when saturating the link on Tx, and this
fixes tools like ping which rely on too small a send buffer. If one
wants to increase this value for certain workloads where it is safe
to do so, "ethtool -C $dev tx-frames" will override this default
setting.

This fix needs to be applied to stable kernels starting with 3.10.

Tested-By: Maggie Mae Roxas <maggie.mae.roxas@gmail.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 drivers/net/ethernet/marvell/mvneta.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index fabdda91fd0e..0e38db6469fb 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -172,7 +172,7 @@
 /* Various constants */
 
 /* Coalescing */
-#define MVNETA_TXDONE_COAL_PKTS		16
+#define MVNETA_TXDONE_COAL_PKTS		1
 #define MVNETA_RX_COAL_PKTS		32
 #define MVNETA_RX_COAL_USEC		100
 
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 08/78] net: mvneta: fix race condition in mvneta_tx()
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (6 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 07/78] net: mvneta: fix Tx interrupt delay Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 09/78] net: sctp: use MAX_HEADER for headroom reserve in output path Jiri Slaby
                   ` (71 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Eric Dumazet, David S. Miller, Jiri Slaby

From: Eric Dumazet <edumazet@google.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

[ Upstream commit 5f478b41033606d325e420df693162e2524c2b94 ]

mvneta_tx() dereferences skb to get skb->len too late,
as hardware might have completed the transmit and TX completion
could have freed the skb from another cpu.

Fixes: 71f6d1b31fb1 ("net: mvneta: replace Tx timer with a real interrupt")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 drivers/net/ethernet/marvell/mvneta.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index 0e38db6469fb..9c66d3168911 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -1524,6 +1524,7 @@ static int mvneta_tx(struct sk_buff *skb, struct net_device *dev)
 	struct mvneta_tx_queue *txq = &pp->txqs[txq_id];
 	struct mvneta_tx_desc *tx_desc;
 	struct netdev_queue *nq;
+	int len = skb->len;
 	int frags = 0;
 	u32 tx_cmd;
 
@@ -1584,7 +1585,7 @@ out:
 	if (frags > 0) {
 		u64_stats_update_begin(&pp->tx_stats.syncp);
 		pp->tx_stats.packets++;
-		pp->tx_stats.bytes += skb->len;
+		pp->tx_stats.bytes += len;
 		u64_stats_update_end(&pp->tx_stats.syncp);
 
 	} else {
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 09/78] net: sctp: use MAX_HEADER for headroom reserve in output path
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (7 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 08/78] net: mvneta: fix race condition in mvneta_tx() Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 10/78] ceph: fix null pointer dereference in discard_cap_releases() Jiri Slaby
                   ` (70 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Daniel Borkmann, David S. Miller, Jiri Slaby

From: Daniel Borkmann <dborkman@redhat.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

[ Upstream commit 9772b54c55266ce80c639a80aa68eeb908f8ecf5 ]

To accomodate for enough headroom for tunnels, use MAX_HEADER instead
of LL_MAX_HEADER. Robert reported that he has hit after roughly 40hrs
of trinity an skb_under_panic() via SCTP output path (see reference).
I couldn't reproduce it from here, but not using MAX_HEADER as elsewhere
in other protocols might be one possible cause for this.

In any case, it looks like accounting on chunks themself seems to look
good as the skb already passed the SCTP output path and did not hit
any skb_over_panic(). Given tunneling was enabled in his .config, the
headroom would have been expanded by MAX_HEADER in this case.

Reported-by: Robert Święcki <robert@swiecki.net>
Reference: https://lkml.org/lkml/2014/12/1/507
Fixes: 594ccc14dfe4d ("[SCTP] Replace incorrect use of dev_alloc_skb with alloc_skb in sctp_packet_transmit().")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 net/sctp/output.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/sctp/output.c b/net/sctp/output.c
index 2a41465729ab..69faf79a48c6 100644
--- a/net/sctp/output.c
+++ b/net/sctp/output.c
@@ -403,12 +403,12 @@ int sctp_packet_transmit(struct sctp_packet *packet)
 	sk = chunk->skb->sk;
 
 	/* Allocate the new skb.  */
-	nskb = alloc_skb(packet->size + LL_MAX_HEADER, GFP_ATOMIC);
+	nskb = alloc_skb(packet->size + MAX_HEADER, GFP_ATOMIC);
 	if (!nskb)
 		goto nomem;
 
 	/* Make sure the outbound skb has enough header room reserved. */
-	skb_reserve(nskb, packet->overhead + LL_MAX_HEADER);
+	skb_reserve(nskb, packet->overhead + MAX_HEADER);
 
 	/* Set the owning socket so that we know where to get the
 	 * destination IP address.
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 10/78] ceph: fix null pointer dereference in discard_cap_releases()
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (8 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 09/78] net: sctp: use MAX_HEADER for headroom reserve in output path Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 11/78] perf/x86/intel: Protect LBR and extra_regs against KVM lying Jiri Slaby
                   ` (69 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Yan, Zheng, Jiri Slaby

From: "Yan, Zheng" <zheng.z.yan@intel.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 00bd8edb861eb41d274938cfc0338999d9c593a3 upstream.

send_mds_reconnect() may call discard_cap_releases() after all
release messages have been dropped by cleanup_cap_releases()

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 fs/ceph/mds_client.c | 21 ++++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index 788901552eb1..6f1161324f91 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -1420,15 +1420,18 @@ static void discard_cap_releases(struct ceph_mds_client *mdsc,
 	dout("discard_cap_releases mds%d\n", session->s_mds);
 	spin_lock(&session->s_cap_lock);
 
-	/* zero out the in-progress message */
-	msg = list_first_entry(&session->s_cap_releases,
-			       struct ceph_msg, list_head);
-	head = msg->front.iov_base;
-	num = le32_to_cpu(head->num);
-	dout("discard_cap_releases mds%d %p %u\n", session->s_mds, msg, num);
-	head->num = cpu_to_le32(0);
-	msg->front.iov_len = sizeof(*head);
-	session->s_num_cap_releases += num;
+	if (!list_empty(&session->s_cap_releases)) {
+		/* zero out the in-progress message */
+		msg = list_first_entry(&session->s_cap_releases,
+					struct ceph_msg, list_head);
+		head = msg->front.iov_base;
+		num = le32_to_cpu(head->num);
+		dout("discard_cap_releases mds%d %p %u\n",
+		     session->s_mds, msg, num);
+		head->num = cpu_to_le32(0);
+		msg->front.iov_len = sizeof(*head);
+		session->s_num_cap_releases += num;
+	}
 
 	/* requeue completed messages */
 	while (!list_empty(&session->s_cap_releases_done)) {
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 11/78] perf/x86/intel: Protect LBR and extra_regs against KVM lying
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (9 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 10/78] ceph: fix null pointer dereference in discard_cap_releases() Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-10 11:24   ` Dongsu Park
  2015-01-09 10:31 ` [PATCH 3.12 12/78] s390/3215: fix hanging console issue Jiri Slaby
                   ` (68 subsequent siblings)
  79 siblings, 1 reply; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable
  Cc: linux-kernel, Kan Liang, Peter Zijlstra, Andi Kleen,
	Arnaldo Carvalho de Melo, Linus Torvalds, Maria Dimakopoulou,
	Mark Davies, Paul Mackerras, Stephane Eranian, Yan, Zheng,
	Ingo Molnar, Jiri Slaby

From: Kan Liang <kan.liang@intel.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 338b522ca43cfd32d11a370f4203bcd089c6c877 upstream.

With -cpu host, KVM reports LBR and extra_regs support, if the host has
support.

When the guest perf driver tries to access LBR or extra_regs MSR,
it #GPs all MSR accesses,since KVM doesn't handle LBR and extra_regs support.
So check the related MSRs access right once at initialization time to avoid
the error access at runtime.

For reproducing the issue, please build the kernel with CONFIG_KVM_INTEL = y
(for host kernel).
And CONFIG_PARAVIRT = n and CONFIG_KVM_GUEST = n (for guest kernel).
Start the guest with -cpu host.
Run perf record with --branch-any or --branch-filter in guest to trigger LBR
Run perf stat offcore events (E.g. LLC-loads/LLC-load-misses ...) in guest to
trigger offcore_rsp #GP

Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Maria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
Cc: Mark Davies <junk@eslaf.co.uk>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Yan, Zheng <zheng.z.yan@intel.com>
Link: http://lkml.kernel.org/r/1405365957-20202-1-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 arch/x86/kernel/cpu/perf_event.c       |  3 ++
 arch/x86/kernel/cpu/perf_event.h       | 12 ++++---
 arch/x86/kernel/cpu/perf_event_intel.c | 66 +++++++++++++++++++++++++++++++++-
 3 files changed, 75 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 5edd3c0b437a..c7106f116fb0 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -118,6 +118,9 @@ static int x86_pmu_extra_regs(u64 config, struct perf_event *event)
 			continue;
 		if (event->attr.config1 & ~er->valid_mask)
 			return -EINVAL;
+		/* Check if the extra msrs can be safely accessed*/
+		if (!er->extra_msr_access)
+			return -ENXIO;
 
 		reg->idx = er->idx;
 		reg->config = event->attr.config1;
diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h
index cc16faae0538..53bd2726f4cd 100644
--- a/arch/x86/kernel/cpu/perf_event.h
+++ b/arch/x86/kernel/cpu/perf_event.h
@@ -279,14 +279,16 @@ struct extra_reg {
 	u64			config_mask;
 	u64			valid_mask;
 	int			idx;  /* per_xxx->regs[] reg index */
+	bool			extra_msr_access;
 };
 
 #define EVENT_EXTRA_REG(e, ms, m, vm, i) {	\
-	.event = (e),		\
-	.msr = (ms),		\
-	.config_mask = (m),	\
-	.valid_mask = (vm),	\
-	.idx = EXTRA_REG_##i,	\
+	.event = (e),			\
+	.msr = (ms),			\
+	.config_mask = (m),		\
+	.valid_mask = (vm),		\
+	.idx = EXTRA_REG_##i,		\
+	.extra_msr_access = true,	\
 	}
 
 #define INTEL_EVENT_EXTRA_REG(event, msr, vm, idx)	\
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index 959bbf204dae..02554ddf8481 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -2144,6 +2144,41 @@ static void intel_snb_check_microcode(void)
 	}
 }
 
+/*
+ * Under certain circumstances, access certain MSR may cause #GP.
+ * The function tests if the input MSR can be safely accessed.
+ */
+static bool check_msr(unsigned long msr, u64 mask)
+{
+	u64 val_old, val_new, val_tmp;
+
+	/*
+	 * Read the current value, change it and read it back to see if it
+	 * matches, this is needed to detect certain hardware emulators
+	 * (qemu/kvm) that don't trap on the MSR access and always return 0s.
+	 */
+	if (rdmsrl_safe(msr, &val_old))
+		return false;
+
+	/*
+	 * Only change the bits which can be updated by wrmsrl.
+	 */
+	val_tmp = val_old ^ mask;
+	if (wrmsrl_safe(msr, val_tmp) ||
+	    rdmsrl_safe(msr, &val_new))
+		return false;
+
+	if (val_new != val_tmp)
+		return false;
+
+	/* Here it's sure that the MSR can be safely accessed.
+	 * Restore the old value and return.
+	 */
+	wrmsrl(msr, val_old);
+
+	return true;
+}
+
 static __init void intel_sandybridge_quirk(void)
 {
 	x86_pmu.check_microcode = intel_snb_check_microcode;
@@ -2207,7 +2242,8 @@ __init int intel_pmu_init(void)
 	union cpuid10_ebx ebx;
 	struct event_constraint *c;
 	unsigned int unused;
-	int version;
+	struct extra_reg *er;
+	int version, i;
 
 	if (!cpu_has(&boot_cpu_data, X86_FEATURE_ARCH_PERFMON)) {
 		switch (boot_cpu_data.x86) {
@@ -2515,6 +2551,34 @@ __init int intel_pmu_init(void)
 		}
 	}
 
+	/*
+	 * Access LBR MSR may cause #GP under certain circumstances.
+	 * E.g. KVM doesn't support LBR MSR
+	 * Check all LBT MSR here.
+	 * Disable LBR access if any LBR MSRs can not be accessed.
+	 */
+	if (x86_pmu.lbr_nr && !check_msr(x86_pmu.lbr_tos, 0x3UL))
+		x86_pmu.lbr_nr = 0;
+	for (i = 0; i < x86_pmu.lbr_nr; i++) {
+		if (!(check_msr(x86_pmu.lbr_from + i, 0xffffUL) &&
+		      check_msr(x86_pmu.lbr_to + i, 0xffffUL)))
+			x86_pmu.lbr_nr = 0;
+	}
+
+	/*
+	 * Access extra MSR may cause #GP under certain circumstances.
+	 * E.g. KVM doesn't support offcore event
+	 * Check all extra_regs here.
+	 */
+	if (x86_pmu.extra_regs) {
+		for (er = x86_pmu.extra_regs; er->msr; er++) {
+			er->extra_msr_access = check_msr(er->msr, 0x1ffUL);
+			/* Disable LBR select mapping */
+			if ((er->idx == EXTRA_REG_LBR) && !er->extra_msr_access)
+				x86_pmu.lbr_sel_map = NULL;
+		}
+	}
+
 	/* Support full width counters using alternative MSR range */
 	if (x86_pmu.intel_cap.full_width_write) {
 		x86_pmu.max_period = x86_pmu.cntval_mask;
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 12/78] s390/3215: fix hanging console issue
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (10 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 11/78] perf/x86/intel: Protect LBR and extra_regs against KVM lying Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 13/78] s390/3215: fix tty output containing tabs Jiri Slaby
                   ` (67 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Martin Schwidefsky, Jiri Slaby

From: Martin Schwidefsky <schwidefsky@de.ibm.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 26d766c60f4ea08cd14f0f3435a6db3d6cc2ae96 upstream.

The ccw_device_start in raw3215_start_io can fail. raw3215_try_io
does not check if the request could be started and removes any
pending timer. This can leave the system in a hanging state.
Check for pending request after raw3215_start_io and start a
timer if necessary.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 drivers/s390/char/con3215.c | 32 +++++++++++++++++---------------
 1 file changed, 17 insertions(+), 15 deletions(-)

diff --git a/drivers/s390/char/con3215.c b/drivers/s390/char/con3215.c
index bb86494e2b7b..9a408f6e95db 100644
--- a/drivers/s390/char/con3215.c
+++ b/drivers/s390/char/con3215.c
@@ -288,12 +288,16 @@ static void raw3215_timeout(unsigned long __data)
 	unsigned long flags;
 
 	spin_lock_irqsave(get_ccwdev_lock(raw->cdev), flags);
-	if (raw->flags & RAW3215_TIMER_RUNS) {
-		del_timer(&raw->timer);
-		raw->flags &= ~RAW3215_TIMER_RUNS;
-		if (!(raw->port.flags & ASYNC_SUSPENDED)) {
-			raw3215_mk_write_req(raw);
-			raw3215_start_io(raw);
+	raw->flags &= ~RAW3215_TIMER_RUNS;
+	if (!(raw->port.flags & ASYNC_SUSPENDED)) {
+		raw3215_mk_write_req(raw);
+		raw3215_start_io(raw);
+		if ((raw->queued_read || raw->queued_write) &&
+		    !(raw->flags & RAW3215_WORKING) &&
+		    !(raw->flags & RAW3215_TIMER_RUNS)) {
+			raw->timer.expires = RAW3215_TIMEOUT + jiffies;
+			add_timer(&raw->timer);
+			raw->flags |= RAW3215_TIMER_RUNS;
 		}
 	}
 	spin_unlock_irqrestore(get_ccwdev_lock(raw->cdev), flags);
@@ -317,17 +321,15 @@ static inline void raw3215_try_io(struct raw3215_info *raw)
 		    (raw->flags & RAW3215_FLUSHING)) {
 			/* execute write requests bigger than minimum size */
 			raw3215_start_io(raw);
-			if (raw->flags & RAW3215_TIMER_RUNS) {
-				del_timer(&raw->timer);
-				raw->flags &= ~RAW3215_TIMER_RUNS;
-			}
-		} else if (!(raw->flags & RAW3215_TIMER_RUNS)) {
-			/* delay small writes */
-			raw->timer.expires = RAW3215_TIMEOUT + jiffies;
-			add_timer(&raw->timer);
-			raw->flags |= RAW3215_TIMER_RUNS;
 		}
 	}
+	if ((raw->queued_read || raw->queued_write) &&
+	    !(raw->flags & RAW3215_WORKING) &&
+	    !(raw->flags & RAW3215_TIMER_RUNS)) {
+		raw->timer.expires = RAW3215_TIMEOUT + jiffies;
+		add_timer(&raw->timer);
+		raw->flags |= RAW3215_TIMER_RUNS;
+	}
 }
 
 /*
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 13/78] s390/3215: fix tty output containing tabs
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (11 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 12/78] s390/3215: fix hanging console issue Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 14/78] usb: gadget: at91_udc: move prepare clk into process context Jiri Slaby
                   ` (66 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Martin Schwidefsky, Jiri Slaby

From: Martin Schwidefsky <schwidefsky@de.ibm.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit e512d56c799517f33b301d81e9a5e0ebf30c2d1e upstream.

git commit 37f81fa1f63ad38e16125526bb2769ae0ea8d332
"n_tty: do O_ONLCR translation as a single write"
surfaced a bug in the 3215 device driver. In combination this
broke tab expansion for tty ouput.

The cause is an asymmetry in the behaviour of tty3215_ops->write
vs tty3215_ops->put_char. The put_char function scans for '\t'
but the write function does not.

As the driver has logic for the '\t' expansion remove XTABS
from c_oflag of the initial termios as well.

Reported-by: Stephen Powell <zlinuxman@wowway.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 drivers/s390/char/con3215.c | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/drivers/s390/char/con3215.c b/drivers/s390/char/con3215.c
index 9a408f6e95db..19915c5b256f 100644
--- a/drivers/s390/char/con3215.c
+++ b/drivers/s390/char/con3215.c
@@ -1029,12 +1029,26 @@ static int tty3215_write(struct tty_struct * tty,
 			 const unsigned char *buf, int count)
 {
 	struct raw3215_info *raw;
+	int i, written;
 
 	if (!tty)
 		return 0;
 	raw = (struct raw3215_info *) tty->driver_data;
-	raw3215_write(raw, buf, count);
-	return count;
+	written = count;
+	while (count > 0) {
+		for (i = 0; i < count; i++)
+			if (buf[i] == '\t' || buf[i] == '\n')
+				break;
+		raw3215_write(raw, buf, i);
+		count -= i;
+		buf += i;
+		if (count > 0) {
+			raw3215_putchar(raw, *buf);
+			count--;
+			buf++;
+		}
+	}
+	return written;
 }
 
 /*
@@ -1182,7 +1196,7 @@ static int __init tty3215_init(void)
 	driver->subtype = SYSTEM_TYPE_TTY;
 	driver->init_termios = tty_std_termios;
 	driver->init_termios.c_iflag = IGNBRK | IGNPAR;
-	driver->init_termios.c_oflag = ONLCR | XTABS;
+	driver->init_termios.c_oflag = ONLCR;
 	driver->init_termios.c_lflag = ISIG;
 	driver->flags = TTY_DRIVER_REAL_RAW;
 	tty_set_operations(driver, &tty3215_ops);
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 14/78] usb: gadget: at91_udc: move prepare clk into process context
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (12 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 13/78] s390/3215: fix tty output containing tabs Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 15/78] tty: Fix pty master poll() after slave closes v2 Jiri Slaby
                   ` (65 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Ronald Wahl, Felipe Balbi, Jiri Slaby

From: Ronald Wahl <ronald.wahl@raritan.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit b2ba27a5c56ff7204d8a8684893d64d4afe2cee5 upstream.

Commit 7628083227b6bc4a7e33d7c381d7a4e558424b6b (usb: gadget: at91_udc:
prepare clk before calling enable) added clock preparation in interrupt
context. This is not allowed as it might sleep. Also setting the clock
rate is unsafe to call from there for the same reason. Move clock
preparation and setting clock rate into process context (at91udc_probe).

Signed-off-by: Ronald Wahl <ronald.wahl@raritan.com>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Felipe Balbi <balbi@ti.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 drivers/usb/gadget/at91_udc.c | 44 +++++++++++++++++++++++++++++++------------
 1 file changed, 32 insertions(+), 12 deletions(-)

diff --git a/drivers/usb/gadget/at91_udc.c b/drivers/usb/gadget/at91_udc.c
index dfd29438a11e..e3101cec93c9 100644
--- a/drivers/usb/gadget/at91_udc.c
+++ b/drivers/usb/gadget/at91_udc.c
@@ -871,12 +871,10 @@ static void clk_on(struct at91_udc *udc)
 		return;
 	udc->clocked = 1;
 
-	if (IS_ENABLED(CONFIG_COMMON_CLK)) {
-		clk_set_rate(udc->uclk, 48000000);
-		clk_prepare_enable(udc->uclk);
-	}
-	clk_prepare_enable(udc->iclk);
-	clk_prepare_enable(udc->fclk);
+	if (IS_ENABLED(CONFIG_COMMON_CLK))
+		clk_enable(udc->uclk);
+	clk_enable(udc->iclk);
+	clk_enable(udc->fclk);
 }
 
 static void clk_off(struct at91_udc *udc)
@@ -885,10 +883,10 @@ static void clk_off(struct at91_udc *udc)
 		return;
 	udc->clocked = 0;
 	udc->gadget.speed = USB_SPEED_UNKNOWN;
-	clk_disable_unprepare(udc->fclk);
-	clk_disable_unprepare(udc->iclk);
+	clk_disable(udc->fclk);
+	clk_disable(udc->iclk);
 	if (IS_ENABLED(CONFIG_COMMON_CLK))
-		clk_disable_unprepare(udc->uclk);
+		clk_disable(udc->uclk);
 }
 
 /*
@@ -1781,14 +1779,24 @@ static int at91udc_probe(struct platform_device *pdev)
 	}
 
 	/* don't do anything until we have both gadget driver and VBUS */
+	if (IS_ENABLED(CONFIG_COMMON_CLK)) {
+		clk_set_rate(udc->uclk, 48000000);
+		retval = clk_prepare(udc->uclk);
+		if (retval)
+			goto fail1;
+	}
+	retval = clk_prepare(udc->fclk);
+	if (retval)
+		goto fail1a;
+
 	retval = clk_prepare_enable(udc->iclk);
 	if (retval)
-		goto fail1;
+		goto fail1b;
 	at91_udp_write(udc, AT91_UDP_TXVC, AT91_UDP_TXVC_TXVDIS);
 	at91_udp_write(udc, AT91_UDP_IDR, 0xffffffff);
 	/* Clear all pending interrupts - UDP may be used by bootloader. */
 	at91_udp_write(udc, AT91_UDP_ICR, 0xffffffff);
-	clk_disable_unprepare(udc->iclk);
+	clk_disable(udc->iclk);
 
 	/* request UDC and maybe VBUS irqs */
 	udc->udp_irq = platform_get_irq(pdev, 0);
@@ -1796,7 +1804,7 @@ static int at91udc_probe(struct platform_device *pdev)
 			0, driver_name, udc);
 	if (retval < 0) {
 		DBG("request irq %d failed\n", udc->udp_irq);
-		goto fail1;
+		goto fail1c;
 	}
 	if (gpio_is_valid(udc->board.vbus_pin)) {
 		retval = gpio_request(udc->board.vbus_pin, "udc_vbus");
@@ -1849,6 +1857,13 @@ fail3:
 		gpio_free(udc->board.vbus_pin);
 fail2:
 	free_irq(udc->udp_irq, udc);
+fail1c:
+	clk_unprepare(udc->iclk);
+fail1b:
+	clk_unprepare(udc->fclk);
+fail1a:
+	if (IS_ENABLED(CONFIG_COMMON_CLK))
+		clk_unprepare(udc->uclk);
 fail1:
 	if (IS_ENABLED(CONFIG_COMMON_CLK) && !IS_ERR(udc->uclk))
 		clk_put(udc->uclk);
@@ -1897,6 +1912,11 @@ static int __exit at91udc_remove(struct platform_device *pdev)
 	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
 	release_mem_region(res->start, resource_size(res));
 
+	if (IS_ENABLED(CONFIG_COMMON_CLK))
+		clk_unprepare(udc->uclk);
+	clk_unprepare(udc->fclk);
+	clk_unprepare(udc->iclk);
+
 	clk_put(udc->iclk);
 	clk_put(udc->fclk);
 	if (IS_ENABLED(CONFIG_COMMON_CLK))
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 15/78] tty: Fix pty master poll() after slave closes v2
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (13 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 14/78] usb: gadget: at91_udc: move prepare clk into process context Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 16/78] mm: frontswap: invalidate expired data on a dup-store failure Jiri Slaby
                   ` (64 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable
  Cc: linux-kernel, Francesco Ruggeri, Francesco Ruggeri,
	Greg Kroah-Hartman, Jiri Slaby

From: Francesco Ruggeri <fruggeri@aristanetworks.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit c4dc304677e8d566572c4738d95c48be150c6606 upstream.

Commit f95499c3030f ("n_tty: Don't wait for buffer work in read() loop")
introduces a race window where a pty master can be signalled that the pty
slave was closed before all the data that the slave wrote is delivered.
Commit f8747d4a466a ("tty: Fix pty master read() after slave closes") fixed the
problem in case of n_tty_read, but the problem still exists for n_tty_poll.
This can be seen by running 'for ((i=0; i<100;i++));do ./test.py ;done'
where test.py is:

import os, select, pty

(pid, pty_fd) = pty.fork()

if pid == 0:
   os.write(1, 'This string should be received by parent')
else:
   poller = select.epoll()
   poller.register( pty_fd, select.EPOLLIN )
   ready = poller.poll( 1 * 1000 )
   for fd, events in ready:
      if not events & select.EPOLLIN:
         print 'missed POLLIN event'
      else:
         print os.read(fd, 100)
   poller.close()

The string from the slave is missed several times.
This patch takes the same approach as the fix for read and special cases
this condition for poll.
Tested on 3.16.

Signed-off-by: Francesco Ruggeri <fruggeri@arista.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 drivers/tty/n_tty.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
index eac1b0d5b463..1197767b3019 100644
--- a/drivers/tty/n_tty.c
+++ b/drivers/tty/n_tty.c
@@ -2410,12 +2410,17 @@ static unsigned int n_tty_poll(struct tty_struct *tty, struct file *file,
 
 	poll_wait(file, &tty->read_wait, wait);
 	poll_wait(file, &tty->write_wait, wait);
+	if (test_bit(TTY_OTHER_CLOSED, &tty->flags))
+		mask |= POLLHUP;
 	if (input_available_p(tty, TIME_CHAR(tty) ? 0 : MIN_CHAR(tty)))
 		mask |= POLLIN | POLLRDNORM;
+	else if (mask & POLLHUP) {
+		tty_flush_to_ldisc(tty);
+		if (input_available_p(tty, TIME_CHAR(tty) ? 0 : MIN_CHAR(tty)))
+			mask |= POLLIN | POLLRDNORM;
+	}
 	if (tty->packet && tty->link->ctrl_status)
 		mask |= POLLPRI | POLLIN | POLLRDNORM;
-	if (test_bit(TTY_OTHER_CLOSED, &tty->flags))
-		mask |= POLLHUP;
 	if (tty_hung_up_p(file))
 		mask |= POLLHUP;
 	if (!(mask & (POLLHUP | POLLIN | POLLRDNORM))) {
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 16/78] mm: frontswap: invalidate expired data on a dup-store failure
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (14 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 15/78] tty: Fix pty master poll() after slave closes v2 Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 17/78] mm/vmpressure.c: fix race in vmpressure_work_fn() Jiri Slaby
                   ` (63 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable
  Cc: linux-kernel, Weijie Yang, Konrad Rzeszutek Wilk, Seth Jennings,
	Dan Streetman, Minchan Kim, Bob Liu, Andrew Morton,
	Linus Torvalds, Jiri Slaby

From: Weijie Yang <weijie.yang@samsung.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit fb993fa1a2f669215fa03a09eed7848f2663e336 upstream.

If a frontswap dup-store failed, it should invalidate the expired page
in the backend, or it could trigger some data corruption issue.
Such as:
 1. use zswap as the frontswap backend with writeback feature
 2. store a swap page(version_1) to entry A, success
 3. dup-store a newer page(version_2) to the same entry A, fail
 4. use __swap_writepage() write version_2 page to swapfile, success
 5. zswap do shrink, writeback version_1 page to swapfile
 6. version_2 page is overwrited by version_1, data corrupt.

This patch fixes this issue by invalidating expired data immediately
when meet a dup-store failure.

Signed-off-by: Weijie Yang <weijie.yang@samsung.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Seth Jennings <sjennings@variantweb.net>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Bob Liu <bob.liu@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 mm/frontswap.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/mm/frontswap.c b/mm/frontswap.c
index c30eec536f03..f2a3571c6e22 100644
--- a/mm/frontswap.c
+++ b/mm/frontswap.c
@@ -244,8 +244,10 @@ int __frontswap_store(struct page *page)
 		  the (older) page from frontswap
 		 */
 		inc_frontswap_failed_stores();
-		if (dup)
+		if (dup) {
 			__frontswap_clear(sis, offset);
+			frontswap_ops->invalidate_page(type, offset);
+		}
 	}
 	if (frontswap_writethrough_enabled)
 		/* report failure so swap also writes to swap device */
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 17/78] mm/vmpressure.c: fix race in vmpressure_work_fn()
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (15 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 16/78] mm: frontswap: invalidate expired data on a dup-store failure Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 18/78] mm: fix swapoff hang after page migration and fork Jiri Slaby
                   ` (62 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable
  Cc: linux-kernel, Andrew Morton, Anton Vorontsov, Linus Torvalds, Jiri Slaby

From: Andrew Morton <akpm@linux-foundation.org>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 91b57191cfd152c02ded0745250167d0263084f8 upstream.

In some android devices, there will be a "divide by zero" exception.
vmpr->scanned could be zero before spin_lock(&vmpr->sr_lock).

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=88051

[akpm@linux-foundation.org: neaten]
Reported-by: ji_ang <ji_ang@163.com>
Cc: Anton Vorontsov <anton.vorontsov@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 mm/vmpressure.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/mm/vmpressure.c b/mm/vmpressure.c
index e0f62837c3f4..c98b14ee69d6 100644
--- a/mm/vmpressure.c
+++ b/mm/vmpressure.c
@@ -164,6 +164,7 @@ static void vmpressure_work_fn(struct work_struct *work)
 	unsigned long scanned;
 	unsigned long reclaimed;
 
+	spin_lock(&vmpr->sr_lock);
 	/*
 	 * Several contexts might be calling vmpressure(), so it is
 	 * possible that the work was rescheduled again before the old
@@ -172,11 +173,12 @@ static void vmpressure_work_fn(struct work_struct *work)
 	 * here. No need for any locks here since we don't care if
 	 * vmpr->reclaimed is in sync.
 	 */
-	if (!vmpr->scanned)
+	scanned = vmpr->scanned;
+	if (!scanned) {
+		spin_unlock(&vmpr->sr_lock);
 		return;
+	}
 
-	spin_lock(&vmpr->sr_lock);
-	scanned = vmpr->scanned;
 	reclaimed = vmpr->reclaimed;
 	vmpr->scanned = 0;
 	vmpr->reclaimed = 0;
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 18/78] mm: fix swapoff hang after page migration and fork
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (16 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 17/78] mm/vmpressure.c: fix race in vmpressure_work_fn() Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 19/78] mm: fix anon_vma_clone() error treatment Jiri Slaby
                   ` (61 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable
  Cc: linux-kernel, Hugh Dickins, Kelley Nielsen, Andrew Morton,
	Linus Torvalds, Jiri Slaby

From: Hugh Dickins <hughd@google.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 2022b4d18a491a578218ce7a4eca8666db895a73 upstream.

I've been seeing swapoff hangs in recent testing: it's cycling around
trying unsuccessfully to find an mm for some remaining pages of swap.

I have been exercising swap and page migration more heavily recently,
and now notice a long-standing error in copy_one_pte(): it's trying to
add dst_mm to swapoff's mmlist when it finds a swap entry, but is doing
so even when it's a migration entry or an hwpoison entry.

Which wouldn't matter much, except it adds dst_mm next to src_mm,
assuming src_mm is already on the mmlist: which may not be so.  Then if
pages are later swapped out from dst_mm, swapoff won't be able to find
where to replace them.

There's already a !non_swap_entry() test for stats: move that up before
the swap_duplicate() and the addition to mmlist.

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Kelley Nielsen <kelleynnn@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 mm/memory.c | 26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index b5901068495f..827a7ed7f5a2 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -808,20 +808,20 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,
 		if (!pte_file(pte)) {
 			swp_entry_t entry = pte_to_swp_entry(pte);
 
-			if (swap_duplicate(entry) < 0)
-				return entry.val;
-
-			/* make sure dst_mm is on swapoff's mmlist. */
-			if (unlikely(list_empty(&dst_mm->mmlist))) {
-				spin_lock(&mmlist_lock);
-				if (list_empty(&dst_mm->mmlist))
-					list_add(&dst_mm->mmlist,
-						 &src_mm->mmlist);
-				spin_unlock(&mmlist_lock);
-			}
-			if (likely(!non_swap_entry(entry)))
+			if (likely(!non_swap_entry(entry))) {
+				if (swap_duplicate(entry) < 0)
+					return entry.val;
+
+				/* make sure dst_mm is on swapoff's mmlist. */
+				if (unlikely(list_empty(&dst_mm->mmlist))) {
+					spin_lock(&mmlist_lock);
+					if (list_empty(&dst_mm->mmlist))
+						list_add(&dst_mm->mmlist,
+							 &src_mm->mmlist);
+					spin_unlock(&mmlist_lock);
+				}
 				rss[MM_SWAPENTS]++;
-			else if (is_migration_entry(entry)) {
+			} else if (is_migration_entry(entry)) {
 				page = migration_entry_to_page(entry);
 
 				if (PageAnon(page))
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 19/78] mm: fix anon_vma_clone() error treatment
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (17 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 18/78] mm: fix swapoff hang after page migration and fork Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 20/78] i2c: omap: fix NACK and Arbitration Lost irq handling Jiri Slaby
                   ` (60 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable
  Cc: linux-kernel, Daniel Forrest, Konstantin Khlebnikov,
	Andrea Arcangeli, Rik van Riel, Tim Hartrick, Hugh Dickins,
	Michel Lespinasse, Vlastimil Babka, Andrew Morton,
	Linus Torvalds, Jiri Slaby

From: Daniel Forrest <dan.forrest@ssec.wisc.edu>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit c4ea95d7cd08d9ffd7fa75e6c5e0332d596dd11e upstream.

Andrew Morton noticed that the error return from anon_vma_clone() was
being dropped and replaced with -ENOMEM (which is not itself a bug
because the only error return value from anon_vma_clone() is -ENOMEM).

I did an audit of callers of anon_vma_clone() and discovered an actual
bug where the error return was being lost.  In __split_vma(), between
Linux 3.11 and 3.12 the code was changed so the err variable is used
before the call to anon_vma_clone() and the default initial value of
-ENOMEM is overwritten.  So a failure of anon_vma_clone() will return
success since err at this point is now zero.

Below is a patch which fixes this bug and also propagates the error
return value from anon_vma_clone() in all cases.

Fixes: ef0855d334e1 ("mm: mempolicy: turn vma_set_policy() into vma_dup_policy()")
Signed-off-by: Daniel Forrest <dan.forrest@ssec.wisc.edu>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Tim Hartrick <tim@edgecast.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 mm/mmap.c | 10 +++++++---
 mm/rmap.c |  6 ++++--
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index c1249cb7dc15..15e07d5a75cb 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -746,8 +746,11 @@ again:			remove_next = 1 + (end > next->vm_end);
 		 * shrinking vma had, to cover any anon pages imported.
 		 */
 		if (exporter && exporter->anon_vma && !importer->anon_vma) {
-			if (anon_vma_clone(importer, exporter))
-				return -ENOMEM;
+			int error;
+
+			error = anon_vma_clone(importer, exporter);
+			if (error)
+				return error;
 			importer->anon_vma = exporter->anon_vma;
 		}
 	}
@@ -2419,7 +2422,8 @@ static int __split_vma(struct mm_struct * mm, struct vm_area_struct * vma,
 	if (err)
 		goto out_free_vma;
 
-	if (anon_vma_clone(new, vma))
+	err = anon_vma_clone(new, vma);
+	if (err)
 		goto out_free_mpol;
 
 	if (new->vm_file)
diff --git a/mm/rmap.c b/mm/rmap.c
index 4271107aa46e..5b8675ccc1ef 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -274,6 +274,7 @@ int anon_vma_fork(struct vm_area_struct *vma, struct vm_area_struct *pvma)
 {
 	struct anon_vma_chain *avc;
 	struct anon_vma *anon_vma;
+	int error;
 
 	/* Don't bother if the parent process has no anon_vma here. */
 	if (!pvma->anon_vma)
@@ -283,8 +284,9 @@ int anon_vma_fork(struct vm_area_struct *vma, struct vm_area_struct *pvma)
 	 * First, attach the new VMA to the parent VMA's anon_vmas,
 	 * so rmap can find non-COWed pages in child processes.
 	 */
-	if (anon_vma_clone(vma, pvma))
-		return -ENOMEM;
+	error = anon_vma_clone(vma, pvma);
+	if (error)
+		return error;
 
 	/* Then add our own anon_vma. */
 	anon_vma = anon_vma_alloc();
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 20/78] i2c: omap: fix NACK and Arbitration Lost irq handling
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (18 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 19/78] mm: fix anon_vma_clone() error treatment Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 21/78] i2c: omap: fix i207 errata handling Jiri Slaby
                   ` (59 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Alexander Kochetkov, Wolfram Sang, Jiri Slaby

From: Alexander Kochetkov <al.kochet@gmail.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 27caca9d2e01c92b26d0690f065aad093fea01c7 upstream.

commit 1d7afc95946487945cc7f5019b41255b72224b70 (i2c: omap: ack IRQ in parts)
changed the interrupt handler to complete transfers without clearing
XRDY (AL case) and ARDY (NACK case) flags. XRDY or ARDY interrupts will be
fired again. As a result, ISR keep processing transfer after it was already
complete (from the driver code point of view).

A didn't see real impacts of the 1d7afc9, but it is really bad idea to
have ISR running on user data after transfer was complete.

It looks, what 1d7afc9 violate TI specs in what how AL and NACK should be
handled (see Note 1, sprugn4r, Figure 17-31 and Figure 17-32).

According to specs (if I understood correctly), in case of NACK and AL driver
must reset NACK, AL, ARDY, RDR, and RRDY (Master Receive Mode), and
NACK, AL, ARDY, and XDR (Master Transmitter Mode).

All that is done down the code under the if condition:
if (stat & (OMAP_I2C_STAT_ARDY | OMAP_I2C_STAT_NACK | OMAP_I2C_STAT_AL)) ...

The patch restore pre 1d7afc9 logic of handling NACK and AL interrupts, so
no interrupts is fired after ISR informs the rest of driver what transfer
complete.

Note: instead of removing break under NACK case, we could just replace 'break'
with 'continue' and allow NACK transfer to finish using ARDY event. I found
that NACK and ARDY bits usually set together. That case confirm TI wiki:
http://processors.wiki.ti.com/index.php/I2C_Tips#Detecting_and_handling_NACK

In order if someone interested in the event traces for NACK and AL cases,
I sent them to mailing list.

Tested on Beagleboard XM C.

Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com>
Fixes: 1d7afc9 i2c: omap: ack IRQ in parts
Acked-by: Felipe Balbi <balbi@ti.com>
Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 drivers/i2c/busses/i2c-omap.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/i2c/busses/i2c-omap.c b/drivers/i2c/busses/i2c-omap.c
index 9967a6f9c2ff..05f5919a7165 100644
--- a/drivers/i2c/busses/i2c-omap.c
+++ b/drivers/i2c/busses/i2c-omap.c
@@ -926,14 +926,12 @@ omap_i2c_isr_thread(int this_irq, void *dev_id)
 		if (stat & OMAP_I2C_STAT_NACK) {
 			err |= OMAP_I2C_STAT_NACK;
 			omap_i2c_ack_stat(dev, OMAP_I2C_STAT_NACK);
-			break;
 		}
 
 		if (stat & OMAP_I2C_STAT_AL) {
 			dev_err(dev->dev, "Arbitration lost\n");
 			err |= OMAP_I2C_STAT_AL;
 			omap_i2c_ack_stat(dev, OMAP_I2C_STAT_AL);
-			break;
 		}
 
 		/*
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 21/78] i2c: omap: fix i207 errata handling
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (19 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 20/78] i2c: omap: fix NACK and Arbitration Lost irq handling Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 22/78] i2c: davinci: generate STP always when NACK is received Jiri Slaby
                   ` (58 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Alexander Kochetkov, Wolfram Sang, Jiri Slaby

From: Alexander Kochetkov <al.kochet@gmail.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit ccfc866356674cb3a61829d239c685af6e85f197 upstream.

commit 6d9939f651419a63e091105663821f9c7d3fec37 (i2c: omap: split out [XR]DR
and [XR]RDY) changed the way how errata i207 (I2C: RDR Flag May Be Incorrectly
Set) get handled. 6d9939f6514 code doesn't correspond to workaround provided by
errata.

According to errata ISR must filter out spurious RDR before data read not after.
ISR must read RXSTAT to get number of bytes available to read. Because RDR
could be set while there could no data in the receive FIFO.

Restored pre 6d9939f6514 way of handling errata.

Found by code review. Real impact haven't seen.
Tested on Beagleboard XM C.

Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com>
Fixes: 6d9939f651419a63e09110 i2c: omap: split out [XR]DR and [XR]RDY
Tested-by: Felipe Balbi <balbi@ti.com>
Reviewed-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 drivers/i2c/busses/i2c-omap.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/i2c/busses/i2c-omap.c b/drivers/i2c/busses/i2c-omap.c
index 05f5919a7165..8eaaff831d7c 100644
--- a/drivers/i2c/busses/i2c-omap.c
+++ b/drivers/i2c/busses/i2c-omap.c
@@ -956,11 +956,13 @@ omap_i2c_isr_thread(int this_irq, void *dev_id)
 			if (dev->fifo_size)
 				num_bytes = dev->buf_len;
 
-			omap_i2c_receive_data(dev, num_bytes, true);
-
-			if (dev->errata & I2C_OMAP_ERRATA_I207)
+			if (dev->errata & I2C_OMAP_ERRATA_I207) {
 				i2c_omap_errata_i207(dev, stat);
+				num_bytes = (omap_i2c_read_reg(dev,
+					OMAP_I2C_BUFSTAT_REG) >> 8) & 0x3F;
+			}
 
+			omap_i2c_receive_data(dev, num_bytes, true);
 			omap_i2c_ack_stat(dev, OMAP_I2C_STAT_RDR);
 			continue;
 		}
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 22/78] i2c: davinci: generate STP always when NACK is received
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (20 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 21/78] i2c: omap: fix i207 errata handling Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 23/78] drm/radeon: kernel panic in drm_calc_vbltimestamp_from_scanoutpos with 3.18.0-rc6 Jiri Slaby
                   ` (57 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Grygorii Strashko, Wolfram Sang, Jiri Slaby

From: Grygorii Strashko <grygorii.strashko@ti.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 9ea359f7314132cbcb5a502d2d8ef095be1f45e4 upstream.

According to I2C specification the NACK should be handled as follows:
"When SDA remains HIGH during this ninth clock pulse, this is defined as the Not
Acknowledge signal. The master can then generate either a STOP condition to
abort the transfer, or a repeated START condition to start a new transfer."
[I2C spec Rev. 6, 3.1.6: http://www.nxp.com/documents/user_manual/UM10204.pdf]

Currently the Davinci i2c driver interrupts the transfer on receipt of a
NACK but fails to send a STOP in some situations and so makes the bus
stuck until next I2C IP reset (idle/enable).

For example, the issue will happen during SMBus read transfer which
consists from two i2c messages write command/address and read data:

S Slave Address Wr A Command Code A Sr Slave Address Rd A D1..Dn A P
<--- write -----------------------> <--- read --------------------->

The I2C client device will send NACK if it can't recognize "Command Code"
and it's expected from I2C master to generate STP in this case.
But now, Davinci i2C driver will just exit with -EREMOTEIO and STP will
not be generated.

Hence, fix it by generating Stop condition (STP) always when NACK is received.

This patch fixes Davinci I2C in the same way it was done for OMAP I2C
commit cda2109a26eb ("i2c: omap: query STP always when NACK is received").

Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reported-by: Hein Tibosch <hein_tibosch@yahoo.es>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 drivers/i2c/busses/i2c-davinci.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/i2c/busses/i2c-davinci.c b/drivers/i2c/busses/i2c-davinci.c
index 132369fad4e0..4e73f3ee05d8 100644
--- a/drivers/i2c/busses/i2c-davinci.c
+++ b/drivers/i2c/busses/i2c-davinci.c
@@ -411,11 +411,9 @@ i2c_davinci_xfer_msg(struct i2c_adapter *adap, struct i2c_msg *msg, int stop)
 	if (dev->cmd_err & DAVINCI_I2C_STR_NACK) {
 		if (msg->flags & I2C_M_IGNORE_NAK)
 			return msg->len;
-		if (stop) {
-			w = davinci_i2c_read_reg(dev, DAVINCI_I2C_MDR_REG);
-			w |= DAVINCI_I2C_MDR_STP;
-			davinci_i2c_write_reg(dev, DAVINCI_I2C_MDR_REG, w);
-		}
+		w = davinci_i2c_read_reg(dev, DAVINCI_I2C_MDR_REG);
+		w |= DAVINCI_I2C_MDR_STP;
+		davinci_i2c_write_reg(dev, DAVINCI_I2C_MDR_REG, w);
 		return -EREMOTEIO;
 	}
 	return -EIO;
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 23/78] drm/radeon: kernel panic in drm_calc_vbltimestamp_from_scanoutpos with 3.18.0-rc6
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (21 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 22/78] i2c: davinci: generate STP always when NACK is received Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 24/78] drm/i915: More cautious with pch fifo underruns Jiri Slaby
                   ` (56 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Petr Mladek, Alex Deucher, Jiri Slaby

From: Petr Mladek <pmladek@suse.cz>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit f5475cc43c899e33098d4db44b7c5e710f16589d upstream.

I was unable too boot 3.18.0-rc6 because of the following kernel
panic in drm_calc_vbltimestamp_from_scanoutpos():

    [drm] Initialized drm 1.1.0 20060810
    [drm] radeon kernel modesetting enabled.
    [drm] initializing kernel modesetting (RV100 0x1002:0x515E 0x15D9:0x8080).
    [drm] register mmio base: 0xC8400000
    [drm] register mmio size: 65536
    radeon 0000:0b:01.0: VRAM: 128M 0x00000000D0000000 - 0x00000000D7FFFFFF (16M used)
    radeon 0000:0b:01.0: GTT: 512M 0x00000000B0000000 - 0x00000000CFFFFFFF
    [drm] Detected VRAM RAM=128M, BAR=128M
    [drm] RAM width 16bits DDR
    [TTM] Zone  kernel: Available graphics memory: 3829346 kiB
    [TTM] Zone   dma32: Available graphics memory: 2097152 kiB
    [TTM] Initializing pool allocator
    [TTM] Initializing DMA pool allocator
    [drm] radeon: 16M of VRAM memory ready
    [drm] radeon: 512M of GTT memory ready.
    [drm] GART: num cpu pages 131072, num gpu pages 131072
    [drm] PCI GART of 512M enabled (table at 0x0000000037880000).
    radeon 0000:0b:01.0: WB disabled
    radeon 0000:0b:01.0: fence driver on ring 0 use gpu addr 0x00000000b0000000 and cpu addr 0xffff8800bbbfa000
    [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
    [drm] Driver supports precise vblank timestamp query.
    [drm] radeon: irq initialized.
    [drm] Loading R100 Microcode
    radeon 0000:0b:01.0: Direct firmware load for radeon/R100_cp.bin failed with error -2
    radeon_cp: Failed to load firmware "radeon/R100_cp.bin"
    [drm:r100_cp_init] *ERROR* Failed to load firmware!
    radeon 0000:0b:01.0: failed initializing CP (-2).
    radeon 0000:0b:01.0: Disabling GPU acceleration
    [drm] radeon: cp finalized
    BUG: unable to handle kernel NULL pointer dereference at 000000000000025c
    IP: [<ffffffff8150423b>] drm_calc_vbltimestamp_from_scanoutpos+0x4b/0x320
    PGD 0
    Oops: 0000 [#1] SMP
    Modules linked in:
    CPU: 1 PID: 1 Comm: swapper/0 Not tainted 3.18.0-rc6-4-default #2649
    Hardware name: Supermicro X7DB8/X7DB8, BIOS 6.00 07/26/2006
    task: ffff880234da2010 ti: ffff880234da4000 task.ti: ffff880234da4000
    RIP: 0010:[<ffffffff8150423b>]  [<ffffffff8150423b>] drm_calc_vbltimestamp_from_scanoutpos+0x4b/0x320
    RSP: 0000:ffff880234da7918  EFLAGS: 00010086
    RAX: ffffffff81557890 RBX: 0000000000000000 RCX: ffff880234da7a48
    RDX: ffff880234da79f4 RSI: 0000000000000000 RDI: ffff880232e15000
    RBP: ffff880234da79b8 R08: 0000000000000000 R09: 0000000000000000
    R10: 000000000000000a R11: 0000000000000001 R12: ffff880232dda1c0
    R13: ffff880232e1518c R14: 0000000000000292 R15: ffff880232e15000
    FS:  0000000000000000(0000) GS:ffff88023fc40000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: 000000000000025c CR3: 0000000002014000 CR4: 00000000000007e0
    Stack:
     ffff880234da79d8 0000000000000286 ffff880232dcbc00 0000000000002480
     ffff880234da7958 0000000000000296 ffff880234da7998 ffffffff8151b51d
     ffff880234da7a48 0000000032dcbeb0 ffff880232dcbc00 ffff880232dcbc58
    Call Trace:
     [<ffffffff8151b51d>] ? drm_vma_offset_remove+0x1d/0x110
     [<ffffffff8152dc98>] radeon_get_vblank_timestamp_kms+0x38/0x60
     [<ffffffff8152076a>] ? ttm_bo_release_list+0xba/0x180
     [<ffffffff81503751>] drm_get_last_vbltimestamp+0x41/0x70
     [<ffffffff81503933>] vblank_disable_and_save+0x73/0x1d0
     [<ffffffff81106b2f>] ? try_to_del_timer_sync+0x4f/0x70
     [<ffffffff81505245>] drm_vblank_cleanup+0x65/0xa0
     [<ffffffff815604fa>] radeon_irq_kms_fini+0x1a/0x70
     [<ffffffff8156c07e>] r100_init+0x26e/0x410
     [<ffffffff8152ae3e>] radeon_device_init+0x7ae/0xb50
     [<ffffffff8152d57f>] radeon_driver_load_kms+0x8f/0x210
     [<ffffffff81506965>] drm_dev_register+0xb5/0x110
     [<ffffffff8150998f>] drm_get_pci_dev+0x8f/0x200
     [<ffffffff815291cd>] radeon_pci_probe+0xad/0xe0
     [<ffffffff8141a365>] local_pci_probe+0x45/0xa0
     [<ffffffff8141b741>] pci_device_probe+0xd1/0x130
     [<ffffffff81633dad>] driver_probe_device+0x12d/0x3e0
     [<ffffffff8163413b>] __driver_attach+0x9b/0xa0
     [<ffffffff816340a0>] ? __device_attach+0x40/0x40
     [<ffffffff81631cd3>] bus_for_each_dev+0x63/0xa0
     [<ffffffff8163378e>] driver_attach+0x1e/0x20
     [<ffffffff81633390>] bus_add_driver+0x180/0x240
     [<ffffffff81634914>] driver_register+0x64/0xf0
     [<ffffffff81419cac>] __pci_register_driver+0x4c/0x50
     [<ffffffff81509bf5>] drm_pci_init+0xf5/0x120
     [<ffffffff821dc871>] ? ttm_init+0x6a/0x6a
     [<ffffffff821dc908>] radeon_init+0x97/0xb5
     [<ffffffff810002fc>] do_one_initcall+0xbc/0x1f0
     [<ffffffff810e3278>] ? __wake_up+0x48/0x60
     [<ffffffff8218e256>] kernel_init_freeable+0x18a/0x215
     [<ffffffff8218d983>] ? initcall_blacklist+0xc0/0xc0
     [<ffffffff818a78f0>] ? rest_init+0x80/0x80
     [<ffffffff818a78fe>] kernel_init+0xe/0xf0
     [<ffffffff818c0c3c>] ret_from_fork+0x7c/0xb0
     [<ffffffff818a78f0>] ? rest_init+0x80/0x80
    Code: 45 ac 0f 88 a8 01 00 00 3b b7 d0 01 00 00 49 89 ff 0f 83 99 01 00 00 48 8b 47 20 48 8b 80 88 00 00 00 48 85 c0 0f 84 cd 01 00 00 <41> 8b b1 5c 02 00 00 41 8b 89 58 02 00 00 89 75 98 41 8b b1 60
    RIP  [<ffffffff8150423b>] drm_calc_vbltimestamp_from_scanoutpos+0x4b/0x320
     RSP <ffff880234da7918>
    CR2: 000000000000025c
    ---[ end trace ad2c0aadf48e2032 ]---
    Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009

It has helped me to add a NULL pointer check that was suggested at
http://lists.freedesktop.org/archives/dri-devel/2014-October/070663.html

I am not familiar with the code. But the change looks sane
and we need something fast at this stage of 3.18 development.

Suggested-by: Helge Deller <deller@gmx.de>
Signed-off-by: Petr Mladek <pmladek@suse.cz>
Tested-by: Petr Mladek <pmladek@suse.cz>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 drivers/gpu/drm/radeon/radeon_kms.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/radeon/radeon_kms.c b/drivers/gpu/drm/radeon/radeon_kms.c
index a134e8bf53f5..03ff6726ce9f 100644
--- a/drivers/gpu/drm/radeon/radeon_kms.c
+++ b/drivers/gpu/drm/radeon/radeon_kms.c
@@ -684,6 +684,8 @@ int radeon_get_vblank_timestamp_kms(struct drm_device *dev, int crtc,
 
 	/* Get associated drm_crtc: */
 	drmcrtc = &rdev->mode_info.crtcs[crtc]->base;
+	if (!drmcrtc)
+		return -EINVAL;
 
 	/* Helper routine in DRM core does all the work: */
 	return drm_calc_vbltimestamp_from_scanoutpos(dev, crtc, max_error,
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 24/78] drm/i915: More cautious with pch fifo underruns
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (22 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 23/78] drm/radeon: kernel panic in drm_calc_vbltimestamp_from_scanoutpos with 3.18.0-rc6 Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 25/78] drm/i915: Unlock panel even when LVDS is disabled Jiri Slaby
                   ` (55 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable
  Cc: linux-kernel, Daniel Vetter, Daniel Vetter, Jani Nikula, Jiri Slaby

From: Daniel Vetter <daniel.vetter@ffwll.ch>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit b68362278af94e1171f5be9d4e44988601fb0439 upstream.

Apparently PCH fifo underruns are tricky, we have plenty reports that
we see the occasional underrun (especially at boot-up).

So for a change let's see what happens when we don't re-enable pch
fifo underrun reporting when the pipe is disabled. This means that the
kernel can't catch pch fifo underruns when they happen (except when
all pipes are on on the pch). But we'll still catch underruns when
disabling the pipe again. So not a terrible reduction in test
coverage.

Since the DRM_ERROR is new and hence a regression plan B would be to
revert it back to a debug output. Which would be a lot worse than this
hack for underrun test coverage in the wild. See the referenced
discussions for more.

References: http://mid.gmane.org/CA+gsUGRfGe3t4NcjdeA=qXysrhLY3r4CEu7z4bjTwxi1uOfy+g@mail.gmail.com
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85898
References: https://bugs.freedesktop.org/show_bug.cgi?id=85898
References: https://bugs.freedesktop.org/show_bug.cgi?id=86233
References: https://bugs.freedesktop.org/show_bug.cgi?id=86478
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Tested-by: lu hua <huax.lu@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 drivers/gpu/drm/i915/intel_display.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 837cc6cd7472..37a9d3c89feb 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -3537,7 +3537,6 @@ static void ironlake_crtc_disable(struct drm_crtc *crtc)
 		ironlake_fdi_disable(crtc);
 
 		ironlake_disable_pch_transcoder(dev_priv, pipe);
-		intel_set_pch_fifo_underrun_reporting(dev, pipe, true);
 
 		if (HAS_PCH_CPT(dev)) {
 			/* disable TRANS_DP_CTL */
@@ -3613,7 +3612,6 @@ static void haswell_crtc_disable(struct drm_crtc *crtc)
 
 	if (intel_crtc->config.has_pch_encoder) {
 		lpt_disable_pch_transcoder(dev_priv);
-		intel_set_pch_fifo_underrun_reporting(dev, TRANSCODER_A, true);
 		intel_ddi_fdi_disable(crtc);
 	}
 
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 25/78] drm/i915: Unlock panel even when LVDS is disabled
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (23 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 24/78] drm/i915: More cautious with pch fifo underruns Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 26/78] media: smiapp: Only some selection targets are settable Jiri Slaby
                   ` (54 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable
  Cc: linux-kernel, Daniel Vetter, Alexey Orishko, Chris Wilson,
	Francois Tigeot, Daniel Vetter, Jani Nikula, Jiri Slaby

From: Daniel Vetter <daniel.vetter@ffwll.ch>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit b0616c5306b342ceca07044dbc4f917d95c4f825 upstream.

Otherwise we'll have backtraces in assert_panel_unlocked because the
BIOS locks the register. In the reporter's case this regression was
introduced in

commit c31407a3672aaebb4acddf90944a114fa5c8af7b
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu Oct 18 21:07:01 2012 +0100

    drm/i915: Add no-lvds quirk for Supermicro X7SPA-H

Reported-by: Alexey Orishko <alexey.orishko@gmail.com>
Cc: Alexey Orishko <alexey.orishko@gmail.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Francois Tigeot <ftigeot@wolfpond.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Tested-by: Alexey Orishko <alexey.orishko@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 drivers/gpu/drm/i915/intel_lvds.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lvds.c b/drivers/gpu/drm/i915/intel_lvds.c
index 667f2117e1d9..e5473daab676 100644
--- a/drivers/gpu/drm/i915/intel_lvds.c
+++ b/drivers/gpu/drm/i915/intel_lvds.c
@@ -934,6 +934,17 @@ void intel_lvds_init(struct drm_device *dev)
 	int pipe;
 	u8 pin;
 
+	/*
+	 * Unlock registers and just leave them unlocked. Do this before
+	 * checking quirk lists to avoid bogus WARNINGs.
+	 */
+	if (HAS_PCH_SPLIT(dev)) {
+		I915_WRITE(PCH_PP_CONTROL,
+			   I915_READ(PCH_PP_CONTROL) | PANEL_UNLOCK_REGS);
+	} else {
+		I915_WRITE(PP_CONTROL,
+			   I915_READ(PP_CONTROL) | PANEL_UNLOCK_REGS);
+	}
 	if (!intel_lvds_supported(dev))
 		return;
 
@@ -1113,17 +1124,6 @@ out:
 	DRM_DEBUG_KMS("detected %s-link lvds configuration\n",
 		      lvds_encoder->is_dual_link ? "dual" : "single");
 
-	/*
-	 * Unlock registers and just
-	 * leave them unlocked
-	 */
-	if (HAS_PCH_SPLIT(dev)) {
-		I915_WRITE(PCH_PP_CONTROL,
-			   I915_READ(PCH_PP_CONTROL) | PANEL_UNLOCK_REGS);
-	} else {
-		I915_WRITE(PP_CONTROL,
-			   I915_READ(PP_CONTROL) | PANEL_UNLOCK_REGS);
-	}
 	lvds_connector->lid_notifier.notifier_call = intel_lid_notify;
 	if (acpi_lid_notifier_register(&lvds_connector->lid_notifier)) {
 		DRM_DEBUG_KMS("lid notifier registration failed\n");
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 26/78] media: smiapp: Only some selection targets are settable
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (24 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 25/78] drm/i915: Unlock panel even when LVDS is disabled Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 27/78] USB: xhci: Reset a halted endpoint immediately when we encounter a stall Jiri Slaby
                   ` (53 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Sakari Ailus, Mauro Carvalho Chehab, Jiri Slaby

From: Sakari Ailus <sakari.ailus@iki.fi>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit b31eb901c4e5eeef4c83c43dfbc7fe0d4348cb21 upstream.

Setting a non-settable selection target caused BUG() to be called. The check
for valid selections only takes the selection target into account, but does
not tell whether it may be set, or only get. Fix the issue by simply
returning an error to the user.

Signed-off-by: Sakari Ailus <sakari.ailus@iki.fi>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 drivers/media/i2c/smiapp/smiapp-core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/media/i2c/smiapp/smiapp-core.c b/drivers/media/i2c/smiapp/smiapp-core.c
index ae66d91bf713..371ca22843ee 100644
--- a/drivers/media/i2c/smiapp/smiapp-core.c
+++ b/drivers/media/i2c/smiapp/smiapp-core.c
@@ -2139,7 +2139,7 @@ static int smiapp_set_selection(struct v4l2_subdev *subdev,
 		ret = smiapp_set_compose(subdev, fh, sel);
 		break;
 	default:
-		BUG();
+		ret = -EINVAL;
 	}
 
 	mutex_unlock(&sensor->mutex);
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 27/78] USB: xhci: Reset a halted endpoint immediately when we encounter a stall.
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (25 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 26/78] media: smiapp: Only some selection targets are settable Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 28/78] AHCI: Add DeviceIDs for Sunrise Point-LP SATA controller Jiri Slaby
                   ` (52 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Mathias Nyman, Jiri Slaby

From: Mathias Nyman <mathias.nyman@linux.intel.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 8e71a322fdb127814bcba423a512914ca5bc6cf5 upstream.

If a device is halted and reuturns a STALL, then the halted endpoint
needs to be cleared both on the host and device side. The host
side halt is cleared by issueing a xhci reset endpoint command. The device side
is cleared with a ClearFeature(ENDPOINT_HALT) request, which should
be issued by the device driver if a URB reruen -EPIPE.

Previously we cleared the host side halt after the device side was cleared.
To make sure the host side halt is cleared in time we want to issue the
reset endpoint command immedialtely when a STALL status is encountered.

Otherwise we end up not following the specs and not returning -EPIPE
several times in a row when trying to transfer data to a halted endpoint.

Fixes: bcef3fd (USB: xhci: Handle errors that cause endpoint halts.)
Tested-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 drivers/usb/host/xhci-ring.c | 40 ++++++++---------------------
 drivers/usb/host/xhci.c      | 60 +++++++++++---------------------------------
 2 files changed, 25 insertions(+), 75 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index d761c040ee2e..6f052daed694 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -1965,22 +1965,13 @@ static int finish_td(struct xhci_hcd *xhci, struct xhci_td *td,
 		ep->stopped_td = td;
 		return 0;
 	} else {
-		if (trb_comp_code == COMP_STALL) {
-			/* The transfer is completed from the driver's
-			 * perspective, but we need to issue a set dequeue
-			 * command for this stalled endpoint to move the dequeue
-			 * pointer past the TD.  We can't do that here because
-			 * the halt condition must be cleared first.  Let the
-			 * USB class driver clear the stall later.
-			 */
-			ep->stopped_td = td;
-			ep->stopped_stream = ep_ring->stream_id;
-		} else if (xhci_requires_manual_halt_cleanup(xhci,
-					ep_ctx, trb_comp_code)) {
-			/* Other types of errors halt the endpoint, but the
-			 * class driver doesn't call usb_reset_endpoint() unless
-			 * the error is -EPIPE.  Clear the halted status in the
-			 * xHCI hardware manually.
+		if (trb_comp_code == COMP_STALL ||
+		    xhci_requires_manual_halt_cleanup(xhci, ep_ctx,
+						      trb_comp_code)) {
+			/* Issue a reset endpoint command to clear the host side
+			 * halt, followed by a set dequeue command to move the
+			 * dequeue pointer past the TD.
+			 * The class driver clears the device side halt later.
 			 */
 			xhci_cleanup_halted_endpoint(xhci,
 					slot_id, ep_index, ep_ring->stream_id,
@@ -2100,9 +2091,7 @@ static int process_ctrl_td(struct xhci_hcd *xhci, struct xhci_td *td,
 		else
 			td->urb->actual_length = 0;
 
-		xhci_cleanup_halted_endpoint(xhci,
-			slot_id, ep_index, 0, td, event_trb);
-		return finish_td(xhci, td, event_trb, event, ep, status, true);
+		return finish_td(xhci, td, event_trb, event, ep, status, false);
 	}
 	/*
 	 * Did we transfer any data, despite the errors that might have
@@ -2656,17 +2645,8 @@ cleanup:
 		if (ret) {
 			urb = td->urb;
 			urb_priv = urb->hcpriv;
-			/* Leave the TD around for the reset endpoint function
-			 * to use(but only if it's not a control endpoint,
-			 * since we already queued the Set TR dequeue pointer
-			 * command for stalled control endpoints).
-			 */
-			if (usb_endpoint_xfer_control(&urb->ep->desc) ||
-				(trb_comp_code != COMP_STALL &&
-					trb_comp_code != COMP_BABBLE))
-				xhci_urb_free_priv(xhci, urb_priv);
-			else
-				kfree(urb_priv);
+
+			xhci_urb_free_priv(xhci, urb_priv);
 
 			usb_hcd_unlink_urb_from_ep(bus_to_hcd(urb->dev->bus), urb);
 			if ((urb->actual_length != urb->transfer_buffer_length &&
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index 381965957a67..e0ccc95c91e2 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -2924,63 +2924,33 @@ void xhci_cleanup_stalled_ring(struct xhci_hcd *xhci,
 	}
 }
 
-/* Deal with stalled endpoints.  The core should have sent the control message
- * to clear the halt condition.  However, we need to make the xHCI hardware
- * reset its sequence number, since a device will expect a sequence number of
- * zero after the halt condition is cleared.
+/* Called when clearing halted device. The core should have sent the control
+ * message to clear the device halt condition. The host side of the halt should
+ * already be cleared with a reset endpoint command issued when the STALL tx
+ * event was received.
+ *
  * Context: in_interrupt
  */
+
 void xhci_endpoint_reset(struct usb_hcd *hcd,
 		struct usb_host_endpoint *ep)
 {
 	struct xhci_hcd *xhci;
-	struct usb_device *udev;
-	unsigned int ep_index;
-	unsigned long flags;
-	int ret;
-	struct xhci_virt_ep *virt_ep;
 
 	xhci = hcd_to_xhci(hcd);
-	udev = (struct usb_device *) ep->hcpriv;
-	/* Called with a root hub endpoint (or an endpoint that wasn't added
-	 * with xhci_add_endpoint()
-	 */
-	if (!ep->hcpriv)
-		return;
-	ep_index = xhci_get_endpoint_index(&ep->desc);
-	virt_ep = &xhci->devs[udev->slot_id]->eps[ep_index];
-	if (!virt_ep->stopped_td) {
-		xhci_dbg_trace(xhci, trace_xhci_dbg_reset_ep,
-			"Endpoint 0x%x not halted, refusing to reset.",
-			ep->desc.bEndpointAddress);
-		return;
-	}
-	if (usb_endpoint_xfer_control(&ep->desc)) {
-		xhci_dbg_trace(xhci, trace_xhci_dbg_reset_ep,
-				"Control endpoint stall already handled.");
-		return;
-	}
 
-	xhci_dbg_trace(xhci, trace_xhci_dbg_reset_ep,
-			"Queueing reset endpoint command");
-	spin_lock_irqsave(&xhci->lock, flags);
-	ret = xhci_queue_reset_ep(xhci, udev->slot_id, ep_index);
 	/*
-	 * Can't change the ring dequeue pointer until it's transitioned to the
-	 * stopped state, which is only upon a successful reset endpoint
-	 * command.  Better hope that last command worked!
+	 * We might need to implement the config ep cmd in xhci 4.8.1 note:
+	 * The Reset Endpoint Command may only be issued to endpoints in the
+	 * Halted state. If software wishes reset the Data Toggle or Sequence
+	 * Number of an endpoint that isn't in the Halted state, then software
+	 * may issue a Configure Endpoint Command with the Drop and Add bits set
+	 * for the target endpoint. that is in the Stopped state.
 	 */
-	if (!ret) {
-		xhci_cleanup_stalled_ring(xhci, udev, ep_index);
-		kfree(virt_ep->stopped_td);
-		xhci_ring_cmd_db(xhci);
-	}
-	virt_ep->stopped_td = NULL;
-	virt_ep->stopped_stream = 0;
-	spin_unlock_irqrestore(&xhci->lock, flags);
 
-	if (ret)
-		xhci_warn(xhci, "FIXME allocate a new ring segment\n");
+	/* For now just print debug to follow the situation */
+	xhci_dbg(xhci, "Endpoint 0x%x ep reset callback called\n",
+		 ep->desc.bEndpointAddress);
 }
 
 static int xhci_check_streams_endpoint(struct xhci_hcd *xhci,
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 28/78] AHCI: Add DeviceIDs for Sunrise Point-LP SATA controller
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (26 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 27/78] USB: xhci: Reset a halted endpoint immediately when we encounter a stall Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 29/78] ahci: disable MSI on SAMSUNG 0xa800 SSD Jiri Slaby
                   ` (51 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Devin Ryles, Tejun Heo, Jiri Slaby

From: Devin Ryles <devin.ryles@intel.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 249cd0a187ed4ef1d0af7f74362cc2791ec5581b upstream.

This patch adds DeviceIDs for Sunrise Point-LP.

Signed-off-by: Devin Ryles <devin.ryles@intel.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 drivers/ata/ahci.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
index 4432c9dc9c7a..e0a8a793d14a 100644
--- a/drivers/ata/ahci.c
+++ b/drivers/ata/ahci.c
@@ -320,6 +320,9 @@ static const struct pci_device_id ahci_pci_tbl[] = {
 	{ PCI_VDEVICE(INTEL, 0x8c87), board_ahci }, /* 9 Series RAID */
 	{ PCI_VDEVICE(INTEL, 0x8c8e), board_ahci }, /* 9 Series RAID */
 	{ PCI_VDEVICE(INTEL, 0x8c8f), board_ahci }, /* 9 Series RAID */
+	{ PCI_VDEVICE(INTEL, 0x9d03), board_ahci }, /* Sunrise Point-LP AHCI */
+	{ PCI_VDEVICE(INTEL, 0x9d05), board_ahci }, /* Sunrise Point-LP RAID */
+	{ PCI_VDEVICE(INTEL, 0x9d07), board_ahci }, /* Sunrise Point-LP RAID */
 	{ PCI_VDEVICE(INTEL, 0xa103), board_ahci }, /* Sunrise Point-H AHCI */
 	{ PCI_VDEVICE(INTEL, 0xa103), board_ahci }, /* Sunrise Point-H RAID */
 	{ PCI_VDEVICE(INTEL, 0xa105), board_ahci }, /* Sunrise Point-H RAID */
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 29/78] ahci: disable MSI on SAMSUNG 0xa800 SSD
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (27 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 28/78] AHCI: Add DeviceIDs for Sunrise Point-LP SATA controller Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 30/78] sata_fsl: fix error handling of irq_of_parse_and_map Jiri Slaby
                   ` (50 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Tejun Heo, Jiri Slaby

From: Tejun Heo <tj@kernel.org>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 2b21ef0aae65f22f5ba86b13c4588f6f0c2dbefb upstream.

Just like 0x1600 which got blacklisted by 66a7cbc303f4 ("ahci: disable
MSI instead of NCQ on Samsung pci-e SSDs on macbooks"), 0xa800 chokes
on NCQ commands if MSI is enabled.  Disable MSI.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Dominik Mierzejewski <dominik@greysector.net>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=89171
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 drivers/ata/ahci.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
index e0a8a793d14a..53111fd27ebb 100644
--- a/drivers/ata/ahci.c
+++ b/drivers/ata/ahci.c
@@ -494,6 +494,7 @@ static const struct pci_device_id ahci_pci_tbl[] = {
 	 * enabled.  https://bugzilla.kernel.org/show_bug.cgi?id=60731
 	 */
 	{ PCI_VDEVICE(SAMSUNG, 0x1600), board_ahci_nomsi },
+	{ PCI_VDEVICE(SAMSUNG, 0xa800), board_ahci_nomsi },
 
 	/* Enmotus */
 	{ PCI_DEVICE(0x1c44, 0x8000), board_ahci },
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 30/78] sata_fsl: fix error handling of irq_of_parse_and_map
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (28 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 29/78] ahci: disable MSI on SAMSUNG 0xa800 SSD Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 31/78] igb: bring link up when PHY is powered up Jiri Slaby
                   ` (49 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Dmitry Torokhov, Tejun Heo, Jiri Slaby

From: Dmitry Torokhov <dtor@chromium.org>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit aad0b624129709c94c2e19e583b6053520353fa8 upstream.

irq_of_parse_and_map() returns 0 on error (the result is unsigned int),
so testing for negative result never works.

Signed-off-by: Dmitry Torokhov <dtor@chromium.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 drivers/ata/sata_fsl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/ata/sata_fsl.c b/drivers/ata/sata_fsl.c
index 851bd3f43ac6..017ed84a0cc4 100644
--- a/drivers/ata/sata_fsl.c
+++ b/drivers/ata/sata_fsl.c
@@ -1501,7 +1501,7 @@ static int sata_fsl_probe(struct platform_device *ofdev)
 	host_priv->csr_base = csr_base;
 
 	irq = irq_of_parse_and_map(ofdev->dev.of_node, 0);
-	if (irq < 0) {
+	if (!irq) {
 		dev_err(&ofdev->dev, "invalid irq from platform\n");
 		goto error_exit_with_cleanup;
 	}
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 31/78] igb: bring link up when PHY is powered up
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (29 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 30/78] sata_fsl: fix error handling of irq_of_parse_and_map Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 32/78] powerpc: 32 bit getcpu VDSO function uses 64 bit instructions Jiri Slaby
                   ` (48 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable
  Cc: linux-kernel, Todd Fujinaka, Jeff Kirsher, Vincent Donnefort, Jiri Slaby

From: Todd Fujinaka <todd.fujinaka@intel.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit aec653c43b0c55667355e26d7de1236bda9fb4e3 upstream.

Call igb_setup_link() when the PHY is powered up.

Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Reported-by: Jeff Westfahl <jeff.westfahl@ni.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Vincent Donnefort <vdonnefort@gmail.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 drivers/net/ethernet/intel/igb/igb_main.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 2b76ae55f2af..02544ce60b1f 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -1587,6 +1587,8 @@ void igb_power_up_link(struct igb_adapter *adapter)
 		igb_power_up_phy_copper(&adapter->hw);
 	else
 		igb_power_up_serdes_link_82575(&adapter->hw);
+
+	igb_setup_link(&adapter->hw);
 }
 
 /**
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 32/78] powerpc: 32 bit getcpu VDSO function uses 64 bit instructions
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (30 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 31/78] igb: bring link up when PHY is powered up Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 33/78] ALSA: hda - Add EAPD fixup for ASUS Z99He laptop Jiri Slaby
                   ` (47 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Anton Blanchard, Michael Ellerman, Jiri Slaby

From: Anton Blanchard <anton@samba.org>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 152d44a853e42952f6c8a504fb1f8eefd21fd5fd upstream.

I used some 64 bit instructions when adding the 32 bit getcpu VDSO
function. Fix it.

Fixes: 18ad51dd342a ("powerpc: Add VDSO version of getcpu")
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 arch/powerpc/kernel/vdso32/getcpu.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/vdso32/getcpu.S b/arch/powerpc/kernel/vdso32/getcpu.S
index 47afd08c90f7..fe7e97a1aad9 100644
--- a/arch/powerpc/kernel/vdso32/getcpu.S
+++ b/arch/powerpc/kernel/vdso32/getcpu.S
@@ -30,8 +30,8 @@
 V_FUNCTION_BEGIN(__kernel_getcpu)
   .cfi_startproc
 	mfspr	r5,SPRN_USPRG3
-	cmpdi	cr0,r3,0
-	cmpdi	cr1,r4,0
+	cmpwi	cr0,r3,0
+	cmpwi	cr1,r4,0
 	clrlwi  r6,r5,16
 	rlwinm  r7,r5,16,31-15,31-0
 	beq	cr0,1f
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 33/78] ALSA: hda - Add EAPD fixup for ASUS Z99He laptop
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (31 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 32/78] powerpc: 32 bit getcpu VDSO function uses 64 bit instructions Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 34/78] ALSA: hda - Fix built-in mic at resume on Lenovo Ideapad S210 Jiri Slaby
                   ` (46 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Takashi Iwai, Jiri Slaby

From: Takashi Iwai <tiwai@suse.de>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit f62f5eff3d40a56ad1cf0d81a6cac8dd8743e8a1 upstream.

The same fixup to enable EAPD is needed for ASUS Z99He with AD1986A
codec like another ASUS machine.

Reported-and-tested-by: Dmitry V. Zimin <pfzim@mail.ru>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 sound/pci/hda/patch_analog.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/sound/pci/hda/patch_analog.c b/sound/pci/hda/patch_analog.c
index 01338064260e..10dc0c8fbb87 100644
--- a/sound/pci/hda/patch_analog.c
+++ b/sound/pci/hda/patch_analog.c
@@ -316,6 +316,7 @@ static const struct hda_fixup ad1986a_fixups[] = {
 
 static const struct snd_pci_quirk ad1986a_fixup_tbl[] = {
 	SND_PCI_QUIRK(0x103c, 0x30af, "HP B2800", AD1986A_FIXUP_LAPTOP_IMIC),
+	SND_PCI_QUIRK(0x1043, 0x1443, "ASUS Z99He", AD1986A_FIXUP_EAPD),
 	SND_PCI_QUIRK(0x1043, 0x1447, "ASUS A8JN", AD1986A_FIXUP_EAPD),
 	SND_PCI_QUIRK_MASK(0x1043, 0xff00, 0x8100, "ASUS P5", AD1986A_FIXUP_3STACK),
 	SND_PCI_QUIRK_MASK(0x1043, 0xff00, 0x8200, "ASUS M2", AD1986A_FIXUP_3STACK),
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 34/78] ALSA: hda - Fix built-in mic at resume on Lenovo Ideapad S210
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (32 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 33/78] ALSA: hda - Add EAPD fixup for ASUS Z99He laptop Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 35/78] ALSA: usb-audio: Don't resubmit pending URBs at MIDI error recovery Jiri Slaby
                   ` (45 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Takashi Iwai, Jiri Slaby

From: Takashi Iwai <tiwai@suse.de>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit fedb2245cbb8d823e449ebdd48ba9bb35c071ce0 upstream.

The built-in mic boost volume gets almost muted after suspend/resume
on Lenovo Ideapad S210.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=88121
Reported-and-tested-by: Roman Kagan <rkagan@mail.ru>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 sound/pci/hda/patch_realtek.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c
index 8be86358f640..09193457d0b0 100644
--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -4147,6 +4147,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
 	SND_PCI_QUIRK(0x17aa, 0x2212, "Thinkpad", ALC269_FIXUP_LIMIT_INT_MIC_BOOST),
 	SND_PCI_QUIRK(0x17aa, 0x2214, "Thinkpad", ALC269_FIXUP_LIMIT_INT_MIC_BOOST),
 	SND_PCI_QUIRK(0x17aa, 0x2215, "Thinkpad", ALC269_FIXUP_LIMIT_INT_MIC_BOOST),
+	SND_PCI_QUIRK(0x17aa, 0x3977, "IdeaPad S210", ALC283_FIXUP_INT_MIC),
 	SND_PCI_QUIRK(0x17aa, 0x3978, "IdeaPad Y410P", ALC269_FIXUP_NO_SHUTUP),
 	SND_PCI_QUIRK(0x17aa, 0x5013, "Thinkpad", ALC269_FIXUP_LIMIT_INT_MIC_BOOST),
 	SND_PCI_QUIRK(0x17aa, 0x501a, "Thinkpad", ALC283_FIXUP_INT_MIC),
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 35/78] ALSA: usb-audio: Don't resubmit pending URBs at MIDI error recovery
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (33 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 34/78] ALSA: hda - Fix built-in mic at resume on Lenovo Ideapad S210 Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 36/78] isofs: Fix infinite looping over CE entries Jiri Slaby
                   ` (44 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Takashi Iwai, Jiri Slaby

From: Takashi Iwai <tiwai@suse.de>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 66139a48cee1530c91f37c145384b4ee7043f0b7 upstream.

In snd_usbmidi_error_timer(), the driver tries to resubmit MIDI input
URBs to reactivate the MIDI stream, but this causes the error when
some of URBs are still pending like:

 WARNING: CPU: 0 PID: 0 at ../drivers/usb/core/urb.c:339 usb_submit_urb+0x5f/0x70()
 URB ef705c40 submitted while active
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.16.6-2-desktop #1
 Hardware name: FOXCONN TPS01/TPS01, BIOS 080015  03/23/2010
  c0984bfa f4009ed4 c078deaf f4009ee4 c024c884 c09a135c f4009f00 00000000
  c0984bfa 00000153 c061ac4f c061ac4f 00000009 00000001 ef705c40 e854d1c0
  f4009eec c024c8d3 00000009 f4009ee4 c09a135c f4009f00 f4009f04 c061ac4f
 Call Trace:
  [<c0205df6>] try_stack_unwind+0x156/0x170
  [<c020482a>] dump_trace+0x5a/0x1b0
  [<c0205e56>] show_trace_log_lvl+0x46/0x50
  [<c02049d1>] show_stack_log_lvl+0x51/0xe0
  [<c0205eb7>] show_stack+0x27/0x50
  [<c078deaf>] dump_stack+0x45/0x65
  [<c024c884>] warn_slowpath_common+0x84/0xa0
  [<c024c8d3>] warn_slowpath_fmt+0x33/0x40
  [<c061ac4f>] usb_submit_urb+0x5f/0x70
  [<f7974104>] snd_usbmidi_submit_urb+0x14/0x60 [snd_usbmidi_lib]
  [<f797483a>] snd_usbmidi_error_timer+0x6a/0xa0 [snd_usbmidi_lib]
  [<c02570c0>] call_timer_fn+0x30/0x130
  [<c0257442>] run_timer_softirq+0x1c2/0x260
  [<c0251493>] __do_softirq+0xc3/0x270
  [<c0204732>] do_softirq_own_stack+0x22/0x30
  [<c025186d>] irq_exit+0x8d/0xa0
  [<c0795228>] smp_apic_timer_interrupt+0x38/0x50
  [<c0794a3c>] apic_timer_interrupt+0x34/0x3c
  [<c0673d9e>] cpuidle_enter_state+0x3e/0xd0
  [<c028bb8d>] cpu_idle_loop+0x29d/0x3e0
  [<c028bd23>] cpu_startup_entry+0x53/0x60
  [<c0bfac1e>] start_kernel+0x415/0x41a

For avoiding these errors, check the pending URBs and skip
resubmitting such ones.

Reported-and-tested-by: Stefan Seyfried <stefan.seyfried@googlemail.com>
Acked-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 sound/usb/midi.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/sound/usb/midi.c b/sound/usb/midi.c
index b901f468b67a..c7aa71ee775b 100644
--- a/sound/usb/midi.c
+++ b/sound/usb/midi.c
@@ -364,6 +364,8 @@ static void snd_usbmidi_error_timer(unsigned long data)
 		if (in && in->error_resubmit) {
 			in->error_resubmit = 0;
 			for (j = 0; j < INPUT_URBS; ++j) {
+				if (atomic_read(&in->urbs[j]->use_count))
+					continue;
 				in->urbs[j]->dev = umidi->dev;
 				snd_usbmidi_submit_urb(in->urbs[j], GFP_ATOMIC);
 			}
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 36/78] isofs: Fix infinite looping over CE entries
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (34 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 35/78] ALSA: usb-audio: Don't resubmit pending URBs at MIDI error recovery Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 37/78] x86/tls: Validate TLS entries to protect espfix Jiri Slaby
                   ` (43 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Jan Kara, Jiri Slaby

From: Jan Kara <jack@suse.cz>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit f54e18f1b831c92f6512d2eedb224cd63d607d3d upstream.

Rock Ridge extensions define so called Continuation Entries (CE) which
define where is further space with Rock Ridge data. Corrupted isofs
image can contain arbitrarily long chain of these, including a one
containing loop and thus causing kernel to end in an infinite loop when
traversing these entries.

Limit the traversal to 32 entries which should be more than enough space
to store all the Rock Ridge data.

Reported-by: P J P <ppandit@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 fs/isofs/rock.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/fs/isofs/rock.c b/fs/isofs/rock.c
index f488bbae541a..bb63254ed848 100644
--- a/fs/isofs/rock.c
+++ b/fs/isofs/rock.c
@@ -30,6 +30,7 @@ struct rock_state {
 	int cont_size;
 	int cont_extent;
 	int cont_offset;
+	int cont_loops;
 	struct inode *inode;
 };
 
@@ -73,6 +74,9 @@ static void init_rock_state(struct rock_state *rs, struct inode *inode)
 	rs->inode = inode;
 }
 
+/* Maximum number of Rock Ridge continuation entries */
+#define RR_MAX_CE_ENTRIES 32
+
 /*
  * Returns 0 if the caller should continue scanning, 1 if the scan must end
  * and -ve on error.
@@ -105,6 +109,8 @@ static int rock_continue(struct rock_state *rs)
 			goto out;
 		}
 		ret = -EIO;
+		if (++rs->cont_loops >= RR_MAX_CE_ENTRIES)
+			goto out;
 		bh = sb_bread(rs->inode->i_sb, rs->cont_extent);
 		if (bh) {
 			memcpy(rs->buffer, bh->b_data + rs->cont_offset,
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 37/78] x86/tls: Validate TLS entries to protect espfix
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (35 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 36/78] isofs: Fix infinite looping over CE entries Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 38/78] x86/tls: Disallow unusual TLS segments Jiri Slaby
                   ` (42 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable
  Cc: linux-kernel, Andy Lutomirski, Konrad Rzeszutek Wilk,
	Linus Torvalds, Willy Tarreau, Ingo Molnar, Jiri Slaby

From: Andy Lutomirski <luto@amacapital.net>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 41bdc78544b8a93a9c6814b8bbbfef966272abbe upstream.

Installing a 16-bit RW data segment into the GDT defeats espfix.
AFAICT this will not affect glibc, Wine, or dosemu at all.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 arch/x86/kernel/tls.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/arch/x86/kernel/tls.c b/arch/x86/kernel/tls.c
index f7fec09e3e3a..e7650bd71109 100644
--- a/arch/x86/kernel/tls.c
+++ b/arch/x86/kernel/tls.c
@@ -27,6 +27,21 @@ static int get_free_idx(void)
 	return -ESRCH;
 }
 
+static bool tls_desc_okay(const struct user_desc *info)
+{
+	if (LDT_empty(info))
+		return true;
+
+	/*
+	 * espfix is required for 16-bit data segments, but espfix
+	 * only works for LDT segments.
+	 */
+	if (!info->seg_32bit)
+		return false;
+
+	return true;
+}
+
 static void set_tls_desc(struct task_struct *p, int idx,
 			 const struct user_desc *info, int n)
 {
@@ -66,6 +81,9 @@ int do_set_thread_area(struct task_struct *p, int idx,
 	if (copy_from_user(&info, u_info, sizeof(info)))
 		return -EFAULT;
 
+	if (!tls_desc_okay(&info))
+		return -EINVAL;
+
 	if (idx == -1)
 		idx = info.entry_number;
 
@@ -192,6 +210,7 @@ int regset_tls_set(struct task_struct *target, const struct user_regset *regset,
 {
 	struct user_desc infobuf[GDT_ENTRY_TLS_ENTRIES];
 	const struct user_desc *info;
+	int i;
 
 	if (pos >= GDT_ENTRY_TLS_ENTRIES * sizeof(struct user_desc) ||
 	    (pos % sizeof(struct user_desc)) != 0 ||
@@ -205,6 +224,10 @@ int regset_tls_set(struct task_struct *target, const struct user_regset *regset,
 	else
 		info = infobuf;
 
+	for (i = 0; i < count / sizeof(struct user_desc); i++)
+		if (!tls_desc_okay(info + i))
+			return -EINVAL;
+
 	set_tls_desc(target,
 		     GDT_ENTRY_TLS_MIN + (pos / sizeof(struct user_desc)),
 		     info, count / sizeof(struct user_desc));
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 38/78] x86/tls: Disallow unusual TLS segments
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (36 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 37/78] x86/tls: Validate TLS entries to protect espfix Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 39/78] x86, kvm: Clear paravirt_enabled on KVM guests for espfix32's benefit Jiri Slaby
                   ` (41 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable
  Cc: linux-kernel, Andy Lutomirski, Konrad Rzeszutek Wilk,
	Linus Torvalds, Willy Tarreau, Ingo Molnar, Jiri Slaby

From: Andy Lutomirski <luto@amacapital.net>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 0e58af4e1d2166e9e33375a0f121e4867010d4f8 upstream.

Users have no business installing custom code segments into the
GDT, and segments that are not present but are otherwise valid
are a historical source of interesting attacks.

For completeness, block attempts to set the L bit.  (Prior to
this patch, the L bit would have been silently dropped.)

This is an ABI break.  I've checked glibc, musl, and Wine, and
none of them look like they'll have any trouble.

Note to stable maintainers: this is a hardening patch that fixes
no known bugs.  Given the possibility of ABI issues, this
probably shouldn't be backported quickly.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 arch/x86/kernel/tls.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/arch/x86/kernel/tls.c b/arch/x86/kernel/tls.c
index e7650bd71109..3e551eee87b9 100644
--- a/arch/x86/kernel/tls.c
+++ b/arch/x86/kernel/tls.c
@@ -39,6 +39,28 @@ static bool tls_desc_okay(const struct user_desc *info)
 	if (!info->seg_32bit)
 		return false;
 
+	/* Only allow data segments in the TLS array. */
+	if (info->contents > 1)
+		return false;
+
+	/*
+	 * Non-present segments with DPL 3 present an interesting attack
+	 * surface.  The kernel should handle such segments correctly,
+	 * but TLS is very difficult to protect in a sandbox, so prevent
+	 * such segments from being created.
+	 *
+	 * If userspace needs to remove a TLS entry, it can still delete
+	 * it outright.
+	 */
+	if (info->seg_not_present)
+		return false;
+
+#ifdef CONFIG_X86_64
+	/* The L bit makes no sense for data. */
+	if (info->lm)
+		return false;
+#endif
+
 	return true;
 }
 
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 39/78] x86, kvm: Clear paravirt_enabled on KVM guests for espfix32's benefit
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (37 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 38/78] x86/tls: Disallow unusual TLS segments Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 40/78] mfd: tc6393xb: Fail ohci suspend if full state restore is required Jiri Slaby
                   ` (40 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Andy Lutomirski, Paolo Bonzini, Jiri Slaby

From: Andy Lutomirski <luto@amacapital.net>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 29fa6825463c97e5157284db80107d1bfac5d77b upstream.

paravirt_enabled has the following effects:

 - Disables the F00F bug workaround warning.  There is no F00F bug
   workaround any more because Linux's standard IDT handling already
   works around the F00F bug, but the warning still exists.  This
   is only cosmetic, and, in any event, there is no such thing as
   KVM on a CPU with the F00F bug.

 - Disables 32-bit APM BIOS detection.  On a KVM paravirt system,
   there should be no APM BIOS anyway.

 - Disables tboot.  I think that the tboot code should check the
   CPUID hypervisor bit directly if it matters.

 - paravirt_enabled disables espfix32.  espfix32 should *not* be
   disabled under KVM paravirt.

The last point is the purpose of this patch.  It fixes a leak of the
high 16 bits of the kernel stack address on 32-bit KVM paravirt
guests.  Fixes CVE-2014-8134.

Suggested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 arch/x86/kernel/kvm.c      | 9 ++++++++-
 arch/x86/kernel/kvmclock.c | 1 -
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index f022c54a79a4..e72593338df6 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -280,7 +280,14 @@ do_async_page_fault(struct pt_regs *regs, unsigned long error_code)
 static void __init paravirt_ops_setup(void)
 {
 	pv_info.name = "KVM";
-	pv_info.paravirt_enabled = 1;
+
+	/*
+	 * KVM isn't paravirt in the sense of paravirt_enabled.  A KVM
+	 * guest kernel works like a bare metal kernel with additional
+	 * features, and paravirt_enabled is about features that are
+	 * missing.
+	 */
+	pv_info.paravirt_enabled = 0;
 
 	if (kvm_para_has_feature(KVM_FEATURE_NOP_IO_DELAY))
 		pv_cpu_ops.io_delay = kvm_io_delay;
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index 1570e0741344..23457e5f0f4f 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -262,7 +262,6 @@ void __init kvmclock_init(void)
 #endif
 	kvm_get_preset_lpj();
 	clocksource_register_hz(&kvm_clock, NSEC_PER_SEC);
-	pv_info.paravirt_enabled = 1;
 	pv_info.name = "KVM";
 
 	if (kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE_STABLE_BIT))
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 40/78] mfd: tc6393xb: Fail ohci suspend if full state restore is required
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (38 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 39/78] x86, kvm: Clear paravirt_enabled on KVM guests for espfix32's benefit Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 41/78] mmc: block: add newline to sysfs display of force_ro Jiri Slaby
                   ` (39 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Dmitry Eremin-Solenikov, Lee Jones, Jiri Slaby

From: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 1a5fb99de4850cba710d91becfa2c65653048589 upstream.

Some boards with TC6393XB chip require full state restore during system
resume thanks to chip's VCC being cut off during suspend (Sharp SL-6000
tosa is one of them). Failing to do so would result in ohci Oops on
resume due to internal memory contentes being changed. Fail ohci suspend
on tc6393xb is full state restore is required.

Recommended workaround is to unbind tmio-ohci driver before suspend and
rebind it after resume.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 drivers/mfd/tc6393xb.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/mfd/tc6393xb.c b/drivers/mfd/tc6393xb.c
index 11c19e538551..48579e5ef02c 100644
--- a/drivers/mfd/tc6393xb.c
+++ b/drivers/mfd/tc6393xb.c
@@ -263,6 +263,17 @@ static int tc6393xb_ohci_disable(struct platform_device *dev)
 	return 0;
 }
 
+static int tc6393xb_ohci_suspend(struct platform_device *dev)
+{
+	struct tc6393xb_platform_data *tcpd = dev_get_platdata(dev->dev.parent);
+
+	/* We can't properly store/restore OHCI state, so fail here */
+	if (tcpd->resume_restore)
+		return -EBUSY;
+
+	return tc6393xb_ohci_disable(dev);
+}
+
 static int tc6393xb_fb_enable(struct platform_device *dev)
 {
 	struct tc6393xb *tc6393xb = dev_get_drvdata(dev->dev.parent);
@@ -403,7 +414,7 @@ static struct mfd_cell tc6393xb_cells[] = {
 		.num_resources = ARRAY_SIZE(tc6393xb_ohci_resources),
 		.resources = tc6393xb_ohci_resources,
 		.enable = tc6393xb_ohci_enable,
-		.suspend = tc6393xb_ohci_disable,
+		.suspend = tc6393xb_ohci_suspend,
 		.resume = tc6393xb_ohci_enable,
 		.disable = tc6393xb_ohci_disable,
 	},
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 41/78] mmc: block: add newline to sysfs display of force_ro
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (39 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 40/78] mfd: tc6393xb: Fail ohci suspend if full state restore is required Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 42/78] megaraid_sas: corrected return of wait_event from abort frame path Jiri Slaby
                   ` (38 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable
  Cc: linux-kernel, Baruch Siach, Andrei Warkentin, Ulf Hansson, Jiri Slaby

From: Baruch Siach <baruch@tkos.co.il>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 0031a98a85e9fca282624bfc887f9531b2768396 upstream.

Make force_ro consistent with other sysfs entries.

Fixes: 371a689f64b0d ('mmc: MMC boot partitions support')
Cc: Andrei Warkentin <andrey.warkentin@gmail.com>
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 drivers/mmc/card/block.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
index 4e8212c714b1..2aea365e096e 100644
--- a/drivers/mmc/card/block.c
+++ b/drivers/mmc/card/block.c
@@ -260,7 +260,7 @@ static ssize_t force_ro_show(struct device *dev, struct device_attribute *attr,
 	int ret;
 	struct mmc_blk_data *md = mmc_blk_get(dev_to_disk(dev));
 
-	ret = snprintf(buf, PAGE_SIZE, "%d",
+	ret = snprintf(buf, PAGE_SIZE, "%d\n",
 		       get_disk_ro(dev_to_disk(dev)) ^
 		       md->read_only);
 	mmc_blk_put(md);
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 42/78] megaraid_sas: corrected return of wait_event from abort frame path
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (40 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 41/78] mmc: block: add newline to sysfs display of force_ro Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 43/78] scsi: correct return values for .eh_abort_handler implementations Jiri Slaby
                   ` (37 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable
  Cc: linux-kernel, Sumit.Saxena, Sumit Saxena, Kashyap Desai,
	Christoph Hellwig, Jiri Slaby

From: "Sumit.Saxena@avagotech.com" <Sumit.Saxena@avagotech.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 170c238701ec38b1829321b17c70671c101bac55 upstream.

Corrected wait_event() call which was waiting for wrong completion
status (0xFF).

Signed-off-by: Sumit Saxena <sumit.saxena@avagotech.com>
Signed-off-by: Kashyap Desai <kashyap.desai@avagotech.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 drivers/scsi/megaraid/megaraid_sas_base.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
index a59a5526a318..855dc7c4cad7 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -953,7 +953,7 @@ megasas_issue_blocked_abort_cmd(struct megasas_instance *instance,
 		cpu_to_le32(upper_32_bits(cmd_to_abort->frame_phys_addr));
 
 	cmd->sync_cmd = 1;
-	cmd->cmd_status = 0xFF;
+	cmd->cmd_status = ENODATA;
 
 	instance->instancet->issue_dcmd(instance, cmd);
 
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 43/78] scsi: correct return values for .eh_abort_handler implementations
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (41 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 42/78] megaraid_sas: corrected return of wait_event from abort frame path Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 44/78] nfs41: fix nfs4_proc_layoutget error handling Jiri Slaby
                   ` (36 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Hannes Reinecke, Christoph Hellwig, Jiri Slaby

From: Hannes Reinecke <hare@suse.de>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit b6c92b7e0af575e2b8b05bdf33633cf9e1661cbf upstream.

The .eh_abort_handler needs to return SUCCESS, FAILED, or
FAST_IO_FAIL. So fixup all callers to adhere to this requirement.

Reviewed-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 drivers/scsi/NCR5380.c            | 12 ++++++------
 drivers/scsi/aha1740.c            |  2 +-
 drivers/scsi/atari_NCR5380.c      |  2 +-
 drivers/scsi/esas2r/esas2r_main.c |  2 +-
 drivers/scsi/megaraid.c           |  8 ++++----
 drivers/scsi/sun3_NCR5380.c       | 10 +++++-----
 6 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/drivers/scsi/NCR5380.c b/drivers/scsi/NCR5380.c
index 1e9d6ad9302b..7563b3d9cc76 100644
--- a/drivers/scsi/NCR5380.c
+++ b/drivers/scsi/NCR5380.c
@@ -2655,14 +2655,14 @@ static void NCR5380_dma_complete(NCR5380_instance * instance) {
  *
  * Purpose : abort a command
  *
- * Inputs : cmd - the Scsi_Cmnd to abort, code - code to set the 
- *      host byte of the result field to, if zero DID_ABORTED is 
+ * Inputs : cmd - the Scsi_Cmnd to abort, code - code to set the
+ *      host byte of the result field to, if zero DID_ABORTED is
  *      used.
  *
- * Returns : 0 - success, -1 on failure.
+ * Returns : SUCCESS - success, FAILED on failure.
  *
- *	XXX - there is no way to abort the command that is currently 
- *	connected, you have to wait for it to complete.  If this is 
+ *	XXX - there is no way to abort the command that is currently
+ *	connected, you have to wait for it to complete.  If this is
  *	a problem, we could implement longjmp() / setjmp(), setjmp()
  *	called where the loop started in NCR5380_main().
  *
@@ -2712,7 +2712,7 @@ static int NCR5380_abort(Scsi_Cmnd * cmd) {
  * aborted flag and get back into our main loop.
  */
 
-		return 0;
+		return SUCCESS;
 	}
 #endif
 
diff --git a/drivers/scsi/aha1740.c b/drivers/scsi/aha1740.c
index 5f3101797c93..31ace4bef8fe 100644
--- a/drivers/scsi/aha1740.c
+++ b/drivers/scsi/aha1740.c
@@ -531,7 +531,7 @@ static int aha1740_eh_abort_handler (Scsi_Cmnd *dummy)
  * quiet as possible...
  */
 
-	return 0;
+	return SUCCESS;
 }
 
 static struct scsi_host_template aha1740_template = {
diff --git a/drivers/scsi/atari_NCR5380.c b/drivers/scsi/atari_NCR5380.c
index 0f3cdbc80ba6..30073d43d87b 100644
--- a/drivers/scsi/atari_NCR5380.c
+++ b/drivers/scsi/atari_NCR5380.c
@@ -2613,7 +2613,7 @@ static void NCR5380_reselect(struct Scsi_Host *instance)
  *	host byte of the result field to, if zero DID_ABORTED is
  *	used.
  *
- * Returns : 0 - success, -1 on failure.
+ * Returns : SUCCESS - success, FAILED on failure.
  *
  * XXX - there is no way to abort the command that is currently
  *	 connected, you have to wait for it to complete.  If this is
diff --git a/drivers/scsi/esas2r/esas2r_main.c b/drivers/scsi/esas2r/esas2r_main.c
index 4abf1272e1eb..5718b1febd57 100644
--- a/drivers/scsi/esas2r/esas2r_main.c
+++ b/drivers/scsi/esas2r/esas2r_main.c
@@ -1057,7 +1057,7 @@ int esas2r_eh_abort(struct scsi_cmnd *cmd)
 
 		cmd->scsi_done(cmd);
 
-		return 0;
+		return SUCCESS;
 	}
 
 	spin_lock_irqsave(&a->queue_lock, flags);
diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c
index 816db12ef5d5..52587ceac099 100644
--- a/drivers/scsi/megaraid.c
+++ b/drivers/scsi/megaraid.c
@@ -1967,7 +1967,7 @@ megaraid_abort_and_reset(adapter_t *adapter, Scsi_Cmnd *cmd, int aor)
 	     cmd->device->id, cmd->device->lun);
 
 	if(list_empty(&adapter->pending_list))
-		return FALSE;
+		return FAILED;
 
 	list_for_each_safe(pos, next, &adapter->pending_list) {
 
@@ -1990,7 +1990,7 @@ megaraid_abort_and_reset(adapter_t *adapter, Scsi_Cmnd *cmd, int aor)
 					(aor==SCB_ABORT) ? "ABORTING":"RESET",
 					scb->idx);
 
-				return FALSE;
+				return FAILED;
 			}
 			else {
 
@@ -2015,12 +2015,12 @@ megaraid_abort_and_reset(adapter_t *adapter, Scsi_Cmnd *cmd, int aor)
 				list_add_tail(SCSI_LIST(cmd),
 						&adapter->completed_list);
 
-				return TRUE;
+				return SUCCESS;
 			}
 		}
 	}
 
-	return FALSE;
+	return FAILED;
 }
 
 static inline int
diff --git a/drivers/scsi/sun3_NCR5380.c b/drivers/scsi/sun3_NCR5380.c
index 636bbe0ea84c..fc57c8aec2b3 100644
--- a/drivers/scsi/sun3_NCR5380.c
+++ b/drivers/scsi/sun3_NCR5380.c
@@ -2597,15 +2597,15 @@ static void NCR5380_reselect (struct Scsi_Host *instance)
  * Purpose : abort a command
  *
  * Inputs : cmd - the struct scsi_cmnd to abort, code - code to set the
- * 	host byte of the result field to, if zero DID_ABORTED is 
+ *	host byte of the result field to, if zero DID_ABORTED is
  *	used.
  *
- * Returns : 0 - success, -1 on failure.
+ * Returns : SUCCESS - success, FAILED on failure.
  *
- * XXX - there is no way to abort the command that is currently 
- * 	 connected, you have to wait for it to complete.  If this is 
+ * XXX - there is no way to abort the command that is currently
+ *	 connected, you have to wait for it to complete.  If this is
  *	 a problem, we could implement longjmp() / setjmp(), setjmp()
- * 	 called where the loop started in NCR5380_main().
+ *	 called where the loop started in NCR5380_main().
  */
 
 static int NCR5380_abort(struct scsi_cmnd *cmd)
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 44/78] nfs41: fix nfs4_proc_layoutget error handling
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (42 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 43/78] scsi: correct return values for .eh_abort_handler implementations Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 45/78] dm bufio: fix memleak when using a dm_buffer's inline bio Jiri Slaby
                   ` (35 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Peng Tao, Trond Myklebust, Jiri Slaby

From: Peng Tao <tao.peng@primarydata.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 4bd5a980de87d2b5af417485bde97b8eb3d6cf6a upstream.

nfs4_layoutget_release() drops layout hdr refcnt. Grab the refcnt
early so that it is safe to call .release in case nfs4_alloc_pages
fails.

Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Fixes: a47970ff78147 ("NFSv4.1: Hold reference to layout hdr in layoutget")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 fs/nfs/nfs4proc.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 759875038791..43c27110387a 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -7238,6 +7238,9 @@ nfs4_proc_layoutget(struct nfs4_layoutget *lgp, gfp_t gfp_flags)
 
 	dprintk("--> %s\n", __func__);
 
+	/* nfs4_layoutget_release calls pnfs_put_layout_hdr */
+	pnfs_get_layout_hdr(NFS_I(inode)->layout);
+
 	lgp->args.layout.pages = nfs4_alloc_pages(max_pages, gfp_flags);
 	if (!lgp->args.layout.pages) {
 		nfs4_layoutget_release(lgp);
@@ -7250,9 +7253,6 @@ nfs4_proc_layoutget(struct nfs4_layoutget *lgp, gfp_t gfp_flags)
 	lgp->res.seq_res.sr_slot = NULL;
 	nfs4_init_sequence(&lgp->args.seq_args, &lgp->res.seq_res, 0);
 
-	/* nfs4_layoutget_release calls pnfs_put_layout_hdr */
-	pnfs_get_layout_hdr(NFS_I(inode)->layout);
-
 	task = rpc_run_task(&task_setup_data);
 	if (IS_ERR(task))
 		return ERR_CAST(task);
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 45/78] dm bufio: fix memleak when using a dm_buffer's inline bio
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (43 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 44/78] nfs41: fix nfs4_proc_layoutget error handling Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 46/78] dm space map metadata: fix sm_bootstrap_get_nr_blocks() Jiri Slaby
                   ` (34 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable
  Cc: linux-kernel, Darrick J. Wong, Mikulas Patocka, Mike Snitzer, Jiri Slaby

From: "Darrick J. Wong" <darrick.wong@oracle.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 445559cdcb98a141f5de415b94fd6eaccab87e6d upstream.

When dm-bufio sets out to use the bio built into a struct dm_buffer to
issue an IO, it needs to call bio_reset after it's done with the bio
so that we can free things attached to the bio such as the integrity
payload.  Therefore, inject our own endio callback to take care of
the bio_reset after calling submit_io's end_io callback.

Test case:
1. modprobe scsi_debug delay=0 dif=1 dix=199 ato=1 dev_size_mb=300
2. Set up a dm-bufio client, e.g. dm-verity, on the scsi_debug device
3. Repeatedly read metadata and watch kmalloc-192 leak!

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 drivers/md/dm-bufio.c | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
index 140be2dd3e23..93edd894e94b 100644
--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -530,6 +530,19 @@ static void use_dmio(struct dm_buffer *b, int rw, sector_t block,
 		end_io(&b->bio, r);
 }
 
+static void inline_endio(struct bio *bio, int error)
+{
+	bio_end_io_t *end_fn = bio->bi_private;
+
+	/*
+	 * Reset the bio to free any attached resources
+	 * (e.g. bio integrity profiles).
+	 */
+	bio_reset(bio);
+
+	end_fn(bio, error);
+}
+
 static void use_inline_bio(struct dm_buffer *b, int rw, sector_t block,
 			   bio_end_io_t *end_io)
 {
@@ -541,7 +554,12 @@ static void use_inline_bio(struct dm_buffer *b, int rw, sector_t block,
 	b->bio.bi_max_vecs = DM_BUFIO_INLINE_VECS;
 	b->bio.bi_sector = block << b->c->sectors_per_block_bits;
 	b->bio.bi_bdev = b->c->bdev;
-	b->bio.bi_end_io = end_io;
+	b->bio.bi_end_io = inline_endio;
+	/*
+	 * Use of .bi_private isn't a problem here because
+	 * the dm_buffer's inline bio is local to bufio.
+	 */
+	b->bio.bi_private = end_io;
 
 	/*
 	 * We assume that if len >= PAGE_SIZE ptr is page-aligned.
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 46/78] dm space map metadata: fix sm_bootstrap_get_nr_blocks()
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (44 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 45/78] dm bufio: fix memleak when using a dm_buffer's inline bio Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 47/78] x86/tls: Don't validate lm in set_thread_area() after all Jiri Slaby
                   ` (33 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Dan Carpenter, Mike Snitzer, Jiri Slaby

From: Dan Carpenter <dan.carpenter@oracle.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit c1c6156fe4d4577444b769d7edd5dd503e57bbc9 upstream.

This function isn't right and it causes a static checker warning:

	drivers/md/dm-thin.c:3016 maybe_resize_data_dev()
	error: potentially using uninitialized 'sb_data_size'.

It should set "*count" and return zero on success the same as the
sm_metadata_get_nr_blocks() function does earlier.

Fixes: 3241b1d3e0aa ('dm: add persistent data library')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 drivers/md/persistent-data/dm-space-map-metadata.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/md/persistent-data/dm-space-map-metadata.c b/drivers/md/persistent-data/dm-space-map-metadata.c
index 579b58200bf2..d9a5aa532017 100644
--- a/drivers/md/persistent-data/dm-space-map-metadata.c
+++ b/drivers/md/persistent-data/dm-space-map-metadata.c
@@ -564,7 +564,9 @@ static int sm_bootstrap_get_nr_blocks(struct dm_space_map *sm, dm_block_t *count
 {
 	struct sm_metadata *smm = container_of(sm, struct sm_metadata, sm);
 
-	return smm->ll.nr_blocks;
+	*count = smm->ll.nr_blocks;
+
+	return 0;
 }
 
 static int sm_bootstrap_get_nr_free(struct dm_space_map *sm, dm_block_t *count)
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 47/78] x86/tls: Don't validate lm in set_thread_area() after all
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (45 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 46/78] dm space map metadata: fix sm_bootstrap_get_nr_blocks() Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 48/78] audit: change decimal constant to macro for invalid uid Jiri Slaby
                   ` (32 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable
  Cc: linux-kernel, Andy Lutomirski, Linus Torvalds, Ingo Molnar, Jiri Slaby

From: Andy Lutomirski <luto@amacapital.net>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 3fb2f4237bb452eb4e98f6a5dbd5a445b4fed9d0 upstream.

It turns out that there's a lurking ABI issue.  GCC, when
compiling this in a 32-bit program:

struct user_desc desc = {
	.entry_number    = idx,
	.base_addr       = base,
	.limit           = 0xfffff,
	.seg_32bit       = 1,
	.contents        = 0, /* Data, grow-up */
	.read_exec_only  = 0,
	.limit_in_pages  = 1,
	.seg_not_present = 0,
	.useable         = 0,
};

will leave .lm uninitialized.  This means that anything in the
kernel that reads user_desc.lm for 32-bit tasks is unreliable.

Revert the .lm check in set_thread_area().  The value never did
anything in the first place.

Fixes: 0e58af4e1d21 ("x86/tls: Disallow unusual TLS segments")
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/d7875b60e28c512f6a6fc0baf5714d58e7eaadbb.1418856405.git.luto@amacapital.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 arch/x86/include/uapi/asm/ldt.h | 7 +++++++
 arch/x86/kernel/tls.c           | 6 ------
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/uapi/asm/ldt.h b/arch/x86/include/uapi/asm/ldt.h
index 46727eb37bfe..6e1aaf73852a 100644
--- a/arch/x86/include/uapi/asm/ldt.h
+++ b/arch/x86/include/uapi/asm/ldt.h
@@ -28,6 +28,13 @@ struct user_desc {
 	unsigned int  seg_not_present:1;
 	unsigned int  useable:1;
 #ifdef __x86_64__
+	/*
+	 * Because this bit is not present in 32-bit user code, user
+	 * programs can pass uninitialized values here.  Therefore, in
+	 * any context in which a user_desc comes from a 32-bit program,
+	 * the kernel must act as though lm == 0, regardless of the
+	 * actual value.
+	 */
 	unsigned int  lm:1;
 #endif
 };
diff --git a/arch/x86/kernel/tls.c b/arch/x86/kernel/tls.c
index 3e551eee87b9..4e942f31b1a7 100644
--- a/arch/x86/kernel/tls.c
+++ b/arch/x86/kernel/tls.c
@@ -55,12 +55,6 @@ static bool tls_desc_okay(const struct user_desc *info)
 	if (info->seg_not_present)
 		return false;
 
-#ifdef CONFIG_X86_64
-	/* The L bit makes no sense for data. */
-	if (info->lm)
-		return false;
-#endif
-
 	return true;
 }
 
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 48/78] audit: change decimal constant to macro for invalid uid
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (46 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 47/78] x86/tls: Don't validate lm in set_thread_area() after all Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 49/78] isofs: Fix unchecked printing of ER records Jiri Slaby
                   ` (31 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable
  Cc: linux-kernel, Richard Guy Briggs, Stephen Rothwell,
	Eric W. Biederman, Eric Paris, Jiri Slaby

From: Richard Guy Briggs <rgb@redhat.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 42f74461a5b60cf6b42887e6d2ff5b7be4abf1ca upstream.

SFR reported this 2013-05-15:

> After merging the final tree, today's linux-next build (i386 defconfig)
> produced this warning:
>
> kernel/auditfilter.c: In function 'audit_data_to_entry':
> kernel/auditfilter.c:426:3: warning: this decimal constant is unsigned only
> in ISO C90 [enabled by default]
>
> Introduced by commit 780a7654cee8 ("audit: Make testing for a valid
> loginuid explicit") from Linus' tree.

Replace this decimal constant in the code with a macro to make it more readable
(add to the unsigned cast to quiet the warning).

Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 include/uapi/linux/audit.h | 2 ++
 kernel/auditfilter.c       | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
index 75cef3fd97ad..b7cb978ed579 100644
--- a/include/uapi/linux/audit.h
+++ b/include/uapi/linux/audit.h
@@ -374,6 +374,8 @@ struct audit_tty_status {
 	__u32		log_passwd;	/* 1 = enabled, 0 = disabled */
 };
 
+#define AUDIT_UID_UNSET (unsigned int)-1
+
 /* audit_rule_data supports filter rules with both integer and string
  * fields.  It corresponds with AUDIT_ADD_RULE, AUDIT_DEL_RULE and
  * AUDIT_LIST_RULES requests.
diff --git a/kernel/auditfilter.c b/kernel/auditfilter.c
index f7aee8be7fb2..8a344cebd8bf 100644
--- a/kernel/auditfilter.c
+++ b/kernel/auditfilter.c
@@ -423,7 +423,7 @@ static struct audit_entry *audit_data_to_entry(struct audit_rule_data *data,
 		f->lsm_rule = NULL;
 
 		/* Support legacy tests for a valid loginuid */
-		if ((f->type == AUDIT_LOGINUID) && (f->val == ~0U)) {
+		if ((f->type == AUDIT_LOGINUID) && (f->val == AUDIT_UID_UNSET)) {
 			f->type = AUDIT_LOGINUID_SET;
 			f->val = 0;
 		}
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 49/78] isofs: Fix unchecked printing of ER records
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (47 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 48/78] audit: change decimal constant to macro for invalid uid Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:31 ` [PATCH 3.12 50/78] KEYS: Fix stale key registration at error path Jiri Slaby
                   ` (30 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Jan Kara, Jiri Slaby

From: Jan Kara <jack@suse.cz>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 4e2024624e678f0ebb916e6192bd23c1f9fdf696 upstream.

We didn't check length of rock ridge ER records before printing them.
Thus corrupted isofs image can cause us to access and print some memory
behind the buffer with obvious consequences.

Reported-and-tested-by: Carl Henrik Lunde <chlunde@ping.uio.no>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 fs/isofs/rock.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/isofs/rock.c b/fs/isofs/rock.c
index bb63254ed848..735d7522a3a9 100644
--- a/fs/isofs/rock.c
+++ b/fs/isofs/rock.c
@@ -362,6 +362,9 @@ repeat:
 			rs.cont_size = isonum_733(rr->u.CE.size);
 			break;
 		case SIG('E', 'R'):
+			/* Invalid length of ER tag id? */
+			if (rr->u.ER.len_id + offsetof(struct rock_ridge, u.ER.data) > rr->len)
+				goto out;
 			ISOFS_SB(inode->i_sb)->s_rock = 1;
 			printk(KERN_DEBUG "ISO 9660 Extensions: ");
 			{
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 50/78] KEYS: Fix stale key registration at error path
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (48 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 49/78] isofs: Fix unchecked printing of ER records Jiri Slaby
@ 2015-01-09 10:31 ` Jiri Slaby
  2015-01-09 10:32 ` [PATCH 3.12 51/78] mac80211: fix multicast LED blinking and counter Jiri Slaby
                   ` (29 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:31 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Takashi Iwai, Mimi Zohar, Jiri Slaby

From: Takashi Iwai <tiwai@suse.de>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit b26bdde5bb27f3f900e25a95e33a0c476c8c2c48 upstream.

When loading encrypted-keys module, if the last check of
aes_get_sizes() in init_encrypted() fails, the driver just returns an
error without unregistering its key type.  This results in the stale
entry in the list.  In addition to memory leaks, this leads to a kernel
crash when registering a new key type later.

This patch fixes the problem by swapping the calls of aes_get_sizes()
and register_key_type(), and releasing resources properly at the error
paths.

Bugzilla: https://bugzilla.opensuse.org/show_bug.cgi?id=908163
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 security/keys/encrypted-keys/encrypted.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/security/keys/encrypted-keys/encrypted.c b/security/keys/encrypted-keys/encrypted.c
index 9e1e005c7596..c4c8df4b214d 100644
--- a/security/keys/encrypted-keys/encrypted.c
+++ b/security/keys/encrypted-keys/encrypted.c
@@ -1018,10 +1018,13 @@ static int __init init_encrypted(void)
 	ret = encrypted_shash_alloc();
 	if (ret < 0)
 		return ret;
+	ret = aes_get_sizes();
+	if (ret < 0)
+		goto out;
 	ret = register_key_type(&key_type_encrypted);
 	if (ret < 0)
 		goto out;
-	return aes_get_sizes();
+	return 0;
 out:
 	encrypted_shash_release();
 	return ret;
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 51/78] mac80211: fix multicast LED blinking and counter
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (49 preceding siblings ...)
  2015-01-09 10:31 ` [PATCH 3.12 50/78] KEYS: Fix stale key registration at error path Jiri Slaby
@ 2015-01-09 10:32 ` Jiri Slaby
  2015-01-09 10:32 ` [PATCH 3.12 52/78] mac80211: free management frame keys when removing station Jiri Slaby
                   ` (28 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:32 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Andreas Müller, Johannes Berg, Jiri Slaby

From: Andreas Müller <goo@stapelspeicher.org>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit d025933e29872cb1fe19fc54d80e4dfa4ee5779c upstream.

As multicast-frames can't be fragmented, "dot11MulticastReceivedFrameCount"
stopped being incremented after the use-after-free fix. Furthermore, the
RX-LED will be triggered by every multicast frame (which wouldn't happen
before) which wouldn't allow the LED to rest at all.

Fixes https://bugzilla.kernel.org/show_bug.cgi?id=89431 which also had the
patch.

Fixes: b8fff407a180 ("mac80211: fix use-after-free in defragmentation")
Signed-off-by: Andreas Müller <goo@stapelspeicher.org>
[rewrite commit message]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 net/mac80211/rx.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c
index 275cb85bfa31..ef3bdba9309e 100644
--- a/net/mac80211/rx.c
+++ b/net/mac80211/rx.c
@@ -1646,14 +1646,14 @@ ieee80211_rx_h_defragment(struct ieee80211_rx_data *rx)
 	sc = le16_to_cpu(hdr->seq_ctrl);
 	frag = sc & IEEE80211_SCTL_FRAG;
 
-	if (likely(!ieee80211_has_morefrags(fc) && frag == 0))
-		goto out;
-
 	if (is_multicast_ether_addr(hdr->addr1)) {
 		rx->local->dot11MulticastReceivedFrameCount++;
-		goto out;
+		goto out_no_led;
 	}
 
+	if (likely(!ieee80211_has_morefrags(fc) && frag == 0))
+		goto out;
+
 	I802_DEBUG_INC(rx->local->rx_handlers_fragments);
 
 	if (skb_linearize(rx->skb))
@@ -1744,9 +1744,10 @@ ieee80211_rx_h_defragment(struct ieee80211_rx_data *rx)
 	status->rx_flags |= IEEE80211_RX_FRAGMENTED;
 
  out:
+	ieee80211_led_rx(rx->local);
+ out_no_led:
 	if (rx->sta)
 		rx->sta->rx_packets++;
-	ieee80211_led_rx(rx->local);
 	return RX_CONTINUE;
 }
 
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 52/78] mac80211: free management frame keys when removing station
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (50 preceding siblings ...)
  2015-01-09 10:32 ` [PATCH 3.12 51/78] mac80211: fix multicast LED blinking and counter Jiri Slaby
@ 2015-01-09 10:32 ` Jiri Slaby
  2015-01-09 10:32 ` [PATCH 3.12 53/78] thermal: Fix error path in thermal_init() Jiri Slaby
                   ` (27 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:32 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Johannes Berg, Jiri Slaby

From: Johannes Berg <johannes.berg@intel.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 28a9bc68124c319b2b3dc861e80828a8865fd1ba upstream.

When writing the code to allow per-station GTKs, I neglected to
take into account the management frame keys (index 4 and 5) when
freeing the station and only added code to free the first four
data frame keys.

Fix this by iterating the array of keys over the right length.

Fixes: e31b82136d1a ("cfg80211/mac80211: allow per-station GTKs")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 net/mac80211/key.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/mac80211/key.c b/net/mac80211/key.c
index 620677e897bd..23dfd244c892 100644
--- a/net/mac80211/key.c
+++ b/net/mac80211/key.c
@@ -615,7 +615,7 @@ void ieee80211_free_sta_keys(struct ieee80211_local *local,
 	int i;
 
 	mutex_lock(&local->key_mtx);
-	for (i = 0; i < NUM_DEFAULT_KEYS; i++) {
+	for (i = 0; i < ARRAY_SIZE(sta->gtk); i++) {
 		key = key_mtx_dereference(local, sta->gtk[i]);
 		if (!key)
 			continue;
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 53/78] thermal: Fix error path in thermal_init()
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (51 preceding siblings ...)
  2015-01-09 10:32 ` [PATCH 3.12 52/78] mac80211: free management frame keys when removing station Jiri Slaby
@ 2015-01-09 10:32 ` Jiri Slaby
  2015-01-09 10:32 ` [PATCH 3.12 54/78] mnt: Implicitly add MNT_NODEV on remount when it was implicitly added by mount Jiri Slaby
                   ` (26 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:32 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Luis Henriques, Zhang Rui, Jiri Slaby

From: Luis Henriques <luis.henriques@canonical.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 9d367e5e7b05c71a8c1ac4e9b6e00ba45a79f2fc upstream.

thermal_unregister_governors() and class_unregister() were being called in
the wrong order.

Fixes: 80a26a5c22b9 ("Thermal: build thermal governors into thermal_sys module")
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 drivers/thermal/thermal_core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 4962a6aaf295..4f35f1ca3ce3 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -1747,10 +1747,10 @@ static int __init thermal_init(void)
 
 	return 0;
 
-unregister_governors:
-	thermal_unregister_governors();
 unregister_class:
 	class_unregister(&thermal_class);
+unregister_governors:
+	thermal_unregister_governors();
 error:
 	idr_destroy(&thermal_tz_idr);
 	idr_destroy(&thermal_cdev_idr);
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 54/78] mnt: Implicitly add MNT_NODEV on remount when it was implicitly added by mount
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (52 preceding siblings ...)
  2015-01-09 10:32 ` [PATCH 3.12 53/78] thermal: Fix error path in thermal_init() Jiri Slaby
@ 2015-01-09 10:32 ` Jiri Slaby
  2015-01-09 10:32 ` [PATCH 3.12 55/78] mnt: Update unprivileged remount test Jiri Slaby
                   ` (25 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:32 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Eric W. Biederman, Jiri Slaby

From: "Eric W. Biederman" <ebiederm@xmission.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 3e1866410f11356a9fd869beb3e95983dc79c067 upstream.

Now that remount is properly enforcing the rule that you can't remove
nodev at least sandstorm.io is breaking when performing a remount.

It turns out that there is an easy intuitive solution implicitly
add nodev on remount when nodev was implicitly added on mount.

Tested-by: Cedric Bosdonnat <cbosdonnat@suse.com>
Tested-by: Richard Weinberger <richard@nod.at>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 fs/namespace.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/fs/namespace.c b/fs/namespace.c
index d00750d2f91e..6b42c6d1590e 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -1858,7 +1858,13 @@ static int do_remount(struct path *path, int flags, int mnt_flags,
 	}
 	if ((mnt->mnt.mnt_flags & MNT_LOCK_NODEV) &&
 	    !(mnt_flags & MNT_NODEV)) {
-		return -EPERM;
+		/* Was the nodev implicitly added in mount? */
+		if ((mnt->mnt_ns->user_ns != &init_user_ns) &&
+		    !(sb->s_type->fs_flags & FS_USERNS_DEV_MOUNT)) {
+			mnt_flags |= MNT_NODEV;
+		} else {
+			return -EPERM;
+		}
 	}
 	if ((mnt->mnt.mnt_flags & MNT_LOCK_NOSUID) &&
 	    !(mnt_flags & MNT_NOSUID)) {
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 55/78] mnt: Update unprivileged remount test
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (53 preceding siblings ...)
  2015-01-09 10:32 ` [PATCH 3.12 54/78] mnt: Implicitly add MNT_NODEV on remount when it was implicitly added by mount Jiri Slaby
@ 2015-01-09 10:32 ` Jiri Slaby
  2015-01-09 10:32 ` [PATCH 3.12 56/78] umount: Disallow unprivileged mount force Jiri Slaby
                   ` (24 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:32 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Eric W. Biederman, Jiri Slaby

From: "Eric W. Biederman" <ebiederm@xmission.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 4a44a19b470a886997d6647a77bb3e38dcbfa8c5 upstream.

- MNT_NODEV should be irrelevant except when reading back mount flags,
  no longer specify MNT_NODEV on remount.

- Test MNT_NODEV on devpts where it is meaningful even for unprivileged mounts.

- Add a test to verify that remount of a prexisting mount with the same flags
  is allowed and does not change those flags.

- Cleanup up the definitions of MS_REC, MS_RELATIME, MS_STRICTATIME that are used
  when the code is built in an environment without them.

- Correct the test error messages when tests fail.  There were not 5 tests
  that tested MS_RELATIME.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 .../selftests/mount/unprivileged-remount-test.c    | 172 +++++++++++++++++----
 1 file changed, 142 insertions(+), 30 deletions(-)

diff --git a/tools/testing/selftests/mount/unprivileged-remount-test.c b/tools/testing/selftests/mount/unprivileged-remount-test.c
index 1b3ff2fda4d0..9669d375625a 100644
--- a/tools/testing/selftests/mount/unprivileged-remount-test.c
+++ b/tools/testing/selftests/mount/unprivileged-remount-test.c
@@ -6,6 +6,8 @@
 #include <sys/types.h>
 #include <sys/mount.h>
 #include <sys/wait.h>
+#include <sys/vfs.h>
+#include <sys/statvfs.h>
 #include <stdlib.h>
 #include <unistd.h>
 #include <fcntl.h>
@@ -32,11 +34,14 @@
 # define CLONE_NEWPID 0x20000000
 #endif
 
+#ifndef MS_REC
+# define MS_REC 16384
+#endif
 #ifndef MS_RELATIME
-#define MS_RELATIME (1 << 21)
+# define MS_RELATIME (1 << 21)
 #endif
 #ifndef MS_STRICTATIME
-#define MS_STRICTATIME (1 << 24)
+# define MS_STRICTATIME (1 << 24)
 #endif
 
 static void die(char *fmt, ...)
@@ -87,6 +92,45 @@ static void write_file(char *filename, char *fmt, ...)
 	}
 }
 
+static int read_mnt_flags(const char *path)
+{
+	int ret;
+	struct statvfs stat;
+	int mnt_flags;
+
+	ret = statvfs(path, &stat);
+	if (ret != 0) {
+		die("statvfs of %s failed: %s\n",
+			path, strerror(errno));
+	}
+	if (stat.f_flag & ~(ST_RDONLY | ST_NOSUID | ST_NODEV | \
+			ST_NOEXEC | ST_NOATIME | ST_NODIRATIME | ST_RELATIME | \
+			ST_SYNCHRONOUS | ST_MANDLOCK)) {
+		die("Unrecognized mount flags\n");
+	}
+	mnt_flags = 0;
+	if (stat.f_flag & ST_RDONLY)
+		mnt_flags |= MS_RDONLY;
+	if (stat.f_flag & ST_NOSUID)
+		mnt_flags |= MS_NOSUID;
+	if (stat.f_flag & ST_NODEV)
+		mnt_flags |= MS_NODEV;
+	if (stat.f_flag & ST_NOEXEC)
+		mnt_flags |= MS_NOEXEC;
+	if (stat.f_flag & ST_NOATIME)
+		mnt_flags |= MS_NOATIME;
+	if (stat.f_flag & ST_NODIRATIME)
+		mnt_flags |= MS_NODIRATIME;
+	if (stat.f_flag & ST_RELATIME)
+		mnt_flags |= MS_RELATIME;
+	if (stat.f_flag & ST_SYNCHRONOUS)
+		mnt_flags |= MS_SYNCHRONOUS;
+	if (stat.f_flag & ST_MANDLOCK)
+		mnt_flags |= ST_MANDLOCK;
+
+	return mnt_flags;
+}
+
 static void create_and_enter_userns(void)
 {
 	uid_t uid;
@@ -118,7 +162,8 @@ static void create_and_enter_userns(void)
 }
 
 static
-bool test_unpriv_remount(int mount_flags, int remount_flags, int invalid_flags)
+bool test_unpriv_remount(const char *fstype, const char *mount_options,
+			 int mount_flags, int remount_flags, int invalid_flags)
 {
 	pid_t child;
 
@@ -151,9 +196,11 @@ bool test_unpriv_remount(int mount_flags, int remount_flags, int invalid_flags)
 			strerror(errno));
 	}
 
-	if (mount("testing", "/tmp", "ramfs", mount_flags, NULL) != 0) {
-		die("mount of /tmp failed: %s\n",
-			strerror(errno));
+	if (mount("testing", "/tmp", fstype, mount_flags, mount_options) != 0) {
+		die("mount of %s with options '%s' on /tmp failed: %s\n",
+		    fstype,
+		    mount_options? mount_options : "",
+		    strerror(errno));
 	}
 
 	create_and_enter_userns();
@@ -181,62 +228,127 @@ bool test_unpriv_remount(int mount_flags, int remount_flags, int invalid_flags)
 
 static bool test_unpriv_remount_simple(int mount_flags)
 {
-	return test_unpriv_remount(mount_flags, mount_flags, 0);
+	return test_unpriv_remount("ramfs", NULL, mount_flags, mount_flags, 0);
 }
 
 static bool test_unpriv_remount_atime(int mount_flags, int invalid_flags)
 {
-	return test_unpriv_remount(mount_flags, mount_flags, invalid_flags);
+	return test_unpriv_remount("ramfs", NULL, mount_flags, mount_flags,
+				   invalid_flags);
+}
+
+static bool test_priv_mount_unpriv_remount(void)
+{
+	pid_t child;
+	int ret;
+	const char *orig_path = "/dev";
+	const char *dest_path = "/tmp";
+	int orig_mnt_flags, remount_mnt_flags;
+
+	child = fork();
+	if (child == -1) {
+		die("fork failed: %s\n",
+			strerror(errno));
+	}
+	if (child != 0) { /* parent */
+		pid_t pid;
+		int status;
+		pid = waitpid(child, &status, 0);
+		if (pid == -1) {
+			die("waitpid failed: %s\n",
+				strerror(errno));
+		}
+		if (pid != child) {
+			die("waited for %d got %d\n",
+				child, pid);
+		}
+		if (!WIFEXITED(status)) {
+			die("child did not terminate cleanly\n");
+		}
+		return WEXITSTATUS(status) == EXIT_SUCCESS ? true : false;
+	}
+
+	orig_mnt_flags = read_mnt_flags(orig_path);
+
+	create_and_enter_userns();
+	ret = unshare(CLONE_NEWNS);
+	if (ret != 0) {
+		die("unshare(CLONE_NEWNS) failed: %s\n",
+			strerror(errno));
+	}
+
+	ret = mount(orig_path, dest_path, "bind", MS_BIND | MS_REC, NULL);
+	if (ret != 0) {
+		die("recursive bind mount of %s onto %s failed: %s\n",
+			orig_path, dest_path, strerror(errno));
+	}
+
+	ret = mount(dest_path, dest_path, "none",
+		    MS_REMOUNT | MS_BIND | orig_mnt_flags , NULL);
+	if (ret != 0) {
+		/* system("cat /proc/self/mounts"); */
+		die("remount of /tmp failed: %s\n",
+		    strerror(errno));
+	}
+
+	remount_mnt_flags = read_mnt_flags(dest_path);
+	if (orig_mnt_flags != remount_mnt_flags) {
+		die("Mount flags unexpectedly changed during remount of %s originally mounted on %s\n",
+			dest_path, orig_path);
+	}
+	exit(EXIT_SUCCESS);
 }
 
 int main(int argc, char **argv)
 {
-	if (!test_unpriv_remount_simple(MS_RDONLY|MS_NODEV)) {
+	if (!test_unpriv_remount_simple(MS_RDONLY)) {
 		die("MS_RDONLY malfunctions\n");
 	}
-	if (!test_unpriv_remount_simple(MS_NODEV)) {
+	if (!test_unpriv_remount("devpts", "newinstance", MS_NODEV, MS_NODEV, 0)) {
 		die("MS_NODEV malfunctions\n");
 	}
-	if (!test_unpriv_remount_simple(MS_NOSUID|MS_NODEV)) {
+	if (!test_unpriv_remount_simple(MS_NOSUID)) {
 		die("MS_NOSUID malfunctions\n");
 	}
-	if (!test_unpriv_remount_simple(MS_NOEXEC|MS_NODEV)) {
+	if (!test_unpriv_remount_simple(MS_NOEXEC)) {
 		die("MS_NOEXEC malfunctions\n");
 	}
-	if (!test_unpriv_remount_atime(MS_RELATIME|MS_NODEV,
-				       MS_NOATIME|MS_NODEV))
+	if (!test_unpriv_remount_atime(MS_RELATIME,
+				       MS_NOATIME))
 	{
 		die("MS_RELATIME malfunctions\n");
 	}
-	if (!test_unpriv_remount_atime(MS_STRICTATIME|MS_NODEV,
-				       MS_NOATIME|MS_NODEV))
+	if (!test_unpriv_remount_atime(MS_STRICTATIME,
+				       MS_NOATIME))
 	{
 		die("MS_STRICTATIME malfunctions\n");
 	}
-	if (!test_unpriv_remount_atime(MS_NOATIME|MS_NODEV,
-				       MS_STRICTATIME|MS_NODEV))
+	if (!test_unpriv_remount_atime(MS_NOATIME,
+				       MS_STRICTATIME))
 	{
-		die("MS_RELATIME malfunctions\n");
+		die("MS_NOATIME malfunctions\n");
 	}
-	if (!test_unpriv_remount_atime(MS_RELATIME|MS_NODIRATIME|MS_NODEV,
-				       MS_NOATIME|MS_NODEV))
+	if (!test_unpriv_remount_atime(MS_RELATIME|MS_NODIRATIME,
+				       MS_NOATIME))
 	{
-		die("MS_RELATIME malfunctions\n");
+		die("MS_RELATIME|MS_NODIRATIME malfunctions\n");
 	}
-	if (!test_unpriv_remount_atime(MS_STRICTATIME|MS_NODIRATIME|MS_NODEV,
-				       MS_NOATIME|MS_NODEV))
+	if (!test_unpriv_remount_atime(MS_STRICTATIME|MS_NODIRATIME,
+				       MS_NOATIME))
 	{
-		die("MS_RELATIME malfunctions\n");
+		die("MS_STRICTATIME|MS_NODIRATIME malfunctions\n");
 	}
-	if (!test_unpriv_remount_atime(MS_NOATIME|MS_NODIRATIME|MS_NODEV,
-				       MS_STRICTATIME|MS_NODEV))
+	if (!test_unpriv_remount_atime(MS_NOATIME|MS_NODIRATIME,
+				       MS_STRICTATIME))
 	{
-		die("MS_RELATIME malfunctions\n");
+		die("MS_NOATIME|MS_DIRATIME malfunctions\n");
 	}
-	if (!test_unpriv_remount(MS_STRICTATIME|MS_NODEV, MS_NODEV,
-				 MS_NOATIME|MS_NODEV))
+	if (!test_unpriv_remount("ramfs", NULL, MS_STRICTATIME, 0, MS_NOATIME))
 	{
 		die("Default atime malfunctions\n");
 	}
+	if (!test_priv_mount_unpriv_remount()) {
+		die("Mount flags unexpectedly changed after remount\n");
+	}
 	return EXIT_SUCCESS;
 }
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 56/78] umount: Disallow unprivileged mount force
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (54 preceding siblings ...)
  2015-01-09 10:32 ` [PATCH 3.12 55/78] mnt: Update unprivileged remount test Jiri Slaby
@ 2015-01-09 10:32 ` Jiri Slaby
  2015-01-09 10:32 ` [PATCH 3.12 57/78] groups: Consolidate the setgroups permission checks Jiri Slaby
                   ` (23 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:32 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Eric W. Biederman, Jiri Slaby

From: "Eric W. Biederman" <ebiederm@xmission.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit b2f5d4dc38e034eecb7987e513255265ff9aa1cf upstream.

Forced unmount affects not just the mount namespace but the underlying
superblock as well.  Restrict forced unmount to the global root user
for now.  Otherwise it becomes possible a user in a less privileged
mount namespace to force the shutdown of a superblock of a filesystem
in a more privileged mount namespace, allowing a DOS attack on root.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 fs/namespace.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/namespace.c b/fs/namespace.c
index 6b42c6d1590e..7c3c0f6d2744 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -1347,6 +1347,9 @@ SYSCALL_DEFINE2(umount, char __user *, name, int, flags)
 		goto dput_and_out;
 	if (mnt->mnt.mnt_flags & MNT_LOCKED)
 		goto dput_and_out;
+	retval = -EPERM;
+	if (flags & MNT_FORCE && !capable(CAP_SYS_ADMIN))
+		goto dput_and_out;
 
 	retval = do_umount(mnt, flags);
 dput_and_out:
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 57/78] groups: Consolidate the setgroups permission checks
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (55 preceding siblings ...)
  2015-01-09 10:32 ` [PATCH 3.12 56/78] umount: Disallow unprivileged mount force Jiri Slaby
@ 2015-01-09 10:32 ` Jiri Slaby
  2015-01-09 10:32 ` [PATCH 3.12 58/78] userns: Document what the invariant required for safe unprivileged mappings Jiri Slaby
                   ` (22 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:32 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Eric W. Biederman, Jiri Slaby

From: "Eric W. Biederman" <ebiederm@xmission.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 7ff4d90b4c24a03666f296c3d4878cd39001e81e upstream.

Today there are 3 instances of setgroups and due to an oversight their
permission checking has diverged.  Add a common function so that
they may all share the same permission checking code.

This corrects the current oversight in the current permission checks
and adds a helper to avoid this in the future.

A user namespace security fix will update this new helper, shortly.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 arch/s390/kernel/compat_linux.c | 2 +-
 include/linux/cred.h            | 1 +
 kernel/groups.c                 | 9 ++++++++-
 kernel/uid16.c                  | 2 +-
 4 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/arch/s390/kernel/compat_linux.c b/arch/s390/kernel/compat_linux.c
index 1f1b8c70ab97..0ebb699aad1e 100644
--- a/arch/s390/kernel/compat_linux.c
+++ b/arch/s390/kernel/compat_linux.c
@@ -249,7 +249,7 @@ asmlinkage long sys32_setgroups16(int gidsetsize, u16 __user *grouplist)
 	struct group_info *group_info;
 	int retval;
 
-	if (!capable(CAP_SETGID))
+	if (!may_setgroups())
 		return -EPERM;
 	if ((unsigned)gidsetsize > NGROUPS_MAX)
 		return -EINVAL;
diff --git a/include/linux/cred.h b/include/linux/cred.h
index 04421e825365..6c58dd7cb9ac 100644
--- a/include/linux/cred.h
+++ b/include/linux/cred.h
@@ -68,6 +68,7 @@ extern void groups_free(struct group_info *);
 extern int set_current_groups(struct group_info *);
 extern int set_groups(struct cred *, struct group_info *);
 extern int groups_search(const struct group_info *, kgid_t);
+extern bool may_setgroups(void);
 
 /* access the groups "array" with this macro */
 #define GROUP_AT(gi, i) \
diff --git a/kernel/groups.c b/kernel/groups.c
index 90cf1c38c8ea..984bb629c68c 100644
--- a/kernel/groups.c
+++ b/kernel/groups.c
@@ -223,6 +223,13 @@ out:
 	return i;
 }
 
+bool may_setgroups(void)
+{
+	struct user_namespace *user_ns = current_user_ns();
+
+	return ns_capable(user_ns, CAP_SETGID);
+}
+
 /*
  *	SMP: Our groups are copy-on-write. We can set them safely
  *	without another task interfering.
@@ -233,7 +240,7 @@ SYSCALL_DEFINE2(setgroups, int, gidsetsize, gid_t __user *, grouplist)
 	struct group_info *group_info;
 	int retval;
 
-	if (!ns_capable(current_user_ns(), CAP_SETGID))
+	if (!may_setgroups())
 		return -EPERM;
 	if ((unsigned)gidsetsize > NGROUPS_MAX)
 		return -EINVAL;
diff --git a/kernel/uid16.c b/kernel/uid16.c
index 602e5bbbceff..d58cc4d8f0d1 100644
--- a/kernel/uid16.c
+++ b/kernel/uid16.c
@@ -176,7 +176,7 @@ SYSCALL_DEFINE2(setgroups16, int, gidsetsize, old_gid_t __user *, grouplist)
 	struct group_info *group_info;
 	int retval;
 
-	if (!ns_capable(current_user_ns(), CAP_SETGID))
+	if (!may_setgroups())
 		return -EPERM;
 	if ((unsigned)gidsetsize > NGROUPS_MAX)
 		return -EINVAL;
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 58/78] userns: Document what the invariant required for safe unprivileged mappings.
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (56 preceding siblings ...)
  2015-01-09 10:32 ` [PATCH 3.12 57/78] groups: Consolidate the setgroups permission checks Jiri Slaby
@ 2015-01-09 10:32 ` Jiri Slaby
  2015-01-09 10:32 ` [PATCH 3.12 59/78] userns: Don't allow setgroups until a gid mapping has been setablished Jiri Slaby
                   ` (21 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:32 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Eric W. Biederman, Jiri Slaby

From: "Eric W. Biederman" <ebiederm@xmission.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 0542f17bf2c1f2430d368f44c8fcf2f82ec9e53e upstream.

The rule is simple.  Don't allow anything that wouldn't be allowed
without unprivileged mappings.

It was previously overlooked that establishing gid mappings would
allow dropping groups and potentially gaining permission to files and
directories that had lesser permissions for a specific group than for
all other users.

This is the rule needed to fix CVE-2014-8989 and prevent any other
security issues with new_idmap_permitted.

The reason for this rule is that the unix permission model is old and
there are programs out there somewhere that take advantage of every
little corner of it.  So allowing a uid or gid mapping to be
established without privielge that would allow anything that would not
be allowed without that mapping will result in expectations from some
code somewhere being violated.  Violated expectations about the
behavior of the OS is a long way to say a security issue.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 kernel/user_namespace.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index 6991139e3303..c9aa0e2c07ba 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -798,7 +798,9 @@ static bool new_idmap_permitted(const struct file *file,
 				struct user_namespace *ns, int cap_setid,
 				struct uid_gid_map *new_map)
 {
-	/* Allow mapping to your own filesystem ids */
+	/* Don't allow mappings that would allow anything that wouldn't
+	 * be allowed without the establishment of unprivileged mappings.
+	 */
 	if ((new_map->nr_extents == 1) && (new_map->extent[0].count == 1)) {
 		u32 id = new_map->extent[0].lower_first;
 		if (cap_setid == CAP_SETUID) {
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 59/78] userns: Don't allow setgroups until a gid mapping has been setablished
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (57 preceding siblings ...)
  2015-01-09 10:32 ` [PATCH 3.12 58/78] userns: Document what the invariant required for safe unprivileged mappings Jiri Slaby
@ 2015-01-09 10:32 ` Jiri Slaby
  2015-01-09 10:32 ` [PATCH 3.12 60/78] userns: Don't allow unprivileged creation of gid mappings Jiri Slaby
                   ` (20 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:32 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Eric W. Biederman, Jiri Slaby

From: "Eric W. Biederman" <ebiederm@xmission.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 273d2c67c3e179adb1e74f403d1e9a06e3f841b5 upstream.

setgroups is unique in not needing a valid mapping before it can be called,
in the case of setgroups(0, NULL) which drops all supplemental groups.

The design of the user namespace assumes that CAP_SETGID can not actually
be used until a gid mapping is established.  Therefore add a helper function
to see if the user namespace gid mapping has been established and call
that function in the setgroups permission check.

This is part of the fix for CVE-2014-8989, being able to drop groups
without privilege using user namespaces.

Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 include/linux/user_namespace.h |  5 +++++
 kernel/groups.c                |  4 +++-
 kernel/user_namespace.c        | 14 ++++++++++++++
 3 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
index 4db29859464f..736bee2b5664 100644
--- a/include/linux/user_namespace.h
+++ b/include/linux/user_namespace.h
@@ -57,6 +57,7 @@ extern struct seq_operations proc_projid_seq_operations;
 extern ssize_t proc_uid_map_write(struct file *, const char __user *, size_t, loff_t *);
 extern ssize_t proc_gid_map_write(struct file *, const char __user *, size_t, loff_t *);
 extern ssize_t proc_projid_map_write(struct file *, const char __user *, size_t, loff_t *);
+extern bool userns_may_setgroups(const struct user_namespace *ns);
 #else
 
 static inline struct user_namespace *get_user_ns(struct user_namespace *ns)
@@ -81,6 +82,10 @@ static inline void put_user_ns(struct user_namespace *ns)
 {
 }
 
+static inline bool userns_may_setgroups(const struct user_namespace *ns)
+{
+	return true;
+}
 #endif
 
 #endif /* _LINUX_USER_H */
diff --git a/kernel/groups.c b/kernel/groups.c
index 984bb629c68c..67b4ba30475f 100644
--- a/kernel/groups.c
+++ b/kernel/groups.c
@@ -6,6 +6,7 @@
 #include <linux/slab.h>
 #include <linux/security.h>
 #include <linux/syscalls.h>
+#include <linux/user_namespace.h>
 #include <asm/uaccess.h>
 
 /* init to 2 - one for init_task, one to ensure it is never freed */
@@ -227,7 +228,8 @@ bool may_setgroups(void)
 {
 	struct user_namespace *user_ns = current_user_ns();
 
-	return ns_capable(user_ns, CAP_SETGID);
+	return ns_capable(user_ns, CAP_SETGID) &&
+		userns_may_setgroups(user_ns);
 }
 
 /*
diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index c9aa0e2c07ba..048bb7b641a9 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -830,6 +830,20 @@ static bool new_idmap_permitted(const struct file *file,
 	return false;
 }
 
+bool userns_may_setgroups(const struct user_namespace *ns)
+{
+	bool allowed;
+
+	mutex_lock(&id_map_mutex);
+	/* It is not safe to use setgroups until a gid mapping in
+	 * the user namespace has been established.
+	 */
+	allowed = ns->gid_map.nr_extents != 0;
+	mutex_unlock(&id_map_mutex);
+
+	return allowed;
+}
+
 static void *userns_get(struct task_struct *task)
 {
 	struct user_namespace *user_ns;
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 60/78] userns: Don't allow unprivileged creation of gid mappings
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (58 preceding siblings ...)
  2015-01-09 10:32 ` [PATCH 3.12 59/78] userns: Don't allow setgroups until a gid mapping has been setablished Jiri Slaby
@ 2015-01-09 10:32 ` Jiri Slaby
  2015-01-09 10:32 ` [PATCH 3.12 61/78] userns: Check euid no fsuid when establishing an unprivileged uid mapping Jiri Slaby
                   ` (19 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:32 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Eric W. Biederman, Jiri Slaby

From: "Eric W. Biederman" <ebiederm@xmission.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit be7c6dba2332cef0677fbabb606e279ae76652c3 upstream.

As any gid mapping will allow and must allow for backwards
compatibility dropping groups don't allow any gid mappings to be
established without CAP_SETGID in the parent user namespace.

For a small class of applications this change breaks userspace
and removes useful functionality.  This small class of applications
includes tools/testing/selftests/mount/unprivilged-remount-test.c

Most of the removed functionality will be added back with the addition
of a one way knob to disable setgroups.  Once setgroups is disabled
setting the gid_map becomes as safe as setting the uid_map.

For more common applications that set the uid_map and the gid_map
with privilege this change will have no affect.

This is part of a fix for CVE-2014-8989.

Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 kernel/user_namespace.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index 048bb7b641a9..a5809c42a1b3 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -808,11 +808,6 @@ static bool new_idmap_permitted(const struct file *file,
 			if (uid_eq(uid, file->f_cred->fsuid))
 				return true;
 		}
-		else if (cap_setid == CAP_SETGID) {
-			kgid_t gid = make_kgid(ns->parent, id);
-			if (gid_eq(gid, file->f_cred->fsgid))
-				return true;
-		}
 	}
 
 	/* Allow anyone to set a mapping that doesn't require privilege */
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 61/78] userns: Check euid no fsuid when establishing an unprivileged uid mapping
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (59 preceding siblings ...)
  2015-01-09 10:32 ` [PATCH 3.12 60/78] userns: Don't allow unprivileged creation of gid mappings Jiri Slaby
@ 2015-01-09 10:32 ` Jiri Slaby
  2015-01-09 10:32 ` [PATCH 3.12 62/78] userns: Only allow the creator of the userns unprivileged mappings Jiri Slaby
                   ` (18 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:32 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Eric W. Biederman, Jiri Slaby

From: "Eric W. Biederman" <ebiederm@xmission.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 80dd00a23784b384ccea049bfb3f259d3f973b9d upstream.

setresuid allows the euid to be set to any of uid, euid, suid, and
fsuid.  Therefor it is safe to allow an unprivileged user to map
their euid and use CAP_SETUID privileged with exactly that uid,
as no new credentials can be obtained.

I can not find a combination of existing system calls that allows setting
uid, euid, suid, and fsuid from the fsuid making the previous use
of fsuid for allowing unprivileged mappings a bug.

This is part of a fix for CVE-2014-8989.

Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 kernel/user_namespace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index a5809c42a1b3..6e495bd672a7 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -805,7 +805,7 @@ static bool new_idmap_permitted(const struct file *file,
 		u32 id = new_map->extent[0].lower_first;
 		if (cap_setid == CAP_SETUID) {
 			kuid_t uid = make_kuid(ns->parent, id);
-			if (uid_eq(uid, file->f_cred->fsuid))
+			if (uid_eq(uid, file->f_cred->euid))
 				return true;
 		}
 	}
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 62/78] userns: Only allow the creator of the userns unprivileged mappings
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (60 preceding siblings ...)
  2015-01-09 10:32 ` [PATCH 3.12 61/78] userns: Check euid no fsuid when establishing an unprivileged uid mapping Jiri Slaby
@ 2015-01-09 10:32 ` Jiri Slaby
  2015-01-09 10:32 ` [PATCH 3.12 63/78] userns: Rename id_map_mutex to userns_state_mutex Jiri Slaby
                   ` (17 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:32 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Eric W. Biederman, Jiri Slaby

From: "Eric W. Biederman" <ebiederm@xmission.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit f95d7918bd1e724675de4940039f2865e5eec5fe upstream.

If you did not create the user namespace and are allowed
to write to uid_map or gid_map you should already have the necessary
privilege in the parent user namespace to establish any mapping
you want so this will not affect userspace in practice.

Limiting unprivileged uid mapping establishment to the creator of the
user namespace makes it easier to verify all credentials obtained with
the uid mapping can be obtained without the uid mapping without
privilege.

Limiting unprivileged gid mapping establishment (which is temporarily
absent) to the creator of the user namespace also ensures that the
combination of uid and gid can already be obtained without privilege.

This is part of the fix for CVE-2014-8989.

Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 kernel/user_namespace.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index 6e495bd672a7..c2ca6c01e575 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -798,14 +798,16 @@ static bool new_idmap_permitted(const struct file *file,
 				struct user_namespace *ns, int cap_setid,
 				struct uid_gid_map *new_map)
 {
+	const struct cred *cred = file->f_cred;
 	/* Don't allow mappings that would allow anything that wouldn't
 	 * be allowed without the establishment of unprivileged mappings.
 	 */
-	if ((new_map->nr_extents == 1) && (new_map->extent[0].count == 1)) {
+	if ((new_map->nr_extents == 1) && (new_map->extent[0].count == 1) &&
+	    uid_eq(ns->owner, cred->euid)) {
 		u32 id = new_map->extent[0].lower_first;
 		if (cap_setid == CAP_SETUID) {
 			kuid_t uid = make_kuid(ns->parent, id);
-			if (uid_eq(uid, file->f_cred->euid))
+			if (uid_eq(uid, cred->euid))
 				return true;
 		}
 	}
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 63/78] userns: Rename id_map_mutex to userns_state_mutex
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (61 preceding siblings ...)
  2015-01-09 10:32 ` [PATCH 3.12 62/78] userns: Only allow the creator of the userns unprivileged mappings Jiri Slaby
@ 2015-01-09 10:32 ` Jiri Slaby
  2015-01-09 10:32 ` [PATCH 3.12 64/78] userns: Add a knob to disable setgroups on a per user namespace basis Jiri Slaby
                   ` (16 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:32 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Eric W. Biederman, Jiri Slaby

From: "Eric W. Biederman" <ebiederm@xmission.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit f0d62aec931e4ae3333c797d346dc4f188f454ba upstream.

Generalize id_map_mutex so it can be used for more state of a user namespace.

Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 kernel/user_namespace.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index c2ca6c01e575..a607b24bec0b 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -24,6 +24,7 @@
 #include <linux/fs_struct.h>
 
 static struct kmem_cache *user_ns_cachep __read_mostly;
+static DEFINE_MUTEX(userns_state_mutex);
 
 static bool new_idmap_permitted(const struct file *file,
 				struct user_namespace *ns, int cap_setid,
@@ -575,9 +576,6 @@ static bool mappings_overlap(struct uid_gid_map *new_map, struct uid_gid_extent
 	return false;
 }
 
-
-static DEFINE_MUTEX(id_map_mutex);
-
 static ssize_t map_write(struct file *file, const char __user *buf,
 			 size_t count, loff_t *ppos,
 			 int cap_setid,
@@ -594,7 +592,7 @@ static ssize_t map_write(struct file *file, const char __user *buf,
 	ssize_t ret = -EINVAL;
 
 	/*
-	 * The id_map_mutex serializes all writes to any given map.
+	 * The userns_state_mutex serializes all writes to any given map.
 	 *
 	 * Any map is only ever written once.
 	 *
@@ -612,7 +610,7 @@ static ssize_t map_write(struct file *file, const char __user *buf,
 	 * order and smp_rmb() is guaranteed that we don't have crazy
 	 * architectures returning stale data.
 	 */
-	mutex_lock(&id_map_mutex);
+	mutex_lock(&userns_state_mutex);
 
 	ret = -EPERM;
 	/* Only allow one successful write to the map */
@@ -739,7 +737,7 @@ static ssize_t map_write(struct file *file, const char __user *buf,
 	*ppos = count;
 	ret = count;
 out:
-	mutex_unlock(&id_map_mutex);
+	mutex_unlock(&userns_state_mutex);
 	if (page)
 		free_page(page);
 	return ret;
@@ -831,12 +829,12 @@ bool userns_may_setgroups(const struct user_namespace *ns)
 {
 	bool allowed;
 
-	mutex_lock(&id_map_mutex);
+	mutex_lock(&userns_state_mutex);
 	/* It is not safe to use setgroups until a gid mapping in
 	 * the user namespace has been established.
 	 */
 	allowed = ns->gid_map.nr_extents != 0;
-	mutex_unlock(&id_map_mutex);
+	mutex_unlock(&userns_state_mutex);
 
 	return allowed;
 }
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 64/78] userns: Add a knob to disable setgroups on a per user namespace basis
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (62 preceding siblings ...)
  2015-01-09 10:32 ` [PATCH 3.12 63/78] userns: Rename id_map_mutex to userns_state_mutex Jiri Slaby
@ 2015-01-09 10:32 ` Jiri Slaby
  2015-01-09 10:32 ` [PATCH 3.12 65/78] userns: Allow setting gid_maps without privilege when setgroups is disabled Jiri Slaby
                   ` (15 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:32 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Eric W. Biederman, Jiri Slaby

From: "Eric W. Biederman" <ebiederm@xmission.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 9cc46516ddf497ea16e8d7cb986ae03a0f6b92f8 upstream.

- Expose the knob to user space through a proc file /proc/<pid>/setgroups

  A value of "deny" means the setgroups system call is disabled in the
  current processes user namespace and can not be enabled in the
  future in this user namespace.

  A value of "allow" means the segtoups system call is enabled.

- Descendant user namespaces inherit the value of setgroups from
  their parents.

- A proc file is used (instead of a sysctl) as sysctls currently do
  not allow checking the permissions at open time.

- Writing to the proc file is restricted to before the gid_map
  for the user namespace is set.

  This ensures that disabling setgroups at a user namespace
  level will never remove the ability to call setgroups
  from a process that already has that ability.

  A process may opt in to the setgroups disable for itself by
  creating, entering and configuring a user namespace or by calling
  setns on an existing user namespace with setgroups disabled.
  Processes without privileges already can not call setgroups so this
  is a noop.  Prodcess with privilege become processes without
  privilege when entering a user namespace and as with any other path
  to dropping privilege they would not have the ability to call
  setgroups.  So this remains within the bounds of what is possible
  without a knob to disable setgroups permanently in a user namespace.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 fs/proc/base.c                 | 53 ++++++++++++++++++++++++++
 include/linux/user_namespace.h |  7 ++++
 kernel/user.c                  |  1 +
 kernel/user_namespace.c        | 85 ++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 146 insertions(+)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index c35eaa404933..dfce13e5327b 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2544,6 +2544,57 @@ static const struct file_operations proc_projid_map_operations = {
 	.llseek		= seq_lseek,
 	.release	= proc_id_map_release,
 };
+
+static int proc_setgroups_open(struct inode *inode, struct file *file)
+{
+	struct user_namespace *ns = NULL;
+	struct task_struct *task;
+	int ret;
+
+	ret = -ESRCH;
+	task = get_proc_task(inode);
+	if (task) {
+		rcu_read_lock();
+		ns = get_user_ns(task_cred_xxx(task, user_ns));
+		rcu_read_unlock();
+		put_task_struct(task);
+	}
+	if (!ns)
+		goto err;
+
+	if (file->f_mode & FMODE_WRITE) {
+		ret = -EACCES;
+		if (!ns_capable(ns, CAP_SYS_ADMIN))
+			goto err_put_ns;
+	}
+
+	ret = single_open(file, &proc_setgroups_show, ns);
+	if (ret)
+		goto err_put_ns;
+
+	return 0;
+err_put_ns:
+	put_user_ns(ns);
+err:
+	return ret;
+}
+
+static int proc_setgroups_release(struct inode *inode, struct file *file)
+{
+	struct seq_file *seq = file->private_data;
+	struct user_namespace *ns = seq->private;
+	int ret = single_release(inode, file);
+	put_user_ns(ns);
+	return ret;
+}
+
+static const struct file_operations proc_setgroups_operations = {
+	.open		= proc_setgroups_open,
+	.write		= proc_setgroups_write,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= proc_setgroups_release,
+};
 #endif /* CONFIG_USER_NS */
 
 static int proc_pid_personality(struct seq_file *m, struct pid_namespace *ns,
@@ -2652,6 +2703,7 @@ static const struct pid_entry tgid_base_stuff[] = {
 	REG("uid_map",    S_IRUGO|S_IWUSR, proc_uid_map_operations),
 	REG("gid_map",    S_IRUGO|S_IWUSR, proc_gid_map_operations),
 	REG("projid_map", S_IRUGO|S_IWUSR, proc_projid_map_operations),
+	REG("setgroups",  S_IRUGO|S_IWUSR, proc_setgroups_operations),
 #endif
 #ifdef CONFIG_CHECKPOINT_RESTORE
 	REG("timers",	  S_IRUGO, proc_timers_operations),
@@ -2987,6 +3039,7 @@ static const struct pid_entry tid_base_stuff[] = {
 	REG("uid_map",    S_IRUGO|S_IWUSR, proc_uid_map_operations),
 	REG("gid_map",    S_IRUGO|S_IWUSR, proc_gid_map_operations),
 	REG("projid_map", S_IRUGO|S_IWUSR, proc_projid_map_operations),
+	REG("setgroups",  S_IRUGO|S_IWUSR, proc_setgroups_operations),
 #endif
 };
 
diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
index 736bee2b5664..67c11082bde2 100644
--- a/include/linux/user_namespace.h
+++ b/include/linux/user_namespace.h
@@ -17,6 +17,10 @@ struct uid_gid_map {	/* 64 bytes -- 1 cache line */
 	} extent[UID_GID_MAP_MAX_EXTENTS];
 };
 
+#define USERNS_SETGROUPS_ALLOWED 1UL
+
+#define USERNS_INIT_FLAGS USERNS_SETGROUPS_ALLOWED
+
 struct user_namespace {
 	struct uid_gid_map	uid_map;
 	struct uid_gid_map	gid_map;
@@ -27,6 +31,7 @@ struct user_namespace {
 	kuid_t			owner;
 	kgid_t			group;
 	unsigned int		proc_inum;
+	unsigned long		flags;
 };
 
 extern struct user_namespace init_user_ns;
@@ -57,6 +62,8 @@ extern struct seq_operations proc_projid_seq_operations;
 extern ssize_t proc_uid_map_write(struct file *, const char __user *, size_t, loff_t *);
 extern ssize_t proc_gid_map_write(struct file *, const char __user *, size_t, loff_t *);
 extern ssize_t proc_projid_map_write(struct file *, const char __user *, size_t, loff_t *);
+extern ssize_t proc_setgroups_write(struct file *, const char __user *, size_t, loff_t *);
+extern int proc_setgroups_show(struct seq_file *m, void *v);
 extern bool userns_may_setgroups(const struct user_namespace *ns);
 #else
 
diff --git a/kernel/user.c b/kernel/user.c
index 5bbb91988e69..75774ce9bf58 100644
--- a/kernel/user.c
+++ b/kernel/user.c
@@ -51,6 +51,7 @@ struct user_namespace init_user_ns = {
 	.owner = GLOBAL_ROOT_UID,
 	.group = GLOBAL_ROOT_GID,
 	.proc_inum = PROC_USER_INIT_INO,
+	.flags = USERNS_INIT_FLAGS,
 };
 EXPORT_SYMBOL_GPL(init_user_ns);
 
diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index a607b24bec0b..7737b3da335c 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -100,6 +100,11 @@ int create_user_ns(struct cred *new)
 	ns->owner = owner;
 	ns->group = group;
 
+	/* Inherit USERNS_SETGROUPS_ALLOWED from our parent */
+	mutex_lock(&userns_state_mutex);
+	ns->flags = parent_ns->flags;
+	mutex_unlock(&userns_state_mutex);
+
 	set_cred_user_ns(new, ns);
 
 	return 0;
@@ -825,6 +830,84 @@ static bool new_idmap_permitted(const struct file *file,
 	return false;
 }
 
+int proc_setgroups_show(struct seq_file *seq, void *v)
+{
+	struct user_namespace *ns = seq->private;
+	unsigned long userns_flags = ACCESS_ONCE(ns->flags);
+
+	seq_printf(seq, "%s\n",
+		   (userns_flags & USERNS_SETGROUPS_ALLOWED) ?
+		   "allow" : "deny");
+	return 0;
+}
+
+ssize_t proc_setgroups_write(struct file *file, const char __user *buf,
+			     size_t count, loff_t *ppos)
+{
+	struct seq_file *seq = file->private_data;
+	struct user_namespace *ns = seq->private;
+	char kbuf[8], *pos;
+	bool setgroups_allowed;
+	ssize_t ret;
+
+	/* Only allow a very narrow range of strings to be written */
+	ret = -EINVAL;
+	if ((*ppos != 0) || (count >= sizeof(kbuf)))
+		goto out;
+
+	/* What was written? */
+	ret = -EFAULT;
+	if (copy_from_user(kbuf, buf, count))
+		goto out;
+	kbuf[count] = '\0';
+	pos = kbuf;
+
+	/* What is being requested? */
+	ret = -EINVAL;
+	if (strncmp(pos, "allow", 5) == 0) {
+		pos += 5;
+		setgroups_allowed = true;
+	}
+	else if (strncmp(pos, "deny", 4) == 0) {
+		pos += 4;
+		setgroups_allowed = false;
+	}
+	else
+		goto out;
+
+	/* Verify there is not trailing junk on the line */
+	pos = skip_spaces(pos);
+	if (*pos != '\0')
+		goto out;
+
+	ret = -EPERM;
+	mutex_lock(&userns_state_mutex);
+	if (setgroups_allowed) {
+		/* Enabling setgroups after setgroups has been disabled
+		 * is not allowed.
+		 */
+		if (!(ns->flags & USERNS_SETGROUPS_ALLOWED))
+			goto out_unlock;
+	} else {
+		/* Permanently disabling setgroups after setgroups has
+		 * been enabled by writing the gid_map is not allowed.
+		 */
+		if (ns->gid_map.nr_extents != 0)
+			goto out_unlock;
+		ns->flags &= ~USERNS_SETGROUPS_ALLOWED;
+	}
+	mutex_unlock(&userns_state_mutex);
+
+	/* Report a successful write */
+	*ppos = count;
+	ret = count;
+out:
+	return ret;
+out_unlock:
+	mutex_unlock(&userns_state_mutex);
+	goto out;
+}
+
 bool userns_may_setgroups(const struct user_namespace *ns)
 {
 	bool allowed;
@@ -834,6 +917,8 @@ bool userns_may_setgroups(const struct user_namespace *ns)
 	 * the user namespace has been established.
 	 */
 	allowed = ns->gid_map.nr_extents != 0;
+	/* Is setgroups allowed? */
+	allowed = allowed && (ns->flags & USERNS_SETGROUPS_ALLOWED);
 	mutex_unlock(&userns_state_mutex);
 
 	return allowed;
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 65/78] userns: Allow setting gid_maps without privilege when setgroups is disabled
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (63 preceding siblings ...)
  2015-01-09 10:32 ` [PATCH 3.12 64/78] userns: Add a knob to disable setgroups on a per user namespace basis Jiri Slaby
@ 2015-01-09 10:32 ` Jiri Slaby
  2015-01-09 10:32 ` [PATCH 3.12 66/78] userns: Unbreak the unprivileged remount tests Jiri Slaby
                   ` (14 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:32 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Eric W. Biederman, Jiri Slaby

From: "Eric W. Biederman" <ebiederm@xmission.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 66d2f338ee4c449396b6f99f5e75cd18eb6df272 upstream.

Now that setgroups can be disabled and not reenabled, setting gid_map
without privielge can now be enabled when setgroups is disabled.

This restores most of the functionality that was lost when unprivileged
setting of gid_map was removed.  Applications that use this functionality
will need to check to see if they use setgroups or init_groups, and if they
don't they can be fixed by simply disabling setgroups before writing to
gid_map.

Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 kernel/user_namespace.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index 7737b3da335c..c09fe8b87cb0 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -812,6 +812,11 @@ static bool new_idmap_permitted(const struct file *file,
 			kuid_t uid = make_kuid(ns->parent, id);
 			if (uid_eq(uid, cred->euid))
 				return true;
+		} else if (cap_setid == CAP_SETGID) {
+			kgid_t gid = make_kgid(ns->parent, id);
+			if (!(ns->flags & USERNS_SETGROUPS_ALLOWED) &&
+			    gid_eq(gid, cred->egid))
+				return true;
 		}
 	}
 
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 66/78] userns: Unbreak the unprivileged remount tests
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (64 preceding siblings ...)
  2015-01-09 10:32 ` [PATCH 3.12 65/78] userns: Allow setting gid_maps without privilege when setgroups is disabled Jiri Slaby
@ 2015-01-09 10:32 ` Jiri Slaby
  2015-01-09 10:32 ` [PATCH 3.12 67/78] audit: restore AUDIT_LOGINUID unset ABI Jiri Slaby
                   ` (13 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:32 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Eric W. Biederman, Jiri Slaby

From: "Eric W. Biederman" <ebiederm@xmission.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit db86da7cb76f797a1a8b445166a15cb922c6ff85 upstream.

A security fix in caused the way the unprivileged remount tests were
using user namespaces to break.  Tweak the way user namespaces are
being used so the test works again.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 .../selftests/mount/unprivileged-remount-test.c    | 32 ++++++++++++++++------
 1 file changed, 24 insertions(+), 8 deletions(-)

diff --git a/tools/testing/selftests/mount/unprivileged-remount-test.c b/tools/testing/selftests/mount/unprivileged-remount-test.c
index 9669d375625a..517785052f1c 100644
--- a/tools/testing/selftests/mount/unprivileged-remount-test.c
+++ b/tools/testing/selftests/mount/unprivileged-remount-test.c
@@ -53,17 +53,14 @@ static void die(char *fmt, ...)
 	exit(EXIT_FAILURE);
 }
 
-static void write_file(char *filename, char *fmt, ...)
+static void vmaybe_write_file(bool enoent_ok, char *filename, char *fmt, va_list ap)
 {
 	char buf[4096];
 	int fd;
 	ssize_t written;
 	int buf_len;
-	va_list ap;
 
-	va_start(ap, fmt);
 	buf_len = vsnprintf(buf, sizeof(buf), fmt, ap);
-	va_end(ap);
 	if (buf_len < 0) {
 		die("vsnprintf failed: %s\n",
 		    strerror(errno));
@@ -74,6 +71,8 @@ static void write_file(char *filename, char *fmt, ...)
 
 	fd = open(filename, O_WRONLY);
 	if (fd < 0) {
+		if ((errno == ENOENT) && enoent_ok)
+			return;
 		die("open of %s failed: %s\n",
 		    filename, strerror(errno));
 	}
@@ -92,6 +91,26 @@ static void write_file(char *filename, char *fmt, ...)
 	}
 }
 
+static void maybe_write_file(char *filename, char *fmt, ...)
+{
+	va_list ap;
+
+	va_start(ap, fmt);
+	vmaybe_write_file(true, filename, fmt, ap);
+	va_end(ap);
+
+}
+
+static void write_file(char *filename, char *fmt, ...)
+{
+	va_list ap;
+
+	va_start(ap, fmt);
+	vmaybe_write_file(false, filename, fmt, ap);
+	va_end(ap);
+
+}
+
 static int read_mnt_flags(const char *path)
 {
 	int ret;
@@ -144,13 +163,10 @@ static void create_and_enter_userns(void)
 			strerror(errno));
 	}
 
+	maybe_write_file("/proc/self/setgroups", "deny");
 	write_file("/proc/self/uid_map", "0 %d 1", uid);
 	write_file("/proc/self/gid_map", "0 %d 1", gid);
 
-	if (setgroups(0, NULL) != 0) {
-		die("setgroups failed: %s\n",
-			strerror(errno));
-	}
 	if (setgid(0) != 0) {
 		die ("setgid(0) failed %s\n",
 			strerror(errno));
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 67/78] audit: restore AUDIT_LOGINUID unset ABI
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (65 preceding siblings ...)
  2015-01-09 10:32 ` [PATCH 3.12 66/78] userns: Unbreak the unprivileged remount tests Jiri Slaby
@ 2015-01-09 10:32 ` Jiri Slaby
  2015-01-09 10:32 ` [PATCH 3.12 68/78] crypto: af_alg - fix backlog handling Jiri Slaby
                   ` (12 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:32 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Richard Guy Briggs, Paul Moore, Jiri Slaby

From: Richard Guy Briggs <rgb@redhat.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 041d7b98ffe59c59fdd639931dea7d74f9aa9a59 upstream.

A regression was caused by commit 780a7654cee8:
	 audit: Make testing for a valid loginuid explicit.
(which in turn attempted to fix a regression caused by e1760bd)

When audit_krule_to_data() fills in the rules to get a listing, there was a
missing clause to convert back from AUDIT_LOGINUID_SET to AUDIT_LOGINUID.

This broke userspace by not returning the same information that was sent and
expected.

The rule:
	auditctl -a exit,never -F auid=-1
gives:
	auditctl -l
		LIST_RULES: exit,never f24=0 syscall=all
when it should give:
		LIST_RULES: exit,never auid=-1 (0xffffffff) syscall=all

Tag it so that it is reported the same way it was set.  Create a new
private flags audit_krule field (pflags) to store it that won't interact with
the public one from the API.

Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 include/linux/audit.h |  4 ++++
 kernel/auditfilter.c  | 10 ++++++++++
 2 files changed, 14 insertions(+)

diff --git a/include/linux/audit.h b/include/linux/audit.h
index 4fb28b23a4a4..c25cb64db967 100644
--- a/include/linux/audit.h
+++ b/include/linux/audit.h
@@ -46,6 +46,7 @@ struct audit_tree;
 
 struct audit_krule {
 	int			vers_ops;
+	u32			pflags;
 	u32			flags;
 	u32			listnr;
 	u32			action;
@@ -63,6 +64,9 @@ struct audit_krule {
 	u64			prio;
 };
 
+/* Flag to indicate legacy AUDIT_LOGINUID unset usage */
+#define AUDIT_LOGINUID_LEGACY		0x1
+
 struct audit_field {
 	u32				type;
 	u32				val;
diff --git a/kernel/auditfilter.c b/kernel/auditfilter.c
index 8a344cebd8bf..dfd2f4af81a9 100644
--- a/kernel/auditfilter.c
+++ b/kernel/auditfilter.c
@@ -426,6 +426,7 @@ static struct audit_entry *audit_data_to_entry(struct audit_rule_data *data,
 		if ((f->type == AUDIT_LOGINUID) && (f->val == AUDIT_UID_UNSET)) {
 			f->type = AUDIT_LOGINUID_SET;
 			f->val = 0;
+			entry->rule.pflags |= AUDIT_LOGINUID_LEGACY;
 		}
 
 		err = audit_field_valid(entry, f);
@@ -601,6 +602,13 @@ static struct audit_rule_data *audit_krule_to_data(struct audit_krule *krule)
 			data->buflen += data->values[i] =
 				audit_pack_string(&bufp, krule->filterkey);
 			break;
+		case AUDIT_LOGINUID_SET:
+			if (krule->pflags & AUDIT_LOGINUID_LEGACY && !f->val) {
+				data->fields[i] = AUDIT_LOGINUID;
+				data->values[i] = AUDIT_UID_UNSET;
+				break;
+			}
+			/* fallthrough if set */
 		default:
 			data->values[i] = f->val;
 		}
@@ -617,6 +625,7 @@ static int audit_compare_rule(struct audit_krule *a, struct audit_krule *b)
 	int i;
 
 	if (a->flags != b->flags ||
+	    a->pflags != b->pflags ||
 	    a->listnr != b->listnr ||
 	    a->action != b->action ||
 	    a->field_count != b->field_count)
@@ -735,6 +744,7 @@ struct audit_entry *audit_dupe_rule(struct audit_krule *old)
 	new = &entry->rule;
 	new->vers_ops = old->vers_ops;
 	new->flags = old->flags;
+	new->pflags = old->pflags;
 	new->listnr = old->listnr;
 	new->action = old->action;
 	for (i = 0; i < AUDIT_BITMASK_SIZE; i++)
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 68/78] crypto: af_alg - fix backlog handling
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (66 preceding siblings ...)
  2015-01-09 10:32 ` [PATCH 3.12 67/78] audit: restore AUDIT_LOGINUID unset ABI Jiri Slaby
@ 2015-01-09 10:32 ` Jiri Slaby
  2015-01-09 10:32 ` [PATCH 3.12 69/78] ncpfs: return proper error from NCP_IOC_SETROOT ioctl Jiri Slaby
                   ` (11 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:32 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Rabin Vincent, Herbert Xu, Jiri Slaby

From: Rabin Vincent <rabin.vincent@axis.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 7e77bdebff5cb1e9876c561f69710b9ab8fa1f7e upstream.

If a request is backlogged, it's complete() handler will get called
twice: once with -EINPROGRESS, and once with the final error code.

af_alg's complete handler, unlike other users, does not handle the
-EINPROGRESS but instead always completes the completion that recvmsg()
is waiting on.  This can lead to a return to user space while the
request is still pending in the driver.  If userspace closes the sockets
before the requests are handled by the driver, this will lead to
use-after-frees (and potential crashes) in the kernel due to the tfm
having been freed.

The crashes can be easily reproduced (for example) by reducing the max
queue length in cryptod.c and running the following (from
http://www.chronox.de/libkcapi.html) on AES-NI capable hardware:

 $ while true; do kcapi -x 1 -e -c '__ecb-aes-aesni' \
    -k 00000000000000000000000000000000 \
    -p 00000000000000000000000000000000 >/dev/null & done

Signed-off-by: Rabin Vincent <rabin.vincent@axis.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 crypto/af_alg.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index bf948e134981..6ef6e2ad344e 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -449,6 +449,9 @@ void af_alg_complete(struct crypto_async_request *req, int err)
 {
 	struct af_alg_completion *completion = req->data;
 
+	if (err == -EINPROGRESS)
+		return;
+
 	completion->err = err;
 	complete(&completion->completion);
 }
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 69/78] ncpfs: return proper error from NCP_IOC_SETROOT ioctl
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (67 preceding siblings ...)
  2015-01-09 10:32 ` [PATCH 3.12 68/78] crypto: af_alg - fix backlog handling Jiri Slaby
@ 2015-01-09 10:32 ` Jiri Slaby
  2015-01-09 10:32 ` [PATCH 3.12 70/78] exit: pidns: alloc_pid() leaks pid_namespace if child_reaper is exiting Jiri Slaby
                   ` (10 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:32 UTC (permalink / raw)
  To: stable
  Cc: linux-kernel, Jan Kara, Petr Vandrovec, Andrew Morton,
	Linus Torvalds, Jiri Slaby

From: Jan Kara <jack@suse.cz>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit a682e9c28cac152e6e54c39efcf046e0c8cfcf63 upstream.

If some error happens in NCP_IOC_SETROOT ioctl, the appropriate error
return value is then (in most cases) just overwritten before we return.
This can result in reporting success to userspace although error happened.

This bug was introduced by commit 2e54eb96e2c8 ("BKL: Remove BKL from
ncpfs").  Propagate the errors correctly.

Coverity id: 1226925.

Fixes: 2e54eb96e2c80 ("BKL: Remove BKL from ncpfs")
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Petr Vandrovec <petr@vandrovec.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 fs/ncpfs/ioctl.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/fs/ncpfs/ioctl.c b/fs/ncpfs/ioctl.c
index 60426ccb3b65..2f970de02b16 100644
--- a/fs/ncpfs/ioctl.c
+++ b/fs/ncpfs/ioctl.c
@@ -448,7 +448,6 @@ static long __ncp_ioctl(struct inode *inode, unsigned int cmd, unsigned long arg
 						result = -EIO;
 					}
 				}
-				result = 0;
 			}
 			mutex_unlock(&server->root_setup_lock);
 
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 70/78] exit: pidns: alloc_pid() leaks pid_namespace if child_reaper is exiting
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (68 preceding siblings ...)
  2015-01-09 10:32 ` [PATCH 3.12 69/78] ncpfs: return proper error from NCP_IOC_SETROOT ioctl Jiri Slaby
@ 2015-01-09 10:32 ` Jiri Slaby
  2015-01-09 10:32 ` [PATCH 3.12 71/78] udf: Verify symlink size before loading it Jiri Slaby
                   ` (9 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:32 UTC (permalink / raw)
  To: stable
  Cc: linux-kernel, Oleg Nesterov, Aaron Tomlin, Pavel Emelyanov,
	Serge Hallyn, Sterling Alexander, Andrew Morton, Linus Torvalds,
	Jiri Slaby

From: Oleg Nesterov <oleg@redhat.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 24c037ebf5723d4d9ab0996433cee4f96c292a4d upstream.

alloc_pid() does get_pid_ns() beforehand but forgets to put_pid_ns() if it
fails because disable_pid_allocation() was called by the exiting
child_reaper.

We could simply move get_pid_ns() down to successful return, but this fix
tries to be as trivial as possible.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Aaron Tomlin <atomlin@redhat.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Serge Hallyn <serge.hallyn@ubuntu.com>
Cc: Sterling Alexander <stalexan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 kernel/pid.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/pid.c b/kernel/pid.c
index 9b9a26698144..82430c858d69 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -341,6 +341,8 @@ out:
 
 out_unlock:
 	spin_unlock_irq(&pidmap_lock);
+	put_pid_ns(ns);
+
 out_free:
 	while (++i <= ns->level)
 		free_pidmap(pid->numbers + i);
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 71/78] udf: Verify symlink size before loading it
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (69 preceding siblings ...)
  2015-01-09 10:32 ` [PATCH 3.12 70/78] exit: pidns: alloc_pid() leaks pid_namespace if child_reaper is exiting Jiri Slaby
@ 2015-01-09 10:32 ` Jiri Slaby
  2015-01-09 10:32 ` [PATCH 3.12 72/78] eCryptfs: Force RO mount when encrypted view is enabled Jiri Slaby
                   ` (8 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:32 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Jan Kara, Jiri Slaby

From: Jan Kara <jack@suse.cz>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit a1d47b262952a45aae62bd49cfaf33dd76c11a2c upstream.

UDF specification allows arbitrarily large symlinks. However we support
only symlinks at most one block large. Check the length of the symlink
so that we don't access memory beyond end of the symlink block.

Reported-by: Carl Henrik Lunde <chlunde@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 fs/udf/symlink.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/fs/udf/symlink.c b/fs/udf/symlink.c
index d7c6dbe4194b..d89f324bc387 100644
--- a/fs/udf/symlink.c
+++ b/fs/udf/symlink.c
@@ -80,11 +80,17 @@ static int udf_symlink_filler(struct file *file, struct page *page)
 	struct inode *inode = page->mapping->host;
 	struct buffer_head *bh = NULL;
 	unsigned char *symlink;
-	int err = -EIO;
+	int err;
 	unsigned char *p = kmap(page);
 	struct udf_inode_info *iinfo;
 	uint32_t pos;
 
+	/* We don't support symlinks longer than one block */
+	if (inode->i_size > inode->i_sb->s_blocksize) {
+		err = -ENAMETOOLONG;
+		goto out_unmap;
+	}
+
 	iinfo = UDF_I(inode);
 	pos = udf_block_map(inode, 0);
 
@@ -94,8 +100,10 @@ static int udf_symlink_filler(struct file *file, struct page *page)
 	} else {
 		bh = sb_bread(inode->i_sb, pos);
 
-		if (!bh)
-			goto out;
+		if (!bh) {
+			err = -EIO;
+			goto out_unlock_inode;
+		}
 
 		symlink = bh->b_data;
 	}
@@ -109,9 +117,10 @@ static int udf_symlink_filler(struct file *file, struct page *page)
 	unlock_page(page);
 	return 0;
 
-out:
+out_unlock_inode:
 	up_read(&iinfo->i_data_sem);
 	SetPageError(page);
+out_unmap:
 	kunmap(page);
 	unlock_page(page);
 	return err;
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 72/78] eCryptfs: Force RO mount when encrypted view is enabled
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (70 preceding siblings ...)
  2015-01-09 10:32 ` [PATCH 3.12 71/78] udf: Verify symlink size before loading it Jiri Slaby
@ 2015-01-09 10:32 ` Jiri Slaby
  2015-01-09 10:32 ` [PATCH 3.12 73/78] eCryptfs: Remove buggy and unnecessary write in file name decode routine Jiri Slaby
                   ` (7 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:32 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Tyler Hicks, Jiri Slaby

From: Tyler Hicks <tyhicks@canonical.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 332b122d39c9cbff8b799007a825d94b2e7c12f2 upstream.

The ecryptfs_encrypted_view mount option greatly changes the
functionality of an eCryptfs mount. Instead of encrypting and decrypting
lower files, it provides a unified view of the encrypted files in the
lower filesystem. The presence of the ecryptfs_encrypted_view mount
option is intended to force a read-only mount and modifying files is not
supported when the feature is in use. See the following commit for more
information:

  e77a56d [PATCH] eCryptfs: Encrypted passthrough

This patch forces the mount to be read-only when the
ecryptfs_encrypted_view mount option is specified by setting the
MS_RDONLY flag on the superblock. Additionally, this patch removes some
broken logic in ecryptfs_open() that attempted to prevent modifications
of files when the encrypted view feature was in use. The check in
ecryptfs_open() was not sufficient to prevent file modifications using
system calls that do not operate on a file descriptor.

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Reported-by: Priya Bansal <p.bansal@samsung.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 fs/ecryptfs/file.c | 12 ------------
 fs/ecryptfs/main.c | 16 +++++++++++++---
 2 files changed, 13 insertions(+), 15 deletions(-)

diff --git a/fs/ecryptfs/file.c b/fs/ecryptfs/file.c
index 992cf95830b5..f3fd66acae47 100644
--- a/fs/ecryptfs/file.c
+++ b/fs/ecryptfs/file.c
@@ -191,23 +191,11 @@ static int ecryptfs_open(struct inode *inode, struct file *file)
 {
 	int rc = 0;
 	struct ecryptfs_crypt_stat *crypt_stat = NULL;
-	struct ecryptfs_mount_crypt_stat *mount_crypt_stat;
 	struct dentry *ecryptfs_dentry = file->f_path.dentry;
 	/* Private value of ecryptfs_dentry allocated in
 	 * ecryptfs_lookup() */
 	struct ecryptfs_file_info *file_info;
 
-	mount_crypt_stat = &ecryptfs_superblock_to_private(
-		ecryptfs_dentry->d_sb)->mount_crypt_stat;
-	if ((mount_crypt_stat->flags & ECRYPTFS_ENCRYPTED_VIEW_ENABLED)
-	    && ((file->f_flags & O_WRONLY) || (file->f_flags & O_RDWR)
-		|| (file->f_flags & O_CREAT) || (file->f_flags & O_TRUNC)
-		|| (file->f_flags & O_APPEND))) {
-		printk(KERN_WARNING "Mount has encrypted view enabled; "
-		       "files may only be read\n");
-		rc = -EPERM;
-		goto out;
-	}
 	/* Released in ecryptfs_release or end of function if failure */
 	file_info = kmem_cache_zalloc(ecryptfs_file_info_cache, GFP_KERNEL);
 	ecryptfs_set_file_private(file, file_info);
diff --git a/fs/ecryptfs/main.c b/fs/ecryptfs/main.c
index eb1c5979ecaf..539a399b8339 100644
--- a/fs/ecryptfs/main.c
+++ b/fs/ecryptfs/main.c
@@ -493,6 +493,7 @@ static struct dentry *ecryptfs_mount(struct file_system_type *fs_type, int flags
 {
 	struct super_block *s;
 	struct ecryptfs_sb_info *sbi;
+	struct ecryptfs_mount_crypt_stat *mount_crypt_stat;
 	struct ecryptfs_dentry_info *root_info;
 	const char *err = "Getting sb failed";
 	struct inode *inode;
@@ -511,6 +512,7 @@ static struct dentry *ecryptfs_mount(struct file_system_type *fs_type, int flags
 		err = "Error parsing options";
 		goto out;
 	}
+	mount_crypt_stat = &sbi->mount_crypt_stat;
 
 	s = sget(fs_type, NULL, set_anon_super, flags, NULL);
 	if (IS_ERR(s)) {
@@ -557,11 +559,19 @@ static struct dentry *ecryptfs_mount(struct file_system_type *fs_type, int flags
 
 	/**
 	 * Set the POSIX ACL flag based on whether they're enabled in the lower
-	 * mount. Force a read-only eCryptfs mount if the lower mount is ro.
-	 * Allow a ro eCryptfs mount even when the lower mount is rw.
+	 * mount.
 	 */
 	s->s_flags = flags & ~MS_POSIXACL;
-	s->s_flags |= path.dentry->d_sb->s_flags & (MS_RDONLY | MS_POSIXACL);
+	s->s_flags |= path.dentry->d_sb->s_flags & MS_POSIXACL;
+
+	/**
+	 * Force a read-only eCryptfs mount when:
+	 *   1) The lower mount is ro
+	 *   2) The ecryptfs_encrypted_view mount option is specified
+	 */
+	if (path.dentry->d_sb->s_flags & MS_RDONLY ||
+	    mount_crypt_stat->flags & ECRYPTFS_ENCRYPTED_VIEW_ENABLED)
+		s->s_flags |= MS_RDONLY;
 
 	s->s_maxbytes = path.dentry->d_sb->s_maxbytes;
 	s->s_blocksize = path.dentry->d_sb->s_blocksize;
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 73/78] eCryptfs: Remove buggy and unnecessary write in file name decode routine
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (71 preceding siblings ...)
  2015-01-09 10:32 ` [PATCH 3.12 72/78] eCryptfs: Force RO mount when encrypted view is enabled Jiri Slaby
@ 2015-01-09 10:32 ` Jiri Slaby
  2015-01-09 10:32 ` [PATCH 3.12 74/78] Btrfs: do not move em to modified list when unpinning Jiri Slaby
                   ` (6 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:32 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Michael Halcrow, Tyler Hicks, Jiri Slaby

From: Michael Halcrow <mhalcrow@google.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 942080643bce061c3dd9d5718d3b745dcb39a8bc upstream.

Dmitry Chernenkov used KASAN to discover that eCryptfs writes past the
end of the allocated buffer during encrypted filename decoding. This
fix corrects the issue by getting rid of the unnecessary 0 write when
the current bit offset is 2.

Signed-off-by: Michael Halcrow <mhalcrow@google.com>
Reported-by: Dmitry Chernenkov <dmitryc@google.com>
Suggested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 fs/ecryptfs/crypto.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/fs/ecryptfs/crypto.c b/fs/ecryptfs/crypto.c
index 000eae2782b6..bf926f7a5f0c 100644
--- a/fs/ecryptfs/crypto.c
+++ b/fs/ecryptfs/crypto.c
@@ -1917,7 +1917,6 @@ ecryptfs_decode_from_filename(unsigned char *dst, size_t *dst_size,
 			break;
 		case 2:
 			dst[dst_byte_offset++] |= (src_byte);
-			dst[dst_byte_offset] = 0;
 			current_bit_offset = 0;
 			break;
 		}
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 74/78] Btrfs: do not move em to modified list when unpinning
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (72 preceding siblings ...)
  2015-01-09 10:32 ` [PATCH 3.12 73/78] eCryptfs: Remove buggy and unnecessary write in file name decode routine Jiri Slaby
@ 2015-01-09 10:32 ` Jiri Slaby
  2015-01-09 10:32 ` [PATCH 3.12 75/78] Btrfs: fix fs corruption on transaction abort if device supports discard Jiri Slaby
                   ` (5 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:32 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Josef Bacik, Chris Mason, Jiri Slaby

From: Josef Bacik <jbacik@fb.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit a28046956c71985046474283fa3bcd256915fb72 upstream.

We use the modified list to keep track of which extents have been modified so we
know which ones are candidates for logging at fsync() time.  Newly modified
extents are added to the list at modification time, around the same time the
ordered extent is created.  We do this so that we don't have to wait for ordered
extents to complete before we know what we need to log.  The problem is when
something like this happens

log extent 0-4k on inode 1
copy csum for 0-4k from ordered extent into log
sync log
commit transaction
log some other extent on inode 1
ordered extent for 0-4k completes and adds itself onto modified list again
log changed extents
see ordered extent for 0-4k has already been logged
	at this point we assume the csum has been copied
sync log
crash

On replay we will see the extent 0-4k in the log, drop the original 0-4k extent
which is the same one that we are replaying which also drops the csum, and then
we won't find the csum in the log for that bytenr.  This of course causes us to
have errors about not having csums for certain ranges of our inode.  So remove
the modified list manipulation in unpin_extent_cache, any modified extents
should have been added well before now, and we don't want them re-logged.  This
fixes my test that I could reliably reproduce this problem with.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 fs/btrfs/extent_map.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c
index a4a7a1a8da95..0a3809500599 100644
--- a/fs/btrfs/extent_map.c
+++ b/fs/btrfs/extent_map.c
@@ -263,8 +263,6 @@ int unpin_extent_cache(struct extent_map_tree *tree, u64 start, u64 len,
 	if (!em)
 		goto out;
 
-	if (!test_bit(EXTENT_FLAG_LOGGING, &em->flags))
-		list_move(&em->list, &tree->modified_extents);
 	em->generation = gen;
 	clear_bit(EXTENT_FLAG_PINNED, &em->flags);
 	em->mod_start = em->start;
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 75/78] Btrfs: fix fs corruption on transaction abort if device supports discard
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (73 preceding siblings ...)
  2015-01-09 10:32 ` [PATCH 3.12 74/78] Btrfs: do not move em to modified list when unpinning Jiri Slaby
@ 2015-01-09 10:32 ` Jiri Slaby
  2015-01-09 10:32 ` [PATCH 3.12 76/78] mfd: stmpe: Fix STMPE24xx GPMR LSB Jiri Slaby
                   ` (4 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:32 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Filipe Manana, Chris Mason, Jiri Slaby

From: Filipe Manana <fdmanana@suse.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 678886bdc6378c1cbd5072da2c5a3035000214e3 upstream.

When we abort a transaction we iterate over all the ranges marked as dirty
in fs_info->freed_extents[0] and fs_info->freed_extents[1], clear them
from those trees, add them back (unpin) to the free space caches and, if
the fs was mounted with "-o discard", perform a discard on those regions.
Also, after adding the regions to the free space caches, a fitrim ioctl call
can see those ranges in a block group's free space cache and perform a discard
on the ranges, so the same issue can happen without "-o discard" as well.

This causes corruption, affecting one or multiple btree nodes (in the worst
case leaving the fs unmountable) because some of those ranges (the ones in
the fs_info->pinned_extents tree) correspond to btree nodes/leafs that are
referred by the last committed super block - breaking the rule that anything
that was committed by a transaction is untouched until the next transaction
commits successfully.

I ran into this while running in a loop (for several hours) the fstest that
I recently submitted:

  [PATCH] fstests: add btrfs test to stress chunk allocation/removal and fstrim

The corruption always happened when a transaction aborted and then fsck complained
like this:

   _check_btrfs_filesystem: filesystem on /dev/sdc is inconsistent
   *** fsck.btrfs output ***
   Check tree block failed, want=94945280, have=0
   Check tree block failed, want=94945280, have=0
   Check tree block failed, want=94945280, have=0
   Check tree block failed, want=94945280, have=0
   Check tree block failed, want=94945280, have=0
   read block failed check_tree_block
   Couldn't open file system

In this case 94945280 corresponded to the root of a tree.
Using frace what I observed was the following sequence of steps happened:

   1) transaction N started, fs_info->pinned_extents pointed to
      fs_info->freed_extents[0];

   2) node/eb 94945280 is created;

   3) eb is persisted to disk;

   4) transaction N commit starts, fs_info->pinned_extents now points to
      fs_info->freed_extents[1], and transaction N completes;

   5) transaction N + 1 starts;

   6) eb is COWed, and btrfs_free_tree_block() called for this eb;

   7) eb range (94945280 to 94945280 + 16Kb) is added to
      fs_info->pinned_extents (fs_info->freed_extents[1]);

   8) Something goes wrong in transaction N + 1, like hitting ENOSPC
      for example, and the transaction is aborted, turning the fs into
      readonly mode. The stack trace I got for example:

      [112065.253935]  [<ffffffff8140c7b6>] dump_stack+0x4d/0x66
      [112065.254271]  [<ffffffff81042984>] warn_slowpath_common+0x7f/0x98
      [112065.254567]  [<ffffffffa0325990>] ? __btrfs_abort_transaction+0x50/0x10b [btrfs]
      [112065.261674]  [<ffffffff810429e5>] warn_slowpath_fmt+0x48/0x50
      [112065.261922]  [<ffffffffa032949e>] ? btrfs_free_path+0x26/0x29 [btrfs]
      [112065.262211]  [<ffffffffa0325990>] __btrfs_abort_transaction+0x50/0x10b [btrfs]
      [112065.262545]  [<ffffffffa036b1d6>] btrfs_remove_chunk+0x537/0x58b [btrfs]
      [112065.262771]  [<ffffffffa033840f>] btrfs_delete_unused_bgs+0x1de/0x21b [btrfs]
      [112065.263105]  [<ffffffffa0343106>] cleaner_kthread+0x100/0x12f [btrfs]
      (...)
      [112065.264493] ---[ end trace dd7903a975a31a08 ]---
      [112065.264673] BTRFS: error (device sdc) in btrfs_remove_chunk:2625: errno=-28 No space left
      [112065.264997] BTRFS info (device sdc): forced readonly

   9) The clear kthread sees that the BTRFS_FS_STATE_ERROR bit is set in
      fs_info->fs_state and calls btrfs_cleanup_transaction(), which in
      turn calls btrfs_destroy_pinned_extent();

   10) Then btrfs_destroy_pinned_extent() iterates over all the ranges
       marked as dirty in fs_info->freed_extents[], and for each one
       it calls discard, if the fs was mounted with "-o discard", and
       adds the range to the free space cache of the respective block
       group;

   11) btrfs_trim_block_group(), invoked from the fitrim ioctl code path,
       sees the free space entries and performs a discard;

   12) After an umount and mount (or fsck), our eb's location on disk was full
       of zeroes, and it should have been untouched, because it was marked as
       dirty in the fs_info->pinned_extents tree, and therefore used by the
       trees that the last committed superblock points to.

Fix this by not performing a discard and not adding the ranges to the free space
caches - it's useless from this point since the fs is now in readonly mode and
we won't write free space caches to disk anymore (otherwise we would leak space)
nor any new superblock. By not adding the ranges to the free space caches, it
prevents other code paths from allocating that space and write to it as well,
therefore being safer and simpler.

This isn't a new problem, as it's been present since 2011 (git commit
acce952b0263825da32cf10489413dec78053347).

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 fs/btrfs/disk-io.c     |  6 ------
 fs/btrfs/extent-tree.c | 10 ++++++----
 2 files changed, 6 insertions(+), 10 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 8964b59fee92..f46ad53626be 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3995,12 +3995,6 @@ again:
 		if (ret)
 			break;
 
-		/* opt_discard */
-		if (btrfs_test_opt(root, DISCARD))
-			ret = btrfs_error_discard_extent(root, start,
-							 end + 1 - start,
-							 NULL);
-
 		clear_extent_dirty(unpin, start, end, GFP_NOFS);
 		btrfs_error_unpin_extent_range(root, start, end);
 		cond_resched();
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 63ee604efa6c..b1c6e490379c 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -5476,7 +5476,8 @@ void btrfs_prepare_extent_commit(struct btrfs_trans_handle *trans,
 	update_global_block_rsv(fs_info);
 }
 
-static int unpin_extent_range(struct btrfs_root *root, u64 start, u64 end)
+static int unpin_extent_range(struct btrfs_root *root, u64 start, u64 end,
+			      const bool return_free_space)
 {
 	struct btrfs_fs_info *fs_info = root->fs_info;
 	struct btrfs_block_group_cache *cache = NULL;
@@ -5500,7 +5501,8 @@ static int unpin_extent_range(struct btrfs_root *root, u64 start, u64 end)
 
 		if (start < cache->last_byte_to_unpin) {
 			len = min(len, cache->last_byte_to_unpin - start);
-			btrfs_add_free_space(cache, start, len);
+			if (return_free_space)
+				btrfs_add_free_space(cache, start, len);
 		}
 
 		start += len;
@@ -5563,7 +5565,7 @@ int btrfs_finish_extent_commit(struct btrfs_trans_handle *trans,
 						   end + 1 - start, NULL);
 
 		clear_extent_dirty(unpin, start, end, GFP_NOFS);
-		unpin_extent_range(root, start, end);
+		unpin_extent_range(root, start, end, true);
 		cond_resched();
 	}
 
@@ -8809,7 +8811,7 @@ out:
 
 int btrfs_error_unpin_extent_range(struct btrfs_root *root, u64 start, u64 end)
 {
-	return unpin_extent_range(root, start, end);
+	return unpin_extent_range(root, start, end, false);
 }
 
 int btrfs_error_discard_extent(struct btrfs_root *root, u64 bytenr,
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 76/78] mfd: stmpe: Fix STMPE24xx GPMR LSB
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (74 preceding siblings ...)
  2015-01-09 10:32 ` [PATCH 3.12 75/78] Btrfs: fix fs corruption on transaction abort if device supports discard Jiri Slaby
@ 2015-01-09 10:32 ` Jiri Slaby
  2015-01-09 10:32 ` [PATCH 3.12 77/78] mfd: viperboard: Fix platform-device id collision Jiri Slaby
                   ` (3 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:32 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Linus Walleij, Lee Jones, Jiri Slaby

From: Linus Walleij <linus.walleij@linaro.org>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 871c3cf4ea7d5baf58e0a40bce7431ca5525aa2a upstream.

The least significat byte of the GPIO value read register
on the STMPE24xx series is on addres 0xA4 not 0xA5. Correct
against datasheet and tested on the STMPE2401 hardware.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 drivers/mfd/stmpe.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mfd/stmpe.h b/drivers/mfd/stmpe.h
index ff2b09ba8797..50a5c8697bf7 100644
--- a/drivers/mfd/stmpe.h
+++ b/drivers/mfd/stmpe.h
@@ -269,7 +269,7 @@ int stmpe_remove(struct stmpe *stmpe);
 #define STMPE24XX_REG_CHIP_ID		0x80
 #define STMPE24XX_REG_IEGPIOR_LSB	0x18
 #define STMPE24XX_REG_ISGPIOR_MSB	0x19
-#define STMPE24XX_REG_GPMR_LSB		0xA5
+#define STMPE24XX_REG_GPMR_LSB		0xA4
 #define STMPE24XX_REG_GPSR_LSB		0x85
 #define STMPE24XX_REG_GPCR_LSB		0x88
 #define STMPE24XX_REG_GPDR_LSB		0x8B
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 77/78] mfd: viperboard: Fix platform-device id collision
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (75 preceding siblings ...)
  2015-01-09 10:32 ` [PATCH 3.12 76/78] mfd: stmpe: Fix STMPE24xx GPMR LSB Jiri Slaby
@ 2015-01-09 10:32 ` Jiri Slaby
  2015-01-09 10:32 ` [PATCH 3.12 78/78] mm: let mm_find_pmd fix buggy race with THP fault Jiri Slaby
                   ` (2 subsequent siblings)
  79 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:32 UTC (permalink / raw)
  To: stable; +Cc: linux-kernel, Johan Hovold, Lee Jones, Jiri Slaby

From: Johan Hovold <johan@kernel.org>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit b6684228726cc25551a43f5c0bd9c5f977f10f48 upstream.

Allow more than one viperboard to be connected by registering with
PLATFORM_DEVID_AUTO instead of PLATFORM_DEVID_NONE.

The subdevices are currently registered with PLATFORM_DEVID_NONE, which
will cause a name collision on the platform bus when a second viperboard
is plugged in:

viperboard 1-2.4:1.0: version 0.00 found at bus 001 address 004
------------[ cut here ]------------
WARNING: CPU: 0 PID: 181 at /home/johan/work/omicron/src/linux/fs/sysfs/dir.c:31 sysfs_warn_dup+0x74/0x84()
sysfs: cannot create duplicate filename '/bus/platform/devices/viperboard-gpio'
Modules linked in: i2c_viperboard viperboard netconsole [last unloaded: viperboard]
CPU: 0 PID: 181 Comm: bash Tainted: G        W      3.17.0-rc6 #1
[<c0016bf4>] (unwind_backtrace) from [<c0013860>] (show_stack+0x20/0x24)
[<c0013860>] (show_stack) from [<c04305f8>] (dump_stack+0x24/0x28)
[<c04305f8>] (dump_stack) from [<c0040fb4>] (warn_slowpath_common+0x80/0x98)
[<c0040fb4>] (warn_slowpath_common) from [<c004100c>] (warn_slowpath_fmt+0x40/0x48)
[<c004100c>] (warn_slowpath_fmt) from [<c016f1bc>] (sysfs_warn_dup+0x74/0x84)
[<c016f1bc>] (sysfs_warn_dup) from [<c016f548>] (sysfs_do_create_link_sd.isra.2+0xcc/0xd0)
[<c016f548>] (sysfs_do_create_link_sd.isra.2) from [<c016f588>] (sysfs_create_link+0x3c/0x48)
[<c016f588>] (sysfs_create_link) from [<c02867ec>] (bus_add_device+0x12c/0x1e0)
[<c02867ec>] (bus_add_device) from [<c0284820>] (device_add+0x410/0x584)
[<c0284820>] (device_add) from [<c0289440>] (platform_device_add+0xd8/0x26c)
[<c0289440>] (platform_device_add) from [<c02a5ae4>] (mfd_add_device+0x240/0x344)
[<c02a5ae4>] (mfd_add_device) from [<c02a5ce0>] (mfd_add_devices+0xb8/0x110)
[<c02a5ce0>] (mfd_add_devices) from [<bf00d1c8>] (vprbrd_probe+0x160/0x1b0 [viperboard])
[<bf00d1c8>] (vprbrd_probe [viperboard]) from [<c030c000>] (usb_probe_interface+0x1bc/0x2a8)
[<c030c000>] (usb_probe_interface) from [<c028768c>] (driver_probe_device+0x14c/0x3ac)
[<c028768c>] (driver_probe_device) from [<c02879e4>] (__driver_attach+0xa4/0xa8)
[<c02879e4>] (__driver_attach) from [<c0285698>] (bus_for_each_dev+0x70/0xa4)
[<c0285698>] (bus_for_each_dev) from [<c0287030>] (driver_attach+0x2c/0x30)
[<c0287030>] (driver_attach) from [<c030a288>] (usb_store_new_id+0x170/0x1ac)
[<c030a288>] (usb_store_new_id) from [<c030a2f8>] (new_id_store+0x34/0x3c)
[<c030a2f8>] (new_id_store) from [<c02853ec>] (drv_attr_store+0x30/0x3c)
[<c02853ec>] (drv_attr_store) from [<c016eaa8>] (sysfs_kf_write+0x5c/0x60)
[<c016eaa8>] (sysfs_kf_write) from [<c016dc68>] (kernfs_fop_write+0xd4/0x194)
[<c016dc68>] (kernfs_fop_write) from [<c010fe40>] (vfs_write+0xb4/0x1c0)
[<c010fe40>] (vfs_write) from [<c01104a8>] (SyS_write+0x4c/0xa0)
[<c01104a8>] (SyS_write) from [<c000f900>] (ret_fast_syscall+0x0/0x48)
---[ end trace 98e8603c22d65817 ]---
viperboard 1-2.4:1.0: Failed to add mfd devices to core.
viperboard: probe of 1-2.4:1.0 failed with error -17

Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 drivers/mfd/viperboard.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/mfd/viperboard.c b/drivers/mfd/viperboard.c
index af2a6703f34f..7bf6dd9625b9 100644
--- a/drivers/mfd/viperboard.c
+++ b/drivers/mfd/viperboard.c
@@ -93,8 +93,9 @@ static int vprbrd_probe(struct usb_interface *interface,
 		 version >> 8, version & 0xff,
 		 vb->usb_dev->bus->busnum, vb->usb_dev->devnum);
 
-	ret = mfd_add_devices(&interface->dev, -1, vprbrd_devs,
-				ARRAY_SIZE(vprbrd_devs), NULL, 0, NULL);
+	ret = mfd_add_devices(&interface->dev, PLATFORM_DEVID_AUTO,
+				vprbrd_devs, ARRAY_SIZE(vprbrd_devs), NULL, 0,
+				NULL);
 	if (ret != 0) {
 		dev_err(&interface->dev, "Failed to add mfd devices to core.");
 		goto error;
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3.12 78/78] mm: let mm_find_pmd fix buggy race with THP fault
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (76 preceding siblings ...)
  2015-01-09 10:32 ` [PATCH 3.12 77/78] mfd: viperboard: Fix platform-device id collision Jiri Slaby
@ 2015-01-09 10:32 ` Jiri Slaby
  2015-01-10  5:01   ` Hugh Dickins
  2015-01-09 17:59 ` [PATCH 3.12 00/78] 3.12.36-stable review Guenter Roeck
  2015-01-12 18:00 ` Shuah Khan
  79 siblings, 1 reply; 89+ messages in thread
From: Jiri Slaby @ 2015-01-09 10:32 UTC (permalink / raw)
  To: stable
  Cc: linux-kernel, Hugh Dickins, Konstantin Khlebnikov, Mel Gorman,
	Bob Liu, Christoph Lameter, Dave Jones, David Rientjes,
	Andrew Morton, Linus Torvalds, Jiri Slaby

From: Hugh Dickins <hughd@google.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit f72e7dcdd25229446b102e587ef2f826f76bff28 upstream.

Trinity has reported:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
    IP: __lock_acquire (kernel/locking/lockdep.c:3070 (discriminator 1))
    CPU: 6 PID: 16173 Comm: trinity-c364 Tainted: G        W
                            3.15.0-rc1-next-20140415-sasha-00020-gaa90d09 #398
    lock_acquire (arch/x86/include/asm/current.h:14
                  kernel/locking/lockdep.c:3602)
    _raw_spin_lock (include/linux/spinlock_api_smp.h:143
                    kernel/locking/spinlock.c:151)
    remove_migration_pte (mm/migrate.c:137)
    rmap_walk (mm/rmap.c:1628 mm/rmap.c:1699)
    remove_migration_ptes (mm/migrate.c:224)
    migrate_pages (mm/migrate.c:922 mm/migrate.c:960 mm/migrate.c:1126)
    migrate_misplaced_page (mm/migrate.c:1733)
    __handle_mm_fault (mm/memory.c:3762 mm/memory.c:3812 mm/memory.c:3925)
    handle_mm_fault (mm/memory.c:3948)
    __get_user_pages (mm/memory.c:1851)
    __mlock_vma_pages_range (mm/mlock.c:255)
    __mm_populate (mm/mlock.c:711)
    SyS_mlockall (include/linux/mm.h:1799 mm/mlock.c:817 mm/mlock.c:791)

I believe this comes about because, whereas collapsing and splitting THP
functions take anon_vma lock in write mode (which excludes concurrent
rmap walks), faulting THP functions (write protection and misplaced
NUMA) do not - and mostly they do not need to.

But they do use a pmdp_clear_flush(), set_pmd_at() sequence which, for
an instant (indeed, for a long instant, given the inter-CPU TLB flush in
there), leaves *pmd neither present not trans_huge.

Which can confuse a concurrent rmap walk, as when removing migration
ptes, seen in the dumped trace.  Although that rmap walk has a 4k page
to insert, anon_vmas containing THPs are in no way segregated from
4k-page anon_vmas, so the 4k-intent mm_find_pmd() does need to cope with
that instant when a trans_huge pmd is temporarily absent.

I don't think we need strengthen the locking at the THP end: it's easily
handled with an ACCESS_ONCE() before testing both conditions.

And since mm_find_pmd() had only one caller who wanted a THP rather than
a pmd, let's slightly repurpose it to fail when it hits a THP or
non-present pmd, and open code split_huge_page_address() again.

Signed-off-by: Hugh Dickins <hughd@google.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Bob Liu <bob.liu@oracle.com>
Cc: Christoph Lameter <cl@gentwo.org>
Cc: Dave Jones <davej@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 mm/huge_memory.c | 18 ++++++++++++------
 mm/ksm.c         |  1 -
 mm/migrate.c     |  2 --
 mm/rmap.c        | 12 ++++++++----
 4 files changed, 20 insertions(+), 13 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index e497843f5f65..04d17ba00893 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2408,8 +2408,6 @@ static void collapse_huge_page(struct mm_struct *mm,
 	pmd = mm_find_pmd(mm, address);
 	if (!pmd)
 		goto out;
-	if (pmd_trans_huge(*pmd))
-		goto out;
 
 	anon_vma_lock_write(vma->anon_vma);
 
@@ -2508,8 +2506,6 @@ static int khugepaged_scan_pmd(struct mm_struct *mm,
 	pmd = mm_find_pmd(mm, address);
 	if (!pmd)
 		goto out;
-	if (pmd_trans_huge(*pmd))
-		goto out;
 
 	memset(khugepaged_node_load, 0, sizeof(khugepaged_node_load));
 	pte = pte_offset_map_lock(mm, pmd, address, &ptl);
@@ -2863,12 +2859,22 @@ void split_huge_page_pmd_mm(struct mm_struct *mm, unsigned long address,
 static void split_huge_page_address(struct mm_struct *mm,
 				    unsigned long address)
 {
+	pgd_t *pgd;
+	pud_t *pud;
 	pmd_t *pmd;
 
 	VM_BUG_ON(!(address & ~HPAGE_PMD_MASK));
 
-	pmd = mm_find_pmd(mm, address);
-	if (!pmd)
+	pgd = pgd_offset(mm, address);
+	if (!pgd_present(*pgd))
+		return;
+
+	pud = pud_offset(pgd, address);
+	if (!pud_present(*pud))
+		return;
+
+	pmd = pmd_offset(pud, address);
+	if (!pmd_present(*pmd))
 		return;
 	/*
 	 * Caller holds the mmap_sem write mode, so a huge pmd cannot
diff --git a/mm/ksm.c b/mm/ksm.c
index c78fff1e9eae..29cbd06c4884 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -945,7 +945,6 @@ static int replace_page(struct vm_area_struct *vma, struct page *page,
 	pmd = mm_find_pmd(mm, addr);
 	if (!pmd)
 		goto out;
-	BUG_ON(pmd_trans_huge(*pmd));
 
 	mmun_start = addr;
 	mmun_end   = addr + PAGE_SIZE;
diff --git a/mm/migrate.c b/mm/migrate.c
index d5c84b0a5243..fac5fa0813c4 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -136,8 +136,6 @@ static int remove_migration_pte(struct page *new, struct vm_area_struct *vma,
 		pmd = mm_find_pmd(mm, addr);
 		if (!pmd)
 			goto out;
-		if (pmd_trans_huge(*pmd))
-			goto out;
 
 		ptep = pte_offset_map(pmd, addr);
 
diff --git a/mm/rmap.c b/mm/rmap.c
index 5b8675ccc1ef..440c71c43b8d 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -571,6 +571,7 @@ pmd_t *mm_find_pmd(struct mm_struct *mm, unsigned long address)
 	pgd_t *pgd;
 	pud_t *pud;
 	pmd_t *pmd = NULL;
+	pmd_t pmde;
 
 	pgd = pgd_offset(mm, address);
 	if (!pgd_present(*pgd))
@@ -581,7 +582,13 @@ pmd_t *mm_find_pmd(struct mm_struct *mm, unsigned long address)
 		goto out;
 
 	pmd = pmd_offset(pud, address);
-	if (!pmd_present(*pmd))
+	/*
+	 * Some THP functions use the sequence pmdp_clear_flush(), set_pmd_at()
+	 * without holding anon_vma lock for write.  So when looking for a
+	 * genuine pmde (in which to find pte), test present and !THP together.
+	 */
+	pmde = ACCESS_ONCE(*pmd);
+	if (!pmd_present(pmde) || pmd_trans_huge(pmde))
 		pmd = NULL;
 out:
 	return pmd;
@@ -617,9 +624,6 @@ pte_t *__page_check_address(struct page *page, struct mm_struct *mm,
 	if (!pmd)
 		return NULL;
 
-	if (pmd_trans_huge(*pmd))
-		return NULL;
-
 	pte = pte_offset_map(pmd, address);
 	/* Make a quick check before getting the lock */
 	if (!sync && !pte_present(*pte)) {
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* Re: [PATCH 3.12 00/78] 3.12.36-stable review
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (77 preceding siblings ...)
  2015-01-09 10:32 ` [PATCH 3.12 78/78] mm: let mm_find_pmd fix buggy race with THP fault Jiri Slaby
@ 2015-01-09 17:59 ` Guenter Roeck
  2015-01-11  3:40   ` Satoru Takeuchi
  2015-01-12 18:00 ` Shuah Khan
  79 siblings, 1 reply; 89+ messages in thread
From: Guenter Roeck @ 2015-01-09 17:59 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: stable, satoru.takeuchi, shuah.kh, linux-kernel

On Fri, Jan 09, 2015 at 11:30:32AM +0100, Jiri Slaby wrote:
> This is the start of the stable review cycle for the 3.12.36 release.
> There are 78 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Tue Jan 13 11:29:45 CET 2015.
> Anything received after that time might be too late.
> 
Build results:
	total: 135 pass: 135 fail: 0
Qemu tests:
	total: 27 pass: 27 fail: 0

Details are available at http://server.roeck-us.net:8010/builders.

Guenter

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 3.12 78/78] mm: let mm_find_pmd fix buggy race with THP fault
  2015-01-09 10:32 ` [PATCH 3.12 78/78] mm: let mm_find_pmd fix buggy race with THP fault Jiri Slaby
@ 2015-01-10  5:01   ` Hugh Dickins
  2015-01-12 10:01     ` Jiri Slaby
  0 siblings, 1 reply; 89+ messages in thread
From: Hugh Dickins @ 2015-01-10  5:01 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: stable, Kirill A. Shutemov, linux-kernel, Hugh Dickins,
	Konstantin Khlebnikov, Mel Gorman, Bob Liu, Christoph Lameter,
	Dave Jones, David Rientjes, Andrew Morton, Linus Torvalds

On Fri, 9 Jan 2015, Jiri Slaby wrote:

> From: Hugh Dickins <hughd@google.com>
> 
> 3.12-stable review patch.  If anyone has any objections, please let me know.
> 
> ===============
> 
> commit f72e7dcdd25229446b102e587ef2f826f76bff28 upstream.
> 
> Trinity has reported:
> 
>     BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
>     IP: __lock_acquire (kernel/locking/lockdep.c:3070 (discriminator 1))
>     CPU: 6 PID: 16173 Comm: trinity-c364 Tainted: G        W
>                             3.15.0-rc1-next-20140415-sasha-00020-gaa90d09 #398
>     lock_acquire (arch/x86/include/asm/current.h:14
>                   kernel/locking/lockdep.c:3602)
>     _raw_spin_lock (include/linux/spinlock_api_smp.h:143
>                     kernel/locking/spinlock.c:151)
>     remove_migration_pte (mm/migrate.c:137)
>     rmap_walk (mm/rmap.c:1628 mm/rmap.c:1699)
>     remove_migration_ptes (mm/migrate.c:224)
>     migrate_pages (mm/migrate.c:922 mm/migrate.c:960 mm/migrate.c:1126)
>     migrate_misplaced_page (mm/migrate.c:1733)
>     __handle_mm_fault (mm/memory.c:3762 mm/memory.c:3812 mm/memory.c:3925)
>     handle_mm_fault (mm/memory.c:3948)
>     __get_user_pages (mm/memory.c:1851)
>     __mlock_vma_pages_range (mm/mlock.c:255)
>     __mm_populate (mm/mlock.c:711)
>     SyS_mlockall (include/linux/mm.h:1799 mm/mlock.c:817 mm/mlock.c:791)
> 
> I believe this comes about because, whereas collapsing and splitting THP
> functions take anon_vma lock in write mode (which excludes concurrent
> rmap walks), faulting THP functions (write protection and misplaced
> NUMA) do not - and mostly they do not need to.
> 
> But they do use a pmdp_clear_flush(), set_pmd_at() sequence which, for
> an instant (indeed, for a long instant, given the inter-CPU TLB flush in
> there), leaves *pmd neither present not trans_huge.
> 
> Which can confuse a concurrent rmap walk, as when removing migration
> ptes, seen in the dumped trace.  Although that rmap walk has a 4k page
> to insert, anon_vmas containing THPs are in no way segregated from
> 4k-page anon_vmas, so the 4k-intent mm_find_pmd() does need to cope with
> that instant when a trans_huge pmd is temporarily absent.
> 
> I don't think we need strengthen the locking at the THP end: it's easily
> handled with an ACCESS_ONCE() before testing both conditions.
> 
> And since mm_find_pmd() had only one caller who wanted a THP rather than
> a pmd, let's slightly repurpose it to fail when it hits a THP or
> non-present pmd, and open code split_huge_page_address() again.
> 
> Signed-off-by: Hugh Dickins <hughd@google.com>
> Reported-by: Sasha Levin <sasha.levin@oracle.com>
> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: Konstantin Khlebnikov <koct9i@gmail.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Bob Liu <bob.liu@oracle.com>
> Cc: Christoph Lameter <cl@gentwo.org>
> Cc: Dave Jones <davej@redhat.com>
> Cc: David Rientjes <rientjes@google.com>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
> ---
>  mm/huge_memory.c | 18 ++++++++++++------
>  mm/ksm.c         |  1 -
>  mm/migrate.c     |  2 --
>  mm/rmap.c        | 12 ++++++++----
>  4 files changed, 20 insertions(+), 13 deletions(-)
> 
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index e497843f5f65..04d17ba00893 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2408,8 +2408,6 @@ static void collapse_huge_page(struct mm_struct *mm,
>  	pmd = mm_find_pmd(mm, address);
>  	if (!pmd)
>  		goto out;
> -	if (pmd_trans_huge(*pmd))
> -		goto out;
>  
>  	anon_vma_lock_write(vma->anon_vma);
>  
> @@ -2508,8 +2506,6 @@ static int khugepaged_scan_pmd(struct mm_struct *mm,
>  	pmd = mm_find_pmd(mm, address);
>  	if (!pmd)
>  		goto out;
> -	if (pmd_trans_huge(*pmd))
> -		goto out;
>  
>  	memset(khugepaged_node_load, 0, sizeof(khugepaged_node_load));
>  	pte = pte_offset_map_lock(mm, pmd, address, &ptl);
> @@ -2863,12 +2859,22 @@ void split_huge_page_pmd_mm(struct mm_struct *mm, unsigned long address,
>  static void split_huge_page_address(struct mm_struct *mm,
>  				    unsigned long address)
>  {
> +	pgd_t *pgd;
> +	pud_t *pud;
>  	pmd_t *pmd;
>  
>  	VM_BUG_ON(!(address & ~HPAGE_PMD_MASK));
>  
> -	pmd = mm_find_pmd(mm, address);
> -	if (!pmd)
> +	pgd = pgd_offset(mm, address);
> +	if (!pgd_present(*pgd))
> +		return;
> +
> +	pud = pud_offset(pgd, address);
> +	if (!pud_present(*pud))
> +		return;
> +
> +	pmd = pmd_offset(pud, address);
> +	if (!pmd_present(*pmd))
>  		return;
>  	/*
>  	 * Caller holds the mmap_sem write mode, so a huge pmd cannot
> diff --git a/mm/ksm.c b/mm/ksm.c
> index c78fff1e9eae..29cbd06c4884 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -945,7 +945,6 @@ static int replace_page(struct vm_area_struct *vma, struct page *page,
>  	pmd = mm_find_pmd(mm, addr);
>  	if (!pmd)
>  		goto out;
> -	BUG_ON(pmd_trans_huge(*pmd));
>  
>  	mmun_start = addr;
>  	mmun_end   = addr + PAGE_SIZE;
> diff --git a/mm/migrate.c b/mm/migrate.c
> index d5c84b0a5243..fac5fa0813c4 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -136,8 +136,6 @@ static int remove_migration_pte(struct page *new, struct vm_area_struct *vma,
>  		pmd = mm_find_pmd(mm, addr);
>  		if (!pmd)
>  			goto out;
> -		if (pmd_trans_huge(*pmd))
> -			goto out;
>  
>  		ptep = pte_offset_map(pmd, addr);
>  
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 5b8675ccc1ef..440c71c43b8d 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -571,6 +571,7 @@ pmd_t *mm_find_pmd(struct mm_struct *mm, unsigned long address)
>  	pgd_t *pgd;
>  	pud_t *pud;
>  	pmd_t *pmd = NULL;
> +	pmd_t pmde;
>  
>  	pgd = pgd_offset(mm, address);
>  	if (!pgd_present(*pgd))
> @@ -581,7 +582,13 @@ pmd_t *mm_find_pmd(struct mm_struct *mm, unsigned long address)
>  		goto out;
>  
>  	pmd = pmd_offset(pud, address);
> -	if (!pmd_present(*pmd))
> +	/*
> +	 * Some THP functions use the sequence pmdp_clear_flush(), set_pmd_at()
> +	 * without holding anon_vma lock for write.  So when looking for a
> +	 * genuine pmde (in which to find pte), test present and !THP together.
> +	 */
> +	pmde = ACCESS_ONCE(*pmd);
> +	if (!pmd_present(pmde) || pmd_trans_huge(pmde))
>  		pmd = NULL;
>  out:
>  	return pmd;
> @@ -617,9 +624,6 @@ pte_t *__page_check_address(struct page *page, struct mm_struct *mm,
>  	if (!pmd)
>  		return NULL;
>  
> -	if (pmd_trans_huge(*pmd))
> -		return NULL;
> -
>  	pte = pte_offset_map(pmd, address);
>  	/* Make a quick check before getting the lock */
>  	if (!sync && !pte_present(*pte)) {
> -- 
> 2.2.1

Fine for this to go in, but there is one catch, which I discovered when
backporting to v3.11: it needed one more hunk.  I haven't checked your
base tree, but if this applies then I believe you need it - most of the
time no problem, but it can case page migration to fail to find a
migration entry it inserted earlier, then BUG_ON(!PageLocked(p)) in
migration_entry_to_page() soon after.  Here's what I wrote back then:

Note on rebase to v3.11: added a hunk to replace the use of mm_find_pmd()
in page_check_address_pmd().  This call had been similarly replaced by
the time of my v3.16 commit, in Kirill Shutemov's v3.15 b5a8cad376ee
("thp: close race between split and zap huge pages"): which we do not
need as such, since it's fixing v3.13 117b0791ac42 ("mm, thp: move ptl
taking inside page_check_address_pmd()"), from a split page-table-lock
series we are not backporting.  But without this additional hunk, rmap
sometimes broke when the new semantic for mm_find_pmd() was used here.

(Adding Kirill to Cc: shouldn't he have been Cc'ed already?)

Hugh
    
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1584,12 +1584,20 @@ pmd_t *page_check_address_pmd(struct page *page,
 			      unsigned long address,
 			      enum page_check_address_pmd_flag flag)
 {
+	pgd_t *pgd;
+	pud_t *pud;
 	pmd_t *pmd, *ret = NULL;
 
 	if (address & ~HPAGE_PMD_MASK)
 		goto out;
 
-	pmd = mm_find_pmd(mm, address);
+	pgd = pgd_offset(mm, address);
+	if (!pgd_present(*pgd))
+		goto out;
+	pud = pud_offset(pgd, address);
+	if (!pud_present(*pud))
+		goto out;
+	pmd = pmd_offset(pud, address);
 	if (!pmd)
 		goto out;
 	if (pmd_none(*pmd))

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 3.12 11/78] perf/x86/intel: Protect LBR and extra_regs against KVM lying
  2015-01-09 10:31 ` [PATCH 3.12 11/78] perf/x86/intel: Protect LBR and extra_regs against KVM lying Jiri Slaby
@ 2015-01-10 11:24   ` Dongsu Park
  2015-01-10 11:42     ` Jiri Slaby
  0 siblings, 1 reply; 89+ messages in thread
From: Dongsu Park @ 2015-01-10 11:24 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: stable, linux-kernel, Kan Liang, Peter Zijlstra, Andi Kleen,
	Arnaldo Carvalho de Melo, Linus Torvalds, Maria Dimakopoulou,
	Mark Davies, Paul Mackerras, Stephane Eranian, Yan, Zheng,
	Ingo Molnar

Hi Jiry,

On 09.01.2015 11:31, Jiri Slaby wrote:
> From: Kan Liang <kan.liang@intel.com>
> 
> 3.12-stable review patch.  If anyone has any objections, please let me know.

Thanks for taking this patch to 3.12-stable.
I've just tested 3.12.36 from your stable-3.12-queue tree.
Unfortunately, the kernel still crashes at intel_pmu_init().

It turns out that this commit alone is not enough for fixing the bug.
Actually you also need commit c9b08884c9c98929ec2d8abafd78e89062d01ee7
("perf/x86: Correctly use FEATURE_PDCM").
Can you please pick that commit too?

Thanks,
Dongsu

p.s. In contrast, that commit didn't have to be backported to kernel
3.14.27, as it was already included in 3.14 tree. So situation was
slightly different between 3.12 and 3.14. This also explains why I
haven't been able to get it working in 3.10 so far.
I'll send a separate mail to stable@.

> ===============
> 
> commit 338b522ca43cfd32d11a370f4203bcd089c6c877 upstream.
> 
> With -cpu host, KVM reports LBR and extra_regs support, if the host has
> support.
> 
> When the guest perf driver tries to access LBR or extra_regs MSR,
> it #GPs all MSR accesses,since KVM doesn't handle LBR and extra_regs support.
> So check the related MSRs access right once at initialization time to avoid
> the error access at runtime.
> 
> For reproducing the issue, please build the kernel with CONFIG_KVM_INTEL = y
> (for host kernel).
> And CONFIG_PARAVIRT = n and CONFIG_KVM_GUEST = n (for guest kernel).
> Start the guest with -cpu host.
> Run perf record with --branch-any or --branch-filter in guest to trigger LBR
> Run perf stat offcore events (E.g. LLC-loads/LLC-load-misses ...) in guest to
> trigger offcore_rsp #GP
> 
> Signed-off-by: Kan Liang <kan.liang@intel.com>
> Signed-off-by: Peter Zijlstra <peterz@infradead.org>
> Cc: Andi Kleen <ak@linux.intel.com>
> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Maria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
> Cc: Mark Davies <junk@eslaf.co.uk>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Stephane Eranian <eranian@google.com>
> Cc: Yan, Zheng <zheng.z.yan@intel.com>
> Link: http://lkml.kernel.org/r/1405365957-20202-1-git-send-email-kan.liang@intel.com
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
> ---
>  arch/x86/kernel/cpu/perf_event.c       |  3 ++
>  arch/x86/kernel/cpu/perf_event.h       | 12 ++++---
>  arch/x86/kernel/cpu/perf_event_intel.c | 66 +++++++++++++++++++++++++++++++++-
>  3 files changed, 75 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
> index 5edd3c0b437a..c7106f116fb0 100644
> --- a/arch/x86/kernel/cpu/perf_event.c
> +++ b/arch/x86/kernel/cpu/perf_event.c
> @@ -118,6 +118,9 @@ static int x86_pmu_extra_regs(u64 config, struct perf_event *event)
>  			continue;
>  		if (event->attr.config1 & ~er->valid_mask)
>  			return -EINVAL;
> +		/* Check if the extra msrs can be safely accessed*/
> +		if (!er->extra_msr_access)
> +			return -ENXIO;
>  
>  		reg->idx = er->idx;
>  		reg->config = event->attr.config1;
> diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h
> index cc16faae0538..53bd2726f4cd 100644
> --- a/arch/x86/kernel/cpu/perf_event.h
> +++ b/arch/x86/kernel/cpu/perf_event.h
> @@ -279,14 +279,16 @@ struct extra_reg {
>  	u64			config_mask;
>  	u64			valid_mask;
>  	int			idx;  /* per_xxx->regs[] reg index */
> +	bool			extra_msr_access;
>  };
>  
>  #define EVENT_EXTRA_REG(e, ms, m, vm, i) {	\
> -	.event = (e),		\
> -	.msr = (ms),		\
> -	.config_mask = (m),	\
> -	.valid_mask = (vm),	\
> -	.idx = EXTRA_REG_##i,	\
> +	.event = (e),			\
> +	.msr = (ms),			\
> +	.config_mask = (m),		\
> +	.valid_mask = (vm),		\
> +	.idx = EXTRA_REG_##i,		\
> +	.extra_msr_access = true,	\
>  	}
>  
>  #define INTEL_EVENT_EXTRA_REG(event, msr, vm, idx)	\
> diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
> index 959bbf204dae..02554ddf8481 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> @@ -2144,6 +2144,41 @@ static void intel_snb_check_microcode(void)
>  	}
>  }
>  
> +/*
> + * Under certain circumstances, access certain MSR may cause #GP.
> + * The function tests if the input MSR can be safely accessed.
> + */
> +static bool check_msr(unsigned long msr, u64 mask)
> +{
> +	u64 val_old, val_new, val_tmp;
> +
> +	/*
> +	 * Read the current value, change it and read it back to see if it
> +	 * matches, this is needed to detect certain hardware emulators
> +	 * (qemu/kvm) that don't trap on the MSR access and always return 0s.
> +	 */
> +	if (rdmsrl_safe(msr, &val_old))
> +		return false;
> +
> +	/*
> +	 * Only change the bits which can be updated by wrmsrl.
> +	 */
> +	val_tmp = val_old ^ mask;
> +	if (wrmsrl_safe(msr, val_tmp) ||
> +	    rdmsrl_safe(msr, &val_new))
> +		return false;
> +
> +	if (val_new != val_tmp)
> +		return false;
> +
> +	/* Here it's sure that the MSR can be safely accessed.
> +	 * Restore the old value and return.
> +	 */
> +	wrmsrl(msr, val_old);
> +
> +	return true;
> +}
> +
>  static __init void intel_sandybridge_quirk(void)
>  {
>  	x86_pmu.check_microcode = intel_snb_check_microcode;
> @@ -2207,7 +2242,8 @@ __init int intel_pmu_init(void)
>  	union cpuid10_ebx ebx;
>  	struct event_constraint *c;
>  	unsigned int unused;
> -	int version;
> +	struct extra_reg *er;
> +	int version, i;
>  
>  	if (!cpu_has(&boot_cpu_data, X86_FEATURE_ARCH_PERFMON)) {
>  		switch (boot_cpu_data.x86) {
> @@ -2515,6 +2551,34 @@ __init int intel_pmu_init(void)
>  		}
>  	}
>  
> +	/*
> +	 * Access LBR MSR may cause #GP under certain circumstances.
> +	 * E.g. KVM doesn't support LBR MSR
> +	 * Check all LBT MSR here.
> +	 * Disable LBR access if any LBR MSRs can not be accessed.
> +	 */
> +	if (x86_pmu.lbr_nr && !check_msr(x86_pmu.lbr_tos, 0x3UL))
> +		x86_pmu.lbr_nr = 0;
> +	for (i = 0; i < x86_pmu.lbr_nr; i++) {
> +		if (!(check_msr(x86_pmu.lbr_from + i, 0xffffUL) &&
> +		      check_msr(x86_pmu.lbr_to + i, 0xffffUL)))
> +			x86_pmu.lbr_nr = 0;
> +	}
> +
> +	/*
> +	 * Access extra MSR may cause #GP under certain circumstances.
> +	 * E.g. KVM doesn't support offcore event
> +	 * Check all extra_regs here.
> +	 */
> +	if (x86_pmu.extra_regs) {
> +		for (er = x86_pmu.extra_regs; er->msr; er++) {
> +			er->extra_msr_access = check_msr(er->msr, 0x1ffUL);
> +			/* Disable LBR select mapping */
> +			if ((er->idx == EXTRA_REG_LBR) && !er->extra_msr_access)
> +				x86_pmu.lbr_sel_map = NULL;
> +		}
> +	}
> +
>  	/* Support full width counters using alternative MSR range */
>  	if (x86_pmu.intel_cap.full_width_write) {
>  		x86_pmu.max_period = x86_pmu.cntval_mask;
> -- 
> 2.2.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe stable" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 3.12 11/78] perf/x86/intel: Protect LBR and extra_regs against KVM lying
  2015-01-10 11:24   ` Dongsu Park
@ 2015-01-10 11:42     ` Jiri Slaby
  0 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-10 11:42 UTC (permalink / raw)
  To: Dongsu Park
  Cc: stable, linux-kernel, Kan Liang, Peter Zijlstra, Andi Kleen,
	Arnaldo Carvalho de Melo, Linus Torvalds, Maria Dimakopoulou,
	Mark Davies, Paul Mackerras, Stephane Eranian, Yan, Zheng,
	Ingo Molnar

On 01/10/2015, 12:24 PM, Dongsu Park wrote:
> Hi Jiry,
> 
> On 09.01.2015 11:31, Jiri Slaby wrote:
>> From: Kan Liang <kan.liang@intel.com>
>>
>> 3.12-stable review patch.  If anyone has any objections, please let me know.
> 
> Thanks for taking this patch to 3.12-stable.
> I've just tested 3.12.36 from your stable-3.12-queue tree.
> Unfortunately, the kernel still crashes at intel_pmu_init().
> 
> It turns out that this commit alone is not enough for fixing the bug.
> Actually you also need commit c9b08884c9c98929ec2d8abafd78e89062d01ee7
> ("perf/x86: Correctly use FEATURE_PDCM").
> Can you please pick that commit too?

Now applied, thanks.


-- 
js
suse labs

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 3.12 00/78] 3.12.36-stable review
  2015-01-09 17:59 ` [PATCH 3.12 00/78] 3.12.36-stable review Guenter Roeck
@ 2015-01-11  3:40   ` Satoru Takeuchi
  2015-01-12 10:35     ` Jiri Slaby
  0 siblings, 1 reply; 89+ messages in thread
From: Satoru Takeuchi @ 2015-01-11  3:40 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Jiri Slaby, stable, satoru.takeuchi, shuah.kh, linux-kernel

At Fri, 9 Jan 2015 09:59:10 -0800,
Guenter Roeck wrote:
> 
> On Fri, Jan 09, 2015 at 11:30:32AM +0100, Jiri Slaby wrote:
> > This is the start of the stable review cycle for the 3.12.36 release.
> > There are 78 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Tue Jan 13 11:29:45 CET 2015.
> > Anything received after that time might be too late.
> > 
> Build results:
> 	total: 135 pass: 135 fail: 0
> Qemu tests:
> 	total: 27 pass: 27 fail: 0
> 
> Details are available at http://server.roeck-us.net:8010/builders.
> 
> Guenter

Plus, this kernel passed my test.

 - Test Cases:
   - Build this kernel.
   - Boot this kernel.
   - Build the latest mainline kernel with this kernel.

 - Test Tool:
   https://github.com/satoru-takeuchi/test-linux-stable

 - Test Result (kernel .config, ktest config and test log):
   http://satoru-takeuchi.org/test-linux-stable/results/<version>-<test datetime>.tar.xz

 - Build Environment:
   - OS: Debian Jessy x86_64
   - CPU: Intel(R) Core(TM) i5-2400 CPU @ 3.10GHz x 4
   - memory: 8GB

 - Test Target Environment:
   - Debian Jessy x86_64 (KVM guest on the Build Environment)
   - # of vCPU: 2
   - memory: 2GB

Thanks,
Satoru

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 3.12 78/78] mm: let mm_find_pmd fix buggy race with THP fault
  2015-01-10  5:01   ` Hugh Dickins
@ 2015-01-12 10:01     ` Jiri Slaby
  2015-01-12 11:13       ` Kirill A. Shutemov
  0 siblings, 1 reply; 89+ messages in thread
From: Jiri Slaby @ 2015-01-12 10:01 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: stable, Kirill A. Shutemov, linux-kernel, Konstantin Khlebnikov,
	Mel Gorman, Bob Liu, Christoph Lameter, Dave Jones,
	David Rientjes, Andrew Morton, Linus Torvalds

[-- Attachment #1: Type: text/plain, Size: 2365 bytes --]

On 01/10/2015, 06:01 AM, Hugh Dickins wrote:
> On Fri, 9 Jan 2015, Jiri Slaby wrote:
> 
>> From: Hugh Dickins <hughd@google.com>
>>
>> 3.12-stable review patch.  If anyone has any objections, please let me know.
>>
>> ===============
>>
>> commit f72e7dcdd25229446b102e587ef2f826f76bff28 upstream.
...
> Fine for this to go in, but there is one catch, which I discovered when
> backporting to v3.11: it needed one more hunk.  I haven't checked your
> base tree, but if this applies then I believe you need it - most of the
> time no problem, but it can case page migration to fail to find a
> migration entry it inserted earlier, then BUG_ON(!PageLocked(p)) in
> migration_entry_to_page() soon after.  Here's what I wrote back then:
> 
> Note on rebase to v3.11: added a hunk to replace the use of mm_find_pmd()
> in page_check_address_pmd().  This call had been similarly replaced by
> the time of my v3.16 commit, in Kirill Shutemov's v3.15 b5a8cad376ee
> ("thp: close race between split and zap huge pages"): which we do not
> need as such, since it's fixing v3.13 117b0791ac42 ("mm, thp: move ptl
> taking inside page_check_address_pmd()"), from a split page-table-lock
> series we are not backporting.  But without this additional hunk, rmap
> sometimes broke when the new semantic for mm_find_pmd() was used here.
> 
> (Adding Kirill to Cc: shouldn't he have been Cc'ed already?)
> 
> Hugh

Thanks, I see. So the diff between the hunk below and 117b0791ac42 are
two things:

> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1584,12 +1584,20 @@ pmd_t *page_check_address_pmd(struct page *page,
>  			      unsigned long address,
>  			      enum page_check_address_pmd_flag flag)
>  {
> +	pgd_t *pgd;
> +	pud_t *pud;
>  	pmd_t *pmd, *ret = NULL;
>  
>  	if (address & ~HPAGE_PMD_MASK)
>  		goto out;
>  
> -	pmd = mm_find_pmd(mm, address);
> +	pgd = pgd_offset(mm, address);
> +	if (!pgd_present(*pgd))
> +		goto out;
> +	pud = pud_offset(pgd, address);
> +	if (!pud_present(*pud))
> +		goto out;
> +	pmd = pmd_offset(pud, address);
>  	if (!pmd)
>  		goto out;

This check is removed by 117b0791ac42. Can actually pmd returned from
pmd_offset be NULL?

>  	if (pmd_none(*pmd))

pmd_none() is replaced by !pmd_present().

My question is: is it OK to take the backport of 117b0791ac42 attached
(to stay with what upstream has)?

thanks,
-- 
js
suse labs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-thp-close-race-between-split-and-zap-huge-pages.patch --]
[-- Type: text/x-patch; name="0001-thp-close-race-between-split-and-zap-huge-pages.patch", Size: 4479 bytes --]

From f43340a2b0a461572ed53284148f9eb67d93733b Mon Sep 17 00:00:00 2001
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Date: Fri, 18 Apr 2014 15:07:25 -0700
Subject: [PATCH 1/1] thp: close race between split and zap huge pages

commit b5a8cad376eebbd8598642697e92a27983aee802 upstream.

Sasha Levin has reported two THP BUGs[1][2].  I believe both of them
have the same root cause.  Let's look to them one by one.

The first bug[1] is "kernel BUG at mm/huge_memory.c:1829!".  It's
BUG_ON(mapcount != page_mapcount(page)) in __split_huge_page().  From my
testing I see that page_mapcount() is higher than mapcount here.

I think it happens due to race between zap_huge_pmd() and
page_check_address_pmd().  page_check_address_pmd() misses PMD which is
under zap:

	CPU0						CPU1
						zap_huge_pmd()
						  pmdp_get_and_clear()
__split_huge_page()
  anon_vma_interval_tree_foreach()
    __split_huge_page_splitting()
      page_check_address_pmd()
        mm_find_pmd()
	  /*
	   * We check if PMD present without taking ptl: no
	   * serialization against zap_huge_pmd(). We miss this PMD,
	   * it's not accounted to 'mapcount' in __split_huge_page().
	   */
	  pmd_present(pmd) == 0

  BUG_ON(mapcount != page_mapcount(page)) // CRASH!!!

						  page_remove_rmap(page)
						    atomic_add_negative(-1, &page->_mapcount)

The second bug[2] is "kernel BUG at mm/huge_memory.c:1371!".
It's VM_BUG_ON_PAGE(!PageHead(page), page) in zap_huge_pmd().

This happens in similar way:

	CPU0						CPU1
						zap_huge_pmd()
						  pmdp_get_and_clear()
						  page_remove_rmap(page)
						    atomic_add_negative(-1, &page->_mapcount)
__split_huge_page()
  anon_vma_interval_tree_foreach()
    __split_huge_page_splitting()
      page_check_address_pmd()
        mm_find_pmd()
	  pmd_present(pmd) == 0	/* The same comment as above */
  /*
   * No crash this time since we already decremented page->_mapcount in
   * zap_huge_pmd().
   */
  BUG_ON(mapcount != page_mapcount(page))

  /*
   * We split the compound page here into small pages without
   * serialization against zap_huge_pmd()
   */
  __split_huge_page_refcount()
						VM_BUG_ON_PAGE(!PageHead(page), page); // CRASH!!!

So my understanding the problem is pmd_present() check in mm_find_pmd()
without taking page table lock.

The bug was introduced by me commit with commit 117b0791ac42. Sorry for
that. :(

Let's open code mm_find_pmd() in page_check_address_pmd() and do the
check under page table lock.

Note that __page_check_address() does the same for PTE entires
if sync != 0.

I've stress tested split and zap code paths for 36+ hours by now and
don't see crashes with the patch applied. Before it took <20 min to
trigger the first bug and few hours for second one (if we ignore
first).

[1] https://lkml.kernel.org/g/<53440991.9090001@oracle.com>
[2] https://lkml.kernel.org/g/<5310C56C.60709@oracle.com>

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Tested-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Bob Liu <lliubbo@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michel Lespinasse <walken@google.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>	[3.13+]

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 mm/huge_memory.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 04d17ba00893..04535b64119c 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1541,15 +1541,22 @@ pmd_t *page_check_address_pmd(struct page *page,
 			      unsigned long address,
 			      enum page_check_address_pmd_flag flag)
 {
+	pgd_t *pgd;
+	pud_t *pud;
 	pmd_t *pmd, *ret = NULL;
 
 	if (address & ~HPAGE_PMD_MASK)
 		goto out;
 
-	pmd = mm_find_pmd(mm, address);
-	if (!pmd)
+	pgd = pgd_offset(mm, address);
+	if (!pgd_present(*pgd))
 		goto out;
-	if (pmd_none(*pmd))
+	pud = pud_offset(pgd, address);
+	if (!pud_present(*pud))
+		goto out;
+	pmd = pmd_offset(pud, address);
+
+	if (!pmd_present(*pmd))
 		goto out;
 	if (pmd_page(*pmd) != page)
 		goto out;
-- 
2.2.1


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* Re: [PATCH 3.12 00/78] 3.12.36-stable review
  2015-01-11  3:40   ` Satoru Takeuchi
@ 2015-01-12 10:35     ` Jiri Slaby
  0 siblings, 0 replies; 89+ messages in thread
From: Jiri Slaby @ 2015-01-12 10:35 UTC (permalink / raw)
  To: Satoru Takeuchi, Guenter Roeck; +Cc: stable, shuah.kh, linux-kernel

On 01/11/2015, 04:40 AM, Satoru Takeuchi wrote:
> At Fri, 9 Jan 2015 09:59:10 -0800,
> Guenter Roeck wrote:
>>
>> On Fri, Jan 09, 2015 at 11:30:32AM +0100, Jiri Slaby wrote:
>>> This is the start of the stable review cycle for the 3.12.36 release.
>>> There are 78 patches in this series, all will be posted as a response
>>> to this one.  If anyone has any issues with these being applied, please
>>> let me know.
>>>
>>> Responses should be made by Tue Jan 13 11:29:45 CET 2015.
>>> Anything received after that time might be too late.
>>>
>> Build results:
>> 	total: 135 pass: 135 fail: 0
>> Qemu tests:
>> 	total: 27 pass: 27 fail: 0
>>
>> Details are available at http://server.roeck-us.net:8010/builders.
> 
> Plus, this kernel passed my test.


Thank you both!


-- 
js
suse labs

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 3.12 78/78] mm: let mm_find_pmd fix buggy race with THP fault
  2015-01-12 10:01     ` Jiri Slaby
@ 2015-01-12 11:13       ` Kirill A. Shutemov
  2015-01-12 23:13         ` Hugh Dickins
  0 siblings, 1 reply; 89+ messages in thread
From: Kirill A. Shutemov @ 2015-01-12 11:13 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Hugh Dickins, stable, Kirill A. Shutemov, linux-kernel,
	Konstantin Khlebnikov, Mel Gorman, Bob Liu, Christoph Lameter,
	Dave Jones, David Rientjes, Andrew Morton, Linus Torvalds

On Mon, Jan 12, 2015 at 11:01:46AM +0100, Jiri Slaby wrote:
> On 01/10/2015, 06:01 AM, Hugh Dickins wrote:
> > On Fri, 9 Jan 2015, Jiri Slaby wrote:
> > 
> >> From: Hugh Dickins <hughd@google.com>
> >>
> >> 3.12-stable review patch.  If anyone has any objections, please let me know.
> >>
> >> ===============
> >>
> >> commit f72e7dcdd25229446b102e587ef2f826f76bff28 upstream.
> ...
> > Fine for this to go in, but there is one catch, which I discovered when
> > backporting to v3.11: it needed one more hunk.  I haven't checked your
> > base tree, but if this applies then I believe you need it - most of the
> > time no problem, but it can case page migration to fail to find a
> > migration entry it inserted earlier, then BUG_ON(!PageLocked(p)) in
> > migration_entry_to_page() soon after.  Here's what I wrote back then:
> > 
> > Note on rebase to v3.11: added a hunk to replace the use of mm_find_pmd()
> > in page_check_address_pmd().  This call had been similarly replaced by
> > the time of my v3.16 commit, in Kirill Shutemov's v3.15 b5a8cad376ee
> > ("thp: close race between split and zap huge pages"): which we do not
> > need as such, since it's fixing v3.13 117b0791ac42 ("mm, thp: move ptl
> > taking inside page_check_address_pmd()"), from a split page-table-lock
> > series we are not backporting.  But without this additional hunk, rmap
> > sometimes broke when the new semantic for mm_find_pmd() was used here.
> > 
> > (Adding Kirill to Cc: shouldn't he have been Cc'ed already?)
> > 
> > Hugh
> 
> Thanks, I see. So the diff between the hunk below and 117b0791ac42 are
> two things:
> 
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -1584,12 +1584,20 @@ pmd_t *page_check_address_pmd(struct page *page,
> >  			      unsigned long address,
> >  			      enum page_check_address_pmd_flag flag)
> >  {
> > +	pgd_t *pgd;
> > +	pud_t *pud;
> >  	pmd_t *pmd, *ret = NULL;
> >  
> >  	if (address & ~HPAGE_PMD_MASK)
> >  		goto out;
> >  
> > -	pmd = mm_find_pmd(mm, address);
> > +	pgd = pgd_offset(mm, address);
> > +	if (!pgd_present(*pgd))
> > +		goto out;
> > +	pud = pud_offset(pgd, address);
> > +	if (!pud_present(*pud))
> > +		goto out;
> > +	pmd = pmd_offset(pud, address);
> >  	if (!pmd)
> >  		goto out;
> 
> This check is removed by 117b0791ac42. Can actually pmd returned from
> pmd_offset be NULL?

[ I believe, you mean by b5a8cad376ee, right? ]

No, pmd cannot be NULL here, if pud is present and valid (pud_page_vaddr()
is not NULL).

> 
> >  	if (pmd_none(*pmd))
> 
> pmd_none() is replaced by !pmd_present().

Both pmd_none() and !pmd_present() would work. pmd_none() can be slightly
faster.

> My question is: is it OK to take the backport of 117b0791ac42 attached
> (to stay with what upstream has)?

The commit message would be totally misleading, since the fixed bug is not
present in v3.12. It's better to fold the patch into "mm: let mm_find_pmd
fix buggy race with THP".

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 3.12 00/78] 3.12.36-stable review
  2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
                   ` (78 preceding siblings ...)
  2015-01-09 17:59 ` [PATCH 3.12 00/78] 3.12.36-stable review Guenter Roeck
@ 2015-01-12 18:00 ` Shuah Khan
  79 siblings, 0 replies; 89+ messages in thread
From: Shuah Khan @ 2015-01-12 18:00 UTC (permalink / raw)
  To: Jiri Slaby, stable; +Cc: linux, satoru.takeuchi, shuah.kh, linux-kernel

On 01/09/2015 03:30 AM, Jiri Slaby wrote:
> This is the start of the stable review cycle for the 3.12.36 release.
> There are 78 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Tue Jan 13 11:29:45 CET 2015.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	http://kernel.org/pub/linux/kernel/people/jirislaby/stable-review/patch-3.12.36-rc1.xz
> and the diffstat can be found below.
> 
> thanks,
> js
> 

Compiled and booted on my test system. No dmesg regressions.

-- Shuah


-- 
Shuah Khan
Sr. Linux Kernel Developer
Open Source Innovation Group
Samsung Research America (Silicon Valley)
shuahkh@osg.samsung.com | (970) 217-8978

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 3.12 78/78] mm: let mm_find_pmd fix buggy race with THP fault
  2015-01-12 11:13       ` Kirill A. Shutemov
@ 2015-01-12 23:13         ` Hugh Dickins
  0 siblings, 0 replies; 89+ messages in thread
From: Hugh Dickins @ 2015-01-12 23:13 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Kirill A. Shutemov, Hugh Dickins, stable, Kirill A. Shutemov,
	linux-kernel, Konstantin Khlebnikov, Mel Gorman, Bob Liu,
	Christoph Lameter, Dave Jones, David Rientjes, Andrew Morton,
	Linus Torvalds

On Mon, 12 Jan 2015, Kirill A. Shutemov wrote:
> On Mon, Jan 12, 2015 at 11:01:46AM +0100, Jiri Slaby wrote:
> > On 01/10/2015, 06:01 AM, Hugh Dickins wrote:
> > > On Fri, 9 Jan 2015, Jiri Slaby wrote:
> > > 
> > >> From: Hugh Dickins <hughd@google.com>
> > >>
> > >> 3.12-stable review patch.  If anyone has any objections, please let me know.
> > >>
> > >> ===============
> > >>
> > >> commit f72e7dcdd25229446b102e587ef2f826f76bff28 upstream.
> > ...
> > > Fine for this to go in, but there is one catch, which I discovered when
> > > backporting to v3.11: it needed one more hunk.  I haven't checked your
> > > base tree, but if this applies then I believe you need it - most of the
> > > time no problem, but it can case page migration to fail to find a
> > > migration entry it inserted earlier, then BUG_ON(!PageLocked(p)) in
> > > migration_entry_to_page() soon after.  Here's what I wrote back then:
> > > 
> > > Note on rebase to v3.11: added a hunk to replace the use of mm_find_pmd()
> > > in page_check_address_pmd().  This call had been similarly replaced by
> > > the time of my v3.16 commit, in Kirill Shutemov's v3.15 b5a8cad376ee
> > > ("thp: close race between split and zap huge pages"): which we do not
> > > need as such, since it's fixing v3.13 117b0791ac42 ("mm, thp: move ptl
> > > taking inside page_check_address_pmd()"), from a split page-table-lock
> > > series we are not backporting.  But without this additional hunk, rmap
> > > sometimes broke when the new semantic for mm_find_pmd() was used here.
> > > 
> > > (Adding Kirill to Cc: shouldn't he have been Cc'ed already?)
> > > 
> > > Hugh
> > 
> > Thanks, I see. So the diff between the hunk below and 117b0791ac42 are
> > two things:
> > 
> > > --- a/mm/huge_memory.c
> > > +++ b/mm/huge_memory.c
> > > @@ -1584,12 +1584,20 @@ pmd_t *page_check_address_pmd(struct page *page,
> > >  			      unsigned long address,
> > >  			      enum page_check_address_pmd_flag flag)
> > >  {
> > > +	pgd_t *pgd;
> > > +	pud_t *pud;
> > >  	pmd_t *pmd, *ret = NULL;
> > >  
> > >  	if (address & ~HPAGE_PMD_MASK)
> > >  		goto out;
> > >  
> > > -	pmd = mm_find_pmd(mm, address);
> > > +	pgd = pgd_offset(mm, address);
> > > +	if (!pgd_present(*pgd))
> > > +		goto out;
> > > +	pud = pud_offset(pgd, address);
> > > +	if (!pud_present(*pud))
> > > +		goto out;
> > > +	pmd = pmd_offset(pud, address);
> > >  	if (!pmd)
> > >  		goto out;
> > 
> > This check is removed by 117b0791ac42. Can actually pmd returned from
> > pmd_offset be NULL?
> 
> [ I believe, you mean by b5a8cad376ee, right? ]
> 
> No, pmd cannot be NULL here, if pud is present and valid (pud_page_vaddr()
> is not NULL).

Right, the !pmd test after pmd_offset is just stupid: I thought I was
copying a standard version of that sequence from somewhere, and blindly
duplicating the stupid test; but looking back now, suspect I was the
one introducing that stupidity.  It doesn't do anything wrong, but
it's misleadingly stupid and better removed.

> 
> > 
> > >  	if (pmd_none(*pmd))
> > 
> > pmd_none() is replaced by !pmd_present().
> 
> Both pmd_none() and !pmd_present() would work. pmd_none() can be slightly
> faster.

Right, you can use whichever you feel like.

> 
> > My question is: is it OK to take the backport of 117b0791ac42 attached
> > (to stay with what upstream has)?
> 
> The commit message would be totally misleading, since the fixed bug is not
> present in v3.12. It's better to fold the patch into "mm: let mm_find_pmd
> fix buggy race with THP".

Agreed, it would be inappropriate to incorporate Kirill's changelog:
better just to append the text I supplied pointing to his commits.

Hugh

^ permalink raw reply	[flat|nested] 89+ messages in thread

end of thread, other threads:[~2015-01-12 23:13 UTC | newest]

Thread overview: 89+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-09 10:30 [PATCH 3.12 00/78] 3.12.36-stable review Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 01/78] ipv6: gre: fix wrong skb->protocol in WCCP Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 02/78] Fix race condition between vxlan_sock_add and vxlan_sock_release Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 03/78] tg3: fix ring init when there are more TX than RX channels Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 04/78] net/mlx4_core: Limit count field to 24 bits in qp_alloc_res Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 05/78] rtnetlink: release net refcnt on error in do_setlink() Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 06/78] xen-netfront: Remove BUGs on paged skb data which crosses a page boundary Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 07/78] net: mvneta: fix Tx interrupt delay Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 08/78] net: mvneta: fix race condition in mvneta_tx() Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 09/78] net: sctp: use MAX_HEADER for headroom reserve in output path Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 10/78] ceph: fix null pointer dereference in discard_cap_releases() Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 11/78] perf/x86/intel: Protect LBR and extra_regs against KVM lying Jiri Slaby
2015-01-10 11:24   ` Dongsu Park
2015-01-10 11:42     ` Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 12/78] s390/3215: fix hanging console issue Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 13/78] s390/3215: fix tty output containing tabs Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 14/78] usb: gadget: at91_udc: move prepare clk into process context Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 15/78] tty: Fix pty master poll() after slave closes v2 Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 16/78] mm: frontswap: invalidate expired data on a dup-store failure Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 17/78] mm/vmpressure.c: fix race in vmpressure_work_fn() Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 18/78] mm: fix swapoff hang after page migration and fork Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 19/78] mm: fix anon_vma_clone() error treatment Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 20/78] i2c: omap: fix NACK and Arbitration Lost irq handling Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 21/78] i2c: omap: fix i207 errata handling Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 22/78] i2c: davinci: generate STP always when NACK is received Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 23/78] drm/radeon: kernel panic in drm_calc_vbltimestamp_from_scanoutpos with 3.18.0-rc6 Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 24/78] drm/i915: More cautious with pch fifo underruns Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 25/78] drm/i915: Unlock panel even when LVDS is disabled Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 26/78] media: smiapp: Only some selection targets are settable Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 27/78] USB: xhci: Reset a halted endpoint immediately when we encounter a stall Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 28/78] AHCI: Add DeviceIDs for Sunrise Point-LP SATA controller Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 29/78] ahci: disable MSI on SAMSUNG 0xa800 SSD Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 30/78] sata_fsl: fix error handling of irq_of_parse_and_map Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 31/78] igb: bring link up when PHY is powered up Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 32/78] powerpc: 32 bit getcpu VDSO function uses 64 bit instructions Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 33/78] ALSA: hda - Add EAPD fixup for ASUS Z99He laptop Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 34/78] ALSA: hda - Fix built-in mic at resume on Lenovo Ideapad S210 Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 35/78] ALSA: usb-audio: Don't resubmit pending URBs at MIDI error recovery Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 36/78] isofs: Fix infinite looping over CE entries Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 37/78] x86/tls: Validate TLS entries to protect espfix Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 38/78] x86/tls: Disallow unusual TLS segments Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 39/78] x86, kvm: Clear paravirt_enabled on KVM guests for espfix32's benefit Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 40/78] mfd: tc6393xb: Fail ohci suspend if full state restore is required Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 41/78] mmc: block: add newline to sysfs display of force_ro Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 42/78] megaraid_sas: corrected return of wait_event from abort frame path Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 43/78] scsi: correct return values for .eh_abort_handler implementations Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 44/78] nfs41: fix nfs4_proc_layoutget error handling Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 45/78] dm bufio: fix memleak when using a dm_buffer's inline bio Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 46/78] dm space map metadata: fix sm_bootstrap_get_nr_blocks() Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 47/78] x86/tls: Don't validate lm in set_thread_area() after all Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 48/78] audit: change decimal constant to macro for invalid uid Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 49/78] isofs: Fix unchecked printing of ER records Jiri Slaby
2015-01-09 10:31 ` [PATCH 3.12 50/78] KEYS: Fix stale key registration at error path Jiri Slaby
2015-01-09 10:32 ` [PATCH 3.12 51/78] mac80211: fix multicast LED blinking and counter Jiri Slaby
2015-01-09 10:32 ` [PATCH 3.12 52/78] mac80211: free management frame keys when removing station Jiri Slaby
2015-01-09 10:32 ` [PATCH 3.12 53/78] thermal: Fix error path in thermal_init() Jiri Slaby
2015-01-09 10:32 ` [PATCH 3.12 54/78] mnt: Implicitly add MNT_NODEV on remount when it was implicitly added by mount Jiri Slaby
2015-01-09 10:32 ` [PATCH 3.12 55/78] mnt: Update unprivileged remount test Jiri Slaby
2015-01-09 10:32 ` [PATCH 3.12 56/78] umount: Disallow unprivileged mount force Jiri Slaby
2015-01-09 10:32 ` [PATCH 3.12 57/78] groups: Consolidate the setgroups permission checks Jiri Slaby
2015-01-09 10:32 ` [PATCH 3.12 58/78] userns: Document what the invariant required for safe unprivileged mappings Jiri Slaby
2015-01-09 10:32 ` [PATCH 3.12 59/78] userns: Don't allow setgroups until a gid mapping has been setablished Jiri Slaby
2015-01-09 10:32 ` [PATCH 3.12 60/78] userns: Don't allow unprivileged creation of gid mappings Jiri Slaby
2015-01-09 10:32 ` [PATCH 3.12 61/78] userns: Check euid no fsuid when establishing an unprivileged uid mapping Jiri Slaby
2015-01-09 10:32 ` [PATCH 3.12 62/78] userns: Only allow the creator of the userns unprivileged mappings Jiri Slaby
2015-01-09 10:32 ` [PATCH 3.12 63/78] userns: Rename id_map_mutex to userns_state_mutex Jiri Slaby
2015-01-09 10:32 ` [PATCH 3.12 64/78] userns: Add a knob to disable setgroups on a per user namespace basis Jiri Slaby
2015-01-09 10:32 ` [PATCH 3.12 65/78] userns: Allow setting gid_maps without privilege when setgroups is disabled Jiri Slaby
2015-01-09 10:32 ` [PATCH 3.12 66/78] userns: Unbreak the unprivileged remount tests Jiri Slaby
2015-01-09 10:32 ` [PATCH 3.12 67/78] audit: restore AUDIT_LOGINUID unset ABI Jiri Slaby
2015-01-09 10:32 ` [PATCH 3.12 68/78] crypto: af_alg - fix backlog handling Jiri Slaby
2015-01-09 10:32 ` [PATCH 3.12 69/78] ncpfs: return proper error from NCP_IOC_SETROOT ioctl Jiri Slaby
2015-01-09 10:32 ` [PATCH 3.12 70/78] exit: pidns: alloc_pid() leaks pid_namespace if child_reaper is exiting Jiri Slaby
2015-01-09 10:32 ` [PATCH 3.12 71/78] udf: Verify symlink size before loading it Jiri Slaby
2015-01-09 10:32 ` [PATCH 3.12 72/78] eCryptfs: Force RO mount when encrypted view is enabled Jiri Slaby
2015-01-09 10:32 ` [PATCH 3.12 73/78] eCryptfs: Remove buggy and unnecessary write in file name decode routine Jiri Slaby
2015-01-09 10:32 ` [PATCH 3.12 74/78] Btrfs: do not move em to modified list when unpinning Jiri Slaby
2015-01-09 10:32 ` [PATCH 3.12 75/78] Btrfs: fix fs corruption on transaction abort if device supports discard Jiri Slaby
2015-01-09 10:32 ` [PATCH 3.12 76/78] mfd: stmpe: Fix STMPE24xx GPMR LSB Jiri Slaby
2015-01-09 10:32 ` [PATCH 3.12 77/78] mfd: viperboard: Fix platform-device id collision Jiri Slaby
2015-01-09 10:32 ` [PATCH 3.12 78/78] mm: let mm_find_pmd fix buggy race with THP fault Jiri Slaby
2015-01-10  5:01   ` Hugh Dickins
2015-01-12 10:01     ` Jiri Slaby
2015-01-12 11:13       ` Kirill A. Shutemov
2015-01-12 23:13         ` Hugh Dickins
2015-01-09 17:59 ` [PATCH 3.12 00/78] 3.12.36-stable review Guenter Roeck
2015-01-11  3:40   ` Satoru Takeuchi
2015-01-12 10:35     ` Jiri Slaby
2015-01-12 18:00 ` Shuah Khan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.