linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [ 000/143] 2.6.32.62-longterm review
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 001/143] scsi: fix missing include linux/types.h in scsi_netlink.h Willy Tarreau
                   ` (142 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 18728 bytes --]

This is the start of the longterm review cycle for the 2.6.32.62 release.
All patches will be posted as a response to this one. If anyone has any
issue with these being applied, please let me know. If anyone is a
maintainer of the proper subsystem, and wants to add a Signed-off-by: line
to the patch, please respond with it.

Responses should be made before Friday 16th 8PM UTC. Anything received
after that time might be too late. If someone wants a bit more time for
a deeper review, please let me know.

The whole patch series can be found in one patch at :
     kernel.org/pub/linux/kernel/v2.6/longterm-review/patch-2.6.32.62-rc1.gz

The shortlog and diffstat are appended below.

----------

Andreas Henriksson (1):
      net: Fix "ip rule delete table 256"

Andy Honig (2):
      KVM: Improve create VCPU parameter (CVE-2013-4587)
      KVM: x86: Fix potential divide by 0 in lapic (CVE-2013-6367)

Ben Greear (1):
      Fix lockup related to stop_machine being stuck in __do_softirq.

Changli Gao (2):
      net: Swap ver and type in pppoe_hdr
      net: drop_monitor: fix the value of maxattr

Chris Healy (1):
      resubmit bridge: fix message_age_timer calculation

Dan Carpenter (13):
      cciss: fix info leak in cciss_ioctl32_passthru()
      cpqarray: fix info leak in ida_locked_ioctl()
      net: heap overflow in __audit_sockaddr()
      arcnet: cleanup sizeof parameter
      af_key: more info leaks in pfkey messages
      net_sched: info leak in atm_tc_dump_class()
      isdnloop: use strlcpy() instead of strcpy()
      net: clamp ->msg_namelen instead of returning an error
      isdnloop: several buffer overflows
      libertas: potential oops in debugfs
      uml: check length in exitcode_proc_write()
      xfs: underflow bug in xfs_attrlist_by_handle()
      aacraid: missing capable() check in compat ioctl

Daniel Borkmann (8):
      net: sctp: fix NULL pointer dereference in socket destruction
      packet: packet_getname_spkt: make sure string is always 0-terminated
      random32: fix off-by-one in seeding requirement
      net: llc: fix use after free in llc_ui_recvmsg
      net: sctp: fix sctp_connectx abi for ia32 emulation/compat mode
      net: sctp: fix sctp_sf_do_5_1D_ce to verify if we/peer is AUTH capable
      net: sctp: fix skb leakage in COOKIE ECHO path of chunk->auth_chunk
      netfilter: nf_conntrack_dccp: fix skb_header_pointer API usages

Dave Kleikamp (1):
      sunvnet: vnet_port_remove must call unregister_netdev

David S. Miller (1):
      net_sched: Fix stack info leak in cbq_dump_wrr().

Ding Tianhong (1):
      bridge: flush br's address entry in fdb when remove the bridge dev

Duan Jiong (1):
      ipv6: use rt6_get_dflt_router to get default router in rt6_route_rcv

Eric Dumazet (12):
      ipv6: ip6_sk_dst_check() must not assume ipv6 dst
      ipv6: tcp: fix panic in SYN processing
      tcp: must unclone packets before mangling them
      net: do not call sock_put() on TIMEWAIT sockets
      tcp: fix tcp_md5_hash_skb_data()
      ipv6: fix possible crashes in ip6_cork_release()
      ip_tunnel: fix kernel panic with icmp_dest_unreach
      neighbour: fix a race in neigh_destroy()
      vlan: fix a race in egress prio management
      tcp: cubic: fix bug in bictcp_acked()
      ipv4: fix possible seqlock deadlock
      inet: fix possible seqlock deadlocks

Fan Du (1):
      sctp: Use software crc32 checksum when xfrm transform will happen.

Florian Westphal (1):
      net: rose: restore old recvmsg behavior

Hannes Frederic Sowa (12):
      ipv6: don't stop backtracking in fib6_lookup_1 if subtree does not match
      ipv6: remove max_addresses check from ipv6_create_tempaddr
      ipv6: drop packets with multiple fragmentation headers
      inet: prevent leakage of uninitialized memory to user in recv syscalls
      net: rework recvmsg handler msg_name and msg_namelen logic
      net: add BUG_ON if kernel advertises msg_namelen > sizeof(struct sockaddr_storage)
      inet: fix addr_len/msg->msg_namelen assignment in recv_error and rxpmtu functions
      ipv6: fix leaking uninitialized port number of offender sockaddr
      ipv6: fix possible seqlock deadlock in ip6_finish_output2
      ipv6: udp packets following an UFO enqueued packet need also be handled by UFO
      inet: fix possible memory corruption with UDP_CORK and UFO
      ipv6: call udp_push_pending_frames when uncorking a socket with AF_INET pending data

Ian Abbott (1):
      staging: comedi: ni_65xx: (bug fix) confine insn_bits to one subdevice

Jason Wang (1):
      virtio-net: alloc big buffers also when guest can receive UFO

Jiri Bohac (2):
      ICMPv6: treat dest unreachable codes 5 and 6 as EACCES, not EPROTO
      bonding: 802.3ad: make aggregator_identifier bond-private

Jitendra Bhivare (1):
      intel-iommu: Flush unmaps at domain_exit

Jonathan Salwan (1):
      drivers/cdrom/cdrom.c: use kzalloc() for failing hardware

Julian Anastasov (1):
      ipvs: fix CHECKSUM_PARTIAL for TCP, UDP

Kees Cook (9):
      block: do not pass disk names as format strings
      b43: stop format string leaking into error msgs
      HID: validate HID report id size
      HID: zeroplus: validate output report details
      HID: pantherlord: validate output report details
      HID: LG: validate HID output report details
      HID: check for NULL field when setting values
      HID: provide a helper for validating hid reports
      exec/ptrace: fix get_dumpable() incorrect tests

Krzysztof Helt (1):
      [CPUFREQ] powernow-k6: set transition latency value so ondemand governor can be used

Linus Torvalds (3):
      vm: add vm_iomap_memory() helper function
      Fix a few incorrectly checked [io_]remap_pfn_range() calls
      x86, fpu, amd: Clear exceptions in AMD FXSAVE workaround

Liu Yu (1):
      tcp_cubic: fix the range of delayed_ack

Maciej Zenczykowski (1):
      net: fix 'ip rule' iif/oif device rename

Mahesh Rajashekhara (1):
      aacraid: prevent invalid pointer dereference

Marc Kleine-Budde (2):
      can: dev: fix nlmsg size calculation in can_get_size()
      net: vlan: fix nlmsg size calculation in vlan_get_size()

Mariusz Ceier (1):
      davinci_emac.c: Fix IFF_ALLMULTI setup

Martin Schwidefsky (1):
      s390: fix kernel crash due to linkage stack instructions

Mathias Krause (3):
      af_key: fix info leaks in notify messages
      proc connector: fix info leaks
      connector: use nlmsg_len() to check message length

Matthew Daley (2):
      floppy: ignore kernel-only members in FDRAWCMD ioctl input
      floppy: don't write kernel-only members to FDRAWCMD ioctl output

Matthew Leach (1):
      net: socket: error on a negative msg_namelen

Max Matveev (1):
      sctp: deal with multiple COOKIE_ECHO chunks

Michael Chan (1):
      tg3: Don't check undefined error bits in RXBD

Michal Tesar (1):
      sysctl net: Keep tcp_syn_retries inside the boundary

Mikulas Patocka (4):
      powernow-k6: disable cache when changing frequency
      powernow-k6: correctly initialize default parameters
      powernow-k6: reorder frequencies
      dm snapshot: fix data corruption

Neal Cardwell (2):
      inet_diag: fix inet_diag_dump_icsk() timewait socket state logic
      tcp: fix tcp_trim_head() to adjust segment count with skb MSS

Neil Horman (3):
      bonding: Fix broken promiscuity reference counting issue
      sctp: fully initialize sctp_outq in sctp_outq_init
      crypto: ansi_cprng - Fix off by one error in non-block size request

Nicolas Dichtel (2):
      af_key: initialize satype in key_notify_policy_flush()
      sctp: unbalanced rcu lock in ip_queue_xmit()

Nikola Pajkovsky (1):
      crypto: api - Fix race condition in larval lookup

Nikolay Aleksandrov (1):
      bonding: fix two race conditions in bond_store_updelay/downdelay

Nithin Sujir (1):
      tg3: Fix deadlock in tg3_change_mtu()

Pablo Neira (1):
      netlink: don't compare the nul-termination in nla_strcmp

Peter Hurley (1):
      n_tty: Fix n_tty_write crash when echoing in raw mode

Peter Korsgaard (1):
      dm9601: fix IFF_ALLMULTI handling

Ricardo Ribalda (1):
      ll_temac: Reset dma descriptors indexes on ndo_open

Roman Gushchin (1):
      net: check net.core.somaxconn sysctl values

Salam Noureddine (2):
      ipv6 mcast: use in6_dev_put in timer handlers instead of __in6_dev_put
      ipv4 igmp: use in_dev_put in timer handlers instead of __in_dev_put

Salva Peiró (3):
      farsync: fix info leak in ioctl
      wanxl: fix info leak in ioctl
      hamradio/yam: fix info leak in ioctl

Sasha Levin (3):
      net: unix: allow bind to fail on mutex lock
      rds: prevent dereference of a NULL device
      rds: prevent dereference of a NULL device in rds_iw_laddr_check

Stephen Smalley (1):
      SELinux: Fix kernel BUG on empty security contexts.

Tetsuo Handa (1):
      kernel/kmod.c: check for NULL in call_usermodehelper_exec()

Thomas Bork (1):
      scsi: fix missing include linux/types.h in scsi_netlink.h

Thomas Graf (1):
      ipv6: Don't depend on per socket memory for neighbour discovery messages

Ursula Braun (1):
      qeth: avoid buffer overflow in snmp ioctl

Vlad Yasevich (4):
      sctp: Use correct sideffect command in duplicate cookie handling
      net: dst: provide accessor function to dst->xfrm
      sctp: Perform software checksum if packet has to be fragmented.
      net: core: Always propagate flag changes to interfaces

Wenliang Fan (1):
      drivers/net/hamradio: Integer overflow in hdlcdrv_ioctl()

Willy Tarreau (2):
      Revert "x86, ptrace: fix build breakage with gcc 4.7"
      x86, ptrace: fix build breakage with gcc 4.7 (second try)

YOSHIFUJI Hideaki (1):
      isdnloop: Validate NUL-terminated strings from user.

Ying Xue (2):
      tipc: fix lockdep warning during bearer initialization
      atm: idt77252: fix dev refcnt leak

Zhu Yanjun (1):
      gianfar: disable TX vlan based on kernel 2.6.x

dingtianhong (3):
      ifb: fix rcu_sched self-detected stalls
      dummy: fix oops when loading the dummy failed
      ifb: fix oops when loading the ifb failed

fan.du (1):
      {pktgen, xfrm} Update IPv4 header total len and checksum after tranformation

stephen hemminger (2):
      htb: fix sign extension bug
      tcp_cubic: limit delayed_ack ratio to prevent divide error

------------

 arch/ia64/include/asm/processor.h         |   2 +-
 arch/s390/kernel/head64.S                 |   7 +-
 arch/um/kernel/exitcode.c                 |   4 +-
 arch/x86/include/asm/i387.h               |  13 +--
 arch/x86/include/asm/ptrace.h             |   4 -
 arch/x86/kernel/cpu/cpufreq/powernow-k6.c | 147 ++++++++++++++++++++++++------
 arch/x86/kvm/lapic.c                      |   3 +-
 crypto/ansi_cprng.c                       |   4 +-
 crypto/api.c                              |   7 +-
 drivers/atm/idt77252.c                    |   1 +
 drivers/block/cciss.c                     |   1 +
 drivers/block/cpqarray.c                  |   1 +
 drivers/block/floppy.c                    |  12 ++-
 drivers/block/nbd.c                       |   4 +-
 drivers/cdrom/cdrom.c                     |   2 +-
 drivers/char/n_tty.c                      |   2 +
 drivers/connector/cn_proc.c               |  16 ++++
 drivers/connector/connector.c             |   7 +-
 drivers/hid/hid-core.c                    |  75 ++++++++++++++-
 drivers/hid/hid-lg2ff.c                   |  19 +---
 drivers/hid/hid-lgff.c                    |  17 +---
 drivers/hid/hid-pl.c                      |  10 +-
 drivers/hid/hid-zpff.c                    |  18 +---
 drivers/isdn/isdnloop/isdnloop.c          |  31 ++++---
 drivers/isdn/mISDN/socket.c               |  13 +--
 drivers/md/dm-snap-persistent.c           |  18 ++--
 drivers/net/arcnet/arcnet.c               |   2 +-
 drivers/net/bonding/bond_3ad.c            |   6 +-
 drivers/net/bonding/bond_3ad.h            |   1 +
 drivers/net/bonding/bond_main.c           |  13 ++-
 drivers/net/bonding/bond_sysfs.c          |   6 ++
 drivers/net/can/dev.c                     |   8 +-
 drivers/net/davinci_emac.c                |   2 +-
 drivers/net/dummy.c                       |   4 +
 drivers/net/gianfar.c                     |   8 +-
 drivers/net/hamradio/hdlcdrv.c            |   2 +
 drivers/net/hamradio/yam.c                |   1 +
 drivers/net/ifb.c                         |   9 +-
 drivers/net/ll_temac_main.c               |   6 ++
 drivers/net/sunvnet.c                     |   2 +
 drivers/net/tg3.c                         |   7 +-
 drivers/net/tg3.h                         |   6 +-
 drivers/net/usb/dm9601.c                  |   2 +-
 drivers/net/virtio_net.c                  |   3 +-
 drivers/net/wan/farsync.c                 |   1 +
 drivers/net/wan/wanxl.c                   |   1 +
 drivers/net/wireless/b43/main.c           |   2 +-
 drivers/net/wireless/libertas/debugfs.c   |   6 +-
 drivers/pci/intel-iommu.c                 |   4 +
 drivers/s390/net/qeth_core_main.c         |   6 +-
 drivers/scsi/aacraid/commctrl.c           |   3 +-
 drivers/scsi/aacraid/linit.c              |   2 +
 drivers/staging/comedi/drivers/ni_65xx.c  |  25 +++--
 drivers/uio/uio.c                         |  16 +++-
 drivers/video/au1100fb.c                  |  26 +-----
 drivers/video/au1200fb.c                  |  26 +-----
 fs/exec.c                                 |   6 ++
 fs/partitions/check.c                     |   2 +-
 fs/xfs/linux-2.6/xfs_ioctl.c              |   3 +-
 fs/xfs/linux-2.6/xfs_ioctl32.c            |   4 +-
 include/linux/binfmts.h                   |   3 -
 include/linux/hid.h                       |   8 +-
 include/linux/icmpv6.h                    |   2 +
 include/linux/if_pppox.h                  |   4 +-
 include/linux/ipv6.h                      |   1 +
 include/linux/mm.h                        |   2 +
 include/linux/net.h                       |   8 ++
 include/linux/sched.h                     |   4 +
 include/linux/skbuff.h                    |  10 ++
 include/net/dst.h                         |  11 +++
 include/net/ip.h                          |   2 +-
 include/net/ipv6.h                        |   3 +-
 include/net/sctp/command.h                |   1 +
 include/net/udp.h                         |   1 +
 include/scsi/scsi_netlink.h               |   2 +-
 kernel/kmod.c                             |   4 +
 kernel/ptrace.c                           |   2 +-
 kernel/softirq.c                          |  13 ++-
 lib/nlattr.c                              |  10 +-
 lib/random32.c                            |  14 +--
 mm/memory.c                               |  47 ++++++++++
 net/8021q/vlan_dev.c                      |   7 ++
 net/8021q/vlan_netlink.c                  |   2 +-
 net/appletalk/ddp.c                       |  16 ++--
 net/atm/common.c                          |   2 -
 net/ax25/af_ax25.c                        |   4 +-
 net/bluetooth/af_bluetooth.c              |   2 -
 net/bluetooth/hci_sock.c                  |   2 -
 net/bluetooth/rfcomm/sock.c               |   3 -
 net/bridge/br_if.c                        |   2 +
 net/bridge/br_stp.c                       |   2 +-
 net/compat.c                              |   5 +-
 net/core/dev.c                            |   2 +-
 net/core/drop_monitor.c                   |   1 -
 net/core/fib_rules.c                      |  10 +-
 net/core/iovec.c                          |   3 +-
 net/core/neighbour.c                      |  12 ++-
 net/core/pktgen.c                         |   7 ++
 net/core/sysctl_net_core.c                |   7 +-
 net/ipv4/datagram.c                       |   2 +-
 net/ipv4/igmp.c                           |   4 +-
 net/ipv4/inet_diag.c                      |   4 +-
 net/ipv4/inet_hashtables.c                |   2 +-
 net/ipv4/ip_output.c                      |   4 +-
 net/ipv4/ip_sockglue.c                    |   3 +-
 net/ipv4/ipip.c                           |   2 +-
 net/ipv4/raw.c                            |   6 +-
 net/ipv4/sysctl_net_ipv4.c                |   6 +-
 net/ipv4/tcp.c                            |   6 +-
 net/ipv4/tcp_cubic.c                      |  11 ++-
 net/ipv4/tcp_ipv4.c                       |   2 +-
 net/ipv4/tcp_output.c                     |  15 +--
 net/ipv4/udp.c                            |  14 +--
 net/ipv6/addrconf.c                       |  10 +-
 net/ipv6/datagram.c                       |   4 +-
 net/ipv6/icmp.c                           |  10 +-
 net/ipv6/inet6_connection_sock.c          |   2 +-
 net/ipv6/inet6_hashtables.c               |   2 +-
 net/ipv6/ip6_fib.c                        |  16 +++-
 net/ipv6/ip6_output.c                     |  45 +++++----
 net/ipv6/mcast.c                          |   4 +-
 net/ipv6/ndisc.c                          |  16 ++--
 net/ipv6/raw.c                            |   6 +-
 net/ipv6/reassembly.c                     |   5 +
 net/ipv6/route.c                          |   7 +-
 net/ipv6/udp.c                            |  14 +--
 net/ipx/af_ipx.c                          |   3 +-
 net/irda/af_irda.c                        |   4 -
 net/iucv/af_iucv.c                        |   2 -
 net/key/af_key.c                          |   8 +-
 net/llc/af_llc.c                          |   7 +-
 net/netfilter/ipvs/ip_vs_proto_tcp.c      |  10 +-
 net/netfilter/ipvs/ip_vs_proto_udp.c      |  10 +-
 net/netfilter/nf_conntrack_proto_dccp.c   |   6 +-
 net/netlink/af_netlink.c                  |   2 -
 net/netrom/af_netrom.c                    |   3 +-
 net/packet/af_packet.c                    |  38 ++++----
 net/phonet/datagram.c                     |   9 +-
 net/rds/ib.c                              |   3 +-
 net/rds/iw.c                              |   3 +-
 net/rds/recv.c                            |   2 -
 net/rose/af_rose.c                        |  24 ++---
 net/rxrpc/ar-recvmsg.c                    |   8 +-
 net/sched/sch_atm.c                       |   1 +
 net/sched/sch_cbq.c                       |   1 +
 net/sched/sch_htb.c                       |   2 +-
 net/sctp/output.c                         |   3 +-
 net/sctp/outqueue.c                       |   8 +-
 net/sctp/sm_make_chunk.c                  |   4 +-
 net/sctp/sm_sideeffect.c                  |   5 +
 net/sctp/sm_statefuns.c                   |  19 +++-
 net/sctp/socket.c                         |  47 ++++++++--
 net/socket.c                              |  40 ++++++--
 net/tipc/eth_media.c                      |  15 ++-
 net/tipc/socket.c                         |   6 --
 net/unix/af_unix.c                        |  13 ++-
 net/x25/af_x25.c                          |   3 +-
 security/selinux/ss/services.c            |   4 +
 virt/kvm/kvm_main.c                       |   3 +
 159 files changed, 937 insertions(+), 501 deletions(-)



^ permalink raw reply	[flat|nested] 172+ messages in thread

* [ 001/143] scsi: fix missing include linux/types.h in scsi_netlink.h
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
  2014-05-12  0:32 ` [ 000/143] 2.6.32.62-longterm review Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 002/143] Fix lockup related to stop_machine being stuck in __do_softirq Willy Tarreau
                   ` (141 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Thomas Bork, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Bork <tom@eisfair.net>

Thomas Bork reported that commit c6203cd ("scsi: use __uX
types for headers exported to user space") caused a regression
as now he's getting this warning :

> /usr/src/linux-2.6.32-eisfair-1/usr/include/scsi/scsi_netlink.h:108:
> found __[us]{8,16,32,64} type without #include <linux/types.h>

This issue was addressed later by commit 10db4e1 ("headers:
include linux/types.h where appropriate"), so let's just pick the
relevant part from it.

Cc: Thomas Bork <tom@eisfair.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 include/scsi/scsi_netlink.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/scsi/scsi_netlink.h b/include/scsi/scsi_netlink.h
index 58ce8fe..5cb20cc 100644
--- a/include/scsi/scsi_netlink.h
+++ b/include/scsi/scsi_netlink.h
@@ -23,7 +23,7 @@
 #define SCSI_NETLINK_H
 
 #include <linux/netlink.h>
-
+#include <linux/types.h>
 
 /*
  * This file intended to be included by both kernel and user space
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 002/143] Fix lockup related to stop_machine being stuck in __do_softirq.
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
  2014-05-12  0:32 ` [ 000/143] 2.6.32.62-longterm review Willy Tarreau
  2014-05-12  0:32 ` [ 001/143] scsi: fix missing include linux/types.h in scsi_netlink.h Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 003/143] Revert "x86, ptrace: fix build breakage with gcc 4.7" Willy Tarreau
                   ` (140 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Ben Greear, Tejun Heo, Pekka Riikonen, Eric Dumazet, stable,
	Ben Hutchings, Linus Torvalds, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Ben Greear <greearb@candelatech.com>

The stop machine logic can lock up if all but one of the migration
threads make it through the disable-irq step and the one remaining
thread gets stuck in __do_softirq.  The reason __do_softirq can hang is
that it has a bail-out based on jiffies timeout, but in the lockup case,
jiffies itself is not incremented.

To work around this, re-add the max_restart counter in __do_irq and stop
processing irqs after 10 restarts.

Thanks to Tejun Heo and Rusty Russell and others for helping me track
this down.

This was introduced in 3.9 by commit c10d73671ad3 ("softirq: reduce
latencies").

It may be worth looking into ath9k to see if it has issues with its irq
handler at a later date.

The hang stack traces look something like this:

    ------------[ cut here ]------------
    WARNING: at kernel/watchdog.c:245 watchdog_overflow_callback+0x9c/0xa7()
    Watchdog detected hard LOCKUP on cpu 2
    Modules linked in: ath9k ath9k_common ath9k_hw ath mac80211 cfg80211 nfsv4 auth_rpcgss nfs fscache nf_nat_ipv4 nf_nat veth 8021q garp stp mrp llc pktgen lockd sunrpc]
    Pid: 23, comm: migration/2 Tainted: G         C   3.9.4+ #11
    Call Trace:
     <NMI>   warn_slowpath_common+0x85/0x9f
      warn_slowpath_fmt+0x46/0x48
      watchdog_overflow_callback+0x9c/0xa7
      __perf_event_overflow+0x137/0x1cb
      perf_event_overflow+0x14/0x16
      intel_pmu_handle_irq+0x2dc/0x359
      perf_event_nmi_handler+0x19/0x1b
      nmi_handle+0x7f/0xc2
      do_nmi+0xbc/0x304
      end_repeat_nmi+0x1e/0x2e
     <<EOE>>
      cpu_stopper_thread+0xae/0x162
      smpboot_thread_fn+0x258/0x260
      kthread+0xc7/0xcf
      ret_from_fork+0x7c/0xb0
    ---[ end trace 4947dfa9b0a4cec3 ]---
    BUG: soft lockup - CPU#1 stuck for 22s! [migration/1:17]
    Modules linked in: ath9k ath9k_common ath9k_hw ath mac80211 cfg80211 nfsv4 auth_rpcgss nfs fscache nf_nat_ipv4 nf_nat veth 8021q garp stp mrp llc pktgen lockd sunrpc]
    irq event stamp: 835637905
    hardirqs last  enabled at (835637904): __do_softirq+0x9f/0x257
    hardirqs last disabled at (835637905): apic_timer_interrupt+0x6d/0x80
    softirqs last  enabled at (5654720): __do_softirq+0x1ff/0x257
    softirqs last disabled at (5654725): irq_exit+0x5f/0xbb
    CPU 1
    Pid: 17, comm: migration/1 Tainted: G        WC   3.9.4+ #11 To be filled by O.E.M. To be filled by O.E.M./To be filled by O.E.M.
    RIP: tasklet_hi_action+0xf0/0xf0
    Process migration/1
    Call Trace:
     <IRQ>
      __do_softirq+0x117/0x257
      irq_exit+0x5f/0xbb
      smp_apic_timer_interrupt+0x8a/0x98
      apic_timer_interrupt+0x72/0x80
     <EOI>
      printk+0x4d/0x4f
      stop_machine_cpu_stop+0x22c/0x274
      cpu_stopper_thread+0xae/0x162
      smpboot_thread_fn+0x258/0x260
      kthread+0xc7/0xcf
      ret_from_fork+0x7c/0xb0

Signed-off-by: Ben Greear <greearb@candelatech.com>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Pekka Riikonen <priikone@iki.fi>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: stable@kernel.org
Cc: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 34376a50fb1fa095b9d0636fa41ed2e73125f214)
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 kernel/softirq.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/kernel/softirq.c b/kernel/softirq.c
index d75c136..e4d5d8c 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -194,8 +194,12 @@ void local_bh_enable_ip(unsigned long ip)
 EXPORT_SYMBOL(local_bh_enable_ip);
 
 /*
- * We restart softirq processing for at most 2 ms,
- * and if need_resched() is not set.
+ * We restart softirq processing for at most MAX_SOFTIRQ_RESTART times,
+ * but break the loop if need_resched() is set or after 2 ms.
+ * The MAX_SOFTIRQ_TIME provides a nice upper bound in most cases, but in
+ * certain cases, such as stop_machine(), jiffies may cease to
+ * increment and so we need the MAX_SOFTIRQ_RESTART limit as
+ * well to make sure we eventually return from this method.
  *
  * These limits have been established via experimentation.
  * The two things to balance is latency against fairness -
@@ -203,6 +207,7 @@ EXPORT_SYMBOL(local_bh_enable_ip);
  * should not be able to lock up the box.
  */
 #define MAX_SOFTIRQ_TIME  msecs_to_jiffies(2)
+#define MAX_SOFTIRQ_RESTART 10
 
 asmlinkage void __do_softirq(void)
 {
@@ -210,6 +215,7 @@ asmlinkage void __do_softirq(void)
 	__u32 pending;
 	unsigned long end = jiffies + MAX_SOFTIRQ_TIME;
 	int cpu;
+	int max_restart = MAX_SOFTIRQ_RESTART;
 
 	pending = local_softirq_pending();
 	account_system_vtime(current);
@@ -254,7 +260,8 @@ restart:
 
 	pending = local_softirq_pending();
 	if (pending) {
-		if (time_before(jiffies, end) && !need_resched())
+		if (time_before(jiffies, end) && !need_resched() &&
+		    --max_restart)
 			goto restart;
 
 		wakeup_softirqd();
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 003/143] Revert "x86, ptrace: fix build breakage with gcc 4.7"
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (2 preceding siblings ...)
  2014-05-12  0:32 ` [ 002/143] Fix lockup related to stop_machine being stuck in __do_softirq Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 004/143] x86, ptrace: fix build breakage with gcc 4.7 (second try) Willy Tarreau
                   ` (139 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Willy Tarreau <w@1wt.eu>

This reverts commit 4ed3bb08f1698c62685278051c19f474fbf961d2.

As reported by Sven-Haegar Koch, this patch breaks make headers_check :

   CHECK   include (0 files)
   CHECK   include/asm (54 files)
   /home/haegar/src/2.6.32/linux/usr/include/asm/ptrace.h:5: included file 'linux/linkage.h' is not exported

Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 arch/x86/include/asm/ptrace.h | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h
index e668d72..0f0d908 100644
--- a/arch/x86/include/asm/ptrace.h
+++ b/arch/x86/include/asm/ptrace.h
@@ -2,7 +2,6 @@
 #define _ASM_X86_PTRACE_H
 
 #include <linux/compiler.h>	/* For __user */
-#include <linux/linkage.h>	/* For asmregparm */
 #include <asm/ptrace-abi.h>
 #include <asm/processor-flags.h>
 
@@ -143,8 +142,8 @@ extern void send_sigtrap(struct task_struct *tsk, struct pt_regs *regs,
 			 int error_code, int si_code);
 void signal_fault(struct pt_regs *regs, void __user *frame, char *where);
 
-extern asmregparm long syscall_trace_enter(struct pt_regs *);
-extern asmregparm void syscall_trace_leave(struct pt_regs *);
+extern long syscall_trace_enter(struct pt_regs *);
+extern void syscall_trace_leave(struct pt_regs *);
 
 static inline unsigned long regs_return_value(struct pt_regs *regs)
 {
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 004/143] x86, ptrace: fix build breakage with gcc 4.7 (second try)
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (3 preceding siblings ...)
  2014-05-12  0:32 ` [ 003/143] Revert "x86, ptrace: fix build breakage with gcc 4.7" Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 005/143] ipvs: fix CHECKSUM_PARTIAL for TCP, UDP Willy Tarreau
                   ` (138 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Sven-Haegar Koch, Christoph Biedl, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Willy Tarreau <w@1wt.eu>

syscall_trace_enter() and syscall_trace_leave() are only called from
within asm code and do not need to be declared in the .c at all.
Removing their reference fixes the build issue that was happening
with gcc 4.7.

Both Sven-Haegar Koch and Christoph Biedl confirmed this patch
addresses their respective build issues.

Cc: Sven-Haegar Koch <haegar@sdinet.de>
Cc: Christoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 arch/x86/include/asm/ptrace.h | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h
index 0f0d908..1ec926d 100644
--- a/arch/x86/include/asm/ptrace.h
+++ b/arch/x86/include/asm/ptrace.h
@@ -142,9 +142,6 @@ extern void send_sigtrap(struct task_struct *tsk, struct pt_regs *regs,
 			 int error_code, int si_code);
 void signal_fault(struct pt_regs *regs, void __user *frame, char *where);
 
-extern long syscall_trace_enter(struct pt_regs *);
-extern void syscall_trace_leave(struct pt_regs *);
-
 static inline unsigned long regs_return_value(struct pt_regs *regs)
 {
 	return regs->ax;
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 005/143] ipvs: fix CHECKSUM_PARTIAL for TCP, UDP
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (4 preceding siblings ...)
  2014-05-12  0:32 ` [ 004/143] x86, ptrace: fix build breakage with gcc 4.7 (second try) Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 006/143] intel-iommu: Flush unmaps at domain_exit Willy Tarreau
                   ` (137 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Julian Anastasov, Simon Horman, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Julian Anastasov <ja@ssi.bg>

 	Fix CHECKSUM_PARTIAL handling. Tested for IPv4 TCP,
UDP not tested because it needs network card with HW CSUM support.
May be fixes problem where IPVS can not be used in virtual boxes.
Problem appears with DNAT to local address when the local stack
sends reply in CHECKSUM_PARTIAL mode.

 	Fix tcp_dnat_handler and udp_dnat_handler to provide
vaddr and daddr in right order (old and new IP) when calling
tcp_partial_csum_update/udp_partial_csum_update (CHECKSUM_PARTIAL).

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
(cherry picked from commit 5bc9068e9d962ca6b8bec3f0eb6f60ab4dee1d04)
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/netfilter/ipvs/ip_vs_proto_tcp.c | 10 +++++-----
 net/netfilter/ipvs/ip_vs_proto_udp.c | 10 +++++-----
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/net/netfilter/ipvs/ip_vs_proto_tcp.c b/net/netfilter/ipvs/ip_vs_proto_tcp.c
index 91d28e0..d462b0d 100644
--- a/net/netfilter/ipvs/ip_vs_proto_tcp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_tcp.c
@@ -147,15 +147,15 @@ tcp_partial_csum_update(int af, struct tcphdr *tcph,
 #ifdef CONFIG_IP_VS_IPV6
 	if (af == AF_INET6)
 		tcph->check =
-			csum_fold(ip_vs_check_diff16(oldip->ip6, newip->ip6,
+			~csum_fold(ip_vs_check_diff16(oldip->ip6, newip->ip6,
 					 ip_vs_check_diff2(oldlen, newlen,
-						~csum_unfold(tcph->check))));
+						csum_unfold(tcph->check))));
 	else
 #endif
 	tcph->check =
-		csum_fold(ip_vs_check_diff4(oldip->ip, newip->ip,
+		~csum_fold(ip_vs_check_diff4(oldip->ip, newip->ip,
 				ip_vs_check_diff2(oldlen, newlen,
-						~csum_unfold(tcph->check))));
+						csum_unfold(tcph->check))));
 }
 
 
@@ -269,7 +269,7 @@ tcp_dnat_handler(struct sk_buff *skb,
 	 *	Adjust TCP checksums
 	 */
 	if (skb->ip_summed == CHECKSUM_PARTIAL) {
-		tcp_partial_csum_update(cp->af, tcph, &cp->daddr, &cp->vaddr,
+		tcp_partial_csum_update(cp->af, tcph, &cp->vaddr, &cp->daddr,
 					htons(oldlen),
 					htons(skb->len - tcphoff));
 	} else if (!cp->app) {
diff --git a/net/netfilter/ipvs/ip_vs_proto_udp.c b/net/netfilter/ipvs/ip_vs_proto_udp.c
index e7a6885..c1781f5 100644
--- a/net/netfilter/ipvs/ip_vs_proto_udp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_udp.c
@@ -154,15 +154,15 @@ udp_partial_csum_update(int af, struct udphdr *uhdr,
 #ifdef CONFIG_IP_VS_IPV6
 	if (af == AF_INET6)
 		uhdr->check =
-			csum_fold(ip_vs_check_diff16(oldip->ip6, newip->ip6,
+			~csum_fold(ip_vs_check_diff16(oldip->ip6, newip->ip6,
 					 ip_vs_check_diff2(oldlen, newlen,
-						~csum_unfold(uhdr->check))));
+						csum_unfold(uhdr->check))));
 	else
 #endif
 	uhdr->check =
-		csum_fold(ip_vs_check_diff4(oldip->ip, newip->ip,
+		~csum_fold(ip_vs_check_diff4(oldip->ip, newip->ip,
 				ip_vs_check_diff2(oldlen, newlen,
-						~csum_unfold(uhdr->check))));
+						csum_unfold(uhdr->check))));
 }
 
 
@@ -205,7 +205,7 @@ udp_snat_handler(struct sk_buff *skb,
 	 *	Adjust UDP checksums
 	 */
 	if (skb->ip_summed == CHECKSUM_PARTIAL) {
-		udp_partial_csum_update(cp->af, udph, &cp->daddr, &cp->vaddr,
+		udp_partial_csum_update(cp->af, udph, &cp->vaddr, &cp->daddr,
 					htons(oldlen),
 					htons(skb->len - udphoff));
 	} else if (!cp->app && (udph->check != 0)) {
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 006/143] intel-iommu: Flush unmaps at domain_exit
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (5 preceding siblings ...)
  2014-05-12  0:32 ` [ 005/143] ipvs: fix CHECKSUM_PARTIAL for TCP, UDP Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 007/143] staging: comedi: ni_65xx: (bug fix) confine insn_bits to one Willy Tarreau
                   ` (136 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Jitendra Bhivare, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Jitendra Bhivare <jitendra.bhivare@gmail.com>

Backported Alex Williamson's commit to 2.6.32.y
http://git.kernel.org/linus/7b668357810ecb5fdda4418689d50f5d95aea6a8

It resolves the following assert when module is immediately reloaded.

kernel BUG at drivers/pci/iova.c:155!
<snip>
Call Trace:
[<ffffffff812645c5>] intel_alloc_iova+0xb5/0xe0
[<ffffffff8126725e>] __intel_map_single+0xbe/0x210
[<ffffffff812674ae>] intel_alloc_coherent+0xae/0x120
[<ffffffffa035f909>] be_queue_alloc+0xb9/0x140 [be2net]
[<ffffffffa035fa5a>] be_rx_qs_create+0xca/0x370 [be2net]
<snip>

The issue is reproducible in 2.6.32.60 and also gets resolved
by passing intel-iommu=strict to kernel.

Signed-off-by: Jitendra Bhivare <jitendra.bhivare@gmail.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/pci/intel-iommu.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/pci/intel-iommu.c b/drivers/pci/intel-iommu.c
index 5b680df..c1a7b01 100644
--- a/drivers/pci/intel-iommu.c
+++ b/drivers/pci/intel-iommu.c
@@ -1434,6 +1434,10 @@ static void domain_exit(struct dmar_domain *domain)
 	if (!domain)
 		return;
 
+	/* Flush any lazy unmaps that may reference this domain */
+	if (!intel_iommu_strict)
+		flush_unmaps_timeout(0);
+
 	domain_remove_dev_info(domain);
 	/* destroy iovas */
 	put_iova_domain(&domain->iovad);
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 007/143] staging: comedi: ni_65xx: (bug fix) confine insn_bits to one
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (6 preceding siblings ...)
  2014-05-12  0:32 ` [ 006/143] intel-iommu: Flush unmaps at domain_exit Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 008/143] kernel/kmod.c: check for NULL in call_usermodehelper_exec() Willy Tarreau
                   ` (135 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Ian Abbott, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------
 subdevice

From: Ian Abbott <abbotti@mev.co.uk>

Commit 677a31565692d596ef42ea589b53ba289abf4713 upstream.

The `insn_bits` handler `ni_65xx_dio_insn_bits()` has a `for` loop that
currently writes (optionally) and reads back up to 5 "ports" consisting
of 8 channels each.  It reads up to 32 1-bit channels but can only read
and write a whole port at once - it needs to handle up to 5 ports as the
first channel it reads might not be aligned on a port boundary.  It
breaks out of the loop early if the next port it handles is beyond the
final port on the card.  It also breaks out early on the 5th port in the
loop if the first channel was aligned.  Unfortunately, it doesn't check
that the current port it is dealing with belongs to the comedi subdevice
the `insn_bits` handler is acting on.  That's a bug.

Redo the `for` loop to terminate after the final port belonging to the
subdevice, changing the loop variable in the process to simplify things
a bit.  The `for` loop could now try and handle more than 5 ports if the
subdevice has more than 40 channels, but the test `if (bitshift >= 32)`
ensures it will break out early after 4 or 5 ports (depending on whether
the first channel is aligned on a port boundary).  (`bitshift` will be
between -7 and 7 inclusive on the first iteration, increasing by 8 for
each subsequent operation.)

Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/staging/comedi/drivers/ni_65xx.c | 25 +++++++++++--------------
 1 file changed, 11 insertions(+), 14 deletions(-)

diff --git a/drivers/staging/comedi/drivers/ni_65xx.c b/drivers/staging/comedi/drivers/ni_65xx.c
index bbf75eb..bb23291 100644
--- a/drivers/staging/comedi/drivers/ni_65xx.c
+++ b/drivers/staging/comedi/drivers/ni_65xx.c
@@ -410,28 +410,25 @@ static int ni_65xx_dio_insn_bits(struct comedi_device *dev,
 				 struct comedi_subdevice *s,
 				 struct comedi_insn *insn, unsigned int *data)
 {
-	unsigned base_bitfield_channel;
-	const unsigned max_ports_per_bitfield = 5;
+	int base_bitfield_channel;
 	unsigned read_bits = 0;
-	unsigned j;
+	int last_port_offset = ni_65xx_port_by_channel(s->n_chan - 1);
+	int port_offset;
+
 	if (insn->n != 2)
 		return -EINVAL;
 	base_bitfield_channel = CR_CHAN(insn->chanspec);
-	for (j = 0; j < max_ports_per_bitfield; ++j) {
-		const unsigned port_offset = ni_65xx_port_by_channel(base_bitfield_channel) + j;
-		const unsigned port =
-		    sprivate(s)->base_port + port_offset;
-		unsigned base_port_channel;
+	for (port_offset = ni_65xx_port_by_channel(base_bitfield_channel);
+	     port_offset <= last_port_offset; port_offset++) {
+		unsigned port = sprivate(s)->base_port + port_offset;
+		int base_port_channel = port_offset * ni_65xx_channels_per_port;
 		unsigned port_mask, port_data, port_read_bits;
-		int bitshift;
-		if (port >= ni_65xx_total_num_ports(board(dev)))
+		int bitshift = base_port_channel - base_bitfield_channel;
+
+		if (bitshift >= 32)
 			break;
-		base_port_channel = port_offset * ni_65xx_channels_per_port;
 		port_mask = data[0];
 		port_data = data[1];
-		bitshift = base_port_channel - base_bitfield_channel;
-		if (bitshift >= 32 || bitshift <= -32)
-			break;
 		if (bitshift > 0) {
 			port_mask >>= bitshift;
 			port_data >>= bitshift;
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 008/143] kernel/kmod.c: check for NULL in call_usermodehelper_exec()
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (7 preceding siblings ...)
  2014-05-12  0:32 ` [ 007/143] staging: comedi: ni_65xx: (bug fix) confine insn_bits to one Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 009/143] cciss: fix info leak in cciss_ioctl32_passthru() Willy Tarreau
                   ` (134 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Tetsuo Handa, Oleg Nesterov, Andrew Morton, Linus Torvalds,
	Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>

If /proc/sys/kernel/core_pattern contains only "|", a NULL pointer
dereference happens upon core dump because argv_split("") returns
argv[0] == NULL.

This bug was once fixed by commit 264b83c07a84 ("usermodehelper: check
subprocess_info->path != NULL") but was by error reintroduced by commit
7f57cfa4e2aa ("usermodehelper: kill the sub_info->path[0] check").

This bug seems to exist since 2.6.19 (the version which core dump to
pipe was added).  Depending on kernel version and config, some side
effect might happen immediately after this oops (e.g.  kernel panic with
2.6.32-358.18.1.el6).

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 4c1c7be95c345cf2ad537a0c48e9aeadc7304527)
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 kernel/kmod.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/kernel/kmod.c b/kernel/kmod.c
index 8ecc509..3da09a9 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -560,6 +560,10 @@ int call_usermodehelper_exec(struct subprocess_info *sub_info,
 	BUG_ON(atomic_read(&sub_info->cred->usage) != 1);
 	validate_creds(sub_info->cred);
 
+	if (!sub_info->path) {
+		call_usermodehelper_freeinfo(sub_info);
+		return -EINVAL;
+	}
 	helper_lock();
 	if (sub_info->path[0] == '\0')
 		goto out;
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 009/143] cciss: fix info leak in cciss_ioctl32_passthru()
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (8 preceding siblings ...)
  2014-05-12  0:32 ` [ 008/143] kernel/kmod.c: check for NULL in call_usermodehelper_exec() Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 010/143] cpqarray: fix info leak in ida_locked_ioctl() Willy Tarreau
                   ` (133 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Dan Carpenter, Mike Miller, Andrew Morton, Linus Torvalds, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Dan Carpenter <dan.carpenter@oracle.com>

commit 58f09e00ae095e46ef9edfcf3a5fd9ccdfad065e upstream.

The arg64 struct has a hole after ->buf_size which isn't cleared.  Or if
any of the calls to copy_from_user() fail then that would cause an
information leak as well.

This was assigned CVE-2013-2147.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/block/cciss.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
index 68b90d9..b2225ab 100644
--- a/drivers/block/cciss.c
+++ b/drivers/block/cciss.c
@@ -1051,6 +1051,7 @@ static int cciss_ioctl32_big_passthru(struct block_device *bdev, fmode_t mode,
 	int err;
 	u32 cp;
 
+	memset(&arg64, 0, sizeof(arg64));
 	err = 0;
 	err |=
 	    copy_from_user(&arg64.LUN_info, &arg32->LUN_info,
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 010/143] cpqarray: fix info leak in ida_locked_ioctl()
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (9 preceding siblings ...)
  2014-05-12  0:32 ` [ 009/143] cciss: fix info leak in cciss_ioctl32_passthru() Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 011/143] drivers/cdrom/cdrom.c: use kzalloc() for failing hardware Willy Tarreau
                   ` (132 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Dan Carpenter, Mike Miller, Andrew Morton, Linus Torvalds, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Dan Carpenter <dan.carpenter@oracle.com>

commit 627aad1c01da6f881e7f98d71fd928ca0c316b1a upstream

The pciinfo struct has a two byte hole after ->dev_fn so stack
information could be leaked to the user.

This was assigned CVE-2013-2147.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/block/cpqarray.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/block/cpqarray.c b/drivers/block/cpqarray.c
index 6422651..f9caa45 100644
--- a/drivers/block/cpqarray.c
+++ b/drivers/block/cpqarray.c
@@ -1181,6 +1181,7 @@ out_passthru:
 		ida_pci_info_struct pciinfo;
 
 		if (!arg) return -EINVAL;
+		memset(&pciinfo, 0, sizeof(pciinfo));
 		pciinfo.bus = host->pci_dev->bus->number;
 		pciinfo.dev_fn = host->pci_dev->devfn;
 		pciinfo.board_id = host->board_id;
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 011/143] drivers/cdrom/cdrom.c: use kzalloc() for failing hardware
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (10 preceding siblings ...)
  2014-05-12  0:32 ` [ 010/143] cpqarray: fix info leak in ida_locked_ioctl() Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 012/143] sctp: deal with multiple COOKIE_ECHO chunks Willy Tarreau
                   ` (131 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Dan Carpenter, Jens Axboe, Andrew Morton, Linus Torvalds, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Jonathan Salwan <jonathan.salwan@gmail.com>

commit 542db01579fbb7ea7d1f7bb9ddcef1559df660b2 upstream

In drivers/cdrom/cdrom.c mmc_ioctl_cdrom_read_data() allocates a memory
area with kmalloc in line 2885.

  2885         cgc->buffer = kmalloc(blocksize, GFP_KERNEL);
  2886         if (cgc->buffer == NULL)
  2887                 return -ENOMEM;

In line 2908 we can find the copy_to_user function:

  2908         if (!ret && copy_to_user(arg, cgc->buffer, blocksize))

The cgc->buffer is never cleaned and initialized before this function.
If ret = 0 with the previous basic block, it's possible to display some
memory bytes in kernel space from userspace.

When we read a block from the disk it normally fills the ->buffer but if
the drive is malfunctioning there is a chance that it would only be
partially filled.  The result is an leak information to userspace.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/cdrom/cdrom.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/cdrom/cdrom.c b/drivers/cdrom/cdrom.c
index a4592ec..71a78dc 100644
--- a/drivers/cdrom/cdrom.c
+++ b/drivers/cdrom/cdrom.c
@@ -2822,7 +2822,7 @@ static noinline int mmc_ioctl_cdrom_read_data(struct cdrom_device_info *cdi,
 	if (lba < 0)
 		return -EINVAL;
 
-	cgc->buffer = kmalloc(blocksize, GFP_KERNEL);
+	cgc->buffer = kzalloc(blocksize, GFP_KERNEL);
 	if (cgc->buffer == NULL)
 		return -ENOMEM;
 
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 012/143] sctp: deal with multiple COOKIE_ECHO chunks
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (11 preceding siblings ...)
  2014-05-12  0:32 ` [ 011/143] drivers/cdrom/cdrom.c: use kzalloc() for failing hardware Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 013/143] sctp: Use correct sideffect command in duplicate cookie handling Willy Tarreau
                   ` (130 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Max Matveev, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Max Matveev <makc@redhat.com>

commit d5ccd496601b8776a516d167a6485754575dc38f upstream

Attempt to reduce the number of IP packets emitted in response to single
SCTP packet (2e3216cd) introduced a complication - if a packet contains
two COOKIE_ECHO chunks and nothing else then SCTP state machine corks the
socket while processing first COOKIE_ECHO and then loses the association
and forgets to uncork the socket. To deal with the issue add new SCTP
command which can be used to set association explictly. Use this new
command when processing second COOKIE_ECHO chunk to restore the context
for SCTP state machine.

Signed-off-by: Max Matveev <makc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[dannf: backported to Debian's 2.6.32]
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 include/net/sctp/command.h | 1 +
 net/sctp/sm_sideeffect.c   | 5 +++++
 net/sctp/sm_statefuns.c    | 6 ++++++
 3 files changed, 12 insertions(+)

diff --git a/include/net/sctp/command.h b/include/net/sctp/command.h
index 2c55a7e..0edc14d 100644
--- a/include/net/sctp/command.h
+++ b/include/net/sctp/command.h
@@ -108,6 +108,7 @@ typedef enum {
 	SCTP_CMD_UPDATE_INITTAG, /* Update peer inittag */
 	SCTP_CMD_SEND_MSG,	 /* Send the whole use message */
 	SCTP_CMD_SEND_NEXT_ASCONF, /* Send the next ASCONF after ACK */
+	SCTP_CMD_SET_ASOC,	 /* Restore association context */
 	SCTP_CMD_LAST
 } sctp_verb_t;
 
diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c
index ed742bf..9005d83 100644
--- a/net/sctp/sm_sideeffect.c
+++ b/net/sctp/sm_sideeffect.c
@@ -1676,6 +1676,11 @@ static int sctp_cmd_interpreter(sctp_event_t event_type,
 		case SCTP_CMD_SEND_NEXT_ASCONF:
 			sctp_cmd_send_asconf(asoc);
 			break;
+
+		case SCTP_CMD_SET_ASOC:
+			asoc = cmd->obj.asoc;
+			break;
+
 		default:
 			printk(KERN_WARNING "Impossible command: %u, %p\n",
 			       cmd->verb, cmd->obj.ptr);
diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c
index 2f8e1c8..9e4e846 100644
--- a/net/sctp/sm_statefuns.c
+++ b/net/sctp/sm_statefuns.c
@@ -2048,6 +2048,12 @@ sctp_disposition_t sctp_sf_do_5_2_4_dupcook(const struct sctp_endpoint *ep,
 	sctp_add_cmd_sf(commands, SCTP_CMD_NEW_ASOC, SCTP_ASOC(new_asoc));
 	sctp_add_cmd_sf(commands, SCTP_CMD_DELETE_TCB, SCTP_NULL());
 
+	/* Restore association pointer to provide SCTP command interpeter
+	 * with a valid context in case it needs to manipulate
+	 * the queues */
+	sctp_add_cmd_sf(commands, SCTP_CMD_SET_ASOC,
+			 SCTP_ASOC((struct sctp_association *)asoc));
+
 	return retval;
 
 nomem:
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 013/143] sctp: Use correct sideffect command in duplicate cookie handling
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (12 preceding siblings ...)
  2014-05-12  0:32 ` [ 012/143] sctp: deal with multiple COOKIE_ECHO chunks Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 014/143] ipv6: ip6_sk_dst_check() must not assume ipv6 dst Willy Tarreau
                   ` (129 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Neil Horman, Vlad Yasevich, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Vlad Yasevich <vyasevich@gmail.com>

commit f2815633504b442ca0b0605c16bf3d88a3a0fcea upstream

When SCTP is done processing a duplicate cookie chunk, it tries
to delete a newly created association.  For that, it has to set
the right association for the side-effect processing to work.
However, when it uses the SCTP_CMD_NEW_ASOC command, that performs
more work then really needed (like hashing the associationa and
assigning it an id) and there is no point to do that only to
delete the association as a next step.  In fact, it also creates
an impossible condition where an association may be found by
the getsockopt() call, and that association is empty.  This
causes a crash in some sctp getsockopts.

The solution is rather simple.  We simply use SCTP_CMD_SET_ASOC
command that doesn't have all the overhead and does exactly
what we need.

Reported-by: Karl Heiss <kheiss@gmail.com>
Tested-by: Karl Heiss <kheiss@gmail.com>
CC: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/sctp/sm_statefuns.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c
index 9e4e846..486df56 100644
--- a/net/sctp/sm_statefuns.c
+++ b/net/sctp/sm_statefuns.c
@@ -2045,7 +2045,7 @@ sctp_disposition_t sctp_sf_do_5_2_4_dupcook(const struct sctp_endpoint *ep,
 	}
 
 	/* Delete the tempory new association. */
-	sctp_add_cmd_sf(commands, SCTP_CMD_NEW_ASOC, SCTP_ASOC(new_asoc));
+	sctp_add_cmd_sf(commands, SCTP_CMD_SET_ASOC, SCTP_ASOC(new_asoc));
 	sctp_add_cmd_sf(commands, SCTP_CMD_DELETE_TCB, SCTP_NULL());
 
 	/* Restore association pointer to provide SCTP command interpeter
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 014/143] ipv6: ip6_sk_dst_check() must not assume ipv6 dst
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (13 preceding siblings ...)
  2014-05-12  0:32 ` [ 013/143] sctp: Use correct sideffect command in duplicate cookie handling Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 015/143] af_key: fix info leaks in notify messages Willy Tarreau
                   ` (128 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Eric Dumazet, Hannes Frederic Sowa, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

commit a963a37d384d71ad43b3e9e79d68d42fbe0901f3 upstream

It's possible to use AF_INET6 sockets and to connect to an IPv4
destination. After this, socket dst cache is a pointer to a rtable,
not rt6_info.

ip6_sk_dst_check() should check the socket dst cache is IPv6, or else
various corruptions/crashes can happen.

Dave Jones can reproduce immediate crash with
trinity -q -l off -n -c sendmsg -c connect

With help from Hannes Frederic Sowa

Reported-by: Dave Jones <davej@redhat.com>
Reported-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/ipv6/ip6_output.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 6ba0fe2..bba91a1 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -920,11 +920,17 @@ static struct dst_entry *ip6_sk_dst_check(struct sock *sk,
 					  struct flowi *fl)
 {
 	struct ipv6_pinfo *np = inet6_sk(sk);
-	struct rt6_info *rt = (struct rt6_info *)dst;
+	struct rt6_info *rt;
 
 	if (!dst)
 		goto out;
 
+	if (dst->ops->family != AF_INET6) {
+		dst_release(dst);
+		return NULL;
+	}
+
+	rt = (struct rt6_info *)dst;
 	/* Yes, checking route validity in not connected
 	 * case is not very simple. Take into account,
 	 * that we do not support routing by source, TOS,
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 015/143] af_key: fix info leaks in notify messages
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (14 preceding siblings ...)
  2014-05-12  0:32 ` [ 014/143] ipv6: ip6_sk_dst_check() must not assume ipv6 dst Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 016/143] af_key: initialize satype in key_notify_policy_flush() Willy Tarreau
                   ` (127 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Mathias Krause, Steffen Klassert, David S. Miller, Herbert Xu,
	Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Mathias Krause <minipli@googlemail.com>

commit a5cc68f3d63306d0d288f31edfc2ae6ef8ecd887 upstream

key_notify_sa_flush() and key_notify_policy_flush() miss to initialize
the sadb_msg_reserved member of the broadcasted message and thereby
leak 2 bytes of heap memory to listeners. Fix that.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/key/af_key.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/key/af_key.c b/net/key/af_key.c
index 4e98193..03d626f 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -1726,6 +1726,7 @@ static int key_notify_sa_flush(struct km_event *c)
 	hdr->sadb_msg_version = PF_KEY_V2;
 	hdr->sadb_msg_errno = (uint8_t) 0;
 	hdr->sadb_msg_len = (sizeof(struct sadb_msg) / sizeof(uint64_t));
+	hdr->sadb_msg_reserved = 0;
 
 	pfkey_broadcast(skb, GFP_ATOMIC, BROADCAST_ALL, NULL, c->net);
 
@@ -2694,6 +2695,7 @@ static int key_notify_policy_flush(struct km_event *c)
 	hdr->sadb_msg_version = PF_KEY_V2;
 	hdr->sadb_msg_errno = (uint8_t) 0;
 	hdr->sadb_msg_len = (sizeof(struct sadb_msg) / sizeof(uint64_t));
+	hdr->sadb_msg_reserved = 0;
 	pfkey_broadcast(skb_out, GFP_ATOMIC, BROADCAST_ALL, NULL, c->net);
 	return 0;
 
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 016/143] af_key: initialize satype in key_notify_policy_flush()
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (15 preceding siblings ...)
  2014-05-12  0:32 ` [ 015/143] af_key: fix info leaks in notify messages Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 017/143] block: do not pass disk names as format strings Willy Tarreau
                   ` (126 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Nicolas Dichtel, Steffen Klassert, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Nicolas Dichtel <nicolas.dichtel@6wind.com>

commit 85dfb745ee40232876663ae206cba35f24ab2a40 upstream

This field was left uninitialized. Some user daemons perform check against this
field.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/key/af_key.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/key/af_key.c b/net/key/af_key.c
index 03d626f..9d22e46 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -2694,6 +2694,7 @@ static int key_notify_policy_flush(struct km_event *c)
 	hdr->sadb_msg_pid = c->pid;
 	hdr->sadb_msg_version = PF_KEY_V2;
 	hdr->sadb_msg_errno = (uint8_t) 0;
+	hdr->sadb_msg_satype = SADB_SATYPE_UNSPEC;
 	hdr->sadb_msg_len = (sizeof(struct sadb_msg) / sizeof(uint64_t));
 	hdr->sadb_msg_reserved = 0;
 	pfkey_broadcast(skb_out, GFP_ATOMIC, BROADCAST_ALL, NULL, c->net);
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 017/143] block: do not pass disk names as format strings
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (16 preceding siblings ...)
  2014-05-12  0:32 ` [ 016/143] af_key: initialize satype in key_notify_policy_flush() Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 018/143] b43: stop format string leaking into error msgs Willy Tarreau
                   ` (125 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Kees Cook, Jens Axboe, Andrew Morton, Linus Torvalds, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Kees Cook <keescook@chromium.org>

commit ffc8b30866879ed9ba62bd0a86fecdbd51cd3d19 upstream

Disk names may contain arbitrary strings, so they must not be
interpreted as format strings.  It seems that only md allows arbitrary
strings to be used for disk names, but this could allow for a local
memory corruption from uid 0 into ring 0.

CVE-2013-2851

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[jmm: Backport to 2.6.32]
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/block/nbd.c   | 4 +++-
 fs/partitions/check.c | 2 +-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 26ada47..90550ba 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -655,7 +655,9 @@ static int __nbd_ioctl(struct block_device *bdev, struct nbd_device *lo,
 
 		mutex_unlock(&lo->tx_lock);
 
-		thread = kthread_create(nbd_thread, lo, lo->disk->disk_name);
+		thread = kthread_create(nbd_thread, lo, "%s",
+					lo->disk->disk_name);
+
 		if (IS_ERR(thread)) {
 			mutex_lock(&lo->tx_lock);
 			return PTR_ERR(thread);
diff --git a/fs/partitions/check.c b/fs/partitions/check.c
index 7b685e1..aa90d88 100644
--- a/fs/partitions/check.c
+++ b/fs/partitions/check.c
@@ -476,7 +476,7 @@ void register_disk(struct gendisk *disk)
 
 	ddev->parent = disk->driverfs_dev;
 
-	dev_set_name(ddev, disk->disk_name);
+	dev_set_name(ddev, "%s", disk->disk_name);
 
 	/* delay uevents, until we scanned partition table */
 	dev_set_uevent_suppress(ddev, 1);
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 018/143] b43: stop format string leaking into error msgs
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (17 preceding siblings ...)
  2014-05-12  0:32 ` [ 017/143] block: do not pass disk names as format strings Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 019/143] HID: validate HID report id size Willy Tarreau
                   ` (124 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Kees Cook, John W. Linville, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Kees Cook <keescook@chromium.org>

commit e0e29b683d6784ef59bbc914eac85a04b650e63c upstream

The module parameter "fwpostfix" is userspace controllable, unfiltered,
and is used to define the firmware filename. b43_do_request_fw() populates
ctx->errors[] on error, containing the firmware filename. b43err()
parses its arguments as a format string. For systems with b43 hardware,
this could lead to a uid-0 to ring-0 escalation.

CVE-2013-2852

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/net/wireless/b43/main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/b43/main.c b/drivers/net/wireless/b43/main.c
index 94dae56..3cf2472 100644
--- a/drivers/net/wireless/b43/main.c
+++ b/drivers/net/wireless/b43/main.c
@@ -2257,7 +2257,7 @@ static int b43_request_firmware(struct b43_wldev *dev)
 	for (i = 0; i < B43_NR_FWTYPES; i++) {
 		errmsg = ctx->errors[i];
 		if (strlen(errmsg))
-			b43err(dev->wl, errmsg);
+			b43err(dev->wl, "%s", errmsg);
 	}
 	b43_print_fw_helptext(dev->wl, 1);
 	err = -ENOENT;
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 019/143] HID: validate HID report id size
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (18 preceding siblings ...)
  2014-05-12  0:32 ` [ 018/143] b43: stop format string leaking into error msgs Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 020/143] HID: zeroplus: validate output report details Willy Tarreau
                   ` (123 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Kees Cook, stable, Jiri Kosina, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Kees Cook <keescook@chromium.org>

commit 43622021d2e2b82ea03d883926605bdd0525e1d1 upstream

The "Report ID" field of a HID report is used to build indexes of
reports. The kernel's index of these is limited to 256 entries, so any
malicious device that sets a Report ID greater than 255 will trigger
memory corruption on the host:

[ 1347.156239] BUG: unable to handle kernel paging request at ffff88094958a878
[ 1347.156261] IP: [<ffffffff813e4da0>] hid_register_report+0x2a/0x8b

CVE-2013-2888

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@kernel.org
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
[jmm: backport to 2.6.32]
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/hid/hid-core.c | 10 +++++++---
 include/linux/hid.h    |  4 +++-
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c
index 11f8069..e40e3c4 100644
--- a/drivers/hid/hid-core.c
+++ b/drivers/hid/hid-core.c
@@ -58,6 +58,8 @@ static struct hid_report *hid_register_report(struct hid_device *device, unsigne
 	struct hid_report_enum *report_enum = device->report_enum + type;
 	struct hid_report *report;
 
+	if (id >= HID_MAX_IDS)
+		return NULL;
 	if (report_enum->report_id_hash[id])
 		return report_enum->report_id_hash[id];
 
@@ -368,8 +370,10 @@ static int hid_parser_global(struct hid_parser *parser, struct hid_item *item)
 
 	case HID_GLOBAL_ITEM_TAG_REPORT_ID:
 		parser->global.report_id = item_udata(item);
-		if (parser->global.report_id == 0) {
-			dbg_hid("report_id 0 is invalid\n");
+		if (parser->global.report_id == 0 ||
+		    parser->global.report_id >= HID_MAX_IDS) {
+			dbg_hid("report_id %u is invalid\n",
+				parser->global.report_id);
 			return -1;
 		}
 		return 0;
@@ -545,7 +549,7 @@ static void hid_device_release(struct device *dev)
 	for (i = 0; i < HID_REPORT_TYPES; i++) {
 		struct hid_report_enum *report_enum = device->report_enum + i;
 
-		for (j = 0; j < 256; j++) {
+		for (j = 0; j < HID_MAX_IDS; j++) {
 			struct hid_report *report = report_enum->report_id_hash[j];
 			if (report)
 				hid_free_report(report);
diff --git a/include/linux/hid.h b/include/linux/hid.h
index 8709365..481080d 100644
--- a/include/linux/hid.h
+++ b/include/linux/hid.h
@@ -410,10 +410,12 @@ struct hid_report {
 	struct hid_device *device;			/* associated device */
 };
 
+#define HID_MAX_IDS 256
+
 struct hid_report_enum {
 	unsigned numbered;
 	struct list_head report_list;
-	struct hid_report *report_id_hash[256];
+	struct hid_report *report_id_hash[HID_MAX_IDS];
 };
 
 #define HID_REPORT_TYPES 3
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 020/143] HID: zeroplus: validate output report details
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (19 preceding siblings ...)
  2014-05-12  0:32 ` [ 019/143] HID: validate HID report id size Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 021/143] HID: pantherlord: " Willy Tarreau
                   ` (122 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Kees Cook, Jiri Kosina, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Kees Cook <keescook@chromium.org>

commit 78214e81a1bf43740ce89bb5efda78eac2f8ef83 upstream

The zeroplus HID driver was not checking the size of allocated values
in fields it used. A HID device could send a malicious output report
that would cause the driver to write beyond the output report allocation
during initialization, causing a heap overflow:

[ 1442.728680] usb 1-1: New USB device found, idVendor=0c12, idProduct=0005
...
[ 1466.243173] BUG kmalloc-192 (Tainted: G        W   ): Redzone overwritten

CVE-2013-2889

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
[jmm: backport to 2.6.32]
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/hid/hid-zpff.c | 18 +++++-------------
 1 file changed, 5 insertions(+), 13 deletions(-)

diff --git a/drivers/hid/hid-zpff.c b/drivers/hid/hid-zpff.c
index a79f0d7..5617ea9 100644
--- a/drivers/hid/hid-zpff.c
+++ b/drivers/hid/hid-zpff.c
@@ -68,21 +68,13 @@ static int zpff_init(struct hid_device *hid)
 	struct hid_report *report;
 	struct hid_input *hidinput = list_entry(hid->inputs.next,
 						struct hid_input, list);
-	struct list_head *report_list =
-			&hid->report_enum[HID_OUTPUT_REPORT].report_list;
 	struct input_dev *dev = hidinput->input;
-	int error;
+	int i, error;
 
-	if (list_empty(report_list)) {
-		dev_err(&hid->dev, "no output report found\n");
-		return -ENODEV;
-	}
-
-	report = list_entry(report_list->next, struct hid_report, list);
-
-	if (report->maxfield < 4) {
-		dev_err(&hid->dev, "not enough fields in report\n");
-		return -ENODEV;
+	for (i = 0; i < 4; i++) {
+		report = hid_validate_values(hid, HID_OUTPUT_REPORT, 0, i, 1);
+		if (!report)
+			return -ENODEV;
 	}
 
 	zpff = kzalloc(sizeof(struct zpff_device), GFP_KERNEL);
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 021/143] HID: pantherlord: validate output report details
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (20 preceding siblings ...)
  2014-05-12  0:32 ` [ 020/143] HID: zeroplus: validate output report details Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 022/143] HID: LG: validate HID " Willy Tarreau
                   ` (121 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Kees Cook, stable, Jiri Kosina, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Kees Cook <keescook@chromium.org>

commit 412f30105ec6735224535791eed5cdc02888ecb4 upstream

A HID device could send a malicious output report that would cause the
pantherlord HID driver to write beyond the output report allocation
during initialization, causing a heap overflow:

[  310.939483] usb 1-1: New USB device found, idVendor=0e8f, idProduct=0003
...
[  315.980774] BUG kmalloc-192 (Tainted: G        W   ): Redzone overwritten

CVE-2013-2892

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@kernel.org
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/hid/hid-pl.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/hid/hid-pl.c b/drivers/hid/hid-pl.c
index c6d7dbc..8cdf7b8 100644
--- a/drivers/hid/hid-pl.c
+++ b/drivers/hid/hid-pl.c
@@ -128,8 +128,14 @@ static int plff_init(struct hid_device *hid)
 			strong = &report->field[0]->value[2];
 			weak = &report->field[0]->value[3];
 			debug("detected single-field device");
-		} else if (report->maxfield >= 4 && report->field[0]->maxusage == 1 &&
-				report->field[0]->usage[0].hid == (HID_UP_LED | 0x43)) {
+		} else if (report->field[0]->maxusage == 1 &&
+			   report->field[0]->usage[0].hid ==
+				(HID_UP_LED | 0x43) &&
+			   report->maxfield >= 4 &&
+			   report->field[0]->report_count >= 1 &&
+			   report->field[1]->report_count >= 1 &&
+			   report->field[2]->report_count >= 1 &&
+			   report->field[3]->report_count >= 1) {
 			report->field[0]->value[0] = 0x00;
 			report->field[1]->value[0] = 0x00;
 			strong = &report->field[2]->value[0];
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 022/143] HID: LG: validate HID output report details
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (21 preceding siblings ...)
  2014-05-12  0:32 ` [ 021/143] HID: pantherlord: " Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 023/143] HID: check for NULL field when setting values Willy Tarreau
                   ` (120 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Kees Cook, Jiri Kosina, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Kees Cook <keescook@chromium.org>

commit 0fb6bd06e06792469acc15bbe427361b56ada528 upstream

A HID device could send a malicious output report that would cause the
lg, lg3, and lg4 HID drivers to write beyond the output report allocation
during an event, causing a heap overflow:

[  325.245240] usb 1-1: New USB device found, idVendor=046d, idProduct=c287
...
[  414.518960] BUG kmalloc-4096 (Not tainted): Redzone overwritten

Additionally, while lg2 did correctly validate the report details, it was
cleaned up and shortened.

CVE-2013-2893

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>

[jmm: backported to 2.6.32]
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/hid/hid-lg2ff.c | 19 +++----------------
 drivers/hid/hid-lgff.c  | 17 ++---------------
 2 files changed, 5 insertions(+), 31 deletions(-)

diff --git a/drivers/hid/hid-lg2ff.c b/drivers/hid/hid-lg2ff.c
index 4e6dc6e..a260a8c 100644
--- a/drivers/hid/hid-lg2ff.c
+++ b/drivers/hid/hid-lg2ff.c
@@ -65,26 +65,13 @@ int lg2ff_init(struct hid_device *hid)
 	struct hid_report *report;
 	struct hid_input *hidinput = list_entry(hid->inputs.next,
 						struct hid_input, list);
-	struct list_head *report_list =
-			&hid->report_enum[HID_OUTPUT_REPORT].report_list;
 	struct input_dev *dev = hidinput->input;
 	int error;
 
-	if (list_empty(report_list)) {
-		dev_err(&hid->dev, "no output report found\n");
+	/* Check that the report looks ok */
+	report = hid_validate_values(hid, HID_OUTPUT_REPORT, 0, 0, 7);
+	if (!report)
 		return -ENODEV;
-	}
-
-	report = list_entry(report_list->next, struct hid_report, list);
-
-	if (report->maxfield < 1) {
-		dev_err(&hid->dev, "output report is empty\n");
-		return -ENODEV;
-	}
-	if (report->field[0]->report_count < 7) {
-		dev_err(&hid->dev, "not enough values in the field\n");
-		return -ENODEV;
-	}
 
 	lg2ff = kmalloc(sizeof(struct lg2ff_device), GFP_KERNEL);
 	if (!lg2ff)
diff --git a/drivers/hid/hid-lgff.c b/drivers/hid/hid-lgff.c
index 987abeb..df26abb 100644
--- a/drivers/hid/hid-lgff.c
+++ b/drivers/hid/hid-lgff.c
@@ -135,27 +135,14 @@ static void hid_lgff_set_autocenter(struct input_dev *dev, u16 magnitude)
 int lgff_init(struct hid_device* hid)
 {
 	struct hid_input *hidinput = list_entry(hid->inputs.next, struct hid_input, list);
-	struct list_head *report_list = &hid->report_enum[HID_OUTPUT_REPORT].report_list;
 	struct input_dev *dev = hidinput->input;
-	struct hid_report *report;
-	struct hid_field *field;
 	const signed short *ff_bits = ff_joystick;
 	int error;
 	int i;
 
-	/* Find the report to use */
-	if (list_empty(report_list)) {
-		err_hid("No output report found");
-		return -1;
-	}
-
 	/* Check that the report looks ok */
-	report = list_entry(report_list->next, struct hid_report, list);
-	field = report->field[0];
-	if (!field) {
-		err_hid("NULL field");
-		return -1;
-	}
+	if (!hid_validate_values(hid, HID_OUTPUT_REPORT, 0, 0, 7))
+		return -ENODEV;
 
 	for (i = 0; i < ARRAY_SIZE(devices); i++) {
 		if (dev->id.vendor == devices[i].idVendor &&
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 023/143] HID: check for NULL field when setting values
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (22 preceding siblings ...)
  2014-05-12  0:32 ` [ 022/143] HID: LG: validate HID " Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 024/143] HID: provide a helper for validating hid reports Willy Tarreau
                   ` (119 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Kees Cook, stable, Jiri Kosina, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Kees Cook <keescook@chromium.org>

commit be67b68d52fa28b9b721c47bb42068f0c1214855 upstream

Defensively check that the field to be worked on is not NULL.

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@kernel.org
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/hid/hid-core.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c
index e40e3c4..a222cbb 100644
--- a/drivers/hid/hid-core.c
+++ b/drivers/hid/hid-core.c
@@ -983,7 +983,12 @@ EXPORT_SYMBOL_GPL(hid_output_report);
 
 int hid_set_field(struct hid_field *field, unsigned offset, __s32 value)
 {
-	unsigned size = field->report_size;
+	unsigned size;
+
+	if (!field)
+		return -1;
+
+	size = field->report_size;
 
 	hid_dump_input(field->report->device, field->usage + offset, value);
 
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 024/143] HID: provide a helper for validating hid reports
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (23 preceding siblings ...)
  2014-05-12  0:32 ` [ 023/143] HID: check for NULL field when setting values Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 025/143] crypto: api - Fix race condition in larval lookup Willy Tarreau
                   ` (118 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Kees Cook, Jiri Kosina, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Kees Cook <keescook@chromium.org>

commit 331415ff16a12147d57d5c953f3a961b7ede348b upstream

Many drivers need to validate the characteristics of their HID report
during initialization to avoid misusing the reports. This adds a common
helper to perform validation of the report exisitng, the field existing,
and the expected number of values within the field.

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>

[jmm: backported to 2.6.32]
[wt: dev_err() in 2.6.32 instead of hid_err()]
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/hid/hid-core.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/hid.h    |  4 ++++
 2 files changed, 62 insertions(+)

diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c
index a222cbb..e7e28b5 100644
--- a/drivers/hid/hid-core.c
+++ b/drivers/hid/hid-core.c
@@ -808,6 +808,64 @@ static __inline__ int search(__s32 *array, __s32 value, unsigned n)
 	return -1;
 }
 
+static const char * const hid_report_names[] = {
+	"HID_INPUT_REPORT",
+	"HID_OUTPUT_REPORT",
+	"HID_FEATURE_REPORT",
+};
+/**
+ * hid_validate_values - validate existing device report's value indexes
+ *
+ * @device: hid device
+ * @type: which report type to examine
+ * @id: which report ID to examine (0 for first)
+ * @field_index: which report field to examine
+ * @report_counts: expected number of values
+ *
+ * Validate the number of values in a given field of a given report, after
+ * parsing.
+ */
+struct hid_report *hid_validate_values(struct hid_device *hid,
+				       unsigned int type, unsigned int id,
+				       unsigned int field_index,
+				       unsigned int report_counts)
+{
+	struct hid_report *report;
+
+	if (type > HID_FEATURE_REPORT) {
+		dev_err(&hid->dev, "invalid HID report type %u\n", type);
+		return NULL;
+	}
+
+	if (id >= HID_MAX_IDS) {
+		dev_err(&hid->dev, "invalid HID report id %u\n", id);
+		return NULL;
+	}
+
+	/*
+	 * Explicitly not using hid_get_report() here since it depends on
+	 * ->numbered being checked, which may not always be the case when
+	 * drivers go to access report values.
+	 */
+	report = hid->report_enum[type].report_id_hash[id];
+	if (!report) {
+		dev_err(&hid->dev, "missing %s %u\n", hid_report_names[type], id);
+		return NULL;
+	}
+	if (report->maxfield <= field_index) {
+		dev_err(&hid->dev, "not enough fields in %s %u\n",
+			hid_report_names[type], id);
+		return NULL;
+	}
+	if (report->field[field_index]->report_count < report_counts) {
+		dev_err(&hid->dev, "not enough values in %s %u field %u\n",
+			hid_report_names[type], id, field_index);
+		return NULL;
+	}
+	return report;
+}
+EXPORT_SYMBOL_GPL(hid_validate_values);
+
 /**
  * hid_match_report - check if driver's raw_event should be called
  *
diff --git a/include/linux/hid.h b/include/linux/hid.h
index 481080d..e5db8e5 100644
--- a/include/linux/hid.h
+++ b/include/linux/hid.h
@@ -693,6 +693,10 @@ int hidinput_find_field(struct hid_device *hid, unsigned int type, unsigned int
 void hid_output_report(struct hid_report *report, __u8 *data);
 struct hid_device *hid_allocate_device(void);
 int hid_parse_report(struct hid_device *hid, __u8 *start, unsigned size);
+struct hid_report *hid_validate_values(struct hid_device *hid,
+				       unsigned int type, unsigned int id,
+				       unsigned int field_index,
+				       unsigned int report_counts);
 int hid_check_keys_pressed(struct hid_device *hid);
 int hid_connect(struct hid_device *hid, unsigned int connect_mask);
 void hid_disconnect(struct hid_device *hid);
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 025/143] crypto: api - Fix race condition in larval lookup
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (24 preceding siblings ...)
  2014-05-12  0:32 ` [ 024/143] HID: provide a helper for validating hid reports Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 026/143] ipv6: tcp: fix panic in SYN processing Willy Tarreau
                   ` (117 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Herbert Xu, Nikola Pajkovsky, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Nikola Pajkovsky <npajkovs@redhat.com>

https://bugzilla.redhat.com/1016108

64z is missing rhel6 commit 3af031a395c0 ("[crypto] algboss: Hold ref
count on larval") which is causing cosmetic fuzz, because crypto_alg_get
was move from crypto/api.c to crypto/internal.h.

From: Herbert Xu <herbert@gondor.apana.org.au>

[ upstream commit 77dbd7a95e4a4f15264c333a9e9ab97ee27dc2aa ]

crypto_larval_lookup should only return a larval if it created one.
Any larval created by another entity must be processed through
crypto_larval_wait before being returned.

Otherwise this will lead to a larval being killed twice, which
will most likely lead to a crash.

Cc: stable@vger.kernel.org
Reported-by: Kees Cook <keescook@chromium.org>
Tested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Nikola Pajkovsky <npajkovs@redhat.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 crypto/api.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/crypto/api.c b/crypto/api.c
index 798526d..f4be65f 100644
--- a/crypto/api.c
+++ b/crypto/api.c
@@ -40,6 +40,8 @@ static inline struct crypto_alg *crypto_alg_get(struct crypto_alg *alg)
 	return alg;
 }
 
+static struct crypto_alg *crypto_larval_wait(struct crypto_alg *alg);
+
 struct crypto_alg *crypto_mod_get(struct crypto_alg *alg)
 {
 	return try_module_get(alg->cra_module) ? crypto_alg_get(alg) : NULL;
@@ -150,8 +152,11 @@ static struct crypto_alg *crypto_larval_add(const char *name, u32 type,
 	}
 	up_write(&crypto_alg_sem);
 
-	if (alg != &larval->alg)
+	if (alg != &larval->alg) {
 		kfree(larval);
+		if (crypto_is_larval(alg))
+			alg = crypto_larval_wait(alg);
+	}
 
 	return alg;
 }
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 026/143] ipv6: tcp: fix panic in SYN processing
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (25 preceding siblings ...)
  2014-05-12  0:32 ` [ 025/143] crypto: api - Fix race condition in larval lookup Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 027/143] tcp: must unclone packets before mangling them Willy Tarreau
                   ` (116 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Eric Dumazet, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <eric.dumazet@gmail.com>

commit 72a3effaf633bc ([NET]: Size listen hash tables using backlog
hint) added a bug allowing inet6_synq_hash() to return an out of bound
array index, because of u16 overflow.

Bug can happen if system admins set net.core.somaxconn &
net.ipv4.tcp_max_syn_backlog sysctls to values greater than 65536

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit c16a98ed91597b40b22b540c6517103497ef8e74)
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/ipv6/inet6_connection_sock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv6/inet6_connection_sock.c b/net/ipv6/inet6_connection_sock.c
index cc4797d..59f4063 100644
--- a/net/ipv6/inet6_connection_sock.c
+++ b/net/ipv6/inet6_connection_sock.c
@@ -57,7 +57,7 @@ EXPORT_SYMBOL_GPL(inet6_csk_bind_conflict);
  * request_sock (formerly open request) hash tables.
  */
 static u32 inet6_synq_hash(const struct in6_addr *raddr, const __be16 rport,
-			   const u32 rnd, const u16 synq_hsize)
+			   const u32 rnd, const u32 synq_hsize)
 {
 	u32 a = (__force u32)raddr->s6_addr32[0];
 	u32 b = (__force u32)raddr->s6_addr32[1];
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 027/143] tcp: must unclone packets before mangling them
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (26 preceding siblings ...)
  2014-05-12  0:32 ` [ 026/143] ipv6: tcp: fix panic in SYN processing Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 028/143] net: do not call sock_put() on TIMEWAIT sockets Willy Tarreau
                   ` (115 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Eric Dumazet, Neal Cardwell, Yuchung Cheng, David S. Miller,
	Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

[ Upstream commit c52e2421f7368fd36cbe330d2cf41b10452e39a9 ]

TCP stack should make sure it owns skbs before mangling them.

We had various crashes using bnx2x, and it turned out gso_size
was cleared right before bnx2x driver was populating TC descriptor
of the _previous_ packet send. TCP stack can sometime retransmit
packets that are still in Qdisc.

Of course we could make bnx2x driver more robust (using
ACCESS_ONCE(shinfo->gso_size) for example), but the bug is TCP stack.

We have identified two points where skb_unclone() was needed.

This patch adds a WARN_ON_ONCE() to warn us if we missed another
fix of this kind.

Kudos to Neal for finding the root cause of this bug. Its visible
using small MSS.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 include/linux/skbuff.h | 10 ++++++++++
 net/ipv4/tcp_output.c  |  9 ++++++---
 2 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 4e647bb..ae77862 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -641,6 +641,16 @@ static inline int skb_cloned(const struct sk_buff *skb)
 	       (atomic_read(&skb_shinfo(skb)->dataref) & SKB_DATAREF_MASK) != 1;
 }
 
+static inline int skb_unclone(struct sk_buff *skb, gfp_t pri)
+{
+	might_sleep_if(pri & __GFP_WAIT);
+
+	if (skb_cloned(skb))
+		return pskb_expand_head(skb, 0, 0, pri);
+
+	return 0;
+}
+
 /**
  *	skb_header_cloned - is the header a clone
  *	@skb: buffer to check
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 38a23e4..49da29e 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -744,6 +744,9 @@ static void tcp_queue_skb(struct sock *sk, struct sk_buff *skb)
 static void tcp_set_skb_tso_segs(struct sock *sk, struct sk_buff *skb,
 				 unsigned int mss_now)
 {
+	/* Make sure we own this skb before messing gso_size/gso_segs */
+	WARN_ON_ONCE(skb_cloned(skb));
+
 	if (skb->len <= mss_now || !sk_can_gso(sk) ||
 	    skb->ip_summed == CHECKSUM_NONE) {
 		/* Avoid the costly divide in the normal
@@ -824,9 +827,7 @@ int tcp_fragment(struct sock *sk, struct sk_buff *skb, u32 len,
 	if (nsize < 0)
 		nsize = 0;
 
-	if (skb_cloned(skb) &&
-	    skb_is_nonlinear(skb) &&
-	    pskb_expand_head(skb, 0, 0, GFP_ATOMIC))
+	if (skb_unclone(skb, GFP_ATOMIC))
 		return -ENOMEM;
 
 	/* Get a new skb... force flag on. */
@@ -1932,6 +1933,8 @@ int tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb)
 		int oldpcount = tcp_skb_pcount(skb);
 
 		if (unlikely(oldpcount > 1)) {
+			if (skb_unclone(skb, GFP_ATOMIC))
+				return -ENOMEM;
 			tcp_init_tso_segs(sk, skb, cur_mss);
 			tcp_adjust_pcount(sk, skb, oldpcount - tcp_skb_pcount(skb));
 		}
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 028/143] net: do not call sock_put() on TIMEWAIT sockets
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (27 preceding siblings ...)
  2014-05-12  0:32 ` [ 027/143] tcp: must unclone packets before mangling them Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 029/143] net: heap overflow in __audit_sockaddr() Willy Tarreau
                   ` (114 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Eric Dumazet, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

[ Upstream commit 80ad1d61e72d626e30ebe8529a0455e660ca4693 ]

commit 3ab5aee7fe84 ("net: Convert TCP & DCCP hash tables to use RCU /
hlist_nulls") incorrectly used sock_put() on TIMEWAIT sockets.

We should instead use inet_twsk_put()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/ipv4/inet_hashtables.c  | 2 +-
 net/ipv6/inet6_hashtables.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
index d717267..03fd04a 100644
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -247,7 +247,7 @@ begintw:
 			}
 			if (unlikely(!INET_TW_MATCH(sk, net, hash, acookie,
 				 saddr, daddr, ports, dif))) {
-				sock_put(sk);
+				inet_twsk_put(inet_twsk(sk));
 				goto begintw;
 			}
 			goto out;
diff --git a/net/ipv6/inet6_hashtables.c b/net/ipv6/inet6_hashtables.c
index 093e9b2..93765577 100644
--- a/net/ipv6/inet6_hashtables.c
+++ b/net/ipv6/inet6_hashtables.c
@@ -104,7 +104,7 @@ begintw:
 				goto out;
 			}
 			if (!INET6_TW_MATCH(sk, net, hash, saddr, daddr, ports, dif)) {
-				sock_put(sk);
+				inet_twsk_put(inet_twsk(sk));
 				goto begintw;
 			}
 			goto out;
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 029/143] net: heap overflow in __audit_sockaddr()
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (28 preceding siblings ...)
  2014-05-12  0:32 ` [ 028/143] net: do not call sock_put() on TIMEWAIT sockets Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 030/143] proc connector: fix info leaks Willy Tarreau
                   ` (113 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Dan Carpenter, David S. Miller, Willy Tarreau

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 2844 bytes --]

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Dan Carpenter <dan.carpenter@oracle.com>

[ Upstream commit 1661bf364ae9c506bc8795fef70d1532931be1e8 ]

We need to cap ->msg_namelen or it leads to a buffer overflow when we
to the memcpy() in __audit_sockaddr().  It requires CAP_AUDIT_CONTROL to
exploit this bug.

The call tree is:
___sys_recvmsg()
  move_addr_to_user()
    audit_sockaddr()
      __audit_sockaddr()

Reported-by: Jüri Aedla <juri.aedla@gmail.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[wt: 2.6.32: msg_sys is a struct, not a pointer]
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/compat.c |  2 ++
 net/socket.c | 24 ++++++++++++++++++++----
 2 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/net/compat.c b/net/compat.c
index 9559afc..da3d0fc 100644
--- a/net/compat.c
+++ b/net/compat.c
@@ -69,6 +69,8 @@ int get_compat_msghdr(struct msghdr *kmsg, struct compat_msghdr __user *umsg)
 	    __get_user(kmsg->msg_controllen, &umsg->msg_controllen) ||
 	    __get_user(kmsg->msg_flags, &umsg->msg_flags))
 		return -EFAULT;
+	if (kmsg->msg_namelen > sizeof(struct sockaddr_storage))
+		return -EINVAL;
 	kmsg->msg_name = compat_ptr(tmp1);
 	kmsg->msg_iov = compat_ptr(tmp2);
 	kmsg->msg_control = compat_ptr(tmp3);
diff --git a/net/socket.c b/net/socket.c
index bf9fc68..9f8cd74 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -1863,6 +1863,16 @@ SYSCALL_DEFINE2(shutdown, int, fd, int, how)
 #define COMPAT_NAMELEN(msg)	COMPAT_MSG(msg, msg_namelen)
 #define COMPAT_FLAGS(msg)	COMPAT_MSG(msg, msg_flags)
 
+static int copy_msghdr_from_user(struct msghdr *kmsg,
+                                 struct msghdr __user *umsg)
+{
+	if (copy_from_user(kmsg, umsg, sizeof(struct msghdr)))
+		return -EFAULT;
+	if (kmsg->msg_namelen > sizeof(struct sockaddr_storage))
+		return -EINVAL;
+	return 0;
+}
+
 /*
  *	BSD sendmsg interface
  */
@@ -1887,8 +1897,11 @@ SYSCALL_DEFINE3(sendmsg, int, fd, struct msghdr __user *, msg, unsigned, flags)
 		if (get_compat_msghdr(&msg_sys, msg_compat))
 			return -EFAULT;
 	}
-	else if (copy_from_user(&msg_sys, msg, sizeof(struct msghdr)))
-		return -EFAULT;
+	else {
+		err = copy_msghdr_from_user(&msg_sys, msg);
+		if (err)
+			return err;
+	}
 
 	sock = sockfd_lookup_light(fd, &err, &fput_needed);
 	if (!sock)
@@ -1997,8 +2010,11 @@ SYSCALL_DEFINE3(recvmsg, int, fd, struct msghdr __user *, msg,
 		if (get_compat_msghdr(&msg_sys, msg_compat))
 			return -EFAULT;
 	}
-	else if (copy_from_user(&msg_sys, msg, sizeof(struct msghdr)))
-		return -EFAULT;
+	else {
+		err = copy_msghdr_from_user(&msg_sys, msg);
+		if (err)
+			return err;
+	}
 
 	sock = sockfd_lookup_light(fd, &err, &fput_needed);
 	if (!sock)
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 030/143] proc connector: fix info leaks
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (29 preceding siblings ...)
  2014-05-12  0:32 ` [ 029/143] net: heap overflow in __audit_sockaddr() Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  8:41   ` Christoph Biedl
  2014-05-12  8:51   ` Mathias Krause
  2014-05-12  0:32 ` [ 031/143] can: dev: fix nlmsg size calculation in can_get_size() Willy Tarreau
                   ` (112 subsequent siblings)
  143 siblings, 2 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Mathias Krause, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Mathias Krause <minipli@googlemail.com>

[ Upstream commit e727ca82e0e9616ab4844301e6bae60ca7327682 ]

Initialize event_data for all possible message types to prevent leaking
kernel stack contents to userland (up to 20 bytes). Also set the flags
member of the connector message to 0 to prevent leaking two more stack
bytes this way.

Cc: stable@vger.kernel.org  
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/connector/cn_proc.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/drivers/connector/cn_proc.c b/drivers/connector/cn_proc.c
index 6069790..3a2587a 100644
--- a/drivers/connector/cn_proc.c
+++ b/drivers/connector/cn_proc.c
@@ -59,6 +59,7 @@ void proc_fork_connector(struct task_struct *task)
 
 	msg = (struct cn_msg*)buffer;
 	ev = (struct proc_event*)msg->data;
+	memset(&ev->event_data, 0, sizeof(ev->event_data));
 	get_seq(&msg->seq, &ev->cpu);
 	ktime_get_ts(&ts); /* get high res monotonic timestamp */
 	put_unaligned(timespec_to_ns(&ts), (__u64 *)&ev->timestamp_ns);
@@ -71,6 +72,7 @@ void proc_fork_connector(struct task_struct *task)
 	memcpy(&msg->id, &cn_proc_event_id, sizeof(msg->id));
 	msg->ack = 0; /* not used */
 	msg->len = sizeof(*ev);
+	msg->flags = 0; /* not used */
 	/*  If cn_netlink_send() failed, the data is not sent */
 	cn_netlink_send(msg, CN_IDX_PROC, GFP_KERNEL);
 }
@@ -87,6 +89,7 @@ void proc_exec_connector(struct task_struct *task)
 
 	msg = (struct cn_msg*)buffer;
 	ev = (struct proc_event*)msg->data;
+	memset(&ev->event_data, 0, sizeof(ev->event_data));
 	get_seq(&msg->seq, &ev->cpu);
 	ktime_get_ts(&ts); /* get high res monotonic timestamp */
 	put_unaligned(timespec_to_ns(&ts), (__u64 *)&ev->timestamp_ns);
@@ -97,6 +100,7 @@ void proc_exec_connector(struct task_struct *task)
 	memcpy(&msg->id, &cn_proc_event_id, sizeof(msg->id));
 	msg->ack = 0; /* not used */
 	msg->len = sizeof(*ev);
+	msg->flags = 0; /* not used */
 	cn_netlink_send(msg, CN_IDX_PROC, GFP_KERNEL);
 }
 
@@ -113,6 +117,7 @@ void proc_id_connector(struct task_struct *task, int which_id)
 
 	msg = (struct cn_msg*)buffer;
 	ev = (struct proc_event*)msg->data;
+	memset(&ev->event_data, 0, sizeof(ev->event_data));
 	ev->what = which_id;
 	ev->event_data.id.process_pid = task->pid;
 	ev->event_data.id.process_tgid = task->tgid;
@@ -136,6 +141,7 @@ void proc_id_connector(struct task_struct *task, int which_id)
 	memcpy(&msg->id, &cn_proc_event_id, sizeof(msg->id));
 	msg->ack = 0; /* not used */
 	msg->len = sizeof(*ev);
+	msg->flags = 0; /* not used */
 	cn_netlink_send(msg, CN_IDX_PROC, GFP_KERNEL);
 }
 
@@ -151,6 +157,7 @@ void proc_sid_connector(struct task_struct *task)
 
 	msg = (struct cn_msg *)buffer;
 	ev = (struct proc_event *)msg->data;
+	memset(&ev->event_data, 0, sizeof(ev->event_data));
 	get_seq(&msg->seq, &ev->cpu);
 	ktime_get_ts(&ts); /* get high res monotonic timestamp */
 	put_unaligned(timespec_to_ns(&ts), (__u64 *)&ev->timestamp_ns);
@@ -161,6 +168,7 @@ void proc_sid_connector(struct task_struct *task)
 	memcpy(&msg->id, &cn_proc_event_id, sizeof(msg->id));
 	msg->ack = 0; /* not used */
 	msg->len = sizeof(*ev);
+	msg->flags = 0; /* not used */
 	cn_netlink_send(msg, CN_IDX_PROC, GFP_KERNEL);
 }
 
@@ -176,8 +184,10 @@ void proc_exit_connector(struct task_struct *task)
 
 	msg = (struct cn_msg*)buffer;
 	ev = (struct proc_event*)msg->data;
+	memset(&ev->event_data, 0, sizeof(ev->event_data));
 	get_seq(&msg->seq, &ev->cpu);
 	ktime_get_ts(&ts); /* get high res monotonic timestamp */
+	memset(&ev->event_data, 0, sizeof(ev->event_data));
 	put_unaligned(timespec_to_ns(&ts), (__u64 *)&ev->timestamp_ns);
 	ev->what = PROC_EVENT_EXIT;
 	ev->event_data.exit.process_pid = task->pid;
@@ -188,6 +198,7 @@ void proc_exit_connector(struct task_struct *task)
 	memcpy(&msg->id, &cn_proc_event_id, sizeof(msg->id));
 	msg->ack = 0; /* not used */
 	msg->len = sizeof(*ev);
+	msg->flags = 0; /* not used */
 	cn_netlink_send(msg, CN_IDX_PROC, GFP_KERNEL);
 }
 
@@ -211,6 +222,7 @@ static void cn_proc_ack(int err, int rcvd_seq, int rcvd_ack)
 
 	msg = (struct cn_msg*)buffer;
 	ev = (struct proc_event*)msg->data;
+	memset(&ev->event_data, 0, sizeof(ev->event_data));
 	msg->seq = rcvd_seq;
 	ktime_get_ts(&ts); /* get high res monotonic timestamp */
 	put_unaligned(timespec_to_ns(&ts), (__u64 *)&ev->timestamp_ns);
@@ -220,6 +232,7 @@ static void cn_proc_ack(int err, int rcvd_seq, int rcvd_ack)
 	memcpy(&msg->id, &cn_proc_event_id, sizeof(msg->id));
 	msg->ack = rcvd_ack + 1;
 	msg->len = sizeof(*ev);
+	msg->flags = 0; /* not used */
 	cn_netlink_send(msg, CN_IDX_PROC, GFP_KERNEL);
 }
 
@@ -249,6 +262,7 @@ static void cn_proc_mcast_ctl(struct cn_msg *msg,
 		break;
 	}
 	cn_proc_ack(err, msg->seq, msg->ack);
+	msg->flags = 0; /* not used */
 }
 
 /*
@@ -269,3 +283,5 @@ static int __init cn_proc_init(void)
 }
 
 module_init(cn_proc_init);
+	memset(&ev->event_data, 0, sizeof(ev->event_data));
+	msg->flags = 0; /* not used */
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 031/143] can: dev: fix nlmsg size calculation in can_get_size()
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (30 preceding siblings ...)
  2014-05-12  0:32 ` [ 030/143] proc connector: fix info leaks Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 032/143] net: vlan: fix nlmsg size calculation in vlan_get_size() Willy Tarreau
                   ` (111 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Marc Kleine-Budde, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Marc Kleine-Budde <mkl@pengutronix.de>

[ Upstream commit fe119a05f8ca481623a8d02efcc984332e612528 ]

This patch fixes the calculation of the nlmsg size, by adding the missing
nla_total_size().

Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/net/can/dev.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/can/dev.c b/drivers/net/can/dev.c
index 2868fe8..ea2749f9 100644
--- a/drivers/net/can/dev.c
+++ b/drivers/net/can/dev.c
@@ -595,12 +595,12 @@ static size_t can_get_size(const struct net_device *dev)
 	size_t size;
 
 	size = nla_total_size(sizeof(u32));   /* IFLA_CAN_STATE */
-	size += sizeof(struct can_ctrlmode);  /* IFLA_CAN_CTRLMODE */
+	size += nla_total_size(sizeof(struct can_ctrlmode));  /* IFLA_CAN_CTRLMODE */
 	size += nla_total_size(sizeof(u32));  /* IFLA_CAN_RESTART_MS */
-	size += sizeof(struct can_bittiming); /* IFLA_CAN_BITTIMING */
-	size += sizeof(struct can_clock);     /* IFLA_CAN_CLOCK */
+	size += nla_total_size(sizeof(struct can_bittiming)); /* IFLA_CAN_BITTIMING */
+	size += nla_total_size(sizeof(struct can_clock));     /* IFLA_CAN_CLOCK */
 	if (priv->bittiming_const)	      /* IFLA_CAN_BITTIMING_CONST */
-		size += sizeof(struct can_bittiming_const);
+		size += nla_total_size(sizeof(struct can_bittiming_const));
 
 	return size;
 }
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 032/143] net: vlan: fix nlmsg size calculation in vlan_get_size()
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (31 preceding siblings ...)
  2014-05-12  0:32 ` [ 031/143] can: dev: fix nlmsg size calculation in can_get_size() Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 033/143] farsync: fix info leak in ioctl Willy Tarreau
                   ` (110 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Patrick McHardy, Marc Kleine-Budde, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Marc Kleine-Budde <mkl@pengutronix.de>

[ Upstream commit c33a39c575068c2ea9bffb22fd6de2df19c74b89 ]

This patch fixes the calculation of the nlmsg size, by adding the missing
nla_total_size().

Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/8021q/vlan_netlink.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/8021q/vlan_netlink.c b/net/8021q/vlan_netlink.c
index a915048..1f13bcf 100644
--- a/net/8021q/vlan_netlink.c
+++ b/net/8021q/vlan_netlink.c
@@ -169,7 +169,7 @@ static size_t vlan_get_size(const struct net_device *dev)
 	struct vlan_dev_info *vlan = vlan_dev_info(dev);
 
 	return nla_total_size(2) +	/* IFLA_VLAN_ID */
-	       sizeof(struct ifla_vlan_flags) + /* IFLA_VLAN_FLAGS */
+	       nla_total_size(sizeof(struct ifla_vlan_flags)) + /* IFLA_VLAN_FLAGS */
 	       vlan_qos_map_size(vlan->nr_ingress_mappings) +
 	       vlan_qos_map_size(vlan->nr_egress_mappings);
 }
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 033/143] farsync: fix info leak in ioctl
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (32 preceding siblings ...)
  2014-05-12  0:32 ` [ 032/143] net: vlan: fix nlmsg size calculation in vlan_get_size() Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 034/143] connector: use nlmsg_len() to check message length Willy Tarreau
                   ` (109 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Dan Carpenter, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: =?latin1?q?Salva=20Peir=F3?= <speiro@ai2.upv.es>

[ Upstream commit 96b340406724d87e4621284ebac5e059d67b2194 ]

The fst_get_iface() code fails to initialize the two padding bytes of
struct sync_serial_settings after the ->loopback member. Add an explicit
memset(0) before filling the structure to avoid the info leak.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/net/wan/farsync.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/wan/farsync.c b/drivers/net/wan/farsync.c
index beda387..433bf99 100644
--- a/drivers/net/wan/farsync.c
+++ b/drivers/net/wan/farsync.c
@@ -1971,6 +1971,7 @@ fst_get_iface(struct fst_card_info *card, struct fst_port_info *port,
 	}
 
 	i = port->index;
+	memset(&sync, 0, sizeof(sync));
 	sync.clock_rate = FST_RDL(card, portConfig[i].lineSpeed);
 	/* Lucky card and linux use same encoding here */
 	sync.clock_type = FST_RDB(card, portConfig[i].internalClock) ==
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 034/143] connector: use nlmsg_len() to check message length
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (33 preceding siblings ...)
  2014-05-12  0:32 ` [ 033/143] farsync: fix info leak in ioctl Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 035/143] net: dst: provide accessor function to dst->xfrm Willy Tarreau
                   ` (108 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Mathias Krause, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Mathias Krause <minipli@googlemail.com>

[ Upstream commit 162b2bedc084d2d908a04c93383ba02348b648b0 ]

The current code tests the length of the whole netlink message to be
at least as long to fit a cn_msg. This is wrong as nlmsg_len includes
the length of the netlink message header. Use nlmsg_len() instead to
fix this "off-by-NLMSG_HDRLEN" size check.

Cc: stable@vger.kernel.org  
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/connector/connector.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/connector/connector.c b/drivers/connector/connector.c
index 537c29a..980412b 100644
--- a/drivers/connector/connector.c
+++ b/drivers/connector/connector.c
@@ -177,17 +177,18 @@ static int cn_call_callback(struct sk_buff *skb)
 static void cn_rx_skb(struct sk_buff *__skb)
 {
 	struct nlmsghdr *nlh;
-	int err;
 	struct sk_buff *skb;
+	int len, err;
 
 	skb = skb_get(__skb);
 
 	if (skb->len >= NLMSG_SPACE(0)) {
 		nlh = nlmsg_hdr(skb);
+		len = nlmsg_len(nlh);
 
-		if (nlh->nlmsg_len < sizeof(struct cn_msg) ||
+		if (len < (int)sizeof(struct cn_msg) ||
 		    skb->len < nlh->nlmsg_len ||
-		    nlh->nlmsg_len > CONNECTOR_MAX_MSG_SIZE) {
+		    len > CONNECTOR_MAX_MSG_SIZE) {
 			kfree_skb(skb);
 			return;
 		}
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 035/143] net: dst: provide accessor function to dst->xfrm
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (34 preceding siblings ...)
  2014-05-12  0:32 ` [ 034/143] connector: use nlmsg_len() to check message length Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 036/143] sctp: Use software crc32 checksum when xfrm transform will happen Willy Tarreau
                   ` (107 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Vlad Yasevich, Neil Horman, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Vlad Yasevich <vyasevich@gmail.com>

[ Upstream commit e87b3998d795123b4139bc3f25490dd236f68212 ]

dst->xfrm is conditionally defined.  Provide accessor funtion that
is always available.

Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 include/net/dst.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/include/net/dst.h b/include/net/dst.h
index 5a900dd..49f443b 100644
--- a/include/net/dst.h
+++ b/include/net/dst.h
@@ -286,11 +286,22 @@ static inline int __xfrm_lookup(struct net *net, struct dst_entry **dst_p,
 {
 	return 0;
 }
+static inline struct xfrm_state *dst_xfrm(const struct dst_entry *dst)
+{
+	return NULL;
+}
+
 #else
 extern int xfrm_lookup(struct net *net, struct dst_entry **dst_p,
 		       struct flowi *fl, struct sock *sk, int flags);
 extern int __xfrm_lookup(struct net *net, struct dst_entry **dst_p,
 			 struct flowi *fl, struct sock *sk, int flags);
+
+/* skb attached with this dst needs transformation if dst->xfrm is valid */
+static inline struct xfrm_state *dst_xfrm(const struct dst_entry *dst)
+{
+	return dst->xfrm;
+}
 #endif
 #endif
 
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 036/143] sctp: Use software crc32 checksum when xfrm transform will happen.
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (35 preceding siblings ...)
  2014-05-12  0:32 ` [ 035/143] net: dst: provide accessor function to dst->xfrm Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 037/143] sctp: Perform software checksum if packet has to be fragmented Willy Tarreau
                   ` (106 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Neil Horman, Steffen Klassert, Fan Du, Vlad Yasevich,
	David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Fan Du <fan.du@windriver.com>

[ Upstream commit 27127a82561a2a3ed955ce207048e1b066a80a2a ]

igb/ixgbe have hardware sctp checksum support, when this feature is enabled
and also IPsec is armed to protect sctp traffic, ugly things happened as
xfrm_output checks CHECKSUM_PARTIAL to do checksum operation(sum every thing
up and pack the 16bits result in the checksum field). The result is fail
establishment of sctp communication.

Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Fan Du <fan.du@windriver.com>
Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/sctp/output.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/sctp/output.c b/net/sctp/output.c
index d494100..8d4eacf 100644
--- a/net/sctp/output.c
+++ b/net/sctp/output.c
@@ -506,7 +506,8 @@ int sctp_packet_transmit(struct sctp_packet *packet)
 	 * by CRC32-C as described in <draft-ietf-tsvwg-sctpcsum-02.txt>.
 	 */
 	if (!sctp_checksum_disable &&
-	    !(dst->dev->features & (NETIF_F_NO_CSUM | NETIF_F_SCTP_CSUM))) {
+	    (!(dst->dev->features & (NETIF_F_NO_CSUM | NETIF_F_SCTP_CSUM)) ||
+	     (dst_xfrm(dst) != NULL))) {
 		__u32 crc32 = sctp_start_cksum((__u8 *)sh, cksum_buf_len);
 
 		/* 3) Put the resultant value into the checksum field in the
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 037/143] sctp: Perform software checksum if packet has to be fragmented.
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (36 preceding siblings ...)
  2014-05-12  0:32 ` [ 036/143] sctp: Use software crc32 checksum when xfrm transform will happen Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 038/143] wanxl: fix info leak in ioctl Willy Tarreau
                   ` (105 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Fan Du, Vlad Yasevich, Neil Horman, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Vlad Yasevich <vyasevich@gmail.com>

[ Upstream commit d2dbbba77e95dff4b4f901fee236fef6d9552072 ]

IP/IPv6 fragmentation knows how to compute only TCP/UDP checksum.
This causes problems if SCTP packets has to be fragmented and
ipsummed has been set to PARTIAL due to checksum offload support.
This condition can happen when retransmitting after MTU discover,
or when INIT or other control chunks are larger then MTU.
Check for the rare fragmentation condition in SCTP and use software
checksum calculation in this case.

CC: Fan Du <fan.du@windriver.com>
Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/sctp/output.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/sctp/output.c b/net/sctp/output.c
index 8d4eacf..54bc011 100644
--- a/net/sctp/output.c
+++ b/net/sctp/output.c
@@ -507,7 +507,7 @@ int sctp_packet_transmit(struct sctp_packet *packet)
 	 */
 	if (!sctp_checksum_disable &&
 	    (!(dst->dev->features & (NETIF_F_NO_CSUM | NETIF_F_SCTP_CSUM)) ||
-	     (dst_xfrm(dst) != NULL))) {
+	     (dst_xfrm(dst) != NULL) || packet->ipfragok)) {
 		__u32 crc32 = sctp_start_cksum((__u8 *)sh, cksum_buf_len);
 
 		/* 3) Put the resultant value into the checksum field in the
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 038/143] wanxl: fix info leak in ioctl
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (37 preceding siblings ...)
  2014-05-12  0:32 ` [ 037/143] sctp: Perform software checksum if packet has to be fragmented Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 039/143] davinci_emac.c: Fix IFF_ALLMULTI setup Willy Tarreau
                   ` (104 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Salva Peiró, David S. Miller, Willy Tarreau

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1129 bytes --]

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: =?latin1?q?Salva=20Peir=F3?= <speiro@ai2.upv.es>

[ Upstream commit 2b13d06c9584b4eb773f1e80bbaedab9a1c344e1 ]

The wanxl_ioctl() code fails to initialize the two padding bytes of
struct sync_serial_settings after the ->loopback member. Add an explicit
memset(0) before filling the structure to avoid the info leak.

Signed-off-by: Salva Peiró <speiro@ai2.upv.es>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/net/wan/wanxl.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/wan/wanxl.c b/drivers/net/wan/wanxl.c
index daee8a0..b52b378 100644
--- a/drivers/net/wan/wanxl.c
+++ b/drivers/net/wan/wanxl.c
@@ -354,6 +354,7 @@ static int wanxl_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
 			ifr->ifr_settings.size = size; /* data size wanted */
 			return -ENOBUFS;
 		}
+		memset(&line, 0, sizeof(line));
 		line.clock_type = get_status(port)->clocking;
 		line.clock_rate = 0;
 		line.loopback = 0;
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 039/143] davinci_emac.c: Fix IFF_ALLMULTI setup
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (38 preceding siblings ...)
  2014-05-12  0:32 ` [ 038/143] wanxl: fix info leak in ioctl Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 040/143] resubmit bridge: fix message_age_timer calculation Willy Tarreau
                   ` (103 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Mariusz Ceier, Mugunthan V N, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Mariusz Ceier <mceier+kernel@gmail.com>

[ Upstream commit d69e0f7ea95fef8059251325a79c004bac01f018 ]

When IFF_ALLMULTI flag is set on interface and IFF_PROMISC isn't,
emac_dev_mcast_set should only enable RX of multicasts and reset
MACHASH registers.

It does this, but afterwards it either sets up multicast MACs
filtering or disables RX of multicasts and resets MACHASH registers
again, rendering IFF_ALLMULTI flag useless.

This patch fixes emac_dev_mcast_set, so that multicast MACs filtering and
disabling of RX of multicasts are skipped when IFF_ALLMULTI flag is set.

Tested with kernel 2.6.37.

Signed-off-by: Mariusz Ceier <mceier+kernel@gmail.com>
Acked-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/net/davinci_emac.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/davinci_emac.c b/drivers/net/davinci_emac.c
index e347831..eafd1e4 100644
--- a/drivers/net/davinci_emac.c
+++ b/drivers/net/davinci_emac.c
@@ -960,7 +960,7 @@ static void emac_dev_mcast_set(struct net_device *ndev)
 			mbp_enable = (mbp_enable | EMAC_MBP_RXMCAST);
 			emac_add_mcast(priv, EMAC_ALL_MULTI_SET, NULL);
 		}
-		if (ndev->mc_count > 0) {
+		else if (ndev->mc_count > 0) {
 			struct dev_mc_list *mc_ptr;
 			mbp_enable = (mbp_enable | EMAC_MBP_RXMCAST);
 			emac_add_mcast(priv, EMAC_ALL_MULTI_CLR, NULL);
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 040/143] resubmit bridge: fix message_age_timer calculation
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (39 preceding siblings ...)
  2014-05-12  0:32 ` [ 039/143] davinci_emac.c: Fix IFF_ALLMULTI setup Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 041/143] ipv6 mcast: use in6_dev_put in timer handlers instead of Willy Tarreau
                   ` (102 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Chris Healy, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Chris Healy <cphealy@gmail.com>

[ Upstream commit 9a0620133ccce9dd35c00a96405c8d80938c2cc0 ]

This changes the message_age_timer calculation to use the BPDU's max age as
opposed to the local bridge's max age.  This is in accordance with section
8.6.2.3.2 Step 2 of the 802.1D-1998 sprecification.

With the current implementation, when running with very large bridge
diameters, convergance will not always occur even if a root bridge is
configured to have a longer max age.

Tested successfully on bridge diameters of ~200.

Signed-off-by: Chris Healy <cphealy@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/bridge/br_stp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/bridge/br_stp.c b/net/bridge/br_stp.c
index c7d6bfc..a67e6ce 100644
--- a/net/bridge/br_stp.c
+++ b/net/bridge/br_stp.c
@@ -192,7 +192,7 @@ static inline void br_record_config_information(struct net_bridge_port *p,
 	p->designated_age = jiffies + bpdu->message_age;
 
 	mod_timer(&p->message_age_timer, jiffies
-		  + (p->br->max_age - bpdu->message_age));
+		  + (bpdu->max_age - bpdu->message_age));
 }
 
 /* called under bridge lock */
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 041/143] ipv6 mcast: use in6_dev_put in timer handlers instead of
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (40 preceding siblings ...)
  2014-05-12  0:32 ` [ 040/143] resubmit bridge: fix message_age_timer calculation Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 042/143] ipv4 igmp: use in_dev_put in timer handlers instead of __in_dev_put Willy Tarreau
                   ` (101 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Salam Noureddine, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------
 __in6_dev_put

From: Salam Noureddine <noureddine@aristanetworks.com>

[ Upstream commit 9260d3e1013701aa814d10c8fc6a9f92bd17d643 ]

It is possible for the timer handlers to run after the call to
ipv6_mc_down so use in6_dev_put instead of __in6_dev_put in the
handler function in order to do proper cleanup when the refcnt
reaches 0. Otherwise, the refcnt can reach zero without the
inet6_dev being destroyed and we end up leaking a reference to
the net_device and see messages like the following,

unregister_netdevice: waiting for eth0 to become free. Usage count = 1

Tested on linux-3.4.43.

Signed-off-by: Salam Noureddine <noureddine@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/ipv6/mcast.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
index f9fcf69..99ae9e3 100644
--- a/net/ipv6/mcast.c
+++ b/net/ipv6/mcast.c
@@ -2208,7 +2208,7 @@ static void mld_gq_timer_expire(unsigned long data)
 
 	idev->mc_gq_running = 0;
 	mld_send_report(idev, NULL);
-	__in6_dev_put(idev);
+	in6_dev_put(idev);
 }
 
 static void mld_ifc_timer_expire(unsigned long data)
@@ -2221,7 +2221,7 @@ static void mld_ifc_timer_expire(unsigned long data)
 		if (idev->mc_ifc_count)
 			mld_ifc_start_timer(idev, idev->mc_maxdelay);
 	}
-	__in6_dev_put(idev);
+	in6_dev_put(idev);
 }
 
 static void mld_ifc_event(struct inet6_dev *idev)
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 042/143] ipv4 igmp: use in_dev_put in timer handlers instead of __in_dev_put
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (41 preceding siblings ...)
  2014-05-12  0:32 ` [ 041/143] ipv6 mcast: use in6_dev_put in timer handlers instead of Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 043/143] dm9601: fix IFF_ALLMULTI handling Willy Tarreau
                   ` (100 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Salam Noureddine, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Salam Noureddine <noureddine@aristanetworks.com>

[ Upstream commit e2401654dd0f5f3fb7a8d80dad9554d73d7ca394 ]

It is possible for the timer handlers to run after the call to
ip_mc_down so use in_dev_put instead of __in_dev_put in the handler
function in order to do proper cleanup when the refcnt reaches 0.
Otherwise, the refcnt can reach zero without the in_device being
destroyed and we end up leaking a reference to the net_device and
see messages like the following,

unregister_netdevice: waiting for eth0 to become free. Usage count = 1

Tested on linux-3.4.43.

Signed-off-by: Salam Noureddine <noureddine@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/ipv4/igmp.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index 169da93..c07be7c 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -697,7 +697,7 @@ static void igmp_gq_timer_expire(unsigned long data)
 
 	in_dev->mr_gq_running = 0;
 	igmpv3_send_report(in_dev, NULL);
-	__in_dev_put(in_dev);
+	in_dev_put(in_dev);
 }
 
 static void igmp_ifc_timer_expire(unsigned long data)
@@ -709,7 +709,7 @@ static void igmp_ifc_timer_expire(unsigned long data)
 		in_dev->mr_ifc_count--;
 		igmp_ifc_start_timer(in_dev, IGMP_Unsolicited_Report_Interval);
 	}
-	__in_dev_put(in_dev);
+	in_dev_put(in_dev);
 }
 
 static void igmp_ifc_event(struct in_device *in_dev)
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 043/143] dm9601: fix IFF_ALLMULTI handling
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (42 preceding siblings ...)
  2014-05-12  0:32 ` [ 042/143] ipv4 igmp: use in_dev_put in timer handlers instead of __in_dev_put Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 044/143] bonding: Fix broken promiscuity reference counting issue Willy Tarreau
                   ` (99 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Peter Korsgaard, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Peter Korsgaard <peter@korsgaard.com>

[ Upstream commit bf0ea6380724beb64f27a722dfc4b0edabff816e ]

Pass-all-multicast is controlled by bit 3 in RX control, not bit 2
(pass undersized frames).

Reported-by: Joseph Chang <joseph_chang@davicom.com.tw>
Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/net/usb/dm9601.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/usb/dm9601.c b/drivers/net/usb/dm9601.c
index 9a6eede..498681a 100644
--- a/drivers/net/usb/dm9601.c
+++ b/drivers/net/usb/dm9601.c
@@ -382,7 +382,7 @@ static void dm9601_set_multicast(struct net_device *net)
 	if (net->flags & IFF_PROMISC) {
 		rx_ctl |= 0x02;
 	} else if (net->flags & IFF_ALLMULTI || net->mc_count > DM_MAX_MCAST) {
-		rx_ctl |= 0x04;
+		rx_ctl |= 0x08;
 	} else if (net->mc_count) {
 		struct dev_mc_list *mc_list = net->mc_list;
 		int i;
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 044/143] bonding: Fix broken promiscuity reference counting issue
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (43 preceding siblings ...)
  2014-05-12  0:32 ` [ 043/143] dm9601: fix IFF_ALLMULTI handling Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 045/143] ll_temac: Reset dma descriptors indexes on ndo_open Willy Tarreau
                   ` (98 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Jay Vosburgh, Andy Gospodarek, Mark Wu, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Neil Horman <nhorman@tuxdriver.com>

[ Upstream commit 5a0068deb611109c5ba77358be533f763f395ee4 ]

Recently grabbed this report:
https://bugzilla.redhat.com/show_bug.cgi?id=1005567

Of an issue in which the bonding driver, with an attached vlan encountered the
following errors when bond0 was taken down and back up:

dummy1: promiscuity touches roof, set promiscuity failed. promiscuity feature of
device might be broken.

The error occurs because, during __bond_release_one, if we release our last
slave, we take on a random mac address and issue a NETDEV_CHANGEADDR
notification.  With an attached vlan, the vlan may see that the vlan and bond
mac address were in sync, but no longer are.  This triggers a call to dev_uc_add
and dev_set_rx_mode, which enables IFF_PROMISC on the bond device.  Then, when
we complete __bond_release_one, we use the current state of the bond flags to
determine if we should decrement the promiscuity of the releasing slave.  But
since the bond changed promiscuity state during the release operation, we
incorrectly decrement the slave promisc count when it wasn't in promiscuous mode
to begin with, causing the above error

Fix is pretty simple, just cache the bonding flags at the start of the function
and use those when determining the need to set promiscuity.

This is also needed for the ALLMULTI flag

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Mark Wu <wudxw@linux.vnet.ibm.com>
CC: "David S. Miller" <davem@davemloft.net>
Reported-by: Mark Wu <wudxw@linux.vnet.ibm.com>

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/net/bonding/bond_main.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 6ffbfb7..4f52101 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1794,6 +1794,7 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
 	struct bonding *bond = netdev_priv(bond_dev);
 	struct slave *slave, *oldcurrent;
 	struct sockaddr addr;
+	int old_flags = bond_dev->flags;
 
 	/* slave is not a slave or master is not master of this slave */
 	if (!(slave_dev->flags & IFF_SLAVE) ||
@@ -1929,12 +1930,18 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
 	 * already taken care of above when we detached the slave
 	 */
 	if (!USES_PRIMARY(bond->params.mode)) {
-		/* unset promiscuity level from slave */
-		if (bond_dev->flags & IFF_PROMISC)
+		/* unset promiscuity level from slave
+		 * NOTE: The NETDEV_CHANGEADDR call above may change the value
+		 * of the IFF_PROMISC flag in the bond_dev, but we need the
+		 * value of that flag before that change, as that was the value
+		 * when this slave was attached, so we cache at the start of the
+		 * function and use it here. Same goes for ALLMULTI below
+		 */
+		if (old_flags & IFF_PROMISC)
 			dev_set_promiscuity(slave_dev, -1);
 
 		/* unset allmulti level from slave */
-		if (bond_dev->flags & IFF_ALLMULTI)
+		if (old_flags & IFF_ALLMULTI)
 			dev_set_allmulti(slave_dev, -1);
 
 		/* flush master's mc_list from slave */
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 045/143] ll_temac: Reset dma descriptors indexes on ndo_open
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (44 preceding siblings ...)
  2014-05-12  0:32 ` [ 044/143] bonding: Fix broken promiscuity reference counting issue Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 046/143] tcp: fix tcp_md5_hash_skb_data() Willy Tarreau
                   ` (97 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Ricardo Ribalda Delgado, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Ricardo Ribalda <ricardo.ribalda@gmail.com>

[ Upstream commit 7167cf0e8cd10287b7912b9ffcccd9616f382922 ]

The dma descriptors indexes are only initialized on the probe function.

If a packet is on the buffer when temac_stop is called, the dma
descriptors indexes can be left on a incorrect state where no other
package can be sent.

So an interface could be left in an usable state after ifdow/ifup.

This patch makes sure that the descriptors indexes are in a proper
status when the device is open.

Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/net/ll_temac_main.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/ll_temac_main.c b/drivers/net/ll_temac_main.c
index f2a197f..d2516dd 100644
--- a/drivers/net/ll_temac_main.c
+++ b/drivers/net/ll_temac_main.c
@@ -190,6 +190,12 @@ static int temac_dma_bd_init(struct net_device *ndev)
 		       lp->rx_bd_p + (sizeof(*lp->rx_bd_v) * (RX_BD_NUM - 1)));
 	temac_dma_out32(lp, TX_CURDESC_PTR, lp->tx_bd_p);
 
+	/* Init descriptor indexes */
+	lp->tx_bd_ci = 0;
+	lp->tx_bd_next = 0;
+	lp->tx_bd_tail = 0;
+	lp->rx_bd_ci = 0;
+
 	return 0;
 }
 
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 046/143] tcp: fix tcp_md5_hash_skb_data()
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (45 preceding siblings ...)
  2014-05-12  0:32 ` [ 045/143] ll_temac: Reset dma descriptors indexes on ndo_open Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 047/143] ipv6: fix possible crashes in ip6_cork_release() Willy Tarreau
                   ` (96 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Eric Dumazet, Bernhard Beck, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

[ Upstream commit 54d27fcb338bd9c42d1dfc5a39e18f6f9d373c2e ]

TCP md5 communications fail [1] for some devices, because sg/crypto code
assume page offsets are below PAGE_SIZE.

This was discovered using mlx4 driver [2], but I suspect loopback
might trigger the same bug now we use order-3 pages in tcp_sendmsg()

[1] Failure is giving following messages.

huh, entered softirq 3 NET_RX ffffffff806ad230 preempt_count 00000100,
exited with 00000101?

[2] mlx4 driver uses order-2 pages to allocate RX frags

Reported-by: Matt Schnall <mischnal@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Bernhard Beck <bbeck@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/ipv4/tcp.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 6232462..fc18410 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2826,7 +2826,11 @@ int tcp_md5_hash_skb_data(struct tcp_md5sig_pool *hp,
 
 	for (i = 0; i < shi->nr_frags; ++i) {
 		const struct skb_frag_struct *f = &shi->frags[i];
-		sg_set_page(&sg, f->page, f->size, f->page_offset);
+		unsigned int offset = f->page_offset;
+		struct page *page = f->page + (offset >> PAGE_SHIFT);
+
+		sg_set_page(&sg, page, f->size,
+			    offset_in_page(offset));
 		if (crypto_hash_update(desc, &sg, f->size))
 			return 1;
 	}
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 047/143] ipv6: fix possible crashes in ip6_cork_release()
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (46 preceding siblings ...)
  2014-05-12  0:32 ` [ 046/143] tcp: fix tcp_md5_hash_skb_data() Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 048/143] ip_tunnel: fix kernel panic with icmp_dest_unreach Willy Tarreau
                   ` (95 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Eric Dumazet, David S. Miller, Herbert Xu, Hideaki YOSHIFUJI,
	Neal Cardwell, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

[ Upstream commit 284041ef21fdf2e0d216ab6b787bc9072b4eb58a ]

commit 0178b695fd6b4 ("ipv6: Copy cork options in ip6_append_data")
added some code duplication and bad error recovery, leading to potential
crash in ip6_cork_release() as kfree() could be called with garbage.

use kzalloc() to make sure this wont happen.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/ipv6/ip6_output.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index bba91a1..bb63ffc 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1174,7 +1174,7 @@ int ip6_append_data(struct sock *sk, int getfrag(void *from, char *to,
 			if (WARN_ON(np->cork.opt))
 				return -EINVAL;
 
-			np->cork.opt = kmalloc(opt->tot_len, sk->sk_allocation);
+			np->cork.opt = kzalloc(opt->tot_len, sk->sk_allocation);
 			if (unlikely(np->cork.opt == NULL))
 				return -ENOBUFS;
 
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 048/143] ip_tunnel: fix kernel panic with icmp_dest_unreach
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (47 preceding siblings ...)
  2014-05-12  0:32 ` [ 047/143] ipv6: fix possible crashes in ip6_cork_release() Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 049/143] net: sctp: fix NULL pointer dereference in socket destruction Willy Tarreau
                   ` (94 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Eric Dumazet, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

[ Upstream commit a622260254ee481747cceaaa8609985b29a31565 ]

Daniel Petre reported crashes in icmp_dst_unreach() with following call
graph:

Daniel found a similar problem mentioned in
 http://lkml.indiana.edu/hypermail/linux/kernel/1007.0/00961.html

And indeed this is the root cause : skb->cb[] contains data fooling IP
stack.

We must clear IPCB in ip_tunnel_xmit() sooner in case dst_link_failure()
is called. Or else skb->cb[] might contain garbage from GSO segmentation
layer.

A similar fix was tested on linux-3.9, but gre code was refactored in
linux-3.10. I'll send patches for stable kernels as well.

Many thanks to Daniel for providing reports, patches and testing !

Reported-by: Daniel Petre <daniel.petre@rcs-rds.ro>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/ipv4/ipip.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/ipip.c b/net/ipv4/ipip.c
index 860b5c5..49aa1ad 100644
--- a/net/ipv4/ipip.c
+++ b/net/ipv4/ipip.c
@@ -408,6 +408,7 @@ static netdev_tx_t ipip_tunnel_xmit(struct sk_buff *skb, struct net_device *dev)
 	if (tos&1)
 		tos = old_iph->tos;
 
+	memset(&(IPCB(skb)->opt), 0, sizeof(IPCB(skb)->opt));
 	if (!dst) {
 		/* NBMA tunnel */
 		if ((rt = skb_rtable(skb)) == NULL) {
@@ -494,7 +495,6 @@ static netdev_tx_t ipip_tunnel_xmit(struct sk_buff *skb, struct net_device *dev)
 	skb->transport_header = skb->network_header;
 	skb_push(skb, sizeof(struct iphdr));
 	skb_reset_network_header(skb);
-	memset(&(IPCB(skb)->opt), 0, sizeof(IPCB(skb)->opt));
 	IPCB(skb)->flags &= ~(IPSKB_XFRM_TUNNEL_SIZE | IPSKB_XFRM_TRANSFORMED |
 			      IPSKB_REROUTED);
 	skb_dst_drop(skb);
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 049/143] net: sctp: fix NULL pointer dereference in socket destruction
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (48 preceding siblings ...)
  2014-05-12  0:32 ` [ 048/143] ip_tunnel: fix kernel panic with icmp_dest_unreach Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 050/143] packet: packet_getname_spkt: make sure string is always 0-terminated Willy Tarreau
                   ` (93 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Daniel Borkmann, Neil Horman, Vlad Yasevich, David S. Miller,
	Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Daniel Borkmann <dborkman@redhat.com>

[ Upstream commit 1abd165ed757db1afdefaac0a4bc8a70f97d258c ]

While stress testing sctp sockets, I hit the following panic:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
IP: [<ffffffffa0490c4e>] sctp_endpoint_free+0xe/0x40 [sctp]
PGD 7cead067 PUD 7ce76067 PMD 0
Oops: 0000 [#1] SMP
Modules linked in: sctp(F) libcrc32c(F) [...]
CPU: 7 PID: 2950 Comm: acc Tainted: GF            3.10.0-rc2+ #1
Hardware name: Dell Inc. PowerEdge T410/0H19HD, BIOS 1.6.3 02/01/2011
task: ffff88007ce0e0c0 ti: ffff88007b568000 task.ti: ffff88007b568000
RIP: 0010:[<ffffffffa0490c4e>]  [<ffffffffa0490c4e>] sctp_endpoint_free+0xe/0x40 [sctp]
RSP: 0018:ffff88007b569e08  EFLAGS: 00010292
RAX: 0000000000000000 RBX: ffff88007db78a00 RCX: dead000000200200
RDX: ffffffffa049fdb0 RSI: ffff8800379baf38 RDI: 0000000000000000
RBP: ffff88007b569e18 R08: ffff88007c230da0 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffff880077990d00 R14: 0000000000000084 R15: ffff88007db78a00
FS:  00007fc18ab61700(0000) GS:ffff88007fc60000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000020 CR3: 000000007cf9d000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
 ffff88007b569e38 ffff88007db78a00 ffff88007b569e38 ffffffffa049fded
 ffffffff81abf0c0 ffff88007db78a00 ffff88007b569e58 ffffffff8145b60e
 0000000000000000 0000000000000000 ffff88007b569eb8 ffffffff814df36e
Call Trace:
 [<ffffffffa049fded>] sctp_destroy_sock+0x3d/0x80 [sctp]
 [<ffffffff8145b60e>] sk_common_release+0x1e/0xf0
 [<ffffffff814df36e>] inet_create+0x2ae/0x350
 [<ffffffff81455a6f>] __sock_create+0x11f/0x240
 [<ffffffff81455bf0>] sock_create+0x30/0x40
 [<ffffffff8145696c>] SyS_socket+0x4c/0xc0
 [<ffffffff815403be>] ? do_page_fault+0xe/0x10
 [<ffffffff8153cb32>] ? page_fault+0x22/0x30
 [<ffffffff81544e02>] system_call_fastpath+0x16/0x1b
Code: 0c c9 c3 66 2e 0f 1f 84 00 00 00 00 00 e8 fb fe ff ff c9 c3 66 0f
      1f 84 00 00 00 00 00 55 48 89 e5 53 48 83 ec 08 66 66 66 66 90 <48>
      8b 47 20 48 89 fb c6 47 1c 01 c6 40 12 07 e8 9e 68 01 00 48
RIP  [<ffffffffa0490c4e>] sctp_endpoint_free+0xe/0x40 [sctp]
 RSP <ffff88007b569e08>
CR2: 0000000000000020
---[ end trace e0d71ec1108c1dd9 ]---

I did not hit this with the lksctp-tools functional tests, but with a
small, multi-threaded test program, that heavily allocates, binds,
listens and waits in accept on sctp sockets, and then randomly kills
some of them (no need for an actual client in this case to hit this).
Then, again, allocating, binding, etc, and then killing child processes.

This panic then only occurs when ``echo 1 > /proc/sys/net/sctp/auth_enable''
is set. The cause for that is actually very simple: in sctp_endpoint_init()
we enter the path of sctp_auth_init_hmacs(). There, we try to allocate
our crypto transforms through crypto_alloc_hash(). In our scenario,
it then can happen that crypto_alloc_hash() fails with -EINTR from
crypto_larval_wait(), thus we bail out and release the socket via
sk_common_release(), sctp_destroy_sock() and hit the NULL pointer
dereference as soon as we try to access members in the endpoint during
sctp_endpoint_free(), since endpoint at that time is still NULL. Now,
if we have that case, we do not need to do any cleanup work and just
leave the destruction handler.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/sctp/socket.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 26ffae2..44d8eab 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -3743,6 +3743,12 @@ SCTP_STATIC void sctp_destroy_sock(struct sock *sk)
 
 	/* Release our hold on the endpoint. */
 	ep = sctp_sk(sk)->ep;
+	/* This could happen during socket init, thus we bail out
+	 * early, since the rest of the below is not setup either.
+	 */
+	if (ep == NULL)
+		return;
+
 	sctp_endpoint_free(ep);
 	percpu_counter_dec(&sctp_sockets_allocated);
 	local_bh_disable();
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 050/143] packet: packet_getname_spkt: make sure string is always 0-terminated
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (49 preceding siblings ...)
  2014-05-12  0:32 ` [ 049/143] net: sctp: fix NULL pointer dereference in socket destruction Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 051/143] neighbour: fix a race in neigh_destroy() Willy Tarreau
                   ` (92 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Daniel Borkmann, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Daniel Borkmann <dborkman@redhat.com>

[ Upstream commit 2dc85bf323515e59e15dfa858d1472bb25cad0fe ]

uaddr->sa_data is exactly of size 14, which is hard-coded here and
passed as a size argument to strncpy(). A device name can be of size
IFNAMSIZ (== 16), meaning we might leave the destination string
unterminated. Thus, use strlcpy() and also sizeof() while we're
at it. We need to memset the data area beforehand, since strlcpy
does not padd the remaining buffer with zeroes for user space, so
that we do not possibly leak anything.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/packet/af_packet.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 728c080..f084e01 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1525,12 +1525,12 @@ static int packet_getname_spkt(struct socket *sock, struct sockaddr *uaddr,
 		return -EOPNOTSUPP;
 
 	uaddr->sa_family = AF_PACKET;
+	memset(uaddr->sa_data, 0, sizeof(uaddr->sa_data));
 	dev = dev_get_by_index(sock_net(sk), pkt_sk(sk)->ifindex);
 	if (dev) {
-		strncpy(uaddr->sa_data, dev->name, 14);
+		strlcpy(uaddr->sa_data, dev->name, sizeof(uaddr->sa_data));
 		dev_put(dev);
-	} else
-		memset(uaddr->sa_data, 0, 14);
+	}
 	*uaddr_len = sizeof(*uaddr);
 
 	return 0;
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 051/143] neighbour: fix a race in neigh_destroy()
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (50 preceding siblings ...)
  2014-05-12  0:32 ` [ 050/143] packet: packet_getname_spkt: make sure string is always 0-terminated Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 052/143] net: Swap ver and type in pppoe_hdr Willy Tarreau
                   ` (91 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Eric Dumazet, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <eric.dumazet@gmail.com>

[ Upstream commit c9ab4d85de222f3390c67aedc9c18a50e767531e ]

There is a race in neighbour code, because neigh_destroy() uses
skb_queue_purge(&neigh->arp_queue) without holding neighbour lock,
while other parts of the code assume neighbour rwlock is what
protects arp_queue

Convert all skb_queue_purge() calls to the __skb_queue_purge() variant

Use __skb_queue_head_init() instead of skb_queue_head_init()
to make clear we do not use arp_queue.lock

And hold neigh->lock in neigh_destroy() to close the race.

Reported-by: Joe Jin <joe.jin@oracle.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/core/neighbour.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index e696250..fc9feaa 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -222,7 +222,7 @@ static void neigh_flush_dev(struct neigh_table *tbl, struct net_device *dev)
 				   we must kill timers etc. and move
 				   it to safe state.
 				 */
-				skb_queue_purge(&n->arp_queue);
+				__skb_queue_purge(&n->arp_queue);
 				n->output = neigh_blackhole;
 				if (n->nud_state & NUD_VALID)
 					n->nud_state = NUD_NOARP;
@@ -276,7 +276,7 @@ static struct neighbour *neigh_alloc(struct neigh_table *tbl)
 	if (!n)
 		goto out_entries;
 
-	skb_queue_head_init(&n->arp_queue);
+	__skb_queue_head_init(&n->arp_queue);
 	rwlock_init(&n->lock);
 	n->updated	  = n->used = now;
 	n->nud_state	  = NUD_NONE;
@@ -646,7 +646,9 @@ void neigh_destroy(struct neighbour *neigh)
 			kfree(hh);
 	}
 
-	skb_queue_purge(&neigh->arp_queue);
+	write_lock_bh(&neigh->lock);
+	__skb_queue_purge(&neigh->arp_queue);
+	write_unlock_bh(&neigh->lock);
 
 	dev_put(neigh->dev);
 	neigh_parms_put(neigh->parms);
@@ -789,7 +791,7 @@ static void neigh_invalidate(struct neighbour *neigh)
 		neigh->ops->error_report(neigh, skb);
 		write_lock(&neigh->lock);
 	}
-	skb_queue_purge(&neigh->arp_queue);
+	__skb_queue_purge(&neigh->arp_queue);
 }
 
 /* Called when a timer expires for a neighbour entry. */
@@ -1105,7 +1107,7 @@ int neigh_update(struct neighbour *neigh, const u8 *lladdr, u8 new,
 			n1->output(skb);
 			write_lock_bh(&neigh->lock);
 		}
-		skb_queue_purge(&neigh->arp_queue);
+		__skb_queue_purge(&neigh->arp_queue);
 	}
 out:
 	if (update_isrouter) {
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 052/143] net: Swap ver and type in pppoe_hdr
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (51 preceding siblings ...)
  2014-05-12  0:32 ` [ 051/143] neighbour: fix a race in neigh_destroy() Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 053/143] sunvnet: vnet_port_remove must call unregister_netdev Willy Tarreau
                   ` (90 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Changli Gao <xiaosuo@gmail.com>

[ Upstream commit b1a5a34bd0b8767ea689e68f8ea513e9710b671e ]

Ver and type in pppoe_hdr should be swapped as defined by RFC2516
section-4.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 include/linux/if_pppox.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/if_pppox.h b/include/linux/if_pppox.h
index 90b5fae..1750054 100644
--- a/include/linux/if_pppox.h
+++ b/include/linux/if_pppox.h
@@ -108,11 +108,11 @@ struct pppoe_tag {
 
 struct pppoe_hdr {
 #if defined(__LITTLE_ENDIAN_BITFIELD)
-	__u8 ver : 4;
 	__u8 type : 4;
+	__u8 ver : 4;
 #elif defined(__BIG_ENDIAN_BITFIELD)
-	__u8 type : 4;
 	__u8 ver : 4;
+	__u8 type : 4;
 #else
 #error	"Please fix <asm/byteorder.h>"
 #endif
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 053/143] sunvnet: vnet_port_remove must call unregister_netdev
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (52 preceding siblings ...)
  2014-05-12  0:32 ` [ 052/143] net: Swap ver and type in pppoe_hdr Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 054/143] ifb: fix rcu_sched self-detected stalls Willy Tarreau
                   ` (89 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Dave Kleikamp, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Dave Kleikamp <dave.kleikamp@oracle.com>

[ Upstream commit aabb9875d02559ab9b928cd6f259a5cc4c21a589 ]

The missing call to unregister_netdev() leaves the interface active
after the driver is unloaded by rmmod.

Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/net/sunvnet.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/sunvnet.c b/drivers/net/sunvnet.c
index bc74db0..b6d0348 100644
--- a/drivers/net/sunvnet.c
+++ b/drivers/net/sunvnet.c
@@ -1260,6 +1260,8 @@ static int vnet_port_remove(struct vio_dev *vdev)
 		dev_set_drvdata(&vdev->dev, NULL);
 
 		kfree(port);
+
+		unregister_netdev(vp->dev);
 	}
 	return 0;
 }
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 054/143] ifb: fix rcu_sched self-detected stalls
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (53 preceding siblings ...)
  2014-05-12  0:32 ` [ 053/143] sunvnet: vnet_port_remove must call unregister_netdev Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 055/143] dummy: fix oops when loading the dummy failed Willy Tarreau
                   ` (88 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Ding Tianhong, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: dingtianhong <dingtianhong@huawei.com>

[ Upstream commit 440d57bc5ff55ec1efb3efc9cbe9420b4bbdfefa ]

According to the commit 16b0dc29c1af9df341428f4c49ada4f626258082
(dummy: fix rcu_sched self-detected stalls)

Eric Dumazet fix the problem in dummy, but the ifb will occur the
same problem like the dummy modules.

Trying to "modprobe ifb numifbs=30000" triggers :

INFO: rcu_sched self-detected stall on CPU

After this splat, RTNL is locked and reboot is needed.

We must call cond_resched() to avoid this, even holding RTNL.

Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[wt: 2.6.32: cond_resched() needs linux/sched.h]
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/net/ifb.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ifb.c b/drivers/net/ifb.c
index 030913f..cfd5510 100644
--- a/drivers/net/ifb.c
+++ b/drivers/net/ifb.c
@@ -33,6 +33,7 @@
 #include <linux/etherdevice.h>
 #include <linux/init.h>
 #include <linux/moduleparam.h>
+#include <linux/sched.h>
 #include <net/pkt_sched.h>
 #include <net/net_namespace.h>
 
@@ -269,8 +270,10 @@ static int __init ifb_init_module(void)
 	rtnl_lock();
 	err = __rtnl_link_register(&ifb_link_ops);
 
-	for (i = 0; i < numifbs && !err; i++)
+	for (i = 0; i < numifbs && !err; i++) {
 		err = ifb_init_one(i);
+		cond_resched();
+	}
 	if (err)
 		__rtnl_link_unregister(&ifb_link_ops);
 	rtnl_unlock();
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 055/143] dummy: fix oops when loading the dummy failed
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (54 preceding siblings ...)
  2014-05-12  0:32 ` [ 054/143] ifb: fix rcu_sched self-detected stalls Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 056/143] ifb: fix oops when loading the ifb failed Willy Tarreau
                   ` (87 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Tan Xiaojun, Ding Tianhong, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: dingtianhong <dingtianhong@huawei.com>

[ Upstream commit 2c8a01894a12665d8059fad8f0a293c98a264121 ]

We rename the dummy in modprobe.conf like this:

install dummy0 /sbin/modprobe -o dummy0 --ignore-install dummy
install dummy1 /sbin/modprobe -o dummy1 --ignore-install dummy

We got oops when we run the command:

modprobe dummy0
modprobe dummy1

------------[ cut here ]------------

[ 3302.187584] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[ 3302.195411] IP: [<ffffffff813fe62a>] __rtnl_link_unregister+0x9a/0xd0
[ 3302.201844] PGD 85c94a067 PUD 8517bd067 PMD 0
[ 3302.206305] Oops: 0002 [#1] SMP
[ 3302.299737] task: ffff88105ccea300 ti: ffff880eba4a0000 task.ti: ffff880eba4a0000
[ 3302.307186] RIP: 0010:[<ffffffff813fe62a>]  [<ffffffff813fe62a>] __rtnl_link_unregister+0x9a/0xd0
[ 3302.316044] RSP: 0018:ffff880eba4a1dd8  EFLAGS: 00010246
[ 3302.321332] RAX: 0000000000000000 RBX: ffffffff81a9d738 RCX: 0000000000000002
[ 3302.328436] RDX: 0000000000000000 RSI: ffffffffa04d602c RDI: ffff880eba4a1dd8
[ 3302.335541] RBP: ffff880eba4a1e18 R08: dead000000200200 R09: dead000000100100
[ 3302.342644] R10: 0000000000000080 R11: 0000000000000003 R12: ffffffff81a9d788
[ 3302.349748] R13: ffffffffa04d7020 R14: ffffffff81a9d670 R15: ffff880eba4a1dd8
[ 3302.364910] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3302.370630] CR2: 0000000000000008 CR3: 000000085e15e000 CR4: 00000000000427e0
[ 3302.377734] DR0: 0000000000000003 DR1: 00000000000000b0 DR2: 0000000000000001
[ 3302.384838] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 3302.391940] Stack:
[ 3302.393944]  ffff880eba4a1dd8 ffff880eba4a1dd8 ffff880eba4a1e18 ffffffffa04d70c0
[ 3302.401350]  00000000ffffffef ffffffffa01a8000 0000000000000000 ffffffff816111c8
[ 3302.408758]  ffff880eba4a1e48 ffffffffa01a80be ffff880eba4a1e48 ffffffffa04d70c0
[ 3302.416164] Call Trace:
[ 3302.418605]  [<ffffffffa01a8000>] ? 0xffffffffa01a7fff
[ 3302.423727]  [<ffffffffa01a80be>] dummy_init_module+0xbe/0x1000 [dummy0]
[ 3302.430405]  [<ffffffffa01a8000>] ? 0xffffffffa01a7fff
[ 3302.435535]  [<ffffffff81000322>] do_one_initcall+0x152/0x1b0
[ 3302.441263]  [<ffffffff810ab24b>] do_init_module+0x7b/0x200
[ 3302.446824]  [<ffffffff810ad3d2>] load_module+0x4e2/0x530
[ 3302.452215]  [<ffffffff8127ae40>] ? ddebug_dyndbg_boot_param_cb+0x60/0x60
[ 3302.458979]  [<ffffffff810ad5f1>] SyS_init_module+0xd1/0x130
[ 3302.464627]  [<ffffffff814b9652>] system_call_fastpath+0x16/0x1b
[ 3302.490090] RIP  [<ffffffff813fe62a>] __rtnl_link_unregister+0x9a/0xd0
[ 3302.496607]  RSP <ffff880eba4a1dd8>
[ 3302.500084] CR2: 0000000000000008
[ 3302.503466] ---[ end trace 8342d49cd49f78ed ]---

The reason is that when loading dummy, if __rtnl_link_register() return failed,
the init_module should return and avoid take the wrong path.

Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/net/dummy.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/dummy.c b/drivers/net/dummy.c
index 37dcfdc..9d9de18 100644
--- a/drivers/net/dummy.c
+++ b/drivers/net/dummy.c
@@ -137,11 +137,15 @@ static int __init dummy_init_module(void)
 
 	rtnl_lock();
 	err = __rtnl_link_register(&dummy_link_ops);
+	if (err < 0)
+		goto out;
 
 	for (i = 0; i < numdummies && !err; i++)
 		err = dummy_init_one();
 	if (err < 0)
 		__rtnl_link_unregister(&dummy_link_ops);
+
+out:
 	rtnl_unlock();
 
 	return err;
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 056/143] ifb: fix oops when loading the ifb failed
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (55 preceding siblings ...)
  2014-05-12  0:32 ` [ 055/143] dummy: fix oops when loading the dummy failed Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 057/143] vlan: fix a race in egress prio management Willy Tarreau
                   ` (86 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Ding Tianhong, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: dingtianhong <dingtianhong@huawei.com>

[ Upstream commit f2966cd5691058b8674a20766525bedeaea9cbcf ]

If __rtnl_link_register() return faild when loading the ifb, it will
take the wrong path and get oops, so fix it just like dummy.

Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/net/ifb.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ifb.c b/drivers/net/ifb.c
index cfd5510..509c6f5 100644
--- a/drivers/net/ifb.c
+++ b/drivers/net/ifb.c
@@ -269,6 +269,8 @@ static int __init ifb_init_module(void)
 
 	rtnl_lock();
 	err = __rtnl_link_register(&ifb_link_ops);
+	if (err < 0)
+		goto out;
 
 	for (i = 0; i < numifbs && !err; i++) {
 		err = ifb_init_one(i);
@@ -276,6 +278,8 @@ static int __init ifb_init_module(void)
 	}
 	if (err)
 		__rtnl_link_unregister(&ifb_link_ops);
+
+out:
 	rtnl_unlock();
 
 	return err;
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 057/143] vlan: fix a race in egress prio management
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (56 preceding siblings ...)
  2014-05-12  0:32 ` [ 056/143] ifb: fix oops when loading the ifb failed Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 058/143] arcnet: cleanup sizeof parameter Willy Tarreau
                   ` (85 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Eric Dumazet, Patrick McHardy, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

[ Upstream commit 3e3aac497513c669e1c62c71e1d552ea85c1d974 ]

egress_priority_map[] hash table updates are protected by rtnl,
and we never remove elements until device is dismantled.

We have to make sure that before inserting an new element in hash table,
all its fields are committed to memory or else another cpu could
find corrupt values and crash.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/8021q/vlan_dev.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index 4198ec5..9796ea4 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -220,6 +220,8 @@ vlan_dev_get_egress_qos_mask(struct net_device *dev, struct sk_buff *skb)
 {
 	struct vlan_priority_tci_mapping *mp;
 
+	smp_rmb(); /* coupled with smp_wmb() in vlan_dev_set_egress_priority() */
+
 	mp = vlan_dev_info(dev)->egress_priority_map[(skb->priority & 0xF)];
 	while (mp) {
 		if (mp->priority == skb->priority) {
@@ -418,6 +420,11 @@ int vlan_dev_set_egress_priority(const struct net_device *dev,
 	np->next = mp;
 	np->priority = skb_prio;
 	np->vlan_qos = vlan_qos;
+	/* Before inserting this element in hash table, make sure all its fields
+	 * are committed to memory.
+	 * coupled with smp_rmb() in vlan_dev_get_egress_qos_mask()
+	 */
+	smp_wmb();
 	vlan->egress_priority_map[skb_prio & 0xF] = np;
 	if (vlan_qos)
 		vlan->nr_egress_mappings++;
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 058/143] arcnet: cleanup sizeof parameter
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (57 preceding siblings ...)
  2014-05-12  0:32 ` [ 057/143] vlan: fix a race in egress prio management Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-05-12  0:32 ` [ 059/143] sysctl net: Keep tcp_syn_retries inside the boundary Willy Tarreau
                   ` (84 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Dan Carpenter, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Dan Carpenter <dan.carpenter@oracle.com>

[ Upstream commit 087d273caf4f7d3f2159256f255f1f432bc84a5b ]

This patch doesn't change the compiled code because ARC_HDR_SIZE is 4
and sizeof(int) is 4, but the intent was to use the header size and not
the sizeof the header size.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/net/arcnet/arcnet.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/arcnet/arcnet.c b/drivers/net/arcnet/arcnet.c
index 75a5725..e29940d 100644
--- a/drivers/net/arcnet/arcnet.c
+++ b/drivers/net/arcnet/arcnet.c
@@ -1008,7 +1008,7 @@ static void arcnet_rx(struct net_device *dev, int bufnum)
 
 	soft = &pkt.soft.rfc1201;
 
-	lp->hw.copy_from_card(dev, bufnum, 0, &pkt, sizeof(ARC_HDR_SIZE));
+	lp->hw.copy_from_card(dev, bufnum, 0, &pkt, ARC_HDR_SIZE);
 	if (pkt.hard.offset[0]) {
 		ofs = pkt.hard.offset[0];
 		length = 256 - ofs;
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 059/143] sysctl net: Keep tcp_syn_retries inside the boundary
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (58 preceding siblings ...)
  2014-05-12  0:32 ` [ 058/143] arcnet: cleanup sizeof parameter Willy Tarreau
@ 2014-05-12  0:32 ` Willy Tarreau
  2014-06-11 18:46   ` Luis Henriques
  2014-05-12  0:33 ` [ 060/143] sctp: fully initialize sctp_outq in sctp_outq_init Willy Tarreau
                   ` (83 subsequent siblings)
  143 siblings, 1 reply; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:32 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Michal Tesar, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Michal Tesar <mtesar@redhat.com>

[ Upstream commit 651e92716aaae60fc41b9652f54cb6803896e0da ]

Limit the min/max value passed to the
/proc/sys/net/ipv4/tcp_syn_retries.

Signed-off-by: Michal Tesar <mtesar@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/ipv4/sysctl_net_ipv4.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 2dcf04d..910fa54 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -23,6 +23,8 @@
 
 static int zero;
 static int tcp_retr1_max = 255;
+static int tcp_syn_retries_min = 1;
+static int tcp_syn_retries_max = MAX_TCP_SYNCNT;
 static int ip_local_port_range_min[] = { 1, 1 };
 static int ip_local_port_range_max[] = { 65535, 65535 };
 
@@ -237,7 +239,9 @@ static struct ctl_table ipv4_table[] = {
 		.data		= &ipv4_config.no_pmtu_disc,
 		.maxlen		= sizeof(int),
 		.mode		= 0644,
-		.proc_handler	= proc_dointvec
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= &tcp_syn_retries_min,
+		.extra2		= &tcp_syn_retries_max
 	},
 	{
 		.ctl_name	= NET_IPV4_NONLOCAL_BIND,
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 060/143] sctp: fully initialize sctp_outq in sctp_outq_init
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (59 preceding siblings ...)
  2014-05-12  0:32 ` [ 059/143] sysctl net: Keep tcp_syn_retries inside the boundary Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 061/143] net_sched: Fix stack info leak in cbq_dump_wrr() Willy Tarreau
                   ` (82 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Neil Horman, Vlad Yasevich, netdev, davem, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Neil Horman <nhorman@tuxdriver.com>

[ Upstream commit c5c7774d7eb4397891edca9ebdf750ba90977a69 ]

In commit 2f94aabd9f6c925d77aecb3ff020f1cc12ed8f86
(refactor sctp_outq_teardown to insure proper re-initalization)
we modified sctp_outq_teardown to use sctp_outq_init to fully re-initalize the
outq structure.  Steve West recently asked me why I removed the q->error = 0
initalization from sctp_outq_teardown.  I did so because I was operating under
the impression that sctp_outq_init would properly initalize that value for us,
but it doesn't.  sctp_outq_init operates under the assumption that the outq
struct is all 0's (as it is when called from sctp_association_init), but using
it in __sctp_outq_teardown violates that assumption. We should do a memset in
sctp_outq_init to ensure that the entire structure is in a known state there
instead.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: "West, Steve (NSN - US/Fort Worth)" <steve.west@nsn.com>
CC: Vlad Yasevich <vyasevich@gmail.com>
CC: netdev@vger.kernel.org
CC: davem@davemloft.net
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/sctp/outqueue.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/net/sctp/outqueue.c b/net/sctp/outqueue.c
index 23e5e97..bc423b4 100644
--- a/net/sctp/outqueue.c
+++ b/net/sctp/outqueue.c
@@ -203,6 +203,8 @@ static inline int sctp_cacc_skip(struct sctp_transport *primary,
  */
 void sctp_outq_init(struct sctp_association *asoc, struct sctp_outq *q)
 {
+	memset(q, 0, sizeof(struct sctp_outq));
+
 	q->asoc = asoc;
 	INIT_LIST_HEAD(&q->out_chunk_list);
 	INIT_LIST_HEAD(&q->control_chunk_list);
@@ -210,13 +212,7 @@ void sctp_outq_init(struct sctp_association *asoc, struct sctp_outq *q)
 	INIT_LIST_HEAD(&q->sacked);
 	INIT_LIST_HEAD(&q->abandoned);
 
-	q->fast_rtx = 0;
-	q->outstanding_bytes = 0;
 	q->empty = 1;
-	q->cork  = 0;
-
-	q->malloced = 0;
-	q->out_qlen = 0;
 }
 
 /* Free the outqueue structure and any related pending chunks.
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 061/143] net_sched: Fix stack info leak in cbq_dump_wrr().
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (60 preceding siblings ...)
  2014-05-12  0:33 ` [ 060/143] sctp: fully initialize sctp_outq in sctp_outq_init Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 062/143] af_key: more info leaks in pfkey messages Willy Tarreau
                   ` (81 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: "David S. Miller" <davem@davemloft.net>

[ Upstream commit a0db856a95a29efb1c23db55c02d9f0ff4f0db48 ]

Make sure the reserved fields, and padding (if any), are
fully initialized.

Based upon a patch by Dan Carpenter and feedback from
Joe Perches.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/sched/sch_cbq.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/sched/sch_cbq.c b/net/sched/sch_cbq.c
index 5b132c4..8b6f05d 100644
--- a/net/sched/sch_cbq.c
+++ b/net/sched/sch_cbq.c
@@ -1458,6 +1458,7 @@ static __inline__ int cbq_dump_wrr(struct sk_buff *skb, struct cbq_class *cl)
 	unsigned char *b = skb_tail_pointer(skb);
 	struct tc_cbq_wrropt opt;
 
+	memset(&opt, 0, sizeof(opt));
 	opt.flags = 0;
 	opt.allot = cl->allot;
 	opt.priority = cl->priority+1;
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 062/143] af_key: more info leaks in pfkey messages
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (61 preceding siblings ...)
  2014-05-12  0:33 ` [ 061/143] net_sched: Fix stack info leak in cbq_dump_wrr() Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 063/143] net_sched: info leak in atm_tc_dump_class() Willy Tarreau
                   ` (80 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Mathias Krause, Dan Carpenter, Steffen Klassert, David S. Miller,
	Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Dan Carpenter <dan.carpenter@oracle.com>

[ Upstream commit ff862a4668dd6dba962b1d2d8bd344afa6375683 ]

This is inspired by a5cc68f3d6 "af_key: fix info leaks in notify
messages".  There are some struct members which don't get initialized
and could disclose small amounts of private information.

Acked-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/key/af_key.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/net/key/af_key.c b/net/key/af_key.c
index 9d22e46..3f55faa 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -2079,6 +2079,7 @@ static int pfkey_xfrm_policy2msg(struct sk_buff *skb, struct xfrm_policy *xp, in
 			pol->sadb_x_policy_type = IPSEC_POLICY_NONE;
 	}
 	pol->sadb_x_policy_dir = dir+1;
+	pol->sadb_x_policy_reserved = 0;
 	pol->sadb_x_policy_id = xp->index;
 	pol->sadb_x_policy_priority = xp->priority;
 
@@ -3111,7 +3112,9 @@ static int pfkey_send_acquire(struct xfrm_state *x, struct xfrm_tmpl *t, struct
 	pol->sadb_x_policy_exttype = SADB_X_EXT_POLICY;
 	pol->sadb_x_policy_type = IPSEC_POLICY_IPSEC;
 	pol->sadb_x_policy_dir = dir+1;
+	pol->sadb_x_policy_reserved = 0;
 	pol->sadb_x_policy_id = xp->index;
+	pol->sadb_x_policy_priority = xp->priority;
 
 	/* Set sadb_comb's. */
 	if (x->id.proto == IPPROTO_AH)
@@ -3499,6 +3502,7 @@ static int pfkey_send_migrate(struct xfrm_selector *sel, u8 dir, u8 type,
 	pol->sadb_x_policy_exttype = SADB_X_EXT_POLICY;
 	pol->sadb_x_policy_type = IPSEC_POLICY_IPSEC;
 	pol->sadb_x_policy_dir = dir + 1;
+	pol->sadb_x_policy_reserved = 0;
 	pol->sadb_x_policy_id = 0;
 	pol->sadb_x_policy_priority = 0;
 
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 063/143] net_sched: info leak in atm_tc_dump_class()
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (62 preceding siblings ...)
  2014-05-12  0:33 ` [ 062/143] af_key: more info leaks in pfkey messages Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 064/143] htb: fix sign extension bug Willy Tarreau
                   ` (79 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Dan Carpenter, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Dan Carpenter <dan.carpenter@oracle.com>

[ Upstream commit 8cb3b9c3642c0263d48f31d525bcee7170eedc20 ]

The "pvc" struct has a hole after pvc.sap_family which is not cleared.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/sched/sch_atm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/sched/sch_atm.c b/net/sched/sch_atm.c
index ab82f14..b022c59 100644
--- a/net/sched/sch_atm.c
+++ b/net/sched/sch_atm.c
@@ -628,6 +628,7 @@ static int atm_tc_dump_class(struct Qdisc *sch, unsigned long cl,
 		struct sockaddr_atmpvc pvc;
 		int state;
 
+		memset(&pvc, 0, sizeof(pvc));
 		pvc.sap_family = AF_ATMPVC;
 		pvc.sap_addr.itf = flow->vcc->dev ? flow->vcc->dev->number : -1;
 		pvc.sap_addr.vpi = flow->vcc->vpi;
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 064/143] htb: fix sign extension bug
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (63 preceding siblings ...)
  2014-05-12  0:33 ` [ 063/143] net_sched: info leak in atm_tc_dump_class() Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 065/143] net: check net.core.somaxconn sysctl values Willy Tarreau
                   ` (78 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Stephen Hemminger, Eric Dumazet, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: stephen hemminger <stephen@networkplumber.org>

[ Upstream commit cbd375567f7e4811b1c721f75ec519828ac6583f ]

When userspace passes a large priority value
the assignment of the unsigned value hopt->prio
to  signed int cl->prio causes cl->prio to become negative and the
comparison is with TC_HTB_NUMPRIO is always false.

The result is that HTB crashes by referencing outside
the array when processing packets. With this patch the large value
wraps around like other values outside the normal range.

See: https://bugzilla.kernel.org/show_bug.cgi?id=60669

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/sched/sch_htb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
index 2f074d6..9ce5963 100644
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -85,7 +85,7 @@ struct htb_class {
 	unsigned int children;
 	struct htb_class *parent;	/* parent class */
 
-	int prio;		/* these two are used only by leaves... */
+	u32 prio;		/* these two are used only by leaves... */
 	int quantum;		/* but stored for parent-to-leaf return */
 
 	union {
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 065/143] net: check net.core.somaxconn sysctl values
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (64 preceding siblings ...)
  2014-05-12  0:33 ` [ 064/143] htb: fix sign extension bug Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 066/143] tcp: cubic: fix bug in bictcp_acked() Willy Tarreau
                   ` (77 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Roman Gushchin, Eric Dumazet, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Roman Gushchin <klamm@yandex-team.ru>

[ Upstream commit 5f671d6b4ec3e6d66c2a868738af2cdea09e7509 ]

It's possible to assign an invalid value to the net.core.somaxconn
sysctl variable, because there is no checks at all.

The sk_max_ack_backlog field of the sock structure is defined as
unsigned short. Therefore, the backlog argument in inet_listen()
shouldn't exceed USHRT_MAX. The backlog argument in the listen() syscall
is truncated to the somaxconn value. So, the somaxconn value shouldn't
exceed 65535 (USHRT_MAX).
Also, negative values of somaxconn are meaningless.

before:
$ sysctl -w net.core.somaxconn=256
net.core.somaxconn = 256
$ sysctl -w net.core.somaxconn=65536
net.core.somaxconn = 65536
$ sysctl -w net.core.somaxconn=-100
net.core.somaxconn = -100

after:
$ sysctl -w net.core.somaxconn=256
net.core.somaxconn = 256
$ sysctl -w net.core.somaxconn=65536
error: "Invalid argument" setting key "net.core.somaxconn"
$ sysctl -w net.core.somaxconn=-100
error: "Invalid argument" setting key "net.core.somaxconn"

Based on a prior patch from Changli Gao.

Signed-off-by: Roman Gushchin <klamm@yandex-team.ru>
Reported-by: Changli Gao <xiaosuo@gmail.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/core/sysctl_net_core.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index 7db1de0..e2eaf29 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -14,6 +14,9 @@
 #include <net/ip.h>
 #include <net/sock.h>
 
+static int zero = 0;
+static int ushort_max = 65535;
+
 static struct ctl_table net_core_table[] = {
 #ifdef CONFIG_NET
 	{
@@ -116,7 +119,9 @@ static struct ctl_table netns_core_table[] = {
 		.data		= &init_net.core.sysctl_somaxconn,
 		.maxlen		= sizeof(int),
 		.mode		= 0644,
-		.proc_handler	= proc_dointvec
+		.extra1		= &zero,
+		.extra2		= &ushort_max,
+		.proc_handler	= proc_dointvec_minmax
 	},
 	{ .ctl_name = 0 }
 };
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 066/143] tcp: cubic: fix bug in bictcp_acked()
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (65 preceding siblings ...)
  2014-05-12  0:33 ` [ 065/143] net: check net.core.somaxconn sysctl values Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 067/143] ipv6: dont stop backtracking in fib6_lookup_1 if subtree does not Willy Tarreau
                   ` (76 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Eric Dumazet, Neal Cardwell, Yuchung Cheng, David S. Miller,
	Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

[ Upstream commit cd6b423afd3c08b27e1fed52db828ade0addbc6b ]

While investigating about strange increase of retransmit rates
on hosts ~24 days after boot, Van found hystart was disabled
if ca->epoch_start was 0, as following condition is true
when tcp_time_stamp high order bit is set.

(s32)(tcp_time_stamp - ca->epoch_start) < HZ

Quoting Van :

 At initialization & after every loss ca->epoch_start is set to zero so
 I believe that the above line will turn off hystart as soon as the 2^31
 bit is set in tcp_time_stamp & hystart will stay off for 24 days.
 I think we've observed that cubic's restart is too aggressive without
 hystart so this might account for the higher drop rate we observe.

Diagnosed-by: Van Jacobson <vanj@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/ipv4/tcp_cubic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/tcp_cubic.c b/net/ipv4/tcp_cubic.c
index 71d5f2f..0d41c26 100644
--- a/net/ipv4/tcp_cubic.c
+++ b/net/ipv4/tcp_cubic.c
@@ -388,7 +388,7 @@ static void bictcp_acked(struct sock *sk, u32 cnt, s32 rtt_us)
 		return;
 
 	/* Discard delay samples right after fast recovery */
-	if ((s32)(tcp_time_stamp - ca->epoch_start) < HZ)
+	if (ca->epoch_start && (s32)(tcp_time_stamp - ca->epoch_start) < HZ)
 		return;
 
 	delay = usecs_to_jiffies(rtt_us) << 3;
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 067/143] ipv6: dont stop backtracking in fib6_lookup_1 if subtree does not
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (66 preceding siblings ...)
  2014-05-12  0:33 ` [ 066/143] tcp: cubic: fix bug in bictcp_acked() Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 068/143] ipv6: remove max_addresses check from ipv6_create_tempaddr Willy Tarreau
                   ` (75 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: YOSHIFUJI Hideaki, David Lamparter, boutier,
	Hannes Frederic Sowa, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------
 match

From: Hannes Frederic Sowa <hannes@stressinduktion.org>

[ Upstream commit 3e3be275851bc6fc90bfdcd732cd95563acd982b ]

In case a subtree did not match we currently stop backtracking and return
NULL (root table from fib_lookup). This could yield in invalid routing
table lookups when using subtrees.

Instead continue to backtrack until a valid subtree or node is found
and return this match.

Also remove unneeded NULL check.

Reported-by: Teco Boot <teco@inf-net.nl>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: David Lamparter <equinox@diac24.net>
Cc: <boutier@pps.univ-paris-diderot.fr>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/ipv6/ip6_fib.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 0e93ca5..0a36d8d 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -846,14 +846,22 @@ static struct fib6_node * fib6_lookup_1(struct fib6_node *root,
 
 			if (ipv6_prefix_equal(&key->addr, args->addr, key->plen)) {
 #ifdef CONFIG_IPV6_SUBTREES
-				if (fn->subtree)
-					fn = fib6_lookup_1(fn->subtree, args + 1);
+				if (fn->subtree) {
+					struct fib6_node *sfn;
+					sfn = fib6_lookup_1(fn->subtree,
+							    args + 1);
+					if (!sfn)
+						goto backtrack;
+					fn = sfn;
+				}
 #endif
-				if (!fn || fn->fn_flags & RTN_RTINFO)
+				if (fn->fn_flags & RTN_RTINFO)
 					return fn;
 			}
 		}
-
+#ifdef CONFIG_IPV6_SUBTREES
+backtrack:
+#endif
 		if (fn->fn_flags & RTN_ROOT)
 			break;
 
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 068/143] ipv6: remove max_addresses check from ipv6_create_tempaddr
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (67 preceding siblings ...)
  2014-05-12  0:33 ` [ 067/143] ipv6: dont stop backtracking in fib6_lookup_1 if subtree does not Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 069/143] ipv6: drop packets with multiple fragmentation headers Willy Tarreau
                   ` (74 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Ding Tianhong, George Kargiotakis, P J P, YOSHIFUJI Hideaki,
	Hannes Frederic Sowa, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Hannes Frederic Sowa <hannes@stressinduktion.org>

[ Upstream commit 4b08a8f1bd8cb4541c93ec170027b4d0782dab52 ]

Because of the max_addresses check attackers were able to disable privacy
extensions on an interface by creating enough autoconfigured addresses:

<http://seclists.org/oss-sec/2012/q4/292>

But the check is not actually needed: max_addresses protects the
kernel to install too many ipv6 addresses on an interface and guards
addrconf_prefix_rcv to install further addresses as soon as this limit
is reached. We only generate temporary addresses in direct response of
a new address showing up. As soon as we filled up the maximum number of
addresses of an interface, we stop installing more addresses and thus
also stop generating more temp addresses.

Even if the attacker tries to generate a lot of temporary addresses
by announcing a prefix and removing it again (lifetime == 0) we won't
install more temp addresses, because the temporary addresses do count
to the maximum number of addresses, thus we would stop installing new
autoconfigured addresses when the limit is reached.

This patch fixes CVE-2013-0343 (but other layer-2 attacks are still
possible).

Thanks to Ding Tianhong to bring this topic up again.

Cc: Ding Tianhong <dingtianhong@huawei.com>
Cc: George Kargiotakis <kargig@void.gr>
Cc: P J P <ppandit@redhat.com>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/ipv6/addrconf.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 8ac3d09..e8c4fd9 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -920,12 +920,10 @@ retry:
 	if (ifp->flags & IFA_F_OPTIMISTIC)
 		addr_flags |= IFA_F_OPTIMISTIC;
 
-	ift = !max_addresses ||
-	      ipv6_count_addresses(idev) < max_addresses ?
-		ipv6_add_addr(idev, &addr, tmp_plen,
-			      ipv6_addr_type(&addr)&IPV6_ADDR_SCOPE_MASK,
-			      addr_flags) : NULL;
-	if (!ift || IS_ERR(ift)) {
+	ift = ipv6_add_addr(idev, &addr, tmp_plen,
+			    ipv6_addr_type(&addr)&IPV6_ADDR_SCOPE_MASK,
+			    addr_flags);
+	if (IS_ERR(ift)) {
 		in6_ifa_put(ifp);
 		in6_dev_put(idev);
 		printk(KERN_INFO
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 069/143] ipv6: drop packets with multiple fragmentation headers
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (68 preceding siblings ...)
  2014-05-12  0:33 ` [ 068/143] ipv6: remove max_addresses check from ipv6_create_tempaddr Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 070/143] ipv6: Dont depend on per socket memory for neighbour discovery Willy Tarreau
                   ` (73 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: YOSHIFUJI Hideaki, Hannes Frederic Sowa, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Hannes Frederic Sowa <hannes@stressinduktion.org>

[ Upstream commit f46078cfcd77fa5165bf849f5e568a7ac5fa569c ]

It is not allowed for an ipv6 packet to contain multiple fragmentation
headers. So discard packets which were already reassembled by
fragmentation logic and send back a parameter problem icmp.

The updates for RFC 6980 will come in later, I have to do a bit more
research here.

Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 include/linux/ipv6.h  | 1 +
 net/ipv6/reassembly.c | 5 +++++
 2 files changed, 6 insertions(+)

diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index c662efa..5bf3324 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -248,6 +248,7 @@ struct inet6_skb_parm {
 
 #define IP6SKB_XFRM_TRANSFORMED	1
 #define IP6SKB_FORWARDED	2
+#define IP6SKB_FRAGMENTED      16
 };
 
 #define IP6CB(skb)	((struct inet6_skb_parm*)((skb)->cb))
diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c
index 105de22..0c09d8e 100644
--- a/net/ipv6/reassembly.c
+++ b/net/ipv6/reassembly.c
@@ -503,6 +503,7 @@ static int ip6_frag_reasm(struct frag_queue *fq, struct sk_buff *prev,
 	head->tstamp = fq->q.stamp;
 	ipv6_hdr(head)->payload_len = htons(payload_len);
 	IP6CB(head)->nhoff = nhoff;
+	IP6CB(head)->flags |= IP6SKB_FRAGMENTED;
 
 	/* Yes, and fold redundant checksum back. 8) */
 	if (head->ip_summed == CHECKSUM_COMPLETE)
@@ -537,6 +538,9 @@ static int ipv6_frag_rcv(struct sk_buff *skb)
 	struct ipv6hdr *hdr = ipv6_hdr(skb);
 	struct net *net = dev_net(skb_dst(skb)->dev);
 
+	if (IP6CB(skb)->flags & IP6SKB_FRAGMENTED)
+		goto fail_hdr;
+
 	IP6_INC_STATS_BH(net, ip6_dst_idev(skb_dst(skb)), IPSTATS_MIB_REASMREQDS);
 
 	/* Jumbo payload inhibits frag. header */
@@ -557,6 +561,7 @@ static int ipv6_frag_rcv(struct sk_buff *skb)
 				 ip6_dst_idev(skb_dst(skb)), IPSTATS_MIB_REASMOKS);
 
 		IP6CB(skb)->nhoff = (u8 *)fhdr - skb_network_header(skb);
+		IP6CB(skb)->flags |= IP6SKB_FRAGMENTED;
 		return 1;
 	}
 
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 070/143] ipv6: Dont depend on per socket memory for neighbour discovery
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (69 preceding siblings ...)
  2014-05-12  0:33 ` [ 069/143] ipv6: drop packets with multiple fragmentation headers Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 071/143] ICMPv6: treat dest unreachable codes 5 and 6 as EACCES, not EPROTO Willy Tarreau
                   ` (72 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Thomas Graf, Eric Dumazet, Hannes Frederic Sowa, Stephen Warren,
	Fabio Estevam, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------
 messages

From: Thomas Graf <tgraf@suug.ch>

[ Upstream commit 25a6e6b84fba601eff7c28d30da8ad7cfbef0d43 ]

Allocating skbs when sending out neighbour discovery messages
currently uses sock_alloc_send_skb() based on a per net namespace
socket and thus share a socket wmem buffer space.

If a netdevice is temporarily unable to transmit due to carrier
loss or for other reasons, the queued up ndisc messages will cosnume
all of the wmem space and will thus prevent from any more skbs to
be allocated even for netdevices that are able to transmit packets.

The number of neighbour discovery messages sent is very limited,
use of alloc_skb() bypasses the socket wmem buffer size enforcement
while the manual call to skb_set_owner_w() maintains the socket
reference needed for the IPv6 output path.

This patch has orginally been posted by Eric Dumazet in a modified
form.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Stephen Warren <swarren@wwwdotorg.org>
Cc: Fabio Estevam <festevam@gmail.com>
Tested-by: Fabio Estevam <fabio.estevam@freescale.com>
Tested-by: Stephen Warren <swarren@nvidia.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/ipv6/ndisc.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index f74e4e2..752da21 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -449,7 +449,6 @@ struct sk_buff *ndisc_build_skb(struct net_device *dev,
 	struct sk_buff *skb;
 	struct icmp6hdr *hdr;
 	int len;
-	int err;
 	u8 *opt;
 
 	if (!dev->addr_len)
@@ -459,14 +458,12 @@ struct sk_buff *ndisc_build_skb(struct net_device *dev,
 	if (llinfo)
 		len += ndisc_opt_addr_space(dev);
 
-	skb = sock_alloc_send_skb(sk,
-				  (MAX_HEADER + sizeof(struct ipv6hdr) +
-				   len + LL_ALLOCATED_SPACE(dev)),
-				  1, &err);
+	skb = alloc_skb((MAX_HEADER + sizeof(struct ipv6hdr) +
+			 len + LL_ALLOCATED_SPACE(dev)), GFP_ATOMIC);
 	if (!skb) {
 		ND_PRINTK0(KERN_ERR
-			   "ICMPv6 ND: %s() failed to allocate an skb, err=%d.\n",
-			   __func__, err);
+			   "ICMPv6 ND: %s() failed to allocate an skb.\n",
+			   __func__);
 		return NULL;
 	}
 
@@ -494,6 +491,11 @@ struct sk_buff *ndisc_build_skb(struct net_device *dev,
 					   csum_partial(hdr,
 							len, 0));
 
+	/* Manually assign socket ownership as we avoid calling
+	 * sock_alloc_send_pskb() to bypass wmem buffer limits
+	 */
+	skb_set_owner_w(skb, sk);
+
 	return skb;
 }
 
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 071/143] ICMPv6: treat dest unreachable codes 5 and 6 as EACCES, not EPROTO
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (70 preceding siblings ...)
  2014-05-12  0:33 ` [ 070/143] ipv6: Dont depend on per socket memory for neighbour discovery Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 072/143] tipc: fix lockdep warning during bearer initialization Willy Tarreau
                   ` (71 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Jiri Bohac, Hannes Frederic Sowa, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Jiri Bohac <jbohac@suse.cz>

[ Upstream commit 61e76b178dbe7145e8d6afa84bb4ccea71918994 ]

RFC 4443 has defined two additional codes for ICMPv6 type 1 (destination
unreachable) messages:
        5 - Source address failed ingress/egress policy
	6 - Reject route to destination

Now they are treated as protocol error and icmpv6_err_convert() converts them
to EPROTO.

RFC 4443 says:
	"Codes 5 and 6 are more informative subsets of code 1."

Treat codes 5 and 6 as code 1 (EACCES)

Btw, connect() returning -EPROTO confuses firefox, so that fallback to
other/IPv4 addresses does not work:
https://bugzilla.mozilla.org/show_bug.cgi?id=910773

Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 include/linux/icmpv6.h |  2 ++
 net/ipv6/icmp.c        | 10 +++++++++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/include/linux/icmpv6.h b/include/linux/icmpv6.h
index c0d8357..2e3d33c 100644
--- a/include/linux/icmpv6.h
+++ b/include/linux/icmpv6.h
@@ -123,6 +123,8 @@ static inline struct icmp6hdr *icmp6_hdr(const struct sk_buff *skb)
 #define ICMPV6_NOT_NEIGHBOUR		2
 #define ICMPV6_ADDR_UNREACH		3
 #define ICMPV6_PORT_UNREACH		4
+#define ICMPV6_POLICY_FAIL		5
+#define ICMPV6_REJECT_ROUTE		6
 
 /*
  *	Codes for Time Exceeded
diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c
index f23ebbe..376a4b6 100644
--- a/net/ipv6/icmp.c
+++ b/net/ipv6/icmp.c
@@ -903,6 +903,14 @@ static const struct icmp6_err {
 		.err	= ECONNREFUSED,
 		.fatal	= 1,
 	},
+	{	/* POLICY_FAIL */
+		.err	= EACCES,
+		.fatal	= 1,
+	},
+	{	/* REJECT_ROUTE	*/
+		.err	= EACCES,
+		.fatal	= 1,
+	},
 };
 
 int icmpv6_err_convert(u8 type, u8 code, int *err)
@@ -914,7 +922,7 @@ int icmpv6_err_convert(u8 type, u8 code, int *err)
 	switch (type) {
 	case ICMPV6_DEST_UNREACH:
 		fatal = 1;
-		if (code <= ICMPV6_PORT_UNREACH) {
+		if (code < ARRAY_SIZE(tab_unreach)) {
 			*err  = tab_unreach[code].err;
 			fatal = tab_unreach[code].fatal;
 		}
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 072/143] tipc: fix lockdep warning during bearer initialization
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (71 preceding siblings ...)
  2014-05-12  0:33 ` [ 071/143] ICMPv6: treat dest unreachable codes 5 and 6 as EACCES, not EPROTO Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12 16:04   ` Jon Maloy
  2014-05-12  0:33 ` [ 073/143] net: Fix "ip rule delete table 256" Willy Tarreau
                   ` (70 subsequent siblings)
  143 siblings, 1 reply; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Ying Xue, Jon Maloy, Paul Gortmaker, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Ying Xue <ying.xue@windriver.com>

[ Upstream commit 4225a398c1352a7a5c14dc07277cb5cc4473983b ]

When the lockdep validator is enabled, it will report the below
warning when we enable a TIPC bearer:

[ INFO: possible irq lock inversion dependency detected ]
---------------------------------------------------------
Possible interrupt unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(ptype_lock);
                                local_irq_disable();
                                lock(tipc_net_lock);
                                lock(ptype_lock);
   <Interrupt>
   lock(tipc_net_lock);

  *** DEADLOCK ***

the shortest dependencies between 2nd lock and 1st lock:
  -> (ptype_lock){+.+...} ops: 10 {
[...]
SOFTIRQ-ON-W at:
                      [<c1089418>] __lock_acquire+0x528/0x13e0
                      [<c108a360>] lock_acquire+0x90/0x100
                      [<c1553c38>] _raw_spin_lock+0x38/0x50
                      [<c14651ca>] dev_add_pack+0x3a/0x60
                      [<c182da75>] arp_init+0x1a/0x48
                      [<c182dce5>] inet_init+0x181/0x27e
                      [<c1001114>] do_one_initcall+0x34/0x170
                      [<c17f7329>] kernel_init+0x110/0x1b2
                      [<c155b6a2>] kernel_thread_helper+0x6/0x10
[...]
   ... key      at: [<c17e4b10>] ptype_lock+0x10/0x20
   ... acquired at:
    [<c108a360>] lock_acquire+0x90/0x100
    [<c1553c38>] _raw_spin_lock+0x38/0x50
    [<c14651ca>] dev_add_pack+0x3a/0x60
    [<c8bc18d2>] enable_bearer+0xf2/0x140 [tipc]
    [<c8bb283a>] tipc_enable_bearer+0x1ba/0x450 [tipc]
    [<c8bb3a04>] tipc_cfg_do_cmd+0x5c4/0x830 [tipc]
    [<c8bbc032>] handle_cmd+0x42/0xd0 [tipc]
    [<c148e802>] genl_rcv_msg+0x232/0x280
    [<c148d3f6>] netlink_rcv_skb+0x86/0xb0
    [<c148e5bc>] genl_rcv+0x1c/0x30
    [<c148d144>] netlink_unicast+0x174/0x1f0
    [<c148ddab>] netlink_sendmsg+0x1eb/0x2d0
    [<c1456bc1>] sock_aio_write+0x161/0x170
    [<c1135a7c>] do_sync_write+0xac/0xf0
    [<c11360f6>] vfs_write+0x156/0x170
    [<c11361e2>] sys_write+0x42/0x70
    [<c155b0df>] sysenter_do_call+0x12/0x38
[...]
}
  -> (tipc_net_lock){+..-..} ops: 4 {
[...]
    IN-SOFTIRQ-R at:
                     [<c108953a>] __lock_acquire+0x64a/0x13e0
                     [<c108a360>] lock_acquire+0x90/0x100
                     [<c15541cd>] _raw_read_lock_bh+0x3d/0x50
                     [<c8bb874d>] tipc_recv_msg+0x1d/0x830 [tipc]
                     [<c8bc195f>] recv_msg+0x3f/0x50 [tipc]
                     [<c146a5fa>] __netif_receive_skb+0x22a/0x590
                     [<c146ab0b>] netif_receive_skb+0x2b/0xf0
                     [<c13c43d2>] pcnet32_poll+0x292/0x780
                     [<c146b00a>] net_rx_action+0xfa/0x1e0
                     [<c103a4be>] __do_softirq+0xae/0x1e0
[...]
}

>From the log, we can see three different call chains between
CPU0 and CPU1:

Time 0 on CPU0:

  kernel_init()->inet_init()->dev_add_pack()

At time 0, the ptype_lock is held by CPU0 in dev_add_pack();

Time 1 on CPU1:

  tipc_enable_bearer()->enable_bearer()->dev_add_pack()

At time 1, tipc_enable_bearer() first holds tipc_net_lock, and then
wants to take ptype_lock to register TIPC protocol handler into the
networking stack.  But the ptype_lock has been taken by dev_add_pack()
on CPU0, so at this time the dev_add_pack() running on CPU1 has to be
busy looping.

Time 2 on CPU0:

  netif_receive_skb()->recv_msg()->tipc_recv_msg()

At time 2, an incoming TIPC packet arrives at CPU0, hence
tipc_recv_msg() will be invoked. In tipc_recv_msg(), it first wants
to hold tipc_net_lock.  At the moment, below scenario happens:

On CPU0, below is our sequence of taking locks:

  lock(ptype_lock)->lock(tipc_net_lock)

On CPU1, our sequence of taking locks looks like:

  lock(tipc_net_lock)->lock(ptype_lock)

Obviously deadlock may happen in this case.

But please note the deadlock possibly doesn't occur at all when the
first TIPC bearer is enabled.  Before enable_bearer() -- running on
CPU1 does not hold ptype_lock, so the TIPC receive handler (i.e.
recv_msg()) is not registered successfully via dev_add_pack(), so
the tipc_recv_msg() cannot be called by recv_msg() even if a TIPC
message comes to CPU0. But when the second TIPC bearer is
registered, the deadlock can perhaps really happen.

To fix it, we will push the work of registering TIPC protocol
handler into workqueue context. After the change, both paths taking
ptype_lock are always in process contexts, thus, the deadlock should
never occur.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/tipc/eth_media.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/net/tipc/eth_media.c b/net/tipc/eth_media.c
index 524ba56..22453a8 100644
--- a/net/tipc/eth_media.c
+++ b/net/tipc/eth_media.c
@@ -56,6 +56,7 @@ struct eth_bearer {
 	struct tipc_bearer *bearer;
 	struct net_device *dev;
 	struct packet_type tipc_packet_type;
+	struct work_struct setup;
 };
 
 static struct eth_bearer eth_bearers[MAX_ETH_BEARERS];
@@ -122,6 +123,17 @@ static int recv_msg(struct sk_buff *buf, struct net_device *dev,
 }
 
 /**
+ * setup_bearer - setup association between Ethernet bearer and interface
+ */
+static void setup_bearer(struct work_struct *work)
+{
+	struct eth_bearer *eb_ptr =
+		container_of(work, struct eth_bearer, setup);
+
+	dev_add_pack(&eb_ptr->tipc_packet_type);
+}
+
+/**
  * enable_bearer - attach TIPC bearer to an Ethernet interface
  */
 
@@ -157,7 +169,8 @@ static int enable_bearer(struct tipc_bearer *tb_ptr)
 		eb_ptr->tipc_packet_type.af_packet_priv = eb_ptr;
 		INIT_LIST_HEAD(&(eb_ptr->tipc_packet_type.list));
 		dev_hold(dev);
-		dev_add_pack(&eb_ptr->tipc_packet_type);
+		INIT_WORK(&eb_ptr->setup, setup_bearer);
+		schedule_work(&eb_ptr->setup);
 	}
 
 	/* Associate TIPC bearer with Ethernet bearer */
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 073/143] net: Fix "ip rule delete table 256"
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (72 preceding siblings ...)
  2014-05-12  0:33 ` [ 072/143] tipc: fix lockdep warning during bearer initialization Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 074/143] ipv6: use rt6_get_dflt_router to get default router in rt6_route_rcv Willy Tarreau
                   ` (69 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Andreas Henriksson, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Andreas Henriksson <andreas@fatal.se>

[ Upstream commit 13eb2ab2d33c57ebddc57437a7d341995fc9138c ]

When trying to delete a table >= 256 using iproute2 the local table
will be deleted.
The table id is specified as a netlink attribute when it needs more then
8 bits and iproute2 then sets the table field to RT_TABLE_UNSPEC (0).
Preconditions to matching the table id in the rule delete code
doesn't seem to take the "table id in netlink attribute" into condition
so the frh_get_table helper function never gets to do its job when
matching against current rule.
Use the helper function twice instead of peaking at the table value directly.

Originally reported at: http://bugs.debian.org/724783

Reported-by: Nicolas HICHER <nhicher@avencall.com>
Signed-off-by: Andreas Henriksson <andreas@fatal.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/core/fib_rules.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/core/fib_rules.c b/net/core/fib_rules.c
index bd30938..de9eac9 100644
--- a/net/core/fib_rules.c
+++ b/net/core/fib_rules.c
@@ -381,7 +381,8 @@ static int fib_nl_delrule(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg)
 		if (frh->action && (frh->action != rule->action))
 			continue;
 
-		if (frh->table && (frh_get_table(frh, tb) != rule->table))
+		if (frh_get_table(frh, tb) &&
+		    (frh_get_table(frh, tb) != rule->table))
 			continue;
 
 		if (tb[FRA_PRIORITY] &&
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 074/143] ipv6: use rt6_get_dflt_router to get default router in rt6_route_rcv
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (73 preceding siblings ...)
  2014-05-12  0:33 ` [ 073/143] net: Fix "ip rule delete table 256" Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 075/143] random32: fix off-by-one in seeding requirement Willy Tarreau
                   ` (68 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Duan Jiong, Hannes Frederic Sowa, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Duan Jiong <duanj.fnst@cn.fujitsu.com>

[ Upstream commit f104a567e673f382b09542a8dc3500aa689957b4 ]

As the rfc 4191 said, the Router Preference and Lifetime values in a
::/0 Route Information Option should override the preference and lifetime
values in the Router Advertisement header. But when the kernel deals with
a ::/0 Route Information Option, the rt6_get_route_info() always return
NULL, that means that overriding will not happen, because those default
routers were added without flag RTF_ROUTEINFO in rt6_add_dflt_router().

In order to deal with that condition, we should call rt6_get_dflt_router
when the prefix length is 0.

Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/ipv6/route.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index e307517..5af0d1e 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -495,8 +495,11 @@ int rt6_route_rcv(struct net_device *dev, u8 *opt, int len,
 		prefix = &prefix_buf;
 	}
 
-	rt = rt6_get_route_info(net, prefix, rinfo->prefix_len, gwaddr,
-				dev->ifindex);
+	if (rinfo->prefix_len == 0)
+		rt = rt6_get_dflt_router(gwaddr, dev);
+	else
+		rt = rt6_get_route_info(net, prefix, rinfo->prefix_len,
+					gwaddr, dev->ifindex);
 
 	if (rt && !lifetime) {
 		ip6_del_rt(rt);
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 075/143] random32: fix off-by-one in seeding requirement
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (74 preceding siblings ...)
  2014-05-12  0:33 ` [ 074/143] ipv6: use rt6_get_dflt_router to get default router in rt6_route_rcv Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 076/143] bonding: fix two race conditions in bond_store_updelay/downdelay Willy Tarreau
                   ` (67 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Stephen Hemminger, Florian Weimer, Theodore Tso, Daniel Borkmann,
	Hannes Frederic Sowa, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Daniel Borkmann <dborkman@redhat.com>

[ Upstream commit 51c37a70aaa3f95773af560e6db3073520513912 ]

For properly initialising the Tausworthe generator [1], we have
a strict seeding requirement, that is, s1 > 1, s2 > 7, s3 > 15.

Commit 697f8d0348 ("random32: seeding improvement") introduced
a __seed() function that imposes boundary checks proposed by the
errata paper [2] to properly ensure above conditions.

However, we're off by one, as the function is implemented as:
"return (x < m) ? x + m : x;", and called with __seed(X, 1),
__seed(X, 7), __seed(X, 15). Thus, an unwanted seed of 1, 7, 15
would be possible, whereas the lower boundary should actually
be of at least 2, 8, 16, just as GSL does. Fix this, as otherwise
an initialization with an unwanted seed could have the effect
that Tausworthe's PRNG properties cannot not be ensured.

Note that this PRNG is *not* used for cryptography in the kernel.

 [1] http://www.iro.umontreal.ca/~lecuyer/myftp/papers/tausme.ps
 [2] http://www.iro.umontreal.ca/~lecuyer/myftp/papers/tausme2.ps

Joint work with Hannes Frederic Sowa.

Fixes: 697f8d0348a6 ("random32: seeding improvement")
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 lib/random32.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/lib/random32.c b/lib/random32.c
index 217d5c4..b9275d2 100644
--- a/lib/random32.c
+++ b/lib/random32.c
@@ -96,7 +96,7 @@ void srandom32(u32 entropy)
 	 */
 	for_each_possible_cpu (i) {
 		struct rnd_state *state = &per_cpu(net_rand_state, i);
-		state->s1 = __seed(state->s1 ^ entropy, 1);
+		state->s1 = __seed(state->s1 ^ entropy, 2);
 	}
 }
 EXPORT_SYMBOL(srandom32);
@@ -113,9 +113,9 @@ static int __init random32_init(void)
 		struct rnd_state *state = &per_cpu(net_rand_state,i);
 
 #define LCG(x)	((x) * 69069)	/* super-duper LCG */
-		state->s1 = __seed(LCG(i + jiffies), 1);
-		state->s2 = __seed(LCG(state->s1), 7);
-		state->s3 = __seed(LCG(state->s2), 15);
+		state->s1 = __seed(LCG(i + jiffies), 2);
+		state->s2 = __seed(LCG(state->s1), 8);
+		state->s3 = __seed(LCG(state->s2), 16);
 
 		/* "warm it up" */
 		__random32(state);
@@ -142,9 +142,9 @@ static int __init random32_reseed(void)
 		u32 seeds[3];
 
 		get_random_bytes(&seeds, sizeof(seeds));
-		state->s1 = __seed(seeds[0], 1);
-		state->s2 = __seed(seeds[1], 7);
-		state->s3 = __seed(seeds[2], 15);
+		state->s1 = __seed(seeds[0], 2);
+		state->s2 = __seed(seeds[1], 8);
+		state->s3 = __seed(seeds[2], 16);
 
 		/* mix it in */
 		__random32(state);
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 076/143] bonding: fix two race conditions in bond_store_updelay/downdelay
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (75 preceding siblings ...)
  2014-05-12  0:33 ` [ 075/143] random32: fix off-by-one in seeding requirement Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 077/143] isdnloop: use strlcpy() instead of strcpy() Willy Tarreau
                   ` (66 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Jay Vosburgh, Andy Gospodarek, Veaceslav Falico,
	Nikolay Aleksandrov, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Nikolay Aleksandrov <nikolay@redhat.com>

[ Upstream commit b869ccfab1e324507fa3596e3e1308444fb68227 ]

This patch fixes two race conditions between bond_store_updelay/downdelay
and bond_store_miimon which could lead to division by zero as miimon can
be set to 0 while either updelay/downdelay are being set and thus miss the
zero check in the beginning, the zero div happens because updelay/downdelay
are stored as new_value / bond->params.miimon. Use rtnl to synchronize with
miimon setting.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Acked-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/net/bonding/bond_sysfs.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index 8762a27..3666a9a 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -755,6 +755,8 @@ static ssize_t bonding_store_downdelay(struct device *d,
 	int new_value, ret = count;
 	struct bonding *bond = to_bond(d);
 
+	if (!rtnl_trylock())
+		return restart_syscall();
 	if (!(bond->params.miimon)) {
 		pr_err(DRV_NAME
 		       ": %s: Unable to set down delay as MII monitoring is disabled\n",
@@ -795,6 +797,7 @@ static ssize_t bonding_store_downdelay(struct device *d,
 	}
 
 out:
+	rtnl_unlock();
 	return ret;
 }
 static DEVICE_ATTR(downdelay, S_IRUGO | S_IWUSR,
@@ -817,6 +820,8 @@ static ssize_t bonding_store_updelay(struct device *d,
 	int new_value, ret = count;
 	struct bonding *bond = to_bond(d);
 
+	if (!rtnl_trylock())
+		return restart_syscall();
 	if (!(bond->params.miimon)) {
 		pr_err(DRV_NAME
 		       ": %s: Unable to set up delay as MII monitoring is disabled\n",
@@ -856,6 +861,7 @@ static ssize_t bonding_store_updelay(struct device *d,
 	}
 
 out:
+	rtnl_unlock();
 	return ret;
 }
 static DEVICE_ATTR(updelay, S_IRUGO | S_IWUSR,
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 077/143] isdnloop: use strlcpy() instead of strcpy()
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (76 preceding siblings ...)
  2014-05-12  0:33 ` [ 076/143] bonding: fix two race conditions in bond_store_updelay/downdelay Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 078/143] ipv4: fix possible seqlock deadlock Willy Tarreau
                   ` (65 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Dan Carpenter, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Dan Carpenter <dan.carpenter@oracle.com>

[ Upstream commit f9a23c84486ed350cce7bb1b2828abd1f6658796 ]

These strings come from a copy_from_user() and there is no way to be
sure they are NUL terminated.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/isdn/isdnloop/isdnloop.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/isdn/isdnloop/isdnloop.c b/drivers/isdn/isdnloop/isdnloop.c
index 22446f7..92d895f 100644
--- a/drivers/isdn/isdnloop/isdnloop.c
+++ b/drivers/isdn/isdnloop/isdnloop.c
@@ -1082,8 +1082,10 @@ isdnloop_start(isdnloop_card * card, isdnloop_sdef * sdefp)
 				spin_unlock_irqrestore(&card->isdnloop_lock, flags);
 				return -ENOMEM;
 			}
-			for (i = 0; i < 3; i++)
-				strcpy(card->s0num[i], sdef.num[i]);
+			for (i = 0; i < 3; i++) {
+				strlcpy(card->s0num[i], sdef.num[i],
+					sizeof(card->s0num[0]));
+			}
 			break;
 		case ISDN_PTYPE_1TR6:
 			if (isdnloop_fake(card, "DRV1.04TC-1TR6-CAPI-CNS-BASIS-29.11.95",
@@ -1096,7 +1098,7 @@ isdnloop_start(isdnloop_card * card, isdnloop_sdef * sdefp)
 				spin_unlock_irqrestore(&card->isdnloop_lock, flags);
 				return -ENOMEM;
 			}
-			strcpy(card->s0num[0], sdef.num[0]);
+			strlcpy(card->s0num[0], sdef.num[0], sizeof(card->s0num[0]));
 			card->s0num[1][0] = '\0';
 			card->s0num[2][0] = '\0';
 			break;
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 078/143] ipv4: fix possible seqlock deadlock
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (77 preceding siblings ...)
  2014-05-12  0:33 ` [ 077/143] isdnloop: use strlcpy() instead of strcpy() Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 079/143] inet: prevent leakage of uninitialized memory to user in recv Willy Tarreau
                   ` (64 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Eric Dumazet, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

[ Upstream commit c9e9042994d37cbc1ee538c500e9da1bb9d1bcdf ]

ip4_datagram_connect() being called from process context,
it should use IP_INC_STATS() instead of IP_INC_STATS_BH()
otherwise we can deadlock on 32bit arches, or get corruptions of
SNMP counters.

Fixes: 584bdf8cbdf6 ("[IPV4]: Fix "ipOutNoRoutes" counter error for TCP and UDP")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/ipv4/datagram.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/datagram.c b/net/ipv4/datagram.c
index 5e6c5a0..30aeb26 100644
--- a/net/ipv4/datagram.c
+++ b/net/ipv4/datagram.c
@@ -52,7 +52,7 @@ int ip4_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
 			       inet->sport, usin->sin_port, sk, 1);
 	if (err) {
 		if (err == -ENETUNREACH)
-			IP_INC_STATS_BH(sock_net(sk), IPSTATS_MIB_OUTNOROUTES);
+			IP_INC_STATS(sock_net(sk), IPSTATS_MIB_OUTNOROUTES);
 		return err;
 	}
 
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 079/143] inet: prevent leakage of uninitialized memory to user in recv
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (78 preceding siblings ...)
  2014-05-12  0:33 ` [ 078/143] ipv4: fix possible seqlock deadlock Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 080/143] net: rework recvmsg handler msg_name and msg_namelen logic Willy Tarreau
                   ` (63 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Hannes Frederic Sowa, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------
 syscalls

From: Hannes Frederic Sowa <hannes@stressinduktion.org>

[ Upstream commit bceaa90240b6019ed73b49965eac7d167610be69 ]

Only update *addr_len when we actually fill in sockaddr, otherwise we
can return uninitialized memory from the stack to the caller in the
recvfrom, recvmmsg and recvmsg syscalls. Drop the the (addr_len == NULL)
checks because we only get called with a valid addr_len pointer either
from sock_common_recvmsg or inet_recvmsg.

If a blocking read waits on a socket which is concurrently shut down we
now return zero and set msg_msgnamelen to 0.

Reported-by: mpb <mpb.mail@gmail.com>
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
[wt: no ieee802154, ping nor l2tp in 2.6.32]
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/ipv4/raw.c        | 4 +---
 net/ipv4/udp.c        | 7 +------
 net/ipv6/raw.c        | 4 +---
 net/ipv6/udp.c        | 5 +----
 net/phonet/datagram.c | 9 ++++-----
 5 files changed, 8 insertions(+), 21 deletions(-)

diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 07ab583..c50344b 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -681,9 +681,6 @@ static int raw_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 	if (flags & MSG_OOB)
 		goto out;
 
-	if (addr_len)
-		*addr_len = sizeof(*sin);
-
 	if (flags & MSG_ERRQUEUE) {
 		err = ip_recv_error(sk, msg, len);
 		goto out;
@@ -711,6 +708,7 @@ static int raw_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 		sin->sin_addr.s_addr = ip_hdr(skb)->saddr;
 		sin->sin_port = 0;
 		memset(&sin->sin_zero, 0, sizeof(sin->sin_zero));
+		*addr_len = sizeof(*sin);
 	}
 	if (inet->cmsg_flags)
 		ip_cmsg_recv(msg, skb);
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index af559e0..80487ee 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -941,12 +941,6 @@ int udp_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 	int err;
 	int is_udplite = IS_UDPLITE(sk);
 
-	/*
-	 *	Check any passed addresses
-	 */
-	if (addr_len)
-		*addr_len = sizeof(*sin);
-
 	if (flags & MSG_ERRQUEUE)
 		return ip_recv_error(sk, msg, len);
 
@@ -1001,6 +995,7 @@ try_again:
 		sin->sin_port = udp_hdr(skb)->source;
 		sin->sin_addr.s_addr = ip_hdr(skb)->saddr;
 		memset(sin->sin_zero, 0, sizeof(sin->sin_zero));
+		*addr_len = sizeof(*sin);
 	}
 	if (inet->cmsg_flags)
 		ip_cmsg_recv(msg, skb);
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 4f24570..df75f94 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -456,9 +456,6 @@ static int rawv6_recvmsg(struct kiocb *iocb, struct sock *sk,
 	if (flags & MSG_OOB)
 		return -EOPNOTSUPP;
 
-	if (addr_len)
-		*addr_len=sizeof(*sin6);
-
 	if (flags & MSG_ERRQUEUE)
 		return ipv6_recv_error(sk, msg, len);
 
@@ -495,6 +492,7 @@ static int rawv6_recvmsg(struct kiocb *iocb, struct sock *sk,
 		sin6->sin6_scope_id = 0;
 		if (ipv6_addr_type(&sin6->sin6_addr) & IPV6_ADDR_LINKLOCAL)
 			sin6->sin6_scope_id = IP6CB(skb)->iif;
+		*addr_len = sizeof(*sin6);
 	}
 
 	sock_recv_timestamp(msg, sk, skb);
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index d8c0374..ce291af 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -200,9 +200,6 @@ int udpv6_recvmsg(struct kiocb *iocb, struct sock *sk,
 	int is_udplite = IS_UDPLITE(sk);
 	int is_udp4;
 
-	if (addr_len)
-		*addr_len=sizeof(struct sockaddr_in6);
-
 	if (flags & MSG_ERRQUEUE)
 		return ipv6_recv_error(sk, msg, len);
 
@@ -273,7 +270,7 @@ try_again:
 			if (ipv6_addr_type(&sin6->sin6_addr) & IPV6_ADDR_LINKLOCAL)
 				sin6->sin6_scope_id = IP6CB(skb)->iif;
 		}
-
+		*addr_len = sizeof(*sin6);
 	}
 	if (is_udp4) {
 		if (inet->cmsg_flags)
diff --git a/net/phonet/datagram.c b/net/phonet/datagram.c
index ef5c75c..c88da73 100644
--- a/net/phonet/datagram.c
+++ b/net/phonet/datagram.c
@@ -122,9 +122,6 @@ static int pn_recvmsg(struct kiocb *iocb, struct sock *sk,
 	if (flags & MSG_OOB)
 		goto out_nofree;
 
-	if (addr_len)
-		*addr_len = sizeof(sa);
-
 	skb = skb_recv_datagram(sk, flags, noblock, &rval);
 	if (skb == NULL)
 		goto out_nofree;
@@ -145,8 +142,10 @@ static int pn_recvmsg(struct kiocb *iocb, struct sock *sk,
 
 	rval = (flags & MSG_TRUNC) ? skb->len : copylen;
 
-	if (msg->msg_name != NULL)
-		memcpy(msg->msg_name, &sa, sizeof(struct sockaddr_pn));
+	if (msg->msg_name != NULL) {
+		memcpy(msg->msg_name, &sa, sizeof(sa));
+		*addr_len = sizeof(sa);
+	}
 
 out:
 	skb_free_datagram(sk, skb);
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 080/143] net: rework recvmsg handler msg_name and msg_namelen logic
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (79 preceding siblings ...)
  2014-05-12  0:33 ` [ 079/143] inet: prevent leakage of uninitialized memory to user in recv Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-13 12:44   ` Luis Henriques
  2014-05-12  0:33 ` [ 081/143] net: add BUG_ON if kernel advertises msg_namelen > sizeof(struct Willy Tarreau
                   ` (62 subsequent siblings)
  143 siblings, 1 reply; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: David Miller, Hannes Frederic Sowa, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Hannes Frederic Sowa <hannes@stressinduktion.org>

[ Upstream commit f3d3342602f8bcbf37d7c46641cb9bca7618eb1c ]

This patch now always passes msg->msg_namelen as 0. recvmsg handlers must
set msg_namelen to the proper size <= sizeof(struct sockaddr_storage)
to return msg_name to the user.

This prevents numerous uninitialized memory leaks we had in the
recvmsg handlers and makes it harder for new code to accidentally leak
uninitialized memory.

Optimize for the case recvfrom is called with NULL as address. We don't
need to copy the address at all, so set it to NULL before invoking the
recvmsg handler. We can do so, because all the recvmsg handlers must
cope with the case a plain read() is called on them. read() also sets
msg_name to NULL.

Also document these changes in include/linux/net.h as suggested by David
Miller.

Changes since RFC:

Set msg->msg_name = NULL if user specified a NULL in msg_name but had a
non-null msg_namelen in verify_iovec/verify_compat_iovec. This doesn't
affect sendto as it would bail out earlier while trying to copy-in the
address. It also more naturally reflects the logic by the callers of
verify_iovec.

With this change in place I could remove "
if (!uaddr || msg_sys->msg_namelen == 0)
	msg->msg_name = NULL
".

This change does not alter the user visible error logic as we ignore
msg_namelen as long as msg_name is NULL.

Also remove two unnecessary curly brackets in ___sys_recvmsg and change
comments to netdev style.

Cc: David Miller <davem@davemloft.net>
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
[wt: 2.6.32: msg_sys is a struct not a pointer]
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/isdn/mISDN/socket.c  | 13 ++++---------
 include/linux/net.h          |  8 ++++++++
 net/appletalk/ddp.c          | 16 +++++++---------
 net/atm/common.c             |  2 --
 net/ax25/af_ax25.c           |  4 ++--
 net/bluetooth/af_bluetooth.c |  2 --
 net/bluetooth/hci_sock.c     |  2 --
 net/bluetooth/rfcomm/sock.c  |  3 ---
 net/compat.c                 |  3 ++-
 net/core/iovec.c             |  3 ++-
 net/ipx/af_ipx.c             |  3 +--
 net/irda/af_irda.c           |  4 ----
 net/iucv/af_iucv.c           |  2 --
 net/key/af_key.c             |  1 -
 net/llc/af_llc.c             |  2 --
 net/netlink/af_netlink.c     |  2 --
 net/netrom/af_netrom.c       |  3 +--
 net/packet/af_packet.c       | 32 +++++++++++++++-----------------
 net/rds/recv.c               |  2 --
 net/rose/af_rose.c           |  8 +++++---
 net/rxrpc/ar-recvmsg.c       |  8 +++++---
 net/socket.c                 |  9 +++++++--
 net/tipc/socket.c            |  6 ------
 net/unix/af_unix.c           |  5 -----
 net/x25/af_x25.c             |  3 +--
 25 files changed, 60 insertions(+), 86 deletions(-)

diff --git a/drivers/isdn/mISDN/socket.c b/drivers/isdn/mISDN/socket.c
index feb0fa4..db69cb4 100644
--- a/drivers/isdn/mISDN/socket.c
+++ b/drivers/isdn/mISDN/socket.c
@@ -115,7 +115,6 @@ mISDN_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
 {
 	struct sk_buff		*skb;
 	struct sock		*sk = sock->sk;
-	struct sockaddr_mISDN	*maddr;
 
 	int		copied, err;
 
@@ -133,9 +132,9 @@ mISDN_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
 	if (!skb)
 		return err;
 
-	if (msg->msg_namelen >= sizeof(struct sockaddr_mISDN)) {
-		msg->msg_namelen = sizeof(struct sockaddr_mISDN);
-		maddr = (struct sockaddr_mISDN *)msg->msg_name;
+	if (msg->msg_name) {
+		struct sockaddr_mISDN *maddr = msg->msg_name;
+
 		maddr->family = AF_ISDN;
 		maddr->dev = _pms(sk)->dev->id;
 		if ((sk->sk_protocol == ISDN_P_LAPD_TE) ||
@@ -148,11 +147,7 @@ mISDN_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
 			maddr->sapi = _pms(sk)->ch.addr & 0xFF;
 			maddr->tei =  (_pms(sk)->ch.addr >> 8) & 0xFF;
 		}
-	} else {
-		if (msg->msg_namelen)
-			printk(KERN_WARNING "%s: too small namelen %d\n",
-			    __func__, msg->msg_namelen);
-		msg->msg_namelen = 0;
+		msg->msg_namelen = sizeof(*maddr);
 	}
 
 	copied = skb->len + MISDN_HEADER_LEN;
diff --git a/include/linux/net.h b/include/linux/net.h
index 529a093..e40cbcc 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -187,6 +187,14 @@ struct proto_ops {
 				      int optname, char __user *optval, int __user *optlen);
 	int		(*sendmsg)   (struct kiocb *iocb, struct socket *sock,
 				      struct msghdr *m, size_t total_len);
+	/* Notes for implementing recvmsg:
+	 * ===============================
+	 * msg->msg_namelen should get updated by the recvmsg handlers
+	 * iff msg_name != NULL. It is by default 0 to prevent
+	 * returning uninitialized memory to user space.  The recvfrom
+	 * handlers can assume that msg.msg_name is either NULL or has
+	 * a minimum size of sizeof(struct sockaddr_storage).
+	 */
 	int		(*recvmsg)   (struct kiocb *iocb, struct socket *sock,
 				      struct msghdr *m, size_t total_len,
 				      int flags);
diff --git a/net/appletalk/ddp.c b/net/appletalk/ddp.c
index b1a4290..5eae360 100644
--- a/net/appletalk/ddp.c
+++ b/net/appletalk/ddp.c
@@ -1703,7 +1703,6 @@ static int atalk_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr
 			 size_t size, int flags)
 {
 	struct sock *sk = sock->sk;
-	struct sockaddr_at *sat = (struct sockaddr_at *)msg->msg_name;
 	struct ddpehdr *ddp;
 	int copied = 0;
 	int offset = 0;
@@ -1728,14 +1727,13 @@ static int atalk_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr
 	}
 	err = skb_copy_datagram_iovec(skb, offset, msg->msg_iov, copied);
 
-	if (!err) {
-		if (sat) {
-			sat->sat_family      = AF_APPLETALK;
-			sat->sat_port        = ddp->deh_sport;
-			sat->sat_addr.s_node = ddp->deh_snode;
-			sat->sat_addr.s_net  = ddp->deh_snet;
-		}
-		msg->msg_namelen = sizeof(*sat);
+	if (!err && msg->msg_name) {
+		struct sockaddr_at *sat = msg->msg_name;
+		sat->sat_family      = AF_APPLETALK;
+		sat->sat_port        = ddp->deh_sport;
+		sat->sat_addr.s_node = ddp->deh_snode;
+		sat->sat_addr.s_net  = ddp->deh_snet;
+		msg->msg_namelen     = sizeof(*sat);
 	}
 
 	skb_free_datagram(sk, skb);	/* Free the datagram. */
diff --git a/net/atm/common.c b/net/atm/common.c
index 65737b8..0baf05e 100644
--- a/net/atm/common.c
+++ b/net/atm/common.c
@@ -473,8 +473,6 @@ int vcc_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 	struct sk_buff *skb;
 	int copied, error = -EINVAL;
 
-	msg->msg_namelen = 0;
-
 	if (sock->state != SS_CONNECTED)
 		return -ENOTCONN;
 	if (flags & ~MSG_DONTWAIT)		/* only handle MSG_DONTWAIT */
diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c
index 8613bd1..6b9d62b 100644
--- a/net/ax25/af_ax25.c
+++ b/net/ax25/af_ax25.c
@@ -1648,11 +1648,11 @@ static int ax25_recvmsg(struct kiocb *iocb, struct socket *sock,
 
 	skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
 
-	if (msg->msg_namelen != 0) {
-		struct sockaddr_ax25 *sax = (struct sockaddr_ax25 *)msg->msg_name;
+	if (msg->msg_name) {
 		ax25_digi digi;
 		ax25_address src;
 		const unsigned char *mac = skb_mac_header(skb);
+		struct sockaddr_ax25 *sax = msg->msg_name;
 
 		memset(sax, 0, sizeof(struct full_sockaddr_ax25));
 		ax25_addr_parse(mac + 1, skb->data - mac - 1, &src, NULL,
diff --git a/net/bluetooth/af_bluetooth.c b/net/bluetooth/af_bluetooth.c
index d7239dd..143b8a7 100644
--- a/net/bluetooth/af_bluetooth.c
+++ b/net/bluetooth/af_bluetooth.c
@@ -240,8 +240,6 @@ int bt_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
 	if (flags & (MSG_OOB))
 		return -EOPNOTSUPP;
 
-	msg->msg_namelen = 0;
-
 	if (!(skb = skb_recv_datagram(sk, flags, noblock, &err))) {
 		if (sk->sk_shutdown & RCV_SHUTDOWN)
 			return 0;
diff --git a/net/bluetooth/hci_sock.c b/net/bluetooth/hci_sock.c
index 45caaaa..0e0f517 100644
--- a/net/bluetooth/hci_sock.c
+++ b/net/bluetooth/hci_sock.c
@@ -370,8 +370,6 @@ static int hci_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
 	if (!(skb = skb_recv_datagram(sk, flags, noblock, &err)))
 		return err;
 
-	msg->msg_namelen = 0;
-
 	copied = skb->len;
 	if (len < copied) {
 		msg->msg_flags |= MSG_TRUNC;
diff --git a/net/bluetooth/rfcomm/sock.c b/net/bluetooth/rfcomm/sock.c
index 1db0132..3fabaad 100644
--- a/net/bluetooth/rfcomm/sock.c
+++ b/net/bluetooth/rfcomm/sock.c
@@ -652,15 +652,12 @@ static int rfcomm_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
 
 	if (test_and_clear_bit(RFCOMM_DEFER_SETUP, &d->flags)) {
 		rfcomm_dlc_accept(d);
-		msg->msg_namelen = 0;
 		return 0;
 	}
 
 	if (flags & MSG_OOB)
 		return -EOPNOTSUPP;
 
-	msg->msg_namelen = 0;
-
 	BT_DBG("sk %p size %zu", sk, size);
 
 	lock_sock(sk);
diff --git a/net/compat.c b/net/compat.c
index da3d0fc..d325d16 100644
--- a/net/compat.c
+++ b/net/compat.c
@@ -91,7 +91,8 @@ int verify_compat_iovec(struct msghdr *kern_msg, struct iovec *kern_iov,
 			if (err < 0)
 				return err;
 		}
-		kern_msg->msg_name = kern_address;
+		if (kern_msg->msg_name)
+			kern_msg->msg_name = kern_address;
 	} else
 		kern_msg->msg_name = NULL;
 
diff --git a/net/core/iovec.c b/net/core/iovec.c
index f911e66..39369e9 100644
--- a/net/core/iovec.c
+++ b/net/core/iovec.c
@@ -47,7 +47,8 @@ int verify_iovec(struct msghdr *m, struct iovec *iov, struct sockaddr *address,
 			if (err < 0)
 				return err;
 		}
-		m->msg_name = address;
+		if (m->msg_name)
+			m->msg_name = address;
 	} else {
 		m->msg_name = NULL;
 	}
diff --git a/net/ipx/af_ipx.c b/net/ipx/af_ipx.c
index 66c7a20..25931b3 100644
--- a/net/ipx/af_ipx.c
+++ b/net/ipx/af_ipx.c
@@ -1808,8 +1808,6 @@ static int ipx_recvmsg(struct kiocb *iocb, struct socket *sock,
 	if (skb->tstamp.tv64)
 		sk->sk_stamp = skb->tstamp;
 
-	msg->msg_namelen = sizeof(*sipx);
-
 	if (sipx) {
 		sipx->sipx_family	= AF_IPX;
 		sipx->sipx_port		= ipx->ipx_source.sock;
@@ -1817,6 +1815,7 @@ static int ipx_recvmsg(struct kiocb *iocb, struct socket *sock,
 		sipx->sipx_network	= IPX_SKB_CB(skb)->ipx_source_net;
 		sipx->sipx_type 	= ipx->ipx_type;
 		sipx->sipx_zero		= 0;
+		msg->msg_namelen	= sizeof(*sipx);
 	}
 	rc = copied;
 
diff --git a/net/irda/af_irda.c b/net/irda/af_irda.c
index bfb325d..7cb7613 100644
--- a/net/irda/af_irda.c
+++ b/net/irda/af_irda.c
@@ -1338,8 +1338,6 @@ static int irda_recvmsg_dgram(struct kiocb *iocb, struct socket *sock,
 	if ((err = sock_error(sk)) < 0)
 		return err;
 
-	msg->msg_namelen = 0;
-
 	skb = skb_recv_datagram(sk, flags & ~MSG_DONTWAIT,
 				flags & MSG_DONTWAIT, &err);
 	if (!skb)
@@ -1402,8 +1400,6 @@ static int irda_recvmsg_stream(struct kiocb *iocb, struct socket *sock,
 	target = sock_rcvlowat(sk, flags & MSG_WAITALL, size);
 	timeo = sock_rcvtimeo(sk, noblock);
 
-	msg->msg_namelen = 0;
-
 	do {
 		int chunk;
 		struct sk_buff *skb = skb_dequeue(&sk->sk_receive_queue);
diff --git a/net/iucv/af_iucv.c b/net/iucv/af_iucv.c
index f605b23..bada1b9 100644
--- a/net/iucv/af_iucv.c
+++ b/net/iucv/af_iucv.c
@@ -1160,8 +1160,6 @@ static int iucv_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
 	struct sk_buff *skb, *rskb, *cskb;
 	int err = 0;
 
-	msg->msg_namelen = 0;
-
 	if ((sk->sk_state == IUCV_DISCONN || sk->sk_state == IUCV_SEVERED) &&
 	    skb_queue_empty(&iucv->backlog_skb_q) &&
 	    skb_queue_empty(&sk->sk_receive_queue) &&
diff --git a/net/key/af_key.c b/net/key/af_key.c
index 3f55faa..3e5d0dc 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -3597,7 +3597,6 @@ static int pfkey_recvmsg(struct kiocb *kiocb,
 	if (flags & ~(MSG_PEEK|MSG_DONTWAIT|MSG_TRUNC|MSG_CMSG_COMPAT))
 		goto out;
 
-	msg->msg_namelen = 0;
 	skb = skb_recv_datagram(sk, flags, flags & MSG_DONTWAIT, &err);
 	if (skb == NULL)
 		goto out;
diff --git a/net/llc/af_llc.c b/net/llc/af_llc.c
index 8a814a5..606b6ad 100644
--- a/net/llc/af_llc.c
+++ b/net/llc/af_llc.c
@@ -674,8 +674,6 @@ static int llc_ui_recvmsg(struct kiocb *iocb, struct socket *sock,
 	int target;	/* Read at least this many bytes */
 	long timeo;
 
-	msg->msg_namelen = 0;
-
 	lock_sock(sk);
 	copied = -ENOTCONN;
 	if (unlikely(sk->sk_type == SOCK_STREAM && sk->sk_state == TCP_LISTEN))
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index fc91ff6..39a6d5d 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -1400,8 +1400,6 @@ static int netlink_recvmsg(struct kiocb *kiocb, struct socket *sock,
 	}
 #endif
 
-	msg->msg_namelen = 0;
-
 	copied = data_skb->len;
 	if (len < copied) {
 		msg->msg_flags |= MSG_TRUNC;
diff --git a/net/netrom/af_netrom.c b/net/netrom/af_netrom.c
index 7a83495..ad1ec1b 100644
--- a/net/netrom/af_netrom.c
+++ b/net/netrom/af_netrom.c
@@ -1184,10 +1184,9 @@ static int nr_recvmsg(struct kiocb *iocb, struct socket *sock,
 		sax->sax25_family = AF_NETROM;
 		skb_copy_from_linear_data_offset(skb, 7, sax->sax25_call.ax25_call,
 			      AX25_ADDR_LEN);
+		msg->msg_namelen = sizeof(*sax);
 	}
 
-	msg->msg_namelen = sizeof(*sax);
-
 	skb_free_datagram(sk, skb);
 
 	release_sock(sk);
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index f084e01..06707d0 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1423,7 +1423,6 @@ static int packet_recvmsg(struct kiocb *iocb, struct socket *sock,
 	struct sock *sk = sock->sk;
 	struct sk_buff *skb;
 	int copied, err;
-	struct sockaddr_ll *sll;
 
 	err = -EINVAL;
 	if (flags & ~(MSG_PEEK|MSG_DONTWAIT|MSG_TRUNC|MSG_CMSG_COMPAT))
@@ -1455,22 +1454,10 @@ static int packet_recvmsg(struct kiocb *iocb, struct socket *sock,
 	if (skb == NULL)
 		goto out;
 
-	/*
-	 *	If the address length field is there to be filled in, we fill
-	 *	it in now.
+	/* You lose any data beyond the buffer you gave. If it worries
+	 * a user program they can ask the device for its MTU
+	 * anyway.
 	 */
-
-	sll = &PACKET_SKB_CB(skb)->sa.ll;
-	if (sock->type == SOCK_PACKET)
-		msg->msg_namelen = sizeof(struct sockaddr_pkt);
-	else
-		msg->msg_namelen = sll->sll_halen + offsetof(struct sockaddr_ll, sll_addr);
-
-	/*
-	 *	You lose any data beyond the buffer you gave. If it worries a
-	 *	user program they can ask the device for its MTU anyway.
-	 */
-
 	copied = skb->len;
 	if (copied > len) {
 		copied = len;
@@ -1483,9 +1470,20 @@ static int packet_recvmsg(struct kiocb *iocb, struct socket *sock,
 
 	sock_recv_timestamp(msg, sk, skb);
 
-	if (msg->msg_name)
+	if (msg->msg_name) {
+		/* If the address length field is there to be filled
+		 * in, we fill it in now.
+		 */
+		if (sock->type == SOCK_PACKET) {
+			msg->msg_namelen = sizeof(struct sockaddr_pkt);
+		} else {
+			struct sockaddr_ll *sll = &PACKET_SKB_CB(skb)->sa.ll;
+			msg->msg_namelen = sll->sll_halen +
+				offsetof(struct sockaddr_ll, sll_addr);
+		}
 		memcpy(msg->msg_name, &PACKET_SKB_CB(skb)->sa,
 		       msg->msg_namelen);
+	}
 
 	if (pkt_sk(sk)->auxdata) {
 		struct tpacket_auxdata aux;
diff --git a/net/rds/recv.c b/net/rds/recv.c
index c45a881c..a11cab9 100644
--- a/net/rds/recv.c
+++ b/net/rds/recv.c
@@ -410,8 +410,6 @@ int rds_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 
 	rdsdebug("size %zu flags 0x%x timeo %ld\n", size, msg_flags, timeo);
 
-	msg->msg_namelen = 0;
-
 	if (msg_flags & MSG_OOB)
 		goto out;
 
diff --git a/net/rose/af_rose.c b/net/rose/af_rose.c
index 2984999..08a86f6 100644
--- a/net/rose/af_rose.c
+++ b/net/rose/af_rose.c
@@ -1238,7 +1238,6 @@ static int rose_recvmsg(struct kiocb *iocb, struct socket *sock,
 {
 	struct sock *sk = sock->sk;
 	struct rose_sock *rose = rose_sk(sk);
-	struct sockaddr_rose *srose = (struct sockaddr_rose *)msg->msg_name;
 	size_t copied;
 	unsigned char *asmptr;
 	struct sk_buff *skb;
@@ -1274,8 +1273,11 @@ static int rose_recvmsg(struct kiocb *iocb, struct socket *sock,
 
 	skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
 
-	if (srose != NULL) {
-		memset(srose, 0, msg->msg_namelen);
+	if (msg->msg_name) {
+		struct sockaddr_rose *srose;
+
+		memset(msg->msg_name, 0, sizeof(struct full_sockaddr_rose));
+		srose = msg->msg_name;
 		srose->srose_family = AF_ROSE;
 		srose->srose_addr   = rose->dest_addr;
 		srose->srose_call   = rose->dest_call;
diff --git a/net/rxrpc/ar-recvmsg.c b/net/rxrpc/ar-recvmsg.c
index a39bf97..f779fc3 100644
--- a/net/rxrpc/ar-recvmsg.c
+++ b/net/rxrpc/ar-recvmsg.c
@@ -142,10 +142,12 @@ int rxrpc_recvmsg(struct kiocb *iocb, struct socket *sock,
 
 		/* copy the peer address and timestamp */
 		if (!continue_call) {
-			if (msg->msg_name && msg->msg_namelen > 0)
+			if (msg->msg_name) {
+				size_t len =
+					sizeof(call->conn->trans->peer->srx);
 				memcpy(msg->msg_name,
-				       &call->conn->trans->peer->srx,
-				       sizeof(call->conn->trans->peer->srx));
+				       &call->conn->trans->peer->srx, len);
+			}
 			sock_recv_timestamp(msg, &rx->sk, skb);
 		}
 
diff --git a/net/socket.c b/net/socket.c
index 9f8cd74..2b7be6d 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -1744,8 +1744,10 @@ SYSCALL_DEFINE6(recvfrom, int, fd, void __user *, ubuf, size_t, size,
 	msg.msg_iov = &iov;
 	iov.iov_len = size;
 	iov.iov_base = ubuf;
-	msg.msg_name = (struct sockaddr *)&address;
-	msg.msg_namelen = sizeof(address);
+	/* Save some cycles and don't copy the address if not needed */
+	msg.msg_name = addr ? (struct sockaddr *)&address : NULL;
+	/* We assume all kernel code knows the size of sockaddr_storage */
+	msg.msg_namelen = 0;
 	if (sock->file->f_flags & O_NONBLOCK)
 		flags |= MSG_DONTWAIT;
 	err = sock_recvmsg(sock, &msg, size, flags);
@@ -2055,6 +2057,9 @@ SYSCALL_DEFINE3(recvmsg, int, fd, struct msghdr __user *, msg,
 	cmsg_ptr = (unsigned long)msg_sys.msg_control;
 	msg_sys.msg_flags = flags & (MSG_CMSG_CLOEXEC|MSG_CMSG_COMPAT);
 
+	/* We assume all kernel code knows the size of sockaddr_storage */
+	msg_sys.msg_namelen = 0;
+
 	if (sock->file->f_flags & O_NONBLOCK)
 		flags |= MSG_DONTWAIT;
 	err = sock_recvmsg(sock, &msg_sys, total_len, flags);
diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index eccb86b..124f1a2 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -917,9 +917,6 @@ static int recv_msg(struct kiocb *iocb, struct socket *sock,
 		goto exit;
 	}
 
-	/* will be updated in set_orig_addr() if needed */
-	m->msg_namelen = 0;
-
 restart:
 
 	/* Look for a message in receive queue; wait if necessary */
@@ -1053,9 +1050,6 @@ static int recv_stream(struct kiocb *iocb, struct socket *sock,
 		goto exit;
 	}
 
-	/* will be updated in set_orig_addr() if needed */
-	m->msg_namelen = 0;
-
 restart:
 
 	/* Look for a message in receive queue; wait if necessary */
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index d146b76..bb0b008 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1682,7 +1682,6 @@ static void unix_copy_addr(struct msghdr *msg, struct sock *sk)
 {
 	struct unix_sock *u = unix_sk(sk);
 
-	msg->msg_namelen = 0;
 	if (u->addr) {
 		msg->msg_namelen = u->addr->len;
 		memcpy(msg->msg_name, u->addr->name, u->addr->len);
@@ -1705,8 +1704,6 @@ static int unix_dgram_recvmsg(struct kiocb *iocb, struct socket *sock,
 	if (flags&MSG_OOB)
 		goto out;
 
-	msg->msg_namelen = 0;
-
 	mutex_lock(&u->readlock);
 
 	skb = skb_recv_datagram(sk, flags, noblock, &err);
@@ -1832,8 +1829,6 @@ static int unix_stream_recvmsg(struct kiocb *iocb, struct socket *sock,
 	target = sock_rcvlowat(sk, flags&MSG_WAITALL, size);
 	timeo = sock_rcvtimeo(sk, flags&MSG_DONTWAIT);
 
-	msg->msg_namelen = 0;
-
 	/* Lock the socket to prevent queue disordering
 	 * while sleeps in memcpy_tomsg
 	 */
diff --git a/net/x25/af_x25.c b/net/x25/af_x25.c
index 2e9e300..40c447f 100644
--- a/net/x25/af_x25.c
+++ b/net/x25/af_x25.c
@@ -1294,10 +1294,9 @@ static int x25_recvmsg(struct kiocb *iocb, struct socket *sock,
 	if (sx25) {
 		sx25->sx25_family = AF_X25;
 		sx25->sx25_addr   = x25->dest_addr;
+		msg->msg_namelen = sizeof(*sx25);
 	}
 
-	msg->msg_namelen = sizeof(struct sockaddr_x25);
-
 	lock_sock(sk);
 	x25_check_rbuf(sk);
 	release_sock(sk);
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 081/143] net: add BUG_ON if kernel advertises msg_namelen > sizeof(struct
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (80 preceding siblings ...)
  2014-05-12  0:33 ` [ 080/143] net: rework recvmsg handler msg_name and msg_namelen logic Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 082/143] inet: fix addr_len/msg->msg_namelen assignment in recv_error and Willy Tarreau
                   ` (61 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Hannes Frederic Sowa, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------
 sockaddr_storage)

From: Hannes Frederic Sowa <hannes@stressinduktion.org>

[ Upstream commit 68c6beb373955da0886d8f4f5995b3922ceda4be ]

In that case it is probable that kernel code overwrote part of the
stack. So we should bail out loudly here.

The BUG_ON may be removed in future if we are sure all protocols are
conformant.

Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/socket.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/socket.c b/net/socket.c
index 2b7be6d..712f977 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -216,12 +216,13 @@ int move_addr_to_user(struct sockaddr *kaddr, int klen, void __user *uaddr,
 	int err;
 	int len;
 
+	BUG_ON(klen > sizeof(struct sockaddr_storage));
 	err = get_user(len, ulen);
 	if (err)
 		return err;
 	if (len > klen)
 		len = klen;
-	if (len < 0 || len > sizeof(struct sockaddr_storage))
+	if (len < 0)
 		return -EINVAL;
 	if (len) {
 		if (audit_sockaddr(klen, kaddr))
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 082/143] inet: fix addr_len/msg->msg_namelen assignment in recv_error and
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (81 preceding siblings ...)
  2014-05-12  0:33 ` [ 081/143] net: add BUG_ON if kernel advertises msg_namelen > sizeof(struct Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 083/143] net: clamp ->msg_namelen instead of returning an error Willy Tarreau
                   ` (60 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: mpb, David S. Miller, Eric Dumazet, Hannes Frederic Sowa, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------
 rxpmtu functions

From: Hannes Frederic Sowa <hannes@stressinduktion.org>

[ Upstream commit 85fbaa75037d0b6b786ff18658ddf0b4014ce2a4 ]

Commit bceaa90240b6019ed73b49965eac7d167610be69 ("inet: prevent leakage
of uninitialized memory to user in recv syscalls") conditionally updated
addr_len if the msg_name is written to. The recv_error and rxpmtu
functions relied on the recvmsg functions to set up addr_len before.

As this does not happen any more we have to pass addr_len to those
functions as well and set it to the size of the corresponding sockaddr
length.

This broke traceroute and such.

Fixes: bceaa90240b6 ("inet: prevent leakage of uninitialized memory to user in recv syscalls")
Reported-by: Brad Spengler <spender@grsecurity.net>
Reported-by: Tom Labanowski
Cc: mpb <mpb.mail@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 include/net/ip.h       | 2 +-
 include/net/ipv6.h     | 3 ++-
 net/ipv4/ip_sockglue.c | 3 ++-
 net/ipv4/raw.c         | 2 +-
 net/ipv4/udp.c         | 2 +-
 net/ipv6/datagram.c    | 3 ++-
 net/ipv6/raw.c         | 2 +-
 net/ipv6/udp.c         | 2 +-
 8 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/include/net/ip.h b/include/net/ip.h
index a7d4675..e6860b1 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -391,7 +391,7 @@ extern int	compat_ip_getsockopt(struct sock *sk, int level,
 			int optname, char __user *optval, int __user *optlen);
 extern int	ip_ra_control(struct sock *sk, unsigned char on, void (*destructor)(struct sock *));
 
-extern int 	ip_recv_error(struct sock *sk, struct msghdr *msg, int len);
+extern int 	ip_recv_error(struct sock *sk, struct msghdr *msg, int len, int *addr_len);
 extern void	ip_icmp_error(struct sock *sk, struct sk_buff *skb, int err, 
 			      __be16 port, u32 info, u8 *payload);
 extern void	ip_local_error(struct sock *sk, int err, __be32 daddr, __be16 dport,
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 52d86da..cf928c4 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -567,7 +567,8 @@ extern int			compat_ipv6_getsockopt(struct sock *sk,
 extern int			ip6_datagram_connect(struct sock *sk, 
 						     struct sockaddr *addr, int addr_len);
 
-extern int 			ipv6_recv_error(struct sock *sk, struct msghdr *msg, int len);
+extern int 			ipv6_recv_error(struct sock *sk, struct msghdr *msg, int len,
+						int *addr_len);
 extern void			ipv6_icmp_error(struct sock *sk, struct sk_buff *skb, int err, __be16 port,
 						u32 info, u8 *payload);
 extern void			ipv6_local_error(struct sock *sk, int err, struct flowi *fl, u32 info);
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index 099e6c3..d5a179b 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -356,7 +356,7 @@ void ip_local_error(struct sock *sk, int err, __be32 daddr, __be16 port, u32 inf
 /*
  *	Handle MSG_ERRQUEUE
  */
-int ip_recv_error(struct sock *sk, struct msghdr *msg, int len)
+int ip_recv_error(struct sock *sk, struct msghdr *msg, int len, int *addr_len)
 {
 	struct sock_exterr_skb *serr;
 	struct sk_buff *skb, *skb2;
@@ -393,6 +393,7 @@ int ip_recv_error(struct sock *sk, struct msghdr *msg, int len)
 						   serr->addr_offset);
 		sin->sin_port = serr->port;
 		memset(&sin->sin_zero, 0, sizeof(sin->sin_zero));
+		*addr_len = sizeof(*sin);
 	}
 
 	memcpy(&errhdr.ee, &serr->ee, sizeof(struct sock_extended_err));
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index c50344b..8065efa 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -682,7 +682,7 @@ static int raw_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 		goto out;
 
 	if (flags & MSG_ERRQUEUE) {
-		err = ip_recv_error(sk, msg, len);
+		err = ip_recv_error(sk, msg, len, addr_len);
 		goto out;
 	}
 
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 80487ee..a9aed7e 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -942,7 +942,7 @@ int udp_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 	int is_udplite = IS_UDPLITE(sk);
 
 	if (flags & MSG_ERRQUEUE)
-		return ip_recv_error(sk, msg, len);
+		return ip_recv_error(sk, msg, len, addr_len);
 
 try_again:
 	skb = __skb_recv_datagram(sk, flags | (noblock ? MSG_DONTWAIT : 0),
diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index e2bdc6d..ef6436d 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -281,7 +281,7 @@ void ipv6_local_error(struct sock *sk, int err, struct flowi *fl, u32 info)
 /*
  *	Handle MSG_ERRQUEUE
  */
-int ipv6_recv_error(struct sock *sk, struct msghdr *msg, int len)
+int ipv6_recv_error(struct sock *sk, struct msghdr *msg, int len, int *addr_len)
 {
 	struct ipv6_pinfo *np = inet6_sk(sk);
 	struct sock_exterr_skb *serr;
@@ -333,6 +333,7 @@ int ipv6_recv_error(struct sock *sk, struct msghdr *msg, int len)
 				      htonl(0xffff),
 				      *(__be32 *)(nh + serr->addr_offset));
 		}
+		*addr_len = sizeof(*sin);
 	}
 
 	memcpy(&errhdr.ee, &serr->ee, sizeof(struct sock_extended_err));
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index df75f94..d5b09c7 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -457,7 +457,7 @@ static int rawv6_recvmsg(struct kiocb *iocb, struct sock *sk,
 		return -EOPNOTSUPP;
 
 	if (flags & MSG_ERRQUEUE)
-		return ipv6_recv_error(sk, msg, len);
+		return ipv6_recv_error(sk, msg, len, addr_len);
 
 	skb = skb_recv_datagram(sk, flags, noblock, &err);
 	if (!skb)
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index ce291af..3a91859 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -201,7 +201,7 @@ int udpv6_recvmsg(struct kiocb *iocb, struct sock *sk,
 	int is_udp4;
 
 	if (flags & MSG_ERRQUEUE)
-		return ipv6_recv_error(sk, msg, len);
+		return ipv6_recv_error(sk, msg, len, addr_len);
 
 try_again:
 	skb = __skb_recv_datagram(sk, flags | (noblock ? MSG_DONTWAIT : 0),
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 083/143] net: clamp ->msg_namelen instead of returning an error
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (82 preceding siblings ...)
  2014-05-12  0:33 ` [ 082/143] inet: fix addr_len/msg->msg_namelen assignment in recv_error and Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-14 10:02   ` Dan Carpenter
  2014-05-12  0:33 ` [ 084/143] ipv6: fix leaking uninitialized port number of offender sockaddr Willy Tarreau
                   ` (59 subsequent siblings)
  143 siblings, 1 reply; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Dan Carpenter, Eric Dumazet, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Dan Carpenter <dan.carpenter@oracle.com>

[ Upstream commit db31c55a6fb245fdbb752a2ca4aefec89afabb06 ]

If kmsg->msg_namelen > sizeof(struct sockaddr_storage) then in the
original code that would lead to memory corruption in the kernel if you
had audit configured.  If you didn't have audit configured it was
harmless.

There are some programs such as beta versions of Ruby which use too
large of a buffer and returning an error code breaks them.  We should
clamp the ->msg_namelen value instead.

Fixes: 1661bf364ae9 ("net: heap overflow in __audit_sockaddr()")
Reported-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Tested-by: Eric Wong <normalperson@yhbt.net>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/compat.c | 2 +-
 net/socket.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/compat.c b/net/compat.c
index d325d16..e9672c8 100644
--- a/net/compat.c
+++ b/net/compat.c
@@ -70,7 +70,7 @@ int get_compat_msghdr(struct msghdr *kmsg, struct compat_msghdr __user *umsg)
 	    __get_user(kmsg->msg_flags, &umsg->msg_flags))
 		return -EFAULT;
 	if (kmsg->msg_namelen > sizeof(struct sockaddr_storage))
-		return -EINVAL;
+		kmsg->msg_namelen = sizeof(struct sockaddr_storage);
 	kmsg->msg_name = compat_ptr(tmp1);
 	kmsg->msg_iov = compat_ptr(tmp2);
 	kmsg->msg_control = compat_ptr(tmp3);
diff --git a/net/socket.c b/net/socket.c
index 712f977..0823497 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -1872,7 +1872,7 @@ static int copy_msghdr_from_user(struct msghdr *kmsg,
 	if (copy_from_user(kmsg, umsg, sizeof(struct msghdr)))
 		return -EFAULT;
 	if (kmsg->msg_namelen > sizeof(struct sockaddr_storage))
-		return -EINVAL;
+		kmsg->msg_namelen = sizeof(struct sockaddr_storage);
 	return 0;
 }
 
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 084/143] ipv6: fix leaking uninitialized port number of offender sockaddr
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (83 preceding siblings ...)
  2014-05-12  0:33 ` [ 083/143] net: clamp ->msg_namelen instead of returning an error Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 085/143] atm: idt77252: fix dev refcnt leak Willy Tarreau
                   ` (58 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Hannes Frederic Sowa, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Hannes Frederic Sowa <hannes@stressinduktion.org>

[ Upstream commit 1fa4c710b6fe7b0aac9907240291b6fe6aafc3b8 ]

Offenders don't have port numbers, so set it to 0.

Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/ipv6/datagram.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index ef6436d..5da306b 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -342,6 +342,7 @@ int ipv6_recv_error(struct sock *sk, struct msghdr *msg, int len, int *addr_len)
 	if (serr->ee.ee_origin != SO_EE_ORIGIN_LOCAL) {
 		sin->sin6_family = AF_INET6;
 		sin->sin6_flowinfo = 0;
+		sin->sin6_port = 0;
 		sin->sin6_scope_id = 0;
 		if (serr->ee.ee_origin == SO_EE_ORIGIN_ICMP6) {
 			ipv6_addr_copy(&sin->sin6_addr, &ipv6_hdr(skb)->saddr);
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 085/143] atm: idt77252: fix dev refcnt leak
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (84 preceding siblings ...)
  2014-05-12  0:33 ` [ 084/143] ipv6: fix leaking uninitialized port number of offender sockaddr Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 086/143] net: core: Always propagate flag changes to interfaces Willy Tarreau
                   ` (57 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Ying Xue, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Ying Xue <ying.xue@windriver.com>

[ Upstream commit b5de4a22f157ca345cdb3575207bf46402414bc1 ]

init_card() calls dev_get_by_name() to get a network deceive. But it
doesn't decrease network device reference count after the device is
used.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/atm/idt77252.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/atm/idt77252.c b/drivers/atm/idt77252.c
index e33ae00..adbaed5 100644
--- a/drivers/atm/idt77252.c
+++ b/drivers/atm/idt77252.c
@@ -3557,6 +3557,7 @@ init_card(struct atm_dev *dev)
 	if (tmp) {
 		memcpy(card->atmdev->esi, tmp->dev_addr, 6);
 
+		dev_put(tmp);
 		printk("%s: ESI %02x:%02x:%02x:%02x:%02x:%02x\n",
 		       card->name, card->atmdev->esi[0], card->atmdev->esi[1],
 		       card->atmdev->esi[2], card->atmdev->esi[3],
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 086/143] net: core: Always propagate flag changes to interfaces
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (85 preceding siblings ...)
  2014-05-12  0:33 ` [ 085/143] atm: idt77252: fix dev refcnt leak Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 087/143] bridge: flush brs address entry in fdb when remove the bridge dev Willy Tarreau
                   ` (56 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Vlad Yasevich, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Vlad Yasevich <vyasevic@redhat.com>

[ Upstream commit d2615bf450694c1302d86b9cc8a8958edfe4c3a4 ]

The following commit:
    b6c40d68ff6498b7f63ddf97cf0aa818d748dee7
    net: only invoke dev->change_rx_flags when device is UP

tried to fix a problem with VLAN devices and promiscuouse flag setting.
The issue was that VLAN device was setting a flag on an interface that
was down, thus resulting in bad promiscuity count.
This commit blocked flag propagation to any device that is currently
down.

A later commit:
    deede2fabe24e00bd7e246eb81cd5767dc6fcfc7
    vlan: Don't propagate flag changes on down interfaces

fixed VLAN code to only propagate flags when the VLAN interface is up,
thus fixing the same issue as above, only localized to VLAN.

The problem we have now is that if we have create a complex stack
involving multiple software devices like bridges, bonds, and vlans,
then it is possible that the flags would not propagate properly to
the physical devices.  A simple examle of the scenario is the
following:

  eth0----> bond0 ----> bridge0 ---> vlan50

If bond0 or eth0 happen to be down at the time bond0 is added to
the bridge, then eth0 will never have promisc mode set which is
currently required for operation as part of the bridge.  As a
result, packets with vlan50 will be dropped by the interface.

The only 2 devices that implement the special flag handling are
VLAN and DSA and they both have required code to prevent incorrect
flag propagation.  As a result we can remove the generic solution
introduced in b6c40d68ff6498b7f63ddf97cf0aa818d748dee7 and leave
it to the individual devices to decide whether they will block
flag propagation or not.

Reported-by: Stefan Priebe <s.priebe@profihost.ag>
Suggested-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/core/dev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index d775563..d250444 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3388,7 +3388,7 @@ static void dev_change_rx_flags(struct net_device *dev, int flags)
 {
 	const struct net_device_ops *ops = dev->netdev_ops;
 
-	if ((dev->flags & IFF_UP) && ops->ndo_change_rx_flags)
+	if (ops->ndo_change_rx_flags)
 		ops->ndo_change_rx_flags(dev, flags);
 }
 
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 087/143] bridge: flush brs address entry in fdb when remove the bridge dev
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (86 preceding siblings ...)
  2014-05-12  0:33 ` [ 086/143] net: core: Always propagate flag changes to interfaces Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 088/143] inet: fix possible seqlock deadlocks Willy Tarreau
                   ` (55 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Ding Tianhong, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Ding Tianhong <dingtianhong@huawei.com>

[ Upstream commit f873042093c0b418d2351fe142222b625c740149 ]

When the following commands are executed:

brctl addbr br0
ifconfig br0 hw ether <addr>
rmmod bridge

The calltrace will occur:

[  563.312114] device eth1 left promiscuous mode
[  563.312188] br0: port 1(eth1) entered disabled state
[  563.468190] kmem_cache_destroy bridge_fdb_cache: Slab cache still has objects
[  563.468197] CPU: 6 PID: 6982 Comm: rmmod Tainted: G           O 3.12.0-0.7-default+ #9
[  563.468199] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[  563.468200]  0000000000000880 ffff88010f111e98 ffffffff814d1c92 ffff88010f111eb8
[  563.468204]  ffffffff81148efd ffff88010f111eb8 0000000000000000 ffff88010f111ec8
[  563.468206]  ffffffffa062a270 ffff88010f111ed8 ffffffffa063ac76 ffff88010f111f78
[  563.468209] Call Trace:
[  563.468218]  [<ffffffff814d1c92>] dump_stack+0x6a/0x78
[  563.468234]  [<ffffffff81148efd>] kmem_cache_destroy+0xfd/0x100
[  563.468242]  [<ffffffffa062a270>] br_fdb_fini+0x10/0x20 [bridge]
[  563.468247]  [<ffffffffa063ac76>] br_deinit+0x4e/0x50 [bridge]
[  563.468254]  [<ffffffff810c7dc9>] SyS_delete_module+0x199/0x2b0
[  563.468259]  [<ffffffff814e0922>] system_call_fastpath+0x16/0x1b
[  570.377958] Bridge firewalling registered

--------------------------- cut here -------------------------------

The reason is that when the bridge dev's address is changed, the
br_fdb_change_mac_address() will add new address in fdb, but when
the bridge was removed, the address entry in the fdb did not free,
the bridge_fdb_cache still has objects when destroy the cache, Fix
this by flushing the bridge address entry when removing the bridge.

v2: according to the Toshiaki Makita and Vlad's suggestion, I only
    delete the vlan0 entry, it still have a leak here if the vlan id
    is other number, so I need to call fdb_delete_by_port(br, NULL, 1)
    to flush all entries whose dst is NULL for the bridge.

Suggested-by: Toshiaki Makita <toshiaki.makita1@gmail.com>
Suggested-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/bridge/br_if.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 4a9f527..c01e65d 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -162,6 +162,8 @@ static void del_br(struct net_bridge *br)
 		del_nbp(p);
 	}
 
+	br_fdb_delete_by_port(br, NULL, 1);
+
 	del_timer_sync(&br->gc_timer);
 
 	br_sysfs_delbr(br->dev);
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 088/143] inet: fix possible seqlock deadlocks
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (87 preceding siblings ...)
  2014-05-12  0:33 ` [ 087/143] bridge: flush brs address entry in fdb when remove the bridge dev Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 089/143] ipv6: fix possible seqlock deadlock in ip6_finish_output2 Willy Tarreau
                   ` (54 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Eric Dumazet, Hannes Frederic Sowa, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

[ Upstream commit f1d8cba61c3c4b1eb88e507249c4cb8d635d9a76 ]

In commit c9e9042994d3 ("ipv4: fix possible seqlock deadlock") I left
another places where IP_INC_STATS_BH() were improperly used.

udp_sendmsg(), ping_v4_sendmsg() and tcp_v4_connect() are called from
process context, not from softirq context.

This was detected by lockdep seqlock support.

Reported-by: jongman heo <jongman.heo@samsung.com>
Fixes: 584bdf8cbdf6 ("[IPV4]: Fix "ipOutNoRoutes" counter error for TCP and UDP")
Fixes: c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP socket kind")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/ipv4/tcp_ipv4.c | 2 +-
 net/ipv4/udp.c      | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index d746d3b3..e60f0fd 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -174,7 +174,7 @@ int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
 			       inet->sport, usin->sin_port, sk, 1);
 	if (tmp < 0) {
 		if (tmp == -ENETUNREACH)
-			IP_INC_STATS_BH(sock_net(sk), IPSTATS_MIB_OUTNOROUTES);
+			IP_INC_STATS(sock_net(sk), IPSTATS_MIB_OUTNOROUTES);
 		return tmp;
 	}
 
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index a9aed7e..dba3c01 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -723,7 +723,7 @@ int udp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 		err = ip_route_output_flow(net, &rt, &fl, sk, 1);
 		if (err) {
 			if (err == -ENETUNREACH)
-				IP_INC_STATS_BH(net, IPSTATS_MIB_OUTNOROUTES);
+				IP_INC_STATS(net, IPSTATS_MIB_OUTNOROUTES);
 			goto out;
 		}
 
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 089/143] ipv6: fix possible seqlock deadlock in ip6_finish_output2
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (88 preceding siblings ...)
  2014-05-12  0:33 ` [ 088/143] inet: fix possible seqlock deadlocks Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 090/143] {pktgen, xfrm} Update IPv4 header total len and checksum after Willy Tarreau
                   ` (53 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Eric Dumazet, Hannes Frederic Sowa, Eric Dumazet,
	David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Hannes Frederic Sowa <hannes@stressinduktion.org>

[ Upstream commit 7f88c6b23afbd31545c676dea77ba9593a1a14bf ]

IPv6 stats are 64 bits and thus are protected with a seqlock. By not
disabling bottom-half we could deadlock here if we don't disable bh and
a softirq reentrantly updates the same mib.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/ipv6/ip6_output.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index bb63ffc..6ff4d07 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -91,8 +91,8 @@ static int ip6_output_finish(struct sk_buff *skb)
 	else if (dst->neighbour)
 		return dst->neighbour->output(skb);
 
-	IP6_INC_STATS_BH(dev_net(dst->dev),
-			 ip6_dst_idev(dst), IPSTATS_MIB_OUTNOROUTES);
+	IP6_INC_STATS(dev_net(dst->dev),
+	              ip6_dst_idev(dst), IPSTATS_MIB_OUTNOROUTES);
 	kfree_skb(skb);
 	return -EINVAL;
 
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 090/143] {pktgen, xfrm} Update IPv4 header total len and checksum after
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (89 preceding siblings ...)
  2014-05-12  0:33 ` [ 089/143] ipv6: fix possible seqlock deadlock in ip6_finish_output2 Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 091/143] net: drop_monitor: fix the value of maxattr Willy Tarreau
                   ` (52 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Fan Du, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------
 tranformation

From: "fan.du" <fan.du@windriver.com>

[ Upstream commit 3868204d6b89ea373a273e760609cb08020beb1a ]

commit a553e4a6317b2cfc7659542c10fe43184ffe53da ("[PKTGEN]: IPSEC support")
tried to support IPsec ESP transport transformation for pktgen, but acctually
this doesn't work at all for two reasons(The orignal transformed packet has
bad IPv4 checksum value, as well as wrong auth value, reported by wireshark)

- After transpormation, IPv4 header total length needs update,
  because encrypted payload's length is NOT same as that of plain text.

- After transformation, IPv4 checksum needs re-caculate because of payload
  has been changed.

With this patch, armmed pktgen with below cofiguration, Wireshark is able to
decrypted ESP packet generated by pktgen without any IPv4 checksum error or
auth value error.

pgset "flag IPSEC"
pgset "flows 1"

Signed-off-by: Fan Du <fan.du@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/core/pktgen.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index 6a993b1..f776b99 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -2495,6 +2495,8 @@ static int process_ipsec(struct pktgen_dev *pkt_dev,
 		if (x) {
 			int ret;
 			__u8 *eth;
+			struct iphdr *iph;
+
 			nhead = x->props.header_len - skb_headroom(skb);
 			if (nhead > 0) {
 				ret = pskb_expand_head(skb, nhead, 0, GFP_ATOMIC);
@@ -2517,6 +2519,11 @@ static int process_ipsec(struct pktgen_dev *pkt_dev,
 			eth = (__u8 *) skb_push(skb, ETH_HLEN);
 			memcpy(eth, pkt_dev->hh, 12);
 			*(u16 *) &eth[12] = protocol;
+
+			/* Update IPv4 header len as well as checksum value */
+			iph = ip_hdr(skb);
+			iph->tot_len = htons(skb->len - ETH_HLEN);
+			ip_send_check(iph);
 		}
 	}
 	return 1;
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 091/143] net: drop_monitor: fix the value of maxattr
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (90 preceding siblings ...)
  2014-05-12  0:33 ` [ 090/143] {pktgen, xfrm} Update IPv4 header total len and checksum after Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 092/143] net: unix: allow bind to fail on mutex lock Willy Tarreau
                   ` (51 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Changli Gao <xiaosuo@gmail.com>

[ Upstream commit d323e92cc3f4edd943610557c9ea1bb4bb5056e8 ]

maxattr in genl_family should be used to save the max attribute
type, but not the max command type. Drop monitor doesn't support
any attributes, so we should leave it as zero.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/core/drop_monitor.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/net/core/drop_monitor.c b/net/core/drop_monitor.c
index 0a113f2..e65fa2f 100644
--- a/net/core/drop_monitor.c
+++ b/net/core/drop_monitor.c
@@ -63,7 +63,6 @@ static struct genl_family net_drop_monitor_family = {
 	.hdrsize        = 0,
 	.name           = "NET_DM",
 	.version        = 2,
-	.maxattr        = NET_DM_CMD_MAX,
 };
 
 static DEFINE_PER_CPU(struct per_cpu_dm_data, dm_cpu_data);
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 092/143] net: unix: allow bind to fail on mutex lock
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (91 preceding siblings ...)
  2014-05-12  0:33 ` [ 091/143] net: drop_monitor: fix the value of maxattr Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 093/143] drivers/net/hamradio: Integer overflow in hdlcdrv_ioctl() Willy Tarreau
                   ` (50 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Sasha Levin, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Sasha Levin <sasha.levin@oracle.com>

[ Upstream commit 37ab4fa7844a044dc21fde45e2a0fc2f3c3b6490 ]

This is similar to the set_peek_off patch where calling bind while the
socket is stuck in unix_dgram_recvmsg() will block and cause a hung task
spew after a while.

This is also the last place that did a straightforward mutex_lock(), so
there shouldn't be any more of these patches.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/unix/af_unix.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index bb0b008..79c1dce 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -674,7 +674,9 @@ static int unix_autobind(struct socket *sock)
 	int err;
 	unsigned int retries = 0;
 
-	mutex_lock(&u->readlock);
+	err = mutex_lock_interruptible(&u->readlock);
+	if (err)
+		return err;
 
 	err = 0;
 	if (u->addr)
@@ -806,7 +808,9 @@ static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
 		goto out;
 	addr_len = err;
 
-	mutex_lock(&u->readlock);
+	err = mutex_lock_interruptible(&u->readlock);
+	if (err)
+		goto out;
 
 	err = -EINVAL;
 	if (u->addr)
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 093/143] drivers/net/hamradio: Integer overflow in hdlcdrv_ioctl()
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (92 preceding siblings ...)
  2014-05-12  0:33 ` [ 092/143] net: unix: allow bind to fail on mutex lock Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 094/143] hamradio/yam: fix info leak in ioctl Willy Tarreau
                   ` (49 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Wenliang Fan, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Wenliang Fan <fanwlexca@gmail.com>

[ Upstream commit e9db5c21d3646a6454fcd04938dd215ac3ab620a ]

The local variable 'bi' comes from userspace. If userspace passed a
large number to 'bi.data.calibrate', there would be an integer overflow
in the following line:
	s->hdlctx.calibrate = bi.data.calibrate * s->par.bitrate / 16;

Signed-off-by: Wenliang Fan <fanwlexca@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/net/hamradio/hdlcdrv.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/hamradio/hdlcdrv.c b/drivers/net/hamradio/hdlcdrv.c
index 91c5790..c1b265d 100644
--- a/drivers/net/hamradio/hdlcdrv.c
+++ b/drivers/net/hamradio/hdlcdrv.c
@@ -572,6 +572,8 @@ static int hdlcdrv_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
 	case HDLCDRVCTL_CALIBRATE:
 		if(!capable(CAP_SYS_RAWIO))
 			return -EPERM;
+		if (bi.data.calibrate > INT_MAX / s->par.bitrate)
+			return -EINVAL;
 		s->hdlctx.calibrate = bi.data.calibrate * s->par.bitrate / 16;
 		return 0;
 
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 094/143] hamradio/yam: fix info leak in ioctl
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (93 preceding siblings ...)
  2014-05-12  0:33 ` [ 093/143] drivers/net/hamradio: Integer overflow in hdlcdrv_ioctl() Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 095/143] rds: prevent dereference of a NULL device Willy Tarreau
                   ` (48 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Salva Peiró, David S. Miller, Willy Tarreau

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1047 bytes --]

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: =?latin1?q?Salva=20Peir=F3?= <speiro@ai2.upv.es>

[ Upstream commit 8e3fbf870481eb53b2d3a322d1fc395ad8b367ed ]

The yam_ioctl() code fails to initialise the cmd field
of the struct yamdrv_ioctl_cfg. Add an explicit memset(0)
before filling the structure to avoid the 4-byte info leak.

Signed-off-by: Salva Peiró <speiro@ai2.upv.es>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/net/hamradio/yam.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/hamradio/yam.c b/drivers/net/hamradio/yam.c
index 694132e..1a1002d 100644
--- a/drivers/net/hamradio/yam.c
+++ b/drivers/net/hamradio/yam.c
@@ -1060,6 +1060,7 @@ static int yam_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
 		break;
 
 	case SIOCYAMGCFG:
+		memset(&yi, 0, sizeof(yi));
 		yi.cfg.mask = 0xffffffff;
 		yi.cfg.iobase = yp->iobase;
 		yi.cfg.irq = yp->irq;
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 095/143] rds: prevent dereference of a NULL device
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (94 preceding siblings ...)
  2014-05-12  0:33 ` [ 094/143] hamradio/yam: fix info leak in ioctl Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 096/143] net: rose: restore old recvmsg behavior Willy Tarreau
                   ` (47 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Sasha Levin, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Sasha Levin <sasha.levin@oracle.com>

[ Upstream commit c2349758acf1874e4c2b93fe41d072336f1a31d0 ]

Binding might result in a NULL device, which is dereferenced
causing this BUG:

[ 1317.260548] BUG: unable to handle kernel NULL pointer dereference at 000000000000097
4
[ 1317.261847] IP: [<ffffffff84225f52>] rds_ib_laddr_check+0x82/0x110
[ 1317.263315] PGD 418bcb067 PUD 3ceb21067 PMD 0
[ 1317.263502] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[ 1317.264179] Dumping ftrace buffer:
[ 1317.264774]    (ftrace buffer empty)
[ 1317.265220] Modules linked in:
[ 1317.265824] CPU: 4 PID: 836 Comm: trinity-child46 Tainted: G        W    3.13.0-rc4-
next-20131218-sasha-00013-g2cebb9b-dirty #4159
[ 1317.267415] task: ffff8803ddf33000 ti: ffff8803cd31a000 task.ti: ffff8803cd31a000
[ 1317.268399] RIP: 0010:[<ffffffff84225f52>]  [<ffffffff84225f52>] rds_ib_laddr_check+
0x82/0x110
[ 1317.269670] RSP: 0000:ffff8803cd31bdf8  EFLAGS: 00010246
[ 1317.270230] RAX: 0000000000000000 RBX: ffff88020b0dd388 RCX: 0000000000000000
[ 1317.270230] RDX: ffffffff8439822e RSI: 00000000000c000a RDI: 0000000000000286
[ 1317.270230] RBP: ffff8803cd31be38 R08: 0000000000000000 R09: 0000000000000000
[ 1317.270230] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
[ 1317.270230] R13: 0000000054086700 R14: 0000000000a25de0 R15: 0000000000000031
[ 1317.270230] FS:  00007ff40251d700(0000) GS:ffff88022e200000(0000) knlGS:000000000000
0000
[ 1317.270230] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1317.270230] CR2: 0000000000000974 CR3: 00000003cd478000 CR4: 00000000000006e0
[ 1317.270230] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1317.270230] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000090602
[ 1317.270230] Stack:
[ 1317.270230]  0000000054086700 5408670000a25de0 5408670000000002 0000000000000000
[ 1317.270230]  ffffffff84223542 00000000ea54c767 0000000000000000 ffffffff86d26160
[ 1317.270230]  ffff8803cd31be68 ffffffff84223556 ffff8803cd31beb8 ffff8800c6765280
[ 1317.270230] Call Trace:
[ 1317.270230]  [<ffffffff84223542>] ? rds_trans_get_preferred+0x42/0xa0
[ 1317.270230]  [<ffffffff84223556>] rds_trans_get_preferred+0x56/0xa0
[ 1317.270230]  [<ffffffff8421c9c3>] rds_bind+0x73/0xf0
[ 1317.270230]  [<ffffffff83e4ce62>] SYSC_bind+0x92/0xf0
[ 1317.270230]  [<ffffffff812493f8>] ? context_tracking_user_exit+0xb8/0x1d0
[ 1317.270230]  [<ffffffff8119313d>] ? trace_hardirqs_on+0xd/0x10
[ 1317.270230]  [<ffffffff8107a852>] ? syscall_trace_enter+0x32/0x290
[ 1317.270230]  [<ffffffff83e4cece>] SyS_bind+0xe/0x10
[ 1317.270230]  [<ffffffff843a6ad0>] tracesys+0xdd/0xe2
[ 1317.270230] Code: 00 8b 45 cc 48 8d 75 d0 48 c7 45 d8 00 00 00 00 66 c7 45 d0 02 00
89 45 d4 48 89 df e8 78 49 76 ff 41 89 c4 85 c0 75 0c 48 8b 03 <80> b8 74 09 00 00 01 7
4 06 41 bc 9d ff ff ff f6 05 2a b6 c2 02
[ 1317.270230] RIP  [<ffffffff84225f52>] rds_ib_laddr_check+0x82/0x110
[ 1317.270230]  RSP <ffff8803cd31bdf8>
[ 1317.270230] CR2: 0000000000000974

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/rds/ib.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/rds/ib.c b/net/rds/ib.c
index 536ebe5..5018f3d 100644
--- a/net/rds/ib.c
+++ b/net/rds/ib.c
@@ -235,7 +235,8 @@ static int rds_ib_laddr_check(__be32 addr)
 	ret = rdma_bind_addr(cm_id, (struct sockaddr *)&sin);
 	/* due to this, we will claim to support iWARP devices unless we
 	   check node_type. */
-	if (ret || cm_id->device->node_type != RDMA_NODE_IB_CA)
+	if (ret || !cm_id->device ||
+	    cm_id->device->node_type != RDMA_NODE_IB_CA)
 		ret = -EADDRNOTAVAIL;
 
 	rdsdebug("addr %pI4 ret %d node type %d\n",
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 096/143] net: rose: restore old recvmsg behavior
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (95 preceding siblings ...)
  2014-05-12  0:33 ` [ 095/143] rds: prevent dereference of a NULL device Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 097/143] net: llc: fix use after free in llc_ui_recvmsg Willy Tarreau
                   ` (46 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Florian Westphal, Hannes Frederic Sowa, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Florian Westphal <fw@strlen.de>

[ Upstream commit f81152e35001e91997ec74a7b4e040e6ab0acccf ]

recvmsg handler in net/rose/af_rose.c performs size-check ->msg_namelen.

After commit f3d3342602f8bcbf37d7c46641cb9bca7618eb1c
(net: rework recvmsg handler msg_name and msg_namelen logic), we now
always take the else branch due to namelen being initialized to 0.

Digging in netdev-vger-cvs git repo shows that msg_namelen was
initialized with a fixed-size since at least 1995, so the else branch
was never taken.

Compile tested only.

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/rose/af_rose.c | 16 ++++------------
 1 file changed, 4 insertions(+), 12 deletions(-)

diff --git a/net/rose/af_rose.c b/net/rose/af_rose.c
index 08a86f6..7119ea6 100644
--- a/net/rose/af_rose.c
+++ b/net/rose/af_rose.c
@@ -1275,6 +1275,7 @@ static int rose_recvmsg(struct kiocb *iocb, struct socket *sock,
 
 	if (msg->msg_name) {
 		struct sockaddr_rose *srose;
+		struct full_sockaddr_rose *full_srose = msg->msg_name;
 
 		memset(msg->msg_name, 0, sizeof(struct full_sockaddr_rose));
 		srose = msg->msg_name;
@@ -1282,18 +1283,9 @@ static int rose_recvmsg(struct kiocb *iocb, struct socket *sock,
 		srose->srose_addr   = rose->dest_addr;
 		srose->srose_call   = rose->dest_call;
 		srose->srose_ndigis = rose->dest_ndigis;
-		if (msg->msg_namelen >= sizeof(struct full_sockaddr_rose)) {
-			struct full_sockaddr_rose *full_srose = (struct full_sockaddr_rose *)msg->msg_name;
-			for (n = 0 ; n < rose->dest_ndigis ; n++)
-				full_srose->srose_digis[n] = rose->dest_digis[n];
-			msg->msg_namelen = sizeof(struct full_sockaddr_rose);
-		} else {
-			if (rose->dest_ndigis >= 1) {
-				srose->srose_ndigis = 1;
-				srose->srose_digi = rose->dest_digis[0];
-			}
-			msg->msg_namelen = sizeof(struct sockaddr_rose);
-		}
+		for (n = 0 ; n < rose->dest_ndigis ; n++)
+			full_srose->srose_digis[n] = rose->dest_digis[n];
+		msg->msg_namelen = sizeof(struct full_sockaddr_rose);
 	}
 
 	skb_free_datagram(sk, skb);
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 097/143] net: llc: fix use after free in llc_ui_recvmsg
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (96 preceding siblings ...)
  2014-05-12  0:33 ` [ 096/143] net: rose: restore old recvmsg behavior Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 098/143] inet_diag: fix inet_diag_dump_icsk() timewait socket state logic Willy Tarreau
                   ` (45 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Daniel Borkmann, Stephen Hemminger, Arnaldo Carvalho de Melo,
	David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Daniel Borkmann <dborkman@redhat.com>

[ Upstream commit 4d231b76eef6c4a6bd9c96769e191517765942cb ]

While commit 30a584d944fb fixes datagram interface in LLC, a use
after free bug has been introduced for SOCK_STREAM sockets that do
not make use of MSG_PEEK.

The flow is as follow ...

  if (!(flags & MSG_PEEK)) {
    ...
    sk_eat_skb(sk, skb, false);
    ...
  }
  ...
  if (used + offset < skb->len)
    continue;

... where sk_eat_skb() calls __kfree_skb(). Therefore, cache
original length and work on skb_len to check partial reads.

Fixes: 30a584d944fb ("[LLX]: SOCK_DGRAM interface fixes")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/llc/af_llc.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/llc/af_llc.c b/net/llc/af_llc.c
index 606b6ad..f62b63e 100644
--- a/net/llc/af_llc.c
+++ b/net/llc/af_llc.c
@@ -669,7 +669,7 @@ static int llc_ui_recvmsg(struct kiocb *iocb, struct socket *sock,
 	struct llc_sock *llc = llc_sk(sk);
 	size_t copied = 0;
 	u32 peek_seq = 0;
-	u32 *seq;
+	u32 *seq, skb_len;
 	unsigned long used;
 	int target;	/* Read at least this many bytes */
 	long timeo;
@@ -767,6 +767,7 @@ static int llc_ui_recvmsg(struct kiocb *iocb, struct socket *sock,
 		}
 		continue;
 	found_ok_skb:
+		skb_len = skb->len;
 		/* Ok so how much can we use? */
 		used = skb->len - offset;
 		if (len < used)
@@ -797,7 +798,7 @@ static int llc_ui_recvmsg(struct kiocb *iocb, struct socket *sock,
 			goto copy_uaddr;
 
 		/* Partial read */
-		if (used + offset < skb->len)
+		if (used + offset < skb_len)
 			continue;
 	} while (len > 0);
 
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 098/143] inet_diag: fix inet_diag_dump_icsk() timewait socket state logic
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (97 preceding siblings ...)
  2014-05-12  0:33 ` [ 097/143] net: llc: fix use after free in llc_ui_recvmsg Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 099/143] net: fix ip rule iif/oif device rename Willy Tarreau
                   ` (44 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Neal Cardwell, Eric Dumazet, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Neal Cardwell <ncardwell@google.com>

[ Based upon upstream commit 70315d22d3c7383f9a508d0aab21e2eb35b2303a ]

Fix inet_diag_dump_icsk() to reflect the fact that both TIME_WAIT and
FIN_WAIT2 connections are represented by inet_timewait_sock (not just
TIME_WAIT). Thus:

(a) We need to iterate through the time_wait buckets if the user wants
either TIME_WAIT or FIN_WAIT2. (Before fixing this, "ss -nemoi state
fin-wait-2" would not return any sockets, even if there were some in
FIN_WAIT2.)

(b) We need to check tw_substate to see if the user wants to dump
sockets in the particular substate (TIME_WAIT or FIN_WAIT2) that a
given connection is in. (Before fixing this, "ss -nemoi state
time-wait" would actually return sockets in state FIN_WAIT2.)

An analogous fix is in v3.13: 70315d22d3c7383f9a508d0aab21e2eb35b2303a
("inet_diag: fix inet_diag_dump_icsk() to use correct state for
timewait sockets") but that patch is quite different because 3.13 code
is very different in this area due to the unification of TCP hash
tables in 05dbc7b ("tcp/dccp: remove twchain") in v3.13-rc1.

I tested that this applies cleanly between v3.3 and v3.12, and tested
that it works in both 3.3 and 3.12. It does not apply cleanly to 3.2
and earlier (though it makes semantic sense), and semantically is not
the right fix for 3.13 and beyond (as mentioned above).

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/ipv4/inet_diag.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c
index dba56d2..65ee65a 100644
--- a/net/ipv4/inet_diag.c
+++ b/net/ipv4/inet_diag.c
@@ -814,7 +814,7 @@ next_normal:
 			++num;
 		}
 
-		if (r->idiag_states & TCPF_TIME_WAIT) {
+		if (r->idiag_states & (TCPF_TIME_WAIT | TCPF_FIN_WAIT2)) {
 			struct inet_timewait_sock *tw;
 
 			inet_twsk_for_each(tw, node,
@@ -822,6 +822,8 @@ next_normal:
 
 				if (num < s_num)
 					goto next_dying;
+				if (!(r->idiag_states & (1 << tw->tw_substate)))
+					goto next_dying;
 				if (r->id.idiag_sport != tw->tw_sport &&
 				    r->id.idiag_sport)
 					goto next_dying;
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 099/143] net: fix ip rule iif/oif device rename
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (98 preceding siblings ...)
  2014-05-12  0:33 ` [ 098/143] inet_diag: fix inet_diag_dump_icsk() timewait socket state logic Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 100/143] tg3: Fix deadlock in tg3_change_mtu() Willy Tarreau
                   ` (43 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Maciej Zenczykowski, Willem de Bruijn, Eric Dumazet, Chris Davis,
	Carlo Contavalli, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Maciej Zenczykowski <maze@google.com>

[ Upstream commit 946c032e5a53992ea45e062ecb08670ba39b99e3 ]

ip rules with iif/oif references do not update:
(detach/attach) across interface renames.

Signed-off-by: Maciej Zenczykowski <maze@google.com>
CC: Willem de Bruijn <willemb@google.com>
CC: Eric Dumazet <edumazet@google.com>
CC: Chris Davis <chrismd@google.com>
CC: Carlo Contavalli <ccontavalli@google.com>

Google-Bug-Id: 12936021
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/core/fib_rules.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/net/core/fib_rules.c b/net/core/fib_rules.c
index de9eac9..06bdee7 100644
--- a/net/core/fib_rules.c
+++ b/net/core/fib_rules.c
@@ -633,6 +633,13 @@ static int fib_rules_event(struct notifier_block *this, unsigned long event,
 			attach_rules(&ops->rules_list, dev);
 		break;
 
+	case NETDEV_CHANGENAME:
+		list_for_each_entry(ops, &net->rules_ops, list) {
+			detach_rules(&ops->rules_list, dev);
+			attach_rules(&ops->rules_list, dev);
+		}
+		break;
+
 	case NETDEV_UNREGISTER:
 		list_for_each_entry(ops, &net->rules_ops, list)
 			detach_rules(&ops->rules_list, dev);
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 100/143] tg3: Fix deadlock in tg3_change_mtu()
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (99 preceding siblings ...)
  2014-05-12  0:33 ` [ 099/143] net: fix ip rule iif/oif device rename Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 101/143] bonding: 802.3ad: make aggregator_identifier bond-private Willy Tarreau
                   ` (42 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Michael Chan, Nithin Nayak Sujir, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Nithin Sujir <nsujir@broadcom.com>

[ Upstream commit c6993dfd7db9b0c6b7ca7503a56fda9236a4710f ]

Quoting David Vrabel -
"5780 cards cannot have jumbo frames and TSO enabled together.  When
jumbo frames are enabled by setting the MTU, the TSO feature must be
cleared.  This is done indirectly by calling netdev_update_features()
which will call tg3_fix_features() to actually clear the flags.

netdev_update_features() will also trigger a new netlink message for the
feature change event which will result in a call to tg3_get_stats64()
which deadlocks on the tg3 lock."

tg3_set_mtu() does not need to be under the tg3 lock since converting
the flags to use set_bit(). Move it out to after tg3_netif_stop().

Reported-by: David Vrabel <david.vrabel@citrix.com>
Tested-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/net/tg3.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index 89aa69c..56648b4 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -5583,12 +5583,12 @@ static int tg3_change_mtu(struct net_device *dev, int new_mtu)
 
 	tg3_netif_stop(tp);
 
+	tg3_set_mtu(dev, tp, new_mtu);
+
 	tg3_full_lock(tp, 1);
 
 	tg3_halt(tp, RESET_KIND_SHUTDOWN, 1);
 
-	tg3_set_mtu(dev, tp, new_mtu);
-
 	err = tg3_restart_hw(tp, 0);
 
 	if (!err)
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 101/143] bonding: 802.3ad: make aggregator_identifier bond-private
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (100 preceding siblings ...)
  2014-05-12  0:33 ` [ 100/143] tg3: Fix deadlock in tg3_change_mtu() Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 102/143] net: sctp: fix sctp_connectx abi for ia32 emulation/compat mode Willy Tarreau
                   ` (41 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Jiri Bohac, Veaceslav Falico, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Jiri Bohac <jiri@boha.cz>

[ Upstream commit 163c8ff30dbe473abfbb24a7eac5536c87f3baa9 ]

aggregator_identifier is used to assign unique aggregator identifiers
to aggregators of a bond during device enslaving.

aggregator_identifier is currently a global variable that is zeroed in
bond_3ad_initialize().

This sequence will lead to duplicate aggregator identifiers for eth1 and eth3:

create bond0
change bond0 mode to 802.3ad
enslave eth0 to bond0 		//eth0 gets agg id 1
enslave eth1 to bond0 		//eth1 gets agg id 2
create bond1
change bond1 mode to 802.3ad
enslave eth2 to bond1		//aggregator_identifier is reset to 0
				//eth2 gets agg id 1
enslave eth3 to bond0 		//eth3 gets agg id 2

Fix this by making aggregator_identifier private to the bond.

Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Acked-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/net/bonding/bond_3ad.c | 6 ++----
 drivers/net/bonding/bond_3ad.h | 1 +
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
index 05308e6..ec2bf8c 100644
--- a/drivers/net/bonding/bond_3ad.c
+++ b/drivers/net/bonding/bond_3ad.c
@@ -1846,8 +1846,6 @@ void bond_3ad_initiate_agg_selection(struct bonding *bond, int timeout)
 	BOND_AD_INFO(bond).agg_select_mode = bond->params.ad_select;
 }
 
-static u16 aggregator_identifier;
-
 /**
  * bond_3ad_initialize - initialize a bond's 802.3ad parameters and structures
  * @bond: bonding struct to work on
@@ -1862,7 +1860,7 @@ void bond_3ad_initialize(struct bonding *bond, u16 tick_resolution, int lacp_fas
 	if (MAC_ADDRESS_COMPARE(&(BOND_AD_INFO(bond).system.sys_mac_addr),
 				bond->dev->dev_addr)) {
 
-		aggregator_identifier = 0;
+		BOND_AD_INFO(bond).aggregator_identifier = 0;
 
 		BOND_AD_INFO(bond).lacp_fast = lacp_fast;
 		BOND_AD_INFO(bond).system.sys_priority = 0xFFFF;
@@ -1937,7 +1935,7 @@ int bond_3ad_bind_slave(struct slave *slave)
 		ad_initialize_agg(aggregator);
 
 		aggregator->aggregator_mac_address = *((struct mac_addr *)bond->dev->dev_addr);
-		aggregator->aggregator_identifier = (++aggregator_identifier);
+		aggregator->aggregator_identifier = ++BOND_AD_INFO(bond).aggregator_identifier;
 		aggregator->slave = slave;
 		aggregator->is_active = 0;
 		aggregator->num_of_ports = 0;
diff --git a/drivers/net/bonding/bond_3ad.h b/drivers/net/bonding/bond_3ad.h
index 2c46a154..f04f465 100644
--- a/drivers/net/bonding/bond_3ad.h
+++ b/drivers/net/bonding/bond_3ad.h
@@ -253,6 +253,7 @@ struct ad_system {
 struct ad_bond_info {
 	struct ad_system system;	    /* 802.3ad system structure */
 	u32 agg_select_timer;	    // Timer to select aggregator after all adapter's hand shakes
+	u16 aggregator_identifier;
 	u32 agg_select_mode;	    // Mode of selection of active aggregator(bandwidth/count)
 	int lacp_fast;		/* whether fast periodic tx should be
 				 * requested
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 102/143] net: sctp: fix sctp_connectx abi for ia32 emulation/compat mode
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (101 preceding siblings ...)
  2014-05-12  0:33 ` [ 101/143] bonding: 802.3ad: make aggregator_identifier bond-private Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 103/143] virtio-net: alloc big buffers also when guest can receive UFO Willy Tarreau
                   ` (40 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Daniel Borkmann, Neil Horman, Vlad Yasevich, David S. Miller,
	Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Daniel Borkmann <dborkman@redhat.com>

[ Upstream commit ffd5939381c609056b33b7585fb05a77b4c695f3 ]

SCTP's sctp_connectx() abi breaks for 64bit kernels compiled with 32bit
emulation (e.g. ia32 emulation or x86_x32). Due to internal usage of
'struct sctp_getaddrs_old' which includes a struct sockaddr pointer,
sizeof(param) check will always fail in kernel as the structure in
64bit kernel space is 4bytes larger than for user binaries compiled
in 32bit mode. Thus, applications making use of sctp_connectx() won't
be able to run under such circumstances.

Introduce a compat interface in the kernel to deal with such
situations by using a 'struct compat_sctp_getaddrs_old' structure
where user data is copied into it, and then sucessively transformed
into a 'struct sctp_getaddrs_old' structure with the help of
compat_ptr(). That fixes sctp_connectx() abi without any changes
needed in user space, and lets the SCTP test suite pass when compiled
in 32bit and run on 64bit kernels.

Fixes: f9c67811ebc0 ("sctp: Fix regression introduced by new sctp_connectx api")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/sctp/socket.c | 41 ++++++++++++++++++++++++++++++++---------
 1 file changed, 32 insertions(+), 9 deletions(-)

diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 44d8eab..c26d905 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -67,6 +67,7 @@
 #include <linux/poll.h>
 #include <linux/init.h>
 #include <linux/crypto.h>
+#include <linux/compat.h>
 
 #include <net/ip.h>
 #include <net/icmp.h>
@@ -1284,11 +1285,19 @@ SCTP_STATIC int sctp_setsockopt_connectx(struct sock* sk,
 /*
  * New (hopefully final) interface for the API.
  * We use the sctp_getaddrs_old structure so that use-space library
- * can avoid any unnecessary allocations.   The only defferent part
+ * can avoid any unnecessary allocations. The only different part
  * is that we store the actual length of the address buffer into the
- * addrs_num structure member.  That way we can re-use the existing
+ * addrs_num structure member. That way we can re-use the existing
  * code.
  */
+#ifdef CONFIG_COMPAT
+struct compat_sctp_getaddrs_old {
+	sctp_assoc_t	assoc_id;
+	s32		addr_num;
+	compat_uptr_t	addrs;		/* struct sockaddr * */
+};
+#endif
+
 SCTP_STATIC int sctp_getsockopt_connectx3(struct sock* sk, int len,
 					char __user *optval,
 					int __user *optlen)
@@ -1297,16 +1306,30 @@ SCTP_STATIC int sctp_getsockopt_connectx3(struct sock* sk, int len,
 	sctp_assoc_t assoc_id = 0;
 	int err = 0;
 
-	if (len < sizeof(param))
-		return -EINVAL;
+#ifdef CONFIG_COMPAT
+	if (is_compat_task()) {
+		struct compat_sctp_getaddrs_old param32;
 
-	if (copy_from_user(&param, optval, sizeof(param)))
-		return -EFAULT;
+		if (len < sizeof(param32))
+			return -EINVAL;
+		if (copy_from_user(&param32, optval, sizeof(param32)))
+			return -EFAULT;
 
-	err = __sctp_setsockopt_connectx(sk,
-			(struct sockaddr __user *)param.addrs,
-			param.addr_num, &assoc_id);
+		param.assoc_id = param32.assoc_id;
+		param.addr_num = param32.addr_num;
+		param.addrs = compat_ptr(param32.addrs);
+	} else
+#endif
+	{
+		if (len < sizeof(param))
+			return -EINVAL;
+		if (copy_from_user(&param, optval, sizeof(param)))
+			return -EFAULT;
+	}
 
+	err = __sctp_setsockopt_connectx(sk, (struct sockaddr __user *)
+					 param.addrs, param.addr_num,
+					 &assoc_id);
 	if (err == 0 || err == -EINPROGRESS) {
 		if (copy_to_user(optval, &assoc_id, sizeof(assoc_id)))
 			return -EFAULT;
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 103/143] virtio-net: alloc big buffers also when guest can receive UFO
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (102 preceding siblings ...)
  2014-05-12  0:33 ` [ 102/143] net: sctp: fix sctp_connectx abi for ia32 emulation/compat mode Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 104/143] tg3: Dont check undefined error bits in RXBD Willy Tarreau
                   ` (39 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Rusty Russell, Michael S. Tsirkin, Sridhar Samudrala, Jason Wang,
	David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Jason Wang <jasowang@redhat.com>

[ Upstream commit 0e7ede80d929ff0f830c44a543daa1acd590c749 ]

We should alloc big buffers also when guest can receive UFO
packets to let the big packets fit into guest rx buffer.

Fixes 5c5167515d80f78f6bb538492c423adcae31ad65
(virtio-net: Allow UFO feature to be set and advertised.)

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/net/virtio_net.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index bf6d850..97a56f0 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -904,7 +904,8 @@ static int virtnet_probe(struct virtio_device *vdev)
 	/* If we can receive ANY GSO packets, we must allocate large ones. */
 	if (virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_TSO4)
 	    || virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_TSO6)
-	    || virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_ECN))
+	    || virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_ECN)
+	    || virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_UFO))
 		vi->big_packets = true;
 
 	if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF))
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 104/143] tg3: Dont check undefined error bits in RXBD
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (103 preceding siblings ...)
  2014-05-12  0:33 ` [ 103/143] virtio-net: alloc big buffers also when guest can receive UFO Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 105/143] net: sctp: fix sctp_sf_do_5_1D_ce to verify if we/peer is AUTH Willy Tarreau
                   ` (38 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Michael Chan, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Michael Chan <mchan@broadcom.com>

[ Upstream commit d7b95315cc7f441418845a165ee56df723941487 ]

Redefine the RXD_ERR_MASK to include only relevant error bits. This fixes
a customer reported issue of randomly dropping packets on the 5719.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/net/tg3.c | 3 +--
 drivers/net/tg3.h | 6 +++++-
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index 56648b4..17e8abe 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -4557,8 +4557,7 @@ static int tg3_rx(struct tg3_napi *tnapi, int budget)
 
 		work_mask |= opaque_key;
 
-		if ((desc->err_vlan & RXD_ERR_MASK) != 0 &&
-		    (desc->err_vlan != RXD_ERR_ODD_NIBBLE_RCVD_MII)) {
+		if (desc->err_vlan & RXD_ERR_MASK) {
 		drop_it:
 			tg3_recycle_rx(tnapi, opaque_key,
 				       desc_idx, *post_ptr);
diff --git a/drivers/net/tg3.h b/drivers/net/tg3.h
index 529f55a..593f8c6 100644
--- a/drivers/net/tg3.h
+++ b/drivers/net/tg3.h
@@ -2219,7 +2219,11 @@ struct tg3_rx_buffer_desc {
 #define RXD_ERR_TOO_SMALL		0x00400000
 #define RXD_ERR_NO_RESOURCES		0x00800000
 #define RXD_ERR_HUGE_FRAME		0x01000000
-#define RXD_ERR_MASK			0xffff0000
+
+#define RXD_ERR_MASK	(RXD_ERR_BAD_CRC | RXD_ERR_COLLISION |		\
+			 RXD_ERR_LINK_LOST | RXD_ERR_PHY_DECODE |	\
+			 RXD_ERR_MAC_ABRT | RXD_ERR_TOO_SMALL |		\
+			 RXD_ERR_NO_RESOURCES | RXD_ERR_HUGE_FRAME)
 
 	u32				reserved;
 	u32				opaque;
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 105/143] net: sctp: fix sctp_sf_do_5_1D_ce to verify if we/peer is AUTH
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (104 preceding siblings ...)
  2014-05-12  0:33 ` [ 104/143] tg3: Dont check undefined error bits in RXBD Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 106/143] net: sctp: fix skb leakage in COOKIE ECHO path of chunk->auth_chunk Willy Tarreau
                   ` (37 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Daniel Borkmann, Vlad Yasevich, Neil Horman, Vlad Yasevich,
	David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------
 capable

From: Daniel Borkmann <dborkman@redhat.com>

[ Upstream commit ec0223ec48a90cb605244b45f7c62de856403729 ]

RFC4895 introduced AUTH chunks for SCTP; during the SCTP
handshake RANDOM; CHUNKS; HMAC-ALGO are negotiated (CHUNKS
being optional though):

  ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ---------->
  <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] ---------
  -------------------- COOKIE-ECHO -------------------->
  <-------------------- COOKIE-ACK ---------------------

A special case is when an endpoint requires COOKIE-ECHO
chunks to be authenticated:

  ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ---------->
  <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] ---------
  ------------------ AUTH; COOKIE-ECHO ---------------->
  <-------------------- COOKIE-ACK ---------------------

RFC4895, section 6.3. Receiving Authenticated Chunks says:

  The receiver MUST use the HMAC algorithm indicated in
  the HMAC Identifier field. If this algorithm was not
  specified by the receiver in the HMAC-ALGO parameter in
  the INIT or INIT-ACK chunk during association setup, the
  AUTH chunk and all the chunks after it MUST be discarded
  and an ERROR chunk SHOULD be sent with the error cause
  defined in Section 4.1. [...] If no endpoint pair shared
  key has been configured for that Shared Key Identifier,
  all authenticated chunks MUST be silently discarded. [...]

  When an endpoint requires COOKIE-ECHO chunks to be
  authenticated, some special procedures have to be followed
  because the reception of a COOKIE-ECHO chunk might result
  in the creation of an SCTP association. If a packet arrives
  containing an AUTH chunk as a first chunk, a COOKIE-ECHO
  chunk as the second chunk, and possibly more chunks after
  them, and the receiver does not have an STCB for that
  packet, then authentication is based on the contents of
  the COOKIE-ECHO chunk. In this situation, the receiver MUST
  authenticate the chunks in the packet by using the RANDOM
  parameters, CHUNKS parameters and HMAC_ALGO parameters
  obtained from the COOKIE-ECHO chunk, and possibly a local
  shared secret as inputs to the authentication procedure
  specified in Section 6.3. If authentication fails, then
  the packet is discarded. If the authentication is successful,
  the COOKIE-ECHO and all the chunks after the COOKIE-ECHO
  MUST be processed. If the receiver has an STCB, it MUST
  process the AUTH chunk as described above using the STCB
  from the existing association to authenticate the
  COOKIE-ECHO chunk and all the chunks after it. [...]

Commit bbd0d59809f9 introduced the possibility to receive
and verification of AUTH chunk, including the edge case for
authenticated COOKIE-ECHO. On reception of COOKIE-ECHO,
the function sctp_sf_do_5_1D_ce() handles processing,
unpacks and creates a new association if it passed sanity
checks and also tests for authentication chunks being
present. After a new association has been processed, it
invokes sctp_process_init() on the new association and
walks through the parameter list it received from the INIT
chunk. It checks SCTP_PARAM_RANDOM, SCTP_PARAM_HMAC_ALGO
and SCTP_PARAM_CHUNKS, and copies them into asoc->peer
meta data (peer_random, peer_hmacs, peer_chunks) in case
sysctl -w net.sctp.auth_enable=1 is set. If in INIT's
SCTP_PARAM_SUPPORTED_EXT parameter SCTP_CID_AUTH is set,
peer_random != NULL and peer_hmacs != NULL the peer is to be
assumed asoc->peer.auth_capable=1, in any other case
asoc->peer.auth_capable=0.

Now, if in sctp_sf_do_5_1D_ce() chunk->auth_chunk is
available, we set up a fake auth chunk and pass that on to
sctp_sf_authenticate(), which at latest in
sctp_auth_calculate_hmac() reliably dereferences a NULL pointer
at position 0..0008 when setting up the crypto key in
crypto_hash_setkey() by using asoc->asoc_shared_key that is
NULL as condition key_id == asoc->active_key_id is true if
the AUTH chunk was injected correctly from remote. This
happens no matter what net.sctp.auth_enable sysctl says.

The fix is to check for net->sctp.auth_enable and for
asoc->peer.auth_capable before doing any operations like
sctp_sf_authenticate() as no key is activated in
sctp_auth_asoc_init_active_key() for each case.

Now as RFC4895 section 6.3 states that if the used HMAC-ALGO
passed from the INIT chunk was not used in the AUTH chunk, we
SHOULD send an error; however in this case it would be better
to just silently discard such a maliciously prepared handshake
as we didn't even receive a parameter at all. Also, as our
endpoint has no shared key configured, section 6.3 says that
MUST silently discard, which we are doing from now onwards.

Before calling sctp_sf_pdiscard(), we need not only to free
the association, but also the chunk->auth_chunk skb, as
commit bbd0d59809f9 created a skb clone in that case.

I have tested this locally by using netfilter's nfqueue and
re-injecting packets into the local stack after maliciously
modifying the INIT chunk (removing RANDOM; HMAC-ALGO param)
and the SCTP packet containing the COOKIE_ECHO (injecting
AUTH chunk before COOKIE_ECHO). Fixed with this patch applied.

Fixes: bbd0d59809f9 ("[SCTP]: Implement the receive and verification of AUTH chunk")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Vlad Yasevich <yasevich@gmail.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/sctp/sm_statefuns.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c
index 486df56..d43002b 100644
--- a/net/sctp/sm_statefuns.c
+++ b/net/sctp/sm_statefuns.c
@@ -745,6 +745,13 @@ sctp_disposition_t sctp_sf_do_5_1D_ce(const struct sctp_endpoint *ep,
 		struct sctp_chunk auth;
 		sctp_ierror_t ret;
 
+		/* Make sure that we and the peer are AUTH capable */
+		if (!sctp_auth_enable || !new_asoc->peer.auth_capable) {
+			kfree_skb(chunk->auth_chunk);
+			sctp_association_free(new_asoc);
+			return sctp_sf_pdiscard(ep, asoc, type, arg, commands);
+		}
+
 		/* set-up our fake chunk so that we can process it */
 		auth.skb = chunk->auth_chunk;
 		auth.asoc = chunk->asoc;
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 106/143] net: sctp: fix skb leakage in COOKIE ECHO path of chunk->auth_chunk
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (105 preceding siblings ...)
  2014-05-12  0:33 ` [ 105/143] net: sctp: fix sctp_sf_do_5_1D_ce to verify if we/peer is AUTH Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 107/143] net: socket: error on a negative msg_namelen Willy Tarreau
                   ` (36 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Daniel Borkmann, Vlad Yasevich, Neil Horman, Vlad Yasevich,
	David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Daniel Borkmann <dborkman@redhat.com>

[ Upstream commit c485658bae87faccd7aed540fd2ca3ab37992310 ]

While working on ec0223ec48a9 ("net: sctp: fix sctp_sf_do_5_1D_ce to
verify if we/peer is AUTH capable"), we noticed that there's a skb
memory leakage in the error path.

Running the same reproducer as in ec0223ec48a9 and by unconditionally
jumping to the error label (to simulate an error condition) in
sctp_sf_do_5_1D_ce() receive path lets kmemleak detector bark about
the unfreed chunk->auth_chunk skb clone:

Unreferenced object 0xffff8800b8f3a000 (size 256):
  comm "softirq", pid 0, jiffies 4294769856 (age 110.757s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    89 ab 75 5e d4 01 58 13 00 00 00 00 00 00 00 00  ..u^..X.........
  backtrace:
    [<ffffffff816660be>] kmemleak_alloc+0x4e/0xb0
    [<ffffffff8119f328>] kmem_cache_alloc+0xc8/0x210
    [<ffffffff81566929>] skb_clone+0x49/0xb0
    [<ffffffffa0467459>] sctp_endpoint_bh_rcv+0x1d9/0x230 [sctp]
    [<ffffffffa046fdbc>] sctp_inq_push+0x4c/0x70 [sctp]
    [<ffffffffa047e8de>] sctp_rcv+0x82e/0x9a0 [sctp]
    [<ffffffff815abd38>] ip_local_deliver_finish+0xa8/0x210
    [<ffffffff815a64af>] nf_reinject+0xbf/0x180
    [<ffffffffa04b4762>] nfqnl_recv_verdict+0x1d2/0x2b0 [nfnetlink_queue]
    [<ffffffffa04aa40b>] nfnetlink_rcv_msg+0x14b/0x250 [nfnetlink]
    [<ffffffff815a3269>] netlink_rcv_skb+0xa9/0xc0
    [<ffffffffa04aa7cf>] nfnetlink_rcv+0x23f/0x408 [nfnetlink]
    [<ffffffff815a2bd8>] netlink_unicast+0x168/0x250
    [<ffffffff815a2fa1>] netlink_sendmsg+0x2e1/0x3f0
    [<ffffffff8155cc6b>] sock_sendmsg+0x8b/0xc0
    [<ffffffff8155d449>] ___sys_sendmsg+0x369/0x380

What happens is that commit bbd0d59809f9 clones the skb containing
the AUTH chunk in sctp_endpoint_bh_rcv() when having the edge case
that an endpoint requires COOKIE-ECHO chunks to be authenticated:

  ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ---------->
  <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] ---------
  ------------------ AUTH; COOKIE-ECHO ---------------->
  <-------------------- COOKIE-ACK ---------------------

When we enter sctp_sf_do_5_1D_ce() and before we actually get to
the point where we process (and subsequently free) a non-NULL
chunk->auth_chunk, we could hit the "goto nomem_init" path from
an error condition and thus leave the cloned skb around w/o
freeing it.

The fix is to centrally free such clones in sctp_chunk_destroy()
handler that is invoked from sctp_chunk_free() after all refs have
dropped; and also move both kfree_skb(chunk->auth_chunk) there,
so that chunk->auth_chunk is either NULL (since sctp_chunkify()
allocs new chunks through kmem_cache_zalloc()) or non-NULL with
a valid skb pointer. chunk->skb and chunk->auth_chunk are the
only skbs in the sctp_chunk structure that need to be handeled.

While at it, we should use consume_skb() for both. It is the same
as dev_kfree_skb() but more appropriately named as we are not
a device but a protocol. Also, this effectively replaces the
kfree_skb() from both invocations into consume_skb(). Functions
are the same only that kfree_skb() assumes that the frame was
being dropped after a failure (e.g. for tools like drop monitor),
usage of consume_skb() seems more appropriate in function
sctp_chunk_destroy() though.

Fixes: bbd0d59809f9 ("[SCTP]: Implement the receive and verification of AUTH chunk")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Vlad Yasevich <yasevich@gmail.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/sctp/sm_make_chunk.c | 4 ++--
 net/sctp/sm_statefuns.c  | 4 ----
 2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
index feedee7..22d4ed8 100644
--- a/net/sctp/sm_make_chunk.c
+++ b/net/sctp/sm_make_chunk.c
@@ -1356,8 +1356,8 @@ static void sctp_chunk_destroy(struct sctp_chunk *chunk)
 	BUG_ON(!list_empty(&chunk->list));
 	list_del_init(&chunk->transmitted_list);
 
-	/* Free the chunk skb data and the SCTP_chunk stub itself. */
-	dev_kfree_skb(chunk->skb);
+	consume_skb(chunk->skb);
+	consume_skb(chunk->auth_chunk);
 
 	SCTP_DBG_OBJCNT_DEC(chunk);
 	kmem_cache_free(sctp_chunk_cachep, chunk);
diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c
index d43002b..6da0171 100644
--- a/net/sctp/sm_statefuns.c
+++ b/net/sctp/sm_statefuns.c
@@ -762,10 +762,6 @@ sctp_disposition_t sctp_sf_do_5_1D_ce(const struct sctp_endpoint *ep,
 		auth.transport = chunk->transport;
 
 		ret = sctp_sf_authenticate(ep, new_asoc, type, &auth);
-
-		/* We can now safely free the auth_chunk clone */
-		kfree_skb(chunk->auth_chunk);
-
 		if (ret != SCTP_IERROR_NO_ERROR) {
 			sctp_association_free(new_asoc);
 			return sctp_sf_pdiscard(ep, asoc, type, arg, commands);
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 107/143] net: socket: error on a negative msg_namelen
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (106 preceding siblings ...)
  2014-05-12  0:33 ` [ 106/143] net: sctp: fix skb leakage in COOKIE ECHO path of chunk->auth_chunk Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 108/143] netlink: dont compare the nul-termination in nla_strcmp Willy Tarreau
                   ` (35 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Matthew Leach, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Matthew Leach <matthew.leach@arm.com>

[ Upstream commit dbb490b96584d4e958533fb637f08b557f505657 ]

When copying in a struct msghdr from the user, if the user has set the
msg_namelen parameter to a negative value it gets clamped to a valid
size due to a comparison between signed and unsigned values.

Ensure the syscall errors when the user passes in a negative value.

Signed-off-by: Matthew Leach <matthew.leach@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/socket.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/net/socket.c b/net/socket.c
index 0823497..bc151b8 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -1871,6 +1871,10 @@ static int copy_msghdr_from_user(struct msghdr *kmsg,
 {
 	if (copy_from_user(kmsg, umsg, sizeof(struct msghdr)))
 		return -EFAULT;
+
+	if (kmsg->msg_namelen < 0)
+		return -EINVAL;
+
 	if (kmsg->msg_namelen > sizeof(struct sockaddr_storage))
 		kmsg->msg_namelen = sizeof(struct sockaddr_storage);
 	return 0;
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 108/143] netlink: dont compare the nul-termination in nla_strcmp
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (107 preceding siblings ...)
  2014-05-12  0:33 ` [ 107/143] net: socket: error on a negative msg_namelen Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 109/143] isdnloop: several buffer overflows Willy Tarreau
                   ` (34 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Florian Westphal, Thomas Graf, Pablo Neira Ayuso,
	David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Pablo Neira <pablo@netfilter.org>

[ Upstream commit 8b7b932434f5eee495b91a2804f5b64ebb2bc835 ]

nla_strcmp compares the string length plus one, so it's implicitly
including the nul-termination in the comparison.

 int nla_strcmp(const struct nlattr *nla, const char *str)
 {
        int len = strlen(str) + 1;
        ...
                d = memcmp(nla_data(nla), str, len);

However, if NLA_STRING is used, userspace can send us a string without
the nul-termination. This is a problem since the string
comparison will not match as the last byte may be not the
nul-termination.

Fix this by skipping the comparison of the nul-termination if the
attribute data is nul-terminated. Suggested by Thomas Graf.

Cc: Florian Westphal <fw@strlen.de>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 lib/nlattr.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/lib/nlattr.c b/lib/nlattr.c
index 109d4fe..51b84de 100644
--- a/lib/nlattr.c
+++ b/lib/nlattr.c
@@ -299,9 +299,15 @@ int nla_memcmp(const struct nlattr *nla, const void *data,
  */
 int nla_strcmp(const struct nlattr *nla, const char *str)
 {
-	int len = strlen(str) + 1;
-	int d = nla_len(nla) - len;
+	int len = strlen(str);
+	char *buf = nla_data(nla);
+	int attrlen = nla_len(nla);
+	int d;
 
+	if (attrlen > 0 && buf[attrlen - 1] == '\0')
+		attrlen--;
+
+	d = attrlen - len;
 	if (d == 0)
 		d = memcmp(nla_data(nla), str, len);
 
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 109/143] isdnloop: several buffer overflows
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (108 preceding siblings ...)
  2014-05-12  0:33 ` [ 108/143] netlink: dont compare the nul-termination in nla_strcmp Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 110/143] rds: prevent dereference of a NULL device in rds_iw_laddr_check Willy Tarreau
                   ` (33 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Dan Carpenter, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Dan Carpenter <dan.carpenter@oracle.com>

[ Upstream commit 7563487cbf865284dcd35e9ef5a95380da046737 ]

There are three buffer overflows addressed in this patch.

1) In isdnloop_fake_err() we add an 'E' to a 60 character string and
then copy it into a 60 character buffer.  I have made the destination
buffer 64 characters and I'm changed the sprintf() to a snprintf().

2) In isdnloop_parse_cmd(), p points to a 6 characters into a 60
character buffer so we have 54 characters.  The ->eazlist[] is 11
characters long.  I have modified the code to return if the source
buffer is too long.

3) In isdnloop_command() the cbuf[] array was 60 characters long but the
max length of the string then can be up to 79 characters.  I made the
cbuf array 80 characters long and changed the sprintf() to snprintf().
I also removed the temporary "dial" buffer and changed it to use "p"
directly.

Unfortunately, we pass the "cbuf" string from isdnloop_command() to
isdnloop_writecmd() which truncates anything over 60 characters to make
it fit in card->omsg[].  (It can accept values up to 255 characters so
long as there is a '\n' character every 60 characters).  For now I have
just fixed the memory corruption bug and left the other problems in this
driver alone.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/isdn/isdnloop/isdnloop.c | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/isdn/isdnloop/isdnloop.c b/drivers/isdn/isdnloop/isdnloop.c
index 92d895f..bf4168b 100644
--- a/drivers/isdn/isdnloop/isdnloop.c
+++ b/drivers/isdn/isdnloop/isdnloop.c
@@ -517,9 +517,9 @@ static isdnloop_stat isdnloop_cmd_table[] =
 static void
 isdnloop_fake_err(isdnloop_card * card)
 {
-	char buf[60];
+	char buf[64];
 
-	sprintf(buf, "E%s", card->omsg);
+	snprintf(buf, sizeof(buf), "E%s", card->omsg);
 	isdnloop_fake(card, buf, -1);
 	isdnloop_fake(card, "NAK", -1);
 }
@@ -902,6 +902,8 @@ isdnloop_parse_cmd(isdnloop_card * card)
 		case 7:
 			/* 0x;EAZ */
 			p += 3;
+			if (strlen(p) >= sizeof(card->eazlist[0]))
+				break;
 			strcpy(card->eazlist[ch - 1], p);
 			break;
 		case 8:
@@ -1126,7 +1128,7 @@ isdnloop_command(isdn_ctrl * c, isdnloop_card * card)
 {
 	ulong a;
 	int i;
-	char cbuf[60];
+	char cbuf[80];
 	isdn_ctrl cmd;
 	isdnloop_cdef cdef;
 
@@ -1191,7 +1193,6 @@ isdnloop_command(isdn_ctrl * c, isdnloop_card * card)
 				break;
 			if ((c->arg & 255) < ISDNLOOP_BCH) {
 				char *p;
-				char dial[50];
 				char dcode[4];
 
 				a = c->arg;
@@ -1203,10 +1204,10 @@ isdnloop_command(isdn_ctrl * c, isdnloop_card * card)
 				} else
 					/* Normal Dial */
 					strcpy(dcode, "CAL");
-				strcpy(dial, p);
-				sprintf(cbuf, "%02d;D%s_R%s,%02d,%02d,%s\n", (int) (a + 1),
-					dcode, dial, c->parm.setup.si1,
-				c->parm.setup.si2, c->parm.setup.eazmsn);
+				snprintf(cbuf, sizeof(cbuf),
+					 "%02d;D%s_R%s,%02d,%02d,%s\n", (int) (a + 1),
+					 dcode, p, c->parm.setup.si1,
+					 c->parm.setup.si2, c->parm.setup.eazmsn);
 				i = isdnloop_writecmd(cbuf, strlen(cbuf), 0, card);
 			}
 			break;
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 110/143] rds: prevent dereference of a NULL device in rds_iw_laddr_check
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (109 preceding siblings ...)
  2014-05-12  0:33 ` [ 109/143] isdnloop: several buffer overflows Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 111/143] isdnloop: Validate NUL-terminated strings from user Willy Tarreau
                   ` (32 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Sasha Levin, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Sasha Levin <sasha.levin@oracle.com>

[ Upstream commit bf39b4247b8799935ea91d90db250ab608a58e50 ]

Binding might result in a NULL device which is later dereferenced
without checking.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/rds/iw.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/rds/iw.c b/net/rds/iw.c
index db224f7..bff1e4b 100644
--- a/net/rds/iw.c
+++ b/net/rds/iw.c
@@ -237,7 +237,8 @@ static int rds_iw_laddr_check(__be32 addr)
 	ret = rdma_bind_addr(cm_id, (struct sockaddr *)&sin);
 	/* due to this, we will claim to support IB devices unless we
 	   check node_type. */
-	if (ret || cm_id->device->node_type != RDMA_NODE_RNIC)
+	if (ret || !cm_id->device ||
+	    cm_id->device->node_type != RDMA_NODE_RNIC)
 		ret = -EADDRNOTAVAIL;
 
 	rdsdebug("addr %pI4 ret %d node type %d\n",
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 111/143] isdnloop: Validate NUL-terminated strings from user.
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (110 preceding siblings ...)
  2014-05-12  0:33 ` [ 110/143] rds: prevent dereference of a NULL device in rds_iw_laddr_check Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 112/143] sctp: unbalanced rcu lock in ip_queue_xmit() Willy Tarreau
                   ` (31 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: YOSHIFUJI Hideaki, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

[ Upstream commit 77bc6bed7121936bb2e019a8c336075f4c8eef62 ]

Return -EINVAL unless all of user-given strings are correctly
NUL-terminated.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/isdn/isdnloop/isdnloop.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/isdn/isdnloop/isdnloop.c b/drivers/isdn/isdnloop/isdnloop.c
index bf4168b..4267d48 100644
--- a/drivers/isdn/isdnloop/isdnloop.c
+++ b/drivers/isdn/isdnloop/isdnloop.c
@@ -1071,6 +1071,12 @@ isdnloop_start(isdnloop_card * card, isdnloop_sdef * sdefp)
 		return -EBUSY;
 	if (copy_from_user((char *) &sdef, (char *) sdefp, sizeof(sdef)))
 		return -EFAULT;
+
+	for (i = 0; i < 3; i++) {
+		if (!memchr(sdef.num[i], 0, sizeof(sdef.num[i])))
+			return -EINVAL;
+	}
+
 	spin_lock_irqsave(&card->isdnloop_lock, flags);
 	switch (sdef.ptype) {
 		case ISDN_PTYPE_EURO:
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 112/143] sctp: unbalanced rcu lock in ip_queue_xmit()
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (111 preceding siblings ...)
  2014-05-12  0:33 ` [ 111/143] isdnloop: Validate NUL-terminated strings from user Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 113/143] aacraid: prevent invalid pointer dereference Willy Tarreau
                   ` (30 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Nicolas Dichtel, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Nicolas Dichtel <nicolas.dichtel@6wind.com>

The bug was introduced in 2.6.32.61 by commit b8710128e201 ("inet: add RCU
protection to inet->opt") (it's a backport of upstream commit f6d8bd051c39).

In SCTP case, packet is already routed, hence we jump to the label
'packet_routed', but without rcu_read_lock(). After this label,
rcu_read_unlock() is called unconditionally.

Spotted-by: Guo Fengtian <fengtian.guo@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/ipv4/ip_output.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 7dde039..2cd69e3 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -320,13 +320,13 @@ int ip_queue_xmit(struct sk_buff *skb, int ipfragok)
 	/* Skip all of this if the packet is already routed,
 	 * f.e. by something like SCTP.
 	 */
+	rcu_read_lock();
 	rt = skb_rtable(skb);
 	if (rt != NULL)
 		goto packet_routed;
 
 	/* Make sure we can route this packet. */
 	rt = (struct rtable *)__sk_dst_check(sk, 0);
-	rcu_read_lock();
 	inet_opt = rcu_dereference(inet->inet_opt);
 	if (rt == NULL) {
 		__be32 daddr;
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 113/143] aacraid: prevent invalid pointer dereference
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (112 preceding siblings ...)
  2014-05-12  0:33 ` [ 112/143] sctp: unbalanced rcu lock in ip_queue_xmit() Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 114/143] ipv6: udp packets following an UFO enqueued packet need also be Willy Tarreau
                   ` (29 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Mahesh Rajashekhara, Linus Torvalds, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Mahesh Rajashekhara <Mahesh.Rajashekhara@pmcs.com>

It appears that driver runs into a problem here if fibsize is too small
because we allocate user_srbcmd with fibsize size only but later we
access it until user_srbcmd->sg.count to copy it over to srbcmd.

It is not correct to test (fibsize < sizeof(*user_srbcmd)) because this
structure already includes one sg element and this is not needed for
commands without data.  So, we would recommend to add the following
(instead of test for fibsize == 0).

Signed-off-by: Mahesh Rajashekhara <Mahesh.Rajashekhara@pmcs.com>
Reported-by: Nico Golde <nico@ngolde.de>
Reported-by: Fabian Yamaguchi <fabs@goesec.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit b4789b8e6be3151a955ade74872822f30e8cd914)

CVE-2013-6380
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/scsi/aacraid/commctrl.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/aacraid/commctrl.c b/drivers/scsi/aacraid/commctrl.c
index a5b8e7b..c895174 100644
--- a/drivers/scsi/aacraid/commctrl.c
+++ b/drivers/scsi/aacraid/commctrl.c
@@ -507,7 +507,8 @@ static int aac_send_raw_srb(struct aac_dev* dev, void __user * arg)
 		goto cleanup;
 	}
 
-	if (fibsize > (dev->max_fib_size - sizeof(struct aac_fibhdr))) {
+	if ((fibsize < (sizeof(struct user_aac_srb) - sizeof(struct user_sgentry))) ||
+	    (fibsize > (dev->max_fib_size - sizeof(struct aac_fibhdr)))) {
 		rcode = -EINVAL;
 		goto cleanup;
 	}
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 114/143] ipv6: udp packets following an UFO enqueued packet need also be
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (113 preceding siblings ...)
  2014-05-12  0:33 ` [ 113/143] aacraid: prevent invalid pointer dereference Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 115/143] inet: fix possible memory corruption with UDP_CORK and UFO Willy Tarreau
                   ` (28 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: YOSHIFUJI Hideaki, Hannes Frederic Sowa, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------
 handled by UFO

From: Hannes Frederic Sowa <hannes@stressinduktion.org>

In the following scenario the socket is corked:
If the first UDP packet is larger then the mtu we try to append it to the
write queue via ip6_ufo_append_data. A following packet, which is smaller
than the mtu would be appended to the already queued up gso-skb via
plain ip6_append_data. This causes random memory corruptions.

In ip6_ufo_append_data we also have to be careful to not queue up the
same skb multiple times. So setup the gso frame only when no first skb
is available.

This also fixes a shortcoming where we add the current packet's length to
cork->length but return early because of a packet > mtu with dontfrag set
(instead of sutracting it again).

Found with trinity.

Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 2811ebac2521ceac84f2bdae402455baa6a7fb47)
[wt: 2.6.32 doesn't have dontfrag so remove the optimization]
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/ipv6/ip6_output.c | 31 ++++++++++++-------------------
 1 file changed, 12 insertions(+), 19 deletions(-)

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 6ff4d07..5a1b5bc 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1086,6 +1086,8 @@ static inline int ip6_ufo_append_data(struct sock *sk,
 	 * udp datagram
 	 */
 	if ((skb = skb_peek_tail(&sk->sk_write_queue)) == NULL) {
+		struct frag_hdr fhdr;
+
 		skb = sock_alloc_send_skb(sk,
 			hh_len + fragheaderlen + transhdrlen + 20,
 			(flags & MSG_DONTWAIT), &err);
@@ -1107,12 +1109,6 @@ static inline int ip6_ufo_append_data(struct sock *sk,
 		skb->ip_summed = CHECKSUM_PARTIAL;
 		skb->csum = 0;
 		sk->sk_sndmsg_off = 0;
-	}
-
-	err = skb_append_datato_frags(sk,skb, getfrag, from,
-				      (length - transhdrlen));
-	if (!err) {
-		struct frag_hdr fhdr;
 
 		/* Specify the length of each IPv6 datagram fragment.
 		 * It has to be a multiple of 8.
@@ -1123,15 +1119,10 @@ static inline int ip6_ufo_append_data(struct sock *sk,
 		ipv6_select_ident(&fhdr, rt);
 		skb_shinfo(skb)->ip6_frag_id = fhdr.identification;
 		__skb_queue_tail(&sk->sk_write_queue, skb);
-
-		return 0;
 	}
-	/* There is not enough support do UPD LSO,
-	 * so follow normal path
-	 */
-	kfree_skb(skb);
 
-	return err;
+	return skb_append_datato_frags(sk, skb, getfrag, from,
+				       (length - transhdrlen));
 }
 
 static inline struct ipv6_opt_hdr *ip6_opt_dup(struct ipv6_opt_hdr *src,
@@ -1264,18 +1255,20 @@ int ip6_append_data(struct sock *sk, int getfrag(void *from, char *to,
 	 */
 
 	inet->cork.length += length;
-	if (((length > mtu) && (sk->sk_protocol == IPPROTO_UDP)) &&
+	skb = skb_peek_tail(&sk->sk_write_queue);
+	if (((length > mtu) ||
+	     (skb && skb_is_gso(skb))) &&
+	    (sk->sk_protocol == IPPROTO_UDP) &&
 	    (rt->u.dst.dev->features & NETIF_F_UFO)) {
-
-		err = ip6_ufo_append_data(sk, getfrag, from, length, hh_len,
-					  fragheaderlen, transhdrlen, mtu,
-					  flags, rt);
+		err = ip6_ufo_append_data(sk, getfrag, from, length,
+					  hh_len, fragheaderlen,
+					  transhdrlen, mtu, flags, rt);
 		if (err)
 			goto error;
 		return 0;
 	}
 
-	if ((skb = skb_peek_tail(&sk->sk_write_queue)) == NULL)
+	if (!skb)
 		goto alloc_new_skb;
 
 	while (length > 0) {
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 115/143] inet: fix possible memory corruption with UDP_CORK and UFO
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (114 preceding siblings ...)
  2014-05-12  0:33 ` [ 114/143] ipv6: udp packets following an UFO enqueued packet need also be Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 116/143] vm: add vm_iomap_memory() helper function Willy Tarreau
                   ` (27 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Jiri Pirko, Eric Dumazet, David Miller, Hannes Frederic Sowa,
	Greg Kroah-Hartman, Ben Hutchings, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Hannes Frederic Sowa <hannes@stressinduktion.org>

[ This is a simplified -stable version of a set of upstream commits. ]

This is a replacement patch only for stable which does fix the problems
handled by the following two commits in -net:

"ip_output: do skb ufo init for peeked non ufo skb as well" (e93b7d748be887cd7639b113ba7d7ef792a7efb9)
"ip6_output: do skb ufo init for peeked non ufo skb as well" (c547dbf55d5f8cf615ccc0e7265e98db27d3fb8b)

Three frames are written on a corked udp socket for which the output
netdevice has UFO enabled.  If the first and third frame are smaller than
the mtu and the second one is bigger, we enqueue the second frame with
skb_append_datato_frags without initializing the gso fields. This leads
to the third frame appended regulary and thus constructing an invalid skb.

This fixes the problem by always using skb_append_datato_frags as soon
as the first frag got enqueued to the skb without marking the packet
as SKB_GSO_UDP.

The problem with only two frames for ipv6 was fixed by "ipv6: udp
packets following an UFO enqueued packet need also be handled by UFO"
(2811ebac2521ceac84f2bdae402455baa6a7fb47).

Cc: Jiri Pirko <jiri@resnulli.us>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
(cherry picked from commit 5124ae99ac8a8f63d0fca9b75adaef40b20678ff)
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/ipv4/ip_output.c  | 2 +-
 net/ipv6/ip6_output.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 2cd69e3..faa6623 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -875,7 +875,7 @@ int ip_append_data(struct sock *sk,
 	skb = skb_peek_tail(&sk->sk_write_queue);
 
 	inet->cork.length += length;
-	if (((length > mtu) || (skb && skb_is_gso(skb))) &&
+	if (((length > mtu) || (skb && skb_has_frags(skb))) &&
 	    (sk->sk_protocol == IPPROTO_UDP) &&
 	    (rt->u.dst.dev->features & NETIF_F_UFO)) {
 		err = ip_ufo_append_data(sk, getfrag, from, length, hh_len,
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 5a1b5bc..6dff3d7 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1257,7 +1257,7 @@ int ip6_append_data(struct sock *sk, int getfrag(void *from, char *to,
 	inet->cork.length += length;
 	skb = skb_peek_tail(&sk->sk_write_queue);
 	if (((length > mtu) ||
-	     (skb && skb_is_gso(skb))) &&
+	     (skb && skb_has_frags(skb))) &&
 	    (sk->sk_protocol == IPPROTO_UDP) &&
 	    (rt->u.dst.dev->features & NETIF_F_UFO)) {
 		err = ip6_ufo_append_data(sk, getfrag, from, length,
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 116/143] vm: add vm_iomap_memory() helper function
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (115 preceding siblings ...)
  2014-05-12  0:33 ` [ 115/143] inet: fix possible memory corruption with UDP_CORK and UFO Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 117/143] Fix a few incorrectly checked [io_]remap_pfn_range() calls Willy Tarreau
                   ` (26 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Greg Kroah-Hartman, Linus Torvalds, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Linus Torvalds <torvalds@linux-foundation.org>

Various drivers end up replicating the code to mmap() their memory
buffers into user space, and our core memory remapping function may be
very flexible but it is unnecessarily complicated for the common cases
to use.

Our internal VM uses pfn's ("page frame numbers") which simplifies
things for the VM, and allows us to pass physical addresses around in a
denser and more efficient format than passing a "phys_addr_t" around,
and having to shift it up and down by the page size.  But it just means
that drivers end up doing that shifting instead at the interface level.

It also means that drivers end up mucking around with internal VM things
like the vma details (vm_pgoff, vm_start/end) way more than they really
need to.

So this just exports a function to map a certain physical memory range
into user space (using a phys_addr_t based interface that is much more
natural for a driver) and hides all the complexity from the driver.
Some drivers will still end up tweaking the vm_page_prot details for
things like prefetching or cacheability etc, but that's actually
relevant to the driver, rather than caring about what the page offset of
the mapping is into the particular IO memory region.

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit b4cbb197c7e7a68dbad0d491242e3ca67420c13e)
[WT: only needed in 2.6.32 for next commit]
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 include/linux/mm.h |  2 ++
 mm/memory.c        | 47 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 49 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 11e5be6..5ef50c1 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1243,6 +1243,8 @@ int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
 			unsigned long pfn);
 int vm_insert_mixed(struct vm_area_struct *vma, unsigned long addr,
 			unsigned long pfn);
+int vm_iomap_memory(struct vm_area_struct *vma, phys_addr_t start, unsigned long len);
+
 
 struct page *follow_page(struct vm_area_struct *, unsigned long address,
 			unsigned int foll_flags);
diff --git a/mm/memory.c b/mm/memory.c
index 6c836d3..085b068 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1811,6 +1811,53 @@ int remap_pfn_range(struct vm_area_struct *vma, unsigned long addr,
 }
 EXPORT_SYMBOL(remap_pfn_range);
 
+/**
+ * vm_iomap_memory - remap memory to userspace
+ * @vma: user vma to map to
+ * @start: start of area
+ * @len: size of area
+ *
+ * This is a simplified io_remap_pfn_range() for common driver use. The
+ * driver just needs to give us the physical memory range to be mapped,
+ * we'll figure out the rest from the vma information.
+ *
+ * NOTE! Some drivers might want to tweak vma->vm_page_prot first to get
+ * whatever write-combining details or similar.
+ */
+int vm_iomap_memory(struct vm_area_struct *vma, phys_addr_t start, unsigned long len)
+{
+	unsigned long vm_len, pfn, pages;
+
+	/* Check that the physical memory area passed in looks valid */
+	if (start + len < start)
+		return -EINVAL;
+	/*
+	 * You *really* shouldn't map things that aren't page-aligned,
+	 * but we've historically allowed it because IO memory might
+	 * just have smaller alignment.
+	 */
+	len += start & ~PAGE_MASK;
+	pfn = start >> PAGE_SHIFT;
+	pages = (len + ~PAGE_MASK) >> PAGE_SHIFT;
+	if (pfn + pages < pfn)
+		return -EINVAL;
+
+	/* We start the mapping 'vm_pgoff' pages into the area */
+	if (vma->vm_pgoff > pages)
+		return -EINVAL;
+	pfn += vma->vm_pgoff;
+	pages -= vma->vm_pgoff;
+
+	/* Can we fit all of the mapping? */
+	vm_len = vma->vm_end - vma->vm_start;
+	if (vm_len >> PAGE_SHIFT > pages)
+		return -EINVAL;
+
+	/* Ok, let it rip */
+	return io_remap_pfn_range(vma, vma->vm_start, pfn, vm_len, vma->vm_page_prot);
+}
+EXPORT_SYMBOL(vm_iomap_memory);
+
 static int apply_to_pte_range(struct mm_struct *mm, pmd_t *pmd,
 				     unsigned long addr, unsigned long end,
 				     pte_fn_t fn, void *data)
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 117/143] Fix a few incorrectly checked [io_]remap_pfn_range() calls
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (116 preceding siblings ...)
  2014-05-12  0:33 ` [ 116/143] vm: add vm_iomap_memory() helper function Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 118/143] libertas: potential oops in debugfs Willy Tarreau
                   ` (25 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Linus Torvalds, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Linus Torvalds <torvalds@linux-foundation.org>

Nico Golde reports a few straggling uses of [io_]remap_pfn_range() that
really should use the vm_iomap_memory() helper.  This trivially converts
two of them to the helper, and comments about why the third one really
needs to continue to use remap_pfn_range(), and adds the missing size
check.

Reported-by: Nico Golde <nico@ngolde.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 7314e613d5ff9f0934f7a0f74ed7973b903315d1)
[wt: vm_flags were absent in mainline, Ben removed them in 3.2, but
 I kept them to minimize changes and avoid any side effect]
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/uio/uio.c        | 16 +++++++++++++++-
 drivers/video/au1100fb.c | 26 +-------------------------
 drivers/video/au1200fb.c | 26 +-------------------------
 3 files changed, 17 insertions(+), 51 deletions(-)

diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
index e941367..e3804d3 100644
--- a/drivers/uio/uio.c
+++ b/drivers/uio/uio.c
@@ -669,16 +669,30 @@ static int uio_mmap_physical(struct vm_area_struct *vma)
 {
 	struct uio_device *idev = vma->vm_private_data;
 	int mi = uio_find_mem_index(vma);
+	struct uio_mem *mem;
 	if (mi < 0)
 		return -EINVAL;
+	mem = idev->info->mem + mi;
+
+	if (vma->vm_end - vma->vm_start > mem->size)
+		return -EINVAL;
 
 	vma->vm_flags |= VM_IO | VM_RESERVED;
 
 	vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
 
+	/*
+	 * We cannot use the vm_iomap_memory() helper here,
+	 * because vma->vm_pgoff is the map index we looked
+	 * up above in uio_find_mem_index(), rather than an
+	 * actual page offset into the mmap.
+	 *
+	 * So we just do the physical mmap without a page
+	 * offset.
+	 */
 	return remap_pfn_range(vma,
 			       vma->vm_start,
-			       idev->info->mem[mi].addr >> PAGE_SHIFT,
+			       mem->addr >> PAGE_SHIFT,
 			       vma->vm_end - vma->vm_start,
 			       vma->vm_page_prot);
 }
diff --git a/drivers/video/au1100fb.c b/drivers/video/au1100fb.c
index a699aab..745e5b3 100644
--- a/drivers/video/au1100fb.c
+++ b/drivers/video/au1100fb.c
@@ -392,39 +392,15 @@ void au1100fb_fb_rotate(struct fb_info *fbi, int angle)
 int au1100fb_fb_mmap(struct fb_info *fbi, struct vm_area_struct *vma)
 {
 	struct au1100fb_device *fbdev;
-	unsigned int len;
-	unsigned long start=0, off;
 
 	fbdev = to_au1100fb_device(fbi);
 
-	if (vma->vm_pgoff > (~0UL >> PAGE_SHIFT)) {
-		return -EINVAL;
-	}
-
-	start = fbdev->fb_phys & PAGE_MASK;
-	len = PAGE_ALIGN((start & ~PAGE_MASK) + fbdev->fb_len);
-
-	off = vma->vm_pgoff << PAGE_SHIFT;
-
-	if ((vma->vm_end - vma->vm_start + off) > len) {
-		return -EINVAL;
-	}
-
-	off += start;
-	vma->vm_pgoff = off >> PAGE_SHIFT;
-
 	vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
 	pgprot_val(vma->vm_page_prot) |= (6 << 9); //CCA=6
 
 	vma->vm_flags |= VM_IO;
 
-	if (io_remap_pfn_range(vma, vma->vm_start, off >> PAGE_SHIFT,
-				vma->vm_end - vma->vm_start,
-				vma->vm_page_prot)) {
-		return -EAGAIN;
-	}
-
-	return 0;
+	return vm_iomap_memory(vma, fbdev->fb_phys, fbdev->fb_len);
 }
 
 /* fb_cursor
diff --git a/drivers/video/au1200fb.c b/drivers/video/au1200fb.c
index 0d96f1d..5d6e509 100644
--- a/drivers/video/au1200fb.c
+++ b/drivers/video/au1200fb.c
@@ -1241,42 +1241,18 @@ static int au1200fb_fb_blank(int blank_mode, struct fb_info *fbi)
  * method mainly to allow the use of the TLB streaming flag (CCA=6)
  */
 static int au1200fb_fb_mmap(struct fb_info *info, struct vm_area_struct *vma)
-
 {
-	unsigned int len;
-	unsigned long start=0, off;
 	struct au1200fb_device *fbdev = (struct au1200fb_device *) info;
 
 #ifdef CONFIG_PM
 	au1xxx_pm_access(LCD_pm_dev);
 #endif
-
-	if (vma->vm_pgoff > (~0UL >> PAGE_SHIFT)) {
-		return -EINVAL;
-	}
-
-	start = fbdev->fb_phys & PAGE_MASK;
-	len = PAGE_ALIGN((start & ~PAGE_MASK) + fbdev->fb_len);
-
-	off = vma->vm_pgoff << PAGE_SHIFT;
-
-	if ((vma->vm_end - vma->vm_start + off) > len) {
-		return -EINVAL;
-	}
-
-	off += start;
-	vma->vm_pgoff = off >> PAGE_SHIFT;
-
 	vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
 	pgprot_val(vma->vm_page_prot) |= _CACHE_MASK; /* CCA=7 */
 
 	vma->vm_flags |= VM_IO;
 
-	return io_remap_pfn_range(vma, vma->vm_start, off >> PAGE_SHIFT,
-				  vma->vm_end - vma->vm_start,
-				  vma->vm_page_prot);
-
-	return 0;
+	return vm_iomap_memory(vma, fbdev->fb_phys, fbdev->fb_len);
 }
 
 static void set_global(u_int cmd, struct au1200_lcd_global_regs_t *pdata)
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 118/143] libertas: potential oops in debugfs
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (117 preceding siblings ...)
  2014-05-12  0:33 ` [ 117/143] Fix a few incorrectly checked [io_]remap_pfn_range() calls Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:33 ` [ 119/143] x86, fpu, amd: Clear exceptions in AMD FXSAVE workaround Willy Tarreau
                   ` (24 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Dan Carpenter, Dan Williams, John W. Linville, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Dan Carpenter <dan.carpenter@oracle.com>

If we do a zero size allocation then it will oops.  Also we can't be
sure the user passes us a NUL terminated string so I've added a
terminator.

This code can only be triggered by root.

Reported-by: Nico Golde <nico@ngolde.de>
Reported-by: Fabian Yamaguchi <fabs@goesec.de>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
(cherry picked from commit a497e47d4aec37aaf8f13509f3ef3d1f6a717d88)
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/net/wireless/libertas/debugfs.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/wireless/libertas/debugfs.c b/drivers/net/wireless/libertas/debugfs.c
index 893a55c..89532a6 100644
--- a/drivers/net/wireless/libertas/debugfs.c
+++ b/drivers/net/wireless/libertas/debugfs.c
@@ -925,7 +925,10 @@ static ssize_t lbs_debugfs_write(struct file *f, const char __user *buf,
 	char *p2;
 	struct debug_data *d = (struct debug_data *)f->private_data;
 
-	pdata = kmalloc(cnt, GFP_KERNEL);
+	if (cnt == 0)
+		return 0;
+
+	pdata = kmalloc(cnt + 1, GFP_KERNEL);
 	if (pdata == NULL)
 		return 0;
 
@@ -934,6 +937,7 @@ static ssize_t lbs_debugfs_write(struct file *f, const char __user *buf,
 		kfree(pdata);
 		return 0;
 	}
+	pdata[cnt] = '\0';
 
 	p0 = pdata;
 	for (i = 0; i < num_of_items; i++) {
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 119/143] x86, fpu, amd: Clear exceptions in AMD FXSAVE workaround
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (118 preceding siblings ...)
  2014-05-12  0:33 ` [ 118/143] libertas: potential oops in debugfs Willy Tarreau
@ 2014-05-12  0:33 ` Willy Tarreau
  2014-05-12  0:34 ` [ 120/143] gianfar: disable TX vlan based on kernel 2.6.x Willy Tarreau
                   ` (23 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:33 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: H. Peter Anvin, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Linus Torvalds <torvalds@linux-foundation.org>

Before we do an EMMS in the AMD FXSAVE information leak workaround we
need to clear any pending exceptions, otherwise we trap with a
floating-point exception inside this code.

Reported-by: halfdog <me@halfdog.net>
Tested-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/CA%2B55aFxQnY_PCG_n4=0w-VG=YLXL-yr7oMxyy0WU2gCBAf3ydg@mail.gmail.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
(cherry picked from commit 26bef1318adc1b3a530ecc807ef99346db2aa8b0)
[wt: in 2.6.32, patch applies to arch/x86/include/asm/i387.h. There's
 no static_cpu_has() so we use boot_cpu_has() like other kernels do
 with gcc3.
]
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 arch/x86/include/asm/i387.h | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/i387.h b/arch/x86/include/asm/i387.h
index 0b20bbb..cb42fad 100644
--- a/arch/x86/include/asm/i387.h
+++ b/arch/x86/include/asm/i387.h
@@ -242,12 +242,13 @@ clear_state:
 	/* AMD K7/K8 CPUs don't save/restore FDP/FIP/FOP unless an exception
 	   is pending.  Clear the x87 state here by setting it to fixed
 	   values. safe_address is a random variable that should be in L1 */
-	alternative_input(
-		GENERIC_NOP8 GENERIC_NOP2,
-		"emms\n\t"	  	/* clear stack tags */
-		"fildl %[addr]", 	/* set F?P to defined value */
-		X86_FEATURE_FXSAVE_LEAK,
-		[addr] "m" (safe_address));
+	if (unlikely(boot_cpu_has(X86_FEATURE_FXSAVE_LEAK))) {
+		asm volatile(
+			"fnclex\n\t"
+			"emms\n\t"
+			"fildl %[addr]"        /* set F?P to defined value */
+			: : [addr] "m" (safe_address));
+	}
 end:
 	task_thread_info(tsk)->status &= ~TS_USEDFPU;
 }
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 120/143] gianfar: disable TX vlan based on kernel 2.6.x
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (119 preceding siblings ...)
  2014-05-12  0:33 ` [ 119/143] x86, fpu, amd: Clear exceptions in AMD FXSAVE workaround Willy Tarreau
@ 2014-05-12  0:34 ` Willy Tarreau
  2014-05-12  0:34 ` [ 121/143] [CPUFREQ] powernow-k6: set transition latency value so ondemand Willy Tarreau
                   ` (22 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:34 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Zhu Yanjun, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Zhu Yanjun <Yanjun.Zhu@windriver.com>

2.6.x kernels require a similar logic change as commit e1653c3e
[gianfar: do vlan cleanup] and commit 51b8cbfc
[gianfar: fix bug caused by e1653c3e] introduces for newer kernels.

Since there is something wrong with tx vlan of gianfar nic driver,
in kernel(3.1+), tx vlan is disabled. But in kernel 2.6.x, tx vlan
is still enabled. Thus,gianfar nic driver can not support vlan
packets and non-vlan packets at the same time.

Signed-off-by: Zhu Yanjun <Yanjun.Zhu@windriver.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/net/gianfar.c | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/drivers/net/gianfar.c b/drivers/net/gianfar.c
index 934a28f..8aa2cf6 100644
--- a/drivers/net/gianfar.c
+++ b/drivers/net/gianfar.c
@@ -365,7 +365,7 @@ static int gfar_probe(struct of_device *ofdev,
 	priv->vlgrp = NULL;
 
 	if (priv->device_flags & FSL_GIANFAR_DEV_HAS_VLAN)
-		dev->features |= NETIF_F_HW_VLAN_TX | NETIF_F_HW_VLAN_RX;
+		dev->features |= NETIF_F_HW_VLAN_RX;
 
 	if (priv->device_flags & FSL_GIANFAR_DEV_HAS_EXTENDED_HASH) {
 		priv->extended_hash = 1;
@@ -1451,12 +1451,6 @@ static void gfar_vlan_rx_register(struct net_device *dev,
 	priv->vlgrp = grp;
 
 	if (grp) {
-		/* Enable VLAN tag insertion */
-		tempval = gfar_read(&priv->regs->tctrl);
-		tempval |= TCTRL_VLINS;
-
-		gfar_write(&priv->regs->tctrl, tempval);
-
 		/* Enable VLAN tag extraction */
 		tempval = gfar_read(&priv->regs->rctrl);
 		tempval |= (RCTRL_VLEX | RCTRL_PRSDEP_INIT);
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 121/143] [CPUFREQ] powernow-k6: set transition latency value so ondemand
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (120 preceding siblings ...)
  2014-05-12  0:34 ` [ 120/143] gianfar: disable TX vlan based on kernel 2.6.x Willy Tarreau
@ 2014-05-12  0:34 ` Willy Tarreau
  2014-05-12  0:34 ` [ 122/143] powernow-k6: disable cache when changing frequency Willy Tarreau
                   ` (21 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:34 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Krzysztof Helt, Dave Jones, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------
 governor can be used

From: Krzysztof Helt <krzysztof.h1@wp.pl>

Set the transition latency to value smaller than CPUFREQ_ETERNAL so
governors other than "performance" work (like the "ondemand" one).

The value is found in "AMD PowerNow! Technology Platform Design Guide for
Embedded Processors" dated December 2000 (AMD doc #24267A). There is the
answer to one of FAQs on page 40 which states that suggested complete transition
period is 200 us.

Tested on K6-2+ CPU with K6-3 core (model 13, stepping 4).

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
(cherry picked from commit db2820dd5445a44b4726f15a2bc89b9ded2503eb)
[wt: in 2.6.32, we only need this one so that next series applies cleanly]
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 arch/x86/kernel/cpu/cpufreq/powernow-k6.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/cpufreq/powernow-k6.c b/arch/x86/kernel/cpu/cpufreq/powernow-k6.c
index f10dea4..cb01dac 100644
--- a/arch/x86/kernel/cpu/cpufreq/powernow-k6.c
+++ b/arch/x86/kernel/cpu/cpufreq/powernow-k6.c
@@ -164,7 +164,7 @@ static int powernow_k6_cpu_init(struct cpufreq_policy *policy)
 	}
 
 	/* cpuinfo and default policy values */
-	policy->cpuinfo.transition_latency = CPUFREQ_ETERNAL;
+	policy->cpuinfo.transition_latency = 200000;
 	policy->cur = busfreq * max_multiplier;
 
 	result = cpufreq_frequency_table_cpuinfo(policy, clock_ratio);
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 122/143] powernow-k6: disable cache when changing frequency
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (121 preceding siblings ...)
  2014-05-12  0:34 ` [ 121/143] [CPUFREQ] powernow-k6: set transition latency value so ondemand Willy Tarreau
@ 2014-05-12  0:34 ` Willy Tarreau
  2014-05-12  0:34 ` [ 123/143] powernow-k6: correctly initialize default parameters Willy Tarreau
                   ` (20 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:34 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Mikulas Patocka, Rafael J. Wysocki, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Mikulas Patocka <mpatocka@redhat.com>

commit e20e1d0ac02308e2211306fc67abcd0b2668fb8b upstream

I found out that a system with k6-3+ processor is unstable during network
server load. The system locks up or the network card stops receiving. The
reason for the instability is the CPU frequency scaling.

During frequency transition the processor is in "EPM Stop Grant" state.
The documentation says that the processor doesn't respond to inquiry
requests in this state. Consequently, coherency of processor caches and
bus master devices is not maintained, causing the system instability.

This patch flushes the cache during frequency transition. It fixes the
instability.

Other minor changes:
* u64 invalue changed to unsigned long because the variable is 32-bit
* move the logic to set the multiplier to a separate function
  powernow_k6_set_cpu_multiplier
* preserve lower 5 bits of the powernow port instead of 4 (the voltage
  field has 5 bits)
* mask interrupts when reading the multiplier, so that the port is not
  open during other activity (running other kernel code with the port open
  shouldn't cause any misbehavior, but we should better be safe and keep
  the port closed)

This patch should be backported to all stable kernels. If it doesn't
apply cleanly, change it, or ask me to change it.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 arch/x86/kernel/cpu/cpufreq/powernow-k6.c | 56 +++++++++++++++++++++----------
 1 file changed, 39 insertions(+), 17 deletions(-)

diff --git a/arch/x86/kernel/cpu/cpufreq/powernow-k6.c b/arch/x86/kernel/cpu/cpufreq/powernow-k6.c
index cb01dac..9f80a34 100644
--- a/arch/x86/kernel/cpu/cpufreq/powernow-k6.c
+++ b/arch/x86/kernel/cpu/cpufreq/powernow-k6.c
@@ -44,23 +44,58 @@ static struct cpufreq_frequency_table clock_ratio[] = {
 /**
  * powernow_k6_get_cpu_multiplier - returns the current FSB multiplier
  *
- *   Returns the current setting of the frequency multiplier. Core clock
+ * Returns the current setting of the frequency multiplier. Core clock
  * speed is frequency of the Front-Side Bus multiplied with this value.
  */
 static int powernow_k6_get_cpu_multiplier(void)
 {
-	u64 invalue = 0;
+	unsigned long invalue = 0;
 	u32 msrval;
 
+	local_irq_disable();
+
 	msrval = POWERNOW_IOPORT + 0x1;
 	wrmsr(MSR_K6_EPMR, msrval, 0); /* enable the PowerNow port */
 	invalue = inl(POWERNOW_IOPORT + 0x8);
 	msrval = POWERNOW_IOPORT + 0x0;
 	wrmsr(MSR_K6_EPMR, msrval, 0); /* disable it again */
 
+	local_irq_enable();
+
 	return clock_ratio[(invalue >> 5)&7].index;
 }
 
+static void powernow_k6_set_cpu_multiplier(unsigned int best_i)
+{
+	unsigned long outvalue, invalue;
+	unsigned long msrval;
+	unsigned long cr0;
+
+	/* we now need to transform best_i to the BVC format, see AMD#23446 */
+
+	/*
+	 * The processor doesn't respond to inquiry cycles while changing the
+	 * frequency, so we must disable cache.
+	 */
+	local_irq_disable();
+	cr0 = read_cr0();
+	write_cr0(cr0 | X86_CR0_CD);
+	wbinvd();
+
+	outvalue = (1<<12) | (1<<10) | (1<<9) | (best_i<<5);
+
+	msrval = POWERNOW_IOPORT + 0x1;
+	wrmsr(MSR_K6_EPMR, msrval, 0); /* enable the PowerNow port */
+	invalue = inl(POWERNOW_IOPORT + 0x8);
+	invalue = invalue & 0x1f;
+	outvalue = outvalue | invalue;
+	outl(outvalue, (POWERNOW_IOPORT + 0x8));
+	msrval = POWERNOW_IOPORT + 0x0;
+	wrmsr(MSR_K6_EPMR, msrval, 0); /* disable it again */
+
+	write_cr0(cr0);
+	local_irq_enable();
+}
 
 /**
  * powernow_k6_set_state - set the PowerNow! multiplier
@@ -70,8 +105,6 @@ static int powernow_k6_get_cpu_multiplier(void)
  */
 static void powernow_k6_set_state(unsigned int best_i)
 {
-	unsigned long outvalue = 0, invalue = 0;
-	unsigned long msrval;
 	struct cpufreq_freqs freqs;
 
 	if (clock_ratio[best_i].index > max_multiplier) {
@@ -85,18 +118,7 @@ static void powernow_k6_set_state(unsigned int best_i)
 
 	cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);
 
-	/* we now need to transform best_i to the BVC format, see AMD#23446 */
-
-	outvalue = (1<<12) | (1<<10) | (1<<9) | (best_i<<5);
-
-	msrval = POWERNOW_IOPORT + 0x1;
-	wrmsr(MSR_K6_EPMR, msrval, 0); /* enable the PowerNow port */
-	invalue = inl(POWERNOW_IOPORT + 0x8);
-	invalue = invalue & 0xf;
-	outvalue = outvalue | invalue;
-	outl(outvalue , (POWERNOW_IOPORT + 0x8));
-	msrval = POWERNOW_IOPORT + 0x0;
-	wrmsr(MSR_K6_EPMR, msrval, 0); /* disable it again */
+	powernow_k6_set_cpu_multiplier(best_i);
 
 	cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
 
@@ -164,7 +186,7 @@ static int powernow_k6_cpu_init(struct cpufreq_policy *policy)
 	}
 
 	/* cpuinfo and default policy values */
-	policy->cpuinfo.transition_latency = 200000;
+	policy->cpuinfo.transition_latency = 500000;
 	policy->cur = busfreq * max_multiplier;
 
 	result = cpufreq_frequency_table_cpuinfo(policy, clock_ratio);
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 123/143] powernow-k6: correctly initialize default parameters
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (122 preceding siblings ...)
  2014-05-12  0:34 ` [ 122/143] powernow-k6: disable cache when changing frequency Willy Tarreau
@ 2014-05-12  0:34 ` Willy Tarreau
  2014-05-12  0:34 ` [ 124/143] powernow-k6: reorder frequencies Willy Tarreau
                   ` (19 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:34 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Mikulas Patocka, Rafael J. Wysocki, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Mikulas Patocka <mpatocka@redhat.com>

commit d82b922a4acc1781d368aceac2f9da43b038cab2 upstream

The powernow-k6 driver used to read the initial multiplier from the
powernow register. However, there is a problem with this:

* If there was a frequency transition before, the multiplier read from the
  register corresponds to the current multiplier.
* If there was no frequency transition since reset, the field in the
  register always reads as zero, regardless of the current multiplier that
  is set using switches on the mainboard and that the CPU is running at.

The zero value corresponds to multiplier 4.5, so as a consequence, the
powernow-k6 driver always assumes multiplier 4.5.

For example, if we have 550MHz CPU with bus frequency 100MHz and
multiplier 5.5, the powernow-k6 driver thinks that the multiplier is 4.5
and bus frequency is 122MHz. The powernow-k6 driver then sets the
multiplier to 4.5, underclocking the CPU to 450MHz, but reports the
current frequency as 550MHz.

There is no reliable way how to read the initial multiplier. I modified
the driver so that it contains a table of known frequencies (based on
parameters of existing CPUs and some common overclocking schemes) and sets
the multiplier according to the frequency. If the frequency is unknown
(because of unusual overclocking or underclocking), the user must supply
the bus speed and maximum multiplier as module parameters.

This patch should be backported to all stable kernels. If it doesn't
apply cleanly, change it, or ask me to change it.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 arch/x86/kernel/cpu/cpufreq/powernow-k6.c | 76 +++++++++++++++++++++++++++++--
 1 file changed, 72 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/cpufreq/powernow-k6.c b/arch/x86/kernel/cpu/cpufreq/powernow-k6.c
index 9f80a34..2d9161e 100644
--- a/arch/x86/kernel/cpu/cpufreq/powernow-k6.c
+++ b/arch/x86/kernel/cpu/cpufreq/powernow-k6.c
@@ -26,6 +26,14 @@
 static unsigned int                     busfreq;   /* FSB, in 10 kHz */
 static unsigned int                     max_multiplier;
 
+static unsigned int			param_busfreq = 0;
+static unsigned int			param_max_multiplier = 0;
+
+module_param_named(max_multiplier, param_max_multiplier, uint, S_IRUGO);
+MODULE_PARM_DESC(max_multiplier, "Maximum multiplier (allowed values: 20 30 35 40 45 50 55 60)");
+
+module_param_named(bus_frequency, param_busfreq, uint, S_IRUGO);
+MODULE_PARM_DESC(bus_frequency, "Bus frequency in kHz");
 
 /* Clock ratio multiplied by 10 - see table 27 in AMD#23446 */
 static struct cpufreq_frequency_table clock_ratio[] = {
@@ -40,6 +48,27 @@ static struct cpufreq_frequency_table clock_ratio[] = {
 	{0, CPUFREQ_TABLE_END}
 };
 
+static const struct {
+	unsigned freq;
+	unsigned mult;
+} usual_frequency_table[] = {
+	{ 400000, 40 },	// 100   * 4
+	{ 450000, 45 }, // 100   * 4.5
+	{ 475000, 50 }, //  95   * 5
+	{ 500000, 50 }, // 100   * 5
+	{ 506250, 45 }, // 112.5 * 4.5
+	{ 533500, 55 }, //  97   * 5.5
+	{ 550000, 55 }, // 100   * 5.5
+	{ 562500, 50 }, // 112.5 * 5
+	{ 570000, 60 }, //  95   * 6
+	{ 600000, 60 }, // 100   * 6
+	{ 618750, 55 }, // 112.5 * 5.5
+	{ 660000, 55 }, // 120   * 5.5
+	{ 675000, 60 }, // 112.5 * 6
+	{ 720000, 60 }, // 120   * 6
+};
+
+#define FREQ_RANGE		3000
 
 /**
  * powernow_k6_get_cpu_multiplier - returns the current FSB multiplier
@@ -163,18 +192,57 @@ static int powernow_k6_target(struct cpufreq_policy *policy,
 	return 0;
 }
 
-
 static int powernow_k6_cpu_init(struct cpufreq_policy *policy)
 {
 	unsigned int i, f;
 	int result;
+	unsigned khz;
 
 	if (policy->cpu != 0)
 		return -ENODEV;
 
-	/* get frequencies */
-	max_multiplier = powernow_k6_get_cpu_multiplier();
-	busfreq = cpu_khz / max_multiplier;
+	max_multiplier = 0;
+	khz = cpu_khz;
+	for (i = 0; i < ARRAY_SIZE(usual_frequency_table); i++) {
+		if (khz >= usual_frequency_table[i].freq - FREQ_RANGE &&
+		    khz <= usual_frequency_table[i].freq + FREQ_RANGE) {
+			khz = usual_frequency_table[i].freq;
+			max_multiplier = usual_frequency_table[i].mult;
+			break;
+		}
+	}
+	if (param_max_multiplier) {
+		for (i = 0; (clock_ratio[i].frequency != CPUFREQ_TABLE_END); i++) {
+			if (clock_ratio[i].index == param_max_multiplier) {
+				max_multiplier = param_max_multiplier;
+				goto have_max_multiplier;
+			}
+		}
+		printk(KERN_ERR "powernow-k6: invalid max_multiplier parameter, valid parameters 20, 30, 35, 40, 45, 50, 55, 60\n");
+		return -EINVAL;
+	}
+
+	if (!max_multiplier) {
+		printk(KERN_WARNING "powernow-k6: unknown frequency %u, cannot determine current multiplier\n", khz);
+		printk(KERN_WARNING "powernow-k6: use module parameters max_multiplier and bus_frequency\n");
+		return -EOPNOTSUPP;
+	}
+
+have_max_multiplier:
+	param_max_multiplier = max_multiplier;
+
+	if (param_busfreq) {
+		if (param_busfreq >= 50000 && param_busfreq <= 150000) {
+			busfreq = param_busfreq / 10;
+			goto have_busfreq;
+		}
+		printk(KERN_ERR "powernow-k6: invalid bus_frequency parameter, allowed range 50000 - 150000 kHz\n");
+		return -EINVAL;
+	}
+
+	busfreq = khz / max_multiplier;
+have_busfreq:
+	param_busfreq = busfreq * 10;
 
 	/* table init */
 	for (i = 0; (clock_ratio[i].frequency != CPUFREQ_TABLE_END); i++) {
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 124/143] powernow-k6: reorder frequencies
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (123 preceding siblings ...)
  2014-05-12  0:34 ` [ 123/143] powernow-k6: correctly initialize default parameters Willy Tarreau
@ 2014-05-12  0:34 ` Willy Tarreau
  2014-05-12  0:34 ` [ 125/143] tcp: fix tcp_trim_head() to adjust segment count with skb MSS Willy Tarreau
                   ` (18 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:34 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Mikulas Patocka, Viresh Kumar, Rafael J. Wysocki, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Mikulas Patocka <mpatocka@redhat.com>

commit 22c73795b101597051924556dce019385a1e2fa0 upstream

This patch reorders reported frequencies from the highest to the lowest,
just like in other frequency drivers.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 arch/x86/kernel/cpu/cpufreq/powernow-k6.c | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/cpu/cpufreq/powernow-k6.c b/arch/x86/kernel/cpu/cpufreq/powernow-k6.c
index 2d9161e..eb890f1 100644
--- a/arch/x86/kernel/cpu/cpufreq/powernow-k6.c
+++ b/arch/x86/kernel/cpu/cpufreq/powernow-k6.c
@@ -37,17 +37,20 @@ MODULE_PARM_DESC(bus_frequency, "Bus frequency in kHz");
 
 /* Clock ratio multiplied by 10 - see table 27 in AMD#23446 */
 static struct cpufreq_frequency_table clock_ratio[] = {
-	{45,  /* 000 -> 4.5x */ 0},
+	{60,  /* 110 -> 6.0x */ 0},
+	{55,  /* 011 -> 5.5x */ 0},
 	{50,  /* 001 -> 5.0x */ 0},
+	{45,  /* 000 -> 4.5x */ 0},
 	{40,  /* 010 -> 4.0x */ 0},
-	{55,  /* 011 -> 5.5x */ 0},
-	{20,  /* 100 -> 2.0x */ 0},
-	{30,  /* 101 -> 3.0x */ 0},
-	{60,  /* 110 -> 6.0x */ 0},
 	{35,  /* 111 -> 3.5x */ 0},
+	{30,  /* 101 -> 3.0x */ 0},
+	{20,  /* 100 -> 2.0x */ 0},
 	{0, CPUFREQ_TABLE_END}
 };
 
+static const u8 index_to_register[8] = { 6, 3, 1, 0, 2, 7, 5, 4 };
+static const u8 register_to_index[8] = { 3, 2, 4, 1, 7, 6, 0, 5 };
+
 static const struct {
 	unsigned freq;
 	unsigned mult;
@@ -91,7 +94,7 @@ static int powernow_k6_get_cpu_multiplier(void)
 
 	local_irq_enable();
 
-	return clock_ratio[(invalue >> 5)&7].index;
+	return clock_ratio[register_to_index[(invalue >> 5)&7]].index;
 }
 
 static void powernow_k6_set_cpu_multiplier(unsigned int best_i)
@@ -111,7 +114,7 @@ static void powernow_k6_set_cpu_multiplier(unsigned int best_i)
 	write_cr0(cr0 | X86_CR0_CD);
 	wbinvd();
 
-	outvalue = (1<<12) | (1<<10) | (1<<9) | (best_i<<5);
+	outvalue = (1<<12) | (1<<10) | (1<<9) | (index_to_register[best_i]<<5);
 
 	msrval = POWERNOW_IOPORT + 0x1;
 	wrmsr(MSR_K6_EPMR, msrval, 0); /* enable the PowerNow port */
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 125/143] tcp: fix tcp_trim_head() to adjust segment count with skb MSS
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (124 preceding siblings ...)
  2014-05-12  0:34 ` [ 124/143] powernow-k6: reorder frequencies Willy Tarreau
@ 2014-05-12  0:34 ` Willy Tarreau
  2014-05-12  0:34 ` [ 126/143] tcp_cubic: limit delayed_ack ratio to prevent divide error Willy Tarreau
                   ` (17 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:34 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Neal Cardwell, Nandita Dukkipati, Ilpo Järvinen,
	David S. Miller, Willy Tarreau

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 2256 bytes --]

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Neal Cardwell <ncardwell@google.com>

This commit fixes tcp_trim_head() to recalculate the number of
segments in the skb with the skb's existing MSS, so trimming the head
causes the skb segment count to be monotonically non-increasing - it
should stay the same or go down, but not increase.

Previously tcp_trim_head() used the current MSS of the connection. But
if there was a decrease in MSS between original transmission and ACK
(e.g. due to PMTUD), this could cause tcp_trim_head() to
counter-intuitively increase the segment count when trimming bytes off
the head of an skb. This violated assumptions in tcp_tso_acked() that
tcp_trim_head() only decreases the packet count, so that packets_acked
in tcp_tso_acked() could underflow, leading tcp_clean_rtx_queue() to
pass u32 pkts_acked values as large as 0xffffffff to
ca_ops->pkts_acked().

As an aside, if tcp_trim_head() had really wanted the skb to reflect
the current MSS, it should have called tcp_set_skb_tso_segs()
unconditionally, since a decrease in MSS would mean that a
single-packet skb should now be sliced into multiple segments.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Nandita Dukkipati <nanditad@google.com>
Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 5b35e1e6e9ca651e6b291c96d1106043c9af314a)
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/ipv4/tcp_output.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 49da29e..0fc0a73 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -949,11 +949,9 @@ int tcp_trim_head(struct sock *sk, struct sk_buff *skb, u32 len)
 	sk_mem_uncharge(sk, len);
 	sock_set_flag(sk, SOCK_QUEUE_SHRUNK);
 
-	/* Any change of skb->len requires recalculation of tso
-	 * factor and mss.
-	 */
+	/* Any change of skb->len requires recalculation of tso factor. */
 	if (tcp_skb_pcount(skb) > 1)
-		tcp_set_skb_tso_segs(sk, skb, tcp_current_mss(sk));
+		tcp_set_skb_tso_segs(sk, skb, tcp_skb_mss(skb));
 
 	return 0;
 }
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 126/143] tcp_cubic: limit delayed_ack ratio to prevent divide error
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (125 preceding siblings ...)
  2014-05-12  0:34 ` [ 125/143] tcp: fix tcp_trim_head() to adjust segment count with skb MSS Willy Tarreau
@ 2014-05-12  0:34 ` Willy Tarreau
  2014-05-12  0:34 ` [ 127/143] tcp_cubic: fix the range of delayed_ack Willy Tarreau
                   ` (16 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:34 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Stephen Hemminger, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: stephen hemminger <shemminger@vyatta.com>

TCP Cubic keeps a metric that estimates the amount of delayed
acknowledgements to use in adjusting the window. If an abnormally
large number of packets are acknowledged at once, then the update
could wrap and reach zero. This kind of ACK could only
happen when there was a large window and huge number of
ACK's were lost.

This patch limits the value of delayed ack ratio. The choice of 32
is just a conservative value since normally it should be range of
1 to 4 packets.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit b9f47a3aaeabdce3b42829bbb27765fa340f76ba)
[wt: in 2.6.32, this fix is needed for the next one]
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/ipv4/tcp_cubic.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/tcp_cubic.c b/net/ipv4/tcp_cubic.c
index 0d41c26..1b6b8c2 100644
--- a/net/ipv4/tcp_cubic.c
+++ b/net/ipv4/tcp_cubic.c
@@ -90,6 +90,7 @@ struct bictcp {
 	u32	ack_cnt;	/* number of acks */
 	u32	tcp_cwnd;	/* estimated tcp cwnd */
 #define ACK_RATIO_SHIFT	4
+#define ACK_RATIO_LIMIT (32u << ACK_RATIO_SHIFT)
 	u16	delayed_ack;	/* estimate the ratio of Packets/ACKs << 4 */
 	u8	sample_cnt;	/* number of samples to decide curr_rtt */
 	u8	found;		/* the exit point is found? */
@@ -379,8 +380,12 @@ static void bictcp_acked(struct sock *sk, u32 cnt, s32 rtt_us)
 	u32 delay;
 
 	if (icsk->icsk_ca_state == TCP_CA_Open) {
-		cnt -= ca->delayed_ack >> ACK_RATIO_SHIFT;
-		ca->delayed_ack += cnt;
+		u32 ratio = ca->delayed_ack;
+
+		ratio -= ca->delayed_ack >> ACK_RATIO_SHIFT;
+		ratio += cnt;
+
+		ca->delayed_ack = min(ratio, ACK_RATIO_LIMIT);
 	}
 
 	/* Some calls are for duplicates without timetamps */
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 127/143] tcp_cubic: fix the range of delayed_ack
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (126 preceding siblings ...)
  2014-05-12  0:34 ` [ 126/143] tcp_cubic: limit delayed_ack ratio to prevent divide error Willy Tarreau
@ 2014-05-12  0:34 ` Willy Tarreau
  2014-05-12  0:34 ` [ 128/143] n_tty: Fix n_tty_write crash when echoing in raw mode Willy Tarreau
                   ` (15 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:34 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Stephen Hemminger, Liu Yu, Eric Dumazet, Neal Cardwell,
	David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Liu Yu <allanyuliu@tencent.com>

commit b9f47a3aaeab (tcp_cubic: limit delayed_ack ratio to prevent
divide error) try to prevent divide error, but there is still a little
chance that delayed_ack can reach zero. In case the param cnt get
negative value, then ratio+cnt would overflow and may happen to be zero.
As a result, min(ratio, ACK_RATIO_LIMIT) will calculate to be zero.

In some old kernels, such as 2.6.32, there is a bug that would
pass negative param, which then ultimately leads to this divide error.

commit 5b35e1e6e9c (tcp: fix tcp_trim_head() to adjust segment count
with skb MSS) fixed the negative param issue. However,
it's safe that we fix the range of delayed_ack as well,
to make sure we do not hit a divide by zero.

CC: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Liu Yu <allanyuliu@tencent.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 0cda345d1b2201dd15591b163e3c92bad5191745)
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/ipv4/tcp_cubic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/tcp_cubic.c b/net/ipv4/tcp_cubic.c
index 1b6b8c2..db41113 100644
--- a/net/ipv4/tcp_cubic.c
+++ b/net/ipv4/tcp_cubic.c
@@ -385,7 +385,7 @@ static void bictcp_acked(struct sock *sk, u32 cnt, s32 rtt_us)
 		ratio -= ca->delayed_ack >> ACK_RATIO_SHIFT;
 		ratio += cnt;
 
-		ca->delayed_ack = min(ratio, ACK_RATIO_LIMIT);
+		ca->delayed_ack = clamp(ratio, 1U, ACK_RATIO_LIMIT);
 	}
 
 	/* Some calls are for duplicates without timetamps */
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 128/143] n_tty: Fix n_tty_write crash when echoing in raw mode
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (127 preceding siblings ...)
  2014-05-12  0:34 ` [ 127/143] tcp_cubic: fix the range of delayed_ack Willy Tarreau
@ 2014-05-12  0:34 ` Willy Tarreau
  2014-05-12  0:34 ` [ 129/143] exec/ptrace: fix get_dumpable() incorrect tests Willy Tarreau
                   ` (14 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:34 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Peter Hurley, Jiri Slaby, Linus Torvalds, Alan Cox,
	Greg Kroah-Hartman, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Peter Hurley <peter@hurleysoftware.com>

The tty atomic_write_lock does not provide an exclusion guarantee for
the tty driver if the termios settings are LECHO & !OPOST.  And since
it is unexpected and not allowed to call TTY buffer helpers like
tty_insert_flip_string concurrently, this may lead to crashes when
concurrect writers call pty_write. In that case the following two
writers:
* the ECHOing from a workqueue and
* pty_write from the process
race and can overflow the corresponding TTY buffer like follows.

If we look into tty_insert_flip_string_fixed_flag, there is:
  int space = __tty_buffer_request_room(port, goal, flags);
  struct tty_buffer *tb = port->buf.tail;
  ...
  memcpy(char_buf_ptr(tb, tb->used), chars, space);
  ...
  tb->used += space;

so the race of the two can result in something like this:
              A                                B
__tty_buffer_request_room
                                  __tty_buffer_request_room
memcpy(buf(tb->used), ...)
tb->used += space;
                                  memcpy(buf(tb->used), ...) ->BOOM

B's memcpy is past the tty_buffer due to the previous A's tb->used
increment.

Since the N_TTY line discipline input processing can output
concurrently with a tty write, obtain the N_TTY ldisc output_lock to
serialize echo output with normal tty writes.  This ensures the tty
buffer helper tty_insert_flip_string is not called concurrently and
everything is fine.

Note that this is nicely reproducible by an ordinary user using
forkpty and some setup around that (raw termios + ECHO). And it is
present in kernels at least after commit
d945cb9cce20ac7143c2de8d88b187f62db99bdc (pty: Rework the pty layer to
use the normal buffering logic) in 2.6.31-rc3.

js: add more info to the commit log
js: switch to bool
js: lock unconditionally
js: lock only the tty->ops->write call

References: CVE-2014-0196
Reported-and-tested-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 4291086b1f081b869c6d79e5b7441633dc3ace00)
[wt: 2.6.32 has no n_tty_data, so output_lock is in tty, not tty->disc_data]
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/char/n_tty.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/char/n_tty.c b/drivers/char/n_tty.c
index 2e50f4d..5269fa0 100644
--- a/drivers/char/n_tty.c
+++ b/drivers/char/n_tty.c
@@ -1969,7 +1969,9 @@ static ssize_t n_tty_write(struct tty_struct *tty, struct file *file,
 				tty->ops->flush_chars(tty);
 		} else {
 			while (nr > 0) {
+				mutex_lock(&tty->output_lock);
 				c = tty->ops->write(tty, b, nr);
+				mutex_unlock(&tty->output_lock);
 				if (c < 0) {
 					retval = c;
 					goto break_out;
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 129/143] exec/ptrace: fix get_dumpable() incorrect tests
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (128 preceding siblings ...)
  2014-05-12  0:34 ` [ 128/143] n_tty: Fix n_tty_write crash when echoing in raw mode Willy Tarreau
@ 2014-05-12  0:34 ` Willy Tarreau
  2014-05-12  0:34 ` [ 130/143] ipv6: call udp_push_pending_frames when uncorking a socket with Willy Tarreau
                   ` (13 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:34 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Kees Cook, Tony Luck, Oleg Nesterov, Eric W. Biederman,
	Andrew Morton, Linus Torvalds, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Kees Cook <keescook@chromium.org>

commit d049f74f2dbe71354d43d393ac3a188947811348 upstream

The get_dumpable() return value is not boolean.  Most users of the
function actually want to be testing for non-SUID_DUMP_USER(1) rather than
SUID_DUMP_DISABLE(0).  The SUID_DUMP_ROOT(2) is also considered a
protected state.  Almost all places did this correctly, excepting the two
places fixed in this patch.

Wrong logic:
    if (dumpable == SUID_DUMP_DISABLE) { /* be protective */ }
        or
    if (dumpable == 0) { /* be protective */ }
        or
    if (!dumpable) { /* be protective */ }

Correct logic:
    if (dumpable != SUID_DUMP_USER) { /* be protective */ }
        or
    if (dumpable != 1) { /* be protective */ }

Without this patch, if the system had set the sysctl fs/suid_dumpable=2, a
user was able to ptrace attach to processes that had dropped privileges to
that user.  (This may have been partially mitigated if Yama was enabled.)

The macros have been moved into the file that declares get/set_dumpable(),
which means things like the ia64 code can see them too.

CVE-2013-2929

Reported-by: Vasily Kulikov <segoon@openwall.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[dannf: backported to Debian's 2.6.32]
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 arch/ia64/include/asm/processor.h | 2 +-
 fs/exec.c                         | 6 ++++++
 include/linux/binfmts.h           | 3 ---
 include/linux/sched.h             | 4 ++++
 kernel/ptrace.c                   | 2 +-
 5 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/arch/ia64/include/asm/processor.h b/arch/ia64/include/asm/processor.h
index 3eaeedf..d77b342 100644
--- a/arch/ia64/include/asm/processor.h
+++ b/arch/ia64/include/asm/processor.h
@@ -361,7 +361,7 @@ struct thread_struct {
 	regs->loadrs = 0;									\
 	regs->r8 = get_dumpable(current->mm);	/* set "don't zap registers" flag */		\
 	regs->r12 = new_sp - 16;	/* allocate 16 byte scratch area */			\
-	if (unlikely(!get_dumpable(current->mm))) {							\
+	if (unlikely(get_dumpable(current->mm) != SUID_DUMP_USER)) {	\
 		/*										\
 		 * Zap scratch regs to avoid leaking bits between processes with different	\
 		 * uid/privileges.								\
diff --git a/fs/exec.c b/fs/exec.c
index feb2435..c32ae34 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1793,6 +1793,12 @@ void set_dumpable(struct mm_struct *mm, int value)
 	}
 }
 
+/*
+ * This returns the actual value of the suid_dumpable flag. For things
+ * that are using this for checking for privilege transitions, it must
+ * test against SUID_DUMP_USER rather than treating it as a boolean
+ * value.
+ */
 int get_dumpable(struct mm_struct *mm)
 {
 	int ret;
diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h
index 9ffffec..8eab628 100644
--- a/include/linux/binfmts.h
+++ b/include/linux/binfmts.h
@@ -107,9 +107,6 @@ extern int flush_old_exec(struct linux_binprm * bprm);
 extern void setup_new_exec(struct linux_binprm * bprm);
 
 extern int suid_dumpable;
-#define SUID_DUMP_DISABLE	0	/* No setuid dumping */
-#define SUID_DUMP_USER		1	/* Dump as user of process */
-#define SUID_DUMP_ROOT		2	/* Dump as root */
 
 /* Stack area protections */
 #define EXSTACK_DEFAULT   0	/* Whatever the arch defaults to */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 73c3b9b..56e1771 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -442,6 +442,10 @@ static inline unsigned long get_mm_hiwater_vm(struct mm_struct *mm)
 extern void set_dumpable(struct mm_struct *mm, int value);
 extern int get_dumpable(struct mm_struct *mm);
 
+#define SUID_DUMP_DISABLE	0	/* No setuid dumping */
+#define SUID_DUMP_USER		1	/* Dump as user of process */
+#define SUID_DUMP_ROOT		2	/* Dump as root */
+
 /* mm flags */
 /* dumpable bits */
 #define MMF_DUMPABLE      0  /* core dump is permitted */
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index d9c8c47..4185220 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -187,7 +187,7 @@ int __ptrace_may_access(struct task_struct *task, unsigned int mode)
 	smp_rmb();
 	if (task->mm)
 		dumpable = get_dumpable(task->mm);
-	if (!dumpable && !capable(CAP_SYS_PTRACE))
+	if (dumpable != SUID_DUMP_USER && !capable(CAP_SYS_PTRACE))
 		return -EPERM;
 
 	return security_ptrace_access_check(task, mode);
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 130/143] ipv6: call udp_push_pending_frames when uncorking a socket with
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (129 preceding siblings ...)
  2014-05-12  0:34 ` [ 129/143] exec/ptrace: fix get_dumpable() incorrect tests Willy Tarreau
@ 2014-05-12  0:34 ` Willy Tarreau
  2014-05-12  0:34 ` [ 131/143] dm snapshot: fix data corruption Willy Tarreau
                   ` (12 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:34 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Dave Jones, YOSHIFUJI Hideaki, Hannes Frederic Sowa,
	David S. Miller, Luis Henriques, Brad Figg, Tim Gardner,
	Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------
 AF_INET pending data

From: Hannes Frederic Sowa <hannes@stressinduktion.org>

CVE-2013-4162

BugLink: http://bugs.launchpad.net/bugs/1205070

We accidentally call down to ip6_push_pending_frames when uncorking
pending AF_INET data on a ipv6 socket. This results in the following
splat (from Dave Jones):

skbuff: skb_under_panic: text:ffffffff816765f6 len:48 put:40 head:ffff88013deb6df0 data:ffff88013deb6dec tail:0x2c end:0xc0 dev:<NULL>
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:126!
invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in: dccp_ipv4 dccp 8021q garp bridge stp dlci mpoa snd_seq_dummy sctp fuse hidp tun bnep nfnetlink scsi_transport_iscsi rfcomm can_raw can_bcm af_802154 appletalk caif_socket can caif ipt_ULOG x25 rose af_key pppoe pppox ipx phonet irda llc2 ppp_generic slhc p8023 psnap p8022 llc crc_ccitt atm bluetooth
+netrom ax25 nfc rfkill rds af_rxrpc coretemp hwmon kvm_intel kvm crc32c_intel snd_hda_codec_realtek ghash_clmulni_intel microcode pcspkr snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep usb_debug snd_seq snd_seq_device snd_pcm e1000e snd_page_alloc snd_timer ptp snd pps_core soundcore xfs libcrc32c
CPU: 2 PID: 8095 Comm: trinity-child2 Not tainted 3.10.0-rc7+ #37
task: ffff8801f52c2520 ti: ffff8801e6430000 task.ti: ffff8801e6430000
RIP: 0010:[<ffffffff816e759c>]  [<ffffffff816e759c>] skb_panic+0x63/0x65
RSP: 0018:ffff8801e6431de8  EFLAGS: 00010282
RAX: 0000000000000086 RBX: ffff8802353d3cc0 RCX: 0000000000000006
RDX: 0000000000003b90 RSI: ffff8801f52c2ca0 RDI: ffff8801f52c2520
RBP: ffff8801e6431e08 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: ffff88022ea0c800
R13: ffff88022ea0cdf8 R14: ffff8802353ecb40 R15: ffffffff81cc7800
FS:  00007f5720a10740(0000) GS:ffff880244c00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000005862000 CR3: 000000022843c000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
Stack:
 ffff88013deb6dec 000000000000002c 00000000000000c0 ffffffff81a3f6e4
 ffff8801e6431e18 ffffffff8159a9aa ffff8801e6431e90 ffffffff816765f6
 ffffffff810b756b 0000000700000002 ffff8801e6431e40 0000fea9292aa8c0
Call Trace:
 [<ffffffff8159a9aa>] skb_push+0x3a/0x40
 [<ffffffff816765f6>] ip6_push_pending_frames+0x1f6/0x4d0
 [<ffffffff810b756b>] ? mark_held_locks+0xbb/0x140
 [<ffffffff81694919>] udp_v6_push_pending_frames+0x2b9/0x3d0
 [<ffffffff81694660>] ? udplite_getfrag+0x20/0x20
 [<ffffffff8162092a>] udp_lib_setsockopt+0x1aa/0x1f0
 [<ffffffff811cc5e7>] ? fget_light+0x387/0x4f0
 [<ffffffff816958a4>] udpv6_setsockopt+0x34/0x40
 [<ffffffff815949f4>] sock_common_setsockopt+0x14/0x20
 [<ffffffff81593c31>] SyS_setsockopt+0x71/0xd0
 [<ffffffff816f5d54>] tracesys+0xdd/0xe2
Code: 00 00 48 89 44 24 10 8b 87 d8 00 00 00 48 89 44 24 08 48 8b 87 e8 00 00 00 48 c7 c7 c0 04 aa 81 48 89 04 24 31 c0 e8 e1 7e ff ff <0f> 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55
RIP  [<ffffffff816e759c>] skb_panic+0x63/0x65
 RSP <ffff8801e6431de8>

This patch adds a check if the pending data is of address family AF_INET
and directly calls udp_push_ending_frames from udp_v6_push_pending_frames
if that is the case.

This bug was found by Dave Jones with trinity.

(Also move the initialization of fl6 below the AF_INET check, even if
not strictly necessary.)

Cc: Dave Jones <davej@redhat.com>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
(back ported from commit 8822b64a0fa64a5dd1dfcf837c5b0be83f8c05d1)
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 include/net/udp.h | 1 +
 net/ipv4/udp.c    | 3 ++-
 net/ipv6/udp.c    | 7 ++++++-
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/include/net/udp.h b/include/net/udp.h
index f98abd2..702bea0 100644
--- a/include/net/udp.h
+++ b/include/net/udp.h
@@ -134,6 +134,7 @@ extern void	udp_err(struct sk_buff *, u32);
 
 extern int	udp_sendmsg(struct kiocb *iocb, struct sock *sk,
 			    struct msghdr *msg, size_t len);
+extern int udp_push_pending_frames(struct sock *sk);
 extern void	udp_flush_pending_frames(struct sock *sk);
 
 extern int	udp_rcv(struct sk_buff *skb);
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index dba3c01..0b2e07f 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -513,7 +513,7 @@ static void udp4_hwcsum_outgoing(struct sock *sk, struct sk_buff *skb,
 /*
  * Push out all pending data as one UDP datagram. Socket is locked.
  */
-static int udp_push_pending_frames(struct sock *sk)
+int udp_push_pending_frames(struct sock *sk)
 {
 	struct udp_sock  *up = udp_sk(sk);
 	struct inet_sock *inet = inet_sk(sk);
@@ -575,6 +575,7 @@ out:
 	up->pending = 0;
 	return err;
 }
+EXPORT_SYMBOL(udp_push_pending_frames);
 
 int udp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 		size_t len)
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 3a91859..d0367eb 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -687,11 +687,16 @@ static int udp_v6_push_pending_frames(struct sock *sk)
 	struct udphdr *uh;
 	struct udp_sock  *up = udp_sk(sk);
 	struct inet_sock *inet = inet_sk(sk);
-	struct flowi *fl = &inet->cork.fl;
+	struct flowi *fl;
 	int err = 0;
 	int is_udplite = IS_UDPLITE(sk);
 	__wsum csum = 0;
 
+	if (up->pending == AF_INET)
+		return udp_push_pending_frames(sk);
+
+	fl = &inet->cork.fl;
+
 	/* Grab the skbuff where UDP header space exists. */
 	if ((skb = skb_peek(&sk->sk_write_queue)) == NULL)
 		goto out;
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 131/143] dm snapshot: fix data corruption
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (130 preceding siblings ...)
  2014-05-12  0:34 ` [ 130/143] ipv6: call udp_push_pending_frames when uncorking a socket with Willy Tarreau
@ 2014-05-12  0:34 ` Willy Tarreau
  2014-05-12  0:34 ` [ 132/143] crypto: ansi_cprng - Fix off by one error in non-block size request Willy Tarreau
                   ` (11 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:34 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Mikulas Patocka, Mike Snitzer, Alasdair G Kergon, Luis Henriques,
	Stefan Bader, Tim Gardner, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Mikulas Patocka <mpatocka@redhat.com>

CVE-2013-4299

BugLink: http://bugs.launchpad.net/bugs/1241769

This patch fixes a particular type of data corruption that has been
encountered when loading a snapshot's metadata from disk.

When we allocate a new chunk in persistent_prepare, we increment
ps->next_free and we make sure that it doesn't point to a metadata area
by further incrementing it if necessary.

When we load metadata from disk on device activation, ps->next_free is
positioned after the last used data chunk. However, if this last used
data chunk is followed by a metadata area, ps->next_free is positioned
erroneously to the metadata area. A newly-allocated chunk is placed at
the same location as the metadata area, resulting in data or metadata
corruption.

This patch changes the code so that ps->next_free skips the metadata
area when metadata are loaded in function read_exceptions.

The patch also moves a piece of code from persistent_prepare_exception
to a separate function skip_metadata to avoid code duplication.

CVE-2013-4299

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Cc: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
(back ported from commit e9c6a182649f4259db704ae15a91ac820e63b0ca)
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/md/dm-snap-persistent.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/md/dm-snap-persistent.c b/drivers/md/dm-snap-persistent.c
index 0c74642..97c3f06 100644
--- a/drivers/md/dm-snap-persistent.c
+++ b/drivers/md/dm-snap-persistent.c
@@ -252,6 +252,14 @@ static chunk_t area_location(struct pstore *ps, chunk_t area)
 	return 1 + ((ps->exceptions_per_area + 1) * area);
 }
 
+static void skip_metadata(struct pstore *ps)
+{
+	uint32_t stride = ps->exceptions_per_area + 1;
+	chunk_t next_free = ps->next_free;
+	if (sector_div(next_free, stride) == 1)
+		ps->next_free++;
+}
+
 /*
  * Read or write a metadata area.  Remembering to skip the first
  * chunk which holds the header.
@@ -481,6 +489,8 @@ static int read_exceptions(struct pstore *ps,
 
 	ps->current_area--;
 
+	skip_metadata(ps);
+
 	return 0;
 }
 
@@ -587,8 +597,6 @@ static int persistent_prepare_exception(struct dm_exception_store *store,
 					struct dm_snap_exception *e)
 {
 	struct pstore *ps = get_info(store);
-	uint32_t stride;
-	chunk_t next_free;
 	sector_t size = get_dev_size(store->cow->bdev);
 
 	/* Is there enough room ? */
@@ -601,10 +609,8 @@ static int persistent_prepare_exception(struct dm_exception_store *store,
 	 * Move onto the next free pending, making sure to take
 	 * into account the location of the metadata chunks.
 	 */
-	stride = (ps->exceptions_per_area + 1);
-	next_free = ++ps->next_free;
-	if (sector_div(next_free, stride) == 1)
-		ps->next_free++;
+	ps->next_free++;
+	skip_metadata(ps);
 
 	atomic_inc(&ps->pending_count);
 	return 0;
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 132/143] crypto: ansi_cprng - Fix off by one error in non-block size request
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (131 preceding siblings ...)
  2014-05-12  0:34 ` [ 131/143] dm snapshot: fix data corruption Willy Tarreau
@ 2014-05-12  0:34 ` Willy Tarreau
  2014-05-12  0:34 ` [ 133/143] uml: check length in exitcode_proc_write() Willy Tarreau
                   ` (10 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:34 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Neil Horman, Stephan Mueller, Petr Matousek, Herbert Xu,
	David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Neil Horman <nhorman@tuxdriver.com>

commit 714b33d15130cbb5ab426456d4e3de842d6c5b8a upstream

Stephan Mueller reported to me recently a error in random number generation in
the ansi cprng. If several small requests are made that are less than the
instances block size, the remainder for loop code doesn't increment
rand_data_valid in the last iteration, meaning that the last bytes in the
rand_data buffer gets reused on the subsequent smaller-than-a-block request for
random data.

The fix is pretty easy, just re-code the for loop to make sure that
rand_data_valid gets incremented appropriately

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: Stephan Mueller <stephan.mueller@atsec.com>
CC: Stephan Mueller <stephan.mueller@atsec.com>
CC: Petr Matousek <pmatouse@redhat.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 crypto/ansi_cprng.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/crypto/ansi_cprng.c b/crypto/ansi_cprng.c
index 3aa6e38..0ffd5995 100644
--- a/crypto/ansi_cprng.c
+++ b/crypto/ansi_cprng.c
@@ -232,11 +232,11 @@ remainder:
 	 */
 	if (byte_count < DEFAULT_BLK_SZ) {
 empty_rbuf:
-		for (; ctx->rand_data_valid < DEFAULT_BLK_SZ;
-			ctx->rand_data_valid++) {
+		while (ctx->rand_data_valid < DEFAULT_BLK_SZ) {
 			*ptr = ctx->rand_data[ctx->rand_data_valid];
 			ptr++;
 			byte_count--;
+			ctx->rand_data_valid++;
 			if (byte_count == 0)
 				goto done;
 		}
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 133/143] uml: check length in exitcode_proc_write()
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (132 preceding siblings ...)
  2014-05-12  0:34 ` [ 132/143] crypto: ansi_cprng - Fix off by one error in non-block size request Willy Tarreau
@ 2014-05-12  0:34 ` Willy Tarreau
  2014-05-12  0:34 ` [ 134/143] KVM: Improve create VCPU parameter (CVE-2013-4587) Willy Tarreau
                   ` (9 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:34 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Dan Carpenter, stable, Linus Torvalds, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Dan Carpenter <dan.carpenter@oracle.com>

commit 201f99f170df14ba52ea4c52847779042b7a623b upstream

We don't cap the size of buffer from the user so we could write past the
end of the array here.  Only root can write to this file.

Reported-by: Nico Golde <nico@ngolde.de>
Reported-by: Fabian Yamaguchi <fabs@goesec.de>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 arch/um/kernel/exitcode.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/um/kernel/exitcode.c b/arch/um/kernel/exitcode.c
index 6540d2c..ce057af 100644
--- a/arch/um/kernel/exitcode.c
+++ b/arch/um/kernel/exitcode.c
@@ -42,9 +42,11 @@ static int write_proc_exitcode(struct file *file, const char __user *buffer,
 			       unsigned long count, void *data)
 {
 	char *end, buf[sizeof("nnnnn\0")];
+	size_t size;
 	int tmp;
 
-	if (copy_from_user(buf, buffer, count))
+	size = min(count, sizeof(buf));
+	if (copy_from_user(buf, buffer, size))
 		return -EFAULT;
 
 	tmp = simple_strtol(buf, &end, 0);
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 134/143] KVM: Improve create VCPU parameter (CVE-2013-4587)
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (133 preceding siblings ...)
  2014-05-12  0:34 ` [ 133/143] uml: check length in exitcode_proc_write() Willy Tarreau
@ 2014-05-12  0:34 ` Willy Tarreau
  2014-05-12  0:34 ` [ 135/143] KVM: x86: Fix potential divide by 0 in lapic (CVE-2013-6367) Willy Tarreau
                   ` (8 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:34 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Andrew Honig, Paolo Bonzini, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Honig <ahonig@google.com>

commit 338c7dbadd2671189cec7faf64c84d01071b3f96 upstream

In multiple functions the vcpu_id is used as an offset into a bitfield.  Ag
malicious user could specify a vcpu_id greater than 255 in order to set or
clear bits in kernel memory.  This could be used to elevate priveges in the
kernel.  This patch verifies that the vcpu_id provided is less than 255.
The api documentation already specifies that the vcpu_id must be less than
max_vcpus, but this is currently not checked.

Reported-by: Andrew Honig <ahonig@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Andrew Honig <ahonig@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 virt/kvm/kvm_main.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 82b6fdc..3b9443b 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1846,6 +1846,9 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id)
 	int r;
 	struct kvm_vcpu *vcpu, *v;
 
+	if (id >= KVM_MAX_VCPUS)
+		return -EINVAL;
+
 	vcpu = kvm_arch_vcpu_create(kvm, id);
 	if (IS_ERR(vcpu))
 		return PTR_ERR(vcpu);
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 135/143] KVM: x86: Fix potential divide by 0 in lapic (CVE-2013-6367)
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (134 preceding siblings ...)
  2014-05-12  0:34 ` [ 134/143] KVM: Improve create VCPU parameter (CVE-2013-4587) Willy Tarreau
@ 2014-05-12  0:34 ` Willy Tarreau
  2014-05-12  0:34 ` [ 136/143] qeth: avoid buffer overflow in snmp ioctl Willy Tarreau
                   ` (7 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:34 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Andrew Honig, Paolo Bonzini, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Honig <ahonig@google.com>

commit b963a22e6d1a266a67e9eecc88134713fd54775c upstream

Under guest controllable circumstances apic_get_tmcct will execute a
divide by zero and cause a crash.  If the guest cpuid support
tsc deadline timers and performs the following sequence of requests
the host will crash.
- Set the mode to periodic
- Set the TMICT to 0
- Set the mode bits to 11 (neither periodic, nor one shot, nor tsc deadline)
- Set the TMICT to non-zero.
Then the lapic_timer.period will be 0, but the TMICT will not be.  If the
guest then reads from the TMCCT then the host will perform a divide by 0.

This patch ensures that if the lapic_timer.period is 0, then the division
does not occur.

Reported-by: Andrew Honig <ahonig@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Andrew Honig <ahonig@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[dannf: backported to Debian's 2.6.32]
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 arch/x86/kvm/lapic.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 8dfeaaa..b77857f 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -519,7 +519,8 @@ static u32 apic_get_tmcct(struct kvm_lapic *apic)
 	ASSERT(apic != NULL);
 
 	/* if initial count is 0, current count should also be 0 */
-	if (apic_get_reg(apic, APIC_TMICT) == 0)
+	if (apic_get_reg(apic, APIC_TMICT) == 0 ||
+		apic->lapic_timer.period == 0)
 		return 0;
 
 	remaining = hrtimer_get_remaining(&apic->lapic_timer.timer);
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 136/143] qeth: avoid buffer overflow in snmp ioctl
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (135 preceding siblings ...)
  2014-05-12  0:34 ` [ 135/143] KVM: x86: Fix potential divide by 0 in lapic (CVE-2013-6367) Willy Tarreau
@ 2014-05-12  0:34 ` Willy Tarreau
  2014-05-12  0:34 ` [ 137/143] xfs: underflow bug in xfs_attrlist_by_handle() Willy Tarreau
                   ` (6 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:34 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Ursula Braun, Frank Blaschka, David S. Miller, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Ursula Braun <ursula.braun@de.ibm.com>

commit 6fb392b1a63ae36c31f62bc3fc8630b49d602b62 upstream

Check user-defined length in snmp ioctl request and allow request
only if it fits into a qeth command buffer.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Reviewed-by: Heiko Carstens <heicars2@linux.vnet.ibm.com>
Reported-by: Nico Golde <nico@ngolde.de>
Reported-by: Fabian Yamaguchi <fabs@goesec.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
[jmm: backport 2.6.32]
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/s390/net/qeth_core_main.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/s390/net/qeth_core_main.c b/drivers/s390/net/qeth_core_main.c
index c4a42d9..29afd6c 100644
--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -3557,7 +3557,7 @@ int qeth_snmp_command(struct qeth_card *card, char __user *udata)
 	struct qeth_cmd_buffer *iob;
 	struct qeth_ipa_cmd *cmd;
 	struct qeth_snmp_ureq *ureq;
-	int req_len;
+	unsigned int req_len;
 	struct qeth_arp_query_info qinfo = {0, };
 	int rc = 0;
 
@@ -3573,6 +3573,10 @@ int qeth_snmp_command(struct qeth_card *card, char __user *udata)
 	/* skip 4 bytes (data_len struct member) to get req_len */
 	if (copy_from_user(&req_len, udata + sizeof(int), sizeof(int)))
 		return -EFAULT;
+	if (req_len > (QETH_BUFSIZE - IPA_PDU_HEADER_SIZE -
+		       sizeof(struct qeth_ipacmd_hdr) -
+		       sizeof(struct qeth_ipacmd_setadpparms_hdr)))
+		return -EINVAL;
 	ureq = kmalloc(req_len+sizeof(struct qeth_snmp_ureq_hdr), GFP_KERNEL);
 	if (!ureq) {
 		QETH_DBF_TEXT(TRACE, 2, "snmpnome");
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 137/143] xfs: underflow bug in xfs_attrlist_by_handle()
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (136 preceding siblings ...)
  2014-05-12  0:34 ` [ 136/143] qeth: avoid buffer overflow in snmp ioctl Willy Tarreau
@ 2014-05-12  0:34 ` Willy Tarreau
  2014-05-13 11:08   ` Luis Henriques
  2014-05-12  0:34 ` [ 138/143] aacraid: missing capable() check in compat ioctl Willy Tarreau
                   ` (5 subsequent siblings)
  143 siblings, 1 reply; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:34 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Dan Carpenter, Ben Myers, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Dan Carpenter <dan.carpenter@oracle.com>

If we allocate less than sizeof(struct attrlist) then we end up
corrupting memory or doing a ZERO_PTR_SIZE dereference.

This can only be triggered with CAP_SYS_ADMIN.

Reported-by: Nico Golde <nico@ngolde.de>
Reported-by: Fabian Yamaguchi <fabs@goesec.de>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

(cherry picked from commit 071c529eb672648ee8ca3f90944bcbcc730b4c06)
[dannf: backported to Debian's 2.6.32]
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 fs/xfs/linux-2.6/xfs_ioctl.c   | 3 ++-
 fs/xfs/linux-2.6/xfs_ioctl32.c | 4 ++--
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/fs/xfs/linux-2.6/xfs_ioctl.c b/fs/xfs/linux-2.6/xfs_ioctl.c
index 942362f..5663351 100644
--- a/fs/xfs/linux-2.6/xfs_ioctl.c
+++ b/fs/xfs/linux-2.6/xfs_ioctl.c
@@ -410,7 +410,8 @@ xfs_attrlist_by_handle(
 		return -XFS_ERROR(EPERM);
 	if (copy_from_user(&al_hreq, arg, sizeof(xfs_fsop_attrlist_handlereq_t)))
 		return -XFS_ERROR(EFAULT);
-	if (al_hreq.buflen > XATTR_LIST_MAX)
+	if (al_hreq.buflen < sizeof(struct attrlist) ||
+	    al_hreq.buflen > XATTR_LIST_MAX)
 		return -XFS_ERROR(EINVAL);
 
 	/*
diff --git a/fs/xfs/linux-2.6/xfs_ioctl32.c b/fs/xfs/linux-2.6/xfs_ioctl32.c
index bad485a..782d03d 100644
--- a/fs/xfs/linux-2.6/xfs_ioctl32.c
+++ b/fs/xfs/linux-2.6/xfs_ioctl32.c
@@ -361,8 +361,8 @@ xfs_compat_attrlist_by_handle(
 	if (copy_from_user(&al_hreq, arg,
 			   sizeof(compat_xfs_fsop_attrlist_handlereq_t)))
 		return -XFS_ERROR(EFAULT);
-	if (al_hreq.buflen > XATTR_LIST_MAX)
-		return -XFS_ERROR(EINVAL);
+	if (al_hreq.buflen < sizeof(struct attrlist) ||
+	    al_hreq.buflen > XATTR_LIST_MAX)
 
 	/*
 	 * Reject flags, only allow namespaces.
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 138/143] aacraid: missing capable() check in compat ioctl
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (137 preceding siblings ...)
  2014-05-12  0:34 ` [ 137/143] xfs: underflow bug in xfs_attrlist_by_handle() Willy Tarreau
@ 2014-05-12  0:34 ` Willy Tarreau
  2014-05-12  0:34 ` [ 139/143] SELinux: Fix kernel BUG on empty security contexts Willy Tarreau
                   ` (4 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:34 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Dan Carpenter, stable, Linus Torvalds, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Dan Carpenter <dan.carpenter@oracle.com>

commit f856567b930dfcdbc3323261bf77240ccdde01f5 upstream

In commit d496f94d22d1 ('[SCSI] aacraid: fix security weakness') we
added a check on CAP_SYS_RAWIO to the ioctl.  The compat ioctls need the
check as well.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/scsi/aacraid/linit.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/scsi/aacraid/linit.c b/drivers/scsi/aacraid/linit.c
index 9b97c3e..387872c 100644
--- a/drivers/scsi/aacraid/linit.c
+++ b/drivers/scsi/aacraid/linit.c
@@ -754,6 +754,8 @@ static long aac_compat_do_ioctl(struct aac_dev *dev, unsigned cmd, unsigned long
 static int aac_compat_ioctl(struct scsi_device *sdev, int cmd, void __user *arg)
 {
 	struct aac_dev *dev = (struct aac_dev *)sdev->host->hostdata;
+	if (!capable(CAP_SYS_RAWIO))
+		return -EPERM;
 	return aac_compat_do_ioctl(dev, cmd, (unsigned long)arg);
 }
 
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 139/143] SELinux: Fix kernel BUG on empty security contexts.
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (138 preceding siblings ...)
  2014-05-12  0:34 ` [ 138/143] aacraid: missing capable() check in compat ioctl Willy Tarreau
@ 2014-05-12  0:34 ` Willy Tarreau
  2014-05-12  0:34 ` [ 140/143] s390: fix kernel crash due to linkage stack instructions Willy Tarreau
                   ` (3 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:34 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Stephen Smalley, Paul Moore, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Stephen Smalley <sds@tycho.nsa.gov>

commit 2172fa709ab32ca60e86179dc67d0857be8e2c98 upstream

Setting an empty security context (length=0) on a file will
lead to incorrectly dereferencing the type and other fields
of the security context structure, yielding a kernel BUG.
As a zero-length security context is never valid, just reject
all such security contexts whether coming from userspace
via setxattr or coming from the filesystem upon a getxattr
request by SELinux.

Setting a security context value (empty or otherwise) unknown to
SELinux in the first place is only possible for a root process
(CAP_MAC_ADMIN), and, if running SELinux in enforcing mode, only
if the corresponding SELinux mac_admin permission is also granted
to the domain by policy.  In Fedora policies, this is only allowed for
specific domains such as livecd for setting down security contexts
that are not defined in the build host policy.

Reproducer:
su
setenforce 0
touch foo
setfattr -n security.selinux foo

Caveat:
Relabeling or removing foo after doing the above may not be possible
without booting with SELinux disabled.  Any subsequent access to foo
after doing the above will also trigger the BUG.

BUG output from Matthew Thode:
[  473.893141] ------------[ cut here ]------------
[  473.962110] kernel BUG at security/selinux/ss/services.c:654!
[  473.995314] invalid opcode: 0000 [#6] SMP
[  474.027196] Modules linked in:
[  474.058118] CPU: 0 PID: 8138 Comm: ls Tainted: G      D   I
3.13.0-grsec #1
[  474.116637] Hardware name: Supermicro X8ST3/X8ST3, BIOS 2.0
07/29/10
[  474.149768] task: ffff8805f50cd010 ti: ffff8805f50cd488 task.ti:
ffff8805f50cd488
[  474.183707] RIP: 0010:[<ffffffff814681c7>]  [<ffffffff814681c7>]
context_struct_compute_av+0xce/0x308
[  474.219954] RSP: 0018:ffff8805c0ac3c38  EFLAGS: 00010246
[  474.252253] RAX: 0000000000000000 RBX: ffff8805c0ac3d94 RCX:
0000000000000100
[  474.287018] RDX: ffff8805e8aac000 RSI: 00000000ffffffff RDI:
ffff8805e8aaa000
[  474.321199] RBP: ffff8805c0ac3cb8 R08: 0000000000000010 R09:
0000000000000006
[  474.357446] R10: 0000000000000000 R11: ffff8805c567a000 R12:
0000000000000006
[  474.419191] R13: ffff8805c2b74e88 R14: 00000000000001da R15:
0000000000000000
[  474.453816] FS:  00007f2e75220800(0000) GS:ffff88061fc00000(0000)
knlGS:0000000000000000
[  474.489254] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  474.522215] CR2: 00007f2e74716090 CR3: 00000005c085e000 CR4:
00000000000207f0
[  474.556058] Stack:
[  474.584325]  ffff8805c0ac3c98 ffffffff811b549b ffff8805c0ac3c98
ffff8805f1190a40
[  474.618913]  ffff8805a6202f08 ffff8805c2b74e88 00068800d0464990
ffff8805e8aac860
[  474.653955]  ffff8805c0ac3cb8 000700068113833a ffff880606c75060
ffff8805c0ac3d94
[  474.690461] Call Trace:
[  474.723779]  [<ffffffff811b549b>] ? lookup_fast+0x1cd/0x22a
[  474.778049]  [<ffffffff81468824>] security_compute_av+0xf4/0x20b
[  474.811398]  [<ffffffff8196f419>] avc_compute_av+0x2a/0x179
[  474.843813]  [<ffffffff8145727b>] avc_has_perm+0x45/0xf4
[  474.875694]  [<ffffffff81457d0e>] inode_has_perm+0x2a/0x31
[  474.907370]  [<ffffffff81457e76>] selinux_inode_getattr+0x3c/0x3e
[  474.938726]  [<ffffffff81455cf6>] security_inode_getattr+0x1b/0x22
[  474.970036]  [<ffffffff811b057d>] vfs_getattr+0x19/0x2d
[  475.000618]  [<ffffffff811b05e5>] vfs_fstatat+0x54/0x91
[  475.030402]  [<ffffffff811b063b>] vfs_lstat+0x19/0x1b
[  475.061097]  [<ffffffff811b077e>] SyS_newlstat+0x15/0x30
[  475.094595]  [<ffffffff8113c5c1>] ? __audit_syscall_entry+0xa1/0xc3
[  475.148405]  [<ffffffff8197791e>] system_call_fastpath+0x16/0x1b
[  475.179201] Code: 00 48 85 c0 48 89 45 b8 75 02 0f 0b 48 8b 45 a0 48
8b 3d 45 d0 b6 00 8b 40 08 89 c6 ff ce e8 d1 b0 06 00 48 85 c0 49 89 c7
75 02 <0f> 0b 48 8b 45 b8 4c 8b 28 eb 1e 49 8d 7d 08 be 80 01 00 00 e8
[  475.255884] RIP  [<ffffffff814681c7>]
context_struct_compute_av+0xce/0x308
[  475.296120]  RSP <ffff8805c0ac3c38>
[  475.328734] ---[ end trace f076482e9d754adc ]---

Reported-by:  Matthew Thode <mthode@mthode.org>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 security/selinux/ss/services.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
index ff17820..dee7177 100644
--- a/security/selinux/ss/services.c
+++ b/security/selinux/ss/services.c
@@ -1074,6 +1074,10 @@ static int security_context_to_sid_core(const char *scontext, u32 scontext_len,
 	struct context context;
 	int rc = 0;
 
+	/* An empty security context is never valid. */
+	if (!scontext_len)
+		return -EINVAL;
+
 	if (!ss_initialized) {
 		int i;
 
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 140/143] s390: fix kernel crash due to linkage stack instructions
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (139 preceding siblings ...)
  2014-05-12  0:34 ` [ 139/143] SELinux: Fix kernel BUG on empty security contexts Willy Tarreau
@ 2014-05-12  0:34 ` Willy Tarreau
  2014-05-12  0:34 ` [ 141/143] netfilter: nf_conntrack_dccp: fix skb_header_pointer API usages Willy Tarreau
                   ` (2 subsequent siblings)
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:34 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Martin Schwidefsky, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Martin Schwidefsky <schwidefsky@de.ibm.com>

commit 8d7f6690cedb83456edd41c9bd583783f0703bf0 upstream

The kernel currently crashes with a low-address-protection exception
if a user space process executes an instruction that tries to use the
linkage stack. Set the base-ASTE origin and the subspace-ASTE origin
of the dispatchable-unit-control-table to point to a dummy ASTE.
Set up control register 15 to point to an empty linkage stack with no
room left.

A user space process with a linkage stack instruction will still crash
but with a different exception which is correctly translated to a
segmentation fault instead of a kernel oops.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
[dannf: backported to Debian's 2.6.32]
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 arch/s390/kernel/head64.S | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/s390/kernel/head64.S b/arch/s390/kernel/head64.S
index d984a2a..5b27ed0 100644
--- a/arch/s390/kernel/head64.S
+++ b/arch/s390/kernel/head64.S
@@ -124,7 +124,7 @@ startup_continue:
 	.quad	0			# cr12: tracing off
 	.quad	0			# cr13: home space segment table
 	.quad	0xc0000000		# cr14: machine check handling off
-	.quad	0			# cr15: linkage stack operations
+	.quad	.Llinkage_stack		# cr15: linkage stack operations
 .Lpcmsk:.quad	0x0000000180000000
 .L4malign:.quad 0xffffffffffc00000
 .Lscan2g:.quad	0x80000000 + 0x20000 - 8	# 2GB + 128K - 8
@@ -139,12 +139,15 @@ startup_continue:
 .Lparmaddr:
 	.quad	PARMAREA
 	.align	64
-.Lduct: .long	0,0,0,0,.Lduald,0,0,0
+.Lduct: .long	0,.Laste,.Laste,0,.Lduald,0,0,0
 	.long	0,0,0,0,0,0,0,0
+.Laste:	.quad	0,0xffffffffffffffff,0,0,0,0,0,0
 	.align	128
 .Lduald:.rept	8
 	.long	0x80000000,0,0,0	# invalid access-list entries
 	.endr
+.Llinkage_stack:
+	.long	0,0,0x89000000,0,0,0,0x8a000000,0
 
 	.org	0x12000
 	.globl	_ehead
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 141/143] netfilter: nf_conntrack_dccp: fix skb_header_pointer API usages
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (140 preceding siblings ...)
  2014-05-12  0:34 ` [ 140/143] s390: fix kernel crash due to linkage stack instructions Willy Tarreau
@ 2014-05-12  0:34 ` Willy Tarreau
  2014-05-12  0:34 ` [ 142/143] floppy: ignore kernel-only members in FDRAWCMD ioctl input Willy Tarreau
  2014-05-12  0:34 ` [ 143/143] floppy: dont write kernel-only members to FDRAWCMD ioctl output Willy Tarreau
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:34 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Daniel Borkmann, Pablo Neira Ayuso, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Daniel Borkmann <dborkman@redhat.com>

commit b22f5126a24b3b2f15448c3f2a254fc10cbc2b92 upstream

Some occurences in the netfilter tree use skb_header_pointer() in
the following way ...

  struct dccp_hdr _dh, *dh;
  ...
  skb_header_pointer(skb, dataoff, sizeof(_dh), &dh);

... where dh itself is a pointer that is being passed as the copy
buffer. Instead, we need to use &_dh as the forth argument so that
we're copying the data into an actual buffer that sits on the stack.

Currently, we probably could overwrite memory on the stack (e.g.
with a possibly mal-formed DCCP packet), but unintentionally, as
we only want the buffer to be placed into _dh variable.

Fixes: 2bc780499aa3 ("[NETFILTER]: nf_conntrack: add DCCP protocol support")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 net/netfilter/nf_conntrack_proto_dccp.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/netfilter/nf_conntrack_proto_dccp.c b/net/netfilter/nf_conntrack_proto_dccp.c
index 1b816a2..274e8a7 100644
--- a/net/netfilter/nf_conntrack_proto_dccp.c
+++ b/net/netfilter/nf_conntrack_proto_dccp.c
@@ -430,7 +430,7 @@ static bool dccp_new(struct nf_conn *ct, const struct sk_buff *skb,
 	const char *msg;
 	u_int8_t state;
 
-	dh = skb_header_pointer(skb, dataoff, sizeof(_dh), &dh);
+	dh = skb_header_pointer(skb, dataoff, sizeof(_dh), &_dh);
 	BUG_ON(dh == NULL);
 
 	state = dccp_state_table[CT_DCCP_ROLE_CLIENT][dh->dccph_type][CT_DCCP_NONE];
@@ -479,7 +479,7 @@ static int dccp_packet(struct nf_conn *ct, const struct sk_buff *skb,
 	u_int8_t type, old_state, new_state;
 	enum ct_dccp_roles role;
 
-	dh = skb_header_pointer(skb, dataoff, sizeof(_dh), &dh);
+	dh = skb_header_pointer(skb, dataoff, sizeof(_dh), &_dh);
 	BUG_ON(dh == NULL);
 	type = dh->dccph_type;
 
@@ -570,7 +570,7 @@ static int dccp_error(struct net *net, struct sk_buff *skb,
 	unsigned int cscov;
 	const char *msg;
 
-	dh = skb_header_pointer(skb, dataoff, sizeof(_dh), &dh);
+	dh = skb_header_pointer(skb, dataoff, sizeof(_dh), &_dh);
 	if (dh == NULL) {
 		msg = "nf_ct_dccp: short packet ";
 		goto out_invalid;
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 142/143] floppy: ignore kernel-only members in FDRAWCMD ioctl input
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (141 preceding siblings ...)
  2014-05-12  0:34 ` [ 141/143] netfilter: nf_conntrack_dccp: fix skb_header_pointer API usages Willy Tarreau
@ 2014-05-12  0:34 ` Willy Tarreau
  2014-05-12  0:34 ` [ 143/143] floppy: dont write kernel-only members to FDRAWCMD ioctl output Willy Tarreau
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:34 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Matthew Daley, Linus Torvalds, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Matthew Daley <mattd@bugfuzz.com>

Always clear out these floppy_raw_cmd struct members after copying the
entire structure from userspace so that the in-kernel version is always
valid and never left in an interdeterminate state.

Signed-off-by: Matthew Daley <mattd@bugfuzz.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit ef87dbe7614341c2e7bfe8d32fcb7028cc97442c)
[wt: be careful in 2.6.32 we still have the ugly macros everywhere]
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/block/floppy.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c
index 5c01f74..19d45e6 100644
--- a/drivers/block/floppy.c
+++ b/drivers/block/floppy.c
@@ -3209,9 +3209,12 @@ static inline int raw_cmd_copyin(int cmd, char __user *param,
 		if (!ptr)
 			return -ENOMEM;
 		*rcmd = ptr;
-		COPYIN(*ptr);
+		ret = copy_from_user(ptr, (void __user *)param, sizeof(*ptr));
 		ptr->next = NULL;
 		ptr->buffer_length = 0;
+		ptr->kernel_data = NULL;
+		if (ret)
+			return -EFAULT;
 		param += sizeof(struct floppy_raw_cmd);
 		if (ptr->cmd_count > 33)
 			/* the command may now also take up the space
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* [ 143/143] floppy: dont write kernel-only members to FDRAWCMD ioctl output
       [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
                   ` (142 preceding siblings ...)
  2014-05-12  0:34 ` [ 142/143] floppy: ignore kernel-only members in FDRAWCMD ioctl input Willy Tarreau
@ 2014-05-12  0:34 ` Willy Tarreau
  143 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  0:34 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Matthew Daley, Linus Torvalds, Willy Tarreau

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Matthew Daley <mattd@bugfuzz.com>

Do not leak kernel-only floppy_raw_cmd structure members to userspace.
This includes the linked-list pointer and the pointer to the allocated
DMA space.

Signed-off-by: Matthew Daley <mattd@bugfuzz.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 2145e15e0557a01b9195d1c7199a1b92cb9be81f)
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/block/floppy.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c
index 19d45e6..f959aad 100644
--- a/drivers/block/floppy.c
+++ b/drivers/block/floppy.c
@@ -3162,7 +3162,12 @@ static inline int raw_cmd_copyout(int cmd, char __user *param,
 	int ret;
 
 	while (ptr) {
-		COPYOUT(*ptr);
+		struct floppy_raw_cmd cmd = *ptr;
+		cmd.next = NULL;
+		cmd.kernel_data = NULL;
+		ret = copy_to_user((void __user *)param, &cmd, sizeof(cmd));
+		if (ret)
+			return -EFAULT;
 		param += sizeof(struct floppy_raw_cmd);
 		if ((ptr->flags & FD_RAW_READ) && ptr->buffer_length) {
 			if (ptr->length >= 0
-- 
1.7.12.2.21.g234cd45.dirty




^ permalink raw reply related	[flat|nested] 172+ messages in thread

* Re: [ 030/143] proc connector: fix info leaks
  2014-05-12  0:32 ` [ 030/143] proc connector: fix info leaks Willy Tarreau
@ 2014-05-12  8:41   ` Christoph Biedl
  2014-05-12  8:51   ` Mathias Krause
  1 sibling, 0 replies; 172+ messages in thread
From: Christoph Biedl @ 2014-05-12  8:41 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: linux-kernel, stable, Mathias Krause, David S. Miller

Willy Tarreau wrote...

> Initialize event_data for all possible message types to prevent leaking
> kernel stack contents to userland (up to 20 bytes). Also set the flags
> member of the connector message to 0 to prevent leaking two more stack
> bytes this way.

There are build errors as shown below and I guess that one is the
culprit. Can do detailled checks tonight, I'm a bit in a hurry right
now.

(Using gcc-4.7 as provided by Debian wheezy)

    Christoph

drivers/connector/cn_proc.c:286:9: error: expected declaration specifiers or '...' before '&' token
drivers/connector/cn_proc.c:286:26: error: expected declaration specifiers or '...' before numeric constant
drivers/connector/cn_proc.c:286:29: error: expected declaration specifiers or '...' before 'sizeof'
drivers/connector/cn_proc.c:287:5: error: expected '=', ',', ';', 'asm' or '__attribute__' before '->' token
make[5]: *** [drivers/connector/cn_proc.o] Error 1
make[4]: *** [drivers/connector] Error 2
make[4]: *** Waiting for unfinished jobs....

^ permalink raw reply	[flat|nested] 172+ messages in thread

* Re: [ 030/143] proc connector: fix info leaks
  2014-05-12  0:32 ` [ 030/143] proc connector: fix info leaks Willy Tarreau
  2014-05-12  8:41   ` Christoph Biedl
@ 2014-05-12  8:51   ` Mathias Krause
  2014-05-12  8:57     ` Willy Tarreau
  1 sibling, 1 reply; 172+ messages in thread
From: Mathias Krause @ 2014-05-12  8:51 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: linux-kernel, stable, David S. Miller, Christoph Biedl

On 12 May 2014 02:32, Willy Tarreau <w@1wt.eu> wrote:
> 2.6.32-longterm review patch.  If anyone has any objections, please let me know.
>
> ------------------
>
> From: Mathias Krause <minipli@googlemail.com>
>
> [ Upstream commit e727ca82e0e9616ab4844301e6bae60ca7327682 ]
>
> Initialize event_data for all possible message types to prevent leaking
> kernel stack contents to userland (up to 20 bytes). Also set the flags
> member of the connector message to 0 to prevent leaking two more stack
> bytes this way.
>
> Cc: stable@vger.kernel.org
> Signed-off-by: Mathias Krause <minipli@googlemail.com>
> Signed-off-by: David S. Miller <davem@davemloft.net>
> Signed-off-by: Willy Tarreau <w@1wt.eu>
> ---
>  drivers/connector/cn_proc.c | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
>
> diff --git a/drivers/connector/cn_proc.c b/drivers/connector/cn_proc.c
> index 6069790..3a2587a 100644
> --- a/drivers/connector/cn_proc.c
> +++ b/drivers/connector/cn_proc.c
> [...]
>  module_init(cn_proc_init);
> +       memset(&ev->event_data, 0, sizeof(ev->event_data));
> +       msg->flags = 0; /* not used */

That last hunk looks bogus. Probably the source of Christoph's compile error.

Mathias

^ permalink raw reply	[flat|nested] 172+ messages in thread

* Re: [ 030/143] proc connector: fix info leaks
  2014-05-12  8:51   ` Mathias Krause
@ 2014-05-12  8:57     ` Willy Tarreau
  2014-05-12 11:43       ` Willy Tarreau
  2014-05-12 14:42       ` David Miller
  0 siblings, 2 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12  8:57 UTC (permalink / raw)
  To: Mathias Krause; +Cc: linux-kernel, stable, David S. Miller, Christoph Biedl

On Mon, May 12, 2014 at 10:51:31AM +0200, Mathias Krause wrote:
> On 12 May 2014 02:32, Willy Tarreau <w@1wt.eu> wrote:
> > 2.6.32-longterm review patch.  If anyone has any objections, please let me know.
> >
> > ------------------
> >
> > From: Mathias Krause <minipli@googlemail.com>
> >
> > [ Upstream commit e727ca82e0e9616ab4844301e6bae60ca7327682 ]
> >
> > Initialize event_data for all possible message types to prevent leaking
> > kernel stack contents to userland (up to 20 bytes). Also set the flags
> > member of the connector message to 0 to prevent leaking two more stack
> > bytes this way.
> >
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Mathias Krause <minipli@googlemail.com>
> > Signed-off-by: David S. Miller <davem@davemloft.net>
> > Signed-off-by: Willy Tarreau <w@1wt.eu>
> > ---
> >  drivers/connector/cn_proc.c | 16 ++++++++++++++++
> >  1 file changed, 16 insertions(+)
> >
> > diff --git a/drivers/connector/cn_proc.c b/drivers/connector/cn_proc.c
> > index 6069790..3a2587a 100644
> > --- a/drivers/connector/cn_proc.c
> > +++ b/drivers/connector/cn_proc.c
> > [...]
> >  module_init(cn_proc_init);
> > +       memset(&ev->event_data, 0, sizeof(ev->event_data));
> > +       msg->flags = 0; /* not used */
> 
> That last hunk looks bogus. Probably the source of Christoph's compile error.

Thank you guys, I'll check. I don't know why I didn't see them on an
allmodconfig build.

Willy


^ permalink raw reply	[flat|nested] 172+ messages in thread

* Re: [ 030/143] proc connector: fix info leaks
  2014-05-12  8:57     ` Willy Tarreau
@ 2014-05-12 11:43       ` Willy Tarreau
  2014-05-12 14:42       ` David Miller
  1 sibling, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12 11:43 UTC (permalink / raw)
  To: Mathias Krause; +Cc: linux-kernel, stable, David S. Miller, Christoph Biedl

[-- Attachment #1: Type: text/plain, Size: 716 bytes --]

On Mon, May 12, 2014 at 10:57:04AM +0200, Willy Tarreau wrote:
> > > diff --git a/drivers/connector/cn_proc.c b/drivers/connector/cn_proc.c
> > > index 6069790..3a2587a 100644
> > > --- a/drivers/connector/cn_proc.c
> > > +++ b/drivers/connector/cn_proc.c
> > > [...]
> > >  module_init(cn_proc_init);
> > > +       memset(&ev->event_data, 0, sizeof(ev->event_data));
> > > +       msg->flags = 0; /* not used */
> > 
> > That last hunk looks bogus. Probably the source of Christoph's compile error.
> 
> Thank you guys, I'll check. I don't know why I didn't see them on an
> allmodconfig build.

I found why: CONFIG_PROC_EVENTS is not enabled when CONNECTOR=m.

Here's a fixed patch. Sorry for the trouble.

Willy


[-- Attachment #2: 0001-proc-connector-fix-info-leaks.patch --]
[-- Type: text/plain, Size: 4873 bytes --]

>From 8ab62fb31d8fb0f84686612437cfa69e32cd3626 Mon Sep 17 00:00:00 2001
From: Mathias Krause <minipli@googlemail.com>
Date: Mon, 30 Sep 2013 22:03:06 +0200
Subject: proc connector: fix info leaks

[ Upstream commit e727ca82e0e9616ab4844301e6bae60ca7327682 ]

Initialize event_data for all possible message types to prevent leaking
kernel stack contents to userland (up to 20 bytes). Also set the flags
member of the connector message to 0 to prevent leaking two more stack
bytes this way.

Cc: stable@vger.kernel.org  # v2.6.15+
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 drivers/connector/cn_proc.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/drivers/connector/cn_proc.c b/drivers/connector/cn_proc.c
index 6069790..3603599 100644
--- a/drivers/connector/cn_proc.c
+++ b/drivers/connector/cn_proc.c
@@ -59,6 +59,7 @@ void proc_fork_connector(struct task_struct *task)
 
 	msg = (struct cn_msg*)buffer;
 	ev = (struct proc_event*)msg->data;
+	memset(&ev->event_data, 0, sizeof(ev->event_data));
 	get_seq(&msg->seq, &ev->cpu);
 	ktime_get_ts(&ts); /* get high res monotonic timestamp */
 	put_unaligned(timespec_to_ns(&ts), (__u64 *)&ev->timestamp_ns);
@@ -71,6 +72,7 @@ void proc_fork_connector(struct task_struct *task)
 	memcpy(&msg->id, &cn_proc_event_id, sizeof(msg->id));
 	msg->ack = 0; /* not used */
 	msg->len = sizeof(*ev);
+	msg->flags = 0; /* not used */
 	/*  If cn_netlink_send() failed, the data is not sent */
 	cn_netlink_send(msg, CN_IDX_PROC, GFP_KERNEL);
 }
@@ -87,6 +89,7 @@ void proc_exec_connector(struct task_struct *task)
 
 	msg = (struct cn_msg*)buffer;
 	ev = (struct proc_event*)msg->data;
+	memset(&ev->event_data, 0, sizeof(ev->event_data));
 	get_seq(&msg->seq, &ev->cpu);
 	ktime_get_ts(&ts); /* get high res monotonic timestamp */
 	put_unaligned(timespec_to_ns(&ts), (__u64 *)&ev->timestamp_ns);
@@ -97,6 +100,7 @@ void proc_exec_connector(struct task_struct *task)
 	memcpy(&msg->id, &cn_proc_event_id, sizeof(msg->id));
 	msg->ack = 0; /* not used */
 	msg->len = sizeof(*ev);
+	msg->flags = 0; /* not used */
 	cn_netlink_send(msg, CN_IDX_PROC, GFP_KERNEL);
 }
 
@@ -113,6 +117,7 @@ void proc_id_connector(struct task_struct *task, int which_id)
 
 	msg = (struct cn_msg*)buffer;
 	ev = (struct proc_event*)msg->data;
+	memset(&ev->event_data, 0, sizeof(ev->event_data));
 	ev->what = which_id;
 	ev->event_data.id.process_pid = task->pid;
 	ev->event_data.id.process_tgid = task->tgid;
@@ -136,6 +141,7 @@ void proc_id_connector(struct task_struct *task, int which_id)
 	memcpy(&msg->id, &cn_proc_event_id, sizeof(msg->id));
 	msg->ack = 0; /* not used */
 	msg->len = sizeof(*ev);
+	msg->flags = 0; /* not used */
 	cn_netlink_send(msg, CN_IDX_PROC, GFP_KERNEL);
 }
 
@@ -151,6 +157,7 @@ void proc_sid_connector(struct task_struct *task)
 
 	msg = (struct cn_msg *)buffer;
 	ev = (struct proc_event *)msg->data;
+	memset(&ev->event_data, 0, sizeof(ev->event_data));
 	get_seq(&msg->seq, &ev->cpu);
 	ktime_get_ts(&ts); /* get high res monotonic timestamp */
 	put_unaligned(timespec_to_ns(&ts), (__u64 *)&ev->timestamp_ns);
@@ -161,6 +168,7 @@ void proc_sid_connector(struct task_struct *task)
 	memcpy(&msg->id, &cn_proc_event_id, sizeof(msg->id));
 	msg->ack = 0; /* not used */
 	msg->len = sizeof(*ev);
+	msg->flags = 0; /* not used */
 	cn_netlink_send(msg, CN_IDX_PROC, GFP_KERNEL);
 }
 
@@ -176,8 +184,10 @@ void proc_exit_connector(struct task_struct *task)
 
 	msg = (struct cn_msg*)buffer;
 	ev = (struct proc_event*)msg->data;
+	memset(&ev->event_data, 0, sizeof(ev->event_data));
 	get_seq(&msg->seq, &ev->cpu);
 	ktime_get_ts(&ts); /* get high res monotonic timestamp */
+	memset(&ev->event_data, 0, sizeof(ev->event_data));
 	put_unaligned(timespec_to_ns(&ts), (__u64 *)&ev->timestamp_ns);
 	ev->what = PROC_EVENT_EXIT;
 	ev->event_data.exit.process_pid = task->pid;
@@ -188,6 +198,7 @@ void proc_exit_connector(struct task_struct *task)
 	memcpy(&msg->id, &cn_proc_event_id, sizeof(msg->id));
 	msg->ack = 0; /* not used */
 	msg->len = sizeof(*ev);
+	msg->flags = 0; /* not used */
 	cn_netlink_send(msg, CN_IDX_PROC, GFP_KERNEL);
 }
 
@@ -211,6 +222,7 @@ static void cn_proc_ack(int err, int rcvd_seq, int rcvd_ack)
 
 	msg = (struct cn_msg*)buffer;
 	ev = (struct proc_event*)msg->data;
+	memset(&ev->event_data, 0, sizeof(ev->event_data));
 	msg->seq = rcvd_seq;
 	ktime_get_ts(&ts); /* get high res monotonic timestamp */
 	put_unaligned(timespec_to_ns(&ts), (__u64 *)&ev->timestamp_ns);
@@ -220,6 +232,7 @@ static void cn_proc_ack(int err, int rcvd_seq, int rcvd_ack)
 	memcpy(&msg->id, &cn_proc_event_id, sizeof(msg->id));
 	msg->ack = rcvd_ack + 1;
 	msg->len = sizeof(*ev);
+	msg->flags = 0; /* not used */
 	cn_netlink_send(msg, CN_IDX_PROC, GFP_KERNEL);
 }
 
-- 
1.7.12.1


^ permalink raw reply related	[flat|nested] 172+ messages in thread

* Re: [ 030/143] proc connector: fix info leaks
  2014-05-12  8:57     ` Willy Tarreau
  2014-05-12 11:43       ` Willy Tarreau
@ 2014-05-12 14:42       ` David Miller
  1 sibling, 0 replies; 172+ messages in thread
From: David Miller @ 2014-05-12 14:42 UTC (permalink / raw)
  To: w; +Cc: minipli, linux-kernel, stable, linux-kernel.bfrz

From: Willy Tarreau <w@1wt.eu>
Date: Mon, 12 May 2014 10:57:04 +0200

> Thank you guys, I'll check. I don't know why I didn't see them on an
> allmodconfig build.

Because you have to enable connector as "y", rather than "m", for the
proc connector to be available in the config.

^ permalink raw reply	[flat|nested] 172+ messages in thread

* Re: [ 072/143] tipc: fix lockdep warning during bearer initialization
  2014-05-12  0:33 ` [ 072/143] tipc: fix lockdep warning during bearer initialization Willy Tarreau
@ 2014-05-12 16:04   ` Jon Maloy
  2014-05-12 16:16     ` Willy Tarreau
  0 siblings, 1 reply; 172+ messages in thread
From: Jon Maloy @ 2014-05-12 16:04 UTC (permalink / raw)
  To: Willy Tarreau, linux-kernel, stable
  Cc: Ying Xue, Paul Gortmaker, David S. Miller

This one is obsolete.
tipc_net_lock does not exist in the current code. It was removed in commit
 7216cd949c9bd56a4ccd952c624ab68f8c9aa0a4("tipc: purge tipc_net_lock lock")

Regards
///jon

On 05/11/2014 08:33 PM, Willy Tarreau wrote:
> 2.6.32-longterm review patch.  If anyone has any objections, please let me know.
> 
> ------------------
> 
> From: Ying Xue <ying.xue@windriver.com>
> 
> [ Upstream commit 4225a398c1352a7a5c14dc07277cb5cc4473983b ]
> 
> When the lockdep validator is enabled, it will report the below
> warning when we enable a TIPC bearer:
> 
> [ INFO: possible irq lock inversion dependency detected ]
> ---------------------------------------------------------
> Possible interrupt unsafe locking scenario:
> 
>         CPU0                    CPU1
>         ----                    ----
>    lock(ptype_lock);
>                                 local_irq_disable();
>                                 lock(tipc_net_lock);
>                                 lock(ptype_lock);
>    <Interrupt>
>    lock(tipc_net_lock);
> 
>   *** DEADLOCK ***
> 
> the shortest dependencies between 2nd lock and 1st lock:
>   -> (ptype_lock){+.+...} ops: 10 {
> [...]
> SOFTIRQ-ON-W at:
>                       [<c1089418>] __lock_acquire+0x528/0x13e0
>                       [<c108a360>] lock_acquire+0x90/0x100
>                       [<c1553c38>] _raw_spin_lock+0x38/0x50
>                       [<c14651ca>] dev_add_pack+0x3a/0x60
>                       [<c182da75>] arp_init+0x1a/0x48
>                       [<c182dce5>] inet_init+0x181/0x27e
>                       [<c1001114>] do_one_initcall+0x34/0x170
>                       [<c17f7329>] kernel_init+0x110/0x1b2
>                       [<c155b6a2>] kernel_thread_helper+0x6/0x10
> [...]
>    ... key      at: [<c17e4b10>] ptype_lock+0x10/0x20
>    ... acquired at:
>     [<c108a360>] lock_acquire+0x90/0x100
>     [<c1553c38>] _raw_spin_lock+0x38/0x50
>     [<c14651ca>] dev_add_pack+0x3a/0x60
>     [<c8bc18d2>] enable_bearer+0xf2/0x140 [tipc]
>     [<c8bb283a>] tipc_enable_bearer+0x1ba/0x450 [tipc]
>     [<c8bb3a04>] tipc_cfg_do_cmd+0x5c4/0x830 [tipc]
>     [<c8bbc032>] handle_cmd+0x42/0xd0 [tipc]
>     [<c148e802>] genl_rcv_msg+0x232/0x280
>     [<c148d3f6>] netlink_rcv_skb+0x86/0xb0
>     [<c148e5bc>] genl_rcv+0x1c/0x30
>     [<c148d144>] netlink_unicast+0x174/0x1f0
>     [<c148ddab>] netlink_sendmsg+0x1eb/0x2d0
>     [<c1456bc1>] sock_aio_write+0x161/0x170
>     [<c1135a7c>] do_sync_write+0xac/0xf0
>     [<c11360f6>] vfs_write+0x156/0x170
>     [<c11361e2>] sys_write+0x42/0x70
>     [<c155b0df>] sysenter_do_call+0x12/0x38
> [...]
> }
>   -> (tipc_net_lock){+..-..} ops: 4 {
> [...]
>     IN-SOFTIRQ-R at:
>                      [<c108953a>] __lock_acquire+0x64a/0x13e0
>                      [<c108a360>] lock_acquire+0x90/0x100
>                      [<c15541cd>] _raw_read_lock_bh+0x3d/0x50
>                      [<c8bb874d>] tipc_recv_msg+0x1d/0x830 [tipc]
>                      [<c8bc195f>] recv_msg+0x3f/0x50 [tipc]
>                      [<c146a5fa>] __netif_receive_skb+0x22a/0x590
>                      [<c146ab0b>] netif_receive_skb+0x2b/0xf0
>                      [<c13c43d2>] pcnet32_poll+0x292/0x780
>                      [<c146b00a>] net_rx_action+0xfa/0x1e0
>                      [<c103a4be>] __do_softirq+0xae/0x1e0
> [...]
> }
> 
>>From the log, we can see three different call chains between
> CPU0 and CPU1:
> 
> Time 0 on CPU0:
> 
>   kernel_init()->inet_init()->dev_add_pack()
> 
> At time 0, the ptype_lock is held by CPU0 in dev_add_pack();
> 
> Time 1 on CPU1:
> 
>   tipc_enable_bearer()->enable_bearer()->dev_add_pack()
> 
> At time 1, tipc_enable_bearer() first holds tipc_net_lock, and then
> wants to take ptype_lock to register TIPC protocol handler into the
> networking stack.  But the ptype_lock has been taken by dev_add_pack()
> on CPU0, so at this time the dev_add_pack() running on CPU1 has to be
> busy looping.
> 
> Time 2 on CPU0:
> 
>   netif_receive_skb()->recv_msg()->tipc_recv_msg()
> 
> At time 2, an incoming TIPC packet arrives at CPU0, hence
> tipc_recv_msg() will be invoked. In tipc_recv_msg(), it first wants
> to hold tipc_net_lock.  At the moment, below scenario happens:
> 
> On CPU0, below is our sequence of taking locks:
> 
>   lock(ptype_lock)->lock(tipc_net_lock)
> 
> On CPU1, our sequence of taking locks looks like:
> 
>   lock(tipc_net_lock)->lock(ptype_lock)
> 
> Obviously deadlock may happen in this case.
> 
> But please note the deadlock possibly doesn't occur at all when the
> first TIPC bearer is enabled.  Before enable_bearer() -- running on
> CPU1 does not hold ptype_lock, so the TIPC receive handler (i.e.
> recv_msg()) is not registered successfully via dev_add_pack(), so
> the tipc_recv_msg() cannot be called by recv_msg() even if a TIPC
> message comes to CPU0. But when the second TIPC bearer is
> registered, the deadlock can perhaps really happen.
> 
> To fix it, we will push the work of registering TIPC protocol
> handler into workqueue context. After the change, both paths taking
> ptype_lock are always in process contexts, thus, the deadlock should
> never occur.
> 
> Signed-off-by: Ying Xue <ying.xue@windriver.com>
> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
> Signed-off-by: David S. Miller <davem@davemloft.net>
> Signed-off-by: Willy Tarreau <w@1wt.eu>
> ---
>  net/tipc/eth_media.c | 15 ++++++++++++++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/net/tipc/eth_media.c b/net/tipc/eth_media.c
> index 524ba56..22453a8 100644
> --- a/net/tipc/eth_media.c
> +++ b/net/tipc/eth_media.c
> @@ -56,6 +56,7 @@ struct eth_bearer {
>  	struct tipc_bearer *bearer;
>  	struct net_device *dev;
>  	struct packet_type tipc_packet_type;
> +	struct work_struct setup;
>  };
>  
>  static struct eth_bearer eth_bearers[MAX_ETH_BEARERS];
> @@ -122,6 +123,17 @@ static int recv_msg(struct sk_buff *buf, struct net_device *dev,
>  }
>  
>  /**
> + * setup_bearer - setup association between Ethernet bearer and interface
> + */
> +static void setup_bearer(struct work_struct *work)
> +{
> +	struct eth_bearer *eb_ptr =
> +		container_of(work, struct eth_bearer, setup);
> +
> +	dev_add_pack(&eb_ptr->tipc_packet_type);
> +}
> +
> +/**
>   * enable_bearer - attach TIPC bearer to an Ethernet interface
>   */
>  
> @@ -157,7 +169,8 @@ static int enable_bearer(struct tipc_bearer *tb_ptr)
>  		eb_ptr->tipc_packet_type.af_packet_priv = eb_ptr;
>  		INIT_LIST_HEAD(&(eb_ptr->tipc_packet_type.list));
>  		dev_hold(dev);
> -		dev_add_pack(&eb_ptr->tipc_packet_type);
> +		INIT_WORK(&eb_ptr->setup, setup_bearer);
> +		schedule_work(&eb_ptr->setup);
>  	}
>  
>  	/* Associate TIPC bearer with Ethernet bearer */
> 


^ permalink raw reply	[flat|nested] 172+ messages in thread

* Re: [ 072/143] tipc: fix lockdep warning during bearer initialization
  2014-05-12 16:04   ` Jon Maloy
@ 2014-05-12 16:16     ` Willy Tarreau
  2014-05-12 16:41       ` Jon Maloy
  0 siblings, 1 reply; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12 16:16 UTC (permalink / raw)
  To: Jon Maloy; +Cc: linux-kernel, stable, Ying Xue, Paul Gortmaker, David S. Miller

On Mon, May 12, 2014 at 12:04:00PM -0400, Jon Maloy wrote:
> This one is obsolete.
> tipc_net_lock does not exist in the current code. It was removed in commit
>  7216cd949c9bd56a4ccd952c624ab68f8c9aa0a4("tipc: purge tipc_net_lock lock")

I'm a bit confused, I can't find this commit, in what branch is it ?

Thanks,
Willy


^ permalink raw reply	[flat|nested] 172+ messages in thread

* RE: [ 072/143] tipc: fix lockdep warning during bearer initialization
  2014-05-12 16:16     ` Willy Tarreau
@ 2014-05-12 16:41       ` Jon Maloy
  2014-05-12 17:12         ` Willy Tarreau
  0 siblings, 1 reply; 172+ messages in thread
From: Jon Maloy @ 2014-05-12 16:41 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: linux-kernel, stable, Ying Xue, Paul Gortmaker, David S. Miller

Sorry, I should have mentioned that it is still only in net-next.
It is pretty recent.

///jon

> -----Original Message-----
> From: Willy Tarreau [mailto:w@1wt.eu]
> Sent: May-12-14 12:16 PM
> To: Jon Maloy
> Cc: linux-kernel@vger.kernel.org; stable@vger.kernel.org; Ying Xue; Paul
> Gortmaker; David S. Miller
> Subject: Re: [ 072/143] tipc: fix lockdep warning during bearer initialization
> 
> On Mon, May 12, 2014 at 12:04:00PM -0400, Jon Maloy wrote:
> > This one is obsolete.
> > tipc_net_lock does not exist in the current code. It was removed in commit
> >  7216cd949c9bd56a4ccd952c624ab68f8c9aa0a4("tipc: purge tipc_net_lock
> lock")
> 
> I'm a bit confused, I can't find this commit, in what branch is it ?
> 
> Thanks,
> Willy


^ permalink raw reply	[flat|nested] 172+ messages in thread

* Re: [ 072/143] tipc: fix lockdep warning during bearer initialization
  2014-05-12 16:41       ` Jon Maloy
@ 2014-05-12 17:12         ` Willy Tarreau
  2014-05-12 17:19           ` Jon Maloy
  0 siblings, 1 reply; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12 17:12 UTC (permalink / raw)
  To: Jon Maloy; +Cc: linux-kernel, stable, Ying Xue, Paul Gortmaker, David S. Miller

On Mon, May 12, 2014 at 04:41:00PM +0000, Jon Maloy wrote:
> Sorry, I should have mentioned that it is still only in net-next.
> It is pretty recent.

OK indeed, got it here for reference :

  https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/?id=7216cd949c9bd56

But anyway, what is mentionned in this commit is a number of changes that
happened between 2.6.32 and 3.15, so I hardly see how the tipc_net_lock
is obsolete in 2.6.32 if it does not have all these changes. I have counted
489 commits between 2.6.32 and 3.15-rc5 touching net/tipc, so surely we
cannot count on what we have to get rid of this lock.

Am I missing anything ?

Thanks,
Willy


^ permalink raw reply	[flat|nested] 172+ messages in thread

* Re: [ 072/143] tipc: fix lockdep warning during bearer initialization
  2014-05-12 17:12         ` Willy Tarreau
@ 2014-05-12 17:19           ` Jon Maloy
  2014-05-12 18:11             ` Willy Tarreau
  0 siblings, 1 reply; 172+ messages in thread
From: Jon Maloy @ 2014-05-12 17:19 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: linux-kernel, stable, Ying Xue, Paul Gortmaker, David S. Miller

On 05/12/2014 01:12 PM, Willy Tarreau wrote:
> On Mon, May 12, 2014 at 04:41:00PM +0000, Jon Maloy wrote:
>> Sorry, I should have mentioned that it is still only in net-next.
>> It is pretty recent.
> 
> OK indeed, got it here for reference :
> 
>   https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/?id=7216cd949c9bd56
> 
> But anyway, what is mentionned in this commit is a number of changes that
> happened between 2.6.32 and 3.15, so I hardly see how the tipc_net_lock
> is obsolete in 2.6.32 if it does not have all these changes. I have counted
> 489 commits between 2.6.32 and 3.15-rc5 touching net/tipc, so surely we
> cannot count on what we have to get rid of this lock.
> 
> Am I missing anything ?

Ok. I missed the 2.6.32 part. I withdraw my objection.

///jon

> 
> Thanks,
> Willy
> 


^ permalink raw reply	[flat|nested] 172+ messages in thread

* Re: [ 072/143] tipc: fix lockdep warning during bearer initialization
  2014-05-12 17:19           ` Jon Maloy
@ 2014-05-12 18:11             ` Willy Tarreau
  0 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-12 18:11 UTC (permalink / raw)
  To: Jon Maloy; +Cc: linux-kernel, stable, Ying Xue, Paul Gortmaker, David S. Miller

On Mon, May 12, 2014 at 01:19:54PM -0400, Jon Maloy wrote:
> On 05/12/2014 01:12 PM, Willy Tarreau wrote:
> > On Mon, May 12, 2014 at 04:41:00PM +0000, Jon Maloy wrote:
> >> Sorry, I should have mentioned that it is still only in net-next.
> >> It is pretty recent.
> > 
> > OK indeed, got it here for reference :
> > 
> >   https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/?id=7216cd949c9bd56
> > 
> > But anyway, what is mentionned in this commit is a number of changes that
> > happened between 2.6.32 and 3.15, so I hardly see how the tipc_net_lock
> > is obsolete in 2.6.32 if it does not have all these changes. I have counted
> > 489 commits between 2.6.32 and 3.15-rc5 touching net/tipc, so surely we
> > cannot count on what we have to get rid of this lock.
> > 
> > Am I missing anything ?
> 
> Ok. I missed the 2.6.32 part. I withdraw my objection.

Ah OK now it's much easier for me to understand your point :-)
No problem, Thanks!
Willy


^ permalink raw reply	[flat|nested] 172+ messages in thread

* Re: [ 137/143] xfs: underflow bug in xfs_attrlist_by_handle()
  2014-05-12  0:34 ` [ 137/143] xfs: underflow bug in xfs_attrlist_by_handle() Willy Tarreau
@ 2014-05-13 11:08   ` Luis Henriques
  2014-05-13 11:18     ` Willy Tarreau
  2014-05-14  9:50     ` Dan Carpenter
  0 siblings, 2 replies; 172+ messages in thread
From: Luis Henriques @ 2014-05-13 11:08 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: linux-kernel, stable, Dan Carpenter, Ben Myers

On Mon, May 12, 2014 at 02:34:17AM +0200, Willy Tarreau wrote:
> 2.6.32-longterm review patch.  If anyone has any objections, please let me know.
> 
> ------------------
> 
> From: Dan Carpenter <dan.carpenter@oracle.com>
> 
> If we allocate less than sizeof(struct attrlist) then we end up
> corrupting memory or doing a ZERO_PTR_SIZE dereference.
> 
> This can only be triggered with CAP_SYS_ADMIN.
> 
> Reported-by: Nico Golde <nico@ngolde.de>
> Reported-by: Fabian Yamaguchi <fabs@goesec.de>
> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
> Reviewed-by: Dave Chinner <dchinner@redhat.com>
> Signed-off-by: Ben Myers <bpm@sgi.com>
> 
> (cherry picked from commit 071c529eb672648ee8ca3f90944bcbcc730b4c06)
> [dannf: backported to Debian's 2.6.32]
> Signed-off-by: Willy Tarreau <w@1wt.eu>
> ---
>  fs/xfs/linux-2.6/xfs_ioctl.c   | 3 ++-
>  fs/xfs/linux-2.6/xfs_ioctl32.c | 4 ++--
>  2 files changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/xfs/linux-2.6/xfs_ioctl.c b/fs/xfs/linux-2.6/xfs_ioctl.c
> index 942362f..5663351 100644
> --- a/fs/xfs/linux-2.6/xfs_ioctl.c
> +++ b/fs/xfs/linux-2.6/xfs_ioctl.c
> @@ -410,7 +410,8 @@ xfs_attrlist_by_handle(
>  		return -XFS_ERROR(EPERM);
>  	if (copy_from_user(&al_hreq, arg, sizeof(xfs_fsop_attrlist_handlereq_t)))
>  		return -XFS_ERROR(EFAULT);
> -	if (al_hreq.buflen > XATTR_LIST_MAX)
> +	if (al_hreq.buflen < sizeof(struct attrlist) ||
> +	    al_hreq.buflen > XATTR_LIST_MAX)
>  		return -XFS_ERROR(EINVAL);
>  
>  	/*
> diff --git a/fs/xfs/linux-2.6/xfs_ioctl32.c b/fs/xfs/linux-2.6/xfs_ioctl32.c
> index bad485a..782d03d 100644
> --- a/fs/xfs/linux-2.6/xfs_ioctl32.c
> +++ b/fs/xfs/linux-2.6/xfs_ioctl32.c
> @@ -361,8 +361,8 @@ xfs_compat_attrlist_by_handle(
>  	if (copy_from_user(&al_hreq, arg,
>  			   sizeof(compat_xfs_fsop_attrlist_handlereq_t)))
>  		return -XFS_ERROR(EFAULT);
> -	if (al_hreq.buflen > XATTR_LIST_MAX)
> -		return -XFS_ERROR(EINVAL);

Am I missing something or was the above return statement deleted by
mistake?

Cheers,
--
Luís

> +	if (al_hreq.buflen < sizeof(struct attrlist) ||
> +	    al_hreq.buflen > XATTR_LIST_MAX)
>  
>  	/*
>  	 * Reject flags, only allow namespaces.
> -- 
> 1.7.12.2.21.g234cd45.dirty
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe stable" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 172+ messages in thread

* Re: [ 137/143] xfs: underflow bug in xfs_attrlist_by_handle()
  2014-05-13 11:08   ` Luis Henriques
@ 2014-05-13 11:18     ` Willy Tarreau
  2014-05-14  9:50     ` Dan Carpenter
  1 sibling, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-13 11:18 UTC (permalink / raw)
  To: Luis Henriques; +Cc: linux-kernel, stable, Dan Carpenter, Ben Myers, dannf, jmm

[-- Attachment #1: Type: text/plain, Size: 880 bytes --]

Hi Luis,

On Tue, May 13, 2014 at 12:08:12PM +0100, Luis Henriques wrote:
> > diff --git a/fs/xfs/linux-2.6/xfs_ioctl32.c b/fs/xfs/linux-2.6/xfs_ioctl32.c
> > index bad485a..782d03d 100644
> > --- a/fs/xfs/linux-2.6/xfs_ioctl32.c
> > +++ b/fs/xfs/linux-2.6/xfs_ioctl32.c
> > @@ -361,8 +361,8 @@ xfs_compat_attrlist_by_handle(
> >  	if (copy_from_user(&al_hreq, arg,
> >  			   sizeof(compat_xfs_fsop_attrlist_handlereq_t)))
> >  		return -XFS_ERROR(EFAULT);
> > -	if (al_hreq.buflen > XATTR_LIST_MAX)
> > -		return -XFS_ERROR(EINVAL);
> 
> Am I missing something or was the above return statement deleted by
> mistake?
>
> > +	if (al_hreq.buflen < sizeof(struct attrlist) ||
> > +	    al_hreq.buflen > XATTR_LIST_MAX)

Ouch! You're absolutely right, thanks a lot for spotting this!

Here's an updated patch. Dann, Moritz, you want to use this one
as well instead!

thanks,
Willy


[-- Attachment #2: 0001-xfs-underflow-bug-in-xfs_attrlist_by_handle.patch --]
[-- Type: text/plain, Size: 1964 bytes --]

>From 2282cff50cb9c2205d92b31257d894a4eda4ed86 Mon Sep 17 00:00:00 2001
From: Dan Carpenter <dan.carpenter@oracle.com>
Date: Thu, 31 Oct 2013 21:00:10 +0300
Subject: xfs: underflow bug in xfs_attrlist_by_handle()

If we allocate less than sizeof(struct attrlist) then we end up
corrupting memory or doing a ZERO_PTR_SIZE dereference.

This can only be triggered with CAP_SYS_ADMIN.

Reported-by: Nico Golde <nico@ngolde.de>
Reported-by: Fabian Yamaguchi <fabs@goesec.de>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

(cherry picked from commit 071c529eb672648ee8ca3f90944bcbcc730b4c06)
[dannf: backported to Debian's 2.6.32]
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 fs/xfs/linux-2.6/xfs_ioctl.c   | 3 ++-
 fs/xfs/linux-2.6/xfs_ioctl32.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/linux-2.6/xfs_ioctl.c b/fs/xfs/linux-2.6/xfs_ioctl.c
index 942362f..5663351 100644
--- a/fs/xfs/linux-2.6/xfs_ioctl.c
+++ b/fs/xfs/linux-2.6/xfs_ioctl.c
@@ -410,7 +410,8 @@ xfs_attrlist_by_handle(
 		return -XFS_ERROR(EPERM);
 	if (copy_from_user(&al_hreq, arg, sizeof(xfs_fsop_attrlist_handlereq_t)))
 		return -XFS_ERROR(EFAULT);
-	if (al_hreq.buflen > XATTR_LIST_MAX)
+	if (al_hreq.buflen < sizeof(struct attrlist) ||
+	    al_hreq.buflen > XATTR_LIST_MAX)
 		return -XFS_ERROR(EINVAL);
 
 	/*
diff --git a/fs/xfs/linux-2.6/xfs_ioctl32.c b/fs/xfs/linux-2.6/xfs_ioctl32.c
index bad485a..e671047 100644
--- a/fs/xfs/linux-2.6/xfs_ioctl32.c
+++ b/fs/xfs/linux-2.6/xfs_ioctl32.c
@@ -361,7 +361,8 @@ xfs_compat_attrlist_by_handle(
 	if (copy_from_user(&al_hreq, arg,
 			   sizeof(compat_xfs_fsop_attrlist_handlereq_t)))
 		return -XFS_ERROR(EFAULT);
-	if (al_hreq.buflen > XATTR_LIST_MAX)
+	if (al_hreq.buflen < sizeof(struct attrlist) ||
+	    al_hreq.buflen > XATTR_LIST_MAX)
 		return -XFS_ERROR(EINVAL);
 
 	/*
-- 
1.7.12.2.21.g234cd45.dirty


^ permalink raw reply related	[flat|nested] 172+ messages in thread

* Re: [ 080/143] net: rework recvmsg handler msg_name and msg_namelen logic
  2014-05-12  0:33 ` [ 080/143] net: rework recvmsg handler msg_name and msg_namelen logic Willy Tarreau
@ 2014-05-13 12:44   ` Luis Henriques
  2014-05-13 12:49     ` Willy Tarreau
  2014-05-14  5:45     ` Willy Tarreau
  0 siblings, 2 replies; 172+ messages in thread
From: Luis Henriques @ 2014-05-13 12:44 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: linux-kernel, stable, David Miller, Hannes Frederic Sowa

[-- Attachment #1: Type: text/plain, Size: 1515 bytes --]

On Mon, May 12, 2014 at 02:33:20AM +0200, Willy Tarreau wrote:
> 2.6.32-longterm review patch.  If anyone has any objections, please let me know.
> 
> ------------------
> 
> From: Hannes Frederic Sowa <hannes@stressinduktion.org>
> 
> [ Upstream commit f3d3342602f8bcbf37d7c46641cb9bca7618eb1c ]
> 

<snip>

> diff --git a/net/rxrpc/ar-recvmsg.c b/net/rxrpc/ar-recvmsg.c
> index a39bf97..f779fc3 100644
> --- a/net/rxrpc/ar-recvmsg.c
> +++ b/net/rxrpc/ar-recvmsg.c
> @@ -142,10 +142,12 @@ int rxrpc_recvmsg(struct kiocb *iocb, struct socket *sock,
>  
>  		/* copy the peer address and timestamp */
>  		if (!continue_call) {
> -			if (msg->msg_name && msg->msg_namelen > 0)
> +			if (msg->msg_name) {
> +				size_t len =
> +					sizeof(call->conn->trans->peer->srx);
>  				memcpy(msg->msg_name,
> -				       &call->conn->trans->peer->srx,
> -				       sizeof(call->conn->trans->peer->srx));
> +				       &call->conn->trans->peer->srx, len);
  +                                    msg->msg_namelen = len;

This statement^^^ is missing in this backport.

Also missing in this patch are the changes to the following functions:

- pppoe_recvmsg()
  In file drivers/net/pppoe.c instead of drivers/net/ppp/pppoe.c

- pppol2tp_recvmsg()
  In file drivers/net/pppol2tp.c instead of net/l2tp/l2tp_ppp.c

For reference, I'm attaching the backport we have used in our Ubuntu
kernel.  (Note that we added some extra information to the commit
message to include the CVE number and a link to the CVE bug.)

Cheers,
--
Luís

[-- Attachment #2: 0001-net-rework-recvmsg-handler-msg_name-and-msg_namelen-.patch --]
[-- Type: text/x-diff, Size: 21884 bytes --]

>From b50dda4282bec22ee8f0a3ee93527215d6f1028d Mon Sep 17 00:00:00 2001
From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Fri, 17 Jan 2014 14:31:09 +0000
Subject: [PATCH] net: rework recvmsg handler msg_name and msg_namelen logic

CVE-2013-7266

BugLink: http://bugs.launchpad.net/bugs/1267081

This patch now always passes msg->msg_namelen as 0. recvmsg handlers must
set msg_namelen to the proper size <= sizeof(struct sockaddr_storage)
to return msg_name to the user.

This prevents numerous uninitialized memory leaks we had in the
recvmsg handlers and makes it harder for new code to accidentally leak
uninitialized memory.

Optimize for the case recvfrom is called with NULL as address. We don't
need to copy the address at all, so set it to NULL before invoking the
recvmsg handler. We can do so, because all the recvmsg handlers must
cope with the case a plain read() is called on them. read() also sets
msg_name to NULL.

Also document these changes in include/linux/net.h as suggested by David
Miller.

Changes since RFC:

Set msg->msg_name = NULL if user specified a NULL in msg_name but had a
non-null msg_namelen in verify_iovec/verify_compat_iovec. This doesn't
affect sendto as it would bail out earlier while trying to copy-in the
address. It also more naturally reflects the logic by the callers of
verify_iovec.

With this change in place I could remove "
if (!uaddr || msg_sys->msg_namelen == 0)
	msg->msg_name = NULL
".

This change does not alter the user visible error logic as we ignore
msg_namelen as long as msg_name is NULL.

Also remove two unnecessary curly brackets in ___sys_recvmsg and change
comments to netdev style.

Cc: David Miller <davem@davemloft.net>
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
(back ported from commit f3d3342602f8bcbf37d7c46641cb9bca7618eb1c)
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Acked-by: Andy Whitcroft <andy.whitcroft@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
---
 drivers/isdn/mISDN/socket.c  | 13 ++++---------
 drivers/net/pppoe.c          |  2 --
 drivers/net/pppol2tp.c       |  2 --
 include/linux/net.h          |  8 ++++++++
 net/appletalk/ddp.c          | 16 +++++++---------
 net/atm/common.c             |  2 --
 net/ax25/af_ax25.c           |  4 ++--
 net/bluetooth/af_bluetooth.c |  2 --
 net/bluetooth/hci_sock.c     |  2 --
 net/bluetooth/rfcomm/sock.c  |  3 ---
 net/compat.c                 |  3 ++-
 net/core/iovec.c             |  3 ++-
 net/ipx/af_ipx.c             |  3 +--
 net/irda/af_irda.c           |  4 ----
 net/iucv/af_iucv.c           |  2 --
 net/key/af_key.c             |  1 -
 net/llc/af_llc.c             |  2 --
 net/netlink/af_netlink.c     |  2 --
 net/netrom/af_netrom.c       |  3 +--
 net/packet/af_packet.c       | 32 +++++++++++++++-----------------
 net/rds/recv.c               |  2 --
 net/rose/af_rose.c           |  8 +++++---
 net/rxrpc/ar-recvmsg.c       |  9 ++++++---
 net/socket.c                 | 19 +++++++++++--------
 net/tipc/socket.c            |  6 ------
 net/unix/af_unix.c           |  5 -----
 net/x25/af_x25.c             |  3 +--
 27 files changed, 65 insertions(+), 96 deletions(-)

diff --git a/drivers/isdn/mISDN/socket.c b/drivers/isdn/mISDN/socket.c
index feb0fa4..db69cb4 100644
--- a/drivers/isdn/mISDN/socket.c
+++ b/drivers/isdn/mISDN/socket.c
@@ -115,7 +115,6 @@ mISDN_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
 {
 	struct sk_buff		*skb;
 	struct sock		*sk = sock->sk;
-	struct sockaddr_mISDN	*maddr;
 
 	int		copied, err;
 
@@ -133,9 +132,9 @@ mISDN_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
 	if (!skb)
 		return err;
 
-	if (msg->msg_namelen >= sizeof(struct sockaddr_mISDN)) {
-		msg->msg_namelen = sizeof(struct sockaddr_mISDN);
-		maddr = (struct sockaddr_mISDN *)msg->msg_name;
+	if (msg->msg_name) {
+		struct sockaddr_mISDN *maddr = msg->msg_name;
+
 		maddr->family = AF_ISDN;
 		maddr->dev = _pms(sk)->dev->id;
 		if ((sk->sk_protocol == ISDN_P_LAPD_TE) ||
@@ -148,11 +147,7 @@ mISDN_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
 			maddr->sapi = _pms(sk)->ch.addr & 0xFF;
 			maddr->tei =  (_pms(sk)->ch.addr >> 8) & 0xFF;
 		}
-	} else {
-		if (msg->msg_namelen)
-			printk(KERN_WARNING "%s: too small namelen %d\n",
-			    __func__, msg->msg_namelen);
-		msg->msg_namelen = 0;
+		msg->msg_namelen = sizeof(*maddr);
 	}
 
 	copied = skb->len + MISDN_HEADER_LEN;
diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c
index 2559991..343fd1e 100644
--- a/drivers/net/pppoe.c
+++ b/drivers/net/pppoe.c
@@ -992,8 +992,6 @@ static int pppoe_recvmsg(struct kiocb *iocb, struct socket *sock,
 	if (error < 0)
 		goto end;
 
-	m->msg_namelen = 0;
-
 	if (skb) {
 		total_len = min_t(size_t, total_len, skb->len);
 		error = skb_copy_datagram_iovec(skb, 0, m->msg_iov, total_len);
diff --git a/drivers/net/pppol2tp.c b/drivers/net/pppol2tp.c
index 9235901..4cdc1cf 100644
--- a/drivers/net/pppol2tp.c
+++ b/drivers/net/pppol2tp.c
@@ -829,8 +829,6 @@ static int pppol2tp_recvmsg(struct kiocb *iocb, struct socket *sock,
 	if (sk->sk_state & PPPOX_BOUND)
 		goto end;
 
-	msg->msg_namelen = 0;
-
 	err = 0;
 	skb = skb_recv_datagram(sk, flags & ~MSG_DONTWAIT,
 				flags & MSG_DONTWAIT, &err);
diff --git a/include/linux/net.h b/include/linux/net.h
index 529a093..e40cbcc 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -187,6 +187,14 @@ struct proto_ops {
 				      int optname, char __user *optval, int __user *optlen);
 	int		(*sendmsg)   (struct kiocb *iocb, struct socket *sock,
 				      struct msghdr *m, size_t total_len);
+	/* Notes for implementing recvmsg:
+	 * ===============================
+	 * msg->msg_namelen should get updated by the recvmsg handlers
+	 * iff msg_name != NULL. It is by default 0 to prevent
+	 * returning uninitialized memory to user space.  The recvfrom
+	 * handlers can assume that msg.msg_name is either NULL or has
+	 * a minimum size of sizeof(struct sockaddr_storage).
+	 */
 	int		(*recvmsg)   (struct kiocb *iocb, struct socket *sock,
 				      struct msghdr *m, size_t total_len,
 				      int flags);
diff --git a/net/appletalk/ddp.c b/net/appletalk/ddp.c
index b1a4290..5eae360 100644
--- a/net/appletalk/ddp.c
+++ b/net/appletalk/ddp.c
@@ -1703,7 +1703,6 @@ static int atalk_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr
 			 size_t size, int flags)
 {
 	struct sock *sk = sock->sk;
-	struct sockaddr_at *sat = (struct sockaddr_at *)msg->msg_name;
 	struct ddpehdr *ddp;
 	int copied = 0;
 	int offset = 0;
@@ -1728,14 +1727,13 @@ static int atalk_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr
 	}
 	err = skb_copy_datagram_iovec(skb, offset, msg->msg_iov, copied);
 
-	if (!err) {
-		if (sat) {
-			sat->sat_family      = AF_APPLETALK;
-			sat->sat_port        = ddp->deh_sport;
-			sat->sat_addr.s_node = ddp->deh_snode;
-			sat->sat_addr.s_net  = ddp->deh_snet;
-		}
-		msg->msg_namelen = sizeof(*sat);
+	if (!err && msg->msg_name) {
+		struct sockaddr_at *sat = msg->msg_name;
+		sat->sat_family      = AF_APPLETALK;
+		sat->sat_port        = ddp->deh_sport;
+		sat->sat_addr.s_node = ddp->deh_snode;
+		sat->sat_addr.s_net  = ddp->deh_snet;
+		msg->msg_namelen     = sizeof(*sat);
 	}
 
 	skb_free_datagram(sk, skb);	/* Free the datagram. */
diff --git a/net/atm/common.c b/net/atm/common.c
index 65737b8..0baf05e 100644
--- a/net/atm/common.c
+++ b/net/atm/common.c
@@ -473,8 +473,6 @@ int vcc_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 	struct sk_buff *skb;
 	int copied, error = -EINVAL;
 
-	msg->msg_namelen = 0;
-
 	if (sock->state != SS_CONNECTED)
 		return -ENOTCONN;
 	if (flags & ~MSG_DONTWAIT)		/* only handle MSG_DONTWAIT */
diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c
index 8613bd1..6b9d62b 100644
--- a/net/ax25/af_ax25.c
+++ b/net/ax25/af_ax25.c
@@ -1648,11 +1648,11 @@ static int ax25_recvmsg(struct kiocb *iocb, struct socket *sock,
 
 	skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
 
-	if (msg->msg_namelen != 0) {
-		struct sockaddr_ax25 *sax = (struct sockaddr_ax25 *)msg->msg_name;
+	if (msg->msg_name) {
 		ax25_digi digi;
 		ax25_address src;
 		const unsigned char *mac = skb_mac_header(skb);
+		struct sockaddr_ax25 *sax = msg->msg_name;
 
 		memset(sax, 0, sizeof(struct full_sockaddr_ax25));
 		ax25_addr_parse(mac + 1, skb->data - mac - 1, &src, NULL,
diff --git a/net/bluetooth/af_bluetooth.c b/net/bluetooth/af_bluetooth.c
index d7239dd..143b8a7 100644
--- a/net/bluetooth/af_bluetooth.c
+++ b/net/bluetooth/af_bluetooth.c
@@ -240,8 +240,6 @@ int bt_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
 	if (flags & (MSG_OOB))
 		return -EOPNOTSUPP;
 
-	msg->msg_namelen = 0;
-
 	if (!(skb = skb_recv_datagram(sk, flags, noblock, &err))) {
 		if (sk->sk_shutdown & RCV_SHUTDOWN)
 			return 0;
diff --git a/net/bluetooth/hci_sock.c b/net/bluetooth/hci_sock.c
index 6c00bf7..bb2548b 100644
--- a/net/bluetooth/hci_sock.c
+++ b/net/bluetooth/hci_sock.c
@@ -370,8 +370,6 @@ static int hci_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
 	if (!(skb = skb_recv_datagram(sk, flags, noblock, &err)))
 		return err;
 
-	msg->msg_namelen = 0;
-
 	copied = skb->len;
 	if (len < copied) {
 		msg->msg_flags |= MSG_TRUNC;
diff --git a/net/bluetooth/rfcomm/sock.c b/net/bluetooth/rfcomm/sock.c
index 1db0132..3fabaad 100644
--- a/net/bluetooth/rfcomm/sock.c
+++ b/net/bluetooth/rfcomm/sock.c
@@ -652,15 +652,12 @@ static int rfcomm_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
 
 	if (test_and_clear_bit(RFCOMM_DEFER_SETUP, &d->flags)) {
 		rfcomm_dlc_accept(d);
-		msg->msg_namelen = 0;
 		return 0;
 	}
 
 	if (flags & MSG_OOB)
 		return -EOPNOTSUPP;
 
-	msg->msg_namelen = 0;
-
 	BT_DBG("sk %p size %zu", sk, size);
 
 	lock_sock(sk);
diff --git a/net/compat.c b/net/compat.c
index 9559afc..305bca6 100644
--- a/net/compat.c
+++ b/net/compat.c
@@ -89,7 +89,8 @@ int verify_compat_iovec(struct msghdr *kern_msg, struct iovec *kern_iov,
 			if (err < 0)
 				return err;
 		}
-		kern_msg->msg_name = kern_address;
+		if (kern_msg->msg_name)
+			kern_msg->msg_name = kern_address;
 	} else
 		kern_msg->msg_name = NULL;
 
diff --git a/net/core/iovec.c b/net/core/iovec.c
index f911e66..39369e9 100644
--- a/net/core/iovec.c
+++ b/net/core/iovec.c
@@ -47,7 +47,8 @@ int verify_iovec(struct msghdr *m, struct iovec *iov, struct sockaddr *address,
 			if (err < 0)
 				return err;
 		}
-		m->msg_name = address;
+		if (m->msg_name)
+			m->msg_name = address;
 	} else {
 		m->msg_name = NULL;
 	}
diff --git a/net/ipx/af_ipx.c b/net/ipx/af_ipx.c
index 66c7a20..25931b3 100644
--- a/net/ipx/af_ipx.c
+++ b/net/ipx/af_ipx.c
@@ -1808,8 +1808,6 @@ static int ipx_recvmsg(struct kiocb *iocb, struct socket *sock,
 	if (skb->tstamp.tv64)
 		sk->sk_stamp = skb->tstamp;
 
-	msg->msg_namelen = sizeof(*sipx);
-
 	if (sipx) {
 		sipx->sipx_family	= AF_IPX;
 		sipx->sipx_port		= ipx->ipx_source.sock;
@@ -1817,6 +1815,7 @@ static int ipx_recvmsg(struct kiocb *iocb, struct socket *sock,
 		sipx->sipx_network	= IPX_SKB_CB(skb)->ipx_source_net;
 		sipx->sipx_type 	= ipx->ipx_type;
 		sipx->sipx_zero		= 0;
+		msg->msg_namelen	= sizeof(*sipx);
 	}
 	rc = copied;
 
diff --git a/net/irda/af_irda.c b/net/irda/af_irda.c
index bfb325d..7cb7613 100644
--- a/net/irda/af_irda.c
+++ b/net/irda/af_irda.c
@@ -1338,8 +1338,6 @@ static int irda_recvmsg_dgram(struct kiocb *iocb, struct socket *sock,
 	if ((err = sock_error(sk)) < 0)
 		return err;
 
-	msg->msg_namelen = 0;
-
 	skb = skb_recv_datagram(sk, flags & ~MSG_DONTWAIT,
 				flags & MSG_DONTWAIT, &err);
 	if (!skb)
@@ -1402,8 +1400,6 @@ static int irda_recvmsg_stream(struct kiocb *iocb, struct socket *sock,
 	target = sock_rcvlowat(sk, flags & MSG_WAITALL, size);
 	timeo = sock_rcvtimeo(sk, noblock);
 
-	msg->msg_namelen = 0;
-
 	do {
 		int chunk;
 		struct sk_buff *skb = skb_dequeue(&sk->sk_receive_queue);
diff --git a/net/iucv/af_iucv.c b/net/iucv/af_iucv.c
index f605b23..bada1b9 100644
--- a/net/iucv/af_iucv.c
+++ b/net/iucv/af_iucv.c
@@ -1160,8 +1160,6 @@ static int iucv_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
 	struct sk_buff *skb, *rskb, *cskb;
 	int err = 0;
 
-	msg->msg_namelen = 0;
-
 	if ((sk->sk_state == IUCV_DISCONN || sk->sk_state == IUCV_SEVERED) &&
 	    skb_queue_empty(&iucv->backlog_skb_q) &&
 	    skb_queue_empty(&sk->sk_receive_queue) &&
diff --git a/net/key/af_key.c b/net/key/af_key.c
index 9d22e46..b6a6b85 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -3593,7 +3593,6 @@ static int pfkey_recvmsg(struct kiocb *kiocb,
 	if (flags & ~(MSG_PEEK|MSG_DONTWAIT|MSG_TRUNC|MSG_CMSG_COMPAT))
 		goto out;
 
-	msg->msg_namelen = 0;
 	skb = skb_recv_datagram(sk, flags, flags & MSG_DONTWAIT, &err);
 	if (skb == NULL)
 		goto out;
diff --git a/net/llc/af_llc.c b/net/llc/af_llc.c
index 8a814a5..606b6ad 100644
--- a/net/llc/af_llc.c
+++ b/net/llc/af_llc.c
@@ -674,8 +674,6 @@ static int llc_ui_recvmsg(struct kiocb *iocb, struct socket *sock,
 	int target;	/* Read at least this many bytes */
 	long timeo;
 
-	msg->msg_namelen = 0;
-
 	lock_sock(sk);
 	copied = -ENOTCONN;
 	if (unlikely(sk->sk_type == SOCK_STREAM && sk->sk_state == TCP_LISTEN))
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index fc91ff6..39a6d5d 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -1400,8 +1400,6 @@ static int netlink_recvmsg(struct kiocb *kiocb, struct socket *sock,
 	}
 #endif
 
-	msg->msg_namelen = 0;
-
 	copied = data_skb->len;
 	if (len < copied) {
 		msg->msg_flags |= MSG_TRUNC;
diff --git a/net/netrom/af_netrom.c b/net/netrom/af_netrom.c
index d240523..b3c9b48 100644
--- a/net/netrom/af_netrom.c
+++ b/net/netrom/af_netrom.c
@@ -1185,10 +1185,9 @@ static int nr_recvmsg(struct kiocb *iocb, struct socket *sock,
 		sax->sax25_family = AF_NETROM;
 		skb_copy_from_linear_data_offset(skb, 7, sax->sax25_call.ax25_call,
 			      AX25_ADDR_LEN);
+		msg->msg_namelen = sizeof(*sax);
 	}
 
-	msg->msg_namelen = sizeof(*sax);
-
 	skb_free_datagram(sk, skb);
 
 	release_sock(sk);
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 728c080..1de1992 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1423,7 +1423,6 @@ static int packet_recvmsg(struct kiocb *iocb, struct socket *sock,
 	struct sock *sk = sock->sk;
 	struct sk_buff *skb;
 	int copied, err;
-	struct sockaddr_ll *sll;
 
 	err = -EINVAL;
 	if (flags & ~(MSG_PEEK|MSG_DONTWAIT|MSG_TRUNC|MSG_CMSG_COMPAT))
@@ -1455,22 +1454,10 @@ static int packet_recvmsg(struct kiocb *iocb, struct socket *sock,
 	if (skb == NULL)
 		goto out;
 
-	/*
-	 *	If the address length field is there to be filled in, we fill
-	 *	it in now.
+	/* You lose any data beyond the buffer you gave. If it worries
+	 * a user program they can ask the device for its MTU
+	 * anyway.
 	 */
-
-	sll = &PACKET_SKB_CB(skb)->sa.ll;
-	if (sock->type == SOCK_PACKET)
-		msg->msg_namelen = sizeof(struct sockaddr_pkt);
-	else
-		msg->msg_namelen = sll->sll_halen + offsetof(struct sockaddr_ll, sll_addr);
-
-	/*
-	 *	You lose any data beyond the buffer you gave. If it worries a
-	 *	user program they can ask the device for its MTU anyway.
-	 */
-
 	copied = skb->len;
 	if (copied > len) {
 		copied = len;
@@ -1483,9 +1470,20 @@ static int packet_recvmsg(struct kiocb *iocb, struct socket *sock,
 
 	sock_recv_timestamp(msg, sk, skb);
 
-	if (msg->msg_name)
+	if (msg->msg_name) {
+		/* If the address length field is there to be filled
+		 * in, we fill it in now.
+		 */
+		if (sock->type == SOCK_PACKET) {
+			msg->msg_namelen = sizeof(struct sockaddr_pkt);
+		} else {
+			struct sockaddr_ll *sll = &PACKET_SKB_CB(skb)->sa.ll;
+			msg->msg_namelen = sll->sll_halen +
+				offsetof(struct sockaddr_ll, sll_addr);
+		}
 		memcpy(msg->msg_name, &PACKET_SKB_CB(skb)->sa,
 		       msg->msg_namelen);
+	}
 
 	if (pkt_sk(sk)->auxdata) {
 		struct tpacket_auxdata aux;
diff --git a/net/rds/recv.c b/net/rds/recv.c
index c45a881c..a11cab9 100644
--- a/net/rds/recv.c
+++ b/net/rds/recv.c
@@ -410,8 +410,6 @@ int rds_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 
 	rdsdebug("size %zu flags 0x%x timeo %ld\n", size, msg_flags, timeo);
 
-	msg->msg_namelen = 0;
-
 	if (msg_flags & MSG_OOB)
 		goto out;
 
diff --git a/net/rose/af_rose.c b/net/rose/af_rose.c
index 2984999..08a86f6 100644
--- a/net/rose/af_rose.c
+++ b/net/rose/af_rose.c
@@ -1238,7 +1238,6 @@ static int rose_recvmsg(struct kiocb *iocb, struct socket *sock,
 {
 	struct sock *sk = sock->sk;
 	struct rose_sock *rose = rose_sk(sk);
-	struct sockaddr_rose *srose = (struct sockaddr_rose *)msg->msg_name;
 	size_t copied;
 	unsigned char *asmptr;
 	struct sk_buff *skb;
@@ -1274,8 +1273,11 @@ static int rose_recvmsg(struct kiocb *iocb, struct socket *sock,
 
 	skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
 
-	if (srose != NULL) {
-		memset(srose, 0, msg->msg_namelen);
+	if (msg->msg_name) {
+		struct sockaddr_rose *srose;
+
+		memset(msg->msg_name, 0, sizeof(struct full_sockaddr_rose));
+		srose = msg->msg_name;
 		srose->srose_family = AF_ROSE;
 		srose->srose_addr   = rose->dest_addr;
 		srose->srose_call   = rose->dest_call;
diff --git a/net/rxrpc/ar-recvmsg.c b/net/rxrpc/ar-recvmsg.c
index a39bf97..d5630d9 100644
--- a/net/rxrpc/ar-recvmsg.c
+++ b/net/rxrpc/ar-recvmsg.c
@@ -142,10 +142,13 @@ int rxrpc_recvmsg(struct kiocb *iocb, struct socket *sock,
 
 		/* copy the peer address and timestamp */
 		if (!continue_call) {
-			if (msg->msg_name && msg->msg_namelen > 0)
+			if (msg->msg_name) {
+				size_t len =
+					sizeof(call->conn->trans->peer->srx);
 				memcpy(msg->msg_name,
-				       &call->conn->trans->peer->srx,
-				       sizeof(call->conn->trans->peer->srx));
+				       &call->conn->trans->peer->srx, len);
+				msg->msg_namelen = len;
+			}
 			sock_recv_timestamp(msg, &rx->sk, skb);
 		}
 
diff --git a/net/socket.c b/net/socket.c
index bf9fc68..e6c3396 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -1744,8 +1744,10 @@ SYSCALL_DEFINE6(recvfrom, int, fd, void __user *, ubuf, size_t, size,
 	msg.msg_iov = &iov;
 	iov.iov_len = size;
 	iov.iov_base = ubuf;
-	msg.msg_name = (struct sockaddr *)&address;
-	msg.msg_namelen = sizeof(address);
+	/* Save some cycles and don't copy the address if not needed */
+	msg.msg_name = addr ? (struct sockaddr *)&address : NULL;
+	/* We assume all kernel code knows the size of sockaddr_storage */
+	msg.msg_namelen = 0;
 	if (sock->file->f_flags & O_NONBLOCK)
 		flags |= MSG_DONTWAIT;
 	err = sock_recvmsg(sock, &msg, size, flags);
@@ -2017,18 +2019,16 @@ SYSCALL_DEFINE3(recvmsg, int, fd, struct msghdr __user *, msg,
 			goto out_put;
 	}
 
-	/*
-	 *      Save the user-mode address (verify_iovec will change the
-	 *      kernel msghdr to use the kernel address space)
+	/* Save the user-mode address (verify_iovec will change the
+	 * kernel msghdr to use the kernel address space)
 	 */
-
 	uaddr = (__force void __user *)msg_sys.msg_name;
 	uaddr_len = COMPAT_NAMELEN(msg);
-	if (MSG_CMSG_COMPAT & flags) {
+	if (MSG_CMSG_COMPAT & flags)
 		err = verify_compat_iovec(&msg_sys, iov,
 					  (struct sockaddr *)&addr,
 					  VERIFY_WRITE);
-	} else
+	else
 		err = verify_iovec(&msg_sys, iov,
 				   (struct sockaddr *)&addr,
 				   VERIFY_WRITE);
@@ -2039,6 +2039,9 @@ SYSCALL_DEFINE3(recvmsg, int, fd, struct msghdr __user *, msg,
 	cmsg_ptr = (unsigned long)msg_sys.msg_control;
 	msg_sys.msg_flags = flags & (MSG_CMSG_CLOEXEC|MSG_CMSG_COMPAT);
 
+	/* We assume all kernel code knows the size of sockaddr_storage */
+	msg_sys.msg_namelen = 0;
+
 	if (sock->file->f_flags & O_NONBLOCK)
 		flags |= MSG_DONTWAIT;
 	err = sock_recvmsg(sock, &msg_sys, total_len, flags);
diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index b453345..024f490 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -917,9 +917,6 @@ static int recv_msg(struct kiocb *iocb, struct socket *sock,
 		goto exit;
 	}
 
-	/* will be updated in set_orig_addr() if needed */
-	m->msg_namelen = 0;
-
 restart:
 
 	/* Look for a message in receive queue; wait if necessary */
@@ -1053,9 +1050,6 @@ static int recv_stream(struct kiocb *iocb, struct socket *sock,
 		goto exit;
 	}
 
-	/* will be updated in set_orig_addr() if needed */
-	m->msg_namelen = 0;
-
 restart:
 
 	/* Look for a message in receive queue; wait if necessary */
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index d65e7f0..fc57017 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1668,7 +1668,6 @@ static void unix_copy_addr(struct msghdr *msg, struct sock *sk)
 {
 	struct unix_sock *u = unix_sk(sk);
 
-	msg->msg_namelen = 0;
 	if (u->addr) {
 		msg->msg_namelen = u->addr->len;
 		memcpy(msg->msg_name, u->addr->name, u->addr->len);
@@ -1691,8 +1690,6 @@ static int unix_dgram_recvmsg(struct kiocb *iocb, struct socket *sock,
 	if (flags&MSG_OOB)
 		goto out;
 
-	msg->msg_namelen = 0;
-
 	mutex_lock(&u->readlock);
 
 	skb = skb_recv_datagram(sk, flags, noblock, &err);
@@ -1818,8 +1815,6 @@ static int unix_stream_recvmsg(struct kiocb *iocb, struct socket *sock,
 	target = sock_rcvlowat(sk, flags&MSG_WAITALL, size);
 	timeo = sock_rcvtimeo(sk, flags&MSG_DONTWAIT);
 
-	msg->msg_namelen = 0;
-
 	/* Lock the socket to prevent queue disordering
 	 * while sleeps in memcpy_tomsg
 	 */
diff --git a/net/x25/af_x25.c b/net/x25/af_x25.c
index 2e9e300..40c447f 100644
--- a/net/x25/af_x25.c
+++ b/net/x25/af_x25.c
@@ -1294,10 +1294,9 @@ static int x25_recvmsg(struct kiocb *iocb, struct socket *sock,
 	if (sx25) {
 		sx25->sx25_family = AF_X25;
 		sx25->sx25_addr   = x25->dest_addr;
+		msg->msg_namelen = sizeof(*sx25);
 	}
 
-	msg->msg_namelen = sizeof(struct sockaddr_x25);
-
 	lock_sock(sk);
 	x25_check_rbuf(sk);
 	release_sock(sk);
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 172+ messages in thread

* Re: [ 080/143] net: rework recvmsg handler msg_name and msg_namelen logic
  2014-05-13 12:44   ` Luis Henriques
@ 2014-05-13 12:49     ` Willy Tarreau
  2014-05-14  5:45     ` Willy Tarreau
  1 sibling, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-13 12:49 UTC (permalink / raw)
  To: Luis Henriques; +Cc: linux-kernel, stable, David Miller, Hannes Frederic Sowa

Hi Luis,

On Tue, May 13, 2014 at 01:44:25PM +0100, Luis Henriques wrote:
> On Mon, May 12, 2014 at 02:33:20AM +0200, Willy Tarreau wrote:
> > 2.6.32-longterm review patch.  If anyone has any objections, please let me know.
> > 
> > ------------------
> > 
> > From: Hannes Frederic Sowa <hannes@stressinduktion.org>
> > 
> > [ Upstream commit f3d3342602f8bcbf37d7c46641cb9bca7618eb1c ]
> > 
> 
> <snip>
> 
> > diff --git a/net/rxrpc/ar-recvmsg.c b/net/rxrpc/ar-recvmsg.c
> > index a39bf97..f779fc3 100644
> > --- a/net/rxrpc/ar-recvmsg.c
> > +++ b/net/rxrpc/ar-recvmsg.c
> > @@ -142,10 +142,12 @@ int rxrpc_recvmsg(struct kiocb *iocb, struct socket *sock,
> >  
> >  		/* copy the peer address and timestamp */
> >  		if (!continue_call) {
> > -			if (msg->msg_name && msg->msg_namelen > 0)
> > +			if (msg->msg_name) {
> > +				size_t len =
> > +					sizeof(call->conn->trans->peer->srx);
> >  				memcpy(msg->msg_name,
> > -				       &call->conn->trans->peer->srx,
> > -				       sizeof(call->conn->trans->peer->srx));
> > +				       &call->conn->trans->peer->srx, len);
>   +                                    msg->msg_namelen = len;
> 
> This statement^^^ is missing in this backport.

Ah this is the one that was placed out of scope and which failed on
the first round. Now I understand what happened. I rememeber having
applied one patch with git am -C0 to help it pass, and a few patches
later I was scared to see that I still had -C0 on the command I was
reusing (up arrow, ctrl-w, copy-paste patch name and enter). I carefully
rechecked the few affected patches but found nothing suspicious. It seems
I did not check well enough. Given how this patch was mangled, clearly it
was affected by this mistake.

> Also missing in this patch are the changes to the following functions:
> 
> - pppoe_recvmsg()
>   In file drivers/net/pppoe.c instead of drivers/net/ppp/pppoe.c
> 
> - pppol2tp_recvmsg()
>   In file drivers/net/pppol2tp.c instead of net/l2tp/l2tp_ppp.c
> 
> For reference, I'm attaching the backport we have used in our Ubuntu
> kernel.  (Note that we added some extra information to the commit
> message to include the CVE number and a link to the CVE bug.)

OK thank you very much Luis!

Willy


^ permalink raw reply	[flat|nested] 172+ messages in thread

* Re: [ 080/143] net: rework recvmsg handler msg_name and msg_namelen logic
  2014-05-13 12:44   ` Luis Henriques
  2014-05-13 12:49     ` Willy Tarreau
@ 2014-05-14  5:45     ` Willy Tarreau
  1 sibling, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-14  5:45 UTC (permalink / raw)
  To: Luis Henriques; +Cc: linux-kernel, stable, David Miller, Hannes Frederic Sowa

Hi Luis,

On Tue, May 13, 2014 at 01:44:25PM +0100, Luis Henriques wrote:
> On Mon, May 12, 2014 at 02:33:20AM +0200, Willy Tarreau wrote:
> > 2.6.32-longterm review patch.  If anyone has any objections, please let me know.
> > 
> > ------------------
> > 
> > From: Hannes Frederic Sowa <hannes@stressinduktion.org>
> > 
> > [ Upstream commit f3d3342602f8bcbf37d7c46641cb9bca7618eb1c ]
> > 
> 
> <snip>
> 
> > diff --git a/net/rxrpc/ar-recvmsg.c b/net/rxrpc/ar-recvmsg.c
> > index a39bf97..f779fc3 100644
> > --- a/net/rxrpc/ar-recvmsg.c
> > +++ b/net/rxrpc/ar-recvmsg.c
> > @@ -142,10 +142,12 @@ int rxrpc_recvmsg(struct kiocb *iocb, struct socket *sock,
> >  
> >  		/* copy the peer address and timestamp */
> >  		if (!continue_call) {
> > -			if (msg->msg_name && msg->msg_namelen > 0)
> > +			if (msg->msg_name) {
> > +				size_t len =
> > +					sizeof(call->conn->trans->peer->srx);
> >  				memcpy(msg->msg_name,
> > -				       &call->conn->trans->peer->srx,
> > -				       sizeof(call->conn->trans->peer->srx));
> > +				       &call->conn->trans->peer->srx, len);
>   +                                    msg->msg_namelen = len;
> 
> This statement^^^ is missing in this backport.
> 
> Also missing in this patch are the changes to the following functions:
> 
> - pppoe_recvmsg()
>   In file drivers/net/pppoe.c instead of drivers/net/ppp/pppoe.c
> 
> - pppol2tp_recvmsg()
>   In file drivers/net/pppol2tp.c instead of net/l2tp/l2tp_ppp.c
> 
> For reference, I'm attaching the backport we have used in our Ubuntu
> kernel.  (Note that we added some extra information to the commit
> message to include the CVE number and a link to the CVE bug.)

Just a quick note to let you know that your patch applied well, and
that the diff between the two correctly reports the parts you pointed
above. So I'm taking it instead.

Thanks!
Willy


^ permalink raw reply	[flat|nested] 172+ messages in thread

* Re: [ 137/143] xfs: underflow bug in xfs_attrlist_by_handle()
  2014-05-13 11:08   ` Luis Henriques
  2014-05-13 11:18     ` Willy Tarreau
@ 2014-05-14  9:50     ` Dan Carpenter
  2014-05-22  8:19       ` Dan Carpenter
  1 sibling, 1 reply; 172+ messages in thread
From: Dan Carpenter @ 2014-05-14  9:50 UTC (permalink / raw)
  To: Luis Henriques; +Cc: Willy Tarreau, linux-kernel, stable, Ben Myers

On Tue, May 13, 2014 at 12:08:12PM +0100, Luis Henriques wrote:
> > diff --git a/fs/xfs/linux-2.6/xfs_ioctl32.c b/fs/xfs/linux-2.6/xfs_ioctl32.c
> > index bad485a..782d03d 100644
> > --- a/fs/xfs/linux-2.6/xfs_ioctl32.c
> > +++ b/fs/xfs/linux-2.6/xfs_ioctl32.c
> > @@ -361,8 +361,8 @@ xfs_compat_attrlist_by_handle(
> >  	if (copy_from_user(&al_hreq, arg,
> >  			   sizeof(compat_xfs_fsop_attrlist_handlereq_t)))
> >  		return -XFS_ERROR(EFAULT);
> > -	if (al_hreq.buflen > XATTR_LIST_MAX)
> > -		return -XFS_ERROR(EINVAL);
> 
> Am I missing something or was the above return statement deleted by
> mistake?
> 
> Cheers,
> --
> Luís

Good eye.  I have created a Smatch check to look for these bugs.

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 172+ messages in thread

* Re: [ 083/143] net: clamp ->msg_namelen instead of returning an error
  2014-05-12  0:33 ` [ 083/143] net: clamp ->msg_namelen instead of returning an error Willy Tarreau
@ 2014-05-14 10:02   ` Dan Carpenter
  2014-05-14 12:27     ` Willy Tarreau
  0 siblings, 1 reply; 172+ messages in thread
From: Dan Carpenter @ 2014-05-14 10:02 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: linux-kernel, stable, Eric Dumazet, David S. Miller

On Mon, May 12, 2014 at 02:33:23AM +0200, Willy Tarreau wrote:
> 2.6.32-longterm review patch.  If anyone has any objections, please let me know.
> 
> ------------------
> 
> From: Dan Carpenter <dan.carpenter@oracle.com>
> 
> [ Upstream commit db31c55a6fb245fdbb752a2ca4aefec89afabb06 ]
> 
> If kmsg->msg_namelen > sizeof(struct sockaddr_storage) then in the
> original code that would lead to memory corruption in the kernel if you
> had audit configured.  If you didn't have audit configured it was
> harmless.
> 
> There are some programs such as beta versions of Ruby which use too
> large of a buffer and returning an error code breaks them.  We should
> clamp the ->msg_namelen value instead.
> 
> Fixes: 1661bf364ae9 ("net: heap overflow in __audit_sockaddr()")

You should probably take dbb490b96584 ('net: socket: error on a negative
msg_namelen') as well.  LTP has a test that passes negative values to
this code and expects an error return so my clamp patch breaks LTP.

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 172+ messages in thread

* Re: [ 083/143] net: clamp ->msg_namelen instead of returning an error
  2014-05-14 10:02   ` Dan Carpenter
@ 2014-05-14 12:27     ` Willy Tarreau
  0 siblings, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-05-14 12:27 UTC (permalink / raw)
  To: Dan Carpenter; +Cc: linux-kernel, stable, Eric Dumazet, David S. Miller

Hi Dan,

On Wed, May 14, 2014 at 01:02:15PM +0300, Dan Carpenter wrote:
> On Mon, May 12, 2014 at 02:33:23AM +0200, Willy Tarreau wrote:
> > 2.6.32-longterm review patch.  If anyone has any objections, please let me know.
> > 
> > ------------------
> > 
> > From: Dan Carpenter <dan.carpenter@oracle.com>
> > 
> > [ Upstream commit db31c55a6fb245fdbb752a2ca4aefec89afabb06 ]
> > 
> > If kmsg->msg_namelen > sizeof(struct sockaddr_storage) then in the
> > original code that would lead to memory corruption in the kernel if you
> > had audit configured.  If you didn't have audit configured it was
> > harmless.
> > 
> > There are some programs such as beta versions of Ruby which use too
> > large of a buffer and returning an error code breaks them.  We should
> > clamp the ->msg_namelen value instead.
> > 
> > Fixes: 1661bf364ae9 ("net: heap overflow in __audit_sockaddr()")
> 
> You should probably take dbb490b96584 ('net: socket: error on a negative
> msg_namelen') as well.  LTP has a test that passes negative values to
> this code and expects an error return so my clamp patch breaks LTP.

It happens that we already have it (127/143), but thank you for
checking, I really appreciate it.

Willy


^ permalink raw reply	[flat|nested] 172+ messages in thread

* Re: [ 137/143] xfs: underflow bug in xfs_attrlist_by_handle()
  2014-05-14  9:50     ` Dan Carpenter
@ 2014-05-22  8:19       ` Dan Carpenter
  0 siblings, 0 replies; 172+ messages in thread
From: Dan Carpenter @ 2014-05-22  8:19 UTC (permalink / raw)
  To: Luis Henriques; +Cc: Willy Tarreau, linux-kernel, stable, Ben Myers

On Wed, May 14, 2014 at 12:50:20PM +0300, Dan Carpenter wrote:
> On Tue, May 13, 2014 at 12:08:12PM +0100, Luis Henriques wrote:
> > > diff --git a/fs/xfs/linux-2.6/xfs_ioctl32.c b/fs/xfs/linux-2.6/xfs_ioctl32.c
> > > index bad485a..782d03d 100644
> > > --- a/fs/xfs/linux-2.6/xfs_ioctl32.c
> > > +++ b/fs/xfs/linux-2.6/xfs_ioctl32.c
> > > @@ -361,8 +361,8 @@ xfs_compat_attrlist_by_handle(
> > >  	if (copy_from_user(&al_hreq, arg,
> > >  			   sizeof(compat_xfs_fsop_attrlist_handlereq_t)))
> > >  		return -XFS_ERROR(EFAULT);
> > > -	if (al_hreq.buflen > XATTR_LIST_MAX)
> > > -		return -XFS_ERROR(EINVAL);
> > 
> > Am I missing something or was the above return statement deleted by
> > mistake?
> > 
> > Cheers,
> > --
> > Luís
> 
> Good eye.  I have created a Smatch check to look for these bugs.
> 

Oh.  It turns out checkpatch.pl catches this bug as well.

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 172+ messages in thread

* Re: [ 059/143] sysctl net: Keep tcp_syn_retries inside the boundary
  2014-05-12  0:32 ` [ 059/143] sysctl net: Keep tcp_syn_retries inside the boundary Willy Tarreau
@ 2014-06-11 18:46   ` Luis Henriques
  2014-06-11 19:46     ` Willy Tarreau
  0 siblings, 1 reply; 172+ messages in thread
From: Luis Henriques @ 2014-06-11 18:46 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: linux-kernel, stable, Michal Tesar, David S. Miller

Hi Willy,

On Mon, May 12, 2014 at 02:32:59AM +0200, Willy Tarreau wrote:
> 2.6.32-longterm review patch.  If anyone has any objections, please let me know.
> 

During Ubuntu Lucid kernel regression testing, after the merge of
2.6.32.62, we found problems with the following patches

[ 059/143] sysctl net: Keep tcp_syn_retries inside the boundary
           (Upstream commit 651e92716aaae60fc41b9652f54cb6803896e0da)

[ 065/143] net: check net.core.somaxconn sysctl values
           (Upstream commit 5f671d6b4ec3e6d66c2a868738af2cdea09e7509)

The following two stack traces were found in kernel logs:

[    0.199908] sysctl table check failed: /net/core/somaxconn .3.1.18 Missing strategy
[    0.201100] Pid: 1, comm: swapper Not tainted 2.6.32-02063262-generic #201405200837
[    0.202173] Call Trace:
[    0.202523]  [<ffffffff8108e419>] set_fail+0x59/0x60
[    0.203213]  [<ffffffff8108e74b>] sysctl_check_table+0x16b/0x4b0
[    0.204065]  [<ffffffff8108e75c>] sysctl_check_table+0x17c/0x4b0
[    0.204879]  [<ffffffff8108e75c>] sysctl_check_table+0x17c/0x4b0
[    0.205697]  [<ffffffff810712dd>] __register_sysctl_paths+0x11d/0x360
[    0.206709]  [<ffffffff8108e75c>] ? sysctl_check_table+0x17c/0x4b0
[    0.207552]  [<ffffffff81528af1>] register_net_sysctl_table+0x61/0x70
[    0.208425]  [<ffffffff814566d5>] sysctl_core_net_init+0x45/0xb0
[    0.209297]  [<ffffffff81455af8>] register_pernet_operations+0x48/0x100
[    0.210119]  [<ffffffff8187b6ee>] ? sysctl_core_init+0x0/0x38
[    0.210867]  [<ffffffff81455c5c>] register_pernet_subsys+0x2c/0x50
[    0.211699]  [<ffffffff8187b724>] sysctl_core_init+0x36/0x38
[    0.212448]  [<ffffffff8100a04c>] do_one_initcall+0x3c/0x1a0
[    0.213324]  [<ffffffff818446d1>] do_basic_setup+0x54/0x66
[    0.214563]  [<ffffffff818447f1>] kernel_init+0x10e/0x156
[    0.215766]  [<ffffffff810131ea>] child_rip+0xa/0x20
[    0.216882]  [<ffffffff818446e3>] ? kernel_init+0x0/0x156
[    0.218099]  [<ffffffff810131e0>] ? child_rip+0x0/0x20

and

[    0.398433] sysctl table check failed: /net/ipv4/ip_no_pmtu_disc .3.5.39 Missing strategy
[    0.398437] Pid: 1, comm: swapper Not tainted 2.6.32-02063262-generic #201405200837
[    0.398438] Call Trace:
[    0.398444]  [<ffffffff8108e419>] set_fail+0x59/0x60
[    0.398446]  [<ffffffff8108e74b>] sysctl_check_table+0x16b/0x4b0
[    0.398447]  [<ffffffff8108e75c>] sysctl_check_table+0x17c/0x4b0
[    0.398449]  [<ffffffff8108e75c>] sysctl_check_table+0x17c/0x4b0
[    0.398452]  [<ffffffff810712dd>] __register_sysctl_paths+0x11d/0x360
[    0.398455]  [<ffffffff811a21d8>] ? __proc_create+0xd8/0x130
[    0.398459]  [<ffffffff8187d106>] ? sysctl_ipv4_init+0x0/0x4e
[    0.398461]  [<ffffffff8107154b>] register_sysctl_paths+0x2b/0x30
[    0.398463]  [<ffffffff8187d122>] sysctl_ipv4_init+0x1c/0x4e
[    0.398466]  [<ffffffff8100a04c>] do_one_initcall+0x3c/0x1a0
[    0.398469]  [<ffffffff818446d1>] do_basic_setup+0x54/0x66
[    0.398470]  [<ffffffff818447f1>] kernel_init+0x10e/0x156
[    0.398473]  [<ffffffff810131ea>] child_rip+0xa/0x20
[    0.398474]  [<ffffffff818446e3>] ? kernel_init+0x0/0x156
[    0.398476]  [<ffffffff810131e0>] ? child_rip+0x0/0x20

and here's a bug link:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1326473

For the Ubuntu Lucid kernel, we ended up reverting the offending
commits.  Since I was able to reproduce this problem with a vanilla
2.6.32.62, you may want to take a similar action for the next 2.6.32
release.

Cheers,
--
Luís

> ------------------
> 
> From: Michal Tesar <mtesar@redhat.com>
> 
> [ Upstream commit 651e92716aaae60fc41b9652f54cb6803896e0da ]
> 
> Limit the min/max value passed to the
> /proc/sys/net/ipv4/tcp_syn_retries.
> 
> Signed-off-by: Michal Tesar <mtesar@redhat.com>
> Signed-off-by: David S. Miller <davem@davemloft.net>
> Signed-off-by: Willy Tarreau <w@1wt.eu>
> ---
>  net/ipv4/sysctl_net_ipv4.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
> index 2dcf04d..910fa54 100644
> --- a/net/ipv4/sysctl_net_ipv4.c
> +++ b/net/ipv4/sysctl_net_ipv4.c
> @@ -23,6 +23,8 @@
>  
>  static int zero;
>  static int tcp_retr1_max = 255;
> +static int tcp_syn_retries_min = 1;
> +static int tcp_syn_retries_max = MAX_TCP_SYNCNT;
>  static int ip_local_port_range_min[] = { 1, 1 };
>  static int ip_local_port_range_max[] = { 65535, 65535 };
>  
> @@ -237,7 +239,9 @@ static struct ctl_table ipv4_table[] = {
>  		.data		= &ipv4_config.no_pmtu_disc,
>  		.maxlen		= sizeof(int),
>  		.mode		= 0644,
> -		.proc_handler	= proc_dointvec
> +		.proc_handler	= proc_dointvec_minmax,
> +		.extra1		= &tcp_syn_retries_min,
> +		.extra2		= &tcp_syn_retries_max
>  	},
>  	{
>  		.ctl_name	= NET_IPV4_NONLOCAL_BIND,
> -- 
> 1.7.12.2.21.g234cd45.dirty
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe stable" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 172+ messages in thread

* Re: [ 059/143] sysctl net: Keep tcp_syn_retries inside the boundary
  2014-06-11 18:46   ` Luis Henriques
@ 2014-06-11 19:46     ` Willy Tarreau
  2014-06-12 12:55       ` Luis Henriques
  0 siblings, 1 reply; 172+ messages in thread
From: Willy Tarreau @ 2014-06-11 19:46 UTC (permalink / raw)
  To: Luis Henriques; +Cc: linux-kernel, stable, Michal Tesar, David S. Miller

Hi Luis,

On Wed, Jun 11, 2014 at 07:46:44PM +0100, Luis Henriques wrote:
> Hi Willy,
> 
> On Mon, May 12, 2014 at 02:32:59AM +0200, Willy Tarreau wrote:
> > 2.6.32-longterm review patch.  If anyone has any objections, please let me know.
> > 
> 
> During Ubuntu Lucid kernel regression testing, after the merge of
> 2.6.32.62, we found problems with the following patches
> 
> [ 059/143] sysctl net: Keep tcp_syn_retries inside the boundary
>            (Upstream commit 651e92716aaae60fc41b9652f54cb6803896e0da)
> 
> [ 065/143] net: check net.core.somaxconn sysctl values
>            (Upstream commit 5f671d6b4ec3e6d66c2a868738af2cdea09e7509)
> 
> The following two stack traces were found in kernel logs:

Aie :-/

> [    0.199908] sysctl table check failed: /net/core/somaxconn .3.1.18 Missing strategy
> [    0.201100] Pid: 1, comm: swapper Not tainted 2.6.32-02063262-generic #201405200837
> [    0.202173] Call Trace:
(...)
> and here's a bug link:
> 
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1326473

I think that Tyler's suggest is the right approach.

> For the Ubuntu Lucid kernel, we ended up reverting the offending
> commits.  Since I was able to reproduce this problem with a vanilla
> 2.6.32.62, you may want to take a similar action for the next 2.6.32
> release.

The initial bug is hard to debug on live systems. I've been hit myself
and it took me a lot of time to find the root cause. The problem is that
the backlog is stored on an unsigned short while the sysctl is stored
on an int, and the value is naturally truncated, so when you use an
somaxconn of N*65536 + just a few, you end up with just a few and drop
a lot of SYNs even under moderate loads. Worse, the only people who
touch these values are those who run under high loads and who are the
most likely to face the issue.

Thus if there's a quick way to check that Tyler's fix reliably addresses
the issue, I think we should take it instead. Of course I understand that
in the mean time the revert is better for you!

Regards,
Willy


^ permalink raw reply	[flat|nested] 172+ messages in thread

* Re: [ 059/143] sysctl net: Keep tcp_syn_retries inside the boundary
  2014-06-11 19:46     ` Willy Tarreau
@ 2014-06-12 12:55       ` Luis Henriques
  2014-06-12 13:02         ` Willy Tarreau
  2014-06-14 17:50         ` Willy Tarreau
  0 siblings, 2 replies; 172+ messages in thread
From: Luis Henriques @ 2014-06-12 12:55 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: linux-kernel, stable, Michal Tesar, David S. Miller, tyler.hicks

(Adding Tyler to the thread, as I should have done in the first place)

Willy Tarreau <w@1wt.eu> writes:

> Hi Luis,
>
> On Wed, Jun 11, 2014 at 07:46:44PM +0100, Luis Henriques wrote:
>> Hi Willy,
>> 
>> On Mon, May 12, 2014 at 02:32:59AM +0200, Willy Tarreau wrote:
>> > 2.6.32-longterm review patch.  If anyone has any objections, please let me know.
>> > 
>> 
>> During Ubuntu Lucid kernel regression testing, after the merge of
>> 2.6.32.62, we found problems with the following patches
>> 
>> [ 059/143] sysctl net: Keep tcp_syn_retries inside the boundary
>>            (Upstream commit 651e92716aaae60fc41b9652f54cb6803896e0da)
>> 
>> [ 065/143] net: check net.core.somaxconn sysctl values
>>            (Upstream commit 5f671d6b4ec3e6d66c2a868738af2cdea09e7509)
>> 
>> The following two stack traces were found in kernel logs:
>
> Aie :-/
>
>> [    0.199908] sysctl table check failed: /net/core/somaxconn .3.1.18 Missing strategy
>> [    0.201100] Pid: 1, comm: swapper Not tainted 2.6.32-02063262-generic #201405200837
>> [    0.202173] Call Trace:
> (...)
>> and here's a bug link:
>> 
>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1326473
>
> I think that Tyler's suggest is the right approach.
>
>> For the Ubuntu Lucid kernel, we ended up reverting the offending
>> commits.  Since I was able to reproduce this problem with a vanilla
>> 2.6.32.62, you may want to take a similar action for the next 2.6.32
>> release.
>
> The initial bug is hard to debug on live systems. I've been hit myself
> and it took me a lot of time to find the root cause. The problem is that
> the backlog is stored on an unsigned short while the sysctl is stored
> on an int, and the value is naturally truncated, so when you use an
> somaxconn of N*65536 + just a few, you end up with just a few and drop
> a lot of SYNs even under moderate loads. Worse, the only people who
> touch these values are those who run under high loads and who are the
> most likely to face the issue.
>
> Thus if there's a quick way to check that Tyler's fix reliably addresses
> the issue, I think we should take it instead. Of course I understand that
> in the mean time the revert is better for you!
>
> Regards,
> Willy
>

I was finally able to spend some more time with this and tried (a
modified) Tyler's patch on top of 2.6.32.62, and it seems to work.
Although I haven't done any extended testing, I don't see the two
stack traces and the /proc/sys/net/ipv4/ directory seems to be
correctly populated.

I'm attaching the patch I've used, based on Tyler's.

Cheers,
-- 
Luís

diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index e2eaf29..e6bf72c 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -121,7 +121,8 @@ static struct ctl_table netns_core_table[] = {
 		.mode		= 0644,
 		.extra1		= &zero,
 		.extra2		= &ushort_max,
-		.proc_handler	= proc_dointvec_minmax
+		.proc_handler	= proc_dointvec_minmax,
+		.strategy	= &sysctl_intvec
 	},
 	{ .ctl_name = 0 }
 };
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 910fa54..d957371 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -241,7 +241,8 @@ static struct ctl_table ipv4_table[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec_minmax,
 		.extra1		= &tcp_syn_retries_min,
-		.extra2		= &tcp_syn_retries_max
+		.extra2		= &tcp_syn_retries_max,
+		.strategy	= &sysctl_intvec
 	},
 	{
 		.ctl_name	= NET_IPV4_NONLOCAL_BIND,

^ permalink raw reply related	[flat|nested] 172+ messages in thread

* Re: [ 059/143] sysctl net: Keep tcp_syn_retries inside the boundary
  2014-06-12 12:55       ` Luis Henriques
@ 2014-06-12 13:02         ` Willy Tarreau
  2014-06-14 17:50         ` Willy Tarreau
  1 sibling, 0 replies; 172+ messages in thread
From: Willy Tarreau @ 2014-06-12 13:02 UTC (permalink / raw)
  To: Luis Henriques
  Cc: linux-kernel, stable, Michal Tesar, David S. Miller, tyler.hicks

Hi Luis,

On Thu, Jun 12, 2014 at 01:55:53PM +0100, Luis Henriques wrote:
> (Adding Tyler to the thread, as I should have done in the first place)
> 
> Willy Tarreau <w@1wt.eu> writes:
> 
> > Hi Luis,
> >
> > On Wed, Jun 11, 2014 at 07:46:44PM +0100, Luis Henriques wrote:
> >> Hi Willy,
> >> 
> >> On Mon, May 12, 2014 at 02:32:59AM +0200, Willy Tarreau wrote:
> >> > 2.6.32-longterm review patch.  If anyone has any objections, please let me know.
> >> > 
> >> 
> >> During Ubuntu Lucid kernel regression testing, after the merge of
> >> 2.6.32.62, we found problems with the following patches
> >> 
> >> [ 059/143] sysctl net: Keep tcp_syn_retries inside the boundary
> >>            (Upstream commit 651e92716aaae60fc41b9652f54cb6803896e0da)
> >> 
> >> [ 065/143] net: check net.core.somaxconn sysctl values
> >>            (Upstream commit 5f671d6b4ec3e6d66c2a868738af2cdea09e7509)
> >> 
> >> The following two stack traces were found in kernel logs:
> >
> > Aie :-/
> >
> >> [    0.199908] sysctl table check failed: /net/core/somaxconn .3.1.18 Missing strategy
> >> [    0.201100] Pid: 1, comm: swapper Not tainted 2.6.32-02063262-generic #201405200837
> >> [    0.202173] Call Trace:
> > (...)
> >> and here's a bug link:
> >> 
> >> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1326473
> >
> > I think that Tyler's suggest is the right approach.
> >
> >> For the Ubuntu Lucid kernel, we ended up reverting the offending
> >> commits.  Since I was able to reproduce this problem with a vanilla
> >> 2.6.32.62, you may want to take a similar action for the next 2.6.32
> >> release.
> >
> > The initial bug is hard to debug on live systems. I've been hit myself
> > and it took me a lot of time to find the root cause. The problem is that
> > the backlog is stored on an unsigned short while the sysctl is stored
> > on an int, and the value is naturally truncated, so when you use an
> > somaxconn of N*65536 + just a few, you end up with just a few and drop
> > a lot of SYNs even under moderate loads. Worse, the only people who
> > touch these values are those who run under high loads and who are the
> > most likely to face the issue.
> >
> > Thus if there's a quick way to check that Tyler's fix reliably addresses
> > the issue, I think we should take it instead. Of course I understand that
> > in the mean time the revert is better for you!
> >
> > Regards,
> > Willy
> >
> 
> I was finally able to spend some more time with this and tried (a
> modified) Tyler's patch on top of 2.6.32.62, and it seems to work.
> Although I haven't done any extended testing, I don't see the two
> stack traces and the /proc/sys/net/ipv4/ directory seems to be
> correctly populated.

OK so that's confirmed now. I remember we had to do the same for
another patch during the 32.y cycle last year, so I'm not surprized.

> I'm attaching the patch I've used, based on Tyler's.

Great, thank you guys. I'll queue it up for .63. I just have to check
if there is anything else pending for a release.

Cheers,
Willy


^ permalink raw reply	[flat|nested] 172+ messages in thread

* Re: [ 059/143] sysctl net: Keep tcp_syn_retries inside the boundary
  2014-06-12 12:55       ` Luis Henriques
  2014-06-12 13:02         ` Willy Tarreau
@ 2014-06-14 17:50         ` Willy Tarreau
  2014-06-20 22:16           ` Eric W. Biederman
  1 sibling, 1 reply; 172+ messages in thread
From: Willy Tarreau @ 2014-06-14 17:50 UTC (permalink / raw)
  To: Luis Henriques
  Cc: linux-kernel, stable, Michal Tesar, David S. Miller, tyler.hicks

Hi Luis,

On Thu, Jun 12, 2014 at 01:55:53PM +0100, Luis Henriques wrote:
> I was finally able to spend some more time with this and tried (a
> modified) Tyler's patch on top of 2.6.32.62, and it seems to work.
> Although I haven't done any extended testing, I don't see the two
> stack traces and the /proc/sys/net/ipv4/ directory seems to be
> correctly populated.
> 
> I'm attaching the patch I've used, based on Tyler's.

Would any of you or Tyler please kindly pass me a signed-off-by with
a commit message ? That would be great. Alternately I'd do it myself
and mention you authored them.

> Cheers,
> -- 
> Luís

Thanks,
Willy

> diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
> index e2eaf29..e6bf72c 100644
> --- a/net/core/sysctl_net_core.c
> +++ b/net/core/sysctl_net_core.c
> @@ -121,7 +121,8 @@ static struct ctl_table netns_core_table[] = {
>  		.mode		= 0644,
>  		.extra1		= &zero,
>  		.extra2		= &ushort_max,
> -		.proc_handler	= proc_dointvec_minmax
> +		.proc_handler	= proc_dointvec_minmax,
> +		.strategy	= &sysctl_intvec
>  	},
>  	{ .ctl_name = 0 }
>  };
> diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
> index 910fa54..d957371 100644
> --- a/net/ipv4/sysctl_net_ipv4.c
> +++ b/net/ipv4/sysctl_net_ipv4.c
> @@ -241,7 +241,8 @@ static struct ctl_table ipv4_table[] = {
>  		.mode		= 0644,
>  		.proc_handler	= proc_dointvec_minmax,
>  		.extra1		= &tcp_syn_retries_min,
> -		.extra2		= &tcp_syn_retries_max
> +		.extra2		= &tcp_syn_retries_max,
> +		.strategy	= &sysctl_intvec
>  	},
>  	{
>  		.ctl_name	= NET_IPV4_NONLOCAL_BIND,

^ permalink raw reply	[flat|nested] 172+ messages in thread

* Re: [ 059/143] sysctl net: Keep tcp_syn_retries inside the boundary
  2014-06-14 17:50         ` Willy Tarreau
@ 2014-06-20 22:16           ` Eric W. Biederman
  2014-06-20 22:58             ` Willy Tarreau
  0 siblings, 1 reply; 172+ messages in thread
From: Eric W. Biederman @ 2014-06-20 22:16 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Luis Henriques, linux-kernel, stable, Michal Tesar,
	David S. Miller, tyler.hicks

Willy Tarreau <w@1wt.eu> writes:

> Hi Luis,
>
> On Thu, Jun 12, 2014 at 01:55:53PM +0100, Luis Henriques wrote:
>> I was finally able to spend some more time with this and tried (a
>> modified) Tyler's patch on top of 2.6.32.62, and it seems to work.
>> Although I haven't done any extended testing, I don't see the two
>> stack traces and the /proc/sys/net/ipv4/ directory seems to be
>> correctly populated.
>> 
>> I'm attaching the patch I've used, based on Tyler's.
>
> Would any of you or Tyler please kindly pass me a signed-off-by with
> a commit message ? That would be great. Alternately I'd do it myself
> and mention you authored them.

If my memory serves it is possibe in 2.6.32 to set 
.ctl_name = CTL_UNNEEDED

and not need to implement a .strategy routine at all.

Given the fact that most people got the strategy routines
slightly wrong and that sys_sysctl is effectively unused
a strategy where you don't implement code that no-one
will use in a backport I would be preferable.

Since you have mentioned this has come up a couple of times if something
else this will be something to think about for next time.

I am puzzled why .ctl_name was populated in a backport at all.

Eric

^ permalink raw reply	[flat|nested] 172+ messages in thread

* Re: [ 059/143] sysctl net: Keep tcp_syn_retries inside the boundary
  2014-06-20 22:16           ` Eric W. Biederman
@ 2014-06-20 22:58             ` Willy Tarreau
  2014-06-21  0:19               ` Eric W. Biederman
  0 siblings, 1 reply; 172+ messages in thread
From: Willy Tarreau @ 2014-06-20 22:58 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Luis Henriques, linux-kernel, stable, Michal Tesar,
	David S. Miller, tyler.hicks

Hi Eric,

On Fri, Jun 20, 2014 at 03:16:07PM -0700, Eric W. Biederman wrote:
> Willy Tarreau <w@1wt.eu> writes:
> 
> > Hi Luis,
> >
> > On Thu, Jun 12, 2014 at 01:55:53PM +0100, Luis Henriques wrote:
> >> I was finally able to spend some more time with this and tried (a
> >> modified) Tyler's patch on top of 2.6.32.62, and it seems to work.
> >> Although I haven't done any extended testing, I don't see the two
> >> stack traces and the /proc/sys/net/ipv4/ directory seems to be
> >> correctly populated.
> >> 
> >> I'm attaching the patch I've used, based on Tyler's.
> >
> > Would any of you or Tyler please kindly pass me a signed-off-by with
> > a commit message ? That would be great. Alternately I'd do it myself
> > and mention you authored them.
> 
> If my memory serves it is possibe in 2.6.32 to set 
> .ctl_name = CTL_UNNEEDED
> 
> and not need to implement a .strategy routine at all.

Ah that's quite interesting, thanks for the tip!

> Given the fact that most people got the strategy routines
> slightly wrong and that sys_sysctl is effectively unused
> a strategy where you don't implement code that no-one
> will use in a backport I would be preferable.

OK.

> Since you have mentioned this has come up a couple of times if something
> else this will be something to think about for next time.

I'm keeping your e-mail where I manage patches, hoping to recognize
this case next time.

> I am puzzled why .ctl_name was populated in a backport at all.

Oh it's simply because I didn't know it did not have to be there,
and among the few reviewers, I guess that it's not common to know
what version uses what semantics.

Thank you for the exaplanation, it's really helpful. We're not used
to backport sysctl changes but here I got caught a few times and have
found some sysctl.conf with bogus values in field a few times, so it
was really important to backport this one.

Best regards,
Willy


^ permalink raw reply	[flat|nested] 172+ messages in thread

* Re: [ 059/143] sysctl net: Keep tcp_syn_retries inside the boundary
  2014-06-20 22:58             ` Willy Tarreau
@ 2014-06-21  0:19               ` Eric W. Biederman
  0 siblings, 0 replies; 172+ messages in thread
From: Eric W. Biederman @ 2014-06-21  0:19 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Luis Henriques, linux-kernel, stable, Michal Tesar,
	David S. Miller, tyler.hicks

Willy Tarreau <w@1wt.eu> writes:

> Hi Eric,
>
> On Fri, Jun 20, 2014 at 03:16:07PM -0700, Eric W. Biederman wrote:
>> Willy Tarreau <w@1wt.eu> writes:
>> 
>> > Hi Luis,
>> >
>> > On Thu, Jun 12, 2014 at 01:55:53PM +0100, Luis Henriques wrote:
>> >> I was finally able to spend some more time with this and tried (a
>> >> modified) Tyler's patch on top of 2.6.32.62, and it seems to work.
>> >> Although I haven't done any extended testing, I don't see the two
>> >> stack traces and the /proc/sys/net/ipv4/ directory seems to be
>> >> correctly populated.
>> >> 
>> >> I'm attaching the patch I've used, based on Tyler's.
>> >
>> > Would any of you or Tyler please kindly pass me a signed-off-by with
>> > a commit message ? That would be great. Alternately I'd do it myself
>> > and mention you authored them.
>> 
>> If my memory serves it is possibe in 2.6.32 to set 
>> .ctl_name = CTL_UNNEEDED
>> 
>> and not need to implement a .strategy routine at all.
>
> Ah that's quite interesting, thanks for the tip!
>
>> Given the fact that most people got the strategy routines
>> slightly wrong and that sys_sysctl is effectively unused
>> a strategy where you don't implement code that no-one
>> will use in a backport I would be preferable.
>
> OK.
>
>> Since you have mentioned this has come up a couple of times if something
>> else this will be something to think about for next time.
>
> I'm keeping your e-mail where I manage patches, hoping to recognize
> this case next time.
>
>> I am puzzled why .ctl_name was populated in a backport at all.
>
> Oh it's simply because I didn't know it did not have to be there,
> and among the few reviewers, I guess that it's not common to know
> what version uses what semantics.

I guess what I meant is that the field .ctl_name does not even exist
anymore for the same reasons .strategy does not exist anymore.  So I
was just suprirsed that someone picked a randomish number and stuck
it in there.

If anyone actually were to use those randomish numbers in the binary
sys_sysctl call their applications would break when they eventually
moved to a more recent kernel.

Which is one of the motivations it was decided there would be no more
binary sysctls allocated around the 2.6.32 timeframe.

> Thank you for the exaplanation, it's really helpful. We're not used
> to backport sysctl changes but here I got caught a few times and have
> found some sysctl.conf with bogus values in field a few times, so it
> was really important to backport this one.

Sysctl do have their uses, and at least 2.6.32 has runtime sysctl checks
to keep the insanity to a dull roar.

Eric

^ permalink raw reply	[flat|nested] 172+ messages in thread

end of thread, other threads:[~2014-06-21  0:20 UTC | newest]

Thread overview: 172+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <f07e5fe6d87f172fc73580b9c86ba9a2@local>
2014-05-12  0:32 ` [ 000/143] 2.6.32.62-longterm review Willy Tarreau
2014-05-12  0:32 ` [ 001/143] scsi: fix missing include linux/types.h in scsi_netlink.h Willy Tarreau
2014-05-12  0:32 ` [ 002/143] Fix lockup related to stop_machine being stuck in __do_softirq Willy Tarreau
2014-05-12  0:32 ` [ 003/143] Revert "x86, ptrace: fix build breakage with gcc 4.7" Willy Tarreau
2014-05-12  0:32 ` [ 004/143] x86, ptrace: fix build breakage with gcc 4.7 (second try) Willy Tarreau
2014-05-12  0:32 ` [ 005/143] ipvs: fix CHECKSUM_PARTIAL for TCP, UDP Willy Tarreau
2014-05-12  0:32 ` [ 006/143] intel-iommu: Flush unmaps at domain_exit Willy Tarreau
2014-05-12  0:32 ` [ 007/143] staging: comedi: ni_65xx: (bug fix) confine insn_bits to one Willy Tarreau
2014-05-12  0:32 ` [ 008/143] kernel/kmod.c: check for NULL in call_usermodehelper_exec() Willy Tarreau
2014-05-12  0:32 ` [ 009/143] cciss: fix info leak in cciss_ioctl32_passthru() Willy Tarreau
2014-05-12  0:32 ` [ 010/143] cpqarray: fix info leak in ida_locked_ioctl() Willy Tarreau
2014-05-12  0:32 ` [ 011/143] drivers/cdrom/cdrom.c: use kzalloc() for failing hardware Willy Tarreau
2014-05-12  0:32 ` [ 012/143] sctp: deal with multiple COOKIE_ECHO chunks Willy Tarreau
2014-05-12  0:32 ` [ 013/143] sctp: Use correct sideffect command in duplicate cookie handling Willy Tarreau
2014-05-12  0:32 ` [ 014/143] ipv6: ip6_sk_dst_check() must not assume ipv6 dst Willy Tarreau
2014-05-12  0:32 ` [ 015/143] af_key: fix info leaks in notify messages Willy Tarreau
2014-05-12  0:32 ` [ 016/143] af_key: initialize satype in key_notify_policy_flush() Willy Tarreau
2014-05-12  0:32 ` [ 017/143] block: do not pass disk names as format strings Willy Tarreau
2014-05-12  0:32 ` [ 018/143] b43: stop format string leaking into error msgs Willy Tarreau
2014-05-12  0:32 ` [ 019/143] HID: validate HID report id size Willy Tarreau
2014-05-12  0:32 ` [ 020/143] HID: zeroplus: validate output report details Willy Tarreau
2014-05-12  0:32 ` [ 021/143] HID: pantherlord: " Willy Tarreau
2014-05-12  0:32 ` [ 022/143] HID: LG: validate HID " Willy Tarreau
2014-05-12  0:32 ` [ 023/143] HID: check for NULL field when setting values Willy Tarreau
2014-05-12  0:32 ` [ 024/143] HID: provide a helper for validating hid reports Willy Tarreau
2014-05-12  0:32 ` [ 025/143] crypto: api - Fix race condition in larval lookup Willy Tarreau
2014-05-12  0:32 ` [ 026/143] ipv6: tcp: fix panic in SYN processing Willy Tarreau
2014-05-12  0:32 ` [ 027/143] tcp: must unclone packets before mangling them Willy Tarreau
2014-05-12  0:32 ` [ 028/143] net: do not call sock_put() on TIMEWAIT sockets Willy Tarreau
2014-05-12  0:32 ` [ 029/143] net: heap overflow in __audit_sockaddr() Willy Tarreau
2014-05-12  0:32 ` [ 030/143] proc connector: fix info leaks Willy Tarreau
2014-05-12  8:41   ` Christoph Biedl
2014-05-12  8:51   ` Mathias Krause
2014-05-12  8:57     ` Willy Tarreau
2014-05-12 11:43       ` Willy Tarreau
2014-05-12 14:42       ` David Miller
2014-05-12  0:32 ` [ 031/143] can: dev: fix nlmsg size calculation in can_get_size() Willy Tarreau
2014-05-12  0:32 ` [ 032/143] net: vlan: fix nlmsg size calculation in vlan_get_size() Willy Tarreau
2014-05-12  0:32 ` [ 033/143] farsync: fix info leak in ioctl Willy Tarreau
2014-05-12  0:32 ` [ 034/143] connector: use nlmsg_len() to check message length Willy Tarreau
2014-05-12  0:32 ` [ 035/143] net: dst: provide accessor function to dst->xfrm Willy Tarreau
2014-05-12  0:32 ` [ 036/143] sctp: Use software crc32 checksum when xfrm transform will happen Willy Tarreau
2014-05-12  0:32 ` [ 037/143] sctp: Perform software checksum if packet has to be fragmented Willy Tarreau
2014-05-12  0:32 ` [ 038/143] wanxl: fix info leak in ioctl Willy Tarreau
2014-05-12  0:32 ` [ 039/143] davinci_emac.c: Fix IFF_ALLMULTI setup Willy Tarreau
2014-05-12  0:32 ` [ 040/143] resubmit bridge: fix message_age_timer calculation Willy Tarreau
2014-05-12  0:32 ` [ 041/143] ipv6 mcast: use in6_dev_put in timer handlers instead of Willy Tarreau
2014-05-12  0:32 ` [ 042/143] ipv4 igmp: use in_dev_put in timer handlers instead of __in_dev_put Willy Tarreau
2014-05-12  0:32 ` [ 043/143] dm9601: fix IFF_ALLMULTI handling Willy Tarreau
2014-05-12  0:32 ` [ 044/143] bonding: Fix broken promiscuity reference counting issue Willy Tarreau
2014-05-12  0:32 ` [ 045/143] ll_temac: Reset dma descriptors indexes on ndo_open Willy Tarreau
2014-05-12  0:32 ` [ 046/143] tcp: fix tcp_md5_hash_skb_data() Willy Tarreau
2014-05-12  0:32 ` [ 047/143] ipv6: fix possible crashes in ip6_cork_release() Willy Tarreau
2014-05-12  0:32 ` [ 048/143] ip_tunnel: fix kernel panic with icmp_dest_unreach Willy Tarreau
2014-05-12  0:32 ` [ 049/143] net: sctp: fix NULL pointer dereference in socket destruction Willy Tarreau
2014-05-12  0:32 ` [ 050/143] packet: packet_getname_spkt: make sure string is always 0-terminated Willy Tarreau
2014-05-12  0:32 ` [ 051/143] neighbour: fix a race in neigh_destroy() Willy Tarreau
2014-05-12  0:32 ` [ 052/143] net: Swap ver and type in pppoe_hdr Willy Tarreau
2014-05-12  0:32 ` [ 053/143] sunvnet: vnet_port_remove must call unregister_netdev Willy Tarreau
2014-05-12  0:32 ` [ 054/143] ifb: fix rcu_sched self-detected stalls Willy Tarreau
2014-05-12  0:32 ` [ 055/143] dummy: fix oops when loading the dummy failed Willy Tarreau
2014-05-12  0:32 ` [ 056/143] ifb: fix oops when loading the ifb failed Willy Tarreau
2014-05-12  0:32 ` [ 057/143] vlan: fix a race in egress prio management Willy Tarreau
2014-05-12  0:32 ` [ 058/143] arcnet: cleanup sizeof parameter Willy Tarreau
2014-05-12  0:32 ` [ 059/143] sysctl net: Keep tcp_syn_retries inside the boundary Willy Tarreau
2014-06-11 18:46   ` Luis Henriques
2014-06-11 19:46     ` Willy Tarreau
2014-06-12 12:55       ` Luis Henriques
2014-06-12 13:02         ` Willy Tarreau
2014-06-14 17:50         ` Willy Tarreau
2014-06-20 22:16           ` Eric W. Biederman
2014-06-20 22:58             ` Willy Tarreau
2014-06-21  0:19               ` Eric W. Biederman
2014-05-12  0:33 ` [ 060/143] sctp: fully initialize sctp_outq in sctp_outq_init Willy Tarreau
2014-05-12  0:33 ` [ 061/143] net_sched: Fix stack info leak in cbq_dump_wrr() Willy Tarreau
2014-05-12  0:33 ` [ 062/143] af_key: more info leaks in pfkey messages Willy Tarreau
2014-05-12  0:33 ` [ 063/143] net_sched: info leak in atm_tc_dump_class() Willy Tarreau
2014-05-12  0:33 ` [ 064/143] htb: fix sign extension bug Willy Tarreau
2014-05-12  0:33 ` [ 065/143] net: check net.core.somaxconn sysctl values Willy Tarreau
2014-05-12  0:33 ` [ 066/143] tcp: cubic: fix bug in bictcp_acked() Willy Tarreau
2014-05-12  0:33 ` [ 067/143] ipv6: dont stop backtracking in fib6_lookup_1 if subtree does not Willy Tarreau
2014-05-12  0:33 ` [ 068/143] ipv6: remove max_addresses check from ipv6_create_tempaddr Willy Tarreau
2014-05-12  0:33 ` [ 069/143] ipv6: drop packets with multiple fragmentation headers Willy Tarreau
2014-05-12  0:33 ` [ 070/143] ipv6: Dont depend on per socket memory for neighbour discovery Willy Tarreau
2014-05-12  0:33 ` [ 071/143] ICMPv6: treat dest unreachable codes 5 and 6 as EACCES, not EPROTO Willy Tarreau
2014-05-12  0:33 ` [ 072/143] tipc: fix lockdep warning during bearer initialization Willy Tarreau
2014-05-12 16:04   ` Jon Maloy
2014-05-12 16:16     ` Willy Tarreau
2014-05-12 16:41       ` Jon Maloy
2014-05-12 17:12         ` Willy Tarreau
2014-05-12 17:19           ` Jon Maloy
2014-05-12 18:11             ` Willy Tarreau
2014-05-12  0:33 ` [ 073/143] net: Fix "ip rule delete table 256" Willy Tarreau
2014-05-12  0:33 ` [ 074/143] ipv6: use rt6_get_dflt_router to get default router in rt6_route_rcv Willy Tarreau
2014-05-12  0:33 ` [ 075/143] random32: fix off-by-one in seeding requirement Willy Tarreau
2014-05-12  0:33 ` [ 076/143] bonding: fix two race conditions in bond_store_updelay/downdelay Willy Tarreau
2014-05-12  0:33 ` [ 077/143] isdnloop: use strlcpy() instead of strcpy() Willy Tarreau
2014-05-12  0:33 ` [ 078/143] ipv4: fix possible seqlock deadlock Willy Tarreau
2014-05-12  0:33 ` [ 079/143] inet: prevent leakage of uninitialized memory to user in recv Willy Tarreau
2014-05-12  0:33 ` [ 080/143] net: rework recvmsg handler msg_name and msg_namelen logic Willy Tarreau
2014-05-13 12:44   ` Luis Henriques
2014-05-13 12:49     ` Willy Tarreau
2014-05-14  5:45     ` Willy Tarreau
2014-05-12  0:33 ` [ 081/143] net: add BUG_ON if kernel advertises msg_namelen > sizeof(struct Willy Tarreau
2014-05-12  0:33 ` [ 082/143] inet: fix addr_len/msg->msg_namelen assignment in recv_error and Willy Tarreau
2014-05-12  0:33 ` [ 083/143] net: clamp ->msg_namelen instead of returning an error Willy Tarreau
2014-05-14 10:02   ` Dan Carpenter
2014-05-14 12:27     ` Willy Tarreau
2014-05-12  0:33 ` [ 084/143] ipv6: fix leaking uninitialized port number of offender sockaddr Willy Tarreau
2014-05-12  0:33 ` [ 085/143] atm: idt77252: fix dev refcnt leak Willy Tarreau
2014-05-12  0:33 ` [ 086/143] net: core: Always propagate flag changes to interfaces Willy Tarreau
2014-05-12  0:33 ` [ 087/143] bridge: flush brs address entry in fdb when remove the bridge dev Willy Tarreau
2014-05-12  0:33 ` [ 088/143] inet: fix possible seqlock deadlocks Willy Tarreau
2014-05-12  0:33 ` [ 089/143] ipv6: fix possible seqlock deadlock in ip6_finish_output2 Willy Tarreau
2014-05-12  0:33 ` [ 090/143] {pktgen, xfrm} Update IPv4 header total len and checksum after Willy Tarreau
2014-05-12  0:33 ` [ 091/143] net: drop_monitor: fix the value of maxattr Willy Tarreau
2014-05-12  0:33 ` [ 092/143] net: unix: allow bind to fail on mutex lock Willy Tarreau
2014-05-12  0:33 ` [ 093/143] drivers/net/hamradio: Integer overflow in hdlcdrv_ioctl() Willy Tarreau
2014-05-12  0:33 ` [ 094/143] hamradio/yam: fix info leak in ioctl Willy Tarreau
2014-05-12  0:33 ` [ 095/143] rds: prevent dereference of a NULL device Willy Tarreau
2014-05-12  0:33 ` [ 096/143] net: rose: restore old recvmsg behavior Willy Tarreau
2014-05-12  0:33 ` [ 097/143] net: llc: fix use after free in llc_ui_recvmsg Willy Tarreau
2014-05-12  0:33 ` [ 098/143] inet_diag: fix inet_diag_dump_icsk() timewait socket state logic Willy Tarreau
2014-05-12  0:33 ` [ 099/143] net: fix ip rule iif/oif device rename Willy Tarreau
2014-05-12  0:33 ` [ 100/143] tg3: Fix deadlock in tg3_change_mtu() Willy Tarreau
2014-05-12  0:33 ` [ 101/143] bonding: 802.3ad: make aggregator_identifier bond-private Willy Tarreau
2014-05-12  0:33 ` [ 102/143] net: sctp: fix sctp_connectx abi for ia32 emulation/compat mode Willy Tarreau
2014-05-12  0:33 ` [ 103/143] virtio-net: alloc big buffers also when guest can receive UFO Willy Tarreau
2014-05-12  0:33 ` [ 104/143] tg3: Dont check undefined error bits in RXBD Willy Tarreau
2014-05-12  0:33 ` [ 105/143] net: sctp: fix sctp_sf_do_5_1D_ce to verify if we/peer is AUTH Willy Tarreau
2014-05-12  0:33 ` [ 106/143] net: sctp: fix skb leakage in COOKIE ECHO path of chunk->auth_chunk Willy Tarreau
2014-05-12  0:33 ` [ 107/143] net: socket: error on a negative msg_namelen Willy Tarreau
2014-05-12  0:33 ` [ 108/143] netlink: dont compare the nul-termination in nla_strcmp Willy Tarreau
2014-05-12  0:33 ` [ 109/143] isdnloop: several buffer overflows Willy Tarreau
2014-05-12  0:33 ` [ 110/143] rds: prevent dereference of a NULL device in rds_iw_laddr_check Willy Tarreau
2014-05-12  0:33 ` [ 111/143] isdnloop: Validate NUL-terminated strings from user Willy Tarreau
2014-05-12  0:33 ` [ 112/143] sctp: unbalanced rcu lock in ip_queue_xmit() Willy Tarreau
2014-05-12  0:33 ` [ 113/143] aacraid: prevent invalid pointer dereference Willy Tarreau
2014-05-12  0:33 ` [ 114/143] ipv6: udp packets following an UFO enqueued packet need also be Willy Tarreau
2014-05-12  0:33 ` [ 115/143] inet: fix possible memory corruption with UDP_CORK and UFO Willy Tarreau
2014-05-12  0:33 ` [ 116/143] vm: add vm_iomap_memory() helper function Willy Tarreau
2014-05-12  0:33 ` [ 117/143] Fix a few incorrectly checked [io_]remap_pfn_range() calls Willy Tarreau
2014-05-12  0:33 ` [ 118/143] libertas: potential oops in debugfs Willy Tarreau
2014-05-12  0:33 ` [ 119/143] x86, fpu, amd: Clear exceptions in AMD FXSAVE workaround Willy Tarreau
2014-05-12  0:34 ` [ 120/143] gianfar: disable TX vlan based on kernel 2.6.x Willy Tarreau
2014-05-12  0:34 ` [ 121/143] [CPUFREQ] powernow-k6: set transition latency value so ondemand Willy Tarreau
2014-05-12  0:34 ` [ 122/143] powernow-k6: disable cache when changing frequency Willy Tarreau
2014-05-12  0:34 ` [ 123/143] powernow-k6: correctly initialize default parameters Willy Tarreau
2014-05-12  0:34 ` [ 124/143] powernow-k6: reorder frequencies Willy Tarreau
2014-05-12  0:34 ` [ 125/143] tcp: fix tcp_trim_head() to adjust segment count with skb MSS Willy Tarreau
2014-05-12  0:34 ` [ 126/143] tcp_cubic: limit delayed_ack ratio to prevent divide error Willy Tarreau
2014-05-12  0:34 ` [ 127/143] tcp_cubic: fix the range of delayed_ack Willy Tarreau
2014-05-12  0:34 ` [ 128/143] n_tty: Fix n_tty_write crash when echoing in raw mode Willy Tarreau
2014-05-12  0:34 ` [ 129/143] exec/ptrace: fix get_dumpable() incorrect tests Willy Tarreau
2014-05-12  0:34 ` [ 130/143] ipv6: call udp_push_pending_frames when uncorking a socket with Willy Tarreau
2014-05-12  0:34 ` [ 131/143] dm snapshot: fix data corruption Willy Tarreau
2014-05-12  0:34 ` [ 132/143] crypto: ansi_cprng - Fix off by one error in non-block size request Willy Tarreau
2014-05-12  0:34 ` [ 133/143] uml: check length in exitcode_proc_write() Willy Tarreau
2014-05-12  0:34 ` [ 134/143] KVM: Improve create VCPU parameter (CVE-2013-4587) Willy Tarreau
2014-05-12  0:34 ` [ 135/143] KVM: x86: Fix potential divide by 0 in lapic (CVE-2013-6367) Willy Tarreau
2014-05-12  0:34 ` [ 136/143] qeth: avoid buffer overflow in snmp ioctl Willy Tarreau
2014-05-12  0:34 ` [ 137/143] xfs: underflow bug in xfs_attrlist_by_handle() Willy Tarreau
2014-05-13 11:08   ` Luis Henriques
2014-05-13 11:18     ` Willy Tarreau
2014-05-14  9:50     ` Dan Carpenter
2014-05-22  8:19       ` Dan Carpenter
2014-05-12  0:34 ` [ 138/143] aacraid: missing capable() check in compat ioctl Willy Tarreau
2014-05-12  0:34 ` [ 139/143] SELinux: Fix kernel BUG on empty security contexts Willy Tarreau
2014-05-12  0:34 ` [ 140/143] s390: fix kernel crash due to linkage stack instructions Willy Tarreau
2014-05-12  0:34 ` [ 141/143] netfilter: nf_conntrack_dccp: fix skb_header_pointer API usages Willy Tarreau
2014-05-12  0:34 ` [ 142/143] floppy: ignore kernel-only members in FDRAWCMD ioctl input Willy Tarreau
2014-05-12  0:34 ` [ 143/143] floppy: dont write kernel-only members to FDRAWCMD ioctl output Willy Tarreau

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).