All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 4.19 00/92] 4.19.100-stable review
@ 2020-01-28 14:07 Greg Kroah-Hartman
  2020-01-28 14:07 ` [PATCH 4.19 01/92] can, slip: Protect tty->disc_data in write_wakeup and close with RCU Greg Kroah-Hartman
                   ` (88 more replies)
  0 siblings, 89 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, torvalds, akpm, linux, shuah, patches,
	ben.hutchings, lkft-triage, stable

This is the start of the stable review cycle for the 4.19.100 release.
There are 92 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Thu, 30 Jan 2020 13:57:09 +0000.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
	https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.100-rc1.gz
or in the git tree and branch at:
	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Linux 4.19.100-rc1

David Hildenbrand <david@redhat.com>
    mm/memory_hotplug: shrink zones when offlining memory

David Hildenbrand <david@redhat.com>
    mm/memory_hotplug: fix try_offline_node()

Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
    mm/memunmap: don't access uninitialized memmap in memunmap_pages()

David Hildenbrand <david@redhat.com>
    drivers/base/node.c: simplify unregister_memory_block_under_nodes()

Dan Williams <dan.j.williams@intel.com>
    mm/hotplug: kill is_dev_zone() usage in __remove_pages()

David Hildenbrand <david@redhat.com>
    mm/memory_hotplug: remove "zone" parameter from sparse_remove_one_section

David Hildenbrand <david@redhat.com>
    mm/memory_hotplug: make unregister_memory_block_under_nodes() never fail

David Hildenbrand <david@redhat.com>
    mm/memory_hotplug: remove memory block devices before arch_remove_memory()

David Hildenbrand <david@redhat.com>
    mm/memory_hotplug: create memory block devices after arch_add_memory()

David Hildenbrand <david@redhat.com>
    drivers/base/memory: pass a block_id to init_memory_block()

David Hildenbrand <david@redhat.com>
    mm/memory_hotplug: allow arch_remove_memory() without CONFIG_MEMORY_HOTREMOVE

David Hildenbrand <david@redhat.com>
    s390x/mm: implement arch_remove_memory()

David Hildenbrand <david@redhat.com>
    mm/memory_hotplug: make __remove_pages() and arch_remove_memory() never fail

Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
    powerpc/mm: Fix section mismatch warning

David Hildenbrand <david@redhat.com>
    mm/memory_hotplug: make __remove_section() never fail

David Hildenbrand <david@redhat.com>
    mm/memory_hotplug: make unregister_memory_section() never fail

Dan Carpenter <dan.carpenter@oracle.com>
    mm, memory_hotplug: update a comment in unregister_memory()

Baoquan He <bhe@redhat.com>
    drivers/base/memory.c: clean up relics in function parameters

David Hildenbrand <david@redhat.com>
    mm/memory_hotplug: release memory resource after arch_remove_memory()

Oscar Salvador <osalvador@suse.com>
    mm, memory_hotplug: add nid parameter to arch_remove_memory

Wei Yang <richard.weiyang@gmail.com>
    drivers/base/memory.c: remove an unnecessary check on NR_MEM_SECTIONS

Wei Yang <richard.weiyang@gmail.com>
    mm, sparse: pass nid instead of pgdat to sparse_add_one_section()

Wei Yang <richard.weiyang@gmail.com>
    mm, sparse: drop pgdat_resize_lock in sparse_add/remove_one_section()

David Hildenbrand <david@redhat.com>
    mm/memory_hotplug: make remove_memory() take the device_hotplug_lock

Martin Schiller <ms@dev.tdt.de>
    net/x25: fix nonblocking connect

Pablo Neira Ayuso <pablo@netfilter.org>
    netfilter: nf_tables: add __nft_chain_type_get()

Kadlecsik József <kadlec@blackhole.kfki.hu>
    netfilter: ipset: use bitmap infrastructure completely

Bo Wu <wubo40@huawei.com>
    scsi: iscsi: Avoid potential deadlock in iscsi_if_rx func

Hans Verkuil <hverkuil-cisco@xs4all.nl>
    media: v4l2-ioctl.c: zero reserved fields for S/TRY_FMT

Wen Huang <huangwenabc@gmail.com>
    libertas: Fix two buffer overflows at parsing bss descriptor

Suzuki K Poulose <suzuki.poulose@arm.com>
    coresight: tmc-etf: Do not call smp_processor_id from preemptible

Suzuki K Poulose <suzuki.poulose@arm.com>
    coresight: etb10: Do not call smp_processor_id from preemptible

Ard Biesheuvel <ard.biesheuvel@linaro.org>
    crypto: geode-aes - switch to skcipher for cbc(aes) fallback

Masato Suzuki <masato.suzuki@wdc.com>
    sd: Fix REQ_OP_ZONE_REPORT completion handling

Steven Rostedt (VMware) <rostedt@goodmis.org>
    tracing: Fix histogram code when expression has same var as value

Tom Zanussi <tom.zanussi@linux.intel.com>
    tracing: Remove open-coding of hist trigger var_ref management

Tom Zanussi <tom.zanussi@linux.intel.com>
    tracing: Use hist trigger's var_ref array to destroy var_refs

Finn Thain <fthain@telegraphics.com.au>
    net/sonic: Prevent tx watchdog timeout

Finn Thain <fthain@telegraphics.com.au>
    net/sonic: Fix CAM initialization

Finn Thain <fthain@telegraphics.com.au>
    net/sonic: Fix command register usage

Finn Thain <fthain@telegraphics.com.au>
    net/sonic: Quiesce SONIC before re-initializing descriptor memory

Finn Thain <fthain@telegraphics.com.au>
    net/sonic: Fix receive buffer replenishment

Finn Thain <fthain@telegraphics.com.au>
    net/sonic: Improve receive descriptor status flag check

Finn Thain <fthain@telegraphics.com.au>
    net/sonic: Avoid needless receive descriptor EOL flag updates

Finn Thain <fthain@telegraphics.com.au>
    net/sonic: Fix receive buffer handling

Finn Thain <fthain@telegraphics.com.au>
    net/sonic: Fix interface error stats collection

Finn Thain <fthain@telegraphics.com.au>
    net/sonic: Use MMIO accessors

Finn Thain <fthain@telegraphics.com.au>
    net/sonic: Clear interrupt flags immediately

Finn Thain <fthain@telegraphics.com.au>
    net/sonic: Add mutual exclusion for accessing shared state

Al Viro <viro@zeniv.linux.org.uk>
    do_last(): fetch directory ->i_mode and ->i_uid before it's too late

Changbin Du <changbin.du@gmail.com>
    tracing: xen: Ordered comparison of function pointers

Bart Van Assche <bvanassche@acm.org>
    scsi: RDMA/isert: Fix a recently introduced regression related to logout

Gilles Buloz <gilles.buloz@kontron.com>
    hwmon: (nct7802) Fix voltage limits to wrong registers

Florian Westphal <fw@strlen.de>
    netfilter: nft_osf: add missing check for DREG attribute

Chuhong Yuan <hslester96@gmail.com>
    Input: sun4i-ts - add a check for devm_thermal_zone_of_sensor_register

Johan Hovold <johan@kernel.org>
    Input: pegasus_notetaker - fix endpoint sanity check

Johan Hovold <johan@kernel.org>
    Input: aiptek - fix endpoint sanity check

Johan Hovold <johan@kernel.org>
    Input: gtco - fix endpoint sanity check

Johan Hovold <johan@kernel.org>
    Input: sur40 - fix interface sanity checks

Stephan Gerhold <stephan@gerhold.net>
    Input: pm8xxx-vib - fix handling of separate enable register

Jeremy Linton <jeremy.linton@arm.com>
    Documentation: Document arm64 kpti control

Michał Mirosław <mirq-linux@rere.qmqm.pl>
    mmc: sdhci: fix minimum clock rate for v3 controller

Michał Mirosław <mirq-linux@rere.qmqm.pl>
    mmc: tegra: fix SDR50 tuning override

Alex Sverdlin <alexander.sverdlin@nokia.com>
    ARM: 8950/1: ftrace/recordmcount: filter relocation types

Hans Verkuil <hverkuil-cisco@xs4all.nl>
    Revert "Input: synaptics-rmi4 - don't increment rmiaddr for SMBus transfers"

Johan Hovold <johan@kernel.org>
    Input: keyspan-remote - fix control-message timeouts

Masami Hiramatsu <mhiramat@kernel.org>
    tracing: trigger: Replace unneeded RCU-list traversals

Alex Deucher <alexander.deucher@amd.com>
    PCI: Mark AMD Navi14 GPU rev 0xc5 ATS as broken

Guenter Roeck <linux@roeck-us.net>
    hwmon: (core) Do not use device managed functions for memory allocations

Luuk Paulussen <luuk.paulussen@alliedtelesis.co.nz>
    hwmon: (adt7475) Make volt2reg return same reg as reg2volt input

David Howells <dhowells@redhat.com>
    afs: Fix characters allowed into cell names

Eric Dumazet <edumazet@google.com>
    tun: add mutex_unlock() call and napi.skb clearing in tun_get_user()

Eric Dumazet <edumazet@google.com>
    tcp: do not leave dangling pointers in tp->highest_sack

Wen Yang <wenyang@linux.alibaba.com>
    tcp_bbr: improve arithmetic division in bbr_update_bw()

Paolo Abeni <pabeni@redhat.com>
    Revert "udp: do rmem bulk free even if the rx sk queue is empty"

James Hughes <james.hughes@raspberrypi.org>
    net: usb: lan78xx: Add .ndo_features_check

Jouni Hogander <jouni.hogander@unikie.com>
    net-sysfs: Fix reference count leak

Jouni Hogander <jouni.hogander@unikie.com>
    net-sysfs: Call dev_hold always in rx_queue_add_kobject

Jouni Hogander <jouni.hogander@unikie.com>
    net-sysfs: Call dev_hold always in netdev_queue_add_kobject

Eric Dumazet <edumazet@google.com>
    net-sysfs: fix netdev_queue_add_kobject() breakage

Jouni Hogander <jouni.hogander@unikie.com>
    net-sysfs: Fix reference count leak in rx|netdev_queue_add_kobject

Cong Wang <xiyou.wangcong@gmail.com>
    net_sched: fix datalen for ematch

Eric Dumazet <edumazet@google.com>
    net: rtnetlink: validate IFLA_MTU attribute in rtnl_create_link()

William Dauchy <w.dauchy@criteo.com>
    net, ip_tunnel: fix namespaces move

William Dauchy <w.dauchy@criteo.com>
    net, ip6_tunnel: fix namespaces move

Niko Kortstrom <niko.kortstrom@nokia.com>
    net: ip6_gre: fix moving ip6gre between namespaces

Michael Ellerman <mpe@ellerman.id.au>
    net: cxgb3_main: Add CAP_NET_ADMIN check to CHELSIO_GET_MEM

Florian Fainelli <f.fainelli@gmail.com>
    net: bcmgenet: Use netif_tx_napi_add() for TX NAPI

Yuki Taguchi <tagyounit@gmail.com>
    ipv6: sr: remove SKB_GSO_IPXIP6 on End.D* actions

Eric Dumazet <edumazet@google.com>
    gtp: make sure only SOCK_DGRAM UDP sockets are accepted

Wenwen Wang <wenwen@cs.uga.edu>
    firestream: fix memory leaks

Richard Palethorpe <rpalethorpe@suse.com>
    can, slip: Protect tty->disc_data in write_wakeup and close with RCU


-------------

Diffstat:

 Documentation/admin-guide/kernel-parameters.txt |   6 +
 Makefile                                        |   4 +-
 arch/ia64/mm/init.c                             |  15 +-
 arch/powerpc/mm/mem.c                           |  25 +-
 arch/powerpc/platforms/powernv/memtrace.c       |   2 +-
 arch/powerpc/platforms/pseries/hotplug-memory.c |   6 +-
 arch/s390/mm/init.c                             |  16 +-
 arch/sh/mm/init.c                               |  15 +-
 arch/x86/mm/init_32.c                           |   9 +-
 arch/x86/mm/init_64.c                           |  17 +-
 drivers/acpi/acpi_memhotplug.c                  |   2 +-
 drivers/atm/firestream.c                        |   3 +
 drivers/base/memory.c                           | 203 ++++++++-----
 drivers/base/node.c                             |  52 ++--
 drivers/crypto/geode-aes.c                      |  57 ++--
 drivers/crypto/geode-aes.h                      |   2 +-
 drivers/hwmon/adt7475.c                         |   5 +-
 drivers/hwmon/hwmon.c                           |  68 +++--
 drivers/hwmon/nct7802.c                         |   4 +-
 drivers/hwtracing/coresight/coresight-etb10.c   |   4 +-
 drivers/hwtracing/coresight/coresight-tmc-etf.c |   4 +-
 drivers/infiniband/ulp/isert/ib_isert.c         |  12 -
 drivers/input/misc/keyspan_remote.c             |   9 +-
 drivers/input/misc/pm8xxx-vibrator.c            |   2 +-
 drivers/input/rmi4/rmi_smbus.c                  |   2 +
 drivers/input/tablet/aiptek.c                   |   6 +-
 drivers/input/tablet/gtco.c                     |  10 +-
 drivers/input/tablet/pegasus_notetaker.c        |   2 +-
 drivers/input/touchscreen/sun4i-ts.c            |   6 +-
 drivers/input/touchscreen/sur40.c               |   2 +-
 drivers/media/v4l2-core/v4l2-ioctl.c            |  24 +-
 drivers/mmc/host/sdhci-tegra.c                  |   2 +-
 drivers/mmc/host/sdhci.c                        |  10 +-
 drivers/net/can/slcan.c                         |  12 +-
 drivers/net/ethernet/broadcom/genet/bcmgenet.c  |   4 +-
 drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c |   2 +
 drivers/net/ethernet/natsemi/sonic.c            | 380 ++++++++++++++----------
 drivers/net/ethernet/natsemi/sonic.h            |  44 ++-
 drivers/net/gtp.c                               |  10 +-
 drivers/net/slip/slip.c                         |  12 +-
 drivers/net/tun.c                               |   4 +
 drivers/net/usb/lan78xx.c                       |  15 +
 drivers/net/wireless/marvell/libertas/cfg.c     |  16 +-
 drivers/pci/quirks.c                            |  19 +-
 drivers/scsi/scsi_transport_iscsi.c             |   7 +
 drivers/scsi/sd.c                               |   8 +-
 drivers/target/iscsi/iscsi_target.c             |   6 +-
 fs/afs/cell.c                                   |  11 +-
 fs/namei.c                                      |  17 +-
 include/linux/memory.h                          |   8 +-
 include/linux/memory_hotplug.h                  |  22 +-
 include/linux/mmzone.h                          |   3 +-
 include/linux/netdevice.h                       |   2 +
 include/linux/netfilter/ipset/ip_set.h          |   7 -
 include/linux/node.h                            |   7 +-
 include/trace/events/xen.h                      |   6 +-
 kernel/memremap.c                               |  12 +-
 kernel/trace/trace_events_hist.c                | 177 +++++++++--
 kernel/trace/trace_events_trigger.c             |  20 +-
 mm/hmm.c                                        |   8 +-
 mm/memory_hotplug.c                             | 166 ++++++-----
 mm/sparse.c                                     |  27 +-
 net/core/dev.c                                  |  33 +-
 net/core/net-sysfs.c                            |  39 ++-
 net/core/rtnetlink.c                            |  13 +-
 net/ipv4/ip_tunnel.c                            |   4 +-
 net/ipv4/tcp.c                                  |   1 +
 net/ipv4/tcp_bbr.c                              |   3 +-
 net/ipv4/tcp_input.c                            |   1 +
 net/ipv4/tcp_output.c                           |   1 +
 net/ipv4/udp.c                                  |   3 +-
 net/ipv6/ip6_gre.c                              |   3 -
 net/ipv6/ip6_tunnel.c                           |   4 +-
 net/ipv6/seg6_local.c                           |   4 +-
 net/netfilter/ipset/ip_set_bitmap_gen.h         |   2 +-
 net/netfilter/ipset/ip_set_bitmap_ip.c          |   6 +-
 net/netfilter/ipset/ip_set_bitmap_ipmac.c       |   6 +-
 net/netfilter/ipset/ip_set_bitmap_port.c        |   6 +-
 net/netfilter/nf_tables_api.c                   |  29 +-
 net/netfilter/nft_osf.c                         |   3 +
 net/sched/ematch.c                              |   2 +-
 net/x25/af_x25.c                                |   6 +-
 scripts/recordmcount.c                          |  17 ++
 83 files changed, 1102 insertions(+), 722 deletions(-)



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 01/92] can, slip: Protect tty->disc_data in write_wakeup and close with RCU
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
@ 2020-01-28 14:07 ` Greg Kroah-Hartman
  2020-01-28 14:07 ` [PATCH 4.19 02/92] firestream: fix memory leaks Greg Kroah-Hartman
                   ` (87 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, syzbot+017e491ae13c0068598a,
	Richard Palethorpe, Wolfgang Grandegger, Marc Kleine-Budde,
	David S. Miller, Tyler Hall, linux-can, netdev, syzkaller

From: Richard Palethorpe <rpalethorpe@suse.com>

[ Upstream commit 0ace17d56824165c7f4c68785d6b58971db954dd ]

write_wakeup can happen in parallel with close/hangup where tty->disc_data
is set to NULL and the netdevice is freed thus also freeing
disc_data. write_wakeup accesses disc_data so we must prevent close from
freeing the netdev while write_wakeup has a non-NULL view of
tty->disc_data.

We also need to make sure that accesses to disc_data are atomic. Which can
all be done with RCU.

This problem was found by Syzkaller on SLCAN, but the same issue is
reproducible with the SLIP line discipline using an LTP test based on the
Syzkaller reproducer.

A fix which didn't use RCU was posted by Hillf Danton.

Fixes: 661f7fda21b1 ("slip: Fix deadlock in write_wakeup")
Fixes: a8e83b17536a ("slcan: Port write_wakeup deadlock fix from slip")
Reported-by: syzbot+017e491ae13c0068598a@syzkaller.appspotmail.com
Signed-off-by: Richard Palethorpe <rpalethorpe@suse.com>
Cc: Wolfgang Grandegger <wg@grandegger.com>
Cc: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Tyler Hall <tylerwhall@gmail.com>
Cc: linux-can@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: syzkaller@googlegroups.com
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/can/slcan.c |   12 ++++++++++--
 drivers/net/slip/slip.c |   12 ++++++++++--
 2 files changed, 20 insertions(+), 4 deletions(-)

--- a/drivers/net/can/slcan.c
+++ b/drivers/net/can/slcan.c
@@ -343,9 +343,16 @@ static void slcan_transmit(struct work_s
  */
 static void slcan_write_wakeup(struct tty_struct *tty)
 {
-	struct slcan *sl = tty->disc_data;
+	struct slcan *sl;
+
+	rcu_read_lock();
+	sl = rcu_dereference(tty->disc_data);
+	if (!sl)
+		goto out;
 
 	schedule_work(&sl->tx_work);
+out:
+	rcu_read_unlock();
 }
 
 /* Send a can_frame to a TTY queue. */
@@ -640,10 +647,11 @@ static void slcan_close(struct tty_struc
 		return;
 
 	spin_lock_bh(&sl->lock);
-	tty->disc_data = NULL;
+	rcu_assign_pointer(tty->disc_data, NULL);
 	sl->tty = NULL;
 	spin_unlock_bh(&sl->lock);
 
+	synchronize_rcu();
 	flush_work(&sl->tx_work);
 
 	/* Flush network side */
--- a/drivers/net/slip/slip.c
+++ b/drivers/net/slip/slip.c
@@ -452,9 +452,16 @@ static void slip_transmit(struct work_st
  */
 static void slip_write_wakeup(struct tty_struct *tty)
 {
-	struct slip *sl = tty->disc_data;
+	struct slip *sl;
+
+	rcu_read_lock();
+	sl = rcu_dereference(tty->disc_data);
+	if (!sl)
+		goto out;
 
 	schedule_work(&sl->tx_work);
+out:
+	rcu_read_unlock();
 }
 
 static void sl_tx_timeout(struct net_device *dev)
@@ -882,10 +889,11 @@ static void slip_close(struct tty_struct
 		return;
 
 	spin_lock_bh(&sl->lock);
-	tty->disc_data = NULL;
+	rcu_assign_pointer(tty->disc_data, NULL);
 	sl->tty = NULL;
 	spin_unlock_bh(&sl->lock);
 
+	synchronize_rcu();
 	flush_work(&sl->tx_work);
 
 	/* VSV = very important to remove timers */

^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 02/92] firestream: fix memory leaks
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
  2020-01-28 14:07 ` [PATCH 4.19 01/92] can, slip: Protect tty->disc_data in write_wakeup and close with RCU Greg Kroah-Hartman
@ 2020-01-28 14:07 ` Greg Kroah-Hartman
  2020-01-28 14:07 ` [PATCH 4.19 03/92] gtp: make sure only SOCK_DGRAM UDP sockets are accepted Greg Kroah-Hartman
                   ` (86 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:07 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Wenwen Wang, David S. Miller

From: Wenwen Wang <wenwen@cs.uga.edu>

[ Upstream commit fa865ba183d61c1ec8cbcab8573159c3b72b89a4 ]

In fs_open(), 'vcc' is allocated through kmalloc() and assigned to
'atm_vcc->dev_data.' In the following execution, if an error occurs, e.g.,
there is no more free channel, an error code EBUSY or ENOMEM will be
returned. However, 'vcc' is not deallocated, leading to memory leaks. Note
that, in normal cases where fs_open() returns 0, 'vcc' will be deallocated
in fs_close(). But, if fs_open() fails, there is no guarantee that
fs_close() will be invoked.

To fix this issue, deallocate 'vcc' before the error code is returned.

Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/atm/firestream.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/drivers/atm/firestream.c
+++ b/drivers/atm/firestream.c
@@ -927,6 +927,7 @@ static int fs_open(struct atm_vcc *atm_v
 			}
 			if (!to) {
 				printk ("No more free channels for FS50..\n");
+				kfree(vcc);
 				return -EBUSY;
 			}
 			vcc->channo = dev->channo;
@@ -937,6 +938,7 @@ static int fs_open(struct atm_vcc *atm_v
 			if (((DO_DIRECTION(rxtp) && dev->atm_vccs[vcc->channo])) ||
 			    ( DO_DIRECTION(txtp) && test_bit (vcc->channo, dev->tx_inuse))) {
 				printk ("Channel is in use for FS155.\n");
+				kfree(vcc);
 				return -EBUSY;
 			}
 		}
@@ -950,6 +952,7 @@ static int fs_open(struct atm_vcc *atm_v
 			    tc, sizeof (struct fs_transmit_config));
 		if (!tc) {
 			fs_dprintk (FS_DEBUG_OPEN, "fs: can't alloc transmit_config.\n");
+			kfree(vcc);
 			return -ENOMEM;
 		}
 



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 03/92] gtp: make sure only SOCK_DGRAM UDP sockets are accepted
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
  2020-01-28 14:07 ` [PATCH 4.19 01/92] can, slip: Protect tty->disc_data in write_wakeup and close with RCU Greg Kroah-Hartman
  2020-01-28 14:07 ` [PATCH 4.19 02/92] firestream: fix memory leaks Greg Kroah-Hartman
@ 2020-01-28 14:07 ` Greg Kroah-Hartman
  2020-01-28 14:07 ` [PATCH 4.19 04/92] ipv6: sr: remove SKB_GSO_IPXIP6 on End.D* actions Greg Kroah-Hartman
                   ` (85 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Eric Dumazet, Pablo Neira, syzbot,
	David S. Miller

From: Eric Dumazet <edumazet@google.com>

[ Upstream commit 940ba14986657a50c15f694efca1beba31fa568f ]

A malicious user could use RAW sockets and fool
GTP using them as standard SOCK_DGRAM UDP sockets.

BUG: KMSAN: uninit-value in udp_tunnel_encap_enable include/net/udp_tunnel.h:174 [inline]
BUG: KMSAN: uninit-value in setup_udp_tunnel_sock+0x45e/0x6f0 net/ipv4/udp_tunnel.c:85
CPU: 0 PID: 11262 Comm: syz-executor613 Not tainted 5.5.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1c9/0x220 lib/dump_stack.c:118
 kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:118
 __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215
 udp_tunnel_encap_enable include/net/udp_tunnel.h:174 [inline]
 setup_udp_tunnel_sock+0x45e/0x6f0 net/ipv4/udp_tunnel.c:85
 gtp_encap_enable_socket+0x37f/0x5a0 drivers/net/gtp.c:827
 gtp_encap_enable drivers/net/gtp.c:844 [inline]
 gtp_newlink+0xfb/0x1e50 drivers/net/gtp.c:666
 __rtnl_newlink net/core/rtnetlink.c:3305 [inline]
 rtnl_newlink+0x2973/0x3920 net/core/rtnetlink.c:3363
 rtnetlink_rcv_msg+0x1153/0x1570 net/core/rtnetlink.c:5424
 netlink_rcv_skb+0x451/0x650 net/netlink/af_netlink.c:2477
 rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:5442
 netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
 netlink_unicast+0xf9e/0x1100 net/netlink/af_netlink.c:1328
 netlink_sendmsg+0x1248/0x14d0 net/netlink/af_netlink.c:1917
 sock_sendmsg_nosec net/socket.c:639 [inline]
 sock_sendmsg net/socket.c:659 [inline]
 ____sys_sendmsg+0x12b6/0x1350 net/socket.c:2330
 ___sys_sendmsg net/socket.c:2384 [inline]
 __sys_sendmsg+0x451/0x5f0 net/socket.c:2417
 __do_sys_sendmsg net/socket.c:2426 [inline]
 __se_sys_sendmsg+0x97/0xb0 net/socket.c:2424
 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2424
 do_syscall_64+0xb8/0x160 arch/x86/entry/common.c:296
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x441359
Code: e8 ac e8 ff ff 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fff1cd0ac28 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000441359
RDX: 0000000000000000 RSI: 0000000020000100 RDI: 0000000000000003
RBP: 00000000006cb018 R08: 00000000004002c8 R09: 00000000004002c8
R10: 00000000004002c8 R11: 0000000000000246 R12: 00000000004020d0
R13: 0000000000402160 R14: 0000000000000000 R15: 0000000000000000

Uninit was created at:
 kmsan_save_stack_with_flags+0x3c/0x90 mm/kmsan/kmsan.c:144
 kmsan_internal_alloc_meta_for_pages mm/kmsan/kmsan_shadow.c:307 [inline]
 kmsan_alloc_page+0x12a/0x310 mm/kmsan/kmsan_shadow.c:336
 __alloc_pages_nodemask+0x57f2/0x5f60 mm/page_alloc.c:4800
 alloc_pages_current+0x67d/0x990 mm/mempolicy.c:2207
 alloc_pages include/linux/gfp.h:534 [inline]
 alloc_slab_page+0x111/0x12f0 mm/slub.c:1511
 allocate_slab mm/slub.c:1656 [inline]
 new_slab+0x2bc/0x1130 mm/slub.c:1722
 new_slab_objects mm/slub.c:2473 [inline]
 ___slab_alloc+0x1533/0x1f30 mm/slub.c:2624
 __slab_alloc mm/slub.c:2664 [inline]
 slab_alloc_node mm/slub.c:2738 [inline]
 slab_alloc mm/slub.c:2783 [inline]
 kmem_cache_alloc+0xb23/0xd70 mm/slub.c:2788
 sk_prot_alloc+0xf2/0x620 net/core/sock.c:1597
 sk_alloc+0xf0/0xbe0 net/core/sock.c:1657
 inet_create+0x7c7/0x1370 net/ipv4/af_inet.c:321
 __sock_create+0x8eb/0xf00 net/socket.c:1420
 sock_create net/socket.c:1471 [inline]
 __sys_socket+0x1a1/0x600 net/socket.c:1513
 __do_sys_socket net/socket.c:1522 [inline]
 __se_sys_socket+0x8d/0xb0 net/socket.c:1520
 __x64_sys_socket+0x4a/0x70 net/socket.c:1520
 do_syscall_64+0xb8/0x160 arch/x86/entry/common.c:296
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 459aa660eb1d ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Pablo Neira <pablo@netfilter.org>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/gtp.c |   10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

--- a/drivers/net/gtp.c
+++ b/drivers/net/gtp.c
@@ -809,19 +809,21 @@ static struct sock *gtp_encap_enable_soc
 		return NULL;
 	}
 
-	if (sock->sk->sk_protocol != IPPROTO_UDP) {
+	sk = sock->sk;
+	if (sk->sk_protocol != IPPROTO_UDP ||
+	    sk->sk_type != SOCK_DGRAM ||
+	    (sk->sk_family != AF_INET && sk->sk_family != AF_INET6)) {
 		pr_debug("socket fd=%d not UDP\n", fd);
 		sk = ERR_PTR(-EINVAL);
 		goto out_sock;
 	}
 
-	lock_sock(sock->sk);
-	if (sock->sk->sk_user_data) {
+	lock_sock(sk);
+	if (sk->sk_user_data) {
 		sk = ERR_PTR(-EBUSY);
 		goto out_rel_sock;
 	}
 
-	sk = sock->sk;
 	sock_hold(sk);
 
 	tuncfg.sk_user_data = gtp;



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 04/92] ipv6: sr: remove SKB_GSO_IPXIP6 on End.D* actions
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (2 preceding siblings ...)
  2020-01-28 14:07 ` [PATCH 4.19 03/92] gtp: make sure only SOCK_DGRAM UDP sockets are accepted Greg Kroah-Hartman
@ 2020-01-28 14:07 ` Greg Kroah-Hartman
  2020-01-28 14:07 ` [PATCH 4.19 05/92] net: bcmgenet: Use netif_tx_napi_add() for TX NAPI Greg Kroah-Hartman
                   ` (84 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:07 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Yuki Taguchi, David S. Miller

From: Yuki Taguchi <tagyounit@gmail.com>

[ Upstream commit 62ebaeaedee7591c257543d040677a60e35c7aec ]

After LRO/GRO is applied, SRv6 encapsulated packets have
SKB_GSO_IPXIP6 feature flag, and this flag must be removed right after
decapulation procedure.

Currently, SKB_GSO_IPXIP6 flag is not removed on End.D* actions, which
creates inconsistent packet state, that is, a normal TCP/IP packets
have the SKB_GSO_IPXIP6 flag. This behavior can cause unexpected
fallback to GSO on routing to netdevices that do not support
SKB_GSO_IPXIP6. For example, on inter-VRF forwarding, decapsulated
packets separated into small packets by GSO because VRF devices do not
support TSO for packets with SKB_GSO_IPXIP6 flag, and this degrades
forwarding performance.

This patch removes encapsulation related GSO flags from the skb right
after the End.D* action is applied.

Fixes: d7a669dd2f8b ("ipv6: sr: add helper functions for seg6local")
Signed-off-by: Yuki Taguchi <tagyounit@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv6/seg6_local.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/net/ipv6/seg6_local.c
+++ b/net/ipv6/seg6_local.c
@@ -28,6 +28,7 @@
 #include <net/addrconf.h>
 #include <net/ip6_route.h>
 #include <net/dst_cache.h>
+#include <net/ip_tunnels.h>
 #ifdef CONFIG_IPV6_SEG6_HMAC
 #include <net/seg6_hmac.h>
 #endif
@@ -135,7 +136,8 @@ static bool decap_and_validate(struct sk
 
 	skb_reset_network_header(skb);
 	skb_reset_transport_header(skb);
-	skb->encapsulation = 0;
+	if (iptunnel_pull_offloads(skb))
+		return false;
 
 	return true;
 }



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 05/92] net: bcmgenet: Use netif_tx_napi_add() for TX NAPI
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (3 preceding siblings ...)
  2020-01-28 14:07 ` [PATCH 4.19 04/92] ipv6: sr: remove SKB_GSO_IPXIP6 on End.D* actions Greg Kroah-Hartman
@ 2020-01-28 14:07 ` Greg Kroah-Hartman
  2020-01-28 14:07 ` [PATCH 4.19 06/92] net: cxgb3_main: Add CAP_NET_ADMIN check to CHELSIO_GET_MEM Greg Kroah-Hartman
                   ` (83 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Florian Fainelli, Doug Berger,
	David S. Miller

From: Florian Fainelli <f.fainelli@gmail.com>

[ Upstream commit 148965df1a990af98b2c84092c2a2274c7489284 ]

Before commit 7587935cfa11 ("net: bcmgenet: move NAPI initialization to
ring initialization") moved the code, this used to be
netif_tx_napi_add(), but we lost that small semantic change in the
process, restore that.

Fixes: 7587935cfa11 ("net: bcmgenet: move NAPI initialization to ring initialization")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Doug Berger <opendmb@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/broadcom/genet/bcmgenet.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
@@ -2166,8 +2166,8 @@ static void bcmgenet_init_tx_ring(struct
 				  DMA_END_ADDR);
 
 	/* Initialize Tx NAPI */
-	netif_napi_add(priv->dev, &ring->napi, bcmgenet_tx_poll,
-		       NAPI_POLL_WEIGHT);
+	netif_tx_napi_add(priv->dev, &ring->napi, bcmgenet_tx_poll,
+			  NAPI_POLL_WEIGHT);
 }
 
 /* Initialize a RDMA ring */



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 06/92] net: cxgb3_main: Add CAP_NET_ADMIN check to CHELSIO_GET_MEM
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (4 preceding siblings ...)
  2020-01-28 14:07 ` [PATCH 4.19 05/92] net: bcmgenet: Use netif_tx_napi_add() for TX NAPI Greg Kroah-Hartman
@ 2020-01-28 14:07 ` Greg Kroah-Hartman
  2020-01-28 14:07 ` [PATCH 4.19 07/92] net: ip6_gre: fix moving ip6gre between namespaces Greg Kroah-Hartman
                   ` (82 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Ilja Van Sprundel, Michael Ellerman,
	David S. Miller

From: Michael Ellerman <mpe@ellerman.id.au>

[ Upstream commit 3546d8f1bbe992488ed91592cf6bf76e7114791a =

The cxgb3 driver for "Chelsio T3-based gigabit and 10Gb Ethernet
adapters" implements a custom ioctl as SIOCCHIOCTL/SIOCDEVPRIVATE in
cxgb_extension_ioctl().

One of the subcommands of the ioctl is CHELSIO_GET_MEM, which appears
to read memory directly out of the adapter and return it to userspace.
It's not entirely clear what the contents of the adapter memory
contains, but the assumption is that it shouldn't be accessible to all
users.

So add a CAP_NET_ADMIN check to the CHELSIO_GET_MEM case. Put it after
the is_offload() check, which matches two of the other subcommands in
the same function which also check for is_offload() and CAP_NET_ADMIN.

Found by Ilja by code inspection, not tested as I don't have the
required hardware.

Reported-by: Ilja Van Sprundel <ivansprundel@ioactive.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c
@@ -2449,6 +2449,8 @@ static int cxgb_extension_ioctl(struct n
 
 		if (!is_offload(adapter))
 			return -EOPNOTSUPP;
+		if (!capable(CAP_NET_ADMIN))
+			return -EPERM;
 		if (!(adapter->flags & FULL_INIT_DONE))
 			return -EIO;	/* need the memory controllers */
 		if (copy_from_user(&t, useraddr, sizeof(t)))



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 07/92] net: ip6_gre: fix moving ip6gre between namespaces
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (5 preceding siblings ...)
  2020-01-28 14:07 ` [PATCH 4.19 06/92] net: cxgb3_main: Add CAP_NET_ADMIN check to CHELSIO_GET_MEM Greg Kroah-Hartman
@ 2020-01-28 14:07 ` Greg Kroah-Hartman
  2020-01-28 14:07 ` [PATCH 4.19 08/92] net, ip6_tunnel: fix namespaces move Greg Kroah-Hartman
                   ` (81 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Niko Kortstrom, Nicolas Dichtel,
	William Tu, David S. Miller

From: Niko Kortstrom <niko.kortstrom@nokia.com>

[ Upstream commit 690afc165bb314354667f67157c1a1aea7dc797a ]

Support for moving IPv4 GRE tunnels between namespaces was added in
commit b57708add314 ("gre: add x-netns support"). The respective change
for IPv6 tunnels, commit 22f08069e8b4 ("ip6gre: add x-netns support")
did not drop NETIF_F_NETNS_LOCAL flag so moving them from one netns to
another is still denied in IPv6 case. Drop NETIF_F_NETNS_LOCAL flag from
ip6gre tunnels to allow moving ip6gre tunnel endpoints between network
namespaces.

Signed-off-by: Niko Kortstrom <niko.kortstrom@nokia.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv6/ip6_gre.c |    3 ---
 1 file changed, 3 deletions(-)

--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -1486,7 +1486,6 @@ static int ip6gre_tunnel_init_common(str
 		dev->mtu -= 8;
 
 	if (tunnel->parms.collect_md) {
-		dev->features |= NETIF_F_NETNS_LOCAL;
 		netif_keep_dst(dev);
 	}
 	ip6gre_tnl_init_features(dev);
@@ -1914,7 +1913,6 @@ static void ip6gre_tap_setup(struct net_
 	dev->needs_free_netdev = true;
 	dev->priv_destructor = ip6gre_dev_free;
 
-	dev->features |= NETIF_F_NETNS_LOCAL;
 	dev->priv_flags &= ~IFF_TX_SKB_SHARING;
 	dev->priv_flags |= IFF_LIVE_ADDR_CHANGE;
 	netif_keep_dst(dev);
@@ -2223,7 +2221,6 @@ static void ip6erspan_tap_setup(struct n
 	dev->needs_free_netdev = true;
 	dev->priv_destructor = ip6gre_dev_free;
 
-	dev->features |= NETIF_F_NETNS_LOCAL;
 	dev->priv_flags &= ~IFF_TX_SKB_SHARING;
 	dev->priv_flags |= IFF_LIVE_ADDR_CHANGE;
 	netif_keep_dst(dev);



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 08/92] net, ip6_tunnel: fix namespaces move
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (6 preceding siblings ...)
  2020-01-28 14:07 ` [PATCH 4.19 07/92] net: ip6_gre: fix moving ip6gre between namespaces Greg Kroah-Hartman
@ 2020-01-28 14:07 ` Greg Kroah-Hartman
  2020-01-28 14:07 ` [PATCH 4.19 09/92] net, ip_tunnel: " Greg Kroah-Hartman
                   ` (80 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, William Dauchy, Nicolas Dichtel,
	David S. Miller

From: William Dauchy <w.dauchy@criteo.com>

[ Upstream commit 5311a69aaca30fa849c3cc46fb25f75727fb72d0 ]

in the same manner as commit d0f418516022 ("net, ip_tunnel: fix
namespaces move"), fix namespace moving as it was broken since commit
8d79266bc48c ("ip6_tunnel: add collect_md mode to IPv6 tunnel"), but for
ipv6 this time; there is no reason to keep it for ip6_tunnel.

Fixes: 8d79266bc48c ("ip6_tunnel: add collect_md mode to IPv6 tunnel")
Signed-off-by: William Dauchy <w.dauchy@criteo.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv6/ip6_tunnel.c |    4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -1882,10 +1882,8 @@ static int ip6_tnl_dev_init(struct net_d
 	if (err)
 		return err;
 	ip6_tnl_link_config(t);
-	if (t->parms.collect_md) {
-		dev->features |= NETIF_F_NETNS_LOCAL;
+	if (t->parms.collect_md)
 		netif_keep_dst(dev);
-	}
 	return 0;
 }
 



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 09/92] net, ip_tunnel: fix namespaces move
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (7 preceding siblings ...)
  2020-01-28 14:07 ` [PATCH 4.19 08/92] net, ip6_tunnel: fix namespaces move Greg Kroah-Hartman
@ 2020-01-28 14:07 ` Greg Kroah-Hartman
  2020-01-28 14:07 ` [PATCH 4.19 10/92] net: rtnetlink: validate IFLA_MTU attribute in rtnl_create_link() Greg Kroah-Hartman
                   ` (79 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, William Dauchy, Nicolas Dichtel,
	David S. Miller

From: William Dauchy <w.dauchy@criteo.com>

[ Upstream commit d0f418516022c32ecceaf4275423e5bd3f8743a9 ]

in the same manner as commit 690afc165bb3 ("net: ip6_gre: fix moving
ip6gre between namespaces"), fix namespace moving as it was broken since
commit 2e15ea390e6f ("ip_gre: Add support to collect tunnel metadata.").
Indeed, the ip6_gre commit removed the local flag for collect_md
condition, so there is no reason to keep it for ip_gre/ip_tunnel.

this patch will fix both ip_tunnel and ip_gre modules.

Fixes: 2e15ea390e6f ("ip_gre: Add support to collect tunnel metadata.")
Signed-off-by: William Dauchy <w.dauchy@criteo.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv4/ip_tunnel.c |    4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

--- a/net/ipv4/ip_tunnel.c
+++ b/net/ipv4/ip_tunnel.c
@@ -1203,10 +1203,8 @@ int ip_tunnel_init(struct net_device *de
 	iph->version		= 4;
 	iph->ihl		= 5;
 
-	if (tunnel->collect_md) {
-		dev->features |= NETIF_F_NETNS_LOCAL;
+	if (tunnel->collect_md)
 		netif_keep_dst(dev);
-	}
 	return 0;
 }
 EXPORT_SYMBOL_GPL(ip_tunnel_init);



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 10/92] net: rtnetlink: validate IFLA_MTU attribute in rtnl_create_link()
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (8 preceding siblings ...)
  2020-01-28 14:07 ` [PATCH 4.19 09/92] net, ip_tunnel: " Greg Kroah-Hartman
@ 2020-01-28 14:07 ` Greg Kroah-Hartman
  2020-01-28 14:07 ` [PATCH 4.19 11/92] net_sched: fix datalen for ematch Greg Kroah-Hartman
                   ` (78 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Eric Dumazet, syzbot, David S. Miller

From: Eric Dumazet <edumazet@google.com>

[ Upstream commit d836f5c69d87473ff65c06a6123e5b2cf5e56f5b ]

rtnl_create_link() needs to apply dev->min_mtu and dev->max_mtu
checks that we apply in do_setlink()

Otherwise malicious users can crash the kernel, for example after
an integer overflow :

BUG: KASAN: use-after-free in memset include/linux/string.h:365 [inline]
BUG: KASAN: use-after-free in __alloc_skb+0x37b/0x5e0 net/core/skbuff.c:238
Write of size 32 at addr ffff88819f20b9c0 by task swapper/0/0

CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.5.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x197/0x210 lib/dump_stack.c:118
 print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374
 __kasan_report.cold+0x1b/0x41 mm/kasan/report.c:506
 kasan_report+0x12/0x20 mm/kasan/common.c:639
 check_memory_region_inline mm/kasan/generic.c:185 [inline]
 check_memory_region+0x134/0x1a0 mm/kasan/generic.c:192
 memset+0x24/0x40 mm/kasan/common.c:108
 memset include/linux/string.h:365 [inline]
 __alloc_skb+0x37b/0x5e0 net/core/skbuff.c:238
 alloc_skb include/linux/skbuff.h:1049 [inline]
 alloc_skb_with_frags+0x93/0x590 net/core/skbuff.c:5664
 sock_alloc_send_pskb+0x7ad/0x920 net/core/sock.c:2242
 sock_alloc_send_skb+0x32/0x40 net/core/sock.c:2259
 mld_newpack+0x1d7/0x7f0 net/ipv6/mcast.c:1609
 add_grhead.isra.0+0x299/0x370 net/ipv6/mcast.c:1713
 add_grec+0x7db/0x10b0 net/ipv6/mcast.c:1844
 mld_send_cr net/ipv6/mcast.c:1970 [inline]
 mld_ifc_timer_expire+0x3d3/0x950 net/ipv6/mcast.c:2477
 call_timer_fn+0x1ac/0x780 kernel/time/timer.c:1404
 expire_timers kernel/time/timer.c:1449 [inline]
 __run_timers kernel/time/timer.c:1773 [inline]
 __run_timers kernel/time/timer.c:1740 [inline]
 run_timer_softirq+0x6c3/0x1790 kernel/time/timer.c:1786
 __do_softirq+0x262/0x98c kernel/softirq.c:292
 invoke_softirq kernel/softirq.c:373 [inline]
 irq_exit+0x19b/0x1e0 kernel/softirq.c:413
 exiting_irq arch/x86/include/asm/apic.h:536 [inline]
 smp_apic_timer_interrupt+0x1a3/0x610 arch/x86/kernel/apic/apic.c:1137
 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:829
 </IRQ>
RIP: 0010:native_safe_halt+0xe/0x10 arch/x86/include/asm/irqflags.h:61
Code: 98 6b ea f9 eb 8a cc cc cc cc cc cc e9 07 00 00 00 0f 00 2d 44 1c 60 00 f4 c3 66 90 e9 07 00 00 00 0f 00 2d 34 1c 60 00 fb f4 <c3> cc 55 48 89 e5 41 57 41 56 41 55 41 54 53 e8 4e 5d 9a f9 e8 79
RSP: 0018:ffffffff89807ce8 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff13
RAX: 1ffffffff13266ae RBX: ffffffff8987a1c0 RCX: 0000000000000000
RDX: dffffc0000000000 RSI: 0000000000000006 RDI: ffffffff8987aa54
RBP: ffffffff89807d18 R08: ffffffff8987a1c0 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: dffffc0000000000
R13: ffffffff8a799980 R14: 0000000000000000 R15: 0000000000000000
 arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:690
 default_idle_call+0x84/0xb0 kernel/sched/idle.c:94
 cpuidle_idle_call kernel/sched/idle.c:154 [inline]
 do_idle+0x3c8/0x6e0 kernel/sched/idle.c:269
 cpu_startup_entry+0x1b/0x20 kernel/sched/idle.c:361
 rest_init+0x23b/0x371 init/main.c:451
 arch_call_rest_init+0xe/0x1b
 start_kernel+0x904/0x943 init/main.c:784
 x86_64_start_reservations+0x29/0x2b arch/x86/kernel/head64.c:490
 x86_64_start_kernel+0x77/0x7b arch/x86/kernel/head64.c:471
 secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:242

The buggy address belongs to the page:
page:ffffea00067c82c0 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0
raw: 057ffe0000000000 ffffea00067c82c8 ffffea00067c82c8 0000000000000000
raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff88819f20b880: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
 ffff88819f20b900: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>ffff88819f20b980: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
                                           ^
 ffff88819f20ba00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
 ffff88819f20ba80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff

Fixes: 61e84623ace3 ("net: centralize net_device min/max MTU checking")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/netdevice.h |    2 ++
 net/core/dev.c            |   29 +++++++++++++++++++----------
 net/core/rtnetlink.c      |   13 +++++++++++--
 3 files changed, 32 insertions(+), 12 deletions(-)

--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3579,6 +3579,8 @@ int dev_set_alias(struct net_device *, c
 int dev_get_alias(const struct net_device *, char *, size_t);
 int dev_change_net_namespace(struct net_device *, struct net *, const char *);
 int __dev_set_mtu(struct net_device *, int);
+int dev_validate_mtu(struct net_device *dev, int mtu,
+		     struct netlink_ext_ack *extack);
 int dev_set_mtu_ext(struct net_device *dev, int mtu,
 		    struct netlink_ext_ack *extack);
 int dev_set_mtu(struct net_device *, int);
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -7752,6 +7752,22 @@ int __dev_set_mtu(struct net_device *dev
 }
 EXPORT_SYMBOL(__dev_set_mtu);
 
+int dev_validate_mtu(struct net_device *dev, int new_mtu,
+		     struct netlink_ext_ack *extack)
+{
+	/* MTU must be positive, and in range */
+	if (new_mtu < 0 || new_mtu < dev->min_mtu) {
+		NL_SET_ERR_MSG(extack, "mtu less than device minimum");
+		return -EINVAL;
+	}
+
+	if (dev->max_mtu > 0 && new_mtu > dev->max_mtu) {
+		NL_SET_ERR_MSG(extack, "mtu greater than device maximum");
+		return -EINVAL;
+	}
+	return 0;
+}
+
 /**
  *	dev_set_mtu_ext - Change maximum transfer unit
  *	@dev: device
@@ -7768,16 +7784,9 @@ int dev_set_mtu_ext(struct net_device *d
 	if (new_mtu == dev->mtu)
 		return 0;
 
-	/* MTU must be positive, and in range */
-	if (new_mtu < 0 || new_mtu < dev->min_mtu) {
-		NL_SET_ERR_MSG(extack, "mtu less than device minimum");
-		return -EINVAL;
-	}
-
-	if (dev->max_mtu > 0 && new_mtu > dev->max_mtu) {
-		NL_SET_ERR_MSG(extack, "mtu greater than device maximum");
-		return -EINVAL;
-	}
+	err = dev_validate_mtu(dev, new_mtu, extack);
+	if (err)
+		return err;
 
 	if (!netif_device_present(dev))
 		return -ENODEV;
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -2875,8 +2875,17 @@ struct net_device *rtnl_create_link(stru
 	dev->rtnl_link_ops = ops;
 	dev->rtnl_link_state = RTNL_LINK_INITIALIZING;
 
-	if (tb[IFLA_MTU])
-		dev->mtu = nla_get_u32(tb[IFLA_MTU]);
+	if (tb[IFLA_MTU]) {
+		u32 mtu = nla_get_u32(tb[IFLA_MTU]);
+		int err;
+
+		err = dev_validate_mtu(dev, mtu, NULL);
+		if (err) {
+			free_netdev(dev);
+			return ERR_PTR(err);
+		}
+		dev->mtu = mtu;
+	}
 	if (tb[IFLA_ADDRESS]) {
 		memcpy(dev->dev_addr, nla_data(tb[IFLA_ADDRESS]),
 				nla_len(tb[IFLA_ADDRESS]));



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 11/92] net_sched: fix datalen for ematch
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (9 preceding siblings ...)
  2020-01-28 14:07 ` [PATCH 4.19 10/92] net: rtnetlink: validate IFLA_MTU attribute in rtnl_create_link() Greg Kroah-Hartman
@ 2020-01-28 14:07 ` Greg Kroah-Hartman
  2020-01-28 14:07 ` [PATCH 4.19 12/92] net-sysfs: Fix reference count leak in rx|netdev_queue_add_kobject Greg Kroah-Hartman
                   ` (77 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, syzbot+2f07903a5b05e7f36410,
	Eric Dumazet, Cong Wang, Eric Dumazet, David S. Miller,
	syzbot+5af9a90dad568aa9f611

From: Cong Wang <xiyou.wangcong@gmail.com>

[ Upstream commit 61678d28d4a45ef376f5d02a839cc37509ae9281 ]

syzbot reported an out-of-bound access in em_nbyte. As initially
analyzed by Eric, this is because em_nbyte sets its own em->datalen
in em_nbyte_change() other than the one specified by user, but this
value gets overwritten later by its caller tcf_em_validate().
We should leave em->datalen untouched to respect their choices.

I audit all the in-tree ematch users, all of those implement
->change() set em->datalen, so we can just avoid setting it twice
in this case.

Reported-and-tested-by: syzbot+5af9a90dad568aa9f611@syzkaller.appspotmail.com
Reported-by: syzbot+2f07903a5b05e7f36410@syzkaller.appspotmail.com
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/sched/ematch.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/net/sched/ematch.c
+++ b/net/sched/ematch.c
@@ -267,12 +267,12 @@ static int tcf_em_validate(struct tcf_pr
 				}
 				em->data = (unsigned long) v;
 			}
+			em->datalen = data_len;
 		}
 	}
 
 	em->matchid = em_hdr->matchid;
 	em->flags = em_hdr->flags;
-	em->datalen = data_len;
 	em->net = net;
 
 	err = 0;



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 12/92] net-sysfs: Fix reference count leak in rx|netdev_queue_add_kobject
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (10 preceding siblings ...)
  2020-01-28 14:07 ` [PATCH 4.19 11/92] net_sched: fix datalen for ematch Greg Kroah-Hartman
@ 2020-01-28 14:07 ` Greg Kroah-Hartman
  2020-01-28 14:07 ` [PATCH 4.19 13/92] net-sysfs: fix netdev_queue_add_kobject() breakage Greg Kroah-Hartman
                   ` (76 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, David Miller, Lukas Bulwahn, Jouni Hogander

From: Jouni Hogander <jouni.hogander@unikie.com>

commit b8eb718348b8fb30b5a7d0a8fce26fb3f4ac741b upstream.

kobject_init_and_add takes reference even when it fails. This has
to be given up by the caller in error handling. Otherwise memory
allocated by kobject_init_and_add is never freed. Originally found
by Syzkaller:

BUG: memory leak
unreferenced object 0xffff8880679f8b08 (size 8):
  comm "netdev_register", pid 269, jiffies 4294693094 (age 12.132s)
  hex dump (first 8 bytes):
    72 78 2d 30 00 36 20 d4                          rx-0.6 .
  backtrace:
    [<000000008c93818e>] __kmalloc_track_caller+0x16e/0x290
    [<000000001f2e4e49>] kvasprintf+0xb1/0x140
    [<000000007f313394>] kvasprintf_const+0x56/0x160
    [<00000000aeca11c8>] kobject_set_name_vargs+0x5b/0x140
    [<0000000073a0367c>] kobject_init_and_add+0xd8/0x170
    [<0000000088838e4b>] net_rx_queue_update_kobjects+0x152/0x560
    [<000000006be5f104>] netdev_register_kobject+0x210/0x380
    [<00000000e31dab9d>] register_netdevice+0xa1b/0xf00
    [<00000000f68b2465>] __tun_chr_ioctl+0x20d5/0x3dd0
    [<000000004c50599f>] tun_chr_ioctl+0x2f/0x40
    [<00000000bbd4c317>] do_vfs_ioctl+0x1c7/0x1510
    [<00000000d4c59e8f>] ksys_ioctl+0x99/0xb0
    [<00000000946aea81>] __x64_sys_ioctl+0x78/0xb0
    [<0000000038d946e5>] do_syscall_64+0x16f/0x580
    [<00000000e0aa5d8f>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
    [<00000000285b3d1a>] 0xffffffffffffffff

Cc: David Miller <davem@davemloft.net>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Jouni Hogander <jouni.hogander@unikie.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 net/core/net-sysfs.c |   24 +++++++++++++-----------
 1 file changed, 13 insertions(+), 11 deletions(-)

--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -932,21 +932,23 @@ static int rx_queue_add_kobject(struct n
 	error = kobject_init_and_add(kobj, &rx_queue_ktype, NULL,
 				     "rx-%u", index);
 	if (error)
-		return error;
+		goto err;
 
 	dev_hold(queue->dev);
 
 	if (dev->sysfs_rx_queue_group) {
 		error = sysfs_create_group(kobj, dev->sysfs_rx_queue_group);
-		if (error) {
-			kobject_put(kobj);
-			return error;
-		}
+		if (error)
+			goto err;
 	}
 
 	kobject_uevent(kobj, KOBJ_ADD);
 
 	return error;
+
+err:
+	kobject_put(kobj);
+	return error;
 }
 #endif /* CONFIG_SYSFS */
 
@@ -1471,21 +1473,21 @@ static int netdev_queue_add_kobject(stru
 	error = kobject_init_and_add(kobj, &netdev_queue_ktype, NULL,
 				     "tx-%u", index);
 	if (error)
-		return error;
+		goto err;
 
 	dev_hold(queue->dev);
 
 #ifdef CONFIG_BQL
 	error = sysfs_create_group(kobj, &dql_group);
-	if (error) {
-		kobject_put(kobj);
-		return error;
-	}
+	if (error)
+		goto err;
 #endif
 
 	kobject_uevent(kobj, KOBJ_ADD);
 
-	return 0;
+err:
+	kobject_put(kobj);
+	return error;
 }
 #endif /* CONFIG_SYSFS */
 



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 13/92] net-sysfs: fix netdev_queue_add_kobject() breakage
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (11 preceding siblings ...)
  2020-01-28 14:07 ` [PATCH 4.19 12/92] net-sysfs: Fix reference count leak in rx|netdev_queue_add_kobject Greg Kroah-Hartman
@ 2020-01-28 14:07 ` Greg Kroah-Hartman
  2020-01-28 14:07 ` [PATCH 4.19 14/92] net-sysfs: Call dev_hold always in netdev_queue_add_kobject Greg Kroah-Hartman
                   ` (75 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Eric Dumazet, Jouni Hogander,
	David S. Miller

From: Eric Dumazet <edumazet@google.com>

commit 48a322b6f9965b2f1e4ce81af972f0e287b07ed0 upstream.

kobject_put() should only be called in error path.

Fixes: b8eb718348b8 ("net-sysfs: Fix reference count leak in rx|netdev_queue_add_kobject")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jouni Hogander <jouni.hogander@unikie.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 net/core/net-sysfs.c |    1 +
 1 file changed, 1 insertion(+)

--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -1484,6 +1484,7 @@ static int netdev_queue_add_kobject(stru
 #endif
 
 	kobject_uevent(kobj, KOBJ_ADD);
+	return 0;
 
 err:
 	kobject_put(kobj);



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 14/92] net-sysfs: Call dev_hold always in netdev_queue_add_kobject
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (12 preceding siblings ...)
  2020-01-28 14:07 ` [PATCH 4.19 13/92] net-sysfs: fix netdev_queue_add_kobject() breakage Greg Kroah-Hartman
@ 2020-01-28 14:07 ` Greg Kroah-Hartman
  2020-01-28 14:07 ` [PATCH 4.19 15/92] net-sysfs: Call dev_hold always in rx_queue_add_kobject Greg Kroah-Hartman
                   ` (74 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Hulk Robot, Tetsuo Handa,
	David Miller, Lukas Bulwahn

From: Jouni Hogander <jouni.hogander@unikie.com>

commit e0b60903b434a7ee21ba8d8659f207ed84101e89 upstream.

Dev_hold has to be called always in netdev_queue_add_kobject.
Otherwise usage count drops below 0 in case of failure in
kobject_init_and_add.

Fixes: b8eb718348b8 ("net-sysfs: Fix reference count leak in rx|netdev_queue_add_kobject")
Reported-by: Hulk Robot <hulkci@huawei.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: David Miller <davem@davemloft.net>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 net/core/net-sysfs.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -1469,14 +1469,17 @@ static int netdev_queue_add_kobject(stru
 	struct kobject *kobj = &queue->kobj;
 	int error = 0;
 
+	/* Kobject_put later will trigger netdev_queue_release call
+	 * which decreases dev refcount: Take that reference here
+	 */
+	dev_hold(queue->dev);
+
 	kobj->kset = dev->queues_kset;
 	error = kobject_init_and_add(kobj, &netdev_queue_ktype, NULL,
 				     "tx-%u", index);
 	if (error)
 		goto err;
 
-	dev_hold(queue->dev);
-
 #ifdef CONFIG_BQL
 	error = sysfs_create_group(kobj, &dql_group);
 	if (error)



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 15/92] net-sysfs: Call dev_hold always in rx_queue_add_kobject
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (13 preceding siblings ...)
  2020-01-28 14:07 ` [PATCH 4.19 14/92] net-sysfs: Call dev_hold always in netdev_queue_add_kobject Greg Kroah-Hartman
@ 2020-01-28 14:07 ` Greg Kroah-Hartman
  2020-01-28 14:07 ` [PATCH 4.19 16/92] net-sysfs: Fix reference count leak Greg Kroah-Hartman
                   ` (73 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, syzbot, Tetsuo Handa, David Miller,
	Lukas Bulwahn, Jouni Hogander

From: Jouni Hogander <jouni.hogander@unikie.com>

commit ddd9b5e3e765d8ed5a35786a6cb00111713fe161 upstream.

Dev_hold has to be called always in rx_queue_add_kobject.
Otherwise usage count drops below 0 in case of failure in
kobject_init_and_add.

Fixes: b8eb718348b8 ("net-sysfs: Fix reference count leak in rx|netdev_queue_add_kobject")
Reported-by: syzbot <syzbot+30209ea299c09d8785c9@syzkaller.appspotmail.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: David Miller <davem@davemloft.net>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Jouni Hogander <jouni.hogander@unikie.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 net/core/net-sysfs.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -928,14 +928,17 @@ static int rx_queue_add_kobject(struct n
 	struct kobject *kobj = &queue->kobj;
 	int error = 0;
 
+	/* Kobject_put later will trigger rx_queue_release call which
+	 * decreases dev refcount: Take that reference here
+	 */
+	dev_hold(queue->dev);
+
 	kobj->kset = dev->queues_kset;
 	error = kobject_init_and_add(kobj, &rx_queue_ktype, NULL,
 				     "rx-%u", index);
 	if (error)
 		goto err;
 
-	dev_hold(queue->dev);
-
 	if (dev->sysfs_rx_queue_group) {
 		error = sysfs_create_group(kobj, dev->sysfs_rx_queue_group);
 		if (error)



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 16/92] net-sysfs: Fix reference count leak
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (14 preceding siblings ...)
  2020-01-28 14:07 ` [PATCH 4.19 15/92] net-sysfs: Call dev_hold always in rx_queue_add_kobject Greg Kroah-Hartman
@ 2020-01-28 14:07 ` Greg Kroah-Hartman
  2020-01-28 14:07 ` [PATCH 4.19 17/92] net: usb: lan78xx: Add .ndo_features_check Greg Kroah-Hartman
                   ` (72 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, syzbot+ad8ca40ecd77896d51e2,
	David Miller, Lukas Bulwahn, Jouni Hogander

From: Jouni Hogander <jouni.hogander@unikie.com>

[ Upstream commit cb626bf566eb4433318d35681286c494f04fedcc ]

Netdev_register_kobject is calling device_initialize. In case of error
reference taken by device_initialize is not given up.

Drivers are supposed to call free_netdev in case of error. In non-error
case the last reference is given up there and device release sequence
is triggered. In error case this reference is kept and the release
sequence is never started.

Fix this by setting reg_state as NETREG_UNREGISTERED if registering
fails.

This is the rootcause for couple of memory leaks reported by Syzkaller:

BUG: memory leak unreferenced object 0xffff8880675ca008 (size 256):
  comm "netdev_register", pid 281, jiffies 4294696663 (age 6.808s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
  backtrace:
    [<0000000058ca4711>] kmem_cache_alloc_trace+0x167/0x280
    [<000000002340019b>] device_add+0x882/0x1750
    [<000000001d588c3a>] netdev_register_kobject+0x128/0x380
    [<0000000011ef5535>] register_netdevice+0xa1b/0xf00
    [<000000007fcf1c99>] __tun_chr_ioctl+0x20d5/0x3dd0
    [<000000006a5b7b2b>] tun_chr_ioctl+0x2f/0x40
    [<00000000f30f834a>] do_vfs_ioctl+0x1c7/0x1510
    [<00000000fba062ea>] ksys_ioctl+0x99/0xb0
    [<00000000b1c1b8d2>] __x64_sys_ioctl+0x78/0xb0
    [<00000000984cabb9>] do_syscall_64+0x16f/0x580
    [<000000000bde033d>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
    [<00000000e6ca2d9f>] 0xffffffffffffffff

BUG: memory leak
unreferenced object 0xffff8880668ba588 (size 8):
  comm "kobject_set_nam", pid 286, jiffies 4294725297 (age 9.871s)
  hex dump (first 8 bytes):
    6e 72 30 00 cc be df 2b                          nr0....+
  backtrace:
    [<00000000a322332a>] __kmalloc_track_caller+0x16e/0x290
    [<00000000236fd26b>] kstrdup+0x3e/0x70
    [<00000000dd4a2815>] kstrdup_const+0x3e/0x50
    [<0000000049a377fc>] kvasprintf_const+0x10e/0x160
    [<00000000627fc711>] kobject_set_name_vargs+0x5b/0x140
    [<0000000019eeab06>] dev_set_name+0xc0/0xf0
    [<0000000069cb12bc>] netdev_register_kobject+0xc8/0x320
    [<00000000f2e83732>] register_netdevice+0xa1b/0xf00
    [<000000009e1f57cc>] __tun_chr_ioctl+0x20d5/0x3dd0
    [<000000009c560784>] tun_chr_ioctl+0x2f/0x40
    [<000000000d759e02>] do_vfs_ioctl+0x1c7/0x1510
    [<00000000351d7c31>] ksys_ioctl+0x99/0xb0
    [<000000008390040a>] __x64_sys_ioctl+0x78/0xb0
    [<0000000052d196b7>] do_syscall_64+0x16f/0x580
    [<0000000019af9236>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
    [<00000000bc384531>] 0xffffffffffffffff

v3 -> v4:
  Set reg_state to NETREG_UNREGISTERED if registering fails

v2 -> v3:
* Replaced BUG_ON with WARN_ON in free_netdev and netdev_release

v1 -> v2:
* Relying on driver calling free_netdev rather than calling
  put_device directly in error path

Reported-by: syzbot+ad8ca40ecd77896d51e2@syzkaller.appspotmail.com
Cc: David Miller <davem@davemloft.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Jouni Hogander <jouni.hogander@unikie.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/core/dev.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -8705,8 +8705,10 @@ int register_netdevice(struct net_device
 		goto err_uninit;
 
 	ret = netdev_register_kobject(dev);
-	if (ret)
+	if (ret) {
+		dev->reg_state = NETREG_UNREGISTERED;
 		goto err_uninit;
+	}
 	dev->reg_state = NETREG_REGISTERED;
 
 	__netdev_update_features(dev);



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 17/92] net: usb: lan78xx: Add .ndo_features_check
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (15 preceding siblings ...)
  2020-01-28 14:07 ` [PATCH 4.19 16/92] net-sysfs: Fix reference count leak Greg Kroah-Hartman
@ 2020-01-28 14:07 ` Greg Kroah-Hartman
  2020-01-28 14:07 ` [PATCH 4.19 18/92] Revert "udp: do rmem bulk free even if the rx sk queue is empty" Greg Kroah-Hartman
                   ` (71 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, James Hughes, Eric Dumazet, David S. Miller

From: James Hughes <james.hughes@raspberrypi.org>

[ Upstream commit ce896476c65d72b4b99fa09c2f33436b4198f034 ]

As reported by Eric Dumazet, there are still some outstanding
cases where the driver does not handle TSO correctly when skb's
are over a certain size. Most cases have been fixed, this patch
should ensure that forwarded SKB's that are greater than
MAX_SINGLE_PACKET_SIZE - TX_OVERHEAD are software segmented
and handled correctly.

Signed-off-by: James Hughes <james.hughes@raspberrypi.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/usb/lan78xx.c |   15 +++++++++++++++
 1 file changed, 15 insertions(+)

--- a/drivers/net/usb/lan78xx.c
+++ b/drivers/net/usb/lan78xx.c
@@ -31,6 +31,7 @@
 #include <linux/mdio.h>
 #include <linux/phy.h>
 #include <net/ip6_checksum.h>
+#include <net/vxlan.h>
 #include <linux/interrupt.h>
 #include <linux/irqdomain.h>
 #include <linux/irq.h>
@@ -3686,6 +3687,19 @@ static void lan78xx_tx_timeout(struct ne
 	tasklet_schedule(&dev->bh);
 }
 
+static netdev_features_t lan78xx_features_check(struct sk_buff *skb,
+						struct net_device *netdev,
+						netdev_features_t features)
+{
+	if (skb->len + TX_OVERHEAD > MAX_SINGLE_PACKET_SIZE)
+		features &= ~NETIF_F_GSO_MASK;
+
+	features = vlan_features_check(skb, features);
+	features = vxlan_features_check(skb, features);
+
+	return features;
+}
+
 static const struct net_device_ops lan78xx_netdev_ops = {
 	.ndo_open		= lan78xx_open,
 	.ndo_stop		= lan78xx_stop,
@@ -3699,6 +3713,7 @@ static const struct net_device_ops lan78
 	.ndo_set_features	= lan78xx_set_features,
 	.ndo_vlan_rx_add_vid	= lan78xx_vlan_rx_add_vid,
 	.ndo_vlan_rx_kill_vid	= lan78xx_vlan_rx_kill_vid,
+	.ndo_features_check	= lan78xx_features_check,
 };
 
 static void lan78xx_stat_monitor(struct timer_list *t)



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 18/92] Revert "udp: do rmem bulk free even if the rx sk queue is empty"
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (16 preceding siblings ...)
  2020-01-28 14:07 ` [PATCH 4.19 17/92] net: usb: lan78xx: Add .ndo_features_check Greg Kroah-Hartman
@ 2020-01-28 14:07 ` Greg Kroah-Hartman
  2020-01-28 14:07 ` [PATCH 4.19 19/92] tcp_bbr: improve arithmetic division in bbr_update_bw() Greg Kroah-Hartman
                   ` (70 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Willem de Bruijn, Paolo Abeni,
	Willem de Bruijn, David S. Miller, Eric Dumazet

From: Paolo Abeni <pabeni@redhat.com>

[ Upstream commit d39ca2590d10712f412add7a88e1dd467a7246f4 ]

This reverts commit 0d4a6608f68c7532dcbfec2ea1150c9761767d03.

Willem reported that after commit 0d4a6608f68c ("udp: do rmem bulk
free even if the rx sk queue is empty") the memory allocated by
an almost idle system with many UDP sockets can grow a lot.

For stable kernel keep the solution as simple as possible and revert
the offending commit.

Reported-by: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Diagnosed-by: Eric Dumazet <eric.dumazet@gmail.com>
Fixes: 0d4a6608f68c ("udp: do rmem bulk free even if the rx sk queue is empty")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv4/udp.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1305,7 +1305,8 @@ static void udp_rmem_release(struct sock
 	if (likely(partial)) {
 		up->forward_deficit += size;
 		size = up->forward_deficit;
-		if (size < (sk->sk_rcvbuf >> 2))
+		if (size < (sk->sk_rcvbuf >> 2) &&
+		    !skb_queue_empty(&up->reader_queue))
 			return;
 	} else {
 		size += up->forward_deficit;



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 19/92] tcp_bbr: improve arithmetic division in bbr_update_bw()
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (17 preceding siblings ...)
  2020-01-28 14:07 ` [PATCH 4.19 18/92] Revert "udp: do rmem bulk free even if the rx sk queue is empty" Greg Kroah-Hartman
@ 2020-01-28 14:07 ` Greg Kroah-Hartman
  2020-01-28 14:07 ` [PATCH 4.19 20/92] tcp: do not leave dangling pointers in tp->highest_sack Greg Kroah-Hartman
                   ` (69 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Wen Yang, Eric Dumazet,
	David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI, netdev

From: Wen Yang <wenyang@linux.alibaba.com>

[ Upstream commit 5b2f1f3070b6447b76174ea8bfb7390dc6253ebd ]

do_div() does a 64-by-32 division. Use div64_long() instead of it
if the divisor is long, to avoid truncation to 32-bit.
And as a nice side effect also cleans up the function a bit.

Signed-off-by: Wen Yang <wenyang@linux.alibaba.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv4/tcp_bbr.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/net/ipv4/tcp_bbr.c
+++ b/net/ipv4/tcp_bbr.c
@@ -680,8 +680,7 @@ static void bbr_update_bw(struct sock *s
 	 * bandwidth sample. Delivered is in packets and interval_us in uS and
 	 * ratio will be <<1 for most connections. So delivered is first scaled.
 	 */
-	bw = (u64)rs->delivered * BW_UNIT;
-	do_div(bw, rs->interval_us);
+	bw = div64_long((u64)rs->delivered * BW_UNIT, rs->interval_us);
 
 	/* If this sample is application-limited, it is likely to have a very
 	 * low delivered count that represents application behavior rather than



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 20/92] tcp: do not leave dangling pointers in tp->highest_sack
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (18 preceding siblings ...)
  2020-01-28 14:07 ` [PATCH 4.19 19/92] tcp_bbr: improve arithmetic division in bbr_update_bw() Greg Kroah-Hartman
@ 2020-01-28 14:07 ` Greg Kroah-Hartman
  2020-01-28 14:07 ` [PATCH 4.19 21/92] tun: add mutex_unlock() call and napi.skb clearing in tun_get_user() Greg Kroah-Hartman
                   ` (68 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Eric Dumazet, Cambda Zhu,
	Yuchung Cheng, Neal Cardwell, David S. Miller

From: Eric Dumazet <edumazet@google.com>

[ Upstream commit 2bec445f9bf35e52e395b971df48d3e1e5dc704a ]

Latest commit 853697504de0 ("tcp: Fix highest_sack and highest_sack_seq")
apparently allowed syzbot to trigger various crashes in TCP stack [1]

I believe this commit only made things easier for syzbot to find
its way into triggering use-after-frees. But really the bugs
could lead to bad TCP behavior or even plain crashes even for
non malicious peers.

I have audited all calls to tcp_rtx_queue_unlink() and
tcp_rtx_queue_unlink_and_free() and made sure tp->highest_sack would be updated
if we are removing from rtx queue the skb that tp->highest_sack points to.

These updates were missing in three locations :

1) tcp_clean_rtx_queue() [This one seems quite serious,
                          I have no idea why this was not caught earlier]

2) tcp_rtx_queue_purge() [Probably not a big deal for normal operations]

3) tcp_send_synack()     [Probably not a big deal for normal operations]

[1]
BUG: KASAN: use-after-free in tcp_highest_sack_seq include/net/tcp.h:1864 [inline]
BUG: KASAN: use-after-free in tcp_highest_sack_seq include/net/tcp.h:1856 [inline]
BUG: KASAN: use-after-free in tcp_check_sack_reordering+0x33c/0x3a0 net/ipv4/tcp_input.c:891
Read of size 4 at addr ffff8880a488d068 by task ksoftirqd/1/16

CPU: 1 PID: 16 Comm: ksoftirqd/1 Not tainted 5.5.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x197/0x210 lib/dump_stack.c:118
 print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374
 __kasan_report.cold+0x1b/0x41 mm/kasan/report.c:506
 kasan_report+0x12/0x20 mm/kasan/common.c:639
 __asan_report_load4_noabort+0x14/0x20 mm/kasan/generic_report.c:134
 tcp_highest_sack_seq include/net/tcp.h:1864 [inline]
 tcp_highest_sack_seq include/net/tcp.h:1856 [inline]
 tcp_check_sack_reordering+0x33c/0x3a0 net/ipv4/tcp_input.c:891
 tcp_try_undo_partial net/ipv4/tcp_input.c:2730 [inline]
 tcp_fastretrans_alert+0xf74/0x23f0 net/ipv4/tcp_input.c:2847
 tcp_ack+0x2577/0x5bf0 net/ipv4/tcp_input.c:3710
 tcp_rcv_established+0x6dd/0x1e90 net/ipv4/tcp_input.c:5706
 tcp_v4_do_rcv+0x619/0x8d0 net/ipv4/tcp_ipv4.c:1619
 tcp_v4_rcv+0x307f/0x3b40 net/ipv4/tcp_ipv4.c:2001
 ip_protocol_deliver_rcu+0x5a/0x880 net/ipv4/ip_input.c:204
 ip_local_deliver_finish+0x23b/0x380 net/ipv4/ip_input.c:231
 NF_HOOK include/linux/netfilter.h:307 [inline]
 NF_HOOK include/linux/netfilter.h:301 [inline]
 ip_local_deliver+0x1e9/0x520 net/ipv4/ip_input.c:252
 dst_input include/net/dst.h:442 [inline]
 ip_rcv_finish+0x1db/0x2f0 net/ipv4/ip_input.c:428
 NF_HOOK include/linux/netfilter.h:307 [inline]
 NF_HOOK include/linux/netfilter.h:301 [inline]
 ip_rcv+0xe8/0x3f0 net/ipv4/ip_input.c:538
 __netif_receive_skb_one_core+0x113/0x1a0 net/core/dev.c:5148
 __netif_receive_skb+0x2c/0x1d0 net/core/dev.c:5262
 process_backlog+0x206/0x750 net/core/dev.c:6093
 napi_poll net/core/dev.c:6530 [inline]
 net_rx_action+0x508/0x1120 net/core/dev.c:6598
 __do_softirq+0x262/0x98c kernel/softirq.c:292
 run_ksoftirqd kernel/softirq.c:603 [inline]
 run_ksoftirqd+0x8e/0x110 kernel/softirq.c:595
 smpboot_thread_fn+0x6a3/0xa40 kernel/smpboot.c:165
 kthread+0x361/0x430 kernel/kthread.c:255
 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352

Allocated by task 10091:
 save_stack+0x23/0x90 mm/kasan/common.c:72
 set_track mm/kasan/common.c:80 [inline]
 __kasan_kmalloc mm/kasan/common.c:513 [inline]
 __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:486
 kasan_slab_alloc+0xf/0x20 mm/kasan/common.c:521
 slab_post_alloc_hook mm/slab.h:584 [inline]
 slab_alloc_node mm/slab.c:3263 [inline]
 kmem_cache_alloc_node+0x138/0x740 mm/slab.c:3575
 __alloc_skb+0xd5/0x5e0 net/core/skbuff.c:198
 alloc_skb_fclone include/linux/skbuff.h:1099 [inline]
 sk_stream_alloc_skb net/ipv4/tcp.c:875 [inline]
 sk_stream_alloc_skb+0x113/0xc90 net/ipv4/tcp.c:852
 tcp_sendmsg_locked+0xcf9/0x3470 net/ipv4/tcp.c:1282
 tcp_sendmsg+0x30/0x50 net/ipv4/tcp.c:1432
 inet_sendmsg+0x9e/0xe0 net/ipv4/af_inet.c:807
 sock_sendmsg_nosec net/socket.c:652 [inline]
 sock_sendmsg+0xd7/0x130 net/socket.c:672
 __sys_sendto+0x262/0x380 net/socket.c:1998
 __do_sys_sendto net/socket.c:2010 [inline]
 __se_sys_sendto net/socket.c:2006 [inline]
 __x64_sys_sendto+0xe1/0x1a0 net/socket.c:2006
 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Freed by task 10095:
 save_stack+0x23/0x90 mm/kasan/common.c:72
 set_track mm/kasan/common.c:80 [inline]
 kasan_set_free_info mm/kasan/common.c:335 [inline]
 __kasan_slab_free+0x102/0x150 mm/kasan/common.c:474
 kasan_slab_free+0xe/0x10 mm/kasan/common.c:483
 __cache_free mm/slab.c:3426 [inline]
 kmem_cache_free+0x86/0x320 mm/slab.c:3694
 kfree_skbmem+0x178/0x1c0 net/core/skbuff.c:645
 __kfree_skb+0x1e/0x30 net/core/skbuff.c:681
 sk_eat_skb include/net/sock.h:2453 [inline]
 tcp_recvmsg+0x1252/0x2930 net/ipv4/tcp.c:2166
 inet_recvmsg+0x136/0x610 net/ipv4/af_inet.c:838
 sock_recvmsg_nosec net/socket.c:886 [inline]
 sock_recvmsg net/socket.c:904 [inline]
 sock_recvmsg+0xce/0x110 net/socket.c:900
 __sys_recvfrom+0x1ff/0x350 net/socket.c:2055
 __do_sys_recvfrom net/socket.c:2073 [inline]
 __se_sys_recvfrom net/socket.c:2069 [inline]
 __x64_sys_recvfrom+0xe1/0x1a0 net/socket.c:2069
 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

The buggy address belongs to the object at ffff8880a488d040
 which belongs to the cache skbuff_fclone_cache of size 456
The buggy address is located 40 bytes inside of
 456-byte region [ffff8880a488d040, ffff8880a488d208)
The buggy address belongs to the page:
page:ffffea0002922340 refcount:1 mapcount:0 mapping:ffff88821b057000 index:0x0
raw: 00fffe0000000200 ffffea00022a5788 ffffea0002624a48 ffff88821b057000
raw: 0000000000000000 ffff8880a488d040 0000000100000006 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff8880a488cf00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff8880a488cf80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff8880a488d000: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
                                                          ^
 ffff8880a488d080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff8880a488d100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

Fixes: 853697504de0 ("tcp: Fix highest_sack and highest_sack_seq")
Fixes: 50895b9de1d3 ("tcp: highest_sack fix")
Fixes: 737ff314563c ("tcp: use sequence distance to detect reordering")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Cambda Zhu <cambda@linux.alibaba.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv4/tcp.c        |    1 +
 net/ipv4/tcp_input.c  |    1 +
 net/ipv4/tcp_output.c |    1 +
 3 files changed, 3 insertions(+)

--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2507,6 +2507,7 @@ static void tcp_rtx_queue_purge(struct s
 {
 	struct rb_node *p = rb_first(&sk->tcp_rtx_queue);
 
+	tcp_sk(sk)->highest_sack = NULL;
 	while (p) {
 		struct sk_buff *skb = rb_to_skb(p);
 
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -3149,6 +3149,7 @@ static int tcp_clean_rtx_queue(struct so
 			tp->retransmit_skb_hint = NULL;
 		if (unlikely(skb == tp->lost_skb_hint))
 			tp->lost_skb_hint = NULL;
+		tcp_highest_sack_replace(sk, skb, next);
 		tcp_rtx_queue_unlink_and_free(skb, sk);
 	}
 
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -3165,6 +3165,7 @@ int tcp_send_synack(struct sock *sk)
 			if (!nskb)
 				return -ENOMEM;
 			INIT_LIST_HEAD(&nskb->tcp_tsorted_anchor);
+			tcp_highest_sack_replace(sk, skb, nskb);
 			tcp_rtx_queue_unlink_and_free(skb, sk);
 			__skb_header_release(nskb);
 			tcp_rbtree_insert(&sk->tcp_rtx_queue, nskb);



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 21/92] tun: add mutex_unlock() call and napi.skb clearing in tun_get_user()
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (19 preceding siblings ...)
  2020-01-28 14:07 ` [PATCH 4.19 20/92] tcp: do not leave dangling pointers in tp->highest_sack Greg Kroah-Hartman
@ 2020-01-28 14:07 ` Greg Kroah-Hartman
  2020-01-28 14:07 ` [PATCH 4.19 22/92] afs: Fix characters allowed into cell names Greg Kroah-Hartman
                   ` (67 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Eric Dumazet, syzbot, Petar Penkov,
	Willem de Bruijn, David S. Miller

From: Eric Dumazet <edumazet@google.com>

[ Upstream commit 1efba987c48629c0c64703bb4ea76ca1a3771d17 ]

If both IFF_NAPI_FRAGS mode and XDP are enabled, and the XDP program
consumes the skb, we need to clear the napi.skb (or risk
a use-after-free) and release the mutex (or risk a deadlock)

WARNING: lock held when returning to user space!
5.5.0-rc6-syzkaller #0 Not tainted
------------------------------------------------
syz-executor.0/455 is leaving the kernel with locks still held!
1 lock held by syz-executor.0/455:
 #0: ffff888098f6e748 (&tfile->napi_mutex){+.+.}, at: tun_get_user+0x1604/0x3fc0 drivers/net/tun.c:1835

Fixes: 90e33d459407 ("tun: enable napi_gro_frags() for TUN/TAP driver")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Petar Penkov <ppenkov@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/tun.c |    4 ++++
 1 file changed, 4 insertions(+)

--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1900,6 +1900,10 @@ drop:
 			if (ret != XDP_PASS) {
 				rcu_read_unlock();
 				local_bh_enable();
+				if (frags) {
+					tfile->napi.skb = NULL;
+					mutex_unlock(&tfile->napi_mutex);
+				}
 				return total_len;
 			}
 		}



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 22/92] afs: Fix characters allowed into cell names
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (20 preceding siblings ...)
  2020-01-28 14:07 ` [PATCH 4.19 21/92] tun: add mutex_unlock() call and napi.skb clearing in tun_get_user() Greg Kroah-Hartman
@ 2020-01-28 14:07 ` Greg Kroah-Hartman
  2020-01-28 14:07 ` [PATCH 4.19 23/92] hwmon: (adt7475) Make volt2reg return same reg as reg2volt input Greg Kroah-Hartman
                   ` (66 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, syzbot+b904ba7c947a37b4b291, stable,
	David Howells, Linus Torvalds

From: David Howells <dhowells@redhat.com>

commit a45ea48e2bcd92c1f678b794f488ca0bda9835b8 upstream.

The afs filesystem needs to prohibit certain characters from cell names,
such as '/', as these are used to form filenames in procfs, leading to
the following warning being generated:

	WARNING: CPU: 0 PID: 3489 at fs/proc/generic.c:178

Fix afs_alloc_cell() to disallow nonprintable characters, '/', '@' and
names that begin with a dot.

Remove the check for "@cell" as that is then redundant.

This can be tested by running:

	echo add foo/.bar 1.2.3.4 >/proc/fs/afs/cells

Note that we will also need to deal with:

 - Names ending in ".invalid" shouldn't be passed to the DNS.

 - Names that contain non-valid domainname chars shouldn't be passed to
   the DNS.

 - DNS replies that say "your-dns-needs-immediate-attention.<gTLD>" and
   replies containing A records that say 127.0.53.53 should be
   considered invalid.
   [https://www.icann.org/en/system/files/files/name-collision-mitigation-01aug14-en.pdf]

but these need to be dealt with by the kafs-client DNS program rather
than the kernel.

Reported-by: syzbot+b904ba7c947a37b4b291@syzkaller.appspotmail.com
Cc: stable@kernel.org
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/afs/cell.c |   11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

--- a/fs/afs/cell.c
+++ b/fs/afs/cell.c
@@ -135,8 +135,17 @@ static struct afs_cell *afs_alloc_cell(s
 		_leave(" = -ENAMETOOLONG");
 		return ERR_PTR(-ENAMETOOLONG);
 	}
-	if (namelen == 5 && memcmp(name, "@cell", 5) == 0)
+
+	/* Prohibit cell names that contain unprintable chars, '/' and '@' or
+	 * that begin with a dot.  This also precludes "@cell".
+	 */
+	if (name[0] == '.')
 		return ERR_PTR(-EINVAL);
+	for (i = 0; i < namelen; i++) {
+		char ch = name[i];
+		if (!isprint(ch) || ch == '/' || ch == '@')
+			return ERR_PTR(-EINVAL);
+	}
 
 	_enter("%*.*s,%s", namelen, namelen, name, vllist);
 



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 23/92] hwmon: (adt7475) Make volt2reg return same reg as reg2volt input
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (21 preceding siblings ...)
  2020-01-28 14:07 ` [PATCH 4.19 22/92] afs: Fix characters allowed into cell names Greg Kroah-Hartman
@ 2020-01-28 14:07 ` Greg Kroah-Hartman
  2020-01-28 14:07 ` [PATCH 4.19 24/92] hwmon: (core) Do not use device managed functions for memory allocations Greg Kroah-Hartman
                   ` (65 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:07 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Luuk Paulussen, Guenter Roeck

From: Luuk Paulussen <luuk.paulussen@alliedtelesis.co.nz>

commit cf3ca1877574a306c0207cbf7fdf25419d9229df upstream.

reg2volt returns the voltage that matches a given register value.
Converting this back the other way with volt2reg didn't return the same
register value because it used truncation instead of rounding.

This meant that values read from sysfs could not be written back to sysfs
to set back the same register value.

With this change, volt2reg will return the same value for every voltage
previously returned by reg2volt (for the set of possible input values)

Signed-off-by: Luuk Paulussen <luuk.paulussen@alliedtelesis.co.nz>
Link: https://lore.kernel.org/r/20191205231659.1301-1-luuk.paulussen@alliedtelesis.co.nz
cc: stable@vger.kernel.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/hwmon/adt7475.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/drivers/hwmon/adt7475.c
+++ b/drivers/hwmon/adt7475.c
@@ -296,9 +296,10 @@ static inline u16 volt2reg(int channel,
 	long reg;
 
 	if (bypass_attn & (1 << channel))
-		reg = (volt * 1024) / 2250;
+		reg = DIV_ROUND_CLOSEST(volt * 1024, 2250);
 	else
-		reg = (volt * r[1] * 1024) / ((r[0] + r[1]) * 2250);
+		reg = DIV_ROUND_CLOSEST(volt * r[1] * 1024,
+					(r[0] + r[1]) * 2250);
 	return clamp_val(reg, 0, 1023) & (0xff << 2);
 }
 



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 24/92] hwmon: (core) Do not use device managed functions for memory allocations
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (22 preceding siblings ...)
  2020-01-28 14:07 ` [PATCH 4.19 23/92] hwmon: (adt7475) Make volt2reg return same reg as reg2volt input Greg Kroah-Hartman
@ 2020-01-28 14:07 ` Greg Kroah-Hartman
  2020-01-28 14:07 ` [PATCH 4.19 25/92] PCI: Mark AMD Navi14 GPU rev 0xc5 ATS as broken Greg Kroah-Hartman
                   ` (64 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Martin K. Petersen, Guenter Roeck

From: Guenter Roeck <linux@roeck-us.net>

commit 3bf8bdcf3bada771eb12b57f2a30caee69e8ab8d upstream.

The hwmon core uses device managed functions, tied to the hwmon parent
device, for various internal memory allocations. This is problematic
since hwmon device lifetime does not necessarily match its parent's
device lifetime. If there is a mismatch, memory leaks will accumulate
until the parent device is released.

Fix the problem by managing all memory allocations internally. The only
exception is memory allocation for thermal device registration, which
can be tied to the hwmon device, along with thermal device registration
itself.

Fixes: d560168b5d0f ("hwmon: (core) New hwmon registration API")
Cc: stable@vger.kernel.org # v4.14.x: 47c332deb8e8: hwmon: Deal with errors from the thermal subsystem
Cc: stable@vger.kernel.org # v4.14.x: 74e3512731bd: hwmon: (core) Fix double-free in __hwmon_device_register()
Cc: stable@vger.kernel.org # v4.9.x: 3a412d5e4a1c: hwmon: (core) Simplify sysfs attribute name allocation
Cc: stable@vger.kernel.org # v4.9.x: 47c332deb8e8: hwmon: Deal with errors from the thermal subsystem
Cc: stable@vger.kernel.org # v4.9.x: 74e3512731bd: hwmon: (core) Fix double-free in __hwmon_device_register()
Cc: stable@vger.kernel.org # v4.9+
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/hwmon/hwmon.c |   68 ++++++++++++++++++++++++++++++--------------------
 1 file changed, 41 insertions(+), 27 deletions(-)

--- a/drivers/hwmon/hwmon.c
+++ b/drivers/hwmon/hwmon.c
@@ -51,6 +51,7 @@ struct hwmon_device_attribute {
 
 #define to_hwmon_attr(d) \
 	container_of(d, struct hwmon_device_attribute, dev_attr)
+#define to_dev_attr(a) container_of(a, struct device_attribute, attr)
 
 /*
  * Thermal zone information
@@ -58,7 +59,7 @@ struct hwmon_device_attribute {
  * also provides the sensor index.
  */
 struct hwmon_thermal_data {
-	struct hwmon_device *hwdev;	/* Reference to hwmon device */
+	struct device *dev;		/* Reference to hwmon device */
 	int index;			/* sensor index */
 };
 
@@ -95,9 +96,27 @@ static const struct attribute_group *hwm
 	NULL
 };
 
+static void hwmon_free_attrs(struct attribute **attrs)
+{
+	int i;
+
+	for (i = 0; attrs[i]; i++) {
+		struct device_attribute *dattr = to_dev_attr(attrs[i]);
+		struct hwmon_device_attribute *hattr = to_hwmon_attr(dattr);
+
+		kfree(hattr);
+	}
+	kfree(attrs);
+}
+
 static void hwmon_dev_release(struct device *dev)
 {
-	kfree(to_hwmon_device(dev));
+	struct hwmon_device *hwdev = to_hwmon_device(dev);
+
+	if (hwdev->group.attrs)
+		hwmon_free_attrs(hwdev->group.attrs);
+	kfree(hwdev->groups);
+	kfree(hwdev);
 }
 
 static struct class hwmon_class = {
@@ -121,11 +140,11 @@ static DEFINE_IDA(hwmon_ida);
 static int hwmon_thermal_get_temp(void *data, int *temp)
 {
 	struct hwmon_thermal_data *tdata = data;
-	struct hwmon_device *hwdev = tdata->hwdev;
+	struct hwmon_device *hwdev = to_hwmon_device(tdata->dev);
 	int ret;
 	long t;
 
-	ret = hwdev->chip->ops->read(&hwdev->dev, hwmon_temp, hwmon_temp_input,
+	ret = hwdev->chip->ops->read(tdata->dev, hwmon_temp, hwmon_temp_input,
 				     tdata->index, &t);
 	if (ret < 0)
 		return ret;
@@ -139,8 +158,7 @@ static const struct thermal_zone_of_devi
 	.get_temp = hwmon_thermal_get_temp,
 };
 
-static int hwmon_thermal_add_sensor(struct device *dev,
-				    struct hwmon_device *hwdev, int index)
+static int hwmon_thermal_add_sensor(struct device *dev, int index)
 {
 	struct hwmon_thermal_data *tdata;
 	struct thermal_zone_device *tzd;
@@ -149,10 +167,10 @@ static int hwmon_thermal_add_sensor(stru
 	if (!tdata)
 		return -ENOMEM;
 
-	tdata->hwdev = hwdev;
+	tdata->dev = dev;
 	tdata->index = index;
 
-	tzd = devm_thermal_zone_of_sensor_register(&hwdev->dev, index, tdata,
+	tzd = devm_thermal_zone_of_sensor_register(dev, index, tdata,
 						   &hwmon_thermal_ops);
 	/*
 	 * If CONFIG_THERMAL_OF is disabled, this returns -ENODEV,
@@ -164,8 +182,7 @@ static int hwmon_thermal_add_sensor(stru
 	return 0;
 }
 #else
-static int hwmon_thermal_add_sensor(struct device *dev,
-				    struct hwmon_device *hwdev, int index)
+static int hwmon_thermal_add_sensor(struct device *dev, int index)
 {
 	return 0;
 }
@@ -242,8 +259,7 @@ static bool is_string_attr(enum hwmon_se
 	       (type == hwmon_fan && attr == hwmon_fan_label);
 }
 
-static struct attribute *hwmon_genattr(struct device *dev,
-				       const void *drvdata,
+static struct attribute *hwmon_genattr(const void *drvdata,
 				       enum hwmon_sensor_types type,
 				       u32 attr,
 				       int index,
@@ -271,7 +287,7 @@ static struct attribute *hwmon_genattr(s
 	if ((mode & S_IWUGO) && !ops->write)
 		return ERR_PTR(-EINVAL);
 
-	hattr = devm_kzalloc(dev, sizeof(*hattr), GFP_KERNEL);
+	hattr = kzalloc(sizeof(*hattr), GFP_KERNEL);
 	if (!hattr)
 		return ERR_PTR(-ENOMEM);
 
@@ -478,8 +494,7 @@ static int hwmon_num_channel_attrs(const
 	return n;
 }
 
-static int hwmon_genattrs(struct device *dev,
-			  const void *drvdata,
+static int hwmon_genattrs(const void *drvdata,
 			  struct attribute **attrs,
 			  const struct hwmon_ops *ops,
 			  const struct hwmon_channel_info *info)
@@ -505,7 +520,7 @@ static int hwmon_genattrs(struct device
 			attr_mask &= ~BIT(attr);
 			if (attr >= template_size)
 				return -EINVAL;
-			a = hwmon_genattr(dev, drvdata, info->type, attr, i,
+			a = hwmon_genattr(drvdata, info->type, attr, i,
 					  templates[attr], ops);
 			if (IS_ERR(a)) {
 				if (PTR_ERR(a) != -ENOENT)
@@ -519,8 +534,7 @@ static int hwmon_genattrs(struct device
 }
 
 static struct attribute **
-__hwmon_create_attrs(struct device *dev, const void *drvdata,
-		     const struct hwmon_chip_info *chip)
+__hwmon_create_attrs(const void *drvdata, const struct hwmon_chip_info *chip)
 {
 	int ret, i, aindex = 0, nattrs = 0;
 	struct attribute **attrs;
@@ -531,15 +545,17 @@ __hwmon_create_attrs(struct device *dev,
 	if (nattrs == 0)
 		return ERR_PTR(-EINVAL);
 
-	attrs = devm_kcalloc(dev, nattrs + 1, sizeof(*attrs), GFP_KERNEL);
+	attrs = kcalloc(nattrs + 1, sizeof(*attrs), GFP_KERNEL);
 	if (!attrs)
 		return ERR_PTR(-ENOMEM);
 
 	for (i = 0; chip->info[i]; i++) {
-		ret = hwmon_genattrs(dev, drvdata, &attrs[aindex], chip->ops,
+		ret = hwmon_genattrs(drvdata, &attrs[aindex], chip->ops,
 				     chip->info[i]);
-		if (ret < 0)
+		if (ret < 0) {
+			hwmon_free_attrs(attrs);
 			return ERR_PTR(ret);
+		}
 		aindex += ret;
 	}
 
@@ -581,14 +597,13 @@ __hwmon_device_register(struct device *d
 			for (i = 0; groups[i]; i++)
 				ngroups++;
 
-		hwdev->groups = devm_kcalloc(dev, ngroups, sizeof(*groups),
-					     GFP_KERNEL);
+		hwdev->groups = kcalloc(ngroups, sizeof(*groups), GFP_KERNEL);
 		if (!hwdev->groups) {
 			err = -ENOMEM;
 			goto free_hwmon;
 		}
 
-		attrs = __hwmon_create_attrs(dev, drvdata, chip);
+		attrs = __hwmon_create_attrs(drvdata, chip);
 		if (IS_ERR(attrs)) {
 			err = PTR_ERR(attrs);
 			goto free_hwmon;
@@ -633,8 +648,7 @@ __hwmon_device_register(struct device *d
 							   hwmon_temp_input, j))
 					continue;
 				if (info[i]->config[j] & HWMON_T_INPUT) {
-					err = hwmon_thermal_add_sensor(dev,
-								hwdev, j);
+					err = hwmon_thermal_add_sensor(hdev, j);
 					if (err) {
 						device_unregister(hdev);
 						goto ida_remove;
@@ -647,7 +661,7 @@ __hwmon_device_register(struct device *d
 	return hdev;
 
 free_hwmon:
-	kfree(hwdev);
+	hwmon_dev_release(hdev);
 ida_remove:
 	ida_simple_remove(&hwmon_ida, id);
 	return ERR_PTR(err);



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 25/92] PCI: Mark AMD Navi14 GPU rev 0xc5 ATS as broken
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (23 preceding siblings ...)
  2020-01-28 14:07 ` [PATCH 4.19 24/92] hwmon: (core) Do not use device managed functions for memory allocations Greg Kroah-Hartman
@ 2020-01-28 14:07 ` Greg Kroah-Hartman
  2020-01-28 14:07 ` [PATCH 4.19 26/92] tracing: trigger: Replace unneeded RCU-list traversals Greg Kroah-Hartman
                   ` (63 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:07 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Alex Deucher, Bjorn Helgaas

From: Alex Deucher <alexander.deucher@amd.com>

commit 5e89cd303e3a4505752952259b9f1ba036632544 upstream.

To account for parts of the chip that are "harvested" (disabled) due to
silicon flaws, caches on some AMD GPUs must be initialized before ATS is
enabled.

ATS is normally enabled by the IOMMU driver before the GPU driver loads, so
this cache initialization would have to be done in a quirk, but that's too
complex to be practical.

For Navi14 (device ID 0x7340), this initialization is done by the VBIOS,
but apparently some boards went to production with an older VBIOS that
doesn't do it.  Disable ATS for those boards.

Link: https://lore.kernel.org/r/20200114205523.1054271-3-alexander.deucher@amd.com
Bug: https://gitlab.freedesktop.org/drm/amd/issues/1015
See-also: d28ca864c493 ("PCI: Mark AMD Stoney Radeon R7 GPU ATS as broken")
See-also: 9b44b0b09dec ("PCI: Mark AMD Stoney GPU ATS as broken")
[bhelgaas: squash into one patch, simplify slightly, commit log]
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/pci/quirks.c |   19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -4891,18 +4891,25 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SE
 
 #ifdef CONFIG_PCI_ATS
 /*
- * Some devices have a broken ATS implementation causing IOMMU stalls.
- * Don't use ATS for those devices.
+ * Some devices require additional driver setup to enable ATS.  Don't use
+ * ATS for those devices as ATS will be enabled before the driver has had a
+ * chance to load and configure the device.
  */
-static void quirk_no_ats(struct pci_dev *pdev)
+static void quirk_amd_harvest_no_ats(struct pci_dev *pdev)
 {
-	pci_info(pdev, "disabling ATS (broken on this device)\n");
+	if (pdev->device == 0x7340 && pdev->revision != 0xc5)
+		return;
+
+	pci_info(pdev, "disabling ATS\n");
 	pdev->ats_cap = 0;
 }
 
 /* AMD Stoney platform GPU */
-DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x98e4, quirk_no_ats);
-DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x6900, quirk_no_ats);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x98e4, quirk_amd_harvest_no_ats);
+/* AMD Iceland dGPU */
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x6900, quirk_amd_harvest_no_ats);
+/* AMD Navi14 dGPU */
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x7340, quirk_amd_harvest_no_ats);
 #endif /* CONFIG_PCI_ATS */
 
 /* Freescale PCIe doesn't support MSI in RC mode */



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 26/92] tracing: trigger: Replace unneeded RCU-list traversals
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (24 preceding siblings ...)
  2020-01-28 14:07 ` [PATCH 4.19 25/92] PCI: Mark AMD Navi14 GPU rev 0xc5 ATS as broken Greg Kroah-Hartman
@ 2020-01-28 14:07 ` Greg Kroah-Hartman
  2020-01-28 14:07 ` [PATCH 4.19 27/92] Input: keyspan-remote - fix control-message timeouts Greg Kroah-Hartman
                   ` (62 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Tom Zanussi, Masami Hiramatsu,
	Steven Rostedt (VMware)

From: Masami Hiramatsu <mhiramat@kernel.org>

commit aeed8aa3874dc15b9d82a6fe796fd7cfbb684448 upstream.

With CONFIG_PROVE_RCU_LIST, I had many suspicious RCU warnings
when I ran ftracetest trigger testcases.

-----
  # dmesg -c > /dev/null
  # ./ftracetest test.d/trigger
  ...
  # dmesg | grep "RCU-list traversed" | cut -f 2 -d ] | cut -f 2 -d " "
  kernel/trace/trace_events_hist.c:6070
  kernel/trace/trace_events_hist.c:1760
  kernel/trace/trace_events_hist.c:5911
  kernel/trace/trace_events_trigger.c:504
  kernel/trace/trace_events_hist.c:1810
  kernel/trace/trace_events_hist.c:3158
  kernel/trace/trace_events_hist.c:3105
  kernel/trace/trace_events_hist.c:5518
  kernel/trace/trace_events_hist.c:5998
  kernel/trace/trace_events_hist.c:6019
  kernel/trace/trace_events_hist.c:6044
  kernel/trace/trace_events_trigger.c:1500
  kernel/trace/trace_events_trigger.c:1540
  kernel/trace/trace_events_trigger.c:539
  kernel/trace/trace_events_trigger.c:584
-----

I investigated those warnings and found that the RCU-list
traversals in event trigger and hist didn't need to use
RCU version because those were called only under event_mutex.

I also checked other RCU-list traversals related to event
trigger list, and found that most of them were called from
event_hist_trigger_func() or hist_unregister_trigger() or
register/unregister functions except for a few cases.

Replace these unneeded RCU-list traversals with normal list
traversal macro and lockdep_assert_held() to check the
event_mutex is held.

Link: http://lkml.kernel.org/r/157680910305.11685.15110237954275915782.stgit@devnote2

Cc: stable@vger.kernel.org
Fixes: 30350d65ac567 ("tracing: Add variable support to hist triggers")
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


---
 kernel/trace/trace_events_hist.c    |   38 ++++++++++++++++++++++++++----------
 kernel/trace/trace_events_trigger.c |   20 ++++++++++++++----
 2 files changed, 43 insertions(+), 15 deletions(-)

--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -1511,11 +1511,13 @@ static struct hist_field *find_var(struc
 	struct event_trigger_data *test;
 	struct hist_field *hist_field;
 
+	lockdep_assert_held(&event_mutex);
+
 	hist_field = find_var_field(hist_data, var_name);
 	if (hist_field)
 		return hist_field;
 
-	list_for_each_entry_rcu(test, &file->triggers, list) {
+	list_for_each_entry(test, &file->triggers, list) {
 		if (test->cmd_ops->trigger_type == ETT_EVENT_HIST) {
 			test_data = test->private_data;
 			hist_field = find_var_field(test_data, var_name);
@@ -1565,7 +1567,9 @@ static struct hist_field *find_file_var(
 	struct event_trigger_data *test;
 	struct hist_field *hist_field;
 
-	list_for_each_entry_rcu(test, &file->triggers, list) {
+	lockdep_assert_held(&event_mutex);
+
+	list_for_each_entry(test, &file->triggers, list) {
 		if (test->cmd_ops->trigger_type == ETT_EVENT_HIST) {
 			test_data = test->private_data;
 			hist_field = find_var_field(test_data, var_name);
@@ -2828,7 +2832,9 @@ static char *find_trigger_filter(struct
 {
 	struct event_trigger_data *test;
 
-	list_for_each_entry_rcu(test, &file->triggers, list) {
+	lockdep_assert_held(&event_mutex);
+
+	list_for_each_entry(test, &file->triggers, list) {
 		if (test->cmd_ops->trigger_type == ETT_EVENT_HIST) {
 			if (test->private_data == hist_data)
 				return test->filter_str;
@@ -2879,9 +2885,11 @@ find_compatible_hist(struct hist_trigger
 	struct event_trigger_data *test;
 	unsigned int n_keys;
 
+	lockdep_assert_held(&event_mutex);
+
 	n_keys = target_hist_data->n_fields - target_hist_data->n_vals;
 
-	list_for_each_entry_rcu(test, &file->triggers, list) {
+	list_for_each_entry(test, &file->triggers, list) {
 		if (test->cmd_ops->trigger_type == ETT_EVENT_HIST) {
 			hist_data = test->private_data;
 
@@ -4905,7 +4913,7 @@ static int hist_show(struct seq_file *m,
 		goto out_unlock;
 	}
 
-	list_for_each_entry_rcu(data, &event_file->triggers, list) {
+	list_for_each_entry(data, &event_file->triggers, list) {
 		if (data->cmd_ops->trigger_type == ETT_EVENT_HIST)
 			hist_trigger_show(m, data, n++);
 	}
@@ -5296,7 +5304,9 @@ static int hist_register_trigger(char *g
 	if (hist_data->attrs->name && !named_data)
 		goto new;
 
-	list_for_each_entry_rcu(test, &file->triggers, list) {
+	lockdep_assert_held(&event_mutex);
+
+	list_for_each_entry(test, &file->triggers, list) {
 		if (test->cmd_ops->trigger_type == ETT_EVENT_HIST) {
 			if (!hist_trigger_match(data, test, named_data, false))
 				continue;
@@ -5380,10 +5390,12 @@ static bool have_hist_trigger_match(stru
 	struct event_trigger_data *test, *named_data = NULL;
 	bool match = false;
 
+	lockdep_assert_held(&event_mutex);
+
 	if (hist_data->attrs->name)
 		named_data = find_named_trigger(hist_data->attrs->name);
 
-	list_for_each_entry_rcu(test, &file->triggers, list) {
+	list_for_each_entry(test, &file->triggers, list) {
 		if (test->cmd_ops->trigger_type == ETT_EVENT_HIST) {
 			if (hist_trigger_match(data, test, named_data, false)) {
 				match = true;
@@ -5401,10 +5413,12 @@ static bool hist_trigger_check_refs(stru
 	struct hist_trigger_data *hist_data = data->private_data;
 	struct event_trigger_data *test, *named_data = NULL;
 
+	lockdep_assert_held(&event_mutex);
+
 	if (hist_data->attrs->name)
 		named_data = find_named_trigger(hist_data->attrs->name);
 
-	list_for_each_entry_rcu(test, &file->triggers, list) {
+	list_for_each_entry(test, &file->triggers, list) {
 		if (test->cmd_ops->trigger_type == ETT_EVENT_HIST) {
 			if (!hist_trigger_match(data, test, named_data, false))
 				continue;
@@ -5426,10 +5440,12 @@ static void hist_unregister_trigger(char
 	struct event_trigger_data *test, *named_data = NULL;
 	bool unregistered = false;
 
+	lockdep_assert_held(&event_mutex);
+
 	if (hist_data->attrs->name)
 		named_data = find_named_trigger(hist_data->attrs->name);
 
-	list_for_each_entry_rcu(test, &file->triggers, list) {
+	list_for_each_entry(test, &file->triggers, list) {
 		if (test->cmd_ops->trigger_type == ETT_EVENT_HIST) {
 			if (!hist_trigger_match(data, test, named_data, false))
 				continue;
@@ -5455,7 +5471,9 @@ static bool hist_file_check_refs(struct
 	struct hist_trigger_data *hist_data;
 	struct event_trigger_data *test;
 
-	list_for_each_entry_rcu(test, &file->triggers, list) {
+	lockdep_assert_held(&event_mutex);
+
+	list_for_each_entry(test, &file->triggers, list) {
 		if (test->cmd_ops->trigger_type == ETT_EVENT_HIST) {
 			hist_data = test->private_data;
 			if (check_var_refs(hist_data))
--- a/kernel/trace/trace_events_trigger.c
+++ b/kernel/trace/trace_events_trigger.c
@@ -495,7 +495,9 @@ void update_cond_flag(struct trace_event
 	struct event_trigger_data *data;
 	bool set_cond = false;
 
-	list_for_each_entry_rcu(data, &file->triggers, list) {
+	lockdep_assert_held(&event_mutex);
+
+	list_for_each_entry(data, &file->triggers, list) {
 		if (data->filter || event_command_post_trigger(data->cmd_ops) ||
 		    event_command_needs_rec(data->cmd_ops)) {
 			set_cond = true;
@@ -530,7 +532,9 @@ static int register_trigger(char *glob,
 	struct event_trigger_data *test;
 	int ret = 0;
 
-	list_for_each_entry_rcu(test, &file->triggers, list) {
+	lockdep_assert_held(&event_mutex);
+
+	list_for_each_entry(test, &file->triggers, list) {
 		if (test->cmd_ops->trigger_type == data->cmd_ops->trigger_type) {
 			ret = -EEXIST;
 			goto out;
@@ -575,7 +579,9 @@ static void unregister_trigger(char *glo
 	struct event_trigger_data *data;
 	bool unregistered = false;
 
-	list_for_each_entry_rcu(data, &file->triggers, list) {
+	lockdep_assert_held(&event_mutex);
+
+	list_for_each_entry(data, &file->triggers, list) {
 		if (data->cmd_ops->trigger_type == test->cmd_ops->trigger_type) {
 			unregistered = true;
 			list_del_rcu(&data->list);
@@ -1490,7 +1496,9 @@ int event_enable_register_trigger(char *
 	struct event_trigger_data *test;
 	int ret = 0;
 
-	list_for_each_entry_rcu(test, &file->triggers, list) {
+	lockdep_assert_held(&event_mutex);
+
+	list_for_each_entry(test, &file->triggers, list) {
 		test_enable_data = test->private_data;
 		if (test_enable_data &&
 		    (test->cmd_ops->trigger_type ==
@@ -1530,7 +1538,9 @@ void event_enable_unregister_trigger(cha
 	struct event_trigger_data *data;
 	bool unregistered = false;
 
-	list_for_each_entry_rcu(data, &file->triggers, list) {
+	lockdep_assert_held(&event_mutex);
+
+	list_for_each_entry(data, &file->triggers, list) {
 		enable_data = data->private_data;
 		if (enable_data &&
 		    (data->cmd_ops->trigger_type ==



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 27/92] Input: keyspan-remote - fix control-message timeouts
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (25 preceding siblings ...)
  2020-01-28 14:07 ` [PATCH 4.19 26/92] tracing: trigger: Replace unneeded RCU-list traversals Greg Kroah-Hartman
@ 2020-01-28 14:07 ` Greg Kroah-Hartman
  2020-01-28 14:07 ` [PATCH 4.19 28/92] Revert "Input: synaptics-rmi4 - dont increment rmiaddr for SMBus transfers" Greg Kroah-Hartman
                   ` (61 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:07 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Johan Hovold, Dmitry Torokhov

From: Johan Hovold <johan@kernel.org>

commit ba9a103f40fc4a3ec7558ec9b0b97d4f92034249 upstream.

The driver was issuing synchronous uninterruptible control requests
without using a timeout. This could lead to the driver hanging on probe
due to a malfunctioning (or malicious) device until the device is
physically disconnected. While sleeping in probe the driver prevents
other devices connected to the same hub from being added to (or removed
from) the bus.

The USB upper limit of five seconds per request should be more than
enough.

Fixes: 99f83c9c9ac9 ("[PATCH] USB: add driver for Keyspan Digital Remote")
Signed-off-by: Johan Hovold <johan@kernel.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable <stable@vger.kernel.org>     # 2.6.13
Link: https://lore.kernel.org/r/20200113171715.30621-1-johan@kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/input/misc/keyspan_remote.c |    9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

--- a/drivers/input/misc/keyspan_remote.c
+++ b/drivers/input/misc/keyspan_remote.c
@@ -339,7 +339,8 @@ static int keyspan_setup(struct usb_devi
 	int retval = 0;
 
 	retval = usb_control_msg(dev, usb_sndctrlpipe(dev, 0),
-				 0x11, 0x40, 0x5601, 0x0, NULL, 0, 0);
+				 0x11, 0x40, 0x5601, 0x0, NULL, 0,
+				 USB_CTRL_SET_TIMEOUT);
 	if (retval) {
 		dev_dbg(&dev->dev, "%s - failed to set bit rate due to error: %d\n",
 			__func__, retval);
@@ -347,7 +348,8 @@ static int keyspan_setup(struct usb_devi
 	}
 
 	retval = usb_control_msg(dev, usb_sndctrlpipe(dev, 0),
-				 0x44, 0x40, 0x0, 0x0, NULL, 0, 0);
+				 0x44, 0x40, 0x0, 0x0, NULL, 0,
+				 USB_CTRL_SET_TIMEOUT);
 	if (retval) {
 		dev_dbg(&dev->dev, "%s - failed to set resume sensitivity due to error: %d\n",
 			__func__, retval);
@@ -355,7 +357,8 @@ static int keyspan_setup(struct usb_devi
 	}
 
 	retval = usb_control_msg(dev, usb_sndctrlpipe(dev, 0),
-				 0x22, 0x40, 0x0, 0x0, NULL, 0, 0);
+				 0x22, 0x40, 0x0, 0x0, NULL, 0,
+				 USB_CTRL_SET_TIMEOUT);
 	if (retval) {
 		dev_dbg(&dev->dev, "%s - failed to turn receive on due to error: %d\n",
 			__func__, retval);



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 28/92] Revert "Input: synaptics-rmi4 - dont increment rmiaddr for SMBus transfers"
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (26 preceding siblings ...)
  2020-01-28 14:07 ` [PATCH 4.19 27/92] Input: keyspan-remote - fix control-message timeouts Greg Kroah-Hartman
@ 2020-01-28 14:07 ` Greg Kroah-Hartman
  2020-01-28 14:07 ` [PATCH 4.19 29/92] ARM: 8950/1: ftrace/recordmcount: filter relocation types Greg Kroah-Hartman
                   ` (60 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Hans Verkuil, Timo Kaufmann, Dmitry Torokhov

From: Hans Verkuil <hverkuil-cisco@xs4all.nl>

commit 8ff771f8c8d55d95f102cf88a970e541a8bd6bcf upstream.

This reverts commit a284e11c371e446371675668d8c8120a27227339.

This causes problems (drifting cursor) with at least the F11 function that
reads more than 32 bytes.

The real issue is in the F54 driver, and so this should be fixed there, and
not in rmi_smbus.c.

So first revert this bad commit, then fix the real problem in F54 in another
patch.

Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Reported-by: Timo Kaufmann <timokau@zoho.com>
Fixes: a284e11c371e ("Input: synaptics-rmi4 - don't increment rmiaddr for SMBus transfers")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200115124819.3191024-2-hverkuil-cisco@xs4all.nl
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/input/rmi4/rmi_smbus.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/drivers/input/rmi4/rmi_smbus.c
+++ b/drivers/input/rmi4/rmi_smbus.c
@@ -166,6 +166,7 @@ static int rmi_smb_write_block(struct rm
 		/* prepare to write next block of bytes */
 		cur_len -= SMB_MAX_COUNT;
 		databuff += SMB_MAX_COUNT;
+		rmiaddr += SMB_MAX_COUNT;
 	}
 exit:
 	mutex_unlock(&rmi_smb->page_mutex);
@@ -217,6 +218,7 @@ static int rmi_smb_read_block(struct rmi
 		/* prepare to read next block of bytes */
 		cur_len -= SMB_MAX_COUNT;
 		databuff += SMB_MAX_COUNT;
+		rmiaddr += SMB_MAX_COUNT;
 	}
 
 	retval = 0;



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 29/92] ARM: 8950/1: ftrace/recordmcount: filter relocation types
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (27 preceding siblings ...)
  2020-01-28 14:07 ` [PATCH 4.19 28/92] Revert "Input: synaptics-rmi4 - dont increment rmiaddr for SMBus transfers" Greg Kroah-Hartman
@ 2020-01-28 14:07 ` Greg Kroah-Hartman
  2020-01-28 14:07 ` [PATCH 4.19 30/92] mmc: tegra: fix SDR50 tuning override Greg Kroah-Hartman
                   ` (59 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Alexander Sverdlin,
	Steven Rostedt (VMware),
	Russell King

From: Alex Sverdlin <alexander.sverdlin@nokia.com>

commit 927d780ee371d7e121cea4fc7812f6ef2cea461c upstream.

Scenario 1, ARMv7
=================

If code in arch/arm/kernel/ftrace.c would operate on mcount() pointer
the following may be generated:

00000230 <prealloc_fixed_plts>:
 230:   b5f8            push    {r3, r4, r5, r6, r7, lr}
 232:   b500            push    {lr}
 234:   f7ff fffe       bl      0 <__gnu_mcount_nc>
                        234: R_ARM_THM_CALL     __gnu_mcount_nc
 238:   f240 0600       movw    r6, #0
                        238: R_ARM_THM_MOVW_ABS_NC      __gnu_mcount_nc
 23c:   f8d0 1180       ldr.w   r1, [r0, #384]  ; 0x180

FTRACE currently is not able to deal with it:

WARNING: CPU: 0 PID: 0 at .../kernel/trace/ftrace.c:1979 ftrace_bug+0x1ad/0x230()
...
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.4.116-... #1
...
[<c0314e3d>] (unwind_backtrace) from [<c03115e9>] (show_stack+0x11/0x14)
[<c03115e9>] (show_stack) from [<c051a7f1>] (dump_stack+0x81/0xa8)
[<c051a7f1>] (dump_stack) from [<c0321c5d>] (warn_slowpath_common+0x69/0x90)
[<c0321c5d>] (warn_slowpath_common) from [<c0321cf3>] (warn_slowpath_null+0x17/0x1c)
[<c0321cf3>] (warn_slowpath_null) from [<c038ee9d>] (ftrace_bug+0x1ad/0x230)
[<c038ee9d>] (ftrace_bug) from [<c038f1f9>] (ftrace_process_locs+0x27d/0x444)
[<c038f1f9>] (ftrace_process_locs) from [<c08915bd>] (ftrace_init+0x91/0xe8)
[<c08915bd>] (ftrace_init) from [<c0885a67>] (start_kernel+0x34b/0x358)
[<c0885a67>] (start_kernel) from [<00308095>] (0x308095)
---[ end trace cb88537fdc8fa200 ]---
ftrace failed to modify [<c031266c>] prealloc_fixed_plts+0x8/0x60
 actual: 44:f2:e1:36
ftrace record flags: 0
 (0)   expected tramp: c03143e9

Scenario 2, ARMv4T
==================

ftrace: allocating 14435 entries in 43 pages
------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at kernel/trace/ftrace.c:2029 ftrace_bug+0x204/0x310
CPU: 0 PID: 0 Comm: swapper Not tainted 4.19.5 #1
Hardware name: Cirrus Logic EDB9302 Evaluation Board
[<c0010a24>] (unwind_backtrace) from [<c000ecb0>] (show_stack+0x20/0x2c)
[<c000ecb0>] (show_stack) from [<c03c72e8>] (dump_stack+0x20/0x30)
[<c03c72e8>] (dump_stack) from [<c0021c18>] (__warn+0xdc/0x104)
[<c0021c18>] (__warn) from [<c0021d7c>] (warn_slowpath_null+0x4c/0x5c)
[<c0021d7c>] (warn_slowpath_null) from [<c0095360>] (ftrace_bug+0x204/0x310)
[<c0095360>] (ftrace_bug) from [<c04dabac>] (ftrace_init+0x3b4/0x4d4)
[<c04dabac>] (ftrace_init) from [<c04cef4c>] (start_kernel+0x20c/0x410)
[<c04cef4c>] (start_kernel) from [<00000000>] (  (null))
---[ end trace 0506a2f5dae6b341 ]---
ftrace failed to modify
[<c000c350>] perf_trace_sys_exit+0x5c/0xe8
 actual:   1e:ff:2f:e1
Initializing ftrace call sites
ftrace record flags: 0
 (0)
 expected tramp: c000fb24

The analysis for this problem has been already performed previously,
refer to the link below.

Fix the above problems by allowing only selected reloc types in
__mcount_loc. The list itself comes from the legacy recordmcount.pl
script.

Link: https://lore.kernel.org/lkml/56961010.6000806@pengutronix.de/
Cc: stable@vger.kernel.org
Fixes: ed60453fa8f8 ("ARM: 6511/1: ftrace: add ARM support for C version of recordmcount")
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 scripts/recordmcount.c |   17 +++++++++++++++++
 1 file changed, 17 insertions(+)

--- a/scripts/recordmcount.c
+++ b/scripts/recordmcount.c
@@ -39,6 +39,10 @@
 #define R_AARCH64_ABS64	257
 #endif
 
+#define R_ARM_PC24		1
+#define R_ARM_THM_CALL		10
+#define R_ARM_CALL		28
+
 static int fd_map;	/* File descriptor for file being modified. */
 static int mmap_failed; /* Boolean flag. */
 static char gpfx;	/* prefix for global symbol name (sometimes '_') */
@@ -414,6 +418,18 @@ is_mcounted_section_name(char const *con
 #define RECORD_MCOUNT_64
 #include "recordmcount.h"
 
+static int arm_is_fake_mcount(Elf32_Rel const *rp)
+{
+	switch (ELF32_R_TYPE(w(rp->r_info))) {
+	case R_ARM_THM_CALL:
+	case R_ARM_CALL:
+	case R_ARM_PC24:
+		return 0;
+	}
+
+	return 1;
+}
+
 /* 64-bit EM_MIPS has weird ELF64_Rela.r_info.
  * http://techpubs.sgi.com/library/manuals/4000/007-4658-001/pdf/007-4658-001.pdf
  * We interpret Table 29 Relocation Operation (Elf64_Rel, Elf64_Rela) [p.40]
@@ -515,6 +531,7 @@ do_file(char const *const fname)
 			 altmcount = "__gnu_mcount_nc";
 			 make_nop = make_nop_arm;
 			 rel_type_nop = R_ARM_NONE;
+			 is_fake_mcount32 = arm_is_fake_mcount;
 			 break;
 	case EM_AARCH64:
 			reltype = R_AARCH64_ABS64;



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 30/92] mmc: tegra: fix SDR50 tuning override
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (28 preceding siblings ...)
  2020-01-28 14:07 ` [PATCH 4.19 29/92] ARM: 8950/1: ftrace/recordmcount: filter relocation types Greg Kroah-Hartman
@ 2020-01-28 14:07 ` Greg Kroah-Hartman
  2020-01-28 14:07 ` [PATCH 4.19 31/92] mmc: sdhci: fix minimum clock rate for v3 controller Greg Kroah-Hartman
                   ` (58 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Adrian Hunter, Thierry Reding,
	Michał Mirosław, Ulf Hansson

From: Michał Mirosław <mirq-linux@rere.qmqm.pl>

commit f571389c0b015e76f91c697c4c1700aba860d34f upstream.

Commit 7ad2ed1dfcbe inadvertently mixed up a quirk flag's name and
broke SDR50 tuning override. Use correct NVQUIRK_ name.

Fixes: 7ad2ed1dfcbe ("mmc: tegra: enable UHS-I modes")
Cc: <stable@vger.kernel.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Tested-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Link: https://lore.kernel.org/r/9aff1d859935e59edd81e4939e40d6c55e0b55f6.1578390388.git.mirq-linux@rere.qmqm.pl
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/mmc/host/sdhci-tegra.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/mmc/host/sdhci-tegra.c
+++ b/drivers/mmc/host/sdhci-tegra.c
@@ -177,7 +177,7 @@ static void tegra_sdhci_reset(struct sdh
 			misc_ctrl |= SDHCI_MISC_CTRL_ENABLE_DDR50;
 		if (soc_data->nvquirks & NVQUIRK_ENABLE_SDR104)
 			misc_ctrl |= SDHCI_MISC_CTRL_ENABLE_SDR104;
-		if (soc_data->nvquirks & SDHCI_MISC_CTRL_ENABLE_SDR50)
+		if (soc_data->nvquirks & NVQUIRK_ENABLE_SDR50)
 			clk_ctrl |= SDHCI_CLOCK_CTRL_SDR50_TUNING_OVERRIDE;
 	}
 



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 31/92] mmc: sdhci: fix minimum clock rate for v3 controller
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (29 preceding siblings ...)
  2020-01-28 14:07 ` [PATCH 4.19 30/92] mmc: tegra: fix SDR50 tuning override Greg Kroah-Hartman
@ 2020-01-28 14:07 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 32/92] Documentation: Document arm64 kpti control Greg Kroah-Hartman
                   ` (57 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Michał Mirosław,
	Adrian Hunter, Ulf Hansson

From: Michał Mirosław <mirq-linux@rere.qmqm.pl>

commit 2a187d03352086e300daa2044051db00044cd171 upstream.

For SDHCIv3+ with programmable clock mode, minimal clock frequency is
still base clock / max(divider). Minimal programmable clock frequency is
always greater than minimal divided clock frequency. Without this patch,
SDHCI uses out-of-spec initial frequency when multiplier is big enough:

mmc1: mmc_rescan_try_freq: trying to init card at 468750 Hz
[for 480 MHz source clock divided by 1024]

The code in sdhci_calc_clk() already chooses a correct SDCLK clock mode.

Fixes: c3ed3877625f ("mmc: sdhci: add support for programmable clock mode")
Cc: <stable@vger.kernel.org> # 4f6aa3264af4: mmc: tegra: Only advertise UHS modes if IO regulator is present
Cc: <stable@vger.kernel.org>
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/ffb489519a446caffe7a0a05c4b9372bd52397bb.1579082031.git.mirq-linux@rere.qmqm.pl
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/mmc/host/sdhci.c |   10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -3700,11 +3700,13 @@ int sdhci_setup_host(struct sdhci_host *
 	if (host->ops->get_min_clock)
 		mmc->f_min = host->ops->get_min_clock(host);
 	else if (host->version >= SDHCI_SPEC_300) {
-		if (host->clk_mul) {
-			mmc->f_min = (host->max_clk * host->clk_mul) / 1024;
+		if (host->clk_mul)
 			max_clk = host->max_clk * host->clk_mul;
-		} else
-			mmc->f_min = host->max_clk / SDHCI_MAX_DIV_SPEC_300;
+		/*
+		 * Divided Clock Mode minimum clock rate is always less than
+		 * Programmable Clock Mode minimum clock rate.
+		 */
+		mmc->f_min = host->max_clk / SDHCI_MAX_DIV_SPEC_300;
 	} else
 		mmc->f_min = host->max_clk / SDHCI_MAX_DIV_SPEC_200;
 



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 32/92] Documentation: Document arm64 kpti control
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (30 preceding siblings ...)
  2020-01-28 14:07 ` [PATCH 4.19 31/92] mmc: sdhci: fix minimum clock rate for v3 controller Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 33/92] Input: pm8xxx-vib - fix handling of separate enable register Greg Kroah-Hartman
                   ` (56 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jeremy Linton, Andre Przywara,
	Jonathan Corbet

From: Jeremy Linton <jeremy.linton@arm.com>

commit de19055564c8f8f9d366f8db3395836da0b2176c upstream.

For a while Arm64 has been capable of force enabling
or disabling the kpti mitigations. Lets make sure the
documentation reflects that.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 Documentation/admin-guide/kernel-parameters.txt |    6 ++++++
 1 file changed, 6 insertions(+)

--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1946,6 +1946,12 @@
 			Built with CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF=y,
 			the default is off.
 
+	kpti=		[ARM64] Control page table isolation of user
+			and kernel address spaces.
+			Default: enabled on cores which need mitigation.
+			0: force disabled
+			1: force enabled
+
 	kvm.ignore_msrs=[KVM] Ignore guest accesses to unhandled MSRs.
 			Default is 0 (don't ignore, but inject #GP)
 



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 33/92] Input: pm8xxx-vib - fix handling of separate enable register
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (31 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 32/92] Documentation: Document arm64 kpti control Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 34/92] Input: sur40 - fix interface sanity checks Greg Kroah-Hartman
                   ` (55 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Stephan Gerhold, Dmitry Torokhov

From: Stephan Gerhold <stephan@gerhold.net>

commit 996d5d5f89a558a3608a46e73ccd1b99f1b1d058 upstream.

Setting the vibrator enable_mask is not implemented correctly:

For regmap_update_bits(map, reg, mask, val) we give in either
regs->enable_mask or 0 (= no-op) as mask and "val" as value.
But "val" actually refers to the vibrator voltage control register,
which has nothing to do with the enable_mask.

So we usually end up doing nothing when we really wanted
to enable the vibrator.

We want to set or clear the enable_mask (to enable/disable the vibrator).
Therefore, change the call to always modify the enable_mask
and set the bits only if we want to enable the vibrator.

Fixes: d4c7c5c96c92 ("Input: pm8xxx-vib - handle separate enable register")
Signed-off-by: Stephan Gerhold <stephan@gerhold.net>
Link: https://lore.kernel.org/r/20200114183442.45720-1-stephan@gerhold.net
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/input/misc/pm8xxx-vibrator.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/input/misc/pm8xxx-vibrator.c
+++ b/drivers/input/misc/pm8xxx-vibrator.c
@@ -98,7 +98,7 @@ static int pm8xxx_vib_set(struct pm8xxx_
 
 	if (regs->enable_mask)
 		rc = regmap_update_bits(vib->regmap, regs->enable_addr,
-					on ? regs->enable_mask : 0, val);
+					regs->enable_mask, on ? ~0 : 0);
 
 	return rc;
 }



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 34/92] Input: sur40 - fix interface sanity checks
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (32 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 33/92] Input: pm8xxx-vib - fix handling of separate enable register Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 35/92] Input: gtco - fix endpoint sanity check Greg Kroah-Hartman
                   ` (54 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Johan Hovold, Vladis Dronov, Dmitry Torokhov

From: Johan Hovold <johan@kernel.org>

commit 6b32391ed675827f8425a414abbc6fbd54ea54fe upstream.

Make sure to use the current alternate setting when verifying the
interface descriptors to avoid binding to an invalid interface.

This in turn could cause the driver to misbehave or trigger a WARN() in
usb_submit_urb() that kernels with panic_on_warn set would choke on.

Fixes: bdb5c57f209c ("Input: add sur40 driver for Samsung SUR40 (aka MS Surface 2.0/Pixelsense)")
Signed-off-by: Johan Hovold <johan@kernel.org>
Acked-by: Vladis Dronov <vdronov@redhat.com>
Link: https://lore.kernel.org/r/20191210113737.4016-8-johan@kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/input/touchscreen/sur40.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/input/touchscreen/sur40.c
+++ b/drivers/input/touchscreen/sur40.c
@@ -657,7 +657,7 @@ static int sur40_probe(struct usb_interf
 	int error;
 
 	/* Check if we really have the right interface. */
-	iface_desc = &interface->altsetting[0];
+	iface_desc = interface->cur_altsetting;
 	if (iface_desc->desc.bInterfaceClass != 0xFF)
 		return -ENODEV;
 



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 35/92] Input: gtco - fix endpoint sanity check
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (33 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 34/92] Input: sur40 - fix interface sanity checks Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 36/92] Input: aiptek " Greg Kroah-Hartman
                   ` (53 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Johan Hovold, Vladis Dronov, Dmitry Torokhov

From: Johan Hovold <johan@kernel.org>

commit a8eeb74df5a6bdb214b2b581b14782c5f5a0cf83 upstream.

The driver was checking the number of endpoints of the first alternate
setting instead of the current one, something which could lead to the
driver binding to an invalid interface.

This in turn could cause the driver to misbehave or trigger a WARN() in
usb_submit_urb() that kernels with panic_on_warn set would choke on.

Fixes: 162f98dea487 ("Input: gtco - fix crash on detecting device without endpoints")
Signed-off-by: Johan Hovold <johan@kernel.org>
Acked-by: Vladis Dronov <vdronov@redhat.com>
Link: https://lore.kernel.org/r/20191210113737.4016-5-johan@kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/input/tablet/gtco.c |   10 +++-------
 1 file changed, 3 insertions(+), 7 deletions(-)

--- a/drivers/input/tablet/gtco.c
+++ b/drivers/input/tablet/gtco.c
@@ -875,18 +875,14 @@ static int gtco_probe(struct usb_interfa
 	}
 
 	/* Sanity check that a device has an endpoint */
-	if (usbinterface->altsetting[0].desc.bNumEndpoints < 1) {
+	if (usbinterface->cur_altsetting->desc.bNumEndpoints < 1) {
 		dev_err(&usbinterface->dev,
 			"Invalid number of endpoints\n");
 		error = -EINVAL;
 		goto err_free_urb;
 	}
 
-	/*
-	 * The endpoint is always altsetting 0, we know this since we know
-	 * this device only has one interrupt endpoint
-	 */
-	endpoint = &usbinterface->altsetting[0].endpoint[0].desc;
+	endpoint = &usbinterface->cur_altsetting->endpoint[0].desc;
 
 	/* Some debug */
 	dev_dbg(&usbinterface->dev, "gtco # interfaces: %d\n", usbinterface->num_altsetting);
@@ -973,7 +969,7 @@ static int gtco_probe(struct usb_interfa
 	input_dev->dev.parent = &usbinterface->dev;
 
 	/* Setup the URB, it will be posted later on open of input device */
-	endpoint = &usbinterface->altsetting[0].endpoint[0].desc;
+	endpoint = &usbinterface->cur_altsetting->endpoint[0].desc;
 
 	usb_fill_int_urb(gtco->urbinfo,
 			 udev,



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 36/92] Input: aiptek - fix endpoint sanity check
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (34 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 35/92] Input: gtco - fix endpoint sanity check Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 37/92] Input: pegasus_notetaker " Greg Kroah-Hartman
                   ` (52 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Johan Hovold, Vladis Dronov, Dmitry Torokhov

From: Johan Hovold <johan@kernel.org>

commit 3111491fca4f01764e0c158c5e0f7ced808eef51 upstream.

The driver was checking the number of endpoints of the first alternate
setting instead of the current one, something which could lead to the
driver binding to an invalid interface.

This in turn could cause the driver to misbehave or trigger a WARN() in
usb_submit_urb() that kernels with panic_on_warn set would choke on.

Fixes: 8e20cf2bce12 ("Input: aiptek - fix crash on detecting device without endpoints")
Signed-off-by: Johan Hovold <johan@kernel.org>
Acked-by: Vladis Dronov <vdronov@redhat.com>
Link: https://lore.kernel.org/r/20191210113737.4016-3-johan@kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/input/tablet/aiptek.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/drivers/input/tablet/aiptek.c
+++ b/drivers/input/tablet/aiptek.c
@@ -1815,14 +1815,14 @@ aiptek_probe(struct usb_interface *intf,
 	input_set_abs_params(inputdev, ABS_WHEEL, AIPTEK_WHEEL_MIN, AIPTEK_WHEEL_MAX - 1, 0, 0);
 
 	/* Verify that a device really has an endpoint */
-	if (intf->altsetting[0].desc.bNumEndpoints < 1) {
+	if (intf->cur_altsetting->desc.bNumEndpoints < 1) {
 		dev_err(&intf->dev,
 			"interface has %d endpoints, but must have minimum 1\n",
-			intf->altsetting[0].desc.bNumEndpoints);
+			intf->cur_altsetting->desc.bNumEndpoints);
 		err = -EINVAL;
 		goto fail3;
 	}
-	endpoint = &intf->altsetting[0].endpoint[0].desc;
+	endpoint = &intf->cur_altsetting->endpoint[0].desc;
 
 	/* Go set up our URB, which is called when the tablet receives
 	 * input.



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 37/92] Input: pegasus_notetaker - fix endpoint sanity check
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (35 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 36/92] Input: aiptek " Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 38/92] Input: sun4i-ts - add a check for devm_thermal_zone_of_sensor_register Greg Kroah-Hartman
                   ` (51 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Johan Hovold, Martin Kepplinger,
	Vladis Dronov, Dmitry Torokhov

From: Johan Hovold <johan@kernel.org>

commit bcfcb7f9b480dd0be8f0df2df17340ca92a03b98 upstream.

The driver was checking the number of endpoints of the first alternate
setting instead of the current one, something which could be used by a
malicious device (or USB descriptor fuzzer) to trigger a NULL-pointer
dereference.

Fixes: 1afca2b66aac ("Input: add Pegasus Notetaker tablet driver")
Signed-off-by: Johan Hovold <johan@kernel.org>
Acked-by: Martin Kepplinger <martink@posteo.de>
Acked-by: Vladis Dronov <vdronov@redhat.com>
Link: https://lore.kernel.org/r/20191210113737.4016-2-johan@kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/input/tablet/pegasus_notetaker.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/input/tablet/pegasus_notetaker.c
+++ b/drivers/input/tablet/pegasus_notetaker.c
@@ -274,7 +274,7 @@ static int pegasus_probe(struct usb_inte
 		return -ENODEV;
 
 	/* Sanity check that the device has an endpoint */
-	if (intf->altsetting[0].desc.bNumEndpoints < 1) {
+	if (intf->cur_altsetting->desc.bNumEndpoints < 1) {
 		dev_err(&intf->dev, "Invalid number of endpoints\n");
 		return -EINVAL;
 	}



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 38/92] Input: sun4i-ts - add a check for devm_thermal_zone_of_sensor_register
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (36 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 37/92] Input: pegasus_notetaker " Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 39/92] netfilter: nft_osf: add missing check for DREG attribute Greg Kroah-Hartman
                   ` (50 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Chuhong Yuan, Dmitry Torokhov

From: Chuhong Yuan <hslester96@gmail.com>

commit 97e24b095348a15ec08c476423c3b3b939186ad7 upstream.

The driver misses a check for devm_thermal_zone_of_sensor_register().
Add a check to fix it.

Fixes: e28d0c9cd381 ("input: convert sun4i-ts to use devm_thermal_zone_of_sensor_register")
Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/input/touchscreen/sun4i-ts.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

--- a/drivers/input/touchscreen/sun4i-ts.c
+++ b/drivers/input/touchscreen/sun4i-ts.c
@@ -246,6 +246,7 @@ static int sun4i_ts_probe(struct platfor
 	struct device *dev = &pdev->dev;
 	struct device_node *np = dev->of_node;
 	struct device *hwmon;
+	struct thermal_zone_device *thermal;
 	int error;
 	u32 reg;
 	bool ts_attached;
@@ -365,7 +366,10 @@ static int sun4i_ts_probe(struct platfor
 	if (IS_ERR(hwmon))
 		return PTR_ERR(hwmon);
 
-	devm_thermal_zone_of_sensor_register(ts->dev, 0, ts, &sun4i_ts_tz_ops);
+	thermal = devm_thermal_zone_of_sensor_register(ts->dev, 0, ts,
+						       &sun4i_ts_tz_ops);
+	if (IS_ERR(thermal))
+		return PTR_ERR(thermal);
 
 	writel(TEMP_IRQ_EN(1), ts->base + TP_INT_FIFOC);
 



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 39/92] netfilter: nft_osf: add missing check for DREG attribute
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (37 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 38/92] Input: sun4i-ts - add a check for devm_thermal_zone_of_sensor_register Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 40/92] hwmon: (nct7802) Fix voltage limits to wrong registers Greg Kroah-Hartman
                   ` (49 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, syzbot+cf23983d697c26c34f60,
	Florian Westphal, Pablo Neira Ayuso

From: Florian Westphal <fw@strlen.de>

commit 7eaecf7963c1c8f62d62c6a8e7c439b0e7f2d365 upstream.

syzbot reports just another NULL deref crash because of missing test
for presence of the attribute.

Reported-by: syzbot+cf23983d697c26c34f60@syzkaller.appspotmail.com
Fixes:  b96af92d6eaf9fadd ("netfilter: nf_tables: implement Passive OS fingerprint module in nft_osf")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 net/netfilter/nft_osf.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/net/netfilter/nft_osf.c
+++ b/net/netfilter/nft_osf.c
@@ -47,6 +47,9 @@ static int nft_osf_init(const struct nft
 	struct nft_osf *priv = nft_expr_priv(expr);
 	int err;
 
+	if (!tb[NFTA_OSF_DREG])
+		return -EINVAL;
+
 	priv->dreg = nft_parse_register(tb[NFTA_OSF_DREG]);
 	err = nft_validate_register_store(ctx, priv->dreg, NULL,
 					  NFT_DATA_VALUE, NFT_OSF_MAXGENRELEN);



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 40/92] hwmon: (nct7802) Fix voltage limits to wrong registers
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (38 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 39/92] netfilter: nft_osf: add missing check for DREG attribute Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 41/92] scsi: RDMA/isert: Fix a recently introduced regression related to logout Greg Kroah-Hartman
                   ` (48 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Gilles Buloz, Guenter Roeck

From: Gilles Buloz <gilles.buloz@kontron.com>

commit 7713e62c8623c54dac88d1fa724aa487a38c3efb upstream.

in0 thresholds are written to the in2 thresholds registers
in2 thresholds to in3 thresholds
in3 thresholds to in4 thresholds
in4 thresholds to in0 thresholds

Signed-off-by: Gilles Buloz <gilles.buloz@kontron.com>
Link: https://lore.kernel.org/r/5de0f509.rc0oEvPOMjbfPW1w%gilles.buloz@kontron.com
Fixes: 3434f3783580 ("hwmon: Driver for Nuvoton NCT7802Y")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/hwmon/nct7802.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/hwmon/nct7802.c
+++ b/drivers/hwmon/nct7802.c
@@ -32,8 +32,8 @@
 static const u8 REG_VOLTAGE[5] = { 0x09, 0x0a, 0x0c, 0x0d, 0x0e };
 
 static const u8 REG_VOLTAGE_LIMIT_LSB[2][5] = {
-	{ 0x40, 0x00, 0x42, 0x44, 0x46 },
-	{ 0x3f, 0x00, 0x41, 0x43, 0x45 },
+	{ 0x46, 0x00, 0x40, 0x42, 0x44 },
+	{ 0x45, 0x00, 0x3f, 0x41, 0x43 },
 };
 
 static const u8 REG_VOLTAGE_LIMIT_MSB[5] = { 0x48, 0x00, 0x47, 0x47, 0x48 };



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 41/92] scsi: RDMA/isert: Fix a recently introduced regression related to logout
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (39 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 40/92] hwmon: (nct7802) Fix voltage limits to wrong registers Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 42/92] tracing: xen: Ordered comparison of function pointers Greg Kroah-Hartman
                   ` (47 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Rahul Kundu, Bart Van Assche,
	Mike Marciniszyn, Sagi Grimberg, Martin K. Petersen

From: Bart Van Assche <bvanassche@acm.org>

commit 04060db41178c7c244f2c7dcd913e7fd331de915 upstream.

iscsit_close_connection() calls isert_wait_conn(). Due to commit
e9d3009cb936 both functions call target_wait_for_sess_cmds() although that
last function should be called only once. Fix this by removing the
target_wait_for_sess_cmds() call from isert_wait_conn() and by only calling
isert_wait_conn() after target_wait_for_sess_cmds().

Fixes: e9d3009cb936 ("scsi: target: iscsi: Wait for all commands to finish before freeing a session").
Link: https://lore.kernel.org/r/20200116044737.19507-1-bvanassche@acm.org
Reported-by: Rahul Kundu <rahul.kundu@chelsio.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Tested-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Acked-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/infiniband/ulp/isert/ib_isert.c |   12 ------------
 drivers/target/iscsi/iscsi_target.c     |    6 +++---
 2 files changed, 3 insertions(+), 15 deletions(-)

--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -2584,17 +2584,6 @@ isert_wait4logout(struct isert_conn *ise
 	}
 }
 
-static void
-isert_wait4cmds(struct iscsi_conn *conn)
-{
-	isert_info("iscsi_conn %p\n", conn);
-
-	if (conn->sess) {
-		target_sess_cmd_list_set_waiting(conn->sess->se_sess);
-		target_wait_for_sess_cmds(conn->sess->se_sess);
-	}
-}
-
 /**
  * isert_put_unsol_pending_cmds() - Drop commands waiting for
  *     unsolicitate dataout
@@ -2642,7 +2631,6 @@ static void isert_wait_conn(struct iscsi
 
 	ib_drain_qp(isert_conn->qp);
 	isert_put_unsol_pending_cmds(conn);
-	isert_wait4cmds(conn);
 	isert_wait4logout(isert_conn);
 
 	queue_work(isert_release_wq, &isert_conn->release_work);
--- a/drivers/target/iscsi/iscsi_target.c
+++ b/drivers/target/iscsi/iscsi_target.c
@@ -4123,9 +4123,6 @@ int iscsit_close_connection(
 	iscsit_stop_nopin_response_timer(conn);
 	iscsit_stop_nopin_timer(conn);
 
-	if (conn->conn_transport->iscsit_wait_conn)
-		conn->conn_transport->iscsit_wait_conn(conn);
-
 	/*
 	 * During Connection recovery drop unacknowledged out of order
 	 * commands for this connection, and prepare the other commands
@@ -4211,6 +4208,9 @@ int iscsit_close_connection(
 	target_sess_cmd_list_set_waiting(sess->se_sess);
 	target_wait_for_sess_cmds(sess->se_sess);
 
+	if (conn->conn_transport->iscsit_wait_conn)
+		conn->conn_transport->iscsit_wait_conn(conn);
+
 	ahash_request_free(conn->conn_tx_hash);
 	if (conn->conn_rx_hash) {
 		struct crypto_ahash *tfm;



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 42/92] tracing: xen: Ordered comparison of function pointers
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (40 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 41/92] scsi: RDMA/isert: Fix a recently introduced regression related to logout Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 43/92] do_last(): fetch directory ->i_mode and ->i_uid before its too late Greg Kroah-Hartman
                   ` (46 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Changbin Du, Steven Rostedt (VMware)

From: Changbin Du <changbin.du@gmail.com>

commit d0695e2351102affd8efae83989056bc4b275917 upstream.

Just as commit 0566e40ce7 ("tracing: initcall: Ordered comparison of
function pointers"), this patch fixes another remaining one in xen.h
found by clang-9.

In file included from arch/x86/xen/trace.c:21:
In file included from ./include/trace/events/xen.h:475:
In file included from ./include/trace/define_trace.h:102:
In file included from ./include/trace/trace_events.h:473:
./include/trace/events/xen.h:69:7: warning: ordered comparison of function \
pointers ('xen_mc_callback_fn_t' (aka 'void (*)(void *)') and 'xen_mc_callback_fn_t') [-Wordered-compare-function-pointers]
                    __field(xen_mc_callback_fn_t, fn)
                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./include/trace/trace_events.h:421:29: note: expanded from macro '__field'
                                ^
./include/trace/trace_events.h:407:6: note: expanded from macro '__field_ext'
                                 is_signed_type(type), filter_type);    \
                                 ^
./include/linux/trace_events.h:554:44: note: expanded from macro 'is_signed_type'
                                              ^

Fixes: c796f213a6934 ("xen/trace: add multicall tracing")
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 include/trace/events/xen.h |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

--- a/include/trace/events/xen.h
+++ b/include/trace/events/xen.h
@@ -66,7 +66,11 @@ TRACE_EVENT(xen_mc_callback,
 	    TP_PROTO(xen_mc_callback_fn_t fn, void *data),
 	    TP_ARGS(fn, data),
 	    TP_STRUCT__entry(
-		    __field(xen_mc_callback_fn_t, fn)
+		    /*
+		     * Use field_struct to avoid is_signed_type()
+		     * comparison of a function pointer.
+		     */
+		    __field_struct(xen_mc_callback_fn_t, fn)
 		    __field(void *, data)
 		    ),
 	    TP_fast_assign(



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 43/92] do_last(): fetch directory ->i_mode and ->i_uid before its too late
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (41 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 42/92] tracing: xen: Ordered comparison of function pointers Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-31 10:08   ` Rantala, Tommi T. (Nokia - FI/Espoo)
  2020-01-28 14:08 ` [PATCH 4.19 44/92] net/sonic: Add mutual exclusion for accessing shared state Greg Kroah-Hartman
                   ` (45 subsequent siblings)
  88 siblings, 1 reply; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Al Viro

From: Al Viro <viro@zeniv.linux.org.uk>

commit d0cb50185ae942b03c4327be322055d622dc79f6 upstream.

may_create_in_sticky() call is done when we already have dropped the
reference to dir.

Fixes: 30aba6656f61e (namei: allow restricted O_CREAT of FIFOs and regular files)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/namei.c |   17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1009,7 +1009,8 @@ static int may_linkat(struct path *link)
  * may_create_in_sticky - Check whether an O_CREAT open in a sticky directory
  *			  should be allowed, or not, on files that already
  *			  exist.
- * @dir: the sticky parent directory
+ * @dir_mode: mode bits of directory
+ * @dir_uid: owner of directory
  * @inode: the inode of the file to open
  *
  * Block an O_CREAT open of a FIFO (or a regular file) when:
@@ -1025,18 +1026,18 @@ static int may_linkat(struct path *link)
  *
  * Returns 0 if the open is allowed, -ve on error.
  */
-static int may_create_in_sticky(struct dentry * const dir,
+static int may_create_in_sticky(umode_t dir_mode, kuid_t dir_uid,
 				struct inode * const inode)
 {
 	if ((!sysctl_protected_fifos && S_ISFIFO(inode->i_mode)) ||
 	    (!sysctl_protected_regular && S_ISREG(inode->i_mode)) ||
-	    likely(!(dir->d_inode->i_mode & S_ISVTX)) ||
-	    uid_eq(inode->i_uid, dir->d_inode->i_uid) ||
+	    likely(!(dir_mode & S_ISVTX)) ||
+	    uid_eq(inode->i_uid, dir_uid) ||
 	    uid_eq(current_fsuid(), inode->i_uid))
 		return 0;
 
-	if (likely(dir->d_inode->i_mode & 0002) ||
-	    (dir->d_inode->i_mode & 0020 &&
+	if (likely(dir_mode & 0002) ||
+	    (dir_mode & 0020 &&
 	     ((sysctl_protected_fifos >= 2 && S_ISFIFO(inode->i_mode)) ||
 	      (sysctl_protected_regular >= 2 && S_ISREG(inode->i_mode))))) {
 		return -EACCES;
@@ -3258,6 +3259,8 @@ static int do_last(struct nameidata *nd,
 		   struct file *file, const struct open_flags *op)
 {
 	struct dentry *dir = nd->path.dentry;
+	kuid_t dir_uid = dir->d_inode->i_uid;
+	umode_t dir_mode = dir->d_inode->i_mode;
 	int open_flag = op->open_flag;
 	bool will_truncate = (open_flag & O_TRUNC) != 0;
 	bool got_write = false;
@@ -3393,7 +3396,7 @@ finish_open:
 		error = -EISDIR;
 		if (d_is_dir(nd->path.dentry))
 			goto out;
-		error = may_create_in_sticky(dir,
+		error = may_create_in_sticky(dir_mode, dir_uid,
 					     d_backing_inode(nd->path.dentry));
 		if (unlikely(error))
 			goto out;



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 44/92] net/sonic: Add mutual exclusion for accessing shared state
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (42 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 43/92] do_last(): fetch directory ->i_mode and ->i_uid before its too late Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 45/92] net/sonic: Clear interrupt flags immediately Greg Kroah-Hartman
                   ` (44 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Stan Johnson, Finn Thain, David S. Miller

From: Finn Thain <fthain@telegraphics.com.au>

commit 865ad2f2201dc18685ba2686f13217f8b3a9c52c upstream.

The netif_stop_queue() call in sonic_send_packet() races with the
netif_wake_queue() call in sonic_interrupt(). This causes issues
like "NETDEV WATCHDOG: eth0 (macsonic): transmit queue 0 timed out".
Fix this by disabling interrupts when accessing tx_skb[] and next_tx.
Update a comment to clarify the synchronization properties.

Fixes: efcce839360f ("[PATCH] macsonic/jazzsonic network drivers update")
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/net/ethernet/natsemi/sonic.c |   49 +++++++++++++++++++++++++----------
 drivers/net/ethernet/natsemi/sonic.h |    1 
 2 files changed, 36 insertions(+), 14 deletions(-)

--- a/drivers/net/ethernet/natsemi/sonic.c
+++ b/drivers/net/ethernet/natsemi/sonic.c
@@ -63,6 +63,8 @@ static int sonic_open(struct net_device
 
 	netif_dbg(lp, ifup, dev, "%s: initializing sonic driver\n", __func__);
 
+	spin_lock_init(&lp->lock);
+
 	for (i = 0; i < SONIC_NUM_RRS; i++) {
 		struct sk_buff *skb = netdev_alloc_skb(dev, SONIC_RBSIZE + 2);
 		if (skb == NULL) {
@@ -205,8 +207,6 @@ static void sonic_tx_timeout(struct net_
  *   wake the tx queue
  * Concurrently with all of this, the SONIC is potentially writing to
  * the status flags of the TDs.
- * Until some mutual exclusion is added, this code will not work with SMP. However,
- * MIPS Jazz machines and m68k Macs were all uni-processor machines.
  */
 
 static int sonic_send_packet(struct sk_buff *skb, struct net_device *dev)
@@ -214,7 +214,8 @@ static int sonic_send_packet(struct sk_b
 	struct sonic_local *lp = netdev_priv(dev);
 	dma_addr_t laddr;
 	int length;
-	int entry = lp->next_tx;
+	int entry;
+	unsigned long flags;
 
 	netif_dbg(lp, tx_queued, dev, "%s: skb=%p\n", __func__, skb);
 
@@ -236,6 +237,10 @@ static int sonic_send_packet(struct sk_b
 		return NETDEV_TX_OK;
 	}
 
+	spin_lock_irqsave(&lp->lock, flags);
+
+	entry = lp->next_tx;
+
 	sonic_tda_put(dev, entry, SONIC_TD_STATUS, 0);       /* clear status */
 	sonic_tda_put(dev, entry, SONIC_TD_FRAG_COUNT, 1);   /* single fragment */
 	sonic_tda_put(dev, entry, SONIC_TD_PKTSIZE, length); /* length of packet */
@@ -245,10 +250,6 @@ static int sonic_send_packet(struct sk_b
 	sonic_tda_put(dev, entry, SONIC_TD_LINK,
 		sonic_tda_get(dev, entry, SONIC_TD_LINK) | SONIC_EOL);
 
-	/*
-	 * Must set tx_skb[entry] only after clearing status, and
-	 * before clearing EOL and before stopping queue
-	 */
 	wmb();
 	lp->tx_len[entry] = length;
 	lp->tx_laddr[entry] = laddr;
@@ -271,6 +272,8 @@ static int sonic_send_packet(struct sk_b
 
 	SONIC_WRITE(SONIC_CMD, SONIC_CR_TXP);
 
+	spin_unlock_irqrestore(&lp->lock, flags);
+
 	return NETDEV_TX_OK;
 }
 
@@ -283,9 +286,21 @@ static irqreturn_t sonic_interrupt(int i
 	struct net_device *dev = dev_id;
 	struct sonic_local *lp = netdev_priv(dev);
 	int status;
+	unsigned long flags;
+
+	/* The lock has two purposes. Firstly, it synchronizes sonic_interrupt()
+	 * with sonic_send_packet() so that the two functions can share state.
+	 * Secondly, it makes sonic_interrupt() re-entrant, as that is required
+	 * by macsonic which must use two IRQs with different priority levels.
+	 */
+	spin_lock_irqsave(&lp->lock, flags);
+
+	status = SONIC_READ(SONIC_ISR) & SONIC_IMR_DEFAULT;
+	if (!status) {
+		spin_unlock_irqrestore(&lp->lock, flags);
 
-	if (!(status = SONIC_READ(SONIC_ISR) & SONIC_IMR_DEFAULT))
 		return IRQ_NONE;
+	}
 
 	do {
 		if (status & SONIC_INT_PKTRX) {
@@ -299,11 +314,12 @@ static irqreturn_t sonic_interrupt(int i
 			int td_status;
 			int freed_some = 0;
 
-			/* At this point, cur_tx is the index of a TD that is one of:
-			 *   unallocated/freed                          (status set   & tx_skb[entry] clear)
-			 *   allocated and sent                         (status set   & tx_skb[entry] set  )
-			 *   allocated and not yet sent                 (status clear & tx_skb[entry] set  )
-			 *   still being allocated by sonic_send_packet (status clear & tx_skb[entry] clear)
+			/* The state of a Transmit Descriptor may be inferred
+			 * from { tx_skb[entry], td_status } as follows.
+			 * { clear, clear } => the TD has never been used
+			 * { set,   clear } => the TD was handed to SONIC
+			 * { set,   set   } => the TD was handed back
+			 * { clear, set   } => the TD is available for re-use
 			 */
 
 			netif_dbg(lp, intr, dev, "%s: tx done\n", __func__);
@@ -405,7 +421,12 @@ static irqreturn_t sonic_interrupt(int i
 		/* load CAM done */
 		if (status & SONIC_INT_LCD)
 			SONIC_WRITE(SONIC_ISR, SONIC_INT_LCD); /* clear the interrupt */
-	} while((status = SONIC_READ(SONIC_ISR) & SONIC_IMR_DEFAULT));
+
+		status = SONIC_READ(SONIC_ISR) & SONIC_IMR_DEFAULT;
+	} while (status);
+
+	spin_unlock_irqrestore(&lp->lock, flags);
+
 	return IRQ_HANDLED;
 }
 
--- a/drivers/net/ethernet/natsemi/sonic.h
+++ b/drivers/net/ethernet/natsemi/sonic.h
@@ -322,6 +322,7 @@ struct sonic_local {
 	int msg_enable;
 	struct device *device;         /* generic device */
 	struct net_device_stats stats;
+	spinlock_t lock;
 };
 
 #define TX_TIMEOUT (3 * HZ)



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 45/92] net/sonic: Clear interrupt flags immediately
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (43 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 44/92] net/sonic: Add mutual exclusion for accessing shared state Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 46/92] net/sonic: Use MMIO accessors Greg Kroah-Hartman
                   ` (43 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Stan Johnson, Finn Thain, David S. Miller

From: Finn Thain <fthain@telegraphics.com.au>

commit 5fedabf5a70be26b19d7520f09f12a62274317c6 upstream.

The chip can change a packet's descriptor status flags at any time.
However, an active interrupt flag gets cleared rather late. This
allows a race condition that could theoretically lose an interrupt.
Fix this by clearing asserted interrupt flags immediately.

Fixes: efcce839360f ("[PATCH] macsonic/jazzsonic network drivers update")
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/net/ethernet/natsemi/sonic.c |   28 ++++++----------------------
 1 file changed, 6 insertions(+), 22 deletions(-)

--- a/drivers/net/ethernet/natsemi/sonic.c
+++ b/drivers/net/ethernet/natsemi/sonic.c
@@ -303,10 +303,11 @@ static irqreturn_t sonic_interrupt(int i
 	}
 
 	do {
+		SONIC_WRITE(SONIC_ISR, status); /* clear the interrupt(s) */
+
 		if (status & SONIC_INT_PKTRX) {
 			netif_dbg(lp, intr, dev, "%s: packet rx\n", __func__);
 			sonic_rx(dev);	/* got packet(s) */
-			SONIC_WRITE(SONIC_ISR, SONIC_INT_PKTRX); /* clear the interrupt */
 		}
 
 		if (status & SONIC_INT_TXDN) {
@@ -361,7 +362,6 @@ static irqreturn_t sonic_interrupt(int i
 			if (freed_some || lp->tx_skb[entry] == NULL)
 				netif_wake_queue(dev);  /* The ring is no longer full */
 			lp->cur_tx = entry;
-			SONIC_WRITE(SONIC_ISR, SONIC_INT_TXDN); /* clear the interrupt */
 		}
 
 		/*
@@ -371,42 +371,31 @@ static irqreturn_t sonic_interrupt(int i
 			netif_dbg(lp, rx_err, dev, "%s: rx fifo overrun\n",
 				  __func__);
 			lp->stats.rx_fifo_errors++;
-			SONIC_WRITE(SONIC_ISR, SONIC_INT_RFO); /* clear the interrupt */
 		}
 		if (status & SONIC_INT_RDE) {
 			netif_dbg(lp, rx_err, dev, "%s: rx descriptors exhausted\n",
 				  __func__);
 			lp->stats.rx_dropped++;
-			SONIC_WRITE(SONIC_ISR, SONIC_INT_RDE); /* clear the interrupt */
 		}
 		if (status & SONIC_INT_RBAE) {
 			netif_dbg(lp, rx_err, dev, "%s: rx buffer area exceeded\n",
 				  __func__);
 			lp->stats.rx_dropped++;
-			SONIC_WRITE(SONIC_ISR, SONIC_INT_RBAE); /* clear the interrupt */
 		}
 
 		/* counter overruns; all counters are 16bit wide */
-		if (status & SONIC_INT_FAE) {
+		if (status & SONIC_INT_FAE)
 			lp->stats.rx_frame_errors += 65536;
-			SONIC_WRITE(SONIC_ISR, SONIC_INT_FAE); /* clear the interrupt */
-		}
-		if (status & SONIC_INT_CRC) {
+		if (status & SONIC_INT_CRC)
 			lp->stats.rx_crc_errors += 65536;
-			SONIC_WRITE(SONIC_ISR, SONIC_INT_CRC); /* clear the interrupt */
-		}
-		if (status & SONIC_INT_MP) {
+		if (status & SONIC_INT_MP)
 			lp->stats.rx_missed_errors += 65536;
-			SONIC_WRITE(SONIC_ISR, SONIC_INT_MP); /* clear the interrupt */
-		}
 
 		/* transmit error */
-		if (status & SONIC_INT_TXER) {
+		if (status & SONIC_INT_TXER)
 			if (SONIC_READ(SONIC_TCR) & SONIC_TCR_FU)
 				netif_dbg(lp, tx_err, dev, "%s: tx fifo underrun\n",
 					  __func__);
-			SONIC_WRITE(SONIC_ISR, SONIC_INT_TXER); /* clear the interrupt */
-		}
 
 		/* bus retry */
 		if (status & SONIC_INT_BR) {
@@ -415,13 +404,8 @@ static irqreturn_t sonic_interrupt(int i
 			/* ... to help debug DMA problems causing endless interrupts. */
 			/* Bounce the eth interface to turn on the interrupt again. */
 			SONIC_WRITE(SONIC_IMR, 0);
-			SONIC_WRITE(SONIC_ISR, SONIC_INT_BR); /* clear the interrupt */
 		}
 
-		/* load CAM done */
-		if (status & SONIC_INT_LCD)
-			SONIC_WRITE(SONIC_ISR, SONIC_INT_LCD); /* clear the interrupt */
-
 		status = SONIC_READ(SONIC_ISR) & SONIC_IMR_DEFAULT;
 	} while (status);
 



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 46/92] net/sonic: Use MMIO accessors
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (44 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 45/92] net/sonic: Clear interrupt flags immediately Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 47/92] net/sonic: Fix interface error stats collection Greg Kroah-Hartman
                   ` (42 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Stan Johnson, Finn Thain, David S. Miller

From: Finn Thain <fthain@telegraphics.com.au>

commit e3885f576196ddfc670b3d53e745de96ffcb49ab upstream.

The driver accesses descriptor memory which is simultaneously accessed by
the chip, so the compiler must not be allowed to re-order CPU accesses.
sonic_buf_get() used 'volatile' to prevent that. sonic_buf_put() should
have done so too but was overlooked.

Fixes: efcce839360f ("[PATCH] macsonic/jazzsonic network drivers update")
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/net/ethernet/natsemi/sonic.h |   16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

--- a/drivers/net/ethernet/natsemi/sonic.h
+++ b/drivers/net/ethernet/natsemi/sonic.h
@@ -345,30 +345,30 @@ static void sonic_msg_init(struct net_de
    as far as we can tell. */
 /* OpenBSD calls this "SWO".  I'd like to think that sonic_buf_put()
    is a much better name. */
-static inline void sonic_buf_put(void* base, int bitmode,
+static inline void sonic_buf_put(u16 *base, int bitmode,
 				 int offset, __u16 val)
 {
 	if (bitmode)
 #ifdef __BIG_ENDIAN
-		((__u16 *) base + (offset*2))[1] = val;
+		__raw_writew(val, base + (offset * 2) + 1);
 #else
-		((__u16 *) base + (offset*2))[0] = val;
+		__raw_writew(val, base + (offset * 2) + 0);
 #endif
 	else
-	 	((__u16 *) base)[offset] = val;
+		__raw_writew(val, base + (offset * 1) + 0);
 }
 
-static inline __u16 sonic_buf_get(void* base, int bitmode,
+static inline __u16 sonic_buf_get(u16 *base, int bitmode,
 				  int offset)
 {
 	if (bitmode)
 #ifdef __BIG_ENDIAN
-		return ((volatile __u16 *) base + (offset*2))[1];
+		return __raw_readw(base + (offset * 2) + 1);
 #else
-		return ((volatile __u16 *) base + (offset*2))[0];
+		return __raw_readw(base + (offset * 2) + 0);
 #endif
 	else
-		return ((volatile __u16 *) base)[offset];
+		return __raw_readw(base + (offset * 1) + 0);
 }
 
 /* Inlines that you should actually use for reading/writing DMA buffers */



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 47/92] net/sonic: Fix interface error stats collection
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (45 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 46/92] net/sonic: Use MMIO accessors Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 48/92] net/sonic: Fix receive buffer handling Greg Kroah-Hartman
                   ` (41 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Stan Johnson, Finn Thain, David S. Miller

From: Finn Thain <fthain@telegraphics.com.au>

commit 427db97df1ee721c20bdc9a66db8a9e1da719855 upstream.

The tx_aborted_errors statistic should count packets flagged with EXD,
EXC, FU, or BCM bits because those bits denote an aborted transmission.
That corresponds to the bitmask 0x0446, not 0x0642. Use macros for these
constants to avoid mistakes. Better to leave out FIFO Underruns (FU) as
there's a separate counter for that purpose.

Don't lump all these errors in with the general tx_errors counter as
that's used for tx timeout events.

On the rx side, don't count RDE and RBAE interrupts as dropped packets.
These interrupts don't indicate a lost packet, just a lack of resources.
When a lack of resources results in a lost packet, this gets reported
in the rx_missed_errors counter (along with RFO events).

Don't double-count rx_frame_errors and rx_crc_errors.

Don't use the general rx_errors counter for events that already have
special counters.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/net/ethernet/natsemi/sonic.c |   21 +++++++--------------
 drivers/net/ethernet/natsemi/sonic.h |    1 +
 2 files changed, 8 insertions(+), 14 deletions(-)

--- a/drivers/net/ethernet/natsemi/sonic.c
+++ b/drivers/net/ethernet/natsemi/sonic.c
@@ -329,18 +329,19 @@ static irqreturn_t sonic_interrupt(int i
 				if ((td_status = sonic_tda_get(dev, entry, SONIC_TD_STATUS)) == 0)
 					break;
 
-				if (td_status & 0x0001) {
+				if (td_status & SONIC_TCR_PTX) {
 					lp->stats.tx_packets++;
 					lp->stats.tx_bytes += sonic_tda_get(dev, entry, SONIC_TD_PKTSIZE);
 				} else {
-					lp->stats.tx_errors++;
-					if (td_status & 0x0642)
+					if (td_status & (SONIC_TCR_EXD |
+					    SONIC_TCR_EXC | SONIC_TCR_BCM))
 						lp->stats.tx_aborted_errors++;
-					if (td_status & 0x0180)
+					if (td_status &
+					    (SONIC_TCR_NCRS | SONIC_TCR_CRLS))
 						lp->stats.tx_carrier_errors++;
-					if (td_status & 0x0020)
+					if (td_status & SONIC_TCR_OWC)
 						lp->stats.tx_window_errors++;
-					if (td_status & 0x0004)
+					if (td_status & SONIC_TCR_FU)
 						lp->stats.tx_fifo_errors++;
 				}
 
@@ -370,17 +371,14 @@ static irqreturn_t sonic_interrupt(int i
 		if (status & SONIC_INT_RFO) {
 			netif_dbg(lp, rx_err, dev, "%s: rx fifo overrun\n",
 				  __func__);
-			lp->stats.rx_fifo_errors++;
 		}
 		if (status & SONIC_INT_RDE) {
 			netif_dbg(lp, rx_err, dev, "%s: rx descriptors exhausted\n",
 				  __func__);
-			lp->stats.rx_dropped++;
 		}
 		if (status & SONIC_INT_RBAE) {
 			netif_dbg(lp, rx_err, dev, "%s: rx buffer area exceeded\n",
 				  __func__);
-			lp->stats.rx_dropped++;
 		}
 
 		/* counter overruns; all counters are 16bit wide */
@@ -472,11 +470,6 @@ static void sonic_rx(struct net_device *
 			sonic_rra_put(dev, entry, SONIC_RR_BUFADR_H, bufadr_h);
 		} else {
 			/* This should only happen, if we enable accepting broken packets. */
-			lp->stats.rx_errors++;
-			if (status & SONIC_RCR_FAER)
-				lp->stats.rx_frame_errors++;
-			if (status & SONIC_RCR_CRCR)
-				lp->stats.rx_crc_errors++;
 		}
 		if (status & SONIC_RCR_LPKT) {
 			/*
--- a/drivers/net/ethernet/natsemi/sonic.h
+++ b/drivers/net/ethernet/natsemi/sonic.h
@@ -175,6 +175,7 @@
 #define SONIC_TCR_NCRS          0x0100
 #define SONIC_TCR_CRLS          0x0080
 #define SONIC_TCR_EXC           0x0040
+#define SONIC_TCR_OWC           0x0020
 #define SONIC_TCR_PMB           0x0008
 #define SONIC_TCR_FU            0x0004
 #define SONIC_TCR_BCM           0x0002



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 48/92] net/sonic: Fix receive buffer handling
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (46 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 47/92] net/sonic: Fix interface error stats collection Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 49/92] net/sonic: Avoid needless receive descriptor EOL flag updates Greg Kroah-Hartman
                   ` (40 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Stan Johnson, Finn Thain, David S. Miller

From: Finn Thain <fthain@telegraphics.com.au>

commit 9e311820f67e740f4fb8dcb82b4c4b5b05bdd1a5 upstream.

The SONIC can sometimes advance its rx buffer pointer (RRP register)
without advancing its rx descriptor pointer (CRDA register). As a result
the index of the current rx descriptor may not equal that of the current
rx buffer. The driver mistakenly assumes that they are always equal.
This assumption leads to incorrect packet lengths and possible packet
duplication. Avoid this by calling a new function to locate the buffer
corresponding to a given descriptor.

Fixes: efcce839360f ("[PATCH] macsonic/jazzsonic network drivers update")
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/net/ethernet/natsemi/sonic.c |   35 ++++++++++++++++++++++++++++++-----
 drivers/net/ethernet/natsemi/sonic.h |    5 +++--
 2 files changed, 33 insertions(+), 7 deletions(-)

--- a/drivers/net/ethernet/natsemi/sonic.c
+++ b/drivers/net/ethernet/natsemi/sonic.c
@@ -412,6 +412,21 @@ static irqreturn_t sonic_interrupt(int i
 	return IRQ_HANDLED;
 }
 
+/* Return the array index corresponding to a given Receive Buffer pointer. */
+static int index_from_addr(struct sonic_local *lp, dma_addr_t addr,
+			   unsigned int last)
+{
+	unsigned int i = last;
+
+	do {
+		i = (i + 1) & SONIC_RRS_MASK;
+		if (addr == lp->rx_laddr[i])
+			return i;
+	} while (i != last);
+
+	return -ENOENT;
+}
+
 /*
  * We have a good packet(s), pass it/them up the network stack.
  */
@@ -431,6 +446,16 @@ static void sonic_rx(struct net_device *
 
 		status = sonic_rda_get(dev, entry, SONIC_RD_STATUS);
 		if (status & SONIC_RCR_PRX) {
+			u32 addr = (sonic_rda_get(dev, entry,
+						  SONIC_RD_PKTPTR_H) << 16) |
+				   sonic_rda_get(dev, entry, SONIC_RD_PKTPTR_L);
+			int i = index_from_addr(lp, addr, entry);
+
+			if (i < 0) {
+				WARN_ONCE(1, "failed to find buffer!\n");
+				break;
+			}
+
 			/* Malloc up new buffer. */
 			new_skb = netdev_alloc_skb(dev, SONIC_RBSIZE + 2);
 			if (new_skb == NULL) {
@@ -452,7 +477,7 @@ static void sonic_rx(struct net_device *
 
 			/* now we have a new skb to replace it, pass the used one up the stack */
 			dma_unmap_single(lp->device, lp->rx_laddr[entry], SONIC_RBSIZE, DMA_FROM_DEVICE);
-			used_skb = lp->rx_skb[entry];
+			used_skb = lp->rx_skb[i];
 			pkt_len = sonic_rda_get(dev, entry, SONIC_RD_PKTLEN);
 			skb_trim(used_skb, pkt_len);
 			used_skb->protocol = eth_type_trans(used_skb, dev);
@@ -461,13 +486,13 @@ static void sonic_rx(struct net_device *
 			lp->stats.rx_bytes += pkt_len;
 
 			/* and insert the new skb */
-			lp->rx_laddr[entry] = new_laddr;
-			lp->rx_skb[entry] = new_skb;
+			lp->rx_laddr[i] = new_laddr;
+			lp->rx_skb[i] = new_skb;
 
 			bufadr_l = (unsigned long)new_laddr & 0xffff;
 			bufadr_h = (unsigned long)new_laddr >> 16;
-			sonic_rra_put(dev, entry, SONIC_RR_BUFADR_L, bufadr_l);
-			sonic_rra_put(dev, entry, SONIC_RR_BUFADR_H, bufadr_h);
+			sonic_rra_put(dev, i, SONIC_RR_BUFADR_L, bufadr_l);
+			sonic_rra_put(dev, i, SONIC_RR_BUFADR_H, bufadr_h);
 		} else {
 			/* This should only happen, if we enable accepting broken packets. */
 		}
--- a/drivers/net/ethernet/natsemi/sonic.h
+++ b/drivers/net/ethernet/natsemi/sonic.h
@@ -275,8 +275,9 @@
 #define SONIC_NUM_RDS   SONIC_NUM_RRS /* number of receive descriptors */
 #define SONIC_NUM_TDS   16            /* number of transmit descriptors */
 
-#define SONIC_RDS_MASK  (SONIC_NUM_RDS-1)
-#define SONIC_TDS_MASK  (SONIC_NUM_TDS-1)
+#define SONIC_RRS_MASK  (SONIC_NUM_RRS - 1)
+#define SONIC_RDS_MASK  (SONIC_NUM_RDS - 1)
+#define SONIC_TDS_MASK  (SONIC_NUM_TDS - 1)
 
 #define SONIC_RBSIZE	1520          /* size of one resource buffer */
 



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 49/92] net/sonic: Avoid needless receive descriptor EOL flag updates
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (47 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 48/92] net/sonic: Fix receive buffer handling Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 50/92] net/sonic: Improve receive descriptor status flag check Greg Kroah-Hartman
                   ` (39 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Stan Johnson, Finn Thain, David S. Miller

From: Finn Thain <fthain@telegraphics.com.au>

commit eaabfd19b2c787bbe88dc32424b9a43d67293422 upstream.

The while loop in sonic_rx() traverses the rx descriptor ring. It stops
when it reaches a descriptor that the SONIC has not used. Each iteration
advances the EOL flag so the SONIC can keep using more descriptors.
Therefore, the while loop has no definite termination condition.

The algorithm described in the National Semiconductor literature is quite
different. It consumes descriptors up to the one with its EOL flag set
(which will also have its "in use" flag set). All freed descriptors are
then returned to the ring at once, by adjusting the EOL flags (and link
pointers).

Adopt the algorithm from datasheet as it's simpler, terminates quickly
and avoids a lot of pointless descriptor EOL flag changes.

Fixes: efcce839360f ("[PATCH] macsonic/jazzsonic network drivers update")
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/net/ethernet/natsemi/sonic.c |   21 +++++++++++++++------
 1 file changed, 15 insertions(+), 6 deletions(-)

--- a/drivers/net/ethernet/natsemi/sonic.c
+++ b/drivers/net/ethernet/natsemi/sonic.c
@@ -435,6 +435,7 @@ static void sonic_rx(struct net_device *
 	struct sonic_local *lp = netdev_priv(dev);
 	int status;
 	int entry = lp->cur_rx;
+	int prev_entry = lp->eol_rx;
 
 	while (sonic_rda_get(dev, entry, SONIC_RD_IN_USE) == 0) {
 		struct sk_buff *used_skb;
@@ -515,13 +516,21 @@ static void sonic_rx(struct net_device *
 		/*
 		 * give back the descriptor
 		 */
-		sonic_rda_put(dev, entry, SONIC_RD_LINK,
-			sonic_rda_get(dev, entry, SONIC_RD_LINK) | SONIC_EOL);
 		sonic_rda_put(dev, entry, SONIC_RD_IN_USE, 1);
-		sonic_rda_put(dev, lp->eol_rx, SONIC_RD_LINK,
-			sonic_rda_get(dev, lp->eol_rx, SONIC_RD_LINK) & ~SONIC_EOL);
-		lp->eol_rx = entry;
-		lp->cur_rx = entry = (entry + 1) & SONIC_RDS_MASK;
+
+		prev_entry = entry;
+		entry = (entry + 1) & SONIC_RDS_MASK;
+	}
+
+	lp->cur_rx = entry;
+
+	if (prev_entry != lp->eol_rx) {
+		/* Advance the EOL flag to put descriptors back into service */
+		sonic_rda_put(dev, prev_entry, SONIC_RD_LINK, SONIC_EOL |
+			      sonic_rda_get(dev, prev_entry, SONIC_RD_LINK));
+		sonic_rda_put(dev, lp->eol_rx, SONIC_RD_LINK, ~SONIC_EOL &
+			      sonic_rda_get(dev, lp->eol_rx, SONIC_RD_LINK));
+		lp->eol_rx = prev_entry;
 	}
 	/*
 	 * If any worth-while packets have been received, netif_rx()



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 50/92] net/sonic: Improve receive descriptor status flag check
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (48 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 49/92] net/sonic: Avoid needless receive descriptor EOL flag updates Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 51/92] net/sonic: Fix receive buffer replenishment Greg Kroah-Hartman
                   ` (38 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Stan Johnson, Finn Thain, David S. Miller

From: Finn Thain <fthain@telegraphics.com.au>

commit 94b166349503957079ef5e7d6f667f157aea014a upstream.

After sonic_tx_timeout() calls sonic_init(), it can happen that
sonic_rx() will subsequently encounter a receive descriptor with no
flags set. Remove the comment that says that this can't happen.

When giving a receive descriptor to the SONIC, clear the descriptor
status field. That way, any rx descriptor with flags set can only be
a newly received packet.

Don't process a descriptor without the LPKT bit set. The buffer is
still in use by the SONIC.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/net/ethernet/natsemi/sonic.c |   15 +++++----------
 1 file changed, 5 insertions(+), 10 deletions(-)

--- a/drivers/net/ethernet/natsemi/sonic.c
+++ b/drivers/net/ethernet/natsemi/sonic.c
@@ -433,7 +433,6 @@ static int index_from_addr(struct sonic_
 static void sonic_rx(struct net_device *dev)
 {
 	struct sonic_local *lp = netdev_priv(dev);
-	int status;
 	int entry = lp->cur_rx;
 	int prev_entry = lp->eol_rx;
 
@@ -444,9 +443,10 @@ static void sonic_rx(struct net_device *
 		u16 bufadr_l;
 		u16 bufadr_h;
 		int pkt_len;
+		u16 status = sonic_rda_get(dev, entry, SONIC_RD_STATUS);
 
-		status = sonic_rda_get(dev, entry, SONIC_RD_STATUS);
-		if (status & SONIC_RCR_PRX) {
+		/* If the RD has LPKT set, the chip has finished with the RB */
+		if ((status & SONIC_RCR_PRX) && (status & SONIC_RCR_LPKT)) {
 			u32 addr = (sonic_rda_get(dev, entry,
 						  SONIC_RD_PKTPTR_H) << 16) |
 				   sonic_rda_get(dev, entry, SONIC_RD_PKTPTR_L);
@@ -494,10 +494,6 @@ static void sonic_rx(struct net_device *
 			bufadr_h = (unsigned long)new_laddr >> 16;
 			sonic_rra_put(dev, i, SONIC_RR_BUFADR_L, bufadr_l);
 			sonic_rra_put(dev, i, SONIC_RR_BUFADR_H, bufadr_h);
-		} else {
-			/* This should only happen, if we enable accepting broken packets. */
-		}
-		if (status & SONIC_RCR_LPKT) {
 			/*
 			 * this was the last packet out of the current receive buffer
 			 * give the buffer back to the SONIC
@@ -510,12 +506,11 @@ static void sonic_rx(struct net_device *
 					  __func__);
 				SONIC_WRITE(SONIC_ISR, SONIC_INT_RBE); /* clear the flag */
 			}
-		} else
-			printk(KERN_ERR "%s: rx desc without RCR_LPKT. Shouldn't happen !?\n",
-			     dev->name);
+		}
 		/*
 		 * give back the descriptor
 		 */
+		sonic_rda_put(dev, entry, SONIC_RD_STATUS, 0);
 		sonic_rda_put(dev, entry, SONIC_RD_IN_USE, 1);
 
 		prev_entry = entry;



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 51/92] net/sonic: Fix receive buffer replenishment
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (49 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 50/92] net/sonic: Improve receive descriptor status flag check Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 52/92] net/sonic: Quiesce SONIC before re-initializing descriptor memory Greg Kroah-Hartman
                   ` (37 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Stan Johnson, Finn Thain, David S. Miller

From: Finn Thain <fthain@telegraphics.com.au>

commit 89ba879e95582d3bba55081e45b5409e883312ca upstream.

As soon as the driver is finished with a receive buffer it allocs a new
one and overwrites the corresponding RRA entry with a new buffer pointer.

Problem is, the buffer pointer is split across two word-sized registers.
It can't be updated in one atomic store. So this operation races with the
chip while it stores received packets and advances its RRP register.
This could result in memory corruption by a DMA write.

Avoid this problem by adding buffers only at the location given by the
RWP register, in accordance with the National Semiconductor datasheet.

Re-factor this code into separate functions to calculate a RRA pointer
and to update the RWP.

Fixes: efcce839360f ("[PATCH] macsonic/jazzsonic network drivers update")
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/net/ethernet/natsemi/sonic.c |  150 ++++++++++++++++++++---------------
 drivers/net/ethernet/natsemi/sonic.h |   18 +++-
 2 files changed, 105 insertions(+), 63 deletions(-)

--- a/drivers/net/ethernet/natsemi/sonic.c
+++ b/drivers/net/ethernet/natsemi/sonic.c
@@ -427,6 +427,59 @@ static int index_from_addr(struct sonic_
 	return -ENOENT;
 }
 
+/* Allocate and map a new skb to be used as a receive buffer. */
+static bool sonic_alloc_rb(struct net_device *dev, struct sonic_local *lp,
+			   struct sk_buff **new_skb, dma_addr_t *new_addr)
+{
+	*new_skb = netdev_alloc_skb(dev, SONIC_RBSIZE + 2);
+	if (!*new_skb)
+		return false;
+
+	if (SONIC_BUS_SCALE(lp->dma_bitmode) == 2)
+		skb_reserve(*new_skb, 2);
+
+	*new_addr = dma_map_single(lp->device, skb_put(*new_skb, SONIC_RBSIZE),
+				   SONIC_RBSIZE, DMA_FROM_DEVICE);
+	if (!*new_addr) {
+		dev_kfree_skb(*new_skb);
+		*new_skb = NULL;
+		return false;
+	}
+
+	return true;
+}
+
+/* Place a new receive resource in the Receive Resource Area and update RWP. */
+static void sonic_update_rra(struct net_device *dev, struct sonic_local *lp,
+			     dma_addr_t old_addr, dma_addr_t new_addr)
+{
+	unsigned int entry = sonic_rr_entry(dev, SONIC_READ(SONIC_RWP));
+	unsigned int end = sonic_rr_entry(dev, SONIC_READ(SONIC_RRP));
+	u32 buf;
+
+	/* The resources in the range [RRP, RWP) belong to the SONIC. This loop
+	 * scans the other resources in the RRA, those in the range [RWP, RRP).
+	 */
+	do {
+		buf = (sonic_rra_get(dev, entry, SONIC_RR_BUFADR_H) << 16) |
+		      sonic_rra_get(dev, entry, SONIC_RR_BUFADR_L);
+
+		if (buf == old_addr)
+			break;
+
+		entry = (entry + 1) & SONIC_RRS_MASK;
+	} while (entry != end);
+
+	WARN_ONCE(buf != old_addr, "failed to find resource!\n");
+
+	sonic_rra_put(dev, entry, SONIC_RR_BUFADR_H, new_addr >> 16);
+	sonic_rra_put(dev, entry, SONIC_RR_BUFADR_L, new_addr & 0xffff);
+
+	entry = (entry + 1) & SONIC_RRS_MASK;
+
+	SONIC_WRITE(SONIC_RWP, sonic_rr_addr(dev, entry));
+}
+
 /*
  * We have a good packet(s), pass it/them up the network stack.
  */
@@ -435,18 +488,15 @@ static void sonic_rx(struct net_device *
 	struct sonic_local *lp = netdev_priv(dev);
 	int entry = lp->cur_rx;
 	int prev_entry = lp->eol_rx;
+	bool rbe = false;
 
 	while (sonic_rda_get(dev, entry, SONIC_RD_IN_USE) == 0) {
-		struct sk_buff *used_skb;
-		struct sk_buff *new_skb;
-		dma_addr_t new_laddr;
-		u16 bufadr_l;
-		u16 bufadr_h;
-		int pkt_len;
 		u16 status = sonic_rda_get(dev, entry, SONIC_RD_STATUS);
 
 		/* If the RD has LPKT set, the chip has finished with the RB */
 		if ((status & SONIC_RCR_PRX) && (status & SONIC_RCR_LPKT)) {
+			struct sk_buff *new_skb;
+			dma_addr_t new_laddr;
 			u32 addr = (sonic_rda_get(dev, entry,
 						  SONIC_RD_PKTPTR_H) << 16) |
 				   sonic_rda_get(dev, entry, SONIC_RD_PKTPTR_L);
@@ -457,55 +507,35 @@ static void sonic_rx(struct net_device *
 				break;
 			}
 
-			/* Malloc up new buffer. */
-			new_skb = netdev_alloc_skb(dev, SONIC_RBSIZE + 2);
-			if (new_skb == NULL) {
+			if (sonic_alloc_rb(dev, lp, &new_skb, &new_laddr)) {
+				struct sk_buff *used_skb = lp->rx_skb[i];
+				int pkt_len;
+
+				/* Pass the used buffer up the stack */
+				dma_unmap_single(lp->device, addr, SONIC_RBSIZE,
+						 DMA_FROM_DEVICE);
+
+				pkt_len = sonic_rda_get(dev, entry,
+							SONIC_RD_PKTLEN);
+				skb_trim(used_skb, pkt_len);
+				used_skb->protocol = eth_type_trans(used_skb,
+								    dev);
+				netif_rx(used_skb);
+				lp->stats.rx_packets++;
+				lp->stats.rx_bytes += pkt_len;
+
+				lp->rx_skb[i] = new_skb;
+				lp->rx_laddr[i] = new_laddr;
+			} else {
+				/* Failed to obtain a new buffer so re-use it */
+				new_laddr = addr;
 				lp->stats.rx_dropped++;
-				break;
 			}
-			/* provide 16 byte IP header alignment unless DMA requires otherwise */
-			if(SONIC_BUS_SCALE(lp->dma_bitmode) == 2)
-				skb_reserve(new_skb, 2);
-
-			new_laddr = dma_map_single(lp->device, skb_put(new_skb, SONIC_RBSIZE),
-		                               SONIC_RBSIZE, DMA_FROM_DEVICE);
-			if (!new_laddr) {
-				dev_kfree_skb(new_skb);
-				printk(KERN_ERR "%s: Failed to map rx buffer, dropping packet.\n", dev->name);
-				lp->stats.rx_dropped++;
-				break;
-			}
-
-			/* now we have a new skb to replace it, pass the used one up the stack */
-			dma_unmap_single(lp->device, lp->rx_laddr[entry], SONIC_RBSIZE, DMA_FROM_DEVICE);
-			used_skb = lp->rx_skb[i];
-			pkt_len = sonic_rda_get(dev, entry, SONIC_RD_PKTLEN);
-			skb_trim(used_skb, pkt_len);
-			used_skb->protocol = eth_type_trans(used_skb, dev);
-			netif_rx(used_skb);
-			lp->stats.rx_packets++;
-			lp->stats.rx_bytes += pkt_len;
-
-			/* and insert the new skb */
-			lp->rx_laddr[i] = new_laddr;
-			lp->rx_skb[i] = new_skb;
-
-			bufadr_l = (unsigned long)new_laddr & 0xffff;
-			bufadr_h = (unsigned long)new_laddr >> 16;
-			sonic_rra_put(dev, i, SONIC_RR_BUFADR_L, bufadr_l);
-			sonic_rra_put(dev, i, SONIC_RR_BUFADR_H, bufadr_h);
-			/*
-			 * this was the last packet out of the current receive buffer
-			 * give the buffer back to the SONIC
+			/* If RBE is already asserted when RWP advances then
+			 * it's safe to clear RBE after processing this packet.
 			 */
-			lp->cur_rwp += SIZEOF_SONIC_RR * SONIC_BUS_SCALE(lp->dma_bitmode);
-			if (lp->cur_rwp >= lp->rra_end) lp->cur_rwp = lp->rra_laddr & 0xffff;
-			SONIC_WRITE(SONIC_RWP, lp->cur_rwp);
-			if (SONIC_READ(SONIC_ISR) & SONIC_INT_RBE) {
-				netif_dbg(lp, rx_err, dev, "%s: rx buffer exhausted\n",
-					  __func__);
-				SONIC_WRITE(SONIC_ISR, SONIC_INT_RBE); /* clear the flag */
-			}
+			rbe = rbe || SONIC_READ(SONIC_ISR) & SONIC_INT_RBE;
+			sonic_update_rra(dev, lp, addr, new_laddr);
 		}
 		/*
 		 * give back the descriptor
@@ -527,6 +557,9 @@ static void sonic_rx(struct net_device *
 			      sonic_rda_get(dev, lp->eol_rx, SONIC_RD_LINK));
 		lp->eol_rx = prev_entry;
 	}
+
+	if (rbe)
+		SONIC_WRITE(SONIC_ISR, SONIC_INT_RBE);
 	/*
 	 * If any worth-while packets have been received, netif_rx()
 	 * has done a mark_bh(NET_BH) for us and will work on them
@@ -641,15 +674,10 @@ static int sonic_init(struct net_device
 	}
 
 	/* initialize all RRA registers */
-	lp->rra_end = (lp->rra_laddr + SONIC_NUM_RRS * SIZEOF_SONIC_RR *
-					SONIC_BUS_SCALE(lp->dma_bitmode)) & 0xffff;
-	lp->cur_rwp = (lp->rra_laddr + (SONIC_NUM_RRS - 1) * SIZEOF_SONIC_RR *
-					SONIC_BUS_SCALE(lp->dma_bitmode)) & 0xffff;
-
-	SONIC_WRITE(SONIC_RSA, lp->rra_laddr & 0xffff);
-	SONIC_WRITE(SONIC_REA, lp->rra_end);
-	SONIC_WRITE(SONIC_RRP, lp->rra_laddr & 0xffff);
-	SONIC_WRITE(SONIC_RWP, lp->cur_rwp);
+	SONIC_WRITE(SONIC_RSA, sonic_rr_addr(dev, 0));
+	SONIC_WRITE(SONIC_REA, sonic_rr_addr(dev, SONIC_NUM_RRS));
+	SONIC_WRITE(SONIC_RRP, sonic_rr_addr(dev, 0));
+	SONIC_WRITE(SONIC_RWP, sonic_rr_addr(dev, SONIC_NUM_RRS - 1));
 	SONIC_WRITE(SONIC_URRA, lp->rra_laddr >> 16);
 	SONIC_WRITE(SONIC_EOBC, (SONIC_RBSIZE >> 1) - (lp->dma_bitmode ? 2 : 1));
 
--- a/drivers/net/ethernet/natsemi/sonic.h
+++ b/drivers/net/ethernet/natsemi/sonic.h
@@ -314,8 +314,6 @@ struct sonic_local {
 	u32 rda_laddr;              /* logical DMA address of RDA */
 	dma_addr_t rx_laddr[SONIC_NUM_RRS]; /* logical DMA addresses of rx skbuffs */
 	dma_addr_t tx_laddr[SONIC_NUM_TDS]; /* logical DMA addresses of tx skbuffs */
-	unsigned int rra_end;
-	unsigned int cur_rwp;
 	unsigned int cur_rx;
 	unsigned int cur_tx;           /* first unacked transmit packet */
 	unsigned int eol_rx;
@@ -450,6 +448,22 @@ static inline __u16 sonic_rra_get(struct
 			     (entry * SIZEOF_SONIC_RR) + offset);
 }
 
+static inline u16 sonic_rr_addr(struct net_device *dev, int entry)
+{
+	struct sonic_local *lp = netdev_priv(dev);
+
+	return lp->rra_laddr +
+	       entry * SIZEOF_SONIC_RR * SONIC_BUS_SCALE(lp->dma_bitmode);
+}
+
+static inline u16 sonic_rr_entry(struct net_device *dev, u16 addr)
+{
+	struct sonic_local *lp = netdev_priv(dev);
+
+	return (addr - (u16)lp->rra_laddr) / (SIZEOF_SONIC_RR *
+					      SONIC_BUS_SCALE(lp->dma_bitmode));
+}
+
 static const char version[] =
     "sonic.c:v0.92 20.9.98 tsbogend@alpha.franken.de\n";
 



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 52/92] net/sonic: Quiesce SONIC before re-initializing descriptor memory
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (50 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 51/92] net/sonic: Fix receive buffer replenishment Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 53/92] net/sonic: Fix command register usage Greg Kroah-Hartman
                   ` (36 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Stan Johnson, Finn Thain, David S. Miller

From: Finn Thain <fthain@telegraphics.com.au>

commit 3f4b7e6a2be982fd8820a2b54d46dd9c351db899 upstream.

Make sure the SONIC's DMA engine is idle before altering the transmit
and receive descriptors. Add a helper for this as it will be needed
again.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/net/ethernet/natsemi/sonic.c |   25 +++++++++++++++++++++++++
 drivers/net/ethernet/natsemi/sonic.h |    3 +++
 2 files changed, 28 insertions(+)

--- a/drivers/net/ethernet/natsemi/sonic.c
+++ b/drivers/net/ethernet/natsemi/sonic.c
@@ -115,6 +115,24 @@ static int sonic_open(struct net_device
 	return 0;
 }
 
+/* Wait for the SONIC to become idle. */
+static void sonic_quiesce(struct net_device *dev, u16 mask)
+{
+	struct sonic_local * __maybe_unused lp = netdev_priv(dev);
+	int i;
+	u16 bits;
+
+	for (i = 0; i < 1000; ++i) {
+		bits = SONIC_READ(SONIC_CMD) & mask;
+		if (!bits)
+			return;
+		if (irqs_disabled() || in_interrupt())
+			udelay(20);
+		else
+			usleep_range(100, 200);
+	}
+	WARN_ONCE(1, "command deadline expired! 0x%04x\n", bits);
+}
 
 /*
  * Close the SONIC device
@@ -131,6 +149,9 @@ static int sonic_close(struct net_device
 	/*
 	 * stop the SONIC, disable interrupts
 	 */
+	SONIC_WRITE(SONIC_CMD, SONIC_CR_RXDIS);
+	sonic_quiesce(dev, SONIC_CR_ALL);
+
 	SONIC_WRITE(SONIC_IMR, 0);
 	SONIC_WRITE(SONIC_ISR, 0x7fff);
 	SONIC_WRITE(SONIC_CMD, SONIC_CR_RST);
@@ -170,6 +191,9 @@ static void sonic_tx_timeout(struct net_
 	 * put the Sonic into software-reset mode and
 	 * disable all interrupts before releasing DMA buffers
 	 */
+	SONIC_WRITE(SONIC_CMD, SONIC_CR_RXDIS);
+	sonic_quiesce(dev, SONIC_CR_ALL);
+
 	SONIC_WRITE(SONIC_IMR, 0);
 	SONIC_WRITE(SONIC_ISR, 0x7fff);
 	SONIC_WRITE(SONIC_CMD, SONIC_CR_RST);
@@ -657,6 +681,7 @@ static int sonic_init(struct net_device
 	 */
 	SONIC_WRITE(SONIC_CMD, 0);
 	SONIC_WRITE(SONIC_CMD, SONIC_CR_RXDIS);
+	sonic_quiesce(dev, SONIC_CR_ALL);
 
 	/*
 	 * initialize the receive resource area
--- a/drivers/net/ethernet/natsemi/sonic.h
+++ b/drivers/net/ethernet/natsemi/sonic.h
@@ -110,6 +110,9 @@
 #define SONIC_CR_TXP            0x0002
 #define SONIC_CR_HTX            0x0001
 
+#define SONIC_CR_ALL (SONIC_CR_LCAM | SONIC_CR_RRRA | \
+		      SONIC_CR_RXEN | SONIC_CR_TXP)
+
 /*
  * SONIC data configuration bits
  */



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 53/92] net/sonic: Fix command register usage
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (51 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 52/92] net/sonic: Quiesce SONIC before re-initializing descriptor memory Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 54/92] net/sonic: Fix CAM initialization Greg Kroah-Hartman
                   ` (35 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Stan Johnson, Finn Thain, David S. Miller

From: Finn Thain <fthain@telegraphics.com.au>

commit 27e0c31c5f27c1d1a1d9d135c123069f60dcf97b upstream.

There are several issues relating to command register usage during
chip initialization.

Firstly, the SONIC sometimes comes out of software reset with the
Start Timer bit set. This gets logged as,

    macsonic macsonic eth0: sonic_init: status=24, i=101

Avoid this by giving the Stop Timer command earlier than later.

Secondly, the loop that waits for the Read RRA command to complete has
the break condition inverted. That's why the for loop iterates until
its termination condition. Call the helper for this instead.

Finally, give the Receiver Enable command after clearing interrupts,
not before, to avoid the possibility of losing an interrupt.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/net/ethernet/natsemi/sonic.c |   18 +++---------------
 1 file changed, 3 insertions(+), 15 deletions(-)

--- a/drivers/net/ethernet/natsemi/sonic.c
+++ b/drivers/net/ethernet/natsemi/sonic.c
@@ -663,7 +663,6 @@ static void sonic_multicast_list(struct
  */
 static int sonic_init(struct net_device *dev)
 {
-	unsigned int cmd;
 	struct sonic_local *lp = netdev_priv(dev);
 	int i;
 
@@ -680,7 +679,7 @@ static int sonic_init(struct net_device
 	 * enable interrupts, then completely initialize the SONIC
 	 */
 	SONIC_WRITE(SONIC_CMD, 0);
-	SONIC_WRITE(SONIC_CMD, SONIC_CR_RXDIS);
+	SONIC_WRITE(SONIC_CMD, SONIC_CR_RXDIS | SONIC_CR_STP);
 	sonic_quiesce(dev, SONIC_CR_ALL);
 
 	/*
@@ -710,14 +709,7 @@ static int sonic_init(struct net_device
 	netif_dbg(lp, ifup, dev, "%s: issuing RRRA command\n", __func__);
 
 	SONIC_WRITE(SONIC_CMD, SONIC_CR_RRRA);
-	i = 0;
-	while (i++ < 100) {
-		if (SONIC_READ(SONIC_CMD) & SONIC_CR_RRRA)
-			break;
-	}
-
-	netif_dbg(lp, ifup, dev, "%s: status=%x, i=%d\n", __func__,
-		  SONIC_READ(SONIC_CMD), i);
+	sonic_quiesce(dev, SONIC_CR_RRRA);
 
 	/*
 	 * Initialize the receive descriptors so that they
@@ -805,15 +797,11 @@ static int sonic_init(struct net_device
 	 * enable receiver, disable loopback
 	 * and enable all interrupts
 	 */
-	SONIC_WRITE(SONIC_CMD, SONIC_CR_RXEN | SONIC_CR_STP);
 	SONIC_WRITE(SONIC_RCR, SONIC_RCR_DEFAULT);
 	SONIC_WRITE(SONIC_TCR, SONIC_TCR_DEFAULT);
 	SONIC_WRITE(SONIC_ISR, 0x7fff);
 	SONIC_WRITE(SONIC_IMR, SONIC_IMR_DEFAULT);
-
-	cmd = SONIC_READ(SONIC_CMD);
-	if ((cmd & SONIC_CR_RXEN) == 0 || (cmd & SONIC_CR_STP) == 0)
-		printk(KERN_ERR "sonic_init: failed, status=%x\n", cmd);
+	SONIC_WRITE(SONIC_CMD, SONIC_CR_RXEN);
 
 	netif_dbg(lp, ifup, dev, "%s: new status=%x\n", __func__,
 		  SONIC_READ(SONIC_CMD));



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 54/92] net/sonic: Fix CAM initialization
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (52 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 53/92] net/sonic: Fix command register usage Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 55/92] net/sonic: Prevent tx watchdog timeout Greg Kroah-Hartman
                   ` (34 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Stan Johnson, Finn Thain, David S. Miller

From: Finn Thain <fthain@telegraphics.com.au>

commit 772f66421d5aa0b9f256056f513bbc38ac132271 upstream.

Section 4.3.1 of the datasheet says,

    This bit [TXP] must not be set if a Load CAM operation is in
    progress (LCAM is set). The SONIC will lock up if both bits are
    set simultaneously.

Testing has shown that the driver sometimes attempts to set LCAM
while TXP is set. Avoid this by waiting for command completion
before and after giving the LCAM command.

After issuing the Load CAM command, poll for !SONIC_CR_LCAM rather than
SONIC_INT_LCD, because the SONIC_CR_TXP bit can't be used until
!SONIC_CR_LCAM.

When in reset mode, take the opportunity to reset the CAM Enable
register.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/net/ethernet/natsemi/sonic.c |   21 ++++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)

--- a/drivers/net/ethernet/natsemi/sonic.c
+++ b/drivers/net/ethernet/natsemi/sonic.c
@@ -633,6 +633,8 @@ static void sonic_multicast_list(struct
 		    (netdev_mc_count(dev) > 15)) {
 			rcr |= SONIC_RCR_AMC;
 		} else {
+			unsigned long flags;
+
 			netif_dbg(lp, ifup, dev, "%s: mc_count %d\n", __func__,
 				  netdev_mc_count(dev));
 			sonic_set_cam_enable(dev, 1);  /* always enable our own address */
@@ -646,9 +648,14 @@ static void sonic_multicast_list(struct
 				i++;
 			}
 			SONIC_WRITE(SONIC_CDC, 16);
-			/* issue Load CAM command */
 			SONIC_WRITE(SONIC_CDP, lp->cda_laddr & 0xffff);
+
+			/* LCAM and TXP commands can't be used simultaneously */
+			spin_lock_irqsave(&lp->lock, flags);
+			sonic_quiesce(dev, SONIC_CR_TXP);
 			SONIC_WRITE(SONIC_CMD, SONIC_CR_LCAM);
+			sonic_quiesce(dev, SONIC_CR_LCAM);
+			spin_unlock_irqrestore(&lp->lock, flags);
 		}
 	}
 
@@ -674,6 +681,9 @@ static int sonic_init(struct net_device
 	SONIC_WRITE(SONIC_ISR, 0x7fff);
 	SONIC_WRITE(SONIC_CMD, SONIC_CR_RST);
 
+	/* While in reset mode, clear CAM Enable register */
+	SONIC_WRITE(SONIC_CE, 0);
+
 	/*
 	 * clear software reset flag, disable receiver, clear and
 	 * enable interrupts, then completely initialize the SONIC
@@ -784,14 +794,7 @@ static int sonic_init(struct net_device
 	 * load the CAM
 	 */
 	SONIC_WRITE(SONIC_CMD, SONIC_CR_LCAM);
-
-	i = 0;
-	while (i++ < 100) {
-		if (SONIC_READ(SONIC_ISR) & SONIC_INT_LCD)
-			break;
-	}
-	netif_dbg(lp, ifup, dev, "%s: CMD=%x, ISR=%x, i=%d\n", __func__,
-		  SONIC_READ(SONIC_CMD), SONIC_READ(SONIC_ISR), i);
+	sonic_quiesce(dev, SONIC_CR_LCAM);
 
 	/*
 	 * enable receiver, disable loopback



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 55/92] net/sonic: Prevent tx watchdog timeout
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (53 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 54/92] net/sonic: Fix CAM initialization Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 56/92] tracing: Use hist triggers var_ref array to destroy var_refs Greg Kroah-Hartman
                   ` (33 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Stan Johnson, Finn Thain, David S. Miller

From: Finn Thain <fthain@telegraphics.com.au>

commit 686f85d71d095f1d26b807e23b0f0bfd22042c45 upstream.

Section 5.5.3.2 of the datasheet says,

    If FIFO Underrun, Byte Count Mismatch, Excessive Collision, or
    Excessive Deferral (if enabled) errors occur, transmission ceases.

In this situation, the chip asserts a TXER interrupt rather than TXDN.
But the handler for the TXDN is the only way that the transmit queue
gets restarted. Hence, an aborted transmission can result in a watchdog
timeout.

This problem can be reproduced on congested link, as that can result in
excessive transmitter collisions. Another way to reproduce this is with
a FIFO Underrun, which may be caused by DMA latency.

In event of a TXER interrupt, prevent a watchdog timeout by restarting
transmission.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/net/ethernet/natsemi/sonic.c |   17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

--- a/drivers/net/ethernet/natsemi/sonic.c
+++ b/drivers/net/ethernet/natsemi/sonic.c
@@ -414,10 +414,19 @@ static irqreturn_t sonic_interrupt(int i
 			lp->stats.rx_missed_errors += 65536;
 
 		/* transmit error */
-		if (status & SONIC_INT_TXER)
-			if (SONIC_READ(SONIC_TCR) & SONIC_TCR_FU)
-				netif_dbg(lp, tx_err, dev, "%s: tx fifo underrun\n",
-					  __func__);
+		if (status & SONIC_INT_TXER) {
+			u16 tcr = SONIC_READ(SONIC_TCR);
+
+			netif_dbg(lp, tx_err, dev, "%s: TXER intr, TCR %04x\n",
+				  __func__, tcr);
+
+			if (tcr & (SONIC_TCR_EXD | SONIC_TCR_EXC |
+				   SONIC_TCR_FU | SONIC_TCR_BCM)) {
+				/* Aborted transmission. Try again. */
+				netif_stop_queue(dev);
+				SONIC_WRITE(SONIC_CMD, SONIC_CR_TXP);
+			}
+		}
 
 		/* bus retry */
 		if (status & SONIC_INT_BR) {



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 56/92] tracing: Use hist triggers var_ref array to destroy var_refs
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (54 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 55/92] net/sonic: Prevent tx watchdog timeout Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 57/92] tracing: Remove open-coding of hist trigger var_ref management Greg Kroah-Hartman
                   ` (32 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Namhyung Kim, Masami Hiramatsu,
	Tom Zanussi, Steven Rostedt (VMware)

From: Tom Zanussi <tom.zanussi@linux.intel.com>

commit 656fe2ba85e81d00e4447bf77b8da2be3c47acb2 upstream.

Since every var ref for a trigger has an entry in the var_ref[] array,
use that to destroy the var_refs, instead of piecemeal via the field
expressions.

This allows us to avoid having to keep and treat differently separate
lists for the action-related references, which future patches will
remove.

Link: http://lkml.kernel.org/r/fad1a164f0e257c158e70d6eadbf6c586e04b2a2.1545161087.git.tom.zanussi@linux.intel.com

Acked-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 kernel/trace/trace_events_hist.c |   24 +++++++++++++++++++-----
 1 file changed, 19 insertions(+), 5 deletions(-)

--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -2175,6 +2175,15 @@ static int contains_operator(char *str)
 	return field_op;
 }
 
+static void __destroy_hist_field(struct hist_field *hist_field)
+{
+	kfree(hist_field->var.name);
+	kfree(hist_field->name);
+	kfree(hist_field->type);
+
+	kfree(hist_field);
+}
+
 static void destroy_hist_field(struct hist_field *hist_field,
 			       unsigned int level)
 {
@@ -2186,14 +2195,13 @@ static void destroy_hist_field(struct hi
 	if (!hist_field)
 		return;
 
+	if (hist_field->flags & HIST_FIELD_FL_VAR_REF)
+		return; /* var refs will be destroyed separately */
+
 	for (i = 0; i < HIST_FIELD_OPERANDS_MAX; i++)
 		destroy_hist_field(hist_field->operands[i], level + 1);
 
-	kfree(hist_field->var.name);
-	kfree(hist_field->name);
-	kfree(hist_field->type);
-
-	kfree(hist_field);
+	__destroy_hist_field(hist_field);
 }
 
 static struct hist_field *create_hist_field(struct hist_trigger_data *hist_data,
@@ -2320,6 +2328,12 @@ static void destroy_hist_fields(struct h
 			hist_data->fields[i] = NULL;
 		}
 	}
+
+	for (i = 0; i < hist_data->n_var_refs; i++) {
+		WARN_ON(!(hist_data->var_refs[i]->flags & HIST_FIELD_FL_VAR_REF));
+		__destroy_hist_field(hist_data->var_refs[i]);
+		hist_data->var_refs[i] = NULL;
+	}
 }
 
 static int init_var_ref(struct hist_field *ref_field,



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 57/92] tracing: Remove open-coding of hist trigger var_ref management
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (55 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 56/92] tracing: Use hist triggers var_ref array to destroy var_refs Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 58/92] tracing: Fix histogram code when expression has same var as value Greg Kroah-Hartman
                   ` (31 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Namhyung Kim, Masami Hiramatsu,
	Tom Zanussi, Steven Rostedt (VMware)

From: Tom Zanussi <tom.zanussi@linux.intel.com>

commit de40f033d4e84e843d6a12266e3869015ea9097c upstream.

Have create_var_ref() manage the hist trigger's var_ref list, rather
than having similar code doing it in multiple places.  This cleans up
the code and makes sure var_refs are always accounted properly.

Also, document the var_ref-related functions to make what their
purpose clearer.

Link: http://lkml.kernel.org/r/05ddae93ff514e66fc03897d6665231892939913.1545161087.git.tom.zanussi@linux.intel.com

Acked-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 kernel/trace/trace_events_hist.c |   93 +++++++++++++++++++++++++++++++--------
 1 file changed, 75 insertions(+), 18 deletions(-)

--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -1274,6 +1274,17 @@ static u64 hist_field_cpu(struct hist_fi
 	return cpu;
 }
 
+/**
+ * check_field_for_var_ref - Check if a VAR_REF field references a variable
+ * @hist_field: The VAR_REF field to check
+ * @var_data: The hist trigger that owns the variable
+ * @var_idx: The trigger variable identifier
+ *
+ * Check the given VAR_REF field to see whether or not it references
+ * the given variable associated with the given trigger.
+ *
+ * Return: The VAR_REF field if it does reference the variable, NULL if not
+ */
 static struct hist_field *
 check_field_for_var_ref(struct hist_field *hist_field,
 			struct hist_trigger_data *var_data,
@@ -1324,6 +1335,18 @@ check_field_for_var_refs(struct hist_tri
 	return found;
 }
 
+/**
+ * find_var_ref - Check if a trigger has a reference to a trigger variable
+ * @hist_data: The hist trigger that might have a reference to the variable
+ * @var_data: The hist trigger that owns the variable
+ * @var_idx: The trigger variable identifier
+ *
+ * Check the list of var_refs[] on the first hist trigger to see
+ * whether any of them are references to the variable on the second
+ * trigger.
+ *
+ * Return: The VAR_REF field referencing the variable if so, NULL if not
+ */
 static struct hist_field *find_var_ref(struct hist_trigger_data *hist_data,
 				       struct hist_trigger_data *var_data,
 				       unsigned int var_idx)
@@ -1350,6 +1373,20 @@ static struct hist_field *find_var_ref(s
 	return found;
 }
 
+/**
+ * find_any_var_ref - Check if there is a reference to a given trigger variable
+ * @hist_data: The hist trigger
+ * @var_idx: The trigger variable identifier
+ *
+ * Check to see whether the given variable is currently referenced by
+ * any other trigger.
+ *
+ * The trigger the variable is defined on is explicitly excluded - the
+ * assumption being that a self-reference doesn't prevent a trigger
+ * from being removed.
+ *
+ * Return: The VAR_REF field referencing the variable if so, NULL if not
+ */
 static struct hist_field *find_any_var_ref(struct hist_trigger_data *hist_data,
 					   unsigned int var_idx)
 {
@@ -1368,6 +1405,19 @@ static struct hist_field *find_any_var_r
 	return found;
 }
 
+/**
+ * check_var_refs - Check if there is a reference to any of trigger's variables
+ * @hist_data: The hist trigger
+ *
+ * A trigger can define one or more variables.  If any one of them is
+ * currently referenced by any other trigger, this function will
+ * determine that.
+
+ * Typically used to determine whether or not a trigger can be removed
+ * - if there are any references to a trigger's variables, it cannot.
+ *
+ * Return: True if there is a reference to any of trigger's variables
+ */
 static bool check_var_refs(struct hist_trigger_data *hist_data)
 {
 	struct hist_field *field;
@@ -2392,7 +2442,23 @@ static int init_var_ref(struct hist_fiel
 	goto out;
 }
 
-static struct hist_field *create_var_ref(struct hist_field *var_field,
+/**
+ * create_var_ref - Create a variable reference and attach it to trigger
+ * @hist_data: The trigger that will be referencing the variable
+ * @var_field: The VAR field to create a reference to
+ * @system: The optional system string
+ * @event_name: The optional event_name string
+ *
+ * Given a variable hist_field, create a VAR_REF hist_field that
+ * represents a reference to it.
+ *
+ * This function also adds the reference to the trigger that
+ * now references the variable.
+ *
+ * Return: The VAR_REF field if successful, NULL if not
+ */
+static struct hist_field *create_var_ref(struct hist_trigger_data *hist_data,
+					 struct hist_field *var_field,
 					 char *system, char *event_name)
 {
 	unsigned long flags = HIST_FIELD_FL_VAR_REF;
@@ -2404,6 +2470,9 @@ static struct hist_field *create_var_ref
 			destroy_hist_field(ref_field, 0);
 			return NULL;
 		}
+
+		hist_data->var_refs[hist_data->n_var_refs] = ref_field;
+		ref_field->var_ref_idx = hist_data->n_var_refs++;
 	}
 
 	return ref_field;
@@ -2477,7 +2546,8 @@ static struct hist_field *parse_var_ref(
 
 	var_field = find_event_var(hist_data, system, event_name, var_name);
 	if (var_field)
-		ref_field = create_var_ref(var_field, system, event_name);
+		ref_field = create_var_ref(hist_data, var_field,
+					   system, event_name);
 
 	if (!ref_field)
 		hist_err_event("Couldn't find variable: $",
@@ -2597,8 +2667,6 @@ static struct hist_field *parse_atom(str
 	if (!s) {
 		hist_field = parse_var_ref(hist_data, ref_system, ref_event, ref_var);
 		if (hist_field) {
-			hist_data->var_refs[hist_data->n_var_refs] = hist_field;
-			hist_field->var_ref_idx = hist_data->n_var_refs++;
 			if (var_name) {
 				hist_field = create_alias(hist_data, hist_field, var_name);
 				if (!hist_field) {
@@ -3376,7 +3444,6 @@ static int onmax_create(struct hist_trig
 	unsigned int var_ref_idx = hist_data->n_var_refs;
 	struct field_var *field_var;
 	char *onmax_var_str, *param;
-	unsigned long flags;
 	unsigned int i;
 	int ret = 0;
 
@@ -3393,18 +3460,10 @@ static int onmax_create(struct hist_trig
 		return -EINVAL;
 	}
 
-	flags = HIST_FIELD_FL_VAR_REF;
-	ref_field = create_hist_field(hist_data, NULL, flags, NULL);
+	ref_field = create_var_ref(hist_data, var_field, NULL, NULL);
 	if (!ref_field)
 		return -ENOMEM;
 
-	if (init_var_ref(ref_field, var_field, NULL, NULL)) {
-		destroy_hist_field(ref_field, 0);
-		ret = -ENOMEM;
-		goto out;
-	}
-	hist_data->var_refs[hist_data->n_var_refs] = ref_field;
-	ref_field->var_ref_idx = hist_data->n_var_refs++;
 	data->onmax.var = ref_field;
 
 	data->fn = onmax_save;
@@ -3595,9 +3654,6 @@ static void save_synth_var_ref(struct hi
 			 struct hist_field *var_ref)
 {
 	hist_data->synth_var_refs[hist_data->n_synth_var_refs++] = var_ref;
-
-	hist_data->var_refs[hist_data->n_var_refs] = var_ref;
-	var_ref->var_ref_idx = hist_data->n_var_refs++;
 }
 
 static int check_synth_field(struct synth_event *event,
@@ -3752,7 +3808,8 @@ static int onmatch_create(struct hist_tr
 		}
 
 		if (check_synth_field(event, hist_field, field_pos) == 0) {
-			var_ref = create_var_ref(hist_field, system, event_name);
+			var_ref = create_var_ref(hist_data, hist_field,
+						 system, event_name);
 			if (!var_ref) {
 				kfree(p);
 				ret = -ENOMEM;



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 58/92] tracing: Fix histogram code when expression has same var as value
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (56 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 57/92] tracing: Remove open-coding of hist trigger var_ref management Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 59/92] sd: Fix REQ_OP_ZONE_REPORT completion handling Greg Kroah-Hartman
                   ` (30 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Tom Zanuss, Steven Rostedt (VMware)

From: Steven Rostedt (VMware) <rostedt@goodmis.org>

commit 8bcebc77e85f3d7536f96845a0fe94b1dddb6af0 upstream.

While working on a tool to convert SQL syntex into the histogram language of
the kernel, I discovered the following bug:

 # echo 'first u64 start_time u64 end_time pid_t pid u64 delta' >> synthetic_events
 # echo 'hist:keys=pid:start=common_timestamp' > events/sched/sched_waking/trigger
 # echo 'hist:keys=next_pid:delta=common_timestamp-$start,start2=$start:onmatch(sched.sched_waking).trace(first,$start2,common_timestamp,next_pid,$delta)' > events/sched/sched_switch/trigger

Would not display any histograms in the sched_switch histogram side.

But if I were to swap the location of

  "delta=common_timestamp-$start" with "start2=$start"

Such that the last line had:

 # echo 'hist:keys=next_pid:start2=$start,delta=common_timestamp-$start:onmatch(sched.sched_waking).trace(first,$start2,common_timestamp,next_pid,$delta)' > events/sched/sched_switch/trigger

The histogram works as expected.

What I found out is that the expressions clear out the value once it is
resolved. As the variables are resolved in the order listed, when
processing:

  delta=common_timestamp-$start

The $start is cleared. When it gets to "start2=$start", it errors out with
"unresolved symbol" (which is silent as this happens at the location of the
trace), and the histogram is dropped.

When processing the histogram for variable references, instead of adding a
new reference for a variable used twice, use the same reference. That way,
not only is it more efficient, but the order will no longer matter in
processing of the variables.

>From Tom Zanussi:

 "Just to clarify some more about what the problem was is that without
  your patch, we would have two separate references to the same variable,
  and during resolve_var_refs(), they'd both want to be resolved
  separately, so in this case, since the first reference to start wasn't
  part of an expression, it wouldn't get the read-once flag set, so would
  be read normally, and then the second reference would do the read-once
  read and also be read but using read-once.  So everything worked and
  you didn't see a problem:

   from: start2=$start,delta=common_timestamp-$start

  In the second case, when you switched them around, the first reference
  would be resolved by doing the read-once, and following that the second
  reference would try to resolve and see that the variable had already
  been read, so failed as unset, which caused it to short-circuit out and
  not do the trigger action to generate the synthetic event:

   to: delta=common_timestamp-$start,start2=$start

  With your patch, we only have the single resolution which happens
  correctly the one time it's resolved, so this can't happen."

Link: https://lore.kernel.org/r/20200116154216.58ca08eb@gandalf.local.home

Cc: stable@vger.kernel.org
Fixes: 067fe038e70f6 ("tracing: Add variable reference handling to hist triggers")
Reviewed-by: Tom Zanuss <zanussi@kernel.org>
Tested-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 kernel/trace/trace_events_hist.c |   22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -49,6 +49,7 @@ struct hist_field {
 	struct ftrace_event_field	*field;
 	unsigned long			flags;
 	hist_field_fn_t			fn;
+	unsigned int			ref;
 	unsigned int			size;
 	unsigned int			offset;
 	unsigned int                    is_signed;
@@ -2225,8 +2226,16 @@ static int contains_operator(char *str)
 	return field_op;
 }
 
+static void get_hist_field(struct hist_field *hist_field)
+{
+	hist_field->ref++;
+}
+
 static void __destroy_hist_field(struct hist_field *hist_field)
 {
+	if (--hist_field->ref > 1)
+		return;
+
 	kfree(hist_field->var.name);
 	kfree(hist_field->name);
 	kfree(hist_field->type);
@@ -2268,6 +2277,8 @@ static struct hist_field *create_hist_fi
 	if (!hist_field)
 		return NULL;
 
+	hist_field->ref = 1;
+
 	hist_field->hist_data = hist_data;
 
 	if (flags & HIST_FIELD_FL_EXPR || flags & HIST_FIELD_FL_ALIAS)
@@ -2463,6 +2474,17 @@ static struct hist_field *create_var_ref
 {
 	unsigned long flags = HIST_FIELD_FL_VAR_REF;
 	struct hist_field *ref_field;
+	int i;
+
+	/* Check if the variable already exists */
+	for (i = 0; i < hist_data->n_var_refs; i++) {
+		ref_field = hist_data->var_refs[i];
+		if (ref_field->var.idx == var_field->var.idx &&
+		    ref_field->var.hist_data == var_field->hist_data) {
+			get_hist_field(ref_field);
+			return ref_field;
+		}
+	}
 
 	ref_field = create_hist_field(var_field->hist_data, NULL, flags, NULL);
 	if (ref_field) {



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 59/92] sd: Fix REQ_OP_ZONE_REPORT completion handling
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (57 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 58/92] tracing: Fix histogram code when expression has same var as value Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 18:02   ` Pavel Machek
  2020-01-28 14:08 ` [PATCH 4.19 60/92] crypto: geode-aes - switch to skcipher for cbc(aes) fallback Greg Kroah-Hartman
                   ` (29 subsequent siblings)
  88 siblings, 1 reply; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Damien Le Moal, Masato Suzuki,
	Martin K. Petersen

From: Masato Suzuki <masato.suzuki@wdc.com>


ZBC/ZAC report zones command may return less bytes than requested if the
number of matching zones for the report request is small. However, unlike
read or write commands, the remainder of incomplete report zones commands
cannot be automatically requested by the block layer: the start sector of
the next report cannot be known, and the report reply may not be 512B
aligned for SAS drives (a report zone reply size is always a multiple of
64B). The regular request completion code executing bio_advance() and
restart of the command remainder part currently causes invalid zone
descriptor data to be reported to the caller if the report zone size is
smaller than 512B (a case that can happen easily for a report of the last
zones of a SAS drive for example).

Since blkdev_report_zones() handles report zone command processing in a
loop until completion (no more zones are being reported), we can safely
avoid that the block layer performs an incorrect bio_advance() call and
restart of the remainder of incomplete report zone BIOs. To do so, always
indicate a full completion of REQ_OP_ZONE_REPORT by setting good_bytes to
the request buffer size and by setting the command resid to 0. This does
not affect the post processing of the report zone reply done by
sd_zbc_complete() since the reply header indicates the number of zones
reported.

Fixes: 89d947561077 ("sd: Implement support for ZBC devices")
Cc: <stable@vger.kernel.org> # 4.19
Cc: <stable@vger.kernel.org> # 4.14
Signed-off-by: Masato Suzuki <masato.suzuki@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/scsi/sd.c |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1969,9 +1969,13 @@ static int sd_done(struct scsi_cmnd *SCp
 		}
 		break;
 	case REQ_OP_ZONE_REPORT:
+		/* To avoid that the block layer performs an incorrect
+		 * bio_advance() call and restart of the remainder of
+		 * incomplete report zone BIOs, always indicate a full
+		 * completion of REQ_OP_ZONE_REPORT.
+		 */
 		if (!result) {
-			good_bytes = scsi_bufflen(SCpnt)
-				- scsi_get_resid(SCpnt);
+			good_bytes = scsi_bufflen(SCpnt);
 			scsi_set_resid(SCpnt, 0);
 		} else {
 			good_bytes = 0;



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 60/92] crypto: geode-aes - switch to skcipher for cbc(aes) fallback
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (58 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 59/92] sd: Fix REQ_OP_ZONE_REPORT completion handling Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 61/92] coresight: etb10: Do not call smp_processor_id from preemptible Greg Kroah-Hartman
                   ` (28 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Ard Biesheuvel, Florian Bezdeka, Herbert Xu

From: Ard Biesheuvel <ard.biesheuvel@linaro.org>

commit 504582e8e40b90b8f8c58783e2d1e4f6a2b71a3a upstream.

Commit 79c65d179a40e145 ("crypto: cbc - Convert to skcipher") updated
the generic CBC template wrapper from a blkcipher to a skcipher algo,
to get away from the deprecated blkcipher interface. However, as a side
effect, drivers that instantiate CBC transforms using the blkcipher as
a fallback no longer work, since skciphers can wrap blkciphers but not
the other way around. This broke the geode-aes driver.

So let's fix it by moving to the sync skcipher interface when allocating
the fallback. At the same time, align with the generic API for ECB and
CBC by rejecting inputs that are not a multiple of the AES block size.

Fixes: 79c65d179a40e145 ("crypto: cbc - Convert to skcipher")
Cc: <stable@vger.kernel.org> # v4.20+ ONLY
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Florian Bezdeka <florian@bezdeka.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Florian Bezdeka <florian@bezdeka.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/crypto/geode-aes.c |   57 ++++++++++++++++++++++++++-------------------
 drivers/crypto/geode-aes.h |    2 -
 2 files changed, 35 insertions(+), 24 deletions(-)

--- a/drivers/crypto/geode-aes.c
+++ b/drivers/crypto/geode-aes.c
@@ -14,6 +14,7 @@
 #include <linux/spinlock.h>
 #include <crypto/algapi.h>
 #include <crypto/aes.h>
+#include <crypto/skcipher.h>
 
 #include <linux/io.h>
 #include <linux/delay.h>
@@ -170,13 +171,15 @@ static int geode_setkey_blk(struct crypt
 	/*
 	 * The requested key size is not supported by HW, do a fallback
 	 */
-	op->fallback.blk->base.crt_flags &= ~CRYPTO_TFM_REQ_MASK;
-	op->fallback.blk->base.crt_flags |= (tfm->crt_flags & CRYPTO_TFM_REQ_MASK);
+	crypto_skcipher_clear_flags(op->fallback.blk, CRYPTO_TFM_REQ_MASK);
+	crypto_skcipher_set_flags(op->fallback.blk,
+				  tfm->crt_flags & CRYPTO_TFM_REQ_MASK);
 
-	ret = crypto_blkcipher_setkey(op->fallback.blk, key, len);
+	ret = crypto_skcipher_setkey(op->fallback.blk, key, len);
 	if (ret) {
 		tfm->crt_flags &= ~CRYPTO_TFM_RES_MASK;
-		tfm->crt_flags |= (op->fallback.blk->base.crt_flags & CRYPTO_TFM_RES_MASK);
+		tfm->crt_flags |= crypto_skcipher_get_flags(op->fallback.blk) &
+				  CRYPTO_TFM_RES_MASK;
 	}
 	return ret;
 }
@@ -185,33 +188,28 @@ static int fallback_blk_dec(struct blkci
 		struct scatterlist *dst, struct scatterlist *src,
 		unsigned int nbytes)
 {
-	unsigned int ret;
-	struct crypto_blkcipher *tfm;
 	struct geode_aes_op *op = crypto_blkcipher_ctx(desc->tfm);
+	SKCIPHER_REQUEST_ON_STACK(req, op->fallback.blk);
 
-	tfm = desc->tfm;
-	desc->tfm = op->fallback.blk;
-
-	ret = crypto_blkcipher_decrypt_iv(desc, dst, src, nbytes);
+	skcipher_request_set_tfm(req, op->fallback.blk);
+	skcipher_request_set_callback(req, 0, NULL, NULL);
+	skcipher_request_set_crypt(req, src, dst, nbytes, desc->info);
 
-	desc->tfm = tfm;
-	return ret;
+	return crypto_skcipher_decrypt(req);
 }
+
 static int fallback_blk_enc(struct blkcipher_desc *desc,
 		struct scatterlist *dst, struct scatterlist *src,
 		unsigned int nbytes)
 {
-	unsigned int ret;
-	struct crypto_blkcipher *tfm;
 	struct geode_aes_op *op = crypto_blkcipher_ctx(desc->tfm);
+	SKCIPHER_REQUEST_ON_STACK(req, op->fallback.blk);
 
-	tfm = desc->tfm;
-	desc->tfm = op->fallback.blk;
-
-	ret = crypto_blkcipher_encrypt_iv(desc, dst, src, nbytes);
+	skcipher_request_set_tfm(req, op->fallback.blk);
+	skcipher_request_set_callback(req, 0, NULL, NULL);
+	skcipher_request_set_crypt(req, src, dst, nbytes, desc->info);
 
-	desc->tfm = tfm;
-	return ret;
+	return crypto_skcipher_encrypt(req);
 }
 
 static void
@@ -311,6 +309,9 @@ geode_cbc_decrypt(struct blkcipher_desc
 	struct blkcipher_walk walk;
 	int err, ret;
 
+	if (nbytes % AES_BLOCK_SIZE)
+		return -EINVAL;
+
 	if (unlikely(op->keylen != AES_KEYSIZE_128))
 		return fallback_blk_dec(desc, dst, src, nbytes);
 
@@ -343,6 +344,9 @@ geode_cbc_encrypt(struct blkcipher_desc
 	struct blkcipher_walk walk;
 	int err, ret;
 
+	if (nbytes % AES_BLOCK_SIZE)
+		return -EINVAL;
+
 	if (unlikely(op->keylen != AES_KEYSIZE_128))
 		return fallback_blk_enc(desc, dst, src, nbytes);
 
@@ -370,8 +374,9 @@ static int fallback_init_blk(struct cryp
 	const char *name = crypto_tfm_alg_name(tfm);
 	struct geode_aes_op *op = crypto_tfm_ctx(tfm);
 
-	op->fallback.blk = crypto_alloc_blkcipher(name, 0,
-			CRYPTO_ALG_ASYNC | CRYPTO_ALG_NEED_FALLBACK);
+	op->fallback.blk = crypto_alloc_skcipher(name, 0,
+						 CRYPTO_ALG_ASYNC |
+						 CRYPTO_ALG_NEED_FALLBACK);
 
 	if (IS_ERR(op->fallback.blk)) {
 		printk(KERN_ERR "Error allocating fallback algo %s\n", name);
@@ -385,7 +390,7 @@ static void fallback_exit_blk(struct cry
 {
 	struct geode_aes_op *op = crypto_tfm_ctx(tfm);
 
-	crypto_free_blkcipher(op->fallback.blk);
+	crypto_free_skcipher(op->fallback.blk);
 	op->fallback.blk = NULL;
 }
 
@@ -424,6 +429,9 @@ geode_ecb_decrypt(struct blkcipher_desc
 	struct blkcipher_walk walk;
 	int err, ret;
 
+	if (nbytes % AES_BLOCK_SIZE)
+		return -EINVAL;
+
 	if (unlikely(op->keylen != AES_KEYSIZE_128))
 		return fallback_blk_dec(desc, dst, src, nbytes);
 
@@ -454,6 +462,9 @@ geode_ecb_encrypt(struct blkcipher_desc
 	struct blkcipher_walk walk;
 	int err, ret;
 
+	if (nbytes % AES_BLOCK_SIZE)
+		return -EINVAL;
+
 	if (unlikely(op->keylen != AES_KEYSIZE_128))
 		return fallback_blk_enc(desc, dst, src, nbytes);
 
--- a/drivers/crypto/geode-aes.h
+++ b/drivers/crypto/geode-aes.h
@@ -64,7 +64,7 @@ struct geode_aes_op {
 	u8 *iv;
 
 	union {
-		struct crypto_blkcipher *blk;
+		struct crypto_skcipher *blk;
 		struct crypto_cipher *cip;
 	} fallback;
 	u32 keylen;



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 61/92] coresight: etb10: Do not call smp_processor_id from preemptible
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (59 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 60/92] crypto: geode-aes - switch to skcipher for cbc(aes) fallback Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 62/92] coresight: tmc-etf: " Greg Kroah-Hartman
                   ` (27 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Mathieu Poirier, Suzuki K Poulose

From: Suzuki K Poulose <suzuki.poulose@arm.com>

commit 730766bae3280a25d40ea76a53dc6342e84e6513 upstream.

During a perf session we try to allocate buffers on the "node" associated
with the CPU the event is bound to. If it is not bound to a CPU, we
use the current CPU node, using smp_processor_id(). However this is unsafe
in a pre-emptible context and could generate the splats as below :

 BUG: using smp_processor_id() in preemptible [00000000] code: perf/2544

Use NUMA_NO_NODE hint instead of using the current node for events
not bound to CPUs.

Fixes: 2997aa4063d97fdb39 ("coresight: etb10: implementing AUX API")
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: stable <stable@vger.kernel.org> # 4.6+
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20190620221237.3536-5-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


---
 drivers/hwtracing/coresight/coresight-etb10.c |    4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

--- a/drivers/hwtracing/coresight/coresight-etb10.c
+++ b/drivers/hwtracing/coresight/coresight-etb10.c
@@ -275,9 +275,7 @@ static void *etb_alloc_buffer(struct cor
 	int node;
 	struct cs_buffers *buf;
 
-	if (cpu == -1)
-		cpu = smp_processor_id();
-	node = cpu_to_node(cpu);
+	node = (cpu == -1) ? NUMA_NO_NODE : cpu_to_node(cpu);
 
 	buf = kzalloc_node(sizeof(struct cs_buffers), GFP_KERNEL, node);
 	if (!buf)



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 62/92] coresight: tmc-etf: Do not call smp_processor_id from preemptible
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (60 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 61/92] coresight: etb10: Do not call smp_processor_id from preemptible Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 63/92] libertas: Fix two buffer overflows at parsing bss descriptor Greg Kroah-Hartman
                   ` (26 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Mathieu Poirier, Suzuki K Poulose

From: Suzuki K Poulose <suzuki.poulose@arm.com>

commit 024c1fd9dbcc1d8a847f1311f999d35783921b7f upstream.

During a perf session we try to allocate buffers on the "node" associated
with the CPU the event is bound to. If it is not bound to a CPU, we
use the current CPU node, using smp_processor_id(). However this is unsafe
in a pre-emptible context and could generate the splats as below :

 BUG: using smp_processor_id() in preemptible [00000000] code: perf/2544
 caller is tmc_alloc_etf_buffer+0x5c/0x60
 CPU: 2 PID: 2544 Comm: perf Not tainted 5.1.0-rc6-147786-g116841e #344
 Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Feb  1 2019
 Call trace:
  dump_backtrace+0x0/0x150
  show_stack+0x14/0x20
  dump_stack+0x9c/0xc4
  debug_smp_processor_id+0x10c/0x110
  tmc_alloc_etf_buffer+0x5c/0x60
  etm_setup_aux+0x1c4/0x230
  rb_alloc_aux+0x1b8/0x2b8
  perf_mmap+0x35c/0x478
  mmap_region+0x34c/0x4f0
  do_mmap+0x2d8/0x418
  vm_mmap_pgoff+0xd0/0xf8
  ksys_mmap_pgoff+0x88/0xf8
  __arm64_sys_mmap+0x28/0x38
  el0_svc_handler+0xd8/0x138
  el0_svc+0x8/0xc

Use NUMA_NO_NODE hint instead of using the current node for events
not bound to CPUs.

Fixes: 2e499bbc1a929ac ("coresight: tmc: implementing TMC-ETF AUX space API")
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: stable <stable@vger.kernel.org> # 4.7+
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20190620221237.3536-4-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/hwtracing/coresight/coresight-tmc-etf.c |    4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

--- a/drivers/hwtracing/coresight/coresight-tmc-etf.c
+++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c
@@ -304,9 +304,7 @@ static void *tmc_alloc_etf_buffer(struct
 	int node;
 	struct cs_buffers *buf;
 
-	if (cpu == -1)
-		cpu = smp_processor_id();
-	node = cpu_to_node(cpu);
+	node = (cpu == -1) ? NUMA_NO_NODE : cpu_to_node(cpu);
 
 	/* Allocate memory structure for interaction with Perf */
 	buf = kzalloc_node(sizeof(struct cs_buffers), GFP_KERNEL, node);



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 63/92] libertas: Fix two buffer overflows at parsing bss descriptor
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (61 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 62/92] coresight: tmc-etf: " Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-29 22:47   ` Pavel Machek
  2020-01-28 14:08 ` [PATCH 4.19 64/92] media: v4l2-ioctl.c: zero reserved fields for S/TRY_FMT Greg Kroah-Hartman
                   ` (25 subsequent siblings)
  88 siblings, 1 reply; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, kbuild test robot, Wen Huang, Kalle Valo

From: Wen Huang <huangwenabc@gmail.com>

commit e5e884b42639c74b5b57dc277909915c0aefc8bb upstream.

add_ie_rates() copys rates without checking the length
in bss descriptor from remote AP.when victim connects to
remote attacker, this may trigger buffer overflow.
lbs_ibss_join_existing() copys rates without checking the length
in bss descriptor from remote IBSS node.when victim connects to
remote attacker, this may trigger buffer overflow.
Fix them by putting the length check before performing copy.

This fix addresses CVE-2019-14896 and CVE-2019-14897.
This also fix build warning of mixed declarations and code.

Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Wen Huang <huangwenabc@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/net/wireless/marvell/libertas/cfg.c |   16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

--- a/drivers/net/wireless/marvell/libertas/cfg.c
+++ b/drivers/net/wireless/marvell/libertas/cfg.c
@@ -273,6 +273,10 @@ add_ie_rates(u8 *tlv, const u8 *ie, int
 	int hw, ap, ap_max = ie[1];
 	u8 hw_rate;
 
+	if (ap_max > MAX_RATES) {
+		lbs_deb_assoc("invalid rates\n");
+		return tlv;
+	}
 	/* Advance past IE header */
 	ie += 2;
 
@@ -1717,6 +1721,9 @@ static int lbs_ibss_join_existing(struct
 	struct cmd_ds_802_11_ad_hoc_join cmd;
 	u8 preamble = RADIO_PREAMBLE_SHORT;
 	int ret = 0;
+	int hw, i;
+	u8 rates_max;
+	u8 *rates;
 
 	/* TODO: set preamble based on scan result */
 	ret = lbs_set_radio(priv, preamble, 1);
@@ -1775,9 +1782,12 @@ static int lbs_ibss_join_existing(struct
 	if (!rates_eid) {
 		lbs_add_rates(cmd.bss.rates);
 	} else {
-		int hw, i;
-		u8 rates_max = rates_eid[1];
-		u8 *rates = cmd.bss.rates;
+		rates_max = rates_eid[1];
+		if (rates_max > MAX_RATES) {
+			lbs_deb_join("invalid rates");
+			goto out;
+		}
+		rates = cmd.bss.rates;
 		for (hw = 0; hw < ARRAY_SIZE(lbs_rates); hw++) {
 			u8 hw_rate = lbs_rates[hw].bitrate / 5;
 			for (i = 0; i < rates_max; i++) {



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 64/92] media: v4l2-ioctl.c: zero reserved fields for S/TRY_FMT
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (62 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 63/92] libertas: Fix two buffer overflows at parsing bss descriptor Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 65/92] scsi: iscsi: Avoid potential deadlock in iscsi_if_rx func Greg Kroah-Hartman
                   ` (24 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Hans Verkuil, Mauro Carvalho Chehab

From: Hans Verkuil <hverkuil-cisco@xs4all.nl>

commit ee8951e56c0f960b9621636603a822811cef3158 upstream.

v4l2_vbi_format, v4l2_sliced_vbi_format and v4l2_sdr_format
have a reserved array at the end that should be zeroed by drivers
as per the V4L2 spec. Older drivers often do not do this, so just
handle this in the core.

Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/media/v4l2-core/v4l2-ioctl.c |   24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

--- a/drivers/media/v4l2-core/v4l2-ioctl.c
+++ b/drivers/media/v4l2-core/v4l2-ioctl.c
@@ -1548,12 +1548,12 @@ static int v4l_s_fmt(const struct v4l2_i
 	case V4L2_BUF_TYPE_VBI_CAPTURE:
 		if (unlikely(!ops->vidioc_s_fmt_vbi_cap))
 			break;
-		CLEAR_AFTER_FIELD(p, fmt.vbi);
+		CLEAR_AFTER_FIELD(p, fmt.vbi.flags);
 		return ops->vidioc_s_fmt_vbi_cap(file, fh, arg);
 	case V4L2_BUF_TYPE_SLICED_VBI_CAPTURE:
 		if (unlikely(!ops->vidioc_s_fmt_sliced_vbi_cap))
 			break;
-		CLEAR_AFTER_FIELD(p, fmt.sliced);
+		CLEAR_AFTER_FIELD(p, fmt.sliced.io_size);
 		return ops->vidioc_s_fmt_sliced_vbi_cap(file, fh, arg);
 	case V4L2_BUF_TYPE_VIDEO_OUTPUT:
 		if (unlikely(!ops->vidioc_s_fmt_vid_out))
@@ -1576,22 +1576,22 @@ static int v4l_s_fmt(const struct v4l2_i
 	case V4L2_BUF_TYPE_VBI_OUTPUT:
 		if (unlikely(!ops->vidioc_s_fmt_vbi_out))
 			break;
-		CLEAR_AFTER_FIELD(p, fmt.vbi);
+		CLEAR_AFTER_FIELD(p, fmt.vbi.flags);
 		return ops->vidioc_s_fmt_vbi_out(file, fh, arg);
 	case V4L2_BUF_TYPE_SLICED_VBI_OUTPUT:
 		if (unlikely(!ops->vidioc_s_fmt_sliced_vbi_out))
 			break;
-		CLEAR_AFTER_FIELD(p, fmt.sliced);
+		CLEAR_AFTER_FIELD(p, fmt.sliced.io_size);
 		return ops->vidioc_s_fmt_sliced_vbi_out(file, fh, arg);
 	case V4L2_BUF_TYPE_SDR_CAPTURE:
 		if (unlikely(!ops->vidioc_s_fmt_sdr_cap))
 			break;
-		CLEAR_AFTER_FIELD(p, fmt.sdr);
+		CLEAR_AFTER_FIELD(p, fmt.sdr.buffersize);
 		return ops->vidioc_s_fmt_sdr_cap(file, fh, arg);
 	case V4L2_BUF_TYPE_SDR_OUTPUT:
 		if (unlikely(!ops->vidioc_s_fmt_sdr_out))
 			break;
-		CLEAR_AFTER_FIELD(p, fmt.sdr);
+		CLEAR_AFTER_FIELD(p, fmt.sdr.buffersize);
 		return ops->vidioc_s_fmt_sdr_out(file, fh, arg);
 	case V4L2_BUF_TYPE_META_CAPTURE:
 		if (unlikely(!ops->vidioc_s_fmt_meta_cap))
@@ -1635,12 +1635,12 @@ static int v4l_try_fmt(const struct v4l2
 	case V4L2_BUF_TYPE_VBI_CAPTURE:
 		if (unlikely(!ops->vidioc_try_fmt_vbi_cap))
 			break;
-		CLEAR_AFTER_FIELD(p, fmt.vbi);
+		CLEAR_AFTER_FIELD(p, fmt.vbi.flags);
 		return ops->vidioc_try_fmt_vbi_cap(file, fh, arg);
 	case V4L2_BUF_TYPE_SLICED_VBI_CAPTURE:
 		if (unlikely(!ops->vidioc_try_fmt_sliced_vbi_cap))
 			break;
-		CLEAR_AFTER_FIELD(p, fmt.sliced);
+		CLEAR_AFTER_FIELD(p, fmt.sliced.io_size);
 		return ops->vidioc_try_fmt_sliced_vbi_cap(file, fh, arg);
 	case V4L2_BUF_TYPE_VIDEO_OUTPUT:
 		if (unlikely(!ops->vidioc_try_fmt_vid_out))
@@ -1663,22 +1663,22 @@ static int v4l_try_fmt(const struct v4l2
 	case V4L2_BUF_TYPE_VBI_OUTPUT:
 		if (unlikely(!ops->vidioc_try_fmt_vbi_out))
 			break;
-		CLEAR_AFTER_FIELD(p, fmt.vbi);
+		CLEAR_AFTER_FIELD(p, fmt.vbi.flags);
 		return ops->vidioc_try_fmt_vbi_out(file, fh, arg);
 	case V4L2_BUF_TYPE_SLICED_VBI_OUTPUT:
 		if (unlikely(!ops->vidioc_try_fmt_sliced_vbi_out))
 			break;
-		CLEAR_AFTER_FIELD(p, fmt.sliced);
+		CLEAR_AFTER_FIELD(p, fmt.sliced.io_size);
 		return ops->vidioc_try_fmt_sliced_vbi_out(file, fh, arg);
 	case V4L2_BUF_TYPE_SDR_CAPTURE:
 		if (unlikely(!ops->vidioc_try_fmt_sdr_cap))
 			break;
-		CLEAR_AFTER_FIELD(p, fmt.sdr);
+		CLEAR_AFTER_FIELD(p, fmt.sdr.buffersize);
 		return ops->vidioc_try_fmt_sdr_cap(file, fh, arg);
 	case V4L2_BUF_TYPE_SDR_OUTPUT:
 		if (unlikely(!ops->vidioc_try_fmt_sdr_out))
 			break;
-		CLEAR_AFTER_FIELD(p, fmt.sdr);
+		CLEAR_AFTER_FIELD(p, fmt.sdr.buffersize);
 		return ops->vidioc_try_fmt_sdr_out(file, fh, arg);
 	case V4L2_BUF_TYPE_META_CAPTURE:
 		if (unlikely(!ops->vidioc_try_fmt_meta_cap))



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 65/92] scsi: iscsi: Avoid potential deadlock in iscsi_if_rx func
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (63 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 64/92] media: v4l2-ioctl.c: zero reserved fields for S/TRY_FMT Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 66/92] netfilter: ipset: use bitmap infrastructure completely Greg Kroah-Hartman
                   ` (23 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Bo Wu, Zhiqiang Liu, Lee Duncan,
	Martin K. Petersen

From: Bo Wu <wubo40@huawei.com>

commit bba340c79bfe3644829db5c852fdfa9e33837d6d upstream.

In iscsi_if_rx func, after receiving one request through
iscsi_if_recv_msg func, iscsi_if_send_reply will be called to try to
reply to the request in a do-while loop.  If the iscsi_if_send_reply
function keeps returning -EAGAIN, a deadlock will occur.

For example, a client only send msg without calling recvmsg func, then
it will result in the watchdog soft lockup.  The details are given as
follows:

	sock_fd = socket(AF_NETLINK, SOCK_RAW, NETLINK_ISCSI);
	retval = bind(sock_fd, (struct sock addr*) & src_addr, sizeof(src_addr);
	while (1) {
		state_msg = sendmsg(sock_fd, &msg, 0);
		//Note: recvmsg(sock_fd, &msg, 0) is not processed here.
	}
	close(sock_fd);

watchdog: BUG: soft lockup - CPU#7 stuck for 22s! [netlink_test:253305] Sample time: 4000897528 ns(HZ: 250) Sample stat:
curr: user: 675503481560, nice: 321724050, sys: 448689506750, idle: 4654054240530, iowait: 40885550700, irq: 14161174020, softirq: 8104324140, st: 0
deta: user: 0, nice: 0, sys: 3998210100, idle: 0, iowait: 0, irq: 1547170, softirq: 242870, st: 0 Sample softirq:
         TIMER:        992
         SCHED:          8
Sample irqstat:
         irq    2: delta       1003, curr:    3103802, arch_timer
CPU: 7 PID: 253305 Comm: netlink_test Kdump: loaded Tainted: G           OE
Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
pstate: 40400005 (nZcv daif +PAN -UAO)
pc : __alloc_skb+0x104/0x1b0
lr : __alloc_skb+0x9c/0x1b0
sp : ffff000033603a30
x29: ffff000033603a30 x28: 00000000000002dd
x27: ffff800b34ced810 x26: ffff800ba7569f00
x25: 00000000ffffffff x24: 0000000000000000
x23: ffff800f7c43f600 x22: 0000000000480020
x21: ffff0000091d9000 x20: ffff800b34eff200
x19: ffff800ba7569f00 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000
x15: 0000000000000000 x14: 0001000101000100
x13: 0000000101010000 x12: 0101000001010100
x11: 0001010101010001 x10: 00000000000002dd
x9 : ffff000033603d58 x8 : ffff800b34eff400
x7 : ffff800ba7569200 x6 : ffff800b34eff400
x5 : 0000000000000000 x4 : 00000000ffffffff
x3 : 0000000000000000 x2 : 0000000000000001
x1 : ffff800b34eff2c0 x0 : 0000000000000300 Call trace:
__alloc_skb+0x104/0x1b0
iscsi_if_rx+0x144/0x12bc [scsi_transport_iscsi]
netlink_unicast+0x1e0/0x258
netlink_sendmsg+0x310/0x378
sock_sendmsg+0x4c/0x70
sock_write_iter+0x90/0xf0
__vfs_write+0x11c/0x190
vfs_write+0xac/0x1c0
ksys_write+0x6c/0xd8
__arm64_sys_write+0x24/0x30
el0_svc_common+0x78/0x130
el0_svc_handler+0x38/0x78
el0_svc+0x8/0xc

Link: https://lore.kernel.org/r/EDBAAA0BBBA2AC4E9C8B6B81DEEE1D6915E3D4D2@dggeml505-mbx.china.huawei.com
Signed-off-by: Bo Wu <wubo40@huawei.com>
Reviewed-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/scsi/scsi_transport_iscsi.c |    7 +++++++
 1 file changed, 7 insertions(+)

--- a/drivers/scsi/scsi_transport_iscsi.c
+++ b/drivers/scsi/scsi_transport_iscsi.c
@@ -37,6 +37,8 @@
 
 #define ISCSI_TRANSPORT_VERSION "2.0-870"
 
+#define ISCSI_SEND_MAX_ALLOWED  10
+
 static int dbg_session;
 module_param_named(debug_session, dbg_session, int,
 		   S_IRUGO | S_IWUSR);
@@ -3680,6 +3682,7 @@ iscsi_if_rx(struct sk_buff *skb)
 		struct nlmsghdr	*nlh;
 		struct iscsi_uevent *ev;
 		uint32_t group;
+		int retries = ISCSI_SEND_MAX_ALLOWED;
 
 		nlh = nlmsg_hdr(skb);
 		if (nlh->nlmsg_len < sizeof(*nlh) + sizeof(*ev) ||
@@ -3710,6 +3713,10 @@ iscsi_if_rx(struct sk_buff *skb)
 				break;
 			err = iscsi_if_send_reply(portid, nlh->nlmsg_type,
 						  ev, sizeof(*ev));
+			if (err == -EAGAIN && --retries < 0) {
+				printk(KERN_WARNING "Send reply failed, error %d\n", err);
+				break;
+			}
 		} while (err < 0 && err != -ECONNREFUSED && err != -ESRCH);
 		skb_pull(skb, rlen);
 	}



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 66/92] netfilter: ipset: use bitmap infrastructure completely
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (64 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 65/92] scsi: iscsi: Avoid potential deadlock in iscsi_if_rx func Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 67/92] netfilter: nf_tables: add __nft_chain_type_get() Greg Kroah-Hartman
                   ` (22 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, syzbot+fabca5cbf5e54f3fe2de,
	syzbot+827ced406c9a1d9570ed, syzbot+190d63957b22ef673ea5,
	syzbot+dfccdb2bdb4a12ad425e, syzbot+df0d0f5895ef1f41a65b,
	syzbot+b08bd19bb37513357fd4, syzbot+53cdd0ec0bbabd53370a,
	Jozsef Kadlecsik, Pablo Neira Ayuso

From: Kadlecsik József <kadlec@blackhole.kfki.hu>

commit 32c72165dbd0e246e69d16a3ad348a4851afd415 upstream.

The bitmap allocation did not use full unsigned long sizes
when calculating the required size and that was triggered by KASAN
as slab-out-of-bounds read in several places. The patch fixes all
of them.

Reported-by: syzbot+fabca5cbf5e54f3fe2de@syzkaller.appspotmail.com
Reported-by: syzbot+827ced406c9a1d9570ed@syzkaller.appspotmail.com
Reported-by: syzbot+190d63957b22ef673ea5@syzkaller.appspotmail.com
Reported-by: syzbot+dfccdb2bdb4a12ad425e@syzkaller.appspotmail.com
Reported-by: syzbot+df0d0f5895ef1f41a65b@syzkaller.appspotmail.com
Reported-by: syzbot+b08bd19bb37513357fd4@syzkaller.appspotmail.com
Reported-by: syzbot+53cdd0ec0bbabd53370a@syzkaller.appspotmail.com
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 include/linux/netfilter/ipset/ip_set.h    |    7 -------
 net/netfilter/ipset/ip_set_bitmap_gen.h   |    2 +-
 net/netfilter/ipset/ip_set_bitmap_ip.c    |    6 +++---
 net/netfilter/ipset/ip_set_bitmap_ipmac.c |    6 +++---
 net/netfilter/ipset/ip_set_bitmap_port.c  |    6 +++---
 5 files changed, 10 insertions(+), 17 deletions(-)

--- a/include/linux/netfilter/ipset/ip_set.h
+++ b/include/linux/netfilter/ipset/ip_set.h
@@ -451,13 +451,6 @@ ip6addrptr(const struct sk_buff *skb, bo
 	       sizeof(*addr));
 }
 
-/* Calculate the bytes required to store the inclusive range of a-b */
-static inline int
-bitmap_bytes(u32 a, u32 b)
-{
-	return 4 * ((((b - a + 8) / 8) + 3) / 4);
-}
-
 #include <linux/netfilter/ipset/ip_set_timeout.h>
 #include <linux/netfilter/ipset/ip_set_comment.h>
 #include <linux/netfilter/ipset/ip_set_counter.h>
--- a/net/netfilter/ipset/ip_set_bitmap_gen.h
+++ b/net/netfilter/ipset/ip_set_bitmap_gen.h
@@ -79,7 +79,7 @@ mtype_flush(struct ip_set *set)
 
 	if (set->extensions & IPSET_EXT_DESTROY)
 		mtype_ext_cleanup(set);
-	memset(map->members, 0, map->memsize);
+	bitmap_zero(map->members, map->elements);
 	set->elements = 0;
 	set->ext_size = 0;
 }
--- a/net/netfilter/ipset/ip_set_bitmap_ip.c
+++ b/net/netfilter/ipset/ip_set_bitmap_ip.c
@@ -40,7 +40,7 @@ MODULE_ALIAS("ip_set_bitmap:ip");
 
 /* Type structure */
 struct bitmap_ip {
-	void *members;		/* the set members */
+	unsigned long *members;	/* the set members */
 	u32 first_ip;		/* host byte order, included in range */
 	u32 last_ip;		/* host byte order, included in range */
 	u32 elements;		/* number of max elements in the set */
@@ -223,7 +223,7 @@ init_map_ip(struct ip_set *set, struct b
 	    u32 first_ip, u32 last_ip,
 	    u32 elements, u32 hosts, u8 netmask)
 {
-	map->members = ip_set_alloc(map->memsize);
+	map->members = bitmap_zalloc(elements, GFP_KERNEL | __GFP_NOWARN);
 	if (!map->members)
 		return false;
 	map->first_ip = first_ip;
@@ -313,7 +313,7 @@ bitmap_ip_create(struct net *net, struct
 	if (!map)
 		return -ENOMEM;
 
-	map->memsize = bitmap_bytes(0, elements - 1);
+	map->memsize = BITS_TO_LONGS(elements) * sizeof(unsigned long);
 	set->variant = &bitmap_ip;
 	if (!init_map_ip(set, map, first_ip, last_ip,
 			 elements, hosts, netmask)) {
--- a/net/netfilter/ipset/ip_set_bitmap_ipmac.c
+++ b/net/netfilter/ipset/ip_set_bitmap_ipmac.c
@@ -46,7 +46,7 @@ enum {
 
 /* Type structure */
 struct bitmap_ipmac {
-	void *members;		/* the set members */
+	unsigned long *members;	/* the set members */
 	u32 first_ip;		/* host byte order, included in range */
 	u32 last_ip;		/* host byte order, included in range */
 	u32 elements;		/* number of max elements in the set */
@@ -303,7 +303,7 @@ static bool
 init_map_ipmac(struct ip_set *set, struct bitmap_ipmac *map,
 	       u32 first_ip, u32 last_ip, u32 elements)
 {
-	map->members = ip_set_alloc(map->memsize);
+	map->members = bitmap_zalloc(elements, GFP_KERNEL | __GFP_NOWARN);
 	if (!map->members)
 		return false;
 	map->first_ip = first_ip;
@@ -364,7 +364,7 @@ bitmap_ipmac_create(struct net *net, str
 	if (!map)
 		return -ENOMEM;
 
-	map->memsize = bitmap_bytes(0, elements - 1);
+	map->memsize = BITS_TO_LONGS(elements) * sizeof(unsigned long);
 	set->variant = &bitmap_ipmac;
 	if (!init_map_ipmac(set, map, first_ip, last_ip, elements)) {
 		kfree(map);
--- a/net/netfilter/ipset/ip_set_bitmap_port.c
+++ b/net/netfilter/ipset/ip_set_bitmap_port.c
@@ -34,7 +34,7 @@ MODULE_ALIAS("ip_set_bitmap:port");
 
 /* Type structure */
 struct bitmap_port {
-	void *members;		/* the set members */
+	unsigned long *members;	/* the set members */
 	u16 first_port;		/* host byte order, included in range */
 	u16 last_port;		/* host byte order, included in range */
 	u32 elements;		/* number of max elements in the set */
@@ -208,7 +208,7 @@ static bool
 init_map_port(struct ip_set *set, struct bitmap_port *map,
 	      u16 first_port, u16 last_port)
 {
-	map->members = ip_set_alloc(map->memsize);
+	map->members = bitmap_zalloc(map->elements, GFP_KERNEL | __GFP_NOWARN);
 	if (!map->members)
 		return false;
 	map->first_port = first_port;
@@ -248,7 +248,7 @@ bitmap_port_create(struct net *net, stru
 		return -ENOMEM;
 
 	map->elements = elements;
-	map->memsize = bitmap_bytes(0, map->elements);
+	map->memsize = BITS_TO_LONGS(elements) * sizeof(unsigned long);
 	set->variant = &bitmap_port;
 	if (!init_map_port(set, map, first_port, last_port)) {
 		kfree(map);



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 67/92] netfilter: nf_tables: add __nft_chain_type_get()
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (65 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 66/92] netfilter: ipset: use bitmap infrastructure completely Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 18:13   ` Pavel Machek
  2020-01-28 14:08 ` [PATCH 4.19 68/92] net/x25: fix nonblocking connect Greg Kroah-Hartman
                   ` (21 subsequent siblings)
  88 siblings, 1 reply; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, syzbot+156a04714799b1d480bc,
	Pablo Neira Ayuso

From: Pablo Neira Ayuso <pablo@netfilter.org>

commit 826035498ec14b77b62a44f0cb6b94d45530db6f upstream.

This new helper function validates that unknown family and chain type
coming from userspace do not trigger an out-of-bound array access. Bail
out in case __nft_chain_type_get() returns NULL from
nft_chain_parse_hook().

Fixes: 9370761c56b6 ("netfilter: nf_tables: convert built-in tables/chains to chain types")
Reported-by: syzbot+156a04714799b1d480bc@syzkaller.appspotmail.com
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 net/netfilter/nf_tables_api.c |   29 +++++++++++++++++++++--------
 1 file changed, 21 insertions(+), 8 deletions(-)

--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -472,14 +472,27 @@ static inline u64 nf_tables_alloc_handle
 static const struct nft_chain_type *chain_type[NFPROTO_NUMPROTO][NFT_CHAIN_T_MAX];
 
 static const struct nft_chain_type *
+__nft_chain_type_get(u8 family, enum nft_chain_types type)
+{
+	if (family >= NFPROTO_NUMPROTO ||
+	    type >= NFT_CHAIN_T_MAX)
+		return NULL;
+
+	return chain_type[family][type];
+}
+
+static const struct nft_chain_type *
 __nf_tables_chain_type_lookup(const struct nlattr *nla, u8 family)
 {
+	const struct nft_chain_type *type;
 	int i;
 
 	for (i = 0; i < NFT_CHAIN_T_MAX; i++) {
-		if (chain_type[family][i] != NULL &&
-		    !nla_strcmp(nla, chain_type[family][i]->name))
-			return chain_type[family][i];
+		type = __nft_chain_type_get(family, i);
+		if (!type)
+			continue;
+		if (!nla_strcmp(nla, type->name))
+			return type;
 	}
 	return NULL;
 }
@@ -1050,11 +1063,8 @@ static void nf_tables_table_destroy(stru
 
 void nft_register_chain_type(const struct nft_chain_type *ctype)
 {
-	if (WARN_ON(ctype->family >= NFPROTO_NUMPROTO))
-		return;
-
 	nfnl_lock(NFNL_SUBSYS_NFTABLES);
-	if (WARN_ON(chain_type[ctype->family][ctype->type] != NULL)) {
+	if (WARN_ON(__nft_chain_type_get(ctype->family, ctype->type))) {
 		nfnl_unlock(NFNL_SUBSYS_NFTABLES);
 		return;
 	}
@@ -1511,7 +1521,10 @@ static int nft_chain_parse_hook(struct n
 	hook->num = ntohl(nla_get_be32(ha[NFTA_HOOK_HOOKNUM]));
 	hook->priority = ntohl(nla_get_be32(ha[NFTA_HOOK_PRIORITY]));
 
-	type = chain_type[family][NFT_CHAIN_T_DEFAULT];
+	type = __nft_chain_type_get(family, NFT_CHAIN_T_DEFAULT);
+	if (!type)
+		return -EOPNOTSUPP;
+
 	if (nla[NFTA_CHAIN_TYPE]) {
 		type = nf_tables_chain_type_lookup(net, nla[NFTA_CHAIN_TYPE],
 						   family, autoload);



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 68/92] net/x25: fix nonblocking connect
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (66 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 67/92] netfilter: nf_tables: add __nft_chain_type_get() Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 69/92] mm/memory_hotplug: make remove_memory() take the device_hotplug_lock Greg Kroah-Hartman
                   ` (20 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Martin Schiller,
	syzbot+429c200ffc8772bfe070, syzbot+eec0c87f31a7c3b66f7b,
	David S. Miller

From: Martin Schiller <ms@dev.tdt.de>

commit e21dba7a4df4d93da237da65a096084b4f2e87b4 upstream.

This patch fixes 2 issues in x25_connect():

1. It makes absolutely no sense to reset the neighbour and the
connection state after a (successful) nonblocking call of x25_connect.
This prevents any connection from being established, since the response
(call accept) cannot be processed.

2. Any further calls to x25_connect() while a call is pending should
simply return, instead of creating new Call Request (on different
logical channels).

This patch should also fix the "KASAN: null-ptr-deref Write in
x25_connect" and "BUG: unable to handle kernel NULL pointer dereference
in x25_connect" bugs reported by syzbot.

Signed-off-by: Martin Schiller <ms@dev.tdt.de>
Reported-by: syzbot+429c200ffc8772bfe070@syzkaller.appspotmail.com
Reported-by: syzbot+eec0c87f31a7c3b66f7b@syzkaller.appspotmail.com
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 net/x25/af_x25.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

--- a/net/x25/af_x25.c
+++ b/net/x25/af_x25.c
@@ -765,6 +765,10 @@ static int x25_connect(struct socket *so
 	if (sk->sk_state == TCP_ESTABLISHED)
 		goto out;
 
+	rc = -EALREADY;	/* Do nothing if call is already in progress */
+	if (sk->sk_state == TCP_SYN_SENT)
+		goto out;
+
 	sk->sk_state   = TCP_CLOSE;
 	sock->state = SS_UNCONNECTED;
 
@@ -811,7 +815,7 @@ static int x25_connect(struct socket *so
 	/* Now the loop */
 	rc = -EINPROGRESS;
 	if (sk->sk_state != TCP_ESTABLISHED && (flags & O_NONBLOCK))
-		goto out_put_neigh;
+		goto out;
 
 	rc = x25_wait_for_connection_establishment(sk);
 	if (rc)



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 69/92] mm/memory_hotplug: make remove_memory() take the device_hotplug_lock
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (67 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 68/92] net/x25: fix nonblocking connect Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 70/92] mm, sparse: drop pgdat_resize_lock in sparse_add/remove_one_section() Greg Kroah-Hartman
                   ` (19 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, David Hildenbrand, Pavel Tatashin,
	Rafael J. Wysocki, Rashmica Gupta, Oscar Salvador,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Rafael J. Wysocki, Len Brown, Michael Neuling, Balbir Singh,
	Nathan Fontenot, John Allen, Michal Hocko, Dan Williams,
	Joonsoo Kim, Vlastimil Babka, YASUAKI ISHIMATSU,
	Mathieu Malaterre, Boris Ostrovsky, Haiyang Zhang,
	Heiko Carstens, Jonathan Corbet, Juergen Gross, Kate Stewart,
	K. Y. Srinivasan, Martin Schwidefsky, Philippe Ombredanne,
	Stephen Hemminger, Thomas Gleixner, Andrew Morton,
	Linus Torvalds

From: David Hildenbrand <david@redhat.com>

commit d15e59260f62bd5e0f625cf5f5240f6ffac78ab6 upstream.

Patch series "mm: online/offline_pages called w.o. mem_hotplug_lock", v3.

Reading through the code and studying how mem_hotplug_lock is to be used,
I noticed that there are two places where we can end up calling
device_online()/device_offline() - online_pages()/offline_pages() without
the mem_hotplug_lock.  And there are other places where we call
device_online()/device_offline() without the device_hotplug_lock.

While e.g.
	echo "online" > /sys/devices/system/memory/memory9/state
is fine, e.g.
	echo 1 > /sys/devices/system/memory/memory9/online
Will not take the mem_hotplug_lock. However the device_lock() and
device_hotplug_lock.

E.g.  via memory_probe_store(), we can end up calling
add_memory()->online_pages() without the device_hotplug_lock.  So we can
have concurrent callers in online_pages().  We e.g.  touch in
online_pages() basically unprotected zone->present_pages then.

Looks like there is a longer history to that (see Patch #2 for details),
and fixing it to work the way it was intended is not really possible.  We
would e.g.  have to take the mem_hotplug_lock in device/base/core.c, which
sounds wrong.

Summary: We had a lock inversion on mem_hotplug_lock and device_lock().
More details can be found in patch 3 and patch 6.

I propose the general rules (documentation added in patch 6):

1. add_memory/add_memory_resource() must only be called with
   device_hotplug_lock.
2. remove_memory() must only be called with device_hotplug_lock. This is
   already documented and holds for all callers.
3. device_online()/device_offline() must only be called with
   device_hotplug_lock. This is already documented and true for now in core
   code. Other callers (related to memory hotplug) have to be fixed up.
4. mem_hotplug_lock is taken inside of add_memory/remove_memory/
   online_pages/offline_pages.

To me, this looks way cleaner than what we have right now (and easier to
verify).  And looking at the documentation of remove_memory, using
lock_device_hotplug also for add_memory() feels natural.

This patch (of 6):

remove_memory() is exported right now but requires the
device_hotplug_lock, which is not exported.  So let's provide a variant
that takes the lock and only export that one.

The lock is already held in
	arch/powerpc/platforms/pseries/hotplug-memory.c
	drivers/acpi/acpi_memhotplug.c
	arch/powerpc/platforms/powernv/memtrace.c

Apart from that, there are not other users in the tree.

Link: http://lkml.kernel.org/r/20180925091457.28651-2-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: Rashmica Gupta <rashmica.g@gmail.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Cc: John Allen <jallen@linux.vnet.ibm.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>
Cc: Mathieu Malaterre <malat@debian.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/powerpc/platforms/powernv/memtrace.c       |    2 +-
 arch/powerpc/platforms/pseries/hotplug-memory.c |    6 +++---
 drivers/acpi/acpi_memhotplug.c                  |    2 +-
 include/linux/memory_hotplug.h                  |    3 ++-
 mm/memory_hotplug.c                             |    9 ++++++++-
 5 files changed, 15 insertions(+), 7 deletions(-)

--- a/arch/powerpc/platforms/powernv/memtrace.c
+++ b/arch/powerpc/platforms/powernv/memtrace.c
@@ -122,7 +122,7 @@ static u64 memtrace_alloc_node(u32 nid,
 			 */
 			end_pfn = base_pfn + nr_pages;
 			for (pfn = base_pfn; pfn < end_pfn; pfn += bytes>> PAGE_SHIFT) {
-				remove_memory(nid, pfn << PAGE_SHIFT, bytes);
+				__remove_memory(nid, pfn << PAGE_SHIFT, bytes);
 			}
 			unlock_device_hotplug();
 			return base_pfn << PAGE_SHIFT;
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -301,7 +301,7 @@ static int pseries_remove_memblock(unsig
 	nid = memory_add_physaddr_to_nid(base);
 
 	for (i = 0; i < sections_per_block; i++) {
-		remove_memory(nid, base, MIN_MEMORY_BLOCK_SIZE);
+		__remove_memory(nid, base, MIN_MEMORY_BLOCK_SIZE);
 		base += MIN_MEMORY_BLOCK_SIZE;
 	}
 
@@ -393,7 +393,7 @@ static int dlpar_remove_lmb(struct drmem
 	block_sz = pseries_memory_block_size();
 	nid = memory_add_physaddr_to_nid(lmb->base_addr);
 
-	remove_memory(nid, lmb->base_addr, block_sz);
+	__remove_memory(nid, lmb->base_addr, block_sz);
 
 	/* Update memory regions for memory remove */
 	memblock_remove(lmb->base_addr, block_sz);
@@ -680,7 +680,7 @@ static int dlpar_add_lmb(struct drmem_lm
 
 	rc = dlpar_online_lmb(lmb);
 	if (rc) {
-		remove_memory(nid, lmb->base_addr, block_sz);
+		__remove_memory(nid, lmb->base_addr, block_sz);
 		invalidate_lmb_associativity_index(lmb);
 	} else {
 		lmb->flags |= DRCONF_MEM_ASSIGNED;
--- a/drivers/acpi/acpi_memhotplug.c
+++ b/drivers/acpi/acpi_memhotplug.c
@@ -282,7 +282,7 @@ static void acpi_memory_remove_memory(st
 			nid = memory_add_physaddr_to_nid(info->start_addr);
 
 		acpi_unbind_memory_blocks(info);
-		remove_memory(nid, info->start_addr, info->length);
+		__remove_memory(nid, info->start_addr, info->length);
 		list_del(&info->list);
 		kfree(info);
 	}
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -303,6 +303,7 @@ extern bool is_mem_section_removable(uns
 extern void try_offline_node(int nid);
 extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages);
 extern void remove_memory(int nid, u64 start, u64 size);
+extern void __remove_memory(int nid, u64 start, u64 size);
 
 #else
 static inline bool is_mem_section_removable(unsigned long pfn,
@@ -319,6 +320,7 @@ static inline int offline_pages(unsigned
 }
 
 static inline void remove_memory(int nid, u64 start, u64 size) {}
+static inline void __remove_memory(int nid, u64 start, u64 size) {}
 #endif /* CONFIG_MEMORY_HOTREMOVE */
 
 extern void __ref free_area_init_core_hotplug(int nid);
@@ -333,7 +335,6 @@ extern void move_pfn_range_to_zone(struc
 		unsigned long nr_pages, struct vmem_altmap *altmap);
 extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages);
 extern bool is_memblock_offlined(struct memory_block *mem);
-extern void remove_memory(int nid, u64 start, u64 size);
 extern int sparse_add_one_section(struct pglist_data *pgdat,
 		unsigned long start_pfn, struct vmem_altmap *altmap);
 extern void sparse_remove_one_section(struct zone *zone, struct mem_section *ms,
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1893,7 +1893,7 @@ EXPORT_SYMBOL(try_offline_node);
  * and online/offline operations before this call, as required by
  * try_offline_node().
  */
-void __ref remove_memory(int nid, u64 start, u64 size)
+void __ref __remove_memory(int nid, u64 start, u64 size)
 {
 	int ret;
 
@@ -1922,5 +1922,12 @@ void __ref remove_memory(int nid, u64 st
 
 	mem_hotplug_done();
 }
+
+void remove_memory(int nid, u64 start, u64 size)
+{
+	lock_device_hotplug();
+	__remove_memory(nid, start, size);
+	unlock_device_hotplug();
+}
 EXPORT_SYMBOL_GPL(remove_memory);
 #endif /* CONFIG_MEMORY_HOTREMOVE */



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 70/92] mm, sparse: drop pgdat_resize_lock in sparse_add/remove_one_section()
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (68 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 69/92] mm/memory_hotplug: make remove_memory() take the device_hotplug_lock Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 71/92] mm, sparse: pass nid instead of pgdat to sparse_add_one_section() Greg Kroah-Hartman
                   ` (18 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Wei Yang, David Hildenbrand, Michal Hocko,
	Dave Hansen, Oscar Salvador, Andrew Morton, Linus Torvalds

From: Wei Yang <richard.weiyang@gmail.com>

commit 83af658898cb292a32d8b6cd9b51266d7cfc4b6a upstream.

pgdat_resize_lock is used to protect pgdat's memory region information
like: node_start_pfn, node_present_pages, etc.  While in function
sparse_add/remove_one_section(), pgdat_resize_lock is used to protect
initialization/release of one mem_section.  This looks not proper.

These code paths are currently protected by mem_hotplug_lock currently but
should there ever be any reason for locking at the sparse layer a
dedicated lock should be introduced.

Following is the current call trace of sparse_add/remove_one_section()

    mem_hotplug_begin()
    arch_add_memory()
       add_pages()
           __add_pages()
               __add_section()
                   sparse_add_one_section()
    mem_hotplug_done()

    mem_hotplug_begin()
    arch_remove_memory()
        __remove_pages()
            __remove_section()
                sparse_remove_one_section()
    mem_hotplug_done()

The comment above the pgdat_resize_lock also mentions "Holding this will
also guarantee that any pfn_valid() stays that way.", which is true with
the current implementation and false after this patch.  But current
implementation doesn't meet this comment.  There isn't any pfn walkers to
take the lock so this looks like a relict from the past.  This patch also
removes this comment.

[richard.weiyang@gmail.com: v4]
  Link: http://lkml.kernel.org/r/20181204085657.20472-1-richard.weiyang@gmail.com
[mhocko@suse.com: changelog suggestion]
Link: http://lkml.kernel.org/r/20181128091243.19249-1-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/mmzone.h |    3 +--
 mm/sparse.c            |    9 +--------
 2 files changed, 2 insertions(+), 10 deletions(-)

--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -637,8 +637,7 @@ typedef struct pglist_data {
 #if defined(CONFIG_MEMORY_HOTPLUG) || defined(CONFIG_DEFERRED_STRUCT_PAGE_INIT)
 	/*
 	 * Must be held any time you expect node_start_pfn, node_present_pages
-	 * or node_spanned_pages stay constant.  Holding this will also
-	 * guarantee that any pfn_valid() stays that way.
+	 * or node_spanned_pages stay constant.
 	 *
 	 * pgdat_resize_lock() and pgdat_resize_unlock() are provided to
 	 * manipulate node_size_lock without checking for CONFIG_MEMORY_HOTPLUG
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -668,7 +668,6 @@ int __meminit sparse_add_one_section(str
 	struct mem_section *ms;
 	struct page *memmap;
 	unsigned long *usemap;
-	unsigned long flags;
 	int ret;
 
 	/*
@@ -688,8 +687,6 @@ int __meminit sparse_add_one_section(str
 		return -ENOMEM;
 	}
 
-	pgdat_resize_lock(pgdat, &flags);
-
 	ms = __pfn_to_section(start_pfn);
 	if (ms->section_mem_map & SECTION_MARKED_PRESENT) {
 		ret = -EEXIST;
@@ -708,7 +705,6 @@ int __meminit sparse_add_one_section(str
 	sparse_init_one_section(ms, section_nr, memmap, usemap);
 
 out:
-	pgdat_resize_unlock(pgdat, &flags);
 	if (ret < 0) {
 		kfree(usemap);
 		__kfree_section_memmap(memmap, altmap);
@@ -770,10 +766,8 @@ void sparse_remove_one_section(struct zo
 		unsigned long map_offset, struct vmem_altmap *altmap)
 {
 	struct page *memmap = NULL;
-	unsigned long *usemap = NULL, flags;
-	struct pglist_data *pgdat = zone->zone_pgdat;
+	unsigned long *usemap = NULL;
 
-	pgdat_resize_lock(pgdat, &flags);
 	if (ms->section_mem_map) {
 		usemap = ms->pageblock_flags;
 		memmap = sparse_decode_mem_map(ms->section_mem_map,
@@ -781,7 +775,6 @@ void sparse_remove_one_section(struct zo
 		ms->section_mem_map = 0;
 		ms->pageblock_flags = NULL;
 	}
-	pgdat_resize_unlock(pgdat, &flags);
 
 	clear_hwpoisoned_pages(memmap + map_offset,
 			PAGES_PER_SECTION - map_offset);



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 71/92] mm, sparse: pass nid instead of pgdat to sparse_add_one_section()
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (69 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 70/92] mm, sparse: drop pgdat_resize_lock in sparse_add/remove_one_section() Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 72/92] drivers/base/memory.c: remove an unnecessary check on NR_MEM_SECTIONS Greg Kroah-Hartman
                   ` (17 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Wei Yang, David Hildenbrand, Michal Hocko,
	Dave Hansen, Oscar Salvador, Andrew Morton, Linus Torvalds

From: Wei Yang <richard.weiyang@gmail.com>

commit 4e0d2e7ef14d9e1c900dac909db45263822b824f upstream.

Since the information needed in sparse_add_one_section() is node id to
allocate proper memory, it is not necessary to pass its pgdat.

This patch changes the prototype of sparse_add_one_section() to pass node
id directly.  This is intended to reduce misleading that
sparse_add_one_section() would touch pgdat.

Link: http://lkml.kernel.org/r/20181204085657.20472-2-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/memory_hotplug.h |    4 ++--
 mm/memory_hotplug.c            |    2 +-
 mm/sparse.c                    |    8 ++++----
 3 files changed, 7 insertions(+), 7 deletions(-)

--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -335,8 +335,8 @@ extern void move_pfn_range_to_zone(struc
 		unsigned long nr_pages, struct vmem_altmap *altmap);
 extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages);
 extern bool is_memblock_offlined(struct memory_block *mem);
-extern int sparse_add_one_section(struct pglist_data *pgdat,
-		unsigned long start_pfn, struct vmem_altmap *altmap);
+extern int sparse_add_one_section(int nid, unsigned long start_pfn,
+				  struct vmem_altmap *altmap);
 extern void sparse_remove_one_section(struct zone *zone, struct mem_section *ms,
 		unsigned long map_offset, struct vmem_altmap *altmap);
 extern struct page *sparse_decode_mem_map(unsigned long coded_mem_map,
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -255,7 +255,7 @@ static int __meminit __add_section(int n
 	if (pfn_valid(phys_start_pfn))
 		return -EEXIST;
 
-	ret = sparse_add_one_section(NODE_DATA(nid), phys_start_pfn, altmap);
+	ret = sparse_add_one_section(nid, phys_start_pfn, altmap);
 	if (ret < 0)
 		return ret;
 
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -661,8 +661,8 @@ static void free_map_bootmem(struct page
  * set.  If this is <=0, then that means that the passed-in
  * map was not consumed and must be freed.
  */
-int __meminit sparse_add_one_section(struct pglist_data *pgdat,
-		unsigned long start_pfn, struct vmem_altmap *altmap)
+int __meminit sparse_add_one_section(int nid, unsigned long start_pfn,
+				     struct vmem_altmap *altmap)
 {
 	unsigned long section_nr = pfn_to_section_nr(start_pfn);
 	struct mem_section *ms;
@@ -674,11 +674,11 @@ int __meminit sparse_add_one_section(str
 	 * no locking for this, because it does its own
 	 * plus, it does a kmalloc
 	 */
-	ret = sparse_index_init(section_nr, pgdat->node_id);
+	ret = sparse_index_init(section_nr, nid);
 	if (ret < 0 && ret != -EEXIST)
 		return ret;
 	ret = 0;
-	memmap = kmalloc_section_memmap(section_nr, pgdat->node_id, altmap);
+	memmap = kmalloc_section_memmap(section_nr, nid, altmap);
 	if (!memmap)
 		return -ENOMEM;
 	usemap = __kmalloc_section_usemap();



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 72/92] drivers/base/memory.c: remove an unnecessary check on NR_MEM_SECTIONS
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (70 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 71/92] mm, sparse: pass nid instead of pgdat to sparse_add_one_section() Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 73/92] mm, memory_hotplug: add nid parameter to arch_remove_memory Greg Kroah-Hartman
                   ` (16 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Wei Yang, Andrew Morton, Rafael J. Wysocki,
	Seth Jennings, Linus Torvalds, David Hildenbrand

From: Wei Yang <richard.weiyang@gmail.com>

commit 3b6fd6ffb27c2efa003c6d4d15ca72c054b71d7c upstream.

In cb5e39b8038b ("drivers: base: refactor add_memory_section() to
add_memory_block()"), add_memory_block() is introduced, which is only
invoked in memory_dev_init().

When combining these two loops in memory_dev_init() and
add_memory_block(), they looks like this:

    for (i = 0; i < NR_MEM_SECTIONS; i += sections_per_block)
        for (j = i;
	    (j < i + sections_per_block) && j < NR_MEM_SECTIONS;
	    j++)

Since it is sure the (i < NR_MEM_SECTIONS) and j sits in its own memory
block, the check of (j < NR_MEM_SECTIONS) is not necessary.

This patch just removes this check.

Link: http://lkml.kernel.org/r/20181123222811.18216-1-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Seth Jennings <sjenning@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/base/memory.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -691,7 +691,7 @@ static int add_memory_block(int base_sec
 	int i, ret, section_count = 0, section_nr;
 
 	for (i = base_section_nr;
-	     (i < base_section_nr + sections_per_block) && i < NR_MEM_SECTIONS;
+	     i < base_section_nr + sections_per_block;
 	     i++) {
 		if (!present_section_nr(i))
 			continue;



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 73/92] mm, memory_hotplug: add nid parameter to arch_remove_memory
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (71 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 72/92] drivers/base/memory.c: remove an unnecessary check on NR_MEM_SECTIONS Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 74/92] mm/memory_hotplug: release memory resource after arch_remove_memory() Greg Kroah-Hartman
                   ` (15 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Oscar Salvador, David Hildenbrand,
	Pavel Tatashin, Michal Hocko, Dan Williams, Jerome Glisse,
	Jonathan Cameron, Rafael J. Wysocki, Andrew Morton,
	Linus Torvalds

From: Oscar Salvador <osalvador@suse.com>

commit 2c2a5af6fed20cf74401c9d64319c76c5ff81309 upstream.

-- snip --

Missing unification of mm/hmm.c and kernel/memremap.c

-- snip --

Patch series "Do not touch pages in hot-remove path", v2.

This patchset aims for two things:

 1) A better definition about offline and hot-remove stage
 2) Solving bugs where we can access non-initialized pages
    during hot-remove operations [2] [3].

This is achieved by moving all page/zone handling to the offline
stage, so we do not need to access pages when hot-removing memory.

[1] https://patchwork.kernel.org/cover/10691415/
[2] https://patchwork.kernel.org/patch/10547445/
[3] https://www.spinics.net/lists/linux-mm/msg161316.html

This patch (of 5):

This is a preparation for the following-up patches.  The idea of passing
the nid is that it will allow us to get rid of the zone parameter
afterwards.

Link: http://lkml.kernel.org/r/20181127162005.15833-2-osalvador@suse.de
Signed-off-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/ia64/mm/init.c            |    2 +-
 arch/powerpc/mm/mem.c          |    3 ++-
 arch/s390/mm/init.c            |    2 +-
 arch/sh/mm/init.c              |    2 +-
 arch/x86/mm/init_32.c          |    2 +-
 arch/x86/mm/init_64.c          |    3 ++-
 include/linux/memory_hotplug.h |    4 ++--
 kernel/memremap.c              |    5 ++++-
 mm/hmm.c                       |    4 +++-
 mm/memory_hotplug.c            |    2 +-
 10 files changed, 18 insertions(+), 11 deletions(-)

--- a/arch/ia64/mm/init.c
+++ b/arch/ia64/mm/init.c
@@ -662,7 +662,7 @@ int arch_add_memory(int nid, u64 start,
 }
 
 #ifdef CONFIG_MEMORY_HOTREMOVE
-int arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap)
+int arch_remove_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap)
 {
 	unsigned long start_pfn = start >> PAGE_SHIFT;
 	unsigned long nr_pages = size >> PAGE_SHIFT;
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -140,7 +140,8 @@ int __meminit arch_add_memory(int nid, u
 }
 
 #ifdef CONFIG_MEMORY_HOTREMOVE
-int __meminit arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap)
+int __meminit arch_remove_memory(int nid, u64 start, u64 size,
+					struct vmem_altmap *altmap)
 {
 	unsigned long start_pfn = start >> PAGE_SHIFT;
 	unsigned long nr_pages = size >> PAGE_SHIFT;
--- a/arch/s390/mm/init.c
+++ b/arch/s390/mm/init.c
@@ -240,7 +240,7 @@ int arch_add_memory(int nid, u64 start,
 }
 
 #ifdef CONFIG_MEMORY_HOTREMOVE
-int arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap)
+int arch_remove_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap)
 {
 	/*
 	 * There is no hardware or firmware interface which could trigger a
--- a/arch/sh/mm/init.c
+++ b/arch/sh/mm/init.c
@@ -444,7 +444,7 @@ EXPORT_SYMBOL_GPL(memory_add_physaddr_to
 #endif
 
 #ifdef CONFIG_MEMORY_HOTREMOVE
-int arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap)
+int arch_remove_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap)
 {
 	unsigned long start_pfn = PFN_DOWN(start);
 	unsigned long nr_pages = size >> PAGE_SHIFT;
--- a/arch/x86/mm/init_32.c
+++ b/arch/x86/mm/init_32.c
@@ -861,7 +861,7 @@ int arch_add_memory(int nid, u64 start,
 }
 
 #ifdef CONFIG_MEMORY_HOTREMOVE
-int arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap)
+int arch_remove_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap)
 {
 	unsigned long start_pfn = start >> PAGE_SHIFT;
 	unsigned long nr_pages = size >> PAGE_SHIFT;
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1142,7 +1142,8 @@ kernel_physical_mapping_remove(unsigned
 	remove_pagetable(start, end, true, NULL);
 }
 
-int __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap)
+int __ref arch_remove_memory(int nid, u64 start, u64 size,
+				struct vmem_altmap *altmap)
 {
 	unsigned long start_pfn = start >> PAGE_SHIFT;
 	unsigned long nr_pages = size >> PAGE_SHIFT;
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -109,8 +109,8 @@ static inline bool movable_node_is_enabl
 }
 
 #ifdef CONFIG_MEMORY_HOTREMOVE
-extern int arch_remove_memory(u64 start, u64 size,
-		struct vmem_altmap *altmap);
+extern int arch_remove_memory(int nid, u64 start, u64 size,
+				struct vmem_altmap *altmap);
 extern int __remove_pages(struct zone *zone, unsigned long start_pfn,
 	unsigned long nr_pages, struct vmem_altmap *altmap);
 #endif /* CONFIG_MEMORY_HOTREMOVE */
--- a/kernel/memremap.c
+++ b/kernel/memremap.c
@@ -121,6 +121,7 @@ static void devm_memremap_pages_release(
 	struct resource *res = &pgmap->res;
 	resource_size_t align_start, align_size;
 	unsigned long pfn;
+	int nid;
 
 	pgmap->kill(pgmap->ref);
 	for_each_device_pfn(pfn, pgmap)
@@ -131,13 +132,15 @@ static void devm_memremap_pages_release(
 	align_size = ALIGN(res->start + resource_size(res), SECTION_SIZE)
 		- align_start;
 
+	nid = page_to_nid(pfn_to_page(align_start >> PAGE_SHIFT));
+
 	mem_hotplug_begin();
 	if (pgmap->type == MEMORY_DEVICE_PRIVATE) {
 		pfn = align_start >> PAGE_SHIFT;
 		__remove_pages(page_zone(pfn_to_page(pfn)), pfn,
 				align_size >> PAGE_SHIFT, NULL);
 	} else {
-		arch_remove_memory(align_start, align_size,
+		arch_remove_memory(nid, align_start, align_size,
 				pgmap->altmap_valid ? &pgmap->altmap : NULL);
 		kasan_remove_zero_shadow(__va(align_start), align_size);
 	}
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -999,6 +999,7 @@ static void hmm_devmem_release(void *dat
 	unsigned long start_pfn, npages;
 	struct zone *zone;
 	struct page *page;
+	int nid;
 
 	/* pages are dead and unused, undo the arch mapping */
 	start_pfn = (resource->start & ~(PA_SECTION_SIZE - 1)) >> PAGE_SHIFT;
@@ -1006,12 +1007,13 @@ static void hmm_devmem_release(void *dat
 
 	page = pfn_to_page(start_pfn);
 	zone = page_zone(page);
+	nid = page_to_nid(page);
 
 	mem_hotplug_begin();
 	if (resource->desc == IORES_DESC_DEVICE_PRIVATE_MEMORY)
 		__remove_pages(zone, start_pfn, npages, NULL);
 	else
-		arch_remove_memory(start_pfn << PAGE_SHIFT,
+		arch_remove_memory(nid, start_pfn << PAGE_SHIFT,
 				   npages << PAGE_SHIFT, NULL);
 	mem_hotplug_done();
 
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1916,7 +1916,7 @@ void __ref __remove_memory(int nid, u64
 	memblock_free(start, size);
 	memblock_remove(start, size);
 
-	arch_remove_memory(start, size, NULL);
+	arch_remove_memory(nid, start, size, NULL);
 
 	try_offline_node(nid);
 



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 74/92] mm/memory_hotplug: release memory resource after arch_remove_memory()
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (72 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 73/92] mm, memory_hotplug: add nid parameter to arch_remove_memory Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 75/92] drivers/base/memory.c: clean up relics in function parameters Greg Kroah-Hartman
                   ` (14 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, David Hildenbrand, Oscar Salvador,
	Michal Hocko, Pavel Tatashin, Wei Yang, Qian Cai, Arun KS,
	Mathieu Malaterre, Andrew Banman, Andy Lutomirski,
	Benjamin Herrenschmidt, Borislav Petkov, Christophe Leroy,
	Dave Hansen, Fenghua Yu, Geert Uytterhoeven, Heiko Carstens,
	H. Peter Anvin, Ingo Molnar, Ingo Molnar, Joonsoo Kim,
	Kirill A. Shutemov, Martin Schwidefsky, Masahiro Yamada,
	Michael Ellerman, Mike Rapoport, Mike Travis, Nicholas Piggin,
	Oscar Salvador, Paul Mackerras, Peter Zijlstra,
	Rafael J. Wysocki, Rich Felker, Rob Herring, Stefan Agner,
	Thomas Gleixner, Tony Luck, Vasily Gorbik, Yoshinori Sato,
	Andrew Morton, Linus Torvalds

From: David Hildenbrand <david@redhat.com>

commit d9eb1417c77df7ce19abd2e41619e9dceccbdf2a upstream.

Patch series "mm/memory_hotplug: Better error handling when removing
memory", v1.

Error handling when removing memory is somewhat messed up right now.  Some
errors result in warnings, others are completely ignored.  Memory unplug
code can essentially not deal with errors properly as of now.
remove_memory() will never fail.

We have basically two choices:
1. Allow arch_remov_memory() and friends to fail, propagating errors via
   remove_memory(). Might be problematic (e.g. DIMMs consisting of multiple
   pieces added/removed separately).
2. Don't allow the functions to fail, handling errors in a nicer way.

It seems like most errors that can theoretically happen are really corner
cases and mostly theoretical (e.g.  "section not valid").  However e.g.
aborting removal of sections while all callers simply continue in case of
errors is not nice.

If we can gurantee that removal of memory always works (and WARN/skip in
case of theoretical errors so we can figure out what is going on), we can
go ahead and implement better error handling when adding memory.

E.g. via add_memory():

arch_add_memory()
ret = do_stuff()
if (ret) {
	arch_remove_memory();
	goto error;
}

Handling here that arch_remove_memory() might fail is basically
impossible.  So I suggest, let's avoid reporting errors while removing
memory, warning on theoretical errors instead and continuing instead of
aborting.

This patch (of 4):

__add_pages() doesn't add the memory resource, so __remove_pages()
shouldn't remove it.  Let's factor it out.  Especially as it is a special
case for memory used as system memory, added via add_memory() and friends.

We now remove the resource after removing the sections instead of doing it
the other way around.  I don't think this change is problematic.

add_memory()
	register memory resource
	arch_add_memory()

remove_memory
	arch_remove_memory()
	release memory resource

While at it, explain why we ignore errors and that it only happeny if
we remove memory in a different granularity as we added it.

[david@redhat.com: fix printk warning]
  Link: http://lkml.kernel.org/r/20190417120204.6997-1-david@redhat.com
Link: http://lkml.kernel.org/r/20190409100148.24703-2-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Arun KS <arunks@codeaurora.org>
Cc: Mathieu Malaterre <malat@debian.org>
Cc: Andrew Banman <andrew.banman@hpe.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Mike Travis <mike.travis@hpe.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oscar Salvador <osalvador@suse.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Stefan Agner <stefan@agner.ch>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 mm/memory_hotplug.c |   35 +++++++++++++++++++++--------------
 1 file changed, 21 insertions(+), 14 deletions(-)

--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -523,20 +523,6 @@ int __remove_pages(struct zone *zone, un
 	if (is_dev_zone(zone)) {
 		if (altmap)
 			map_offset = vmem_altmap_offset(altmap);
-	} else {
-		resource_size_t start, size;
-
-		start = phys_start_pfn << PAGE_SHIFT;
-		size = nr_pages * PAGE_SIZE;
-
-		ret = release_mem_region_adjustable(&iomem_resource, start,
-					size);
-		if (ret) {
-			resource_size_t endres = start + size - 1;
-
-			pr_warn("Unable to release resource <%pa-%pa> (%d)\n",
-					&start, &endres, ret);
-		}
 	}
 
 	clear_zone_contiguous(zone);
@@ -1883,6 +1869,26 @@ void try_offline_node(int nid)
 }
 EXPORT_SYMBOL(try_offline_node);
 
+static void __release_memory_resource(resource_size_t start,
+				      resource_size_t size)
+{
+	int ret;
+
+	/*
+	 * When removing memory in the same granularity as it was added,
+	 * this function never fails. It might only fail if resources
+	 * have to be adjusted or split. We'll ignore the error, as
+	 * removing of memory cannot fail.
+	 */
+	ret = release_mem_region_adjustable(&iomem_resource, start, size);
+	if (ret) {
+		resource_size_t endres = start + size - 1;
+
+		pr_warn("Unable to release resource <%pa-%pa> (%d)\n",
+			&start, &endres, ret);
+	}
+}
+
 /**
  * remove_memory
  * @nid: the node ID
@@ -1917,6 +1923,7 @@ void __ref __remove_memory(int nid, u64
 	memblock_remove(start, size);
 
 	arch_remove_memory(nid, start, size, NULL);
+	__release_memory_resource(start, size);
 
 	try_offline_node(nid);
 



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 75/92] drivers/base/memory.c: clean up relics in function parameters
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (73 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 74/92] mm/memory_hotplug: release memory resource after arch_remove_memory() Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 76/92] mm, memory_hotplug: update a comment in unregister_memory() Greg Kroah-Hartman
                   ` (13 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Baoquan He, Michal Hocko, Rafael J. Wysocki,
	Mukesh Ojha, Oscar Salvador, Andrew Morton, Linus Torvalds,
	David Hildenbrand

From: Baoquan He <bhe@redhat.com>

commit 063b8a4cee8088224bcdb79bcd08db98df16178e upstream.

The input parameter 'phys_index' of memory_block_action() is actually the
section number, but not the phys_index of memory_block.  This is a relic
from the past when one memory block could only contain one section.
Rename it to start_section_nr.

And also in remove_memory_section(), the 'node_id' and 'phys_device'
arguments are not used by anyone.  Remove them.

Link: http://lkml.kernel.org/r/20190329144250.14315-2-bhe@redhat.com
Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/base/memory.c |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -230,13 +230,14 @@ static bool pages_correctly_probed(unsig
  * OK to have direct references to sparsemem variables in here.
  */
 static int
-memory_block_action(unsigned long phys_index, unsigned long action, int online_type)
+memory_block_action(unsigned long start_section_nr, unsigned long action,
+		    int online_type)
 {
 	unsigned long start_pfn;
 	unsigned long nr_pages = PAGES_PER_SECTION * sections_per_block;
 	int ret;
 
-	start_pfn = section_nr_to_pfn(phys_index);
+	start_pfn = section_nr_to_pfn(start_section_nr);
 
 	switch (action) {
 	case MEM_ONLINE:
@@ -250,7 +251,7 @@ memory_block_action(unsigned long phys_i
 		break;
 	default:
 		WARN(1, KERN_WARNING "%s(%ld, %ld) unknown action: "
-		     "%ld\n", __func__, phys_index, action, action);
+		     "%ld\n", __func__, start_section_nr, action, action);
 		ret = -EINVAL;
 	}
 
@@ -747,8 +748,7 @@ unregister_memory(struct memory_block *m
 	device_unregister(&memory->dev);
 }
 
-static int remove_memory_section(unsigned long node_id,
-			       struct mem_section *section, int phys_device)
+static int remove_memory_section(struct mem_section *section)
 {
 	struct memory_block *mem;
 
@@ -780,7 +780,7 @@ int unregister_memory_section(struct mem
 	if (!present_section(section))
 		return -EINVAL;
 
-	return remove_memory_section(0, section, 0);
+	return remove_memory_section(section);
 }
 #endif /* CONFIG_MEMORY_HOTREMOVE */
 



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 76/92] mm, memory_hotplug: update a comment in unregister_memory()
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (74 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 75/92] drivers/base/memory.c: clean up relics in function parameters Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 77/92] mm/memory_hotplug: make unregister_memory_section() never fail Greg Kroah-Hartman
                   ` (12 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Greg Kroah-Hartman, Dan Carpenter, David Hildenbrand

From: Dan Carpenter <dan.carpenter@oracle.com>

commit 16df1456aa858a86f398dbc7d27649eb6662b0cc upstream.

The remove_memory_block() function was renamed to in commit
cc292b0b4302 ("drivers/base/memory.c: rename remove_memory_block() to
remove_memory_section()").

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/base/memory.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -743,7 +743,7 @@ unregister_memory(struct memory_block *m
 {
 	BUG_ON(memory->dev.bus != &memory_subsys);
 
-	/* drop the ref. we got in remove_memory_block() */
+	/* drop the ref. we got in remove_memory_section() */
 	put_device(&memory->dev);
 	device_unregister(&memory->dev);
 }



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 77/92] mm/memory_hotplug: make unregister_memory_section() never fail
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (75 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 76/92] mm, memory_hotplug: update a comment in unregister_memory() Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 78/92] mm/memory_hotplug: make __remove_section() " Greg Kroah-Hartman
                   ` (11 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, David Hildenbrand, Rafael J. Wysocki,
	Ingo Molnar, Andrew Banman, Mike Travis, Oscar Salvador,
	Michal Hocko, Pavel Tatashin, Qian Cai, Wei Yang, Arun KS,
	Mathieu Malaterre, Andy Lutomirski, Benjamin Herrenschmidt,
	Borislav Petkov, Christophe Leroy, Dave Hansen, Fenghua Yu,
	Geert Uytterhoeven, Heiko Carstens, H. Peter Anvin, Ingo Molnar,
	Joonsoo Kim, Kirill A. Shutemov, Martin Schwidefsky,
	Masahiro Yamada, Michael Ellerman, Mike Rapoport,
	Nicholas Piggin, Oscar Salvador, Paul Mackerras, Peter Zijlstra,
	Rich Felker, Rob Herring, Stefan Agner, Thomas Gleixner,
	Tony Luck, Vasily Gorbik, Yoshinori Sato, Andrew Morton,
	Linus Torvalds

From: David Hildenbrand <david@redhat.com>

commit cb7b3a3685b20d3b5900ff24b2cb96d002960189 upstream.

Failing while removing memory is mostly ignored and cannot really be
handled.  Let's treat errors in unregister_memory_section() in a nice way,
warning, but continuing.

Link: http://lkml.kernel.org/r/20190409100148.24703-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Banman <andrew.banman@hpe.com>
Cc: Mike Travis <mike.travis@hpe.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Arun KS <arunks@codeaurora.org>
Cc: Mathieu Malaterre <malat@debian.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oscar Salvador <osalvador@suse.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Stefan Agner <stefan@agner.ch>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/base/memory.c  |   16 +++++-----------
 include/linux/memory.h |    2 +-
 mm/memory_hotplug.c    |    4 +---
 3 files changed, 7 insertions(+), 15 deletions(-)

--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -743,15 +743,18 @@ unregister_memory(struct memory_block *m
 {
 	BUG_ON(memory->dev.bus != &memory_subsys);
 
-	/* drop the ref. we got in remove_memory_section() */
+	/* drop the ref. we got via find_memory_block() */
 	put_device(&memory->dev);
 	device_unregister(&memory->dev);
 }
 
-static int remove_memory_section(struct mem_section *section)
+void unregister_memory_section(struct mem_section *section)
 {
 	struct memory_block *mem;
 
+	if (WARN_ON_ONCE(!present_section(section)))
+		return;
+
 	mutex_lock(&mem_sysfs_mutex);
 
 	/*
@@ -772,15 +775,6 @@ static int remove_memory_section(struct
 
 out_unlock:
 	mutex_unlock(&mem_sysfs_mutex);
-	return 0;
-}
-
-int unregister_memory_section(struct mem_section *section)
-{
-	if (!present_section(section))
-		return -EINVAL;
-
-	return remove_memory_section(section);
 }
 #endif /* CONFIG_MEMORY_HOTREMOVE */
 
--- a/include/linux/memory.h
+++ b/include/linux/memory.h
@@ -113,7 +113,7 @@ extern int register_memory_isolate_notif
 extern void unregister_memory_isolate_notifier(struct notifier_block *nb);
 int hotplug_memory_register(int nid, struct mem_section *section);
 #ifdef CONFIG_MEMORY_HOTREMOVE
-extern int unregister_memory_section(struct mem_section *);
+extern void unregister_memory_section(struct mem_section *);
 #endif
 extern int memory_dev_init(void);
 extern int memory_notify(unsigned long val, void *v);
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -488,9 +488,7 @@ static int __remove_section(struct zone
 	if (!valid_section(ms))
 		return ret;
 
-	ret = unregister_memory_section(ms);
-	if (ret)
-		return ret;
+	unregister_memory_section(ms);
 
 	scn_nr = __section_nr(ms);
 	start_pfn = section_nr_to_pfn((unsigned long)scn_nr);



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 78/92] mm/memory_hotplug: make __remove_section() never fail
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (76 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 77/92] mm/memory_hotplug: make unregister_memory_section() never fail Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 79/92] powerpc/mm: Fix section mismatch warning Greg Kroah-Hartman
                   ` (10 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, David Hildenbrand, Oscar Salvador,
	Michal Hocko, Pavel Tatashin, Qian Cai, Wei Yang, Arun KS,
	Mathieu Malaterre, Andrew Banman, Andy Lutomirski,
	Benjamin Herrenschmidt, Borislav Petkov, Christophe Leroy,
	Dave Hansen, Fenghua Yu, Geert Uytterhoeven, Heiko Carstens,
	H. Peter Anvin, Ingo Molnar, Ingo Molnar, Joonsoo Kim,
	Kirill A. Shutemov, Martin Schwidefsky, Masahiro Yamada,
	Michael Ellerman, Mike Rapoport, Mike Travis, Nicholas Piggin,
	Oscar Salvador, Paul Mackerras, Peter Zijlstra,
	Rafael J. Wysocki, Rich Felker, Rob Herring, Stefan Agner,
	Thomas Gleixner, Tony Luck, Vasily Gorbik, Yoshinori Sato,
	Andrew Morton, Linus Torvalds

From: David Hildenbrand <david@redhat.com>

commit 9d1d887d785b4fe0590bd3c5e71acaa3908044e2 upstream.

Let's just warn in case a section is not valid instead of failing to
remove somewhere in the middle of the process, returning an error that
will be mostly ignored by callers.

Link: http://lkml.kernel.org/r/20190409100148.24703-4-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Arun KS <arunks@codeaurora.org>
Cc: Mathieu Malaterre <malat@debian.org>
Cc: Andrew Banman <andrew.banman@hpe.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Mike Travis <mike.travis@hpe.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oscar Salvador <osalvador@suse.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Stefan Agner <stefan@agner.ch>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 mm/memory_hotplug.c |   22 +++++++++-------------
 1 file changed, 9 insertions(+), 13 deletions(-)

--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -478,15 +478,15 @@ static void __remove_zone(struct zone *z
 	pgdat_resize_unlock(zone->zone_pgdat, &flags);
 }
 
-static int __remove_section(struct zone *zone, struct mem_section *ms,
-		unsigned long map_offset, struct vmem_altmap *altmap)
+static void __remove_section(struct zone *zone, struct mem_section *ms,
+			     unsigned long map_offset,
+			     struct vmem_altmap *altmap)
 {
 	unsigned long start_pfn;
 	int scn_nr;
-	int ret = -EINVAL;
 
-	if (!valid_section(ms))
-		return ret;
+	if (WARN_ON_ONCE(!valid_section(ms)))
+		return;
 
 	unregister_memory_section(ms);
 
@@ -495,7 +495,6 @@ static int __remove_section(struct zone
 	__remove_zone(zone, start_pfn);
 
 	sparse_remove_one_section(zone, ms, map_offset, altmap);
-	return 0;
 }
 
 /**
@@ -515,7 +514,7 @@ int __remove_pages(struct zone *zone, un
 {
 	unsigned long i;
 	unsigned long map_offset = 0;
-	int sections_to_remove, ret = 0;
+	int sections_to_remove;
 
 	/* In the ZONE_DEVICE case device driver owns the memory region */
 	if (is_dev_zone(zone)) {
@@ -536,16 +535,13 @@ int __remove_pages(struct zone *zone, un
 		unsigned long pfn = phys_start_pfn + i*PAGES_PER_SECTION;
 
 		cond_resched();
-		ret = __remove_section(zone, __pfn_to_section(pfn), map_offset,
-				altmap);
+		__remove_section(zone, __pfn_to_section(pfn), map_offset,
+				 altmap);
 		map_offset = 0;
-		if (ret)
-			break;
 	}
 
 	set_zone_contiguous(zone);
-
-	return ret;
+	return 0;
 }
 #endif /* CONFIG_MEMORY_HOTREMOVE */
 



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 79/92] powerpc/mm: Fix section mismatch warning
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (77 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 78/92] mm/memory_hotplug: make __remove_section() " Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 80/92] mm/memory_hotplug: make __remove_pages() and arch_remove_memory() never fail Greg Kroah-Hartman
                   ` (9 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Aneesh Kumar K.V, Michael Ellerman,
	David Hildenbrand

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>

commit 26ad26718dfaa7cf49d106d212ebf2370076c253 upstream.

This patch fix the below section mismatch warnings.

WARNING: vmlinux.o(.text+0x2d1f44): Section mismatch in reference from the function devm_memremap_pages_release() to the function .meminit.text:arch_remove_memory()
WARNING: vmlinux.o(.text+0x2d265c): Section mismatch in reference from the function devm_memremap_pages() to the function .meminit.text:arch_add_memory()

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/powerpc/mm/mem.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -118,8 +118,8 @@ int __weak remove_section_mapping(unsign
 	return -ENODEV;
 }
 
-int __meminit arch_add_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap,
-		bool want_memblock)
+int __ref arch_add_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap,
+			  bool want_memblock)
 {
 	unsigned long start_pfn = start >> PAGE_SHIFT;
 	unsigned long nr_pages = size >> PAGE_SHIFT;
@@ -140,8 +140,8 @@ int __meminit arch_add_memory(int nid, u
 }
 
 #ifdef CONFIG_MEMORY_HOTREMOVE
-int __meminit arch_remove_memory(int nid, u64 start, u64 size,
-					struct vmem_altmap *altmap)
+int __ref arch_remove_memory(int nid, u64 start, u64 size,
+			     struct vmem_altmap *altmap)
 {
 	unsigned long start_pfn = start >> PAGE_SHIFT;
 	unsigned long nr_pages = size >> PAGE_SHIFT;



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 80/92] mm/memory_hotplug: make __remove_pages() and arch_remove_memory() never fail
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (78 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 79/92] powerpc/mm: Fix section mismatch warning Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 88/92] mm/hotplug: kill is_dev_zone() usage in __remove_pages() Greg Kroah-Hartman
                   ` (8 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, David Hildenbrand, Tony Luck, Fenghua Yu,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Martin Schwidefsky, Heiko Carstens, Yoshinori Sato, Rich Felker,
	Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin, Michal Hocko,
	Mike Rapoport, Oscar Salvador, Kirill A. Shutemov,
	Christophe Leroy, Stefan Agner, Nicholas Piggin, Pavel Tatashin,
	Vasily Gorbik, Arun KS, Geert Uytterhoeven, Masahiro Yamada,
	Rob Herring, Joonsoo Kim, Wei Yang, Qian Cai, Mathieu Malaterre,
	Andrew Banman, Ingo Molnar, Mike Travis, Oscar Salvador,
	Rafael J. Wysocki, Andrew Morton, Linus Torvalds

From: David Hildenbrand <david@redhat.com>

commit ac5c94264580f498e484c854031d0226b3c1038f upstream.

-- snip --

Minor conflict in arch/powerpc/mm/mem.c

-- snip --

All callers of arch_remove_memory() ignore errors.  And we should really
try to remove any errors from the memory removal path.  No more errors are
reported from __remove_pages().  BUG() in s390x code in case
arch_remove_memory() is triggered.  We may implement that properly later.
WARN in case powerpc code failed to remove the section mapping, which is
better than ignoring the error completely right now.

Link: http://lkml.kernel.org/r/20190409100148.24703-5-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Oscar Salvador <osalvador@suse.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Stefan Agner <stefan@agner.ch>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Arun KS <arunks@codeaurora.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Mathieu Malaterre <malat@debian.org>
Cc: Andrew Banman <andrew.banman@hpe.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mike Travis <mike.travis@hpe.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/ia64/mm/init.c            |   11 +++--------
 arch/powerpc/mm/mem.c          |    9 +++------
 arch/s390/mm/init.c            |    5 +++--
 arch/sh/mm/init.c              |   11 +++--------
 arch/x86/mm/init_32.c          |    5 +++--
 arch/x86/mm/init_64.c          |   10 +++-------
 include/linux/memory_hotplug.h |    8 ++++----
 mm/memory_hotplug.c            |    5 ++---
 8 files changed, 24 insertions(+), 40 deletions(-)

--- a/arch/ia64/mm/init.c
+++ b/arch/ia64/mm/init.c
@@ -662,20 +662,15 @@ int arch_add_memory(int nid, u64 start,
 }
 
 #ifdef CONFIG_MEMORY_HOTREMOVE
-int arch_remove_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap)
+void arch_remove_memory(int nid, u64 start, u64 size,
+			struct vmem_altmap *altmap)
 {
 	unsigned long start_pfn = start >> PAGE_SHIFT;
 	unsigned long nr_pages = size >> PAGE_SHIFT;
 	struct zone *zone;
-	int ret;
 
 	zone = page_zone(pfn_to_page(start_pfn));
-	ret = __remove_pages(zone, start_pfn, nr_pages, altmap);
-	if (ret)
-		pr_warn("%s: Problem encountered in __remove_pages() as"
-			" ret=%d\n", __func__,  ret);
-
-	return ret;
+	__remove_pages(zone, start_pfn, nr_pages, altmap);
 }
 #endif
 #endif
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -140,7 +140,7 @@ int __ref arch_add_memory(int nid, u64 s
 }
 
 #ifdef CONFIG_MEMORY_HOTREMOVE
-int __ref arch_remove_memory(int nid, u64 start, u64 size,
+void __ref arch_remove_memory(int nid, u64 start, u64 size,
 			     struct vmem_altmap *altmap)
 {
 	unsigned long start_pfn = start >> PAGE_SHIFT;
@@ -156,14 +156,13 @@ int __ref arch_remove_memory(int nid, u6
 	if (altmap)
 		page += vmem_altmap_offset(altmap);
 
-	ret = __remove_pages(page_zone(page), start_pfn, nr_pages, altmap);
-	if (ret)
-		return ret;
+	__remove_pages(page_zone(page), start_pfn, nr_pages, altmap);
 
 	/* Remove htab bolted mappings for this section of memory */
 	start = (unsigned long)__va(start);
 	flush_inval_dcache_range(start, start + size);
 	ret = remove_section_mapping(start, start + size);
+	WARN_ON_ONCE(ret);
 
 	/* Ensure all vmalloc mappings are flushed in case they also
 	 * hit that section of memory
@@ -171,8 +170,6 @@ int __ref arch_remove_memory(int nid, u6
 	vm_unmap_aliases();
 
 	resize_hpt_for_hotplug(memblock_phys_mem_size());
-
-	return ret;
 }
 #endif
 #endif /* CONFIG_MEMORY_HOTPLUG */
--- a/arch/s390/mm/init.c
+++ b/arch/s390/mm/init.c
@@ -240,14 +240,15 @@ int arch_add_memory(int nid, u64 start,
 }
 
 #ifdef CONFIG_MEMORY_HOTREMOVE
-int arch_remove_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap)
+void arch_remove_memory(int nid, u64 start, u64 size,
+			struct vmem_altmap *altmap)
 {
 	/*
 	 * There is no hardware or firmware interface which could trigger a
 	 * hot memory remove on s390. So there is nothing that needs to be
 	 * implemented.
 	 */
-	return -EBUSY;
+	BUG();
 }
 #endif
 #endif /* CONFIG_MEMORY_HOTPLUG */
--- a/arch/sh/mm/init.c
+++ b/arch/sh/mm/init.c
@@ -444,20 +444,15 @@ EXPORT_SYMBOL_GPL(memory_add_physaddr_to
 #endif
 
 #ifdef CONFIG_MEMORY_HOTREMOVE
-int arch_remove_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap)
+void arch_remove_memory(int nid, u64 start, u64 size,
+			struct vmem_altmap *altmap)
 {
 	unsigned long start_pfn = PFN_DOWN(start);
 	unsigned long nr_pages = size >> PAGE_SHIFT;
 	struct zone *zone;
-	int ret;
 
 	zone = page_zone(pfn_to_page(start_pfn));
-	ret = __remove_pages(zone, start_pfn, nr_pages, altmap);
-	if (unlikely(ret))
-		pr_warn("%s: Failed, __remove_pages() == %d\n", __func__,
-			ret);
-
-	return ret;
+	__remove_pages(zone, start_pfn, nr_pages, altmap);
 }
 #endif
 #endif /* CONFIG_MEMORY_HOTPLUG */
--- a/arch/x86/mm/init_32.c
+++ b/arch/x86/mm/init_32.c
@@ -861,14 +861,15 @@ int arch_add_memory(int nid, u64 start,
 }
 
 #ifdef CONFIG_MEMORY_HOTREMOVE
-int arch_remove_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap)
+void arch_remove_memory(int nid, u64 start, u64 size,
+			struct vmem_altmap *altmap)
 {
 	unsigned long start_pfn = start >> PAGE_SHIFT;
 	unsigned long nr_pages = size >> PAGE_SHIFT;
 	struct zone *zone;
 
 	zone = page_zone(pfn_to_page(start_pfn));
-	return __remove_pages(zone, start_pfn, nr_pages, altmap);
+	__remove_pages(zone, start_pfn, nr_pages, altmap);
 }
 #endif
 #endif
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1142,24 +1142,20 @@ kernel_physical_mapping_remove(unsigned
 	remove_pagetable(start, end, true, NULL);
 }
 
-int __ref arch_remove_memory(int nid, u64 start, u64 size,
-				struct vmem_altmap *altmap)
+void __ref arch_remove_memory(int nid, u64 start, u64 size,
+			      struct vmem_altmap *altmap)
 {
 	unsigned long start_pfn = start >> PAGE_SHIFT;
 	unsigned long nr_pages = size >> PAGE_SHIFT;
 	struct page *page = pfn_to_page(start_pfn);
 	struct zone *zone;
-	int ret;
 
 	/* With altmap the first mapped page is offset from @start */
 	if (altmap)
 		page += vmem_altmap_offset(altmap);
 	zone = page_zone(page);
-	ret = __remove_pages(zone, start_pfn, nr_pages, altmap);
-	WARN_ON_ONCE(ret);
+	__remove_pages(zone, start_pfn, nr_pages, altmap);
 	kernel_physical_mapping_remove(start, start + size);
-
-	return ret;
 }
 #endif
 #endif /* CONFIG_MEMORY_HOTPLUG */
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -109,10 +109,10 @@ static inline bool movable_node_is_enabl
 }
 
 #ifdef CONFIG_MEMORY_HOTREMOVE
-extern int arch_remove_memory(int nid, u64 start, u64 size,
-				struct vmem_altmap *altmap);
-extern int __remove_pages(struct zone *zone, unsigned long start_pfn,
-	unsigned long nr_pages, struct vmem_altmap *altmap);
+extern void arch_remove_memory(int nid, u64 start, u64 size,
+			       struct vmem_altmap *altmap);
+extern void __remove_pages(struct zone *zone, unsigned long start_pfn,
+			   unsigned long nr_pages, struct vmem_altmap *altmap);
 #endif /* CONFIG_MEMORY_HOTREMOVE */
 
 /* reasonably generic interface to expand the physical pages */
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -509,8 +509,8 @@ static void __remove_section(struct zone
  * sure that pages are marked reserved and zones are adjust properly by
  * calling offline_pages().
  */
-int __remove_pages(struct zone *zone, unsigned long phys_start_pfn,
-		 unsigned long nr_pages, struct vmem_altmap *altmap)
+void __remove_pages(struct zone *zone, unsigned long phys_start_pfn,
+		    unsigned long nr_pages, struct vmem_altmap *altmap)
 {
 	unsigned long i;
 	unsigned long map_offset = 0;
@@ -541,7 +541,6 @@ int __remove_pages(struct zone *zone, un
 	}
 
 	set_zone_contiguous(zone);
-	return 0;
 }
 #endif /* CONFIG_MEMORY_HOTREMOVE */
 



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 88/92] mm/hotplug: kill is_dev_zone() usage in __remove_pages()
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (79 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 80/92] mm/memory_hotplug: make __remove_pages() and arch_remove_memory() never fail Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 89/92] drivers/base/node.c: simplify unregister_memory_block_under_nodes() Greg Kroah-Hartman
                   ` (7 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, Dan Williams, David Hildenbrand,
	Oscar Salvador, Michal Hocko, Logan Gunthorpe, Pavel Tatashin,
	Jane Chu, Jeff Moyer, Jérôme Glisse, Jonathan Corbet,
	Mike Rapoport, Toshi Kani, Vlastimil Babka, Wei Yang,
	Jason Gunthorpe, Christoph Hellwig, Andrew Morton,
	Linus Torvalds, Aneesh Kumar K . V

From: Dan Williams <dan.j.williams@intel.com>

commit 96da4350000973ef9310a10d077d65bbc017f093 upstream.

-- snip --

Minor conflict, keep the altmap check.

-- snip --

The zone type check was a leftover from the cleanup that plumbed altmap
through the memory hotplug path, i.e.  commit da024512a1fa "mm: pass the
vmem_altmap to arch_remove_memory and __remove_pages".

Link: http://lkml.kernel.org/r/156092352642.979959.6664333788149363039.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>	[ppc64]
Cc: Michal Hocko <mhocko@suse.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Jane Chu <jane.chu@oracle.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Toshi Kani <toshi.kani@hpe.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wei Yang <richardw.yang@linux.intel.com>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 mm/memory_hotplug.c |    7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -507,11 +507,8 @@ void __remove_pages(struct zone *zone, u
 	unsigned long map_offset = 0;
 	int sections_to_remove;
 
-	/* In the ZONE_DEVICE case device driver owns the memory region */
-	if (is_dev_zone(zone)) {
-		if (altmap)
-			map_offset = vmem_altmap_offset(altmap);
-	}
+	if (altmap)
+		map_offset = vmem_altmap_offset(altmap);
 
 	clear_zone_contiguous(zone);
 



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 89/92] drivers/base/node.c: simplify unregister_memory_block_under_nodes()
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (80 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 88/92] mm/hotplug: kill is_dev_zone() usage in __remove_pages() Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:08 ` [PATCH 4.19 91/92] mm/memory_hotplug: fix try_offline_node() Greg Kroah-Hartman
                   ` (6 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, David Hildenbrand, Rafael J. Wysocki,
	Stephen Rothwell, Pavel Tatashin, Michal Hocko, Oscar Salvador,
	Andrew Morton, Linus Torvalds

From: David Hildenbrand <david@redhat.com>

commit d84f2f5a755208da3f93e17714631485cb3da11c upstream.

We don't allow to offline memory block devices that belong to multiple
numa nodes.  Therefore, such devices can never get removed.  It is
sufficient to process a single node when removing the memory block.  No
need to iterate over each and every PFN.

We already have the nid stored for each memory block.  Make sure that the
nid always has a sane value.

Please note that checking for node_online(nid) is not required.  If we
would have a memory block belonging to a node that is no longer offline,
then we would have a BUG in the node offlining code.

Link: http://lkml.kernel.org/r/20190719135244.15242-1-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/base/memory.c |    1 +
 drivers/base/node.c   |   39 +++++++++++++++------------------------
 2 files changed, 16 insertions(+), 24 deletions(-)

--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -693,6 +693,7 @@ static int init_memory_block(struct memo
 	mem->state = state;
 	start_pfn = section_nr_to_pfn(mem->start_section_nr);
 	mem->phys_device = arch_get_memory_phys_device(start_pfn);
+	mem->nid = NUMA_NO_NODE;
 
 	ret = register_memory(mem);
 
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -409,8 +409,6 @@ int register_mem_sect_under_node(struct
 	int ret, nid = *(int *)arg;
 	unsigned long pfn, sect_start_pfn, sect_end_pfn;
 
-	mem_blk->nid = nid;
-
 	sect_start_pfn = section_nr_to_pfn(mem_blk->start_section_nr);
 	sect_end_pfn = section_nr_to_pfn(mem_blk->end_section_nr);
 	sect_end_pfn += PAGES_PER_SECTION - 1;
@@ -439,6 +437,13 @@ int register_mem_sect_under_node(struct
 			if (page_nid != nid)
 				continue;
 		}
+
+		/*
+		 * If this memory block spans multiple nodes, we only indicate
+		 * the last processed node.
+		 */
+		mem_blk->nid = nid;
+
 		ret = sysfs_create_link_nowarn(&node_devices[nid]->dev.kobj,
 					&mem_blk->dev.kobj,
 					kobject_name(&mem_blk->dev.kobj));
@@ -454,32 +459,18 @@ int register_mem_sect_under_node(struct
 }
 
 /*
- * Unregister memory block device under all nodes that it spans.
- * Has to be called with mem_sysfs_mutex held (due to unlinked_nodes).
+ * Unregister a memory block device under the node it spans. Memory blocks
+ * with multiple nodes cannot be offlined and therefore also never be removed.
  */
 void unregister_memory_block_under_nodes(struct memory_block *mem_blk)
 {
-	unsigned long pfn, sect_start_pfn, sect_end_pfn;
-	static nodemask_t unlinked_nodes;
-
-	nodes_clear(unlinked_nodes);
-	sect_start_pfn = section_nr_to_pfn(mem_blk->start_section_nr);
-	sect_end_pfn = section_nr_to_pfn(mem_blk->end_section_nr);
-	for (pfn = sect_start_pfn; pfn <= sect_end_pfn; pfn++) {
-		int nid;
+	if (mem_blk->nid == NUMA_NO_NODE)
+		return;
 
-		nid = get_nid_for_pfn(pfn);
-		if (nid < 0)
-			continue;
-		if (!node_online(nid))
-			continue;
-		if (node_test_and_set(nid, unlinked_nodes))
-			continue;
-		sysfs_remove_link(&node_devices[nid]->dev.kobj,
-			 kobject_name(&mem_blk->dev.kobj));
-		sysfs_remove_link(&mem_blk->dev.kobj,
-			 kobject_name(&node_devices[nid]->dev.kobj));
-	}
+	sysfs_remove_link(&node_devices[mem_blk->nid]->dev.kobj,
+			  kobject_name(&mem_blk->dev.kobj));
+	sysfs_remove_link(&mem_blk->dev.kobj,
+			  kobject_name(&node_devices[mem_blk->nid]->dev.kobj));
 }
 
 int link_mem_sections(int nid, unsigned long start_pfn, unsigned long end_pfn)



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 91/92] mm/memory_hotplug: fix try_offline_node()
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (81 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 89/92] drivers/base/node.c: simplify unregister_memory_block_under_nodes() Greg Kroah-Hartman
@ 2020-01-28 14:08 ` Greg Kroah-Hartman
  2020-01-28 14:09 ` [PATCH 4.19 92/92] mm/memory_hotplug: shrink zones when offlining memory Greg Kroah-Hartman
                   ` (5 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:08 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, David Hildenbrand, Masayoshi Mizuma,
	Tang Chen, Rafael J. Wysocki, Keith Busch, Jiri Olsa,
	Peter Zijlstra (Intel),
	Jani Nikula, Nayna Jain, Michal Hocko, Oscar Salvador,
	Stephen Rothwell, Dan Williams, Pavel Tatashin, Andrew Morton,
	Linus Torvalds

From: David Hildenbrand <david@redhat.com>

commit 2c91f8fc6c999fe10185d8ad99fda1759f662f70 upstream.

-- snip --

Only contextual issues:
- Unrelated check_and_unmap_cpu_on_node() changes are missing.
- Unrelated walk_memory_blocks() has not been moved/refactored yet.

-- snip --

try_offline_node() is pretty much broken right now:

 - The node span is updated when onlining memory, not when adding it. We
   ignore memory that was mever onlined. Bad.

 - We touch possible garbage memmaps. The pfn_to_nid(pfn) can easily
   trigger a kernel panic. Bad for memory that is offline but also bad
   for subsection hotadd with ZONE_DEVICE, whereby the memmap of the
   first PFN of a section might contain garbage.

 - Sections belonging to mixed nodes are not properly considered.

As memory blocks might belong to multiple nodes, we would have to walk
all pageblocks (or at least subsections) within present sections.
However, we don't have a way to identify whether a memmap that is not
online was initialized (relevant for ZONE_DEVICE).  This makes things
more complicated.

Luckily, we can piggy pack on the node span and the nid stored in memory
blocks.  Currently, the node span is grown when calling
move_pfn_range_to_zone() - e.g., when onlining memory, and shrunk when
removing memory, before calling try_offline_node().  Sysfs links are
created via link_mem_sections(), e.g., during boot or when adding
memory.

If the node still spans memory or if any memory block belongs to the
nid, we don't set the node offline.  As memory blocks that span multiple
nodes cannot get offlined, the nid stored in memory blocks is reliable
enough (for such online memory blocks, the node still spans the memory).

Introduce for_each_memory_block() to efficiently walk all memory blocks.

Note: We will soon stop shrinking the ZONE_DEVICE zone and the node span
when removing ZONE_DEVICE memory to fix similar issues (access of
garbage memmaps) - until we have a reliable way to identify whether
these memmaps were properly initialized.  This implies later, that once
a node had ZONE_DEVICE memory, we won't be able to set a node offline -
which should be acceptable.

Since commit f1dd2cd13c4b ("mm, memory_hotplug: do not associate
hotadded memory to zones until online") memory that is added is not
assoziated with a zone/node (memmap not initialized).  The introducing
commit 60a5a19e7419 ("memory-hotplug: remove sysfs file of node")
already missed that we could have multiple nodes for a section and that
the zone/node span is updated when onlining pages, not when adding them.

I tested this by hotplugging two DIMMs to a memory-less and cpu-less
NUMA node.  The node is properly onlined when adding the DIMMs.  When
removing the DIMMs, the node is properly offlined.

Masayoshi Mizuma reported:

: Without this patch, memory hotplug fails as panic:
:
:  BUG: kernel NULL pointer dereference, address: 0000000000000000
:  ...
:  Call Trace:
:   remove_memory_block_devices+0x81/0xc0
:   try_remove_memory+0xb4/0x130
:   __remove_memory+0xa/0x20
:   acpi_memory_device_remove+0x84/0x100
:   acpi_bus_trim+0x57/0x90
:   acpi_bus_trim+0x2e/0x90
:   acpi_device_hotplug+0x2b2/0x4d0
:   acpi_hotplug_work_fn+0x1a/0x30
:   process_one_work+0x171/0x380
:   worker_thread+0x49/0x3f0
:   kthread+0xf8/0x130
:   ret_from_fork+0x35/0x40

[david@redhat.com: v3]
  Link: http://lkml.kernel.org/r/20191102120221.7553-1-david@redhat.com
Link: http://lkml.kernel.org/r/20191028105458.28320-1-david@redhat.com
Fixes: 60a5a19e7419 ("memory-hotplug: remove sysfs file of node")
Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") # visiable after d0dc12e86b319
Signed-off-by: David Hildenbrand <david@redhat.com>
Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Nayna Jain <nayna@linux.ibm.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/base/memory.c  |   36 ++++++++++++++++++++++++++++++++++++
 include/linux/memory.h |    2 ++
 mm/memory_hotplug.c    |   47 +++++++++++++++++++++++++++++------------------
 3 files changed, 67 insertions(+), 18 deletions(-)

--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -862,3 +862,39 @@ out:
 		printk(KERN_ERR "%s() failed: %d\n", __func__, ret);
 	return ret;
 }
+
+struct for_each_memory_block_cb_data {
+	walk_memory_blocks_func_t func;
+	void *arg;
+};
+
+static int for_each_memory_block_cb(struct device *dev, void *data)
+{
+	struct memory_block *mem = to_memory_block(dev);
+	struct for_each_memory_block_cb_data *cb_data = data;
+
+	return cb_data->func(mem, cb_data->arg);
+}
+
+/**
+ * for_each_memory_block - walk through all present memory blocks
+ *
+ * @arg: argument passed to func
+ * @func: callback for each memory block walked
+ *
+ * This function walks through all present memory blocks, calling func on
+ * each memory block.
+ *
+ * In case func() returns an error, walking is aborted and the error is
+ * returned.
+ */
+int for_each_memory_block(void *arg, walk_memory_blocks_func_t func)
+{
+	struct for_each_memory_block_cb_data cb_data = {
+		.func = func,
+		.arg = arg,
+	};
+
+	return bus_for_each_dev(&memory_subsys, NULL, &cb_data,
+				for_each_memory_block_cb);
+}
--- a/include/linux/memory.h
+++ b/include/linux/memory.h
@@ -119,6 +119,8 @@ extern int memory_isolate_notify(unsigne
 extern struct memory_block *find_memory_block_hinted(struct mem_section *,
 							struct memory_block *);
 extern struct memory_block *find_memory_block(struct mem_section *);
+typedef int (*walk_memory_blocks_func_t)(struct memory_block *, void *);
+extern int for_each_memory_block(void *arg, walk_memory_blocks_func_t func);
 #define CONFIG_MEM_BLOCK_SIZE	(PAGES_PER_SECTION<<PAGE_SHIFT)
 #endif /* CONFIG_MEMORY_HOTPLUG_SPARSE */
 
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1812,6 +1812,18 @@ static int check_and_unmap_cpu_on_node(p
 	return 0;
 }
 
+static int check_no_memblock_for_node_cb(struct memory_block *mem, void *arg)
+{
+	int nid = *(int *)arg;
+
+	/*
+	 * If a memory block belongs to multiple nodes, the stored nid is not
+	 * reliable. However, such blocks are always online (e.g., cannot get
+	 * offlined) and, therefore, are still spanned by the node.
+	 */
+	return mem->nid == nid ? -EEXIST : 0;
+}
+
 /**
  * try_offline_node
  * @nid: the node ID
@@ -1824,25 +1836,24 @@ static int check_and_unmap_cpu_on_node(p
 void try_offline_node(int nid)
 {
 	pg_data_t *pgdat = NODE_DATA(nid);
-	unsigned long start_pfn = pgdat->node_start_pfn;
-	unsigned long end_pfn = start_pfn + pgdat->node_spanned_pages;
-	unsigned long pfn;
-
-	for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
-		unsigned long section_nr = pfn_to_section_nr(pfn);
-
-		if (!present_section_nr(section_nr))
-			continue;
-
-		if (pfn_to_nid(pfn) != nid)
-			continue;
-
-		/*
-		 * some memory sections of this node are not removed, and we
-		 * can't offline node now.
-		 */
+	int rc;
+
+	/*
+	 * If the node still spans pages (especially ZONE_DEVICE), don't
+	 * offline it. A node spans memory after move_pfn_range_to_zone(),
+	 * e.g., after the memory block was onlined.
+	 */
+	if (pgdat->node_spanned_pages)
+		return;
+
+	/*
+	 * Especially offline memory blocks might not be spanned by the
+	 * node. They will get spanned by the node once they get onlined.
+	 * However, they link to the node in sysfs and can get onlined later.
+	 */
+	rc = for_each_memory_block(&nid, check_no_memblock_for_node_cb);
+	if (rc)
 		return;
-	}
 
 	if (check_and_unmap_cpu_on_node(pgdat))
 		return;



^ permalink raw reply	[flat|nested] 100+ messages in thread

* [PATCH 4.19 92/92] mm/memory_hotplug: shrink zones when offlining memory
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (82 preceding siblings ...)
  2020-01-28 14:08 ` [PATCH 4.19 91/92] mm/memory_hotplug: fix try_offline_node() Greg Kroah-Hartman
@ 2020-01-28 14:09 ` Greg Kroah-Hartman
  2020-01-28 23:03 ` [PATCH 4.19 00/92] 4.19.100-stable review shuah
                   ` (4 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 14:09 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Greg Kroah-Hartman, David Hildenbrand, Oscar Salvador,
	Michal Hocko, Matthew Wilcox (Oracle),
	Aneesh Kumar K.V, Pavel Tatashin, Dan Williams, Logan Gunthorpe,
	Andrew Morton, Linus Torvalds

From: David Hildenbrand <david@redhat.com>

commit feee6b2989165631b17ac6d4ccdbf6759254e85a upstream.

-- snip --

- Missing arm64 hot(un)plug support
- Missing some vmem_altmap_offset() cleanups
- Missing sub-section hotadd support
- Missing unification of mm/hmm.c and kernel/memremap.c

-- snip --

We currently try to shrink a single zone when removing memory.  We use
the zone of the first page of the memory we are removing.  If that
memmap was never initialized (e.g., memory was never onlined), we will
read garbage and can trigger kernel BUGs (due to a stale pointer):

    BUG: unable to handle page fault for address: 000000000000353d
    #PF: supervisor write access in kernel mode
    #PF: error_code(0x0002) - not-present page
    PGD 0 P4D 0
    Oops: 0002 [#1] SMP PTI
    CPU: 1 PID: 7 Comm: kworker/u8:0 Not tainted 5.3.0-rc5-next-20190820+ #317
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.4
    Workqueue: kacpi_hotplug acpi_hotplug_work_fn
    RIP: 0010:clear_zone_contiguous+0x5/0x10
    Code: 48 89 c6 48 89 c3 e8 2a fe ff ff 48 85 c0 75 cf 5b 5d c3 c6 85 fd 05 00 00 01 5b 5d c3 0f 1f 840
    RSP: 0018:ffffad2400043c98 EFLAGS: 00010246
    RAX: 0000000000000000 RBX: 0000000200000000 RCX: 0000000000000000
    RDX: 0000000000200000 RSI: 0000000000140000 RDI: 0000000000002f40
    RBP: 0000000140000000 R08: 0000000000000000 R09: 0000000000000001
    R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000140000
    R13: 0000000000140000 R14: 0000000000002f40 R15: ffff9e3e7aff3680
    FS:  0000000000000000(0000) GS:ffff9e3e7bb00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 000000000000353d CR3: 0000000058610000 CR4: 00000000000006e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
     __remove_pages+0x4b/0x640
     arch_remove_memory+0x63/0x8d
     try_remove_memory+0xdb/0x130
     __remove_memory+0xa/0x11
     acpi_memory_device_remove+0x70/0x100
     acpi_bus_trim+0x55/0x90
     acpi_device_hotplug+0x227/0x3a0
     acpi_hotplug_work_fn+0x1a/0x30
     process_one_work+0x221/0x550
     worker_thread+0x50/0x3b0
     kthread+0x105/0x140
     ret_from_fork+0x3a/0x50
    Modules linked in:
    CR2: 000000000000353d

Instead, shrink the zones when offlining memory or when onlining failed.
Introduce and use remove_pfn_range_from_zone(() for that.  We now
properly shrink the zones, even if we have DIMMs whereby

 - Some memory blocks fall into no zone (never onlined)

 - Some memory blocks fall into multiple zones (offlined+re-onlined)

 - Multiple memory blocks that fall into different zones

Drop the zone parameter (with a potential dubious value) from
__remove_pages() and __remove_section().

Link: http://lkml.kernel.org/r/20191006085646.5768-6-david@redhat.com
Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online")	[visible after d0dc12e86b319]
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: <stable@vger.kernel.org>	[5.0+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/ia64/mm/init.c            |    4 +---
 arch/powerpc/mm/mem.c          |   11 +----------
 arch/s390/mm/init.c            |    4 +---
 arch/sh/mm/init.c              |    4 +---
 arch/x86/mm/init_32.c          |    4 +---
 arch/x86/mm/init_64.c          |    8 +-------
 include/linux/memory_hotplug.h |    7 +++++--
 kernel/memremap.c              |    3 +--
 mm/hmm.c                       |    4 +---
 mm/memory_hotplug.c            |   29 ++++++++++++++---------------
 10 files changed, 27 insertions(+), 51 deletions(-)

--- a/arch/ia64/mm/init.c
+++ b/arch/ia64/mm/init.c
@@ -666,9 +666,7 @@ void arch_remove_memory(int nid, u64 sta
 {
 	unsigned long start_pfn = start >> PAGE_SHIFT;
 	unsigned long nr_pages = size >> PAGE_SHIFT;
-	struct zone *zone;
 
-	zone = page_zone(pfn_to_page(start_pfn));
-	__remove_pages(zone, start_pfn, nr_pages, altmap);
+	__remove_pages(start_pfn, nr_pages, altmap);
 }
 #endif
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -144,18 +144,9 @@ void __ref arch_remove_memory(int nid, u
 {
 	unsigned long start_pfn = start >> PAGE_SHIFT;
 	unsigned long nr_pages = size >> PAGE_SHIFT;
-	struct page *page;
 	int ret;
 
-	/*
-	 * If we have an altmap then we need to skip over any reserved PFNs
-	 * when querying the zone.
-	 */
-	page = pfn_to_page(start_pfn);
-	if (altmap)
-		page += vmem_altmap_offset(altmap);
-
-	__remove_pages(page_zone(page), start_pfn, nr_pages, altmap);
+	__remove_pages(start_pfn, nr_pages, altmap);
 
 	/* Remove htab bolted mappings for this section of memory */
 	start = (unsigned long)__va(start);
--- a/arch/s390/mm/init.c
+++ b/arch/s390/mm/init.c
@@ -244,10 +244,8 @@ void arch_remove_memory(int nid, u64 sta
 {
 	unsigned long start_pfn = start >> PAGE_SHIFT;
 	unsigned long nr_pages = size >> PAGE_SHIFT;
-	struct zone *zone;
 
-	zone = page_zone(pfn_to_page(start_pfn));
-	__remove_pages(zone, start_pfn, nr_pages, altmap);
+	__remove_pages(start_pfn, nr_pages, altmap);
 	vmem_remove_mapping(start, size);
 }
 #endif /* CONFIG_MEMORY_HOTPLUG */
--- a/arch/sh/mm/init.c
+++ b/arch/sh/mm/init.c
@@ -448,9 +448,7 @@ void arch_remove_memory(int nid, u64 sta
 {
 	unsigned long start_pfn = PFN_DOWN(start);
 	unsigned long nr_pages = size >> PAGE_SHIFT;
-	struct zone *zone;
 
-	zone = page_zone(pfn_to_page(start_pfn));
-	__remove_pages(zone, start_pfn, nr_pages, altmap);
+	__remove_pages(start_pfn, nr_pages, altmap);
 }
 #endif /* CONFIG_MEMORY_HOTPLUG */
--- a/arch/x86/mm/init_32.c
+++ b/arch/x86/mm/init_32.c
@@ -865,10 +865,8 @@ void arch_remove_memory(int nid, u64 sta
 {
 	unsigned long start_pfn = start >> PAGE_SHIFT;
 	unsigned long nr_pages = size >> PAGE_SHIFT;
-	struct zone *zone;
 
-	zone = page_zone(pfn_to_page(start_pfn));
-	__remove_pages(zone, start_pfn, nr_pages, altmap);
+	__remove_pages(start_pfn, nr_pages, altmap);
 }
 #endif
 
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1146,14 +1146,8 @@ void __ref arch_remove_memory(int nid, u
 {
 	unsigned long start_pfn = start >> PAGE_SHIFT;
 	unsigned long nr_pages = size >> PAGE_SHIFT;
-	struct page *page = pfn_to_page(start_pfn);
-	struct zone *zone;
 
-	/* With altmap the first mapped page is offset from @start */
-	if (altmap)
-		page += vmem_altmap_offset(altmap);
-	zone = page_zone(page);
-	__remove_pages(zone, start_pfn, nr_pages, altmap);
+	__remove_pages(start_pfn, nr_pages, altmap);
 	kernel_physical_mapping_remove(start, start + size);
 }
 #endif /* CONFIG_MEMORY_HOTPLUG */
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -110,8 +110,8 @@ static inline bool movable_node_is_enabl
 
 extern void arch_remove_memory(int nid, u64 start, u64 size,
 			       struct vmem_altmap *altmap);
-extern void __remove_pages(struct zone *zone, unsigned long start_pfn,
-			   unsigned long nr_pages, struct vmem_altmap *altmap);
+extern void __remove_pages(unsigned long start_pfn, unsigned long nr_pages,
+			   struct vmem_altmap *altmap);
 
 /* reasonably generic interface to expand the physical pages */
 extern int __add_pages(int nid, unsigned long start_pfn, unsigned long nr_pages,
@@ -331,6 +331,9 @@ extern int arch_add_memory(int nid, u64
 		struct vmem_altmap *altmap, bool want_memblock);
 extern void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
 		unsigned long nr_pages, struct vmem_altmap *altmap);
+extern void remove_pfn_range_from_zone(struct zone *zone,
+				       unsigned long start_pfn,
+				       unsigned long nr_pages);
 extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages);
 extern bool is_memblock_offlined(struct memory_block *mem);
 extern int sparse_add_one_section(int nid, unsigned long start_pfn,
--- a/kernel/memremap.c
+++ b/kernel/memremap.c
@@ -141,8 +141,7 @@ static void devm_memremap_pages_release(
 	mem_hotplug_begin();
 	if (pgmap->type == MEMORY_DEVICE_PRIVATE) {
 		pfn = align_start >> PAGE_SHIFT;
-		__remove_pages(page_zone(first_page), pfn,
-			       align_size >> PAGE_SHIFT, NULL);
+		__remove_pages(pfn, align_size >> PAGE_SHIFT, NULL);
 	} else {
 		arch_remove_memory(nid, align_start, align_size,
 				pgmap->altmap_valid ? &pgmap->altmap : NULL);
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -997,7 +997,6 @@ static void hmm_devmem_release(void *dat
 	struct hmm_devmem *devmem = data;
 	struct resource *resource = devmem->resource;
 	unsigned long start_pfn, npages;
-	struct zone *zone;
 	struct page *page;
 	int nid;
 
@@ -1006,12 +1005,11 @@ static void hmm_devmem_release(void *dat
 	npages = ALIGN(resource_size(resource), PA_SECTION_SIZE) >> PAGE_SHIFT;
 
 	page = pfn_to_page(start_pfn);
-	zone = page_zone(page);
 	nid = page_to_nid(page);
 
 	mem_hotplug_begin();
 	if (resource->desc == IORES_DESC_DEVICE_PRIVATE_MEMORY)
-		__remove_pages(zone, start_pfn, npages, NULL);
+		__remove_pages(start_pfn, npages, NULL);
 	else
 		arch_remove_memory(nid, start_pfn << PAGE_SHIFT,
 				   npages << PAGE_SHIFT, NULL);
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -449,10 +449,11 @@ static void update_pgdat_span(struct pgl
 	pgdat->node_spanned_pages = node_end_pfn - node_start_pfn;
 }
 
-static void __remove_zone(struct zone *zone, unsigned long start_pfn)
+void __ref remove_pfn_range_from_zone(struct zone *zone,
+				      unsigned long start_pfn,
+				      unsigned long nr_pages)
 {
 	struct pglist_data *pgdat = zone->zone_pgdat;
-	int nr_pages = PAGES_PER_SECTION;
 	unsigned long flags;
 
 #ifdef CONFIG_ZONE_DEVICE
@@ -465,14 +466,17 @@ static void __remove_zone(struct zone *z
 		return;
 #endif
 
+	clear_zone_contiguous(zone);
+
 	pgdat_resize_lock(zone->zone_pgdat, &flags);
 	shrink_zone_span(zone, start_pfn, start_pfn + nr_pages);
 	update_pgdat_span(pgdat);
 	pgdat_resize_unlock(zone->zone_pgdat, &flags);
+
+	set_zone_contiguous(zone);
 }
 
-static void __remove_section(struct zone *zone, struct mem_section *ms,
-			     unsigned long map_offset,
+static void __remove_section(struct mem_section *ms, unsigned long map_offset,
 			     struct vmem_altmap *altmap)
 {
 	unsigned long start_pfn;
@@ -483,14 +487,12 @@ static void __remove_section(struct zone
 
 	scn_nr = __section_nr(ms);
 	start_pfn = section_nr_to_pfn((unsigned long)scn_nr);
-	__remove_zone(zone, start_pfn);
 
 	sparse_remove_one_section(ms, map_offset, altmap);
 }
 
 /**
- * __remove_pages() - remove sections of pages from a zone
- * @zone: zone from which pages need to be removed
+ * __remove_pages() - remove sections of pages
  * @phys_start_pfn: starting pageframe (must be aligned to start of a section)
  * @nr_pages: number of pages to remove (must be multiple of section size)
  * @altmap: alternative device page map or %NULL if default memmap is used
@@ -500,8 +502,8 @@ static void __remove_section(struct zone
  * sure that pages are marked reserved and zones are adjust properly by
  * calling offline_pages().
  */
-void __remove_pages(struct zone *zone, unsigned long phys_start_pfn,
-		    unsigned long nr_pages, struct vmem_altmap *altmap)
+void __remove_pages(unsigned long phys_start_pfn, unsigned long nr_pages,
+		    struct vmem_altmap *altmap)
 {
 	unsigned long i;
 	unsigned long map_offset = 0;
@@ -510,8 +512,6 @@ void __remove_pages(struct zone *zone, u
 	if (altmap)
 		map_offset = vmem_altmap_offset(altmap);
 
-	clear_zone_contiguous(zone);
-
 	/*
 	 * We can only remove entire sections
 	 */
@@ -523,12 +523,9 @@ void __remove_pages(struct zone *zone, u
 		unsigned long pfn = phys_start_pfn + i*PAGES_PER_SECTION;
 
 		cond_resched();
-		__remove_section(zone, __pfn_to_section(pfn), map_offset,
-				 altmap);
+		__remove_section(__pfn_to_section(pfn), map_offset, altmap);
 		map_offset = 0;
 	}
-
-	set_zone_contiguous(zone);
 }
 
 int set_online_page_callback(online_page_callback_t callback)
@@ -898,6 +895,7 @@ failed_addition:
 		 (unsigned long long) pfn << PAGE_SHIFT,
 		 (((unsigned long long) pfn + nr_pages) << PAGE_SHIFT) - 1);
 	memory_notify(MEM_CANCEL_ONLINE, &arg);
+	remove_pfn_range_from_zone(zone, pfn, nr_pages);
 	mem_hotplug_done();
 	return ret;
 }
@@ -1682,6 +1680,7 @@ repeat:
 	writeback_set_ratelimit();
 
 	memory_notify(MEM_OFFLINE, &arg);
+	remove_pfn_range_from_zone(zone, start_pfn, nr_pages);
 	mem_hotplug_done();
 	return 0;
 



^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 4.19 59/92] sd: Fix REQ_OP_ZONE_REPORT completion handling
  2020-01-28 14:08 ` [PATCH 4.19 59/92] sd: Fix REQ_OP_ZONE_REPORT completion handling Greg Kroah-Hartman
@ 2020-01-28 18:02   ` Pavel Machek
  2020-01-28 18:15     ` Greg Kroah-Hartman
  0 siblings, 1 reply; 100+ messages in thread
From: Pavel Machek @ 2020-01-28 18:02 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, stable, Damien Le Moal, Masato Suzuki, Martin K. Petersen

[-- Attachment #1: Type: text/plain, Size: 1922 bytes --]

Hi!

> From: Masato Suzuki <masato.suzuki@wdc.com>
> 
>

> ZBC/ZAC report zones command may return less bytes than requested if the
> number of matching zones for the report request is small. However, unlike
> read or write commands, the remainder of incomplete report zones commands
> cannot be automatically requested by the block layer: the start sector of
> the next report cannot be known, and the report reply may not be 512B
> aligned for SAS drives (a report zone reply size is always a multiple of
> 64B). The regular request completion code executing bio_advance() and
> restart of the command remainder part currently causes invalid zone
> descriptor data to be reported to the caller if the report zone size is
> smaller than 512B (a case that can happen easily for a report of the last
> zones of a SAS drive for example).

What is the story here? Mainline does not seem to have this patch, so
this is not the case of "upstream commit xxx" line simply missing. If
the same bug is fixed in mainline different way, it would be nice to
point to that commit..

Mainline does not handle REQ_OP_ZONE_REPORT in sd.c at all..

Best regards,
								Pavel


> +++ b/drivers/scsi/sd.c
> @@ -1969,9 +1969,13 @@ static int sd_done(struct scsi_cmnd *SCp
>  		}
>  		break;
>  	case REQ_OP_ZONE_REPORT:
> +		/* To avoid that the block layer performs an incorrect
> +		 * bio_advance() call and restart of the remainder of
> +		 * incomplete report zone BIOs, always indicate a full
> +		 * completion of REQ_OP_ZONE_REPORT.
> +		 */
>  		if (!result) {
> -			good_bytes = scsi_bufflen(SCpnt)
> -				- scsi_get_resid(SCpnt);
> +			good_bytes = scsi_bufflen(SCpnt);
>  			scsi_set_resid(SCpnt, 0);
>  		} else {
>  			good_bytes = 0;
> 

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 4.19 67/92] netfilter: nf_tables: add __nft_chain_type_get()
  2020-01-28 14:08 ` [PATCH 4.19 67/92] netfilter: nf_tables: add __nft_chain_type_get() Greg Kroah-Hartman
@ 2020-01-28 18:13   ` Pavel Machek
  0 siblings, 0 replies; 100+ messages in thread
From: Pavel Machek @ 2020-01-28 18:13 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, stable, syzbot+156a04714799b1d480bc, Pablo Neira Ayuso

[-- Attachment #1: Type: text/plain, Size: 1149 bytes --]

Hi!

> From: Pablo Neira Ayuso <pablo@netfilter.org>
> 
> commit 826035498ec14b77b62a44f0cb6b94d45530db6f upstream.
> 
> This new helper function validates that unknown family and chain type
> coming from userspace do not trigger an out-of-bound array access. Bail
> out in case __nft_chain_type_get() returns NULL from
> nft_chain_parse_hook().

> +++ b/net/netfilter/nf_tables_api.c
> @@ -472,14 +472,27 @@ static inline u64 nf_tables_alloc_handle
>  static const struct nft_chain_type *chain_type[NFPROTO_NUMPROTO][NFT_CHAIN_T_MAX];
>  
>  static const struct nft_chain_type *
> +__nft_chain_type_get(u8 family, enum nft_chain_types type)
> +{
> +	if (family >= NFPROTO_NUMPROTO ||
> +	    type >= NFT_CHAIN_T_MAX)
> +		return NULL;
> +
> +	return chain_type[family][type];
> +}

Are enum types guaranteed to be unsigned on all compilers we care
about? Google says they can be signed, too. So, should the test be
"((unsigned int) type) >= ..." ?

Best regards,
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 4.19 59/92] sd: Fix REQ_OP_ZONE_REPORT completion handling
  2020-01-28 18:02   ` Pavel Machek
@ 2020-01-28 18:15     ` Greg Kroah-Hartman
  2020-01-29  1:05       ` Damien Le Moal
  0 siblings, 1 reply; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-28 18:15 UTC (permalink / raw)
  To: Pavel Machek
  Cc: linux-kernel, stable, Damien Le Moal, Masato Suzuki, Martin K. Petersen

On Tue, Jan 28, 2020 at 07:02:31PM +0100, Pavel Machek wrote:
> Hi!
> 
> > From: Masato Suzuki <masato.suzuki@wdc.com>
> > 
> >
> 
> > ZBC/ZAC report zones command may return less bytes than requested if the
> > number of matching zones for the report request is small. However, unlike
> > read or write commands, the remainder of incomplete report zones commands
> > cannot be automatically requested by the block layer: the start sector of
> > the next report cannot be known, and the report reply may not be 512B
> > aligned for SAS drives (a report zone reply size is always a multiple of
> > 64B). The regular request completion code executing bio_advance() and
> > restart of the command remainder part currently causes invalid zone
> > descriptor data to be reported to the caller if the report zone size is
> > smaller than 512B (a case that can happen easily for a report of the last
> > zones of a SAS drive for example).
> 
> What is the story here? Mainline does not seem to have this patch, so
> this is not the case of "upstream commit xxx" line simply missing. If
> the same bug is fixed in mainline different way, it would be nice to
> point to that commit..

Yes, this is not needed in 5.4 as it was rewritten differently there.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 4.19 00/92] 4.19.100-stable review
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (83 preceding siblings ...)
  2020-01-28 14:09 ` [PATCH 4.19 92/92] mm/memory_hotplug: shrink zones when offlining memory Greg Kroah-Hartman
@ 2020-01-28 23:03 ` shuah
  2020-01-29  4:54 ` Naresh Kamboju
                   ` (3 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: shuah @ 2020-01-28 23:03 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel
  Cc: torvalds, akpm, linux, patches, ben.hutchings, lkft-triage,
	stable, shuah

On 1/28/20 7:07 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.19.100 release.
> There are 92 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Thu, 30 Jan 2020 13:57:09 +0000.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.100-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h
> 

Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 4.19 59/92] sd: Fix REQ_OP_ZONE_REPORT completion handling
  2020-01-28 18:15     ` Greg Kroah-Hartman
@ 2020-01-29  1:05       ` Damien Le Moal
  0 siblings, 0 replies; 100+ messages in thread
From: Damien Le Moal @ 2020-01-29  1:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Pavel Machek
  Cc: linux-kernel, stable, Masato Suzuki, Martin K. Petersen

On 2020/01/29 3:15, Greg Kroah-Hartman wrote:
> On Tue, Jan 28, 2020 at 07:02:31PM +0100, Pavel Machek wrote:
>> Hi!
>>
>>> From: Masato Suzuki <masato.suzuki@wdc.com>
>>>
>>>
>>
>>> ZBC/ZAC report zones command may return less bytes than requested if the
>>> number of matching zones for the report request is small. However, unlike
>>> read or write commands, the remainder of incomplete report zones commands
>>> cannot be automatically requested by the block layer: the start sector of
>>> the next report cannot be known, and the report reply may not be 512B
>>> aligned for SAS drives (a report zone reply size is always a multiple of
>>> 64B). The regular request completion code executing bio_advance() and
>>> restart of the command remainder part currently causes invalid zone
>>> descriptor data to be reported to the caller if the report zone size is
>>> smaller than 512B (a case that can happen easily for a report of the last
>>> zones of a SAS drive for example).
>>
>> What is the story here? Mainline does not seem to have this patch, so
>> this is not the case of "upstream commit xxx" line simply missing. If
>> the same bug is fixed in mainline different way, it would be nice to
>> point to that commit..
> 
> Yes, this is not needed in 5.4 as it was rewritten differently there.

Correct. REQ_OP_ZONE_REPORT was dropped with kernel 4.20 and report zones
is now handled through a device file method with scsi_execute() instead of
a BIO. So this bug does not exist in kernels 4.20 and upward.

> 
> thanks,
> 
> greg k-h
> 


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 4.19 00/92] 4.19.100-stable review
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (84 preceding siblings ...)
  2020-01-28 23:03 ` [PATCH 4.19 00/92] 4.19.100-stable review shuah
@ 2020-01-29  4:54 ` Naresh Kamboju
  2020-01-29 11:31 ` Pavel Machek
                   ` (2 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Naresh Kamboju @ 2020-01-29  4:54 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: open list, Linus Torvalds, Andrew Morton, Guenter Roeck,
	Shuah Khan, patches, Ben Hutchings, lkft-triage, linux- stable

On Tue, 28 Jan 2020 at 19:57, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> This is the start of the stable review cycle for the 4.19.100 release.
> There are 92 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Thu, 30 Jan 2020 13:57:09 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
>         https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.100-rc1.gz
> or in the git tree and branch at:
>         git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

Results from Linaro’s test farm.
No regressions on arm64, arm, x86_64, and i386.

Summary
------------------------------------------------------------------------

kernel: 4.19.100-rc1
git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
git branch: linux-4.19.y
git commit: 26d743063ae6ea030faef12264de0a0156d5452d
git describe: v4.19.99-93-g26d743063ae6
Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-4.19-oe/build/v4.19.99-93-g26d743063ae6


No regressions (compared to build v4.19.99)


No fixes (compared to build v4.19.99)

Ran 24366 total tests in the following environments and test suites.

Environments
--------------
- dragonboard-410c - arm64
- hi6220-hikey - arm64
- i386
- juno-r2 - arm64
- qemu_arm
- qemu_arm64
- qemu_i386
- qemu_x86_64
- x15 - arm
- x86_64

Test Suites
-----------
* build
* install-android-platform-tools-r2600
* kselftest
* libhugetlbfs
* linux-log-parser
* ltp-cap_bounds-tests
* ltp-commands-tests
* ltp-containers-tests
* ltp-cpuhotplug-tests
* ltp-cve-tests
* ltp-dio-tests
* ltp-fcntl-locktests-tests
* ltp-filecaps-tests
* ltp-fs_bind-tests
* ltp-fs_perms_simple-tests
* ltp-fsx-tests
* ltp-hugetlb-tests
* ltp-io-tests
* ltp-ipc-tests
* ltp-math-tests
* ltp-mm-tests
* ltp-nptl-tests
* ltp-pty-tests
* ltp-sched-tests
* ltp-securebits-tests
* ltp-syscalls-tests
* perf
* spectre-meltdown-checker-test
* v4l2-compliance
* ltp-fs-tests
* network-basic-tests
* ltp-open-posix-tests
* kvm-unit-tests
* ssuite
* kselftest-vsyscall-mode-native
* kselftest-vsyscall-mode-none

-- 
Linaro LKFT
https://lkft.linaro.org

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 4.19 00/92] 4.19.100-stable review
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (85 preceding siblings ...)
  2020-01-29  4:54 ` Naresh Kamboju
@ 2020-01-29 11:31 ` Pavel Machek
  2020-01-29 12:57   ` Greg Kroah-Hartman
       [not found] ` <20200128135809.344954797-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r@public.gmane.org>
  2020-01-29 14:43 ` Guenter Roeck
  88 siblings, 1 reply; 100+ messages in thread
From: Pavel Machek @ 2020-01-29 11:31 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, torvalds, akpm, linux, shuah, patches,
	ben.hutchings, lkft-triage, stable

[-- Attachment #1: Type: text/plain, Size: 685 bytes --]

Hi!

> This is the start of the stable review cycle for the 4.19.100 release.
> There are 92 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Thu, 30 Jan 2020 13:57:09 +0000.
> Anything received after that time might be too late.

It builds and basic tests work in our configurations.

https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/pipelines/112957173

Best regards,
									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 4.19 00/92] 4.19.100-stable review
  2020-01-29 11:31 ` Pavel Machek
@ 2020-01-29 12:57   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 100+ messages in thread
From: Greg Kroah-Hartman @ 2020-01-29 12:57 UTC (permalink / raw)
  To: Pavel Machek
  Cc: linux-kernel, torvalds, akpm, linux, shuah, patches,
	ben.hutchings, lkft-triage, stable

On Wed, Jan 29, 2020 at 12:31:06PM +0100, Pavel Machek wrote:
> Hi!
> 
> > This is the start of the stable review cycle for the 4.19.100 release.
> > There are 92 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Thu, 30 Jan 2020 13:57:09 +0000.
> > Anything received after that time might be too late.
> 
> It builds and basic tests work in our configurations.
> 
> https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/pipelines/112957173

That's nice to see, thanks for testing and letting me know.

greg k-h

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 4.19 00/92] 4.19.100-stable review
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
@ 2020-01-29 13:16     ` Jon Hunter
  2020-01-28 14:07 ` [PATCH 4.19 02/92] firestream: fix memory leaks Greg Kroah-Hartman
                       ` (87 subsequent siblings)
  88 siblings, 0 replies; 100+ messages in thread
From: Jon Hunter @ 2020-01-29 13:16 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	linux-0h96xk9xTtrk1uMJSBkQmQ, shuah-DgEjT+Ai2ygdnm+yROfE0A,
	patches-ssFOTAMYnuFg9hUCZPvPmw,
	ben.hutchings-4yDnlxn2s6sWdaTGBSpHTA,
	lkft-triage-cunTk1MwBs8s++Sfvej+rw,
	stable-u79uwXL29TY76Z2rM5mHXA, linux-tegra


On 28/01/2020 14:07, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.19.100 release.
> There are 92 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Thu, 30 Jan 2020 13:57:09 +0000.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.100-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h

All tests are passing for Tegra ...

Test results for stable-v4.19:
    11 builds:	11 pass, 0 fail
    22 boots:	22 pass, 0 fail
    32 tests:	32 pass, 0 fail

Linux version:	4.19.100-rc1-g26d743063ae6
Boards tested:	tegra124-jetson-tk1, tegra186-p2771-0000,
                tegra194-p2972-0000, tegra20-ventana,
                tegra210-p2371-2180, tegra30-cardhu-a04

Cheers
Jon

-- 
nvpublic

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 4.19 00/92] 4.19.100-stable review
@ 2020-01-29 13:16     ` Jon Hunter
  0 siblings, 0 replies; 100+ messages in thread
From: Jon Hunter @ 2020-01-29 13:16 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel
  Cc: torvalds, akpm, linux, shuah, patches, ben.hutchings,
	lkft-triage, stable, linux-tegra


On 28/01/2020 14:07, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.19.100 release.
> There are 92 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Thu, 30 Jan 2020 13:57:09 +0000.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.100-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h

All tests are passing for Tegra ...

Test results for stable-v4.19:
    11 builds:	11 pass, 0 fail
    22 boots:	22 pass, 0 fail
    32 tests:	32 pass, 0 fail

Linux version:	4.19.100-rc1-g26d743063ae6
Boards tested:	tegra124-jetson-tk1, tegra186-p2771-0000,
                tegra194-p2972-0000, tegra20-ventana,
                tegra210-p2371-2180, tegra30-cardhu-a04

Cheers
Jon

-- 
nvpublic

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 4.19 00/92] 4.19.100-stable review
  2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
                   ` (87 preceding siblings ...)
       [not found] ` <20200128135809.344954797-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r@public.gmane.org>
@ 2020-01-29 14:43 ` Guenter Roeck
  88 siblings, 0 replies; 100+ messages in thread
From: Guenter Roeck @ 2020-01-29 14:43 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, torvalds, akpm, shuah, patches, ben.hutchings,
	lkft-triage, stable

On Tue, Jan 28, 2020 at 03:07:28PM +0100, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.19.100 release.
> There are 92 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Thu, 30 Jan 2020 13:57:09 +0000.
> Anything received after that time might be too late.
> 

Build results:
	total: 156 pass: 156 fail: 0
Qemu test results:
	total: 382 pass: 382 fail: 0

Guenter

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 4.19 63/92] libertas: Fix two buffer overflows at parsing bss descriptor
  2020-01-28 14:08 ` [PATCH 4.19 63/92] libertas: Fix two buffer overflows at parsing bss descriptor Greg Kroah-Hartman
@ 2020-01-29 22:47   ` Pavel Machek
  0 siblings, 0 replies; 100+ messages in thread
From: Pavel Machek @ 2020-01-29 22:47 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, stable, kbuild test robot, Wen Huang, Kalle Valo

[-- Attachment #1: Type: text/plain, Size: 1308 bytes --]

Hi!

> From: Wen Huang <huangwenabc@gmail.com>
> 
> commit e5e884b42639c74b5b57dc277909915c0aefc8bb upstream.

> --- a/drivers/net/wireless/marvell/libertas/cfg.c
> +++ b/drivers/net/wireless/marvell/libertas/cfg.c
> @@ -1717,6 +1721,9 @@ static int lbs_ibss_join_existing(struct
>  	struct cmd_ds_802_11_ad_hoc_join cmd;
>  	u8 preamble = RADIO_PREAMBLE_SHORT;
>  	int ret = 0;
> +	int hw, i;
> +	u8 rates_max;
> +	u8 *rates;
>  
>  	/* TODO: set preamble based on scan result */
>  	ret = lbs_set_radio(priv, preamble, 1);
> @@ -1775,9 +1782,12 @@ static int lbs_ibss_join_existing(struct
>  	if (!rates_eid) {
>  		lbs_add_rates(cmd.bss.rates);
>  	} else {
> -		int hw, i;
> -		u8 rates_max = rates_eid[1];
> -		u8 *rates = cmd.bss.rates;
> +		rates_max = rates_eid[1];

I believe original version (with variables being local to the else)
was better.

> +		if (rates_max > MAX_RATES) {
> +			lbs_deb_join("invalid rates");
> +			goto out;
> +		}
> +		rates = cmd.bss.rates;

"goto out" goes to "return ret". ret will be 0 at this point, so this
will return success. I don't think that's right.

Best regards,
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 4.19 43/92] do_last(): fetch directory ->i_mode and ->i_uid before its too late
  2020-01-28 14:08 ` [PATCH 4.19 43/92] do_last(): fetch directory ->i_mode and ->i_uid before its too late Greg Kroah-Hartman
@ 2020-01-31 10:08   ` Rantala, Tommi T. (Nokia - FI/Espoo)
  2020-01-31 12:20     ` Al Viro
  0 siblings, 1 reply; 100+ messages in thread
From: Rantala, Tommi T. (Nokia - FI/Espoo) @ 2020-01-31 10:08 UTC (permalink / raw)
  To: gregkh, linux-kernel; +Cc: stable, viro

On Tue, 2020-01-28 at 15:08 +0100, Greg Kroah-Hartman wrote:
> From: Al Viro <viro@zeniv.linux.org.uk>
> 
> commit d0cb50185ae942b03c4327be322055d622dc79f6 upstream.
> 
> may_create_in_sticky() call is done when we already have dropped the
> reference to dir.
> 
> Fixes: 30aba6656f61e (namei: allow restricted O_CREAT of FIFOs and
> regular files)
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> 
> ---
>  fs/namei.c |   17 ++++++++++-------
>  1 file changed, 10 insertions(+), 7 deletions(-)
> 
> --- a/fs/namei.c
> +++ b/fs/namei.c
> [...]
> @@ -3258,6 +3259,8 @@ static int do_last(struct nameidata *nd,
>  		   struct file *file, const struct open_flags *op)
>  {
>  	struct dentry *dir = nd->path.dentry;
> +	kuid_t dir_uid = dir->d_inode->i_uid;

I hit the following oops in 4.19.100 while running kselftests.

fs/namei.c:3262 matches the line above.

Any ideas?
-Tommi


[  224.682489] test_firmware: batched loading 'tmp.YlpRSimRmc' custom
fallback mechanism 4 times
[  224.686656] misc test_firmware: Direct firmware load for tmp.YlpRSimRmc
failed with error -2
[  224.691091] misc test_firmware: Direct firmware load for tmp.YlpRSimRmc
failed with error -2
[  228.593757] Scheduler tracepoints stat_sleep, stat_iowait, stat_blocked
and stat_runtime require the kernel parameter schedstats=enable or
kernel.sched_schedstats=1
[  239.451836] BUG: unable to handle kernel NULL pointer dereference at
0000000000000004
[  239.460454] PGD 0 P4D 0
[  239.461875] Oops: 0000 [#1] SMP PTI
[  239.463696] CPU: 0 PID: 13450 Comm: ftracetest Not tainted 4.19.100-
1.x86_64 #1
[  239.467604] Hardware name: Red Hat OpenStack Compute, BIOS 1.11.0-2.el7 
04/01/2014
[  239.471254] RIP: 0010:path_openat (fs/namei.c:3262 fs/namei.c:3537) 
[  239.482135] RSP: 0018:ffffb301c10e7cf8 EFLAGS: 00010246
[  239.484757] RAX: 0000000000000000 RBX: ffff99a8f57e1dc0 RCX:
0000000000000004
[  239.488229] RDX: 0000000000000000 RSI: 0000000c00000000 RDI:
ffff99a881d49900
[  239.507724] RBP: ffffb301c10e7de0 R08: ffffb301c10e7c44 R09:
61c8864680b583eb
[  239.511222] R10: 8080807fffffffff R11: d0ffffff979c8b96 R12:
0000000000008241
[  239.514709] R13: 0000000000000000 R14: ffff99a881d49900 R15:
ffffb301c10e7eec
[  239.518287] FS:  00007f1bda771740(0000) GS:ffff99a8f7a00000(0000)
knlGS:0000000000000000
[  239.522233] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  239.524964] CR2: 0000000000000004 CR3: 00000001c2196005 CR4:
00000000001606f0
[  239.528309] DR0: 0000000000405118 DR1: 0000000000000000 DR2:
0000000000000000
[  239.531599] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000600
[  239.534946] Call Trace:
[  239.536265] ? __switch_to_asm (arch/x86/entry/entry_64.S:374) 
[  239.538164] do_filp_open (fs/namei.c:3568) 
[  239.539885] ? __switch_to_asm (arch/x86/entry/entry_64.S:374) 
[  239.541747] ? __switch_to_asm (arch/x86/entry/entry_64.S:374) 
[  239.543633] do_sys_open (fs/open.c:1089) 
[  239.545392] do_syscall_64 (arch/x86/entry/common.c:293) 
[  239.547204] ? prepare_exit_to_usermode (arch/x86/entry/common.c:198) 
[  239.549424] entry_SYSCALL_64_after_hwframe
(arch/x86/entry/entry_64.S:247) 
[  239.551815] RIP: 0033:0x7f1bda86617b
[  239.561833] RSP: 002b:00007fffbd62ff10 EFLAGS: 00000246 ORIG_RAX:
0000000000000101
[  239.565335] RAX: ffffffffffffffda RBX: 0000559cde8e9480 RCX:
00007f1bda86617b
[  239.568576] RDX: 0000000000000241 RSI: 0000559cde908d90 RDI:
00000000ffffff9c
[  239.571810] RBP: 0000559cde908d90 R08: 0000000000000000 R09:
0000000000000020
[  239.575054] R10: 00000000000001b6 R11: 0000000000000246 R12:
0000000000000241
[  239.578277] R13: 0000000000000000 R14: 0000000000000001 R15:
0000559cde908d90
[  239.581512] Modules linked in: test_firmware kvm_intel kvm irqbypass
sch_fq_codel pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper
ata_piix dm_mirror dm_region_hash dm_log dm_mod autofs4
[  239.589275] CR2: 0000000000000004
[  239.590972] ---[ end trace 4b84bf2ccd45a421 ]---
[  239.593205] RIP: 0010:path_openat (fs/namei.c:3262 fs/namei.c:3537)
[  239.603919] RSP: 0018:ffffb301c10e7cf8 EFLAGS: 00010246
[  239.606473] RAX: 0000000000000000 RBX: ffff99a8f57e1dc0 RCX:
0000000000000004
[  239.609775] RDX: 0000000000000000 RSI: 0000000c00000000 RDI:
ffff99a881d49900
[  239.613014] RBP: ffffb301c10e7de0 R08: ffffb301c10e7c44 R09:
61c8864680b583eb
[  239.616244] R10: 8080807fffffffff R11: d0ffffff979c8b96 R12:
0000000000008241
[  239.619457] R13: 0000000000000000 R14: ffff99a881d49900 R15:
ffffb301c10e7eec
[  239.622911] FS:  00007f1bda771740(0000) GS:ffff99a8f7a00000(0000)
knlGS:0000000000000000
[  239.626611] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  239.629254] CR2: 0000000000000004 CR3: 00000001c2196005 CR4:
00000000001606f0
[  239.632512] DR0: 0000000000405118 DR1: 0000000000000000 DR2:
0000000000000000
[  239.635745] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000600
[  243.805958] trace_kprobe: Parse error at argument[0]. (-22)
[  243.879983] trace_probe: x64 type has no corresponding fetch method.
[  243.882922] trace_kprobe: Parse error at argument[0]. (-22)




> +	umode_t dir_mode = dir->d_inode->i_mode;
>  	int open_flag = op->open_flag;
>  	bool will_truncate = (open_flag & O_TRUNC) != 0;
>  	bool got_write = false;
> @@ -3393,7 +3396,7 @@ finish_open:
>  		error = -EISDIR;
>  		if (d_is_dir(nd->path.dentry))
>  			goto out;
> -		error = may_create_in_sticky(dir,
> +		error = may_create_in_sticky(dir_mode, dir_uid,
>  					     d_backing_inode(nd-
> >path.dentry));
>  		if (unlikely(error))
>  			goto out;
> 
> 


^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 4.19 43/92] do_last(): fetch directory ->i_mode and ->i_uid before its too late
  2020-01-31 10:08   ` Rantala, Tommi T. (Nokia - FI/Espoo)
@ 2020-01-31 12:20     ` Al Viro
  2020-01-31 13:57       ` Rantala, Tommi T. (Nokia - FI/Espoo)
  0 siblings, 1 reply; 100+ messages in thread
From: Al Viro @ 2020-01-31 12:20 UTC (permalink / raw)
  To: Rantala, Tommi T. (Nokia - FI/Espoo); +Cc: gregkh, linux-kernel, stable

On Fri, Jan 31, 2020 at 10:08:37AM +0000, Rantala, Tommi T. (Nokia - FI/Espoo) wrote:
> On Tue, 2020-01-28 at 15:08 +0100, Greg Kroah-Hartman wrote:
> > From: Al Viro <viro@zeniv.linux.org.uk>
> > 
> > commit d0cb50185ae942b03c4327be322055d622dc79f6 upstream.
> > 
> > may_create_in_sticky() call is done when we already have dropped the
> > reference to dir.
> > 
> > Fixes: 30aba6656f61e (namei: allow restricted O_CREAT of FIFOs and
> > regular files)
> > Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> > Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > 
> > ---
> >  fs/namei.c |   17 ++++++++++-------
> >  1 file changed, 10 insertions(+), 7 deletions(-)
> > 
> > --- a/fs/namei.c
> > +++ b/fs/namei.c
> > [...]
> > @@ -3258,6 +3259,8 @@ static int do_last(struct nameidata *nd,
> >  		   struct file *file, const struct open_flags *op)
> >  {
> >  	struct dentry *dir = nd->path.dentry;
> > +	kuid_t dir_uid = dir->d_inode->i_uid;
> 
> I hit the following oops in 4.19.100 while running kselftests.
> 
> fs/namei.c:3262 matches the line above.
> 
> Any ideas?

Yes.  Make those two line
	kuid_t dir_uid = nd->inode->i_uid;
	umode_t dir_mode = nd->inode->i_mode;

I'm pretty sure that I know which way I'd fucked up there; we can
get here in RCU mode with stale nd->path.dentry (that would make
the thing fail with -ECHILD. with retry in non-RCU mode).  In
non-stale case nd->inode is the same as nd->path.dentry->d_inode
and it's always pointing to a struct inode that hadn't been
freed yet.

^ permalink raw reply	[flat|nested] 100+ messages in thread

* Re: [PATCH 4.19 43/92] do_last(): fetch directory ->i_mode and ->i_uid before its too late
  2020-01-31 12:20     ` Al Viro
@ 2020-01-31 13:57       ` Rantala, Tommi T. (Nokia - FI/Espoo)
  0 siblings, 0 replies; 100+ messages in thread
From: Rantala, Tommi T. (Nokia - FI/Espoo) @ 2020-01-31 13:57 UTC (permalink / raw)
  To: viro; +Cc: gregkh, stable, linux-kernel

On Fri, 2020-01-31 at 12:20 +0000, Al Viro wrote:
> On Fri, Jan 31, 2020 at 10:08:37AM +0000, Rantala, Tommi T. (Nokia -
> FI/Espoo) wrote:
> > On Tue, 2020-01-28 at 15:08 +0100, Greg Kroah-Hartman wrote:
> > > From: Al Viro <viro@zeniv.linux.org.uk>
> > > 
> > > @@ -3258,6 +3259,8 @@ static int do_last(struct nameidata *nd,
> > >  		   struct file *file, const struct open_flags *op)
> > >  {
> > >  	struct dentry *dir = nd->path.dentry;
> > > +	kuid_t dir_uid = dir->d_inode->i_uid;
> > 
> > I hit the following oops in 4.19.100 while running kselftests.
> > 
> > fs/namei.c:3262 matches the line above.
> > 
> > Any ideas?
> 
> Yes.  Make those two line
> 	kuid_t dir_uid = nd->inode->i_uid;
> 	umode_t dir_mode = nd->inode->i_mode;
> 
> I'm pretty sure that I know which way I'd fucked up there; we can
> get here in RCU mode with stale nd->path.dentry (that would make
> the thing fail with -ECHILD. with retry in non-RCU mode).  In
> non-stale case nd->inode is the same as nd->path.dentry->d_inode
> and it's always pointing to a struct inode that hadn't been
> freed yet.

The oops does not seem to reproduce easily with kselftests, I've only hit
it once with 4.19.100.

Made the change in fs/namei.c, so far no ill effects.
I'll continue running the kselftests...

-Tommi


^ permalink raw reply	[flat|nested] 100+ messages in thread

end of thread, other threads:[~2020-01-31 13:57 UTC | newest]

Thread overview: 100+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-28 14:07 [PATCH 4.19 00/92] 4.19.100-stable review Greg Kroah-Hartman
2020-01-28 14:07 ` [PATCH 4.19 01/92] can, slip: Protect tty->disc_data in write_wakeup and close with RCU Greg Kroah-Hartman
2020-01-28 14:07 ` [PATCH 4.19 02/92] firestream: fix memory leaks Greg Kroah-Hartman
2020-01-28 14:07 ` [PATCH 4.19 03/92] gtp: make sure only SOCK_DGRAM UDP sockets are accepted Greg Kroah-Hartman
2020-01-28 14:07 ` [PATCH 4.19 04/92] ipv6: sr: remove SKB_GSO_IPXIP6 on End.D* actions Greg Kroah-Hartman
2020-01-28 14:07 ` [PATCH 4.19 05/92] net: bcmgenet: Use netif_tx_napi_add() for TX NAPI Greg Kroah-Hartman
2020-01-28 14:07 ` [PATCH 4.19 06/92] net: cxgb3_main: Add CAP_NET_ADMIN check to CHELSIO_GET_MEM Greg Kroah-Hartman
2020-01-28 14:07 ` [PATCH 4.19 07/92] net: ip6_gre: fix moving ip6gre between namespaces Greg Kroah-Hartman
2020-01-28 14:07 ` [PATCH 4.19 08/92] net, ip6_tunnel: fix namespaces move Greg Kroah-Hartman
2020-01-28 14:07 ` [PATCH 4.19 09/92] net, ip_tunnel: " Greg Kroah-Hartman
2020-01-28 14:07 ` [PATCH 4.19 10/92] net: rtnetlink: validate IFLA_MTU attribute in rtnl_create_link() Greg Kroah-Hartman
2020-01-28 14:07 ` [PATCH 4.19 11/92] net_sched: fix datalen for ematch Greg Kroah-Hartman
2020-01-28 14:07 ` [PATCH 4.19 12/92] net-sysfs: Fix reference count leak in rx|netdev_queue_add_kobject Greg Kroah-Hartman
2020-01-28 14:07 ` [PATCH 4.19 13/92] net-sysfs: fix netdev_queue_add_kobject() breakage Greg Kroah-Hartman
2020-01-28 14:07 ` [PATCH 4.19 14/92] net-sysfs: Call dev_hold always in netdev_queue_add_kobject Greg Kroah-Hartman
2020-01-28 14:07 ` [PATCH 4.19 15/92] net-sysfs: Call dev_hold always in rx_queue_add_kobject Greg Kroah-Hartman
2020-01-28 14:07 ` [PATCH 4.19 16/92] net-sysfs: Fix reference count leak Greg Kroah-Hartman
2020-01-28 14:07 ` [PATCH 4.19 17/92] net: usb: lan78xx: Add .ndo_features_check Greg Kroah-Hartman
2020-01-28 14:07 ` [PATCH 4.19 18/92] Revert "udp: do rmem bulk free even if the rx sk queue is empty" Greg Kroah-Hartman
2020-01-28 14:07 ` [PATCH 4.19 19/92] tcp_bbr: improve arithmetic division in bbr_update_bw() Greg Kroah-Hartman
2020-01-28 14:07 ` [PATCH 4.19 20/92] tcp: do not leave dangling pointers in tp->highest_sack Greg Kroah-Hartman
2020-01-28 14:07 ` [PATCH 4.19 21/92] tun: add mutex_unlock() call and napi.skb clearing in tun_get_user() Greg Kroah-Hartman
2020-01-28 14:07 ` [PATCH 4.19 22/92] afs: Fix characters allowed into cell names Greg Kroah-Hartman
2020-01-28 14:07 ` [PATCH 4.19 23/92] hwmon: (adt7475) Make volt2reg return same reg as reg2volt input Greg Kroah-Hartman
2020-01-28 14:07 ` [PATCH 4.19 24/92] hwmon: (core) Do not use device managed functions for memory allocations Greg Kroah-Hartman
2020-01-28 14:07 ` [PATCH 4.19 25/92] PCI: Mark AMD Navi14 GPU rev 0xc5 ATS as broken Greg Kroah-Hartman
2020-01-28 14:07 ` [PATCH 4.19 26/92] tracing: trigger: Replace unneeded RCU-list traversals Greg Kroah-Hartman
2020-01-28 14:07 ` [PATCH 4.19 27/92] Input: keyspan-remote - fix control-message timeouts Greg Kroah-Hartman
2020-01-28 14:07 ` [PATCH 4.19 28/92] Revert "Input: synaptics-rmi4 - dont increment rmiaddr for SMBus transfers" Greg Kroah-Hartman
2020-01-28 14:07 ` [PATCH 4.19 29/92] ARM: 8950/1: ftrace/recordmcount: filter relocation types Greg Kroah-Hartman
2020-01-28 14:07 ` [PATCH 4.19 30/92] mmc: tegra: fix SDR50 tuning override Greg Kroah-Hartman
2020-01-28 14:07 ` [PATCH 4.19 31/92] mmc: sdhci: fix minimum clock rate for v3 controller Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 32/92] Documentation: Document arm64 kpti control Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 33/92] Input: pm8xxx-vib - fix handling of separate enable register Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 34/92] Input: sur40 - fix interface sanity checks Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 35/92] Input: gtco - fix endpoint sanity check Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 36/92] Input: aiptek " Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 37/92] Input: pegasus_notetaker " Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 38/92] Input: sun4i-ts - add a check for devm_thermal_zone_of_sensor_register Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 39/92] netfilter: nft_osf: add missing check for DREG attribute Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 40/92] hwmon: (nct7802) Fix voltage limits to wrong registers Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 41/92] scsi: RDMA/isert: Fix a recently introduced regression related to logout Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 42/92] tracing: xen: Ordered comparison of function pointers Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 43/92] do_last(): fetch directory ->i_mode and ->i_uid before its too late Greg Kroah-Hartman
2020-01-31 10:08   ` Rantala, Tommi T. (Nokia - FI/Espoo)
2020-01-31 12:20     ` Al Viro
2020-01-31 13:57       ` Rantala, Tommi T. (Nokia - FI/Espoo)
2020-01-28 14:08 ` [PATCH 4.19 44/92] net/sonic: Add mutual exclusion for accessing shared state Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 45/92] net/sonic: Clear interrupt flags immediately Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 46/92] net/sonic: Use MMIO accessors Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 47/92] net/sonic: Fix interface error stats collection Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 48/92] net/sonic: Fix receive buffer handling Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 49/92] net/sonic: Avoid needless receive descriptor EOL flag updates Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 50/92] net/sonic: Improve receive descriptor status flag check Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 51/92] net/sonic: Fix receive buffer replenishment Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 52/92] net/sonic: Quiesce SONIC before re-initializing descriptor memory Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 53/92] net/sonic: Fix command register usage Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 54/92] net/sonic: Fix CAM initialization Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 55/92] net/sonic: Prevent tx watchdog timeout Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 56/92] tracing: Use hist triggers var_ref array to destroy var_refs Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 57/92] tracing: Remove open-coding of hist trigger var_ref management Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 58/92] tracing: Fix histogram code when expression has same var as value Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 59/92] sd: Fix REQ_OP_ZONE_REPORT completion handling Greg Kroah-Hartman
2020-01-28 18:02   ` Pavel Machek
2020-01-28 18:15     ` Greg Kroah-Hartman
2020-01-29  1:05       ` Damien Le Moal
2020-01-28 14:08 ` [PATCH 4.19 60/92] crypto: geode-aes - switch to skcipher for cbc(aes) fallback Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 61/92] coresight: etb10: Do not call smp_processor_id from preemptible Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 62/92] coresight: tmc-etf: " Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 63/92] libertas: Fix two buffer overflows at parsing bss descriptor Greg Kroah-Hartman
2020-01-29 22:47   ` Pavel Machek
2020-01-28 14:08 ` [PATCH 4.19 64/92] media: v4l2-ioctl.c: zero reserved fields for S/TRY_FMT Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 65/92] scsi: iscsi: Avoid potential deadlock in iscsi_if_rx func Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 66/92] netfilter: ipset: use bitmap infrastructure completely Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 67/92] netfilter: nf_tables: add __nft_chain_type_get() Greg Kroah-Hartman
2020-01-28 18:13   ` Pavel Machek
2020-01-28 14:08 ` [PATCH 4.19 68/92] net/x25: fix nonblocking connect Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 69/92] mm/memory_hotplug: make remove_memory() take the device_hotplug_lock Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 70/92] mm, sparse: drop pgdat_resize_lock in sparse_add/remove_one_section() Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 71/92] mm, sparse: pass nid instead of pgdat to sparse_add_one_section() Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 72/92] drivers/base/memory.c: remove an unnecessary check on NR_MEM_SECTIONS Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 73/92] mm, memory_hotplug: add nid parameter to arch_remove_memory Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 74/92] mm/memory_hotplug: release memory resource after arch_remove_memory() Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 75/92] drivers/base/memory.c: clean up relics in function parameters Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 76/92] mm, memory_hotplug: update a comment in unregister_memory() Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 77/92] mm/memory_hotplug: make unregister_memory_section() never fail Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 78/92] mm/memory_hotplug: make __remove_section() " Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 79/92] powerpc/mm: Fix section mismatch warning Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 80/92] mm/memory_hotplug: make __remove_pages() and arch_remove_memory() never fail Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 88/92] mm/hotplug: kill is_dev_zone() usage in __remove_pages() Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 89/92] drivers/base/node.c: simplify unregister_memory_block_under_nodes() Greg Kroah-Hartman
2020-01-28 14:08 ` [PATCH 4.19 91/92] mm/memory_hotplug: fix try_offline_node() Greg Kroah-Hartman
2020-01-28 14:09 ` [PATCH 4.19 92/92] mm/memory_hotplug: shrink zones when offlining memory Greg Kroah-Hartman
2020-01-28 23:03 ` [PATCH 4.19 00/92] 4.19.100-stable review shuah
2020-01-29  4:54 ` Naresh Kamboju
2020-01-29 11:31 ` Pavel Machek
2020-01-29 12:57   ` Greg Kroah-Hartman
     [not found] ` <20200128135809.344954797-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r@public.gmane.org>
2020-01-29 13:16   ` Jon Hunter
2020-01-29 13:16     ` Jon Hunter
2020-01-29 14:43 ` Guenter Roeck

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.