linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 4.19 00/72] 4.19.57-stable review
@ 2019-07-02  8:01 Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 01/72] perf ui helpline: Use strlcpy() as a shorter form of strncpy() + explicit set nul Greg Kroah-Hartman
                   ` (78 more replies)
  0 siblings, 79 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, torvalds, akpm, linux, shuah, patches,
	ben.hutchings, lkft-triage, stable

This is the start of the stable review cycle for the 4.19.57 release.
There are 72 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Thu 04 Jul 2019 07:59:45 AM UTC.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
	https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.57-rc1.gz
or in the git tree and branch at:
	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Linux 4.19.57-rc1

Xin Long <lucien.xin@gmail.com>
    tipc: pass tunnel dev as NULL to udp_tunnel(6)_xmit_skb

Jason Gunthorpe <jgg@ziepe.ca>
    RDMA: Directly cast the sockaddr union to sockaddr

Will Deacon <will.deacon@arm.com>
    futex: Update comments and docs about return values of arch futex code

Daniel Borkmann <daniel@iogearbox.net>
    bpf, arm64: use more scalable stadd over ldxr / stxr loop in xadd

Will Deacon <will.deacon@arm.com>
    arm64: futex: Avoid copying out uninitialised stack in failed cmpxchg()

Martin KaFai Lau <kafai@fb.com>
    bpf: udp: ipv6: Avoid running reuseport's bpf_prog from __udp6_lib_err

Martin KaFai Lau <kafai@fb.com>
    bpf: udp: Avoid calling reuseport's bpf_prog from udp_gro

Daniel Borkmann <daniel@iogearbox.net>
    bpf: fix unconnected udp hooks

Matt Mullins <mmullins@fb.com>
    bpf: fix nested bpf tracepoints with per-cpu data

Jonathan Lemon <jonathan.lemon@gmail.com>
    bpf: lpm_trie: check left child of last leftmost node for NULL

Martynas Pumputis <m@lambda.lt>
    bpf: simplify definition of BPF_FIB_LOOKUP related flags

Fei Li <lifei.shirley@bytedance.com>
    tun: wake up waitqueues after IFF_UP is set

Xin Long <lucien.xin@gmail.com>
    tipc: check msg->req data len in tipc_nl_compat_bearer_disable

Xin Long <lucien.xin@gmail.com>
    tipc: change to use register_pernet_device

YueHaibing <yuehaibing@huawei.com>
    team: Always enable vlan tx offload

Xin Long <lucien.xin@gmail.com>
    sctp: change to hold sk after auth shkey is created successfully

Roland Hii <roland.king.guan.hii@intel.com>
    net: stmmac: set IC bit when transmitting frames with HW timestamp

Roland Hii <roland.king.guan.hii@intel.com>
    net: stmmac: fixed new system time seconds value calculation

JingYi Hou <houjingyi647@gmail.com>
    net: remove duplicate fetch in sock_getsockopt

Eric Dumazet <edumazet@google.com>
    net/packet: fix memory leak in packet_set_ring()

Stephen Suryaputra <ssuryaextr@gmail.com>
    ipv4: Use return value of inet_iif() for __raw_v4_lookup in the while loop

YueHaibing <yuehaibing@huawei.com>
    bonding: Always enable vlan tx offload

Neil Horman <nhorman@tuxdriver.com>
    af_packet: Block execution of tasks waiting for transmit to complete in AF_PACKET

Wang Xin <xin.wang7@cn.bosch.com>
    eeprom: at24: fix unexpected timeout under high load

Paul Burton <paul.burton@mips.com>
    irqchip/mips-gic: Use the correct local interrupt map registers

Trond Myklebust <trond.myklebust@hammerspace.com>
    SUNRPC: Clean up initialisation of the struct rpc_rqst

Geert Uytterhoeven <geert@linux-m68k.org>
    cpu/speculation: Warn on unsupported mitigations= parameter

Trond Myklebust <trondmy@gmail.com>
    NFS/flexfiles: Use the correct TCP timeout for flexfiles I/O

Sean Christopherson <sean.j.christopherson@intel.com>
    KVM: x86/mmu: Allocate PAE root array when using SVM's 32-bit NPT

Reinette Chatre <reinette.chatre@intel.com>
    x86/resctrl: Prevent possible overrun during bitmap operations

Thomas Gleixner <tglx@linutronix.de>
    x86/microcode: Fix the microcode load on CPU hotplug for real

Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
    x86/speculation: Allow guests to use SSBD even if host does not

Jan Kara <jack@suse.cz>
    scsi: vmw_pscsi: Fix use-after-free in pvscsi_queue_lck()

zhangyi (F) <yi.zhang@huawei.com>
    dm log writes: make sure super sector log updates are written in order

Colin Ian King <colin.king@canonical.com>
    mm/page_idle.c: fix oops because end_pfn is larger than max_pfn

Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
    mm: hugetlb: soft-offline: dissolve_free_huge_page() return zero on !PageHuge

Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
    mm: soft-offline: return -EBUSY if set_hwpoison_free_buddy_page() fails

Dinh Nguyen <dinguyen@kernel.org>
    clk: socfpga: stratix10: fix divider entry for the emac clocks

Jann Horn <jannh@google.com>
    fs/binfmt_flat.c: make load_flat_shared_library() work

zhong jiang <zhongjiang@huawei.com>
    mm/mempolicy.c: fix an incorrect rebind node in mpol_rebind_nodemask

John Ogness <john.ogness@linutronix.de>
    fs/proc/array.c: allow reporting eip/esp for all coredumping threads

Jack Pham <jackp@codeaurora.org>
    usb: dwc3: gadget: Clear req->needs_extra_trb flag on cleanup

Felipe Balbi <felipe.balbi@linux.intel.com>
    usb: dwc3: gadget: remove wait_end_transfer

Felipe Balbi <felipe.balbi@linux.intel.com>
    usb: dwc3: gadget: move requests to cancelled_list

Felipe Balbi <felipe.balbi@linux.intel.com>
    usb: dwc3: gadget: introduce cancelled_list

Felipe Balbi <felipe.balbi@linux.intel.com>
    usb: dwc3: gadget: extract dwc3_gadget_ep_skip_trbs()

Felipe Balbi <felipe.balbi@linux.intel.com>
    usb: dwc3: gadget: use num_trbs when skipping TRBs on ->dequeue()

Felipe Balbi <felipe.balbi@linux.intel.com>
    usb: dwc3: gadget: track number of TRBs per request

Felipe Balbi <felipe.balbi@linux.intel.com>
    usb: dwc3: gadget: combine unaligned and zero flags

John Stultz <john.stultz@linaro.org>
    Revert "usb: dwc3: gadget: Clear req->needs_extra_trb flag on cleanup"

Bjørn Mork <bjorn@mork.no>
    qmi_wwan: Fix out-of-bounds read

Adeodato Simó <dato@net.com.org.es>
    net/9p: include trans_common.h to fix missing prototype warning.

Dominique Martinet <dominique.martinet@cea.fr>
    9p/trans_fd: put worker reqs on destroy

Dominique Martinet <dominique.martinet@cea.fr>
    9p/trans_fd: abort p9_read_work if req status changed

Dan Carpenter <dan.carpenter@oracle.com>
    9p: potential NULL dereference

Dominique Martinet <dominique.martinet@cea.fr>
    9p: p9dirent_read: check network-provided name length

Dominique Martinet <dominique.martinet@cea.fr>
    9p/rdma: remove useless check in cm_event_handler

Dominique Martinet <dominique.martinet@cea.fr>
    9p: acl: fix uninitialized iattr access

Tomas Bortoli <tomasbortoli@gmail.com>
    9p: Rename req to rreq in trans_fd

Dominique Martinet <dominique.martinet@cea.fr>
    9p/rdma: do not disconnect on down_interruptible EAGAIN

Tomas Bortoli <tomasbortoli@gmail.com>
    9p: Add refcount to p9_req_t

Tomas Bortoli <tomasbortoli@gmail.com>
    9p: rename p9_free_req() function

Dominique Martinet <dominique.martinet@cea.fr>
    9p: add a per-client fcall kmem_cache

Dominique Martinet <dominique.martinet@cea.fr>
    9p: embed fcall in req to round down buffer allocs

Matthew Wilcox <willy@infradead.org>
    9p: Use a slab for allocating requests

Dominique Martinet <dominique.martinet@cea.fr>
    9p/xen: fix check for xenbus_read error in front_probe

Mike Marciniszyn <mike.marciniszyn@intel.com>
    IB/hfi1: Close PSM sdma_progress sleep window

Sasha Levin <sashal@kernel.org>
    Revert "x86/uaccess, ftrace: Fix ftrace_likely_update() vs. SMAP"

Nathan Chancellor <natechancellor@gmail.com>
    arm64: Don't unconditionally add -Wno-psabi to KBUILD_CFLAGS

Arnaldo Carvalho de Melo <acme@redhat.com>
    perf header: Fix unchecked usage of strncpy()

Arnaldo Carvalho de Melo <acme@redhat.com>
    perf help: Remove needless use of strncpy()

Arnaldo Carvalho de Melo <acme@redhat.com>
    perf ui helpline: Use strlcpy() as a shorter form of strncpy() + explicit set nul


-------------

Diffstat:

 Documentation/robust-futexes.txt                   |   3 +-
 Makefile                                           |   4 +-
 arch/arm64/Makefile                                |   2 +-
 arch/arm64/include/asm/futex.h                     |   4 +-
 arch/arm64/include/asm/insn.h                      |   8 +
 arch/arm64/kernel/insn.c                           |  40 ++
 arch/arm64/net/bpf_jit.h                           |   4 +
 arch/arm64/net/bpf_jit_comp.c                      |  28 +-
 arch/mips/include/asm/mips-gic.h                   |  30 ++
 arch/x86/kernel/cpu/bugs.c                         |  11 +-
 arch/x86/kernel/cpu/intel_rdt_rdtgroup.c           |  35 +-
 arch/x86/kernel/cpu/microcode/core.c               |  15 +-
 arch/x86/kvm/mmu.c                                 |  11 +-
 drivers/clk/socfpga/clk-s10.c                      |   4 +-
 drivers/infiniband/core/addr.c                     |  10 +-
 drivers/infiniband/hw/hfi1/user_sdma.c             |  12 +-
 drivers/infiniband/hw/hfi1/user_sdma.h             |   1 -
 drivers/infiniband/hw/ocrdma/ocrdma_ah.c           |   5 +-
 drivers/infiniband/hw/ocrdma/ocrdma_hw.c           |   5 +-
 drivers/irqchip/irq-mips-gic.c                     |   4 +-
 drivers/md/dm-log-writes.c                         |  23 +-
 drivers/misc/eeprom/at24.c                         |  43 +-
 drivers/net/bonding/bond_main.c                    |   2 +-
 .../net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c  |   2 +-
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c  |  22 +-
 drivers/net/team/team.c                            |   2 +-
 drivers/net/tun.c                                  |  19 +-
 drivers/net/usb/qmi_wwan.c                         |   2 +-
 drivers/scsi/vmw_pvscsi.c                          |   6 +-
 drivers/usb/dwc3/core.h                            |  15 +-
 drivers/usb/dwc3/gadget.c                          | 158 ++----
 drivers/usb/dwc3/gadget.h                          |  15 +
 fs/9p/acl.c                                        |   2 +-
 fs/binfmt_flat.c                                   |  23 +-
 fs/nfs/flexfilelayout/flexfilelayoutdev.c          |   2 +-
 fs/proc/array.c                                    |   2 +-
 include/asm-generic/futex.h                        |   8 +-
 include/linux/bpf-cgroup.h                         |   8 +
 include/linux/sunrpc/xprt.h                        |   1 -
 include/net/9p/9p.h                                |   4 +
 include/net/9p/client.h                            |  71 +--
 include/uapi/linux/bpf.h                           |   6 +-
 kernel/bpf/lpm_trie.c                              |   9 +-
 kernel/bpf/syscall.c                               |   8 +
 kernel/bpf/verifier.c                              |  12 +-
 kernel/cpu.c                                       |   3 +
 kernel/trace/bpf_trace.c                           | 100 +++-
 kernel/trace/trace_branch.c                        |   4 -
 mm/hugetlb.c                                       |  29 +-
 mm/memory-failure.c                                |   7 +-
 mm/mempolicy.c                                     |   2 +-
 mm/page_idle.c                                     |   4 +-
 net/9p/client.c                                    | 551 +++++++++++----------
 net/9p/mod.c                                       |   9 +-
 net/9p/protocol.c                                  |  12 +-
 net/9p/trans_common.c                              |   1 +
 net/9p/trans_fd.c                                  |  64 ++-
 net/9p/trans_rdma.c                                |  37 +-
 net/9p/trans_virtio.c                              |  44 +-
 net/9p/trans_xen.c                                 |  17 +-
 net/core/filter.c                                  |   2 +
 net/core/sock.c                                    |   3 -
 net/ipv4/raw.c                                     |   2 +-
 net/ipv4/udp.c                                     |  10 +-
 net/ipv6/udp.c                                     |   8 +-
 net/packet/af_packet.c                             |  23 +-
 net/packet/internal.h                              |   1 +
 net/sctp/endpointola.c                             |   8 +-
 net/sunrpc/clnt.c                                  |   1 -
 net/sunrpc/xprt.c                                  |  91 ++--
 net/tipc/core.c                                    |  12 +-
 net/tipc/netlink_compat.c                          |  18 +-
 net/tipc/udp_media.c                               |   8 +-
 tools/perf/builtin-help.c                          |   2 +-
 tools/perf/ui/tui/helpline.c                       |   2 +-
 tools/perf/util/header.c                           |   2 +-
 tools/testing/selftests/bpf/test_lpm_map.c         |  41 +-
 77 files changed, 1072 insertions(+), 747 deletions(-)



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 01/72] perf ui helpline: Use strlcpy() as a shorter form of strncpy() + explicit set nul
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 02/72] perf help: Remove needless use of strncpy() Greg Kroah-Hartman
                   ` (77 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Adrian Hunter, Jiri Olsa,
	Namhyung Kim, Arnaldo Carvalho de Melo

From: Arnaldo Carvalho de Melo <acme@redhat.com>

commit 4d0f16d059ddb91424480d88473f7392f24aebdc upstream.

The strncpy() function may leave the destination string buffer
unterminated, better use strlcpy() that we have a __weak fallback
implementation for systems without it.

In this case we are actually setting the null byte at the right place,
but since we pass the buffer size as the limit to strncpy() and not
it minus one, gcc ends up warning us about that, see below. So, lets
just switch to the shorter form provided by strlcpy().

This fixes this warning on an Alpine Linux Edge system with gcc 8.2:

  ui/tui/helpline.c: In function 'tui_helpline__push':
  ui/tui/helpline.c:27:2: error: 'strncpy' specified bound 512 equals destination size [-Werror=stringop-truncation]
    strncpy(ui_helpline__current, msg, sz)[sz - 1] = '\0';
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  cc1: all warnings being treated as errors

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: e6e904687949 ("perf ui: Introduce struct ui_helpline")
Link: https://lkml.kernel.org/n/tip-d1wz0hjjsh19xbalw69qpytj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 tools/perf/ui/tui/helpline.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/tools/perf/ui/tui/helpline.c
+++ b/tools/perf/ui/tui/helpline.c
@@ -24,7 +24,7 @@ static void tui_helpline__push(const cha
 	SLsmg_set_color(0);
 	SLsmg_write_nstring((char *)msg, SLtt_Screen_Cols);
 	SLsmg_refresh();
-	strncpy(ui_helpline__current, msg, sz)[sz - 1] = '\0';
+	strlcpy(ui_helpline__current, msg, sz);
 }
 
 static int tui_helpline__show(const char *format, va_list ap)



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 02/72] perf help: Remove needless use of strncpy()
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 01/72] perf ui helpline: Use strlcpy() as a shorter form of strncpy() + explicit set nul Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 03/72] perf header: Fix unchecked usage " Greg Kroah-Hartman
                   ` (76 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Adrian Hunter, Jiri Olsa,
	Namhyung Kim, Arnaldo Carvalho de Melo

From: Arnaldo Carvalho de Melo <acme@redhat.com>

commit b6313899f4ed2e76b8375cf8069556f5b94fbff0 upstream.

Since we make sure the destination buffer has at least strlen(orig) + 1,
no need to do a strncpy(dest, orig, strlen(orig)), just use strcpy(dest,
orig).

This silences this gcc 8.2 warning on Alpine Linux:

  In function 'add_man_viewer',
      inlined from 'perf_help_config' at builtin-help.c:284:3:
  builtin-help.c:192:2: error: 'strncpy' output truncated before terminating nul copying as many bytes from a string as its length [-Werror=stringop-truncation]
    strncpy((*p)->name, name, len);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  builtin-help.c: In function 'perf_help_config':
  builtin-help.c:187:15: note: length computed here
    size_t len = strlen(name);
                 ^~~~~~~~~~~~

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: 078006012401 ("perf_counter tools: add in basic glue from Git")
Link: https://lkml.kernel.org/n/tip-2f69l7drca427ob4km8i7kvo@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 tools/perf/builtin-help.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/tools/perf/builtin-help.c
+++ b/tools/perf/builtin-help.c
@@ -189,7 +189,7 @@ static void add_man_viewer(const char *n
 	while (*p)
 		p = &((*p)->next);
 	*p = zalloc(sizeof(**p) + len + 1);
-	strncpy((*p)->name, name, len);
+	strcpy((*p)->name, name);
 }
 
 static int supported_man_viewer(const char *name, size_t len)



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 03/72] perf header: Fix unchecked usage of strncpy()
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 01/72] perf ui helpline: Use strlcpy() as a shorter form of strncpy() + explicit set nul Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 02/72] perf help: Remove needless use of strncpy() Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 04/72] arm64: Dont unconditionally add -Wno-psabi to KBUILD_CFLAGS Greg Kroah-Hartman
                   ` (75 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Adrian Hunter, Jiri Olsa,
	Namhyung Kim, Arnaldo Carvalho de Melo

From: Arnaldo Carvalho de Melo <acme@redhat.com>

commit 5192bde7d98c99f2cd80225649e3c2e7493722f7 upstream.

The strncpy() function may leave the destination string buffer
unterminated, better use strlcpy() that we have a __weak fallback
implementation for systems without it.

This fixes this warning on an Alpine Linux Edge system with gcc 8.2:

  util/header.c: In function 'perf_event__synthesize_event_update_name':
  util/header.c:3625:2: error: 'strncpy' output truncated before terminating nul copying as many bytes from a string as its length [-Werror=stringop-truncation]
    strncpy(ev->data, evsel->name, len);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  util/header.c:3618:15: note: length computed here
    size_t len = strlen(evsel->name);
                 ^~~~~~~~~~~~~~~~~~~

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: a6e5281780d1 ("perf tools: Add event_update event unit type")
Link: https://lkml.kernel.org/n/tip-wycz66iy8dl2z3yifgqf894p@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 tools/perf/util/header.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -3562,7 +3562,7 @@ perf_event__synthesize_event_update_name
 	if (ev == NULL)
 		return -ENOMEM;
 
-	strncpy(ev->data, evsel->name, len);
+	strlcpy(ev->data, evsel->name, len + 1);
 	err = process(tool, (union perf_event*) ev, NULL, NULL);
 	free(ev);
 	return err;



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 04/72] arm64: Dont unconditionally add -Wno-psabi to KBUILD_CFLAGS
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (2 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 03/72] perf header: Fix unchecked usage " Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 05/72] Revert "x86/uaccess, ftrace: Fix ftrace_likely_update() vs. SMAP" Greg Kroah-Hartman
                   ` (74 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Qian Cai, Dave Martin,
	Nick Desaulniers, Nathan Chancellor, Will Deacon

From: Nathan Chancellor <natechancellor@gmail.com>

commit fa63da2ab046b885a7f70291aafc4e8ce015429b upstream.

This is a GCC only option, which warns about ABI changes within GCC, so
unconditionally adding it breaks Clang with tons of:

warning: unknown warning option '-Wno-psabi' [-Wunknown-warning-option]

and link time failures:

ld.lld: error: undefined symbol: __efistub___stack_chk_guard
>>> referenced by arm-stub.c:73
(/home/nathan/cbl/linux/drivers/firmware/efi/libstub/arm-stub.c:73)
>>>               arm-stub.stub.o:(__efistub_install_memreserve_table)
in archive ./drivers/firmware/efi/libstub/lib.a

These failures come from the lack of -fno-stack-protector, which is
added via cc-option in drivers/firmware/efi/libstub/Makefile. When an
unknown flag is added to KBUILD_CFLAGS, clang will noisily warn that it
is ignoring the option like above, unlike gcc, who will just error.

$ echo "int main() { return 0; }" > tmp.c

$ clang -Wno-psabi tmp.c; echo $?
warning: unknown warning option '-Wno-psabi' [-Wunknown-warning-option]
1 warning generated.
0

$ gcc -Wsometimes-uninitialized tmp.c; echo $?
gcc: error: unrecognized command line option
‘-Wsometimes-uninitialized’; did you mean ‘-Wmaybe-uninitialized’?
1

For cc-option to work properly with clang and behave like gcc, -Werror
is needed, which was done in commit c3f0d0bc5b01 ("kbuild, LLVMLinux:
Add -Werror to cc-option to support clang").

$ clang -Werror -Wno-psabi tmp.c; echo $?
error: unknown warning option '-Wno-psabi'
[-Werror,-Wunknown-warning-option]
1

As a consequence of this, when an unknown flag is unconditionally added
to KBUILD_CFLAGS, it will cause cc-option to always fail and those flags
will never get added:

$ clang -Werror -Wno-psabi -fno-stack-protector tmp.c; echo $?
error: unknown warning option '-Wno-psabi'
[-Werror,-Wunknown-warning-option]
1

This can be seen when compiling the whole kernel as some warnings that
are normally disabled (see below) show up. The full list of flags
missing from drivers/firmware/efi/libstub are the following (gathered
from diffing .arm64-stub.o.cmd):

-fno-delete-null-pointer-checks
-Wno-address-of-packed-member
-Wframe-larger-than=2048
-Wno-unused-const-variable
-fno-strict-overflow
-fno-merge-all-constants
-fno-stack-check
-Werror=date-time
-Werror=incompatible-pointer-types
-ffreestanding
-fno-stack-protector

Use cc-disable-warning so that it gets disabled for GCC and does nothing
for Clang.

Fixes: ebcc5928c5d9 ("arm64: Silence gcc warnings about arch ABI drift")
Link: https://github.com/ClangBuiltLinux/linux/issues/511
Reported-by: Qian Cai <cai@lca.pw>
Acked-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/arm64/Makefile |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -51,7 +51,7 @@ endif
 
 KBUILD_CFLAGS	+= -mgeneral-regs-only $(lseinstr) $(brokengasinst)
 KBUILD_CFLAGS	+= -fno-asynchronous-unwind-tables
-KBUILD_CFLAGS	+= -Wno-psabi
+KBUILD_CFLAGS	+= $(call cc-disable-warning, psabi)
 KBUILD_AFLAGS	+= $(lseinstr) $(brokengasinst)
 
 KBUILD_CFLAGS	+= $(call cc-option,-mabi=lp64)



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 05/72] Revert "x86/uaccess, ftrace: Fix ftrace_likely_update() vs. SMAP"
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (3 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 04/72] arm64: Dont unconditionally add -Wno-psabi to KBUILD_CFLAGS Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 06/72] IB/hfi1: Close PSM sdma_progress sleep window Greg Kroah-Hartman
                   ` (73 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Sasha Levin

This reverts commit 1a3188d737ceb922166d8fe78a5fc4f89907e31b, which was
upstream commit 4a6c91fbdef846ec7250b82f2eeeb87ac5f18cf9.

On Tue, Jun 25, 2019 at 09:39:45AM +0200, Sebastian Andrzej Siewior wrote:
>Please backport commit e74deb11931ff682b59d5b9d387f7115f689698e to
>stable _or_ revert the backport of commit 4a6c91fbdef84 ("x86/uaccess,
>ftrace: Fix ftrace_likely_update() vs. SMAP"). It uses
>user_access_{save|restore}() which has been introduced in the following
>commit.

Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 kernel/trace/trace_branch.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/kernel/trace/trace_branch.c b/kernel/trace/trace_branch.c
index 3ea65cdff30d..4ad967453b6f 100644
--- a/kernel/trace/trace_branch.c
+++ b/kernel/trace/trace_branch.c
@@ -205,8 +205,6 @@ void trace_likely_condition(struct ftrace_likely_data *f, int val, int expect)
 void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 			  int expect, int is_constant)
 {
-	unsigned long flags = user_access_save();
-
 	/* A constant is always correct */
 	if (is_constant) {
 		f->constant++;
@@ -225,8 +223,6 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 		f->data.correct++;
 	else
 		f->data.incorrect++;
-
-	user_access_restore(flags);
 }
 EXPORT_SYMBOL(ftrace_likely_update);
 
-- 
2.20.1




^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 4.19 06/72] IB/hfi1: Close PSM sdma_progress sleep window
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (4 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 05/72] Revert "x86/uaccess, ftrace: Fix ftrace_likely_update() vs. SMAP" Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 07/72] 9p/xen: fix check for xenbus_read error in front_probe Greg Kroah-Hartman
                   ` (72 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Gary Leshner, Mike Marciniszyn,
	Dennis Dalessandro, Jason Gunthorpe, Sasha Levin

commit da9de5f8527f4b9efc82f967d29a583318c034c7 upstream.

The call to sdma_progress() is called outside the wait lock.

In this case, there is a race condition where sdma_progress() can return
false and the sdma_engine can idle.  If that happens, there will be no
more sdma interrupts to cause the wakeup and the user_sdma xmit will hang.

Fix by moving the lock to enclose the sdma_progress() call.

Also, delete busycount. The need for this was removed by:
commit bcad29137a97 ("IB/hfi1: Serve the most starved iowait entry first")

Ported to linux-4.19.y.

Cc: <stable@vger.kernel.org>
Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Reviewed-by: Gary Leshner <Gary.S.Leshner@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/infiniband/hw/hfi1/user_sdma.c | 12 ++++--------
 drivers/infiniband/hw/hfi1/user_sdma.h |  1 -
 2 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/user_sdma.c b/drivers/infiniband/hw/hfi1/user_sdma.c
index 51831bfbf90f..cbff746d9e9d 100644
--- a/drivers/infiniband/hw/hfi1/user_sdma.c
+++ b/drivers/infiniband/hw/hfi1/user_sdma.c
@@ -132,25 +132,22 @@ static int defer_packet_queue(
 	struct hfi1_user_sdma_pkt_q *pq =
 		container_of(wait, struct hfi1_user_sdma_pkt_q, busy);
 	struct hfi1_ibdev *dev = &pq->dd->verbs_dev;
-	struct user_sdma_txreq *tx =
-		container_of(txreq, struct user_sdma_txreq, txreq);
 
-	if (sdma_progress(sde, seq, txreq)) {
-		if (tx->busycount++ < MAX_DEFER_RETRY_COUNT)
-			goto eagain;
-	}
+	write_seqlock(&dev->iowait_lock);
+	if (sdma_progress(sde, seq, txreq))
+		goto eagain;
 	/*
 	 * We are assuming that if the list is enqueued somewhere, it
 	 * is to the dmawait list since that is the only place where
 	 * it is supposed to be enqueued.
 	 */
 	xchg(&pq->state, SDMA_PKT_Q_DEFERRED);
-	write_seqlock(&dev->iowait_lock);
 	if (list_empty(&pq->busy.list))
 		iowait_queue(pkts_sent, &pq->busy, &sde->dmawait);
 	write_sequnlock(&dev->iowait_lock);
 	return -EBUSY;
 eagain:
+	write_sequnlock(&dev->iowait_lock);
 	return -EAGAIN;
 }
 
@@ -803,7 +800,6 @@ static int user_sdma_send_pkts(struct user_sdma_request *req, unsigned maxpkts)
 
 		tx->flags = 0;
 		tx->req = req;
-		tx->busycount = 0;
 		INIT_LIST_HEAD(&tx->list);
 
 		/*
diff --git a/drivers/infiniband/hw/hfi1/user_sdma.h b/drivers/infiniband/hw/hfi1/user_sdma.h
index 91c343f91776..2c056702d975 100644
--- a/drivers/infiniband/hw/hfi1/user_sdma.h
+++ b/drivers/infiniband/hw/hfi1/user_sdma.h
@@ -245,7 +245,6 @@ struct user_sdma_txreq {
 	struct list_head list;
 	struct user_sdma_request *req;
 	u16 flags;
-	unsigned int busycount;
 	u64 seqnum;
 };
 
-- 
2.20.1




^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 4.19 07/72] 9p/xen: fix check for xenbus_read error in front_probe
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (5 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 06/72] IB/hfi1: Close PSM sdma_progress sleep window Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 08/72] 9p: Use a slab for allocating requests Greg Kroah-Hartman
                   ` (71 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Dominique Martinet,
	Stefano Stabellini, Eric Van Hensbergen, Latchesar Ionkov,
	Sasha Levin

[ Upstream commit 2f9ad0ac947ccbe3ffe7c6229c9330f2a7755f64 ]

If the xen bus exists but does not expose the proper interface, it is
possible to get a non-zero length but still some error, leading to
strcmp failing trying to load invalid memory addresses e.g.
fffffffffffffffe.

There is then no need to check length when there is no error, as the
xenbus driver guarantees that the string is nul-terminated.

Link: http://lkml.kernel.org/r/1534236007-10170-1-git-send-email-asmadeus@codewreck.org
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/9p/trans_xen.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/9p/trans_xen.c b/net/9p/trans_xen.c
index c2d54ac76bfd..843cb823d9b9 100644
--- a/net/9p/trans_xen.c
+++ b/net/9p/trans_xen.c
@@ -391,8 +391,8 @@ static int xen_9pfs_front_probe(struct xenbus_device *dev,
 	unsigned int max_rings, max_ring_order, len = 0;
 
 	versions = xenbus_read(XBT_NIL, dev->otherend, "versions", &len);
-	if (!len)
-		return -EINVAL;
+	if (IS_ERR(versions))
+		return PTR_ERR(versions);
 	if (strcmp(versions, "1")) {
 		kfree(versions);
 		return -EINVAL;
-- 
2.20.1




^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 4.19 08/72] 9p: Use a slab for allocating requests
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (6 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 07/72] 9p/xen: fix check for xenbus_read error in front_probe Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 09/72] 9p: embed fcall in req to round down buffer allocs Greg Kroah-Hartman
                   ` (70 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Matthew Wilcox, Eric Van Hensbergen,
	Ron Minnich, Latchesar Ionkov, Dominique Martinet, Sasha Levin

[ Upstream commit 996d5b4db4b191f2676cf8775565cab8a5e2753b ]

Replace the custom batch allocation with a slab.  Use an IDR to store
pointers to the active requests instead of an array.  We don't try to
handle P9_NOTAG specially; the IDR will happily shrink all the way back
once the TVERSION call has completed.

Link: http://lkml.kernel.org/r/20180711210225.19730-6-willy@infradead.org
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/net/9p/client.h |  51 ++-------
 net/9p/client.c         | 238 ++++++++++++++--------------------------
 net/9p/mod.c            |   9 +-
 3 files changed, 102 insertions(+), 196 deletions(-)

diff --git a/include/net/9p/client.h b/include/net/9p/client.h
index 0fa0fbab33b0..a4dc42c53d18 100644
--- a/include/net/9p/client.h
+++ b/include/net/9p/client.h
@@ -64,22 +64,15 @@ enum p9_trans_status {
 
 /**
  * enum p9_req_status_t - status of a request
- * @REQ_STATUS_IDLE: request slot unused
  * @REQ_STATUS_ALLOC: request has been allocated but not sent
  * @REQ_STATUS_UNSENT: request waiting to be sent
  * @REQ_STATUS_SENT: request sent to server
  * @REQ_STATUS_RCVD: response received from server
  * @REQ_STATUS_FLSHD: request has been flushed
  * @REQ_STATUS_ERROR: request encountered an error on the client side
- *
- * The @REQ_STATUS_IDLE state is used to mark a request slot as unused
- * but use is actually tracked by the idpool structure which handles tag
- * id allocation.
- *
  */
 
 enum p9_req_status_t {
-	REQ_STATUS_IDLE,
 	REQ_STATUS_ALLOC,
 	REQ_STATUS_UNSENT,
 	REQ_STATUS_SENT,
@@ -92,24 +85,12 @@ enum p9_req_status_t {
  * struct p9_req_t - request slots
  * @status: status of this request slot
  * @t_err: transport error
- * @flush_tag: tag of request being flushed (for flush requests)
  * @wq: wait_queue for the client to block on for this request
  * @tc: the request fcall structure
  * @rc: the response fcall structure
  * @aux: transport specific data (provided for trans_fd migration)
  * @req_list: link for higher level objects to chain requests
- *
- * Transport use an array to track outstanding requests
- * instead of a list.  While this may incurr overhead during initial
- * allocation or expansion, it makes request lookup much easier as the
- * tag id is a index into an array.  (We use tag+1 so that we can accommodate
- * the -1 tag for the T_VERSION request).
- * This also has the nice effect of only having to allocate wait_queues
- * once, instead of constantly allocating and freeing them.  Its possible
- * other resources could benefit from this scheme as well.
- *
  */
-
 struct p9_req_t {
 	int status;
 	int t_err;
@@ -117,40 +98,26 @@ struct p9_req_t {
 	struct p9_fcall *tc;
 	struct p9_fcall *rc;
 	void *aux;
-
 	struct list_head req_list;
 };
 
 /**
  * struct p9_client - per client instance state
- * @lock: protect @fidlist
+ * @lock: protect @fids and @reqs
  * @msize: maximum data size negotiated by protocol
- * @dotu: extension flags negotiated by protocol
  * @proto_version: 9P protocol version to use
  * @trans_mod: module API instantiated with this client
+ * @status: connection state
  * @trans: tranport instance state and API
  * @fids: All active FID handles
- * @tagpool - transaction id accounting for session
- * @reqs - 2D array of requests
- * @max_tag - current maximum tag id allocated
- * @name - node name used as client id
+ * @reqs: All active requests.
+ * @name: node name used as client id
  *
  * The client structure is used to keep track of various per-client
  * state that has been instantiated.
- * In order to minimize per-transaction overhead we use a
- * simple array to lookup requests instead of a hash table
- * or linked list.  In order to support larger number of
- * transactions, we make this a 2D array, allocating new rows
- * when we need to grow the total number of the transactions.
- *
- * Each row is 256 requests and we'll support up to 256 rows for
- * a total of 64k concurrent requests per session.
- *
- * Bugs: duplicated data and potentially unnecessary elements.
  */
-
 struct p9_client {
-	spinlock_t lock; /* protect client structure */
+	spinlock_t lock;
 	unsigned int msize;
 	unsigned char proto_version;
 	struct p9_trans_module *trans_mod;
@@ -170,10 +137,7 @@ struct p9_client {
 	} trans_opts;
 
 	struct idr fids;
-
-	struct p9_idpool *tagpool;
-	struct p9_req_t *reqs[P9_ROW_MAXTAG];
-	int max_tag;
+	struct idr reqs;
 
 	char name[__NEW_UTS_LEN + 1];
 };
@@ -279,4 +243,7 @@ struct p9_fid *p9_client_xattrwalk(struct p9_fid *, const char *, u64 *);
 int p9_client_xattrcreate(struct p9_fid *, const char *, u64, int);
 int p9_client_readlink(struct p9_fid *fid, char **target);
 
+int p9_client_init(void);
+void p9_client_exit(void);
+
 #endif /* NET_9P_CLIENT_H */
diff --git a/net/9p/client.c b/net/9p/client.c
index 23ec6187dc07..d8949c59d46e 100644
--- a/net/9p/client.c
+++ b/net/9p/client.c
@@ -248,132 +248,102 @@ static struct p9_fcall *p9_fcall_alloc(int alloc_msize)
 	return fc;
 }
 
+static struct kmem_cache *p9_req_cache;
+
 /**
- * p9_tag_alloc - lookup/allocate a request by tag
- * @c: client session to lookup tag within
- * @tag: numeric id for transaction
- *
- * this is a simple array lookup, but will grow the
- * request_slots as necessary to accommodate transaction
- * ids which did not previously have a slot.
- *
- * this code relies on the client spinlock to manage locks, its
- * possible we should switch to something else, but I'd rather
- * stick with something low-overhead for the common case.
+ * p9_req_alloc - Allocate a new request.
+ * @c: Client session.
+ * @type: Transaction type.
+ * @max_size: Maximum packet size for this request.
  *
+ * Context: Process context.
+ * Return: Pointer to new request.
  */
-
 static struct p9_req_t *
-p9_tag_alloc(struct p9_client *c, u16 tag, unsigned int max_size)
+p9_tag_alloc(struct p9_client *c, int8_t type, unsigned int max_size)
 {
-	unsigned long flags;
-	int row, col;
-	struct p9_req_t *req;
+	struct p9_req_t *req = kmem_cache_alloc(p9_req_cache, GFP_NOFS);
 	int alloc_msize = min(c->msize, max_size);
+	int tag;
 
-	/* This looks up the original request by tag so we know which
-	 * buffer to read the data into */
-	tag++;
-
-	if (tag >= c->max_tag) {
-		spin_lock_irqsave(&c->lock, flags);
-		/* check again since original check was outside of lock */
-		while (tag >= c->max_tag) {
-			row = (tag / P9_ROW_MAXTAG);
-			c->reqs[row] = kcalloc(P9_ROW_MAXTAG,
-					sizeof(struct p9_req_t), GFP_ATOMIC);
-
-			if (!c->reqs[row]) {
-				pr_err("Couldn't grow tag array\n");
-				spin_unlock_irqrestore(&c->lock, flags);
-				return ERR_PTR(-ENOMEM);
-			}
-			for (col = 0; col < P9_ROW_MAXTAG; col++) {
-				req = &c->reqs[row][col];
-				req->status = REQ_STATUS_IDLE;
-				init_waitqueue_head(&req->wq);
-			}
-			c->max_tag += P9_ROW_MAXTAG;
-		}
-		spin_unlock_irqrestore(&c->lock, flags);
-	}
-	row = tag / P9_ROW_MAXTAG;
-	col = tag % P9_ROW_MAXTAG;
+	if (!req)
+		return NULL;
 
-	req = &c->reqs[row][col];
-	if (!req->tc)
-		req->tc = p9_fcall_alloc(alloc_msize);
-	if (!req->rc)
-		req->rc = p9_fcall_alloc(alloc_msize);
+	req->tc = p9_fcall_alloc(alloc_msize);
+	req->rc = p9_fcall_alloc(alloc_msize);
 	if (!req->tc || !req->rc)
-		goto grow_failed;
+		goto free;
 
 	p9pdu_reset(req->tc);
 	p9pdu_reset(req->rc);
-
-	req->tc->tag = tag-1;
 	req->status = REQ_STATUS_ALLOC;
+	init_waitqueue_head(&req->wq);
+	INIT_LIST_HEAD(&req->req_list);
+
+	idr_preload(GFP_NOFS);
+	spin_lock_irq(&c->lock);
+	if (type == P9_TVERSION)
+		tag = idr_alloc(&c->reqs, req, P9_NOTAG, P9_NOTAG + 1,
+				GFP_NOWAIT);
+	else
+		tag = idr_alloc(&c->reqs, req, 0, P9_NOTAG, GFP_NOWAIT);
+	req->tc->tag = tag;
+	spin_unlock_irq(&c->lock);
+	idr_preload_end();
+	if (tag < 0)
+		goto free;
 
 	return req;
 
-grow_failed:
-	pr_err("Couldn't grow tag array\n");
+free:
 	kfree(req->tc);
 	kfree(req->rc);
-	req->tc = req->rc = NULL;
+	kmem_cache_free(p9_req_cache, req);
 	return ERR_PTR(-ENOMEM);
 }
 
 /**
- * p9_tag_lookup - lookup a request by tag
- * @c: client session to lookup tag within
- * @tag: numeric id for transaction
+ * p9_tag_lookup - Look up a request by tag.
+ * @c: Client session.
+ * @tag: Transaction ID.
  *
+ * Context: Any context.
+ * Return: A request, or %NULL if there is no request with that tag.
  */
-
 struct p9_req_t *p9_tag_lookup(struct p9_client *c, u16 tag)
 {
-	int row, col;
-
-	/* This looks up the original request by tag so we know which
-	 * buffer to read the data into */
-	tag++;
-
-	if (tag >= c->max_tag)
-		return NULL;
+	struct p9_req_t *req;
 
-	row = tag / P9_ROW_MAXTAG;
-	col = tag % P9_ROW_MAXTAG;
+	rcu_read_lock();
+	req = idr_find(&c->reqs, tag);
+	/* There's no refcount on the req; a malicious server could cause
+	 * us to dereference a NULL pointer
+	 */
+	rcu_read_unlock();
 
-	return &c->reqs[row][col];
+	return req;
 }
 EXPORT_SYMBOL(p9_tag_lookup);
 
 /**
- * p9_tag_init - setup tags structure and contents
- * @c:  v9fs client struct
- *
- * This initializes the tags structure for each client instance.
+ * p9_free_req - Free a request.
+ * @c: Client session.
+ * @r: Request to free.
  *
+ * Context: Any context.
  */
-
-static int p9_tag_init(struct p9_client *c)
+static void p9_free_req(struct p9_client *c, struct p9_req_t *r)
 {
-	int err = 0;
+	unsigned long flags;
+	u16 tag = r->tc->tag;
 
-	c->tagpool = p9_idpool_create();
-	if (IS_ERR(c->tagpool)) {
-		err = PTR_ERR(c->tagpool);
-		goto error;
-	}
-	err = p9_idpool_get(c->tagpool); /* reserve tag 0 */
-	if (err < 0) {
-		p9_idpool_destroy(c->tagpool);
-		goto error;
-	}
-	c->max_tag = 0;
-error:
-	return err;
+	p9_debug(P9_DEBUG_MUX, "clnt %p req %p tag: %d\n", c, r, tag);
+	spin_lock_irqsave(&c->lock, flags);
+	idr_remove(&c->reqs, tag);
+	spin_unlock_irqrestore(&c->lock, flags);
+	kfree(r->tc);
+	kfree(r->rc);
+	kmem_cache_free(p9_req_cache, r);
 }
 
 /**
@@ -385,52 +355,15 @@ static int p9_tag_init(struct p9_client *c)
  */
 static void p9_tag_cleanup(struct p9_client *c)
 {
-	int row, col;
-
-	/* check to insure all requests are idle */
-	for (row = 0; row < (c->max_tag/P9_ROW_MAXTAG); row++) {
-		for (col = 0; col < P9_ROW_MAXTAG; col++) {
-			if (c->reqs[row][col].status != REQ_STATUS_IDLE) {
-				p9_debug(P9_DEBUG_MUX,
-					 "Attempting to cleanup non-free tag %d,%d\n",
-					 row, col);
-				/* TODO: delay execution of cleanup */
-				return;
-			}
-		}
-	}
-
-	if (c->tagpool) {
-		p9_idpool_put(0, c->tagpool); /* free reserved tag 0 */
-		p9_idpool_destroy(c->tagpool);
-	}
+	struct p9_req_t *req;
+	int id;
 
-	/* free requests associated with tags */
-	for (row = 0; row < (c->max_tag/P9_ROW_MAXTAG); row++) {
-		for (col = 0; col < P9_ROW_MAXTAG; col++) {
-			kfree(c->reqs[row][col].tc);
-			kfree(c->reqs[row][col].rc);
-		}
-		kfree(c->reqs[row]);
+	rcu_read_lock();
+	idr_for_each_entry(&c->reqs, req, id) {
+		pr_info("Tag %d still in use\n", id);
+		p9_free_req(c, req);
 	}
-	c->max_tag = 0;
-}
-
-/**
- * p9_free_req - free a request and clean-up as necessary
- * c: client state
- * r: request to release
- *
- */
-
-static void p9_free_req(struct p9_client *c, struct p9_req_t *r)
-{
-	int tag = r->tc->tag;
-	p9_debug(P9_DEBUG_MUX, "clnt %p req %p tag: %d\n", c, r, tag);
-
-	r->status = REQ_STATUS_IDLE;
-	if (tag != P9_NOTAG && p9_idpool_check(tag, c->tagpool))
-		p9_idpool_put(tag, c->tagpool);
+	rcu_read_unlock();
 }
 
 /**
@@ -704,7 +637,7 @@ static struct p9_req_t *p9_client_prepare_req(struct p9_client *c,
 					      int8_t type, int req_size,
 					      const char *fmt, va_list ap)
 {
-	int tag, err;
+	int err;
 	struct p9_req_t *req;
 
 	p9_debug(P9_DEBUG_MUX, "client %p op %d\n", c, type);
@@ -717,24 +650,17 @@ static struct p9_req_t *p9_client_prepare_req(struct p9_client *c,
 	if ((c->status == BeginDisconnect) && (type != P9_TCLUNK))
 		return ERR_PTR(-EIO);
 
-	tag = P9_NOTAG;
-	if (type != P9_TVERSION) {
-		tag = p9_idpool_get(c->tagpool);
-		if (tag < 0)
-			return ERR_PTR(-ENOMEM);
-	}
-
-	req = p9_tag_alloc(c, tag, req_size);
+	req = p9_tag_alloc(c, type, req_size);
 	if (IS_ERR(req))
 		return req;
 
 	/* marshall the data */
-	p9pdu_prepare(req->tc, tag, type);
+	p9pdu_prepare(req->tc, req->tc->tag, type);
 	err = p9pdu_vwritef(req->tc, c->proto_version, fmt, ap);
 	if (err)
 		goto reterr;
 	p9pdu_finalize(c, req->tc);
-	trace_9p_client_req(c, type, tag);
+	trace_9p_client_req(c, type, req->tc->tag);
 	return req;
 reterr:
 	p9_free_req(c, req);
@@ -1040,14 +966,11 @@ struct p9_client *p9_client_create(const char *dev_name, char *options)
 
 	spin_lock_init(&clnt->lock);
 	idr_init(&clnt->fids);
-
-	err = p9_tag_init(clnt);
-	if (err < 0)
-		goto free_client;
+	idr_init(&clnt->reqs);
 
 	err = parse_opts(options, clnt);
 	if (err < 0)
-		goto destroy_tagpool;
+		goto free_client;
 
 	if (!clnt->trans_mod)
 		clnt->trans_mod = v9fs_get_default_trans();
@@ -1056,7 +979,7 @@ struct p9_client *p9_client_create(const char *dev_name, char *options)
 		err = -EPROTONOSUPPORT;
 		p9_debug(P9_DEBUG_ERROR,
 			 "No transport defined or default transport\n");
-		goto destroy_tagpool;
+		goto free_client;
 	}
 
 	p9_debug(P9_DEBUG_MUX, "clnt %p trans %p msize %d protocol %d\n",
@@ -1086,8 +1009,6 @@ struct p9_client *p9_client_create(const char *dev_name, char *options)
 	clnt->trans_mod->close(clnt);
 put_trans:
 	v9fs_put_trans(clnt->trans_mod);
-destroy_tagpool:
-	p9_idpool_destroy(clnt->tagpool);
 free_client:
 	kfree(clnt);
 	return ERR_PTR(err);
@@ -2303,3 +2224,14 @@ int p9_client_readlink(struct p9_fid *fid, char **target)
 	return err;
 }
 EXPORT_SYMBOL(p9_client_readlink);
+
+int __init p9_client_init(void)
+{
+	p9_req_cache = KMEM_CACHE(p9_req_t, 0);
+	return p9_req_cache ? 0 : -ENOMEM;
+}
+
+void __exit p9_client_exit(void)
+{
+	kmem_cache_destroy(p9_req_cache);
+}
diff --git a/net/9p/mod.c b/net/9p/mod.c
index 253ba824a325..0da56d6af73b 100644
--- a/net/9p/mod.c
+++ b/net/9p/mod.c
@@ -171,11 +171,17 @@ void v9fs_put_trans(struct p9_trans_module *m)
  */
 static int __init init_p9(void)
 {
+	int ret;
+
+	ret = p9_client_init();
+	if (ret)
+		return ret;
+
 	p9_error_init();
 	pr_info("Installing 9P2000 support\n");
 	p9_trans_fd_init();
 
-	return 0;
+	return ret;
 }
 
 /**
@@ -188,6 +194,7 @@ static void __exit exit_p9(void)
 	pr_info("Unloading 9P2000 support\n");
 
 	p9_trans_fd_exit();
+	p9_client_exit();
 }
 
 module_init(init_p9)
-- 
2.20.1




^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 4.19 09/72] 9p: embed fcall in req to round down buffer allocs
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (7 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 08/72] 9p: Use a slab for allocating requests Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 10/72] 9p: add a per-client fcall kmem_cache Greg Kroah-Hartman
                   ` (69 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Matthew Wilcox, Dominique Martinet,
	Greg Kurz, Jun Piao, Sasha Levin

[ Upstream commit 523adb6cc10b48655c0abe556505240741425b49 ]

'msize' is often a power of two, or at least page-aligned, so avoiding
an overhead of two dozen bytes for each allocation will help the
allocator do its work and reduce memory fragmentation.

Link: http://lkml.kernel.org/r/1533825236-22896-1-git-send-email-asmadeus@codewreck.org
Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Reviewed-by: Greg Kurz <groug@kaod.org>
Acked-by: Jun Piao <piaojun@huawei.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/net/9p/client.h |   5 +-
 net/9p/client.c         | 167 +++++++++++++++++++++-------------------
 net/9p/trans_fd.c       |  12 +--
 net/9p/trans_rdma.c     |  29 +++----
 net/9p/trans_virtio.c   |  18 ++---
 net/9p/trans_xen.c      |  12 +--
 6 files changed, 125 insertions(+), 118 deletions(-)

diff --git a/include/net/9p/client.h b/include/net/9p/client.h
index a4dc42c53d18..c2671d40bb6b 100644
--- a/include/net/9p/client.h
+++ b/include/net/9p/client.h
@@ -95,8 +95,8 @@ struct p9_req_t {
 	int status;
 	int t_err;
 	wait_queue_head_t wq;
-	struct p9_fcall *tc;
-	struct p9_fcall *rc;
+	struct p9_fcall tc;
+	struct p9_fcall rc;
 	void *aux;
 	struct list_head req_list;
 };
@@ -230,6 +230,7 @@ int p9_client_mkdir_dotl(struct p9_fid *fid, const char *name, int mode,
 				kgid_t gid, struct p9_qid *);
 int p9_client_lock_dotl(struct p9_fid *fid, struct p9_flock *flock, u8 *status);
 int p9_client_getlock_dotl(struct p9_fid *fid, struct p9_getlock *fl);
+void p9_fcall_fini(struct p9_fcall *fc);
 struct p9_req_t *p9_tag_lookup(struct p9_client *, u16);
 void p9_client_cb(struct p9_client *c, struct p9_req_t *req, int status);
 
diff --git a/net/9p/client.c b/net/9p/client.c
index d8949c59d46e..83e39fef58e1 100644
--- a/net/9p/client.c
+++ b/net/9p/client.c
@@ -237,16 +237,20 @@ static int parse_opts(char *opts, struct p9_client *clnt)
 	return ret;
 }
 
-static struct p9_fcall *p9_fcall_alloc(int alloc_msize)
+static int p9_fcall_init(struct p9_fcall *fc, int alloc_msize)
 {
-	struct p9_fcall *fc;
-	fc = kmalloc(sizeof(struct p9_fcall) + alloc_msize, GFP_NOFS);
-	if (!fc)
-		return NULL;
+	fc->sdata = kmalloc(alloc_msize, GFP_NOFS);
+	if (!fc->sdata)
+		return -ENOMEM;
 	fc->capacity = alloc_msize;
-	fc->sdata = (char *) fc + sizeof(struct p9_fcall);
-	return fc;
+	return 0;
+}
+
+void p9_fcall_fini(struct p9_fcall *fc)
+{
+	kfree(fc->sdata);
 }
+EXPORT_SYMBOL(p9_fcall_fini);
 
 static struct kmem_cache *p9_req_cache;
 
@@ -269,13 +273,13 @@ p9_tag_alloc(struct p9_client *c, int8_t type, unsigned int max_size)
 	if (!req)
 		return NULL;
 
-	req->tc = p9_fcall_alloc(alloc_msize);
-	req->rc = p9_fcall_alloc(alloc_msize);
-	if (!req->tc || !req->rc)
+	if (p9_fcall_init(&req->tc, alloc_msize))
+		goto free_req;
+	if (p9_fcall_init(&req->rc, alloc_msize))
 		goto free;
 
-	p9pdu_reset(req->tc);
-	p9pdu_reset(req->rc);
+	p9pdu_reset(&req->tc);
+	p9pdu_reset(&req->rc);
 	req->status = REQ_STATUS_ALLOC;
 	init_waitqueue_head(&req->wq);
 	INIT_LIST_HEAD(&req->req_list);
@@ -287,7 +291,7 @@ p9_tag_alloc(struct p9_client *c, int8_t type, unsigned int max_size)
 				GFP_NOWAIT);
 	else
 		tag = idr_alloc(&c->reqs, req, 0, P9_NOTAG, GFP_NOWAIT);
-	req->tc->tag = tag;
+	req->tc.tag = tag;
 	spin_unlock_irq(&c->lock);
 	idr_preload_end();
 	if (tag < 0)
@@ -296,8 +300,9 @@ p9_tag_alloc(struct p9_client *c, int8_t type, unsigned int max_size)
 	return req;
 
 free:
-	kfree(req->tc);
-	kfree(req->rc);
+	p9_fcall_fini(&req->tc);
+	p9_fcall_fini(&req->rc);
+free_req:
 	kmem_cache_free(p9_req_cache, req);
 	return ERR_PTR(-ENOMEM);
 }
@@ -335,14 +340,14 @@ EXPORT_SYMBOL(p9_tag_lookup);
 static void p9_free_req(struct p9_client *c, struct p9_req_t *r)
 {
 	unsigned long flags;
-	u16 tag = r->tc->tag;
+	u16 tag = r->tc.tag;
 
 	p9_debug(P9_DEBUG_MUX, "clnt %p req %p tag: %d\n", c, r, tag);
 	spin_lock_irqsave(&c->lock, flags);
 	idr_remove(&c->reqs, tag);
 	spin_unlock_irqrestore(&c->lock, flags);
-	kfree(r->tc);
-	kfree(r->rc);
+	p9_fcall_fini(&r->tc);
+	p9_fcall_fini(&r->rc);
 	kmem_cache_free(p9_req_cache, r);
 }
 
@@ -374,7 +379,7 @@ static void p9_tag_cleanup(struct p9_client *c)
  */
 void p9_client_cb(struct p9_client *c, struct p9_req_t *req, int status)
 {
-	p9_debug(P9_DEBUG_MUX, " tag %d\n", req->tc->tag);
+	p9_debug(P9_DEBUG_MUX, " tag %d\n", req->tc.tag);
 
 	/*
 	 * This barrier is needed to make sure any change made to req before
@@ -384,7 +389,7 @@ void p9_client_cb(struct p9_client *c, struct p9_req_t *req, int status)
 	req->status = status;
 
 	wake_up(&req->wq);
-	p9_debug(P9_DEBUG_MUX, "wakeup: %d\n", req->tc->tag);
+	p9_debug(P9_DEBUG_MUX, "wakeup: %d\n", req->tc.tag);
 }
 EXPORT_SYMBOL(p9_client_cb);
 
@@ -455,18 +460,18 @@ static int p9_check_errors(struct p9_client *c, struct p9_req_t *req)
 	int err;
 	int ecode;
 
-	err = p9_parse_header(req->rc, NULL, &type, NULL, 0);
-	if (req->rc->size >= c->msize) {
+	err = p9_parse_header(&req->rc, NULL, &type, NULL, 0);
+	if (req->rc.size >= c->msize) {
 		p9_debug(P9_DEBUG_ERROR,
 			 "requested packet size too big: %d\n",
-			 req->rc->size);
+			 req->rc.size);
 		return -EIO;
 	}
 	/*
 	 * dump the response from server
 	 * This should be after check errors which poplulate pdu_fcall.
 	 */
-	trace_9p_protocol_dump(c, req->rc);
+	trace_9p_protocol_dump(c, &req->rc);
 	if (err) {
 		p9_debug(P9_DEBUG_ERROR, "couldn't parse header %d\n", err);
 		return err;
@@ -476,7 +481,7 @@ static int p9_check_errors(struct p9_client *c, struct p9_req_t *req)
 
 	if (!p9_is_proto_dotl(c)) {
 		char *ename;
-		err = p9pdu_readf(req->rc, c->proto_version, "s?d",
+		err = p9pdu_readf(&req->rc, c->proto_version, "s?d",
 				  &ename, &ecode);
 		if (err)
 			goto out_err;
@@ -492,7 +497,7 @@ static int p9_check_errors(struct p9_client *c, struct p9_req_t *req)
 		}
 		kfree(ename);
 	} else {
-		err = p9pdu_readf(req->rc, c->proto_version, "d", &ecode);
+		err = p9pdu_readf(&req->rc, c->proto_version, "d", &ecode);
 		err = -ecode;
 
 		p9_debug(P9_DEBUG_9P, "<<< RLERROR (%d)\n", -ecode);
@@ -526,12 +531,12 @@ static int p9_check_zc_errors(struct p9_client *c, struct p9_req_t *req,
 	int8_t type;
 	char *ename = NULL;
 
-	err = p9_parse_header(req->rc, NULL, &type, NULL, 0);
+	err = p9_parse_header(&req->rc, NULL, &type, NULL, 0);
 	/*
 	 * dump the response from server
 	 * This should be after parse_header which poplulate pdu_fcall.
 	 */
-	trace_9p_protocol_dump(c, req->rc);
+	trace_9p_protocol_dump(c, &req->rc);
 	if (err) {
 		p9_debug(P9_DEBUG_ERROR, "couldn't parse header %d\n", err);
 		return err;
@@ -546,13 +551,13 @@ static int p9_check_zc_errors(struct p9_client *c, struct p9_req_t *req,
 		/* 7 = header size for RERROR; */
 		int inline_len = in_hdrlen - 7;
 
-		len =  req->rc->size - req->rc->offset;
+		len = req->rc.size - req->rc.offset;
 		if (len > (P9_ZC_HDR_SZ - 7)) {
 			err = -EFAULT;
 			goto out_err;
 		}
 
-		ename = &req->rc->sdata[req->rc->offset];
+		ename = &req->rc.sdata[req->rc.offset];
 		if (len > inline_len) {
 			/* We have error in external buffer */
 			if (!copy_from_iter_full(ename + inline_len,
@@ -562,7 +567,7 @@ static int p9_check_zc_errors(struct p9_client *c, struct p9_req_t *req,
 			}
 		}
 		ename = NULL;
-		err = p9pdu_readf(req->rc, c->proto_version, "s?d",
+		err = p9pdu_readf(&req->rc, c->proto_version, "s?d",
 				  &ename, &ecode);
 		if (err)
 			goto out_err;
@@ -578,7 +583,7 @@ static int p9_check_zc_errors(struct p9_client *c, struct p9_req_t *req,
 		}
 		kfree(ename);
 	} else {
-		err = p9pdu_readf(req->rc, c->proto_version, "d", &ecode);
+		err = p9pdu_readf(&req->rc, c->proto_version, "d", &ecode);
 		err = -ecode;
 
 		p9_debug(P9_DEBUG_9P, "<<< RLERROR (%d)\n", -ecode);
@@ -611,7 +616,7 @@ static int p9_client_flush(struct p9_client *c, struct p9_req_t *oldreq)
 	int16_t oldtag;
 	int err;
 
-	err = p9_parse_header(oldreq->tc, NULL, NULL, &oldtag, 1);
+	err = p9_parse_header(&oldreq->tc, NULL, NULL, &oldtag, 1);
 	if (err)
 		return err;
 
@@ -655,12 +660,12 @@ static struct p9_req_t *p9_client_prepare_req(struct p9_client *c,
 		return req;
 
 	/* marshall the data */
-	p9pdu_prepare(req->tc, req->tc->tag, type);
-	err = p9pdu_vwritef(req->tc, c->proto_version, fmt, ap);
+	p9pdu_prepare(&req->tc, req->tc.tag, type);
+	err = p9pdu_vwritef(&req->tc, c->proto_version, fmt, ap);
 	if (err)
 		goto reterr;
-	p9pdu_finalize(c, req->tc);
-	trace_9p_client_req(c, type, req->tc->tag);
+	p9pdu_finalize(c, &req->tc);
+	trace_9p_client_req(c, type, req->tc.tag);
 	return req;
 reterr:
 	p9_free_req(c, req);
@@ -745,7 +750,7 @@ p9_client_rpc(struct p9_client *c, int8_t type, const char *fmt, ...)
 		goto reterr;
 
 	err = p9_check_errors(c, req);
-	trace_9p_client_res(c, type, req->rc->tag, err);
+	trace_9p_client_res(c, type, req->rc.tag, err);
 	if (!err)
 		return req;
 reterr:
@@ -827,7 +832,7 @@ static struct p9_req_t *p9_client_zc_rpc(struct p9_client *c, int8_t type,
 		goto reterr;
 
 	err = p9_check_zc_errors(c, req, uidata, in_hdrlen);
-	trace_9p_client_res(c, type, req->rc->tag, err);
+	trace_9p_client_res(c, type, req->rc.tag, err);
 	if (!err)
 		return req;
 reterr:
@@ -910,10 +915,10 @@ static int p9_client_version(struct p9_client *c)
 	if (IS_ERR(req))
 		return PTR_ERR(req);
 
-	err = p9pdu_readf(req->rc, c->proto_version, "ds", &msize, &version);
+	err = p9pdu_readf(&req->rc, c->proto_version, "ds", &msize, &version);
 	if (err) {
 		p9_debug(P9_DEBUG_9P, "version error %d\n", err);
-		trace_9p_protocol_dump(c, req->rc);
+		trace_9p_protocol_dump(c, &req->rc);
 		goto error;
 	}
 
@@ -1077,9 +1082,9 @@ struct p9_fid *p9_client_attach(struct p9_client *clnt, struct p9_fid *afid,
 		goto error;
 	}
 
-	err = p9pdu_readf(req->rc, clnt->proto_version, "Q", &qid);
+	err = p9pdu_readf(&req->rc, clnt->proto_version, "Q", &qid);
 	if (err) {
-		trace_9p_protocol_dump(clnt, req->rc);
+		trace_9p_protocol_dump(clnt, &req->rc);
 		p9_free_req(clnt, req);
 		goto error;
 	}
@@ -1134,9 +1139,9 @@ struct p9_fid *p9_client_walk(struct p9_fid *oldfid, uint16_t nwname,
 		goto error;
 	}
 
-	err = p9pdu_readf(req->rc, clnt->proto_version, "R", &nwqids, &wqids);
+	err = p9pdu_readf(&req->rc, clnt->proto_version, "R", &nwqids, &wqids);
 	if (err) {
-		trace_9p_protocol_dump(clnt, req->rc);
+		trace_9p_protocol_dump(clnt, &req->rc);
 		p9_free_req(clnt, req);
 		goto clunk_fid;
 	}
@@ -1201,9 +1206,9 @@ int p9_client_open(struct p9_fid *fid, int mode)
 		goto error;
 	}
 
-	err = p9pdu_readf(req->rc, clnt->proto_version, "Qd", &qid, &iounit);
+	err = p9pdu_readf(&req->rc, clnt->proto_version, "Qd", &qid, &iounit);
 	if (err) {
-		trace_9p_protocol_dump(clnt, req->rc);
+		trace_9p_protocol_dump(clnt, &req->rc);
 		goto free_and_error;
 	}
 
@@ -1245,9 +1250,9 @@ int p9_client_create_dotl(struct p9_fid *ofid, const char *name, u32 flags, u32
 		goto error;
 	}
 
-	err = p9pdu_readf(req->rc, clnt->proto_version, "Qd", qid, &iounit);
+	err = p9pdu_readf(&req->rc, clnt->proto_version, "Qd", qid, &iounit);
 	if (err) {
-		trace_9p_protocol_dump(clnt, req->rc);
+		trace_9p_protocol_dump(clnt, &req->rc);
 		goto free_and_error;
 	}
 
@@ -1290,9 +1295,9 @@ int p9_client_fcreate(struct p9_fid *fid, const char *name, u32 perm, int mode,
 		goto error;
 	}
 
-	err = p9pdu_readf(req->rc, clnt->proto_version, "Qd", &qid, &iounit);
+	err = p9pdu_readf(&req->rc, clnt->proto_version, "Qd", &qid, &iounit);
 	if (err) {
-		trace_9p_protocol_dump(clnt, req->rc);
+		trace_9p_protocol_dump(clnt, &req->rc);
 		goto free_and_error;
 	}
 
@@ -1329,9 +1334,9 @@ int p9_client_symlink(struct p9_fid *dfid, const char *name,
 		goto error;
 	}
 
-	err = p9pdu_readf(req->rc, clnt->proto_version, "Q", qid);
+	err = p9pdu_readf(&req->rc, clnt->proto_version, "Q", qid);
 	if (err) {
-		trace_9p_protocol_dump(clnt, req->rc);
+		trace_9p_protocol_dump(clnt, &req->rc);
 		goto free_and_error;
 	}
 
@@ -1527,10 +1532,10 @@ p9_client_read(struct p9_fid *fid, u64 offset, struct iov_iter *to, int *err)
 			break;
 		}
 
-		*err = p9pdu_readf(req->rc, clnt->proto_version,
+		*err = p9pdu_readf(&req->rc, clnt->proto_version,
 				   "D", &count, &dataptr);
 		if (*err) {
-			trace_9p_protocol_dump(clnt, req->rc);
+			trace_9p_protocol_dump(clnt, &req->rc);
 			p9_free_req(clnt, req);
 			break;
 		}
@@ -1600,9 +1605,9 @@ p9_client_write(struct p9_fid *fid, u64 offset, struct iov_iter *from, int *err)
 			break;
 		}
 
-		*err = p9pdu_readf(req->rc, clnt->proto_version, "d", &count);
+		*err = p9pdu_readf(&req->rc, clnt->proto_version, "d", &count);
 		if (*err) {
-			trace_9p_protocol_dump(clnt, req->rc);
+			trace_9p_protocol_dump(clnt, &req->rc);
 			p9_free_req(clnt, req);
 			break;
 		}
@@ -1644,9 +1649,9 @@ struct p9_wstat *p9_client_stat(struct p9_fid *fid)
 		goto error;
 	}
 
-	err = p9pdu_readf(req->rc, clnt->proto_version, "wS", &ignored, ret);
+	err = p9pdu_readf(&req->rc, clnt->proto_version, "wS", &ignored, ret);
 	if (err) {
-		trace_9p_protocol_dump(clnt, req->rc);
+		trace_9p_protocol_dump(clnt, &req->rc);
 		p9_free_req(clnt, req);
 		goto error;
 	}
@@ -1697,9 +1702,9 @@ struct p9_stat_dotl *p9_client_getattr_dotl(struct p9_fid *fid,
 		goto error;
 	}
 
-	err = p9pdu_readf(req->rc, clnt->proto_version, "A", ret);
+	err = p9pdu_readf(&req->rc, clnt->proto_version, "A", ret);
 	if (err) {
-		trace_9p_protocol_dump(clnt, req->rc);
+		trace_9p_protocol_dump(clnt, &req->rc);
 		p9_free_req(clnt, req);
 		goto error;
 	}
@@ -1849,11 +1854,11 @@ int p9_client_statfs(struct p9_fid *fid, struct p9_rstatfs *sb)
 		goto error;
 	}
 
-	err = p9pdu_readf(req->rc, clnt->proto_version, "ddqqqqqqd", &sb->type,
-		&sb->bsize, &sb->blocks, &sb->bfree, &sb->bavail,
-		&sb->files, &sb->ffree, &sb->fsid, &sb->namelen);
+	err = p9pdu_readf(&req->rc, clnt->proto_version, "ddqqqqqqd", &sb->type,
+			  &sb->bsize, &sb->blocks, &sb->bfree, &sb->bavail,
+			  &sb->files, &sb->ffree, &sb->fsid, &sb->namelen);
 	if (err) {
-		trace_9p_protocol_dump(clnt, req->rc);
+		trace_9p_protocol_dump(clnt, &req->rc);
 		p9_free_req(clnt, req);
 		goto error;
 	}
@@ -1957,9 +1962,9 @@ struct p9_fid *p9_client_xattrwalk(struct p9_fid *file_fid,
 		err = PTR_ERR(req);
 		goto error;
 	}
-	err = p9pdu_readf(req->rc, clnt->proto_version, "q", attr_size);
+	err = p9pdu_readf(&req->rc, clnt->proto_version, "q", attr_size);
 	if (err) {
-		trace_9p_protocol_dump(clnt, req->rc);
+		trace_9p_protocol_dump(clnt, &req->rc);
 		p9_free_req(clnt, req);
 		goto clunk_fid;
 	}
@@ -2045,9 +2050,9 @@ int p9_client_readdir(struct p9_fid *fid, char *data, u32 count, u64 offset)
 		goto error;
 	}
 
-	err = p9pdu_readf(req->rc, clnt->proto_version, "D", &count, &dataptr);
+	err = p9pdu_readf(&req->rc, clnt->proto_version, "D", &count, &dataptr);
 	if (err) {
-		trace_9p_protocol_dump(clnt, req->rc);
+		trace_9p_protocol_dump(clnt, &req->rc);
 		goto free_and_error;
 	}
 	if (rsize < count) {
@@ -2086,9 +2091,9 @@ int p9_client_mknod_dotl(struct p9_fid *fid, const char *name, int mode,
 	if (IS_ERR(req))
 		return PTR_ERR(req);
 
-	err = p9pdu_readf(req->rc, clnt->proto_version, "Q", qid);
+	err = p9pdu_readf(&req->rc, clnt->proto_version, "Q", qid);
 	if (err) {
-		trace_9p_protocol_dump(clnt, req->rc);
+		trace_9p_protocol_dump(clnt, &req->rc);
 		goto error;
 	}
 	p9_debug(P9_DEBUG_9P, "<<< RMKNOD qid %x.%llx.%x\n", qid->type,
@@ -2117,9 +2122,9 @@ int p9_client_mkdir_dotl(struct p9_fid *fid, const char *name, int mode,
 	if (IS_ERR(req))
 		return PTR_ERR(req);
 
-	err = p9pdu_readf(req->rc, clnt->proto_version, "Q", qid);
+	err = p9pdu_readf(&req->rc, clnt->proto_version, "Q", qid);
 	if (err) {
-		trace_9p_protocol_dump(clnt, req->rc);
+		trace_9p_protocol_dump(clnt, &req->rc);
 		goto error;
 	}
 	p9_debug(P9_DEBUG_9P, "<<< RMKDIR qid %x.%llx.%x\n", qid->type,
@@ -2152,9 +2157,9 @@ int p9_client_lock_dotl(struct p9_fid *fid, struct p9_flock *flock, u8 *status)
 	if (IS_ERR(req))
 		return PTR_ERR(req);
 
-	err = p9pdu_readf(req->rc, clnt->proto_version, "b", status);
+	err = p9pdu_readf(&req->rc, clnt->proto_version, "b", status);
 	if (err) {
-		trace_9p_protocol_dump(clnt, req->rc);
+		trace_9p_protocol_dump(clnt, &req->rc);
 		goto error;
 	}
 	p9_debug(P9_DEBUG_9P, "<<< RLOCK status %i\n", *status);
@@ -2183,11 +2188,11 @@ int p9_client_getlock_dotl(struct p9_fid *fid, struct p9_getlock *glock)
 	if (IS_ERR(req))
 		return PTR_ERR(req);
 
-	err = p9pdu_readf(req->rc, clnt->proto_version, "bqqds", &glock->type,
-			&glock->start, &glock->length, &glock->proc_id,
-			&glock->client_id);
+	err = p9pdu_readf(&req->rc, clnt->proto_version, "bqqds", &glock->type,
+			  &glock->start, &glock->length, &glock->proc_id,
+			  &glock->client_id);
 	if (err) {
-		trace_9p_protocol_dump(clnt, req->rc);
+		trace_9p_protocol_dump(clnt, &req->rc);
 		goto error;
 	}
 	p9_debug(P9_DEBUG_9P, "<<< RGETLOCK type %i start %lld length %lld "
@@ -2213,9 +2218,9 @@ int p9_client_readlink(struct p9_fid *fid, char **target)
 	if (IS_ERR(req))
 		return PTR_ERR(req);
 
-	err = p9pdu_readf(req->rc, clnt->proto_version, "s", target);
+	err = p9pdu_readf(&req->rc, clnt->proto_version, "s", target);
 	if (err) {
-		trace_9p_protocol_dump(clnt, req->rc);
+		trace_9p_protocol_dump(clnt, &req->rc);
 		goto error;
 	}
 	p9_debug(P9_DEBUG_9P, "<<< RREADLINK target %s\n", *target);
diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
index e2ef3c782c53..51615c0fb744 100644
--- a/net/9p/trans_fd.c
+++ b/net/9p/trans_fd.c
@@ -354,7 +354,7 @@ static void p9_read_work(struct work_struct *work)
 			goto error;
 		}
 
-		if (m->req->rc == NULL) {
+		if (!m->req->rc.sdata) {
 			p9_debug(P9_DEBUG_ERROR,
 				 "No recv fcall for tag %d (req %p), disconnecting!\n",
 				 m->rc.tag, m->req);
@@ -362,7 +362,7 @@ static void p9_read_work(struct work_struct *work)
 			err = -EIO;
 			goto error;
 		}
-		m->rc.sdata = (char *)m->req->rc + sizeof(struct p9_fcall);
+		m->rc.sdata = m->req->rc.sdata;
 		memcpy(m->rc.sdata, m->tmp_buf, m->rc.capacity);
 		m->rc.capacity = m->rc.size;
 	}
@@ -372,7 +372,7 @@ static void p9_read_work(struct work_struct *work)
 	 */
 	if ((m->req) && (m->rc.offset == m->rc.capacity)) {
 		p9_debug(P9_DEBUG_TRANS, "got new packet\n");
-		m->req->rc->size = m->rc.offset;
+		m->req->rc.size = m->rc.offset;
 		spin_lock(&m->client->lock);
 		if (m->req->status != REQ_STATUS_ERROR)
 			status = REQ_STATUS_RCVD;
@@ -469,8 +469,8 @@ static void p9_write_work(struct work_struct *work)
 		p9_debug(P9_DEBUG_TRANS, "move req %p\n", req);
 		list_move_tail(&req->req_list, &m->req_list);
 
-		m->wbuf = req->tc->sdata;
-		m->wsize = req->tc->size;
+		m->wbuf = req->tc.sdata;
+		m->wsize = req->tc.size;
 		m->wpos = 0;
 		spin_unlock(&m->client->lock);
 	}
@@ -663,7 +663,7 @@ static int p9_fd_request(struct p9_client *client, struct p9_req_t *req)
 	struct p9_conn *m = &ts->conn;
 
 	p9_debug(P9_DEBUG_TRANS, "mux %p task %p tcall %p id %d\n",
-		 m, current, req->tc, req->tc->id);
+		 m, current, &req->tc, req->tc.id);
 	if (m->err < 0)
 		return m->err;
 
diff --git a/net/9p/trans_rdma.c b/net/9p/trans_rdma.c
index b513cffeeb3c..5b0cda1aaa7a 100644
--- a/net/9p/trans_rdma.c
+++ b/net/9p/trans_rdma.c
@@ -122,7 +122,7 @@ struct p9_rdma_context {
 	dma_addr_t busa;
 	union {
 		struct p9_req_t *req;
-		struct p9_fcall *rc;
+		struct p9_fcall rc;
 	};
 };
 
@@ -320,8 +320,8 @@ recv_done(struct ib_cq *cq, struct ib_wc *wc)
 	if (wc->status != IB_WC_SUCCESS)
 		goto err_out;
 
-	c->rc->size = wc->byte_len;
-	err = p9_parse_header(c->rc, NULL, NULL, &tag, 1);
+	c->rc.size = wc->byte_len;
+	err = p9_parse_header(&c->rc, NULL, NULL, &tag, 1);
 	if (err)
 		goto err_out;
 
@@ -331,12 +331,13 @@ recv_done(struct ib_cq *cq, struct ib_wc *wc)
 
 	/* Check that we have not yet received a reply for this request.
 	 */
-	if (unlikely(req->rc)) {
+	if (unlikely(req->rc.sdata)) {
 		pr_err("Duplicate reply for request %d", tag);
 		goto err_out;
 	}
 
-	req->rc = c->rc;
+	req->rc.size = c->rc.size;
+	req->rc.sdata = c->rc.sdata;
 	p9_client_cb(client, req, REQ_STATUS_RCVD);
 
  out:
@@ -361,7 +362,7 @@ send_done(struct ib_cq *cq, struct ib_wc *wc)
 		container_of(wc->wr_cqe, struct p9_rdma_context, cqe);
 
 	ib_dma_unmap_single(rdma->cm_id->device,
-			    c->busa, c->req->tc->size,
+			    c->busa, c->req->tc.size,
 			    DMA_TO_DEVICE);
 	up(&rdma->sq_sem);
 	kfree(c);
@@ -401,7 +402,7 @@ post_recv(struct p9_client *client, struct p9_rdma_context *c)
 	struct ib_sge sge;
 
 	c->busa = ib_dma_map_single(rdma->cm_id->device,
-				    c->rc->sdata, client->msize,
+				    c->rc.sdata, client->msize,
 				    DMA_FROM_DEVICE);
 	if (ib_dma_mapping_error(rdma->cm_id->device, c->busa))
 		goto error;
@@ -443,9 +444,9 @@ static int rdma_request(struct p9_client *client, struct p9_req_t *req)
 	 **/
 	if (unlikely(atomic_read(&rdma->excess_rc) > 0)) {
 		if ((atomic_sub_return(1, &rdma->excess_rc) >= 0)) {
-			/* Got one ! */
-			kfree(req->rc);
-			req->rc = NULL;
+			/* Got one! */
+			p9_fcall_fini(&req->rc);
+			req->rc.sdata = NULL;
 			goto dont_need_post_recv;
 		} else {
 			/* We raced and lost. */
@@ -459,7 +460,7 @@ static int rdma_request(struct p9_client *client, struct p9_req_t *req)
 		err = -ENOMEM;
 		goto recv_error;
 	}
-	rpl_context->rc = req->rc;
+	rpl_context->rc.sdata = req->rc.sdata;
 
 	/*
 	 * Post a receive buffer for this request. We need to ensure
@@ -479,7 +480,7 @@ static int rdma_request(struct p9_client *client, struct p9_req_t *req)
 		goto recv_error;
 	}
 	/* remove posted receive buffer from request structure */
-	req->rc = NULL;
+	req->rc.sdata = NULL;
 
 dont_need_post_recv:
 	/* Post the request */
@@ -491,7 +492,7 @@ static int rdma_request(struct p9_client *client, struct p9_req_t *req)
 	c->req = req;
 
 	c->busa = ib_dma_map_single(rdma->cm_id->device,
-				    c->req->tc->sdata, c->req->tc->size,
+				    c->req->tc.sdata, c->req->tc.size,
 				    DMA_TO_DEVICE);
 	if (ib_dma_mapping_error(rdma->cm_id->device, c->busa)) {
 		err = -EIO;
@@ -501,7 +502,7 @@ static int rdma_request(struct p9_client *client, struct p9_req_t *req)
 	c->cqe.done = send_done;
 
 	sge.addr = c->busa;
-	sge.length = c->req->tc->size;
+	sge.length = c->req->tc.size;
 	sge.lkey = rdma->pd->local_dma_lkey;
 
 	wr.next = NULL;
diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
index 7728b0acde09..3dd6ce1c0f2d 100644
--- a/net/9p/trans_virtio.c
+++ b/net/9p/trans_virtio.c
@@ -155,7 +155,7 @@ static void req_done(struct virtqueue *vq)
 		}
 
 		if (len) {
-			req->rc->size = len;
+			req->rc.size = len;
 			p9_client_cb(chan->client, req, REQ_STATUS_RCVD);
 		}
 	}
@@ -273,12 +273,12 @@ p9_virtio_request(struct p9_client *client, struct p9_req_t *req)
 	out_sgs = in_sgs = 0;
 	/* Handle out VirtIO ring buffers */
 	out = pack_sg_list(chan->sg, 0,
-			   VIRTQUEUE_NUM, req->tc->sdata, req->tc->size);
+			   VIRTQUEUE_NUM, req->tc.sdata, req->tc.size);
 	if (out)
 		sgs[out_sgs++] = chan->sg;
 
 	in = pack_sg_list(chan->sg, out,
-			  VIRTQUEUE_NUM, req->rc->sdata, req->rc->capacity);
+			  VIRTQUEUE_NUM, req->rc.sdata, req->rc.capacity);
 	if (in)
 		sgs[out_sgs + in_sgs++] = chan->sg + out;
 
@@ -416,15 +416,15 @@ p9_virtio_zc_request(struct p9_client *client, struct p9_req_t *req,
 		out_nr_pages = DIV_ROUND_UP(n + offs, PAGE_SIZE);
 		if (n != outlen) {
 			__le32 v = cpu_to_le32(n);
-			memcpy(&req->tc->sdata[req->tc->size - 4], &v, 4);
+			memcpy(&req->tc.sdata[req->tc.size - 4], &v, 4);
 			outlen = n;
 		}
 		/* The size field of the message must include the length of the
 		 * header and the length of the data.  We didn't actually know
 		 * the length of the data until this point so add it in now.
 		 */
-		sz = cpu_to_le32(req->tc->size + outlen);
-		memcpy(&req->tc->sdata[0], &sz, sizeof(sz));
+		sz = cpu_to_le32(req->tc.size + outlen);
+		memcpy(&req->tc.sdata[0], &sz, sizeof(sz));
 	} else if (uidata) {
 		int n = p9_get_mapped_pages(chan, &in_pages, uidata,
 					    inlen, &offs, &need_drop);
@@ -433,7 +433,7 @@ p9_virtio_zc_request(struct p9_client *client, struct p9_req_t *req,
 		in_nr_pages = DIV_ROUND_UP(n + offs, PAGE_SIZE);
 		if (n != inlen) {
 			__le32 v = cpu_to_le32(n);
-			memcpy(&req->tc->sdata[req->tc->size - 4], &v, 4);
+			memcpy(&req->tc.sdata[req->tc.size - 4], &v, 4);
 			inlen = n;
 		}
 	}
@@ -445,7 +445,7 @@ p9_virtio_zc_request(struct p9_client *client, struct p9_req_t *req,
 
 	/* out data */
 	out = pack_sg_list(chan->sg, 0,
-			   VIRTQUEUE_NUM, req->tc->sdata, req->tc->size);
+			   VIRTQUEUE_NUM, req->tc.sdata, req->tc.size);
 
 	if (out)
 		sgs[out_sgs++] = chan->sg;
@@ -464,7 +464,7 @@ p9_virtio_zc_request(struct p9_client *client, struct p9_req_t *req,
 	 * alloced memory and payload onto the user buffer.
 	 */
 	in = pack_sg_list(chan->sg, out,
-			  VIRTQUEUE_NUM, req->rc->sdata, in_hdr_len);
+			  VIRTQUEUE_NUM, req->rc.sdata, in_hdr_len);
 	if (in)
 		sgs[out_sgs + in_sgs++] = chan->sg + out;
 
diff --git a/net/9p/trans_xen.c b/net/9p/trans_xen.c
index 843cb823d9b9..782a07f2ad0c 100644
--- a/net/9p/trans_xen.c
+++ b/net/9p/trans_xen.c
@@ -141,7 +141,7 @@ static int p9_xen_request(struct p9_client *client, struct p9_req_t *p9_req)
 	struct xen_9pfs_front_priv *priv = NULL;
 	RING_IDX cons, prod, masked_cons, masked_prod;
 	unsigned long flags;
-	u32 size = p9_req->tc->size;
+	u32 size = p9_req->tc.size;
 	struct xen_9pfs_dataring *ring;
 	int num;
 
@@ -154,7 +154,7 @@ static int p9_xen_request(struct p9_client *client, struct p9_req_t *p9_req)
 	if (!priv || priv->client != client)
 		return -EINVAL;
 
-	num = p9_req->tc->tag % priv->num_rings;
+	num = p9_req->tc.tag % priv->num_rings;
 	ring = &priv->rings[num];
 
 again:
@@ -176,7 +176,7 @@ static int p9_xen_request(struct p9_client *client, struct p9_req_t *p9_req)
 	masked_prod = xen_9pfs_mask(prod, XEN_9PFS_RING_SIZE);
 	masked_cons = xen_9pfs_mask(cons, XEN_9PFS_RING_SIZE);
 
-	xen_9pfs_write_packet(ring->data.out, p9_req->tc->sdata, size,
+	xen_9pfs_write_packet(ring->data.out, p9_req->tc.sdata, size,
 			      &masked_prod, masked_cons, XEN_9PFS_RING_SIZE);
 
 	p9_req->status = REQ_STATUS_SENT;
@@ -229,12 +229,12 @@ static void p9_xen_response(struct work_struct *work)
 			continue;
 		}
 
-		memcpy(req->rc, &h, sizeof(h));
-		req->rc->offset = 0;
+		memcpy(&req->rc, &h, sizeof(h));
+		req->rc.offset = 0;
 
 		masked_cons = xen_9pfs_mask(cons, XEN_9PFS_RING_SIZE);
 		/* Then, read the whole packet (including the header) */
-		xen_9pfs_read_packet(req->rc->sdata, ring->data.in, h.size,
+		xen_9pfs_read_packet(req->rc.sdata, ring->data.in, h.size,
 				     masked_prod, &masked_cons,
 				     XEN_9PFS_RING_SIZE);
 
-- 
2.20.1




^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 4.19 10/72] 9p: add a per-client fcall kmem_cache
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (8 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 09/72] 9p: embed fcall in req to round down buffer allocs Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 11/72] 9p: rename p9_free_req() function Greg Kroah-Hartman
                   ` (68 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Dominique Martinet, Jun Piao,
	Matthew Wilcox, Greg Kurz, Sasha Levin

[ Upstream commit 91a76be37ff89795526c452a6799576b03bec501 ]

Having a specific cache for the fcall allocations helps speed up
end-to-end latency.

The caches will automatically be merged if there are multiple caches
of items with the same size so we do not need to try to share a cache
between different clients of the same size.

Since the msize is negotiated with the server, only allocate the cache
after that negotiation has happened - previous allocations or
allocations of different sizes (e.g. zero-copy fcall) are made with
kmalloc directly.

Some figures on two beefy VMs with Connect-IB (sriov) / trans=rdma,
with ior running 32 processes in parallel doing small 32 bytes IOs:
 - no alloc (4.18-rc7 request cache): 65.4k req/s
 - non-power of two alloc, no patch: 61.6k req/s
 - power of two alloc, no patch: 62.2k req/s
 - non-power of two alloc, with patch: 64.7k req/s
 - power of two alloc, with patch: 65.1k req/s

Link: http://lkml.kernel.org/r/1532943263-24378-2-git-send-email-asmadeus@codewreck.org
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Acked-by: Jun Piao <piaojun@huawei.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Greg Kurz <groug@kaod.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/net/9p/9p.h     |  4 ++++
 include/net/9p/client.h |  1 +
 net/9p/client.c         | 37 ++++++++++++++++++++++++++++++++-----
 3 files changed, 37 insertions(+), 5 deletions(-)

diff --git a/include/net/9p/9p.h b/include/net/9p/9p.h
index b8eb51a661e5..4ab293f574e0 100644
--- a/include/net/9p/9p.h
+++ b/include/net/9p/9p.h
@@ -336,6 +336,9 @@ enum p9_qid_t {
 #define P9_NOFID	(u32)(~0)
 #define P9_MAXWELEM	16
 
+/* Minimal header size: size[4] type[1] tag[2] */
+#define P9_HDRSZ	7
+
 /* ample room for Twrite/Rread header */
 #define P9_IOHDRSZ	24
 
@@ -558,6 +561,7 @@ struct p9_fcall {
 	size_t offset;
 	size_t capacity;
 
+	struct kmem_cache *cache;
 	u8 *sdata;
 };
 
diff --git a/include/net/9p/client.h b/include/net/9p/client.h
index c2671d40bb6b..735f3979d559 100644
--- a/include/net/9p/client.h
+++ b/include/net/9p/client.h
@@ -123,6 +123,7 @@ struct p9_client {
 	struct p9_trans_module *trans_mod;
 	enum p9_trans_status status;
 	void *trans;
+	struct kmem_cache *fcall_cache;
 
 	union {
 		struct {
diff --git a/net/9p/client.c b/net/9p/client.c
index 83e39fef58e1..7ef54719c6f7 100644
--- a/net/9p/client.c
+++ b/net/9p/client.c
@@ -237,9 +237,16 @@ static int parse_opts(char *opts, struct p9_client *clnt)
 	return ret;
 }
 
-static int p9_fcall_init(struct p9_fcall *fc, int alloc_msize)
+static int p9_fcall_init(struct p9_client *c, struct p9_fcall *fc,
+			 int alloc_msize)
 {
-	fc->sdata = kmalloc(alloc_msize, GFP_NOFS);
+	if (likely(c->fcall_cache) && alloc_msize == c->msize) {
+		fc->sdata = kmem_cache_alloc(c->fcall_cache, GFP_NOFS);
+		fc->cache = c->fcall_cache;
+	} else {
+		fc->sdata = kmalloc(alloc_msize, GFP_NOFS);
+		fc->cache = NULL;
+	}
 	if (!fc->sdata)
 		return -ENOMEM;
 	fc->capacity = alloc_msize;
@@ -248,7 +255,16 @@ static int p9_fcall_init(struct p9_fcall *fc, int alloc_msize)
 
 void p9_fcall_fini(struct p9_fcall *fc)
 {
-	kfree(fc->sdata);
+	/* sdata can be NULL for interrupted requests in trans_rdma,
+	 * and kmem_cache_free does not do NULL-check for us
+	 */
+	if (unlikely(!fc->sdata))
+		return;
+
+	if (fc->cache)
+		kmem_cache_free(fc->cache, fc->sdata);
+	else
+		kfree(fc->sdata);
 }
 EXPORT_SYMBOL(p9_fcall_fini);
 
@@ -273,9 +289,9 @@ p9_tag_alloc(struct p9_client *c, int8_t type, unsigned int max_size)
 	if (!req)
 		return NULL;
 
-	if (p9_fcall_init(&req->tc, alloc_msize))
+	if (p9_fcall_init(c, &req->tc, alloc_msize))
 		goto free_req;
-	if (p9_fcall_init(&req->rc, alloc_msize))
+	if (p9_fcall_init(c, &req->rc, alloc_msize))
 		goto free;
 
 	p9pdu_reset(&req->tc);
@@ -965,6 +981,7 @@ struct p9_client *p9_client_create(const char *dev_name, char *options)
 
 	clnt->trans_mod = NULL;
 	clnt->trans = NULL;
+	clnt->fcall_cache = NULL;
 
 	client_id = utsname()->nodename;
 	memcpy(clnt->name, client_id, strlen(client_id) + 1);
@@ -1008,6 +1025,15 @@ struct p9_client *p9_client_create(const char *dev_name, char *options)
 	if (err)
 		goto close_trans;
 
+	/* P9_HDRSZ + 4 is the smallest packet header we can have that is
+	 * followed by data accessed from userspace by read
+	 */
+	clnt->fcall_cache =
+		kmem_cache_create_usercopy("9p-fcall-cache", clnt->msize,
+					   0, 0, P9_HDRSZ + 4,
+					   clnt->msize - (P9_HDRSZ + 4),
+					   NULL);
+
 	return clnt;
 
 close_trans:
@@ -1039,6 +1065,7 @@ void p9_client_destroy(struct p9_client *clnt)
 
 	p9_tag_cleanup(clnt);
 
+	kmem_cache_destroy(clnt->fcall_cache);
 	kfree(clnt);
 }
 EXPORT_SYMBOL(p9_client_destroy);
-- 
2.20.1




^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 4.19 11/72] 9p: rename p9_free_req() function
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (9 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 10/72] 9p: add a per-client fcall kmem_cache Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 12/72] 9p: Add refcount to p9_req_t Greg Kroah-Hartman
                   ` (67 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Tomas Bortoli, Jun Piao,
	Dominique Martinet, Sasha Levin

[ Upstream commit 43cbcbee9938b17f77cf34f1bc12d302f456810f ]

In sight of the next patch to add a refcount in p9_req_t, rename
the p9_free_req() function in p9_release_req().

In the next patch the actual kfree will be moved to another function.

Link: http://lkml.kernel.org/r/20180811144254.23665-1-tomasbortoli@gmail.com
Signed-off-by: Tomas Bortoli <tomasbortoli@gmail.com>
Acked-by: Jun Piao <piaojun@huawei.com>
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/9p/client.c | 100 ++++++++++++++++++++++++------------------------
 1 file changed, 50 insertions(+), 50 deletions(-)

diff --git a/net/9p/client.c b/net/9p/client.c
index 7ef54719c6f7..3cde9f619980 100644
--- a/net/9p/client.c
+++ b/net/9p/client.c
@@ -347,13 +347,13 @@ struct p9_req_t *p9_tag_lookup(struct p9_client *c, u16 tag)
 EXPORT_SYMBOL(p9_tag_lookup);
 
 /**
- * p9_free_req - Free a request.
+ * p9_tag_remove - Remove a tag.
  * @c: Client session.
- * @r: Request to free.
+ * @r: Request of reference.
  *
  * Context: Any context.
  */
-static void p9_free_req(struct p9_client *c, struct p9_req_t *r)
+static void p9_tag_remove(struct p9_client *c, struct p9_req_t *r)
 {
 	unsigned long flags;
 	u16 tag = r->tc.tag;
@@ -382,7 +382,7 @@ static void p9_tag_cleanup(struct p9_client *c)
 	rcu_read_lock();
 	idr_for_each_entry(&c->reqs, req, id) {
 		pr_info("Tag %d still in use\n", id);
-		p9_free_req(c, req);
+		p9_tag_remove(c, req);
 	}
 	rcu_read_unlock();
 }
@@ -650,7 +650,7 @@ static int p9_client_flush(struct p9_client *c, struct p9_req_t *oldreq)
 		if (c->trans_mod->cancelled)
 			c->trans_mod->cancelled(c, oldreq);
 
-	p9_free_req(c, req);
+	p9_tag_remove(c, req);
 	return 0;
 }
 
@@ -684,7 +684,7 @@ static struct p9_req_t *p9_client_prepare_req(struct p9_client *c,
 	trace_9p_client_req(c, type, req->tc.tag);
 	return req;
 reterr:
-	p9_free_req(c, req);
+	p9_tag_remove(c, req);
 	return ERR_PTR(err);
 }
 
@@ -694,7 +694,7 @@ static struct p9_req_t *p9_client_prepare_req(struct p9_client *c,
  * @type: type of request
  * @fmt: protocol format string (see protocol.c)
  *
- * Returns request structure (which client must free using p9_free_req)
+ * Returns request structure (which client must free using p9_tag_remove)
  */
 
 static struct p9_req_t *
@@ -770,7 +770,7 @@ p9_client_rpc(struct p9_client *c, int8_t type, const char *fmt, ...)
 	if (!err)
 		return req;
 reterr:
-	p9_free_req(c, req);
+	p9_tag_remove(c, req);
 	return ERR_PTR(safe_errno(err));
 }
 
@@ -785,7 +785,7 @@ p9_client_rpc(struct p9_client *c, int8_t type, const char *fmt, ...)
  * @hdrlen: reader header size, This is the size of response protocol data
  * @fmt: protocol format string (see protocol.c)
  *
- * Returns request structure (which client must free using p9_free_req)
+ * Returns request structure (which client must free using p9_tag_remove)
  */
 static struct p9_req_t *p9_client_zc_rpc(struct p9_client *c, int8_t type,
 					 struct iov_iter *uidata,
@@ -852,7 +852,7 @@ static struct p9_req_t *p9_client_zc_rpc(struct p9_client *c, int8_t type,
 	if (!err)
 		return req;
 reterr:
-	p9_free_req(c, req);
+	p9_tag_remove(c, req);
 	return ERR_PTR(safe_errno(err));
 }
 
@@ -963,7 +963,7 @@ static int p9_client_version(struct p9_client *c)
 
 error:
 	kfree(version);
-	p9_free_req(c, req);
+	p9_tag_remove(c, req);
 
 	return err;
 }
@@ -1112,7 +1112,7 @@ struct p9_fid *p9_client_attach(struct p9_client *clnt, struct p9_fid *afid,
 	err = p9pdu_readf(&req->rc, clnt->proto_version, "Q", &qid);
 	if (err) {
 		trace_9p_protocol_dump(clnt, &req->rc);
-		p9_free_req(clnt, req);
+		p9_tag_remove(clnt, req);
 		goto error;
 	}
 
@@ -1121,7 +1121,7 @@ struct p9_fid *p9_client_attach(struct p9_client *clnt, struct p9_fid *afid,
 
 	memmove(&fid->qid, &qid, sizeof(struct p9_qid));
 
-	p9_free_req(clnt, req);
+	p9_tag_remove(clnt, req);
 	return fid;
 
 error:
@@ -1169,10 +1169,10 @@ struct p9_fid *p9_client_walk(struct p9_fid *oldfid, uint16_t nwname,
 	err = p9pdu_readf(&req->rc, clnt->proto_version, "R", &nwqids, &wqids);
 	if (err) {
 		trace_9p_protocol_dump(clnt, &req->rc);
-		p9_free_req(clnt, req);
+		p9_tag_remove(clnt, req);
 		goto clunk_fid;
 	}
-	p9_free_req(clnt, req);
+	p9_tag_remove(clnt, req);
 
 	p9_debug(P9_DEBUG_9P, "<<< RWALK nwqid %d:\n", nwqids);
 
@@ -1247,7 +1247,7 @@ int p9_client_open(struct p9_fid *fid, int mode)
 	fid->iounit = iounit;
 
 free_and_error:
-	p9_free_req(clnt, req);
+	p9_tag_remove(clnt, req);
 error:
 	return err;
 }
@@ -1292,7 +1292,7 @@ int p9_client_create_dotl(struct p9_fid *ofid, const char *name, u32 flags, u32
 	ofid->iounit = iounit;
 
 free_and_error:
-	p9_free_req(clnt, req);
+	p9_tag_remove(clnt, req);
 error:
 	return err;
 }
@@ -1337,7 +1337,7 @@ int p9_client_fcreate(struct p9_fid *fid, const char *name, u32 perm, int mode,
 	fid->iounit = iounit;
 
 free_and_error:
-	p9_free_req(clnt, req);
+	p9_tag_remove(clnt, req);
 error:
 	return err;
 }
@@ -1371,7 +1371,7 @@ int p9_client_symlink(struct p9_fid *dfid, const char *name,
 			qid->type, (unsigned long long)qid->path, qid->version);
 
 free_and_error:
-	p9_free_req(clnt, req);
+	p9_tag_remove(clnt, req);
 error:
 	return err;
 }
@@ -1391,7 +1391,7 @@ int p9_client_link(struct p9_fid *dfid, struct p9_fid *oldfid, const char *newna
 		return PTR_ERR(req);
 
 	p9_debug(P9_DEBUG_9P, "<<< RLINK\n");
-	p9_free_req(clnt, req);
+	p9_tag_remove(clnt, req);
 	return 0;
 }
 EXPORT_SYMBOL(p9_client_link);
@@ -1415,7 +1415,7 @@ int p9_client_fsync(struct p9_fid *fid, int datasync)
 
 	p9_debug(P9_DEBUG_9P, "<<< RFSYNC fid %d\n", fid->fid);
 
-	p9_free_req(clnt, req);
+	p9_tag_remove(clnt, req);
 
 error:
 	return err;
@@ -1450,7 +1450,7 @@ int p9_client_clunk(struct p9_fid *fid)
 
 	p9_debug(P9_DEBUG_9P, "<<< RCLUNK fid %d\n", fid->fid);
 
-	p9_free_req(clnt, req);
+	p9_tag_remove(clnt, req);
 error:
 	/*
 	 * Fid is not valid even after a failed clunk
@@ -1484,7 +1484,7 @@ int p9_client_remove(struct p9_fid *fid)
 
 	p9_debug(P9_DEBUG_9P, "<<< RREMOVE fid %d\n", fid->fid);
 
-	p9_free_req(clnt, req);
+	p9_tag_remove(clnt, req);
 error:
 	if (err == -ERESTARTSYS)
 		p9_client_clunk(fid);
@@ -1511,7 +1511,7 @@ int p9_client_unlinkat(struct p9_fid *dfid, const char *name, int flags)
 	}
 	p9_debug(P9_DEBUG_9P, "<<< RUNLINKAT fid %d %s\n", dfid->fid, name);
 
-	p9_free_req(clnt, req);
+	p9_tag_remove(clnt, req);
 error:
 	return err;
 }
@@ -1563,7 +1563,7 @@ p9_client_read(struct p9_fid *fid, u64 offset, struct iov_iter *to, int *err)
 				   "D", &count, &dataptr);
 		if (*err) {
 			trace_9p_protocol_dump(clnt, &req->rc);
-			p9_free_req(clnt, req);
+			p9_tag_remove(clnt, req);
 			break;
 		}
 		if (rsize < count) {
@@ -1573,7 +1573,7 @@ p9_client_read(struct p9_fid *fid, u64 offset, struct iov_iter *to, int *err)
 
 		p9_debug(P9_DEBUG_9P, "<<< RREAD count %d\n", count);
 		if (!count) {
-			p9_free_req(clnt, req);
+			p9_tag_remove(clnt, req);
 			break;
 		}
 
@@ -1583,7 +1583,7 @@ p9_client_read(struct p9_fid *fid, u64 offset, struct iov_iter *to, int *err)
 			offset += n;
 			if (n != count) {
 				*err = -EFAULT;
-				p9_free_req(clnt, req);
+				p9_tag_remove(clnt, req);
 				break;
 			}
 		} else {
@@ -1591,7 +1591,7 @@ p9_client_read(struct p9_fid *fid, u64 offset, struct iov_iter *to, int *err)
 			total += count;
 			offset += count;
 		}
-		p9_free_req(clnt, req);
+		p9_tag_remove(clnt, req);
 	}
 	return total;
 }
@@ -1635,7 +1635,7 @@ p9_client_write(struct p9_fid *fid, u64 offset, struct iov_iter *from, int *err)
 		*err = p9pdu_readf(&req->rc, clnt->proto_version, "d", &count);
 		if (*err) {
 			trace_9p_protocol_dump(clnt, &req->rc);
-			p9_free_req(clnt, req);
+			p9_tag_remove(clnt, req);
 			break;
 		}
 		if (rsize < count) {
@@ -1645,7 +1645,7 @@ p9_client_write(struct p9_fid *fid, u64 offset, struct iov_iter *from, int *err)
 
 		p9_debug(P9_DEBUG_9P, "<<< RWRITE count %d\n", count);
 
-		p9_free_req(clnt, req);
+		p9_tag_remove(clnt, req);
 		iov_iter_advance(from, count);
 		total += count;
 		offset += count;
@@ -1679,7 +1679,7 @@ struct p9_wstat *p9_client_stat(struct p9_fid *fid)
 	err = p9pdu_readf(&req->rc, clnt->proto_version, "wS", &ignored, ret);
 	if (err) {
 		trace_9p_protocol_dump(clnt, &req->rc);
-		p9_free_req(clnt, req);
+		p9_tag_remove(clnt, req);
 		goto error;
 	}
 
@@ -1696,7 +1696,7 @@ struct p9_wstat *p9_client_stat(struct p9_fid *fid)
 		from_kgid(&init_user_ns, ret->n_gid),
 		from_kuid(&init_user_ns, ret->n_muid));
 
-	p9_free_req(clnt, req);
+	p9_tag_remove(clnt, req);
 	return ret;
 
 error:
@@ -1732,7 +1732,7 @@ struct p9_stat_dotl *p9_client_getattr_dotl(struct p9_fid *fid,
 	err = p9pdu_readf(&req->rc, clnt->proto_version, "A", ret);
 	if (err) {
 		trace_9p_protocol_dump(clnt, &req->rc);
-		p9_free_req(clnt, req);
+		p9_tag_remove(clnt, req);
 		goto error;
 	}
 
@@ -1757,7 +1757,7 @@ struct p9_stat_dotl *p9_client_getattr_dotl(struct p9_fid *fid,
 		ret->st_ctime_nsec, ret->st_btime_sec, ret->st_btime_nsec,
 		ret->st_gen, ret->st_data_version);
 
-	p9_free_req(clnt, req);
+	p9_tag_remove(clnt, req);
 	return ret;
 
 error:
@@ -1826,7 +1826,7 @@ int p9_client_wstat(struct p9_fid *fid, struct p9_wstat *wst)
 
 	p9_debug(P9_DEBUG_9P, "<<< RWSTAT fid %d\n", fid->fid);
 
-	p9_free_req(clnt, req);
+	p9_tag_remove(clnt, req);
 error:
 	return err;
 }
@@ -1858,7 +1858,7 @@ int p9_client_setattr(struct p9_fid *fid, struct p9_iattr_dotl *p9attr)
 		goto error;
 	}
 	p9_debug(P9_DEBUG_9P, "<<< RSETATTR fid %d\n", fid->fid);
-	p9_free_req(clnt, req);
+	p9_tag_remove(clnt, req);
 error:
 	return err;
 }
@@ -1886,7 +1886,7 @@ int p9_client_statfs(struct p9_fid *fid, struct p9_rstatfs *sb)
 			  &sb->files, &sb->ffree, &sb->fsid, &sb->namelen);
 	if (err) {
 		trace_9p_protocol_dump(clnt, &req->rc);
-		p9_free_req(clnt, req);
+		p9_tag_remove(clnt, req);
 		goto error;
 	}
 
@@ -1897,7 +1897,7 @@ int p9_client_statfs(struct p9_fid *fid, struct p9_rstatfs *sb)
 		sb->blocks, sb->bfree, sb->bavail, sb->files,  sb->ffree,
 		sb->fsid, (long int)sb->namelen);
 
-	p9_free_req(clnt, req);
+	p9_tag_remove(clnt, req);
 error:
 	return err;
 }
@@ -1925,7 +1925,7 @@ int p9_client_rename(struct p9_fid *fid,
 
 	p9_debug(P9_DEBUG_9P, "<<< RRENAME fid %d\n", fid->fid);
 
-	p9_free_req(clnt, req);
+	p9_tag_remove(clnt, req);
 error:
 	return err;
 }
@@ -1955,7 +1955,7 @@ int p9_client_renameat(struct p9_fid *olddirfid, const char *old_name,
 	p9_debug(P9_DEBUG_9P, "<<< RRENAMEAT newdirfid %d new name %s\n",
 		   newdirfid->fid, new_name);
 
-	p9_free_req(clnt, req);
+	p9_tag_remove(clnt, req);
 error:
 	return err;
 }
@@ -1992,10 +1992,10 @@ struct p9_fid *p9_client_xattrwalk(struct p9_fid *file_fid,
 	err = p9pdu_readf(&req->rc, clnt->proto_version, "q", attr_size);
 	if (err) {
 		trace_9p_protocol_dump(clnt, &req->rc);
-		p9_free_req(clnt, req);
+		p9_tag_remove(clnt, req);
 		goto clunk_fid;
 	}
-	p9_free_req(clnt, req);
+	p9_tag_remove(clnt, req);
 	p9_debug(P9_DEBUG_9P, "<<<  RXATTRWALK fid %d size %llu\n",
 		attr_fid->fid, *attr_size);
 	return attr_fid;
@@ -2029,7 +2029,7 @@ int p9_client_xattrcreate(struct p9_fid *fid, const char *name,
 		goto error;
 	}
 	p9_debug(P9_DEBUG_9P, "<<< RXATTRCREATE fid %d\n", fid->fid);
-	p9_free_req(clnt, req);
+	p9_tag_remove(clnt, req);
 error:
 	return err;
 }
@@ -2092,11 +2092,11 @@ int p9_client_readdir(struct p9_fid *fid, char *data, u32 count, u64 offset)
 	if (non_zc)
 		memmove(data, dataptr, count);
 
-	p9_free_req(clnt, req);
+	p9_tag_remove(clnt, req);
 	return count;
 
 free_and_error:
-	p9_free_req(clnt, req);
+	p9_tag_remove(clnt, req);
 error:
 	return err;
 }
@@ -2127,7 +2127,7 @@ int p9_client_mknod_dotl(struct p9_fid *fid, const char *name, int mode,
 				(unsigned long long)qid->path, qid->version);
 
 error:
-	p9_free_req(clnt, req);
+	p9_tag_remove(clnt, req);
 	return err;
 
 }
@@ -2158,7 +2158,7 @@ int p9_client_mkdir_dotl(struct p9_fid *fid, const char *name, int mode,
 				(unsigned long long)qid->path, qid->version);
 
 error:
-	p9_free_req(clnt, req);
+	p9_tag_remove(clnt, req);
 	return err;
 
 }
@@ -2191,7 +2191,7 @@ int p9_client_lock_dotl(struct p9_fid *fid, struct p9_flock *flock, u8 *status)
 	}
 	p9_debug(P9_DEBUG_9P, "<<< RLOCK status %i\n", *status);
 error:
-	p9_free_req(clnt, req);
+	p9_tag_remove(clnt, req);
 	return err;
 
 }
@@ -2226,7 +2226,7 @@ int p9_client_getlock_dotl(struct p9_fid *fid, struct p9_getlock *glock)
 		"proc_id %d client_id %s\n", glock->type, glock->start,
 		glock->length, glock->proc_id, glock->client_id);
 error:
-	p9_free_req(clnt, req);
+	p9_tag_remove(clnt, req);
 	return err;
 }
 EXPORT_SYMBOL(p9_client_getlock_dotl);
@@ -2252,7 +2252,7 @@ int p9_client_readlink(struct p9_fid *fid, char **target)
 	}
 	p9_debug(P9_DEBUG_9P, "<<< RREADLINK target %s\n", *target);
 error:
-	p9_free_req(clnt, req);
+	p9_tag_remove(clnt, req);
 	return err;
 }
 EXPORT_SYMBOL(p9_client_readlink);
-- 
2.20.1




^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 4.19 12/72] 9p: Add refcount to p9_req_t
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (10 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 11/72] 9p: rename p9_free_req() function Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 13/72] 9p/rdma: do not disconnect on down_interruptible EAGAIN Greg Kroah-Hartman
                   ` (66 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Tomas Bortoli,
	syzbot+467050c1ce275af2a5b8, Dominique Martinet, Sasha Levin

[ Upstream commit 728356dedeff8ef999cb436c71333ef4ac51a81c ]

To avoid use-after-free(s), use a refcount to keep track of the
usable references to any instantiated struct p9_req_t.

This commit adds p9_req_put(), p9_req_get() and p9_req_try_get() as
wrappers to kref_put(), kref_get() and kref_get_unless_zero().
These are used by the client and the transports to keep track of
valid requests' references.

p9_free_req() is added back and used as callback by kref_put().

Add SLAB_TYPESAFE_BY_RCU as it ensures that the memory freed by
kmem_cache_free() will not be reused for another type until the rcu
synchronisation period is over, so an address gotten under rcu read
lock is safe to inc_ref() without corrupting random memory while
the lock is held.

Link: http://lkml.kernel.org/r/1535626341-20693-1-git-send-email-asmadeus@codewreck.org
Co-developed-by: Dominique Martinet <dominique.martinet@cea.fr>
Signed-off-by: Tomas Bortoli <tomasbortoli@gmail.com>
Reported-by: syzbot+467050c1ce275af2a5b8@syzkaller.appspotmail.com
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/net/9p/client.h | 14 ++++++++++
 net/9p/client.c         | 57 ++++++++++++++++++++++++++++++++++++-----
 net/9p/trans_fd.c       | 11 +++++++-
 net/9p/trans_rdma.c     |  1 +
 net/9p/trans_virtio.c   | 26 ++++++++++++++++---
 net/9p/trans_xen.c      |  1 +
 6 files changed, 98 insertions(+), 12 deletions(-)

diff --git a/include/net/9p/client.h b/include/net/9p/client.h
index 735f3979d559..947a570307a6 100644
--- a/include/net/9p/client.h
+++ b/include/net/9p/client.h
@@ -94,6 +94,7 @@ enum p9_req_status_t {
 struct p9_req_t {
 	int status;
 	int t_err;
+	struct kref refcount;
 	wait_queue_head_t wq;
 	struct p9_fcall tc;
 	struct p9_fcall rc;
@@ -233,6 +234,19 @@ int p9_client_lock_dotl(struct p9_fid *fid, struct p9_flock *flock, u8 *status);
 int p9_client_getlock_dotl(struct p9_fid *fid, struct p9_getlock *fl);
 void p9_fcall_fini(struct p9_fcall *fc);
 struct p9_req_t *p9_tag_lookup(struct p9_client *, u16);
+
+static inline void p9_req_get(struct p9_req_t *r)
+{
+	kref_get(&r->refcount);
+}
+
+static inline int p9_req_try_get(struct p9_req_t *r)
+{
+	return kref_get_unless_zero(&r->refcount);
+}
+
+int p9_req_put(struct p9_req_t *r);
+
 void p9_client_cb(struct p9_client *c, struct p9_req_t *req, int status);
 
 int p9_parse_header(struct p9_fcall *, int32_t *, int8_t *, int16_t *, int);
diff --git a/net/9p/client.c b/net/9p/client.c
index 3cde9f619980..4becde979462 100644
--- a/net/9p/client.c
+++ b/net/9p/client.c
@@ -313,6 +313,18 @@ p9_tag_alloc(struct p9_client *c, int8_t type, unsigned int max_size)
 	if (tag < 0)
 		goto free;
 
+	/* Init ref to two because in the general case there is one ref
+	 * that is put asynchronously by a writer thread, one ref
+	 * temporarily given by p9_tag_lookup and put by p9_client_cb
+	 * in the recv thread, and one ref put by p9_tag_remove in the
+	 * main thread. The only exception is virtio that does not use
+	 * p9_tag_lookup but does not have a writer thread either
+	 * (the write happens synchronously in the request/zc_request
+	 * callback), so p9_client_cb eats the second ref there
+	 * as the pointer is duplicated directly by virtqueue_add_sgs()
+	 */
+	refcount_set(&req->refcount.refcount, 2);
+
 	return req;
 
 free:
@@ -336,10 +348,21 @@ struct p9_req_t *p9_tag_lookup(struct p9_client *c, u16 tag)
 	struct p9_req_t *req;
 
 	rcu_read_lock();
+again:
 	req = idr_find(&c->reqs, tag);
-	/* There's no refcount on the req; a malicious server could cause
-	 * us to dereference a NULL pointer
-	 */
+	if (req) {
+		/* We have to be careful with the req found under rcu_read_lock
+		 * Thanks to SLAB_TYPESAFE_BY_RCU we can safely try to get the
+		 * ref again without corrupting other data, then check again
+		 * that the tag matches once we have the ref
+		 */
+		if (!p9_req_try_get(req))
+			goto again;
+		if (req->tc.tag != tag) {
+			p9_req_put(req);
+			goto again;
+		}
+	}
 	rcu_read_unlock();
 
 	return req;
@@ -353,7 +376,7 @@ EXPORT_SYMBOL(p9_tag_lookup);
  *
  * Context: Any context.
  */
-static void p9_tag_remove(struct p9_client *c, struct p9_req_t *r)
+static int p9_tag_remove(struct p9_client *c, struct p9_req_t *r)
 {
 	unsigned long flags;
 	u16 tag = r->tc.tag;
@@ -362,11 +385,23 @@ static void p9_tag_remove(struct p9_client *c, struct p9_req_t *r)
 	spin_lock_irqsave(&c->lock, flags);
 	idr_remove(&c->reqs, tag);
 	spin_unlock_irqrestore(&c->lock, flags);
+	return p9_req_put(r);
+}
+
+static void p9_req_free(struct kref *ref)
+{
+	struct p9_req_t *r = container_of(ref, struct p9_req_t, refcount);
 	p9_fcall_fini(&r->tc);
 	p9_fcall_fini(&r->rc);
 	kmem_cache_free(p9_req_cache, r);
 }
 
+int p9_req_put(struct p9_req_t *r)
+{
+	return kref_put(&r->refcount, p9_req_free);
+}
+EXPORT_SYMBOL(p9_req_put);
+
 /**
  * p9_tag_cleanup - cleans up tags structure and reclaims resources
  * @c:  v9fs client struct
@@ -382,7 +417,9 @@ static void p9_tag_cleanup(struct p9_client *c)
 	rcu_read_lock();
 	idr_for_each_entry(&c->reqs, req, id) {
 		pr_info("Tag %d still in use\n", id);
-		p9_tag_remove(c, req);
+		if (p9_tag_remove(c, req) == 0)
+			pr_warn("Packet with tag %d has still references",
+				req->tc.tag);
 	}
 	rcu_read_unlock();
 }
@@ -406,6 +443,7 @@ void p9_client_cb(struct p9_client *c, struct p9_req_t *req, int status)
 
 	wake_up(&req->wq);
 	p9_debug(P9_DEBUG_MUX, "wakeup: %d\n", req->tc.tag);
+	p9_req_put(req);
 }
 EXPORT_SYMBOL(p9_client_cb);
 
@@ -646,9 +684,10 @@ static int p9_client_flush(struct p9_client *c, struct p9_req_t *oldreq)
 	 * if we haven't received a response for oldreq,
 	 * remove it from the list
 	 */
-	if (oldreq->status == REQ_STATUS_SENT)
+	if (oldreq->status == REQ_STATUS_SENT) {
 		if (c->trans_mod->cancelled)
 			c->trans_mod->cancelled(c, oldreq);
+	}
 
 	p9_tag_remove(c, req);
 	return 0;
@@ -685,6 +724,8 @@ static struct p9_req_t *p9_client_prepare_req(struct p9_client *c,
 	return req;
 reterr:
 	p9_tag_remove(c, req);
+	/* We have to put also the 2nd reference as it won't be used */
+	p9_req_put(req);
 	return ERR_PTR(err);
 }
 
@@ -719,6 +760,8 @@ p9_client_rpc(struct p9_client *c, int8_t type, const char *fmt, ...)
 
 	err = c->trans_mod->request(c, req);
 	if (err < 0) {
+		/* write won't happen */
+		p9_req_put(req);
 		if (err != -ERESTARTSYS && err != -EFAULT)
 			c->status = Disconnected;
 		goto recalc_sigpending;
@@ -2259,7 +2302,7 @@ EXPORT_SYMBOL(p9_client_readlink);
 
 int __init p9_client_init(void)
 {
-	p9_req_cache = KMEM_CACHE(p9_req_t, 0);
+	p9_req_cache = KMEM_CACHE(p9_req_t, SLAB_TYPESAFE_BY_RCU);
 	return p9_req_cache ? 0 : -ENOMEM;
 }
 
diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
index 51615c0fb744..aca528722183 100644
--- a/net/9p/trans_fd.c
+++ b/net/9p/trans_fd.c
@@ -132,6 +132,7 @@ struct p9_conn {
 	struct list_head req_list;
 	struct list_head unsent_req_list;
 	struct p9_req_t *req;
+	struct p9_req_t *wreq;
 	char tmp_buf[7];
 	struct p9_fcall rc;
 	int wpos;
@@ -383,6 +384,7 @@ static void p9_read_work(struct work_struct *work)
 		m->rc.sdata = NULL;
 		m->rc.offset = 0;
 		m->rc.capacity = 0;
+		p9_req_put(m->req);
 		m->req = NULL;
 	}
 
@@ -472,6 +474,8 @@ static void p9_write_work(struct work_struct *work)
 		m->wbuf = req->tc.sdata;
 		m->wsize = req->tc.size;
 		m->wpos = 0;
+		p9_req_get(req);
+		m->wreq = req;
 		spin_unlock(&m->client->lock);
 	}
 
@@ -492,8 +496,11 @@ static void p9_write_work(struct work_struct *work)
 	}
 
 	m->wpos += err;
-	if (m->wpos == m->wsize)
+	if (m->wpos == m->wsize) {
 		m->wpos = m->wsize = 0;
+		p9_req_put(m->wreq);
+		m->wreq = NULL;
+	}
 
 end_clear:
 	clear_bit(Wworksched, &m->wsched);
@@ -694,6 +701,7 @@ static int p9_fd_cancel(struct p9_client *client, struct p9_req_t *req)
 	if (req->status == REQ_STATUS_UNSENT) {
 		list_del(&req->req_list);
 		req->status = REQ_STATUS_FLSHD;
+		p9_req_put(req);
 		ret = 0;
 	}
 	spin_unlock(&client->lock);
@@ -711,6 +719,7 @@ static int p9_fd_cancelled(struct p9_client *client, struct p9_req_t *req)
 	spin_lock(&client->lock);
 	list_del(&req->req_list);
 	spin_unlock(&client->lock);
+	p9_req_put(req);
 
 	return 0;
 }
diff --git a/net/9p/trans_rdma.c b/net/9p/trans_rdma.c
index 5b0cda1aaa7a..9cc9b3a19ee7 100644
--- a/net/9p/trans_rdma.c
+++ b/net/9p/trans_rdma.c
@@ -365,6 +365,7 @@ send_done(struct ib_cq *cq, struct ib_wc *wc)
 			    c->busa, c->req->tc.size,
 			    DMA_TO_DEVICE);
 	up(&rdma->sq_sem);
+	p9_req_put(c->req);
 	kfree(c);
 }
 
diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
index 3dd6ce1c0f2d..eb596c2ed546 100644
--- a/net/9p/trans_virtio.c
+++ b/net/9p/trans_virtio.c
@@ -207,6 +207,13 @@ static int p9_virtio_cancel(struct p9_client *client, struct p9_req_t *req)
 	return 1;
 }
 
+/* Reply won't come, so drop req ref */
+static int p9_virtio_cancelled(struct p9_client *client, struct p9_req_t *req)
+{
+	p9_req_put(req);
+	return 0;
+}
+
 /**
  * pack_sg_list_p - Just like pack_sg_list. Instead of taking a buffer,
  * this takes a list of pages.
@@ -404,6 +411,7 @@ p9_virtio_zc_request(struct p9_client *client, struct p9_req_t *req,
 	struct scatterlist *sgs[4];
 	size_t offs;
 	int need_drop = 0;
+	int kicked = 0;
 
 	p9_debug(P9_DEBUG_TRANS, "virtio request\n");
 
@@ -411,8 +419,10 @@ p9_virtio_zc_request(struct p9_client *client, struct p9_req_t *req,
 		__le32 sz;
 		int n = p9_get_mapped_pages(chan, &out_pages, uodata,
 					    outlen, &offs, &need_drop);
-		if (n < 0)
-			return n;
+		if (n < 0) {
+			err = n;
+			goto err_out;
+		}
 		out_nr_pages = DIV_ROUND_UP(n + offs, PAGE_SIZE);
 		if (n != outlen) {
 			__le32 v = cpu_to_le32(n);
@@ -428,8 +438,10 @@ p9_virtio_zc_request(struct p9_client *client, struct p9_req_t *req,
 	} else if (uidata) {
 		int n = p9_get_mapped_pages(chan, &in_pages, uidata,
 					    inlen, &offs, &need_drop);
-		if (n < 0)
-			return n;
+		if (n < 0) {
+			err = n;
+			goto err_out;
+		}
 		in_nr_pages = DIV_ROUND_UP(n + offs, PAGE_SIZE);
 		if (n != inlen) {
 			__le32 v = cpu_to_le32(n);
@@ -498,6 +510,7 @@ p9_virtio_zc_request(struct p9_client *client, struct p9_req_t *req,
 	}
 	virtqueue_kick(chan->vq);
 	spin_unlock_irqrestore(&chan->lock, flags);
+	kicked = 1;
 	p9_debug(P9_DEBUG_TRANS, "virtio request kicked\n");
 	err = wait_event_killable(req->wq, req->status >= REQ_STATUS_RCVD);
 	/*
@@ -518,6 +531,10 @@ p9_virtio_zc_request(struct p9_client *client, struct p9_req_t *req,
 	}
 	kvfree(in_pages);
 	kvfree(out_pages);
+	if (!kicked) {
+		/* reply won't come */
+		p9_req_put(req);
+	}
 	return err;
 }
 
@@ -750,6 +767,7 @@ static struct p9_trans_module p9_virtio_trans = {
 	.request = p9_virtio_request,
 	.zc_request = p9_virtio_zc_request,
 	.cancel = p9_virtio_cancel,
+	.cancelled = p9_virtio_cancelled,
 	/*
 	 * We leave one entry for input and one entry for response
 	 * headers. We also skip one more entry to accomodate, address
diff --git a/net/9p/trans_xen.c b/net/9p/trans_xen.c
index 782a07f2ad0c..e2fbf3677b9b 100644
--- a/net/9p/trans_xen.c
+++ b/net/9p/trans_xen.c
@@ -185,6 +185,7 @@ static int p9_xen_request(struct p9_client *client, struct p9_req_t *p9_req)
 	ring->intf->out_prod = prod;
 	spin_unlock_irqrestore(&ring->lock, flags);
 	notify_remote_via_irq(ring->irq);
+	p9_req_put(p9_req);
 
 	return 0;
 }
-- 
2.20.1




^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 4.19 13/72] 9p/rdma: do not disconnect on down_interruptible EAGAIN
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (11 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 12/72] 9p: Add refcount to p9_req_t Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 14/72] 9p: Rename req to rreq in trans_fd Greg Kroah-Hartman
                   ` (65 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Dominique Martinet, Sasha Levin

[ Upstream commit 8b894adb2b7e1d1e64b8954569c761eaf3d51ab5 ]

9p/rdma would sometimes drop the connection and display errors in
recv_done when the user does ^C.
The errors were caused by recv buffers that were posted at the time
of disconnect, and we just do not want to disconnect when
down_interruptible is... interrupted.

Link: http://lkml.kernel.org/r/1535625307-18019-1-git-send-email-asmadeus@codewreck.org
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/9p/trans_rdma.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/9p/trans_rdma.c b/net/9p/trans_rdma.c
index 9cc9b3a19ee7..9719bc4d9424 100644
--- a/net/9p/trans_rdma.c
+++ b/net/9p/trans_rdma.c
@@ -477,7 +477,7 @@ static int rdma_request(struct p9_client *client, struct p9_req_t *req)
 
 	err = post_recv(client, rpl_context);
 	if (err) {
-		p9_debug(P9_DEBUG_FCALL, "POST RECV failed\n");
+		p9_debug(P9_DEBUG_ERROR, "POST RECV failed: %d\n", err);
 		goto recv_error;
 	}
 	/* remove posted receive buffer from request structure */
@@ -546,7 +546,7 @@ static int rdma_request(struct p9_client *client, struct p9_req_t *req)
  recv_error:
 	kfree(rpl_context);
 	spin_lock_irqsave(&rdma->req_lock, flags);
-	if (rdma->state < P9_RDMA_CLOSING) {
+	if (err != -EINTR && rdma->state < P9_RDMA_CLOSING) {
 		rdma->state = P9_RDMA_CLOSING;
 		spin_unlock_irqrestore(&rdma->req_lock, flags);
 		rdma_disconnect(rdma->cm_id);
-- 
2.20.1




^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 4.19 14/72] 9p: Rename req to rreq in trans_fd
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (12 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 13/72] 9p/rdma: do not disconnect on down_interruptible EAGAIN Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 15/72] 9p: acl: fix uninitialized iattr access Greg Kroah-Hartman
                   ` (64 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Tomas Bortoli, Jun Piao,
	Dominique Martinet, Sasha Levin

[ Upstream commit 6d35190f395316916c8bb4aabd35a182890bf856 ]

In struct p9_conn, rename req to rreq as it is used by the read routine.

Link: http://lkml.kernel.org/r/20180903160321.2181-1-tomasbortoli@gmail.com
Signed-off-by: Tomas Bortoli <tomasbortoli@gmail.com>
Suggested-by: Jun Piao <piaojun@huawei.com>
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/9p/trans_fd.c | 30 +++++++++++++++---------------
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
index aca528722183..12559c474dde 100644
--- a/net/9p/trans_fd.c
+++ b/net/9p/trans_fd.c
@@ -131,7 +131,7 @@ struct p9_conn {
 	int err;
 	struct list_head req_list;
 	struct list_head unsent_req_list;
-	struct p9_req_t *req;
+	struct p9_req_t *rreq;
 	struct p9_req_t *wreq;
 	char tmp_buf[7];
 	struct p9_fcall rc;
@@ -323,7 +323,7 @@ static void p9_read_work(struct work_struct *work)
 	m->rc.offset += err;
 
 	/* header read in */
-	if ((!m->req) && (m->rc.offset == m->rc.capacity)) {
+	if ((!m->rreq) && (m->rc.offset == m->rc.capacity)) {
 		p9_debug(P9_DEBUG_TRANS, "got new header\n");
 
 		/* Header size */
@@ -347,23 +347,23 @@ static void p9_read_work(struct work_struct *work)
 			 "mux %p pkt: size: %d bytes tag: %d\n",
 			 m, m->rc.size, m->rc.tag);
 
-		m->req = p9_tag_lookup(m->client, m->rc.tag);
-		if (!m->req || (m->req->status != REQ_STATUS_SENT)) {
+		m->rreq = p9_tag_lookup(m->client, m->rc.tag);
+		if (!m->rreq || (m->rreq->status != REQ_STATUS_SENT)) {
 			p9_debug(P9_DEBUG_ERROR, "Unexpected packet tag %d\n",
 				 m->rc.tag);
 			err = -EIO;
 			goto error;
 		}
 
-		if (!m->req->rc.sdata) {
+		if (!m->rreq->rc.sdata) {
 			p9_debug(P9_DEBUG_ERROR,
 				 "No recv fcall for tag %d (req %p), disconnecting!\n",
-				 m->rc.tag, m->req);
-			m->req = NULL;
+				 m->rc.tag, m->rreq);
+			m->rreq = NULL;
 			err = -EIO;
 			goto error;
 		}
-		m->rc.sdata = m->req->rc.sdata;
+		m->rc.sdata = m->rreq->rc.sdata;
 		memcpy(m->rc.sdata, m->tmp_buf, m->rc.capacity);
 		m->rc.capacity = m->rc.size;
 	}
@@ -371,21 +371,21 @@ static void p9_read_work(struct work_struct *work)
 	/* packet is read in
 	 * not an else because some packets (like clunk) have no payload
 	 */
-	if ((m->req) && (m->rc.offset == m->rc.capacity)) {
+	if ((m->rreq) && (m->rc.offset == m->rc.capacity)) {
 		p9_debug(P9_DEBUG_TRANS, "got new packet\n");
-		m->req->rc.size = m->rc.offset;
+		m->rreq->rc.size = m->rc.offset;
 		spin_lock(&m->client->lock);
-		if (m->req->status != REQ_STATUS_ERROR)
+		if (m->rreq->status != REQ_STATUS_ERROR)
 			status = REQ_STATUS_RCVD;
-		list_del(&m->req->req_list);
+		list_del(&m->rreq->req_list);
 		/* update req->status while holding client->lock  */
-		p9_client_cb(m->client, m->req, status);
+		p9_client_cb(m->client, m->rreq, status);
 		spin_unlock(&m->client->lock);
 		m->rc.sdata = NULL;
 		m->rc.offset = 0;
 		m->rc.capacity = 0;
-		p9_req_put(m->req);
-		m->req = NULL;
+		p9_req_put(m->rreq);
+		m->rreq = NULL;
 	}
 
 end_clear:
-- 
2.20.1




^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 4.19 15/72] 9p: acl: fix uninitialized iattr access
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (13 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 14/72] 9p: Rename req to rreq in trans_fd Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 16/72] 9p/rdma: remove useless check in cm_event_handler Greg Kroah-Hartman
                   ` (63 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Dominique Martinet, Sasha Levin

[ Upstream commit e02a53d92e197706cad1627bd84705d4aa20a145 ]

iattr is passed to v9fs_vfs_setattr_dotl which does send various
values from iattr over the wire, even if it tells the server to
only look at iattr.ia_valid fields this could leak some stack data.

Link: http://lkml.kernel.org/r/1536339057-21974-2-git-send-email-asmadeus@codewreck.org
Addresses-Coverity-ID: 1195601 ("Uninitalized scalar variable")
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/9p/acl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/9p/acl.c b/fs/9p/acl.c
index 082d227fa56b..6261719f6f2a 100644
--- a/fs/9p/acl.c
+++ b/fs/9p/acl.c
@@ -276,7 +276,7 @@ static int v9fs_xattr_set_acl(const struct xattr_handler *handler,
 	switch (handler->flags) {
 	case ACL_TYPE_ACCESS:
 		if (acl) {
-			struct iattr iattr;
+			struct iattr iattr = { 0 };
 			struct posix_acl *old_acl = acl;
 
 			retval = posix_acl_update_mode(inode, &iattr.ia_mode, &acl);
-- 
2.20.1




^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 4.19 16/72] 9p/rdma: remove useless check in cm_event_handler
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (14 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 15/72] 9p: acl: fix uninitialized iattr access Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 17/72] 9p: p9dirent_read: check network-provided name length Greg Kroah-Hartman
                   ` (62 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Dominique Martinet, Sasha Levin

[ Upstream commit 473c7dd1d7b59ff8f88a5154737e3eac78a96e5b ]

the client c is always dereferenced to get the rdma struct, so c has to
be a valid pointer at this point.
Gcc would optimize that away but let's make coverity happy...

Link: http://lkml.kernel.org/r/1536339057-21974-3-git-send-email-asmadeus@codewreck.org
Addresses-Coverity-ID: 102778 ("Dereference before null check")
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/9p/trans_rdma.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/net/9p/trans_rdma.c b/net/9p/trans_rdma.c
index 9719bc4d9424..119103bfa82e 100644
--- a/net/9p/trans_rdma.c
+++ b/net/9p/trans_rdma.c
@@ -274,8 +274,7 @@ p9_cm_event_handler(struct rdma_cm_id *id, struct rdma_cm_event *event)
 	case RDMA_CM_EVENT_DISCONNECTED:
 		if (rdma)
 			rdma->state = P9_RDMA_CLOSED;
-		if (c)
-			c->status = Disconnected;
+		c->status = Disconnected;
 		break;
 
 	case RDMA_CM_EVENT_TIMEWAIT_EXIT:
-- 
2.20.1




^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 4.19 17/72] 9p: p9dirent_read: check network-provided name length
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (15 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 16/72] 9p/rdma: remove useless check in cm_event_handler Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 18/72] 9p: potential NULL dereference Greg Kroah-Hartman
                   ` (61 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Dominique Martinet, Sasha Levin

[ Upstream commit ef5305f1f72eb1cfcda25c382bb0368509c0385b ]

strcpy to dirent->d_name could overflow the buffer, use strscpy to check
the provided string length and error out if the size was too big.

While we are here, make the function return an error when the pdu
parsing failed, instead of returning the pdu offset as if it had been a
success...

Link: http://lkml.kernel.org/r/1536339057-21974-4-git-send-email-asmadeus@codewreck.org
Addresses-Coverity-ID: 139133 ("Copy into fixed size buffer")
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/9p/protocol.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/net/9p/protocol.c b/net/9p/protocol.c
index b4d80c533f89..462ba144cb39 100644
--- a/net/9p/protocol.c
+++ b/net/9p/protocol.c
@@ -623,13 +623,19 @@ int p9dirent_read(struct p9_client *clnt, char *buf, int len,
 	if (ret) {
 		p9_debug(P9_DEBUG_9P, "<<< p9dirent_read failed: %d\n", ret);
 		trace_9p_protocol_dump(clnt, &fake_pdu);
-		goto out;
+		return ret;
 	}
 
-	strcpy(dirent->d_name, nameptr);
+	ret = strscpy(dirent->d_name, nameptr, sizeof(dirent->d_name));
+	if (ret < 0) {
+		p9_debug(P9_DEBUG_ERROR,
+			 "On the wire dirent name too long: %s\n",
+			 nameptr);
+		kfree(nameptr);
+		return ret;
+	}
 	kfree(nameptr);
 
-out:
 	return fake_pdu.offset;
 }
 EXPORT_SYMBOL(p9dirent_read);
-- 
2.20.1




^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 4.19 18/72] 9p: potential NULL dereference
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (16 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 17/72] 9p: p9dirent_read: check network-provided name length Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 19/72] 9p/trans_fd: abort p9_read_work if req status changed Greg Kroah-Hartman
                   ` (60 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Dan Carpenter, Dominique Martinet,
	Sasha Levin

[ Upstream commit 72ea0321088df2c41eca8cc6160c24bcceb56ac7 ]

p9_tag_alloc() is supposed to return error pointers, but we accidentally
return a NULL here.  It would cause a NULL dereference in the caller.

Link: http://lkml.kernel.org/m/20180926103934.GA14535@mwanda
Fixes: 996d5b4db4b1 ("9p: Use a slab for allocating requests")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/9p/client.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/9p/client.c b/net/9p/client.c
index 4becde979462..b615aae5a0f8 100644
--- a/net/9p/client.c
+++ b/net/9p/client.c
@@ -287,7 +287,7 @@ p9_tag_alloc(struct p9_client *c, int8_t type, unsigned int max_size)
 	int tag;
 
 	if (!req)
-		return NULL;
+		return ERR_PTR(-ENOMEM);
 
 	if (p9_fcall_init(c, &req->tc, alloc_msize))
 		goto free_req;
-- 
2.20.1




^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 4.19 19/72] 9p/trans_fd: abort p9_read_work if req status changed
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (17 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 18/72] 9p: potential NULL dereference Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 20/72] 9p/trans_fd: put worker reqs on destroy Greg Kroah-Hartman
                   ` (59 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Dominique Martinet,
	syzbot+2222c34dc40b515f30dc, Eric Van Hensbergen,
	Latchesar Ionkov, Sasha Levin

[ Upstream commit e4ca13f7d075e551dc158df6af18fb412a1dba0a ]

p9_read_work would try to handle an errored req even if it got put to
error state by another thread between the lookup (that worked) and the
time it had been fully read.
The request itself is safe to use because we hold a ref to it from the
lookup (for m->rreq, so it was safe to read into the request data buffer
until this point), but the req_list has been deleted at the same time
status changed, and client_cb already has been called as well, so we
should not do either.

Link: http://lkml.kernel.org/r/1539057956-23741-1-git-send-email-asmadeus@codewreck.org
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Reported-by: syzbot+2222c34dc40b515f30dc@syzkaller.appspotmail.com
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/9p/trans_fd.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
index 12559c474dde..a0317d459cde 100644
--- a/net/9p/trans_fd.c
+++ b/net/9p/trans_fd.c
@@ -292,7 +292,6 @@ static void p9_read_work(struct work_struct *work)
 	__poll_t n;
 	int err;
 	struct p9_conn *m;
-	int status = REQ_STATUS_ERROR;
 
 	m = container_of(work, struct p9_conn, rq);
 
@@ -375,11 +374,17 @@ static void p9_read_work(struct work_struct *work)
 		p9_debug(P9_DEBUG_TRANS, "got new packet\n");
 		m->rreq->rc.size = m->rc.offset;
 		spin_lock(&m->client->lock);
-		if (m->rreq->status != REQ_STATUS_ERROR)
-			status = REQ_STATUS_RCVD;
-		list_del(&m->rreq->req_list);
-		/* update req->status while holding client->lock  */
-		p9_client_cb(m->client, m->rreq, status);
+		if (m->rreq->status == REQ_STATUS_SENT) {
+			list_del(&m->rreq->req_list);
+			p9_client_cb(m->client, m->rreq, REQ_STATUS_RCVD);
+		} else {
+			spin_unlock(&m->client->lock);
+			p9_debug(P9_DEBUG_ERROR,
+				 "Request tag %d errored out while we were reading the reply\n",
+				 m->rc.tag);
+			err = -EIO;
+			goto error;
+		}
 		spin_unlock(&m->client->lock);
 		m->rc.sdata = NULL;
 		m->rc.offset = 0;
-- 
2.20.1




^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 4.19 20/72] 9p/trans_fd: put worker reqs on destroy
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (18 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 19/72] 9p/trans_fd: abort p9_read_work if req status changed Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 21/72] net/9p: include trans_common.h to fix missing prototype warning Greg Kroah-Hartman
                   ` (58 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Dominique Martinet,
	Eric Van Hensbergen, Latchesar Ionkov, Tomas Bortoli,
	Sasha Levin

[ Upstream commit fb488fc1f2b4c5128540b032892ddec91edaf8d9 ]

p9_read_work/p9_write_work might still hold references to a req after
having been cancelled; make sure we put any of these to avoid potential
request leak on disconnect.

Fixes: 728356dedeff8 ("9p: Add refcount to p9_req_t")
Link: http://lkml.kernel.org/r/1539057956-23741-2-git-send-email-asmadeus@codewreck.org
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Reviewed-by: Tomas Bortoli <tomasbortoli@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/9p/trans_fd.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
index a0317d459cde..f868cf6fba79 100644
--- a/net/9p/trans_fd.c
+++ b/net/9p/trans_fd.c
@@ -876,7 +876,15 @@ static void p9_conn_destroy(struct p9_conn *m)
 
 	p9_mux_poll_stop(m);
 	cancel_work_sync(&m->rq);
+	if (m->rreq) {
+		p9_req_put(m->rreq);
+		m->rreq = NULL;
+	}
 	cancel_work_sync(&m->wq);
+	if (m->wreq) {
+		p9_req_put(m->wreq);
+		m->wreq = NULL;
+	}
 
 	p9_conn_cancel(m, -ECONNRESET);
 
-- 
2.20.1




^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 4.19 21/72] net/9p: include trans_common.h to fix missing prototype warning.
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (19 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 20/72] 9p/trans_fd: put worker reqs on destroy Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 22/72] qmi_wwan: Fix out-of-bounds read Greg Kroah-Hartman
                   ` (57 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Adeodato Simó,
	Dominique Martinet, Sasha Levin

[ Upstream commit 52ad259eaac0454c1ac7123e7148cf8d6e6f5301 ]

This silences -Wmissing-prototypes when defining p9_release_pages.

Link: http://lkml.kernel.org/r/b1c4df8f21689b10d451c28fe38e860722d20e71.1542089696.git.dato@net.com.org.es
Signed-off-by: Adeodato Simó <dato@net.com.org.es>
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/9p/trans_common.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/9p/trans_common.c b/net/9p/trans_common.c
index b718db2085b2..3dff68f05fb9 100644
--- a/net/9p/trans_common.c
+++ b/net/9p/trans_common.c
@@ -14,6 +14,7 @@
 
 #include <linux/mm.h>
 #include <linux/module.h>
+#include "trans_common.h"
 
 /**
  *  p9_release_pages - Release pages after the transaction.
-- 
2.20.1




^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 4.19 22/72] qmi_wwan: Fix out-of-bounds read
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (20 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 21/72] net/9p: include trans_common.h to fix missing prototype warning Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 23/72] Revert "usb: dwc3: gadget: Clear req->needs_extra_trb flag on cleanup" Greg Kroah-Hartman
                   ` (56 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, syzbot+b68605d7fadd21510de1,
	Kristian Evensen, Bjørn Mork, David S. Miller, Sasha Levin

[ Upstream commit 904d88d743b0c94092c5117955eab695df8109e8 ]

The syzbot reported

 Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0xca/0x13e lib/dump_stack.c:113
  print_address_description+0x67/0x231 mm/kasan/report.c:188
  __kasan_report.cold+0x1a/0x32 mm/kasan/report.c:317
  kasan_report+0xe/0x20 mm/kasan/common.c:614
  qmi_wwan_probe+0x342/0x360 drivers/net/usb/qmi_wwan.c:1417
  usb_probe_interface+0x305/0x7a0 drivers/usb/core/driver.c:361
  really_probe+0x281/0x660 drivers/base/dd.c:509
  driver_probe_device+0x104/0x210 drivers/base/dd.c:670
  __device_attach_driver+0x1c2/0x220 drivers/base/dd.c:777
  bus_for_each_drv+0x15c/0x1e0 drivers/base/bus.c:454

Caused by too many confusing indirections and casts.
id->driver_info is a pointer stored in a long.  We want the
pointer here, not the address of it.

Thanks-to: Hillf Danton <hdanton@sina.com>
Reported-by: syzbot+b68605d7fadd21510de1@syzkaller.appspotmail.com
Cc: Kristian Evensen <kristian.evensen@gmail.com>
Fixes: e4bf63482c30 ("qmi_wwan: Add quirk for Quectel dynamic config")
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/usb/qmi_wwan.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c
index d9a6699abe59..e657d8947125 100644
--- a/drivers/net/usb/qmi_wwan.c
+++ b/drivers/net/usb/qmi_wwan.c
@@ -1412,7 +1412,7 @@ static int qmi_wwan_probe(struct usb_interface *intf,
 	 * different. Ignore the current interface if the number of endpoints
 	 * equals the number for the diag interface (two).
 	 */
-	info = (void *)&id->driver_info;
+	info = (void *)id->driver_info;
 
 	if (info->data & QMI_WWAN_QUIRK_QUECTEL_DYNCFG) {
 		if (desc->bNumEndpoints == 2)
-- 
2.20.1




^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 4.19 23/72] Revert "usb: dwc3: gadget: Clear req->needs_extra_trb flag on cleanup"
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (21 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 22/72] qmi_wwan: Fix out-of-bounds read Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 24/72] usb: dwc3: gadget: combine unaligned and zero flags Greg Kroah-Hartman
                   ` (55 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Fei Yang, Sam Protsenko,
	Felipe Balbi, linux-usb, John Stultz, Sasha Levin

This reverts commit 25ad17d692ad54c3c33b2a31e5ce2a82e38de14e,
as we will be cherry-picking a number of changes from upstream
that allows us to later cherry-pick the same fix from upstream
rather than using this modified backported version.

Cc: Fei Yang <fei.yang@intel.com>
Cc: Sam Protsenko <semen.protsenko@linaro.org>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: linux-usb@vger.kernel.org
Cc: stable@vger.kernel.org # 4.19.y
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/usb/dwc3/gadget.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 65ba1038b111..eaa78e6c972c 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -177,8 +177,6 @@ static void dwc3_gadget_del_and_unmap_request(struct dwc3_ep *dep,
 	req->started = false;
 	list_del(&req->list);
 	req->remaining = 0;
-	req->unaligned = false;
-	req->zero = false;
 
 	if (req->request.status == -EINPROGRESS)
 		req->request.status = status;
-- 
2.20.1




^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 4.19 24/72] usb: dwc3: gadget: combine unaligned and zero flags
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (22 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 23/72] Revert "usb: dwc3: gadget: Clear req->needs_extra_trb flag on cleanup" Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 25/72] usb: dwc3: gadget: track number of TRBs per request Greg Kroah-Hartman
                   ` (54 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Fei Yang, Sam Protsenko,
	Felipe Balbi, linux-usb, Felipe Balbi, John Stultz, Sasha Levin

commit 1a22ec643580626f439c8583edafdcc73798f2fb upstream

Both flags are used for the same purpose in dwc3: appending an extra
TRB at the end to deal with controller requirements. By combining both
flags into one, we make it clear that the situation is the same and
that they should be treated equally.

Cc: Fei Yang <fei.yang@intel.com>
Cc: Sam Protsenko <semen.protsenko@linaro.org>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: linux-usb@vger.kernel.org
Cc: stable@vger.kernel.org # 4.19.y
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
(cherry picked from commit 1a22ec643580626f439c8583edafdcc73798f2fb)
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/usb/dwc3/core.h   |  7 +++----
 drivers/usb/dwc3/gadget.c | 18 +++++++++---------
 2 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
index 5bfb62533e0f..4872cba8699b 100644
--- a/drivers/usb/dwc3/core.h
+++ b/drivers/usb/dwc3/core.h
@@ -847,11 +847,11 @@ struct dwc3_hwparams {
  * @epnum: endpoint number to which this request refers
  * @trb: pointer to struct dwc3_trb
  * @trb_dma: DMA address of @trb
- * @unaligned: true for OUT endpoints with length not divisible by maxp
+ * @needs_extra_trb: true when request needs one extra TRB (either due to ZLP
+ *	or unaligned OUT)
  * @direction: IN or OUT direction flag
  * @mapped: true when request has been dma-mapped
  * @started: request is started
- * @zero: wants a ZLP
  */
 struct dwc3_request {
 	struct usb_request	request;
@@ -867,11 +867,10 @@ struct dwc3_request {
 	struct dwc3_trb		*trb;
 	dma_addr_t		trb_dma;
 
-	unsigned		unaligned:1;
+	unsigned		needs_extra_trb:1;
 	unsigned		direction:1;
 	unsigned		mapped:1;
 	unsigned		started:1;
-	unsigned		zero:1;
 };
 
 /*
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index eaa78e6c972c..8db7466e4f76 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -1068,7 +1068,7 @@ static void dwc3_prepare_one_trb_sg(struct dwc3_ep *dep,
 			struct dwc3	*dwc = dep->dwc;
 			struct dwc3_trb	*trb;
 
-			req->unaligned = true;
+			req->needs_extra_trb = true;
 
 			/* prepare normal TRB */
 			dwc3_prepare_one_trb(dep, req, true, i);
@@ -1112,7 +1112,7 @@ static void dwc3_prepare_one_trb_linear(struct dwc3_ep *dep,
 		struct dwc3	*dwc = dep->dwc;
 		struct dwc3_trb	*trb;
 
-		req->unaligned = true;
+		req->needs_extra_trb = true;
 
 		/* prepare normal TRB */
 		dwc3_prepare_one_trb(dep, req, true, 0);
@@ -1128,7 +1128,7 @@ static void dwc3_prepare_one_trb_linear(struct dwc3_ep *dep,
 		struct dwc3	*dwc = dep->dwc;
 		struct dwc3_trb	*trb;
 
-		req->zero = true;
+		req->needs_extra_trb = true;
 
 		/* prepare normal TRB */
 		dwc3_prepare_one_trb(dep, req, true, 0);
@@ -1410,7 +1410,7 @@ static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
 					dwc3_ep_inc_deq(dep);
 				}
 
-				if (r->unaligned || r->zero) {
+				if (r->needs_extra_trb) {
 					trb = r->trb + r->num_pending_sgs + 1;
 					trb->ctrl &= ~DWC3_TRB_CTRL_HWO;
 					dwc3_ep_inc_deq(dep);
@@ -1421,7 +1421,7 @@ static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
 				trb->ctrl &= ~DWC3_TRB_CTRL_HWO;
 				dwc3_ep_inc_deq(dep);
 
-				if (r->unaligned || r->zero) {
+				if (r->needs_extra_trb) {
 					trb = r->trb + 1;
 					trb->ctrl &= ~DWC3_TRB_CTRL_HWO;
 					dwc3_ep_inc_deq(dep);
@@ -2250,7 +2250,8 @@ static int dwc3_gadget_ep_reclaim_completed_trb(struct dwc3_ep *dep,
 	 * with one TRB pending in the ring. We need to manually clear HWO bit
 	 * from that TRB.
 	 */
-	if ((req->zero || req->unaligned) && !(trb->ctrl & DWC3_TRB_CTRL_CHN)) {
+
+	if (req->needs_extra_trb && !(trb->ctrl & DWC3_TRB_CTRL_CHN)) {
 		trb->ctrl &= ~DWC3_TRB_CTRL_HWO;
 		return 1;
 	}
@@ -2327,11 +2328,10 @@ static int dwc3_gadget_ep_cleanup_completed_request(struct dwc3_ep *dep,
 		ret = dwc3_gadget_ep_reclaim_trb_linear(dep, req, event,
 				status);
 
-	if (req->unaligned || req->zero) {
+	if (req->needs_extra_trb) {
 		ret = dwc3_gadget_ep_reclaim_trb_linear(dep, req, event,
 				status);
-		req->unaligned = false;
-		req->zero = false;
+		req->needs_extra_trb = false;
 	}
 
 	req->request.actual = req->request.length - req->remaining;
-- 
2.20.1




^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 4.19 25/72] usb: dwc3: gadget: track number of TRBs per request
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (23 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 24/72] usb: dwc3: gadget: combine unaligned and zero flags Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 26/72] usb: dwc3: gadget: use num_trbs when skipping TRBs on ->dequeue() Greg Kroah-Hartman
                   ` (53 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Fei Yang, Sam Protsenko,
	Felipe Balbi, linux-usb, Felipe Balbi, John Stultz, Sasha Levin

commit 09fe1f8d7e2f461275b1cdd832f2cfa5e9be346d upstream

This will help us remove the wait_event() from our ->dequeue().

Cc: Fei Yang <fei.yang@intel.com>
Cc: Sam Protsenko <semen.protsenko@linaro.org>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: linux-usb@vger.kernel.org
Cc: stable@vger.kernel.org # 4.19.y
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
(cherry picked from commit 09fe1f8d7e2f461275b1cdd832f2cfa5e9be346d)
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/usb/dwc3/core.h   | 3 +++
 drivers/usb/dwc3/gadget.c | 6 ++++++
 2 files changed, 9 insertions(+)

diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
index 4872cba8699b..0de78cb29f2c 100644
--- a/drivers/usb/dwc3/core.h
+++ b/drivers/usb/dwc3/core.h
@@ -847,6 +847,7 @@ struct dwc3_hwparams {
  * @epnum: endpoint number to which this request refers
  * @trb: pointer to struct dwc3_trb
  * @trb_dma: DMA address of @trb
+ * @num_trbs: number of TRBs used by this request
  * @needs_extra_trb: true when request needs one extra TRB (either due to ZLP
  *	or unaligned OUT)
  * @direction: IN or OUT direction flag
@@ -867,6 +868,8 @@ struct dwc3_request {
 	struct dwc3_trb		*trb;
 	dma_addr_t		trb_dma;
 
+	unsigned		num_trbs;
+
 	unsigned		needs_extra_trb:1;
 	unsigned		direction:1;
 	unsigned		mapped:1;
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 8db7466e4f76..fd91c494307c 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -1041,6 +1041,8 @@ static void dwc3_prepare_one_trb(struct dwc3_ep *dep,
 		req->trb_dma = dwc3_trb_dma_offset(dep, trb);
 	}
 
+	req->num_trbs++;
+
 	__dwc3_prepare_one_trb(dep, trb, dma, length, chain, node,
 			stream_id, short_not_ok, no_interrupt);
 }
@@ -1075,6 +1077,7 @@ static void dwc3_prepare_one_trb_sg(struct dwc3_ep *dep,
 
 			/* Now prepare one extra TRB to align transfer size */
 			trb = &dep->trb_pool[dep->trb_enqueue];
+			req->num_trbs++;
 			__dwc3_prepare_one_trb(dep, trb, dwc->bounce_addr,
 					maxp - rem, false, 1,
 					req->request.stream_id,
@@ -1119,6 +1122,7 @@ static void dwc3_prepare_one_trb_linear(struct dwc3_ep *dep,
 
 		/* Now prepare one extra TRB to align transfer size */
 		trb = &dep->trb_pool[dep->trb_enqueue];
+		req->num_trbs++;
 		__dwc3_prepare_one_trb(dep, trb, dwc->bounce_addr, maxp - rem,
 				false, 1, req->request.stream_id,
 				req->request.short_not_ok,
@@ -1135,6 +1139,7 @@ static void dwc3_prepare_one_trb_linear(struct dwc3_ep *dep,
 
 		/* Now prepare one extra TRB to handle ZLP */
 		trb = &dep->trb_pool[dep->trb_enqueue];
+		req->num_trbs++;
 		__dwc3_prepare_one_trb(dep, trb, dwc->bounce_addr, 0,
 				false, 1, req->request.stream_id,
 				req->request.short_not_ok,
@@ -2231,6 +2236,7 @@ static int dwc3_gadget_ep_reclaim_completed_trb(struct dwc3_ep *dep,
 	dwc3_ep_inc_deq(dep);
 
 	trace_dwc3_complete_trb(dep, trb);
+	req->num_trbs--;
 
 	/*
 	 * If we're in the middle of series of chained TRBs and we
-- 
2.20.1




^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 4.19 26/72] usb: dwc3: gadget: use num_trbs when skipping TRBs on ->dequeue()
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (24 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 25/72] usb: dwc3: gadget: track number of TRBs per request Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-03  2:03   ` Sasha Levin
  2019-07-02  8:01 ` [PATCH 4.19 27/72] usb: dwc3: gadget: extract dwc3_gadget_ep_skip_trbs() Greg Kroah-Hartman
                   ` (52 subsequent siblings)
  78 siblings, 1 reply; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Fei Yang, Sam Protsenko,
	Felipe Balbi, linux-usb, Felipe Balbi, John Stultz, Sasha Levin

commit c3acd59014148470dc58519870fbc779785b4bf7 upstream

Now that we track how many TRBs a request uses, it's easier to skip
over them in case of a call to usb_ep_dequeue(). Let's do so and
simplify the code a bit.

Cc: Fei Yang <fei.yang@intel.com>
Cc: Sam Protsenko <semen.protsenko@linaro.org>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: linux-usb@vger.kernel.org
Cc: stable@vger.kernel.org # 4.19.y
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
(cherry picked from commit c3acd59014148470dc58519870fbc779785b4bf7)
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/usb/dwc3/gadget.c | 28 ++++------------------------
 1 file changed, 4 insertions(+), 24 deletions(-)

diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index fd91c494307c..4e08904890ed 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -1368,6 +1368,8 @@ static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
 				break;
 		}
 		if (r == req) {
+			int i;
+
 			/* wait until it is processed */
 			dwc3_stop_active_transfer(dep, true);
 
@@ -1405,32 +1407,12 @@ static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
 			if (!r->trb)
 				goto out0;
 
-			if (r->num_pending_sgs) {
+			for (i = 0; i < r->num_trbs; i++) {
 				struct dwc3_trb *trb;
-				int i = 0;
-
-				for (i = 0; i < r->num_pending_sgs; i++) {
-					trb = r->trb + i;
-					trb->ctrl &= ~DWC3_TRB_CTRL_HWO;
-					dwc3_ep_inc_deq(dep);
-				}
-
-				if (r->needs_extra_trb) {
-					trb = r->trb + r->num_pending_sgs + 1;
-					trb->ctrl &= ~DWC3_TRB_CTRL_HWO;
-					dwc3_ep_inc_deq(dep);
-				}
-			} else {
-				struct dwc3_trb *trb = r->trb;
 
+				trb = r->trb + i;
 				trb->ctrl &= ~DWC3_TRB_CTRL_HWO;
 				dwc3_ep_inc_deq(dep);
-
-				if (r->needs_extra_trb) {
-					trb = r->trb + 1;
-					trb->ctrl &= ~DWC3_TRB_CTRL_HWO;
-					dwc3_ep_inc_deq(dep);
-				}
 			}
 			goto out1;
 		}
@@ -1441,8 +1423,6 @@ static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
 	}
 
 out1:
-	/* giveback the request */
-
 	dwc3_gadget_giveback(dep, req, -ECONNRESET);
 
 out0:
-- 
2.20.1




^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 4.19 27/72] usb: dwc3: gadget: extract dwc3_gadget_ep_skip_trbs()
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (25 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 26/72] usb: dwc3: gadget: use num_trbs when skipping TRBs on ->dequeue() Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 28/72] usb: dwc3: gadget: introduce cancelled_list Greg Kroah-Hartman
                   ` (51 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Fei Yang, Sam Protsenko,
	Felipe Balbi, linux-usb, Felipe Balbi, John Stultz, Sasha Levin

commit 7746a8dfb3f9c91b3a0b63a1d5c2664410e6498d upstream

Extract the logic for skipping over TRBs to its own function. This
makes the code slightly more readable and makes it easier to move this
call to its final resting place as a following patch.

Cc: Fei Yang <fei.yang@intel.com>
Cc: Sam Protsenko <semen.protsenko@linaro.org>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: linux-usb@vger.kernel.org
Cc: stable@vger.kernel.org # 4.19.y
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
(cherry picked from commit 7746a8dfb3f9c91b3a0b63a1d5c2664410e6498d)
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/usb/dwc3/gadget.c | 61 +++++++++++++++------------------------
 1 file changed, 24 insertions(+), 37 deletions(-)

diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 4e08904890ed..46aa20b376cd 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -1341,6 +1341,29 @@ static int dwc3_gadget_ep_queue(struct usb_ep *ep, struct usb_request *request,
 	return ret;
 }
 
+static void dwc3_gadget_ep_skip_trbs(struct dwc3_ep *dep, struct dwc3_request *req)
+{
+	int i;
+
+	/*
+	 * If request was already started, this means we had to
+	 * stop the transfer. With that we also need to ignore
+	 * all TRBs used by the request, however TRBs can only
+	 * be modified after completion of END_TRANSFER
+	 * command. So what we do here is that we wait for
+	 * END_TRANSFER completion and only after that, we jump
+	 * over TRBs by clearing HWO and incrementing dequeue
+	 * pointer.
+	 */
+	for (i = 0; i < req->num_trbs; i++) {
+		struct dwc3_trb *trb;
+
+		trb = req->trb + i;
+		trb->ctrl &= ~DWC3_TRB_CTRL_HWO;
+		dwc3_ep_inc_deq(dep);
+	}
+}
+
 static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
 		struct usb_request *request)
 {
@@ -1368,38 +1391,8 @@ static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
 				break;
 		}
 		if (r == req) {
-			int i;
-
 			/* wait until it is processed */
 			dwc3_stop_active_transfer(dep, true);
-
-			/*
-			 * If request was already started, this means we had to
-			 * stop the transfer. With that we also need to ignore
-			 * all TRBs used by the request, however TRBs can only
-			 * be modified after completion of END_TRANSFER
-			 * command. So what we do here is that we wait for
-			 * END_TRANSFER completion and only after that, we jump
-			 * over TRBs by clearing HWO and incrementing dequeue
-			 * pointer.
-			 *
-			 * Note that we have 2 possible types of transfers here:
-			 *
-			 * i) Linear buffer request
-			 * ii) SG-list based request
-			 *
-			 * SG-list based requests will have r->num_pending_sgs
-			 * set to a valid number (> 0). Linear requests,
-			 * normally use a single TRB.
-			 *
-			 * For each of these two cases, if r->unaligned flag is
-			 * set, one extra TRB has been used to align transfer
-			 * size to wMaxPacketSize.
-			 *
-			 * All of these cases need to be taken into
-			 * consideration so we don't mess up our TRB ring
-			 * pointers.
-			 */
 			wait_event_lock_irq(dep->wait_end_transfer,
 					!(dep->flags & DWC3_EP_END_TRANSFER_PENDING),
 					dwc->lock);
@@ -1407,13 +1400,7 @@ static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
 			if (!r->trb)
 				goto out0;
 
-			for (i = 0; i < r->num_trbs; i++) {
-				struct dwc3_trb *trb;
-
-				trb = r->trb + i;
-				trb->ctrl &= ~DWC3_TRB_CTRL_HWO;
-				dwc3_ep_inc_deq(dep);
-			}
+			dwc3_gadget_ep_skip_trbs(dep, r);
 			goto out1;
 		}
 		dev_err(dwc->dev, "request %pK was not queued to %s\n",
-- 
2.20.1




^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 4.19 28/72] usb: dwc3: gadget: introduce cancelled_list
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (26 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 27/72] usb: dwc3: gadget: extract dwc3_gadget_ep_skip_trbs() Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 29/72] usb: dwc3: gadget: move requests to cancelled_list Greg Kroah-Hartman
                   ` (50 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Fei Yang, Sam Protsenko,
	Felipe Balbi, linux-usb, Felipe Balbi, John Stultz, Sasha Levin

commit d5443bbf5fc8f8389cce146b1fc2987cdd229d12 upstream

This list will host cancelled requests who still have TRBs being
processed.

Cc: Fei Yang <fei.yang@intel.com>
Cc: Sam Protsenko <semen.protsenko@linaro.org>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: linux-usb@vger.kernel.org
Cc: stable@vger.kernel.org # 4.19.y
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
(cherry picked from commit d5443bbf5fc8f8389cce146b1fc2987cdd229d12)
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/usb/dwc3/core.h   |  2 ++
 drivers/usb/dwc3/gadget.c |  1 +
 drivers/usb/dwc3/gadget.h | 15 +++++++++++++++
 3 files changed, 18 insertions(+)

diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
index 0de78cb29f2c..24f0b108b7f6 100644
--- a/drivers/usb/dwc3/core.h
+++ b/drivers/usb/dwc3/core.h
@@ -636,6 +636,7 @@ struct dwc3_event_buffer {
 /**
  * struct dwc3_ep - device side endpoint representation
  * @endpoint: usb endpoint
+ * @cancelled_list: list of cancelled requests for this endpoint
  * @pending_list: list of pending requests for this endpoint
  * @started_list: list of started requests on this endpoint
  * @wait_end_transfer: wait_queue_head_t for waiting on End Transfer complete
@@ -659,6 +660,7 @@ struct dwc3_event_buffer {
  */
 struct dwc3_ep {
 	struct usb_ep		endpoint;
+	struct list_head	cancelled_list;
 	struct list_head	pending_list;
 	struct list_head	started_list;
 
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 46aa20b376cd..c2169bc626c8 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -2144,6 +2144,7 @@ static int dwc3_gadget_init_endpoint(struct dwc3 *dwc, u8 epnum)
 
 	INIT_LIST_HEAD(&dep->pending_list);
 	INIT_LIST_HEAD(&dep->started_list);
+	INIT_LIST_HEAD(&dep->cancelled_list);
 
 	return 0;
 }
diff --git a/drivers/usb/dwc3/gadget.h b/drivers/usb/dwc3/gadget.h
index 2aacd1afd9ff..023a473648eb 100644
--- a/drivers/usb/dwc3/gadget.h
+++ b/drivers/usb/dwc3/gadget.h
@@ -79,6 +79,21 @@ static inline void dwc3_gadget_move_started_request(struct dwc3_request *req)
 	list_move_tail(&req->list, &dep->started_list);
 }
 
+/**
+ * dwc3_gadget_move_cancelled_request - move @req to the cancelled_list
+ * @req: the request to be moved
+ *
+ * Caller should take care of locking. This function will move @req from its
+ * current list to the endpoint's cancelled_list.
+ */
+static inline void dwc3_gadget_move_cancelled_request(struct dwc3_request *req)
+{
+	struct dwc3_ep		*dep = req->dep;
+
+	req->started = false;
+	list_move_tail(&req->list, &dep->cancelled_list);
+}
+
 void dwc3_gadget_giveback(struct dwc3_ep *dep, struct dwc3_request *req,
 		int status);
 
-- 
2.20.1




^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 4.19 29/72] usb: dwc3: gadget: move requests to cancelled_list
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (27 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 28/72] usb: dwc3: gadget: introduce cancelled_list Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 30/72] usb: dwc3: gadget: remove wait_end_transfer Greg Kroah-Hartman
                   ` (49 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Fei Yang, Sam Protsenko,
	Felipe Balbi, linux-usb, Felipe Balbi, John Stultz, Sasha Levin

commit d4f1afe5e896c18ae01099a85dab5e1a198bd2a8 upstream

Whenever we have a request in flight, we can move it to the cancelled
list and later simply iterate over that list and skip over any TRBs we
find.

Cc: Fei Yang <fei.yang@intel.com>
Cc: Sam Protsenko <semen.protsenko@linaro.org>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: linux-usb@vger.kernel.org
Cc: stable@vger.kernel.org # 4.19.y
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
(cherry picked from commit d4f1afe5e896c18ae01099a85dab5e1a198bd2a8)
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/usb/dwc3/gadget.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index c2169bc626c8..8291fa1624e1 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -1364,6 +1364,17 @@ static void dwc3_gadget_ep_skip_trbs(struct dwc3_ep *dep, struct dwc3_request *r
 	}
 }
 
+static void dwc3_gadget_ep_cleanup_cancelled_requests(struct dwc3_ep *dep)
+{
+	struct dwc3_request		*req;
+	struct dwc3_request		*tmp;
+
+	list_for_each_entry_safe(req, tmp, &dep->cancelled_list, list) {
+		dwc3_gadget_ep_skip_trbs(dep, req);
+		dwc3_gadget_giveback(dep, req, -ECONNRESET);
+	}
+}
+
 static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
 		struct usb_request *request)
 {
@@ -1400,8 +1411,9 @@ static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
 			if (!r->trb)
 				goto out0;
 
-			dwc3_gadget_ep_skip_trbs(dep, r);
-			goto out1;
+			dwc3_gadget_move_cancelled_request(req);
+			dwc3_gadget_ep_cleanup_cancelled_requests(dep);
+			goto out0;
 		}
 		dev_err(dwc->dev, "request %pK was not queued to %s\n",
 				request, ep->name);
@@ -1409,7 +1421,6 @@ static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
 		goto out0;
 	}
 
-out1:
 	dwc3_gadget_giveback(dep, req, -ECONNRESET);
 
 out0:
-- 
2.20.1




^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 4.19 30/72] usb: dwc3: gadget: remove wait_end_transfer
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (28 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 29/72] usb: dwc3: gadget: move requests to cancelled_list Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 31/72] usb: dwc3: gadget: Clear req->needs_extra_trb flag on cleanup Greg Kroah-Hartman
                   ` (48 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Fei Yang, Sam Protsenko,
	Felipe Balbi, linux-usb, Felipe Balbi, John Stultz, Sasha Levin

commit fec9095bdef4e7c988adb603d0d4f92ee735d4a1 upstream

Now that we have a list of cancelled requests, we can skip over TRBs
when END_TRANSFER command completes.

Cc: Fei Yang <fei.yang@intel.com>
Cc: Sam Protsenko <semen.protsenko@linaro.org>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: linux-usb@vger.kernel.org
Cc: stable@vger.kernel.org # 4.19.y
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
(cherry picked from commit fec9095bdef4e7c988adb603d0d4f92ee735d4a1)
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/usb/dwc3/core.h   |  3 ---
 drivers/usb/dwc3/gadget.c | 40 +--------------------------------------
 2 files changed, 1 insertion(+), 42 deletions(-)

diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
index 24f0b108b7f6..131028501752 100644
--- a/drivers/usb/dwc3/core.h
+++ b/drivers/usb/dwc3/core.h
@@ -639,7 +639,6 @@ struct dwc3_event_buffer {
  * @cancelled_list: list of cancelled requests for this endpoint
  * @pending_list: list of pending requests for this endpoint
  * @started_list: list of started requests on this endpoint
- * @wait_end_transfer: wait_queue_head_t for waiting on End Transfer complete
  * @lock: spinlock for endpoint request queue traversal
  * @regs: pointer to first endpoint register
  * @trb_pool: array of transaction buffers
@@ -664,8 +663,6 @@ struct dwc3_ep {
 	struct list_head	pending_list;
 	struct list_head	started_list;
 
-	wait_queue_head_t	wait_end_transfer;
-
 	spinlock_t		lock;
 	void __iomem		*regs;
 
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 8291fa1624e1..843586f20572 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -638,8 +638,6 @@ static int __dwc3_gadget_ep_enable(struct dwc3_ep *dep, unsigned int action)
 		reg |= DWC3_DALEPENA_EP(dep->number);
 		dwc3_writel(dwc->regs, DWC3_DALEPENA, reg);
 
-		init_waitqueue_head(&dep->wait_end_transfer);
-
 		if (usb_endpoint_xfer_control(desc))
 			goto out;
 
@@ -1404,15 +1402,11 @@ static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
 		if (r == req) {
 			/* wait until it is processed */
 			dwc3_stop_active_transfer(dep, true);
-			wait_event_lock_irq(dep->wait_end_transfer,
-					!(dep->flags & DWC3_EP_END_TRANSFER_PENDING),
-					dwc->lock);
 
 			if (!r->trb)
 				goto out0;
 
 			dwc3_gadget_move_cancelled_request(req);
-			dwc3_gadget_ep_cleanup_cancelled_requests(dep);
 			goto out0;
 		}
 		dev_err(dwc->dev, "request %pK was not queued to %s\n",
@@ -1913,8 +1907,6 @@ static int dwc3_gadget_stop(struct usb_gadget *g)
 {
 	struct dwc3		*dwc = gadget_to_dwc(g);
 	unsigned long		flags;
-	int			epnum;
-	u32			tmo_eps = 0;
 
 	spin_lock_irqsave(&dwc->lock, flags);
 
@@ -1923,36 +1915,6 @@ static int dwc3_gadget_stop(struct usb_gadget *g)
 
 	__dwc3_gadget_stop(dwc);
 
-	for (epnum = 2; epnum < DWC3_ENDPOINTS_NUM; epnum++) {
-		struct dwc3_ep  *dep = dwc->eps[epnum];
-		int ret;
-
-		if (!dep)
-			continue;
-
-		if (!(dep->flags & DWC3_EP_END_TRANSFER_PENDING))
-			continue;
-
-		ret = wait_event_interruptible_lock_irq_timeout(dep->wait_end_transfer,
-			    !(dep->flags & DWC3_EP_END_TRANSFER_PENDING),
-			    dwc->lock, msecs_to_jiffies(5));
-
-		if (ret <= 0) {
-			/* Timed out or interrupted! There's nothing much
-			 * we can do so we just log here and print which
-			 * endpoints timed out at the end.
-			 */
-			tmo_eps |= 1 << epnum;
-			dep->flags &= DWC3_EP_END_TRANSFER_PENDING;
-		}
-	}
-
-	if (tmo_eps) {
-		dev_err(dwc->dev,
-			"end transfer timed out on endpoints 0x%x [bitmap]\n",
-			tmo_eps);
-	}
-
 out:
 	dwc->gadget_driver	= NULL;
 	spin_unlock_irqrestore(&dwc->lock, flags);
@@ -2449,7 +2411,7 @@ static void dwc3_endpoint_interrupt(struct dwc3 *dwc,
 
 		if (cmd == DWC3_DEPCMD_ENDTRANSFER) {
 			dep->flags &= ~DWC3_EP_END_TRANSFER_PENDING;
-			wake_up(&dep->wait_end_transfer);
+			dwc3_gadget_ep_cleanup_cancelled_requests(dep);
 		}
 		break;
 	case DWC3_DEPEVT_STREAMEVT:
-- 
2.20.1




^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 4.19 31/72] usb: dwc3: gadget: Clear req->needs_extra_trb flag on cleanup
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (29 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 30/72] usb: dwc3: gadget: remove wait_end_transfer Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 32/72] fs/proc/array.c: allow reporting eip/esp for all coredumping threads Greg Kroah-Hartman
                   ` (47 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Fei Yang, Sam Protsenko,
	Felipe Balbi, linux-usb, Jack Pham, Felipe Balbi, John Stultz,
	Sasha Levin

commit bd6742249b9ca918565e4e3abaa06665e587f4b5 upstream

OUT endpoint requests may somtimes have this flag set when
preparing to be submitted to HW indicating that there is an
additional TRB chained to the request for alignment purposes.
If that request is removed before the controller can execute the
transfer (e.g. ep_dequeue/ep_disable), the request will not go
through the dwc3_gadget_ep_cleanup_completed_request() handler
and will not have its needs_extra_trb flag cleared when
dwc3_gadget_giveback() is called.  This same request could be
later requeued for a new transfer that does not require an
extra TRB and if it is successfully completed, the cleanup
and TRB reclamation will incorrectly process the additional TRB
which belongs to the next request, and incorrectly advances the
TRB dequeue pointer, thereby messing up calculation of the next
requeust's actual/remaining count when it completes.

The right thing to do here is to ensure that the flag is cleared
before it is given back to the function driver.  A good place
to do that is in dwc3_gadget_del_and_unmap_request().

Fixes: c6267a51639b ("usb: dwc3: gadget: align transfers to wMaxPacketSize")
Cc: Fei Yang <fei.yang@intel.com>
Cc: Sam Protsenko <semen.protsenko@linaro.org>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: linux-usb@vger.kernel.org
Cc: stable@vger.kernel.org # 4.19.y
Signed-off-by: Jack Pham <jackp@codeaurora.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
(cherry picked from commit bd6742249b9ca918565e4e3abaa06665e587f4b5)
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/usb/dwc3/gadget.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 843586f20572..e7122b5199d2 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -177,6 +177,7 @@ static void dwc3_gadget_del_and_unmap_request(struct dwc3_ep *dep,
 	req->started = false;
 	list_del(&req->list);
 	req->remaining = 0;
+	req->needs_extra_trb = false;
 
 	if (req->request.status == -EINPROGRESS)
 		req->request.status = status;
-- 
2.20.1




^ permalink raw reply related	[flat|nested] 84+ messages in thread

* [PATCH 4.19 32/72] fs/proc/array.c: allow reporting eip/esp for all coredumping threads
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (30 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 31/72] usb: dwc3: gadget: Clear req->needs_extra_trb flag on cleanup Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 33/72] mm/mempolicy.c: fix an incorrect rebind node in mpol_rebind_nodemask Greg Kroah-Hartman
                   ` (46 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, John Ogness, Jan Luebbe,
	Alexey Dobriyan, Andy Lutomirski, Andrew Morton, Linus Torvalds

From: John Ogness <john.ogness@linutronix.de>

commit cb8f381f1613cafe3aec30809991cd56e7135d92 upstream.

0a1eb2d474ed ("fs/proc: Stop reporting eip and esp in /proc/PID/stat")
stopped reporting eip/esp and fd7d56270b52 ("fs/proc: Report eip/esp in
/prod/PID/stat for coredumping") reintroduced the feature to fix a
regression with userspace core dump handlers (such as minicoredumper).

Because PF_DUMPCORE is only set for the primary thread, this didn't fix
the original problem for secondary threads.  Allow reporting the eip/esp
for all threads by checking for PF_EXITING as well.  This is set for all
the other threads when they are killed.  coredump_wait() waits for all the
tasks to become inactive before proceeding to invoke a core dumper.

Link: http://lkml.kernel.org/r/87y32p7i7a.fsf@linutronix.de
Link: http://lkml.kernel.org/r/20190522161614.628-1-jlu@pengutronix.de
Fixes: fd7d56270b526ca3 ("fs/proc: Report eip/esp in /prod/PID/stat for coredumping")
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reported-by: Jan Luebbe <jlu@pengutronix.de>
Tested-by: Jan Luebbe <jlu@pengutronix.de>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/proc/array.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -452,7 +452,7 @@ static int do_task_stat(struct seq_file
 		 * a program is not able to use ptrace(2) in that case. It is
 		 * safe because the task has stopped executing permanently.
 		 */
-		if (permitted && (task->flags & PF_DUMPCORE)) {
+		if (permitted && (task->flags & (PF_EXITING|PF_DUMPCORE))) {
 			if (try_get_task_stack(task)) {
 				eip = KSTK_EIP(task);
 				esp = KSTK_ESP(task);



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 33/72] mm/mempolicy.c: fix an incorrect rebind node in mpol_rebind_nodemask
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (31 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 32/72] fs/proc/array.c: allow reporting eip/esp for all coredumping threads Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 34/72] fs/binfmt_flat.c: make load_flat_shared_library() work Greg Kroah-Hartman
                   ` (45 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, zhong jiang, Vlastimil Babka,
	Oscar Salvador, Anshuman Khandual, Michal Hocko, Mel Gorman,
	Andrea Arcangeli, Ralph Campbell, Andrew Morton, Linus Torvalds

From: zhong jiang <zhongjiang@huawei.com>

commit 29b190fa774dd1b72a1a6f19687d55dc72ea83be upstream.

mpol_rebind_nodemask() is called for MPOL_BIND and MPOL_INTERLEAVE
mempoclicies when the tasks's cpuset's mems_allowed changes.  For
policies created without MPOL_F_STATIC_NODES or MPOL_F_RELATIVE_NODES,
it works by remapping the policy's allowed nodes (stored in v.nodes)
using the previous value of mems_allowed (stored in
w.cpuset_mems_allowed) as the domain of map and the new mems_allowed
(passed as nodes) as the range of the map (see the comment of
bitmap_remap() for details).

The result of remapping is stored back as policy's nodemask in v.nodes,
and the new value of mems_allowed should be stored in
w.cpuset_mems_allowed to facilitate the next rebind, if it happens.

However, 213980c0f23b ("mm, mempolicy: simplify rebinding mempolicies
when updating cpusets") introduced a bug where the result of remapping
is stored in w.cpuset_mems_allowed instead.  Thus, a mempolicy's
allowed nodes can evolve in an unexpected way after a series of
rebinding due to cpuset mems_allowed changes, possibly binding to a
wrong node or a smaller number of nodes which may e.g.  overload them.
This patch fixes the bug so rebinding again works as intended.

[vbabka@suse.cz: new changlog]
  Link: http://lkml.kernel.org/r/ef6a69c6-c052-b067-8f2c-9d615c619bb9@suse.cz
Link: http://lkml.kernel.org/r/1558768043-23184-1-git-send-email-zhongjiang@huawei.com
Fixes: 213980c0f23b ("mm, mempolicy: simplify rebinding mempolicies when updating cpusets")
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 mm/mempolicy.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -306,7 +306,7 @@ static void mpol_rebind_nodemask(struct
 	else {
 		nodes_remap(tmp, pol->v.nodes,pol->w.cpuset_mems_allowed,
 								*nodes);
-		pol->w.cpuset_mems_allowed = tmp;
+		pol->w.cpuset_mems_allowed = *nodes;
 	}
 
 	if (nodes_empty(tmp))



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 34/72] fs/binfmt_flat.c: make load_flat_shared_library() work
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (32 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 33/72] mm/mempolicy.c: fix an incorrect rebind node in mpol_rebind_nodemask Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 35/72] clk: socfpga: stratix10: fix divider entry for the emac clocks Greg Kroah-Hartman
                   ` (44 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jann Horn, Alexander Viro, Kees Cook,
	Nicolas Pitre, Arnd Bergmann, Geert Uytterhoeven, Russell King,
	Greg Ungerer, Andrew Morton, Linus Torvalds

From: Jann Horn <jannh@google.com>

commit 867bfa4a5fcee66f2b25639acae718e8b28b25a5 upstream.

load_flat_shared_library() is broken: It only calls load_flat_file() if
prepare_binprm() returns zero, but prepare_binprm() returns the number of
bytes read - so this only happens if the file is empty.

Instead, call into load_flat_file() if the number of bytes read is
non-negative. (Even if the number of bytes is zero - in that case,
load_flat_file() will see nullbytes and return a nice -ENOEXEC.)

In addition, remove the code related to bprm creds and stop using
prepare_binprm() - this code is loading a library, not a main executable,
and it only actually uses the members "buf", "file" and "filename" of the
linux_binprm struct. Instead, call kernel_read() directly.

Link: http://lkml.kernel.org/r/20190524201817.16509-1-jannh@google.com
Fixes: 287980e49ffc ("remove lots of IS_ERR_VALUE abuses")
Signed-off-by: Jann Horn <jannh@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Kees Cook <keescook@chromium.org>
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/binfmt_flat.c |   23 +++++++----------------
 1 file changed, 7 insertions(+), 16 deletions(-)

--- a/fs/binfmt_flat.c
+++ b/fs/binfmt_flat.c
@@ -856,9 +856,14 @@ err:
 
 static int load_flat_shared_library(int id, struct lib_info *libs)
 {
+	/*
+	 * This is a fake bprm struct; only the members "buf", "file" and
+	 * "filename" are actually used.
+	 */
 	struct linux_binprm bprm;
 	int res;
 	char buf[16];
+	loff_t pos = 0;
 
 	memset(&bprm, 0, sizeof(bprm));
 
@@ -872,25 +877,11 @@ static int load_flat_shared_library(int
 	if (IS_ERR(bprm.file))
 		return res;
 
-	bprm.cred = prepare_exec_creds();
-	res = -ENOMEM;
-	if (!bprm.cred)
-		goto out;
-
-	/* We don't really care about recalculating credentials at this point
-	 * as we're past the point of no return and are dealing with shared
-	 * libraries.
-	 */
-	bprm.called_set_creds = 1;
-
-	res = prepare_binprm(&bprm);
+	res = kernel_read(bprm.file, bprm.buf, BINPRM_BUF_SIZE, &pos);
 
-	if (!res)
+	if (res >= 0)
 		res = load_flat_file(&bprm, libs, id, NULL);
 
-	abort_creds(bprm.cred);
-
-out:
 	allow_write_access(bprm.file);
 	fput(bprm.file);
 



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 35/72] clk: socfpga: stratix10: fix divider entry for the emac clocks
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (33 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 34/72] fs/binfmt_flat.c: make load_flat_shared_library() work Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 36/72] mm: soft-offline: return -EBUSY if set_hwpoison_free_buddy_page() fails Greg Kroah-Hartman
                   ` (43 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Dinh Nguyen, Stephen Boyd

From: Dinh Nguyen <dinguyen@kernel.org>

commit 74684cce5ebd567b01e9bc0e9a1945c70a32f32f upstream.

The fixed dividers for the emac clocks should be 2 not 4.

Cc: stable@vger.kernel.org
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/clk/socfpga/clk-s10.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/clk/socfpga/clk-s10.c
+++ b/drivers/clk/socfpga/clk-s10.c
@@ -103,9 +103,9 @@ static const struct stratix10_perip_cnt_
 	{ STRATIX10_NOC_CLK, "noc_clk", NULL, noc_mux, ARRAY_SIZE(noc_mux),
 	  0, 0, 0, 0x3C, 1},
 	{ STRATIX10_EMAC_A_FREE_CLK, "emaca_free_clk", NULL, emaca_free_mux, ARRAY_SIZE(emaca_free_mux),
-	  0, 0, 4, 0xB0, 0},
+	  0, 0, 2, 0xB0, 0},
 	{ STRATIX10_EMAC_B_FREE_CLK, "emacb_free_clk", NULL, emacb_free_mux, ARRAY_SIZE(emacb_free_mux),
-	  0, 0, 4, 0xB0, 1},
+	  0, 0, 2, 0xB0, 1},
 	{ STRATIX10_EMAC_PTP_FREE_CLK, "emac_ptp_free_clk", NULL, emac_ptp_free_mux,
 	  ARRAY_SIZE(emac_ptp_free_mux), 0, 0, 4, 0xB0, 2},
 	{ STRATIX10_GPIO_DB_FREE_CLK, "gpio_db_free_clk", NULL, gpio_db_free_mux,



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 36/72] mm: soft-offline: return -EBUSY if set_hwpoison_free_buddy_page() fails
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (34 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 35/72] clk: socfpga: stratix10: fix divider entry for the emac clocks Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 37/72] mm: hugetlb: soft-offline: dissolve_free_huge_page() return zero on !PageHuge Greg Kroah-Hartman
                   ` (42 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Naoya Horiguchi, Mike Kravetz,
	Oscar Salvador, Michal Hocko, Xishi Qiu, Chen, Jerry T, Zhuo,
	Qiuxu, Andrew Morton, Linus Torvalds

From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>

commit b38e5962f8ed0d2a2b28a887fc2221f7f41db119 upstream.

The pass/fail of soft offline should be judged by checking whether the
raw error page was finally contained or not (i.e.  the result of
set_hwpoison_free_buddy_page()), but current code do not work like
that.  It might lead us to misjudge the test result when
set_hwpoison_free_buddy_page() fails.

Without this fix, there are cases where madvise(MADV_SOFT_OFFLINE) may
not offline the original page and will not return an error.

Link: http://lkml.kernel.org/r/1560154686-18497-2-git-send-email-n-horiguchi@ah.jp.nec.com
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Fixes: 6bc9b56433b76 ("mm: fix race on soft-offlining")
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Xishi Qiu <xishi.qiuxishi@alibaba-inc.com>
Cc: "Chen, Jerry T" <jerry.t.chen@intel.com>
Cc: "Zhuo, Qiuxu" <qiuxu.zhuo@intel.com>
Cc: <stable@vger.kernel.org>	[4.19+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 mm/memory-failure.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1731,6 +1731,8 @@ static int soft_offline_huge_page(struct
 		if (!ret) {
 			if (set_hwpoison_free_buddy_page(page))
 				num_poisoned_pages_inc();
+			else
+				ret = -EBUSY;
 		}
 	}
 	return ret;



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 37/72] mm: hugetlb: soft-offline: dissolve_free_huge_page() return zero on !PageHuge
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (35 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 36/72] mm: soft-offline: return -EBUSY if set_hwpoison_free_buddy_page() fails Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 38/72] mm/page_idle.c: fix oops because end_pfn is larger than max_pfn Greg Kroah-Hartman
                   ` (41 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Naoya Horiguchi, Chen, Jerry T,
	Mike Kravetz, Oscar Salvador, Michal Hocko, Xishi Qiu, Zhuo,
	Qiuxu, Andrew Morton, Linus Torvalds

From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>

commit faf53def3b143df11062d87c12afe6afeb6f8cc7 upstream.

madvise(MADV_SOFT_OFFLINE) often returns -EBUSY when calling soft offline
for hugepages with overcommitting enabled.  That was caused by the
suboptimal code in current soft-offline code.  See the following part:

    ret = migrate_pages(&pagelist, new_page, NULL, MPOL_MF_MOVE_ALL,
                            MIGRATE_SYNC, MR_MEMORY_FAILURE);
    if (ret) {
            ...
    } else {
            /*
             * We set PG_hwpoison only when the migration source hugepage
             * was successfully dissolved, because otherwise hwpoisoned
             * hugepage remains on free hugepage list, then userspace will
             * find it as SIGBUS by allocation failure. That's not expected
             * in soft-offlining.
             */
            ret = dissolve_free_huge_page(page);
            if (!ret) {
                    if (set_hwpoison_free_buddy_page(page))
                            num_poisoned_pages_inc();
            }
    }
    return ret;

Here dissolve_free_huge_page() returns -EBUSY if the migration source page
was freed into buddy in migrate_pages(), but even in that case we actually
has a chance that set_hwpoison_free_buddy_page() succeeds.  So that means
current code gives up offlining too early now.

dissolve_free_huge_page() checks that a given hugepage is suitable for
dissolving, where we should return success for !PageHuge() case because
the given hugepage is considered as already dissolved.

This change also affects other callers of dissolve_free_huge_page(), which
are cleaned up together.

[n-horiguchi@ah.jp.nec.com: v3]
  Link: http://lkml.kernel.org/r/1560761476-4651-3-git-send-email-n-horiguchi@ah.jp.nec.comLink: http://lkml.kernel.org/r/1560154686-18497-3-git-send-email-n-horiguchi@ah.jp.nec.com
Fixes: 6bc9b56433b76 ("mm: fix race on soft-offlining")
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reported-by: Chen, Jerry T <jerry.t.chen@intel.com>
Tested-by: Chen, Jerry T <jerry.t.chen@intel.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Xishi Qiu <xishi.qiuxishi@alibaba-inc.com>
Cc: "Chen, Jerry T" <jerry.t.chen@intel.com>
Cc: "Zhuo, Qiuxu" <qiuxu.zhuo@intel.com>
Cc: <stable@vger.kernel.org>	[4.19+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 mm/hugetlb.c        |   29 ++++++++++++++++++++---------
 mm/memory-failure.c |    5 +----
 2 files changed, 21 insertions(+), 13 deletions(-)

--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1489,16 +1489,29 @@ static int free_pool_huge_page(struct hs
 
 /*
  * Dissolve a given free hugepage into free buddy pages. This function does
- * nothing for in-use (including surplus) hugepages. Returns -EBUSY if the
- * dissolution fails because a give page is not a free hugepage, or because
- * free hugepages are fully reserved.
+ * nothing for in-use hugepages and non-hugepages.
+ * This function returns values like below:
+ *
+ *  -EBUSY: failed to dissolved free hugepages or the hugepage is in-use
+ *          (allocated or reserved.)
+ *       0: successfully dissolved free hugepages or the page is not a
+ *          hugepage (considered as already dissolved)
  */
 int dissolve_free_huge_page(struct page *page)
 {
 	int rc = -EBUSY;
 
+	/* Not to disrupt normal path by vainly holding hugetlb_lock */
+	if (!PageHuge(page))
+		return 0;
+
 	spin_lock(&hugetlb_lock);
-	if (PageHuge(page) && !page_count(page)) {
+	if (!PageHuge(page)) {
+		rc = 0;
+		goto out;
+	}
+
+	if (!page_count(page)) {
 		struct page *head = compound_head(page);
 		struct hstate *h = page_hstate(head);
 		int nid = page_to_nid(head);
@@ -1543,11 +1556,9 @@ int dissolve_free_huge_pages(unsigned lo
 
 	for (pfn = start_pfn; pfn < end_pfn; pfn += 1 << minimum_order) {
 		page = pfn_to_page(pfn);
-		if (PageHuge(page) && !page_count(page)) {
-			rc = dissolve_free_huge_page(page);
-			if (rc)
-				break;
-		}
+		rc = dissolve_free_huge_page(page);
+		if (rc)
+			break;
 	}
 
 	return rc;
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1857,11 +1857,8 @@ static int soft_offline_in_use_page(stru
 
 static int soft_offline_free_page(struct page *page)
 {
-	int rc = 0;
-	struct page *head = compound_head(page);
+	int rc = dissolve_free_huge_page(page);
 
-	if (PageHuge(head))
-		rc = dissolve_free_huge_page(page);
 	if (!rc) {
 		if (set_hwpoison_free_buddy_page(page))
 			num_poisoned_pages_inc();



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 38/72] mm/page_idle.c: fix oops because end_pfn is larger than max_pfn
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (36 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 37/72] mm: hugetlb: soft-offline: dissolve_free_huge_page() return zero on !PageHuge Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 39/72] dm log writes: make sure super sector log updates are written in order Greg Kroah-Hartman
                   ` (40 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Colin Ian King, Andrew Morton,
	Vladimir Davydov, Michal Hocko, Mike Rapoport, Mel Gorman,
	Stephen Rothwell, Andrey Ryabinin, Linus Torvalds

From: Colin Ian King <colin.king@canonical.com>

commit 7298e3b0a149c91323b3205d325e942c3b3b9ef6 upstream.

Currently the calcuation of end_pfn can round up the pfn number to more
than the actual maximum number of pfns, causing an Oops.  Fix this by
ensuring end_pfn is never more than max_pfn.

This can be easily triggered when on systems where the end_pfn gets
rounded up to more than max_pfn using the idle-page stress-ng stress test:

sudo stress-ng --idle-page 0

  BUG: unable to handle kernel paging request at 00000000000020d8
  #PF error: [normal kernel read fault]
  PGD 0 P4D 0
  Oops: 0000 [#1] SMP PTI
  CPU: 1 PID: 11039 Comm: stress-ng-idle- Not tainted 5.0.0-5-generic #6-Ubuntu
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
  RIP: 0010:page_idle_get_page+0xc8/0x1a0
  Code: 0f b1 0a 75 7d 48 8b 03 48 89 c2 48 c1 e8 33 83 e0 07 48 c1 ea 36 48 8d 0c 40 4c 8d 24 88 49 c1 e4 07 4c 03 24 d5 00 89 c3 be <49> 8b 44 24 58 48 8d b8 80 a1 02 00 e8 07 d5 77 00 48 8b 53 08 48
  RSP: 0018:ffffafd7c672fde8 EFLAGS: 00010202
  RAX: 0000000000000005 RBX: ffffe36341fff700 RCX: 000000000000000f
  RDX: 0000000000000284 RSI: 0000000000000275 RDI: 0000000001fff700
  RBP: ffffafd7c672fe00 R08: ffffa0bc34056410 R09: 0000000000000276
  R10: ffffa0bc754e9b40 R11: ffffa0bc330f6400 R12: 0000000000002080
  R13: ffffe36341fff700 R14: 0000000000080000 R15: ffffa0bc330f6400
  FS: 00007f0ec1ea5740(0000) GS:ffffa0bc7db00000(0000) knlGS:0000000000000000
  CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00000000000020d8 CR3: 0000000077d68000 CR4: 00000000000006e0
  Call Trace:
    page_idle_bitmap_write+0x8c/0x140
    sysfs_kf_bin_write+0x5c/0x70
    kernfs_fop_write+0x12e/0x1b0
    __vfs_write+0x1b/0x40
    vfs_write+0xab/0x1b0
    ksys_write+0x55/0xc0
    __x64_sys_write+0x1a/0x20
    do_syscall_64+0x5a/0x110
    entry_SYSCALL_64_after_hwframe+0x44/0xa9

Link: http://lkml.kernel.org/r/20190618124352.28307-1-colin.king@canonical.com
Fixes: 33c3fc71c8cf ("mm: introduce idle page tracking")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 mm/page_idle.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/mm/page_idle.c
+++ b/mm/page_idle.c
@@ -136,7 +136,7 @@ static ssize_t page_idle_bitmap_read(str
 
 	end_pfn = pfn + count * BITS_PER_BYTE;
 	if (end_pfn > max_pfn)
-		end_pfn = ALIGN(max_pfn, BITMAP_CHUNK_BITS);
+		end_pfn = max_pfn;
 
 	for (; pfn < end_pfn; pfn++) {
 		bit = pfn % BITMAP_CHUNK_BITS;
@@ -181,7 +181,7 @@ static ssize_t page_idle_bitmap_write(st
 
 	end_pfn = pfn + count * BITS_PER_BYTE;
 	if (end_pfn > max_pfn)
-		end_pfn = ALIGN(max_pfn, BITMAP_CHUNK_BITS);
+		end_pfn = max_pfn;
 
 	for (; pfn < end_pfn; pfn++) {
 		bit = pfn % BITMAP_CHUNK_BITS;



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 39/72] dm log writes: make sure super sector log updates are written in order
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (37 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 38/72] mm/page_idle.c: fix oops because end_pfn is larger than max_pfn Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 40/72] scsi: vmw_pscsi: Fix use-after-free in pvscsi_queue_lck() Greg Kroah-Hartman
                   ` (39 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, zhangyi (F), Josef Bacik, Mike Snitzer

From: zhangyi (F) <yi.zhang@huawei.com>

commit 211ad4b733037f66f9be0a79eade3da7ab11cbb8 upstream.

Currently, although we submit super bios in order (and super.nr_entries
is incremented by each logged entry), submit_bio() is async so each
super sector may not be written to log device in order and then the
final nr_entries may be smaller than it should be.

This problem can be reproduced by the xfstests generic/455 with ext4:

  QA output created by 455
 -Silence is golden
 +mark 'end' does not exist

Fix this by serializing submission of super sectors to make sure each
is written to the log disk in order.

Fixes: 0e9cebe724597 ("dm: add log writes target")
Cc: stable@vger.kernel.org
Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
Suggested-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/md/dm-log-writes.c |   23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

--- a/drivers/md/dm-log-writes.c
+++ b/drivers/md/dm-log-writes.c
@@ -60,6 +60,7 @@
 
 #define WRITE_LOG_VERSION 1ULL
 #define WRITE_LOG_MAGIC 0x6a736677736872ULL
+#define WRITE_LOG_SUPER_SECTOR 0
 
 /*
  * The disk format for this is braindead simple.
@@ -115,6 +116,7 @@ struct log_writes_c {
 	struct list_head logging_blocks;
 	wait_queue_head_t wait;
 	struct task_struct *log_kthread;
+	struct completion super_done;
 };
 
 struct pending_block {
@@ -180,6 +182,14 @@ static void log_end_io(struct bio *bio)
 	bio_put(bio);
 }
 
+static void log_end_super(struct bio *bio)
+{
+	struct log_writes_c *lc = bio->bi_private;
+
+	complete(&lc->super_done);
+	log_end_io(bio);
+}
+
 /*
  * Meant to be called if there is an error, it will free all the pages
  * associated with the block.
@@ -215,7 +225,8 @@ static int write_metadata(struct log_wri
 	bio->bi_iter.bi_size = 0;
 	bio->bi_iter.bi_sector = sector;
 	bio_set_dev(bio, lc->logdev->bdev);
-	bio->bi_end_io = log_end_io;
+	bio->bi_end_io = (sector == WRITE_LOG_SUPER_SECTOR) ?
+			  log_end_super : log_end_io;
 	bio->bi_private = lc;
 	bio_set_op_attrs(bio, REQ_OP_WRITE, 0);
 
@@ -418,11 +429,18 @@ static int log_super(struct log_writes_c
 	super.nr_entries = cpu_to_le64(lc->logged_entries);
 	super.sectorsize = cpu_to_le32(lc->sectorsize);
 
-	if (write_metadata(lc, &super, sizeof(super), NULL, 0, 0)) {
+	if (write_metadata(lc, &super, sizeof(super), NULL, 0,
+			   WRITE_LOG_SUPER_SECTOR)) {
 		DMERR("Couldn't write super");
 		return -1;
 	}
 
+	/*
+	 * Super sector should be writen in-order, otherwise the
+	 * nr_entries could be rewritten incorrectly by an old bio.
+	 */
+	wait_for_completion_io(&lc->super_done);
+
 	return 0;
 }
 
@@ -531,6 +549,7 @@ static int log_writes_ctr(struct dm_targ
 	INIT_LIST_HEAD(&lc->unflushed_blocks);
 	INIT_LIST_HEAD(&lc->logging_blocks);
 	init_waitqueue_head(&lc->wait);
+	init_completion(&lc->super_done);
 	atomic_set(&lc->io_blocks, 0);
 	atomic_set(&lc->pending_blocks, 0);
 



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 40/72] scsi: vmw_pscsi: Fix use-after-free in pvscsi_queue_lck()
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (38 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 39/72] dm log writes: make sure super sector log updates are written in order Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 41/72] x86/speculation: Allow guests to use SSBD even if host does not Greg Kroah-Hartman
                   ` (38 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jan Kara, Ewan D. Milne, Martin K. Petersen

From: Jan Kara <jack@suse.cz>

commit 240b4cc8fd5db138b675297d4226ec46594d9b3b upstream.

Once we unlock adapter->hw_lock in pvscsi_queue_lck() nothing prevents just
queued scsi_cmnd from completing and freeing the request. Thus cmd->cmnd[0]
dereference can dereference already freed request leading to kernel crashes
or other issues (which one of our customers observed). Store cmd->cmnd[0]
in a local variable before unlocking adapter->hw_lock to fix the issue.

CC: <stable@vger.kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/scsi/vmw_pvscsi.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

--- a/drivers/scsi/vmw_pvscsi.c
+++ b/drivers/scsi/vmw_pvscsi.c
@@ -763,6 +763,7 @@ static int pvscsi_queue_lck(struct scsi_
 	struct pvscsi_adapter *adapter = shost_priv(host);
 	struct pvscsi_ctx *ctx;
 	unsigned long flags;
+	unsigned char op;
 
 	spin_lock_irqsave(&adapter->hw_lock, flags);
 
@@ -775,13 +776,14 @@ static int pvscsi_queue_lck(struct scsi_
 	}
 
 	cmd->scsi_done = done;
+	op = cmd->cmnd[0];
 
 	dev_dbg(&cmd->device->sdev_gendev,
-		"queued cmd %p, ctx %p, op=%x\n", cmd, ctx, cmd->cmnd[0]);
+		"queued cmd %p, ctx %p, op=%x\n", cmd, ctx, op);
 
 	spin_unlock_irqrestore(&adapter->hw_lock, flags);
 
-	pvscsi_kick_io(adapter, cmd->cmnd[0]);
+	pvscsi_kick_io(adapter, op);
 
 	return 0;
 }



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 41/72] x86/speculation: Allow guests to use SSBD even if host does not
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (39 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 40/72] scsi: vmw_pscsi: Fix use-after-free in pvscsi_queue_lck() Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 42/72] x86/microcode: Fix the microcode load on CPU hotplug for real Greg Kroah-Hartman
                   ` (37 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Alejandro Jimenez, Thomas Gleixner,
	Liam Merwick, Mark Kanda, Paolo Bonzini, bp, rkrcmar, kvm

From: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>

commit c1f7fec1eb6a2c86d01bc22afce772c743451d88 upstream.

The bits set in x86_spec_ctrl_mask are used to calculate the guest's value
of SPEC_CTRL that is written to the MSR before VMENTRY, and control which
mitigations the guest can enable.  In the case of SSBD, unless the host has
enabled SSBD always on mode (by passing "spec_store_bypass_disable=on" in
the kernel parameters), the SSBD bit is not set in the mask and the guest
can not properly enable the SSBD always on mitigation mode.

This has been confirmed by running the SSBD PoC on a guest using the SSBD
always on mitigation mode (booted with kernel parameter
"spec_store_bypass_disable=on"), and verifying that the guest is vulnerable
unless the host is also using SSBD always on mode. In addition, the guest
OS incorrectly reports the SSB vulnerability as mitigated.

Always set the SSBD bit in x86_spec_ctrl_mask when the host CPU supports
it, allowing the guest to use SSBD whether or not the host has chosen to
enable the mitigation in any of its modes.

Fixes: be6fcb5478e9 ("x86/bugs: Rework spec_ctrl base and mask logic")
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Reviewed-by: Mark Kanda <mark.kanda@oracle.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: bp@alien8.de
Cc: rkrcmar@redhat.com
Cc: kvm@vger.kernel.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1560187210-11054-1-git-send-email-alejandro.j.jimenez@oracle.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kernel/cpu/bugs.c |   11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -821,6 +821,16 @@ static enum ssb_mitigation __init __ssb_
 	}
 
 	/*
+	 * If SSBD is controlled by the SPEC_CTRL MSR, then set the proper
+	 * bit in the mask to allow guests to use the mitigation even in the
+	 * case where the host does not enable it.
+	 */
+	if (static_cpu_has(X86_FEATURE_SPEC_CTRL_SSBD) ||
+	    static_cpu_has(X86_FEATURE_AMD_SSBD)) {
+		x86_spec_ctrl_mask |= SPEC_CTRL_SSBD;
+	}
+
+	/*
 	 * We have three CPU feature flags that are in play here:
 	 *  - X86_BUG_SPEC_STORE_BYPASS - CPU is susceptible.
 	 *  - X86_FEATURE_SSBD - CPU is able to turn off speculative store bypass
@@ -837,7 +847,6 @@ static enum ssb_mitigation __init __ssb_
 			x86_amd_ssb_disable();
 		} else {
 			x86_spec_ctrl_base |= SPEC_CTRL_SSBD;
-			x86_spec_ctrl_mask |= SPEC_CTRL_SSBD;
 			wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
 		}
 	}



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 42/72] x86/microcode: Fix the microcode load on CPU hotplug for real
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (40 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 41/72] x86/speculation: Allow guests to use SSBD even if host does not Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 43/72] x86/resctrl: Prevent possible overrun during bitmap operations Greg Kroah-Hartman
                   ` (36 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Thomas Gleixner, Borislav Petkov,
	H. Peter Anvin, Ingo Molnar, x86-ml

From: Thomas Gleixner <tglx@linutronix.de>

commit 5423f5ce5ca410b3646f355279e4e937d452e622 upstream.

A recent change moved the microcode loader hotplug callback into the early
startup phase which is running with interrupts disabled. It missed that
the callbacks invoke sysfs functions which might sleep causing nice 'might
sleep' splats with proper debugging enabled.

Split the callbacks and only load the microcode in the early startup phase
and move the sysfs handling back into the later threaded and preemptible
bringup phase where it was before.

Fixes: 78f4e932f776 ("x86/microcode, cpuhotplug: Add a microcode loader CPU hotplug callback")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: stable@vger.kernel.org
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1906182228350.1766@nanos.tec.linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kernel/cpu/microcode/core.c |   15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -790,13 +790,16 @@ static struct syscore_ops mc_syscore_ops
 	.resume			= mc_bp_resume,
 };
 
-static int mc_cpu_online(unsigned int cpu)
+static int mc_cpu_starting(unsigned int cpu)
 {
-	struct device *dev;
-
-	dev = get_cpu_device(cpu);
 	microcode_update_cpu(cpu);
 	pr_debug("CPU%d added\n", cpu);
+	return 0;
+}
+
+static int mc_cpu_online(unsigned int cpu)
+{
+	struct device *dev = get_cpu_device(cpu);
 
 	if (sysfs_create_group(&dev->kobj, &mc_attr_group))
 		pr_err("Failed to create group for CPU%d\n", cpu);
@@ -873,7 +876,9 @@ int __init microcode_init(void)
 		goto out_ucode_group;
 
 	register_syscore_ops(&mc_syscore_ops);
-	cpuhp_setup_state_nocalls(CPUHP_AP_MICROCODE_LOADER, "x86/microcode:online",
+	cpuhp_setup_state_nocalls(CPUHP_AP_MICROCODE_LOADER, "x86/microcode:starting",
+				  mc_cpu_starting, NULL);
+	cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN, "x86/microcode:online",
 				  mc_cpu_online, mc_cpu_down_prep);
 
 	pr_info("Microcode Update Driver: v%s.", DRIVER_VERSION);



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 43/72] x86/resctrl: Prevent possible overrun during bitmap operations
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (41 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 42/72] x86/microcode: Fix the microcode load on CPU hotplug for real Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 44/72] KVM: x86/mmu: Allocate PAE root array when using SVMs 32-bit NPT Greg Kroah-Hartman
                   ` (35 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Reinette Chatre, Borislav Petkov,
	Fenghua Yu, H. Peter Anvin, Ingo Molnar, Thomas Gleixner,
	Tony Luck, x86-ml

From: Reinette Chatre <reinette.chatre@intel.com>

commit 32f010deab575199df4ebe7b6aec20c17bb7eccd upstream.

While the DOC at the beginning of lib/bitmap.c explicitly states that
"The number of valid bits in a given bitmap does _not_ need to be an
exact multiple of BITS_PER_LONG.", some of the bitmap operations do
indeed access BITS_PER_LONG portions of the provided bitmap no matter
the size of the provided bitmap.

For example, if find_first_bit() is provided with an 8 bit bitmap the
operation will access BITS_PER_LONG bits from the provided bitmap. While
the operation ensures that these extra bits do not affect the result,
the memory is still accessed.

The capacity bitmasks (CBMs) are typically stored in u32 since they
can never exceed 32 bits. A few instances exist where a bitmap_*
operation is performed on a CBM by simply pointing the bitmap operation
to the stored u32 value.

The consequence of this pattern is that some bitmap_* operations will
access out-of-bounds memory when interacting with the provided CBM.

This same issue has previously been addressed with commit 49e00eee0061
("x86/intel_rdt: Fix out-of-bounds memory access in CBM tests")
but at that time not all instances of the issue were fixed.

Fix this by using an unsigned long to store the capacity bitmask data
that is passed to bitmap functions.

Fixes: e651901187ab ("x86/intel_rdt: Introduce "bit_usage" to display cache allocations details")
Fixes: f4e80d67a527 ("x86/intel_rdt: Resctrl files reflect pseudo-locked information")
Fixes: 95f0b77efa57 ("x86/intel_rdt: Initialize new resource group with sane defaults")
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: stable <stable@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/58c9b6081fd9bf599af0dfc01a6fdd335768efef.1560975645.git.reinette.chatre@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kernel/cpu/intel_rdt_rdtgroup.c |   35 ++++++++++++++-----------------
 1 file changed, 16 insertions(+), 19 deletions(-)

--- a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
+++ b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
@@ -792,8 +792,12 @@ static int rdt_bit_usage_show(struct ker
 			      struct seq_file *seq, void *v)
 {
 	struct rdt_resource *r = of->kn->parent->priv;
-	u32 sw_shareable = 0, hw_shareable = 0;
-	u32 exclusive = 0, pseudo_locked = 0;
+	/*
+	 * Use unsigned long even though only 32 bits are used to ensure
+	 * test_bit() is used safely.
+	 */
+	unsigned long sw_shareable = 0, hw_shareable = 0;
+	unsigned long exclusive = 0, pseudo_locked = 0;
 	struct rdt_domain *dom;
 	int i, hwb, swb, excl, psl;
 	enum rdtgrp_mode mode;
@@ -838,10 +842,10 @@ static int rdt_bit_usage_show(struct ker
 		}
 		for (i = r->cache.cbm_len - 1; i >= 0; i--) {
 			pseudo_locked = dom->plr ? dom->plr->cbm : 0;
-			hwb = test_bit(i, (unsigned long *)&hw_shareable);
-			swb = test_bit(i, (unsigned long *)&sw_shareable);
-			excl = test_bit(i, (unsigned long *)&exclusive);
-			psl = test_bit(i, (unsigned long *)&pseudo_locked);
+			hwb = test_bit(i, &hw_shareable);
+			swb = test_bit(i, &sw_shareable);
+			excl = test_bit(i, &exclusive);
+			psl = test_bit(i, &pseudo_locked);
 			if (hwb && swb)
 				seq_putc(seq, 'X');
 			else if (hwb && !swb)
@@ -2320,26 +2324,19 @@ out_destroy:
  */
 static void cbm_ensure_valid(u32 *_val, struct rdt_resource *r)
 {
-	/*
-	 * Convert the u32 _val to an unsigned long required by all the bit
-	 * operations within this function. No more than 32 bits of this
-	 * converted value can be accessed because all bit operations are
-	 * additionally provided with cbm_len that is initialized during
-	 * hardware enumeration using five bits from the EAX register and
-	 * thus never can exceed 32 bits.
-	 */
-	unsigned long *val = (unsigned long *)_val;
+	unsigned long val = *_val;
 	unsigned int cbm_len = r->cache.cbm_len;
 	unsigned long first_bit, zero_bit;
 
-	if (*val == 0)
+	if (val == 0)
 		return;
 
-	first_bit = find_first_bit(val, cbm_len);
-	zero_bit = find_next_zero_bit(val, cbm_len, first_bit);
+	first_bit = find_first_bit(&val, cbm_len);
+	zero_bit = find_next_zero_bit(&val, cbm_len, first_bit);
 
 	/* Clear any remaining bits to ensure contiguous region */
-	bitmap_clear(val, zero_bit, cbm_len - zero_bit);
+	bitmap_clear(&val, zero_bit, cbm_len - zero_bit);
+	*_val = (u32)val;
 }
 
 /**



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 44/72] KVM: x86/mmu: Allocate PAE root array when using SVMs 32-bit NPT
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (42 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 43/72] x86/resctrl: Prevent possible overrun during bitmap operations Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 45/72] NFS/flexfiles: Use the correct TCP timeout for flexfiles I/O Greg Kroah-Hartman
                   ` (34 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jiri Palecek, Sean Christopherson,
	Paolo Bonzini

From: Sean Christopherson <sean.j.christopherson@intel.com>

commit b6b80c78af838bef17501416d5d383fedab0010a upstream.

SVM's Nested Page Tables (NPT) reuses x86 paging for the host-controlled
page walk.  For 32-bit KVM, this means PAE paging is used even when TDP
is enabled, i.e. the PAE root array needs to be allocated.

Fixes: ee6268ba3a68 ("KVM: x86: Skip pae_root shadow allocation if tdp enabled")
Cc: stable@vger.kernel.org
Reported-by: Jiri Palecek <jpalecek@web.de>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Jiri Palecek <jpalecek@web.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kvm/mmu.c |   11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -5386,7 +5386,16 @@ static int alloc_mmu_pages(struct kvm_vc
 	struct page *page;
 	int i;
 
-	if (tdp_enabled)
+	/*
+	 * When using PAE paging, the four PDPTEs are treated as 'root' pages,
+	 * while the PDP table is a per-vCPU construct that's allocated at MMU
+	 * creation.  When emulating 32-bit mode, cr3 is only 32 bits even on
+	 * x86_64.  Therefore we need to allocate the PDP table in the first
+	 * 4GB of memory, which happens to fit the DMA32 zone.  Except for
+	 * SVM's 32-bit NPT support, TDP paging doesn't use PAE paging and can
+	 * skip allocating the PDP table.
+	 */
+	if (tdp_enabled && kvm_x86_ops->get_tdp_level(vcpu) > PT32E_ROOT_LEVEL)
 		return 0;
 
 	/*



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 45/72] NFS/flexfiles: Use the correct TCP timeout for flexfiles I/O
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (43 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 44/72] KVM: x86/mmu: Allocate PAE root array when using SVMs 32-bit NPT Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 46/72] cpu/speculation: Warn on unsupported mitigations= parameter Greg Kroah-Hartman
                   ` (33 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Trond Myklebust, Anna Schumaker

From: Trond Myklebust <trondmy@gmail.com>

commit 68f461593f76bd5f17e87cdd0bea28f4278c7268 upstream.

Fix a typo where we're confusing the default TCP retrans value
(NFS_DEF_TCP_RETRANS) for the default TCP timeout value.

Fixes: 15d03055cf39f ("pNFS/flexfiles: Set reasonable default ...")
Cc: stable@vger.kernel.org # 4.8+
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/nfs/flexfilelayout/flexfilelayoutdev.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/nfs/flexfilelayout/flexfilelayoutdev.c
+++ b/fs/nfs/flexfilelayout/flexfilelayoutdev.c
@@ -18,7 +18,7 @@
 
 #define NFSDBG_FACILITY		NFSDBG_PNFS_LD
 
-static unsigned int dataserver_timeo = NFS_DEF_TCP_RETRANS;
+static unsigned int dataserver_timeo = NFS_DEF_TCP_TIMEO;
 static unsigned int dataserver_retrans;
 
 static bool ff_layout_has_available_ds(struct pnfs_layout_segment *lseg);



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 46/72] cpu/speculation: Warn on unsupported mitigations= parameter
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (44 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 45/72] NFS/flexfiles: Use the correct TCP timeout for flexfiles I/O Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 47/72] SUNRPC: Clean up initialisation of the struct rpc_rqst Greg Kroah-Hartman
                   ` (32 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Geert Uytterhoeven, Thomas Gleixner,
	Josh Poimboeuf, Peter Zijlstra, Jiri Kosina, Ben Hutchings

From: Geert Uytterhoeven <geert@linux-m68k.org>

commit 1bf72720281770162c87990697eae1ba2f1d917a upstream.

Currently, if the user specifies an unsupported mitigation strategy on the
kernel command line, it will be ignored silently.  The code will fall back
to the default strategy, possibly leaving the system more vulnerable than
expected.

This may happen due to e.g. a simple typo, or, for a stable kernel release,
because not all mitigation strategies have been backported.

Inform the user by printing a message.

Fixes: 98af8452945c5565 ("cpu/speculation: Add 'mitigations=' cmdline option")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190516070935.22546-1-geert@linux-m68k.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 kernel/cpu.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -2289,6 +2289,9 @@ static int __init mitigations_parse_cmdl
 		cpu_mitigations = CPU_MITIGATIONS_AUTO;
 	else if (!strcmp(arg, "auto,nosmt"))
 		cpu_mitigations = CPU_MITIGATIONS_AUTO_NOSMT;
+	else
+		pr_crit("Unsupported mitigations=%s, system may still be vulnerable\n",
+			arg);
 
 	return 0;
 }



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 47/72] SUNRPC: Clean up initialisation of the struct rpc_rqst
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (45 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 46/72] cpu/speculation: Warn on unsupported mitigations= parameter Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 48/72] irqchip/mips-gic: Use the correct local interrupt map registers Greg Kroah-Hartman
                   ` (31 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Trond Myklebust, Yihao Wu, Caspar Zhang

From: Trond Myklebust <trond.myklebust@hammerspace.com>

commit 9dc6edcf676fe188430e8b119f91280bbf285163 upstream.

Move the initialisation back into xprt.c.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: Yihao Wu <wuyihao@linux.alibaba.com>
Cc: Caspar Zhang <caspar@linux.alibaba.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 include/linux/sunrpc/xprt.h |    1 
 net/sunrpc/clnt.c           |    1 
 net/sunrpc/xprt.c           |   91 ++++++++++++++++++++++++--------------------
 3 files changed, 51 insertions(+), 42 deletions(-)

--- a/include/linux/sunrpc/xprt.h
+++ b/include/linux/sunrpc/xprt.h
@@ -325,7 +325,6 @@ struct xprt_class {
 struct rpc_xprt		*xprt_create_transport(struct xprt_create *args);
 void			xprt_connect(struct rpc_task *task);
 void			xprt_reserve(struct rpc_task *task);
-void			xprt_request_init(struct rpc_task *task);
 void			xprt_retry_reserve(struct rpc_task *task);
 int			xprt_reserve_xprt(struct rpc_xprt *xprt, struct rpc_task *task);
 int			xprt_reserve_xprt_cong(struct rpc_xprt *xprt, struct rpc_task *task);
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -1558,7 +1558,6 @@ call_reserveresult(struct rpc_task *task
 	task->tk_status = 0;
 	if (status >= 0) {
 		if (task->tk_rqstp) {
-			xprt_request_init(task);
 			task->tk_action = call_refresh;
 			return;
 		}
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -1257,6 +1257,55 @@ void xprt_free(struct rpc_xprt *xprt)
 }
 EXPORT_SYMBOL_GPL(xprt_free);
 
+static __be32
+xprt_alloc_xid(struct rpc_xprt *xprt)
+{
+	__be32 xid;
+
+	spin_lock(&xprt->reserve_lock);
+	xid = (__force __be32)xprt->xid++;
+	spin_unlock(&xprt->reserve_lock);
+	return xid;
+}
+
+static void
+xprt_init_xid(struct rpc_xprt *xprt)
+{
+	xprt->xid = prandom_u32();
+}
+
+static void
+xprt_request_init(struct rpc_task *task)
+{
+	struct rpc_xprt *xprt = task->tk_xprt;
+	struct rpc_rqst	*req = task->tk_rqstp;
+
+	INIT_LIST_HEAD(&req->rq_list);
+	req->rq_timeout = task->tk_client->cl_timeout->to_initval;
+	req->rq_task	= task;
+	req->rq_xprt    = xprt;
+	req->rq_buffer  = NULL;
+	req->rq_xid	= xprt_alloc_xid(xprt);
+	req->rq_connect_cookie = xprt->connect_cookie - 1;
+	req->rq_bytes_sent = 0;
+	req->rq_snd_buf.len = 0;
+	req->rq_snd_buf.buflen = 0;
+	req->rq_rcv_buf.len = 0;
+	req->rq_rcv_buf.buflen = 0;
+	req->rq_release_snd_buf = NULL;
+	xprt_reset_majortimeo(req);
+	dprintk("RPC: %5u reserved req %p xid %08x\n", task->tk_pid,
+			req, ntohl(req->rq_xid));
+}
+
+static void
+xprt_do_reserve(struct rpc_xprt *xprt, struct rpc_task *task)
+{
+	xprt->ops->alloc_slot(xprt, task);
+	if (task->tk_rqstp != NULL)
+		xprt_request_init(task);
+}
+
 /**
  * xprt_reserve - allocate an RPC request slot
  * @task: RPC task requesting a slot allocation
@@ -1276,7 +1325,7 @@ void xprt_reserve(struct rpc_task *task)
 	task->tk_timeout = 0;
 	task->tk_status = -EAGAIN;
 	if (!xprt_throttle_congested(xprt, task))
-		xprt->ops->alloc_slot(xprt, task);
+		xprt_do_reserve(xprt, task);
 }
 
 /**
@@ -1298,45 +1347,7 @@ void xprt_retry_reserve(struct rpc_task
 
 	task->tk_timeout = 0;
 	task->tk_status = -EAGAIN;
-	xprt->ops->alloc_slot(xprt, task);
-}
-
-static inline __be32 xprt_alloc_xid(struct rpc_xprt *xprt)
-{
-	__be32 xid;
-
-	spin_lock(&xprt->reserve_lock);
-	xid = (__force __be32)xprt->xid++;
-	spin_unlock(&xprt->reserve_lock);
-	return xid;
-}
-
-static inline void xprt_init_xid(struct rpc_xprt *xprt)
-{
-	xprt->xid = prandom_u32();
-}
-
-void xprt_request_init(struct rpc_task *task)
-{
-	struct rpc_xprt *xprt = task->tk_xprt;
-	struct rpc_rqst	*req = task->tk_rqstp;
-
-	INIT_LIST_HEAD(&req->rq_list);
-	req->rq_timeout = task->tk_client->cl_timeout->to_initval;
-	req->rq_task	= task;
-	req->rq_xprt    = xprt;
-	req->rq_buffer  = NULL;
-	req->rq_xid	= xprt_alloc_xid(xprt);
-	req->rq_connect_cookie = xprt->connect_cookie - 1;
-	req->rq_bytes_sent = 0;
-	req->rq_snd_buf.len = 0;
-	req->rq_snd_buf.buflen = 0;
-	req->rq_rcv_buf.len = 0;
-	req->rq_rcv_buf.buflen = 0;
-	req->rq_release_snd_buf = NULL;
-	xprt_reset_majortimeo(req);
-	dprintk("RPC: %5u reserved req %p xid %08x\n", task->tk_pid,
-			req, ntohl(req->rq_xid));
+	xprt_do_reserve(xprt, task);
 }
 
 /**



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 48/72] irqchip/mips-gic: Use the correct local interrupt map registers
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (46 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 47/72] SUNRPC: Clean up initialisation of the struct rpc_rqst Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 49/72] eeprom: at24: fix unexpected timeout under high load Greg Kroah-Hartman
                   ` (30 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Paul Burton, Thomas Gleixner,
	Jason Cooper, Marc Zyngier, Archer Yan

From: Paul Burton <paul.burton@mips.com>

commit 6d4d367d0e9ffab4d64a3436256a6a052dc1195d upstream.

The MIPS GIC contains a block of registers used to map local interrupts
to a particular CPU interrupt pin. Since these registers are found at a
consecutive range of addresses we access them using an index, via the
(read|write)_gic_v[lo]_map accessor functions. We currently use values
from enum mips_gic_local_interrupt as those indices.

Unfortunately whilst enum mips_gic_local_interrupt provides the correct
offsets for bits in the pending & mask registers, the ordering of the
map registers is subtly different... Compared with the ordering of
pending & mask bits, the map registers move the FDC from the end of the
list to index 3 after the timer interrupt. As a result the performance
counter & software interrupts are therefore at indices 4-6 rather than
indices 3-5.

Notably this causes problems with performance counter interrupts being
incorrectly mapped on some systems, and presumably will also cause
problems for FDC interrupts.

Introduce a function to map from enum mips_gic_local_interrupt to the
index of the corresponding map register, and use it to ensure we access
the map registers for the correct interrupts.

Signed-off-by: Paul Burton <paul.burton@mips.com>
Fixes: a0dc5cb5e31b ("irqchip: mips-gic: Simplify gic_local_irq_domain_map()")
Fixes: da61fcf9d62a ("irqchip: mips-gic: Use irq_cpu_online to (un)mask all-VP(E) IRQs")
Reported-and-tested-by: Archer Yan <ayan@wavecomp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: stable@vger.kernel.org # v4.14+
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/mips/include/asm/mips-gic.h |   30 ++++++++++++++++++++++++++++++
 drivers/irqchip/irq-mips-gic.c   |    4 ++--
 2 files changed, 32 insertions(+), 2 deletions(-)

--- a/arch/mips/include/asm/mips-gic.h
+++ b/arch/mips/include/asm/mips-gic.h
@@ -315,6 +315,36 @@ static inline bool mips_gic_present(void
 }
 
 /**
+ * mips_gic_vx_map_reg() - Return GIC_Vx_<intr>_MAP register offset
+ * @intr: A GIC local interrupt
+ *
+ * Determine the index of the GIC_VL_<intr>_MAP or GIC_VO_<intr>_MAP register
+ * within the block of GIC map registers. This is almost the same as the order
+ * of interrupts in the pending & mask registers, as used by enum
+ * mips_gic_local_interrupt, but moves the FDC interrupt & thus offsets the
+ * interrupts after it...
+ *
+ * Return: The map register index corresponding to @intr.
+ *
+ * The return value is suitable for use with the (read|write)_gic_v[lo]_map
+ * accessor functions.
+ */
+static inline unsigned int
+mips_gic_vx_map_reg(enum mips_gic_local_interrupt intr)
+{
+	/* WD, Compare & Timer are 1:1 */
+	if (intr <= GIC_LOCAL_INT_TIMER)
+		return intr;
+
+	/* FDC moves to after Timer... */
+	if (intr == GIC_LOCAL_INT_FDC)
+		return GIC_LOCAL_INT_TIMER + 1;
+
+	/* As a result everything else is offset by 1 */
+	return intr + 1;
+}
+
+/**
  * gic_get_c0_compare_int() - Return cp0 count/compare interrupt virq
  *
  * Determine the virq number to use for the coprocessor 0 count/compare
--- a/drivers/irqchip/irq-mips-gic.c
+++ b/drivers/irqchip/irq-mips-gic.c
@@ -388,7 +388,7 @@ static void gic_all_vpes_irq_cpu_online(
 	intr = GIC_HWIRQ_TO_LOCAL(d->hwirq);
 	cd = irq_data_get_irq_chip_data(d);
 
-	write_gic_vl_map(intr, cd->map);
+	write_gic_vl_map(mips_gic_vx_map_reg(intr), cd->map);
 	if (cd->mask)
 		write_gic_vl_smask(BIT(intr));
 }
@@ -517,7 +517,7 @@ static int gic_irq_domain_map(struct irq
 	spin_lock_irqsave(&gic_lock, flags);
 	for_each_online_cpu(cpu) {
 		write_gic_vl_other(mips_cm_vp_id(cpu));
-		write_gic_vo_map(intr, map);
+		write_gic_vo_map(mips_gic_vx_map_reg(intr), map);
 	}
 	spin_unlock_irqrestore(&gic_lock, flags);
 



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 49/72] eeprom: at24: fix unexpected timeout under high load
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (47 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 48/72] irqchip/mips-gic: Use the correct local interrupt map registers Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 50/72] af_packet: Block execution of tasks waiting for transmit to complete in AF_PACKET Greg Kroah-Hartman
                   ` (29 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Wang Xin, Mark Jonas, Bartosz Golaszewski

From: Wang Xin <xin.wang7@cn.bosch.com>

commit 9a9e295e7c5c0409c020088b0ae017e6c2b7df6e upstream.

Within at24_loop_until_timeout the timestamp used for timeout checking
is recorded after the I2C transfer and sleep_range(). Under high CPU
load either the execution time for I2C transfer or sleep_range() could
actually be larger than the timeout value. Worst case the I2C transfer
is only tried once because the loop will exit due to the timeout
although the EEPROM is now ready.

To fix this issue the timestamp is recorded at the beginning of each
iteration. That is, before I2C transfer and sleep. Then the timeout
is actually checked against the timestamp of the previous iteration.
This makes sure that even if the timeout is reached, there is still one
more chance to try the I2C transfer in case the EEPROM is ready.

Example:

If you have a system which combines high CPU load with repeated EEPROM
writes you will run into the following scenario.

 - System makes a successful regmap_bulk_write() to EEPROM.
 - System wants to perform another write to EEPROM but EEPROM is still
   busy with the last write.
 - Because of high CPU load the usleep_range() will sleep more than
   25 ms (at24_write_timeout).
 - Within the over-long sleeping the EEPROM finished the previous write
   operation and is ready again.
 - at24_loop_until_timeout() will detect timeout and won't try to write.

Signed-off-by: Wang Xin <xin.wang7@cn.bosch.com>
Signed-off-by: Mark Jonas <mark.jonas@de.bosch.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/misc/eeprom/at24.c |   43 ++++++++++++++++++++++---------------------
 1 file changed, 22 insertions(+), 21 deletions(-)

--- a/drivers/misc/eeprom/at24.c
+++ b/drivers/misc/eeprom/at24.c
@@ -106,23 +106,6 @@ static unsigned int at24_write_timeout =
 module_param_named(write_timeout, at24_write_timeout, uint, 0);
 MODULE_PARM_DESC(at24_write_timeout, "Time (in ms) to try writes (default 25)");
 
-/*
- * Both reads and writes fail if the previous write didn't complete yet. This
- * macro loops a few times waiting at least long enough for one entire page
- * write to work while making sure that at least one iteration is run before
- * checking the break condition.
- *
- * It takes two parameters: a variable in which the future timeout in jiffies
- * will be stored and a temporary variable holding the time of the last
- * iteration of processing the request. Both should be unsigned integers
- * holding at least 32 bits.
- */
-#define at24_loop_until_timeout(tout, op_time)				\
-	for (tout = jiffies + msecs_to_jiffies(at24_write_timeout),	\
-	     op_time = 0;						\
-	     op_time ? time_before(op_time, tout) : true;		\
-	     usleep_range(1000, 1500), op_time = jiffies)
-
 struct at24_chip_data {
 	/*
 	 * these fields mirror their equivalents in
@@ -311,13 +294,22 @@ static ssize_t at24_regmap_read(struct a
 	/* adjust offset for mac and serial read ops */
 	offset += at24->offset_adj;
 
-	at24_loop_until_timeout(timeout, read_time) {
+	timeout = jiffies + msecs_to_jiffies(at24_write_timeout);
+	do {
+		/*
+		 * The timestamp shall be taken before the actual operation
+		 * to avoid a premature timeout in case of high CPU load.
+		 */
+		read_time = jiffies;
+
 		ret = regmap_bulk_read(regmap, offset, buf, count);
 		dev_dbg(&client->dev, "read %zu@%d --> %d (%ld)\n",
 			count, offset, ret, jiffies);
 		if (!ret)
 			return count;
-	}
+
+		usleep_range(1000, 1500);
+	} while (time_before(read_time, timeout));
 
 	return -ETIMEDOUT;
 }
@@ -361,14 +353,23 @@ static ssize_t at24_regmap_write(struct
 	regmap = at24_client->regmap;
 	client = at24_client->client;
 	count = at24_adjust_write_count(at24, offset, count);
+	timeout = jiffies + msecs_to_jiffies(at24_write_timeout);
+
+	do {
+		/*
+		 * The timestamp shall be taken before the actual operation
+		 * to avoid a premature timeout in case of high CPU load.
+		 */
+		write_time = jiffies;
 
-	at24_loop_until_timeout(timeout, write_time) {
 		ret = regmap_bulk_write(regmap, offset, buf, count);
 		dev_dbg(&client->dev, "write %zu@%d --> %d (%ld)\n",
 			count, offset, ret, jiffies);
 		if (!ret)
 			return count;
-	}
+
+		usleep_range(1000, 1500);
+	} while (time_before(write_time, timeout));
 
 	return -ETIMEDOUT;
 }



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 50/72] af_packet: Block execution of tasks waiting for transmit to complete in AF_PACKET
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (48 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 49/72] eeprom: at24: fix unexpected timeout under high load Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 51/72] bonding: Always enable vlan tx offload Greg Kroah-Hartman
                   ` (28 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Willem de Bruijn, Neil Horman,
	Matteo Croce, David S. Miller

From: Neil Horman <nhorman@tuxdriver.com>

[ Upstream commit 89ed5b519004a7706f50b70f611edbd3aaacff2c ]

When an application is run that:
a) Sets its scheduler to be SCHED_FIFO
and
b) Opens a memory mapped AF_PACKET socket, and sends frames with the
MSG_DONTWAIT flag cleared, its possible for the application to hang
forever in the kernel.  This occurs because when waiting, the code in
tpacket_snd calls schedule, which under normal circumstances allows
other tasks to run, including ksoftirqd, which in some cases is
responsible for freeing the transmitted skb (which in AF_PACKET calls a
destructor that flips the status bit of the transmitted frame back to
available, allowing the transmitting task to complete).

However, when the calling application is SCHED_FIFO, its priority is
such that the schedule call immediately places the task back on the cpu,
preventing ksoftirqd from freeing the skb, which in turn prevents the
transmitting task from detecting that the transmission is complete.

We can fix this by converting the schedule call to a completion
mechanism.  By using a completion queue, we force the calling task, when
it detects there are no more frames to send, to schedule itself off the
cpu until such time as the last transmitted skb is freed, allowing
forward progress to be made.

Tested by myself and the reporter, with good results

Change Notes:

V1->V2:
	Enhance the sleep logic to support being interruptible and
allowing for honoring to SK_SNDTIMEO (Willem de Bruijn)

V2->V3:
	Rearrage the point at which we wait for the completion queue, to
avoid needing to check for ph/skb being null at the end of the loop.
Also move the complete call to the skb destructor to avoid needing to
modify __packet_set_status.  Also gate calling complete on
packet_read_pending returning zero to avoid multiple calls to complete.
(Willem de Bruijn)

	Move timeo computation within loop, to re-fetch the socket
timeout since we also use the timeo variable to record the return code
from the wait_for_complete call (Neil Horman)

V3->V4:
	Willem has requested that the control flow be restored to the
previous state.  Doing so lets us eliminate the need for the
po->wait_on_complete flag variable, and lets us get rid of the
packet_next_frame function, but introduces another complexity.
Specifically, but using the packet pending count, we can, if an
applications calls sendmsg multiple times with MSG_DONTWAIT set, each
set of transmitted frames, when complete, will cause
tpacket_destruct_skb to issue a complete call, for which there will
never be a wait_on_completion call.  This imbalance will lead to any
future call to wait_for_completion here to return early, when the frames
they sent may not have completed.  To correct this, we need to re-init
the completion queue on every call to tpacket_snd before we enter the
loop so as to ensure we wait properly for the frames we send in this
iteration.

	Change the timeout and interrupted gotos to out_put rather than
out_status so that we don't try to free a non-existant skb
	Clean up some extra newlines (Willem de Bruijn)

Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/packet/af_packet.c |   20 +++++++++++++++++---
 net/packet/internal.h  |    1 +
 2 files changed, 18 insertions(+), 3 deletions(-)

--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2399,6 +2399,9 @@ static void tpacket_destruct_skb(struct
 
 		ts = __packet_set_timestamp(po, ph, skb);
 		__packet_set_status(po, ph, TP_STATUS_AVAILABLE | ts);
+
+		if (!packet_read_pending(&po->tx_ring))
+			complete(&po->skb_completion);
 	}
 
 	sock_wfree(skb);
@@ -2594,7 +2597,7 @@ static int tpacket_parse_header(struct p
 
 static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
 {
-	struct sk_buff *skb;
+	struct sk_buff *skb = NULL;
 	struct net_device *dev;
 	struct virtio_net_hdr *vnet_hdr = NULL;
 	struct sockcm_cookie sockc;
@@ -2609,6 +2612,7 @@ static int tpacket_snd(struct packet_soc
 	int len_sum = 0;
 	int status = TP_STATUS_AVAILABLE;
 	int hlen, tlen, copylen = 0;
+	long timeo = 0;
 
 	mutex_lock(&po->pg_vec_lock);
 
@@ -2655,12 +2659,21 @@ static int tpacket_snd(struct packet_soc
 	if ((size_max > dev->mtu + reserve + VLAN_HLEN) && !po->has_vnet_hdr)
 		size_max = dev->mtu + reserve + VLAN_HLEN;
 
+	reinit_completion(&po->skb_completion);
+
 	do {
 		ph = packet_current_frame(po, &po->tx_ring,
 					  TP_STATUS_SEND_REQUEST);
 		if (unlikely(ph == NULL)) {
-			if (need_wait && need_resched())
-				schedule();
+			if (need_wait && skb) {
+				timeo = sock_sndtimeo(&po->sk, msg->msg_flags & MSG_DONTWAIT);
+				timeo = wait_for_completion_interruptible_timeout(&po->skb_completion, timeo);
+				if (timeo <= 0) {
+					err = !timeo ? -ETIMEDOUT : -ERESTARTSYS;
+					goto out_put;
+				}
+			}
+			/* check for additional frames */
 			continue;
 		}
 
@@ -3216,6 +3229,7 @@ static int packet_create(struct net *net
 	sock_init_data(sock, sk);
 
 	po = pkt_sk(sk);
+	init_completion(&po->skb_completion);
 	sk->sk_family = PF_PACKET;
 	po->num = proto;
 	po->xmit = dev_queue_xmit;
--- a/net/packet/internal.h
+++ b/net/packet/internal.h
@@ -128,6 +128,7 @@ struct packet_sock {
 	unsigned int		tp_hdrlen;
 	unsigned int		tp_reserve;
 	unsigned int		tp_tstamp;
+	struct completion	skb_completion;
 	struct net_device __rcu	*cached_dev;
 	int			(*xmit)(struct sk_buff *skb);
 	struct packet_type	prot_hook ____cacheline_aligned_in_smp;



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 51/72] bonding: Always enable vlan tx offload
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (49 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 50/72] af_packet: Block execution of tasks waiting for transmit to complete in AF_PACKET Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 52/72] ipv4: Use return value of inet_iif() for __raw_v4_lookup in the while loop Greg Kroah-Hartman
                   ` (27 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jiri Pirko, YueHaibing, Jiri Pirko,
	David S. Miller

From: YueHaibing <yuehaibing@huawei.com>

[ Upstream commit 30d8177e8ac776d89d387fad547af6a0f599210e ]

We build vlan on top of bonding interface, which vlan offload
is off, bond mode is 802.3ad (LACP) and xmit_hash_policy is
BOND_XMIT_POLICY_ENCAP34.

Because vlan tx offload is off, vlan tci is cleared and skb push
the vlan header in validate_xmit_vlan() while sending from vlan
devices. Then in bond_xmit_hash, __skb_flow_dissect() fails to
get information from protocol headers encapsulated within vlan,
because 'nhoff' is points to IP header, so bond hashing is based
on layer 2 info, which fails to distribute packets across slaves.

This patch always enable bonding's vlan tx offload, pass the vlan
packets to the slave devices with vlan tci, let them to handle
vlan implementation.

Fixes: 278339a42a1b ("bonding: propogate vlan_features to bonding master")
Suggested-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/bonding/bond_main.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -4307,12 +4307,12 @@ void bond_setup(struct net_device *bond_
 	bond_dev->features |= NETIF_F_NETNS_LOCAL;
 
 	bond_dev->hw_features = BOND_VLAN_FEATURES |
-				NETIF_F_HW_VLAN_CTAG_TX |
 				NETIF_F_HW_VLAN_CTAG_RX |
 				NETIF_F_HW_VLAN_CTAG_FILTER;
 
 	bond_dev->hw_features |= NETIF_F_GSO_ENCAP_ALL | NETIF_F_GSO_UDP_L4;
 	bond_dev->features |= bond_dev->hw_features;
+	bond_dev->features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX;
 }
 
 /* Destroy a bonding device.



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 52/72] ipv4: Use return value of inet_iif() for __raw_v4_lookup in the while loop
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (50 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 51/72] bonding: Always enable vlan tx offload Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 53/72] net/packet: fix memory leak in packet_set_ring() Greg Kroah-Hartman
                   ` (26 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Stephen Suryaputra, David Ahern,
	David S. Miller

From: Stephen Suryaputra <ssuryaextr@gmail.com>

[ Upstream commit 38c73529de13e1e10914de7030b659a2f8b01c3b ]

In commit 19e4e768064a8 ("ipv4: Fix raw socket lookup for local
traffic"), the dif argument to __raw_v4_lookup() is coming from the
returned value of inet_iif() but the change was done only for the first
lookup. Subsequent lookups in the while loop still use skb->dev->ifIndex.

Fixes: 19e4e768064a8 ("ipv4: Fix raw socket lookup for local traffic")
Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv4/raw.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -202,7 +202,7 @@ static int raw_v4_input(struct sk_buff *
 		}
 		sk = __raw_v4_lookup(net, sk_next(sk), iph->protocol,
 				     iph->saddr, iph->daddr,
-				     skb->dev->ifindex, sdif);
+				     dif, sdif);
 	}
 out:
 	read_unlock(&raw_v4_hashinfo.lock);



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 53/72] net/packet: fix memory leak in packet_set_ring()
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (51 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 52/72] ipv4: Use return value of inet_iif() for __raw_v4_lookup in the while loop Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 54/72] net: remove duplicate fetch in sock_getsockopt Greg Kroah-Hartman
                   ` (25 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Eric Dumazet, Sowmini Varadhan,
	syzbot, David S. Miller

From: Eric Dumazet <edumazet@google.com>

[ Upstream commit 55655e3d1197fff16a7a05088fb0e5eba50eac55 ]

syzbot found we can leak memory in packet_set_ring(), if user application
provides buggy parameters.

Fixes: 7f953ab2ba46 ("af_packet: TX_RING support for TPACKET_V3")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/packet/af_packet.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -4316,7 +4316,7 @@ static int packet_set_ring(struct sock *
 				    req3->tp_sizeof_priv ||
 				    req3->tp_feature_req_word) {
 					err = -EINVAL;
-					goto out;
+					goto out_free_pg_vec;
 				}
 			}
 			break;
@@ -4380,6 +4380,7 @@ static int packet_set_ring(struct sock *
 			prb_shutdown_retire_blk_timer(po, rb_queue);
 	}
 
+out_free_pg_vec:
 	if (pg_vec)
 		free_pg_vec(pg_vec, order, req->tp_block_nr);
 out:



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 54/72] net: remove duplicate fetch in sock_getsockopt
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (52 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 53/72] net/packet: fix memory leak in packet_set_ring() Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 55/72] net: stmmac: fixed new system time seconds value calculation Greg Kroah-Hartman
                   ` (24 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, JingYi Hou, David S. Miller

From: JingYi Hou <houjingyi647@gmail.com>

[ Upstream commit d0bae4a0e3d8c5690a885204d7eb2341a5b4884d ]

In sock_getsockopt(), 'optlen' is fetched the first time from userspace.
'len < 0' is then checked. Then in condition 'SO_MEMINFO', 'optlen' is
fetched the second time from userspace.

If change it between two fetches may cause security problems or unexpected
behaivor, and there is no reason to fetch it a second time.

To fix this, we need to remove the second fetch.

Signed-off-by: JingYi Hou <houjingyi647@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/core/sock.c |    3 ---
 1 file changed, 3 deletions(-)

--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1348,9 +1348,6 @@ int sock_getsockopt(struct socket *sock,
 	{
 		u32 meminfo[SK_MEMINFO_VARS];
 
-		if (get_user(len, optlen))
-			return -EFAULT;
-
 		sk_get_meminfo(sk, meminfo);
 
 		len = min_t(unsigned int, len, sizeof(meminfo));



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 55/72] net: stmmac: fixed new system time seconds value calculation
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (53 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 54/72] net: remove duplicate fetch in sock_getsockopt Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 56/72] net: stmmac: set IC bit when transmitting frames with HW timestamp Greg Kroah-Hartman
                   ` (23 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Roland Hii, Ong Boon Leong,
	Voon Weifeng, David S. Miller

From: Roland Hii <roland.king.guan.hii@intel.com>

[ Upstream commit a1e5388b4d5fc78688e5e9ee6641f779721d6291 ]

When ADDSUB bit is set, the system time seconds field is calculated as
the complement of the seconds part of the update value.

For example, if 3.000000001 seconds need to be subtracted from the
system time, this field is calculated as
2^32 - 3 = 4294967296 - 3 = 0x100000000 - 3 = 0xFFFFFFFD

Previously, the 0x100000000 is mistakenly written as 100000000.

This is further simplified from
  sec = (0x100000000ULL - sec);
to
  sec = -sec;

Fixes: ba1ffd74df74 ("stmmac: fix PTP support for GMAC4")
Signed-off-by: Roland Hii <roland.king.guan.hii@intel.com>
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: Voon Weifeng <weifeng.voon@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c
@@ -122,7 +122,7 @@ static int adjust_systime(void __iomem *
 		 * programmed with (2^32 – <new_sec_value>)
 		 */
 		if (gmac4)
-			sec = (100000000ULL - sec);
+			sec = -sec;
 
 		value = readl(ioaddr + PTP_TCR);
 		if (value & PTP_TCR_TSCTRLSSR)



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 56/72] net: stmmac: set IC bit when transmitting frames with HW timestamp
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (54 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 55/72] net: stmmac: fixed new system time seconds value calculation Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 57/72] sctp: change to hold sk after auth shkey is created successfully Greg Kroah-Hartman
                   ` (22 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Roland Hii, Ong Boon Leong,
	Voon Weifeng, David S. Miller

From: Roland Hii <roland.king.guan.hii@intel.com>

[ Upstream commit d0bb82fd60183868f46c8ccc595a3d61c3334a18 ]

When transmitting certain PTP frames, e.g. SYNC and DELAY_REQ, the
PTP daemon, e.g. ptp4l, is polling the driver for the frame transmit
hardware timestamp. The polling will most likely timeout if the tx
coalesce is enabled due to the Interrupt-on-Completion (IC) bit is
not set in tx descriptor for those frames.

This patch will ignore the tx coalesce parameter and set the IC bit
when transmitting PTP frames which need to report out the frame
transmit hardware timestamp to user space.

Fixes: f748be531d70 ("net: stmmac: Rework coalesce timer and fix multi-queue races")
Signed-off-by: Roland Hii <roland.king.guan.hii@intel.com>
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: Voon Weifeng <weifeng.voon@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c |   22 ++++++++++++++--------
 1 file changed, 14 insertions(+), 8 deletions(-)

--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -2938,12 +2938,15 @@ static netdev_tx_t stmmac_tso_xmit(struc
 
 	/* Manage tx mitigation */
 	tx_q->tx_count_frames += nfrags + 1;
-	if (priv->tx_coal_frames <= tx_q->tx_count_frames) {
+	if (likely(priv->tx_coal_frames > tx_q->tx_count_frames) &&
+	    !(priv->synopsys_id >= DWMAC_CORE_4_00 &&
+	    (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) &&
+	    priv->hwts_tx_en)) {
+		stmmac_tx_timer_arm(priv, queue);
+	} else {
+		tx_q->tx_count_frames = 0;
 		stmmac_set_tx_ic(priv, desc);
 		priv->xstats.tx_set_ic_bit++;
-		tx_q->tx_count_frames = 0;
-	} else {
-		stmmac_tx_timer_arm(priv, queue);
 	}
 
 	skb_tx_timestamp(skb);
@@ -3157,12 +3160,15 @@ static netdev_tx_t stmmac_xmit(struct sk
 	 * element in case of no SG.
 	 */
 	tx_q->tx_count_frames += nfrags + 1;
-	if (priv->tx_coal_frames <= tx_q->tx_count_frames) {
+	if (likely(priv->tx_coal_frames > tx_q->tx_count_frames) &&
+	    !(priv->synopsys_id >= DWMAC_CORE_4_00 &&
+	    (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) &&
+	    priv->hwts_tx_en)) {
+		stmmac_tx_timer_arm(priv, queue);
+	} else {
+		tx_q->tx_count_frames = 0;
 		stmmac_set_tx_ic(priv, desc);
 		priv->xstats.tx_set_ic_bit++;
-		tx_q->tx_count_frames = 0;
-	} else {
-		stmmac_tx_timer_arm(priv, queue);
 	}
 
 	skb_tx_timestamp(skb);



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 57/72] sctp: change to hold sk after auth shkey is created successfully
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (55 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 56/72] net: stmmac: set IC bit when transmitting frames with HW timestamp Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:01 ` [PATCH 4.19 58/72] team: Always enable vlan tx offload Greg Kroah-Hartman
                   ` (21 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, syzbot+afabda3890cc2f765041,
	syzbot+276ca1c77a19977c0130, Xin Long, Neil Horman,
	David S. Miller

From: Xin Long <lucien.xin@gmail.com>

[ Upstream commit 25bff6d5478b2a02368097015b7d8eb727c87e16 ]

Now in sctp_endpoint_init(), it holds the sk then creates auth
shkey. But when the creation fails, it doesn't release the sk,
which causes a sk defcnf leak,

Here to fix it by only holding the sk when auth shkey is created
successfully.

Fixes: a29a5bd4f5c3 ("[SCTP]: Implement SCTP-AUTH initializations.")
Reported-by: syzbot+afabda3890cc2f765041@syzkaller.appspotmail.com
Reported-by: syzbot+276ca1c77a19977c0130@syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/sctp/endpointola.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

--- a/net/sctp/endpointola.c
+++ b/net/sctp/endpointola.c
@@ -126,10 +126,6 @@ static struct sctp_endpoint *sctp_endpoi
 	/* Initialize the bind addr area */
 	sctp_bind_addr_init(&ep->base.bind_addr, 0);
 
-	/* Remember who we are attached to.  */
-	ep->base.sk = sk;
-	sock_hold(ep->base.sk);
-
 	/* Create the lists of associations.  */
 	INIT_LIST_HEAD(&ep->asocs);
 
@@ -167,6 +163,10 @@ static struct sctp_endpoint *sctp_endpoi
 	ep->prsctp_enable = net->sctp.prsctp_enable;
 	ep->reconf_enable = net->sctp.reconf_enable;
 
+	/* Remember who we are attached to.  */
+	ep->base.sk = sk;
+	sock_hold(ep->base.sk);
+
 	return ep;
 
 nomem_hmacs:



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 58/72] team: Always enable vlan tx offload
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (56 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 57/72] sctp: change to hold sk after auth shkey is created successfully Greg Kroah-Hartman
@ 2019-07-02  8:01 ` Greg Kroah-Hartman
  2019-07-02  8:02 ` [PATCH 4.19 59/72] tipc: change to use register_pernet_device Greg Kroah-Hartman
                   ` (20 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jiri Pirko, YueHaibing, David S. Miller

From: YueHaibing <yuehaibing@huawei.com>

[ Upstream commit ee4297420d56a0033a8593e80b33fcc93fda8509 ]

We should rather have vlan_tci filled all the way down
to the transmitting netdevice and let it do the hw/sw
vlan implementation.

Suggested-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/team/team.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/net/team/team.c
+++ b/drivers/net/team/team.c
@@ -2139,12 +2139,12 @@ static void team_setup(struct net_device
 	dev->features |= NETIF_F_NETNS_LOCAL;
 
 	dev->hw_features = TEAM_VLAN_FEATURES |
-			   NETIF_F_HW_VLAN_CTAG_TX |
 			   NETIF_F_HW_VLAN_CTAG_RX |
 			   NETIF_F_HW_VLAN_CTAG_FILTER;
 
 	dev->hw_features |= NETIF_F_GSO_ENCAP_ALL | NETIF_F_GSO_UDP_L4;
 	dev->features |= dev->hw_features;
+	dev->features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX;
 }
 
 static int team_newlink(struct net *src_net, struct net_device *dev,



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 59/72] tipc: change to use register_pernet_device
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (57 preceding siblings ...)
  2019-07-02  8:01 ` [PATCH 4.19 58/72] team: Always enable vlan tx offload Greg Kroah-Hartman
@ 2019-07-02  8:02 ` Greg Kroah-Hartman
  2019-07-02  8:02 ` [PATCH 4.19 60/72] tipc: check msg->req data len in tipc_nl_compat_bearer_disable Greg Kroah-Hartman
                   ` (19 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:02 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Li Shuang, Xin Long, Jon Maloy,
	David S. Miller

From: Xin Long <lucien.xin@gmail.com>

[ Upstream commit c492d4c74dd3f87559883ffa0f94a8f1ae3fe5f5 ]

This patch is to fix a dst defcnt leak, which can be reproduced by doing:

  # ip net a c; ip net a s; modprobe tipc
  # ip net e s ip l a n eth1 type veth peer n eth1 netns c
  # ip net e c ip l s lo up; ip net e c ip l s eth1 up
  # ip net e s ip l s lo up; ip net e s ip l s eth1 up
  # ip net e c ip a a 1.1.1.2/8 dev eth1
  # ip net e s ip a a 1.1.1.1/8 dev eth1
  # ip net e c tipc b e m udp n u1 localip 1.1.1.2
  # ip net e s tipc b e m udp n u1 localip 1.1.1.1
  # ip net d c; ip net d s; rmmod tipc

and it will get stuck and keep logging the error:

  unregister_netdevice: waiting for lo to become free. Usage count = 1

The cause is that a dst is held by the udp sock's sk_rx_dst set on udp rx
path with udp_early_demux == 1, and this dst (eventually holding lo dev)
can't be released as bearer's removal in tipc pernet .exit happens after
lo dev's removal, default_device pernet .exit.

 "There are two distinct types of pernet_operations recognized: subsys and
  device.  At creation all subsys init functions are called before device
  init functions, and at destruction all device exit functions are called
  before subsys exit function."

So by calling register_pernet_device instead to register tipc_net_ops, the
pernet .exit() will be invoked earlier than loopback dev's removal when a
netns is being destroyed, as fou/gue does.

Note that vxlan and geneve udp tunnels don't have this issue, as the udp
sock is released in their device ndo_stop().

This fix is also necessary for tipc dst_cache, which will hold dsts on tx
path and I will introduce in my next patch.

Reported-by: Li Shuang <shuali@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/tipc/core.c |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

--- a/net/tipc/core.c
+++ b/net/tipc/core.c
@@ -132,7 +132,7 @@ static int __init tipc_init(void)
 	if (err)
 		goto out_sysctl;
 
-	err = register_pernet_subsys(&tipc_net_ops);
+	err = register_pernet_device(&tipc_net_ops);
 	if (err)
 		goto out_pernet;
 
@@ -140,7 +140,7 @@ static int __init tipc_init(void)
 	if (err)
 		goto out_socket;
 
-	err = register_pernet_subsys(&tipc_topsrv_net_ops);
+	err = register_pernet_device(&tipc_topsrv_net_ops);
 	if (err)
 		goto out_pernet_topsrv;
 
@@ -151,11 +151,11 @@ static int __init tipc_init(void)
 	pr_info("Started in single node mode\n");
 	return 0;
 out_bearer:
-	unregister_pernet_subsys(&tipc_topsrv_net_ops);
+	unregister_pernet_device(&tipc_topsrv_net_ops);
 out_pernet_topsrv:
 	tipc_socket_stop();
 out_socket:
-	unregister_pernet_subsys(&tipc_net_ops);
+	unregister_pernet_device(&tipc_net_ops);
 out_pernet:
 	tipc_unregister_sysctl();
 out_sysctl:
@@ -170,9 +170,9 @@ out_netlink:
 static void __exit tipc_exit(void)
 {
 	tipc_bearer_cleanup();
-	unregister_pernet_subsys(&tipc_topsrv_net_ops);
+	unregister_pernet_device(&tipc_topsrv_net_ops);
 	tipc_socket_stop();
-	unregister_pernet_subsys(&tipc_net_ops);
+	unregister_pernet_device(&tipc_net_ops);
 	tipc_netlink_stop();
 	tipc_netlink_compat_stop();
 	tipc_unregister_sysctl();



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 60/72] tipc: check msg->req data len in tipc_nl_compat_bearer_disable
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (58 preceding siblings ...)
  2019-07-02  8:02 ` [PATCH 4.19 59/72] tipc: change to use register_pernet_device Greg Kroah-Hartman
@ 2019-07-02  8:02 ` Greg Kroah-Hartman
  2019-07-02  8:02 ` [PATCH 4.19 61/72] tun: wake up waitqueues after IFF_UP is set Greg Kroah-Hartman
                   ` (18 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:02 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, syzbot+30eaa8bf392f7fafffaf,
	Xin Long, David S. Miller

From: Xin Long <lucien.xin@gmail.com>

[ Upstream commit 4f07b80c973348a99b5d2a32476a2e7877e94a05 ]

This patch is to fix an uninit-value issue, reported by syzbot:

  BUG: KMSAN: uninit-value in memchr+0xce/0x110 lib/string.c:981
  Call Trace:
    __dump_stack lib/dump_stack.c:77 [inline]
    dump_stack+0x191/0x1f0 lib/dump_stack.c:113
    kmsan_report+0x130/0x2a0 mm/kmsan/kmsan.c:622
    __msan_warning+0x75/0xe0 mm/kmsan/kmsan_instr.c:310
    memchr+0xce/0x110 lib/string.c:981
    string_is_valid net/tipc/netlink_compat.c:176 [inline]
    tipc_nl_compat_bearer_disable+0x2a1/0x480 net/tipc/netlink_compat.c:449
    __tipc_nl_compat_doit net/tipc/netlink_compat.c:327 [inline]
    tipc_nl_compat_doit+0x3ac/0xb00 net/tipc/netlink_compat.c:360
    tipc_nl_compat_handle net/tipc/netlink_compat.c:1178 [inline]
    tipc_nl_compat_recv+0x1b1b/0x27b0 net/tipc/netlink_compat.c:1281

TLV_GET_DATA_LEN() may return a negtive int value, which will be
used as size_t (becoming a big unsigned long) passed into memchr,
cause this issue.

Similar to what it does in tipc_nl_compat_bearer_enable(), this
fix is to return -EINVAL when TLV_GET_DATA_LEN() is negtive in
tipc_nl_compat_bearer_disable(), as well as in
tipc_nl_compat_link_stat_dump() and tipc_nl_compat_link_reset_stats().

v1->v2:
  - add the missing Fixes tags per Eric's request.

Fixes: 0762216c0ad2 ("tipc: fix uninit-value in tipc_nl_compat_bearer_enable")
Fixes: 8b66fee7f8ee ("tipc: fix uninit-value in tipc_nl_compat_link_reset_stats")
Reported-by: syzbot+30eaa8bf392f7fafffaf@syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/tipc/netlink_compat.c |   18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

--- a/net/tipc/netlink_compat.c
+++ b/net/tipc/netlink_compat.c
@@ -445,7 +445,11 @@ static int tipc_nl_compat_bearer_disable
 	if (!bearer)
 		return -EMSGSIZE;
 
-	len = min_t(int, TLV_GET_DATA_LEN(msg->req), TIPC_MAX_BEARER_NAME);
+	len = TLV_GET_DATA_LEN(msg->req);
+	if (len <= 0)
+		return -EINVAL;
+
+	len = min_t(int, len, TIPC_MAX_BEARER_NAME);
 	if (!string_is_valid(name, len))
 		return -EINVAL;
 
@@ -537,7 +541,11 @@ static int tipc_nl_compat_link_stat_dump
 
 	name = (char *)TLV_DATA(msg->req);
 
-	len = min_t(int, TLV_GET_DATA_LEN(msg->req), TIPC_MAX_LINK_NAME);
+	len = TLV_GET_DATA_LEN(msg->req);
+	if (len <= 0)
+		return -EINVAL;
+
+	len = min_t(int, len, TIPC_MAX_BEARER_NAME);
 	if (!string_is_valid(name, len))
 		return -EINVAL;
 
@@ -815,7 +823,11 @@ static int tipc_nl_compat_link_reset_sta
 	if (!link)
 		return -EMSGSIZE;
 
-	len = min_t(int, TLV_GET_DATA_LEN(msg->req), TIPC_MAX_LINK_NAME);
+	len = TLV_GET_DATA_LEN(msg->req);
+	if (len <= 0)
+		return -EINVAL;
+
+	len = min_t(int, len, TIPC_MAX_BEARER_NAME);
 	if (!string_is_valid(name, len))
 		return -EINVAL;
 



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 61/72] tun: wake up waitqueues after IFF_UP is set
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (59 preceding siblings ...)
  2019-07-02  8:02 ` [PATCH 4.19 60/72] tipc: check msg->req data len in tipc_nl_compat_bearer_disable Greg Kroah-Hartman
@ 2019-07-02  8:02 ` Greg Kroah-Hartman
  2019-07-02  8:02 ` [PATCH 4.19 62/72] bpf: simplify definition of BPF_FIB_LOOKUP related flags Greg Kroah-Hartman
                   ` (17 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:02 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Fei Li, Jason Wang, David S. Miller

From: Fei Li <lifei.shirley@bytedance.com>

[ Upstream commit 72b319dc08b4924a29f5e2560ef6d966fa54c429 ]

Currently after setting tap0 link up, the tun code wakes tx/rx waited
queues up in tun_net_open() when .ndo_open() is called, however the
IFF_UP flag has not been set yet. If there's already a wait queue, it
would fail to transmit when checking the IFF_UP flag in tun_sendmsg().
Then the saving vhost_poll_start() will add the wq into wqh until it
is waken up again. Although this works when IFF_UP flag has been set
when tun_chr_poll detects; this is not true if IFF_UP flag has not
been set at that time. Sadly the latter case is a fatal error, as
the wq will never be waken up in future unless later manually
setting link up on purpose.

Fix this by moving the wakeup process into the NETDEV_UP event
notifying process, this makes sure IFF_UP has been set before all
waited queues been waken up.

Signed-off-by: Fei Li <lifei.shirley@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/tun.c |   19 +++++++++----------
 1 file changed, 9 insertions(+), 10 deletions(-)

--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1024,18 +1024,8 @@ static void tun_net_uninit(struct net_de
 /* Net device open. */
 static int tun_net_open(struct net_device *dev)
 {
-	struct tun_struct *tun = netdev_priv(dev);
-	int i;
-
 	netif_tx_start_all_queues(dev);
 
-	for (i = 0; i < tun->numqueues; i++) {
-		struct tun_file *tfile;
-
-		tfile = rtnl_dereference(tun->tfiles[i]);
-		tfile->socket.sk->sk_write_space(tfile->socket.sk);
-	}
-
 	return 0;
 }
 
@@ -3443,6 +3433,7 @@ static int tun_device_event(struct notif
 {
 	struct net_device *dev = netdev_notifier_info_to_dev(ptr);
 	struct tun_struct *tun = netdev_priv(dev);
+	int i;
 
 	if (dev->rtnl_link_ops != &tun_link_ops)
 		return NOTIFY_DONE;
@@ -3452,6 +3443,14 @@ static int tun_device_event(struct notif
 		if (tun_queue_resize(tun))
 			return NOTIFY_BAD;
 		break;
+	case NETDEV_UP:
+		for (i = 0; i < tun->numqueues; i++) {
+			struct tun_file *tfile;
+
+			tfile = rtnl_dereference(tun->tfiles[i]);
+			tfile->socket.sk->sk_write_space(tfile->socket.sk);
+		}
+		break;
 	default:
 		break;
 	}



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 62/72] bpf: simplify definition of BPF_FIB_LOOKUP related flags
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (60 preceding siblings ...)
  2019-07-02  8:02 ` [PATCH 4.19 61/72] tun: wake up waitqueues after IFF_UP is set Greg Kroah-Hartman
@ 2019-07-02  8:02 ` Greg Kroah-Hartman
  2019-07-02  8:02 ` [PATCH 4.19 63/72] bpf: lpm_trie: check left child of last leftmost node for NULL Greg Kroah-Hartman
                   ` (16 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:02 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Martynas Pumputis, Andrii Nakryiko,
	Daniel Borkmann

From: Martynas Pumputis <m@lambda.lt>

commit b1d6c15b9d824a58c5415673f374fac19e8eccdf upstream.

Previously, the BPF_FIB_LOOKUP_{DIRECT,OUTPUT} flags in the BPF UAPI
were defined with the help of BIT macro. This had the following issues:

- In order to use any of the flags, a user was required to depend
  on <linux/bits.h>.
- No other flag in bpf.h uses the macro, so it seems that an unwritten
  convention is to use (1 << (nr)) to define BPF-related flags.

Fixes: 87f5fc7e48dd ("bpf: Provide helper to do forwarding lookups in kernel FIB table")
Signed-off-by: Martynas Pumputis <m@lambda.lt>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 include/uapi/linux/bpf.h |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -2705,8 +2705,8 @@ struct bpf_raw_tracepoint_args {
 /* DIRECT:  Skip the FIB rules and go to FIB table associated with device
  * OUTPUT:  Do lookup from egress perspective; default is ingress
  */
-#define BPF_FIB_LOOKUP_DIRECT  BIT(0)
-#define BPF_FIB_LOOKUP_OUTPUT  BIT(1)
+#define BPF_FIB_LOOKUP_DIRECT  (1U << 0)
+#define BPF_FIB_LOOKUP_OUTPUT  (1U << 1)
 
 enum {
 	BPF_FIB_LKUP_RET_SUCCESS,      /* lookup successful */



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 63/72] bpf: lpm_trie: check left child of last leftmost node for NULL
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (61 preceding siblings ...)
  2019-07-02  8:02 ` [PATCH 4.19 62/72] bpf: simplify definition of BPF_FIB_LOOKUP related flags Greg Kroah-Hartman
@ 2019-07-02  8:02 ` Greg Kroah-Hartman
  2019-07-02  8:02 ` [PATCH 4.19 64/72] bpf: fix nested bpf tracepoints with per-cpu data Greg Kroah-Hartman
                   ` (15 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:02 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jonathan Lemon, Martin KaFai Lau,
	Daniel Borkmann

From: Jonathan Lemon <jonathan.lemon@gmail.com>

commit da2577fdd0932ea4eefe73903f1130ee366767d2 upstream.

If the leftmost parent node of the tree has does not have a child
on the left side, then trie_get_next_key (and bpftool map dump) will
not look at the child on the right.  This leads to the traversal
missing elements.

Lookup is not affected.

Update selftest to handle this case.

Reproducer:

 bpftool map create /sys/fs/bpf/lpm type lpm_trie key 6 \
     value 1 entries 256 name test_lpm flags 1
 bpftool map update pinned /sys/fs/bpf/lpm key  8 0 0 0  0   0 value 1
 bpftool map update pinned /sys/fs/bpf/lpm key 16 0 0 0  0 128 value 2
 bpftool map dump   pinned /sys/fs/bpf/lpm

Returns only 1 element. (2 expected)

Fixes: b471f2f1de8b ("bpf: implement MAP_GET_NEXT_KEY command for LPM_TRIE")
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 kernel/bpf/lpm_trie.c                      |    9 ++++--
 tools/testing/selftests/bpf/test_lpm_map.c |   41 ++++++++++++++++++++++++++---
 2 files changed, 45 insertions(+), 5 deletions(-)

--- a/kernel/bpf/lpm_trie.c
+++ b/kernel/bpf/lpm_trie.c
@@ -676,9 +676,14 @@ find_leftmost:
 	 * have exact two children, so this function will never return NULL.
 	 */
 	for (node = search_root; node;) {
-		if (!(node->flags & LPM_TREE_NODE_FLAG_IM))
+		if (node->flags & LPM_TREE_NODE_FLAG_IM) {
+			node = rcu_dereference(node->child[0]);
+		} else {
 			next_node = node;
-		node = rcu_dereference(node->child[0]);
+			node = rcu_dereference(node->child[0]);
+			if (!node)
+				node = rcu_dereference(next_node->child[1]);
+		}
 	}
 do_copy:
 	next_key->prefixlen = next_node->prefixlen;
--- a/tools/testing/selftests/bpf/test_lpm_map.c
+++ b/tools/testing/selftests/bpf/test_lpm_map.c
@@ -573,13 +573,13 @@ static void test_lpm_get_next_key(void)
 
 	/* add one more element (total two) */
 	key_p->prefixlen = 24;
-	inet_pton(AF_INET, "192.168.0.0", key_p->data);
+	inet_pton(AF_INET, "192.168.128.0", key_p->data);
 	assert(bpf_map_update_elem(map_fd, key_p, &value, 0) == 0);
 
 	memset(key_p, 0, key_size);
 	assert(bpf_map_get_next_key(map_fd, NULL, key_p) == 0);
 	assert(key_p->prefixlen == 24 && key_p->data[0] == 192 &&
-	       key_p->data[1] == 168 && key_p->data[2] == 0);
+	       key_p->data[1] == 168 && key_p->data[2] == 128);
 
 	memset(next_key_p, 0, key_size);
 	assert(bpf_map_get_next_key(map_fd, key_p, next_key_p) == 0);
@@ -592,7 +592,7 @@ static void test_lpm_get_next_key(void)
 
 	/* Add one more element (total three) */
 	key_p->prefixlen = 24;
-	inet_pton(AF_INET, "192.168.128.0", key_p->data);
+	inet_pton(AF_INET, "192.168.0.0", key_p->data);
 	assert(bpf_map_update_elem(map_fd, key_p, &value, 0) == 0);
 
 	memset(key_p, 0, key_size);
@@ -628,6 +628,41 @@ static void test_lpm_get_next_key(void)
 	assert(bpf_map_get_next_key(map_fd, key_p, next_key_p) == 0);
 	assert(next_key_p->prefixlen == 24 && next_key_p->data[0] == 192 &&
 	       next_key_p->data[1] == 168 && next_key_p->data[2] == 1);
+
+	memcpy(key_p, next_key_p, key_size);
+	assert(bpf_map_get_next_key(map_fd, key_p, next_key_p) == 0);
+	assert(next_key_p->prefixlen == 24 && next_key_p->data[0] == 192 &&
+	       next_key_p->data[1] == 168 && next_key_p->data[2] == 128);
+
+	memcpy(key_p, next_key_p, key_size);
+	assert(bpf_map_get_next_key(map_fd, key_p, next_key_p) == 0);
+	assert(next_key_p->prefixlen == 16 && next_key_p->data[0] == 192 &&
+	       next_key_p->data[1] == 168);
+
+	memcpy(key_p, next_key_p, key_size);
+	assert(bpf_map_get_next_key(map_fd, key_p, next_key_p) == -1 &&
+	       errno == ENOENT);
+
+	/* Add one more element (total five) */
+	key_p->prefixlen = 28;
+	inet_pton(AF_INET, "192.168.1.128", key_p->data);
+	assert(bpf_map_update_elem(map_fd, key_p, &value, 0) == 0);
+
+	memset(key_p, 0, key_size);
+	assert(bpf_map_get_next_key(map_fd, NULL, key_p) == 0);
+	assert(key_p->prefixlen == 24 && key_p->data[0] == 192 &&
+	       key_p->data[1] == 168 && key_p->data[2] == 0);
+
+	memset(next_key_p, 0, key_size);
+	assert(bpf_map_get_next_key(map_fd, key_p, next_key_p) == 0);
+	assert(next_key_p->prefixlen == 28 && next_key_p->data[0] == 192 &&
+	       next_key_p->data[1] == 168 && next_key_p->data[2] == 1 &&
+	       next_key_p->data[3] == 128);
+
+	memcpy(key_p, next_key_p, key_size);
+	assert(bpf_map_get_next_key(map_fd, key_p, next_key_p) == 0);
+	assert(next_key_p->prefixlen == 24 && next_key_p->data[0] == 192 &&
+	       next_key_p->data[1] == 168 && next_key_p->data[2] == 1);
 
 	memcpy(key_p, next_key_p, key_size);
 	assert(bpf_map_get_next_key(map_fd, key_p, next_key_p) == 0);



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 64/72] bpf: fix nested bpf tracepoints with per-cpu data
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (62 preceding siblings ...)
  2019-07-02  8:02 ` [PATCH 4.19 63/72] bpf: lpm_trie: check left child of last leftmost node for NULL Greg Kroah-Hartman
@ 2019-07-02  8:02 ` Greg Kroah-Hartman
  2019-07-02  8:02 ` [PATCH 4.19 65/72] bpf: fix unconnected udp hooks Greg Kroah-Hartman
                   ` (14 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:02 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Matt Mullins, Andrii Nakryiko,
	Daniel Borkmann, Alexei Starovoitov

From: Matt Mullins <mmullins@fb.com>

commit 9594dc3c7e71b9f52bee1d7852eb3d4e3aea9e99 upstream.

BPF_PROG_TYPE_RAW_TRACEPOINTs can be executed nested on the same CPU, as
they do not increment bpf_prog_active while executing.

This enables three levels of nesting, to support
  - a kprobe or raw tp or perf event,
  - another one of the above that irq context happens to call, and
  - another one in nmi context
(at most one of which may be a kprobe or perf event).

Fixes: 20b9d7ac4852 ("bpf: avoid excessive stack usage for perf_sample_data")
Signed-off-by: Matt Mullins <mmullins@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 kernel/trace/bpf_trace.c |  100 +++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 84 insertions(+), 16 deletions(-)

--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -365,8 +365,6 @@ static const struct bpf_func_proto bpf_p
 	.arg4_type	= ARG_CONST_SIZE,
 };
 
-static DEFINE_PER_CPU(struct perf_sample_data, bpf_trace_sd);
-
 static __always_inline u64
 __bpf_perf_event_output(struct pt_regs *regs, struct bpf_map *map,
 			u64 flags, struct perf_sample_data *sd)
@@ -398,24 +396,50 @@ __bpf_perf_event_output(struct pt_regs *
 	return 0;
 }
 
+/*
+ * Support executing tracepoints in normal, irq, and nmi context that each call
+ * bpf_perf_event_output
+ */
+struct bpf_trace_sample_data {
+	struct perf_sample_data sds[3];
+};
+
+static DEFINE_PER_CPU(struct bpf_trace_sample_data, bpf_trace_sds);
+static DEFINE_PER_CPU(int, bpf_trace_nest_level);
 BPF_CALL_5(bpf_perf_event_output, struct pt_regs *, regs, struct bpf_map *, map,
 	   u64, flags, void *, data, u64, size)
 {
-	struct perf_sample_data *sd = this_cpu_ptr(&bpf_trace_sd);
+	struct bpf_trace_sample_data *sds = this_cpu_ptr(&bpf_trace_sds);
+	int nest_level = this_cpu_inc_return(bpf_trace_nest_level);
 	struct perf_raw_record raw = {
 		.frag = {
 			.size = size,
 			.data = data,
 		},
 	};
+	struct perf_sample_data *sd;
+	int err;
 
-	if (unlikely(flags & ~(BPF_F_INDEX_MASK)))
-		return -EINVAL;
+	if (WARN_ON_ONCE(nest_level > ARRAY_SIZE(sds->sds))) {
+		err = -EBUSY;
+		goto out;
+	}
+
+	sd = &sds->sds[nest_level - 1];
+
+	if (unlikely(flags & ~(BPF_F_INDEX_MASK))) {
+		err = -EINVAL;
+		goto out;
+	}
 
 	perf_sample_data_init(sd, 0, 0);
 	sd->raw = &raw;
 
-	return __bpf_perf_event_output(regs, map, flags, sd);
+	err = __bpf_perf_event_output(regs, map, flags, sd);
+
+out:
+	this_cpu_dec(bpf_trace_nest_level);
+	return err;
 }
 
 static const struct bpf_func_proto bpf_perf_event_output_proto = {
@@ -772,16 +796,48 @@ pe_prog_func_proto(enum bpf_func_id func
 /*
  * bpf_raw_tp_regs are separate from bpf_pt_regs used from skb/xdp
  * to avoid potential recursive reuse issue when/if tracepoints are added
- * inside bpf_*_event_output, bpf_get_stackid and/or bpf_get_stack
+ * inside bpf_*_event_output, bpf_get_stackid and/or bpf_get_stack.
+ *
+ * Since raw tracepoints run despite bpf_prog_active, support concurrent usage
+ * in normal, irq, and nmi context.
  */
-static DEFINE_PER_CPU(struct pt_regs, bpf_raw_tp_regs);
+struct bpf_raw_tp_regs {
+	struct pt_regs regs[3];
+};
+static DEFINE_PER_CPU(struct bpf_raw_tp_regs, bpf_raw_tp_regs);
+static DEFINE_PER_CPU(int, bpf_raw_tp_nest_level);
+static struct pt_regs *get_bpf_raw_tp_regs(void)
+{
+	struct bpf_raw_tp_regs *tp_regs = this_cpu_ptr(&bpf_raw_tp_regs);
+	int nest_level = this_cpu_inc_return(bpf_raw_tp_nest_level);
+
+	if (WARN_ON_ONCE(nest_level > ARRAY_SIZE(tp_regs->regs))) {
+		this_cpu_dec(bpf_raw_tp_nest_level);
+		return ERR_PTR(-EBUSY);
+	}
+
+	return &tp_regs->regs[nest_level - 1];
+}
+
+static void put_bpf_raw_tp_regs(void)
+{
+	this_cpu_dec(bpf_raw_tp_nest_level);
+}
+
 BPF_CALL_5(bpf_perf_event_output_raw_tp, struct bpf_raw_tracepoint_args *, args,
 	   struct bpf_map *, map, u64, flags, void *, data, u64, size)
 {
-	struct pt_regs *regs = this_cpu_ptr(&bpf_raw_tp_regs);
+	struct pt_regs *regs = get_bpf_raw_tp_regs();
+	int ret;
+
+	if (IS_ERR(regs))
+		return PTR_ERR(regs);
 
 	perf_fetch_caller_regs(regs);
-	return ____bpf_perf_event_output(regs, map, flags, data, size);
+	ret = ____bpf_perf_event_output(regs, map, flags, data, size);
+
+	put_bpf_raw_tp_regs();
+	return ret;
 }
 
 static const struct bpf_func_proto bpf_perf_event_output_proto_raw_tp = {
@@ -798,12 +854,18 @@ static const struct bpf_func_proto bpf_p
 BPF_CALL_3(bpf_get_stackid_raw_tp, struct bpf_raw_tracepoint_args *, args,
 	   struct bpf_map *, map, u64, flags)
 {
-	struct pt_regs *regs = this_cpu_ptr(&bpf_raw_tp_regs);
+	struct pt_regs *regs = get_bpf_raw_tp_regs();
+	int ret;
+
+	if (IS_ERR(regs))
+		return PTR_ERR(regs);
 
 	perf_fetch_caller_regs(regs);
 	/* similar to bpf_perf_event_output_tp, but pt_regs fetched differently */
-	return bpf_get_stackid((unsigned long) regs, (unsigned long) map,
-			       flags, 0, 0);
+	ret = bpf_get_stackid((unsigned long) regs, (unsigned long) map,
+			      flags, 0, 0);
+	put_bpf_raw_tp_regs();
+	return ret;
 }
 
 static const struct bpf_func_proto bpf_get_stackid_proto_raw_tp = {
@@ -818,11 +880,17 @@ static const struct bpf_func_proto bpf_g
 BPF_CALL_4(bpf_get_stack_raw_tp, struct bpf_raw_tracepoint_args *, args,
 	   void *, buf, u32, size, u64, flags)
 {
-	struct pt_regs *regs = this_cpu_ptr(&bpf_raw_tp_regs);
+	struct pt_regs *regs = get_bpf_raw_tp_regs();
+	int ret;
+
+	if (IS_ERR(regs))
+		return PTR_ERR(regs);
 
 	perf_fetch_caller_regs(regs);
-	return bpf_get_stack((unsigned long) regs, (unsigned long) buf,
-			     (unsigned long) size, flags, 0);
+	ret = bpf_get_stack((unsigned long) regs, (unsigned long) buf,
+			    (unsigned long) size, flags, 0);
+	put_bpf_raw_tp_regs();
+	return ret;
 }
 
 static const struct bpf_func_proto bpf_get_stack_proto_raw_tp = {



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 65/72] bpf: fix unconnected udp hooks
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (63 preceding siblings ...)
  2019-07-02  8:02 ` [PATCH 4.19 64/72] bpf: fix nested bpf tracepoints with per-cpu data Greg Kroah-Hartman
@ 2019-07-02  8:02 ` Greg Kroah-Hartman
  2019-07-02  8:02 ` [PATCH 4.19 66/72] bpf: udp: Avoid calling reuseports bpf_prog from udp_gro Greg Kroah-Hartman
                   ` (13 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:02 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Daniel Borkmann, Andrey Ignatov,
	Martin KaFai Lau, Martynas Pumputis, Alexei Starovoitov

From: Daniel Borkmann <daniel@iogearbox.net>

commit 983695fa676568fc0fe5ddd995c7267aabc24632 upstream.

Intention of cgroup bind/connect/sendmsg BPF hooks is to act transparently
to applications as also stated in original motivation in 7828f20e3779 ("Merge
branch 'bpf-cgroup-bind-connect'"). When recently integrating the latter
two hooks into Cilium to enable host based load-balancing with Kubernetes,
I ran into the issue that pods couldn't start up as DNS got broken. Kubernetes
typically sets up DNS as a service and is thus subject to load-balancing.

Upon further debugging, it turns out that the cgroupv2 sendmsg BPF hooks API
is currently insufficient and thus not usable as-is for standard applications
shipped with most distros. To break down the issue we ran into with a simple
example:

  # cat /etc/resolv.conf
  nameserver 147.75.207.207
  nameserver 147.75.207.208

For the purpose of a simple test, we set up above IPs as service IPs and
transparently redirect traffic to a different DNS backend server for that
node:

  # cilium service list
  ID   Frontend            Backend
  1    147.75.207.207:53   1 => 8.8.8.8:53
  2    147.75.207.208:53   1 => 8.8.8.8:53

The attached BPF program is basically selecting one of the backends if the
service IP/port matches on the cgroup hook. DNS breaks here, because the
hooks are not transparent enough to applications which have built-in msg_name
address checks:

  # nslookup 1.1.1.1
  ;; reply from unexpected source: 8.8.8.8#53, expected 147.75.207.207#53
  ;; reply from unexpected source: 8.8.8.8#53, expected 147.75.207.208#53
  ;; reply from unexpected source: 8.8.8.8#53, expected 147.75.207.207#53
  [...]
  ;; connection timed out; no servers could be reached

  # dig 1.1.1.1
  ;; reply from unexpected source: 8.8.8.8#53, expected 147.75.207.207#53
  ;; reply from unexpected source: 8.8.8.8#53, expected 147.75.207.208#53
  ;; reply from unexpected source: 8.8.8.8#53, expected 147.75.207.207#53
  [...]

  ; <<>> DiG 9.11.3-1ubuntu1.7-Ubuntu <<>> 1.1.1.1
  ;; global options: +cmd
  ;; connection timed out; no servers could be reached

For comparison, if none of the service IPs is used, and we tell nslookup
to use 8.8.8.8 directly it works just fine, of course:

  # nslookup 1.1.1.1 8.8.8.8
  1.1.1.1.in-addr.arpa	name = one.one.one.one.

In order to fix this and thus act more transparent to the application,
this needs reverse translation on recvmsg() side. A minimal fix for this
API is to add similar recvmsg() hooks behind the BPF cgroups static key
such that the program can track state and replace the current sockaddr_in{,6}
with the original service IP. From BPF side, this basically tracks the
service tuple plus socket cookie in an LRU map where the reverse NAT can
then be retrieved via map value as one example. Side-note: the BPF cgroups
static key should be converted to a per-hook static key in future.

Same example after this fix:

  # cilium service list
  ID   Frontend            Backend
  1    147.75.207.207:53   1 => 8.8.8.8:53
  2    147.75.207.208:53   1 => 8.8.8.8:53

Lookups work fine now:

  # nslookup 1.1.1.1
  1.1.1.1.in-addr.arpa    name = one.one.one.one.

  Authoritative answers can be found from:

  # dig 1.1.1.1

  ; <<>> DiG 9.11.3-1ubuntu1.7-Ubuntu <<>> 1.1.1.1
  ;; global options: +cmd
  ;; Got answer:
  ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 51550
  ;; flags: qr rd ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

  ;; OPT PSEUDOSECTION:
  ; EDNS: version: 0, flags:; udp: 512
  ;; QUESTION SECTION:
  ;1.1.1.1.                       IN      A

  ;; AUTHORITY SECTION:
  .                       23426   IN      SOA     a.root-servers.net. nstld.verisign-grs.com. 2019052001 1800 900 604800 86400

  ;; Query time: 17 msec
  ;; SERVER: 147.75.207.207#53(147.75.207.207)
  ;; WHEN: Tue May 21 12:59:38 UTC 2019
  ;; MSG SIZE  rcvd: 111

And from an actual packet level it shows that we're using the back end
server when talking via 147.75.207.20{7,8} front end:

  # tcpdump -i any udp
  [...]
  12:59:52.698732 IP foo.42011 > google-public-dns-a.google.com.domain: 18803+ PTR? 1.1.1.1.in-addr.arpa. (38)
  12:59:52.698735 IP foo.42011 > google-public-dns-a.google.com.domain: 18803+ PTR? 1.1.1.1.in-addr.arpa. (38)
  12:59:52.701208 IP google-public-dns-a.google.com.domain > foo.42011: 18803 1/0/0 PTR one.one.one.one. (67)
  12:59:52.701208 IP google-public-dns-a.google.com.domain > foo.42011: 18803 1/0/0 PTR one.one.one.one. (67)
  [...]

In order to be flexible and to have same semantics as in sendmsg BPF
programs, we only allow return codes in [1,1] range. In the sendmsg case
the program is called if msg->msg_name is present which can be the case
in both, connected and unconnected UDP.

The former only relies on the sockaddr_in{,6} passed via connect(2) if
passed msg->msg_name was NULL. Therefore, on recvmsg side, we act in similar
way to call into the BPF program whenever a non-NULL msg->msg_name was
passed independent of sk->sk_state being TCP_ESTABLISHED or not. Note
that for TCP case, the msg->msg_name is ignored in the regular recvmsg
path and therefore not relevant.

For the case of ip{,v6}_recv_error() paths, picked up via MSG_ERRQUEUE,
the hook is not called. This is intentional as it aligns with the same
semantics as in case of TCP cgroup BPF hooks right now. This might be
better addressed in future through a different bpf_attach_type such
that this case can be distinguished from the regular recvmsg paths,
for example.

Fixes: 1cedee13d25a ("bpf: Hooks for sys_sendmsg")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrey Ignatov <rdna@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Martynas Pumputis <m@lambda.lt>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/bpf-cgroup.h |    8 ++++++++
 include/uapi/linux/bpf.h   |    2 ++
 kernel/bpf/syscall.c       |    8 ++++++++
 kernel/bpf/verifier.c      |   12 ++++++++----
 net/core/filter.c          |    2 ++
 net/ipv4/udp.c             |    4 ++++
 net/ipv6/udp.c             |    4 ++++
 7 files changed, 36 insertions(+), 4 deletions(-)

--- a/include/linux/bpf-cgroup.h
+++ b/include/linux/bpf-cgroup.h
@@ -210,6 +210,12 @@ void bpf_cgroup_storage_release(struct b
 #define BPF_CGROUP_RUN_PROG_UDP6_SENDMSG_LOCK(sk, uaddr, t_ctx)		       \
 	BPF_CGROUP_RUN_SA_PROG_LOCK(sk, uaddr, BPF_CGROUP_UDP6_SENDMSG, t_ctx)
 
+#define BPF_CGROUP_RUN_PROG_UDP4_RECVMSG_LOCK(sk, uaddr)			\
+	BPF_CGROUP_RUN_SA_PROG_LOCK(sk, uaddr, BPF_CGROUP_UDP4_RECVMSG, NULL)
+
+#define BPF_CGROUP_RUN_PROG_UDP6_RECVMSG_LOCK(sk, uaddr)			\
+	BPF_CGROUP_RUN_SA_PROG_LOCK(sk, uaddr, BPF_CGROUP_UDP6_RECVMSG, NULL)
+
 #define BPF_CGROUP_RUN_PROG_SOCK_OPS(sock_ops)				       \
 ({									       \
 	int __ret = 0;							       \
@@ -290,6 +296,8 @@ static inline void bpf_cgroup_storage_fr
 #define BPF_CGROUP_RUN_PROG_INET6_CONNECT_LOCK(sk, uaddr) ({ 0; })
 #define BPF_CGROUP_RUN_PROG_UDP4_SENDMSG_LOCK(sk, uaddr, t_ctx) ({ 0; })
 #define BPF_CGROUP_RUN_PROG_UDP6_SENDMSG_LOCK(sk, uaddr, t_ctx) ({ 0; })
+#define BPF_CGROUP_RUN_PROG_UDP4_RECVMSG_LOCK(sk, uaddr) ({ 0; })
+#define BPF_CGROUP_RUN_PROG_UDP6_RECVMSG_LOCK(sk, uaddr) ({ 0; })
 #define BPF_CGROUP_RUN_PROG_SOCK_OPS(sock_ops) ({ 0; })
 #define BPF_CGROUP_RUN_PROG_DEVICE_CGROUP(type,major,minor,access) ({ 0; })
 
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -172,6 +172,8 @@ enum bpf_attach_type {
 	BPF_CGROUP_UDP4_SENDMSG,
 	BPF_CGROUP_UDP6_SENDMSG,
 	BPF_LIRC_MODE2,
+	BPF_CGROUP_UDP4_RECVMSG = 19,
+	BPF_CGROUP_UDP6_RECVMSG,
 	__MAX_BPF_ATTACH_TYPE
 };
 
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1342,6 +1342,8 @@ bpf_prog_load_check_attach_type(enum bpf
 		case BPF_CGROUP_INET6_CONNECT:
 		case BPF_CGROUP_UDP4_SENDMSG:
 		case BPF_CGROUP_UDP6_SENDMSG:
+		case BPF_CGROUP_UDP4_RECVMSG:
+		case BPF_CGROUP_UDP6_RECVMSG:
 			return 0;
 		default:
 			return -EINVAL;
@@ -1622,6 +1624,8 @@ static int bpf_prog_attach(const union b
 	case BPF_CGROUP_INET6_CONNECT:
 	case BPF_CGROUP_UDP4_SENDMSG:
 	case BPF_CGROUP_UDP6_SENDMSG:
+	case BPF_CGROUP_UDP4_RECVMSG:
+	case BPF_CGROUP_UDP6_RECVMSG:
 		ptype = BPF_PROG_TYPE_CGROUP_SOCK_ADDR;
 		break;
 	case BPF_CGROUP_SOCK_OPS:
@@ -1698,6 +1702,8 @@ static int bpf_prog_detach(const union b
 	case BPF_CGROUP_INET6_CONNECT:
 	case BPF_CGROUP_UDP4_SENDMSG:
 	case BPF_CGROUP_UDP6_SENDMSG:
+	case BPF_CGROUP_UDP4_RECVMSG:
+	case BPF_CGROUP_UDP6_RECVMSG:
 		ptype = BPF_PROG_TYPE_CGROUP_SOCK_ADDR;
 		break;
 	case BPF_CGROUP_SOCK_OPS:
@@ -1744,6 +1750,8 @@ static int bpf_prog_query(const union bp
 	case BPF_CGROUP_INET6_CONNECT:
 	case BPF_CGROUP_UDP4_SENDMSG:
 	case BPF_CGROUP_UDP6_SENDMSG:
+	case BPF_CGROUP_UDP4_RECVMSG:
+	case BPF_CGROUP_UDP6_RECVMSG:
 	case BPF_CGROUP_SOCK_OPS:
 	case BPF_CGROUP_DEVICE:
 		break;
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -4342,9 +4342,12 @@ static int check_return_code(struct bpf_
 	struct tnum range = tnum_range(0, 1);
 
 	switch (env->prog->type) {
+	case BPF_PROG_TYPE_CGROUP_SOCK_ADDR:
+		if (env->prog->expected_attach_type == BPF_CGROUP_UDP4_RECVMSG ||
+		    env->prog->expected_attach_type == BPF_CGROUP_UDP6_RECVMSG)
+			range = tnum_range(1, 1);
 	case BPF_PROG_TYPE_CGROUP_SKB:
 	case BPF_PROG_TYPE_CGROUP_SOCK:
-	case BPF_PROG_TYPE_CGROUP_SOCK_ADDR:
 	case BPF_PROG_TYPE_SOCK_OPS:
 	case BPF_PROG_TYPE_CGROUP_DEVICE:
 		break;
@@ -4360,16 +4363,17 @@ static int check_return_code(struct bpf_
 	}
 
 	if (!tnum_in(range, reg->var_off)) {
+		char tn_buf[48];
+
 		verbose(env, "At program exit the register R0 ");
 		if (!tnum_is_unknown(reg->var_off)) {
-			char tn_buf[48];
-
 			tnum_strn(tn_buf, sizeof(tn_buf), reg->var_off);
 			verbose(env, "has value %s", tn_buf);
 		} else {
 			verbose(env, "has unknown scalar value");
 		}
-		verbose(env, " should have been 0 or 1\n");
+		tnum_strn(tn_buf, sizeof(tn_buf), range);
+		verbose(env, " should have been in %s\n", tn_buf);
 		return -EINVAL;
 	}
 	return 0;
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -5558,6 +5558,7 @@ static bool sock_addr_is_valid_access(in
 		case BPF_CGROUP_INET4_BIND:
 		case BPF_CGROUP_INET4_CONNECT:
 		case BPF_CGROUP_UDP4_SENDMSG:
+		case BPF_CGROUP_UDP4_RECVMSG:
 			break;
 		default:
 			return false;
@@ -5568,6 +5569,7 @@ static bool sock_addr_is_valid_access(in
 		case BPF_CGROUP_INET6_BIND:
 		case BPF_CGROUP_INET6_CONNECT:
 		case BPF_CGROUP_UDP6_SENDMSG:
+		case BPF_CGROUP_UDP6_RECVMSG:
 			break;
 		default:
 			return false;
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1720,6 +1720,10 @@ try_again:
 		sin->sin_addr.s_addr = ip_hdr(skb)->saddr;
 		memset(sin->sin_zero, 0, sizeof(sin->sin_zero));
 		*addr_len = sizeof(*sin);
+
+		if (cgroup_bpf_enabled)
+			BPF_CGROUP_RUN_PROG_UDP4_RECVMSG_LOCK(sk,
+							(struct sockaddr *)sin);
 	}
 	if (inet->cmsg_flags)
 		ip_cmsg_recv_offset(msg, sk, skb, sizeof(struct udphdr), off);
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -419,6 +419,10 @@ try_again:
 						    inet6_iif(skb));
 		}
 		*addr_len = sizeof(*sin6);
+
+		if (cgroup_bpf_enabled)
+			BPF_CGROUP_RUN_PROG_UDP6_RECVMSG_LOCK(sk,
+						(struct sockaddr *)sin6);
 	}
 
 	if (np->rxopt.all)



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 66/72] bpf: udp: Avoid calling reuseports bpf_prog from udp_gro
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (64 preceding siblings ...)
  2019-07-02  8:02 ` [PATCH 4.19 65/72] bpf: fix unconnected udp hooks Greg Kroah-Hartman
@ 2019-07-02  8:02 ` Greg Kroah-Hartman
  2019-07-02  8:02 ` [PATCH 4.19 67/72] bpf: udp: ipv6: Avoid running reuseports bpf_prog from __udp6_lib_err Greg Kroah-Hartman
                   ` (12 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:02 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Tom Herbert, Martin KaFai Lau,
	Song Liu, Alexei Starovoitov

From: Martin KaFai Lau <kafai@fb.com>

commit 257a525fe2e49584842c504a92c27097407f778f upstream.

When the commit a6024562ffd7 ("udp: Add GRO functions to UDP socket")
added udp[46]_lib_lookup_skb to the udp_gro code path, it broke
the reuseport_select_sock() assumption that skb->data is pointing
to the transport header.

This patch follows an earlier __udp6_lib_err() fix by
passing a NULL skb to avoid calling the reuseport's bpf_prog.

Fixes: a6024562ffd7 ("udp: Add GRO functions to UDP socket")
Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 net/ipv4/udp.c |    6 +++++-
 net/ipv6/udp.c |    2 +-
 2 files changed, 6 insertions(+), 2 deletions(-)

--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -542,7 +542,11 @@ static inline struct sock *__udp4_lib_lo
 struct sock *udp4_lib_lookup_skb(struct sk_buff *skb,
 				 __be16 sport, __be16 dport)
 {
-	return __udp4_lib_lookup_skb(skb, sport, dport, &udp_table);
+	const struct iphdr *iph = ip_hdr(skb);
+
+	return __udp4_lib_lookup(dev_net(skb->dev), iph->saddr, sport,
+				 iph->daddr, dport, inet_iif(skb),
+				 inet_sdif(skb), &udp_table, NULL);
 }
 EXPORT_SYMBOL_GPL(udp4_lib_lookup_skb);
 
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -282,7 +282,7 @@ struct sock *udp6_lib_lookup_skb(struct
 
 	return __udp6_lib_lookup(dev_net(skb->dev), &iph->saddr, sport,
 				 &iph->daddr, dport, inet6_iif(skb),
-				 inet6_sdif(skb), &udp_table, skb);
+				 inet6_sdif(skb), &udp_table, NULL);
 }
 EXPORT_SYMBOL_GPL(udp6_lib_lookup_skb);
 



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 67/72] bpf: udp: ipv6: Avoid running reuseports bpf_prog from __udp6_lib_err
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (65 preceding siblings ...)
  2019-07-02  8:02 ` [PATCH 4.19 66/72] bpf: udp: Avoid calling reuseports bpf_prog from udp_gro Greg Kroah-Hartman
@ 2019-07-02  8:02 ` Greg Kroah-Hartman
  2019-07-02  8:02 ` [PATCH 4.19 68/72] arm64: futex: Avoid copying out uninitialised stack in failed cmpxchg() Greg Kroah-Hartman
                   ` (11 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:02 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Craig Gallek, Martin KaFai Lau,
	Song Liu, Alexei Starovoitov, Daniel Borkmann

From: Martin KaFai Lau <kafai@fb.com>

commit 4ac30c4b3659efac031818c418beb51e630d512d upstream.

__udp6_lib_err() may be called when handling icmpv6 message. For example,
the icmpv6 toobig(type=2).  __udp6_lib_lookup() is then called
which may call reuseport_select_sock().  reuseport_select_sock() will
call into a bpf_prog (if there is one).

reuseport_select_sock() is expecting the skb->data pointing to the
transport header (udphdr in this case).  For example, run_bpf_filter()
is pulling the transport header.

However, in the __udp6_lib_err() path, the skb->data is pointing to the
ipv6hdr instead of the udphdr.

One option is to pull and push the ipv6hdr in __udp6_lib_err().
Instead of doing this, this patch follows how the original
commit 538950a1b752 ("soreuseport: setsockopt SO_ATTACH_REUSEPORT_[CE]BPF")
was done in IPv4, which has passed a NULL skb pointer to
reuseport_select_sock().

Fixes: 538950a1b752 ("soreuseport: setsockopt SO_ATTACH_REUSEPORT_[CE]BPF")
Cc: Craig Gallek <kraig@google.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Craig Gallek <kraig@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv6/udp.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -482,7 +482,7 @@ void __udp6_lib_err(struct sk_buff *skb,
 	struct net *net = dev_net(skb->dev);
 
 	sk = __udp6_lib_lookup(net, daddr, uh->dest, saddr, uh->source,
-			       inet6_iif(skb), 0, udptable, skb);
+			       inet6_iif(skb), 0, udptable, NULL);
 	if (!sk) {
 		__ICMP6_INC_STATS(net, __in6_dev_get(skb->dev),
 				  ICMP6_MIB_INERRORS);



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 68/72] arm64: futex: Avoid copying out uninitialised stack in failed cmpxchg()
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (66 preceding siblings ...)
  2019-07-02  8:02 ` [PATCH 4.19 67/72] bpf: udp: ipv6: Avoid running reuseports bpf_prog from __udp6_lib_err Greg Kroah-Hartman
@ 2019-07-02  8:02 ` Greg Kroah-Hartman
  2019-07-02  8:02 ` [PATCH 4.19 69/72] bpf, arm64: use more scalable stadd over ldxr / stxr loop in xadd Greg Kroah-Hartman
                   ` (10 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:02 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Will Deacon

From: Will Deacon <will.deacon@arm.com>

commit 8e4e0ac02b449297b86498ac24db5786ddd9f647 upstream.

Returning an error code from futex_atomic_cmpxchg_inatomic() indicates
that the caller should not make any use of *uval, and should instead act
upon on the value of the error code. Although this is implemented
correctly in our futex code, we needlessly copy uninitialised stack to
*uval in the error case, which can easily be avoided.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/arm64/include/asm/futex.h |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/arch/arm64/include/asm/futex.h
+++ b/arch/arm64/include/asm/futex.h
@@ -134,7 +134,9 @@ futex_atomic_cmpxchg_inatomic(u32 *uval,
 	: "memory");
 	uaccess_disable();
 
-	*uval = val;
+	if (!ret)
+		*uval = val;
+
 	return ret;
 }
 



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 69/72] bpf, arm64: use more scalable stadd over ldxr / stxr loop in xadd
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (67 preceding siblings ...)
  2019-07-02  8:02 ` [PATCH 4.19 68/72] arm64: futex: Avoid copying out uninitialised stack in failed cmpxchg() Greg Kroah-Hartman
@ 2019-07-02  8:02 ` Greg Kroah-Hartman
  2019-07-02  8:02 ` [PATCH 4.19 70/72] futex: Update comments and docs about return values of arch futex code Greg Kroah-Hartman
                   ` (9 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:02 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Daniel Borkmann,
	Jean-Philippe Brucker, Will Deacon, Alexei Starovoitov

From: Daniel Borkmann <daniel@iogearbox.net>

commit 34b8ab091f9ef57a2bb3c8c8359a0a03a8abf2f9 upstream.

Since ARMv8.1 supplement introduced LSE atomic instructions back in 2016,
lets add support for STADD and use that in favor of LDXR / STXR loop for
the XADD mapping if available. STADD is encoded as an alias for LDADD with
XZR as the destination register, therefore add LDADD to the instruction
encoder along with STADD as special case and use it in the JIT for CPUs
that advertise LSE atomics in CPUID register. If immediate offset in the
BPF XADD insn is 0, then use dst register directly instead of temporary
one.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/arm64/include/asm/insn.h |    8 ++++++++
 arch/arm64/kernel/insn.c      |   40 ++++++++++++++++++++++++++++++++++++++++
 arch/arm64/net/bpf_jit.h      |    4 ++++
 arch/arm64/net/bpf_jit_comp.c |   28 +++++++++++++++++++---------
 4 files changed, 71 insertions(+), 9 deletions(-)

--- a/arch/arm64/include/asm/insn.h
+++ b/arch/arm64/include/asm/insn.h
@@ -272,6 +272,7 @@ __AARCH64_INSN_FUNCS(adrp,	0x9F000000, 0
 __AARCH64_INSN_FUNCS(prfm,	0x3FC00000, 0x39800000)
 __AARCH64_INSN_FUNCS(prfm_lit,	0xFF000000, 0xD8000000)
 __AARCH64_INSN_FUNCS(str_reg,	0x3FE0EC00, 0x38206800)
+__AARCH64_INSN_FUNCS(ldadd,	0x3F20FC00, 0xB8200000)
 __AARCH64_INSN_FUNCS(ldr_reg,	0x3FE0EC00, 0x38606800)
 __AARCH64_INSN_FUNCS(ldr_lit,	0xBF000000, 0x18000000)
 __AARCH64_INSN_FUNCS(ldrsw_lit,	0xFF000000, 0x98000000)
@@ -389,6 +390,13 @@ u32 aarch64_insn_gen_load_store_ex(enum
 				   enum aarch64_insn_register state,
 				   enum aarch64_insn_size_type size,
 				   enum aarch64_insn_ldst_type type);
+u32 aarch64_insn_gen_ldadd(enum aarch64_insn_register result,
+			   enum aarch64_insn_register address,
+			   enum aarch64_insn_register value,
+			   enum aarch64_insn_size_type size);
+u32 aarch64_insn_gen_stadd(enum aarch64_insn_register address,
+			   enum aarch64_insn_register value,
+			   enum aarch64_insn_size_type size);
 u32 aarch64_insn_gen_add_sub_imm(enum aarch64_insn_register dst,
 				 enum aarch64_insn_register src,
 				 int imm, enum aarch64_insn_variant variant,
--- a/arch/arm64/kernel/insn.c
+++ b/arch/arm64/kernel/insn.c
@@ -734,6 +734,46 @@ u32 aarch64_insn_gen_load_store_ex(enum
 					    state);
 }
 
+u32 aarch64_insn_gen_ldadd(enum aarch64_insn_register result,
+			   enum aarch64_insn_register address,
+			   enum aarch64_insn_register value,
+			   enum aarch64_insn_size_type size)
+{
+	u32 insn = aarch64_insn_get_ldadd_value();
+
+	switch (size) {
+	case AARCH64_INSN_SIZE_32:
+	case AARCH64_INSN_SIZE_64:
+		break;
+	default:
+		pr_err("%s: unimplemented size encoding %d\n", __func__, size);
+		return AARCH64_BREAK_FAULT;
+	}
+
+	insn = aarch64_insn_encode_ldst_size(size, insn);
+
+	insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RT, insn,
+					    result);
+
+	insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RN, insn,
+					    address);
+
+	return aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RS, insn,
+					    value);
+}
+
+u32 aarch64_insn_gen_stadd(enum aarch64_insn_register address,
+			   enum aarch64_insn_register value,
+			   enum aarch64_insn_size_type size)
+{
+	/*
+	 * STADD is simply encoded as an alias for LDADD with XZR as
+	 * the destination register.
+	 */
+	return aarch64_insn_gen_ldadd(AARCH64_INSN_REG_ZR, address,
+				      value, size);
+}
+
 static u32 aarch64_insn_encode_prfm_imm(enum aarch64_insn_prfm_type type,
 					enum aarch64_insn_prfm_target target,
 					enum aarch64_insn_prfm_policy policy,
--- a/arch/arm64/net/bpf_jit.h
+++ b/arch/arm64/net/bpf_jit.h
@@ -100,6 +100,10 @@
 #define A64_STXR(sf, Rt, Rn, Rs) \
 	A64_LSX(sf, Rt, Rn, Rs, STORE_EX)
 
+/* LSE atomics */
+#define A64_STADD(sf, Rn, Rs) \
+	aarch64_insn_gen_stadd(Rn, Rs, A64_SIZE(sf))
+
 /* Add/subtract (immediate) */
 #define A64_ADDSUB_IMM(sf, Rd, Rn, imm12, type) \
 	aarch64_insn_gen_add_sub_imm(Rd, Rn, imm12, \
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -364,7 +364,7 @@ static int build_insn(const struct bpf_i
 	const int i = insn - ctx->prog->insnsi;
 	const bool is64 = BPF_CLASS(code) == BPF_ALU64;
 	const bool isdw = BPF_SIZE(code) == BPF_DW;
-	u8 jmp_cond;
+	u8 jmp_cond, reg;
 	s32 jmp_offset;
 
 #define check_imm(bits, imm) do {				\
@@ -730,18 +730,28 @@ emit_cond_jmp:
 			break;
 		}
 		break;
+
 	/* STX XADD: lock *(u32 *)(dst + off) += src */
 	case BPF_STX | BPF_XADD | BPF_W:
 	/* STX XADD: lock *(u64 *)(dst + off) += src */
 	case BPF_STX | BPF_XADD | BPF_DW:
-		emit_a64_mov_i(1, tmp, off, ctx);
-		emit(A64_ADD(1, tmp, tmp, dst), ctx);
-		emit(A64_LDXR(isdw, tmp2, tmp), ctx);
-		emit(A64_ADD(isdw, tmp2, tmp2, src), ctx);
-		emit(A64_STXR(isdw, tmp2, tmp, tmp3), ctx);
-		jmp_offset = -3;
-		check_imm19(jmp_offset);
-		emit(A64_CBNZ(0, tmp3, jmp_offset), ctx);
+		if (!off) {
+			reg = dst;
+		} else {
+			emit_a64_mov_i(1, tmp, off, ctx);
+			emit(A64_ADD(1, tmp, tmp, dst), ctx);
+			reg = tmp;
+		}
+		if (cpus_have_cap(ARM64_HAS_LSE_ATOMICS)) {
+			emit(A64_STADD(isdw, reg, src), ctx);
+		} else {
+			emit(A64_LDXR(isdw, tmp2, reg), ctx);
+			emit(A64_ADD(isdw, tmp2, tmp2, src), ctx);
+			emit(A64_STXR(isdw, tmp2, reg, tmp3), ctx);
+			jmp_offset = -3;
+			check_imm19(jmp_offset);
+			emit(A64_CBNZ(0, tmp3, jmp_offset), ctx);
+		}
 		break;
 
 	default:



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 70/72] futex: Update comments and docs about return values of arch futex code
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (68 preceding siblings ...)
  2019-07-02  8:02 ` [PATCH 4.19 69/72] bpf, arm64: use more scalable stadd over ldxr / stxr loop in xadd Greg Kroah-Hartman
@ 2019-07-02  8:02 ` Greg Kroah-Hartman
  2019-07-02  8:02 ` [PATCH 4.19 71/72] RDMA: Directly cast the sockaddr union to sockaddr Greg Kroah-Hartman
                   ` (8 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:02 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Will Deacon

From: Will Deacon <will.deacon@arm.com>

commit 427503519739e779c0db8afe876c1b33f3ac60ae upstream.

The architecture implementations of 'arch_futex_atomic_op_inuser()' and
'futex_atomic_cmpxchg_inatomic()' are permitted to return only -EFAULT,
-EAGAIN or -ENOSYS in the case of failure.

Update the comments in the asm-generic/ implementation and also a stray
reference in the robust futex documentation.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 Documentation/robust-futexes.txt |    3 +--
 include/asm-generic/futex.h      |    8 ++++++--
 2 files changed, 7 insertions(+), 4 deletions(-)

--- a/Documentation/robust-futexes.txt
+++ b/Documentation/robust-futexes.txt
@@ -218,5 +218,4 @@ All other architectures should build jus
 the new syscalls yet.
 
 Architectures need to implement the new futex_atomic_cmpxchg_inatomic()
-inline function before writing up the syscalls (that function returns
--ENOSYS right now).
+inline function before writing up the syscalls.
--- a/include/asm-generic/futex.h
+++ b/include/asm-generic/futex.h
@@ -23,7 +23,9 @@
  *
  * Return:
  * 0 - On success
- * <0 - On error
+ * -EFAULT - User access resulted in a page fault
+ * -EAGAIN - Atomic operation was unable to complete due to contention
+ * -ENOSYS - Operation not supported
  */
 static inline int
 arch_futex_atomic_op_inuser(int op, u32 oparg, int *oval, u32 __user *uaddr)
@@ -85,7 +87,9 @@ out_pagefault_enable:
  *
  * Return:
  * 0 - On success
- * <0 - On error
+ * -EFAULT - User access resulted in a page fault
+ * -EAGAIN - Atomic operation was unable to complete due to contention
+ * -ENOSYS - Function not implemented (only if !HAVE_FUTEX_CMPXCHG)
  */
 static inline int
 futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 71/72] RDMA: Directly cast the sockaddr union to sockaddr
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (69 preceding siblings ...)
  2019-07-02  8:02 ` [PATCH 4.19 70/72] futex: Update comments and docs about return values of arch futex code Greg Kroah-Hartman
@ 2019-07-02  8:02 ` Greg Kroah-Hartman
  2019-07-02  8:02 ` [PATCH 4.19 72/72] tipc: pass tunnel dev as NULL to udp_tunnel(6)_xmit_skb Greg Kroah-Hartman
                   ` (7 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:02 UTC (permalink / raw)
  To: linux-kernel; +Cc: Greg Kroah-Hartman, stable, Jason Gunthorpe

From: Jason Gunthorpe <jgg@mellanox.com>

commit 641114d2af312d39ca9bbc2369d18a5823da51c6 upstream.

gcc 9 now does allocation size tracking and thinks that passing the member
of a union and then accessing beyond that member's bounds is an overflow.

Instead of using the union member, use the entire union with a cast to
get to the sockaddr. gcc will now know that the memory extends the full
size of the union.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/infiniband/core/addr.c           |   10 +++++-----
 drivers/infiniband/hw/ocrdma/ocrdma_ah.c |    5 ++---
 drivers/infiniband/hw/ocrdma/ocrdma_hw.c |    5 ++---
 3 files changed, 9 insertions(+), 11 deletions(-)

--- a/drivers/infiniband/core/addr.c
+++ b/drivers/infiniband/core/addr.c
@@ -716,22 +716,22 @@ int rdma_addr_find_l2_eth_by_grh(const u
 	struct rdma_dev_addr dev_addr;
 	struct resolve_cb_context ctx;
 	union {
-		struct sockaddr     _sockaddr;
 		struct sockaddr_in  _sockaddr_in;
 		struct sockaddr_in6 _sockaddr_in6;
 	} sgid_addr, dgid_addr;
 	int ret;
 
-	rdma_gid2ip(&sgid_addr._sockaddr, sgid);
-	rdma_gid2ip(&dgid_addr._sockaddr, dgid);
+	rdma_gid2ip((struct sockaddr *)&sgid_addr, sgid);
+	rdma_gid2ip((struct sockaddr *)&dgid_addr, dgid);
 
 	memset(&dev_addr, 0, sizeof(dev_addr));
 	dev_addr.bound_dev_if = ndev->ifindex;
 	dev_addr.net = &init_net;
 
 	init_completion(&ctx.comp);
-	ret = rdma_resolve_ip(&sgid_addr._sockaddr, &dgid_addr._sockaddr,
-			      &dev_addr, 1000, resolve_cb, &ctx);
+	ret = rdma_resolve_ip((struct sockaddr *)&sgid_addr,
+			      (struct sockaddr *)&dgid_addr, &dev_addr, 1000,
+			      resolve_cb, &ctx);
 	if (ret)
 		return ret;
 
--- a/drivers/infiniband/hw/ocrdma/ocrdma_ah.c
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_ah.c
@@ -83,7 +83,6 @@ static inline int set_av_attr(struct ocr
 	struct iphdr ipv4;
 	const struct ib_global_route *ib_grh;
 	union {
-		struct sockaddr     _sockaddr;
 		struct sockaddr_in  _sockaddr_in;
 		struct sockaddr_in6 _sockaddr_in6;
 	} sgid_addr, dgid_addr;
@@ -133,9 +132,9 @@ static inline int set_av_attr(struct ocr
 		ipv4.tot_len = htons(0);
 		ipv4.ttl = ib_grh->hop_limit;
 		ipv4.protocol = nxthdr;
-		rdma_gid2ip(&sgid_addr._sockaddr, sgid);
+		rdma_gid2ip((struct sockaddr *)&sgid_addr, sgid);
 		ipv4.saddr = sgid_addr._sockaddr_in.sin_addr.s_addr;
-		rdma_gid2ip(&dgid_addr._sockaddr, &ib_grh->dgid);
+		rdma_gid2ip((struct sockaddr*)&dgid_addr, &ib_grh->dgid);
 		ipv4.daddr = dgid_addr._sockaddr_in.sin_addr.s_addr;
 		memcpy((u8 *)ah->av + eth_sz, &ipv4, sizeof(struct iphdr));
 	} else {
--- a/drivers/infiniband/hw/ocrdma/ocrdma_hw.c
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_hw.c
@@ -2499,7 +2499,6 @@ static int ocrdma_set_av_params(struct o
 	u32 vlan_id = 0xFFFF;
 	u8 mac_addr[6], hdr_type;
 	union {
-		struct sockaddr     _sockaddr;
 		struct sockaddr_in  _sockaddr_in;
 		struct sockaddr_in6 _sockaddr_in6;
 	} sgid_addr, dgid_addr;
@@ -2541,8 +2540,8 @@ static int ocrdma_set_av_params(struct o
 
 	hdr_type = rdma_gid_attr_network_type(sgid_attr);
 	if (hdr_type == RDMA_NETWORK_IPV4) {
-		rdma_gid2ip(&sgid_addr._sockaddr, &sgid_attr->gid);
-		rdma_gid2ip(&dgid_addr._sockaddr, &grh->dgid);
+		rdma_gid2ip((struct sockaddr *)&sgid_addr, &sgid_attr->gid);
+		rdma_gid2ip((struct sockaddr *)&dgid_addr, &grh->dgid);
 		memcpy(&cmd->params.dgid[0],
 		       &dgid_addr._sockaddr_in.sin_addr.s_addr, 4);
 		memcpy(&cmd->params.sgid[0],



^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 4.19 72/72] tipc: pass tunnel dev as NULL to udp_tunnel(6)_xmit_skb
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (70 preceding siblings ...)
  2019-07-02  8:02 ` [PATCH 4.19 71/72] RDMA: Directly cast the sockaddr union to sockaddr Greg Kroah-Hartman
@ 2019-07-02  8:02 ` Greg Kroah-Hartman
  2019-07-02 12:32 ` [PATCH 4.19 00/72] 4.19.57-stable review kernelci.org bot
                   ` (6 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-02  8:02 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, syzbot+9d4c12bfd45a58738d0a,
	syzbot+a9e23ea2aa21044c2798, syzbot+c4c4b2bb358bb936ad7e,
	syzbot+0290d2290a607e035ba1, syzbot+a43d8d4e7e8a7a9e149e,
	syzbot+a47c5f4c6c00fc1ed16e, Xin Long, David S. Miller

From: Xin Long <lucien.xin@gmail.com>

commit c3bcde026684c62d7a2b6f626dc7cf763833875c upstream.

udp_tunnel(6)_xmit_skb() called by tipc_udp_xmit() expects a tunnel device
to count packets on dev->tstats, a perpcu variable. However, TIPC is using
udp tunnel with no tunnel device, and pass the lower dev, like veth device
that only initializes dev->lstats(a perpcu variable) when creating it.

Later iptunnel_xmit_stats() called by ip(6)tunnel_xmit() thinks the dev as
a tunnel device, and uses dev->tstats instead of dev->lstats. tstats' each
pointer points to a bigger struct than lstats, so when tstats->tx_bytes is
increased, other percpu variable's members could be overwritten.

syzbot has reported quite a few crashes due to fib_nh_common percpu member
'nhc_pcpu_rth_output' overwritten, call traces are like:

  BUG: KASAN: slab-out-of-bounds in rt_cache_valid+0x158/0x190
  net/ipv4/route.c:1556
    rt_cache_valid+0x158/0x190 net/ipv4/route.c:1556
    __mkroute_output net/ipv4/route.c:2332 [inline]
    ip_route_output_key_hash_rcu+0x819/0x2d50 net/ipv4/route.c:2564
    ip_route_output_key_hash+0x1ef/0x360 net/ipv4/route.c:2393
    __ip_route_output_key include/net/route.h:125 [inline]
    ip_route_output_flow+0x28/0xc0 net/ipv4/route.c:2651
    ip_route_output_key include/net/route.h:135 [inline]
  ...

or:

  kasan: GPF could be caused by NULL-ptr deref or user memory access
  RIP: 0010:dst_dev_put+0x24/0x290 net/core/dst.c:168
    <IRQ>
    rt_fibinfo_free_cpus net/ipv4/fib_semantics.c:200 [inline]
    free_fib_info_rcu+0x2e1/0x490 net/ipv4/fib_semantics.c:217
    __rcu_reclaim kernel/rcu/rcu.h:240 [inline]
    rcu_do_batch kernel/rcu/tree.c:2437 [inline]
    invoke_rcu_callbacks kernel/rcu/tree.c:2716 [inline]
    rcu_process_callbacks+0x100a/0x1ac0 kernel/rcu/tree.c:2697
  ...

The issue exists since tunnel stats update is moved to iptunnel_xmit by
Commit 039f50629b7f ("ip_tunnel: Move stats update to iptunnel_xmit()"),
and here to fix it by passing a NULL tunnel dev to udp_tunnel(6)_xmit_skb
so that the packets counting won't happen on dev->tstats.

Reported-by: syzbot+9d4c12bfd45a58738d0a@syzkaller.appspotmail.com
Reported-by: syzbot+a9e23ea2aa21044c2798@syzkaller.appspotmail.com
Reported-by: syzbot+c4c4b2bb358bb936ad7e@syzkaller.appspotmail.com
Reported-by: syzbot+0290d2290a607e035ba1@syzkaller.appspotmail.com
Reported-by: syzbot+a43d8d4e7e8a7a9e149e@syzkaller.appspotmail.com
Reported-by: syzbot+a47c5f4c6c00fc1ed16e@syzkaller.appspotmail.com
Fixes: 039f50629b7f ("ip_tunnel: Move stats update to iptunnel_xmit()")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 net/tipc/udp_media.c |    8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

--- a/net/tipc/udp_media.c
+++ b/net/tipc/udp_media.c
@@ -176,7 +176,6 @@ static int tipc_udp_xmit(struct net *net
 			goto tx_error;
 		}
 
-		skb->dev = rt->dst.dev;
 		ttl = ip4_dst_hoplimit(&rt->dst);
 		udp_tunnel_xmit_skb(rt, ub->ubsock->sk, skb, src->ipv4.s_addr,
 				    dst->ipv4.s_addr, 0, ttl, 0, src->port,
@@ -195,10 +194,9 @@ static int tipc_udp_xmit(struct net *net
 		if (err)
 			goto tx_error;
 		ttl = ip6_dst_hoplimit(ndst);
-		err = udp_tunnel6_xmit_skb(ndst, ub->ubsock->sk, skb,
-					   ndst->dev, &src->ipv6,
-					   &dst->ipv6, 0, ttl, 0, src->port,
-					   dst->port, false);
+		err = udp_tunnel6_xmit_skb(ndst, ub->ubsock->sk, skb, NULL,
+					   &src->ipv6, &dst->ipv6, 0, ttl, 0,
+					   src->port, dst->port, false);
 #endif
 	}
 	return err;



^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 4.19 00/72] 4.19.57-stable review
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (71 preceding siblings ...)
  2019-07-02  8:02 ` [PATCH 4.19 72/72] tipc: pass tunnel dev as NULL to udp_tunnel(6)_xmit_skb Greg Kroah-Hartman
@ 2019-07-02 12:32 ` kernelci.org bot
  2019-07-02 16:54 ` Naresh Kamboju
                   ` (5 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: kernelci.org bot @ 2019-07-02 12:32 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel
  Cc: Greg Kroah-Hartman, torvalds, akpm, linux, shuah, patches,
	ben.hutchings, lkft-triage, stable

stable-rc/linux-4.19.y boot: 131 boots: 2 failed, 128 passed with 1 offline (v4.19.56-72-g828a73287676)

Full Boot Summary: https://kernelci.org/boot/all/job/stable-rc/branch/linux-4.19.y/kernel/v4.19.56-72-g828a73287676/
Full Build Summary: https://kernelci.org/build/stable-rc/branch/linux-4.19.y/kernel/v4.19.56-72-g828a73287676/

Tree: stable-rc
Branch: linux-4.19.y
Git Describe: v4.19.56-72-g828a73287676
Git Commit: 828a732876760accbd58e1c3ce70be8b6ae0c03f
Git URL: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Tested: 73 unique boards, 26 SoC families, 16 builds out of 206

Boot Failures Detected:

arm:
    sunxi_defconfig:
        gcc-8:
            sun7i-a20-bananapi: 1 failed lab

    multi_v7_defconfig:
        gcc-8:
            sun7i-a20-bananapi: 1 failed lab

Offline Platforms:

arm:

    multi_v7_defconfig:
        gcc-8
            stih410-b2120: 1 offline lab

---
For more info write to <info@kernelci.org>

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 4.19 00/72] 4.19.57-stable review
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (72 preceding siblings ...)
  2019-07-02 12:32 ` [PATCH 4.19 00/72] 4.19.57-stable review kernelci.org bot
@ 2019-07-02 16:54 ` Naresh Kamboju
  2019-07-02 20:23 ` Guenter Roeck
                   ` (4 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Naresh Kamboju @ 2019-07-02 16:54 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: open list, Linus Torvalds, Andrew Morton, Guenter Roeck,
	Shuah Khan, patches, Ben Hutchings, lkft-triage, linux- stable

On Tue, 2 Jul 2019 at 13:36, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> This is the start of the stable review cycle for the 4.19.57 release.
> There are 72 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Thu 04 Jul 2019 07:59:45 AM UTC.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
>         https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.57-rc1.gz
> or in the git tree and branch at:
>         git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>

Results from Linaro’s test farm.
No regressions on arm64, arm, x86_64, and i386.

Summary
------------------------------------------------------------------------

kernel: 4.19.57-rc1
git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
git branch: linux-4.19.y
git commit: 4d057dfd72c6b6b27f11e499fa7c9fc079fc62ef
git describe: v4.19.56-73-g4d057dfd72c6
Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-4.19-oe/build/v4.19.56-73-g4d057dfd72c6


No regressions (compared to build v4.19.56)

No fixes (compared to build v4.19.56)

Ran 25160 total tests in the following environments and test suites.

Environments
--------------
- dragonboard-410c - arm64
- hi6220-hikey - arm64
- i386
- juno-r2 - arm64
- qemu_arm
- qemu_arm64
- qemu_i386
- qemu_x86_64
- x15 - arm
- x86_64

Test Suites
-----------
* build
* install-android-platform-tools-r2600
* kselftest
* libgpiod
* libhugetlbfs
* ltp-cap_bounds-tests
* ltp-commands-tests
* ltp-containers-tests
* ltp-cpuhotplug-tests
* ltp-cve-tests
* ltp-dio-tests
* ltp-fcntl-locktests-tests
* ltp-filecaps-tests
* ltp-fs_bind-tests
* ltp-fs_perms_simple-tests
* ltp-fsx-tests
* ltp-hugetlb-tests
* ltp-io-tests
* ltp-ipc-tests
* ltp-math-tests
* ltp-mm-tests
* ltp-nptl-tests
* ltp-pty-tests
* ltp-sched-tests
* ltp-securebits-tests
* ltp-syscalls-tests
* ltp-timers-tests
* perf
* spectre-meltdown-checker-test
* v4l2-compliance
* ltp-fs-tests
* network-basic-tests
* ltp-open-posix-tests
* kvm-unit-tests
* kselftest-vsyscall-mode-native
* kselftest-vsyscall-mode-none

-- 
Linaro LKFT
https://lkft.linaro.org

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 4.19 00/72] 4.19.57-stable review
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (73 preceding siblings ...)
  2019-07-02 16:54 ` Naresh Kamboju
@ 2019-07-02 20:23 ` Guenter Roeck
  2019-07-03 14:46   ` Greg Kroah-Hartman
  2019-07-02 21:08 ` Kelsey Skunberg
                   ` (3 subsequent siblings)
  78 siblings, 1 reply; 84+ messages in thread
From: Guenter Roeck @ 2019-07-02 20:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, torvalds, akpm, shuah, patches, ben.hutchings,
	lkft-triage, stable

On Tue, Jul 02, 2019 at 10:01:01AM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.19.57 release.
> There are 72 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Thu 04 Jul 2019 07:59:45 AM UTC.
> Anything received after that time might be too late.
> 
Build results:
	total: 156 pass: 156 fail: 0
Qemu test results:
	total: 364 pass: 364 fail: 0

Guenter

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 4.19 00/72] 4.19.57-stable review
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (74 preceding siblings ...)
  2019-07-02 20:23 ` Guenter Roeck
@ 2019-07-02 21:08 ` Kelsey Skunberg
  2019-07-02 22:52 ` shuah
                   ` (2 subsequent siblings)
  78 siblings, 0 replies; 84+ messages in thread
From: Kelsey Skunberg @ 2019-07-02 21:08 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, torvalds, akpm, linux, shuah, patches,
	ben.hutchings, lkft-triage, stable

On Tue, Jul 02, 2019 at 10:01:01AM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.19.57 release.
> There are 72 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Thu 04 Jul 2019 07:59:45 AM UTC.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.57-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h


Compiled, booted, and no regressions on my system.

-Kelsey

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 4.19 00/72] 4.19.57-stable review
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (75 preceding siblings ...)
  2019-07-02 21:08 ` Kelsey Skunberg
@ 2019-07-02 22:52 ` shuah
  2019-07-03 10:21 ` Jon Hunter
  2019-07-04  5:29 ` Bharath Vedartham
  78 siblings, 0 replies; 84+ messages in thread
From: shuah @ 2019-07-02 22:52 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel
  Cc: torvalds, akpm, linux, patches, ben.hutchings, lkft-triage,
	stable, shuah

On 7/2/19 2:01 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.19.57 release.
> There are 72 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Thu 04 Jul 2019 07:59:45 AM UTC.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.57-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h
> 

Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 4.19 26/72] usb: dwc3: gadget: use num_trbs when skipping TRBs on ->dequeue()
  2019-07-02  8:01 ` [PATCH 4.19 26/72] usb: dwc3: gadget: use num_trbs when skipping TRBs on ->dequeue() Greg Kroah-Hartman
@ 2019-07-03  2:03   ` Sasha Levin
  2019-07-03  7:20     ` Greg Kroah-Hartman
  0 siblings, 1 reply; 84+ messages in thread
From: Sasha Levin @ 2019-07-03  2:03 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, stable, Fei Yang, Sam Protsenko, Felipe Balbi,
	linux-usb, Felipe Balbi, John Stultz

On Tue, Jul 02, 2019 at 10:01:27AM +0200, Greg Kroah-Hartman wrote:
>commit c3acd59014148470dc58519870fbc779785b4bf7 upstream
>
>Now that we track how many TRBs a request uses, it's easier to skip
>over them in case of a call to usb_ep_dequeue(). Let's do so and
>simplify the code a bit.
>
>Cc: Fei Yang <fei.yang@intel.com>
>Cc: Sam Protsenko <semen.protsenko@linaro.org>
>Cc: Felipe Balbi <balbi@kernel.org>
>Cc: linux-usb@vger.kernel.org
>Cc: stable@vger.kernel.org # 4.19.y
>Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
>(cherry picked from commit c3acd59014148470dc58519870fbc779785b4bf7)
>Signed-off-by: John Stultz <john.stultz@linaro.org>
>Signed-off-by: Sasha Levin <sashal@kernel.org>

This one has an upstream fix: c7152763f02e05567da27462b2277a554e507c89
("usb: dwc3: Reset num_trbs after skipping").

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 4.19 26/72] usb: dwc3: gadget: use num_trbs when skipping TRBs on ->dequeue()
  2019-07-03  2:03   ` Sasha Levin
@ 2019-07-03  7:20     ` Greg Kroah-Hartman
  2019-07-03 19:59       ` Sasha Levin
  0 siblings, 1 reply; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-03  7:20 UTC (permalink / raw)
  To: Sasha Levin
  Cc: linux-kernel, stable, Fei Yang, Sam Protsenko, Felipe Balbi,
	linux-usb, Felipe Balbi, John Stultz

On Tue, Jul 02, 2019 at 10:03:12PM -0400, Sasha Levin wrote:
> On Tue, Jul 02, 2019 at 10:01:27AM +0200, Greg Kroah-Hartman wrote:
> > commit c3acd59014148470dc58519870fbc779785b4bf7 upstream
> > 
> > Now that we track how many TRBs a request uses, it's easier to skip
> > over them in case of a call to usb_ep_dequeue(). Let's do so and
> > simplify the code a bit.
> > 
> > Cc: Fei Yang <fei.yang@intel.com>
> > Cc: Sam Protsenko <semen.protsenko@linaro.org>
> > Cc: Felipe Balbi <balbi@kernel.org>
> > Cc: linux-usb@vger.kernel.org
> > Cc: stable@vger.kernel.org # 4.19.y
> > Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
> > (cherry picked from commit c3acd59014148470dc58519870fbc779785b4bf7)
> > Signed-off-by: John Stultz <john.stultz@linaro.org>
> > Signed-off-by: Sasha Levin <sashal@kernel.org>
> 
> This one has an upstream fix: c7152763f02e05567da27462b2277a554e507c89
> ("usb: dwc3: Reset num_trbs after skipping").

You were the one who queued this series up :)

I'll go add this one now...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 4.19 00/72] 4.19.57-stable review
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (76 preceding siblings ...)
  2019-07-02 22:52 ` shuah
@ 2019-07-03 10:21 ` Jon Hunter
  2019-07-04  5:29 ` Bharath Vedartham
  78 siblings, 0 replies; 84+ messages in thread
From: Jon Hunter @ 2019-07-03 10:21 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel
  Cc: torvalds, akpm, linux, shuah, patches, ben.hutchings,
	lkft-triage, stable, linux-tegra


On 02/07/2019 09:01, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.19.57 release.
> There are 72 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Thu 04 Jul 2019 07:59:45 AM UTC.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.57-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h

All tests are passing for Tegra ...

Test results for stable-v4.19:
    12 builds:	12 pass, 0 fail
    22 boots:	22 pass, 0 fail
    32 tests:	32 pass, 0 fail

Linux version:	4.19.57-rc1-g4d057dfd72c6
Boards tested:	tegra124-jetson-tk1, tegra186-p2771-0000,
                tegra194-p2972-0000, tegra20-ventana,
                tegra210-p2371-2180, tegra30-cardhu-a04

Cheers
Jon

-- 
nvpublic

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 4.19 00/72] 4.19.57-stable review
  2019-07-02 20:23 ` Guenter Roeck
@ 2019-07-03 14:46   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 84+ messages in thread
From: Greg Kroah-Hartman @ 2019-07-03 14:46 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: linux-kernel, torvalds, akpm, shuah, patches, ben.hutchings,
	lkft-triage, stable

On Tue, Jul 02, 2019 at 01:23:00PM -0700, Guenter Roeck wrote:
> On Tue, Jul 02, 2019 at 10:01:01AM +0200, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.19.57 release.
> > There are 72 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Thu 04 Jul 2019 07:59:45 AM UTC.
> > Anything received after that time might be too late.
> > 
> Build results:
> 	total: 156 pass: 156 fail: 0
> Qemu test results:
> 	total: 364 pass: 364 fail: 0

Thanks for testing these and letting me know.

greg k-h

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 4.19 26/72] usb: dwc3: gadget: use num_trbs when skipping TRBs on ->dequeue()
  2019-07-03  7:20     ` Greg Kroah-Hartman
@ 2019-07-03 19:59       ` Sasha Levin
  0 siblings, 0 replies; 84+ messages in thread
From: Sasha Levin @ 2019-07-03 19:59 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, stable, Fei Yang, Sam Protsenko, Felipe Balbi,
	linux-usb, Felipe Balbi, John Stultz

On Wed, Jul 03, 2019 at 09:20:12AM +0200, Greg Kroah-Hartman wrote:
>On Tue, Jul 02, 2019 at 10:03:12PM -0400, Sasha Levin wrote:
>> On Tue, Jul 02, 2019 at 10:01:27AM +0200, Greg Kroah-Hartman wrote:
>> > commit c3acd59014148470dc58519870fbc779785b4bf7 upstream
>> >
>> > Now that we track how many TRBs a request uses, it's easier to skip
>> > over them in case of a call to usb_ep_dequeue(). Let's do so and
>> > simplify the code a bit.
>> >
>> > Cc: Fei Yang <fei.yang@intel.com>
>> > Cc: Sam Protsenko <semen.protsenko@linaro.org>
>> > Cc: Felipe Balbi <balbi@kernel.org>
>> > Cc: linux-usb@vger.kernel.org
>> > Cc: stable@vger.kernel.org # 4.19.y
>> > Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
>> > (cherry picked from commit c3acd59014148470dc58519870fbc779785b4bf7)
>> > Signed-off-by: John Stultz <john.stultz@linaro.org>
>> > Signed-off-by: Sasha Levin <sashal@kernel.org>
>>
>> This one has an upstream fix: c7152763f02e05567da27462b2277a554e507c89
>> ("usb: dwc3: Reset num_trbs after skipping").
>
>You were the one who queued this series up :)

Indeed, and I'm actually quite happy about this.

Even though I goofed up and didn't notice the fix when it got queued up,
the automation we have in place to catch these cases worked and we were
able to get the fix in as well before release.

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 4.19 00/72] 4.19.57-stable review
  2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
                   ` (77 preceding siblings ...)
  2019-07-03 10:21 ` Jon Hunter
@ 2019-07-04  5:29 ` Bharath Vedartham
  78 siblings, 0 replies; 84+ messages in thread
From: Bharath Vedartham @ 2019-07-04  5:29 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, torvalds, akpm, linux, shuah, patches,
	ben.hutchings, lkft-triage, stable

Tested and booted in my x86 system. No regressions.

^ permalink raw reply	[flat|nested] 84+ messages in thread

end of thread, other threads:[~2019-07-04  5:29 UTC | newest]

Thread overview: 84+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-02  8:01 [PATCH 4.19 00/72] 4.19.57-stable review Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 01/72] perf ui helpline: Use strlcpy() as a shorter form of strncpy() + explicit set nul Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 02/72] perf help: Remove needless use of strncpy() Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 03/72] perf header: Fix unchecked usage " Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 04/72] arm64: Dont unconditionally add -Wno-psabi to KBUILD_CFLAGS Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 05/72] Revert "x86/uaccess, ftrace: Fix ftrace_likely_update() vs. SMAP" Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 06/72] IB/hfi1: Close PSM sdma_progress sleep window Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 07/72] 9p/xen: fix check for xenbus_read error in front_probe Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 08/72] 9p: Use a slab for allocating requests Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 09/72] 9p: embed fcall in req to round down buffer allocs Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 10/72] 9p: add a per-client fcall kmem_cache Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 11/72] 9p: rename p9_free_req() function Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 12/72] 9p: Add refcount to p9_req_t Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 13/72] 9p/rdma: do not disconnect on down_interruptible EAGAIN Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 14/72] 9p: Rename req to rreq in trans_fd Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 15/72] 9p: acl: fix uninitialized iattr access Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 16/72] 9p/rdma: remove useless check in cm_event_handler Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 17/72] 9p: p9dirent_read: check network-provided name length Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 18/72] 9p: potential NULL dereference Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 19/72] 9p/trans_fd: abort p9_read_work if req status changed Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 20/72] 9p/trans_fd: put worker reqs on destroy Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 21/72] net/9p: include trans_common.h to fix missing prototype warning Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 22/72] qmi_wwan: Fix out-of-bounds read Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 23/72] Revert "usb: dwc3: gadget: Clear req->needs_extra_trb flag on cleanup" Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 24/72] usb: dwc3: gadget: combine unaligned and zero flags Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 25/72] usb: dwc3: gadget: track number of TRBs per request Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 26/72] usb: dwc3: gadget: use num_trbs when skipping TRBs on ->dequeue() Greg Kroah-Hartman
2019-07-03  2:03   ` Sasha Levin
2019-07-03  7:20     ` Greg Kroah-Hartman
2019-07-03 19:59       ` Sasha Levin
2019-07-02  8:01 ` [PATCH 4.19 27/72] usb: dwc3: gadget: extract dwc3_gadget_ep_skip_trbs() Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 28/72] usb: dwc3: gadget: introduce cancelled_list Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 29/72] usb: dwc3: gadget: move requests to cancelled_list Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 30/72] usb: dwc3: gadget: remove wait_end_transfer Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 31/72] usb: dwc3: gadget: Clear req->needs_extra_trb flag on cleanup Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 32/72] fs/proc/array.c: allow reporting eip/esp for all coredumping threads Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 33/72] mm/mempolicy.c: fix an incorrect rebind node in mpol_rebind_nodemask Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 34/72] fs/binfmt_flat.c: make load_flat_shared_library() work Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 35/72] clk: socfpga: stratix10: fix divider entry for the emac clocks Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 36/72] mm: soft-offline: return -EBUSY if set_hwpoison_free_buddy_page() fails Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 37/72] mm: hugetlb: soft-offline: dissolve_free_huge_page() return zero on !PageHuge Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 38/72] mm/page_idle.c: fix oops because end_pfn is larger than max_pfn Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 39/72] dm log writes: make sure super sector log updates are written in order Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 40/72] scsi: vmw_pscsi: Fix use-after-free in pvscsi_queue_lck() Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 41/72] x86/speculation: Allow guests to use SSBD even if host does not Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 42/72] x86/microcode: Fix the microcode load on CPU hotplug for real Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 43/72] x86/resctrl: Prevent possible overrun during bitmap operations Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 44/72] KVM: x86/mmu: Allocate PAE root array when using SVMs 32-bit NPT Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 45/72] NFS/flexfiles: Use the correct TCP timeout for flexfiles I/O Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 46/72] cpu/speculation: Warn on unsupported mitigations= parameter Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 47/72] SUNRPC: Clean up initialisation of the struct rpc_rqst Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 48/72] irqchip/mips-gic: Use the correct local interrupt map registers Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 49/72] eeprom: at24: fix unexpected timeout under high load Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 50/72] af_packet: Block execution of tasks waiting for transmit to complete in AF_PACKET Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 51/72] bonding: Always enable vlan tx offload Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 52/72] ipv4: Use return value of inet_iif() for __raw_v4_lookup in the while loop Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 53/72] net/packet: fix memory leak in packet_set_ring() Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 54/72] net: remove duplicate fetch in sock_getsockopt Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 55/72] net: stmmac: fixed new system time seconds value calculation Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 56/72] net: stmmac: set IC bit when transmitting frames with HW timestamp Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 57/72] sctp: change to hold sk after auth shkey is created successfully Greg Kroah-Hartman
2019-07-02  8:01 ` [PATCH 4.19 58/72] team: Always enable vlan tx offload Greg Kroah-Hartman
2019-07-02  8:02 ` [PATCH 4.19 59/72] tipc: change to use register_pernet_device Greg Kroah-Hartman
2019-07-02  8:02 ` [PATCH 4.19 60/72] tipc: check msg->req data len in tipc_nl_compat_bearer_disable Greg Kroah-Hartman
2019-07-02  8:02 ` [PATCH 4.19 61/72] tun: wake up waitqueues after IFF_UP is set Greg Kroah-Hartman
2019-07-02  8:02 ` [PATCH 4.19 62/72] bpf: simplify definition of BPF_FIB_LOOKUP related flags Greg Kroah-Hartman
2019-07-02  8:02 ` [PATCH 4.19 63/72] bpf: lpm_trie: check left child of last leftmost node for NULL Greg Kroah-Hartman
2019-07-02  8:02 ` [PATCH 4.19 64/72] bpf: fix nested bpf tracepoints with per-cpu data Greg Kroah-Hartman
2019-07-02  8:02 ` [PATCH 4.19 65/72] bpf: fix unconnected udp hooks Greg Kroah-Hartman
2019-07-02  8:02 ` [PATCH 4.19 66/72] bpf: udp: Avoid calling reuseports bpf_prog from udp_gro Greg Kroah-Hartman
2019-07-02  8:02 ` [PATCH 4.19 67/72] bpf: udp: ipv6: Avoid running reuseports bpf_prog from __udp6_lib_err Greg Kroah-Hartman
2019-07-02  8:02 ` [PATCH 4.19 68/72] arm64: futex: Avoid copying out uninitialised stack in failed cmpxchg() Greg Kroah-Hartman
2019-07-02  8:02 ` [PATCH 4.19 69/72] bpf, arm64: use more scalable stadd over ldxr / stxr loop in xadd Greg Kroah-Hartman
2019-07-02  8:02 ` [PATCH 4.19 70/72] futex: Update comments and docs about return values of arch futex code Greg Kroah-Hartman
2019-07-02  8:02 ` [PATCH 4.19 71/72] RDMA: Directly cast the sockaddr union to sockaddr Greg Kroah-Hartman
2019-07-02  8:02 ` [PATCH 4.19 72/72] tipc: pass tunnel dev as NULL to udp_tunnel(6)_xmit_skb Greg Kroah-Hartman
2019-07-02 12:32 ` [PATCH 4.19 00/72] 4.19.57-stable review kernelci.org bot
2019-07-02 16:54 ` Naresh Kamboju
2019-07-02 20:23 ` Guenter Roeck
2019-07-03 14:46   ` Greg Kroah-Hartman
2019-07-02 21:08 ` Kelsey Skunberg
2019-07-02 22:52 ` shuah
2019-07-03 10:21 ` Jon Hunter
2019-07-04  5:29 ` Bharath Vedartham

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).