stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Eric Dumazet <edumazet@google.com>,
	Jason Baron <jbaron@akamai.com>,
	Vladimir Rutsky <rutsky@google.com>,
	Soheil Hassas Yeganeh <soheil@google.com>,
	Neal Cardwell <ncardwell@google.com>,
	"David S. Miller" <davem@davemloft.net>
Subject: [PATCH 5.2 07/94] tcp: remove empty skb from write queue in error cases
Date: Sun,  8 Sep 2019 13:41:03 +0100	[thread overview]
Message-ID: <20190908121150.642000321@linuxfoundation.org> (raw)
In-Reply-To: <20190908121150.420989666@linuxfoundation.org>

From: Eric Dumazet <edumazet@google.com>

[ Upstream commit fdfc5c8594c24c5df883583ebd286321a80e0a67 ]

Vladimir Rutsky reported stuck TCP sessions after memory pressure
events. Edge Trigger epoll() user would never receive an EPOLLOUT
notification allowing them to retry a sendmsg().

Jason tested the case of sk_stream_alloc_skb() returning NULL,
but there are other paths that could lead both sendmsg() and sendpage()
to return -1 (EAGAIN), with an empty skb queued on the write queue.

This patch makes sure we remove this empty skb so that
Jason code can detect that the queue is empty, and
call sk->sk_write_space(sk) accordingly.

Fixes: ce5ec440994b ("tcp: ensure epoll edge trigger wakeup when write queue is empty")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jason Baron <jbaron@akamai.com>
Reported-by: Vladimir Rutsky <rutsky@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv4/tcp.c |   30 ++++++++++++++++++++----------
 1 file changed, 20 insertions(+), 10 deletions(-)

--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -935,6 +935,22 @@ static int tcp_send_mss(struct sock *sk,
 	return mss_now;
 }
 
+/* In some cases, both sendpage() and sendmsg() could have added
+ * an skb to the write queue, but failed adding payload on it.
+ * We need to remove it to consume less memory, but more
+ * importantly be able to generate EPOLLOUT for Edge Trigger epoll()
+ * users.
+ */
+static void tcp_remove_empty_skb(struct sock *sk, struct sk_buff *skb)
+{
+	if (skb && !skb->len) {
+		tcp_unlink_write_queue(skb, sk);
+		if (tcp_write_queue_empty(sk))
+			tcp_chrono_stop(sk, TCP_CHRONO_BUSY);
+		sk_wmem_free_skb(sk, skb);
+	}
+}
+
 ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
 			 size_t size, int flags)
 {
@@ -1064,6 +1080,7 @@ out:
 	return copied;
 
 do_error:
+	tcp_remove_empty_skb(sk, tcp_write_queue_tail(sk));
 	if (copied)
 		goto out;
 out_err:
@@ -1388,18 +1405,11 @@ out_nopush:
 	sock_zerocopy_put(uarg);
 	return copied + copied_syn;
 
+do_error:
+	skb = tcp_write_queue_tail(sk);
 do_fault:
-	if (!skb->len) {
-		tcp_unlink_write_queue(skb, sk);
-		/* It is the one place in all of TCP, except connection
-		 * reset, where we can be unlinking the send_head.
-		 */
-		if (tcp_write_queue_empty(sk))
-			tcp_chrono_stop(sk, TCP_CHRONO_BUSY);
-		sk_wmem_free_skb(sk, skb);
-	}
+	tcp_remove_empty_skb(sk, skb);
 
-do_error:
 	if (copied + copied_syn)
 		goto out;
 out_err:



  parent reply	other threads:[~2019-09-08 12:56 UTC|newest]

Thread overview: 104+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-08 12:40 [PATCH 5.2 00/94] 5.2.14-stable review Greg Kroah-Hartman
2019-09-08 12:40 ` [PATCH 5.2 01/94] mld: fix memory leak in mld_del_delrec() Greg Kroah-Hartman
2019-09-08 12:40 ` [PATCH 5.2 02/94] net: fix skb use after free in netpoll Greg Kroah-Hartman
2019-09-08 12:40 ` [PATCH 5.2 03/94] net: sched: act_sample: fix psample group handling on overwrite Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 04/94] net_sched: fix a NULL pointer deref in ipt action Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 05/94] net: stmmac: dwmac-rk: Dont fail if phy regulator is absent Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 06/94] tcp: inherit timestamp on mtu probe Greg Kroah-Hartman
2019-09-08 12:41 ` Greg Kroah-Hartman [this message]
2019-09-08 12:41 ` [PATCH 5.2 08/94] nfp: flower: prevent ingress block binds on internal ports Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 09/94] nfp: flower: handle neighbour events " Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 10/94] Revert "r8152: napi hangup fix after disconnect" Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 11/94] r8152: remove calling netif_napi_del Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 12/94] taprio: Fix kernel panic in taprio_destroy Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 13/94] taprio: Set default link speed to 10 Mbps in taprio_set_picos_per_byte Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 14/94] net/sched: cbs: Set default link speed to 10 Mbps in cbs_set_port_rate Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 15/94] Add genphy_c45_config_aneg() function to phy-c45.c Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 16/94] net: dsa: tag_8021q: Future-proof the reserved fields in the custom VID Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 17/94] net/sched: pfifo_fast: fix wrong dereference in pfifo_fast_enqueue Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 18/94] net/sched: pfifo_fast: fix wrong dereference when qdisc is reset Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 19/94] net/rds: Fix info leak in rds6_inc_info_copy() Greg Kroah-Hartman
     [not found]   ` <CAFcO6XPJM9gej3N0on-6rdF0CeMu+aBSnyMW5buPde_a7_ViFQ@mail.gmail.com>
2019-09-12  9:40     ` Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 20/94] batman-adv: Fix netlink dumping of all mcast_flags buckets Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 21/94] libbpf: fix erroneous multi-closing of BTF FD Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 22/94] libbpf: set BTF FD for prog only when there is supported .BTF.ext data Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 23/94] netfilter: nf_flow_table: fix offload for flows that are subject to xfrm Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 24/94] net/mlx5e: Fix error flow of CQE recovery on tx reporter Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 25/94] clk: samsung: Change signature of exynos5_subcmus_init() function Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 26/94] clk: samsung: exynos5800: Move MAU subsystem clocks to MAU sub-CMU Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 27/94] clk: samsung: exynos542x: Move MSCL subsystem clocks to its sub-CMU Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 28/94] net: tundra: tsi108: use spin_lock_irqsave instead of spin_lock_irq in IRQ context Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 29/94] netfilter: nf_tables: use-after-free in failing rule with bound set Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 30/94] netfilter: nf_flow_table: conntrack picks up expired flows Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 31/94] netfilter: nf_flow_table: teardown flow timeout race Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 32/94] tools: bpftool: fix error message (prog -> object) Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 33/94] ixgbe: fix possible deadlock in ixgbe_service_task() Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 34/94] hv_netvsc: Fix a warning of suspicious RCU usage Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 35/94] net: tc35815: Explicitly check NET_IP_ALIGN is not zero in tc35815_rx Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 36/94] Bluetooth: btqca: Add a short delay before downloading the NVM Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 37/94] Bluetooth: hci_qca: Send VS pre shutdown command Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 38/94] Bluetooth: hidp: Let hidp_send_message return number of queued bytes Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 39/94] s390/qeth: serialize cmd reply with concurrent timeout Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 40/94] ibmveth: Convert multicast list size for little-endian system Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 41/94] gpio: Fix build error of function redefinition Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 42/94] netfilter: nft_flow_offload: skip tcp rst and fin packets Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 43/94] drm/mediatek: use correct device to import PRIME buffers Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 44/94] drm/mediatek: set DMA max segment size Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 45/94] scsi: qla2xxx: Fix gnl.l memory leak on adapter init failure Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 46/94] scsi: target: tcmu: avoid use-after-free after command timeout Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 47/94] cxgb4: fix a memory leak bug Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 48/94] selftests: kvm: do not try running the VM in vmx_set_nested_state_test Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 49/94] selftests: kvm: provide common function to enable eVMCS Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 50/94] selftests: kvm: fix vmx_set_nested_state_test Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 51/94] liquidio: add cleanup in octeon_setup_iq() Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 52/94] net: myri10ge: fix memory leaks Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 53/94] clk: Fix falling back to legacy parent string matching Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 54/94] clk: Fix potential NULL dereference in clk_fetch_parent_index() Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 55/94] lan78xx: Fix memory leaks Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 56/94] vfs: fix page locking deadlocks when deduping files Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 57/94] cx82310_eth: fix a memory leak bug Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 58/94] net: kalmia: fix memory leaks Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 59/94] ibmvnic: Unmap DMA address of TX descriptor buffers after use Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 60/94] net: cavium: fix driver name Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 61/94] wimax/i2400m: fix a memory leak bug Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 62/94] ravb: Fix use-after-free ravb_tstamp_skb Greg Kroah-Hartman
2019-09-08 12:41 ` [PATCH 5.2 63/94] sched/core: Schedule new worker even if PI-blocked Greg Kroah-Hartman
2019-09-08 12:42 ` [PATCH 5.2 64/94] kprobes: Fix potential deadlock in kprobe_optimizer() Greg Kroah-Hartman
2019-09-08 12:42 ` [PATCH 5.2 65/94] HID: intel-ish-hid: ipc: add EHL device id Greg Kroah-Hartman
2019-09-08 12:42 ` [PATCH 5.2 66/94] HID: cp2112: prevent sleeping function called from invalid context Greg Kroah-Hartman
2019-09-08 12:42 ` [PATCH 5.2 67/94] x86/boot/compressed/64: Fix boot on machines with broken E820 table Greg Kroah-Hartman
2019-09-08 12:42 ` [PATCH 5.2 68/94] scsi: lpfc: Mitigate high memory pre-allocation by SCSI-MQ Greg Kroah-Hartman
2019-09-08 12:42 ` [PATCH 5.2 69/94] Input: hyperv-keyboard: Use in-place iterator API in the channel callback Greg Kroah-Hartman
2019-09-08 12:42 ` [PATCH 5.2 70/94] Tools: hv: kvp: eliminate may be used uninitialized warning Greg Kroah-Hartman
2019-09-08 12:42 ` [PATCH 5.2 71/94] nvme-multipath: fix possible I/O hang when paths are updated Greg Kroah-Hartman
2019-09-08 12:42 ` [PATCH 5.2 72/94] nvme: Fix cntlid validation when not using NVMEoF Greg Kroah-Hartman
2019-09-08 12:42 ` [PATCH 5.2 73/94] RDMA/cma: fix null-ptr-deref Read in cma_cleanup Greg Kroah-Hartman
2019-09-08 12:42 ` [PATCH 5.2 74/94] IB/mlx4: Fix memory leaks Greg Kroah-Hartman
2019-09-08 12:42 ` [PATCH 5.2 75/94] infiniband: hfi1: fix a memory leak bug Greg Kroah-Hartman
2019-09-08 12:42 ` [PATCH 5.2 76/94] infiniband: hfi1: fix memory leaks Greg Kroah-Hartman
2019-09-08 12:42 ` [PATCH 5.2 77/94] selftests: kvm: fix state save/load on processors without XSAVE Greg Kroah-Hartman
2019-09-08 12:42 ` [PATCH 5.2 78/94] selftests/kvm: make platform_info_test pass on AMD Greg Kroah-Hartman
2019-09-08 12:42 ` [PATCH 5.2 79/94] drm/amdgpu: prevent memory leaks in AMDGPU_CS ioctl Greg Kroah-Hartman
2019-09-08 12:42 ` [PATCH 5.2 80/94] ceph: fix buffer free while holding i_ceph_lock in __ceph_setxattr() Greg Kroah-Hartman
2019-09-08 12:42 ` [PATCH 5.2 81/94] ceph: fix buffer free while holding i_ceph_lock in __ceph_build_xattrs_blob() Greg Kroah-Hartman
2019-09-08 12:42 ` [PATCH 5.2 82/94] ceph: fix buffer free while holding i_ceph_lock in fill_inode() Greg Kroah-Hartman
2019-09-08 12:42 ` [PATCH 5.2 83/94] KVM: arm/arm64: Only skip MMIO insn once Greg Kroah-Hartman
2019-09-08 12:42 ` [PATCH 5.2 84/94] afs: Fix leak in afs_lookup_cell_rcu() Greg Kroah-Hartman
2019-09-08 12:42 ` [PATCH 5.2 85/94] afs: Fix possible oops in afs_lookup trace event Greg Kroah-Hartman
2019-09-08 12:42 ` [PATCH 5.2 86/94] afs: use correct afs_call_type in yfs_fs_store_opaque_acl2 Greg Kroah-Hartman
2019-09-08 12:42 ` [PATCH 5.2 87/94] RDMA/bnxt_re: Fix stack-out-of-bounds in bnxt_qplib_rcfw_send_message Greg Kroah-Hartman
2019-09-08 12:42 ` [PATCH 5.2 88/94] gpio: Fix irqchip initialization order Greg Kroah-Hartman
2019-09-08 12:42 ` [PATCH 5.2 89/94] KVM: arm/arm64: VGIC: Properly initialise private IRQ affinity Greg Kroah-Hartman
2019-09-08 12:42 ` [PATCH 5.2 90/94] x86/boot/compressed/64: Fix missing initialization in find_trampoline_placement() Greg Kroah-Hartman
2019-09-08 12:42 ` [PATCH 5.2 91/94] libceph: allow ceph_buffer_put() to receive a NULL ceph_buffer Greg Kroah-Hartman
2019-09-08 12:42 ` [PATCH 5.2 92/94] Revert "x86/apic: Include the LDR when clearing out APIC registers" Greg Kroah-Hartman
2019-09-08 12:42 ` [PATCH 5.2 93/94] x86/boot: Preserve boot_params.secure_boot from sanitizing Greg Kroah-Hartman
2019-09-08 12:42 ` [PATCH 5.2 94/94] Revert "mmc: core: do not retry CMD6 in __mmc_switch()" Greg Kroah-Hartman
2019-09-09  5:54 ` [PATCH 5.2 00/94] 5.2.14-stable review Naresh Kamboju
2019-09-09  9:02   ` Greg Kroah-Hartman
2019-09-09 15:00 ` Bharath Vedartham
2019-09-09 16:05   ` Greg Kroah-Hartman
2019-09-09 19:40 ` Guenter Roeck
2019-09-09 22:58   ` Greg Kroah-Hartman
2019-09-10  9:20 ` Jon Hunter
2019-09-10  9:29   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190908121150.642000321@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=jbaron@akamai.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ncardwell@google.com \
    --cc=rutsky@google.com \
    --cc=soheil@google.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).