linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Eric Dumazet <edumazet@google.com>,
	"David S. Miller" <davem@davemloft.net>
Subject: [PATCH 3.14 03/92] tcp: avoid looping in tcp_send_fin()
Date: Sat,  2 May 2015 21:02:18 +0200	[thread overview]
Message-ID: <20150502190110.551132561@linuxfoundation.org> (raw)
In-Reply-To: <20150502190109.683061482@linuxfoundation.org>

3.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

[ Upstream commit 845704a535e9b3c76448f52af1b70e4422ea03fd ]

Presence of an unbound loop in tcp_send_fin() had always been hard
to explain when analyzing crash dumps involving gigantic dying processes
with millions of sockets.

Lets try a different strategy :

In case of memory pressure, try to add the FIN flag to last packet
in write queue, even if packet was already sent. TCP stack will
be able to deliver this FIN after a timeout event. Note that this
FIN being delivered by a retransmit, it also carries a Push flag
given our current implementation.

By checking sk_under_memory_pressure(), we anticipate that cooking
many FIN packets might deplete tcp memory.

In the case we could not allocate a packet, even with __GFP_WAIT
allocation, then not sending a FIN seems quite reasonable if it allows
to get rid of this socket, free memory, and not block the process from
eventually doing other useful work.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv4/tcp_output.c |   50 +++++++++++++++++++++++++++++---------------------
 1 file changed, 29 insertions(+), 21 deletions(-)

--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2597,7 +2597,8 @@ begin_fwd:
 
 /* We allow to exceed memory limits for FIN packets to expedite
  * connection tear down and (memory) recovery.
- * Otherwise tcp_send_fin() could loop forever.
+ * Otherwise tcp_send_fin() could be tempted to either delay FIN
+ * or even be forced to close flow without any FIN.
  */
 static void sk_forced_wmem_schedule(struct sock *sk, int size)
 {
@@ -2610,33 +2611,40 @@ static void sk_forced_wmem_schedule(stru
 	sk_memory_allocated_add(sk, amt, &status);
 }
 
-/* Send a fin.  The caller locks the socket for us.  This cannot be
- * allowed to fail queueing a FIN frame under any circumstances.
+/* Send a FIN. The caller locks the socket for us.
+ * We should try to send a FIN packet really hard, but eventually give up.
  */
 void tcp_send_fin(struct sock *sk)
 {
+	struct sk_buff *skb, *tskb = tcp_write_queue_tail(sk);
 	struct tcp_sock *tp = tcp_sk(sk);
-	struct sk_buff *skb = tcp_write_queue_tail(sk);
-	int mss_now;
 
-	/* Optimization, tack on the FIN if we have a queue of
-	 * unsent frames.  But be careful about outgoing SACKS
-	 * and IP options.
+	/* Optimization, tack on the FIN if we have one skb in write queue and
+	 * this skb was not yet sent, or we are under memory pressure.
+	 * Note: in the latter case, FIN packet will be sent after a timeout,
+	 * as TCP stack thinks it has already been transmitted.
 	 */
-	mss_now = tcp_current_mss(sk);
-
-	if (tcp_send_head(sk) != NULL) {
-		TCP_SKB_CB(skb)->tcp_flags |= TCPHDR_FIN;
-		TCP_SKB_CB(skb)->end_seq++;
+	if (tskb && (tcp_send_head(sk) || sk_under_memory_pressure(sk))) {
+coalesce:
+		TCP_SKB_CB(tskb)->tcp_flags |= TCPHDR_FIN;
+		TCP_SKB_CB(tskb)->end_seq++;
 		tp->write_seq++;
+		if (!tcp_send_head(sk)) {
+			/* This means tskb was already sent.
+			 * Pretend we included the FIN on previous transmit.
+			 * We need to set tp->snd_nxt to the value it would have
+			 * if FIN had been sent. This is because retransmit path
+			 * does not change tp->snd_nxt.
+			 */
+			tp->snd_nxt++;
+			return;
+		}
 	} else {
-		/* Socket is locked, keep trying until memory is available. */
-		for (;;) {
-			skb = alloc_skb_fclone(MAX_TCP_HEADER,
-					       sk->sk_allocation);
-			if (skb)
-				break;
-			yield();
+		skb = alloc_skb_fclone(MAX_TCP_HEADER, sk->sk_allocation);
+		if (unlikely(!skb)) {
+			if (tskb)
+				goto coalesce;
+			return;
 		}
 		skb_reserve(skb, MAX_TCP_HEADER);
 		sk_forced_wmem_schedule(sk, skb->truesize);
@@ -2645,7 +2653,7 @@ void tcp_send_fin(struct sock *sk)
 				     TCPHDR_ACK | TCPHDR_FIN);
 		tcp_queue_skb(sk, skb);
 	}
-	__tcp_push_pending_frames(sk, mss_now, TCP_NAGLE_OFF);
+	__tcp_push_pending_frames(sk, tcp_current_mss(sk), TCP_NAGLE_OFF);
 }
 
 /* We get here when a process closes a file descriptor (either due to



  parent reply	other threads:[~2015-05-02 20:20 UTC|newest]

Thread overview: 109+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-02 19:02 [PATCH 3.14 00/92] 3.14.41-stable review Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 01/92] ip_forward: Drop frames with attached skb->sk Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 02/92] tcp: fix possible deadlock in tcp_send_fin() Greg Kroah-Hartman
2015-05-02 19:02 ` Greg Kroah-Hartman [this message]
2015-05-02 19:02 ` [PATCH 3.14 04/92] net: do not deplete pfmemalloc reserve Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 05/92] net: fix crash in build_skb() Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 06/92] Btrfs: fix log tree corruption when fs mounted with -o discard Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 07/92] btrfs: dont accept bare namespace as a valid xattr Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 08/92] Btrfs: fix inode eviction infinite loop after cloning into it Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 09/92] Btrfs: fix inode eviction infinite loop after extent_same ioctl Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 10/92] sched/idle/x86: Restore mwait_idle() to fix boot hangs, to improve power savings and to improve performance Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 11/92] usb: gadget: composite: enable BESL support Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 12/92] KVM: s390: Zero out current VMDB of STSI before including level3 data Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 13/92] s390/hibernate: fix save and restore of kernel text section Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 15/92] MIPS: Hibernate: flush TLB entries earlier Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 16/92] md/raid0: fix bug with chunksize not a power of 2 Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 17/92] cdc-wdm: fix endianness bug in debug statements Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 18/92] spi: spidev: fix possible arithmetic overflow for multi-transfer message Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 19/92] compal-laptop: Check return value of power_supply_register Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 20/92] ring-buffer: Replace this_cpu_*() with __this_cpu_*() Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 21/92] power_supply: twl4030_madc: Check return value of power_supply_register Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 22/92] power_supply: lp8788-charger: Fix leaked power supply on probe fail Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 23/92] NFS: fix BUG() crash in notify_change() with patch to chown_common() Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 24/92] ARM: 8320/1: fix integer overflow in ELF_ET_DYN_BASE Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 25/92] ARM: S3C64XX: Use fixed IRQ bases to avoid conflicts on Cragganmore Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 26/92] ARM: at91/dt: sama5d3 xplained: add phy address for macb1 Greg Kroah-Hartman
2015-05-04 18:09   ` Luis Henriques
2015-05-04 21:44     ` Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 27/92] ARM: dts: dove: Fix uart[23] reg property Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 28/92] usb: phy: Find the right match in devm_usb_phy_match Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 29/92] usb: define a generic USB_RESUME_TIMEOUT macro Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 30/92] usb: host: fusbh200: use new USB_RESUME_TIMEOUT Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 31/92] usb: host: uhci: " Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 32/92] usb: host: fotg210: " Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 33/92] usb: host: r8a66597: " Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 34/92] usb: host: isp116x: " Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 35/92] usb: host: xhci: " Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 36/92] usb: host: sl811: " Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 37/92] usb: dwc2: hcd: " Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 38/92] usb: core: hub: " Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 39/92] ALSA: emu10k1: dont deadlock in proc-functions Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 40/92] Input: elantech - fix absolute mode setting on some ASUS laptops Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 41/92] fs/binfmt_elf.c: fix bug in loading of PIE binaries Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 42/92] ptrace: fix race between ptrace_resume() and wait_task_stopped() Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 43/92] rtlwifi: rtl8192cu: Add new USB ID Greg Kroah-Hartman
2015-05-02 19:02 ` [PATCH 3.14 44/92] rtlwifi: rtl8192cu: Add new device ID Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 45/92] arm64: vdso: fix build error when switching from LE to BE Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 46/92] [SCSI] bfa: Replace large udelay() with mdelay() Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 47/92] drm/msm: use componentised device support Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 48/92] ext4: make fsync to sync parent dir in no-journal for real this time Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 49/92] powerpc/perf: Cap 64bit userspace backtraces to PERF_MAX_STACK_DEPTH Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 50/92] tools lib traceevent kbuffer: Remove extra update to data pointer in PADDING Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 51/92] tools/power turbostat: Use $(CURDIR) instead of $(PWD) and add support for O= option in Makefile Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 52/92] UBI: account for bitflips in both the VID header and data Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 53/92] UBI: fix out of bounds write Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 54/92] UBI: initialize LEB number variable Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 55/92] UBI: fix check for "too many bytes" Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 56/92] scsi: storvsc: Fix a bug in copy_from_bounce_buffer() Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 57/92] target: Fix COMPARE_AND_WRITE with SG_TO_MEM_NOALLOC handling Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 58/92] target/file: Fix BUG() when CONFIG_DEBUG_SG=y and DIF protection enabled Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 59/92] target/file: Fix SG table for prot_buf initialization Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 60/92] Bluetooth: ath3k: Add support Atheros AR5B195 combo Mini PCIe card Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 61/92] powerpc: Fix missing L2 cache size in /sys/devices/system/cpu Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 62/92] powerpc/cell: Fix cell iommu after it_page_shift changes Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 63/92] ASoC: davinci-evm: drop un-necessary remove function Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 64/92] ACPICA: Utilities: split IO address types from data type models Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 65/92] ACPI / scan: Annotate physical_node_lock in acpi_scan_is_offline() Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 66/92] xtensa: xtfpga: fix hardware lockup caused by LCD driver Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 67/92] xtensa: provide __NR_sync_file_range2 instead of __NR_sync_file_range Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 68/92] xtensa: ISS: fix locking in TAP network adapter Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 69/92] gpio: mvebu: Fix mask/unmask managment per irq chip type Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 70/92] gpio: clamp returned values to the boolean range Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 71/92] clk: tegra: Register the proper number of resets Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 72/92] clk: qcom: fix RCG M/N counter configuration Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 73/92] dm crypt: fix deadlock when async crypto algorithm returns -EBUSY Greg Kroah-Hartman
2015-05-04 21:32   ` Rabin Vincent
2015-05-04 21:40     ` Ben Collins
2015-05-05  1:53     ` Herbert Xu
2015-05-05  3:22     ` Mike Snitzer
2015-05-05  6:42       ` Milan Broz
2015-05-05 12:50         ` Mike Snitzer
2015-05-02 19:03 ` [PATCH 3.14 74/92] Drivers: hv: vmbus: Fix a bug in the error path in vmbus_open() Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 75/92] mvsas: fix panic on expander attached SATA devices Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 76/92] [media] stk1160: Make sure current buffer is released Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 77/92] IB/core: disallow registering 0-sized memory region Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 78/92] IB/core: dont disallow registering region starting at 0x0 Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 79/92] IB/mlx4: Fix WQE LSO segment calculation Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 80/92] i2c: core: Export bus recovery functions Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 81/92] drm/radeon: fix doublescan modes (v2) Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 82/92] drm/i915: cope with large i2c transfers Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 83/92] RCU pathwalk breakage when running into a symlink overmounting something Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 84/92] ksoftirqd: Enable IRQs and call cond_resched() before poking RCU Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 85/92] e1000: add dummy allocator to fix race condition between mtu change and netpoll Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 86/92] lib: memzero_explicit: use barrier instead of OPTIMIZER_HIDE_VAR Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 87/92] wl18xx: show rx_frames_per_rates as an array as it really is Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 88/92] crypto: omap-aes - Fix support for unequal lengths Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 89/92] C6x: time: Ensure consistency in __init Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 90/92] memstick: mspro_block: add missing curly braces Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 91/92] driver core: bus: Goto appropriate labels on failure in bus_add_device Greg Kroah-Hartman
2015-05-02 19:03 ` [PATCH 3.14 92/92] fs: take i_mutex during prepare_binprm for set[ug]id executables Greg Kroah-Hartman
2015-05-03 19:54 ` [PATCH 3.14 00/92] 3.14.41-stable review Guenter Roeck
2015-05-04 21:42   ` Greg Kroah-Hartman
2015-05-05  4:45   ` Guenter Roeck
2015-05-05 22:12     ` Greg Kroah-Hartman
2015-05-04 16:10 ` Shuah Khan
2015-05-04 21:43   ` Greg Kroah-Hartman
2015-05-05 22:10 ` Greg Kroah-Hartman
2015-05-06  1:51   ` Guenter Roeck
2015-05-06 16:01   ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150502190110.551132561@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).