All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Neal Cardwell <ncardwell@google.com>,
	Yuchung Cheng <ycheng@google.com>,
	Eric Dumazet <edumazet@google.com>,
	"David S. Miller" <davem@davemloft.net>,
	Vinson Lee <vlee@freedesktop.org>
Subject: [PATCH 4.4 52/57] tcp: fix tcp_mark_head_lost to check skb len before fragmenting
Date: Thu, 13 Jul 2017 17:43:07 +0200	[thread overview]
Message-ID: <20170713154001.128393519@linuxfoundation.org> (raw)
In-Reply-To: <20170713153957.515045341@linuxfoundation.org>

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Neal Cardwell <ncardwell@google.com>

commit d88270eef4b56bd7973841dd1fed387ccfa83709 upstream.

This commit fixes a corner case in tcp_mark_head_lost() which was
causing the WARN_ON(len > skb->len) in tcp_fragment() to fire.

tcp_mark_head_lost() was assuming that if a packet has
tcp_skb_pcount(skb) of N, then it's safe to fragment off a prefix of
M*mss bytes, for any M < N. But with the tricky way TCP pcounts are
maintained, this is not always true.

For example, suppose the sender sends 4 1-byte packets and have the
last 3 packet sacked. It will merge the last 3 packets in the write
queue into an skb with pcount = 3 and len = 3 bytes. If another
recovery happens after a sack reneging event, tcp_mark_head_lost()
may attempt to split the skb assuming it has more than 2*MSS bytes.

This sounds very counterintuitive, but as the commit description for
the related commit c0638c247f55 ("tcp: don't fragment SACKed skbs in
tcp_mark_head_lost()") notes, this is because tcp_shifted_skb()
coalesces adjacent regions of SACKed skbs, and when doing this it
preserves the sum of their packet counts in order to reflect the
real-world dynamics on the wire. The c0638c247f55 commit tried to
avoid problems by not fragmenting SACKed skbs, since SACKed skbs are
where the non-proportionality between pcount and skb->len/mss is known
to be possible. However, that commit did not handle the case where
during a reneging event one of these weird SACKed skbs becomes an
un-SACKed skb, which tcp_mark_head_lost() can then try to fragment.

The fix is to simply mark the entire skb lost when this happens.
This makes the recovery slightly more aggressive in such corner
cases before we detect reordering. But once we detect reordering
this code path is by-passed because FACK is disabled.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Vinson Lee <vlee@freedesktop.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 net/ipv4/tcp_input.c |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -2165,8 +2165,7 @@ static void tcp_mark_head_lost(struct so
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 	struct sk_buff *skb;
-	int cnt, oldcnt;
-	int err;
+	int cnt, oldcnt, lost;
 	unsigned int mss;
 	/* Use SACK to deduce losses of new sequences sent during recovery */
 	const u32 loss_high = tcp_is_sack(tp) ?  tp->snd_nxt : tp->high_seq;
@@ -2206,9 +2205,10 @@ static void tcp_mark_head_lost(struct so
 				break;
 
 			mss = tcp_skb_mss(skb);
-			err = tcp_fragment(sk, skb, (packets - oldcnt) * mss,
-					   mss, GFP_ATOMIC);
-			if (err < 0)
+			/* If needed, chop off the prefix to mark as lost. */
+			lost = (packets - oldcnt) * mss;
+			if (lost < skb->len &&
+			    tcp_fragment(sk, skb, lost, mss, GFP_ATOMIC) < 0)
 				break;
 			cnt = packets;
 		}

  parent reply	other threads:[~2017-07-13 15:45 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-13 15:42 [PATCH 4.4 00/57] 4.4.77-stable review Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 01/57] fs: add a VALID_OPEN_FLAGS Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 02/57] fs: completely ignore unknown open flags Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 03/57] driver core: platform: fix race condition with driver_override Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 05/57] mm: fix classzone_idx underflow in shrink_zones() Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 06/57] tracing/kprobes: Allow to create probe with a module name starting with a digit Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 08/57] usb: dwc3: replace %p with %pK Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 09/57] USB: serial: cp210x: add ID for CEL EM3588 USB ZigBee stick Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 10/57] Add USB quirk for HVR-950q to avoid intermittent device resets Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 11/57] usb: usbip: set buffer pointers to NULL after free Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 12/57] usb: Fix typo in the definition of Endpoint[out]Request Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 13/57] mac80211_hwsim: Replace bogus hrtimer clockid Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 14/57] sysctl: dont print negative flag for proc_douintvec Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 15/57] sysctl: report EINVAL if value is larger than UINT_MAX " Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 16/57] pinctrl: sh-pfc: r8a7791: Fix SCIF2 pinmux data Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 17/57] pinctrl: sh-pfc: r8a7791: Add missing DVC_MUTE signal Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 18/57] pinctrl: meson: meson8b: fix the NAND DQS pins Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 19/57] pinctrl: sunxi: Fix SPDIF function name for A83T Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 21/57] pinctrl: sh-pfc: Update info pointer after SoC-specific init Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 24/57] gfs2: Fix glock rhashtable rcu bug Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 26/57] x86/uaccess: Optimize copy_user_enhanced_fast_string() for short strings Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 27/57] ath10k: override CE5 config for QCA9377 Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 28/57] KEYS: Fix an error code in request_master_key() Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 29/57] RDMA/uverbs: Check port number supplied by user verbs cmds Greg Kroah-Hartman
2017-07-13 15:54   ` Ismail, Mustafa
     [not found]     ` <5C5647B5F4794941BAAC07CF28785CBC33E4E81F-96pTJSsuoYSkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2017-07-13 16:25       ` Greg Kroah-Hartman
2017-07-13 16:25         ` Greg Kroah-Hartman
     [not found]         ` <20170713162540.GB22870-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
2017-07-13 18:44           ` Ismail, Mustafa
2017-07-13 18:44             ` Ismail, Mustafa
     [not found]             ` <5C5647B5F4794941BAAC07CF28785CBC33E4EC25-96pTJSsuoYSkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2017-07-14  6:49               ` Greg Kroah-Hartman
2017-07-14  6:49                 ` Greg Kroah-Hartman
     [not found]                 ` <20170714064908.GC16816-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
2017-07-14 14:54                   ` Ismail, Mustafa
2017-07-14 14:54                     ` Ismail, Mustafa
2017-07-17 17:30           ` Marciniszyn, Mike
2017-07-17 17:30             ` Marciniszyn, Mike
2017-07-17 19:22             ` Greg Kroah-Hartman
2017-07-17 19:24               ` Marciniszyn, Mike
2017-07-13 15:42 ` [PATCH 4.4 30/57] mqueue: fix a use-after-free in sys_mq_notify() Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 31/57] tools include: Add a __fallthrough statement Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 32/57] tools string: Use __fallthrough in perf_atoll() Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 33/57] tools strfilter: Use __fallthrough Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 34/57] perf top: " Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 35/57] perf intel-pt: " Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 36/57] perf thread_map: Correctly size buffer used with dirent->dt_name Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 37/57] perf scripting perl: Fix compile error with some perl5 versions Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 38/57] perf tests: Avoid possible truncation with dirent->d_name + snprintf Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 39/57] perf bench numa: Avoid possible truncation when using snprintf() Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 40/57] perf tools: Use readdir() instead of deprecated readdir_r() Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 41/57] perf thread_map: " Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 42/57] perf script: " Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 43/57] perf tools: Remove duplicate const qualifier Greg Kroah-Hartman
2017-07-13 15:42 ` [PATCH 4.4 44/57] perf annotate browser: Fix behaviour of Shift-Tab with nothing focussed Greg Kroah-Hartman
2017-07-13 15:43 ` [PATCH 4.4 45/57] perf pmu: Fix misleadingly indented assignment (whitespace) Greg Kroah-Hartman
2017-07-13 15:43 ` [PATCH 4.4 46/57] perf dwarf: Guard !x86_64 definitions under #ifdef else clause Greg Kroah-Hartman
2017-07-13 15:43 ` [PATCH 4.4 47/57] perf trace: Do not process PERF_RECORD_LOST twice Greg Kroah-Hartman
2017-07-13 15:43 ` [PATCH 4.4 48/57] perf tests: Remove wrong semicolon in while loop in CQM test Greg Kroah-Hartman
2017-07-13 15:43 ` [PATCH 4.4 49/57] perf tools: Use readdir() instead of deprecated readdir_r() again Greg Kroah-Hartman
2017-07-13 15:43 ` [PATCH 4.4 50/57] md: fix incorrect use of lexx_to_cpu in does_sb_need_changing Greg Kroah-Hartman
2017-07-13 15:43 ` [PATCH 4.4 51/57] md: fix super_offset endianness in super_1_rdev_size_change Greg Kroah-Hartman
2017-07-13 15:43 ` Greg Kroah-Hartman [this message]
2017-07-13 15:43 ` [PATCH 4.4 53/57] staging: vt6556: vnt_start Fix missing call to vnt_key_init_table Greg Kroah-Hartman
2017-07-13 15:43 ` [PATCH 4.4 54/57] staging: comedi: fix clean-up of comedi_class in comedi_init() Greg Kroah-Hartman
2017-07-13 15:43 ` [PATCH 4.4 55/57] ext4: check return value of kstrtoull correctly in reserved_clusters_store Greg Kroah-Hartman
2017-07-13 15:43 ` [PATCH 4.4 56/57] x86/mm/pat: Dont report PAT on CPUs that dont support it Greg Kroah-Hartman
2017-07-13 15:43 ` [PATCH 4.4 57/57] [media] saa7134: fix warm Medion 7134 EEPROM read Greg Kroah-Hartman
2017-07-14  1:33 ` [PATCH 4.4 00/57] 4.4.77-stable review Guenter Roeck
2017-07-14  9:50   ` Greg Kroah-Hartman
2017-07-14 19:23     ` Guenter Roeck
2017-07-15  8:10       ` Greg Kroah-Hartman
     [not found] ` <5967e121.9fb6df0a.979fe.f2ca@mx.google.com>
2017-07-14  9:51   ` Greg Kroah-Hartman
2017-07-14 12:21     ` Arnd Bergmann
2017-07-14 13:26       ` Greg Kroah-Hartman
2017-07-14 19:54         ` Arnd Bergmann
2017-07-18 22:56           ` Kevin Hilman
2017-07-15 11:16         ` Geert Uytterhoeven
2017-07-15 11:22           ` Greg Kroah-Hartman
2017-07-14 12:35     ` Mark Brown
2017-07-14 13:26       ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170713154001.128393519@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ncardwell@google.com \
    --cc=stable@vger.kernel.org \
    --cc=vlee@freedesktop.org \
    --cc=ycheng@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.