linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Eric Dumazet <edumazet@google.com>,
	Juha-Matti Tilli <juha-matti.tilli@iki.fi>,
	Yuchung Cheng <ycheng@google.com>,
	Soheil Hassas Yeganeh <soheil@google.com>,
	"David S. Miller" <davem@davemloft.net>
Subject: [PATCH 4.17 42/66] tcp: free batches of packets in tcp_prune_ofo_queue()
Date: Fri, 27 Jul 2018 11:45:35 +0200	[thread overview]
Message-ID: <20180727093813.840716869@linuxfoundation.org> (raw)
In-Reply-To: <20180727093809.043856530@linuxfoundation.org>

4.17-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

[ Upstream commit 72cd43ba64fc172a443410ce01645895850844c8 ]

Juha-Matti Tilli reported that malicious peers could inject tiny
packets in out_of_order_queue, forcing very expensive calls
to tcp_collapse_ofo_queue() and tcp_prune_ofo_queue() for
every incoming packet. out_of_order_queue rb-tree can contain
thousands of nodes, iterating over all of them is not nice.

Before linux-4.9, we would have pruned all packets in ofo_queue
in one go, every XXXX packets. XXXX depends on sk_rcvbuf and skbs
truesize, but is about 7000 packets with tcp_rmem[2] default of 6 MB.

Since we plan to increase tcp_rmem[2] in the future to cope with
modern BDP, can not revert to the old behavior, without great pain.

Strategy taken in this patch is to purge ~12.5 % of the queue capacity.

Fixes: 36a6503fedda ("tcp: refine tcp_prune_ofo_queue() to not drop all packets")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Juha-Matti Tilli <juha-matti.tilli@iki.fi>
Acked-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/ipv4/tcp_input.c |   15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4874,6 +4874,7 @@ new_range:
  * 2) not add too big latencies if thousands of packets sit there.
  *    (But if application shrinks SO_RCVBUF, we could still end up
  *     freeing whole queue here)
+ * 3) Drop at least 12.5 % of sk_rcvbuf to avoid malicious attacks.
  *
  * Return true if queue has shrunk.
  */
@@ -4881,20 +4882,26 @@ static bool tcp_prune_ofo_queue(struct s
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 	struct rb_node *node, *prev;
+	int goal;
 
 	if (RB_EMPTY_ROOT(&tp->out_of_order_queue))
 		return false;
 
 	NET_INC_STATS(sock_net(sk), LINUX_MIB_OFOPRUNED);
+	goal = sk->sk_rcvbuf >> 3;
 	node = &tp->ooo_last_skb->rbnode;
 	do {
 		prev = rb_prev(node);
 		rb_erase(node, &tp->out_of_order_queue);
+		goal -= rb_to_skb(node)->truesize;
 		tcp_drop(sk, rb_to_skb(node));
-		sk_mem_reclaim(sk);
-		if (atomic_read(&sk->sk_rmem_alloc) <= sk->sk_rcvbuf &&
-		    !tcp_under_memory_pressure(sk))
-			break;
+		if (!prev || goal <= 0) {
+			sk_mem_reclaim(sk);
+			if (atomic_read(&sk->sk_rmem_alloc) <= sk->sk_rcvbuf &&
+			    !tcp_under_memory_pressure(sk))
+				break;
+			goal = sk->sk_rcvbuf >> 3;
+		}
 		node = prev;
 	} while (node);
 	tp->ooo_last_skb = rb_to_skb(prev);



  parent reply	other threads:[~2018-07-27  9:48 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-27  9:44 [PATCH 4.17 00/66] 4.17.11-stable review Greg Kroah-Hartman
2018-07-27  9:44 ` [PATCH 4.17 01/66] KVM: VMX: support MSR_IA32_ARCH_CAPABILITIES as a feature MSR Greg Kroah-Hartman
2018-07-27  9:44 ` [PATCH 4.17 02/66] Revert "iommu/intel-iommu: Enable CONFIG_DMA_DIRECT_OPS=y and clean up intel_{alloc,free}_coherent()" Greg Kroah-Hartman
2018-07-27  9:44 ` [PATCH 4.17 03/66] MIPS: ath79: fix register address in ath79_ddr_wb_flush() Greg Kroah-Hartman
2018-07-27  9:44 ` [PATCH 4.17 04/66] MIPS: Fix off-by-one in pci_resource_to_user() Greg Kroah-Hartman
2018-07-27  9:44 ` [PATCH 4.17 05/66] clk: mvebu: armada-37xx-periph: Fix switching CPU rate from 300Mhz to 1.2GHz Greg Kroah-Hartman
2018-07-27  9:44 ` [PATCH 4.17 06/66] clk: aspeed: Mark bclk (PCIe) and dclk (VGA) as critical Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 08/66] xen/PVH: Set up GS segment for stack canary Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 09/66] KVM: PPC: Check if IOMMU page is contained in the pinned physical page Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 10/66] drm/nouveau/drm/nouveau: Fix runtime PM leak in nv50_disp_atomic_commit() Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 11/66] drm/nouveau: Set DRIVER_ATOMIC cap earlier to fix debugfs Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 12/66] clk: meson-gxbb: set fclk_div2 as CLK_IS_CRITICAL Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 13/66] bonding: set default miimon value for non-arp modes if not set Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 14/66] ip: hash fragments consistently Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 15/66] ip: in cmsg IP(V6)_ORIGDSTADDR call pskb_may_pull Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 17/66] net/mlx4_core: Save the qpn from the input modifier in RST2INIT wrapper Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 18/66] net-next/hinic: fix a problem in hinic_xmit_frame() Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 19/66] net: skb_segment() should not return NULL Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 20/66] tcp: fix dctcp delayed ACK schedule Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 21/66] tcp: helpers to send special DCTCP ack Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 22/66] tcp: do not cancel delay-AcK on DCTCP special ACK Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 23/66] tcp: do not delay ACK in DCTCP upon CE status change Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 24/66] net/mlx5: E-Switch, UBSAN fix undefined behavior in mlx5_eswitch_mode Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 25/66] r8169: restore previous behavior to accept BIOS WoL settings Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 26/66] tls: check RCV_SHUTDOWN in tls_wait_data Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 27/66] net/mlx5e: Add ingress/egress indication for offloaded TC flows Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 28/66] net/mlx5e: Only allow offloading decap egress (egdev) flows Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 29/66] net/mlx5e: Refine ets validation function Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 30/66] nfp: flower: ensure dead neighbour entries are not offloaded Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 31/66] sock: fix sg page frag coalescing in sk_alloc_sg Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 32/66] net: phy: consider PHY_IGNORE_INTERRUPT in phy_start_aneg_priv Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 33/66] multicast: do not restore deleted record source filter mode to new one Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 34/66] net/ipv6: Fix linklocal to global address with VRF Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 35/66] net/mlx5e: Dont allow aRFS for encapsulated packets Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 36/66] net/mlx5e: Fix quota counting in aRFS expire flow Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 37/66] net/mlx5: Adjust clock overflow work period Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 38/66] rtnetlink: add rtnl_link_state check in rtnl_configure_link Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 39/66] vxlan: add new fdb alloc and create helpers Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 40/66] vxlan: make netlink notify in vxlan_fdb_destroy optional Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 41/66] vxlan: fix default fdb entry netlink notify ordering during netdev create Greg Kroah-Hartman
2018-07-27  9:45 ` Greg Kroah-Hartman [this message]
2018-07-27  9:45 ` [PATCH 4.17 43/66] tcp: avoid collapses in tcp_prune_queue() if possible Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 44/66] tcp: detect malicious patterns in tcp_collapse_ofo_queue() Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 45/66] tcp: call tcp_drop() from tcp_data_queue_ofo() Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 46/66] tcp: add tcp_ooo_try_coalesce() helper Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 47/66] Revert "staging:r8188eu: Use lib80211 to support TKIP" Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 48/66] staging: speakup: fix wraparound in uaccess length check Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 49/66] usb: cdc_acm: Add quirk for Castles VEGA3000 Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 50/66] usb: core: handle hub C_PORT_OVER_CURRENT condition Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 52/66] usb: xhci: Fix memory leak in xhci_endpoint_reset() Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 53/66] usb: gadget: Fix OS descriptors support Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 54/66] usb: gadget: f_fs: Only return delayed status when len is 0 Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 55/66] ACPICA: AML Parser: ignore dispatcher error status during table load Greg Kroah-Hartman
2018-07-30  9:52   ` Rafael J. Wysocki
2018-07-30 11:44     ` Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 56/66] driver core: Partially revert "driver core: correct devices shutdown order" Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 57/66] can: xilinx_can: fix RX loop if RXNEMP is asserted without RXOK Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 58/66] can: xilinx_can: fix power management handling Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 59/66] can: xilinx_can: fix recovery from error states not being propagated Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 60/66] can: xilinx_can: fix device dropping off bus on RX overrun Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 61/66] can: xilinx_can: keep only 1-2 frames in TX FIFO to fix TX accounting Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 62/66] can: xilinx_can: fix incorrect clear of non-processed interrupts Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 63/66] can: xilinx_can: fix RX overflow interrupt not being enabled Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 64/66] can: peak_canfd: fix firmware < v3.3.0: limit allocation to 32-bit DMA addr only Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 65/66] can: m_can: Fix runtime resume call Greg Kroah-Hartman
2018-07-27  9:45 ` [PATCH 4.17 66/66] can: m_can.c: fix setup of CCCR register: clear CCCR NISO bit before checking can.ctrlmode Greg Kroah-Hartman
2018-07-27 17:31 ` [PATCH 4.17 00/66] 4.17.11-stable review Guenter Roeck
2018-07-28  5:41   ` Greg Kroah-Hartman
2018-07-27 19:49 ` Shuah Khan
2018-07-28  5:41   ` Greg Kroah-Hartman
2018-07-28  6:54 ` Naresh Kamboju
2018-07-28  7:20   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180727093813.840716869@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=juha-matti.tilli@iki.fi \
    --cc=linux-kernel@vger.kernel.org \
    --cc=soheil@google.com \
    --cc=stable@vger.kernel.org \
    --cc=ycheng@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).