From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Eric Dumazet <edumazet@google.com>,
Juha-Matti Tilli <juha-matti.tilli@iki.fi>,
Yuchung Cheng <ycheng@google.com>,
Soheil Hassas Yeganeh <soheil@google.com>,
"David S. Miller" <davem@davemloft.net>
Subject: [PATCH 4.14 28/48] tcp: free batches of packets in tcp_prune_ofo_queue()
Date: Fri, 27 Jul 2018 12:00:13 +0200 [thread overview]
Message-ID: <20180727095921.457100398@linuxfoundation.org> (raw)
In-Reply-To: <20180727095918.503549522@linuxfoundation.org>
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet <edumazet@google.com>
[ Upstream commit 72cd43ba64fc172a443410ce01645895850844c8 ]
Juha-Matti Tilli reported that malicious peers could inject tiny
packets in out_of_order_queue, forcing very expensive calls
to tcp_collapse_ofo_queue() and tcp_prune_ofo_queue() for
every incoming packet. out_of_order_queue rb-tree can contain
thousands of nodes, iterating over all of them is not nice.
Before linux-4.9, we would have pruned all packets in ofo_queue
in one go, every XXXX packets. XXXX depends on sk_rcvbuf and skbs
truesize, but is about 7000 packets with tcp_rmem[2] default of 6 MB.
Since we plan to increase tcp_rmem[2] in the future to cope with
modern BDP, can not revert to the old behavior, without great pain.
Strategy taken in this patch is to purge ~12.5 % of the queue capacity.
Fixes: 36a6503fedda ("tcp: refine tcp_prune_ofo_queue() to not drop all packets")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Juha-Matti Tilli <juha-matti.tilli@iki.fi>
Acked-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
include/linux/skbuff.h | 2 ++
net/ipv4/tcp_input.c | 15 +++++++++++----
2 files changed, 13 insertions(+), 4 deletions(-)
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -3167,6 +3167,8 @@ static inline int __skb_grow_rcsum(struc
return __skb_grow(skb, len);
}
+#define rb_to_skb(rb) rb_entry_safe(rb, struct sk_buff, rbnode)
+
#define skb_queue_walk(queue, skb) \
for (skb = (queue)->next; \
skb != (struct sk_buff *)(queue); \
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4924,6 +4924,7 @@ new_range:
* 2) not add too big latencies if thousands of packets sit there.
* (But if application shrinks SO_RCVBUF, we could still end up
* freeing whole queue here)
+ * 3) Drop at least 12.5 % of sk_rcvbuf to avoid malicious attacks.
*
* Return true if queue has shrunk.
*/
@@ -4931,20 +4932,26 @@ static bool tcp_prune_ofo_queue(struct s
{
struct tcp_sock *tp = tcp_sk(sk);
struct rb_node *node, *prev;
+ int goal;
if (RB_EMPTY_ROOT(&tp->out_of_order_queue))
return false;
NET_INC_STATS(sock_net(sk), LINUX_MIB_OFOPRUNED);
+ goal = sk->sk_rcvbuf >> 3;
node = &tp->ooo_last_skb->rbnode;
do {
prev = rb_prev(node);
rb_erase(node, &tp->out_of_order_queue);
+ goal -= rb_to_skb(node)->truesize;
tcp_drop(sk, rb_entry(node, struct sk_buff, rbnode));
- sk_mem_reclaim(sk);
- if (atomic_read(&sk->sk_rmem_alloc) <= sk->sk_rcvbuf &&
- !tcp_under_memory_pressure(sk))
- break;
+ if (!prev || goal <= 0) {
+ sk_mem_reclaim(sk);
+ if (atomic_read(&sk->sk_rmem_alloc) <= sk->sk_rcvbuf &&
+ !tcp_under_memory_pressure(sk))
+ break;
+ goal = sk->sk_rcvbuf >> 3;
+ }
node = prev;
} while (node);
tp->ooo_last_skb = rb_entry(prev, struct sk_buff, rbnode);
next prev parent reply other threads:[~2018-07-27 10:02 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-27 9:59 [PATCH 4.14 00/48] 4.14.59-stable review Greg Kroah-Hartman
2018-07-27 9:59 ` [PATCH 4.14 02/48] MIPS: ath79: fix register address in ath79_ddr_wb_flush() Greg Kroah-Hartman
2018-07-27 9:59 ` [PATCH 4.14 03/48] MIPS: Fix off-by-one in pci_resource_to_user() Greg Kroah-Hartman
2018-07-27 9:59 ` [PATCH 4.14 04/48] xen/PVH: Set up GS segment for stack canary Greg Kroah-Hartman
2018-07-27 9:59 ` [PATCH 4.14 05/48] KVM: PPC: Check if IOMMU page is contained in the pinned physical page Greg Kroah-Hartman
2018-07-27 9:59 ` [PATCH 4.14 06/48] drm/nouveau/drm/nouveau: Fix runtime PM leak in nv50_disp_atomic_commit() Greg Kroah-Hartman
2018-07-27 9:59 ` [PATCH 4.14 07/48] drm/nouveau: Set DRIVER_ATOMIC cap earlier to fix debugfs Greg Kroah-Hartman
2018-07-27 9:59 ` [PATCH 4.14 08/48] bonding: set default miimon value for non-arp modes if not set Greg Kroah-Hartman
2018-07-27 9:59 ` [PATCH 4.14 09/48] ip: hash fragments consistently Greg Kroah-Hartman
2018-07-27 9:59 ` [PATCH 4.14 10/48] ip: in cmsg IP(V6)_ORIGDSTADDR call pskb_may_pull Greg Kroah-Hartman
2018-07-27 9:59 ` [PATCH 4.14 11/48] net/mlx4_core: Save the qpn from the input modifier in RST2INIT wrapper Greg Kroah-Hartman
2018-07-27 9:59 ` [PATCH 4.14 12/48] net: skb_segment() should not return NULL Greg Kroah-Hartman
2018-07-27 9:59 ` [PATCH 4.14 13/48] net/mlx5: Adjust clock overflow work period Greg Kroah-Hartman
2018-07-27 9:59 ` [PATCH 4.14 14/48] net/mlx5e: Dont allow aRFS for encapsulated packets Greg Kroah-Hartman
2018-07-27 10:00 ` [PATCH 4.14 15/48] net/mlx5e: Fix quota counting in aRFS expire flow Greg Kroah-Hartman
2018-07-27 10:00 ` [PATCH 4.14 16/48] net/ipv6: Fix linklocal to global address with VRF Greg Kroah-Hartman
2018-07-27 10:00 ` [PATCH 4.14 17/48] multicast: do not restore deleted record source filter mode to new one Greg Kroah-Hartman
2018-07-27 10:00 ` [PATCH 4.14 18/48] net: phy: consider PHY_IGNORE_INTERRUPT in phy_start_aneg_priv Greg Kroah-Hartman
2018-07-27 10:00 ` [PATCH 4.14 19/48] sock: fix sg page frag coalescing in sk_alloc_sg Greg Kroah-Hartman
2018-07-27 10:00 ` [PATCH 4.14 20/48] rtnetlink: add rtnl_link_state check in rtnl_configure_link Greg Kroah-Hartman
2018-07-27 10:00 ` [PATCH 4.14 21/48] vxlan: add new fdb alloc and create helpers Greg Kroah-Hartman
2018-07-27 10:00 ` [PATCH 4.14 22/48] vxlan: make netlink notify in vxlan_fdb_destroy optional Greg Kroah-Hartman
2018-07-27 10:00 ` [PATCH 4.14 23/48] vxlan: fix default fdb entry netlink notify ordering during netdev create Greg Kroah-Hartman
2018-07-27 10:00 ` [PATCH 4.14 24/48] tcp: fix dctcp delayed ACK schedule Greg Kroah-Hartman
2018-07-27 10:00 ` [PATCH 4.14 25/48] tcp: helpers to send special DCTCP ack Greg Kroah-Hartman
2018-07-27 10:00 ` [PATCH 4.14 26/48] tcp: do not cancel delay-AcK on DCTCP special ACK Greg Kroah-Hartman
2018-07-27 10:00 ` [PATCH 4.14 27/48] tcp: do not delay ACK in DCTCP upon CE status change Greg Kroah-Hartman
2018-07-27 10:00 ` Greg Kroah-Hartman [this message]
2018-07-27 10:00 ` [PATCH 4.14 29/48] tcp: avoid collapses in tcp_prune_queue() if possible Greg Kroah-Hartman
2018-07-27 10:00 ` [PATCH 4.14 30/48] tcp: detect malicious patterns in tcp_collapse_ofo_queue() Greg Kroah-Hartman
2018-07-27 10:00 ` [PATCH 4.14 31/48] tcp: call tcp_drop() from tcp_data_queue_ofo() Greg Kroah-Hartman
2018-07-27 10:00 ` [PATCH 4.14 32/48] tcp: add tcp_ooo_try_coalesce() helper Greg Kroah-Hartman
2018-07-27 10:00 ` [PATCH 4.14 33/48] staging: speakup: fix wraparound in uaccess length check Greg Kroah-Hartman
2018-07-27 10:00 ` [PATCH 4.14 34/48] usb: cdc_acm: Add quirk for Castles VEGA3000 Greg Kroah-Hartman
2018-07-27 10:00 ` [PATCH 4.14 35/48] usb: core: handle hub C_PORT_OVER_CURRENT condition Greg Kroah-Hartman
2018-07-27 10:00 ` [PATCH 4.14 37/48] usb: gadget: f_fs: Only return delayed status when len is 0 Greg Kroah-Hartman
2018-07-27 10:00 ` [PATCH 4.14 38/48] driver core: Partially revert "driver core: correct devices shutdown order" Greg Kroah-Hartman
2018-07-27 10:00 ` [PATCH 4.14 39/48] can: xilinx_can: fix RX loop if RXNEMP is asserted without RXOK Greg Kroah-Hartman
2018-07-27 10:00 ` [PATCH 4.14 40/48] can: xilinx_can: fix power management handling Greg Kroah-Hartman
2018-07-27 10:00 ` [PATCH 4.14 41/48] can: xilinx_can: fix recovery from error states not being propagated Greg Kroah-Hartman
2018-07-27 10:00 ` [PATCH 4.14 42/48] can: xilinx_can: fix device dropping off bus on RX overrun Greg Kroah-Hartman
2018-07-27 10:00 ` [PATCH 4.14 43/48] can: xilinx_can: keep only 1-2 frames in TX FIFO to fix TX accounting Greg Kroah-Hartman
2018-07-27 10:00 ` [PATCH 4.14 44/48] can: xilinx_can: fix incorrect clear of non-processed interrupts Greg Kroah-Hartman
2018-07-27 10:00 ` [PATCH 4.14 45/48] can: xilinx_can: fix RX overflow interrupt not being enabled Greg Kroah-Hartman
2018-07-27 10:00 ` [PATCH 4.14 46/48] can: peak_canfd: fix firmware < v3.3.0: limit allocation to 32-bit DMA addr only Greg Kroah-Hartman
2018-07-27 10:00 ` [PATCH 4.14 47/48] can: m_can.c: fix setup of CCCR register: clear CCCR NISO bit before checking can.ctrlmode Greg Kroah-Hartman
2018-07-27 10:00 ` [PATCH 4.14 48/48] turn off -Wattribute-alias Greg Kroah-Hartman
2018-07-27 17:31 ` [PATCH 4.14 00/48] 4.14.59-stable review Guenter Roeck
2018-07-27 19:55 ` Shuah Khan
2018-07-28 6:54 ` Naresh Kamboju
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180727095921.457100398@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=juha-matti.tilli@iki.fi \
--cc=linux-kernel@vger.kernel.org \
--cc=soheil@google.com \
--cc=stable@vger.kernel.org \
--cc=ycheng@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).