From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org,
Denys Fedoryshchenko <nuclearcat@nuclearcat.com>,
Eric Dumazet <edumazet@google.com>,
Yuchung Cheng <ycheng@google.com>,
"David S. Miller" <davem@davemloft.net>
Subject: [PATCH 3.14 17/79] tcp: make connect() mem charging friendly
Date: Tue, 24 Mar 2015 16:45:28 +0100 [thread overview]
Message-ID: <20150324154421.680350004@linuxfoundation.org> (raw)
In-Reply-To: <20150324154420.803073211@linuxfoundation.org>
3.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet <edumazet@google.com>
[ Upstream commit 355a901e6cf1b2b763ec85caa2a9f04fbcc4ab4a ]
While working on sk_forward_alloc problems reported by Denys
Fedoryshchenko, we found that tcp connect() (and fastopen) do not call
sk_wmem_schedule() for SYN packet (and/or SYN/DATA packet), so
sk_forward_alloc is negative while connect is in progress.
We can fix this by calling regular sk_stream_alloc_skb() both for the
SYN packet (in tcp_connect()) and the syn_data packet in
tcp_send_syn_data()
Then, tcp_send_syn_data() can avoid copying syn_data as we simply
can manipulate syn_data->cb[] to remove SYN flag (and increment seq)
Instead of open coding memcpy_fromiovecend(), simply use this helper.
This leaves in socket write queue clean fast clone skbs.
This was tested against our fastopen packetdrill tests.
Reported-by: Denys Fedoryshchenko <nuclearcat@nuclearcat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
net/ipv4/tcp_output.c | 66 +++++++++++++++++++++-----------------------------
1 file changed, 29 insertions(+), 37 deletions(-)
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2895,9 +2895,9 @@ static int tcp_send_syn_data(struct sock
{
struct tcp_sock *tp = tcp_sk(sk);
struct tcp_fastopen_request *fo = tp->fastopen_req;
- int syn_loss = 0, space, i, err = 0, iovlen = fo->data->msg_iovlen;
- struct sk_buff *syn_data = NULL, *data;
+ int syn_loss = 0, space, err = 0;
unsigned long last_syn_loss = 0;
+ struct sk_buff *syn_data;
tp->rx_opt.mss_clamp = tp->advmss; /* If MSS is not cached */
tcp_fastopen_cache_get(sk, &tp->rx_opt.mss_clamp, &fo->cookie,
@@ -2928,42 +2928,38 @@ static int tcp_send_syn_data(struct sock
/* limit to order-0 allocations */
space = min_t(size_t, space, SKB_MAX_HEAD(MAX_TCP_HEADER));
- syn_data = skb_copy_expand(syn, MAX_TCP_HEADER, space,
- sk->sk_allocation);
- if (syn_data == NULL)
+ syn_data = sk_stream_alloc_skb(sk, space, sk->sk_allocation);
+ if (!syn_data)
goto fallback;
-
- for (i = 0; i < iovlen && syn_data->len < space; ++i) {
- struct iovec *iov = &fo->data->msg_iov[i];
- unsigned char __user *from = iov->iov_base;
- int len = iov->iov_len;
-
- if (syn_data->len + len > space)
- len = space - syn_data->len;
- else if (i + 1 == iovlen)
- /* No more data pending in inet_wait_for_connect() */
- fo->data = NULL;
-
- if (skb_add_data(syn_data, from, len))
- goto fallback;
- }
-
- /* Queue a data-only packet after the regular SYN for retransmission */
- data = pskb_copy(syn_data, sk->sk_allocation);
- if (data == NULL)
+ syn_data->ip_summed = CHECKSUM_PARTIAL;
+ memcpy(syn_data->cb, syn->cb, sizeof(syn->cb));
+ if (unlikely(memcpy_fromiovecend(skb_put(syn_data, space),
+ fo->data->msg_iov, 0, space))) {
+ kfree_skb(syn_data);
goto fallback;
- TCP_SKB_CB(data)->seq++;
- TCP_SKB_CB(data)->tcp_flags &= ~TCPHDR_SYN;
- TCP_SKB_CB(data)->tcp_flags = (TCPHDR_ACK|TCPHDR_PSH);
- tcp_connect_queue_skb(sk, data);
- fo->copied = data->len;
+ }
- if (tcp_transmit_skb(sk, syn_data, 0, sk->sk_allocation) == 0) {
+ /* No more data pending in inet_wait_for_connect() */
+ if (space == fo->size)
+ fo->data = NULL;
+ fo->copied = space;
+
+ tcp_connect_queue_skb(sk, syn_data);
+
+ err = tcp_transmit_skb(sk, syn_data, 1, sk->sk_allocation);
+
+ /* Now full SYN+DATA was cloned and sent (or not),
+ * remove the SYN from the original skb (syn_data)
+ * we keep in write queue in case of a retransmit, as we
+ * also have the SYN packet (with no data) in the same queue.
+ */
+ TCP_SKB_CB(syn_data)->seq++;
+ TCP_SKB_CB(syn_data)->tcp_flags = TCPHDR_ACK | TCPHDR_PSH;
+ if (!err) {
tp->syn_data = (fo->copied > 0);
NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPFASTOPENACTIVE);
goto done;
}
- syn_data = NULL;
fallback:
/* Send a regular SYN with Fast Open cookie request option */
@@ -2972,7 +2968,6 @@ fallback:
err = tcp_transmit_skb(sk, syn, 1, sk->sk_allocation);
if (err)
tp->syn_fastopen = 0;
- kfree_skb(syn_data);
done:
fo->cookie.len = -1; /* Exclude Fast Open option for SYN retries */
return err;
@@ -2992,13 +2987,10 @@ int tcp_connect(struct sock *sk)
return 0;
}
- buff = alloc_skb_fclone(MAX_TCP_HEADER + 15, sk->sk_allocation);
- if (unlikely(buff == NULL))
+ buff = sk_stream_alloc_skb(sk, 0, sk->sk_allocation);
+ if (unlikely(!buff))
return -ENOBUFS;
- /* Reserve space for headers. */
- skb_reserve(buff, MAX_TCP_HEADER);
-
tcp_init_nondata_skb(buff, tp->write_seq++, TCPHDR_SYN);
tp->retrans_stamp = TCP_SKB_CB(buff)->when = tcp_time_stamp;
tcp_connect_queue_skb(sk, buff);
next prev parent reply other threads:[~2015-03-24 17:00 UTC|newest]
Thread overview: 81+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-24 15:45 [PATCH 3.14 00/79] 3.14.37-stable review Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 02/79] sparc32: destroy_context() and switch_mm() needs to disable interrupts Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 03/79] sparc: semtimedop() unreachable due to comparison error Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 04/79] sparc: perf: Remove redundant perf_pmu_{en|dis}able calls Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 05/79] sparc: perf: Make counting mode actually work Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 06/79] sparc: Touch NMI watchdog when walking cpus and calling printk Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 07/79] sparc64: Fix several bugs in memmove() Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 08/79] net: sysctl_net_core: check SNDBUF and RCVBUF for min length Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 09/79] rds: avoid potential stack overflow Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 10/79] inet_diag: fix possible overflow in inet_diag_dump_one_icsk() Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 11/79] caif: fix MSG_OOB test in caif_seqpkt_recvmsg() Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 12/79] rxrpc: bogus MSG_PEEK test in rxrpc_recvmsg() Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 13/79] Revert "net: cx82310_eth: use common match macro" Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 14/79] ipv6: fix backtracking for throw routes Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 15/79] tcp: fix tcp fin memory accounting Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 16/79] net: compat: Update get_compat_msghdr() to match copy_msghdr_from_user() behaviour Greg Kroah-Hartman
2015-03-24 15:45 ` Greg Kroah-Hartman [this message]
2015-03-24 15:45 ` [PATCH 3.14 19/79] drm/radeon: do a posting read in evergreen_set_irq Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 20/79] drm/radeon: do a posting read in r100_set_irq Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 21/79] drm/radeon: do a posting read in r600_set_irq Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 22/79] drm/radeon: do a posting read in cik_set_irq Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 23/79] drm/radeon: do a posting read in si_set_irq Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 24/79] drm/radeon: do a posting read in rs600_set_irq Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 25/79] drm/radeon: fix interlaced modes on DCE8 Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 27/79] LZ4 : fix the data abort issue Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 28/79] fuse: set stolen page uptodate Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 29/79] fuse: notify: dont move pages Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 30/79] console: Fix console name size mismatch Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 31/79] virtio_console: init work unconditionally Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 32/79] virtio_console: avoid config access from irq Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 33/79] Change email address for 8250_pci Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 34/79] can: add missing initialisations in CAN related skbuffs Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 35/79] workqueue: fix hang involving racing cancel[_delayed]_work_sync()s for PREEMPT_NONE Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 36/79] cpuset: Fix cpuset sched_relax_domain_level Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 37/79] tpm/ibmvtpm: Additional LE support for tpm_ibmvtpm_send Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 38/79] spi: atmel: Fix interrupt setup for PDC transfers Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 39/79] spi: pl022: Fix race in giveback() leading to driver lock-up Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 41/79] ALSA: control: Add sanity checks for user ctl id name string Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 42/79] ALSA: hda - Fix built-in mic on Compaq Presario CQ60 Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 43/79] ALSA: hda - Dont access stereo amps for mono channel widgets Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 44/79] ALSA: hda - Set single_adc_amp flag for CS420x codecs Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 45/79] ALSA: hda - Add workaround for MacBook Air 5,2 built-in mic Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 46/79] ALSA: hda - Fix regression of HD-audio controller fallback modes Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 47/79] ALSA: hda - Treat stereo-to-mono mix properly Greg Kroah-Hartman
2015-03-24 15:45 ` [PATCH 3.14 48/79] mtd: nand: pxa3xx: Fix PIO FIFO draining Greg Kroah-Hartman
2015-03-24 15:46 ` [PATCH 3.14 49/79] bnx2x: Force fundamental reset for EEH recovery Greg Kroah-Hartman
2015-03-24 15:46 ` [PATCH 3.14 50/79] regulator: Only enable disabled regulators on resume Greg Kroah-Hartman
2015-03-24 15:46 ` [PATCH 3.14 51/79] regulator: core: Fix enable GPIO reference counting Greg Kroah-Hartman
2015-03-24 15:46 ` [PATCH 3.14 52/79] nilfs2: fix deadlock of segment constructor during recovery Greg Kroah-Hartman
2015-03-24 15:46 ` [PATCH 3.14 53/79] drm/vmwgfx: Reorder device takedown somewhat Greg Kroah-Hartman
2015-03-24 15:46 ` [PATCH 3.14 54/79] xen/events: avoid NULL pointer dereference in dom0 on large machines Greg Kroah-Hartman
2015-03-24 15:46 ` [PATCH 3.14 55/79] xen-pciback: limit guest control of command register Greg Kroah-Hartman
2015-03-24 15:46 ` [PATCH 3.14 56/79] libsas: Fix Kernel Crash in smp_execute_task Greg Kroah-Hartman
2015-03-24 15:46 ` [PATCH 3.14 57/79] pagemap: do not leak physical addresses to non-privileged userspace Greg Kroah-Hartman
2015-03-24 15:46 ` [PATCH 3.14 58/79] crypto: arm/aes update NEON AES module to latest OpenSSL version Greg Kroah-Hartman
2015-03-24 15:46 ` [PATCH 3.14 59/79] crypto: aesni - fix memory usage in GCM decryption Greg Kroah-Hartman
2015-03-24 15:46 ` [PATCH 3.14 60/79] x86/fpu: Avoid math_state_restore() without used_math() in __restore_xstate_sig() Greg Kroah-Hartman
2015-03-24 15:46 ` [PATCH 3.14 61/79] x86/fpu: Drop_fpu() should not assume that tsk equals current Greg Kroah-Hartman
2015-03-24 15:46 ` [PATCH 3.14 62/79] x86/vdso: Fix the build on GCC5 Greg Kroah-Hartman
2015-03-24 15:46 ` [PATCH 3.14 63/79] ipvs: add missing ip_vs_pe_put in sync code Greg Kroah-Hartman
2015-03-24 15:46 ` [PATCH 3.14 64/79] ipvs: rerouting to local clients is not needed anymore Greg Kroah-Hartman
2015-03-24 15:46 ` [PATCH 3.14 65/79] netfilter: nft_compat: fix module refcount underflow Greg Kroah-Hartman
2015-03-24 15:46 ` [PATCH 3.14 66/79] netfilter: xt_socket: fix a stack corruption bug Greg Kroah-Hartman
2015-03-24 15:46 ` [PATCH 3.14 67/79] ARM: imx6sl-evk: set swbst_reg as vbuss parent reg Greg Kroah-Hartman
2015-03-24 15:46 ` [PATCH 3.14 68/79] arm64: Honor __GFP_ZERO in dma allocations Greg Kroah-Hartman
2015-03-24 15:46 ` [PATCH 3.14 69/79] ARM: imx6qdl-sabresd: set swbst_reg as vbuss parent reg Greg Kroah-Hartman
2015-03-24 15:46 ` [PATCH 3.14 70/79] ARM: at91: pm: fix at91rm9200 standby Greg Kroah-Hartman
2015-03-24 15:46 ` [PATCH 3.14 71/79] ARM: dts: DRA7x: Fix the bypass clock source for dpll_iva and others Greg Kroah-Hartman
2015-03-24 15:46 ` [PATCH 3.14 72/79] target: Fix reference leak in target_get_sess_cmd() error path Greg Kroah-Hartman
2015-03-24 15:46 ` [PATCH 3.14 73/79] target: Fix virtual LUN=0 target_configure_device failure OOPs Greg Kroah-Hartman
2015-03-24 15:46 ` [PATCH 3.14 74/79] iscsi-target: Avoid early conn_logout_comp for iser connections Greg Kroah-Hartman
2015-03-24 15:46 ` [PATCH 3.14 75/79] target/pscsi: Fix NULL pointer dereference in get_device_type Greg Kroah-Hartman
2015-03-24 15:46 ` [PATCH 3.14 76/79] target: Fix R_HOLDER bit usage for AllRegistrants Greg Kroah-Hartman
2015-03-24 15:46 ` [PATCH 3.14 77/79] target: Avoid dropping AllRegistrants reservation during unregister Greg Kroah-Hartman
2015-03-24 15:46 ` [PATCH 3.14 78/79] target: Allow AllRegistrants to re-RESERVE existing reservation Greg Kroah-Hartman
2015-03-24 15:46 ` [PATCH 3.14 79/79] target: Allow Write Exclusive non-reservation holders to READ Greg Kroah-Hartman
2015-03-25 2:50 ` [PATCH 3.14 00/79] 3.14.37-stable review Guenter Roeck
2015-03-25 8:30 ` Greg Kroah-Hartman
2015-03-25 13:03 ` Guenter Roeck
2015-03-25 13:07 ` Guenter Roeck
2015-03-26 14:00 ` Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150324154421.680350004@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=nuclearcat@nuclearcat.com \
--cc=stable@vger.kernel.org \
--cc=ycheng@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).