* [PATCH v2 net-next] tcp: fix zerocopy and notsent_lowat issues
@ 2019-03-26 15:34 Eric Dumazet
2019-03-26 16:30 ` Soheil Hassas Yeganeh
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Eric Dumazet @ 2019-03-26 15:34 UTC (permalink / raw)
To: David S . Miller
Cc: netdev, Eric Dumazet, Eric Dumazet, Willem de Bruijn,
Soheil Hassas Yeganeh
My recent patch had at least three problems :
1) TX zerocopy wants notification when skb is acknowledged,
thus we need to call skb_zcopy_clear() if the skb is
cached into sk->sk_tx_skb_cache
2) Some applications might expect precise EPOLLOUT
notifications, so we need to update sk->sk_wmem_queued
and call sk_mem_uncharge() from sk_wmem_free_skb()
in all cases. The SOCK_QUEUE_SHRUNK flag must also be set.
3) Reuse of saved skb should have used skb_cloned() instead
of simply checking if the fast clone has been freed.
Fixes: 472c2e07eef0 ("tcp: add one skb cache for tx")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
---
include/net/sock.h | 9 +++++----
net/ipv4/tcp.c | 13 +++----------
2 files changed, 8 insertions(+), 14 deletions(-)
diff --git a/include/net/sock.h b/include/net/sock.h
index 577d91fb56267371c6bc5ae65f7454deba726bd6..7fa2232785226bcafd46b230559964fd16f3c4f4 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1465,13 +1465,14 @@ static inline void sk_mem_uncharge(struct sock *sk, int size)
static inline void sk_wmem_free_skb(struct sock *sk, struct sk_buff *skb)
{
- if (!sk->sk_tx_skb_cache) {
- sk->sk_tx_skb_cache = skb;
- return;
- }
sock_set_flag(sk, SOCK_QUEUE_SHRUNK);
sk->sk_wmem_queued -= skb->truesize;
sk_mem_uncharge(sk, skb->truesize);
+ if (!sk->sk_tx_skb_cache) {
+ skb_zcopy_clear(skb, true);
+ sk->sk_tx_skb_cache = skb;
+ return;
+ }
__kfree_skb(skb);
}
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 29b94edf05f9357d3a33744d677827ce624738ae..82bd707c03472f2cebb1a90d5f1c13acc821468f 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -865,14 +865,9 @@ struct sk_buff *sk_stream_alloc_skb(struct sock *sk, int size, gfp_t gfp,
{
struct sk_buff *skb;
- skb = sk->sk_tx_skb_cache;
- if (skb && !size) {
- const struct sk_buff_fclones *fclones;
-
- fclones = container_of(skb, struct sk_buff_fclones, skb1);
- if (refcount_read(&fclones->fclone_ref) == 1) {
- sk->sk_wmem_queued -= skb->truesize;
- sk_mem_uncharge(sk, skb->truesize);
+ if (likely(!size)) {
+ skb = sk->sk_tx_skb_cache;
+ if (skb && !skb_cloned(skb)) {
skb->truesize -= skb->data_len;
sk->sk_tx_skb_cache = NULL;
pskb_trim(skb, 0);
@@ -2543,8 +2538,6 @@ void tcp_write_queue_purge(struct sock *sk)
tcp_rtx_queue_purge(sk);
skb = sk->sk_tx_skb_cache;
if (skb) {
- sk->sk_wmem_queued -= skb->truesize;
- sk_mem_uncharge(sk, skb->truesize);
__kfree_skb(skb);
sk->sk_tx_skb_cache = NULL;
}
--
2.21.0.392.gf8f6787159e-goog
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v2 net-next] tcp: fix zerocopy and notsent_lowat issues
2019-03-26 15:34 [PATCH v2 net-next] tcp: fix zerocopy and notsent_lowat issues Eric Dumazet
@ 2019-03-26 16:30 ` Soheil Hassas Yeganeh
2019-03-26 19:52 ` Holger Hoffstätte
2019-03-27 20:59 ` David Miller
2 siblings, 0 replies; 4+ messages in thread
From: Soheil Hassas Yeganeh @ 2019-03-26 16:30 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David S . Miller, netdev, Eric Dumazet, Willem de Bruijn
On Tue, Mar 26, 2019 at 11:35 AM Eric Dumazet <edumazet@google.com> wrote:
>
> My recent patch had at least three problems :
>
> 1) TX zerocopy wants notification when skb is acknowledged,
> thus we need to call skb_zcopy_clear() if the skb is
> cached into sk->sk_tx_skb_cache
>
> 2) Some applications might expect precise EPOLLOUT
> notifications, so we need to update sk->sk_wmem_queued
> and call sk_mem_uncharge() from sk_wmem_free_skb()
> in all cases. The SOCK_QUEUE_SHRUNK flag must also be set.
>
> 3) Reuse of saved skb should have used skb_cloned() instead
> of simply checking if the fast clone has been freed.
>
> Fixes: 472c2e07eef0 ("tcp: add one skb cache for tx")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Willem de Bruijn <willemb@google.com>
> Cc: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
I can't think of other corner cases. Thanks!
> ---
> include/net/sock.h | 9 +++++----
> net/ipv4/tcp.c | 13 +++----------
> 2 files changed, 8 insertions(+), 14 deletions(-)
>
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 577d91fb56267371c6bc5ae65f7454deba726bd6..7fa2232785226bcafd46b230559964fd16f3c4f4 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -1465,13 +1465,14 @@ static inline void sk_mem_uncharge(struct sock *sk, int size)
>
> static inline void sk_wmem_free_skb(struct sock *sk, struct sk_buff *skb)
> {
> - if (!sk->sk_tx_skb_cache) {
> - sk->sk_tx_skb_cache = skb;
> - return;
> - }
> sock_set_flag(sk, SOCK_QUEUE_SHRUNK);
> sk->sk_wmem_queued -= skb->truesize;
> sk_mem_uncharge(sk, skb->truesize);
> + if (!sk->sk_tx_skb_cache) {
> + skb_zcopy_clear(skb, true);
> + sk->sk_tx_skb_cache = skb;
> + return;
> + }
> __kfree_skb(skb);
> }
>
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 29b94edf05f9357d3a33744d677827ce624738ae..82bd707c03472f2cebb1a90d5f1c13acc821468f 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -865,14 +865,9 @@ struct sk_buff *sk_stream_alloc_skb(struct sock *sk, int size, gfp_t gfp,
> {
> struct sk_buff *skb;
>
> - skb = sk->sk_tx_skb_cache;
> - if (skb && !size) {
> - const struct sk_buff_fclones *fclones;
> -
> - fclones = container_of(skb, struct sk_buff_fclones, skb1);
> - if (refcount_read(&fclones->fclone_ref) == 1) {
> - sk->sk_wmem_queued -= skb->truesize;
> - sk_mem_uncharge(sk, skb->truesize);
> + if (likely(!size)) {
> + skb = sk->sk_tx_skb_cache;
> + if (skb && !skb_cloned(skb)) {
> skb->truesize -= skb->data_len;
> sk->sk_tx_skb_cache = NULL;
> pskb_trim(skb, 0);
> @@ -2543,8 +2538,6 @@ void tcp_write_queue_purge(struct sock *sk)
> tcp_rtx_queue_purge(sk);
> skb = sk->sk_tx_skb_cache;
> if (skb) {
> - sk->sk_wmem_queued -= skb->truesize;
> - sk_mem_uncharge(sk, skb->truesize);
> __kfree_skb(skb);
> sk->sk_tx_skb_cache = NULL;
> }
> --
> 2.21.0.392.gf8f6787159e-goog
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v2 net-next] tcp: fix zerocopy and notsent_lowat issues
2019-03-26 15:34 [PATCH v2 net-next] tcp: fix zerocopy and notsent_lowat issues Eric Dumazet
2019-03-26 16:30 ` Soheil Hassas Yeganeh
@ 2019-03-26 19:52 ` Holger Hoffstätte
2019-03-27 20:59 ` David Miller
2 siblings, 0 replies; 4+ messages in thread
From: Holger Hoffstätte @ 2019-03-26 19:52 UTC (permalink / raw)
To: netdev
On Tue, 26 Mar 2019 08:34:55 -0700, Eric Dumazet wrote:
> My recent patch had at least three problems :
>
> 1) TX zerocopy wants notification when skb is acknowledged,
> thus we need to call skb_zcopy_clear() if the skb is
> cached into sk->sk_tx_skb_cache
>
> 2) Some applications might expect precise EPOLLOUT
> notifications, so we need to update sk->sk_wmem_queued
> and call sk_mem_uncharge() from sk_wmem_free_skb()
> in all cases. The SOCK_QUEUE_SHRUNK flag must also be set.
>
> 3) Reuse of saved skb should have used skb_cloned() instead
> of simply checking if the fast clone has been freed.
>
> Fixes: 472c2e07eef0 ("tcp: add one skb cache for tx")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Willem de Bruijn <willemb@google.com>
> Cc: Soheil Hassas Yeganeh <soheil@google.com>
The v1 of this patch caused a lot of oopsies, but this one
works fine. So:
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
cheers,
Holger
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v2 net-next] tcp: fix zerocopy and notsent_lowat issues
2019-03-26 15:34 [PATCH v2 net-next] tcp: fix zerocopy and notsent_lowat issues Eric Dumazet
2019-03-26 16:30 ` Soheil Hassas Yeganeh
2019-03-26 19:52 ` Holger Hoffstätte
@ 2019-03-27 20:59 ` David Miller
2 siblings, 0 replies; 4+ messages in thread
From: David Miller @ 2019-03-27 20:59 UTC (permalink / raw)
To: edumazet; +Cc: netdev, eric.dumazet, willemb, soheil
From: Eric Dumazet <edumazet@google.com>
Date: Tue, 26 Mar 2019 08:34:55 -0700
> My recent patch had at least three problems :
>
> 1) TX zerocopy wants notification when skb is acknowledged,
> thus we need to call skb_zcopy_clear() if the skb is
> cached into sk->sk_tx_skb_cache
>
> 2) Some applications might expect precise EPOLLOUT
> notifications, so we need to update sk->sk_wmem_queued
> and call sk_mem_uncharge() from sk_wmem_free_skb()
> in all cases. The SOCK_QUEUE_SHRUNK flag must also be set.
>
> 3) Reuse of saved skb should have used skb_cloned() instead
> of simply checking if the fast clone has been freed.
>
> Fixes: 472c2e07eef0 ("tcp: add one skb cache for tx")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
Applied, thanks Eric.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2019-03-27 20:59 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-26 15:34 [PATCH v2 net-next] tcp: fix zerocopy and notsent_lowat issues Eric Dumazet
2019-03-26 16:30 ` Soheil Hassas Yeganeh
2019-03-26 19:52 ` Holger Hoffstätte
2019-03-27 20:59 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).