All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next] tcp: drop dst in tcp_add_backlog()
@ 2018-11-20  1:45 Eric Dumazet
  2018-11-20 18:01 ` David Miller
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2018-11-20  1:45 UTC (permalink / raw)
  To: David S . Miller; +Cc: netdev, Eric Dumazet, Eric Dumazet

Under stress, softirq rx handler often hits a socket owned by the user,
and has to queue the packet into socket backlog.

When this happens, skb dst refcount is taken before we escape rcu
protected region. This is done from __sk_add_backlog() calling
skb_dst_force().

Consumer will have to perform the opposite costly operation.

AFAIK nothing in tcp stack requests the dst after skb was stored
in the backlog. If this was the case, we would have had failures
already since skb_dst_force() can end up clearing skb dst anyway.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/tcp_ipv4.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 0952d4b772e7aa3cfb936596aa586dbd2dd6efde..795605a2327504b8a025405826e7e0ca8dc8501d 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1634,6 +1634,8 @@ bool tcp_add_backlog(struct sock *sk, struct sk_buff *skb)
 	 */
 	skb_condense(skb);
 
+	skb_dst_drop(skb);
+
 	if (unlikely(sk_add_backlog(sk, skb, limit))) {
 		bh_unlock_sock(sk);
 		__NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPBACKLOGDROP);
-- 
2.19.1.1215.g8438c0b245-goog

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next] tcp: drop dst in tcp_add_backlog()
  2018-11-20  1:45 [PATCH net-next] tcp: drop dst in tcp_add_backlog() Eric Dumazet
@ 2018-11-20 18:01 ` David Miller
  2018-11-20 18:08   ` Eric Dumazet
  0 siblings, 1 reply; 6+ messages in thread
From: David Miller @ 2018-11-20 18:01 UTC (permalink / raw)
  To: edumazet; +Cc: netdev, eric.dumazet

From: Eric Dumazet <edumazet@google.com>
Date: Mon, 19 Nov 2018 17:45:55 -0800

> Under stress, softirq rx handler often hits a socket owned by the user,
> and has to queue the packet into socket backlog.
> 
> When this happens, skb dst refcount is taken before we escape rcu
> protected region. This is done from __sk_add_backlog() calling
> skb_dst_force().
> 
> Consumer will have to perform the opposite costly operation.
> 
> AFAIK nothing in tcp stack requests the dst after skb was stored
> in the backlog. If this was the case, we would have had failures
> already since skb_dst_force() can end up clearing skb dst anyway.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Hmmm, it seems to be used by connection completion code to setup the
socket cached rx dst, right?

For example tcp_finish_connect() --> icsk->icsk_af_ops->sk_rx_dst_set(sk, skb)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next] tcp: drop dst in tcp_add_backlog()
  2018-11-20 18:01 ` David Miller
@ 2018-11-20 18:08   ` Eric Dumazet
  2018-11-20 18:16     ` David Miller
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2018-11-20 18:08 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Eric Dumazet

On Tue, Nov 20, 2018 at 10:01 AM David Miller <davem@davemloft.net> wrote:
>
> From: Eric Dumazet <edumazet@google.com>
> Date: Mon, 19 Nov 2018 17:45:55 -0800
>
> > Under stress, softirq rx handler often hits a socket owned by the user,
> > and has to queue the packet into socket backlog.
> >
> > When this happens, skb dst refcount is taken before we escape rcu
> > protected region. This is done from __sk_add_backlog() calling
> > skb_dst_force().
> >
> > Consumer will have to perform the opposite costly operation.
> >
> > AFAIK nothing in tcp stack requests the dst after skb was stored
> > in the backlog. If this was the case, we would have had failures
> > already since skb_dst_force() can end up clearing skb dst anyway.
> >
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
>
> Hmmm, it seems to be used by connection completion code to setup the
> socket cached rx dst, right?
>
> For example tcp_finish_connect() --> icsk->icsk_af_ops->sk_rx_dst_set(sk, skb)

We already cope with skb->dst being NULL there I believe.

For reference look at

commit 5037e9ef9454917b047f9f3a19b4dd179fbf7cd4    net: fix IP early demux races

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next] tcp: drop dst in tcp_add_backlog()
  2018-11-20 18:08   ` Eric Dumazet
@ 2018-11-20 18:16     ` David Miller
  2018-11-20 18:21       ` Eric Dumazet
  0 siblings, 1 reply; 6+ messages in thread
From: David Miller @ 2018-11-20 18:16 UTC (permalink / raw)
  To: edumazet; +Cc: netdev, eric.dumazet

From: Eric Dumazet <edumazet@google.com>
Date: Tue, 20 Nov 2018 10:08:12 -0800

> On Tue, Nov 20, 2018 at 10:01 AM David Miller <davem@davemloft.net> wrote:
>>
>> From: Eric Dumazet <edumazet@google.com>
>> Date: Mon, 19 Nov 2018 17:45:55 -0800
>>
>> > Under stress, softirq rx handler often hits a socket owned by the user,
>> > and has to queue the packet into socket backlog.
>> >
>> > When this happens, skb dst refcount is taken before we escape rcu
>> > protected region. This is done from __sk_add_backlog() calling
>> > skb_dst_force().
>> >
>> > Consumer will have to perform the opposite costly operation.
>> >
>> > AFAIK nothing in tcp stack requests the dst after skb was stored
>> > in the backlog. If this was the case, we would have had failures
>> > already since skb_dst_force() can end up clearing skb dst anyway.
>> >
>> > Signed-off-by: Eric Dumazet <edumazet@google.com>
>>
>> Hmmm, it seems to be used by connection completion code to setup the
>> socket cached rx dst, right?
>>
>> For example tcp_finish_connect() --> icsk->icsk_af_ops->sk_rx_dst_set(sk, skb)
> 
> We already cope with skb->dst being NULL there I believe.
> 
> For reference look at
> 
> commit 5037e9ef9454917b047f9f3a19b4dd179fbf7cd4    net: fix IP early demux races

Well, I'm sure we "handle" it.  But I was more asking about the performance
tradeoff, which probably is on the side of your change but I wanted to
just be sure.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next] tcp: drop dst in tcp_add_backlog()
  2018-11-20 18:16     ` David Miller
@ 2018-11-20 18:21       ` Eric Dumazet
  2018-11-20 18:26         ` David Miller
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2018-11-20 18:21 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Eric Dumazet

On Tue, Nov 20, 2018 at 10:16 AM David Miller <davem@davemloft.net> wrote:
>
> From: Eric Dumazet <edumazet@google.com>
> Date: Tue, 20 Nov 2018 10:08:12 -0800
>
> > On Tue, Nov 20, 2018 at 10:01 AM David Miller <davem@davemloft.net> wrote:
> >>
> >> From: Eric Dumazet <edumazet@google.com>
> >> Date: Mon, 19 Nov 2018 17:45:55 -0800
> >>
> >> > Under stress, softirq rx handler often hits a socket owned by the user,
> >> > and has to queue the packet into socket backlog.
> >> >
> >> > When this happens, skb dst refcount is taken before we escape rcu
> >> > protected region. This is done from __sk_add_backlog() calling
> >> > skb_dst_force().
> >> >
> >> > Consumer will have to perform the opposite costly operation.
> >> >
> >> > AFAIK nothing in tcp stack requests the dst after skb was stored
> >> > in the backlog. If this was the case, we would have had failures
> >> > already since skb_dst_force() can end up clearing skb dst anyway.
> >> >
> >> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> >>
> >> Hmmm, it seems to be used by connection completion code to setup the
> >> socket cached rx dst, right?
> >>
> >> For example tcp_finish_connect() --> icsk->icsk_af_ops->sk_rx_dst_set(sk, skb)
> >
> > We already cope with skb->dst being NULL there I believe.
> >
> > For reference look at
> >
> > commit 5037e9ef9454917b047f9f3a19b4dd179fbf7cd4    net: fix IP early demux races
>
> Well, I'm sure we "handle" it.  But I was more asking about the performance
> tradeoff, which probably is on the side of your change but I wanted to
> just be sure.

Ah sorry for misunderstanding.
Yes, that should be fine, backlog is not generally used at the
beginning of a flow.

I am working on adding coalescing support to tcp_add_backlog() to balance
time spent in softirq and time spent from process context, hoping to reduce
probability of having user threads trapped for a while in __release_sock()

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next] tcp: drop dst in tcp_add_backlog()
  2018-11-20 18:21       ` Eric Dumazet
@ 2018-11-20 18:26         ` David Miller
  0 siblings, 0 replies; 6+ messages in thread
From: David Miller @ 2018-11-20 18:26 UTC (permalink / raw)
  To: edumazet; +Cc: netdev, eric.dumazet

From: Eric Dumazet <edumazet@google.com>
Date: Tue, 20 Nov 2018 10:21:22 -0800

> On Tue, Nov 20, 2018 at 10:16 AM David Miller <davem@davemloft.net> wrote:
>>
>> Well, I'm sure we "handle" it.  But I was more asking about the performance
>> tradeoff, which probably is on the side of your change but I wanted to
>> just be sure.
> 
> Ah sorry for misunderstanding.
> Yes, that should be fine, backlog is not generally used at the
> beginning of a flow.
> 
> I am working on adding coalescing support to tcp_add_backlog() to balance
> time spent in softirq and time spent from process context, hoping to reduce
> probability of having user threads trapped for a while in __release_sock()

Ok, meanwhile I applied the patch under discussion.

Thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-11-21  4:56 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-20  1:45 [PATCH net-next] tcp: drop dst in tcp_add_backlog() Eric Dumazet
2018-11-20 18:01 ` David Miller
2018-11-20 18:08   ` Eric Dumazet
2018-11-20 18:16     ` David Miller
2018-11-20 18:21       ` Eric Dumazet
2018-11-20 18:26         ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.