All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net] ip: hash fragments consistently
@ 2018-07-23 14:50 Paolo Abeni
  2018-07-23 16:26 ` Eric Dumazet
  2018-07-23 18:40 ` David Miller
  0 siblings, 2 replies; 6+ messages in thread
From: Paolo Abeni @ 2018-07-23 14:50 UTC (permalink / raw)
  To: netdev; +Cc: David S. Miller, Tom Herbert

The skb hash for locally generated ip[v6] fragments belonging
to the same datagram can vary in several circumstances:
* for connected UDP[v6] sockets, the first fragment get its hash
  via set_owner_w()/skb_set_hash_from_sk()
* for unconnected IPv6 UDPv6 sockets, the first fragment can get
  its hash via ip6_make_flowlabel()/skb_get_hash_flowi6(), if
  auto_flowlabel is enabled

For the following frags the hash is usually computed via
skb_get_hash().
The above can cause OoO for unconnected IPv6 UDPv6 socket: in that
scenario the egress tx queue can be selected on a per packet basis
via the skb hash.
It may also fool flow-oriented schedulers to place fragments belonging
to the same datagram in different flows.

Fix the issue by copying the skb hash from the head frag into
the others at fragmentation time.

Before this commit:
perf probe -a "dev_queue_xmit skb skb->hash skb->l4_hash:b1@0/8 skb->sw_hash:b1@1/8"
netperf -H $IPV4 -t UDP_STREAM -l 5 -- -m 2000 -n &
perf record -e probe:dev_queue_xmit -e probe:skb_set_owner_w -a sleep 0.1
perf script
probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=3713014309 l4_hash=1 sw_hash=0
probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=0 l4_hash=0 sw_hash=0

After this commit:
probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=2171763177 l4_hash=1 sw_hash=0
probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=2171763177 l4_hash=1 sw_hash=0

Fixes: b73c3d0e4f0e ("net: Save TX flow hash in sock and set in skbuf on xmit")
Fixes: 67800f9b1f4e ("ipv6: Call skb_get_hash_flowi6 to get skb->hash in ip6_make_flowlabel")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 net/ipv4/ip_output.c  | 2 ++
 net/ipv6/ip6_output.c | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index b3308e9d9762..0e3edd25f881 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -523,6 +523,8 @@ static void ip_copy_metadata(struct sk_buff *to, struct sk_buff *from)
 	to->dev = from->dev;
 	to->mark = from->mark;
 
+	skb_copy_hash(to, from);
+
 	/* Copy the flags to each fragment. */
 	IPCB(to)->flags = IPCB(from)->flags;
 
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index a14fb4fcdf18..3168847c30d1 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -570,6 +570,8 @@ static void ip6_copy_metadata(struct sk_buff *to, struct sk_buff *from)
 	to->dev = from->dev;
 	to->mark = from->mark;
 
+	skb_copy_hash(to, from);
+
 #ifdef CONFIG_NET_SCHED
 	to->tc_index = from->tc_index;
 #endif
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH net] ip: hash fragments consistently
  2018-07-23 14:50 [PATCH net] ip: hash fragments consistently Paolo Abeni
@ 2018-07-23 16:26 ` Eric Dumazet
  2018-11-09 19:44   ` Eric Dumazet
       [not found]   ` <CGME20181109194407epcas2p3793beaca271f6a5dcd067cedd853b41d@epcms1p3>
  2018-07-23 18:40 ` David Miller
  1 sibling, 2 replies; 6+ messages in thread
From: Eric Dumazet @ 2018-07-23 16:26 UTC (permalink / raw)
  To: Paolo Abeni, netdev; +Cc: David S. Miller, Tom Herbert



On 07/23/2018 07:50 AM, Paolo Abeni wrote:
> The skb hash for locally generated ip[v6] fragments belonging
> to the same datagram can vary in several circumstances:
> * for connected UDP[v6] sockets, the first fragment get its hash
>   via set_owner_w()/skb_set_hash_from_sk()
> * for unconnected IPv6 UDPv6 sockets, the first fragment can get
>   its hash via ip6_make_flowlabel()/skb_get_hash_flowi6(), if
>   auto_flowlabel is enabled
> 
> For the following frags the hash is usually computed via
> skb_get_hash().
> The above can cause OoO for unconnected IPv6 UDPv6 socket: in that
> scenario the egress tx queue can be selected on a per packet basis
> via the skb hash.
> It may also fool flow-oriented schedulers to place fragments belonging
> to the same datagram in different flows.
>

It also fools bond_xmit_hash(), packets of the same datagram can be sent on
two bonding slaves instead of one, meaning adding pressure on the defrag unit
in receiver.

Reviewed-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net] ip: hash fragments consistently
  2018-07-23 14:50 [PATCH net] ip: hash fragments consistently Paolo Abeni
  2018-07-23 16:26 ` Eric Dumazet
@ 2018-07-23 18:40 ` David Miller
  1 sibling, 0 replies; 6+ messages in thread
From: David Miller @ 2018-07-23 18:40 UTC (permalink / raw)
  To: pabeni; +Cc: netdev, tom

From: Paolo Abeni <pabeni@redhat.com>
Date: Mon, 23 Jul 2018 16:50:48 +0200

> The skb hash for locally generated ip[v6] fragments belonging
> to the same datagram can vary in several circumstances:
> * for connected UDP[v6] sockets, the first fragment get its hash
>   via set_owner_w()/skb_set_hash_from_sk()
> * for unconnected IPv6 UDPv6 sockets, the first fragment can get
>   its hash via ip6_make_flowlabel()/skb_get_hash_flowi6(), if
>   auto_flowlabel is enabled
> 
> For the following frags the hash is usually computed via
> skb_get_hash().
> The above can cause OoO for unconnected IPv6 UDPv6 socket: in that
> scenario the egress tx queue can be selected on a per packet basis
> via the skb hash.
> It may also fool flow-oriented schedulers to place fragments belonging
> to the same datagram in different flows.
> 
> Fix the issue by copying the skb hash from the head frag into
> the others at fragmentation time.
> 
> Before this commit:
> perf probe -a "dev_queue_xmit skb skb->hash skb->l4_hash:b1@0/8 skb->sw_hash:b1@1/8"
> netperf -H $IPV4 -t UDP_STREAM -l 5 -- -m 2000 -n &
> perf record -e probe:dev_queue_xmit -e probe:skb_set_owner_w -a sleep 0.1
> perf script
> probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=3713014309 l4_hash=1 sw_hash=0
> probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=0 l4_hash=0 sw_hash=0
> 
> After this commit:
> probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=2171763177 l4_hash=1 sw_hash=0
> probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=2171763177 l4_hash=1 sw_hash=0
> 
> Fixes: b73c3d0e4f0e ("net: Save TX flow hash in sock and set in skbuf on xmit")
> Fixes: 67800f9b1f4e ("ipv6: Call skb_get_hash_flowi6 to get skb->hash in ip6_make_flowlabel")
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Good catch!

Applied and queued up for -stable, thanks!

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net] ip: hash fragments consistently
  2018-07-23 16:26 ` Eric Dumazet
@ 2018-11-09 19:44   ` Eric Dumazet
       [not found]   ` <CGME20181109194407epcas2p3793beaca271f6a5dcd067cedd853b41d@epcms1p3>
  1 sibling, 0 replies; 6+ messages in thread
From: Eric Dumazet @ 2018-11-09 19:44 UTC (permalink / raw)
  To: Paolo Abeni, netdev; +Cc: David S. Miller, soukjin bae



On 07/23/2018 09:26 AM, Eric Dumazet wrote:
> 
> 
> On 07/23/2018 07:50 AM, Paolo Abeni wrote:
>> The skb hash for locally generated ip[v6] fragments belonging
>> to the same datagram can vary in several circumstances:
>> * for connected UDP[v6] sockets, the first fragment get its hash
>>   via set_owner_w()/skb_set_hash_from_sk()
>> * for unconnected IPv6 UDPv6 sockets, the first fragment can get
>>   its hash via ip6_make_flowlabel()/skb_get_hash_flowi6(), if
>>   auto_flowlabel is enabled
>>
>> For the following frags the hash is usually computed via
>> skb_get_hash().
>> The above can cause OoO for unconnected IPv6 UDPv6 socket: in that
>> scenario the egress tx queue can be selected on a per packet basis
>> via the skb hash.
>> It may also fool flow-oriented schedulers to place fragments belonging
>> to the same datagram in different flows.
>>
> 
> It also fools bond_xmit_hash(), packets of the same datagram can be sent on
> two bonding slaves instead of one, meaning adding pressure on the defrag unit
> in receiver.
> 
> Reviewed-by: Eric Dumazet <edumazet@google.com>
> 

Also we might note that flow dissector itself is buggy as
found by Soukjin Bae ( https://patchwork.ozlabs.org/patch/994601/ )

I will send a v2 of his patch with a different changelog.

Defrag is fixed [1] but the bug in flow dissector is adding
extra work and hash inconsistencies.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=0d5b9311baf27bb545f187f12ecfd558220c607d

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE:(2) [PATCH net] ip: hash fragments consistently
       [not found]   ` <CGME20181109194407epcas2p3793beaca271f6a5dcd067cedd853b41d@epcms1p3>
@ 2018-11-12  0:40     ` 배석진
  2018-11-12  3:37       ` (2) " Eric Dumazet
  0 siblings, 1 reply; 6+ messages in thread
From: 배석진 @ 2018-11-12  0:40 UTC (permalink / raw)
  To: Eric Dumazet, Paolo Abeni, netdev; +Cc: David S. Miller

> Also we might note that flow dissector itself is buggy as
> found by Soukjin Bae ( https://patchwork.ozlabs.org/patch/994601/ )
> 
> I will send a v2 of his patch with a different changelog.
>  
> Defrag is fixed [1] but the bug in flow dissector is adding
> extra work and hash inconsistencies.
> 
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=0d5b9311baf27bb545f187f12ecfd558220c607d


Dear all,

it's good news to mee too.
thanks for accept my work :)

 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: (2) [PATCH net] ip: hash fragments consistently
  2018-11-12  0:40     ` 배석진
@ 2018-11-12  3:37       ` Eric Dumazet
  0 siblings, 0 replies; 6+ messages in thread
From: Eric Dumazet @ 2018-11-12  3:37 UTC (permalink / raw)
  To: soukjin.bae, Paolo Abeni, netdev; +Cc: David S. Miller



On 11/11/2018 04:40 PM, 배석진 wrote:
>> Also we might note that flow dissector itself is buggy as
>> found by Soukjin Bae ( https://patchwork.ozlabs.org/patch/994601/ )
>>  
>> I will send a v2 of his patch with a different changelog.
>>  
>> Defrag is fixed [1] but the bug in flow dissector is adding
>> extra work and hash inconsistencies.
>>  
>> [1] https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=0d5b9311baf27bb545f187f12ecfd558220c607d
> 
> 
> Dear all,
> 
> it's good news to mee too.
> thanks for accept my work :)

Sure thing ;)

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-11-12 13:28 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-23 14:50 [PATCH net] ip: hash fragments consistently Paolo Abeni
2018-07-23 16:26 ` Eric Dumazet
2018-11-09 19:44   ` Eric Dumazet
     [not found]   ` <CGME20181109194407epcas2p3793beaca271f6a5dcd067cedd853b41d@epcms1p3>
2018-11-12  0:40     ` 배석진
2018-11-12  3:37       ` (2) " Eric Dumazet
2018-07-23 18:40 ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.