All of lore.kernel.org
 help / color / mirror / Atom feed
* some failures with vxlan offloads..
@ 2014-10-26 13:36 Or Gerlitz
  2014-10-26 15:29 ` Tom Herbert
  0 siblings, 1 reply; 9+ messages in thread
From: Or Gerlitz @ 2014-10-26 13:36 UTC (permalink / raw)
  To: Tom Herbert; +Cc: netdev, John Fastabend, Jeff Kirsher

Hi all, Tom..

Running VXLAN traffic using driver/NIC which support offloads (mlx4 
driver, ConnectX-3 pro NIC), I see some configurationsthat don't really 
work. The testing was done over the net tree, 3.17.0+, as ofcommit 
d10845f "Merge branch 'gso_encap_fixes'", whenI say breaks, it means 
that encapsulated ping works, but encapsulatedTCP (netperf) doesn't.

conf    client          server          status
----------------------------------------------
1       offloaded       offloaded       works
2       non-offloaded   non-offloaded   works
3       non-offloaded   offloaded       breaks
4       offloaded       non-offloaded   breaks


In the cases where it breaks I can see

         UDP: bad checksum. From 192.168.31.18:54748 to 
192.168.31.17:4789 ulen 726

prints from __udp4_lib_rcv() in the kernel log of the node where 
offloads are OFF, where the badpacket is sent from the hostwhere 
offloading is enabled. I guess the packet is just dropped:

# dmesg -c ; nstat
UDP: bad checksum. From 192.168.31.18:45084 to 192.168.31.17:4789 ulen 78
UDP: bad checksum. From 192.168.31.18:45084 to 192.168.31.17:4789 ulen 78
#kernel
IpInReceives                    18                 0.0
IpInDelivers                    18                 0.0
IpOutRequests                   17                 0.0
TcpInSegs                       15                 0.0
TcpOutSegs                      12                 0.0
TcpRetransSegs                  1                  0.0
UdpInDatagrams                  1                  0.0
UdpInErrors                     2                  0.0
UdpOutDatagrams                 3                  0.0
UdpInCsumErrors                 2                  0.0
TcpExtTCPHPHits                 1                  0.0
TcpExtTCPHPAcks                 12                 0.0
TcpExtTCPAutoCorking            5                  0.0
TcpExtTCPSynRetrans             1                  0.0
TcpExtTCPOrigDataSent           12                 0.0
IpExtInOctets                   1068               0.0
IpExtOutOctets                  3174               0.0
IpExtInNoECTPkts                18                 0.0



The mlx4 driver advertizes NETIF_F_GSO_UDP_TUNNEL but 
notNETIF_F_GSO_UDP_TUNNEL_CSUM

I wonder if such or similar configs work for people with other 
drivers/NIC that supports offloads?

Tom, I think you were testing your changes with bnx2x

Or.

Setup details: I use OVS with VXLAN, create <veth0,veth1> pair,plug 
veth1 to OVS and as ip address on veth0, run ping and laternetperf over 
the veth interfaces IP subnet (192.168.52/24 in this case)which goes 
through VXLAN encapsulation over the host subnet(192.168.31/24 in this 
case).

client: host 192.168.31.17 / inner 192.168.52.17
server: host 192.168.31.18 / inner 192.168.52.18

output from config #3

the client side has these messages printed from __udp4_lib_rcv()
on the csum_error label

UDP: bad checksum. From 192.168.31.18:54748 to 192.168.31.17:4789 ulen 70
UDP: bad checksum. From 192.168.31.18:54748 to 192.168.31.17:4789 ulen 726
UDP: bad checksum. From 192.168.31.18:54748 to 192.168.31.17:4789 ulen 70
UDP: bad checksum. From 192.168.31.18:54748 to 192.168.31.17:4789 ulen 726
UDP: bad checksum. From 192.168.31.18:54748 to 192.168.31.17:4789 ulen 70
UDP: bad checksum. From 192.168.31.18:54748 to 192.168.31.17:4789 ulen 70
UDP: bad checksum. From 192.168.31.18:54748 to 192.168.31.17:4789 ulen 726


output fromconfig #4

the server side has these messages printed from __udp4_lib_rcv()
on the csum_error label and the below warning

UDP: bad checksum. From 192.168.31.17:34521 to 192.168.31.18:4789 ulen 1480
UDP: bad checksum. From 192.168.31.17:34521 to 192.168.31.18:4789 ulen 1480
UDP: bad checksum. From 192.168.31.17:34521 to 192.168.31.18:4789 ulen 1480
UDP: bad checksum. From 192.168.31.17:34521 to 192.168.31.18:4789 ulen 1480
UDP: bad checksum. From 192.168.31.17:60499 to 192.168.31.18:4789 ulen 78
UDP: bad checksum. From 192.168.31.17:36909 to 192.168.31.18:4789 ulen 78
UDP: bad checksum. From 192.168.31.17:36909 to 192.168.31.18:4789 ulen 78
UDP: bad checksum. From 192.168.31.17:36909 to 192.168.31.18:4789 ulen 78


------------[ cut here ]------------
WARNING: CPU: 0 PID: 5427 at net/core/skbuff.c:4006 
skb_try_coalesce+0x25e/0x395()
Modules linked in: mlx4_ib mlx4_en mlx4_core veth ib_ipoib ib_cm ib_umad 
ib_sa ib_mad ib_core ib_addrigb dca ptp pps_core hwmon autofs4 sunrpc 
target_core_mod configfs ipmi_devintf ipmi_si ipmi_msghandleripv6 
openvswitch vxlan geneve udp_tunnel ip6_udp_tunnel gre crc32c_generic 
libcrc32c dm_mirrordm_region_hash dm_log uinput dm_mod microcode sr_mod 
ext3 jbd usb_storage floppy sd_mod ata_piixlibata scsi_mod uhci_hcd 
[last unloaded: mlx4_core]
CPU: 0 PID: 5427 Comm: netserver Not tainted 3.17.0+ #172
Hardware name: Supermicro X7DWU/X7DWU, BIOS  1.1 04/30/2008
  0000000000000fa6 ffff8802156039b8 ffffffff813f6da9 0000000000000fa6
  0000000000000000 ffff8802156039f8 ffffffff8103dc38 ffff8802239b40c0
  ffffffff81352a6a ffff8800c5530e00 ffff880215fd1200 ffff880215603a74
Call Trace:
  [<ffffffff813f6da9>] dump_stack+0x51/0x70
  [<ffffffff8103dc38>] warn_slowpath_common+0x7c/0x96
  [<ffffffff81352a6a>] ? skb_try_coalesce+0x25e/0x395
  [<ffffffff8103dc67>] warn_slowpath_null+0x15/0x17
  [<ffffffff81352a6a>] skb_try_coalesce+0x25e/0x395
  [<ffffffff813a0468>] tcp_try_coalesce+0x35/0x91
  [<ffffffff813a0525>] tcp_queue_rcv+0x61/0x101
  [<ffffffff813a344f>] tcp_rcv_established+0x3b9/0x602
  [<ffffffff8134d7e1>] ? release_sock+0x30/0x1b0
  [<ffffffff813aa139>] tcp_v4_do_rcv+0x105/0x41a
  [<ffffffff8134d8b6>] release_sock+0x105/0x1b0
  [<ffffffff8139a9cf>] tcp_recvmsg+0x912/0xa5b
  [<ffffffff81084cef>] ? rcu_irq_exit+0x7d/0x8f
  [<ffffffff813fcea0>] ? retint_restore_args+0xe/0xe
  [<ffffffff813bc82d>] inet_recvmsg+0xd1/0xeb
  [<ffffffff81349c0e>] sock_recvmsg+0x94/0xb2
  [<ffffffff81070cd4>] ? trace_hardirqs_on+0xd/0xf
  [<ffffffff813fbd99>] ? _raw_spin_unlock_irq+0x2b/0x38
  [<ffffffff8113b882>] ? __fdget+0xe/0x10
  [<ffffffff81349ceb>] SyS_recvfrom+0xbf/0x10f
  [<ffffffff811e405e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
  [<ffffffff81062070>] ? try_to_wake_up+0x2d0/0x317
  [<ffffffff8134d7e1>] ? release_sock+0x30/0x1b0
  [<ffffffff813fc2d2>] system_call_fastpath+0x12/0x17
---[ end trace d39905841ae018aa ]---

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: some failures with vxlan offloads..
  2014-10-26 13:36 some failures with vxlan offloads Or Gerlitz
@ 2014-10-26 15:29 ` Tom Herbert
  2014-10-26 22:23   ` Or Gerlitz
  0 siblings, 1 reply; 9+ messages in thread
From: Tom Herbert @ 2014-10-26 15:29 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: netdev, John Fastabend, Jeff Kirsher

On Sun, Oct 26, 2014 at 6:36 AM, Or Gerlitz <ogerlitz@mellanox.com> wrote:
> Hi all, Tom..
>
> Running VXLAN traffic using driver/NIC which support offloads (mlx4 driver,
> ConnectX-3 pro NIC), I see some configurationsthat don't really work. The
> testing was done over the net tree, 3.17.0+, as ofcommit d10845f "Merge
> branch 'gso_encap_fixes'", whenI say breaks, it means that encapsulated ping
> works, but encapsulatedTCP (netperf) doesn't.
>
> conf    client          server          status
> ----------------------------------------------
> 1       offloaded       offloaded       works
> 2       non-offloaded   non-offloaded   works
> 3       non-offloaded   offloaded       breaks
> 4       offloaded       non-offloaded   breaks
>
>
> In the cases where it breaks I can see
>
>         UDP: bad checksum. From 192.168.31.18:54748 to 192.168.31.17:4789
> ulen 726
>
> prints from __udp4_lib_rcv() in the kernel log of the node where offloads
> are OFF, where the badpacket is sent from the hostwhere offloading is
> enabled. I guess the packet is just dropped:
>
Can you determine what the TSO HW engine is setting in UDP checksum
field? tcpdump -vv might be able to show this. The symptoms seem to
indicate that it may not be zero.

> # dmesg -c ; nstat
> UDP: bad checksum. From 192.168.31.18:45084 to 192.168.31.17:4789 ulen 78
> UDP: bad checksum. From 192.168.31.18:45084 to 192.168.31.17:4789 ulen 78
> #kernel
> IpInReceives                    18                 0.0
> IpInDelivers                    18                 0.0
> IpOutRequests                   17                 0.0
> TcpInSegs                       15                 0.0
> TcpOutSegs                      12                 0.0
> TcpRetransSegs                  1                  0.0
> UdpInDatagrams                  1                  0.0
> UdpInErrors                     2                  0.0
> UdpOutDatagrams                 3                  0.0
> UdpInCsumErrors                 2                  0.0
> TcpExtTCPHPHits                 1                  0.0
> TcpExtTCPHPAcks                 12                 0.0
> TcpExtTCPAutoCorking            5                  0.0
> TcpExtTCPSynRetrans             1                  0.0
> TcpExtTCPOrigDataSent           12                 0.0
> IpExtInOctets                   1068               0.0
> IpExtOutOctets                  3174               0.0
> IpExtInNoECTPkts                18                 0.0
>
>
>
> The mlx4 driver advertizes NETIF_F_GSO_UDP_TUNNEL but
> notNETIF_F_GSO_UDP_TUNNEL_CSUM
>
> I wonder if such or similar configs work for people with other drivers/NIC
> that supports offloads?
>
> Tom, I think you were testing your changes with bnx2x
>
> Or.
>
> Setup details: I use OVS with VXLAN, create <veth0,veth1> pair,plug veth1 to
> OVS and as ip address on veth0, run ping and laternetperf over the veth
> interfaces IP subnet (192.168.52/24 in this case)which goes through VXLAN
> encapsulation over the host subnet(192.168.31/24 in this case).
>
> client: host 192.168.31.17 / inner 192.168.52.17
> server: host 192.168.31.18 / inner 192.168.52.18
>
> output from config #3
>
> the client side has these messages printed from __udp4_lib_rcv()
> on the csum_error label
>
> UDP: bad checksum. From 192.168.31.18:54748 to 192.168.31.17:4789 ulen 70
> UDP: bad checksum. From 192.168.31.18:54748 to 192.168.31.17:4789 ulen 726
> UDP: bad checksum. From 192.168.31.18:54748 to 192.168.31.17:4789 ulen 70
> UDP: bad checksum. From 192.168.31.18:54748 to 192.168.31.17:4789 ulen 726
> UDP: bad checksum. From 192.168.31.18:54748 to 192.168.31.17:4789 ulen 70
> UDP: bad checksum. From 192.168.31.18:54748 to 192.168.31.17:4789 ulen 70
> UDP: bad checksum. From 192.168.31.18:54748 to 192.168.31.17:4789 ulen 726
>
>
> output fromconfig #4
>
> the server side has these messages printed from __udp4_lib_rcv()
> on the csum_error label and the below warning
>
> UDP: bad checksum. From 192.168.31.17:34521 to 192.168.31.18:4789 ulen 1480
> UDP: bad checksum. From 192.168.31.17:34521 to 192.168.31.18:4789 ulen 1480
> UDP: bad checksum. From 192.168.31.17:34521 to 192.168.31.18:4789 ulen 1480
> UDP: bad checksum. From 192.168.31.17:34521 to 192.168.31.18:4789 ulen 1480
> UDP: bad checksum. From 192.168.31.17:60499 to 192.168.31.18:4789 ulen 78
> UDP: bad checksum. From 192.168.31.17:36909 to 192.168.31.18:4789 ulen 78
> UDP: bad checksum. From 192.168.31.17:36909 to 192.168.31.18:4789 ulen 78
> UDP: bad checksum. From 192.168.31.17:36909 to 192.168.31.18:4789 ulen 78
>
>
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 5427 at net/core/skbuff.c:4006
> skb_try_coalesce+0x25e/0x395()
> Modules linked in: mlx4_ib mlx4_en mlx4_core veth ib_ipoib ib_cm ib_umad
> ib_sa ib_mad ib_core ib_addrigb dca ptp pps_core hwmon autofs4 sunrpc
> target_core_mod configfs ipmi_devintf ipmi_si ipmi_msghandleripv6
> openvswitch vxlan geneve udp_tunnel ip6_udp_tunnel gre crc32c_generic
> libcrc32c dm_mirrordm_region_hash dm_log uinput dm_mod microcode sr_mod ext3
> jbd usb_storage floppy sd_mod ata_piixlibata scsi_mod uhci_hcd [last
> unloaded: mlx4_core]
> CPU: 0 PID: 5427 Comm: netserver Not tainted 3.17.0+ #172
> Hardware name: Supermicro X7DWU/X7DWU, BIOS  1.1 04/30/2008
>  0000000000000fa6 ffff8802156039b8 ffffffff813f6da9 0000000000000fa6
>  0000000000000000 ffff8802156039f8 ffffffff8103dc38 ffff8802239b40c0
>  ffffffff81352a6a ffff8800c5530e00 ffff880215fd1200 ffff880215603a74
> Call Trace:
>  [<ffffffff813f6da9>] dump_stack+0x51/0x70
>  [<ffffffff8103dc38>] warn_slowpath_common+0x7c/0x96
>  [<ffffffff81352a6a>] ? skb_try_coalesce+0x25e/0x395
>  [<ffffffff8103dc67>] warn_slowpath_null+0x15/0x17
>  [<ffffffff81352a6a>] skb_try_coalesce+0x25e/0x395
>  [<ffffffff813a0468>] tcp_try_coalesce+0x35/0x91
>  [<ffffffff813a0525>] tcp_queue_rcv+0x61/0x101
>  [<ffffffff813a344f>] tcp_rcv_established+0x3b9/0x602
>  [<ffffffff8134d7e1>] ? release_sock+0x30/0x1b0
>  [<ffffffff813aa139>] tcp_v4_do_rcv+0x105/0x41a
>  [<ffffffff8134d8b6>] release_sock+0x105/0x1b0
>  [<ffffffff8139a9cf>] tcp_recvmsg+0x912/0xa5b
>  [<ffffffff81084cef>] ? rcu_irq_exit+0x7d/0x8f
>  [<ffffffff813fcea0>] ? retint_restore_args+0xe/0xe
>  [<ffffffff813bc82d>] inet_recvmsg+0xd1/0xeb
>  [<ffffffff81349c0e>] sock_recvmsg+0x94/0xb2
>  [<ffffffff81070cd4>] ? trace_hardirqs_on+0xd/0xf
>  [<ffffffff813fbd99>] ? _raw_spin_unlock_irq+0x2b/0x38
>  [<ffffffff8113b882>] ? __fdget+0xe/0x10
>  [<ffffffff81349ceb>] SyS_recvfrom+0xbf/0x10f
>  [<ffffffff811e405e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
>  [<ffffffff81062070>] ? try_to_wake_up+0x2d0/0x317
>  [<ffffffff8134d7e1>] ? release_sock+0x30/0x1b0
>  [<ffffffff813fc2d2>] system_call_fastpath+0x12/0x17
> ---[ end trace d39905841ae018aa ]---
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: some failures with vxlan offloads..
  2014-10-26 15:29 ` Tom Herbert
@ 2014-10-26 22:23   ` Or Gerlitz
  2014-10-27  1:23     ` Tom Herbert
  0 siblings, 1 reply; 9+ messages in thread
From: Or Gerlitz @ 2014-10-26 22:23 UTC (permalink / raw)
  To: Tom Herbert; +Cc: Or Gerlitz, netdev, John Fastabend, Jeff Kirsher

On Sun, Oct 26, 2014 at 5:29 PM, Tom Herbert <therbert@google.com> wrote:
> On Sun, Oct 26, 2014 at 6:36 AM, Or Gerlitz <ogerlitz@mellanox.com> wrote:

>> In the cases where it breaks I can see
>>         UDP: bad checksum. From 192.168.31.18:54748 to 192.168.31.17:4789
>> ulen 726
>> prints from __udp4_lib_rcv() in the kernel log of the node where offloads
>> are OFF, where the bad packet is sent from the host where offloading is
>> enabled. I guess the packet is just dropped:

> Can you determine what the TSO HW engine is setting in UDP checksum
> field? tcpdump -vv might be able to show this. The symptoms seem to
> indicate that it may not be zero.

Thanks for the quick response. I'll check what is placed in the UDP
checksum field for packets that went through the offloading HW and let
you know.

BTW, if following the direction you proposed, I wonder why this works
(e.g the kernel doesn't drops the encapsulated TCP packets) when both
sides are offloaded?

Tomorrow (Monday) I am OOO so will be able to do these further tests Tuesday.

Or.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: some failures with vxlan offloads..
  2014-10-26 22:23   ` Or Gerlitz
@ 2014-10-27  1:23     ` Tom Herbert
  2014-10-28 15:27       ` Or Gerlitz
  0 siblings, 1 reply; 9+ messages in thread
From: Tom Herbert @ 2014-10-27  1:23 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: Or Gerlitz, netdev, John Fastabend, Jeff Kirsher

On Sun, Oct 26, 2014 at 3:23 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
> On Sun, Oct 26, 2014 at 5:29 PM, Tom Herbert <therbert@google.com> wrote:
>> On Sun, Oct 26, 2014 at 6:36 AM, Or Gerlitz <ogerlitz@mellanox.com> wrote:
>
>>> In the cases where it breaks I can see
>>>         UDP: bad checksum. From 192.168.31.18:54748 to 192.168.31.17:4789
>>> ulen 726
>>> prints from __udp4_lib_rcv() in the kernel log of the node where offloads
>>> are OFF, where the bad packet is sent from the host where offloading is
>>> enabled. I guess the packet is just dropped:
>
>> Can you determine what the TSO HW engine is setting in UDP checksum
>> field? tcpdump -vv might be able to show this. The symptoms seem to
>> indicate that it may not be zero.
>
> Thanks for the quick response. I'll check what is placed in the UDP
> checksum field for packets that went through the offloading HW and let
> you know.
>
> BTW, if following the direction you proposed, I wonder why this works
> (e.g the kernel doesn't drops the encapsulated TCP packets) when both
> sides are offloaded?
>
I'm just speculating, but the device may be returning checksum
unnecessary for the UDP checksum without actually checking it.
Technically, VXLAN RFC7348 allows an implementation to ignore the UDP
checksum, although this clearly violates RFC1122 UDP checksum
requirements. In the stack we now checksum all non-zero checksums
including UDP checksum in VXLAN if it's not marked
checksum-unnecessary.

Tom


> Tomorrow (Monday) I am OOO so will be able to do these further tests Tuesday.
>
> Or.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: some failures with vxlan offloads..
  2014-10-27  1:23     ` Tom Herbert
@ 2014-10-28 15:27       ` Or Gerlitz
  2014-10-28 15:36         ` Tom Herbert
  0 siblings, 1 reply; 9+ messages in thread
From: Or Gerlitz @ 2014-10-28 15:27 UTC (permalink / raw)
  To: Tom Herbert; +Cc: netdev, John Fastabend, Jeff Kirsher

On 10/27/2014 3:23 AM, Tom Herbert wrote:
> On Sun, Oct 26, 2014 at 3:23 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
>> On Sun, Oct 26, 2014 at 5:29 PM, Tom Herbert <therbert@google.com> wrote:
>>> Can you determine what the TSO HW engine is setting in UDP checksum
>>> field? tcpdump -vv might be able to show this. The symptoms seem to
>>> indicate that it may not be zero.
>> Thanks for the quick response. I'll check what is placed in the UDP
>> checksum field for packets that went through the offloading HW and let
>> you know.
>>
>> BTW, if following the direction you proposed, I wonder why this works
>> (e.g the kernel doesn't drops the encapsulated TCP packets) when both
>> sides are offloaded?
>>
> I'm just speculating, but the device may be returning checksum unnecessary for the UDP checksum without actually checking it. Technically, VXLAN RFC7348 allows an implementation to ignore the UDP checksum, although this clearly violates RFC1122 UDP checksum
> requirements. In the stack we now checksum all non-zero checksums including UDP checksum in VXLAN if it's not marked checksum-unnecessary.

OK, I found something (it's always bad habit to try and potentially 
blame someone else for your bugs...) -- as I wrote here earlier, the 
current HW doesn't support checksum generation for both the inner (say 
TCP) and outer (UDP) packet (and indeed we don't advertize 
SKB_GSO_UDP_TUNNEL_CSUM).

So if we tell them to offload the inner TCP checksum we must **not** 
tell them to attempt and offload the outer checksum too, and I wrongly 
did that... once I stopped doing so, I get mixed configurations (one 
side offloaded the peer not offloaded) to work. I will submit mlx4 fix 
for that.

I wonder if we have another bug somewhere... when both sides were 
offloaded, it works even with the mlx4 bug, canyou explain that?is it 
possible that the GRO stack somehow covers on the bug when both sides 
are offloaded and GRO/VXLAN comes into play?

Or.

after the fix, packets sent by the offloaded side (192.168.31.17) carry 
zero udpchecksum

17:20:44.445866 IP (tos 0x0, ttl 64, id 61275, offset 0, flags [DF], 
proto UDP (17), length 1500)
     192.168.31.17.45387 > 192.168.31.18.4789: [no cksum] UDP, length 1472
         0x0000:  f452 1401 da82 0002 c9e9 bf32 0800 4500
         0x0010:  05dc ef5b 4000 4011 8641 c0a8 1f11 c0a8
         0x0020:  1f12 b14b 12b5 05c8 0000 0800 0000 0000
         0x0030:  6300 b2c7 81db e850 7a83 2ecb 8c68 0800
         0x0040:  4500 05aa bb84 4000 4006 9055 c0a8 3411
         0x0050:  c0a8 3412 e479 9116 553a e008 f28e 6268
         0x0060:  5010 0038 88e3 0000 6600 6e65 7470 6572
         0x0070:  6600 6e65 7470 6572 6600
17:20:44.445871 IP (tos 0x0, ttl 64, id 61276, offset 0, flags [DF], 
proto UDP (17), length 1500)
     192.168.31.17.45387 > 192.168.31.18.4789: [no cksum] UDP, length 1472
         0x0000:  f452 1401 da82 0002 c9e9 bf32 0800 4500
         0x0010:  05dc ef5c 4000 4011 8640 c0a8 1f11 c0a8
         0x0020:  1f12 b14b 12b5 05c8 0000 0800 0000 0000
         0x0030:  6300 b2c7 81db e850 7a83 2ecb 8c68 0800
         0x0040:  4500 05aa bb85 4000 4006 9054 c0a8 3411
         0x0050:  c0a8 3412 e479 9116 553a e58a f28e 6268
         0x0060:  5010 0038 7afc 0000 6e65 7470 6572 6600
         0x0070:  6e65 7470 6572 6600 6e65


before the fix, packets sent by the offloaded side (192.168.31.17) carry 
junkudpchecksum

Also note that on one of the packets sent by the offloaded part, we 
don't see the "bad udp cksum" scream from tcpdump, which is weird...

17:03:08.765845 IP (tos 0x0, ttl 64, id 52396, offset 0, flags [DF], 
proto UDP (17), length 746)
     192.168.31.17.56686 > 192.168.31.18.4789: UDP, length 718
         0x0000:  f452 1401 da82 0002 c9e9 bf32 0800 4500
         0x0010:  02ea ccac 4000 4011 abe2 c0a8 1f11 c0a8
         0x0020:  1f12 dd6e 12b5 02d6 0c1b 0800 0000 0000
         0x0030:  6300 b2c7 81db e850 7a83 2ecb 8c68 0800
         0x0040:  4500 02b8 6357 4000 4006 eb74 c0a8 3411
         0x0050:  c0a8 3412 86f2 3241 d67e f47d e5a9 d041
         0x0060:  5018 0038 871c 0000 0000 0258 ffff ffff
         0x0070:  0000 0000 0000 0000 0000
17:03:09.336285 IP (tos 0x0, ttl 64, id 52536, offset 0, flags [DF], 
proto UDP (17), length 90)
     192.168.31.17.56686 > 192.168.31.18.4789: [bad udp cksum 1360!] 
UDP, length 62
         0x0000:  f452 1401 da82 0002 c9e9 bf32 0800 4500
         0x0010:  005a cd38 4000 4011 ade6 c0a8 1f11 c0a8
         0x0020:  1f12 dd6e 12b5 0046 139a 0800 0000 0000
         0x0030:  6300 b2c7 81db e850 7a83 2ecb 8c68 0800
         0x0040:  4500 0028 6358 4000 4006 ee03 c0a8 3411
         0x0050:  c0a8 3412 86f2 3241 d67e f70d e5a9 d041
         0x0060:  5011 0038 897b 0000
17:03:10.335074 IP (tos 0x0, ttl 64, id 40045, offset 0, flags [DF], 
proto UDP (17), length 98)
     192.168.31.18.48861 > 192.168.31.17.4789: [no cksum] UDP, length 70
         0x0000:  0002 c9e9 bf32 f452 1401 da82 0800 4500
         0x0010:  0062 9c6d 4000 4011 dea9 c0a8 1f12 c0a8
         0x0020:  1f11 bedd 12b5 004e 0000 0800 0000 0000
         0x0030:  6300 7a83 2ecb 8c68 b2c7 81db e850 0800
         0x0040:  4500 0030 0000 4000 4006 5154 c0a8 3412
         0x0050:  c0a8 3411 3241 86f2 e5a9 d040 d67e f47d
         0x0060:  7012 6e28 f282 0000 0204 0582 0103 0307
17:03:10.335110 IP (tos 0x0, ttl 64, id 52764, offset 0, flags [DF], 
proto UDP (17), length 90)
     192.168.31.17.56686 > 192.168.31.18.4789: [bad udp cksum 1360!] 
UDP, length 62
         0x0000:  f452 1401 da82 0002 c9e9 bf32 0800 4500
         0x0010:  005a ce1c 4000 4011 ad02 c0a8 1f11 c0a8
         0x0020:  1f12 dd6e 12b5 0046 139a 0800 0000 0000
         0x0030:  6300 b2c7 81db e850 7a83 2ecb 8c68 0800
         0x0040:  4500 0028 6359 4000 4006 ee02 c0a8 3411
         0x0050:  c0a8 3412 86f2 3241 d67e f70e e5a9 d041
         0x0060:  5010 0038 897b 0000

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: some failures with vxlan offloads..
  2014-10-28 15:27       ` Or Gerlitz
@ 2014-10-28 15:36         ` Tom Herbert
  2014-10-29  5:50           ` Or Gerlitz
  0 siblings, 1 reply; 9+ messages in thread
From: Tom Herbert @ 2014-10-28 15:36 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: netdev, John Fastabend, Jeff Kirsher

On Tue, Oct 28, 2014 at 8:27 AM, Or Gerlitz <ogerlitz@mellanox.com> wrote:
> On 10/27/2014 3:23 AM, Tom Herbert wrote:
>>
>> On Sun, Oct 26, 2014 at 3:23 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
>>>
>>> On Sun, Oct 26, 2014 at 5:29 PM, Tom Herbert <therbert@google.com> wrote:
>>>>
>>>> Can you determine what the TSO HW engine is setting in UDP checksum
>>>> field? tcpdump -vv might be able to show this. The symptoms seem to
>>>> indicate that it may not be zero.
>>>
>>> Thanks for the quick response. I'll check what is placed in the UDP
>>> checksum field for packets that went through the offloading HW and let
>>> you know.
>>>
>>> BTW, if following the direction you proposed, I wonder why this works
>>> (e.g the kernel doesn't drops the encapsulated TCP packets) when both
>>> sides are offloaded?
>>>
>> I'm just speculating, but the device may be returning checksum unnecessary
>> for the UDP checksum without actually checking it. Technically, VXLAN
>> RFC7348 allows an implementation to ignore the UDP checksum, although this
>> clearly violates RFC1122 UDP checksum
>> requirements. In the stack we now checksum all non-zero checksums
>> including UDP checksum in VXLAN if it's not marked checksum-unnecessary.
>
>
> OK, I found something (it's always bad habit to try and potentially blame
> someone else for your bugs...) -- as I wrote here earlier, the current HW
> doesn't support checksum generation for both the inner (say TCP) and outer
> (UDP) packet (and indeed we don't advertize SKB_GSO_UDP_TUNNEL_CSUM).
>
> So if we tell them to offload the inner TCP checksum we must **not** tell
> them to attempt and offload the outer checksum too, and I wrongly did
> that... once I stopped doing so, I get mixed configurations (one side
> offloaded the peer not offloaded) to work. I will submit mlx4 fix for that.
>
> I wonder if we have another bug somewhere... when both sides were offloaded,
> it works even with the mlx4 bug, canyou explain that?is it possible that the
> GRO stack somehow covers on the bug when both sides are offloaded and
> GRO/VXLAN comes into play?
>
Look at the receive side. As I mentioned, if the device is returning
checksum-unnecessary and setting csum_level to 1 (inner checksum was
validated) then stack won't try to verify the outer checksum. So in
this case if outer checksum is incorrect nobody complains about it.


> Or.
>
> after the fix, packets sent by the offloaded side (192.168.31.17) carry zero
> udpchecksum
>
> 17:20:44.445866 IP (tos 0x0, ttl 64, id 61275, offset 0, flags [DF], proto
> UDP (17), length 1500)
>     192.168.31.17.45387 > 192.168.31.18.4789: [no cksum] UDP, length 1472
>         0x0000:  f452 1401 da82 0002 c9e9 bf32 0800 4500
>         0x0010:  05dc ef5b 4000 4011 8641 c0a8 1f11 c0a8
>         0x0020:  1f12 b14b 12b5 05c8 0000 0800 0000 0000
>         0x0030:  6300 b2c7 81db e850 7a83 2ecb 8c68 0800
>         0x0040:  4500 05aa bb84 4000 4006 9055 c0a8 3411
>         0x0050:  c0a8 3412 e479 9116 553a e008 f28e 6268
>         0x0060:  5010 0038 88e3 0000 6600 6e65 7470 6572
>         0x0070:  6600 6e65 7470 6572 6600
> 17:20:44.445871 IP (tos 0x0, ttl 64, id 61276, offset 0, flags [DF], proto
> UDP (17), length 1500)
>     192.168.31.17.45387 > 192.168.31.18.4789: [no cksum] UDP, length 1472
>         0x0000:  f452 1401 da82 0002 c9e9 bf32 0800 4500
>         0x0010:  05dc ef5c 4000 4011 8640 c0a8 1f11 c0a8
>         0x0020:  1f12 b14b 12b5 05c8 0000 0800 0000 0000
>         0x0030:  6300 b2c7 81db e850 7a83 2ecb 8c68 0800
>         0x0040:  4500 05aa bb85 4000 4006 9054 c0a8 3411
>         0x0050:  c0a8 3412 e479 9116 553a e58a f28e 6268
>         0x0060:  5010 0038 7afc 0000 6e65 7470 6572 6600
>         0x0070:  6e65 7470 6572 6600 6e65
>
>
> before the fix, packets sent by the offloaded side (192.168.31.17) carry
> junkudpchecksum
>
> Also note that on one of the packets sent by the offloaded part, we don't
> see the "bad udp cksum" scream from tcpdump, which is weird...
>
> 17:03:08.765845 IP (tos 0x0, ttl 64, id 52396, offset 0, flags [DF], proto
> UDP (17), length 746)
>     192.168.31.17.56686 > 192.168.31.18.4789: UDP, length 718
>         0x0000:  f452 1401 da82 0002 c9e9 bf32 0800 4500
>         0x0010:  02ea ccac 4000 4011 abe2 c0a8 1f11 c0a8
>         0x0020:  1f12 dd6e 12b5 02d6 0c1b 0800 0000 0000
>         0x0030:  6300 b2c7 81db e850 7a83 2ecb 8c68 0800
>         0x0040:  4500 02b8 6357 4000 4006 eb74 c0a8 3411
>         0x0050:  c0a8 3412 86f2 3241 d67e f47d e5a9 d041
>         0x0060:  5018 0038 871c 0000 0000 0258 ffff ffff
>         0x0070:  0000 0000 0000 0000 0000
> 17:03:09.336285 IP (tos 0x0, ttl 64, id 52536, offset 0, flags [DF], proto
> UDP (17), length 90)
>     192.168.31.17.56686 > 192.168.31.18.4789: [bad udp cksum 1360!] UDP,
> length 62
>         0x0000:  f452 1401 da82 0002 c9e9 bf32 0800 4500
>         0x0010:  005a cd38 4000 4011 ade6 c0a8 1f11 c0a8
>         0x0020:  1f12 dd6e 12b5 0046 139a 0800 0000 0000
>         0x0030:  6300 b2c7 81db e850 7a83 2ecb 8c68 0800
>         0x0040:  4500 0028 6358 4000 4006 ee03 c0a8 3411
>         0x0050:  c0a8 3412 86f2 3241 d67e f70d e5a9 d041
>         0x0060:  5011 0038 897b 0000
> 17:03:10.335074 IP (tos 0x0, ttl 64, id 40045, offset 0, flags [DF], proto
> UDP (17), length 98)
>     192.168.31.18.48861 > 192.168.31.17.4789: [no cksum] UDP, length 70
>         0x0000:  0002 c9e9 bf32 f452 1401 da82 0800 4500
>         0x0010:  0062 9c6d 4000 4011 dea9 c0a8 1f12 c0a8
>         0x0020:  1f11 bedd 12b5 004e 0000 0800 0000 0000
>         0x0030:  6300 7a83 2ecb 8c68 b2c7 81db e850 0800
>         0x0040:  4500 0030 0000 4000 4006 5154 c0a8 3412
>         0x0050:  c0a8 3411 3241 86f2 e5a9 d040 d67e f47d
>         0x0060:  7012 6e28 f282 0000 0204 0582 0103 0307
> 17:03:10.335110 IP (tos 0x0, ttl 64, id 52764, offset 0, flags [DF], proto
> UDP (17), length 90)
>     192.168.31.17.56686 > 192.168.31.18.4789: [bad udp cksum 1360!] UDP,
> length 62
>         0x0000:  f452 1401 da82 0002 c9e9 bf32 0800 4500
>         0x0010:  005a ce1c 4000 4011 ad02 c0a8 1f11 c0a8
>         0x0020:  1f12 dd6e 12b5 0046 139a 0800 0000 0000
>         0x0030:  6300 b2c7 81db e850 7a83 2ecb 8c68 0800
>         0x0040:  4500 0028 6359 4000 4006 ee02 c0a8 3411
>         0x0050:  c0a8 3412 86f2 3241 d67e f70e e5a9 d041
>         0x0060:  5010 0038 897b 0000
>
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: some failures with vxlan offloads..
  2014-10-28 15:36         ` Tom Herbert
@ 2014-10-29  5:50           ` Or Gerlitz
  2014-10-29 14:59             ` Tom Herbert
  0 siblings, 1 reply; 9+ messages in thread
From: Or Gerlitz @ 2014-10-29  5:50 UTC (permalink / raw)
  To: Tom Herbert; +Cc: netdev

On 10/28/2014 5:36 PM, Tom Herbert wrote:
>> I wonder if we have another bug somewhere... when both sides were offloaded,
>> >it works even with the mlx4 bug, canyou explain that?is it possible that the
>> >GRO stack somehow covers on the bug when both sides are offloaded and
>> >GRO/VXLAN comes into play?
>>
> Look at the receive side. As I mentioned, if the device is returning
> checksum-unnecessary and setting csum_level to 1 (inner checksum was
> validated) then stack won't try to verify the outer checksum. So in
> this case if outer checksum is incorrect nobody complains about it.

OK, I'll look there. Anything that should worries us at that stack trace 
I sent in my initial email of this thread, or you think this is related 
to the mlx4 driver checksum bug?

Or.

>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: some failures with vxlan offloads..
  2014-10-29  5:50           ` Or Gerlitz
@ 2014-10-29 14:59             ` Tom Herbert
  2014-10-29 15:56               ` Or Gerlitz
  0 siblings, 1 reply; 9+ messages in thread
From: Tom Herbert @ 2014-10-29 14:59 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: netdev

On Tue, Oct 28, 2014 at 10:50 PM, Or Gerlitz <ogerlitz@mellanox.com> wrote:
> On 10/28/2014 5:36 PM, Tom Herbert wrote:
>>>
>>> I wonder if we have another bug somewhere... when both sides were
>>> offloaded,
>>> >it works even with the mlx4 bug, canyou explain that?is it possible that
>>> > the
>>> >GRO stack somehow covers on the bug when both sides are offloaded and
>>> >GRO/VXLAN comes into play?
>>>
>> Look at the receive side. As I mentioned, if the device is returning
>> checksum-unnecessary and setting csum_level to 1 (inner checksum was
>> validated) then stack won't try to verify the outer checksum. So in
>> this case if outer checksum is incorrect nobody complains about it.
>
>
> OK, I'll look there. Anything that should worries us at that stack trace I
> sent in my initial email of this thread, or you think this is related to the
> mlx4 driver checksum bug?
>
The trace doesn't seem like it would be related to a checksum bug. Do
you only see this with offload enabled?

> Or.
>
>>
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: some failures with vxlan offloads..
  2014-10-29 14:59             ` Tom Herbert
@ 2014-10-29 15:56               ` Or Gerlitz
  0 siblings, 0 replies; 9+ messages in thread
From: Or Gerlitz @ 2014-10-29 15:56 UTC (permalink / raw)
  To: Tom Herbert; +Cc: netdev

On 10/29/2014 4:59 PM, Tom Herbert wrote:
>> >OK, I'll look there. Anything that should worries us at that stack trace I
>> >sent in my initial email of this thread, or you think this is related to the
>> >mlx4 driver checksum bug?
>> >
> The trace doesn't seem like it would be related to a checksum bug. Do you only see this with offload enabled?

no, it happened on the server side of configuration #4 in my original 
email, which is offloaded client and non-offloaded server.

Or.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-10-29 15:56 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-26 13:36 some failures with vxlan offloads Or Gerlitz
2014-10-26 15:29 ` Tom Herbert
2014-10-26 22:23   ` Or Gerlitz
2014-10-27  1:23     ` Tom Herbert
2014-10-28 15:27       ` Or Gerlitz
2014-10-28 15:36         ` Tom Herbert
2014-10-29  5:50           ` Or Gerlitz
2014-10-29 14:59             ` Tom Herbert
2014-10-29 15:56               ` Or Gerlitz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.