All of lore.kernel.org
 help / color / mirror / Atom feed
* [BUG] mlx5 have problems with ipv4-ipv6 tunnels in linux 4.4
@ 2018-07-04  5:45 Konstantin Khlebnikov
  2018-07-09 22:31 ` Saeed Mahameed
  0 siblings, 1 reply; 4+ messages in thread
From: Konstantin Khlebnikov @ 2018-07-04  5:45 UTC (permalink / raw)
  Cc: netdev, Saeed Mahameed, Or Gerlitz, Tariq Toukan, Gal Pressman

I'm seeing problems with tunnelled traffic with Mellanox Technologies MT27710 Family [ConnectX-4 Lx] using vanilla driver from linux 4.4.y

Packets with payload bigger than 116 bytes are not exmited.
Smaller packets and normal ipv6 works fine.

In linux 4.9, 4.14 and out-of-tree driver everything seems fine for now.
It's hard to guess or bisect commit: there are a lot of changes and something wrong with driver or swiotlb in 4.7..4.8.
4.6 is affected too - so this should be something between 4.6 and 4.9

Probably this case was fixed indirectly by adding some kind of offload and non-offloaded path is still broken.
Please give me a hint: which commit could it be.

In 4.4 ethtool shows this:

# ethtool -i eth0
driver: mlx5_core
version: 3.0-1 (January 2015)
firmware-version: 14.21.2500
expansion-rom-version:
bus-info: 0000:81:00.0
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: no

# ethtool -k eth0
Features for eth0:
rx-checksumming: on
tx-checksumming: on
	tx-checksum-ipv4: on
	tx-checksum-ip-generic: off [fixed]
	tx-checksum-ipv6: on
	tx-checksum-fcoe-crc: off [fixed]
	tx-checksum-sctp: off [fixed]
scatter-gather: on
	tx-scatter-gather: on
	tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: off
	tx-tcp-segmentation: off
	tx-tcp-ecn-segmentation: off [fixed]
	tx-tcp6-segmentation: off
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off [fixed]
receive-hashing: on
highdma: on [fixed]
rx-vlan-filter: on
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-ipip-segmentation: off [fixed]
tx-sit-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
busy-poll: off [fixed]



in 4.9


# ethtool -i eth0
driver: mlx5_core
version: 3.0-1 (January 2015)
firmware-version: 14.21.2500
expansion-rom-version:
bus-info: 0000:81:00.0
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: yes


# ethtool -k eth0
Features for eth0:
rx-checksumming: on
tx-checksumming: on
	tx-checksum-ipv4: on
	tx-checksum-ip-generic: off [fixed]
	tx-checksum-ipv6: on
	tx-checksum-fcoe-crc: off [fixed]
	tx-checksum-sctp: off [fixed]
scatter-gather: on
	tx-scatter-gather: on
	tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: off
	tx-tcp-segmentation: off
	tx-tcp-ecn-segmentation: off [fixed]
	tx-tcp-mangleid-segmentation: off
	tx-tcp6-segmentation: off
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off
receive-hashing: on
highdma: on [fixed]
rx-vlan-filter: on
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-gre-csum-segmentation: off [fixed]
tx-ipxip4-segmentation: off [fixed]
tx-ipxip6-segmentation: off [fixed]
tx-udp_tnl-segmentation: on
tx-udp_tnl-csum-segmentation: on
tx-gso-partial: on
tx-sctp-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
busy-poll: off [fixed]
hw-tc-offload: off

diff

@@ -12,6 +12,7 @@
  tcp-segmentation-offload: off
  	tx-tcp-segmentation: off
  	tx-tcp-ecn-segmentation: off [fixed]
+	tx-tcp-mangleid-segmentation: off
  	tx-tcp6-segmentation: off
  udp-fragmentation-offload: off [fixed]
  generic-segmentation-offload: on
@@ -19,7 +20,7 @@
  large-receive-offload: off
  rx-vlan-offload: on
  tx-vlan-offload: on
-ntuple-filters: off [fixed]
+ntuple-filters: off
  receive-hashing: on
  highdma: on [fixed]
  rx-vlan-filter: on
@@ -29,16 +30,21 @@
  tx-gso-robust: off [fixed]
  tx-fcoe-segmentation: off [fixed]
  tx-gre-segmentation: off [fixed]
-tx-ipip-segmentation: off [fixed]
-tx-sit-segmentation: off [fixed]
-tx-udp_tnl-segmentation: off [fixed]
+tx-gre-csum-segmentation: off [fixed]
+tx-ipxip4-segmentation: off [fixed]
+tx-ipxip6-segmentation: off [fixed]
+tx-udp_tnl-segmentation: on
+tx-udp_tnl-csum-segmentation: on
+tx-gso-partial: on
+tx-sctp-segmentation: off [fixed]
  fcoe-mtu: off [fixed]
  tx-nocache-copy: off
  loopback: off [fixed]
  rx-fcs: off [fixed]
-rx-all: off [fixed]
+rx-all: off
  tx-vlan-stag-hw-insert: off [fixed]
  rx-vlan-stag-hw-parse: off [fixed]
  rx-vlan-stag-filter: off [fixed]
  l2-fwd-offload: off [fixed]
  busy-poll: off [fixed]
+hw-tc-offload: off

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG] mlx5 have problems with ipv4-ipv6 tunnels in linux 4.4
  2018-07-04  5:45 [BUG] mlx5 have problems with ipv4-ipv6 tunnels in linux 4.4 Konstantin Khlebnikov
@ 2018-07-09 22:31 ` Saeed Mahameed
  2018-07-10  9:19   ` Konstantin Khlebnikov
  0 siblings, 1 reply; 4+ messages in thread
From: Saeed Mahameed @ 2018-07-09 22:31 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: netdev, Saeed Mahameed, Or Gerlitz, Tariq Toukan, Gal Pressman

On Tue, Jul 3, 2018 at 10:45 PM, Konstantin Khlebnikov
<khlebnikov@yandex-team.ru> wrote:
> I'm seeing problems with tunnelled traffic with Mellanox Technologies
> MT27710 Family [ConnectX-4 Lx] using vanilla driver from linux 4.4.y
>
> Packets with payload bigger than 116 bytes are not exmited.
> Smaller packets and normal ipv6 works fine.
>

Hi Konstantin,

Is this true for all ipv6 traffic or just ipv4-ipv6 tunnels ?

what is the skb_network_offset(skb) for such packet ?

> In linux 4.9, 4.14 and out-of-tree driver everything seems fine for now.
> It's hard to guess or bisect commit: there are a lot of changes and
> something wrong with driver or swiotlb in 4.7..4.8.
> 4.6 is affected too - so this should be something between 4.6 and 4.9
>
> Probably this case was fixed indirectly by adding some kind of offload and
> non-offloaded path is still broken.
> Please give me a hint: which commit could it be.
>

I suspect it works in a newer kernel since we introduced on 4.7/4.8:

commit e3a19b53cbb0e6738b7a547f262179065b72e3fa
Author: Matthew Finlay <matt@mellanox.com>
Date:   Thu Jun 30 17:34:47 2016 +0300

    net/mlx5e: Copy all L2 headers into inline segment

    ConnectX4-Lx uses an inline wqe mode that currently defaults to
    requiring the entire L2 header be included in the wqe.
    This patch fixes mlx5e_get_inline_hdr_size() to account for
    all L2 headers (VLAN, QinQ, etc) using skb_network_offset(skb).

    Fixes: e586b3b0baee ("net/mlx5: Ethernet Datapath files")
    Signed-off-by: Matthew Finlay <matt@mellanox.com>
    Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>



commit ae76715d153e33c249b6850361e4d8d775388b5a
Author: Hadar Hen Zion <hadarh@mellanox.com>
Date:   Sun Jul 24 16:12:39 2016 +0300

    net/mlx5e: Check the minimum inline header mode before xmit

and then some fixes on top of it, such as:

commit f600c6088018d1dbc5777d18daa83660f7ea4a64
Author: Eran Ben Elisha <eranbe@mellanox.com>
Date:   Thu Jan 25 11:18:09 2018 +0200

    net/mlx5e: Verify inline header size do not exceed SKB linear size


anyhow, can you try the above patches one by one  on 4.4.y and see if it helps ?


Thanks,
Saeed

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG] mlx5 have problems with ipv4-ipv6 tunnels in linux 4.4
  2018-07-09 22:31 ` Saeed Mahameed
@ 2018-07-10  9:19   ` Konstantin Khlebnikov
  2018-07-12 18:03     ` Or Gerlitz
  0 siblings, 1 reply; 4+ messages in thread
From: Konstantin Khlebnikov @ 2018-07-10  9:19 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: netdev, Saeed Mahameed, Or Gerlitz, Tariq Toukan, Gal Pressman

On 10.07.2018 01:31, Saeed Mahameed wrote:
> On Tue, Jul 3, 2018 at 10:45 PM, Konstantin Khlebnikov
> <khlebnikov@yandex-team.ru> wrote:
>> I'm seeing problems with tunnelled traffic with Mellanox Technologies
>> MT27710 Family [ConnectX-4 Lx] using vanilla driver from linux 4.4.y
>>
>> Packets with payload bigger than 116 bytes are not exmited.
>> Smaller packets and normal ipv6 works fine.
>>
> 
> Hi Konstantin,
> 
> Is this true for all ipv6 traffic or just ipv4-ipv6 tunnels ?
> 
> what is the skb_network_offset(skb) for such packet ?
> 
>> In linux 4.9, 4.14 and out-of-tree driver everything seems fine for now.
>> It's hard to guess or bisect commit: there are a lot of changes and
>> something wrong with driver or swiotlb in 4.7..4.8.
>> 4.6 is affected too - so this should be something between 4.6 and 4.9
>>
>> Probably this case was fixed indirectly by adding some kind of offload and
>> non-offloaded path is still broken.
>> Please give me a hint: which commit could it be.
>>
> 
> I suspect it works in a newer kernel since we introduced on 4.7/4.8:

Yes, this works. Thank you.

Problem was with VLAN rather than tunnel.

This hunk from first patch is enough:
-#define MLX5E_MIN_INLINE ETH_HLEN
+#define MLX5E_MIN_INLINE (ETH_HLEN + VLAN_HLEN)

In my case full data path looks like

( tcp -> ipip6 -> veth ) -> netns-to-host -> ( veth -> vlan at mlx5 )

Tunnelled traffic also goes to vlan, while most of other traffic goes
through non-tagged interface and worked fine.

max_inline is 226 so (226 - vlan - ethernet - ipv6 - ipv4 - tcp)
leaves exactly 116 bytes for payload.

> 
> commit e3a19b53cbb0e6738b7a547f262179065b72e3fa
> Author: Matthew Finlay <matt@mellanox.com>
> Date:   Thu Jun 30 17:34:47 2016 +0300
> 
>      net/mlx5e: Copy all L2 headers into inline segment
> 
>      ConnectX4-Lx uses an inline wqe mode that currently defaults to
>      requiring the entire L2 header be included in the wqe.
>      This patch fixes mlx5e_get_inline_hdr_size() to account for
>      all L2 headers (VLAN, QinQ, etc) using skb_network_offset(skb).
> 
>      Fixes: e586b3b0baee ("net/mlx5: Ethernet Datapath files")
>      Signed-off-by: Matthew Finlay <matt@mellanox.com>
>      Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
>      Signed-off-by: David S. Miller <davem@davemloft.net>
> 
> 
> 
> commit ae76715d153e33c249b6850361e4d8d775388b5a
> Author: Hadar Hen Zion <hadarh@mellanox.com>
> Date:   Sun Jul 24 16:12:39 2016 +0300
> 
>      net/mlx5e: Check the minimum inline header mode before xmit
> 
> and then some fixes on top of it, such as:
> 
> commit f600c6088018d1dbc5777d18daa83660f7ea4a64
> Author: Eran Ben Elisha <eranbe@mellanox.com>
> Date:   Thu Jan 25 11:18:09 2018 +0200
> 
>      net/mlx5e: Verify inline header size do not exceed SKB linear size
> 
> 
> anyhow, can you try the above patches one by one  on 4.4.y and see if it helps ?
> 
> 
> Thanks,
> Saeed
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG] mlx5 have problems with ipv4-ipv6 tunnels in linux 4.4
  2018-07-10  9:19   ` Konstantin Khlebnikov
@ 2018-07-12 18:03     ` Or Gerlitz
  0 siblings, 0 replies; 4+ messages in thread
From: Or Gerlitz @ 2018-07-12 18:03 UTC (permalink / raw)
  To: Konstantin Khlebnikov, Saeed Mahameed
  Cc: Saeed Mahameed, netdev, Or Gerlitz, Tariq Toukan

On Tue, Jul 10, 2018 at 12:19 PM, Konstantin Khlebnikov
<khlebnikov@yandex-team.ru> wrote:
> On 10.07.2018 01:31, Saeed Mahameed wrote:
>>
>> On Tue, Jul 3, 2018 at 10:45 PM, Konstantin Khlebnikov
>> <khlebnikov@yandex-team.ru> wrote:
>>>
>>> I'm seeing problems with tunnelled traffic with Mellanox Technologies
>>> MT27710 Family [ConnectX-4 Lx] using vanilla driver from linux 4.4.y
>>>
>>> Packets with payload bigger than 116 bytes are not exmited.
>>> Smaller packets and normal ipv6 works fine.
>>>
>>
>> Hi Konstantin,
>>
>> Is this true for all ipv6 traffic or just ipv4-ipv6 tunnels ?
>>
>> what is the skb_network_offset(skb) for such packet ?
>>
>>> In linux 4.9, 4.14 and out-of-tree driver everything seems fine for now.
>>> It's hard to guess or bisect commit: there are a lot of changes and
>>> something wrong with driver or swiotlb in 4.7..4.8.
>>> 4.6 is affected too - so this should be something between 4.6 and 4.9
>>>
>>> Probably this case was fixed indirectly by adding some kind of offload
>>> and
>>> non-offloaded path is still broken.
>>> Please give me a hint: which commit could it be.
>>>
>>
>> I suspect it works in a newer kernel since we introduced on 4.7/4.8:
>
>
> Yes, this works. Thank you.
>
> Problem was with VLAN rather than tunnel.
>
> This hunk from first patch is enough:
> -#define MLX5E_MIN_INLINE ETH_HLEN
> +#define MLX5E_MIN_INLINE (ETH_HLEN + VLAN_HLEN)


so... what should we do to fix 4.4-stable? just push there the 1st path?

Saeed, 4.4 is LTS, lets fix it there..

Or.


> In my case full data path looks like
>
> ( tcp -> ipip6 -> veth ) -> netns-to-host -> ( veth -> vlan at mlx5 )
>
> Tunnelled traffic also goes to vlan, while most of other traffic goes
> through non-tagged interface and worked fine.
>
> max_inline is 226 so (226 - vlan - ethernet - ipv6 - ipv4 - tcp)
> leaves exactly 116 bytes for payload.
>
>
>>
>> commit e3a19b53cbb0e6738b7a547f262179065b72e3fa
>> Author: Matthew Finlay <matt@mellanox.com>
>> Date:   Thu Jun 30 17:34:47 2016 +0300
>>
>>      net/mlx5e: Copy all L2 headers into inline segment
>>
>>      ConnectX4-Lx uses an inline wqe mode that currently defaults to
>>      requiring the entire L2 header be included in the wqe.
>>      This patch fixes mlx5e_get_inline_hdr_size() to account for
>>      all L2 headers (VLAN, QinQ, etc) using skb_network_offset(skb).
>>
>>      Fixes: e586b3b0baee ("net/mlx5: Ethernet Datapath files")
>>      Signed-off-by: Matthew Finlay <matt@mellanox.com>
>>      Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
>>      Signed-off-by: David S. Miller <davem@davemloft.net>
>>
>>
>>
>> commit ae76715d153e33c249b6850361e4d8d775388b5a
>> Author: Hadar Hen Zion <hadarh@mellanox.com>
>> Date:   Sun Jul 24 16:12:39 2016 +0300
>>
>>      net/mlx5e: Check the minimum inline header mode before xmit
>>
>> and then some fixes on top of it, such as:
>>
>> commit f600c6088018d1dbc5777d18daa83660f7ea4a64
>> Author: Eran Ben Elisha <eranbe@mellanox.com>
>> Date:   Thu Jan 25 11:18:09 2018 +0200
>>
>>      net/mlx5e: Verify inline header size do not exceed SKB linear size
>>
>>
>> anyhow, can you try the above patches one by one  on 4.4.y and see if it
>> helps ?
>>
>>
>> Thanks,
>> Saeed
>>
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-07-12 18:13 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-04  5:45 [BUG] mlx5 have problems with ipv4-ipv6 tunnels in linux 4.4 Konstantin Khlebnikov
2018-07-09 22:31 ` Saeed Mahameed
2018-07-10  9:19   ` Konstantin Khlebnikov
2018-07-12 18:03     ` Or Gerlitz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.