All of lore.kernel.org
 help / color / mirror / Atom feed
* Bug or mis configuration for mlx5e lag and multipath
@ 2019-05-20  1:53 wenxu
  2019-05-23 15:15 ` Roi Dayan
  0 siblings, 1 reply; 4+ messages in thread
From: wenxu @ 2019-05-20  1:53 UTC (permalink / raw)
  To: Roi Dayan, Saeed Mahameed; +Cc: netdev

Hi Roi & Saeed,

I just test the mlx5e lag and mutipath feature. There are some suituation the outgoing can't be offloaded.

ovs configureation as following.

# ovs-vsctl show
dfd71dfb-6e22-423e-b088-d2022103af6b
    Bridge "br0"
        Port "mlx_pf0vf0"
            Interface "mlx_pf0vf0"
        Port gre
            Interface gre
                type: gre
                options: {key="1000", local_ip="172.168.152.75", remote_ip="172.168.152.241"}
        Port "br0"
            Interface "br0"
                type: internal

set the mlx5e driver:


modprobe mlx5_core
echo 0 > /sys/class/net/eth2/device/sriov_numvfs
echo 0 > /sys/class/net/eth3/device/sriov_numvfs
echo 2 > /sys/class/net/eth2/device/sriov_numvfs
echo 2 > /sys/class/net/eth3/device/sriov_numvfs
lspci -nn | grep Mellanox
echo 0000:81:00.2 > /sys/bus/pci/drivers/mlx5_core/unbind
echo 0000:81:00.3 > /sys/bus/pci/drivers/mlx5_core/unbind
echo 0000:81:03.6 > /sys/bus/pci/drivers/mlx5_core/unbind
echo 0000:81:03.7 > /sys/bus/pci/drivers/mlx5_core/unbind

devlink dev eswitch set pci/0000:81:00.0  mode switchdev encap enable
devlink dev eswitch set pci/0000:81:00.1  mode switchdev encap enable

modprobe bonding mode=802.3ad miimon=100 lacp_rate=1
ip l del dev bond0
ifconfig mlx_p0 down
ifconfig mlx_p1 down
ip l add dev bond0 type bond mode 802.3ad
ifconfig bond0 172.168.152.75/24 up
echo 1 > /sys/class/net/bond0/bonding/xmit_hash_policy
ip l set dev mlx_p0 master bond0
ip l set dev mlx_p1 master bond0
ifconfig mlx_p0 up
ifconfig mlx_p1 up

systemctl start openvswitch
ovs-vsctl set Open_vSwitch . other_config:hw-offload=true
systemctl restart openvswitch


mlx_pf0vf0 is assigned to vm. The tc rule show in_hw

# tc filter ls dev mlx_pf0vf0 ingress
filter protocol ip pref 2 flower
filter protocol ip pref 2 flower handle 0x1
  dst_mac 8e:c0:bd:bf:72:c3
  src_mac 52:54:00:00:12:75
  eth_type ipv4
  ip_tos 0/3
  ip_flags nofrag
  in_hw
    action order 1: tunnel_key set
    src_ip 172.168.152.75
    dst_ip 172.168.152.241
    key_id 1000 pipe
    index 2 ref 1 bind 1
 
    action order 2: mirred (Egress Redirect to device gre_sys) stolen
     index 2 ref 1 bind 1

In the vm:  the mlx5e driver enable xps default (by the way I think it is better not enable xps in default kernel can select queue by each flow),  in the lag mode different vf queue associate with hw PF.

with command taskset -c 2 ping 10.0.0.241

the packet can be offloaded , the outgoing pf is mlx_p0

but with command taskset -c 1 ping 10.0.0.241

the packet can't be offloaded, I can capture the packet on the mlx_pf0vf0, the outgoing pf is mlx_p1. Althrough the tc flower rule show in_hw


I check with the driver  both mlx_pf0vf0 and peer(mlx_p1) install the tc rule success

I think it's a problem of lag mode. Or I miss some configureation?


BR

wenxu






^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Bug or mis configuration for mlx5e lag and multipath
  2019-05-20  1:53 Bug or mis configuration for mlx5e lag and multipath wenxu
@ 2019-05-23 15:15 ` Roi Dayan
  2019-05-24  1:02   ` wenxu
  0 siblings, 1 reply; 4+ messages in thread
From: Roi Dayan @ 2019-05-23 15:15 UTC (permalink / raw)
  To: wenxu, Saeed Mahameed; +Cc: netdev



On 20/05/2019 04:53, wenxu wrote:
> Hi Roi & Saeed,
> 
> I just test the mlx5e lag and mutipath feature. There are some suituation the outgoing can't be offloaded.
> 
> ovs configureation as following.
> 
> # ovs-vsctl show
> dfd71dfb-6e22-423e-b088-d2022103af6b
>     Bridge "br0"
>         Port "mlx_pf0vf0"
>             Interface "mlx_pf0vf0"
>         Port gre
>             Interface gre
>                 type: gre
>                 options: {key="1000", local_ip="172.168.152.75", remote_ip="172.168.152.241"}
>         Port "br0"
>             Interface "br0"
>                 type: internal
> 
> set the mlx5e driver:
> 
> 
> modprobe mlx5_core
> echo 0 > /sys/class/net/eth2/device/sriov_numvfs
> echo 0 > /sys/class/net/eth3/device/sriov_numvfs
> echo 2 > /sys/class/net/eth2/device/sriov_numvfs
> echo 2 > /sys/class/net/eth3/device/sriov_numvfs
> lspci -nn | grep Mellanox
> echo 0000:81:00.2 > /sys/bus/pci/drivers/mlx5_core/unbind
> echo 0000:81:00.3 > /sys/bus/pci/drivers/mlx5_core/unbind
> echo 0000:81:03.6 > /sys/bus/pci/drivers/mlx5_core/unbind
> echo 0000:81:03.7 > /sys/bus/pci/drivers/mlx5_core/unbind
> 
> devlink dev eswitch set pci/0000:81:00.0  mode switchdev encap enable
> devlink dev eswitch set pci/0000:81:00.1  mode switchdev encap enable
> 
> modprobe bonding mode=802.3ad miimon=100 lacp_rate=1
> ip l del dev bond0
> ifconfig mlx_p0 down
> ifconfig mlx_p1 down
> ip l add dev bond0 type bond mode 802.3ad
> ifconfig bond0 172.168.152.75/24 up
> echo 1 > /sys/class/net/bond0/bonding/xmit_hash_policy
> ip l set dev mlx_p0 master bond0
> ip l set dev mlx_p1 master bond0
> ifconfig mlx_p0 up
> ifconfig mlx_p1 up
> 
> systemctl start openvswitch
> ovs-vsctl set Open_vSwitch . other_config:hw-offload=true
> systemctl restart openvswitch
> 
> 
> mlx_pf0vf0 is assigned to vm. The tc rule show in_hw
> 
> # tc filter ls dev mlx_pf0vf0 ingress
> filter protocol ip pref 2 flower
> filter protocol ip pref 2 flower handle 0x1
>   dst_mac 8e:c0:bd:bf:72:c3
>   src_mac 52:54:00:00:12:75
>   eth_type ipv4
>   ip_tos 0/3
>   ip_flags nofrag
>   in_hw
>     action order 1: tunnel_key set
>     src_ip 172.168.152.75
>     dst_ip 172.168.152.241
>     key_id 1000 pipe
>     index 2 ref 1 bind 1
>  
>     action order 2: mirred (Egress Redirect to device gre_sys) stolen
>      index 2 ref 1 bind 1
> 
> In the vm:  the mlx5e driver enable xps default (by the way I think it is better not enable xps in default kernel can select queue by each flow),  in the lag mode different vf queue associate with hw PF.
> 
> with command taskset -c 2 ping 10.0.0.241
> 
> the packet can be offloaded , the outgoing pf is mlx_p0
> 
> but with command taskset -c 1 ping 10.0.0.241
> 
> the packet can't be offloaded, I can capture the packet on the mlx_pf0vf0, the outgoing pf is mlx_p1. Althrough the tc flower rule show in_hw
> 
> 
> I check with the driver  both mlx_pf0vf0 and peer(mlx_p1) install the tc rule success
> 
> I think it's a problem of lag mode. Or I miss some configureation?
> 
> 
> BR
> 
> wenxu
> 
> 
> 
> 
> 

Hi,

we need to verify the driver detected to be in lag mode and
duplicated the offload rule to both eswitches.
do you see lag map messages in dmesg?
something like "lag map port 1:1 port 2:2"
this is to make sure the driver actually in lag mode.
in this mode a rule added to mlx_pf0vf0 will be added to esw of pf0 and esw of pf1.
then when u send a packet it could be handled in esw0 or esw1
if the rule is not in esw1 then it wont be offloaded when using pf1.

thanks,
Roi

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Bug or mis configuration for mlx5e lag and multipath
  2019-05-23 15:15 ` Roi Dayan
@ 2019-05-24  1:02   ` wenxu
  2019-05-28 10:48     ` Roi Dayan
  0 siblings, 1 reply; 4+ messages in thread
From: wenxu @ 2019-05-24  1:02 UTC (permalink / raw)
  To: Roi Dayan, Saeed Mahameed; +Cc: netdev

Hi,

I can get the right log from demsg

 mlx5_core 0000:81:00.0: modify lag map port 1:1 port 2:2


I debug with the driver, I find the rule be add on mlx_pf0vf0 and the peer one pf1,

So I think the esw0 and esw1 both have the rule.

The test case is based on the master branch of the net git tree.

在 2019/5/23 23:15, Roi Dayan 写道:
>
> On 20/05/2019 04:53, wenxu wrote:
>> Hi Roi & Saeed,
>>
>> I just test the mlx5e lag and mutipath feature. There are some suituation the outgoing can't be offloaded.
>>
>> ovs configureation as following.
>>
>> # ovs-vsctl show
>> dfd71dfb-6e22-423e-b088-d2022103af6b
>>     Bridge "br0"
>>         Port "mlx_pf0vf0"
>>             Interface "mlx_pf0vf0"
>>         Port gre
>>             Interface gre
>>                 type: gre
>>                 options: {key="1000", local_ip="172.168.152.75", remote_ip="172.168.152.241"}
>>         Port "br0"
>>             Interface "br0"
>>                 type: internal
>>
>> set the mlx5e driver:
>>
>>
>> modprobe mlx5_core
>> echo 0 > /sys/class/net/eth2/device/sriov_numvfs
>> echo 0 > /sys/class/net/eth3/device/sriov_numvfs
>> echo 2 > /sys/class/net/eth2/device/sriov_numvfs
>> echo 2 > /sys/class/net/eth3/device/sriov_numvfs
>> lspci -nn | grep Mellanox
>> echo 0000:81:00.2 > /sys/bus/pci/drivers/mlx5_core/unbind
>> echo 0000:81:00.3 > /sys/bus/pci/drivers/mlx5_core/unbind
>> echo 0000:81:03.6 > /sys/bus/pci/drivers/mlx5_core/unbind
>> echo 0000:81:03.7 > /sys/bus/pci/drivers/mlx5_core/unbind
>>
>> devlink dev eswitch set pci/0000:81:00.0  mode switchdev encap enable
>> devlink dev eswitch set pci/0000:81:00.1  mode switchdev encap enable
>>
>> modprobe bonding mode=802.3ad miimon=100 lacp_rate=1
>> ip l del dev bond0
>> ifconfig mlx_p0 down
>> ifconfig mlx_p1 down
>> ip l add dev bond0 type bond mode 802.3ad
>> ifconfig bond0 172.168.152.75/24 up
>> echo 1 > /sys/class/net/bond0/bonding/xmit_hash_policy
>> ip l set dev mlx_p0 master bond0
>> ip l set dev mlx_p1 master bond0
>> ifconfig mlx_p0 up
>> ifconfig mlx_p1 up
>>
>> systemctl start openvswitch
>> ovs-vsctl set Open_vSwitch . other_config:hw-offload=true
>> systemctl restart openvswitch
>>
>>
>> mlx_pf0vf0 is assigned to vm. The tc rule show in_hw
>>
>> # tc filter ls dev mlx_pf0vf0 ingress
>> filter protocol ip pref 2 flower
>> filter protocol ip pref 2 flower handle 0x1
>>   dst_mac 8e:c0:bd:bf:72:c3
>>   src_mac 52:54:00:00:12:75
>>   eth_type ipv4
>>   ip_tos 0/3
>>   ip_flags nofrag
>>   in_hw
>>     action order 1: tunnel_key set
>>     src_ip 172.168.152.75
>>     dst_ip 172.168.152.241
>>     key_id 1000 pipe
>>     index 2 ref 1 bind 1
>>  
>>     action order 2: mirred (Egress Redirect to device gre_sys) stolen
>>      index 2 ref 1 bind 1
>>
>> In the vm:  the mlx5e driver enable xps default (by the way I think it is better not enable xps in default kernel can select queue by each flow),  in the lag mode different vf queue associate with hw PF.
>>
>> with command taskset -c 2 ping 10.0.0.241
>>
>> the packet can be offloaded , the outgoing pf is mlx_p0
>>
>> but with command taskset -c 1 ping 10.0.0.241
>>
>> the packet can't be offloaded, I can capture the packet on the mlx_pf0vf0, the outgoing pf is mlx_p1. Althrough the tc flower rule show in_hw
>>
>>
>> I check with the driver  both mlx_pf0vf0 and peer(mlx_p1) install the tc rule success
>>
>> I think it's a problem of lag mode. Or I miss some configureation?
>>
>>
>> BR
>>
>> wenxu
>>
>>
>>
>>
>>
> Hi,
>
> we need to verify the driver detected to be in lag mode and
> duplicated the offload rule to both eswitches.
> do you see lag map messages in dmesg?
> something like "lag map port 1:1 port 2:2"
> this is to make sure the driver actually in lag mode.
> in this mode a rule added to mlx_pf0vf0 will be added to esw of pf0 and esw of pf1.
> then when u send a packet it could be handled in esw0 or esw1
> if the rule is not in esw1 then it wont be offloaded when using pf1.
>
> thanks,
> Roi

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Bug or mis configuration for mlx5e lag and multipath
  2019-05-24  1:02   ` wenxu
@ 2019-05-28 10:48     ` Roi Dayan
  0 siblings, 0 replies; 4+ messages in thread
From: Roi Dayan @ 2019-05-28 10:48 UTC (permalink / raw)
  To: wenxu, Saeed Mahameed; +Cc: netdev



On 24/05/2019 04:02, wenxu wrote:
> Hi,
> 
> I can get the right log from demsg
> 
>  mlx5_core 0000:81:00.0: modify lag map port 1:1 port 2:2
> 
> 
> I debug with the driver, I find the rule be add on mlx_pf0vf0 and the peer one pf1,
> 
> So I think the esw0 and esw1 both have the rule.
> 
> The test case is based on the master branch of the net git tree.

ok thanks for reporting. we'll have to check it.

> 
> 在 2019/5/23 23:15, Roi Dayan 写道:
>>
>> On 20/05/2019 04:53, wenxu wrote:
>>> Hi Roi & Saeed,
>>>
>>> I just test the mlx5e lag and mutipath feature. There are some suituation the outgoing can't be offloaded.
>>>
>>> ovs configureation as following.
>>>
>>> # ovs-vsctl show
>>> dfd71dfb-6e22-423e-b088-d2022103af6b
>>>     Bridge "br0"
>>>         Port "mlx_pf0vf0"
>>>             Interface "mlx_pf0vf0"
>>>         Port gre
>>>             Interface gre
>>>                 type: gre
>>>                 options: {key="1000", local_ip="172.168.152.75", remote_ip="172.168.152.241"}
>>>         Port "br0"
>>>             Interface "br0"
>>>                 type: internal
>>>
>>> set the mlx5e driver:
>>>
>>>
>>> modprobe mlx5_core
>>> echo 0 > /sys/class/net/eth2/device/sriov_numvfs
>>> echo 0 > /sys/class/net/eth3/device/sriov_numvfs
>>> echo 2 > /sys/class/net/eth2/device/sriov_numvfs
>>> echo 2 > /sys/class/net/eth3/device/sriov_numvfs
>>> lspci -nn | grep Mellanox
>>> echo 0000:81:00.2 > /sys/bus/pci/drivers/mlx5_core/unbind
>>> echo 0000:81:00.3 > /sys/bus/pci/drivers/mlx5_core/unbind
>>> echo 0000:81:03.6 > /sys/bus/pci/drivers/mlx5_core/unbind
>>> echo 0000:81:03.7 > /sys/bus/pci/drivers/mlx5_core/unbind
>>>
>>> devlink dev eswitch set pci/0000:81:00.0  mode switchdev encap enable
>>> devlink dev eswitch set pci/0000:81:00.1  mode switchdev encap enable
>>>
>>> modprobe bonding mode=802.3ad miimon=100 lacp_rate=1
>>> ip l del dev bond0
>>> ifconfig mlx_p0 down
>>> ifconfig mlx_p1 down
>>> ip l add dev bond0 type bond mode 802.3ad
>>> ifconfig bond0 172.168.152.75/24 up
>>> echo 1 > /sys/class/net/bond0/bonding/xmit_hash_policy
>>> ip l set dev mlx_p0 master bond0
>>> ip l set dev mlx_p1 master bond0
>>> ifconfig mlx_p0 up
>>> ifconfig mlx_p1 up
>>>
>>> systemctl start openvswitch
>>> ovs-vsctl set Open_vSwitch . other_config:hw-offload=true
>>> systemctl restart openvswitch
>>>
>>>
>>> mlx_pf0vf0 is assigned to vm. The tc rule show in_hw
>>>
>>> # tc filter ls dev mlx_pf0vf0 ingress
>>> filter protocol ip pref 2 flower
>>> filter protocol ip pref 2 flower handle 0x1
>>>   dst_mac 8e:c0:bd:bf:72:c3
>>>   src_mac 52:54:00:00:12:75
>>>   eth_type ipv4
>>>   ip_tos 0/3
>>>   ip_flags nofrag
>>>   in_hw
>>>     action order 1: tunnel_key set
>>>     src_ip 172.168.152.75
>>>     dst_ip 172.168.152.241
>>>     key_id 1000 pipe
>>>     index 2 ref 1 bind 1
>>>  
>>>     action order 2: mirred (Egress Redirect to device gre_sys) stolen
>>>      index 2 ref 1 bind 1
>>>
>>> In the vm:  the mlx5e driver enable xps default (by the way I think it is better not enable xps in default kernel can select queue by each flow),  in the lag mode different vf queue associate with hw PF.
>>>
>>> with command taskset -c 2 ping 10.0.0.241
>>>
>>> the packet can be offloaded , the outgoing pf is mlx_p0
>>>
>>> but with command taskset -c 1 ping 10.0.0.241
>>>
>>> the packet can't be offloaded, I can capture the packet on the mlx_pf0vf0, the outgoing pf is mlx_p1. Althrough the tc flower rule show in_hw
>>>
>>>
>>> I check with the driver  both mlx_pf0vf0 and peer(mlx_p1) install the tc rule success
>>>
>>> I think it's a problem of lag mode. Or I miss some configureation?
>>>
>>>
>>> BR
>>>
>>> wenxu
>>>
>>>
>>>
>>>
>>>
>> Hi,
>>
>> we need to verify the driver detected to be in lag mode and
>> duplicated the offload rule to both eswitches.
>> do you see lag map messages in dmesg?
>> something like "lag map port 1:1 port 2:2"
>> this is to make sure the driver actually in lag mode.
>> in this mode a rule added to mlx_pf0vf0 will be added to esw of pf0 and esw of pf1.
>> then when u send a packet it could be handled in esw0 or esw1
>> if the rule is not in esw1 then it wont be offloaded when using pf1.
>>
>> thanks,
>> Roi

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-05-28 10:48 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-20  1:53 Bug or mis configuration for mlx5e lag and multipath wenxu
2019-05-23 15:15 ` Roi Dayan
2019-05-24  1:02   ` wenxu
2019-05-28 10:48     ` Roi Dayan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.