All of lore.kernel.org
 help / color / mirror / Atom feed
* LRO: creating vlan subports affects parent port's LRO settings
@ 2020-11-20  1:37 Limin Wang
  2020-11-24  0:26 ` Jakub Kicinski
  0 siblings, 1 reply; 7+ messages in thread
From: Limin Wang @ 2020-11-20  1:37 UTC (permalink / raw)
  To: netdev

Under relatively recent kernels (v4.4+), creating a vlan subport on a
LRO supported parent NIC may turn LRO off on the parent port and
further render its LRO feature practically unchangeable.

This can be easily reproduced on different distros, and independent of
NIC vendors.
Hopefully, this is not a repeat post of a known issue.

Below example is on Ubuntu 18.04 LTS. (Centos-7.6 is slightly
different, but the end result is the same, will attach in the end)
===========================================================================
# Ubuntu 18.04 LTS
root@server1:# uname -a
Linux server1 4.15.0-20-generic #21-Ubuntu SMP Tue Apr 24 06:16:15 UTC
2018 x86_64 x86_64 x86_64 GNU/Linux

# mellanox NIC
root@server1:# /sbin/ethtool -i ens4f0
driver: mlx5_core
version: 5.0-2.1.8

# enable LRO on the NIC
root@server1:# /sbin/ethtool -k ens4f0 | grep large
large-receive-offload: off
root@server1:# /sbin/ethtool -K ens4f0 lro on
root@server1:# /sbin/ethtool -k ens4f0 | grep large
large-receive-offload: on

# create a vlan subport, once subport is up, parent port LRO is disabled
root@server1:# ip link add link ens4f0 name ens4f0.50 type vlan id 50
root@server1:# ifconfig ens4f0.50 up
root@server1:# ethtool -k ens4f0.50 | grep large
large-receive-offload: off [fixed]
root@server1:# ethtool -k ens4f0 | grep large
large-receive-offload: off

# manually enabling LRO on parent port not working any more
root@server1:# /sbin/ethtool -K ens4f0 lro on
Could not change any device features
root@server1:# /sbin/ethtool -K ens4f0.50 lro on
Cannot change large-receive-offload
Could not change any device features
root@server1:# /sbin/ethtool -K ens4f0 lro on
Could not change any device features
root@server1:# ethtool -k ens4f0 | grep large
large-receive-offload: off [requested on]

# Now the only way to re-enable LRO on the parent port is to remove the subport
root@server1:# ip link del ens4f0.50
root@server1:# /sbin/ethtool -k ens4f0 | grep large
large-receive-offload: off [requested on]
root@server1:# /sbin/ethtool -K ens4f0 lro on
root@server1:# ethtool -k ens4f0 | grep large
large-receive-offload: on
===========================================================================

Although LRO may have different implications or issues in practice,
this seems a simple use case expected to work?--enabling LRO on the
physical NIC and also having vlans on the same NIC port.
Note, here both the parent port and the vlan subport are not attached
to any bridge, bond, team or ovs devices, just standalone.

This issue seems not driver or distro related, and lies in the kernel
network stack.
When changing netdev features, (via either userspace ethtool, or other
in-kernel processing), in the end:
__netdev_update_features() does the job and calls
netdev_sync_upper_features() and netdev_sync_lower_features()
both sync functions basically do one thing: make sure
NETIF_F_UPPER_DISABLES is consistently enforced among upper and lower
net devices.
currently NETIF_F_UPPER_DISABLES only includes NETIF_F_LRO

A lot of thoughts must have been given to this logic, and many
situations are considered for upper_devs like bond, team, bridge etc.
However, maybe a possible oversight is vlan_dev, which is an upper_dev
for its parent real_dev?
A vlan_dev is created with LRO unsupported by default, (NETIF_F_LRO
bit not set in hw_features).
As seen "fixed" in
root@server1:# ethtool -k ens4f0.50 | grep large
large-receive-offload: off [fixed]

Therefore, following the code path of upper_sync and lower_sync above,
once a vlan_dev is created, the parent real_dev can no longer set LRO
on.

Honestly, vlan_dev being treated as an upper_dev for the real_dev is a
bit counter-intuitive at the beginning, as people call them vlan
subports.
But, from the perspective that vlan_dev is a virtual device created
out of real_dev, it has somewhat "upper_dev" flavor, similar to
bond/team devices.
Kernel also associates upper_dev with some "master" role, and it makes
perfect sense for bond/team/bridge/ovs.
However, for vlan_dev, it sounds more like a slave dev to real_dev
(some people call real_dev parent port).
A secondary point, upper_dev (bond/team/bridge) typically has > 1
lower_dev, upper:lower normally has 1:N relationship.
For vlan_dev, it has only 1 lower_dev, upper:lower could often be N:1
relationship.

The above upper/lower sync logic probably stems from the "master" role
aspect of upper_dev, just that vlan_dev may not be a good fit for
this.
Probably that is where the confusion is.

Maybe I missed something, but this logic has been there for quite some
time (since v4.4 onwards, didn't try the latest, but tried pre-v4.4
kernels, no such issue under older kernels though).

Feel free to correct me.

Now, two possible solution proposals to fix this (if considered as an issue)
1. when creating/init a vlan_dev, set its hw_feature's NETIF_F_LRO bit
based on its underlying real_dev's hw_feature NETIF_F_LRO bit.
  (maybe not just hw_features, set wanted_feature as well?)
2. in netdev_sync_upper_features() and netdev_sync_lower_features()
exclude those upper_dev that is also a vlan_dev

Thanks for the attention.
Limin

p.s. another example of Centos-7.6 with VMXNET3 port
===========================================================================
# CentOS Linux release 7.6.1810 (Core)
root@esxi-server]# uname -a
Linux esxi-server 3.10.0-957.27.2.el7.x86_64 #1 SMP Mon Jul 29
17:46:05 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

# VMXNET3 NIC
[root@esxi-server]# ethtool -i ens224
driver: vmxnet3
version: 1.4.14.0-k-NAPI

# LRO enabled on the NIC
[root@esxi-server]# ethtool -k ens224 | grep large
large-receive-offload: on

# create a vlan subport, NIC LRO still on
[root@esxi-server]# ip link add link ens224 name ens224.50 type vlan id 50
[root@esxi-server]# ifconfig ens224.50 up
[root@esxi-server]# ethtool -k ens224 | grep large
large-receive-offload: on
[root@esxi-server]# ethtool -k ens224.50 | grep large
large-receive-offload: off [fixed]

# now turn LRO off, and after that, LRO cannot be turned on any longer
[root@esxi-server]# ethtool -K ens224 lro off
[root@esxi-server]# ethtool -k ens224 | grep large
large-receive-offload: off
[root@esxi-server]# ethtool -k ens224.50 | grep large
large-receive-offload: off [fixed]
[root@esxi-server]# ethtool -K ens224 lro on
Could not change any device features
[root@esxi-server]# ethtool -k ens224 | grep large
large-receive-offload: off [requested on]
[root@esxi-server]# ethtool -k ens224.50 | grep large
large-receive-offload: off [fixed]

# Now the only way to re-enable LRO on the parent port is to remove the subport
[root@esxi-server]# ip link del ens224.50
[root@esxi-server]# ethtool -k ens224 | grep large
large-receive-offload: off [requested on]
[root@esxi-server]# ethtool -K ens224 lro on
[root@esxi-server]# ethtool -k ens224 | grep large
large-receive-offload: on
===========================================================================

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-12-07  3:20 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-20  1:37 LRO: creating vlan subports affects parent port's LRO settings Limin Wang
2020-11-24  0:26 ` Jakub Kicinski
2020-12-06  0:04   ` Jarod Wilson
2020-12-06 16:49     ` Michal Kubecek
2020-12-06 22:58       ` Jarod Wilson
2020-12-07  3:19         ` Limin Wang
2020-12-07  3:04       ` Limin Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.