All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next v2 0/5] Introduce NETIF_F_GRO_HW
@ 2017-12-07  8:03 Michael Chan
  2017-12-07  8:03 ` [PATCH net-next v2 1/5] net: " Michael Chan
                   ` (4 more replies)
  0 siblings, 5 replies; 19+ messages in thread
From: Michael Chan @ 2017-12-07  8:03 UTC (permalink / raw)
  To: davem; +Cc: netdev, andrew.gospodarek

Introduce NETIF_F_GRO_HW feature flag and convert drivers that support
hardware GRO to use the new flag.

v2:
- NETIF_F_GRO_HW flag propagation between upper and lower devices not
required (see patch 1).
- NETIF_F_GRO_HW depends on NETIF_F_GRO and NETIF_F_RXCSUM.
- Add dev_disable_gro_hw() to disable GRO_HW for generic XDP.
- Use ndo_fix_features() on all 3 drivers to drop GRO_HW when it is not
supported

Michael Chan (5):
  net: Introduce NETIF_F_GRO_HW.
  net: Disable GRO_HW when generic XDP is installed on a device.
  bnxt_en: Use NETIF_F_GRO_HW.
  bnx2x: Use NETIF_F_GRO_HW.
  qede: Use NETIF_F_GRO_HW.

 Documentation/networking/netdev-features.txt     |  8 +++++
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c  | 17 ++++++----
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c |  4 ++-
 drivers/net/ethernet/broadcom/bnxt/bnxt.c        | 21 ++++++++----
 drivers/net/ethernet/qlogic/qede/qede.h          |  2 ++
 drivers/net/ethernet/qlogic/qede/qede_ethtool.c  |  3 ++
 drivers/net/ethernet/qlogic/qede/qede_filter.c   | 20 ++++++++----
 drivers/net/ethernet/qlogic/qede/qede_main.c     | 17 ++++------
 include/linux/netdev_features.h                  |  3 ++
 net/core/dev.c                                   | 41 ++++++++++++++++++++++++
 net/core/ethtool.c                               |  1 +
 11 files changed, 104 insertions(+), 33 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH net-next v2 1/5] net: Introduce NETIF_F_GRO_HW.
  2017-12-07  8:03 [PATCH net-next v2 0/5] Introduce NETIF_F_GRO_HW Michael Chan
@ 2017-12-07  8:03 ` Michael Chan
  2017-12-07 18:13   ` Alexander Duyck
  2017-12-07  8:03 ` [PATCH net-next v2 2/5] net: Disable GRO_HW when generic XDP is installed on a device Michael Chan
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 19+ messages in thread
From: Michael Chan @ 2017-12-07  8:03 UTC (permalink / raw)
  To: davem; +Cc: netdev, andrew.gospodarek, Ariel Elior, everest-linux-l2

Introduce NETIF_F_GRO_HW feature flag for NICs that support hardware
GRO.  With this flag, we can now independently turn on or off hardware
GRO when GRO is on.  Previously, drivers were using NETIF_F_GRO to
control hardware GRO and so it cannot be independently turned on or
off without affecting GRO.

Hardware GRO (just like GRO) guarantees that packets can be re-segmented
by TSO/GSO to reconstruct the original packet stream.  It is a subset of
NETIF_F_GRO and depends on it

Since NETIF_F_GRO is not propagated between upper and lower devices,
NETIF_F_GRO_HW should follow suit since it is a subset of GRO.  In other
words, a lower device can independent have GRO/GRO_HW enabled or disabled
and no feature propagation is required.  This will preserve the current
GRO behavior.

Cc: Ariel Elior <Ariel.Elior@cavium.com>
Cc: everest-linux-l2@cavium.com
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 Documentation/networking/netdev-features.txt |  8 ++++++++
 include/linux/netdev_features.h              |  3 +++
 net/core/dev.c                               | 17 +++++++++++++++++
 net/core/ethtool.c                           |  1 +
 4 files changed, 29 insertions(+)

diff --git a/Documentation/networking/netdev-features.txt b/Documentation/networking/netdev-features.txt
index 7413eb0..8f36527 100644
--- a/Documentation/networking/netdev-features.txt
+++ b/Documentation/networking/netdev-features.txt
@@ -163,3 +163,11 @@ This requests that the NIC receive all possible frames, including errored
 frames (such as bad FCS, etc).  This can be helpful when sniffing a link with
 bad packets on it.  Some NICs may receive more packets if also put into normal
 PROMISC mode.
+
+*  rx-gro-hw
+
+This requests that the NIC enables Hardware GRO (generic receive offload).
+Hardware GRO is basically the exact reverse of TSO, and is generally
+stricter than Hardware LRO.  A packet stream merged by Hardware GRO must
+be re-segmentable by GSO or TSO back to the exact original packet stream.
+Hardware GRO is dependent on GRO and RXCSUM.
diff --git a/include/linux/netdev_features.h b/include/linux/netdev_features.h
index b1b0ca7..db84c51 100644
--- a/include/linux/netdev_features.h
+++ b/include/linux/netdev_features.h
@@ -78,6 +78,8 @@ enum {
 	NETIF_F_HW_ESP_TX_CSUM_BIT,	/* ESP with TX checksum offload */
 	NETIF_F_RX_UDP_TUNNEL_PORT_BIT, /* Offload of RX port for UDP tunnels */
 
+	NETIF_F_GRO_HW_BIT,		/* Hardware Generic receive offload */
+
 	/*
 	 * Add your fresh new feature above and remember to update
 	 * netdev_features_strings[] in net/core/ethtool.c and maybe
@@ -97,6 +99,7 @@ enum {
 #define NETIF_F_FRAGLIST	__NETIF_F(FRAGLIST)
 #define NETIF_F_FSO		__NETIF_F(FSO)
 #define NETIF_F_GRO		__NETIF_F(GRO)
+#define NETIF_F_GRO_HW		__NETIF_F(GRO_HW)
 #define NETIF_F_GSO		__NETIF_F(GSO)
 #define NETIF_F_GSO_ROBUST	__NETIF_F(GSO_ROBUST)
 #define NETIF_F_HIGHDMA		__NETIF_F(HIGHDMA)
diff --git a/net/core/dev.c b/net/core/dev.c
index 6bea893..7242e5e 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -7405,6 +7405,23 @@ static netdev_features_t netdev_fix_features(struct net_device *dev,
 		features &= ~dev->gso_partial_features;
 	}
 
+	if (features & NETIF_F_GRO_HW) {
+		/* Hardware GRO depends on GRO and RXCSUM. */
+		if (!(features & NETIF_F_GRO)) {
+			netdev_dbg(dev, "Dropping NETIF_F_GSO_HW since no GRO feature.\n");
+			features &= ~NETIF_F_GRO_HW;
+		}
+		if (!(features & NETIF_F_RXCSUM)) {
+			netdev_dbg(dev, "Dropping NETIF_F_GSO_HW since no RXCSUM feature.\n");
+			features &= ~NETIF_F_GRO_HW;
+		}
+		/* Hardware GRO and LRO are mutually exclusive. */
+		if (features & NETIF_F_LRO) {
+			netdev_dbg(dev, "Dropping NETIF_F_LRO since GRO_HW is set.\n");
+			features &= ~NETIF_F_LRO;
+		}
+	}
+
 	return features;
 }
 
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index f8fcf45..50a7920 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -73,6 +73,7 @@ int ethtool_op_get_ts_info(struct net_device *dev, struct ethtool_ts_info *info)
 	[NETIF_F_LLTX_BIT] =             "tx-lockless",
 	[NETIF_F_NETNS_LOCAL_BIT] =      "netns-local",
 	[NETIF_F_GRO_BIT] =              "rx-gro",
+	[NETIF_F_GRO_HW_BIT] =           "rx-gro-hw",
 	[NETIF_F_LRO_BIT] =              "rx-lro",
 
 	[NETIF_F_TSO_BIT] =              "tx-tcp-segmentation",
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH net-next v2 2/5] net: Disable GRO_HW when generic XDP is installed on a device.
  2017-12-07  8:03 [PATCH net-next v2 0/5] Introduce NETIF_F_GRO_HW Michael Chan
  2017-12-07  8:03 ` [PATCH net-next v2 1/5] net: " Michael Chan
@ 2017-12-07  8:03 ` Michael Chan
  2017-12-07  8:03 ` [PATCH net-next v2 3/5] bnxt_en: Use NETIF_F_GRO_HW Michael Chan
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 19+ messages in thread
From: Michael Chan @ 2017-12-07  8:03 UTC (permalink / raw)
  To: davem; +Cc: netdev, andrew.gospodarek, Ariel Elior, everest-linux-l2

Hardware should not aggregate any packets when generic XDP is installed.

Cc: Ariel Elior <Ariel.Elior@cavium.com>
Cc: everest-linux-l2@cavium.com
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 net/core/dev.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/net/core/dev.c b/net/core/dev.c
index 7242e5e..7087f34 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1542,6 +1542,29 @@ void dev_disable_lro(struct net_device *dev)
 }
 EXPORT_SYMBOL(dev_disable_lro);
 
+/**
+ *	dev_disable_gro_hw - disable HW Generic Receive Offload on a device
+ *	@dev: device
+ *
+ *	Disable HW Generic Receive Offload (GRO_HW) on a net device.  Must be
+ *	called under RTNL.  This is needed if Generic XDP is installed on
+ *	the device.
+ */
+static void dev_disable_gro_hw(struct net_device *dev)
+{
+	struct net_device *lower_dev;
+	struct list_head *iter;
+
+	dev->wanted_features &= ~NETIF_F_GRO_HW;
+	netdev_update_features(dev);
+
+	if (unlikely(dev->features & NETIF_F_GRO_HW))
+		netdev_WARN(dev, "failed to disable GRO_HW!\n");
+
+	netdev_for_each_lower_dev(dev, lower_dev, iter)
+		dev_disable_gro_hw(lower_dev);
+}
+
 static int call_netdevice_notifier(struct notifier_block *nb, unsigned long val,
 				   struct net_device *dev)
 {
@@ -4545,6 +4568,7 @@ static int generic_xdp_install(struct net_device *dev, struct netdev_bpf *xdp)
 		} else if (new && !old) {
 			static_key_slow_inc(&generic_xdp_needed);
 			dev_disable_lro(dev);
+			dev_disable_gro_hw(dev);
 		}
 		break;
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH net-next v2 3/5] bnxt_en: Use NETIF_F_GRO_HW.
  2017-12-07  8:03 [PATCH net-next v2 0/5] Introduce NETIF_F_GRO_HW Michael Chan
  2017-12-07  8:03 ` [PATCH net-next v2 1/5] net: " Michael Chan
  2017-12-07  8:03 ` [PATCH net-next v2 2/5] net: Disable GRO_HW when generic XDP is installed on a device Michael Chan
@ 2017-12-07  8:03 ` Michael Chan
  2017-12-07  8:03 ` [PATCH net-next v2 4/5] bnx2x: " Michael Chan
  2017-12-07  8:03 ` [PATCH net-next v2 5/5] qede: " Michael Chan
  4 siblings, 0 replies; 19+ messages in thread
From: Michael Chan @ 2017-12-07  8:03 UTC (permalink / raw)
  To: davem; +Cc: netdev, andrew.gospodarek

Advertise NETIF_F_GRO_HW in hw_features if hardware GRO is supported.
In bnxt_fix_features(), disable GRO_HW and LRO if current hardware
configuration does not allow it.  XDP setup will now rely on
bnxt_fix_features() to turn off aggregation.  During chip init, turn on
or off hardware GRO based on NETIF_F_GRO_HW in features flag.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 21 ++++++++++++++-------
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 24fe676..8227fa0 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -2751,7 +2751,7 @@ void bnxt_set_tpa_flags(struct bnxt *bp)
 		return;
 	if (bp->dev->features & NETIF_F_LRO)
 		bp->flags |= BNXT_FLAG_LRO;
-	if (bp->dev->features & NETIF_F_GRO)
+	else if (bp->dev->features & NETIF_F_GRO_HW)
 		bp->flags |= BNXT_FLAG_GRO;
 }
 
@@ -2839,10 +2839,10 @@ int bnxt_set_rx_skb_mode(struct bnxt *bp, bool page_mode)
 			min_t(u16, bp->max_mtu, BNXT_MAX_PAGE_MODE_MTU);
 		bp->flags &= ~BNXT_FLAG_AGG_RINGS;
 		bp->flags |= BNXT_FLAG_NO_AGG_RINGS | BNXT_FLAG_RX_PAGE_MODE;
-		bp->dev->hw_features &= ~NETIF_F_LRO;
-		bp->dev->features &= ~NETIF_F_LRO;
 		bp->rx_dir = DMA_BIDIRECTIONAL;
 		bp->rx_skb_func = bnxt_rx_page_skb;
+		/* Disable LRO or GRO_HW */
+		netdev_update_features(bp->dev);
 	} else {
 		bp->dev->max_mtu = bp->max_mtu;
 		bp->flags &= ~BNXT_FLAG_RX_PAGE_MODE;
@@ -6784,6 +6784,9 @@ static netdev_features_t bnxt_fix_features(struct net_device *dev,
 	if ((features & NETIF_F_NTUPLE) && !bnxt_rfs_capable(bp))
 		features &= ~NETIF_F_NTUPLE;
 
+	if (bp->flags & BNXT_FLAG_NO_AGG_RINGS)
+		features &= ~(NETIF_F_LRO | NETIF_F_GRO_HW);
+
 	/* Both CTAG and STAG VLAN accelaration on the RX side have to be
 	 * turned on or off together.
 	 */
@@ -6817,9 +6820,9 @@ static int bnxt_set_features(struct net_device *dev, netdev_features_t features)
 	bool update_tpa = false;
 
 	flags &= ~BNXT_FLAG_ALL_CONFIG_FEATS;
-	if ((features & NETIF_F_GRO) && !BNXT_CHIP_TYPE_NITRO_A0(bp))
+	if (features & NETIF_F_GRO_HW)
 		flags |= BNXT_FLAG_GRO;
-	if (features & NETIF_F_LRO)
+	else if (features & NETIF_F_LRO)
 		flags |= BNXT_FLAG_LRO;
 
 	if (bp->flags & BNXT_FLAG_NO_AGG_RINGS)
@@ -7920,8 +7923,8 @@ static int bnxt_get_dflt_rings(struct bnxt *bp, int *max_rx, int *max_tx,
 		if (rc)
 			return rc;
 		bp->flags |= BNXT_FLAG_NO_AGG_RINGS;
-		bp->dev->hw_features &= ~NETIF_F_LRO;
-		bp->dev->features &= ~NETIF_F_LRO;
+		bp->dev->hw_features &= ~(NETIF_F_LRO | NETIF_F_GRO_HW);
+		bp->dev->features &= ~(NETIF_F_LRO | NETIF_F_GRO_HW);
 		bnxt_set_ring_params(bp);
 	}
 
@@ -8104,7 +8107,11 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 	dev->vlan_features = dev->hw_features | NETIF_F_HIGHDMA;
 	dev->hw_features |= NETIF_F_HW_VLAN_CTAG_RX | NETIF_F_HW_VLAN_CTAG_TX |
 			    NETIF_F_HW_VLAN_STAG_RX | NETIF_F_HW_VLAN_STAG_TX;
+	if (!BNXT_CHIP_TYPE_NITRO_A0(bp))
+		dev->hw_features |= NETIF_F_GRO_HW;
 	dev->features |= dev->hw_features | NETIF_F_HIGHDMA;
+	if (dev->features & NETIF_F_GRO_HW)
+		dev->features &= ~NETIF_F_LRO;
 	dev->priv_flags |= IFF_UNICAST_FLT;
 
 #ifdef CONFIG_BNXT_SRIOV
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH net-next v2 4/5] bnx2x: Use NETIF_F_GRO_HW.
  2017-12-07  8:03 [PATCH net-next v2 0/5] Introduce NETIF_F_GRO_HW Michael Chan
                   ` (2 preceding siblings ...)
  2017-12-07  8:03 ` [PATCH net-next v2 3/5] bnxt_en: Use NETIF_F_GRO_HW Michael Chan
@ 2017-12-07  8:03 ` Michael Chan
  2017-12-07  8:03 ` [PATCH net-next v2 5/5] qede: " Michael Chan
  4 siblings, 0 replies; 19+ messages in thread
From: Michael Chan @ 2017-12-07  8:03 UTC (permalink / raw)
  To: davem; +Cc: netdev, andrew.gospodarek, Ariel Elior, everest-linux-l2

Advertise NETIF_F_GRO_HW and turn on TPA_MODE_GRO when NETIF_F_GRO_HW
is set.  Disable NETIF_F_GRO_HW in bnx2x_fix_features() if the MTU
does not support TPA_MODE_GRO.  bnx2x_change_mtu() also needs to disable
NETIF_F_GRO_HW if the MTU does not support it.

Cc: Ariel Elior <Ariel.Elior@cavium.com>
Cc: everest-linux-l2@cavium.com
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c  | 17 ++++++++++-------
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c |  4 +++-
 2 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
index 4c739d5..a464c98 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
@@ -2482,8 +2482,7 @@ static void bnx2x_bz_fp(struct bnx2x *bp, int index)
 	 */
 	if (bp->dev->features & NETIF_F_LRO)
 		fp->mode = TPA_MODE_LRO;
-	else if (bp->dev->features & NETIF_F_GRO &&
-		 bnx2x_mtu_allows_gro(bp->dev->mtu))
+	else if (bp->dev->features & NETIF_F_GRO_HW)
 		fp->mode = TPA_MODE_GRO;
 	else
 		fp->mode = TPA_MODE_DISABLED;
@@ -4874,6 +4873,9 @@ int bnx2x_change_mtu(struct net_device *dev, int new_mtu)
 	 */
 	dev->mtu = new_mtu;
 
+	if (!bnx2x_mtu_allows_gro(new_mtu))
+		dev->features &= ~NETIF_F_GRO_HW;
+
 	if (IS_PF(bp) && SHMEM2_HAS(bp, curr_cfg))
 		SHMEM2_WR(bp, curr_cfg, CURR_CFG_MET_OS);
 
@@ -4907,6 +4909,8 @@ netdev_features_t bnx2x_fix_features(struct net_device *dev,
 		features &= ~NETIF_F_LRO;
 		features &= ~NETIF_F_GRO;
 	}
+	if (!bnx2x_mtu_allows_gro(dev->mtu))
+		features &= ~NETIF_F_GRO_HW;
 
 	return features;
 }
@@ -4933,13 +4937,12 @@ int bnx2x_set_features(struct net_device *dev, netdev_features_t features)
 		}
 	}
 
-	/* if GRO is changed while LRO is enabled, don't force a reload */
-	if ((changes & NETIF_F_GRO) && (features & NETIF_F_LRO))
-		changes &= ~NETIF_F_GRO;
+	/* Don't care about GRO changes */
+	changes &= ~NETIF_F_GRO;
 
 	/* if GRO is changed while HW TPA is off, don't force a reload */
-	if ((changes & NETIF_F_GRO) && bp->disable_tpa)
-		changes &= ~NETIF_F_GRO;
+	if ((changes & NETIF_F_GRO_HW) && bp->disable_tpa)
+		changes &= ~NETIF_F_GRO_HW;
 
 	if (changes)
 		bnx2x_reload = true;
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index 91e2a75..1c7f821 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -13273,7 +13273,7 @@ static int bnx2x_init_dev(struct bnx2x *bp, struct pci_dev *pdev,
 
 	dev->hw_features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
 		NETIF_F_TSO | NETIF_F_TSO_ECN | NETIF_F_TSO6 |
-		NETIF_F_RXCSUM | NETIF_F_LRO | NETIF_F_GRO |
+		NETIF_F_RXCSUM | NETIF_F_LRO | NETIF_F_GRO | NETIF_F_GRO_HW |
 		NETIF_F_RXHASH | NETIF_F_HW_VLAN_CTAG_TX;
 	if (!chip_is_e1x) {
 		dev->hw_features |= NETIF_F_GSO_GRE | NETIF_F_GSO_GRE_CSUM |
@@ -13309,6 +13309,8 @@ static int bnx2x_init_dev(struct bnx2x *bp, struct pci_dev *pdev,
 
 	dev->features |= dev->hw_features | NETIF_F_HW_VLAN_CTAG_RX;
 	dev->features |= NETIF_F_HIGHDMA;
+	if (dev->features & NETIF_F_GRO_HW)
+		dev->features &= ~NETIF_F_LRO;
 
 	/* Add Loopback capability to the device */
 	dev->hw_features |= NETIF_F_LOOPBACK;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH net-next v2 5/5] qede: Use NETIF_F_GRO_HW.
  2017-12-07  8:03 [PATCH net-next v2 0/5] Introduce NETIF_F_GRO_HW Michael Chan
                   ` (3 preceding siblings ...)
  2017-12-07  8:03 ` [PATCH net-next v2 4/5] bnx2x: " Michael Chan
@ 2017-12-07  8:03 ` Michael Chan
  2017-12-08 17:02   ` Chopra, Manish
  2017-12-08 22:09   ` Marcelo Ricardo Leitner
  4 siblings, 2 replies; 19+ messages in thread
From: Michael Chan @ 2017-12-07  8:03 UTC (permalink / raw)
  To: davem; +Cc: netdev, andrew.gospodarek, Ariel Elior, everest-linux-l2

Advertise NETIF_F_GRO_HW and set edev->gro_disable according to the
feature flag.  Add qede_fix_features() to drop NETIF_F_GRO_HW if
XDP is running or MTU does not support GRO_HW.  qede_change_mtu()
also checks and disables GRO_HW if MTU is not supported.

Cc: Ariel Elior <Ariel.Elior@cavium.com>
Cc: everest-linux-l2@cavium.com
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/qlogic/qede/qede.h         |  2 ++
 drivers/net/ethernet/qlogic/qede/qede_ethtool.c |  3 +++
 drivers/net/ethernet/qlogic/qede/qede_filter.c  | 20 +++++++++++++-------
 drivers/net/ethernet/qlogic/qede/qede_main.c    | 17 ++++++-----------
 4 files changed, 24 insertions(+), 18 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qede/qede.h b/drivers/net/ethernet/qlogic/qede/qede.h
index a3a70ad..8a33651 100644
--- a/drivers/net/ethernet/qlogic/qede/qede.h
+++ b/drivers/net/ethernet/qlogic/qede/qede.h
@@ -494,6 +494,8 @@ int qede_free_tx_pkt(struct qede_dev *edev,
 void qede_vlan_mark_nonconfigured(struct qede_dev *edev);
 int qede_configure_vlan_filters(struct qede_dev *edev);
 
+netdev_features_t qede_fix_features(struct net_device *dev,
+				    netdev_features_t features);
 int qede_set_features(struct net_device *dev, netdev_features_t features);
 void qede_set_rx_mode(struct net_device *ndev);
 void qede_config_rx_mode(struct net_device *ndev);
diff --git a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
index dae7412..4ca3847 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
@@ -940,6 +940,9 @@ int qede_change_mtu(struct net_device *ndev, int new_mtu)
 	DP_VERBOSE(edev, (NETIF_MSG_IFUP | NETIF_MSG_IFDOWN),
 		   "Configuring MTU size of %d\n", new_mtu);
 
+	if (new_mtu > PAGE_SIZE)
+		ndev->features &= ~NETIF_F_GRO_HW;
+
 	/* Set the mtu field and re-start the interface if needed */
 	args.u.mtu = new_mtu;
 	args.func = &qede_update_mtu;
diff --git a/drivers/net/ethernet/qlogic/qede/qede_filter.c b/drivers/net/ethernet/qlogic/qede/qede_filter.c
index c1a0708..2de947e 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_filter.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_filter.c
@@ -895,19 +895,25 @@ static void qede_set_features_reload(struct qede_dev *edev,
 	edev->ndev->features = args->u.features;
 }
 
+netdev_features_t qede_fix_features(struct net_device *dev,
+				    netdev_features_t features)
+{
+	struct qede_dev *edev = netdev_priv(dev);
+
+	if (edev->xdp_prog || edev->ndev->mtu > PAGE_SIZE)
+		features &= ~NETIF_F_GRO_HW;
+
+	return features;
+}
+
 int qede_set_features(struct net_device *dev, netdev_features_t features)
 {
 	struct qede_dev *edev = netdev_priv(dev);
 	netdev_features_t changes = features ^ dev->features;
 	bool need_reload = false;
 
-	/* No action needed if hardware GRO is disabled during driver load */
-	if (changes & NETIF_F_GRO) {
-		if (dev->features & NETIF_F_GRO)
-			need_reload = !edev->gro_disable;
-		else
-			need_reload = edev->gro_disable;
-	}
+	if (changes & NETIF_F_GRO_HW)
+		need_reload = true;
 
 	if (need_reload) {
 		struct qede_reload_args args;
diff --git a/drivers/net/ethernet/qlogic/qede/qede_main.c b/drivers/net/ethernet/qlogic/qede/qede_main.c
index 57332b3..90d79ae 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_main.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_main.c
@@ -545,6 +545,7 @@ static int qede_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
 #endif
 	.ndo_vlan_rx_add_vid = qede_vlan_rx_add_vid,
 	.ndo_vlan_rx_kill_vid = qede_vlan_rx_kill_vid,
+	.ndo_fix_features = qede_fix_features,
 	.ndo_set_features = qede_set_features,
 	.ndo_get_stats64 = qede_get_stats64,
 #ifdef CONFIG_QED_SRIOV
@@ -572,6 +573,7 @@ static int qede_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
 	.ndo_change_mtu = qede_change_mtu,
 	.ndo_vlan_rx_add_vid = qede_vlan_rx_add_vid,
 	.ndo_vlan_rx_kill_vid = qede_vlan_rx_kill_vid,
+	.ndo_fix_features = qede_fix_features,
 	.ndo_set_features = qede_set_features,
 	.ndo_get_stats64 = qede_get_stats64,
 	.ndo_udp_tunnel_add = qede_udp_tunnel_add,
@@ -589,6 +591,7 @@ static int qede_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
 	.ndo_change_mtu = qede_change_mtu,
 	.ndo_vlan_rx_add_vid = qede_vlan_rx_add_vid,
 	.ndo_vlan_rx_kill_vid = qede_vlan_rx_kill_vid,
+	.ndo_fix_features = qede_fix_features,
 	.ndo_set_features = qede_set_features,
 	.ndo_get_stats64 = qede_get_stats64,
 	.ndo_udp_tunnel_add = qede_udp_tunnel_add,
@@ -676,7 +679,7 @@ static void qede_init_ndev(struct qede_dev *edev)
 	ndev->priv_flags |= IFF_UNICAST_FLT;
 
 	/* user-changeble features */
-	hw_features = NETIF_F_GRO | NETIF_F_SG |
+	hw_features = NETIF_F_GRO | NETIF_F_GRO_HW | NETIF_F_SG |
 		      NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
 		      NETIF_F_TSO | NETIF_F_TSO6;
 
@@ -1228,18 +1231,9 @@ static int qede_alloc_sge_mem(struct qede_dev *edev, struct qede_rx_queue *rxq)
 	dma_addr_t mapping;
 	int i;
 
-	/* Don't perform FW aggregations in case of XDP */
-	if (edev->xdp_prog)
-		edev->gro_disable = 1;
-
 	if (edev->gro_disable)
 		return 0;
 
-	if (edev->ndev->mtu > PAGE_SIZE) {
-		edev->gro_disable = 1;
-		return 0;
-	}
-
 	for (i = 0; i < ETH_TPA_MAX_AGGS_NUM; i++) {
 		struct qede_agg_info *tpa_info = &rxq->tpa_info[i];
 		struct sw_rx_data *replace_buf = &tpa_info->buffer;
@@ -1269,6 +1263,7 @@ static int qede_alloc_sge_mem(struct qede_dev *edev, struct qede_rx_queue *rxq)
 err:
 	qede_free_sge_mem(edev, rxq);
 	edev->gro_disable = 1;
+	edev->ndev->features &= ~NETIF_F_GRO_HW;
 	return -ENOMEM;
 }
 
@@ -1511,7 +1506,7 @@ static void qede_init_fp(struct qede_dev *edev)
 			 edev->ndev->name, queue_id);
 	}
 
-	edev->gro_disable = !(edev->ndev->features & NETIF_F_GRO);
+	edev->gro_disable = !(edev->ndev->features & NETIF_F_GRO_HW);
 }
 
 static int qede_set_real_num_queues(struct qede_dev *edev)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next v2 1/5] net: Introduce NETIF_F_GRO_HW.
  2017-12-07  8:03 ` [PATCH net-next v2 1/5] net: " Michael Chan
@ 2017-12-07 18:13   ` Alexander Duyck
  2017-12-07 18:44     ` Michael Chan
  0 siblings, 1 reply; 19+ messages in thread
From: Alexander Duyck @ 2017-12-07 18:13 UTC (permalink / raw)
  To: Michael Chan
  Cc: David Miller, Netdev, andrew.gospodarek, Ariel Elior, everest-linux-l2

On Thu, Dec 7, 2017 at 12:03 AM, Michael Chan <michael.chan@broadcom.com> wrote:
> Introduce NETIF_F_GRO_HW feature flag for NICs that support hardware
> GRO.  With this flag, we can now independently turn on or off hardware
> GRO when GRO is on.  Previously, drivers were using NETIF_F_GRO to
> control hardware GRO and so it cannot be independently turned on or
> off without affecting GRO.
>
> Hardware GRO (just like GRO) guarantees that packets can be re-segmented
> by TSO/GSO to reconstruct the original packet stream.  It is a subset of
> NETIF_F_GRO and depends on it
>
> Since NETIF_F_GRO is not propagated between upper and lower devices,
> NETIF_F_GRO_HW should follow suit since it is a subset of GRO.  In other
> words, a lower device can independent have GRO/GRO_HW enabled or disabled
> and no feature propagation is required.  This will preserve the current
> GRO behavior.
>
> Cc: Ariel Elior <Ariel.Elior@cavium.com>
> Cc: everest-linux-l2@cavium.com
> Signed-off-by: Michael Chan <michael.chan@broadcom.com>
> ---
>  Documentation/networking/netdev-features.txt |  8 ++++++++
>  include/linux/netdev_features.h              |  3 +++
>  net/core/dev.c                               | 17 +++++++++++++++++
>  net/core/ethtool.c                           |  1 +
>  4 files changed, 29 insertions(+)
>
> diff --git a/Documentation/networking/netdev-features.txt b/Documentation/networking/netdev-features.txt
> index 7413eb0..8f36527 100644
> --- a/Documentation/networking/netdev-features.txt
> +++ b/Documentation/networking/netdev-features.txt
> @@ -163,3 +163,11 @@ This requests that the NIC receive all possible frames, including errored
>  frames (such as bad FCS, etc).  This can be helpful when sniffing a link with
>  bad packets on it.  Some NICs may receive more packets if also put into normal
>  PROMISC mode.
> +
> +*  rx-gro-hw
> +
> +This requests that the NIC enables Hardware GRO (generic receive offload).
> +Hardware GRO is basically the exact reverse of TSO, and is generally
> +stricter than Hardware LRO.  A packet stream merged by Hardware GRO must
> +be re-segmentable by GSO or TSO back to the exact original packet stream.
> +Hardware GRO is dependent on GRO and RXCSUM.
> diff --git a/include/linux/netdev_features.h b/include/linux/netdev_features.h
> index b1b0ca7..db84c51 100644
> --- a/include/linux/netdev_features.h
> +++ b/include/linux/netdev_features.h
> @@ -78,6 +78,8 @@ enum {
>         NETIF_F_HW_ESP_TX_CSUM_BIT,     /* ESP with TX checksum offload */
>         NETIF_F_RX_UDP_TUNNEL_PORT_BIT, /* Offload of RX port for UDP tunnels */
>
> +       NETIF_F_GRO_HW_BIT,             /* Hardware Generic receive offload */
> +
>         /*
>          * Add your fresh new feature above and remember to update
>          * netdev_features_strings[] in net/core/ethtool.c and maybe
> @@ -97,6 +99,7 @@ enum {
>  #define NETIF_F_FRAGLIST       __NETIF_F(FRAGLIST)
>  #define NETIF_F_FSO            __NETIF_F(FSO)
>  #define NETIF_F_GRO            __NETIF_F(GRO)
> +#define NETIF_F_GRO_HW         __NETIF_F(GRO_HW)
>  #define NETIF_F_GSO            __NETIF_F(GSO)
>  #define NETIF_F_GSO_ROBUST     __NETIF_F(GSO_ROBUST)
>  #define NETIF_F_HIGHDMA                __NETIF_F(HIGHDMA)
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 6bea893..7242e5e 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -7405,6 +7405,23 @@ static netdev_features_t netdev_fix_features(struct net_device *dev,
>                 features &= ~dev->gso_partial_features;
>         }
>
> +       if (features & NETIF_F_GRO_HW) {
> +               /* Hardware GRO depends on GRO and RXCSUM. */
> +               if (!(features & NETIF_F_GRO)) {
> +                       netdev_dbg(dev, "Dropping NETIF_F_GSO_HW since no GRO feature.\n");
> +                       features &= ~NETIF_F_GRO_HW;
> +               }

I'm not sure I agree with this part. The hardware offload should not
be dependent on the software offload. If you are concerned with the
XDP case and such you should probably just look at handling it more
like how TSO is handled with a group of flags that aggregate GRO, LRO,
and GRO_HW into a group of features that must be disabled to support
XDP.

> +               if (!(features & NETIF_F_RXCSUM)) {
> +                       netdev_dbg(dev, "Dropping NETIF_F_GSO_HW since no RXCSUM feature.\n");
> +                       features &= ~NETIF_F_GRO_HW;
> +               }
> +               /* Hardware GRO and LRO are mutually exclusive. */
> +               if (features & NETIF_F_LRO) {
> +                       netdev_dbg(dev, "Dropping NETIF_F_LRO since GRO_HW is set.\n");
> +                       features &= ~NETIF_F_LRO;
> +               }

This is going to be problematic. What if you bond one interface that
has LRO and one that has GRO_HW? It seems like this is going to cause
the bond interface to lie and say LRO isn't there when it actually is,
or worse yet it will force features off when they shouldn't be.

> +       }
> +
>         return features;
>  }
>
> diff --git a/net/core/ethtool.c b/net/core/ethtool.c
> index f8fcf45..50a7920 100644
> --- a/net/core/ethtool.c
> +++ b/net/core/ethtool.c
> @@ -73,6 +73,7 @@ int ethtool_op_get_ts_info(struct net_device *dev, struct ethtool_ts_info *info)
>         [NETIF_F_LLTX_BIT] =             "tx-lockless",
>         [NETIF_F_NETNS_LOCAL_BIT] =      "netns-local",
>         [NETIF_F_GRO_BIT] =              "rx-gro",
> +       [NETIF_F_GRO_HW_BIT] =           "rx-gro-hw",
>         [NETIF_F_LRO_BIT] =              "rx-lro",
>
>         [NETIF_F_TSO_BIT] =              "tx-tcp-segmentation",
> --
> 1.8.3.1
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next v2 1/5] net: Introduce NETIF_F_GRO_HW.
  2017-12-07 18:13   ` Alexander Duyck
@ 2017-12-07 18:44     ` Michael Chan
  2017-12-07 21:35       ` Alexander Duyck
  0 siblings, 1 reply; 19+ messages in thread
From: Michael Chan @ 2017-12-07 18:44 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: David Miller, Netdev, Andrew Gospodarek, Ariel Elior, everest-linux-l2

On Thu, Dec 7, 2017 at 10:13 AM, Alexander Duyck
<alexander.duyck@gmail.com> wrote:
> On Thu, Dec 7, 2017 at 12:03 AM, Michael Chan <michael.chan@broadcom.com> wrote:
>> @@ -7405,6 +7405,23 @@ static netdev_features_t netdev_fix_features(struct net_device *dev,
>>                 features &= ~dev->gso_partial_features;
>>         }
>>
>> +       if (features & NETIF_F_GRO_HW) {
>> +               /* Hardware GRO depends on GRO and RXCSUM. */
>> +               if (!(features & NETIF_F_GRO)) {
>> +                       netdev_dbg(dev, "Dropping NETIF_F_GSO_HW since no GRO feature.\n");
>> +                       features &= ~NETIF_F_GRO_HW;
>> +               }
>
> I'm not sure I agree with this part. The hardware offload should not
> be dependent on the software offload.

As I said before, GRO_HW is basically hardware accelerated GRO. To me,
it makes perfect sense that GRO_HW depends on GRO.

> If you are concerned with the
> XDP case and such you should probably just look at handling it more
> like how TSO is handled with a group of flags that aggregate GRO, LRO,
> and GRO_HW into a group of features that must be disabled to support
> XDP.
>
>> +               if (!(features & NETIF_F_RXCSUM)) {
>> +                       netdev_dbg(dev, "Dropping NETIF_F_GSO_HW since no RXCSUM feature.\n");
>> +                       features &= ~NETIF_F_GRO_HW;
>> +               }
>> +               /* Hardware GRO and LRO are mutually exclusive. */
>> +               if (features & NETIF_F_LRO) {
>> +                       netdev_dbg(dev, "Dropping NETIF_F_LRO since GRO_HW is set.\n");
>> +                       features &= ~NETIF_F_LRO;
>> +               }
>
> This is going to be problematic. What if you bond one interface that
> has LRO and one that has GRO_HW? It seems like this is going to cause
> the bond interface to lie and say LRO isn't there when it actually is,
> or worse yet it will force features off when they shouldn't be.

This is just dropping incompatible features similar to how other
incompatible feature flags are dropped in netdev_fix_features().
Hardware cannot aggregate in both LRO and GRO_HW modes at the same
time.  It's one or the other.

LRO and GRO_HW are different.  We may never agree on this but they are
different.  If a bond needs to disable LRO, it will be propagated to
all slaves.  But each slave can have GRO enabled.  If GRO is enabled,
GRO_HW, being a subset of GRO, can be enabled.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next v2 1/5] net: Introduce NETIF_F_GRO_HW.
  2017-12-07 18:44     ` Michael Chan
@ 2017-12-07 21:35       ` Alexander Duyck
  2017-12-07 22:08         ` Michael Chan
  0 siblings, 1 reply; 19+ messages in thread
From: Alexander Duyck @ 2017-12-07 21:35 UTC (permalink / raw)
  To: Michael Chan
  Cc: David Miller, Netdev, Andrew Gospodarek, Ariel Elior, everest-linux-l2

On Thu, Dec 7, 2017 at 10:44 AM, Michael Chan <michael.chan@broadcom.com> wrote:
> On Thu, Dec 7, 2017 at 10:13 AM, Alexander Duyck
> <alexander.duyck@gmail.com> wrote:
>> On Thu, Dec 7, 2017 at 12:03 AM, Michael Chan <michael.chan@broadcom.com> wrote:
>>> @@ -7405,6 +7405,23 @@ static netdev_features_t netdev_fix_features(struct net_device *dev,
>>>                 features &= ~dev->gso_partial_features;
>>>         }
>>>
>>> +       if (features & NETIF_F_GRO_HW) {
>>> +               /* Hardware GRO depends on GRO and RXCSUM. */
>>> +               if (!(features & NETIF_F_GRO)) {
>>> +                       netdev_dbg(dev, "Dropping NETIF_F_GSO_HW since no GRO feature.\n");
>>> +                       features &= ~NETIF_F_GRO_HW;
>>> +               }
>>
>> I'm not sure I agree with this part. The hardware offload should not
>> be dependent on the software offload.
>
> As I said before, GRO_HW is basically hardware accelerated GRO. To me,
> it makes perfect sense that GRO_HW depends on GRO.
>
>> If you are concerned with the
>> XDP case and such you should probably just look at handling it more
>> like how TSO is handled with a group of flags that aggregate GRO, LRO,
>> and GRO_HW into a group of features that must be disabled to support
>> XDP.
>>
>>> +               if (!(features & NETIF_F_RXCSUM)) {
>>> +                       netdev_dbg(dev, "Dropping NETIF_F_GSO_HW since no RXCSUM feature.\n");
>>> +                       features &= ~NETIF_F_GRO_HW;
>>> +               }
>>> +               /* Hardware GRO and LRO are mutually exclusive. */
>>> +               if (features & NETIF_F_LRO) {
>>> +                       netdev_dbg(dev, "Dropping NETIF_F_LRO since GRO_HW is set.\n");
>>> +                       features &= ~NETIF_F_LRO;
>>> +               }
>>
>> This is going to be problematic. What if you bond one interface that
>> has LRO and one that has GRO_HW? It seems like this is going to cause
>> the bond interface to lie and say LRO isn't there when it actually is,
>> or worse yet it will force features off when they shouldn't be.
>
> This is just dropping incompatible features similar to how other
> incompatible feature flags are dropped in netdev_fix_features().
> Hardware cannot aggregate in both LRO and GRO_HW modes at the same
> time.  It's one or the other.

I'm not saying this is on a single device. The problem is in the case
of bonding you have combined 2 devices. We don't want to have an upper
device saying LRO isn't there when it is. A bonding setup is supposed
to advertise LRO as being enabled if a lower device has it present.
You are now making this an either/or thing at the driver level
generically when that isn't the case due to things like bridging and
bonding. This is why I was arguing that it would be easier to just
make this a super-set of LRO with one form that is reversible and one
that isn't. As is you are introducing all sorts of regressions that
are going to be problematic in any sort of mixed environment.

> LRO and GRO_HW are different.  We may never agree on this but they are
> different.  If a bond needs to disable LRO, it will be propagated to
> all slaves.  But each slave can have GRO enabled.  If GRO is enabled,
> GRO_HW, being a subset of GRO, can be enabled.

I get that they are very different. At a driver level it makes sense
to say you get one or the other at a physical driver level. The point
I was trying to make is that the bond is only able to set one or the
other when you might have a mix of devices. On the bonding device you
should be able to have both LRO and GRO_HW enabled.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next v2 1/5] net: Introduce NETIF_F_GRO_HW.
  2017-12-07 21:35       ` Alexander Duyck
@ 2017-12-07 22:08         ` Michael Chan
  2017-12-07 22:43           ` Alexander Duyck
  0 siblings, 1 reply; 19+ messages in thread
From: Michael Chan @ 2017-12-07 22:08 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: David Miller, Netdev, Andrew Gospodarek, Ariel Elior, everest-linux-l2

On Thu, Dec 7, 2017 at 1:35 PM, Alexander Duyck
<alexander.duyck@gmail.com> wrote:
> On Thu, Dec 7, 2017 at 10:44 AM, Michael Chan <michael.chan@broadcom.com> wrote:
>> On Thu, Dec 7, 2017 at 10:13 AM, Alexander Duyck
>> <alexander.duyck@gmail.com> wrote:
>>> On Thu, Dec 7, 2017 at 12:03 AM, Michael Chan <michael.chan@broadcom.com> wrote:
>>>> @@ -7405,6 +7405,23 @@ static netdev_features_t netdev_fix_features(struct net_device *dev,
>>>>                 features &= ~dev->gso_partial_features;
>>>>         }
>>>>
>>>> +       if (features & NETIF_F_GRO_HW) {
>>>> +               /* Hardware GRO depends on GRO and RXCSUM. */
>>>> +               if (!(features & NETIF_F_GRO)) {
>>>> +                       netdev_dbg(dev, "Dropping NETIF_F_GSO_HW since no GRO feature.\n");
>>>> +                       features &= ~NETIF_F_GRO_HW;
>>>> +               }
>>>
>>> I'm not sure I agree with this part. The hardware offload should not
>>> be dependent on the software offload.
>>
>> As I said before, GRO_HW is basically hardware accelerated GRO. To me,
>> it makes perfect sense that GRO_HW depends on GRO.
>>
>>> If you are concerned with the
>>> XDP case and such you should probably just look at handling it more
>>> like how TSO is handled with a group of flags that aggregate GRO, LRO,
>>> and GRO_HW into a group of features that must be disabled to support
>>> XDP.
>>>
>>>> +               if (!(features & NETIF_F_RXCSUM)) {
>>>> +                       netdev_dbg(dev, "Dropping NETIF_F_GSO_HW since no RXCSUM feature.\n");
>>>> +                       features &= ~NETIF_F_GRO_HW;
>>>> +               }
>>>> +               /* Hardware GRO and LRO are mutually exclusive. */
>>>> +               if (features & NETIF_F_LRO) {
>>>> +                       netdev_dbg(dev, "Dropping NETIF_F_LRO since GRO_HW is set.\n");
>>>> +                       features &= ~NETIF_F_LRO;
>>>> +               }
>>>
>>> This is going to be problematic. What if you bond one interface that
>>> has LRO and one that has GRO_HW? It seems like this is going to cause
>>> the bond interface to lie and say LRO isn't there when it actually is,
>>> or worse yet it will force features off when they shouldn't be.
>>
>> This is just dropping incompatible features similar to how other
>> incompatible feature flags are dropped in netdev_fix_features().
>> Hardware cannot aggregate in both LRO and GRO_HW modes at the same
>> time.  It's one or the other.
>
> I'm not saying this is on a single device. The problem is in the case
> of bonding you have combined 2 devices. We don't want to have an upper
> device saying LRO isn't there when it is. A bonding setup is supposed
> to advertise LRO as being enabled if a lower device has it present.
> You are now making this an either/or thing at the driver level
> generically when that isn't the case due to things like bridging and
> bonding. This is why I was arguing that it would be easier to just
> make this a super-set of LRO with one form that is reversible and one
> that isn't. As is you are introducing all sorts of regressions that
> are going to be problematic in any sort of mixed environment.

Are you saying that if LRO is dropped on a upper device (by the new
code added in this patch), all lower devices will need to drop it too?

If that's what you are saying, it's already taken care of.  There is
netdev_sync_upper_features() and netdev_sync_lower_features() that
will automatically do that as part of __netdev_update_features().

>
>> LRO and GRO_HW are different.  We may never agree on this but they are
>> different.  If a bond needs to disable LRO, it will be propagated to
>> all slaves.  But each slave can have GRO enabled.  If GRO is enabled,
>> GRO_HW, being a subset of GRO, can be enabled.
>
> I get that they are very different. At a driver level it makes sense
> to say you get one or the other at a physical driver level. The point
> I was trying to make is that the bond is only able to set one or the
> other when you might have a mix of devices. On the bonding device you
> should be able to have both LRO and GRO_HW enabled.

On the bond, you can have LRO enabled and it is propagated to lower
devices so that lower devices will enable LRO if it is supported.  If
LRO is disabled on the bond (e.g. due to bridging), It will be
propagated so all lower devices will disable LRO if it is enabled.

GRO/GRO_HW is different and it is not propagated from upper devices to
lower devices like LRO.  Lower devices can have GRO/GRO_HW
enabled/disabled regardless of upper device.  This is no different
from say RXCSUM.  Lower devices can have RXCSUM on/off regardless.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next v2 1/5] net: Introduce NETIF_F_GRO_HW.
  2017-12-07 22:08         ` Michael Chan
@ 2017-12-07 22:43           ` Alexander Duyck
  2017-12-07 23:17             ` Michael Chan
  0 siblings, 1 reply; 19+ messages in thread
From: Alexander Duyck @ 2017-12-07 22:43 UTC (permalink / raw)
  To: Michael Chan
  Cc: David Miller, Netdev, Andrew Gospodarek, Ariel Elior, everest-linux-l2

On Thu, Dec 7, 2017 at 2:08 PM, Michael Chan <michael.chan@broadcom.com> wrote:
> On Thu, Dec 7, 2017 at 1:35 PM, Alexander Duyck
> <alexander.duyck@gmail.com> wrote:
>> On Thu, Dec 7, 2017 at 10:44 AM, Michael Chan <michael.chan@broadcom.com> wrote:
>>> On Thu, Dec 7, 2017 at 10:13 AM, Alexander Duyck
>>> <alexander.duyck@gmail.com> wrote:
>>>> On Thu, Dec 7, 2017 at 12:03 AM, Michael Chan <michael.chan@broadcom.com> wrote:
>>>>> @@ -7405,6 +7405,23 @@ static netdev_features_t netdev_fix_features(struct net_device *dev,
>>>>>                 features &= ~dev->gso_partial_features;
>>>>>         }
>>>>>
>>>>> +       if (features & NETIF_F_GRO_HW) {
>>>>> +               /* Hardware GRO depends on GRO and RXCSUM. */
>>>>> +               if (!(features & NETIF_F_GRO)) {
>>>>> +                       netdev_dbg(dev, "Dropping NETIF_F_GSO_HW since no GRO feature.\n");
>>>>> +                       features &= ~NETIF_F_GRO_HW;
>>>>> +               }
>>>>
>>>> I'm not sure I agree with this part. The hardware offload should not
>>>> be dependent on the software offload.
>>>
>>> As I said before, GRO_HW is basically hardware accelerated GRO. To me,
>>> it makes perfect sense that GRO_HW depends on GRO.
>>>
>>>> If you are concerned with the
>>>> XDP case and such you should probably just look at handling it more
>>>> like how TSO is handled with a group of flags that aggregate GRO, LRO,
>>>> and GRO_HW into a group of features that must be disabled to support
>>>> XDP.
>>>>
>>>>> +               if (!(features & NETIF_F_RXCSUM)) {
>>>>> +                       netdev_dbg(dev, "Dropping NETIF_F_GSO_HW since no RXCSUM feature.\n");
>>>>> +                       features &= ~NETIF_F_GRO_HW;
>>>>> +               }
>>>>> +               /* Hardware GRO and LRO are mutually exclusive. */
>>>>> +               if (features & NETIF_F_LRO) {
>>>>> +                       netdev_dbg(dev, "Dropping NETIF_F_LRO since GRO_HW is set.\n");
>>>>> +                       features &= ~NETIF_F_LRO;
>>>>> +               }
>>>>
>>>> This is going to be problematic. What if you bond one interface that
>>>> has LRO and one that has GRO_HW? It seems like this is going to cause
>>>> the bond interface to lie and say LRO isn't there when it actually is,
>>>> or worse yet it will force features off when they shouldn't be.
>>>
>>> This is just dropping incompatible features similar to how other
>>> incompatible feature flags are dropped in netdev_fix_features().
>>> Hardware cannot aggregate in both LRO and GRO_HW modes at the same
>>> time.  It's one or the other.
>>
>> I'm not saying this is on a single device. The problem is in the case
>> of bonding you have combined 2 devices. We don't want to have an upper
>> device saying LRO isn't there when it is. A bonding setup is supposed
>> to advertise LRO as being enabled if a lower device has it present.
>> You are now making this an either/or thing at the driver level
>> generically when that isn't the case due to things like bridging and
>> bonding. This is why I was arguing that it would be easier to just
>> make this a super-set of LRO with one form that is reversible and one
>> that isn't. As is you are introducing all sorts of regressions that
>> are going to be problematic in any sort of mixed environment.
>
> Are you saying that if LRO is dropped on a upper device (by the new
> code added in this patch), all lower devices will need to drop it too?
>
> If that's what you are saying, it's already taken care of.  There is
> netdev_sync_upper_features() and netdev_sync_lower_features() that
> will automatically do that as part of __netdev_update_features().
>
>>
>>> LRO and GRO_HW are different.  We may never agree on this but they are
>>> different.  If a bond needs to disable LRO, it will be propagated to
>>> all slaves.  But each slave can have GRO enabled.  If GRO is enabled,
>>> GRO_HW, being a subset of GRO, can be enabled.
>>
>> I get that they are very different. At a driver level it makes sense
>> to say you get one or the other at a physical driver level. The point
>> I was trying to make is that the bond is only able to set one or the
>> other when you might have a mix of devices. On the bonding device you
>> should be able to have both LRO and GRO_HW enabled.
>
> On the bond, you can have LRO enabled and it is propagated to lower
> devices so that lower devices will enable LRO if it is supported.  If
> LRO is disabled on the bond (e.g. due to bridging), It will be
> propagated so all lower devices will disable LRO if it is enabled.
>
> GRO/GRO_HW is different and it is not propagated from upper devices to
> lower devices like LRO.  Lower devices can have GRO/GRO_HW
> enabled/disabled regardless of upper device.  This is no different
> from say RXCSUM.  Lower devices can have RXCSUM on/off regardless.

So on the bond device you should be able to have LRO, GRO, and GRO_HW
enabled all at the same time and that means devices below it can have
any combination of those enabled. If I recall enabling it on the lower
device shouldn't be allowed if an upper device has it disabled. If it
is enabled on the upper device it can be either enabled or disabled on
the lower device depending on the state the device prefers.

The GRO and GRO_HW should be propagated just like RXCSUM and
everything else. Basically if I turn it off on the bond all of the
lower devices should have it disabled. Only if it is enabled should
lower devices be allowed to have it either enabled or disabled. The
easiest way to think of it is on the bond the features are like a mask
of what can be enabled/disabled for the lower devices. If a feature is
not present in the bond it should not be present on any of the devices
below it. That is why I am saying you cannot say LRO and GRO_HW are
mutually exclusive for all devices. For physical devices I would say
yes it can be mutually exclusive, but for things like bond it cannot
be since you can have one device doing LRO and one device doing GRO_HW
getting added to a bond and both features should be advertised there.

My concern is if I have a bond with both GRO_HW and LRO enabled on
lower devices, and then disable something like TSO we are going to see
LRO disabled on the bond and propagated to all the lower devices. That
is a bad behavior and shouldn't happen. In addition I would want to be
able to disable GRO_HW from the bond should I encounter a buggy
implementation of the GRO hardware offload at some point in the
future. I don't like it being tied so closely with GRO since all it
takes is one hardware failure and then we are dealing with people
disabling GRO due to reports of it being buggy instead of it being
tied specifically to GRO_HW.

- Alex

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next v2 1/5] net: Introduce NETIF_F_GRO_HW.
  2017-12-07 22:43           ` Alexander Duyck
@ 2017-12-07 23:17             ` Michael Chan
  2017-12-07 23:35               ` Alexander Duyck
  0 siblings, 1 reply; 19+ messages in thread
From: Michael Chan @ 2017-12-07 23:17 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: David Miller, Netdev, Andrew Gospodarek, Ariel Elior, everest-linux-l2

On Thu, Dec 7, 2017 at 2:43 PM, Alexander Duyck
<alexander.duyck@gmail.com> wrote:
> On Thu, Dec 7, 2017 at 2:08 PM, Michael Chan <michael.chan@broadcom.com> wrote:
>> On the bond, you can have LRO enabled and it is propagated to lower
>> devices so that lower devices will enable LRO if it is supported.  If
>> LRO is disabled on the bond (e.g. due to bridging), It will be
>> propagated so all lower devices will disable LRO if it is enabled.
>>
>> GRO/GRO_HW is different and it is not propagated from upper devices to
>> lower devices like LRO.  Lower devices can have GRO/GRO_HW
>> enabled/disabled regardless of upper device.  This is no different
>> from say RXCSUM.  Lower devices can have RXCSUM on/off regardless.
>
> So on the bond device you should be able to have LRO, GRO, and GRO_HW
> enabled all at the same time and that means devices below it can have
> any combination of those enabled. If I recall enabling it on the lower
> device shouldn't be allowed if an upper device has it disabled. If it
> is enabled on the upper device it can be either enabled or disabled on
> the lower device depending on the state the device prefers.

Only LRO behaves the way you describe.  I believe LRO is treated like
this differently because bridging or routing needs to be able to
disable it completely.

GRO is not like that.  RXCSUM is not like that.

>
> The GRO and GRO_HW should be propagated just like RXCSUM and
> everything else. Basically if I turn it off on the bond all of the
> lower devices should have it disabled. Only if it is enabled should
> lower devices be allowed to have it either enabled or disabled. The
> easiest way to think of it is on the bond the features are like a mask
> of what can be enabled/disabled for the lower devices. If a feature is
> not present in the bond it should not be present on any of the devices
> below it. That is why I am saying you cannot say LRO and GRO_HW are
> mutually exclusive for all devices. For physical devices I would say
> yes it can be mutually exclusive, but for things like bond it cannot
> be since you can have one device doing LRO and one device doing GRO_HW
> getting added to a bond and both features should be advertised there.

I think you keep thinking that whatever we do with LRO, we must do the
same thing with GRO_HW.

But that's not true.  GRO/GRO_HW don't have to be disabled when
bridging/routing are enabled.  That's why upper devices don't have to
treat GRO/GRO_HW like LRO.  It can be treated just like RXCSUM which
is don't care.

>
> My concern is if I have a bond with both GRO_HW and LRO enabled on
> lower devices, and then disable something like TSO we are going to see
> LRO disabled on the bond and propagated to all the lower devices.

I don't get this.  I don't see how TSO is related.

> That
> is a bad behavior and shouldn't happen. In addition I would want to be
> able to disable GRO_HW from the bond should I encounter a buggy
> implementation of the GRO hardware offload at some point in the
> future. I don't like it being tied so closely with GRO since all it
> takes is one hardware failure and then we are dealing with people
> disabling GRO due to reports of it being buggy instead of it being
> tied specifically to GRO_HW.
>

Today, GRO, RXCSUM, NTUPLE, etc on a bond do not get propagated.  You
can propose and make them propagate if you want.

So GRO_HW just naturally follows GRO since it is a true subset of it.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next v2 1/5] net: Introduce NETIF_F_GRO_HW.
  2017-12-07 23:17             ` Michael Chan
@ 2017-12-07 23:35               ` Alexander Duyck
  2017-12-08  0:05                 ` Michael Chan
  0 siblings, 1 reply; 19+ messages in thread
From: Alexander Duyck @ 2017-12-07 23:35 UTC (permalink / raw)
  To: Michael Chan
  Cc: David Miller, Netdev, Andrew Gospodarek, Ariel Elior, everest-linux-l2

On Thu, Dec 7, 2017 at 3:17 PM, Michael Chan <michael.chan@broadcom.com> wrote:
> On Thu, Dec 7, 2017 at 2:43 PM, Alexander Duyck
> <alexander.duyck@gmail.com> wrote:
>> On Thu, Dec 7, 2017 at 2:08 PM, Michael Chan <michael.chan@broadcom.com> wrote:
>>> On the bond, you can have LRO enabled and it is propagated to lower
>>> devices so that lower devices will enable LRO if it is supported.  If
>>> LRO is disabled on the bond (e.g. due to bridging), It will be
>>> propagated so all lower devices will disable LRO if it is enabled.
>>>
>>> GRO/GRO_HW is different and it is not propagated from upper devices to
>>> lower devices like LRO.  Lower devices can have GRO/GRO_HW
>>> enabled/disabled regardless of upper device.  This is no different
>>> from say RXCSUM.  Lower devices can have RXCSUM on/off regardless.
>>
>> So on the bond device you should be able to have LRO, GRO, and GRO_HW
>> enabled all at the same time and that means devices below it can have
>> any combination of those enabled. If I recall enabling it on the lower
>> device shouldn't be allowed if an upper device has it disabled. If it
>> is enabled on the upper device it can be either enabled or disabled on
>> the lower device depending on the state the device prefers.
>
> Only LRO behaves the way you describe.  I believe LRO is treated like
> this differently because bridging or routing needs to be able to
> disable it completely.
>
> GRO is not like that.  RXCSUM is not like that.

I think that what your saying is your patch has exposed a bug. If I
disable GRO or GRO_HW on the upper device I would expect to have it
disabled on all lower devices. If anything it should probably be added
to NETIF_F_UPPER_DISABLES. Otherwise we are going to be breaking
things like XPS which you pointed out earlier.

>> The GRO and GRO_HW should be propagated just like RXCSUM and
>> everything else. Basically if I turn it off on the bond all of the
>> lower devices should have it disabled. Only if it is enabled should
>> lower devices be allowed to have it either enabled or disabled. The
>> easiest way to think of it is on the bond the features are like a mask
>> of what can be enabled/disabled for the lower devices. If a feature is
>> not present in the bond it should not be present on any of the devices
>> below it. That is why I am saying you cannot say LRO and GRO_HW are
>> mutually exclusive for all devices. For physical devices I would say
>> yes it can be mutually exclusive, but for things like bond it cannot
>> be since you can have one device doing LRO and one device doing GRO_HW
>> getting added to a bond and both features should be advertised there.
>
> I think you keep thinking that whatever we do with LRO, we must do the
> same thing with GRO_HW.
>
> But that's not true.  GRO/GRO_HW don't have to be disabled when
> bridging/routing are enabled.  That's why upper devices don't have to
> treat GRO/GRO_HW like LRO.  It can be treated just like RXCSUM which
> is don't care.
>
>>
>> My concern is if I have a bond with both GRO_HW and LRO enabled on
>> lower devices, and then disable something like TSO we are going to see
>> LRO disabled on the bond and propagated to all the lower devices.
>
> I don't get this.  I don't see how TSO is related.

It isn't. That is the point. If I change ANY feature it will trigger
fix_features to get called. It will then see we have GRO_HW and LRO
since we have one interface that supports one and one that supports
another. It will trigger this code and cause LRO to be disabled which
then will be propagated down.

>> That
>> is a bad behavior and shouldn't happen. In addition I would want to be
>> able to disable GRO_HW from the bond should I encounter a buggy
>> implementation of the GRO hardware offload at some point in the
>> future. I don't like it being tied so closely with GRO since all it
>> takes is one hardware failure and then we are dealing with people
>> disabling GRO due to reports of it being buggy instead of it being
>> tied specifically to GRO_HW.
>>
>
> Today, GRO, RXCSUM, NTUPLE, etc on a bond do not get propagated.  You
> can propose and make them propagate if you want.

Or you can fix your patches so you take it into account now instead of
putting things in a broken state and asking others to fix it.

> So GRO_HW just naturally follows GRO since it is a true subset of it.

Right, and I am saying you have exposed a bug. If I don't want GRO on
the bond it should be propagated down. It doesn't make sense to allow
lower devices to assemble frames when the bond doesn't want them and
GSO won't kick in before it gets to the bond.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next v2 1/5] net: Introduce NETIF_F_GRO_HW.
  2017-12-07 23:35               ` Alexander Duyck
@ 2017-12-08  0:05                 ` Michael Chan
  2017-12-08  2:36                   ` Alexander Duyck
  0 siblings, 1 reply; 19+ messages in thread
From: Michael Chan @ 2017-12-08  0:05 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: David Miller, Netdev, Andrew Gospodarek, Ariel Elior, everest-linux-l2

On Thu, Dec 7, 2017 at 3:35 PM, Alexander Duyck
<alexander.duyck@gmail.com> wrote:
> On Thu, Dec 7, 2017 at 3:17 PM, Michael Chan <michael.chan@broadcom.com> wrote:
>> I don't get this.  I don't see how TSO is related.
>
> It isn't. That is the point. If I change ANY feature it will trigger
> fix_features to get called. It will then see we have GRO_HW and LRO
> since we have one interface that supports one and one that supports
> another. It will trigger this code and cause LRO to be disabled which
> then will be propagated down.

I see.  But this won't happen.  Because the bonding driver is not
advertising NETIF_F_GRO_HW in its hw_features.  It is not advertising
NETIF_F_GRO either but I think it gets added automatically since it is
a software feature.  So LRO won't get disabled on the bond when a
unrelated feature is changed.

But I think I see your point.  I can make it so that it is up to
individual driver's .ndo_fix_features() to drop LRO/GRO_HW as it sees
fit, instead of doing it in the common netdev_fix_features().  That
way, it is more flexible at least.

>
>>> That
>>> is a bad behavior and shouldn't happen. In addition I would want to be
>>> able to disable GRO_HW from the bond should I encounter a buggy
>>> implementation of the GRO hardware offload at some point in the
>>> future. I don't like it being tied so closely with GRO since all it
>>> takes is one hardware failure and then we are dealing with people
>>> disabling GRO due to reports of it being buggy instead of it being
>>> tied specifically to GRO_HW.
>>>
>>
>> Today, GRO, RXCSUM, NTUPLE, etc on a bond do not get propagated.  You
>> can propose and make them propagate if you want.
>
> Or you can fix your patches so you take it into account now instead of
> putting things in a broken state and asking others to fix it.

I don't think that things are necessarily broken today.  LRO truly
needs to be propagated.  It's debatable whether other features like
GRO/RXCSUM/NTUPLE should be centrally set by the upper device or not.

>
>> So GRO_HW just naturally follows GRO since it is a true subset of it.
>
> Right, and I am saying you have exposed a bug. If I don't want GRO on
> the bond it should be propagated down. It doesn't make sense to allow
> lower devices to assemble frames when the bond doesn't want them and
> GSO won't kick in before it gets to the bond.

GRO kicks in at the lower device before it gets to the bond if the
lower device calls napi_gro_receive() and GRO is enabled.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next v2 1/5] net: Introduce NETIF_F_GRO_HW.
  2017-12-08  0:05                 ` Michael Chan
@ 2017-12-08  2:36                   ` Alexander Duyck
  2017-12-08  4:02                     ` Michael Chan
  0 siblings, 1 reply; 19+ messages in thread
From: Alexander Duyck @ 2017-12-08  2:36 UTC (permalink / raw)
  To: Michael Chan
  Cc: David Miller, Netdev, Andrew Gospodarek, Ariel Elior, everest-linux-l2

On Thu, Dec 7, 2017 at 4:05 PM, Michael Chan <michael.chan@broadcom.com> wrote:
> On Thu, Dec 7, 2017 at 3:35 PM, Alexander Duyck
> <alexander.duyck@gmail.com> wrote:
>> On Thu, Dec 7, 2017 at 3:17 PM, Michael Chan <michael.chan@broadcom.com> wrote:
>>> I don't get this.  I don't see how TSO is related.
>>
>> It isn't. That is the point. If I change ANY feature it will trigger
>> fix_features to get called. It will then see we have GRO_HW and LRO
>> since we have one interface that supports one and one that supports
>> another. It will trigger this code and cause LRO to be disabled which
>> then will be propagated down.
>
> I see.  But this won't happen.  Because the bonding driver is not
> advertising NETIF_F_GRO_HW in its hw_features.  It is not advertising
> NETIF_F_GRO either but I think it gets added automatically since it is
> a software feature.  So LRO won't get disabled on the bond when a
> unrelated feature is changed.
>
> But I think I see your point.  I can make it so that it is up to
> individual driver's .ndo_fix_features() to drop LRO/GRO_HW as it sees
> fit, instead of doing it in the common netdev_fix_features().  That
> way, it is more flexible at least.

Thank you.

>>
>>>> That
>>>> is a bad behavior and shouldn't happen. In addition I would want to be
>>>> able to disable GRO_HW from the bond should I encounter a buggy
>>>> implementation of the GRO hardware offload at some point in the
>>>> future. I don't like it being tied so closely with GRO since all it
>>>> takes is one hardware failure and then we are dealing with people
>>>> disabling GRO due to reports of it being buggy instead of it being
>>>> tied specifically to GRO_HW.
>>>>
>>>
>>> Today, GRO, RXCSUM, NTUPLE, etc on a bond do not get propagated.  You
>>> can propose and make them propagate if you want.
>>
>> Or you can fix your patches so you take it into account now instead of
>> putting things in a broken state and asking others to fix it.
>
> I don't think that things are necessarily broken today.  LRO truly
> needs to be propagated.  It's debatable whether other features like
> GRO/RXCSUM/NTUPLE should be centrally set by the upper device or not.

So I can agree with the NTUPLE not being propagated since it doesn't
actually effect upper devices. Really the functionality only really
has effects locally since the functionality consists of route to a
specific queue/device or drop the packet.

I'm not sure why RXCSUM isn't being propagated. It seems like that is
something that would make sense to have passed all the way down to the
lower devices since a single device that is doing bad Rx checksum
offloads could potentially corrupt all traffic in a bond. Seems like
that one should definitely be included.

>>
>>> So GRO_HW just naturally follows GRO since it is a true subset of it.
>>
>> Right, and I am saying you have exposed a bug. If I don't want GRO on
>> the bond it should be propagated down. It doesn't make sense to allow
>> lower devices to assemble frames when the bond doesn't want them and
>> GSO won't kick in before it gets to the bond.
>
> GRO kicks in at the lower device before it gets to the bond if the
> lower device calls napi_gro_receive() and GRO is enabled.

I get that. I assume the reason why the bond doesn't have it enabled
is because we don't want it to kick in at every given netdev, there
isn't any point to do GRO more than once. The problem is GRO_HW isn't
a pure software offload like GRO is. Call me a pessimist, but when we
end up encountering a buggy implementation that has to be disabled we
will want the right infrastructure in place to handle it. It becomes
another argument for why we might want to split GRO_HW and GRO without
tying them together. It would make sense to expose GRO_HW in a bond,
but not GRO. It might be something where we want to do any close tying
together of the GRO flag and GRO_HW at the driver as well. Basically
the legacy devices that transition over to GRO_HW from using just the
GRO flag could do that to maintain existing functionality, and new
drivers that implement it could opt in to the same behavior or just
handle GRO_HW as a separate flag.

Actually I just had a thought. What if we consider this a separate GRO
stage instead of just a hardware offload? Our standard GRO is a post
receive from the driver perspective, basically the packet is assembled
after we have handed it to the stack. What you are doing with GRO_HW
is essentially providing an early reassembly before it is handed to
the stack. What if we were to rename GRO_HW to something like
GRO_LOWER, GRO_EARLY, GRO_PRE, or pick your name (I'm lousy at
naming), and used it as a way to indicate that we want to perform GRO
before we begin receive processing on the frame in our driver? Then
for stacked devices you could use this new flag to indicate you don't
want to perform GRO on the lower levels below this device, and could
then use the regular GRO flag to control if we do it ourselves. Doing
that should provide stacked devices with a good way to control GRO on
the lower devices and would resolve what you need to indicate as well.
The only real changes needed might be a rename and to add the
necessary bit shifting for the upper and lower dev sync code. If you
aren't interested in the idea I can probably spend a couple of hours
getting to it tomorrow since I think this might be a much better way
to go as it solves multiple issues.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next v2 1/5] net: Introduce NETIF_F_GRO_HW.
  2017-12-08  2:36                   ` Alexander Duyck
@ 2017-12-08  4:02                     ` Michael Chan
  0 siblings, 0 replies; 19+ messages in thread
From: Michael Chan @ 2017-12-08  4:02 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: David Miller, Netdev, Andrew Gospodarek, Ariel Elior, everest-linux-l2

On Thu, Dec 7, 2017 at 6:36 PM, Alexander Duyck
<alexander.duyck@gmail.com> wrote:
> On Thu, Dec 7, 2017 at 4:05 PM, Michael Chan <michael.chan@broadcom.com> wrote:
>> I see.  But this won't happen.  Because the bonding driver is not
>> advertising NETIF_F_GRO_HW in its hw_features.  It is not advertising
>> NETIF_F_GRO either but I think it gets added automatically since it is
>> a software feature.  So LRO won't get disabled on the bond when a
>> unrelated feature is changed.
>>
>> But I think I see your point.  I can make it so that it is up to
>> individual driver's .ndo_fix_features() to drop LRO/GRO_HW as it sees
>> fit, instead of doing it in the common netdev_fix_features().  That
>> way, it is more flexible at least.
>
> Thank you.

OK.  I will make this change for V3.

>>
>> I don't think that things are necessarily broken today.  LRO truly
>> needs to be propagated.  It's debatable whether other features like
>> GRO/RXCSUM/NTUPLE should be centrally set by the upper device or not.
>
> So I can agree with the NTUPLE not being propagated since it doesn't
> actually effect upper devices. Really the functionality only really
> has effects locally since the functionality consists of route to a
> specific queue/device or drop the packet.
>
> I'm not sure why RXCSUM isn't being propagated. It seems like that is
> something that would make sense to have passed all the way down to the
> lower devices since a single device that is doing bad Rx checksum
> offloads could potentially corrupt all traffic in a bond. Seems like
> that one should definitely be included.

This is a separate discussion that goes beyond GRO.  There are pros
and cons for the upper device to propagate every single feature flag.

>>
>> GRO kicks in at the lower device before it gets to the bond if the
>> lower device calls napi_gro_receive() and GRO is enabled.
>
> I get that. I assume the reason why the bond doesn't have it enabled
> is because we don't want it to kick in at every given netdev, there
> isn't any point to do GRO more than once. The problem is GRO_HW isn't
> a pure software offload like GRO is. Call me a pessimist, but when we
> end up encountering a buggy implementation that has to be disabled we
> will want the right infrastructure in place to handle it. It becomes
> another argument for why we might want to split GRO_HW and GRO without
> tying them together. It would make sense to expose GRO_HW in a bond,
> but not GRO. It might be something where we want to do any close tying
> together of the GRO flag and GRO_HW at the driver as well. Basically
> the legacy devices that transition over to GRO_HW from using just the
> GRO flag could do that to maintain existing functionality, and new
> drivers that implement it could opt in to the same behavior or just
> handle GRO_HW as a separate flag.

To me, making GRO_HW dependent on GRO makes the most intuitive sense.
Separating them is just confusing.  The possibility of GRO_HW being
enabled without GRO enabled makes no sense to me.

>
> Actually I just had a thought. What if we consider this a separate GRO
> stage instead of just a hardware offload? Our standard GRO is a post
> receive from the driver perspective, basically the packet is assembled
> after we have handed it to the stack. What you are doing with GRO_HW
> is essentially providing an early reassembly before it is handed to
> the stack. What if we were to rename GRO_HW to something like
> GRO_LOWER, GRO_EARLY, GRO_PRE, or pick your name (I'm lousy at
> naming), and used it as a way to indicate that we want to perform GRO
> before we begin receive processing on the frame in our driver? Then
> for stacked devices you could use this new flag to indicate you don't
> want to perform GRO on the lower levels below this device, and could
> then use the regular GRO flag to control if we do it ourselves. Doing
> that should provide stacked devices with a good way to control GRO on
> the lower devices and would resolve what you need to indicate as well.
> The only real changes needed might be a rename and to add the
> necessary bit shifting for the upper and lower dev sync code. If you
> aren't interested in the idea I can probably spend a couple of hours
> getting to it tomorrow since I think this might be a much better way
> to go as it solves multiple issues.

I really don't see what a different name will buy us.  If you want to
propagate GRO/GRO_HW, we can do that if others agree.  I only feel
strongly that GRO/GRO_HW should be tied.  I don't feel strongly
whether GRO/GRO_HW should be propagated or not propagated.  Again, LRO
needs to be propagated out of necessity (e.g. when a bond is added to
a bridge).

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH net-next v2 5/5] qede: Use NETIF_F_GRO_HW.
  2017-12-07  8:03 ` [PATCH net-next v2 5/5] qede: " Michael Chan
@ 2017-12-08 17:02   ` Chopra, Manish
  2017-12-08 22:09   ` Marcelo Ricardo Leitner
  1 sibling, 0 replies; 19+ messages in thread
From: Chopra, Manish @ 2017-12-08 17:02 UTC (permalink / raw)
  To: Michael Chan, davem
  Cc: netdev, andrew.gospodarek, Elior, Ariel, Dept-Eng Everest Linux L2

> -----Original Message-----
> From: Michael Chan [mailto:michael.chan@broadcom.com]
> Sent: Thursday, December 07, 2017 1:34 PM
> To: davem@davemloft.net
> Cc: netdev@vger.kernel.org; andrew.gospodarek@broadcom.com; Elior, Ariel
> <Ariel.Elior@cavium.com>; Dept-Eng Everest Linux L2 <Dept-
> EngEverestLinuxL2@cavium.com>
> Subject: [PATCH net-next v2 5/5] qede: Use NETIF_F_GRO_HW.
> 
> Advertise NETIF_F_GRO_HW and set edev->gro_disable according to the feature
> flag.  Add qede_fix_features() to drop NETIF_F_GRO_HW if XDP is running or
> MTU does not support GRO_HW.  qede_change_mtu() also checks and disables
> GRO_HW if MTU is not supported.
> 
> Cc: Ariel Elior <Ariel.Elior@cavium.com>
> Cc: everest-linux-l2@cavium.com
> Signed-off-by: Michael Chan <michael.chan@broadcom.com>
> ---
>  drivers/net/ethernet/qlogic/qede/qede.h         |  2 ++
>  drivers/net/ethernet/qlogic/qede/qede_ethtool.c |  3 +++
> drivers/net/ethernet/qlogic/qede/qede_filter.c  | 20 +++++++++++++-------
>  drivers/net/ethernet/qlogic/qede/qede_main.c    | 17 ++++++-----------
>  4 files changed, 24 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/net/ethernet/qlogic/qede/qede.h
> b/drivers/net/ethernet/qlogic/qede/qede.h
> index a3a70ad..8a33651 100644
> --- a/drivers/net/ethernet/qlogic/qede/qede.h
> +++ b/drivers/net/ethernet/qlogic/qede/qede.h
> @@ -494,6 +494,8 @@ int qede_free_tx_pkt(struct qede_dev *edev,  void
> qede_vlan_mark_nonconfigured(struct qede_dev *edev);  int
> qede_configure_vlan_filters(struct qede_dev *edev);
> 
> +netdev_features_t qede_fix_features(struct net_device *dev,
> +				    netdev_features_t features);
>  int qede_set_features(struct net_device *dev, netdev_features_t features);
> void qede_set_rx_mode(struct net_device *ndev);  void
> qede_config_rx_mode(struct net_device *ndev); diff --git
> a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
> b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
> index dae7412..4ca3847 100644
> --- a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
> +++ b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
> @@ -940,6 +940,9 @@ int qede_change_mtu(struct net_device *ndev, int
> new_mtu)
>  	DP_VERBOSE(edev, (NETIF_MSG_IFUP | NETIF_MSG_IFDOWN),
>  		   "Configuring MTU size of %d\n", new_mtu);
> 
> +	if (new_mtu > PAGE_SIZE)
> +		ndev->features &= ~NETIF_F_GRO_HW;
> +
>  	/* Set the mtu field and re-start the interface if needed */
>  	args.u.mtu = new_mtu;
>  	args.func = &qede_update_mtu;
> diff --git a/drivers/net/ethernet/qlogic/qede/qede_filter.c
> b/drivers/net/ethernet/qlogic/qede/qede_filter.c
> index c1a0708..2de947e 100644
> --- a/drivers/net/ethernet/qlogic/qede/qede_filter.c
> +++ b/drivers/net/ethernet/qlogic/qede/qede_filter.c
> @@ -895,19 +895,25 @@ static void qede_set_features_reload(struct
> qede_dev *edev,
>  	edev->ndev->features = args->u.features;  }
> 
> +netdev_features_t qede_fix_features(struct net_device *dev,
> +				    netdev_features_t features)
> +{
> +	struct qede_dev *edev = netdev_priv(dev);
> +
> +	if (edev->xdp_prog || edev->ndev->mtu > PAGE_SIZE)
> +		features &= ~NETIF_F_GRO_HW;
> +
> +	return features;
> +}
> +
>  int qede_set_features(struct net_device *dev, netdev_features_t features)  {
>  	struct qede_dev *edev = netdev_priv(dev);
>  	netdev_features_t changes = features ^ dev->features;
>  	bool need_reload = false;
> 
> -	/* No action needed if hardware GRO is disabled during driver load */
> -	if (changes & NETIF_F_GRO) {
> -		if (dev->features & NETIF_F_GRO)
> -			need_reload = !edev->gro_disable;
> -		else
> -			need_reload = edev->gro_disable;
> -	}
> +	if (changes & NETIF_F_GRO_HW)
> +		need_reload = true;
> 
>  	if (need_reload) {
>  		struct qede_reload_args args;
> diff --git a/drivers/net/ethernet/qlogic/qede/qede_main.c
> b/drivers/net/ethernet/qlogic/qede/qede_main.c
> index 57332b3..90d79ae 100644
> --- a/drivers/net/ethernet/qlogic/qede/qede_main.c
> +++ b/drivers/net/ethernet/qlogic/qede/qede_main.c
> @@ -545,6 +545,7 @@ static int qede_ioctl(struct net_device *dev, struct ifreq
> *ifr, int cmd)  #endif
>  	.ndo_vlan_rx_add_vid = qede_vlan_rx_add_vid,
>  	.ndo_vlan_rx_kill_vid = qede_vlan_rx_kill_vid,
> +	.ndo_fix_features = qede_fix_features,
>  	.ndo_set_features = qede_set_features,
>  	.ndo_get_stats64 = qede_get_stats64,
>  #ifdef CONFIG_QED_SRIOV
> @@ -572,6 +573,7 @@ static int qede_ioctl(struct net_device *dev, struct ifreq
> *ifr, int cmd)
>  	.ndo_change_mtu = qede_change_mtu,
>  	.ndo_vlan_rx_add_vid = qede_vlan_rx_add_vid,
>  	.ndo_vlan_rx_kill_vid = qede_vlan_rx_kill_vid,
> +	.ndo_fix_features = qede_fix_features,
>  	.ndo_set_features = qede_set_features,
>  	.ndo_get_stats64 = qede_get_stats64,
>  	.ndo_udp_tunnel_add = qede_udp_tunnel_add, @@ -589,6 +591,7 @@
> static int qede_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
>  	.ndo_change_mtu = qede_change_mtu,
>  	.ndo_vlan_rx_add_vid = qede_vlan_rx_add_vid,
>  	.ndo_vlan_rx_kill_vid = qede_vlan_rx_kill_vid,
> +	.ndo_fix_features = qede_fix_features,
>  	.ndo_set_features = qede_set_features,
>  	.ndo_get_stats64 = qede_get_stats64,
>  	.ndo_udp_tunnel_add = qede_udp_tunnel_add, @@ -676,7 +679,7 @@
> static void qede_init_ndev(struct qede_dev *edev)
>  	ndev->priv_flags |= IFF_UNICAST_FLT;
> 
>  	/* user-changeble features */
> -	hw_features = NETIF_F_GRO | NETIF_F_SG |
> +	hw_features = NETIF_F_GRO | NETIF_F_GRO_HW | NETIF_F_SG |
>  		      NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
>  		      NETIF_F_TSO | NETIF_F_TSO6;
> 
> @@ -1228,18 +1231,9 @@ static int qede_alloc_sge_mem(struct qede_dev
> *edev, struct qede_rx_queue *rxq)
>  	dma_addr_t mapping;
>  	int i;
> 
> -	/* Don't perform FW aggregations in case of XDP */
> -	if (edev->xdp_prog)
> -		edev->gro_disable = 1;
> -
>  	if (edev->gro_disable)
>  		return 0;
> 
> -	if (edev->ndev->mtu > PAGE_SIZE) {
> -		edev->gro_disable = 1;
> -		return 0;
> -	}
> -
>  	for (i = 0; i < ETH_TPA_MAX_AGGS_NUM; i++) {
>  		struct qede_agg_info *tpa_info = &rxq->tpa_info[i];
>  		struct sw_rx_data *replace_buf = &tpa_info->buffer; @@ -
> 1269,6 +1263,7 @@ static int qede_alloc_sge_mem(struct qede_dev *edev,
> struct qede_rx_queue *rxq)
>  err:
>  	qede_free_sge_mem(edev, rxq);
>  	edev->gro_disable = 1;
> +	edev->ndev->features &= ~NETIF_F_GRO_HW;
>  	return -ENOMEM;
>  }
> 
> @@ -1511,7 +1506,7 @@ static void qede_init_fp(struct qede_dev *edev)
>  			 edev->ndev->name, queue_id);
>  	}
> 
> -	edev->gro_disable = !(edev->ndev->features & NETIF_F_GRO);
> +	edev->gro_disable = !(edev->ndev->features & NETIF_F_GRO_HW);
>  }
> 
>  static int qede_set_real_num_queues(struct qede_dev *edev)
> --
> 1.8.3.1

Hi Michael, we have currently tested changes for qede driver and that overall looks good.
One small observation though in following steps -

1) HW GRO on initially as per "ethtool -k".
2) change MTU to 9000 which will not be supported with HW gro [HW gro gets off as per "ethtool -k"].
3) Change MTU back to 1500 which is supported mtu for HW gro but HW gro remains off as per "ethtool -k".

I don't have any strong recommendation that in step 3 if user fallback to supported mtu for HW gro then HW gro should
be again turned on automatically. I will leave up to user to control it back through ethtool.

Please note that we are yet to test bnx2x but I do see you are going to send V3. Hopefully, we will be able to provide feedback on bnx2x by that time.
Thanks for making this change.

Acked-by: Manish Chopra <manish.chopra@cavium.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next v2 5/5] qede: Use NETIF_F_GRO_HW.
  2017-12-07  8:03 ` [PATCH net-next v2 5/5] qede: " Michael Chan
  2017-12-08 17:02   ` Chopra, Manish
@ 2017-12-08 22:09   ` Marcelo Ricardo Leitner
  2017-12-08 22:40     ` Michael Chan
  1 sibling, 1 reply; 19+ messages in thread
From: Marcelo Ricardo Leitner @ 2017-12-08 22:09 UTC (permalink / raw)
  To: Michael Chan
  Cc: davem, netdev, andrew.gospodarek, Ariel Elior, everest-linux-l2

Hi,

On Thu, Dec 07, 2017 at 03:03:35AM -0500, Michael Chan wrote:
> --- a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
> +++ b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
> @@ -940,6 +940,9 @@ int qede_change_mtu(struct net_device *ndev, int new_mtu)
>  	DP_VERBOSE(edev, (NETIF_MSG_IFUP | NETIF_MSG_IFDOWN),
>  		   "Configuring MTU size of %d\n", new_mtu);
>  
> +	if (new_mtu > PAGE_SIZE)

I don't know the specs for this card but if it needs to fit the whole
packet in a page, maybe it should consider the ethernet header size in
such checks?

  Marcelo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH net-next v2 5/5] qede: Use NETIF_F_GRO_HW.
  2017-12-08 22:09   ` Marcelo Ricardo Leitner
@ 2017-12-08 22:40     ` Michael Chan
  0 siblings, 0 replies; 19+ messages in thread
From: Michael Chan @ 2017-12-08 22:40 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner
  Cc: David Miller, Netdev, Andrew Gospodarek, Ariel Elior, everest-linux-l2

On Fri, Dec 8, 2017 at 2:09 PM, Marcelo Ricardo Leitner
<marcelo.leitner@gmail.com> wrote:
> Hi,
>
> On Thu, Dec 07, 2017 at 03:03:35AM -0500, Michael Chan wrote:
>> --- a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
>> +++ b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
>> @@ -940,6 +940,9 @@ int qede_change_mtu(struct net_device *ndev, int new_mtu)
>>       DP_VERBOSE(edev, (NETIF_MSG_IFUP | NETIF_MSG_IFDOWN),
>>                  "Configuring MTU size of %d\n", new_mtu);
>>
>> +     if (new_mtu > PAGE_SIZE)
>
> I don't know the specs for this card but if it needs to fit the whole
> packet in a page, maybe it should consider the ethernet header size in
> such checks?
>

I am not changing the logic in this patch, just moving the check from
qede_alloc_sge_mem() to qede_fix_features().

Typically, the chip will also do header-data split when doing GRO_HW,
so that's probably why it is checking the MTU and not the total packet
size against page size.  Ariel can confirm.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2017-12-08 22:40 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-07  8:03 [PATCH net-next v2 0/5] Introduce NETIF_F_GRO_HW Michael Chan
2017-12-07  8:03 ` [PATCH net-next v2 1/5] net: " Michael Chan
2017-12-07 18:13   ` Alexander Duyck
2017-12-07 18:44     ` Michael Chan
2017-12-07 21:35       ` Alexander Duyck
2017-12-07 22:08         ` Michael Chan
2017-12-07 22:43           ` Alexander Duyck
2017-12-07 23:17             ` Michael Chan
2017-12-07 23:35               ` Alexander Duyck
2017-12-08  0:05                 ` Michael Chan
2017-12-08  2:36                   ` Alexander Duyck
2017-12-08  4:02                     ` Michael Chan
2017-12-07  8:03 ` [PATCH net-next v2 2/5] net: Disable GRO_HW when generic XDP is installed on a device Michael Chan
2017-12-07  8:03 ` [PATCH net-next v2 3/5] bnxt_en: Use NETIF_F_GRO_HW Michael Chan
2017-12-07  8:03 ` [PATCH net-next v2 4/5] bnx2x: " Michael Chan
2017-12-07  8:03 ` [PATCH net-next v2 5/5] qede: " Michael Chan
2017-12-08 17:02   ` Chopra, Manish
2017-12-08 22:09   ` Marcelo Ricardo Leitner
2017-12-08 22:40     ` Michael Chan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.