All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH next 0/2] Add support for macvlan offload
@ 2017-10-17 21:18 ` Shannon Nelson
  0 siblings, 0 replies; 10+ messages in thread
From: Shannon Nelson @ 2017-10-17 21:18 UTC (permalink / raw)
  To: intel-wired-lan, jeffrey.t.kirsher; +Cc: netdev

The XL710 and family was originally designed as a device to support the
growing "cloud" networking needs.  With its large number of queues,
filters, VFs, and other features, it can be a very handy device for
sorting traffic in a variety of ways.  However, one early design point
was to support macvlan offloads, and this was never really worked out;
as the Intel group knows, this has bothered me for a rather long time.

The original intent was to use a separate VSI for each macvlan offloaded.
This would make multiple queues and various other features available for
the new pseudo-device.  Unfortunately, there are 2 problems with this
approach: (1) the interraction between the stack and the driver makes it
hard to figure out which VSI:queue pair to transmit through, and (2) there
are a lot more queues available for offload duties than there are VSIs.

Using a simpler design, we can partition off some of the queues in the
PF's primary VSI and use the XL710's macaddr-to-queue filtering capability
to make a large number of macvlan offload channels available.

This RFC is with code that has been shown to get packets in and out of the
right queues, but has gone through very little testing.  In the spirit
of fail fast, I wanted to get this out quickly for comments and get the
rework cycle started.

Shannon Nelson (2):
  i40e: add ToQueue specific handling for mac filters
  i40e: add support for macvlan hardware offload

 drivers/net/ethernet/intel/i40e/i40e.h             |   27 ++-
 drivers/net/ethernet/intel/i40e/i40e_debugfs.c     |    4 +-
 drivers/net/ethernet/intel/i40e/i40e_ethtool.c     |   15 +
 drivers/net/ethernet/intel/i40e/i40e_main.c        |  311 ++++++++++++++++++--
 drivers/net/ethernet/intel/i40e/i40e_txrx.h        |    1 +
 drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c |   10 +-
 6 files changed, 327 insertions(+), 41 deletions(-)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Intel-wired-lan] [RFC PATCH next 0/2] Add support for macvlan offload
@ 2017-10-17 21:18 ` Shannon Nelson
  0 siblings, 0 replies; 10+ messages in thread
From: Shannon Nelson @ 2017-10-17 21:18 UTC (permalink / raw)
  To: intel-wired-lan

The XL710 and family was originally designed as a device to support the
growing "cloud" networking needs.  With its large number of queues,
filters, VFs, and other features, it can be a very handy device for
sorting traffic in a variety of ways.  However, one early design point
was to support macvlan offloads, and this was never really worked out;
as the Intel group knows, this has bothered me for a rather long time.

The original intent was to use a separate VSI for each macvlan offloaded.
This would make multiple queues and various other features available for
the new pseudo-device.  Unfortunately, there are 2 problems with this
approach: (1) the interraction between the stack and the driver makes it
hard to figure out which VSI:queue pair to transmit through, and (2) there
are a lot more queues available for offload duties than there are VSIs.

Using a simpler design, we can partition off some of the queues in the
PF's primary VSI and use the XL710's macaddr-to-queue filtering capability
to make a large number of macvlan offload channels available.

This RFC is with code that has been shown to get packets in and out of the
right queues, but has gone through very little testing.  In the spirit
of fail fast, I wanted to get this out quickly for comments and get the
rework cycle started.

Shannon Nelson (2):
  i40e: add ToQueue specific handling for mac filters
  i40e: add support for macvlan hardware offload

 drivers/net/ethernet/intel/i40e/i40e.h             |   27 ++-
 drivers/net/ethernet/intel/i40e/i40e_debugfs.c     |    4 +-
 drivers/net/ethernet/intel/i40e/i40e_ethtool.c     |   15 +
 drivers/net/ethernet/intel/i40e/i40e_main.c        |  311 ++++++++++++++++++--
 drivers/net/ethernet/intel/i40e/i40e_txrx.h        |    1 +
 drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c |   10 +-
 6 files changed, 327 insertions(+), 41 deletions(-)


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [RFC PATCH next 1/2] i40e: add ToQueue specific handling for mac filters
  2017-10-17 21:18 ` [Intel-wired-lan] " Shannon Nelson
@ 2017-10-17 21:18   ` Shannon Nelson
  -1 siblings, 0 replies; 10+ messages in thread
From: Shannon Nelson @ 2017-10-17 21:18 UTC (permalink / raw)
  To: intel-wired-lan, jeffrey.t.kirsher; +Cc: netdev

Add the concept of queue-specific filters to the filter handling.  This
will be used in the near future for macvlan offload filters.  In
general, filters for standard use will use a queue of 0, which we'll
take to mean the filter applies to the whole VSI.  Only the filters for
macvlan offload will use a non-zero queue.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
---
 drivers/net/ethernet/intel/i40e/i40e.h             |   17 +++--
 drivers/net/ethernet/intel/i40e/i40e_debugfs.c     |    4 +-
 drivers/net/ethernet/intel/i40e/i40e_main.c        |   72 ++++++++++++-------
 drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c |   10 ++--
 4 files changed, 63 insertions(+), 40 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h
index 18c453a..a187f53 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -539,14 +539,17 @@ struct i40e_pf {
 /**
  * i40e_mac_to_hkey - Convert a 6-byte MAC Address to a u64 hash key
  * @macaddr: the MAC Address as the base key
+ * @queue: if non-zero, the queue to receive packets with this mac address
  *
  * Simply copies the address and returns it as a u64 for hashing
  **/
-static inline u64 i40e_addr_to_hkey(const u8 *macaddr)
+static inline u64 i40e_addr_to_hkey(const u8 *macaddr, u16 queue)
 {
 	u64 key = 0;
+	u16 *k = (u16 *)&key;
 
 	ether_addr_copy((u8 *)&key, macaddr);
+	k[3] = queue;
 	return key;
 }
 
@@ -563,6 +566,7 @@ struct i40e_mac_filter {
 	u8 macaddr[ETH_ALEN];
 #define I40E_VLAN_ANY -1
 	s16 vlan;
+	u16 queue;
 	enum i40e_filter_state state;
 };
 
@@ -892,10 +896,11 @@ int i40e_add_del_fdir(struct i40e_vsi *vsi,
 u32 i40e_get_global_fd_count(struct i40e_pf *pf);
 bool i40e_set_ntuple(struct i40e_pf *pf, netdev_features_t features);
 void i40e_set_ethtool_ops(struct net_device *netdev);
-struct i40e_mac_filter *i40e_add_filter(struct i40e_vsi *vsi,
-					const u8 *macaddr, s16 vlan);
+struct i40e_mac_filter *i40e_add_filter(struct i40e_vsi *vsi, const u8 *macaddr,
+					s16 vlan, u16 queue);
 void __i40e_del_filter(struct i40e_vsi *vsi, struct i40e_mac_filter *f);
-void i40e_del_filter(struct i40e_vsi *vsi, const u8 *macaddr, s16 vlan);
+void i40e_del_filter(struct i40e_vsi *vsi, const u8 *macaddr,
+		     s16 vlan, u16 queue);
 int i40e_sync_vsi_filters(struct i40e_vsi *vsi);
 struct i40e_vsi *i40e_vsi_setup(struct i40e_pf *pf, u8 type,
 				u16 uplink, u32 param1);
@@ -971,8 +976,8 @@ static inline void i40e_irq_dynamic_enable(struct i40e_vsi *vsi, int vector)
 void i40e_rm_vlan_all_mac(struct i40e_vsi *vsi, s16 vid);
 void i40e_vsi_kill_vlan(struct i40e_vsi *vsi, u16 vid);
 struct i40e_mac_filter *i40e_add_mac_filter(struct i40e_vsi *vsi,
-					    const u8 *macaddr);
-int i40e_del_mac_filter(struct i40e_vsi *vsi, const u8 *macaddr);
+					    const u8 *macaddr, u16 queue);
+int i40e_del_mac_filter(struct i40e_vsi *vsi, const u8 *macaddr, u16 queue);
 bool i40e_is_vsi_in_vlan(struct i40e_vsi *vsi);
 struct i40e_mac_filter *i40e_find_mac(struct i40e_vsi *vsi, const u8 *macaddr);
 void i40e_vlan_stripping_enable(struct i40e_vsi *vsi);
diff --git a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
index 6f2725f..cf173e1 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
@@ -171,8 +171,8 @@ static void i40e_dbg_dump_vsi_seid(struct i40e_pf *pf, int seid)
 			 pf->hw.mac.port_addr);
 	hash_for_each(vsi->mac_filter_hash, bkt, f, hlist) {
 		dev_info(&pf->pdev->dev,
-			 "    mac_filter_hash: %pM vid=%d, state %s\n",
-			 f->macaddr, f->vlan,
+			 "    mac_filter_hash: %pM vid=%d q=%d, state %s\n",
+			 f->macaddr, f->vlan, f->queue,
 			 i40e_filter_state_string[f->state]);
 	}
 	dev_info(&pf->pdev->dev, "    active_filters %u, promisc_threshold %u, overflow promisc %s\n",
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 84c5087..e4b8a4b 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -1114,11 +1114,13 @@ void i40e_update_stats(struct i40e_vsi *vsi)
  * @vsi: the VSI to be searched
  * @macaddr: the MAC address
  * @vlan: the vlan
+ * @queue: the queue
  *
  * Returns ptr to the filter object or NULL
  **/
 static struct i40e_mac_filter *i40e_find_filter(struct i40e_vsi *vsi,
-						const u8 *macaddr, s16 vlan)
+						const u8 *macaddr, s16 vlan,
+						u16 queue)
 {
 	struct i40e_mac_filter *f;
 	u64 key;
@@ -1126,10 +1128,10 @@ void i40e_update_stats(struct i40e_vsi *vsi)
 	if (!vsi || !macaddr)
 		return NULL;
 
-	key = i40e_addr_to_hkey(macaddr);
+	key = i40e_addr_to_hkey(macaddr, queue);
 	hash_for_each_possible(vsi->mac_filter_hash, f, hlist, key) {
 		if ((ether_addr_equal(macaddr, f->macaddr)) &&
-		    (vlan == f->vlan))
+		    (vlan == f->vlan) && (queue == f->queue))
 			return f;
 	}
 	return NULL;
@@ -1151,7 +1153,7 @@ struct i40e_mac_filter *i40e_find_mac(struct i40e_vsi *vsi, const u8 *macaddr)
 	if (!vsi || !macaddr)
 		return NULL;
 
-	key = i40e_addr_to_hkey(macaddr);
+	key = i40e_addr_to_hkey(macaddr, 0);
 	hash_for_each_possible(vsi->mac_filter_hash, f, hlist, key) {
 		if ((ether_addr_equal(macaddr, f->macaddr)))
 			return f;
@@ -1277,7 +1279,8 @@ static int i40e_correct_mac_vlan_filters(struct i40e_vsi *vsi,
 				new_vlan = I40E_VLAN_ANY;
 
 			/* Create the new filter */
-			add_head = i40e_add_filter(vsi, f->macaddr, new_vlan);
+			add_head = i40e_add_filter(vsi, f->macaddr,
+						   new_vlan, f->queue);
 			if (!add_head)
 				return -ENOMEM;
 
@@ -1342,14 +1345,15 @@ static void i40e_rm_default_mac_filter(struct i40e_vsi *vsi, u8 *macaddr)
  * @vsi: the VSI to be searched
  * @macaddr: the MAC address
  * @vlan: the vlan
+ * @queue: if non-zero, the specific queue to receive for this mac address
  *
  * Returns ptr to the filter object or NULL when no memory available.
  *
  * NOTE: This function is expected to be called with mac_filter_hash_lock
  * being held.
  **/
-struct i40e_mac_filter *i40e_add_filter(struct i40e_vsi *vsi,
-					const u8 *macaddr, s16 vlan)
+struct i40e_mac_filter *i40e_add_filter(struct i40e_vsi *vsi, const u8 *macaddr,
+					s16 vlan, u16 queue)
 {
 	struct i40e_mac_filter *f;
 	u64 key;
@@ -1357,7 +1361,7 @@ struct i40e_mac_filter *i40e_add_filter(struct i40e_vsi *vsi,
 	if (!vsi || !macaddr)
 		return NULL;
 
-	f = i40e_find_filter(vsi, macaddr, vlan);
+	f = i40e_find_filter(vsi, macaddr, vlan, queue);
 	if (!f) {
 		f = kzalloc(sizeof(*f), GFP_ATOMIC);
 		if (!f)
@@ -1371,6 +1375,7 @@ struct i40e_mac_filter *i40e_add_filter(struct i40e_vsi *vsi,
 
 		ether_addr_copy(f->macaddr, macaddr);
 		f->vlan = vlan;
+		f->queue = queue;
 		/* If we're in overflow promisc mode, set the state directly
 		 * to failed, so we don't bother to try sending the filter
 		 * to the hardware.
@@ -1381,7 +1386,7 @@ struct i40e_mac_filter *i40e_add_filter(struct i40e_vsi *vsi,
 			f->state = I40E_FILTER_NEW;
 		INIT_HLIST_NODE(&f->hlist);
 
-		key = i40e_addr_to_hkey(macaddr);
+		key = i40e_addr_to_hkey(macaddr, queue);
 		hash_add(vsi->mac_filter_hash, &f->hlist, key);
 
 		vsi->flags |= I40E_VSI_FLAG_FILTER_CHANGED;
@@ -1443,6 +1448,7 @@ void __i40e_del_filter(struct i40e_vsi *vsi, struct i40e_mac_filter *f)
  * @vsi: the VSI to be searched
  * @macaddr: the MAC address
  * @vlan: the VLAN
+ * @queue: if non-zero, the specific queue to receive for this mac address
  *
  * NOTE: This function is expected to be called with mac_filter_hash_lock
  * being held.
@@ -1450,14 +1456,15 @@ void __i40e_del_filter(struct i40e_vsi *vsi, struct i40e_mac_filter *f)
  * the "safe" variants of any list iterators, e.g. list_for_each_entry_safe()
  * instead of list_for_each_entry().
  **/
-void i40e_del_filter(struct i40e_vsi *vsi, const u8 *macaddr, s16 vlan)
+void i40e_del_filter(struct i40e_vsi *vsi, const u8 *macaddr,
+		     s16 vlan, u16 queue)
 {
 	struct i40e_mac_filter *f;
 
 	if (!vsi || !macaddr)
 		return;
 
-	f = i40e_find_filter(vsi, macaddr, vlan);
+	f = i40e_find_filter(vsi, macaddr, vlan, queue);
 	__i40e_del_filter(vsi, f);
 }
 
@@ -1465,6 +1472,7 @@ void i40e_del_filter(struct i40e_vsi *vsi, const u8 *macaddr, s16 vlan)
  * i40e_add_mac_filter - Add a MAC filter for all active VLANs
  * @vsi: the VSI to be searched
  * @macaddr: the mac address to be filtered
+ * @queue: if non-zero, the target ToQueue
  *
  * If we're not in VLAN mode, just add the filter to I40E_VLAN_ANY. Otherwise,
  * go through all the macvlan filters and add a macvlan filter for each
@@ -1474,7 +1482,7 @@ void i40e_del_filter(struct i40e_vsi *vsi, const u8 *macaddr, s16 vlan)
  * Returns last filter added on success, else NULL
  **/
 struct i40e_mac_filter *i40e_add_mac_filter(struct i40e_vsi *vsi,
-					    const u8 *macaddr)
+					    const u8 *macaddr, u16 queue)
 {
 	struct i40e_mac_filter *f, *add = NULL;
 	struct hlist_node *h;
@@ -1482,15 +1490,15 @@ struct i40e_mac_filter *i40e_add_mac_filter(struct i40e_vsi *vsi,
 
 	if (vsi->info.pvid)
 		return i40e_add_filter(vsi, macaddr,
-				       le16_to_cpu(vsi->info.pvid));
+				       le16_to_cpu(vsi->info.pvid), queue);
 
 	if (!i40e_is_vsi_in_vlan(vsi))
-		return i40e_add_filter(vsi, macaddr, I40E_VLAN_ANY);
+		return i40e_add_filter(vsi, macaddr, I40E_VLAN_ANY, queue);
 
 	hash_for_each_safe(vsi->mac_filter_hash, bkt, h, f, hlist) {
 		if (f->state == I40E_FILTER_REMOVE)
 			continue;
-		add = i40e_add_filter(vsi, macaddr, f->vlan);
+		add = i40e_add_filter(vsi, macaddr, f->vlan, queue);
 		if (!add)
 			return NULL;
 	}
@@ -1502,13 +1510,14 @@ struct i40e_mac_filter *i40e_add_mac_filter(struct i40e_vsi *vsi,
  * i40e_del_mac_filter - Remove a MAC filter from all VLANs
  * @vsi: the VSI to be searched
  * @macaddr: the mac address to be removed
+ * @queue: if non-zero, the target ToQueue
  *
  * Removes a given MAC address from a VSI regardless of what VLAN it has been
  * associated with.
  *
  * Returns 0 for success, or error
  **/
-int i40e_del_mac_filter(struct i40e_vsi *vsi, const u8 *macaddr)
+int i40e_del_mac_filter(struct i40e_vsi *vsi, const u8 *macaddr, u16 queue)
 {
 	struct i40e_mac_filter *f;
 	struct hlist_node *h;
@@ -1518,7 +1527,8 @@ int i40e_del_mac_filter(struct i40e_vsi *vsi, const u8 *macaddr)
 	WARN(!spin_is_locked(&vsi->mac_filter_hash_lock),
 	     "Missing mac_filter_hash_lock\n");
 	hash_for_each_safe(vsi->mac_filter_hash, bkt, h, f, hlist) {
-		if (ether_addr_equal(macaddr, f->macaddr)) {
+		if (ether_addr_equal(macaddr, f->macaddr) &&
+		    queue == f->queue) {
 			__i40e_del_filter(vsi, f);
 			found = true;
 		}
@@ -1565,8 +1575,8 @@ static int i40e_set_mac(struct net_device *netdev, void *p)
 		netdev_info(netdev, "set new mac address %pM\n", addr->sa_data);
 
 	spin_lock_bh(&vsi->mac_filter_hash_lock);
-	i40e_del_mac_filter(vsi, netdev->dev_addr);
-	i40e_add_mac_filter(vsi, addr->sa_data);
+	i40e_del_mac_filter(vsi, netdev->dev_addr, 0);
+	i40e_add_mac_filter(vsi, addr->sa_data, 0);
 	spin_unlock_bh(&vsi->mac_filter_hash_lock);
 	ether_addr_copy(netdev->dev_addr, addr->sa_data);
 	if (vsi->type == I40E_VSI_MAIN) {
@@ -1731,7 +1741,7 @@ static int i40e_addr_sync(struct net_device *netdev, const u8 *addr)
 	struct i40e_netdev_priv *np = netdev_priv(netdev);
 	struct i40e_vsi *vsi = np->vsi;
 
-	if (i40e_add_mac_filter(vsi, addr))
+	if (i40e_add_mac_filter(vsi, addr, 0))
 		return 0;
 	else
 		return -ENOMEM;
@@ -1750,7 +1760,7 @@ static int i40e_addr_unsync(struct net_device *netdev, const u8 *addr)
 	struct i40e_netdev_priv *np = netdev_priv(netdev);
 	struct i40e_vsi *vsi = np->vsi;
 
-	i40e_del_mac_filter(vsi, addr);
+	i40e_del_mac_filter(vsi, addr, 0);
 
 	return 0;
 }
@@ -1793,7 +1803,7 @@ static void i40e_undo_del_filter_entries(struct i40e_vsi *vsi,
 	struct hlist_node *h;
 
 	hlist_for_each_entry_safe(f, h, from, hlist) {
-		u64 key = i40e_addr_to_hkey(f->macaddr);
+		u64 key = i40e_addr_to_hkey(f->macaddr, f->queue);
 
 		/* Move the element back into MAC filter list*/
 		hlist_del(&f->hlist);
@@ -2194,7 +2204,15 @@ int i40e_sync_vsi_filters(struct i40e_vsi *vsi)
 				add_list[num_add].vlan_tag =
 					cpu_to_le16((u16)(new->f->vlan));
 			}
-			add_list[num_add].queue_number = 0;
+
+			if (new->f->queue) {
+				add_list[num_add].queue_number =
+					cpu_to_le16(new->f->queue);
+				cmd_flags |= I40E_AQC_MACVLAN_ADD_TO_QUEUE;
+			} else {
+				add_list[num_add].queue_number = 0;
+			}
+
 			/* set invalid match method for later detection */
 			add_list[num_add].match_method = I40E_AQC_MM_ERR_NO_RES;
 			cmd_flags |= I40E_AQC_MACVLAN_ADD_PERFECT_MATCH;
@@ -2580,7 +2598,7 @@ int i40e_add_vlan_all_mac(struct i40e_vsi *vsi, s16 vid)
 	hash_for_each_safe(vsi->mac_filter_hash, bkt, h, f, hlist) {
 		if (f->state == I40E_FILTER_REMOVE)
 			continue;
-		add_f = i40e_add_filter(vsi, f->macaddr, vid);
+		add_f = i40e_add_filter(vsi, f->macaddr, vid, 0);
 		if (!add_f) {
 			dev_info(&vsi->back->pdev->dev,
 				 "Could not add vlan filter %d for %pM\n",
@@ -9772,7 +9790,7 @@ static int i40e_config_netdev(struct i40e_vsi *vsi)
 		 */
 		i40e_rm_default_mac_filter(vsi, mac_addr);
 		spin_lock_bh(&vsi->mac_filter_hash_lock);
-		i40e_add_mac_filter(vsi, mac_addr);
+		i40e_add_mac_filter(vsi, mac_addr, 0);
 		spin_unlock_bh(&vsi->mac_filter_hash_lock);
 	} else {
 		/* Relate the VSI_VMDQ name to the VSI_MAIN name. Note that we
@@ -9786,7 +9804,7 @@ static int i40e_config_netdev(struct i40e_vsi *vsi)
 		random_ether_addr(mac_addr);
 
 		spin_lock_bh(&vsi->mac_filter_hash_lock);
-		i40e_add_mac_filter(vsi, mac_addr);
+		i40e_add_mac_filter(vsi, mac_addr, 0);
 		spin_unlock_bh(&vsi->mac_filter_hash_lock);
 	}
 
@@ -9805,7 +9823,7 @@ static int i40e_config_netdev(struct i40e_vsi *vsi)
 	 */
 	eth_broadcast_addr(broadcast);
 	spin_lock_bh(&vsi->mac_filter_hash_lock);
-	i40e_add_mac_filter(vsi, broadcast);
+	i40e_add_mac_filter(vsi, broadcast, 0);
 	spin_unlock_bh(&vsi->mac_filter_hash_lock);
 
 	ether_addr_copy(netdev->dev_addr, mac_addr);
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 0456813..d2ed218 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -709,14 +709,14 @@ static int i40e_alloc_vsi_res(struct i40e_vf *vf, enum i40e_vsi_type type)
 		spin_lock_bh(&vsi->mac_filter_hash_lock);
 		if (is_valid_ether_addr(vf->default_lan_addr.addr)) {
 			f = i40e_add_mac_filter(vsi,
-						vf->default_lan_addr.addr);
+						vf->default_lan_addr.addr, 0);
 			if (!f)
 				dev_info(&pf->pdev->dev,
 					 "Could not add MAC filter %pM for VF %d\n",
 					vf->default_lan_addr.addr, vf->vf_id);
 		}
 		eth_broadcast_addr(broadcast);
-		f = i40e_add_mac_filter(vsi, broadcast);
+		f = i40e_add_mac_filter(vsi, broadcast, 0);
 		if (!f)
 			dev_info(&pf->pdev->dev,
 				 "Could not allocate VF broadcast filter\n");
@@ -2217,7 +2217,7 @@ static int i40e_vc_add_mac_addr_msg(struct i40e_vf *vf, u8 *msg, u16 msglen)
 
 		f = i40e_find_mac(vsi, al->list[i].addr);
 		if (!f)
-			f = i40e_add_mac_filter(vsi, al->list[i].addr);
+			f = i40e_add_mac_filter(vsi, al->list[i].addr, 0);
 
 		if (!f) {
 			dev_err(&pf->pdev->dev,
@@ -2282,7 +2282,7 @@ static int i40e_vc_del_mac_addr_msg(struct i40e_vf *vf, u8 *msg, u16 msglen)
 	spin_lock_bh(&vsi->mac_filter_hash_lock);
 	/* delete addresses from the list */
 	for (i = 0; i < al->num_elements; i++)
-		if (i40e_del_mac_filter(vsi, al->list[i].addr)) {
+		if (i40e_del_mac_filter(vsi, al->list[i].addr, 0)) {
 			ret = I40E_ERR_INVALID_MAC_ADDR;
 			spin_unlock_bh(&vsi->mac_filter_hash_lock);
 			goto error_param;
@@ -2916,7 +2916,7 @@ int i40e_ndo_set_vf_mac(struct net_device *netdev, int vf_id, u8 *mac)
 
 	/* delete the temporary mac address */
 	if (!is_zero_ether_addr(vf->default_lan_addr.addr))
-		i40e_del_mac_filter(vsi, vf->default_lan_addr.addr);
+		i40e_del_mac_filter(vsi, vf->default_lan_addr.addr, 0);
 
 	/* Delete all the filters for this VSI - we're going to kill it
 	 * anyway.
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Intel-wired-lan] [RFC PATCH next 1/2] i40e: add ToQueue specific handling for mac filters
@ 2017-10-17 21:18   ` Shannon Nelson
  0 siblings, 0 replies; 10+ messages in thread
From: Shannon Nelson @ 2017-10-17 21:18 UTC (permalink / raw)
  To: intel-wired-lan

Add the concept of queue-specific filters to the filter handling.  This
will be used in the near future for macvlan offload filters.  In
general, filters for standard use will use a queue of 0, which we'll
take to mean the filter applies to the whole VSI.  Only the filters for
macvlan offload will use a non-zero queue.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
---
 drivers/net/ethernet/intel/i40e/i40e.h             |   17 +++--
 drivers/net/ethernet/intel/i40e/i40e_debugfs.c     |    4 +-
 drivers/net/ethernet/intel/i40e/i40e_main.c        |   72 ++++++++++++-------
 drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c |   10 ++--
 4 files changed, 63 insertions(+), 40 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h
index 18c453a..a187f53 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -539,14 +539,17 @@ struct i40e_pf {
 /**
  * i40e_mac_to_hkey - Convert a 6-byte MAC Address to a u64 hash key
  * @macaddr: the MAC Address as the base key
+ * @queue: if non-zero, the queue to receive packets with this mac address
  *
  * Simply copies the address and returns it as a u64 for hashing
  **/
-static inline u64 i40e_addr_to_hkey(const u8 *macaddr)
+static inline u64 i40e_addr_to_hkey(const u8 *macaddr, u16 queue)
 {
 	u64 key = 0;
+	u16 *k = (u16 *)&key;
 
 	ether_addr_copy((u8 *)&key, macaddr);
+	k[3] = queue;
 	return key;
 }
 
@@ -563,6 +566,7 @@ struct i40e_mac_filter {
 	u8 macaddr[ETH_ALEN];
 #define I40E_VLAN_ANY -1
 	s16 vlan;
+	u16 queue;
 	enum i40e_filter_state state;
 };
 
@@ -892,10 +896,11 @@ int i40e_add_del_fdir(struct i40e_vsi *vsi,
 u32 i40e_get_global_fd_count(struct i40e_pf *pf);
 bool i40e_set_ntuple(struct i40e_pf *pf, netdev_features_t features);
 void i40e_set_ethtool_ops(struct net_device *netdev);
-struct i40e_mac_filter *i40e_add_filter(struct i40e_vsi *vsi,
-					const u8 *macaddr, s16 vlan);
+struct i40e_mac_filter *i40e_add_filter(struct i40e_vsi *vsi, const u8 *macaddr,
+					s16 vlan, u16 queue);
 void __i40e_del_filter(struct i40e_vsi *vsi, struct i40e_mac_filter *f);
-void i40e_del_filter(struct i40e_vsi *vsi, const u8 *macaddr, s16 vlan);
+void i40e_del_filter(struct i40e_vsi *vsi, const u8 *macaddr,
+		     s16 vlan, u16 queue);
 int i40e_sync_vsi_filters(struct i40e_vsi *vsi);
 struct i40e_vsi *i40e_vsi_setup(struct i40e_pf *pf, u8 type,
 				u16 uplink, u32 param1);
@@ -971,8 +976,8 @@ static inline void i40e_irq_dynamic_enable(struct i40e_vsi *vsi, int vector)
 void i40e_rm_vlan_all_mac(struct i40e_vsi *vsi, s16 vid);
 void i40e_vsi_kill_vlan(struct i40e_vsi *vsi, u16 vid);
 struct i40e_mac_filter *i40e_add_mac_filter(struct i40e_vsi *vsi,
-					    const u8 *macaddr);
-int i40e_del_mac_filter(struct i40e_vsi *vsi, const u8 *macaddr);
+					    const u8 *macaddr, u16 queue);
+int i40e_del_mac_filter(struct i40e_vsi *vsi, const u8 *macaddr, u16 queue);
 bool i40e_is_vsi_in_vlan(struct i40e_vsi *vsi);
 struct i40e_mac_filter *i40e_find_mac(struct i40e_vsi *vsi, const u8 *macaddr);
 void i40e_vlan_stripping_enable(struct i40e_vsi *vsi);
diff --git a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
index 6f2725f..cf173e1 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
@@ -171,8 +171,8 @@ static void i40e_dbg_dump_vsi_seid(struct i40e_pf *pf, int seid)
 			 pf->hw.mac.port_addr);
 	hash_for_each(vsi->mac_filter_hash, bkt, f, hlist) {
 		dev_info(&pf->pdev->dev,
-			 "    mac_filter_hash: %pM vid=%d, state %s\n",
-			 f->macaddr, f->vlan,
+			 "    mac_filter_hash: %pM vid=%d q=%d, state %s\n",
+			 f->macaddr, f->vlan, f->queue,
 			 i40e_filter_state_string[f->state]);
 	}
 	dev_info(&pf->pdev->dev, "    active_filters %u, promisc_threshold %u, overflow promisc %s\n",
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 84c5087..e4b8a4b 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -1114,11 +1114,13 @@ void i40e_update_stats(struct i40e_vsi *vsi)
  * @vsi: the VSI to be searched
  * @macaddr: the MAC address
  * @vlan: the vlan
+ * @queue: the queue
  *
  * Returns ptr to the filter object or NULL
  **/
 static struct i40e_mac_filter *i40e_find_filter(struct i40e_vsi *vsi,
-						const u8 *macaddr, s16 vlan)
+						const u8 *macaddr, s16 vlan,
+						u16 queue)
 {
 	struct i40e_mac_filter *f;
 	u64 key;
@@ -1126,10 +1128,10 @@ void i40e_update_stats(struct i40e_vsi *vsi)
 	if (!vsi || !macaddr)
 		return NULL;
 
-	key = i40e_addr_to_hkey(macaddr);
+	key = i40e_addr_to_hkey(macaddr, queue);
 	hash_for_each_possible(vsi->mac_filter_hash, f, hlist, key) {
 		if ((ether_addr_equal(macaddr, f->macaddr)) &&
-		    (vlan == f->vlan))
+		    (vlan == f->vlan) && (queue == f->queue))
 			return f;
 	}
 	return NULL;
@@ -1151,7 +1153,7 @@ struct i40e_mac_filter *i40e_find_mac(struct i40e_vsi *vsi, const u8 *macaddr)
 	if (!vsi || !macaddr)
 		return NULL;
 
-	key = i40e_addr_to_hkey(macaddr);
+	key = i40e_addr_to_hkey(macaddr, 0);
 	hash_for_each_possible(vsi->mac_filter_hash, f, hlist, key) {
 		if ((ether_addr_equal(macaddr, f->macaddr)))
 			return f;
@@ -1277,7 +1279,8 @@ static int i40e_correct_mac_vlan_filters(struct i40e_vsi *vsi,
 				new_vlan = I40E_VLAN_ANY;
 
 			/* Create the new filter */
-			add_head = i40e_add_filter(vsi, f->macaddr, new_vlan);
+			add_head = i40e_add_filter(vsi, f->macaddr,
+						   new_vlan, f->queue);
 			if (!add_head)
 				return -ENOMEM;
 
@@ -1342,14 +1345,15 @@ static void i40e_rm_default_mac_filter(struct i40e_vsi *vsi, u8 *macaddr)
  * @vsi: the VSI to be searched
  * @macaddr: the MAC address
  * @vlan: the vlan
+ * @queue: if non-zero, the specific queue to receive for this mac address
  *
  * Returns ptr to the filter object or NULL when no memory available.
  *
  * NOTE: This function is expected to be called with mac_filter_hash_lock
  * being held.
  **/
-struct i40e_mac_filter *i40e_add_filter(struct i40e_vsi *vsi,
-					const u8 *macaddr, s16 vlan)
+struct i40e_mac_filter *i40e_add_filter(struct i40e_vsi *vsi, const u8 *macaddr,
+					s16 vlan, u16 queue)
 {
 	struct i40e_mac_filter *f;
 	u64 key;
@@ -1357,7 +1361,7 @@ struct i40e_mac_filter *i40e_add_filter(struct i40e_vsi *vsi,
 	if (!vsi || !macaddr)
 		return NULL;
 
-	f = i40e_find_filter(vsi, macaddr, vlan);
+	f = i40e_find_filter(vsi, macaddr, vlan, queue);
 	if (!f) {
 		f = kzalloc(sizeof(*f), GFP_ATOMIC);
 		if (!f)
@@ -1371,6 +1375,7 @@ struct i40e_mac_filter *i40e_add_filter(struct i40e_vsi *vsi,
 
 		ether_addr_copy(f->macaddr, macaddr);
 		f->vlan = vlan;
+		f->queue = queue;
 		/* If we're in overflow promisc mode, set the state directly
 		 * to failed, so we don't bother to try sending the filter
 		 * to the hardware.
@@ -1381,7 +1386,7 @@ struct i40e_mac_filter *i40e_add_filter(struct i40e_vsi *vsi,
 			f->state = I40E_FILTER_NEW;
 		INIT_HLIST_NODE(&f->hlist);
 
-		key = i40e_addr_to_hkey(macaddr);
+		key = i40e_addr_to_hkey(macaddr, queue);
 		hash_add(vsi->mac_filter_hash, &f->hlist, key);
 
 		vsi->flags |= I40E_VSI_FLAG_FILTER_CHANGED;
@@ -1443,6 +1448,7 @@ void __i40e_del_filter(struct i40e_vsi *vsi, struct i40e_mac_filter *f)
  * @vsi: the VSI to be searched
  * @macaddr: the MAC address
  * @vlan: the VLAN
+ * @queue: if non-zero, the specific queue to receive for this mac address
  *
  * NOTE: This function is expected to be called with mac_filter_hash_lock
  * being held.
@@ -1450,14 +1456,15 @@ void __i40e_del_filter(struct i40e_vsi *vsi, struct i40e_mac_filter *f)
  * the "safe" variants of any list iterators, e.g. list_for_each_entry_safe()
  * instead of list_for_each_entry().
  **/
-void i40e_del_filter(struct i40e_vsi *vsi, const u8 *macaddr, s16 vlan)
+void i40e_del_filter(struct i40e_vsi *vsi, const u8 *macaddr,
+		     s16 vlan, u16 queue)
 {
 	struct i40e_mac_filter *f;
 
 	if (!vsi || !macaddr)
 		return;
 
-	f = i40e_find_filter(vsi, macaddr, vlan);
+	f = i40e_find_filter(vsi, macaddr, vlan, queue);
 	__i40e_del_filter(vsi, f);
 }
 
@@ -1465,6 +1472,7 @@ void i40e_del_filter(struct i40e_vsi *vsi, const u8 *macaddr, s16 vlan)
  * i40e_add_mac_filter - Add a MAC filter for all active VLANs
  * @vsi: the VSI to be searched
  * @macaddr: the mac address to be filtered
+ * @queue: if non-zero, the target ToQueue
  *
  * If we're not in VLAN mode, just add the filter to I40E_VLAN_ANY. Otherwise,
  * go through all the macvlan filters and add a macvlan filter for each
@@ -1474,7 +1482,7 @@ void i40e_del_filter(struct i40e_vsi *vsi, const u8 *macaddr, s16 vlan)
  * Returns last filter added on success, else NULL
  **/
 struct i40e_mac_filter *i40e_add_mac_filter(struct i40e_vsi *vsi,
-					    const u8 *macaddr)
+					    const u8 *macaddr, u16 queue)
 {
 	struct i40e_mac_filter *f, *add = NULL;
 	struct hlist_node *h;
@@ -1482,15 +1490,15 @@ struct i40e_mac_filter *i40e_add_mac_filter(struct i40e_vsi *vsi,
 
 	if (vsi->info.pvid)
 		return i40e_add_filter(vsi, macaddr,
-				       le16_to_cpu(vsi->info.pvid));
+				       le16_to_cpu(vsi->info.pvid), queue);
 
 	if (!i40e_is_vsi_in_vlan(vsi))
-		return i40e_add_filter(vsi, macaddr, I40E_VLAN_ANY);
+		return i40e_add_filter(vsi, macaddr, I40E_VLAN_ANY, queue);
 
 	hash_for_each_safe(vsi->mac_filter_hash, bkt, h, f, hlist) {
 		if (f->state == I40E_FILTER_REMOVE)
 			continue;
-		add = i40e_add_filter(vsi, macaddr, f->vlan);
+		add = i40e_add_filter(vsi, macaddr, f->vlan, queue);
 		if (!add)
 			return NULL;
 	}
@@ -1502,13 +1510,14 @@ struct i40e_mac_filter *i40e_add_mac_filter(struct i40e_vsi *vsi,
  * i40e_del_mac_filter - Remove a MAC filter from all VLANs
  * @vsi: the VSI to be searched
  * @macaddr: the mac address to be removed
+ * @queue: if non-zero, the target ToQueue
  *
  * Removes a given MAC address from a VSI regardless of what VLAN it has been
  * associated with.
  *
  * Returns 0 for success, or error
  **/
-int i40e_del_mac_filter(struct i40e_vsi *vsi, const u8 *macaddr)
+int i40e_del_mac_filter(struct i40e_vsi *vsi, const u8 *macaddr, u16 queue)
 {
 	struct i40e_mac_filter *f;
 	struct hlist_node *h;
@@ -1518,7 +1527,8 @@ int i40e_del_mac_filter(struct i40e_vsi *vsi, const u8 *macaddr)
 	WARN(!spin_is_locked(&vsi->mac_filter_hash_lock),
 	     "Missing mac_filter_hash_lock\n");
 	hash_for_each_safe(vsi->mac_filter_hash, bkt, h, f, hlist) {
-		if (ether_addr_equal(macaddr, f->macaddr)) {
+		if (ether_addr_equal(macaddr, f->macaddr) &&
+		    queue == f->queue) {
 			__i40e_del_filter(vsi, f);
 			found = true;
 		}
@@ -1565,8 +1575,8 @@ static int i40e_set_mac(struct net_device *netdev, void *p)
 		netdev_info(netdev, "set new mac address %pM\n", addr->sa_data);
 
 	spin_lock_bh(&vsi->mac_filter_hash_lock);
-	i40e_del_mac_filter(vsi, netdev->dev_addr);
-	i40e_add_mac_filter(vsi, addr->sa_data);
+	i40e_del_mac_filter(vsi, netdev->dev_addr, 0);
+	i40e_add_mac_filter(vsi, addr->sa_data, 0);
 	spin_unlock_bh(&vsi->mac_filter_hash_lock);
 	ether_addr_copy(netdev->dev_addr, addr->sa_data);
 	if (vsi->type == I40E_VSI_MAIN) {
@@ -1731,7 +1741,7 @@ static int i40e_addr_sync(struct net_device *netdev, const u8 *addr)
 	struct i40e_netdev_priv *np = netdev_priv(netdev);
 	struct i40e_vsi *vsi = np->vsi;
 
-	if (i40e_add_mac_filter(vsi, addr))
+	if (i40e_add_mac_filter(vsi, addr, 0))
 		return 0;
 	else
 		return -ENOMEM;
@@ -1750,7 +1760,7 @@ static int i40e_addr_unsync(struct net_device *netdev, const u8 *addr)
 	struct i40e_netdev_priv *np = netdev_priv(netdev);
 	struct i40e_vsi *vsi = np->vsi;
 
-	i40e_del_mac_filter(vsi, addr);
+	i40e_del_mac_filter(vsi, addr, 0);
 
 	return 0;
 }
@@ -1793,7 +1803,7 @@ static void i40e_undo_del_filter_entries(struct i40e_vsi *vsi,
 	struct hlist_node *h;
 
 	hlist_for_each_entry_safe(f, h, from, hlist) {
-		u64 key = i40e_addr_to_hkey(f->macaddr);
+		u64 key = i40e_addr_to_hkey(f->macaddr, f->queue);
 
 		/* Move the element back into MAC filter list*/
 		hlist_del(&f->hlist);
@@ -2194,7 +2204,15 @@ int i40e_sync_vsi_filters(struct i40e_vsi *vsi)
 				add_list[num_add].vlan_tag =
 					cpu_to_le16((u16)(new->f->vlan));
 			}
-			add_list[num_add].queue_number = 0;
+
+			if (new->f->queue) {
+				add_list[num_add].queue_number =
+					cpu_to_le16(new->f->queue);
+				cmd_flags |= I40E_AQC_MACVLAN_ADD_TO_QUEUE;
+			} else {
+				add_list[num_add].queue_number = 0;
+			}
+
 			/* set invalid match method for later detection */
 			add_list[num_add].match_method = I40E_AQC_MM_ERR_NO_RES;
 			cmd_flags |= I40E_AQC_MACVLAN_ADD_PERFECT_MATCH;
@@ -2580,7 +2598,7 @@ int i40e_add_vlan_all_mac(struct i40e_vsi *vsi, s16 vid)
 	hash_for_each_safe(vsi->mac_filter_hash, bkt, h, f, hlist) {
 		if (f->state == I40E_FILTER_REMOVE)
 			continue;
-		add_f = i40e_add_filter(vsi, f->macaddr, vid);
+		add_f = i40e_add_filter(vsi, f->macaddr, vid, 0);
 		if (!add_f) {
 			dev_info(&vsi->back->pdev->dev,
 				 "Could not add vlan filter %d for %pM\n",
@@ -9772,7 +9790,7 @@ static int i40e_config_netdev(struct i40e_vsi *vsi)
 		 */
 		i40e_rm_default_mac_filter(vsi, mac_addr);
 		spin_lock_bh(&vsi->mac_filter_hash_lock);
-		i40e_add_mac_filter(vsi, mac_addr);
+		i40e_add_mac_filter(vsi, mac_addr, 0);
 		spin_unlock_bh(&vsi->mac_filter_hash_lock);
 	} else {
 		/* Relate the VSI_VMDQ name to the VSI_MAIN name. Note that we
@@ -9786,7 +9804,7 @@ static int i40e_config_netdev(struct i40e_vsi *vsi)
 		random_ether_addr(mac_addr);
 
 		spin_lock_bh(&vsi->mac_filter_hash_lock);
-		i40e_add_mac_filter(vsi, mac_addr);
+		i40e_add_mac_filter(vsi, mac_addr, 0);
 		spin_unlock_bh(&vsi->mac_filter_hash_lock);
 	}
 
@@ -9805,7 +9823,7 @@ static int i40e_config_netdev(struct i40e_vsi *vsi)
 	 */
 	eth_broadcast_addr(broadcast);
 	spin_lock_bh(&vsi->mac_filter_hash_lock);
-	i40e_add_mac_filter(vsi, broadcast);
+	i40e_add_mac_filter(vsi, broadcast, 0);
 	spin_unlock_bh(&vsi->mac_filter_hash_lock);
 
 	ether_addr_copy(netdev->dev_addr, mac_addr);
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 0456813..d2ed218 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -709,14 +709,14 @@ static int i40e_alloc_vsi_res(struct i40e_vf *vf, enum i40e_vsi_type type)
 		spin_lock_bh(&vsi->mac_filter_hash_lock);
 		if (is_valid_ether_addr(vf->default_lan_addr.addr)) {
 			f = i40e_add_mac_filter(vsi,
-						vf->default_lan_addr.addr);
+						vf->default_lan_addr.addr, 0);
 			if (!f)
 				dev_info(&pf->pdev->dev,
 					 "Could not add MAC filter %pM for VF %d\n",
 					vf->default_lan_addr.addr, vf->vf_id);
 		}
 		eth_broadcast_addr(broadcast);
-		f = i40e_add_mac_filter(vsi, broadcast);
+		f = i40e_add_mac_filter(vsi, broadcast, 0);
 		if (!f)
 			dev_info(&pf->pdev->dev,
 				 "Could not allocate VF broadcast filter\n");
@@ -2217,7 +2217,7 @@ static int i40e_vc_add_mac_addr_msg(struct i40e_vf *vf, u8 *msg, u16 msglen)
 
 		f = i40e_find_mac(vsi, al->list[i].addr);
 		if (!f)
-			f = i40e_add_mac_filter(vsi, al->list[i].addr);
+			f = i40e_add_mac_filter(vsi, al->list[i].addr, 0);
 
 		if (!f) {
 			dev_err(&pf->pdev->dev,
@@ -2282,7 +2282,7 @@ static int i40e_vc_del_mac_addr_msg(struct i40e_vf *vf, u8 *msg, u16 msglen)
 	spin_lock_bh(&vsi->mac_filter_hash_lock);
 	/* delete addresses from the list */
 	for (i = 0; i < al->num_elements; i++)
-		if (i40e_del_mac_filter(vsi, al->list[i].addr)) {
+		if (i40e_del_mac_filter(vsi, al->list[i].addr, 0)) {
 			ret = I40E_ERR_INVALID_MAC_ADDR;
 			spin_unlock_bh(&vsi->mac_filter_hash_lock);
 			goto error_param;
@@ -2916,7 +2916,7 @@ int i40e_ndo_set_vf_mac(struct net_device *netdev, int vf_id, u8 *mac)
 
 	/* delete the temporary mac address */
 	if (!is_zero_ether_addr(vf->default_lan_addr.addr))
-		i40e_del_mac_filter(vsi, vf->default_lan_addr.addr);
+		i40e_del_mac_filter(vsi, vf->default_lan_addr.addr, 0);
 
 	/* Delete all the filters for this VSI - we're going to kill it
 	 * anyway.
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC PATCH next 2/2] i40e: add support for macvlan hardware offload
  2017-10-17 21:18 ` [Intel-wired-lan] " Shannon Nelson
@ 2017-10-17 21:18   ` Shannon Nelson
  -1 siblings, 0 replies; 10+ messages in thread
From: Shannon Nelson @ 2017-10-17 21:18 UTC (permalink / raw)
  To: intel-wired-lan, jeffrey.t.kirsher; +Cc: netdev

This patch adds support for macvlan hardware offload (l2-fwd-offload)
feature using the XL710's macvlan-to-queue filtering machanism.  These
are most useful for supporting separate mac addresses for Container
virtualization using Docker and similar configurations.

The basic design is to partition off some of the PF's general LAN queues
outside of the standard RSS pool and use them as the offload queues.
This especially makes sense on machines with more than 64 CPUs: since
the RSS pool is limited to a maximum of 64, the queues assigned to the
remaining CPUs essentially go unused.  When on a machine with fewer than
64 CPUs, we shrink the RSS pool and use the upper queues for the offload.

If the user has added Flow Director filters, enabling of macvlan offload
is disallowed.

To use this feature, use ethtool to enable l2-fwd-offload
	ethtool -K ethX l2-fwd-offload on
When the next macvlan devices are created on ethX, the macvlan driver
will automatically attempt to setup the hardweare offload.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
---
 drivers/net/ethernet/intel/i40e/i40e.h         |   10 +
 drivers/net/ethernet/intel/i40e/i40e_ethtool.c |   15 ++
 drivers/net/ethernet/intel/i40e/i40e_main.c    |  239 +++++++++++++++++++++++-
 drivers/net/ethernet/intel/i40e/i40e_txrx.h    |    1 +
 4 files changed, 264 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h
index a187f53..4868ae2 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -365,6 +365,10 @@ struct i40e_pf {
 	u8 atr_sample_rate;
 	bool wol_en;
 
+	u16 macvlan_hint;
+	u16 macvlan_used;
+	u16 macvlan_num;
+
 	struct hlist_head fdir_filter_list;
 	u16 fdir_pf_active_filters;
 	unsigned long fd_flush_timestamp;
@@ -712,6 +716,12 @@ struct i40e_netdev_priv {
 	struct i40e_vsi *vsi;
 };
 
+struct i40e_fwd {
+	struct net_device *vdev;
+	u16 tx_base_queue;
+	/* future expansion here might include number of queues */
+};
+
 /* struct that defines an interrupt vector */
 struct i40e_q_vector {
 	struct i40e_vsi *vsi;
diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
index afd3ca8..e1628c1 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
@@ -3817,6 +3817,13 @@ static int i40e_set_rxnfc(struct net_device *netdev, struct ethtool_rxnfc *cmd)
 	struct i40e_pf *pf = vsi->back;
 	int ret = -EOPNOTSUPP;
 
+	if (pf->macvlan_num) {
+		dev_warn(&pf->pdev->dev,
+			 "Remove %d remaining macvlan offloads to change filter options\n",
+			 pf->macvlan_used);
+		return -EBUSY;
+	}
+
 	switch (cmd->cmd) {
 	case ETHTOOL_SRXFH:
 		ret = i40e_set_rss_hash_opt(pf, cmd);
@@ -3909,6 +3916,14 @@ static int i40e_set_channels(struct net_device *dev,
 	if (count > i40e_max_channels(vsi))
 		return -EINVAL;
 
+	/* verify that macvlan offloads are not in use */
+	if (pf->macvlan_num) {
+		dev_warn(&pf->pdev->dev,
+			 "Remove %d remaining macvlan offloads to change channel count\n",
+			 pf->macvlan_used);
+		return -EBUSY;
+	}
+
 	/* verify that the number of channels does not invalidate any current
 	 * flow director rules
 	 */
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index e4b8a4b..7b26c6f 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -9221,6 +9221,66 @@ static void i40e_clear_rss_lut(struct i40e_vsi *vsi)
 }
 
 /**
+ * i40e_fix_features - fix the proposed netdev feature flags
+ * @netdev: ptr to the netdev being adjusted
+ * @features: the feature set that the stack is suggesting
+ * Note: expects to be called while under rtnl_lock()
+ **/
+static netdev_features_t i40e_fix_features(struct net_device *netdev,
+					   netdev_features_t features)
+{
+	struct i40e_netdev_priv *np = netdev_priv(netdev);
+	struct i40e_pf *pf = np->vsi->back;
+	struct i40e_vsi *vsi = np->vsi;
+
+	/* make sure there are queues to be used for macvlan offload */
+	if (features & NETIF_F_HW_L2FW_DOFFLOAD &&
+	    !(netdev->features & NETIF_F_HW_L2FW_DOFFLOAD)) {
+		const u8 drop = I40E_FILTER_PROGRAM_DESC_DEST_DROP_PACKET;
+		struct i40e_fdir_filter *rule;
+		struct hlist_node *node2;
+		u16 rss, unused;
+
+		/* Find a set of queues to be used for macvlan offload.
+		 * If there aren't many queues outside of the RSS set
+		 * that could be used for macvlan, try shrinking the
+		 * set to free up some queues, after checking if there
+		 * are any Flow Director rules we might break.
+		 */
+
+		rss = vsi->rss_size;
+		unused = vsi->num_queue_pairs - rss;
+		if (unused < (vsi->rss_size / 2)) {
+			rss = vsi->rss_size / 2;
+			unused = vsi->num_q_vectors - rss;
+		}
+		pf->macvlan_num = unused;
+
+		/* check the flow director rules */
+		hlist_for_each_entry_safe(rule, node2,
+					  &pf->fdir_filter_list, fdir_node) {
+			if (rule->dest_ctl != drop && rss <= rule->q_index) {
+				dev_warn(&pf->pdev->dev,
+					 "Remove user defined filter %d to enable macvlan offload\n",
+					 rule->fd_id);
+				features &= ~NETIF_F_HW_L2FW_DOFFLOAD;
+				pf->macvlan_num = 0;
+			}
+		}
+	} else if (!(features & NETIF_F_HW_L2FW_DOFFLOAD) &&
+		    netdev->features & NETIF_F_HW_L2FW_DOFFLOAD) {
+		if (pf->macvlan_used) {
+			dev_warn(&pf->pdev->dev,
+				 "Remove %d remaining macvlan offloads to disable macvlan offload\n",
+				 pf->macvlan_used);
+			features |= NETIF_F_HW_L2FW_DOFFLOAD;
+		}
+	}
+
+	return features;
+}
+
+/**
  * i40e_set_features - set the netdev feature flags
  * @netdev: ptr to the netdev being adjusted
  * @features: the feature set that the stack is suggesting
@@ -9247,6 +9307,45 @@ static int i40e_set_features(struct net_device *netdev,
 
 	need_reset = i40e_set_ntuple(pf, features);
 
+	/* keep this section last in this function as it
+	 * might take care of the need_reset for the others
+	 */
+	if (features & NETIF_F_HW_L2FW_DOFFLOAD &&
+	    !(netdev->features & NETIF_F_HW_L2FW_DOFFLOAD)) {
+		/* reserve queues for macvlan use */
+		u16 rss = vsi->num_q_vectors - pf->macvlan_num;
+
+		if (rss != vsi->rss_size) {
+			if (i40e_reconfig_rss_queues(pf, rss))
+				need_reset = false;
+		}
+
+		pf->macvlan_hint = rss;
+		pf->macvlan_used = 0;
+
+	} else if (!(features & NETIF_F_HW_L2FW_DOFFLOAD) &&
+		    netdev->features & NETIF_F_HW_L2FW_DOFFLOAD) {
+		/* return macvlan queues to general use */
+		int num_qs = vsi->rss_size + pf->macvlan_num;
+		int i;
+
+		/* stop the upperdev queues if not already stopped */
+		for (i = vsi->rss_size; i < num_qs; i++) {
+			struct i40e_fwd *fwd = vsi->tx_rings[i]->fwd;
+
+			if (fwd)
+				netif_tx_stop_all_queues(fwd->vdev);
+		}
+
+		/* rebuild the rss layout with the restored queues */
+		if (i40e_reconfig_rss_queues(pf, num_qs))
+			need_reset = false;
+
+		pf->macvlan_hint = 0;
+		pf->macvlan_used = 0;
+		pf->macvlan_num = 0;
+	}
+
 	if (need_reset)
 		i40e_do_reset(pf, BIT_ULL(__I40E_PF_RESET_REQUESTED), true);
 
@@ -9674,6 +9773,137 @@ static int i40e_xdp(struct net_device *dev,
 	}
 }
 
+/**
+ * i40e_select_queue - select the Tx queue, watching for macvlan offloads
+ * @dev: netdevice
+ * @skb: packet to be sent
+ * @accel_priv: hint for offloading macvlan
+ * @fallback: alternative function to use if we don't care which Tx
+ **/
+static u16 i40e_select_queue(struct net_device *dev, struct sk_buff *skb,
+			     void *accel_priv, select_queue_fallback_t fallback)
+{
+	struct i40e_fwd *fwd = accel_priv;
+
+	if (fwd)
+		return fwd->tx_base_queue;
+
+	return fallback(dev, skb);
+}
+
+/**
+ * i40e_fwd_add - add a macvlan offload
+ * @pdev: the lower physical device
+ * @vdev: the upper macvlan device
+ **/
+static void *i40e_fwd_add(struct net_device *pdev, struct net_device *vdev)
+{
+	struct i40e_netdev_priv *np = netdev_priv(pdev);
+	struct i40e_pf *pf = np->vsi->back;
+	struct i40e_vsi *vsi = np->vsi;
+	struct i40e_fwd *fwd = NULL;
+	struct i40e_mac_filter *f;
+	int i;
+
+	if (vdev->num_tx_queues != 1 ||
+	    vdev->num_rx_queues != vdev->num_tx_queues) {
+		netdev_info(pdev, "Macvlan offload for Rx/Tx single queue only\n");
+		return ERR_PTR(-EINVAL);
+	}
+
+	if (!(pf->macvlan_num - pf->macvlan_used)) {
+		netdev_err(pdev, "No macvlan offload slots left\n");
+		return ERR_PTR(-ENOMEM);
+	}
+
+	if (i40e_find_mac(vsi, vdev->dev_addr)) {
+		netdev_err(pdev, "MAC address %pM already in use\n",
+			   vdev->dev_addr);
+		return ERR_PTR(-EINVAL);
+	}
+
+	/* create the fwd struct */
+	fwd = kzalloc(sizeof(*fwd), GFP_KERNEL);
+	if (!fwd)
+		return ERR_PTR(-ENOMEM);
+
+	/* find the next available macvlan queue */
+	if (!pf->macvlan_hint)
+		pf->macvlan_hint = vsi->rss_size;
+	for (i = pf->macvlan_hint; i < vsi->alloc_queue_pairs; i++) {
+		if (!vsi->tx_rings[i]->fwd) {
+			vsi->tx_rings[i]->fwd = fwd;
+
+			fwd->tx_base_queue = i;
+			fwd->vdev = vdev;
+
+			pf->macvlan_hint = i + 1;
+			break;
+		}
+	}
+	if (!fwd->tx_base_queue) {
+		netdev_err(pdev, "No available queue found for macvlan %s\n",
+			   vdev->name);
+		goto no_queue;
+	}
+	pf->macvlan_used++;
+
+	/* set the mac address */
+	spin_lock_bh(&vsi->mac_filter_hash_lock);
+	f = i40e_add_mac_filter(vsi, vdev->dev_addr, fwd->tx_base_queue);
+	spin_unlock_bh(&vsi->mac_filter_hash_lock);
+	if (!f) {
+		netdev_err(pdev, "Failed to add macaddr %pM for macvlan %s\n",
+			   vdev->dev_addr, vdev->name);
+		goto no_open;
+	}
+
+	netdev_info(pdev, "%s: queue %d for macvlan %s\n",
+		    __func__, fwd->tx_base_queue, vdev->name);
+
+	if (netif_running(pdev))
+		netif_tx_start_all_queues(vdev);
+	else
+		netdev_info(pdev, "Macvlan %s offload start pending\n",
+			    vdev->name);
+
+	return fwd;
+
+no_open:
+	vsi->tx_rings[fwd->tx_base_queue]->fwd = NULL;
+no_queue:
+	fwd->vdev = NULL;
+	kfree(fwd);
+	return ERR_PTR(-EBUSY);
+}
+
+/**
+ * i40e_fwd_del - remove a macvlan offload
+ * @pdev: the lower physical device
+ * @priv: the private pointer for the offload information
+ **/
+static void i40e_fwd_del(struct net_device *pdev, void *priv)
+{
+	struct i40e_netdev_priv *np = netdev_priv(pdev);
+	struct i40e_pf *pf = np->vsi->back;
+	struct i40e_vsi *vsi = np->vsi;
+	struct i40e_fwd *fwd = priv;
+
+	spin_lock_bh(&vsi->mac_filter_hash_lock);
+	i40e_del_mac_filter(vsi, fwd->vdev->dev_addr, fwd->tx_base_queue);
+	spin_unlock_bh(&vsi->mac_filter_hash_lock);
+
+	vsi->tx_rings[fwd->tx_base_queue]->fwd = NULL;
+	fwd->tx_base_queue = 0;
+	fwd->vdev = NULL;
+
+	if (!pf->macvlan_hint || pf->macvlan_hint > fwd->tx_base_queue)
+		pf->macvlan_hint = fwd->tx_base_queue;
+	pf->macvlan_used--;
+
+	kfree(fwd);
+}
+
 static const struct net_device_ops i40e_netdev_ops = {
 	.ndo_open		= i40e_open,
 	.ndo_stop		= i40e_close,
@@ -9691,6 +9921,7 @@ static int i40e_xdp(struct net_device *dev,
 	.ndo_poll_controller	= i40e_netpoll,
 #endif
 	.ndo_setup_tc		= __i40e_setup_tc,
+	.ndo_fix_features	= i40e_fix_features,
 	.ndo_set_features	= i40e_set_features,
 	.ndo_set_vf_mac		= i40e_ndo_set_vf_mac,
 	.ndo_set_vf_vlan	= i40e_ndo_set_vf_port_vlan,
@@ -9707,6 +9938,9 @@ static int i40e_xdp(struct net_device *dev,
 	.ndo_bridge_getlink	= i40e_ndo_bridge_getlink,
 	.ndo_bridge_setlink	= i40e_ndo_bridge_setlink,
 	.ndo_xdp		= i40e_xdp,
+	.ndo_select_queue	= i40e_select_queue,
+	.ndo_dfwd_add_station	= i40e_fwd_add,
+	.ndo_dfwd_del_station	= i40e_fwd_del,
 };
 
 /**
@@ -9776,6 +10010,8 @@ static int i40e_config_netdev(struct i40e_vsi *vsi)
 	netdev->hw_enc_features |= NETIF_F_TSO_MANGLEID;
 
 	if (vsi->type == I40E_VSI_MAIN) {
+		netdev->hw_features |= NETIF_F_HW_L2FW_DOFFLOAD;
+
 		SET_NETDEV_DEV(netdev, &pf->pdev->dev);
 		ether_addr_copy(mac_addr, hw->mac.perm_addr);
 		/* The following steps are necessary for two reasons. First,
@@ -11209,7 +11445,8 @@ static void i40e_determine_queue_usage(struct i40e_pf *pf)
 		/* limit lan qps to the smaller of qps, cpus or msix */
 		q_max = max_t(int, pf->rss_size_max, num_online_cpus());
 		q_max = min_t(int, q_max, pf->hw.func_caps.num_tx_qp);
-		q_max = min_t(int, q_max, pf->hw.func_caps.num_msix_vectors);
+		q_max = min_t(int, q_max,
+			      (pf->hw.func_caps.num_msix_vectors - 1));
 		pf->num_lan_qps = q_max;
 
 		queues_left -= pf->num_lan_qps;
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.h b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
index a4e3e66..8a0ea20 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
@@ -363,6 +363,7 @@ struct i40e_ring {
 	struct device *dev;		/* Used for DMA mapping */
 	struct net_device *netdev;	/* netdev ring maps to */
 	struct bpf_prog *xdp_prog;
+	struct i40e_fwd *fwd;		/* macvlan forwarding */
 	union {
 		struct i40e_tx_buffer *tx_bi;
 		struct i40e_rx_buffer *rx_bi;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Intel-wired-lan] [RFC PATCH next 2/2] i40e: add support for macvlan hardware offload
@ 2017-10-17 21:18   ` Shannon Nelson
  0 siblings, 0 replies; 10+ messages in thread
From: Shannon Nelson @ 2017-10-17 21:18 UTC (permalink / raw)
  To: intel-wired-lan

This patch adds support for macvlan hardware offload (l2-fwd-offload)
feature using the XL710's macvlan-to-queue filtering machanism.  These
are most useful for supporting separate mac addresses for Container
virtualization using Docker and similar configurations.

The basic design is to partition off some of the PF's general LAN queues
outside of the standard RSS pool and use them as the offload queues.
This especially makes sense on machines with more than 64 CPUs: since
the RSS pool is limited to a maximum of 64, the queues assigned to the
remaining CPUs essentially go unused.  When on a machine with fewer than
64 CPUs, we shrink the RSS pool and use the upper queues for the offload.

If the user has added Flow Director filters, enabling of macvlan offload
is disallowed.

To use this feature, use ethtool to enable l2-fwd-offload
	ethtool -K ethX l2-fwd-offload on
When the next macvlan devices are created on ethX, the macvlan driver
will automatically attempt to setup the hardweare offload.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
---
 drivers/net/ethernet/intel/i40e/i40e.h         |   10 +
 drivers/net/ethernet/intel/i40e/i40e_ethtool.c |   15 ++
 drivers/net/ethernet/intel/i40e/i40e_main.c    |  239 +++++++++++++++++++++++-
 drivers/net/ethernet/intel/i40e/i40e_txrx.h    |    1 +
 4 files changed, 264 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h
index a187f53..4868ae2 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -365,6 +365,10 @@ struct i40e_pf {
 	u8 atr_sample_rate;
 	bool wol_en;
 
+	u16 macvlan_hint;
+	u16 macvlan_used;
+	u16 macvlan_num;
+
 	struct hlist_head fdir_filter_list;
 	u16 fdir_pf_active_filters;
 	unsigned long fd_flush_timestamp;
@@ -712,6 +716,12 @@ struct i40e_netdev_priv {
 	struct i40e_vsi *vsi;
 };
 
+struct i40e_fwd {
+	struct net_device *vdev;
+	u16 tx_base_queue;
+	/* future expansion here might include number of queues */
+};
+
 /* struct that defines an interrupt vector */
 struct i40e_q_vector {
 	struct i40e_vsi *vsi;
diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
index afd3ca8..e1628c1 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
@@ -3817,6 +3817,13 @@ static int i40e_set_rxnfc(struct net_device *netdev, struct ethtool_rxnfc *cmd)
 	struct i40e_pf *pf = vsi->back;
 	int ret = -EOPNOTSUPP;
 
+	if (pf->macvlan_num) {
+		dev_warn(&pf->pdev->dev,
+			 "Remove %d remaining macvlan offloads to change filter options\n",
+			 pf->macvlan_used);
+		return -EBUSY;
+	}
+
 	switch (cmd->cmd) {
 	case ETHTOOL_SRXFH:
 		ret = i40e_set_rss_hash_opt(pf, cmd);
@@ -3909,6 +3916,14 @@ static int i40e_set_channels(struct net_device *dev,
 	if (count > i40e_max_channels(vsi))
 		return -EINVAL;
 
+	/* verify that macvlan offloads are not in use */
+	if (pf->macvlan_num) {
+		dev_warn(&pf->pdev->dev,
+			 "Remove %d remaining macvlan offloads to change channel count\n",
+			 pf->macvlan_used);
+		return -EBUSY;
+	}
+
 	/* verify that the number of channels does not invalidate any current
 	 * flow director rules
 	 */
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index e4b8a4b..7b26c6f 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -9221,6 +9221,66 @@ static void i40e_clear_rss_lut(struct i40e_vsi *vsi)
 }
 
 /**
+ * i40e_fix_features - fix the proposed netdev feature flags
+ * @netdev: ptr to the netdev being adjusted
+ * @features: the feature set that the stack is suggesting
+ * Note: expects to be called while under rtnl_lock()
+ **/
+static netdev_features_t i40e_fix_features(struct net_device *netdev,
+					   netdev_features_t features)
+{
+	struct i40e_netdev_priv *np = netdev_priv(netdev);
+	struct i40e_pf *pf = np->vsi->back;
+	struct i40e_vsi *vsi = np->vsi;
+
+	/* make sure there are queues to be used for macvlan offload */
+	if (features & NETIF_F_HW_L2FW_DOFFLOAD &&
+	    !(netdev->features & NETIF_F_HW_L2FW_DOFFLOAD)) {
+		const u8 drop = I40E_FILTER_PROGRAM_DESC_DEST_DROP_PACKET;
+		struct i40e_fdir_filter *rule;
+		struct hlist_node *node2;
+		u16 rss, unused;
+
+		/* Find a set of queues to be used for macvlan offload.
+		 * If there aren't many queues outside of the RSS set
+		 * that could be used for macvlan, try shrinking the
+		 * set to free up some queues, after checking if there
+		 * are any Flow Director rules we might break.
+		 */
+
+		rss = vsi->rss_size;
+		unused = vsi->num_queue_pairs - rss;
+		if (unused < (vsi->rss_size / 2)) {
+			rss = vsi->rss_size / 2;
+			unused = vsi->num_q_vectors - rss;
+		}
+		pf->macvlan_num = unused;
+
+		/* check the flow director rules */
+		hlist_for_each_entry_safe(rule, node2,
+					  &pf->fdir_filter_list, fdir_node) {
+			if (rule->dest_ctl != drop && rss <= rule->q_index) {
+				dev_warn(&pf->pdev->dev,
+					 "Remove user defined filter %d to enable macvlan offload\n",
+					 rule->fd_id);
+				features &= ~NETIF_F_HW_L2FW_DOFFLOAD;
+				pf->macvlan_num = 0;
+			}
+		}
+	} else if (!(features & NETIF_F_HW_L2FW_DOFFLOAD) &&
+		    netdev->features & NETIF_F_HW_L2FW_DOFFLOAD) {
+		if (pf->macvlan_used) {
+			dev_warn(&pf->pdev->dev,
+				 "Remove %d remaining macvlan offloads to disable macvlan offload\n",
+				 pf->macvlan_used);
+			features |= NETIF_F_HW_L2FW_DOFFLOAD;
+		}
+	}
+
+	return features;
+}
+
+/**
  * i40e_set_features - set the netdev feature flags
  * @netdev: ptr to the netdev being adjusted
  * @features: the feature set that the stack is suggesting
@@ -9247,6 +9307,45 @@ static int i40e_set_features(struct net_device *netdev,
 
 	need_reset = i40e_set_ntuple(pf, features);
 
+	/* keep this section last in this function as it
+	 * might take care of the need_reset for the others
+	 */
+	if (features & NETIF_F_HW_L2FW_DOFFLOAD &&
+	    !(netdev->features & NETIF_F_HW_L2FW_DOFFLOAD)) {
+		/* reserve queues for macvlan use */
+		u16 rss = vsi->num_q_vectors - pf->macvlan_num;
+
+		if (rss != vsi->rss_size) {
+			if (i40e_reconfig_rss_queues(pf, rss))
+				need_reset = false;
+		}
+
+		pf->macvlan_hint = rss;
+		pf->macvlan_used = 0;
+
+	} else if (!(features & NETIF_F_HW_L2FW_DOFFLOAD) &&
+		    netdev->features & NETIF_F_HW_L2FW_DOFFLOAD) {
+		/* return macvlan queues to general use */
+		int num_qs = vsi->rss_size + pf->macvlan_num;
+		int i;
+
+		/* stop the upperdev queues if not already stopped */
+		for (i = vsi->rss_size; i < num_qs; i++) {
+			struct i40e_fwd *fwd = vsi->tx_rings[i]->fwd;
+
+			if (fwd)
+				netif_tx_stop_all_queues(fwd->vdev);
+		}
+
+		/* rebuild the rss layout with the restored queues */
+		if (i40e_reconfig_rss_queues(pf, num_qs))
+			need_reset = false;
+
+		pf->macvlan_hint = 0;
+		pf->macvlan_used = 0;
+		pf->macvlan_num = 0;
+	}
+
 	if (need_reset)
 		i40e_do_reset(pf, BIT_ULL(__I40E_PF_RESET_REQUESTED), true);
 
@@ -9674,6 +9773,137 @@ static int i40e_xdp(struct net_device *dev,
 	}
 }
 
+/**
+ * i40e_select_queue - select the Tx queue, watching for macvlan offloads
+ * @dev: netdevice
+ * @skb: packet to be sent
+ * @accel_priv: hint for offloading macvlan
+ * @fallback: alternative function to use if we don't care which Tx
+ **/
+static u16 i40e_select_queue(struct net_device *dev, struct sk_buff *skb,
+			     void *accel_priv, select_queue_fallback_t fallback)
+{
+	struct i40e_fwd *fwd = accel_priv;
+
+	if (fwd)
+		return fwd->tx_base_queue;
+
+	return fallback(dev, skb);
+}
+
+/**
+ * i40e_fwd_add - add a macvlan offload
+ * @pdev: the lower physical device
+ * @vdev: the upper macvlan device
+ **/
+static void *i40e_fwd_add(struct net_device *pdev, struct net_device *vdev)
+{
+	struct i40e_netdev_priv *np = netdev_priv(pdev);
+	struct i40e_pf *pf = np->vsi->back;
+	struct i40e_vsi *vsi = np->vsi;
+	struct i40e_fwd *fwd = NULL;
+	struct i40e_mac_filter *f;
+	int i;
+
+	if (vdev->num_tx_queues != 1 ||
+	    vdev->num_rx_queues != vdev->num_tx_queues) {
+		netdev_info(pdev, "Macvlan offload for Rx/Tx single queue only\n");
+		return ERR_PTR(-EINVAL);
+	}
+
+	if (!(pf->macvlan_num - pf->macvlan_used)) {
+		netdev_err(pdev, "No macvlan offload slots left\n");
+		return ERR_PTR(-ENOMEM);
+	}
+
+	if (i40e_find_mac(vsi, vdev->dev_addr)) {
+		netdev_err(pdev, "MAC address %pM already in use\n",
+			   vdev->dev_addr);
+		return ERR_PTR(-EINVAL);
+	}
+
+	/* create the fwd struct */
+	fwd = kzalloc(sizeof(*fwd), GFP_KERNEL);
+	if (!fwd)
+		return ERR_PTR(-ENOMEM);
+
+	/* find the next available macvlan queue */
+	if (!pf->macvlan_hint)
+		pf->macvlan_hint = vsi->rss_size;
+	for (i = pf->macvlan_hint; i < vsi->alloc_queue_pairs; i++) {
+		if (!vsi->tx_rings[i]->fwd) {
+			vsi->tx_rings[i]->fwd = fwd;
+
+			fwd->tx_base_queue = i;
+			fwd->vdev = vdev;
+
+			pf->macvlan_hint = i + 1;
+			break;
+		}
+	}
+	if (!fwd->tx_base_queue) {
+		netdev_err(pdev, "No available queue found for macvlan %s\n",
+			   vdev->name);
+		goto no_queue;
+	}
+	pf->macvlan_used++;
+
+	/* set the mac address */
+	spin_lock_bh(&vsi->mac_filter_hash_lock);
+	f = i40e_add_mac_filter(vsi, vdev->dev_addr, fwd->tx_base_queue);
+	spin_unlock_bh(&vsi->mac_filter_hash_lock);
+	if (!f) {
+		netdev_err(pdev, "Failed to add macaddr %pM for macvlan %s\n",
+			   vdev->dev_addr, vdev->name);
+		goto no_open;
+	}
+
+	netdev_info(pdev, "%s: queue %d for macvlan %s\n",
+		    __func__, fwd->tx_base_queue, vdev->name);
+
+	if (netif_running(pdev))
+		netif_tx_start_all_queues(vdev);
+	else
+		netdev_info(pdev, "Macvlan %s offload start pending\n",
+			    vdev->name);
+
+	return fwd;
+
+no_open:
+	vsi->tx_rings[fwd->tx_base_queue]->fwd = NULL;
+no_queue:
+	fwd->vdev = NULL;
+	kfree(fwd);
+	return ERR_PTR(-EBUSY);
+}
+
+/**
+ * i40e_fwd_del - remove a macvlan offload
+ * @pdev: the lower physical device
+ * @priv: the private pointer for the offload information
+ **/
+static void i40e_fwd_del(struct net_device *pdev, void *priv)
+{
+	struct i40e_netdev_priv *np = netdev_priv(pdev);
+	struct i40e_pf *pf = np->vsi->back;
+	struct i40e_vsi *vsi = np->vsi;
+	struct i40e_fwd *fwd = priv;
+
+	spin_lock_bh(&vsi->mac_filter_hash_lock);
+	i40e_del_mac_filter(vsi, fwd->vdev->dev_addr, fwd->tx_base_queue);
+	spin_unlock_bh(&vsi->mac_filter_hash_lock);
+
+	vsi->tx_rings[fwd->tx_base_queue]->fwd = NULL;
+	fwd->tx_base_queue = 0;
+	fwd->vdev = NULL;
+
+	if (!pf->macvlan_hint || pf->macvlan_hint > fwd->tx_base_queue)
+		pf->macvlan_hint = fwd->tx_base_queue;
+	pf->macvlan_used--;
+
+	kfree(fwd);
+}
+
 static const struct net_device_ops i40e_netdev_ops = {
 	.ndo_open		= i40e_open,
 	.ndo_stop		= i40e_close,
@@ -9691,6 +9921,7 @@ static int i40e_xdp(struct net_device *dev,
 	.ndo_poll_controller	= i40e_netpoll,
 #endif
 	.ndo_setup_tc		= __i40e_setup_tc,
+	.ndo_fix_features	= i40e_fix_features,
 	.ndo_set_features	= i40e_set_features,
 	.ndo_set_vf_mac		= i40e_ndo_set_vf_mac,
 	.ndo_set_vf_vlan	= i40e_ndo_set_vf_port_vlan,
@@ -9707,6 +9938,9 @@ static int i40e_xdp(struct net_device *dev,
 	.ndo_bridge_getlink	= i40e_ndo_bridge_getlink,
 	.ndo_bridge_setlink	= i40e_ndo_bridge_setlink,
 	.ndo_xdp		= i40e_xdp,
+	.ndo_select_queue	= i40e_select_queue,
+	.ndo_dfwd_add_station	= i40e_fwd_add,
+	.ndo_dfwd_del_station	= i40e_fwd_del,
 };
 
 /**
@@ -9776,6 +10010,8 @@ static int i40e_config_netdev(struct i40e_vsi *vsi)
 	netdev->hw_enc_features |= NETIF_F_TSO_MANGLEID;
 
 	if (vsi->type == I40E_VSI_MAIN) {
+		netdev->hw_features |= NETIF_F_HW_L2FW_DOFFLOAD;
+
 		SET_NETDEV_DEV(netdev, &pf->pdev->dev);
 		ether_addr_copy(mac_addr, hw->mac.perm_addr);
 		/* The following steps are necessary for two reasons. First,
@@ -11209,7 +11445,8 @@ static void i40e_determine_queue_usage(struct i40e_pf *pf)
 		/* limit lan qps to the smaller of qps, cpus or msix */
 		q_max = max_t(int, pf->rss_size_max, num_online_cpus());
 		q_max = min_t(int, q_max, pf->hw.func_caps.num_tx_qp);
-		q_max = min_t(int, q_max, pf->hw.func_caps.num_msix_vectors);
+		q_max = min_t(int, q_max,
+			      (pf->hw.func_caps.num_msix_vectors - 1));
 		pf->num_lan_qps = q_max;
 
 		queues_left -= pf->num_lan_qps;
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.h b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
index a4e3e66..8a0ea20 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
@@ -363,6 +363,7 @@ struct i40e_ring {
 	struct device *dev;		/* Used for DMA mapping */
 	struct net_device *netdev;	/* netdev ring maps to */
 	struct bpf_prog *xdp_prog;
+	struct i40e_fwd *fwd;		/* macvlan forwarding */
 	union {
 		struct i40e_tx_buffer *tx_bi;
 		struct i40e_rx_buffer *rx_bi;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [Intel-wired-lan] [RFC PATCH next 2/2] i40e: add support for macvlan hardware offload
  2017-10-17 21:18   ` [Intel-wired-lan] " Shannon Nelson
@ 2017-10-17 21:32     ` Alexander Duyck
  -1 siblings, 0 replies; 10+ messages in thread
From: Alexander Duyck @ 2017-10-17 21:32 UTC (permalink / raw)
  To: Shannon Nelson; +Cc: intel-wired-lan, Jeff Kirsher, Netdev

On Tue, Oct 17, 2017 at 2:18 PM, Shannon Nelson
<shannon.nelson@oracle.com> wrote:
> This patch adds support for macvlan hardware offload (l2-fwd-offload)
> feature using the XL710's macvlan-to-queue filtering machanism.  These
> are most useful for supporting separate mac addresses for Container
> virtualization using Docker and similar configurations.
>
> The basic design is to partition off some of the PF's general LAN queues
> outside of the standard RSS pool and use them as the offload queues.
> This especially makes sense on machines with more than 64 CPUs: since
> the RSS pool is limited to a maximum of 64, the queues assigned to the
> remaining CPUs essentially go unused.  When on a machine with fewer than
> 64 CPUs, we shrink the RSS pool and use the upper queues for the offload.
>
> If the user has added Flow Director filters, enabling of macvlan offload
> is disallowed.
>
> To use this feature, use ethtool to enable l2-fwd-offload
>         ethtool -K ethX l2-fwd-offload on
> When the next macvlan devices are created on ethX, the macvlan driver
> will automatically attempt to setup the hardweare offload.
>
> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
> ---
>  drivers/net/ethernet/intel/i40e/i40e.h         |   10 +
>  drivers/net/ethernet/intel/i40e/i40e_ethtool.c |   15 ++
>  drivers/net/ethernet/intel/i40e/i40e_main.c    |  239 +++++++++++++++++++++++-
>  drivers/net/ethernet/intel/i40e/i40e_txrx.h    |    1 +
>  4 files changed, 264 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h
> index a187f53..4868ae2 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e.h
> +++ b/drivers/net/ethernet/intel/i40e/i40e.h
> @@ -365,6 +365,10 @@ struct i40e_pf {
>         u8 atr_sample_rate;
>         bool wol_en;
>
> +       u16 macvlan_hint;
> +       u16 macvlan_used;
> +       u16 macvlan_num;
> +
>         struct hlist_head fdir_filter_list;
>         u16 fdir_pf_active_filters;
>         unsigned long fd_flush_timestamp;
> @@ -712,6 +716,12 @@ struct i40e_netdev_priv {
>         struct i40e_vsi *vsi;
>  };
>
> +struct i40e_fwd {
> +       struct net_device *vdev;
> +       u16 tx_base_queue;
> +       /* future expansion here might include number of queues */
> +};
> +
>  /* struct that defines an interrupt vector */
>  struct i40e_q_vector {
>         struct i40e_vsi *vsi;
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
> index afd3ca8..e1628c1 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
> @@ -3817,6 +3817,13 @@ static int i40e_set_rxnfc(struct net_device *netdev, struct ethtool_rxnfc *cmd)
>         struct i40e_pf *pf = vsi->back;
>         int ret = -EOPNOTSUPP;
>
> +       if (pf->macvlan_num) {
> +               dev_warn(&pf->pdev->dev,
> +                        "Remove %d remaining macvlan offloads to change filter options\n",
> +                        pf->macvlan_used);
> +               return -EBUSY;
> +       }
> +
>         switch (cmd->cmd) {
>         case ETHTOOL_SRXFH:
>                 ret = i40e_set_rss_hash_opt(pf, cmd);
> @@ -3909,6 +3916,14 @@ static int i40e_set_channels(struct net_device *dev,
>         if (count > i40e_max_channels(vsi))
>                 return -EINVAL;
>
> +       /* verify that macvlan offloads are not in use */
> +       if (pf->macvlan_num) {
> +               dev_warn(&pf->pdev->dev,
> +                        "Remove %d remaining macvlan offloads to change channel count\n",
> +                        pf->macvlan_used);
> +               return -EBUSY;
> +       }
> +
>         /* verify that the number of channels does not invalidate any current
>          * flow director rules
>          */
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
> index e4b8a4b..7b26c6f 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_main.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
> @@ -9221,6 +9221,66 @@ static void i40e_clear_rss_lut(struct i40e_vsi *vsi)
>  }
>
>  /**
> + * i40e_fix_features - fix the proposed netdev feature flags
> + * @netdev: ptr to the netdev being adjusted
> + * @features: the feature set that the stack is suggesting
> + * Note: expects to be called while under rtnl_lock()
> + **/
> +static netdev_features_t i40e_fix_features(struct net_device *netdev,
> +                                          netdev_features_t features)
> +{
> +       struct i40e_netdev_priv *np = netdev_priv(netdev);
> +       struct i40e_pf *pf = np->vsi->back;
> +       struct i40e_vsi *vsi = np->vsi;
> +
> +       /* make sure there are queues to be used for macvlan offload */
> +       if (features & NETIF_F_HW_L2FW_DOFFLOAD &&
> +           !(netdev->features & NETIF_F_HW_L2FW_DOFFLOAD)) {
> +               const u8 drop = I40E_FILTER_PROGRAM_DESC_DEST_DROP_PACKET;
> +               struct i40e_fdir_filter *rule;
> +               struct hlist_node *node2;
> +               u16 rss, unused;
> +
> +               /* Find a set of queues to be used for macvlan offload.
> +                * If there aren't many queues outside of the RSS set
> +                * that could be used for macvlan, try shrinking the
> +                * set to free up some queues, after checking if there
> +                * are any Flow Director rules we might break.
> +                */
> +
> +               rss = vsi->rss_size;
> +               unused = vsi->num_queue_pairs - rss;
> +               if (unused < (vsi->rss_size / 2)) {
> +                       rss = vsi->rss_size / 2;
> +                       unused = vsi->num_q_vectors - rss;
> +               }
> +               pf->macvlan_num = unused;
> +
> +               /* check the flow director rules */
> +               hlist_for_each_entry_safe(rule, node2,
> +                                         &pf->fdir_filter_list, fdir_node) {
> +                       if (rule->dest_ctl != drop && rss <= rule->q_index) {
> +                               dev_warn(&pf->pdev->dev,
> +                                        "Remove user defined filter %d to enable macvlan offload\n",
> +                                        rule->fd_id);
> +                               features &= ~NETIF_F_HW_L2FW_DOFFLOAD;
> +                               pf->macvlan_num = 0;
> +                       }
> +               }
> +       } else if (!(features & NETIF_F_HW_L2FW_DOFFLOAD) &&
> +                   netdev->features & NETIF_F_HW_L2FW_DOFFLOAD) {
> +               if (pf->macvlan_used) {
> +                       dev_warn(&pf->pdev->dev,
> +                                "Remove %d remaining macvlan offloads to disable macvlan offload\n",
> +                                pf->macvlan_used);
> +                       features |= NETIF_F_HW_L2FW_DOFFLOAD;
> +               }
> +       }
> +
> +       return features;
> +}
> +
> +/**
>   * i40e_set_features - set the netdev feature flags
>   * @netdev: ptr to the netdev being adjusted
>   * @features: the feature set that the stack is suggesting
> @@ -9247,6 +9307,45 @@ static int i40e_set_features(struct net_device *netdev,
>
>         need_reset = i40e_set_ntuple(pf, features);
>
> +       /* keep this section last in this function as it
> +        * might take care of the need_reset for the others
> +        */
> +       if (features & NETIF_F_HW_L2FW_DOFFLOAD &&
> +           !(netdev->features & NETIF_F_HW_L2FW_DOFFLOAD)) {
> +               /* reserve queues for macvlan use */
> +               u16 rss = vsi->num_q_vectors - pf->macvlan_num;
> +
> +               if (rss != vsi->rss_size) {
> +                       if (i40e_reconfig_rss_queues(pf, rss))
> +                               need_reset = false;
> +               }
> +
> +               pf->macvlan_hint = rss;
> +               pf->macvlan_used = 0;
> +
> +       } else if (!(features & NETIF_F_HW_L2FW_DOFFLOAD) &&
> +                   netdev->features & NETIF_F_HW_L2FW_DOFFLOAD) {
> +               /* return macvlan queues to general use */
> +               int num_qs = vsi->rss_size + pf->macvlan_num;
> +               int i;
> +
> +               /* stop the upperdev queues if not already stopped */
> +               for (i = vsi->rss_size; i < num_qs; i++) {
> +                       struct i40e_fwd *fwd = vsi->tx_rings[i]->fwd;
> +
> +                       if (fwd)
> +                               netif_tx_stop_all_queues(fwd->vdev);
> +               }
> +
> +               /* rebuild the rss layout with the restored queues */
> +               if (i40e_reconfig_rss_queues(pf, num_qs))
> +                       need_reset = false;
> +
> +               pf->macvlan_hint = 0;
> +               pf->macvlan_used = 0;
> +               pf->macvlan_num = 0;
> +       }
> +
>         if (need_reset)
>                 i40e_do_reset(pf, BIT_ULL(__I40E_PF_RESET_REQUESTED), true);
>
> @@ -9674,6 +9773,137 @@ static int i40e_xdp(struct net_device *dev,
>         }
>  }
>
> +/**
> + * i40e_select_queue - select the Tx queue, watching for macvlan offloads
> + * @dev: netdevice
> + * @skb: packet to be sent
> + * @accel_priv: hint for offloading macvlan
> + * @fallback: alternative function to use if we don't care which Tx
> + **/
> +static u16 i40e_select_queue(struct net_device *dev, struct sk_buff *skb,
> +                            void *accel_priv, select_queue_fallback_t fallback)
> +{
> +       struct i40e_fwd *fwd = accel_priv;
> +
> +       if (fwd)
> +               return fwd->tx_base_queue;
> +
> +       return fallback(dev, skb);
> +}
> +

So the select_queue function being needed is the deal breaker on all
of this as far as I am concerned. We aren't allowed to use it under
other cases so why should macvlan be an exception to the rule?

I think we should probably look at a different approach for this. For
example why is it we need to use a different transmit path for a
macvlan packet vs any other packet? On the Rx side we get the
advantage of avoiding the software hashing and demux. What do we get
for reserving queues for transmit?

My plan for this is to go back and "fix" ixgbe so we can get it away
from having to use the select_queue call for the macvlan offload and
then maybe look at proving a few select NDO operations for allowing
macvlans that are being offloaded to make specific calls into the
hardware to perform tasks as needed.

- Alex

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Intel-wired-lan] [RFC PATCH next 2/2] i40e: add support for macvlan hardware offload
@ 2017-10-17 21:32     ` Alexander Duyck
  0 siblings, 0 replies; 10+ messages in thread
From: Alexander Duyck @ 2017-10-17 21:32 UTC (permalink / raw)
  To: intel-wired-lan

On Tue, Oct 17, 2017 at 2:18 PM, Shannon Nelson
<shannon.nelson@oracle.com> wrote:
> This patch adds support for macvlan hardware offload (l2-fwd-offload)
> feature using the XL710's macvlan-to-queue filtering machanism.  These
> are most useful for supporting separate mac addresses for Container
> virtualization using Docker and similar configurations.
>
> The basic design is to partition off some of the PF's general LAN queues
> outside of the standard RSS pool and use them as the offload queues.
> This especially makes sense on machines with more than 64 CPUs: since
> the RSS pool is limited to a maximum of 64, the queues assigned to the
> remaining CPUs essentially go unused.  When on a machine with fewer than
> 64 CPUs, we shrink the RSS pool and use the upper queues for the offload.
>
> If the user has added Flow Director filters, enabling of macvlan offload
> is disallowed.
>
> To use this feature, use ethtool to enable l2-fwd-offload
>         ethtool -K ethX l2-fwd-offload on
> When the next macvlan devices are created on ethX, the macvlan driver
> will automatically attempt to setup the hardweare offload.
>
> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
> ---
>  drivers/net/ethernet/intel/i40e/i40e.h         |   10 +
>  drivers/net/ethernet/intel/i40e/i40e_ethtool.c |   15 ++
>  drivers/net/ethernet/intel/i40e/i40e_main.c    |  239 +++++++++++++++++++++++-
>  drivers/net/ethernet/intel/i40e/i40e_txrx.h    |    1 +
>  4 files changed, 264 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h
> index a187f53..4868ae2 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e.h
> +++ b/drivers/net/ethernet/intel/i40e/i40e.h
> @@ -365,6 +365,10 @@ struct i40e_pf {
>         u8 atr_sample_rate;
>         bool wol_en;
>
> +       u16 macvlan_hint;
> +       u16 macvlan_used;
> +       u16 macvlan_num;
> +
>         struct hlist_head fdir_filter_list;
>         u16 fdir_pf_active_filters;
>         unsigned long fd_flush_timestamp;
> @@ -712,6 +716,12 @@ struct i40e_netdev_priv {
>         struct i40e_vsi *vsi;
>  };
>
> +struct i40e_fwd {
> +       struct net_device *vdev;
> +       u16 tx_base_queue;
> +       /* future expansion here might include number of queues */
> +};
> +
>  /* struct that defines an interrupt vector */
>  struct i40e_q_vector {
>         struct i40e_vsi *vsi;
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
> index afd3ca8..e1628c1 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
> @@ -3817,6 +3817,13 @@ static int i40e_set_rxnfc(struct net_device *netdev, struct ethtool_rxnfc *cmd)
>         struct i40e_pf *pf = vsi->back;
>         int ret = -EOPNOTSUPP;
>
> +       if (pf->macvlan_num) {
> +               dev_warn(&pf->pdev->dev,
> +                        "Remove %d remaining macvlan offloads to change filter options\n",
> +                        pf->macvlan_used);
> +               return -EBUSY;
> +       }
> +
>         switch (cmd->cmd) {
>         case ETHTOOL_SRXFH:
>                 ret = i40e_set_rss_hash_opt(pf, cmd);
> @@ -3909,6 +3916,14 @@ static int i40e_set_channels(struct net_device *dev,
>         if (count > i40e_max_channels(vsi))
>                 return -EINVAL;
>
> +       /* verify that macvlan offloads are not in use */
> +       if (pf->macvlan_num) {
> +               dev_warn(&pf->pdev->dev,
> +                        "Remove %d remaining macvlan offloads to change channel count\n",
> +                        pf->macvlan_used);
> +               return -EBUSY;
> +       }
> +
>         /* verify that the number of channels does not invalidate any current
>          * flow director rules
>          */
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
> index e4b8a4b..7b26c6f 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_main.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
> @@ -9221,6 +9221,66 @@ static void i40e_clear_rss_lut(struct i40e_vsi *vsi)
>  }
>
>  /**
> + * i40e_fix_features - fix the proposed netdev feature flags
> + * @netdev: ptr to the netdev being adjusted
> + * @features: the feature set that the stack is suggesting
> + * Note: expects to be called while under rtnl_lock()
> + **/
> +static netdev_features_t i40e_fix_features(struct net_device *netdev,
> +                                          netdev_features_t features)
> +{
> +       struct i40e_netdev_priv *np = netdev_priv(netdev);
> +       struct i40e_pf *pf = np->vsi->back;
> +       struct i40e_vsi *vsi = np->vsi;
> +
> +       /* make sure there are queues to be used for macvlan offload */
> +       if (features & NETIF_F_HW_L2FW_DOFFLOAD &&
> +           !(netdev->features & NETIF_F_HW_L2FW_DOFFLOAD)) {
> +               const u8 drop = I40E_FILTER_PROGRAM_DESC_DEST_DROP_PACKET;
> +               struct i40e_fdir_filter *rule;
> +               struct hlist_node *node2;
> +               u16 rss, unused;
> +
> +               /* Find a set of queues to be used for macvlan offload.
> +                * If there aren't many queues outside of the RSS set
> +                * that could be used for macvlan, try shrinking the
> +                * set to free up some queues, after checking if there
> +                * are any Flow Director rules we might break.
> +                */
> +
> +               rss = vsi->rss_size;
> +               unused = vsi->num_queue_pairs - rss;
> +               if (unused < (vsi->rss_size / 2)) {
> +                       rss = vsi->rss_size / 2;
> +                       unused = vsi->num_q_vectors - rss;
> +               }
> +               pf->macvlan_num = unused;
> +
> +               /* check the flow director rules */
> +               hlist_for_each_entry_safe(rule, node2,
> +                                         &pf->fdir_filter_list, fdir_node) {
> +                       if (rule->dest_ctl != drop && rss <= rule->q_index) {
> +                               dev_warn(&pf->pdev->dev,
> +                                        "Remove user defined filter %d to enable macvlan offload\n",
> +                                        rule->fd_id);
> +                               features &= ~NETIF_F_HW_L2FW_DOFFLOAD;
> +                               pf->macvlan_num = 0;
> +                       }
> +               }
> +       } else if (!(features & NETIF_F_HW_L2FW_DOFFLOAD) &&
> +                   netdev->features & NETIF_F_HW_L2FW_DOFFLOAD) {
> +               if (pf->macvlan_used) {
> +                       dev_warn(&pf->pdev->dev,
> +                                "Remove %d remaining macvlan offloads to disable macvlan offload\n",
> +                                pf->macvlan_used);
> +                       features |= NETIF_F_HW_L2FW_DOFFLOAD;
> +               }
> +       }
> +
> +       return features;
> +}
> +
> +/**
>   * i40e_set_features - set the netdev feature flags
>   * @netdev: ptr to the netdev being adjusted
>   * @features: the feature set that the stack is suggesting
> @@ -9247,6 +9307,45 @@ static int i40e_set_features(struct net_device *netdev,
>
>         need_reset = i40e_set_ntuple(pf, features);
>
> +       /* keep this section last in this function as it
> +        * might take care of the need_reset for the others
> +        */
> +       if (features & NETIF_F_HW_L2FW_DOFFLOAD &&
> +           !(netdev->features & NETIF_F_HW_L2FW_DOFFLOAD)) {
> +               /* reserve queues for macvlan use */
> +               u16 rss = vsi->num_q_vectors - pf->macvlan_num;
> +
> +               if (rss != vsi->rss_size) {
> +                       if (i40e_reconfig_rss_queues(pf, rss))
> +                               need_reset = false;
> +               }
> +
> +               pf->macvlan_hint = rss;
> +               pf->macvlan_used = 0;
> +
> +       } else if (!(features & NETIF_F_HW_L2FW_DOFFLOAD) &&
> +                   netdev->features & NETIF_F_HW_L2FW_DOFFLOAD) {
> +               /* return macvlan queues to general use */
> +               int num_qs = vsi->rss_size + pf->macvlan_num;
> +               int i;
> +
> +               /* stop the upperdev queues if not already stopped */
> +               for (i = vsi->rss_size; i < num_qs; i++) {
> +                       struct i40e_fwd *fwd = vsi->tx_rings[i]->fwd;
> +
> +                       if (fwd)
> +                               netif_tx_stop_all_queues(fwd->vdev);
> +               }
> +
> +               /* rebuild the rss layout with the restored queues */
> +               if (i40e_reconfig_rss_queues(pf, num_qs))
> +                       need_reset = false;
> +
> +               pf->macvlan_hint = 0;
> +               pf->macvlan_used = 0;
> +               pf->macvlan_num = 0;
> +       }
> +
>         if (need_reset)
>                 i40e_do_reset(pf, BIT_ULL(__I40E_PF_RESET_REQUESTED), true);
>
> @@ -9674,6 +9773,137 @@ static int i40e_xdp(struct net_device *dev,
>         }
>  }
>
> +/**
> + * i40e_select_queue - select the Tx queue, watching for macvlan offloads
> + * @dev: netdevice
> + * @skb: packet to be sent
> + * @accel_priv: hint for offloading macvlan
> + * @fallback: alternative function to use if we don't care which Tx
> + **/
> +static u16 i40e_select_queue(struct net_device *dev, struct sk_buff *skb,
> +                            void *accel_priv, select_queue_fallback_t fallback)
> +{
> +       struct i40e_fwd *fwd = accel_priv;
> +
> +       if (fwd)
> +               return fwd->tx_base_queue;
> +
> +       return fallback(dev, skb);
> +}
> +

So the select_queue function being needed is the deal breaker on all
of this as far as I am concerned. We aren't allowed to use it under
other cases so why should macvlan be an exception to the rule?

I think we should probably look at a different approach for this. For
example why is it we need to use a different transmit path for a
macvlan packet vs any other packet? On the Rx side we get the
advantage of avoiding the software hashing and demux. What do we get
for reserving queues for transmit?

My plan for this is to go back and "fix" ixgbe so we can get it away
from having to use the select_queue call for the macvlan offload and
then maybe look at proving a few select NDO operations for allowing
macvlans that are being offloaded to make specific calls into the
hardware to perform tasks as needed.

- Alex

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Intel-wired-lan] [RFC PATCH next 2/2] i40e: add support for macvlan hardware offload
  2017-10-17 21:32     ` Alexander Duyck
@ 2017-10-17 23:12       ` Shannon Nelson
  -1 siblings, 0 replies; 10+ messages in thread
From: Shannon Nelson @ 2017-10-17 23:12 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: intel-wired-lan, Jeff Kirsher, Netdev

On 10/17/2017 2:32 PM, Alexander Duyck wrote:
> 
> So the select_queue function being needed is the deal breaker on all
> of this as far as I am concerned. We aren't allowed to use it under
> other cases so why should macvlan be an exception to the rule?

I realize that the stack is pretty good at chosing the "right" queue, 
which is my understanding as to why we shouldn't use select_queue(), but 
it doesn't know how to use the accel_priv context associated with the 
macvlan offload.

I saw DaveM's guidance to the HiNIC folks when they tried to add 
select_queue(): "do not implement this function unless you absolutely 
need to do something custom in your driver".  I can see where this might 
be the exception.

When originally thinking about how to do this, I wanted to use the 
accel_priv as a pointer to the VSI to be used for the offload, then we 
could have multiple queues and use all the VSI specific tuning 
operations that XL710 has available.  It can work when selecting the 
queue, but by the time you get to start_xmit(), you no longer have that 
context and only have the queue number.  You can't do any fancy encoding 
in the queue number because the value has to be within 
dev->num_tx_queues.  Maybe we can add accel_priv to the start_xmit 
interface?  (I can hear the groans already...)

However... for our case, you might be right anyway.  If the stack is 
doing its job at keeping the conversation on the one queue/irq/cpu 
combination, any Tx following the offloaded Rx might already be headed 
for the right Tx queue.  I'll check on that.
> I think we should probably look at a different approach for this. For
> example why is it we need to use a different transmit path for a
> macvlan packet vs any other packet? On the Rx side we get the
> advantage of avoiding the software hashing and demux. What do we get
> for reserving queues for transmit?

There are a couple of reasons I can think of to keep the Tx on the 
specific queue pair:

- Keep the Tx traffic on the same CPU and irq as the Rx traffic

- Don't let the flow get interrupted, slowed, or otherwise perturbed by 
other traffic flows.

- Allow for adding hardware assisted bandwidth constraints to the 
offloaded flow without bothering the rest of the NIC's traffic

Are these enough to want to guarantee the Tx queue?

> My plan for this is to go back and "fix" ixgbe so we can get it away
> from having to use the select_queue call for the macvlan offload and
> then maybe look at proving a few select NDO operations for allowing
> macvlans that are being offloaded to make specific calls into the
> hardware to perform tasks as needed.

The ixgbe implementation can certainly be improved.  I think its biggest 
failing is that the rest of the general traffic gets constrained to a 
single queue - no more RSS for load balancing.

sln

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Intel-wired-lan] [RFC PATCH next 2/2] i40e: add support for macvlan hardware offload
@ 2017-10-17 23:12       ` Shannon Nelson
  0 siblings, 0 replies; 10+ messages in thread
From: Shannon Nelson @ 2017-10-17 23:12 UTC (permalink / raw)
  To: intel-wired-lan

On 10/17/2017 2:32 PM, Alexander Duyck wrote:
> 
> So the select_queue function being needed is the deal breaker on all
> of this as far as I am concerned. We aren't allowed to use it under
> other cases so why should macvlan be an exception to the rule?

I realize that the stack is pretty good at chosing the "right" queue, 
which is my understanding as to why we shouldn't use select_queue(), but 
it doesn't know how to use the accel_priv context associated with the 
macvlan offload.

I saw DaveM's guidance to the HiNIC folks when they tried to add 
select_queue(): "do not implement this function unless you absolutely 
need to do something custom in your driver".  I can see where this might 
be the exception.

When originally thinking about how to do this, I wanted to use the 
accel_priv as a pointer to the VSI to be used for the offload, then we 
could have multiple queues and use all the VSI specific tuning 
operations that XL710 has available.  It can work when selecting the 
queue, but by the time you get to start_xmit(), you no longer have that 
context and only have the queue number.  You can't do any fancy encoding 
in the queue number because the value has to be within 
dev->num_tx_queues.  Maybe we can add accel_priv to the start_xmit 
interface?  (I can hear the groans already...)

However... for our case, you might be right anyway.  If the stack is 
doing its job at keeping the conversation on the one queue/irq/cpu 
combination, any Tx following the offloaded Rx might already be headed 
for the right Tx queue.  I'll check on that.
> I think we should probably look at a different approach for this. For
> example why is it we need to use a different transmit path for a
> macvlan packet vs any other packet? On the Rx side we get the
> advantage of avoiding the software hashing and demux. What do we get
> for reserving queues for transmit?

There are a couple of reasons I can think of to keep the Tx on the 
specific queue pair:

- Keep the Tx traffic on the same CPU and irq as the Rx traffic

- Don't let the flow get interrupted, slowed, or otherwise perturbed by 
other traffic flows.

- Allow for adding hardware assisted bandwidth constraints to the 
offloaded flow without bothering the rest of the NIC's traffic

Are these enough to want to guarantee the Tx queue?

> My plan for this is to go back and "fix" ixgbe so we can get it away
> from having to use the select_queue call for the macvlan offload and
> then maybe look at proving a few select NDO operations for allowing
> macvlans that are being offloaded to make specific calls into the
> hardware to perform tasks as needed.

The ixgbe implementation can certainly be improved.  I think its biggest 
failing is that the rest of the general traffic gets constrained to a 
single queue - no more RSS for load balancing.

sln


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-10-17 23:12 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-17 21:18 [RFC PATCH next 0/2] Add support for macvlan offload Shannon Nelson
2017-10-17 21:18 ` [Intel-wired-lan] " Shannon Nelson
2017-10-17 21:18 ` [RFC PATCH next 1/2] i40e: add ToQueue specific handling for mac filters Shannon Nelson
2017-10-17 21:18   ` [Intel-wired-lan] " Shannon Nelson
2017-10-17 21:18 ` [RFC PATCH next 2/2] i40e: add support for macvlan hardware offload Shannon Nelson
2017-10-17 21:18   ` [Intel-wired-lan] " Shannon Nelson
2017-10-17 21:32   ` Alexander Duyck
2017-10-17 21:32     ` Alexander Duyck
2017-10-17 23:12     ` Shannon Nelson
2017-10-17 23:12       ` Shannon Nelson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.