Linux-RDMA Archive on lore.kernel.org
 help / color / Atom feed
* [RFC PATCH 0/4] Introduce master_xmit_slave_get
@ 2020-01-26 13:21 Maor Gottlieb
  2020-01-26 13:21 ` [RFC PATCH 1/4] net/core: " Maor Gottlieb
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Maor Gottlieb @ 2020-01-26 13:21 UTC (permalink / raw)
  To: j.vosburgh, vfalico, andy, jiri, davem
  Cc: Maor Gottlieb, netdev, saeedm, jgg, leonro, alexr, markz, parav,
	eranbe, linux-rdma

Hi,

This patch series add support to get the LAG master xmit slave by
introducing new .ndo - ndo_xmit_slave_get. Every LAG module can
implement it. In this RFC, we added the support to the bond module.

The main motivation for doing this is for drivers that offload part
of the LAG functionality [1]. For example, Mellanox Connect-X hardware
implements RoCE LAG which selects the TX affinity when the resources
are created and port is remapped when it goes down [2].

Because of that and the fact that the RDMA frames are bypass the bonding
driver completely, we need a function to get the xmit slave assume
all the slaves are active.

The idea is that the same UDP header will get the same hash result so
they will be transmitted from the same port.


Thanks

[1] https://www.spinics.net/lists/netdev/msg624832.html
[2] https://www.spinics.net/lists/netdev/msg626758.html

Maor Gottlieb (4):
  net/core: Introduce master_xmit_slave_get
  bonding: Rename slave_arr to active_slaves
  bonding: Add helpers to get xmit slave
  bonding: Implement ndo_xmit_slave_get

 drivers/net/bonding/bond_alb.c  |  41 ++++--
 drivers/net/bonding/bond_main.c | 231 ++++++++++++++++++++++----------
 include/linux/netdevice.h       |   3 +
 include/net/bond_alb.h          |   4 +
 include/net/bonding.h           |   3 +-
 include/net/lag.h               |  19 +++
 6 files changed, 215 insertions(+), 86 deletions(-)

-- 
2.17.2


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [RFC PATCH 1/4] net/core: Introduce master_xmit_slave_get
  2020-01-26 13:21 [RFC PATCH 0/4] Introduce master_xmit_slave_get Maor Gottlieb
@ 2020-01-26 13:21 ` " Maor Gottlieb
  2020-01-29  9:57   ` Gal Pressman
  2020-02-12 20:33   ` Saeed Mahameed
  2020-01-26 13:21 ` [RFC PATCH 2/4] bonding: Rename slave_arr to active_slaves Maor Gottlieb
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 10+ messages in thread
From: Maor Gottlieb @ 2020-01-26 13:21 UTC (permalink / raw)
  To: j.vosburgh, vfalico, andy, jiri, davem
  Cc: Maor Gottlieb, netdev, saeedm, jgg, leonro, alexr, markz, parav,
	eranbe, linux-rdma

Add new ndo to get the xmit slave of master device.
When slave selection method is based on hash, then the user can ask to
get the xmit slave assume all the slaves can transmit by setting the
LAG_FLAGS_HASH_ALL_SLAVES bit in the flags argument.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
---
 include/linux/netdevice.h |  3 +++
 include/net/lag.h         | 19 +++++++++++++++++++
 2 files changed, 22 insertions(+)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 11bdf6cb30bd..faba4aa094e5 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1379,6 +1379,9 @@ struct net_device_ops {
 						 struct netlink_ext_ack *extack);
 	int			(*ndo_del_slave)(struct net_device *dev,
 						 struct net_device *slave_dev);
+	struct net_device*	(*ndo_xmit_slave_get)(struct net_device *master_dev,
+						      struct sk_buff *skb,
+						      int lag);
 	netdev_features_t	(*ndo_fix_features)(struct net_device *dev,
 						    netdev_features_t features);
 	int			(*ndo_set_features)(struct net_device *dev,
diff --git a/include/net/lag.h b/include/net/lag.h
index 95b880e6fdde..c710daf8f57a 100644
--- a/include/net/lag.h
+++ b/include/net/lag.h
@@ -6,6 +6,25 @@
 #include <linux/if_team.h>
 #include <net/bonding.h>
 
+enum lag_get_slaves_flags {
+	LAG_FLAGS_HASH_ALL_SLAVES = 1<<0
+};
+
+static inline
+struct net_device *master_xmit_slave_get(struct net_device *master_dev,
+					 struct sk_buff *skb,
+					 int flags)
+{
+	const struct net_device_ops *ops = master_dev->netdev_ops;
+	struct net_device *slave = NULL;
+
+	rcu_read_lock();
+	if (ops->ndo_xmit_slave_get)
+		slave = ops->ndo_xmit_slave_get(master_dev, skb, flags);
+	rcu_read_unlock();
+	return slave;
+}
+
 static inline bool net_lag_port_dev_txable(const struct net_device *port_dev)
 {
 	if (netif_is_team_port(port_dev))
-- 
2.17.2


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [RFC PATCH 2/4] bonding: Rename slave_arr to active_slaves
  2020-01-26 13:21 [RFC PATCH 0/4] Introduce master_xmit_slave_get Maor Gottlieb
  2020-01-26 13:21 ` [RFC PATCH 1/4] net/core: " Maor Gottlieb
@ 2020-01-26 13:21 ` Maor Gottlieb
  2020-01-26 13:21 ` [RFC PATCH 3/4] bonding: Add helpers to get xmit slave Maor Gottlieb
  2020-01-26 13:21 ` [RFC PATCH 4/4] bonding: Implement ndo_xmit_slave_get Maor Gottlieb
  3 siblings, 0 replies; 10+ messages in thread
From: Maor Gottlieb @ 2020-01-26 13:21 UTC (permalink / raw)
  To: j.vosburgh, vfalico, andy, jiri, davem
  Cc: Maor Gottlieb, netdev, saeedm, jgg, leonro, alexr, markz, parav,
	eranbe, linux-rdma

This patch does renaming of slave_arr to active_slaves, since we will
have two arrays, one for the active slaves and the second to all
slaves.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
---
 drivers/net/bonding/bond_alb.c  |  4 +-
 drivers/net/bonding/bond_main.c | 84 ++++++++++++++++++---------------
 include/net/bonding.h           |  2 +-
 3 files changed, 48 insertions(+), 42 deletions(-)

diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index 4f2e6910c623..15ff1d1999f7 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -1360,7 +1360,7 @@ netdev_tx_t bond_tlb_xmit(struct sk_buff *skb, struct net_device *bond_dev)
 				struct bond_up_slave *slaves;
 				unsigned int count;
 
-				slaves = rcu_dereference(bond->slave_arr);
+				slaves = rcu_dereference(bond->active_slaves);
 				count = slaves ? READ_ONCE(slaves->count) : 0;
 				if (likely(count))
 					tx_slave = slaves->arr[hash_index %
@@ -1474,7 +1474,7 @@ netdev_tx_t bond_alb_xmit(struct sk_buff *skb, struct net_device *bond_dev)
 			struct bond_up_slave *slaves;
 			unsigned int count;
 
-			slaves = rcu_dereference(bond->slave_arr);
+			slaves = rcu_dereference(bond->active_slaves);
 			count = slaves ? READ_ONCE(slaves->count) : 0;
 			if (likely(count))
 				tx_slave = slaves->arr[bond_xmit_hash(bond, skb) %
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 48d5ec770b94..14a592824f0c 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -4038,6 +4038,29 @@ static void bond_slave_arr_handler(struct work_struct *work)
 	bond_slave_arr_work_rearm(bond, 1);
 }
 
+static void bond_skip_slave(struct bond_up_slave *slaves,
+			    struct slave *skipslave)
+{
+	int idx;
+
+	/* Rare situation where caller has asked to skip a specific
+	 * slave but allocation failed (most likely!). BTW this is
+	 * only possible when the call is initiated from
+	 * __bond_release_one(). In this situation; overwrite the
+	 * skipslave entry in the array with the last entry from the
+	 * array to avoid a situation where the xmit path may choose
+	 * this to-be-skipped slave to send a packet out.
+	 */
+	for (idx = 0; slaves && idx < slaves->count; idx++) {
+		if (skipslave == slaves->arr[idx]) {
+			slaves->arr[idx] =
+				slaves->arr[slaves->count - 1];
+			slaves->count--;
+			break;
+		}
+	}
+}
+
 /* Build the usable slaves array in control path for modes that use xmit-hash
  * to determine the slave interface -
  * (a) BOND_MODE_8023AD
@@ -4048,9 +4071,9 @@ static void bond_slave_arr_handler(struct work_struct *work)
  */
 int bond_update_slave_arr(struct bonding *bond, struct slave *skipslave)
 {
+	struct bond_up_slave *active_slaves, *old_active_slaves;
 	struct slave *slave;
 	struct list_head *iter;
-	struct bond_up_slave *new_arr, *old_arr;
 	int agg_id = 0;
 	int ret = 0;
 
@@ -4058,9 +4081,9 @@ int bond_update_slave_arr(struct bonding *bond, struct slave *skipslave)
 	WARN_ON(lockdep_is_held(&bond->mode_lock));
 #endif
 
-	new_arr = kzalloc(offsetof(struct bond_up_slave, arr[bond->slave_cnt]),
-			  GFP_KERNEL);
-	if (!new_arr) {
+	active_slaves = kzalloc(struct_size(active_slaves, arr,
+					    bond->slave_cnt), GFP_KERNEL);
+	if (!active_slaves) {
 		ret = -ENOMEM;
 		pr_err("Failed to build slave-array.\n");
 		goto out;
@@ -4070,14 +4093,14 @@ int bond_update_slave_arr(struct bonding *bond, struct slave *skipslave)
 
 		if (bond_3ad_get_active_agg_info(bond, &ad_info)) {
 			pr_debug("bond_3ad_get_active_agg_info failed\n");
-			kfree_rcu(new_arr, rcu);
+			kfree_rcu(active_slaves, rcu);
 			/* No active aggragator means it's not safe to use
 			 * the previous array.
 			 */
-			old_arr = rtnl_dereference(bond->slave_arr);
-			if (old_arr) {
-				RCU_INIT_POINTER(bond->slave_arr, NULL);
-				kfree_rcu(old_arr, rcu);
+			old_active_slaves = rtnl_dereference(bond->active_slaves);
+			if (old_active_slaves) {
+				RCU_INIT_POINTER(bond->active_slaves, NULL);
+				kfree_rcu(old_active_slaves, rcu);
 			}
 			goto out;
 		}
@@ -4097,37 +4120,20 @@ int bond_update_slave_arr(struct bonding *bond, struct slave *skipslave)
 			continue;
 
 		slave_dbg(bond->dev, slave->dev, "Adding slave to tx hash array[%d]\n",
-			  new_arr->count);
+			  active_slaves->count);
 
-		new_arr->arr[new_arr->count++] = slave;
+		active_slaves->arr[active_slaves->count++] = slave;
 	}
 
-	old_arr = rtnl_dereference(bond->slave_arr);
-	rcu_assign_pointer(bond->slave_arr, new_arr);
-	if (old_arr)
-		kfree_rcu(old_arr, rcu);
+	old_active_slaves = rtnl_dereference(bond->active_slaves);
+	rcu_assign_pointer(bond->active_slaves, active_slaves);
+	if (old_active_slaves)
+		kfree_rcu(old_active_slaves, rcu);
 out:
-	if (ret != 0 && skipslave) {
-		int idx;
-
-		/* Rare situation where caller has asked to skip a specific
-		 * slave but allocation failed (most likely!). BTW this is
-		 * only possible when the call is initiated from
-		 * __bond_release_one(). In this situation; overwrite the
-		 * skipslave entry in the array with the last entry from the
-		 * array to avoid a situation where the xmit path may choose
-		 * this to-be-skipped slave to send a packet out.
-		 */
-		old_arr = rtnl_dereference(bond->slave_arr);
-		for (idx = 0; old_arr != NULL && idx < old_arr->count; idx++) {
-			if (skipslave == old_arr->arr[idx]) {
-				old_arr->arr[idx] =
-				    old_arr->arr[old_arr->count-1];
-				old_arr->count--;
-				break;
-			}
-		}
-	}
+	if (ret != 0 && skipslave)
+		bond_skip_slave(rtnl_dereference(bond->active_slaves),
+				skipslave);
+
 	return ret;
 }
 
@@ -4143,7 +4149,7 @@ static netdev_tx_t bond_3ad_xor_xmit(struct sk_buff *skb,
 	struct bond_up_slave *slaves;
 	unsigned int count;
 
-	slaves = rcu_dereference(bond->slave_arr);
+	slaves = rcu_dereference(bond->active_slaves);
 	count = slaves ? READ_ONCE(slaves->count) : 0;
 	if (likely(count)) {
 		slave = slaves->arr[bond_xmit_hash(bond, skb) % count];
@@ -4435,9 +4441,9 @@ static void bond_uninit(struct net_device *bond_dev)
 		__bond_release_one(bond_dev, slave->dev, true, true);
 	netdev_info(bond_dev, "Released all slaves\n");
 
-	arr = rtnl_dereference(bond->slave_arr);
+	arr = rtnl_dereference(bond->active_slaves);
 	if (arr) {
-		RCU_INIT_POINTER(bond->slave_arr, NULL);
+		RCU_INIT_POINTER(bond->active_slaves, NULL);
 		kfree_rcu(arr, rcu);
 	}
 
diff --git a/include/net/bonding.h b/include/net/bonding.h
index 3d56b026bb9e..b77daffc1b52 100644
--- a/include/net/bonding.h
+++ b/include/net/bonding.h
@@ -200,7 +200,7 @@ struct bonding {
 	struct   slave __rcu *curr_active_slave;
 	struct   slave __rcu *current_arp_slave;
 	struct   slave __rcu *primary_slave;
-	struct   bond_up_slave __rcu *slave_arr; /* Array of usable slaves */
+	struct   bond_up_slave __rcu *active_slaves; /* Array of usable slaves */
 	bool     force_primary;
 	s32      slave_cnt; /* never change this value outside the attach/detach wrappers */
 	int     (*recv_probe)(const struct sk_buff *, struct bonding *,
-- 
2.17.2


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [RFC PATCH 3/4] bonding: Add helpers to get xmit slave
  2020-01-26 13:21 [RFC PATCH 0/4] Introduce master_xmit_slave_get Maor Gottlieb
  2020-01-26 13:21 ` [RFC PATCH 1/4] net/core: " Maor Gottlieb
  2020-01-26 13:21 ` [RFC PATCH 2/4] bonding: Rename slave_arr to active_slaves Maor Gottlieb
@ 2020-01-26 13:21 ` Maor Gottlieb
  2020-01-26 13:21 ` [RFC PATCH 4/4] bonding: Implement ndo_xmit_slave_get Maor Gottlieb
  3 siblings, 0 replies; 10+ messages in thread
From: Maor Gottlieb @ 2020-01-26 13:21 UTC (permalink / raw)
  To: j.vosburgh, vfalico, andy, jiri, davem
  Cc: Maor Gottlieb, netdev, saeedm, jgg, leonro, alexr, markz, parav,
	eranbe, linux-rdma

This helpers will be used by both the xmit function
and the get xmit slave ndo.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
---
 drivers/net/bonding/bond_alb.c  | 37 +++++++++----
 drivers/net/bonding/bond_main.c | 94 +++++++++++++++++++++------------
 include/net/bond_alb.h          |  4 ++
 3 files changed, 90 insertions(+), 45 deletions(-)

diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index 15ff1d1999f7..5bf097404062 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -1334,11 +1334,11 @@ static netdev_tx_t bond_do_alb_xmit(struct sk_buff *skb, struct bonding *bond,
 	return NETDEV_TX_OK;
 }
 
-netdev_tx_t bond_tlb_xmit(struct sk_buff *skb, struct net_device *bond_dev)
+struct slave *bond_xmit_tlb_slave_get(struct bonding *bond,
+				      struct sk_buff *skb)
 {
-	struct bonding *bond = netdev_priv(bond_dev);
-	struct ethhdr *eth_data;
 	struct slave *tx_slave = NULL;
+	struct ethhdr *eth_data;
 	u32 hash_index;
 
 	skb_reset_mac_header(skb);
@@ -1369,21 +1369,30 @@ netdev_tx_t bond_tlb_xmit(struct sk_buff *skb, struct net_device *bond_dev)
 			break;
 		}
 	}
-	return bond_do_alb_xmit(skb, bond, tx_slave);
+	return tx_slave;
 }
 
-netdev_tx_t bond_alb_xmit(struct sk_buff *skb, struct net_device *bond_dev)
+netdev_tx_t bond_tlb_xmit(struct sk_buff *skb, struct net_device *bond_dev)
 {
 	struct bonding *bond = netdev_priv(bond_dev);
-	struct ethhdr *eth_data;
+	struct slave *tx_slave;
+
+	tx_slave = bond_xmit_tlb_slave_get(bond, skb);
+	return bond_do_alb_xmit(skb, bond, tx_slave);
+}
+
+struct slave *bond_xmit_alb_slave_get(struct bonding *bond,
+				      struct sk_buff *skb)
+{
 	struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
-	struct slave *tx_slave = NULL;
 	static const __be32 ip_bcast = htonl(0xffffffff);
-	int hash_size = 0;
-	bool do_tx_balance = true;
-	u32 hash_index = 0;
+	struct slave *tx_slave = NULL;
 	const u8 *hash_start = NULL;
+	bool do_tx_balance = true;
+	struct ethhdr *eth_data;
 	struct ipv6hdr *ip6hdr;
+	u32 hash_index = 0;
+	int hash_size = 0;
 
 	skb_reset_mac_header(skb);
 	eth_data = eth_hdr(skb);
@@ -1481,7 +1490,15 @@ netdev_tx_t bond_alb_xmit(struct sk_buff *skb, struct net_device *bond_dev)
 						       count];
 		}
 	}
+	return tx_slave;
+}
+
+netdev_tx_t bond_alb_xmit(struct sk_buff *skb, struct net_device *bond_dev)
+{
+	struct bonding *bond = netdev_priv(bond_dev);
+	struct slave *tx_slave = NULL;
 
+	tx_slave = bond_xmit_alb_slave_get(bond, skb);
 	return bond_do_alb_xmit(skb, bond, tx_slave);
 }
 
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 14a592824f0c..adab1e3549ff 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -82,6 +82,7 @@
 #include <net/bonding.h>
 #include <net/bond_3ad.h>
 #include <net/bond_alb.h>
+#include <net/lag.h>
 
 #include "bonding_priv.h"
 
@@ -3406,10 +3407,26 @@ u32 bond_xmit_hash(struct bonding *bond, struct sk_buff *skb)
 		(__force u32)flow_get_u32_src(&flow);
 	hash ^= (hash >> 16);
 	hash ^= (hash >> 8);
-
 	return hash >> 1;
 }
 
+static struct slave *bond_xmit_3ad_xor_slave_get(struct bonding *bond,
+						 struct sk_buff *skb,
+						 struct bond_up_slave *slaves)
+{
+	struct slave *slave;
+	unsigned int count;
+	u32 hash;
+
+	hash = bond_xmit_hash(bond, skb);
+	count = slaves ? READ_ONCE(slaves->count) : 0;
+	if (unlikely(!count))
+		return NULL;
+
+	slave = slaves->arr[hash % count];
+	return slave;
+}
+
 /*-------------------------- Device entry points ----------------------------*/
 
 void bond_work_init_all(struct bonding *bond)
@@ -3874,16 +3891,15 @@ static int bond_set_mac_address(struct net_device *bond_dev, void *addr)
 }
 
 /**
- * bond_xmit_slave_id - transmit skb through slave with slave_id
+ * bond_get_slave_by_id - get xmit slave with slave_id
  * @bond: bonding device that is transmitting
- * @skb: buffer to transmit
  * @slave_id: slave id up to slave_cnt-1 through which to transmit
  *
- * This function tries to transmit through slave with slave_id but in case
+ * This function tries to get slave with slave_id but in case
  * it fails, it tries to find the first available slave for transmission.
- * The skb is consumed in all cases, thus the function is void.
  */
-static void bond_xmit_slave_id(struct bonding *bond, struct sk_buff *skb, int slave_id)
+static struct slave *bond_get_slave_by_id(struct bonding *bond,
+					  int slave_id)
 {
 	struct list_head *iter;
 	struct slave *slave;
@@ -3892,10 +3908,8 @@ static void bond_xmit_slave_id(struct bonding *bond, struct sk_buff *skb, int sl
 	/* Here we start from the slave with slave_id */
 	bond_for_each_slave_rcu(bond, slave, iter) {
 		if (--i < 0) {
-			if (bond_slave_can_tx(slave)) {
-				bond_dev_queue_xmit(bond, skb, slave->dev);
-				return;
-			}
+			if (bond_slave_can_tx(slave))
+				return slave;
 		}
 	}
 
@@ -3904,13 +3918,11 @@ static void bond_xmit_slave_id(struct bonding *bond, struct sk_buff *skb, int sl
 	bond_for_each_slave_rcu(bond, slave, iter) {
 		if (--i < 0)
 			break;
-		if (bond_slave_can_tx(slave)) {
-			bond_dev_queue_xmit(bond, skb, slave->dev);
-			return;
-		}
+		if (bond_slave_can_tx(slave))
+			return slave;
 	}
-	/* no slave that can tx has been found */
-	bond_tx_drop(bond->dev, skb);
+
+	return NULL;
 }
 
 /**
@@ -3946,10 +3958,9 @@ static u32 bond_rr_gen_slave_id(struct bonding *bond)
 	return slave_id;
 }
 
-static netdev_tx_t bond_xmit_roundrobin(struct sk_buff *skb,
-					struct net_device *bond_dev)
+static struct slave *bond_xmit_roundrobin_slave_get(struct bonding *bond,
+						    struct sk_buff *skb)
 {
-	struct bonding *bond = netdev_priv(bond_dev);
 	struct slave *slave;
 	int slave_cnt;
 	u32 slave_id;
@@ -3971,24 +3982,40 @@ static netdev_tx_t bond_xmit_roundrobin(struct sk_buff *skb,
 		if (iph->protocol == IPPROTO_IGMP) {
 			slave = rcu_dereference(bond->curr_active_slave);
 			if (slave)
-				bond_dev_queue_xmit(bond, skb, slave->dev);
-			else
-				bond_xmit_slave_id(bond, skb, 0);
-			return NETDEV_TX_OK;
+				return slave;
+			return bond_get_slave_by_id(bond, 0);
 		}
 	}
 
 non_igmp:
 	slave_cnt = READ_ONCE(bond->slave_cnt);
 	if (likely(slave_cnt)) {
-		slave_id = bond_rr_gen_slave_id(bond);
-		bond_xmit_slave_id(bond, skb, slave_id % slave_cnt);
-	} else {
-		bond_tx_drop(bond_dev, skb);
+		slave_id = bond_rr_gen_slave_id(bond) % slave_cnt;
+		return bond_get_slave_by_id(bond, slave_id);
 	}
+	return NULL;
+}
+
+static netdev_tx_t bond_xmit_roundrobin(struct sk_buff *skb,
+					struct net_device *bond_dev)
+{
+	struct bonding *bond = netdev_priv(bond_dev);
+	struct slave *slave;
+
+	slave = bond_xmit_roundrobin_slave_get(bond, skb);
+	if (slave)
+		bond_dev_queue_xmit(bond, skb, slave->dev);
+	else
+		bond_tx_drop(bond_dev, skb);
 	return NETDEV_TX_OK;
 }
 
+static struct slave *bond_xmit_activebackup_slave_get(struct bonding *bond,
+						      struct sk_buff *skb)
+{
+	return rcu_dereference(bond->curr_active_slave);
+}
+
 /* In active-backup mode, we know that bond->curr_active_slave is always valid if
  * the bond has a usable interface.
  */
@@ -3998,7 +4025,7 @@ static netdev_tx_t bond_xmit_activebackup(struct sk_buff *skb,
 	struct bonding *bond = netdev_priv(bond_dev);
 	struct slave *slave;
 
-	slave = rcu_dereference(bond->curr_active_slave);
+	slave = bond_xmit_activebackup_slave_get(bond, skb);
 	if (slave)
 		bond_dev_queue_xmit(bond, skb, slave->dev);
 	else
@@ -4145,18 +4172,15 @@ static netdev_tx_t bond_3ad_xor_xmit(struct sk_buff *skb,
 				     struct net_device *dev)
 {
 	struct bonding *bond = netdev_priv(dev);
-	struct slave *slave;
 	struct bond_up_slave *slaves;
-	unsigned int count;
+	struct slave *slave;
 
 	slaves = rcu_dereference(bond->active_slaves);
-	count = slaves ? READ_ONCE(slaves->count) : 0;
-	if (likely(count)) {
-		slave = slaves->arr[bond_xmit_hash(bond, skb) % count];
+	slave = bond_xmit_3ad_xor_slave_get(bond, skb, slaves);
+	if (likely(slave))
 		bond_dev_queue_xmit(bond, skb, slave->dev);
-	} else {
+	else
 		bond_tx_drop(dev, skb);
-	}
 
 	return NETDEV_TX_OK;
 }
diff --git a/include/net/bond_alb.h b/include/net/bond_alb.h
index b3504fcd773d..f6af76c87a6c 100644
--- a/include/net/bond_alb.h
+++ b/include/net/bond_alb.h
@@ -158,6 +158,10 @@ void bond_alb_handle_link_change(struct bonding *bond, struct slave *slave, char
 void bond_alb_handle_active_change(struct bonding *bond, struct slave *new_slave);
 int bond_alb_xmit(struct sk_buff *skb, struct net_device *bond_dev);
 int bond_tlb_xmit(struct sk_buff *skb, struct net_device *bond_dev);
+struct slave *bond_xmit_alb_slave_get(struct bonding *bond,
+				      struct sk_buff *skb);
+struct slave *bond_xmit_tlb_slave_get(struct bonding *bond,
+				      struct sk_buff *skb);
 void bond_alb_monitor(struct work_struct *);
 int bond_alb_set_mac_address(struct net_device *bond_dev, void *addr);
 void bond_alb_clear_vlan(struct bonding *bond, unsigned short vlan_id);
-- 
2.17.2


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [RFC PATCH 4/4] bonding: Implement ndo_xmit_slave_get
  2020-01-26 13:21 [RFC PATCH 0/4] Introduce master_xmit_slave_get Maor Gottlieb
                   ` (2 preceding siblings ...)
  2020-01-26 13:21 ` [RFC PATCH 3/4] bonding: Add helpers to get xmit slave Maor Gottlieb
@ 2020-01-26 13:21 ` Maor Gottlieb
  2020-01-29  2:08   ` Jay Vosburgh
  3 siblings, 1 reply; 10+ messages in thread
From: Maor Gottlieb @ 2020-01-26 13:21 UTC (permalink / raw)
  To: j.vosburgh, vfalico, andy, jiri, davem
  Cc: Maor Gottlieb, netdev, saeedm, jgg, leonro, alexr, markz, parav,
	eranbe, linux-rdma

Add implementation of ndo_xmit_slave_get.
When user set the LAG_FLAGS_HASH_ALL_SLAVES bit and the xmit slave
result is based on the hash, then the slave will be selected from the
array of all the slaves.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
---
 drivers/net/bonding/bond_main.c | 63 ++++++++++++++++++++++++++++++---
 include/net/bonding.h           |  1 +
 2 files changed, 60 insertions(+), 4 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index adab1e3549ff..c8f440d1b624 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -4098,7 +4098,8 @@ static void bond_skip_slave(struct bond_up_slave *slaves,
  */
 int bond_update_slave_arr(struct bonding *bond, struct slave *skipslave)
 {
-	struct bond_up_slave *active_slaves, *old_active_slaves;
+	struct bond_up_slave *active_slaves = NULL, *all_slaves = NULL;
+	struct bond_up_slave *old_active_slaves, *old_all_slaves;
 	struct slave *slave;
 	struct list_head *iter;
 	int agg_id = 0;
@@ -4110,7 +4111,9 @@ int bond_update_slave_arr(struct bonding *bond, struct slave *skipslave)
 
 	active_slaves = kzalloc(struct_size(active_slaves, arr,
 					    bond->slave_cnt), GFP_KERNEL);
-	if (!active_slaves) {
+	all_slaves = kzalloc(struct_size(all_slaves, arr,
+					 bond->slave_cnt), GFP_KERNEL);
+	if (!active_slaves || !all_slaves) {
 		ret = -ENOMEM;
 		pr_err("Failed to build slave-array.\n");
 		goto out;
@@ -4141,14 +4144,17 @@ int bond_update_slave_arr(struct bonding *bond, struct slave *skipslave)
 			if (!agg || agg->aggregator_identifier != agg_id)
 				continue;
 		}
-		if (!bond_slave_can_tx(slave))
+		if (!bond_slave_can_tx(slave)) {
+			all_slaves->arr[all_slaves->count++] = slave;
 			continue;
+		}
 		if (skipslave == slave)
 			continue;
 
 		slave_dbg(bond->dev, slave->dev, "Adding slave to tx hash array[%d]\n",
 			  active_slaves->count);
 
+		all_slaves->arr[all_slaves->count++] = slave;
 		active_slaves->arr[active_slaves->count++] = slave;
 	}
 
@@ -4156,10 +4162,18 @@ int bond_update_slave_arr(struct bonding *bond, struct slave *skipslave)
 	rcu_assign_pointer(bond->active_slaves, active_slaves);
 	if (old_active_slaves)
 		kfree_rcu(old_active_slaves, rcu);
+
+	old_all_slaves = rtnl_dereference(bond->all_slaves);
+	rcu_assign_pointer(bond->all_slaves, all_slaves);
+	if (old_all_slaves)
+		kfree_rcu(old_all_slaves, rcu);
 out:
-	if (ret != 0 && skipslave)
+	if (ret != 0 && skipslave) {
 		bond_skip_slave(rtnl_dereference(bond->active_slaves),
 				skipslave);
+		kfree(all_slaves);
+		kfree(active_slaves);
+	}
 
 	return ret;
 }
@@ -4265,6 +4279,46 @@ static u16 bond_select_queue(struct net_device *dev, struct sk_buff *skb,
 	return txq;
 }
 
+static struct net_device *bond_xmit_slave_get(struct net_device *master_dev,
+					      struct sk_buff *skb,
+					      int flags)
+{
+	struct bonding *bond = netdev_priv(master_dev);
+	struct bond_up_slave *slaves;
+	struct slave *slave;
+
+	switch (BOND_MODE(bond)) {
+	case BOND_MODE_ROUNDROBIN:
+		slave = bond_xmit_roundrobin_slave_get(bond, skb);
+		break;
+	case BOND_MODE_ACTIVEBACKUP:
+		slave = bond_xmit_activebackup_slave_get(bond, skb);
+		break;
+	case BOND_MODE_8023AD:
+	case BOND_MODE_XOR:
+		if (flags & LAG_FLAGS_HASH_ALL_SLAVES)
+			slaves = rcu_dereference(bond->all_slaves);
+		else
+			slaves = rcu_dereference(bond->active_slaves);
+		slave = bond_xmit_3ad_xor_slave_get(bond, skb, slaves);
+		break;
+	case BOND_MODE_BROADCAST:
+		return ERR_PTR(-EOPNOTSUPP);
+	case BOND_MODE_ALB:
+		slave = bond_xmit_alb_slave_get(bond, skb);
+		break;
+	case BOND_MODE_TLB:
+		slave = bond_xmit_tlb_slave_get(bond, skb);
+		break;
+	default:
+		return NULL;
+	}
+
+	if (slave)
+		return slave->dev;
+	return NULL;
+}
+
 static netdev_tx_t __bond_start_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	struct bonding *bond = netdev_priv(dev);
@@ -4387,6 +4441,7 @@ static const struct net_device_ops bond_netdev_ops = {
 	.ndo_del_slave		= bond_release,
 	.ndo_fix_features	= bond_fix_features,
 	.ndo_features_check	= passthru_features_check,
+	.ndo_xmit_slave_get	= bond_xmit_slave_get,
 };
 
 static const struct device_type bond_type = {
diff --git a/include/net/bonding.h b/include/net/bonding.h
index b77daffc1b52..6dd970eb9d3f 100644
--- a/include/net/bonding.h
+++ b/include/net/bonding.h
@@ -201,6 +201,7 @@ struct bonding {
 	struct   slave __rcu *current_arp_slave;
 	struct   slave __rcu *primary_slave;
 	struct   bond_up_slave __rcu *active_slaves; /* Array of usable slaves */
+	struct   bond_up_slave __rcu *all_slaves; /* Array of all slaves */
 	bool     force_primary;
 	s32      slave_cnt; /* never change this value outside the attach/detach wrappers */
 	int     (*recv_probe)(const struct sk_buff *, struct bonding *,
-- 
2.17.2


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH 4/4] bonding: Implement ndo_xmit_slave_get
  2020-01-26 13:21 ` [RFC PATCH 4/4] bonding: Implement ndo_xmit_slave_get Maor Gottlieb
@ 2020-01-29  2:08   ` Jay Vosburgh
  2020-01-30 15:44     ` Maor Gottlieb
  0 siblings, 1 reply; 10+ messages in thread
From: Jay Vosburgh @ 2020-01-29  2:08 UTC (permalink / raw)
  To: Maor Gottlieb
  Cc: vfalico, andy, jiri, davem, netdev, saeedm, jgg, leonro, alexr,
	markz, parav, eranbe, linux-rdma

Maor Gottlieb <maorg@mellanox.com> wrote:

>Add implementation of ndo_xmit_slave_get.
>When user set the LAG_FLAGS_HASH_ALL_SLAVES bit and the xmit slave
>result is based on the hash, then the slave will be selected from the
>array of all the slaves.
>
>Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
>---
> drivers/net/bonding/bond_main.c | 63 ++++++++++++++++++++++++++++++---
> include/net/bonding.h           |  1 +
> 2 files changed, 60 insertions(+), 4 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index adab1e3549ff..c8f440d1b624 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -4098,7 +4098,8 @@ static void bond_skip_slave(struct bond_up_slave *slaves,
>  */
> int bond_update_slave_arr(struct bonding *bond, struct slave *skipslave)
> {
>-	struct bond_up_slave *active_slaves, *old_active_slaves;
>+	struct bond_up_slave *active_slaves = NULL, *all_slaves = NULL;
>+	struct bond_up_slave *old_active_slaves, *old_all_slaves;
> 	struct slave *slave;
> 	struct list_head *iter;
> 	int agg_id = 0;
>@@ -4110,7 +4111,9 @@ int bond_update_slave_arr(struct bonding *bond, struct slave *skipslave)
> 
> 	active_slaves = kzalloc(struct_size(active_slaves, arr,
> 					    bond->slave_cnt), GFP_KERNEL);
>-	if (!active_slaves) {
>+	all_slaves = kzalloc(struct_size(all_slaves, arr,
>+					 bond->slave_cnt), GFP_KERNEL);
>+	if (!active_slaves || !all_slaves) {
> 		ret = -ENOMEM;
> 		pr_err("Failed to build slave-array.\n");
> 		goto out;
>@@ -4141,14 +4144,17 @@ int bond_update_slave_arr(struct bonding *bond, struct slave *skipslave)
> 			if (!agg || agg->aggregator_identifier != agg_id)
> 				continue;
> 		}
>-		if (!bond_slave_can_tx(slave))
>+		if (!bond_slave_can_tx(slave)) {
>+			all_slaves->arr[all_slaves->count++] = slave;
> 			continue;
>+		}
> 		if (skipslave == slave)
> 			continue;
> 
> 		slave_dbg(bond->dev, slave->dev, "Adding slave to tx hash array[%d]\n",
> 			  active_slaves->count);
> 
>+		all_slaves->arr[all_slaves->count++] = slave;
> 		active_slaves->arr[active_slaves->count++] = slave;
> 	}
> 
>@@ -4156,10 +4162,18 @@ int bond_update_slave_arr(struct bonding *bond, struct slave *skipslave)
> 	rcu_assign_pointer(bond->active_slaves, active_slaves);
> 	if (old_active_slaves)
> 		kfree_rcu(old_active_slaves, rcu);
>+
>+	old_all_slaves = rtnl_dereference(bond->all_slaves);
>+	rcu_assign_pointer(bond->all_slaves, all_slaves);
>+	if (old_all_slaves)
>+		kfree_rcu(old_all_slaves, rcu);
> out:
>-	if (ret != 0 && skipslave)
>+	if (ret != 0 && skipslave) {
> 		bond_skip_slave(rtnl_dereference(bond->active_slaves),
> 				skipslave);
>+		kfree(all_slaves);
>+		kfree(active_slaves);
>+	}

	I'm still going through the patch set, but noticed this right
away: the above will leak memory if !skipslave and the allocation for
active_slaves succeeds, but the allocation for all_slaves fails.
> 
> 	return ret;
> }
>@@ -4265,6 +4279,46 @@ static u16 bond_select_queue(struct net_device *dev, struct sk_buff *skb,
> 	return txq;
> }
> 
>+static struct net_device *bond_xmit_slave_get(struct net_device *master_dev,
>+					      struct sk_buff *skb,
>+					      int flags)
>+{
>+	struct bonding *bond = netdev_priv(master_dev);
>+	struct bond_up_slave *slaves;
>+	struct slave *slave;
>+
>+	switch (BOND_MODE(bond)) {
>+	case BOND_MODE_ROUNDROBIN:
>+		slave = bond_xmit_roundrobin_slave_get(bond, skb);
>+		break;
>+	case BOND_MODE_ACTIVEBACKUP:
>+		slave = bond_xmit_activebackup_slave_get(bond, skb);
>+		break;
>+	case BOND_MODE_8023AD:
>+	case BOND_MODE_XOR:
>+		if (flags & LAG_FLAGS_HASH_ALL_SLAVES)
>+			slaves = rcu_dereference(bond->all_slaves);
>+		else
>+			slaves = rcu_dereference(bond->active_slaves);
>+		slave = bond_xmit_3ad_xor_slave_get(bond, skb, slaves);
>+		break;
>+	case BOND_MODE_BROADCAST:
>+		return ERR_PTR(-EOPNOTSUPP);
>+	case BOND_MODE_ALB:
>+		slave = bond_xmit_alb_slave_get(bond, skb);
>+		break;
>+	case BOND_MODE_TLB:
>+		slave = bond_xmit_tlb_slave_get(bond, skb);
>+		break;
>+	default:
>+		return NULL;

	I would argue this should (a) return an error (not NULL), and,
(b) ideally issue a netdev_err for this impossible situation, similar to
the other switch statements in bonding.

	-J
	
>+	}
>+
>+	if (slave)
>+		return slave->dev;
>+	return NULL;
>+}
>+
> static netdev_tx_t __bond_start_xmit(struct sk_buff *skb, struct net_device *dev)
> {
> 	struct bonding *bond = netdev_priv(dev);
>@@ -4387,6 +4441,7 @@ static const struct net_device_ops bond_netdev_ops = {
> 	.ndo_del_slave		= bond_release,
> 	.ndo_fix_features	= bond_fix_features,
> 	.ndo_features_check	= passthru_features_check,
>+	.ndo_xmit_slave_get	= bond_xmit_slave_get,
> };
> 
> static const struct device_type bond_type = {
>diff --git a/include/net/bonding.h b/include/net/bonding.h
>index b77daffc1b52..6dd970eb9d3f 100644
>--- a/include/net/bonding.h
>+++ b/include/net/bonding.h
>@@ -201,6 +201,7 @@ struct bonding {
> 	struct   slave __rcu *current_arp_slave;
> 	struct   slave __rcu *primary_slave;
> 	struct   bond_up_slave __rcu *active_slaves; /* Array of usable slaves */
>+	struct   bond_up_slave __rcu *all_slaves; /* Array of all slaves */
> 	bool     force_primary;
> 	s32      slave_cnt; /* never change this value outside the attach/detach wrappers */
> 	int     (*recv_probe)(const struct sk_buff *, struct bonding *,
>-- 
>2.17.2
>

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH 1/4] net/core: Introduce master_xmit_slave_get
  2020-01-26 13:21 ` [RFC PATCH 1/4] net/core: " Maor Gottlieb
@ 2020-01-29  9:57   ` Gal Pressman
  2020-01-30 15:40     ` Maor Gottlieb
  2020-02-12 20:33   ` Saeed Mahameed
  1 sibling, 1 reply; 10+ messages in thread
From: Gal Pressman @ 2020-01-29  9:57 UTC (permalink / raw)
  To: Maor Gottlieb
  Cc: j.vosburgh, vfalico, andy, jiri, davem, netdev, saeedm, jgg,
	leonro, alexr, markz, parav, eranbe, linux-rdma

On 26/01/2020 15:21, Maor Gottlieb wrote:
> Add new ndo to get the xmit slave of master device.
> When slave selection method is based on hash, then the user can ask to
> get the xmit slave assume all the slaves can transmit by setting the
> LAG_FLAGS_HASH_ALL_SLAVES bit in the flags argument.
> 
> Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
> ---
>  include/linux/netdevice.h |  3 +++
>  include/net/lag.h         | 19 +++++++++++++++++++
>  2 files changed, 22 insertions(+)
> 
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 11bdf6cb30bd..faba4aa094e5 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -1379,6 +1379,9 @@ struct net_device_ops {
>  						 struct netlink_ext_ack *extack);
>  	int			(*ndo_del_slave)(struct net_device *dev,
>  						 struct net_device *slave_dev);
> +	struct net_device*	(*ndo_xmit_slave_get)(struct net_device *master_dev,
> +						      struct sk_buff *skb,
> +						      int lag);

Hey Maor,
Should lag be named flags?
Also, better to use unsigned type for it.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH 1/4] net/core: Introduce master_xmit_slave_get
  2020-01-29  9:57   ` Gal Pressman
@ 2020-01-30 15:40     ` Maor Gottlieb
  0 siblings, 0 replies; 10+ messages in thread
From: Maor Gottlieb @ 2020-01-30 15:40 UTC (permalink / raw)
  To: Gal Pressman
  Cc: j.vosburgh, vfalico, andy, Jiri Pirko, davem, netdev,
	Saeed Mahameed, Jason Gunthorpe, Leon Romanovsky, Alex Rosenbaum,
	Mark Zhang, Parav Pandit, Eran Ben Elisha, linux-rdma


On 1/29/2020 11:57 AM, Gal Pressman wrote:
> On 26/01/2020 15:21, Maor Gottlieb wrote:
>> Add new ndo to get the xmit slave of master device.
>> When slave selection method is based on hash, then the user can ask to
>> get the xmit slave assume all the slaves can transmit by setting the
>> LAG_FLAGS_HASH_ALL_SLAVES bit in the flags argument.
>>
>> Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
>> ---
>>   include/linux/netdevice.h |  3 +++
>>   include/net/lag.h         | 19 +++++++++++++++++++
>>   2 files changed, 22 insertions(+)
>>
>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>> index 11bdf6cb30bd..faba4aa094e5 100644
>> --- a/include/linux/netdevice.h
>> +++ b/include/linux/netdevice.h
>> @@ -1379,6 +1379,9 @@ struct net_device_ops {
>>   						 struct netlink_ext_ack *extack);
>>   	int			(*ndo_del_slave)(struct net_device *dev,
>>   						 struct net_device *slave_dev);
>> +	struct net_device*	(*ndo_xmit_slave_get)(struct net_device *master_dev,
>> +						      struct sk_buff *skb,
>> +						      int lag);
> Hey Maor,
> Should lag be named flags?
> Also, better to use unsigned type for it.

Yeah, will change it.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH 4/4] bonding: Implement ndo_xmit_slave_get
  2020-01-29  2:08   ` Jay Vosburgh
@ 2020-01-30 15:44     ` Maor Gottlieb
  0 siblings, 0 replies; 10+ messages in thread
From: Maor Gottlieb @ 2020-01-30 15:44 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: vfalico, andy, Jiri Pirko, davem, netdev, Saeed Mahameed,
	Jason Gunthorpe, Leon Romanovsky, Alex Rosenbaum, Mark Zhang,
	Parav Pandit, Eran Ben Elisha, linux-rdma


On 1/29/2020 4:08 AM, Jay Vosburgh wrote:
> Maor Gottlieb <maorg@mellanox.com> wrote:
>
>> Add implementation of ndo_xmit_slave_get.
>> When user set the LAG_FLAGS_HASH_ALL_SLAVES bit and the xmit slave
>> result is based on the hash, then the slave will be selected from the
>> array of all the slaves.
>>
>> Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
>> ---
>> drivers/net/bonding/bond_main.c | 63 ++++++++++++++++++++++++++++++---
>> include/net/bonding.h           |  1 +
>> 2 files changed, 60 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>> index adab1e3549ff..c8f440d1b624 100644
>> --- a/drivers/net/bonding/bond_main.c
>> +++ b/drivers/net/bonding/bond_main.c
>> @@ -4098,7 +4098,8 @@ static void bond_skip_slave(struct bond_up_slave *slaves,
>>   */
>> int bond_update_slave_arr(struct bonding *bond, struct slave *skipslave)
>> {
>> -	struct bond_up_slave *active_slaves, *old_active_slaves;
>> +	struct bond_up_slave *active_slaves = NULL, *all_slaves = NULL;
>> +	struct bond_up_slave *old_active_slaves, *old_all_slaves;
>> 	struct slave *slave;
>> 	struct list_head *iter;
>> 	int agg_id = 0;
>> @@ -4110,7 +4111,9 @@ int bond_update_slave_arr(struct bonding *bond, struct slave *skipslave)
>>
>> 	active_slaves = kzalloc(struct_size(active_slaves, arr,
>> 					    bond->slave_cnt), GFP_KERNEL);
>> -	if (!active_slaves) {
>> +	all_slaves = kzalloc(struct_size(all_slaves, arr,
>> +					 bond->slave_cnt), GFP_KERNEL);
>> +	if (!active_slaves || !all_slaves) {
>> 		ret = -ENOMEM;
>> 		pr_err("Failed to build slave-array.\n");
>> 		goto out;
>> @@ -4141,14 +4144,17 @@ int bond_update_slave_arr(struct bonding *bond, struct slave *skipslave)
>> 			if (!agg || agg->aggregator_identifier != agg_id)
>> 				continue;
>> 		}
>> -		if (!bond_slave_can_tx(slave))
>> +		if (!bond_slave_can_tx(slave)) {
>> +			all_slaves->arr[all_slaves->count++] = slave;
>> 			continue;
>> +		}
>> 		if (skipslave == slave)
>> 			continue;
>>
>> 		slave_dbg(bond->dev, slave->dev, "Adding slave to tx hash array[%d]\n",
>> 			  active_slaves->count);
>>
>> +		all_slaves->arr[all_slaves->count++] = slave;
>> 		active_slaves->arr[active_slaves->count++] = slave;
>> 	}
>>
>> @@ -4156,10 +4162,18 @@ int bond_update_slave_arr(struct bonding *bond, struct slave *skipslave)
>> 	rcu_assign_pointer(bond->active_slaves, active_slaves);
>> 	if (old_active_slaves)
>> 		kfree_rcu(old_active_slaves, rcu);
>> +
>> +	old_all_slaves = rtnl_dereference(bond->all_slaves);
>> +	rcu_assign_pointer(bond->all_slaves, all_slaves);
>> +	if (old_all_slaves)
>> +		kfree_rcu(old_all_slaves, rcu);
>> out:
>> -	if (ret != 0 && skipslave)
>> +	if (ret != 0 && skipslave) {
>> 		bond_skip_slave(rtnl_dereference(bond->active_slaves),
>> 				skipslave);
>> +		kfree(all_slaves);
>> +		kfree(active_slaves);
>> +	}
> 	I'm still going through the patch set, but noticed this right
> away: the above will leak memory if !skipslave and the allocation for
> active_slaves succeeds, but the allocation for all_slaves fails.
>> 	return ret;
>> }
>> @@ -4265,6 +4279,46 @@ static u16 bond_select_queue(struct net_device *dev, struct sk_buff *skb,
>> 	return txq;
>> }
>>
>> +static struct net_device *bond_xmit_slave_get(struct net_device *master_dev,
>> +					      struct sk_buff *skb,
>> +					      int flags)
>> +{
>> +	struct bonding *bond = netdev_priv(master_dev);
>> +	struct bond_up_slave *slaves;
>> +	struct slave *slave;
>> +
>> +	switch (BOND_MODE(bond)) {
>> +	case BOND_MODE_ROUNDROBIN:
>> +		slave = bond_xmit_roundrobin_slave_get(bond, skb);
>> +		break;
>> +	case BOND_MODE_ACTIVEBACKUP:
>> +		slave = bond_xmit_activebackup_slave_get(bond, skb);
>> +		break;
>> +	case BOND_MODE_8023AD:
>> +	case BOND_MODE_XOR:
>> +		if (flags & LAG_FLAGS_HASH_ALL_SLAVES)
>> +			slaves = rcu_dereference(bond->all_slaves);
>> +		else
>> +			slaves = rcu_dereference(bond->active_slaves);
>> +		slave = bond_xmit_3ad_xor_slave_get(bond, skb, slaves);
>> +		break;
>> +	case BOND_MODE_BROADCAST:
>> +		return ERR_PTR(-EOPNOTSUPP);
>> +	case BOND_MODE_ALB:
>> +		slave = bond_xmit_alb_slave_get(bond, skb);
>> +		break;
>> +	case BOND_MODE_TLB:
>> +		slave = bond_xmit_tlb_slave_get(bond, skb);
>> +		break;
>> +	default:
>> +		return NULL;
> 	I would argue this should (a) return an error (not NULL), and,
> (b) ideally issue a netdev_err for this impossible situation, similar to
> the other switch statements in bonding.
>
> 	-J
> 	
>> +	}
>> +
>> +	if (slave)
>> +		return slave->dev;
>> +	return NULL;
>> +}
>> +
>> static netdev_tx_t __bond_start_xmit(struct sk_buff *skb, struct net_device *dev)
>> {
>> 	struct bonding *bond = netdev_priv(dev);
>> @@ -4387,6 +4441,7 @@ static const struct net_device_ops bond_netdev_ops = {
>> 	.ndo_del_slave		= bond_release,
>> 	.ndo_fix_features	= bond_fix_features,
>> 	.ndo_features_check	= passthru_features_check,
>> +	.ndo_xmit_slave_get	= bond_xmit_slave_get,
>> };
>>
>> static const struct device_type bond_type = {
>> diff --git a/include/net/bonding.h b/include/net/bonding.h
>> index b77daffc1b52..6dd970eb9d3f 100644
>> --- a/include/net/bonding.h
>> +++ b/include/net/bonding.h
>> @@ -201,6 +201,7 @@ struct bonding {
>> 	struct   slave __rcu *current_arp_slave;
>> 	struct   slave __rcu *primary_slave;
>> 	struct   bond_up_slave __rcu *active_slaves; /* Array of usable slaves */
>> +	struct   bond_up_slave __rcu *all_slaves; /* Array of all slaves */
>> 	bool     force_primary;
>> 	s32      slave_cnt; /* never change this value outside the attach/detach wrappers */
>> 	int     (*recv_probe)(const struct sk_buff *, struct bonding *,
>> -- 
>> 2.17.2
>>
> ---
> 	-Jay Vosburgh, jay.vosburgh@canonical.com

Thanks Jay,
I will address the comments and submit this patch set along with the 
RoCE patches.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH 1/4] net/core: Introduce master_xmit_slave_get
  2020-01-26 13:21 ` [RFC PATCH 1/4] net/core: " Maor Gottlieb
  2020-01-29  9:57   ` Gal Pressman
@ 2020-02-12 20:33   ` Saeed Mahameed
  1 sibling, 0 replies; 10+ messages in thread
From: Saeed Mahameed @ 2020-02-12 20:33 UTC (permalink / raw)
  To: vfalico, Maor Gottlieb, andy, j.vosburgh, davem, Jiri Pirko
  Cc: Jason Gunthorpe, linux-rdma, Leon Romanovsky, Mark Zhang,
	Parav Pandit, Eran Ben Elisha, netdev, Alex Rosenbaum

On Sun, 2020-01-26 at 15:21 +0200, Maor Gottlieb wrote:
> Add new ndo to get the xmit slave of master device.
> When slave selection method is based on hash, then the user can ask
> to
> get the xmit slave assume all the slaves can transmit by setting the
> LAG_FLAGS_HASH_ALL_SLAVES bit in the flags argument.
> 
> Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
> ---
>  include/linux/netdevice.h |  3 +++
>  include/net/lag.h         | 19 +++++++++++++++++++
>  2 files changed, 22 insertions(+)
> 
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 11bdf6cb30bd..faba4aa094e5 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -1379,6 +1379,9 @@ struct net_device_ops {
>  						 struct netlink_ext_ack
> *extack);
>  	int			(*ndo_del_slave)(struct net_device
> *dev,
>  						 struct net_device
> *slave_dev);
> +	struct net_device*	(*ndo_xmit_slave_get)(struct
> net_device *master_dev,
> +						      struct sk_buff
> *skb,
> +						      int lag);
>  	netdev_features_t	(*ndo_fix_features)(struct net_device *dev,
>  						    netdev_features_t
> features);
>  	int			(*ndo_set_features)(struct net_device
> *dev,
> diff --git a/include/net/lag.h b/include/net/lag.h
> index 95b880e6fdde..c710daf8f57a 100644
> --- a/include/net/lag.h
> +++ b/include/net/lag.h
> @@ -6,6 +6,25 @@
>  #include <linux/if_team.h>
>  #include <net/bonding.h>
>  
> +enum lag_get_slaves_flags {
> +	LAG_FLAGS_HASH_ALL_SLAVES = 1<<0
> +};
> +
> +static inline
> +struct net_device *master_xmit_slave_get(struct net_device
> *master_dev,
> +					 struct sk_buff *skb,
> +					 int flags)
> +{
> +	const struct net_device_ops *ops = master_dev->netdev_ops;
> +	struct net_device *slave = NULL;
> +
> +	rcu_read_lock();
> +	if (ops->ndo_xmit_slave_get)
> +		slave = ops->ndo_xmit_slave_get(master_dev, skb,
> flags);

what is the purpose of the rcu ? Aren't you supposed to dev_hold(slave)
under the rcu ?

and the caller should be responsible to issue the dev_put() .. 

otherwise slave is not guaranteed to stick around after this ndo
returns. or i am missing some assumptions that are not listed in this
patchset commit messages or cover letter.

Please clarify.


> +	rcu_read_unlock();
> +	return slave;
> +}
> +
>  static inline bool net_lag_port_dev_txable(const struct net_device
> *port_dev)
>  {
>  	if (netif_is_team_port(port_dev))

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, back to index

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-26 13:21 [RFC PATCH 0/4] Introduce master_xmit_slave_get Maor Gottlieb
2020-01-26 13:21 ` [RFC PATCH 1/4] net/core: " Maor Gottlieb
2020-01-29  9:57   ` Gal Pressman
2020-01-30 15:40     ` Maor Gottlieb
2020-02-12 20:33   ` Saeed Mahameed
2020-01-26 13:21 ` [RFC PATCH 2/4] bonding: Rename slave_arr to active_slaves Maor Gottlieb
2020-01-26 13:21 ` [RFC PATCH 3/4] bonding: Add helpers to get xmit slave Maor Gottlieb
2020-01-26 13:21 ` [RFC PATCH 4/4] bonding: Implement ndo_xmit_slave_get Maor Gottlieb
2020-01-29  2:08   ` Jay Vosburgh
2020-01-30 15:44     ` Maor Gottlieb

Linux-RDMA Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-rdma/0 linux-rdma/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-rdma linux-rdma/ https://lore.kernel.org/linux-rdma \
		linux-rdma@vger.kernel.org
	public-inbox-index linux-rdma

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-rdma


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git