All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next v2 0/4] bonding: performance and reliability
@ 2018-05-14 18:48 Debabrata Banerjee
  2018-05-14 18:48 ` [PATCH net-next v2 1/4] bonding: don't queue up extraneous rlb updates Debabrata Banerjee
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Debabrata Banerjee @ 2018-05-14 18:48 UTC (permalink / raw)
  To: David S . Miller, netdev, Jay Vosburgh
  Cc: Veaceslav Falico, Andy Gospodarek, dbanerje

Series of fixes to how rlb updates are handled, code cleanup, allowing
higher performance tx hashing in balance-alb mode, and reliability of
link up/down monitoring.

v2: refactor bond_is_nondyn_tlb with inline fn, update log comment to
point out that multicast addresses will not get rlb updates.

Debabrata Banerjee (4):
  bonding: don't queue up extraneous rlb updates
  bonding: use common mac addr checks
  bonding: allow use of tx hashing in balance-alb
  bonding: allow carrier and link status to determine link state

 Documentation/networking/bonding.txt |  4 +--
 drivers/net/bonding/bond_alb.c       | 50 +++++++++++++++++-----------
 drivers/net/bonding/bond_main.c      | 37 ++++++++++++--------
 drivers/net/bonding/bond_options.c   |  9 ++---
 include/net/bonding.h                | 11 ++++--
 5 files changed, 70 insertions(+), 41 deletions(-)

-- 
2.17.0

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH net-next v2 1/4] bonding: don't queue up extraneous rlb updates
  2018-05-14 18:48 [PATCH net-next v2 0/4] bonding: performance and reliability Debabrata Banerjee
@ 2018-05-14 18:48 ` Debabrata Banerjee
  2018-05-14 18:48 ` [PATCH net-next v2 2/4] bonding: use common mac addr checks Debabrata Banerjee
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Debabrata Banerjee @ 2018-05-14 18:48 UTC (permalink / raw)
  To: David S . Miller, netdev, Jay Vosburgh
  Cc: Veaceslav Falico, Andy Gospodarek, dbanerje

arps for incomplete entries can't be sent anyway.

Signed-off-by: Debabrata Banerjee <dbanerje@akamai.com>
---
 drivers/net/bonding/bond_alb.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index 5eb0df2e5464..c2f6c58e4e6a 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -421,7 +421,8 @@ static void rlb_clear_slave(struct bonding *bond, struct slave *slave)
 			if (assigned_slave) {
 				rx_hash_table[index].slave = assigned_slave;
 				if (!ether_addr_equal_64bits(rx_hash_table[index].mac_dst,
-							     mac_bcast)) {
+							     mac_bcast) &&
+				    !is_zero_ether_addr(rx_hash_table[index].mac_dst)) {
 					bond_info->rx_hashtbl[index].ntt = 1;
 					bond_info->rx_ntt = 1;
 					/* A slave has been removed from the
@@ -524,7 +525,8 @@ static void rlb_req_update_slave_clients(struct bonding *bond, struct slave *sla
 		client_info = &(bond_info->rx_hashtbl[hash_index]);
 
 		if ((client_info->slave == slave) &&
-		    !ether_addr_equal_64bits(client_info->mac_dst, mac_bcast)) {
+		    !ether_addr_equal_64bits(client_info->mac_dst, mac_bcast) &&
+		    !is_zero_ether_addr(client_info->mac_dst)) {
 			client_info->ntt = 1;
 			ntt = 1;
 		}
@@ -565,7 +567,8 @@ static void rlb_req_update_subnet_clients(struct bonding *bond, __be32 src_ip)
 		if ((client_info->ip_src == src_ip) &&
 		    !ether_addr_equal_64bits(client_info->slave->dev->dev_addr,
 					     bond->dev->dev_addr) &&
-		    !ether_addr_equal_64bits(client_info->mac_dst, mac_bcast)) {
+		    !ether_addr_equal_64bits(client_info->mac_dst, mac_bcast) &&
+		    !is_zero_ether_addr(client_info->mac_dst)) {
 			client_info->ntt = 1;
 			bond_info->rx_ntt = 1;
 		}
@@ -641,7 +644,8 @@ static struct slave *rlb_choose_channel(struct sk_buff *skb, struct bonding *bon
 		ether_addr_copy(client_info->mac_src, arp->mac_src);
 		client_info->slave = assigned_slave;
 
-		if (!ether_addr_equal_64bits(client_info->mac_dst, mac_bcast)) {
+		if (!ether_addr_equal_64bits(client_info->mac_dst, mac_bcast) &&
+		    !is_zero_ether_addr(client_info->mac_dst)) {
 			client_info->ntt = 1;
 			bond->alb_info.rx_ntt = 1;
 		} else {
@@ -733,8 +737,10 @@ static void rlb_rebalance(struct bonding *bond)
 		assigned_slave = __rlb_next_rx_slave(bond);
 		if (assigned_slave && (client_info->slave != assigned_slave)) {
 			client_info->slave = assigned_slave;
-			client_info->ntt = 1;
-			ntt = 1;
+			if (!is_zero_ether_addr(client_info->mac_dst)) {
+				client_info->ntt = 1;
+				ntt = 1;
+			}
 		}
 	}
 
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v2 2/4] bonding: use common mac addr checks
  2018-05-14 18:48 [PATCH net-next v2 0/4] bonding: performance and reliability Debabrata Banerjee
  2018-05-14 18:48 ` [PATCH net-next v2 1/4] bonding: don't queue up extraneous rlb updates Debabrata Banerjee
@ 2018-05-14 18:48 ` Debabrata Banerjee
  2018-05-14 18:48 ` [PATCH net-next v2 3/4] bonding: allow use of tx hashing in balance-alb Debabrata Banerjee
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Debabrata Banerjee @ 2018-05-14 18:48 UTC (permalink / raw)
  To: David S . Miller, netdev, Jay Vosburgh
  Cc: Veaceslav Falico, Andy Gospodarek, dbanerje

Replace homegrown mac addr checks with faster defs from etherdevice.h

Note that this will also prevent any rlb arp updates for multicast
addresses, however this should have been forbidden anyway.

Signed-off-by: Debabrata Banerjee <dbanerje@akamai.com>
---
 drivers/net/bonding/bond_alb.c | 28 +++++++++-------------------
 1 file changed, 9 insertions(+), 19 deletions(-)

diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index c2f6c58e4e6a..180e50f7806f 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -40,11 +40,6 @@
 #include <net/bonding.h>
 #include <net/bond_alb.h>
 
-
-
-static const u8 mac_bcast[ETH_ALEN + 2] __long_aligned = {
-	0xff, 0xff, 0xff, 0xff, 0xff, 0xff
-};
 static const u8 mac_v6_allmcast[ETH_ALEN + 2] __long_aligned = {
 	0x33, 0x33, 0x00, 0x00, 0x00, 0x01
 };
@@ -420,9 +415,7 @@ static void rlb_clear_slave(struct bonding *bond, struct slave *slave)
 
 			if (assigned_slave) {
 				rx_hash_table[index].slave = assigned_slave;
-				if (!ether_addr_equal_64bits(rx_hash_table[index].mac_dst,
-							     mac_bcast) &&
-				    !is_zero_ether_addr(rx_hash_table[index].mac_dst)) {
+				if (is_valid_ether_addr(rx_hash_table[index].mac_dst)) {
 					bond_info->rx_hashtbl[index].ntt = 1;
 					bond_info->rx_ntt = 1;
 					/* A slave has been removed from the
@@ -525,8 +518,7 @@ static void rlb_req_update_slave_clients(struct bonding *bond, struct slave *sla
 		client_info = &(bond_info->rx_hashtbl[hash_index]);
 
 		if ((client_info->slave == slave) &&
-		    !ether_addr_equal_64bits(client_info->mac_dst, mac_bcast) &&
-		    !is_zero_ether_addr(client_info->mac_dst)) {
+		    is_valid_ether_addr(client_info->mac_dst)) {
 			client_info->ntt = 1;
 			ntt = 1;
 		}
@@ -567,8 +559,7 @@ static void rlb_req_update_subnet_clients(struct bonding *bond, __be32 src_ip)
 		if ((client_info->ip_src == src_ip) &&
 		    !ether_addr_equal_64bits(client_info->slave->dev->dev_addr,
 					     bond->dev->dev_addr) &&
-		    !ether_addr_equal_64bits(client_info->mac_dst, mac_bcast) &&
-		    !is_zero_ether_addr(client_info->mac_dst)) {
+		    is_valid_ether_addr(client_info->mac_dst)) {
 			client_info->ntt = 1;
 			bond_info->rx_ntt = 1;
 		}
@@ -596,7 +587,7 @@ static struct slave *rlb_choose_channel(struct sk_buff *skb, struct bonding *bon
 		if ((client_info->ip_src == arp->ip_src) &&
 		    (client_info->ip_dst == arp->ip_dst)) {
 			/* the entry is already assigned to this client */
-			if (!ether_addr_equal_64bits(arp->mac_dst, mac_bcast)) {
+			if (!is_broadcast_ether_addr(arp->mac_dst)) {
 				/* update mac address from arp */
 				ether_addr_copy(client_info->mac_dst, arp->mac_dst);
 			}
@@ -644,8 +635,7 @@ static struct slave *rlb_choose_channel(struct sk_buff *skb, struct bonding *bon
 		ether_addr_copy(client_info->mac_src, arp->mac_src);
 		client_info->slave = assigned_slave;
 
-		if (!ether_addr_equal_64bits(client_info->mac_dst, mac_bcast) &&
-		    !is_zero_ether_addr(client_info->mac_dst)) {
+		if (is_valid_ether_addr(client_info->mac_dst)) {
 			client_info->ntt = 1;
 			bond->alb_info.rx_ntt = 1;
 		} else {
@@ -1418,9 +1408,9 @@ int bond_alb_xmit(struct sk_buff *skb, struct net_device *bond_dev)
 	case ETH_P_IP: {
 		const struct iphdr *iph = ip_hdr(skb);
 
-		if (ether_addr_equal_64bits(eth_data->h_dest, mac_bcast) ||
-		    (iph->daddr == ip_bcast) ||
-		    (iph->protocol == IPPROTO_IGMP)) {
+		if (is_broadcast_ether_addr(eth_data->h_dest) ||
+		    iph->daddr == ip_bcast ||
+		    iph->protocol == IPPROTO_IGMP) {
 			do_tx_balance = false;
 			break;
 		}
@@ -1432,7 +1422,7 @@ int bond_alb_xmit(struct sk_buff *skb, struct net_device *bond_dev)
 		/* IPv6 doesn't really use broadcast mac address, but leave
 		 * that here just in case.
 		 */
-		if (ether_addr_equal_64bits(eth_data->h_dest, mac_bcast)) {
+		if (is_broadcast_ether_addr(eth_data->h_dest)) {
 			do_tx_balance = false;
 			break;
 		}
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v2 3/4] bonding: allow use of tx hashing in balance-alb
  2018-05-14 18:48 [PATCH net-next v2 0/4] bonding: performance and reliability Debabrata Banerjee
  2018-05-14 18:48 ` [PATCH net-next v2 1/4] bonding: don't queue up extraneous rlb updates Debabrata Banerjee
  2018-05-14 18:48 ` [PATCH net-next v2 2/4] bonding: use common mac addr checks Debabrata Banerjee
@ 2018-05-14 18:48 ` Debabrata Banerjee
  2018-05-14 18:48 ` [PATCH net-next v2 4/4] bonding: allow carrier and link status to determine link state Debabrata Banerjee
  2018-05-16 16:16 ` [PATCH net-next v2 0/4] bonding: performance and reliability David Miller
  4 siblings, 0 replies; 12+ messages in thread
From: Debabrata Banerjee @ 2018-05-14 18:48 UTC (permalink / raw)
  To: David S . Miller, netdev, Jay Vosburgh
  Cc: Veaceslav Falico, Andy Gospodarek, dbanerje

The rx load balancing provided by balance-alb is not mutually
exclusive with using hashing for tx selection, and should provide a decent
speed increase because this eliminates spinlocks and cache contention.

Signed-off-by: Debabrata Banerjee <dbanerje@akamai.com>
---
 drivers/net/bonding/bond_alb.c     | 20 ++++++++++++++++++--
 drivers/net/bonding/bond_main.c    | 25 +++++++++++++++----------
 drivers/net/bonding/bond_options.c |  2 +-
 include/net/bonding.h              | 11 +++++++++--
 4 files changed, 43 insertions(+), 15 deletions(-)

diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index 180e50f7806f..6228635880d5 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -1478,8 +1478,24 @@ int bond_alb_xmit(struct sk_buff *skb, struct net_device *bond_dev)
 	}
 
 	if (do_tx_balance) {
-		hash_index = _simple_hash(hash_start, hash_size);
-		tx_slave = tlb_choose_channel(bond, hash_index, skb->len);
+		if (bond->params.tlb_dynamic_lb) {
+			hash_index = _simple_hash(hash_start, hash_size);
+			tx_slave = tlb_choose_channel(bond, hash_index, skb->len);
+		} else {
+			/*
+			 * do_tx_balance means we are free to select the tx_slave
+			 * So we do exactly what tlb would do for hash selection
+			 */
+
+			struct bond_up_slave *slaves;
+			unsigned int count;
+
+			slaves = rcu_dereference(bond->slave_arr);
+			count = slaves ? READ_ONCE(slaves->count) : 0;
+			if (likely(count))
+				tx_slave = slaves->arr[bond_xmit_hash(bond, skb) %
+						       count];
+		}
 	}
 
 	return bond_do_alb_xmit(skb, bond, tx_slave);
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 1f1e97b26f95..f7f8a49cb32b 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -159,7 +159,7 @@ module_param(min_links, int, 0);
 MODULE_PARM_DESC(min_links, "Minimum number of available links before turning on carrier");
 
 module_param(xmit_hash_policy, charp, 0);
-MODULE_PARM_DESC(xmit_hash_policy, "balance-xor and 802.3ad hashing method; "
+MODULE_PARM_DESC(xmit_hash_policy, "balance-alb, balance-tlb, balance-xor, 802.3ad hashing method; "
 				   "0 for layer 2 (default), 1 for layer 3+4, "
 				   "2 for layer 2+3, 3 for encap layer 2+3, "
 				   "4 for encap layer 3+4");
@@ -1735,7 +1735,7 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev,
 		unblock_netpoll_tx();
 	}
 
-	if (bond_mode_uses_xmit_hash(bond))
+	if (bond_mode_can_use_xmit_hash(bond))
 		bond_update_slave_arr(bond, NULL);
 
 	bond->nest_level = dev_get_nest_level(bond_dev);
@@ -1870,7 +1870,7 @@ static int __bond_release_one(struct net_device *bond_dev,
 	if (BOND_MODE(bond) == BOND_MODE_8023AD)
 		bond_3ad_unbind_slave(slave);
 
-	if (bond_mode_uses_xmit_hash(bond))
+	if (bond_mode_can_use_xmit_hash(bond))
 		bond_update_slave_arr(bond, slave);
 
 	netdev_info(bond_dev, "Releasing %s interface %s\n",
@@ -3102,7 +3102,7 @@ static int bond_slave_netdev_event(unsigned long event,
 		 * events. If these (miimon/arpmon) parameters are configured
 		 * then array gets refreshed twice and that should be fine!
 		 */
-		if (bond_mode_uses_xmit_hash(bond))
+		if (bond_mode_can_use_xmit_hash(bond))
 			bond_update_slave_arr(bond, NULL);
 		break;
 	case NETDEV_CHANGEMTU:
@@ -3322,7 +3322,7 @@ static int bond_open(struct net_device *bond_dev)
 		 */
 		if (bond_alb_initialize(bond, (BOND_MODE(bond) == BOND_MODE_ALB)))
 			return -ENOMEM;
-		if (bond->params.tlb_dynamic_lb)
+		if (bond->params.tlb_dynamic_lb || BOND_MODE(bond) == BOND_MODE_ALB)
 			queue_delayed_work(bond->wq, &bond->alb_work, 0);
 	}
 
@@ -3341,7 +3341,7 @@ static int bond_open(struct net_device *bond_dev)
 		bond_3ad_initiate_agg_selection(bond, 1);
 	}
 
-	if (bond_mode_uses_xmit_hash(bond))
+	if (bond_mode_can_use_xmit_hash(bond))
 		bond_update_slave_arr(bond, NULL);
 
 	return 0;
@@ -3892,7 +3892,7 @@ static void bond_slave_arr_handler(struct work_struct *work)
  * to determine the slave interface -
  * (a) BOND_MODE_8023AD
  * (b) BOND_MODE_XOR
- * (c) BOND_MODE_TLB && tlb_dynamic_lb == 0
+ * (c) (BOND_MODE_TLB || BOND_MODE_ALB) && tlb_dynamic_lb == 0
  *
  * The caller is expected to hold RTNL only and NO other lock!
  */
@@ -3945,6 +3945,11 @@ int bond_update_slave_arr(struct bonding *bond, struct slave *skipslave)
 			continue;
 		if (skipslave == slave)
 			continue;
+
+		netdev_dbg(bond->dev,
+			   "Adding slave dev %s to tx hash array[%d]\n",
+			   slave->dev->name, new_arr->count);
+
 		new_arr->arr[new_arr->count++] = slave;
 	}
 
@@ -4320,9 +4325,9 @@ static int bond_check_params(struct bond_params *params)
 	}
 
 	if (xmit_hash_policy) {
-		if ((bond_mode != BOND_MODE_XOR) &&
-		    (bond_mode != BOND_MODE_8023AD) &&
-		    (bond_mode != BOND_MODE_TLB)) {
+		if (bond_mode == BOND_MODE_ROUNDROBIN ||
+		    bond_mode == BOND_MODE_ACTIVEBACKUP ||
+		    bond_mode == BOND_MODE_BROADCAST) {
 			pr_info("xmit_hash_policy param is irrelevant in mode %s\n",
 				bond_mode_name(bond_mode));
 		} else {
diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
index 58c705f24f96..8a945c9341d6 100644
--- a/drivers/net/bonding/bond_options.c
+++ b/drivers/net/bonding/bond_options.c
@@ -395,7 +395,7 @@ static const struct bond_option bond_opts[BOND_OPT_LAST] = {
 		.id = BOND_OPT_TLB_DYNAMIC_LB,
 		.name = "tlb_dynamic_lb",
 		.desc = "Enable dynamic flow shuffling",
-		.unsuppmodes = BOND_MODE_ALL_EX(BIT(BOND_MODE_TLB)),
+		.unsuppmodes = BOND_MODE_ALL_EX(BIT(BOND_MODE_TLB) | BIT(BOND_MODE_ALB)),
 		.values = bond_tlb_dynamic_lb_tbl,
 		.flags = BOND_OPTFLAG_IFDOWN,
 		.set = bond_option_tlb_dynamic_lb_set,
diff --git a/include/net/bonding.h b/include/net/bonding.h
index b52235158836..808f1d167349 100644
--- a/include/net/bonding.h
+++ b/include/net/bonding.h
@@ -285,8 +285,15 @@ static inline bool bond_needs_speed_duplex(const struct bonding *bond)
 
 static inline bool bond_is_nondyn_tlb(const struct bonding *bond)
 {
-	return (BOND_MODE(bond) == BOND_MODE_TLB)  &&
-	       (bond->params.tlb_dynamic_lb == 0);
+	return (bond_is_lb(bond) && bond->params.tlb_dynamic_lb == 0);
+}
+
+static inline bool bond_mode_can_use_xmit_hash(const struct bonding *bond)
+{
+	return (BOND_MODE(bond) == BOND_MODE_8023AD ||
+		BOND_MODE(bond) == BOND_MODE_XOR ||
+		BOND_MODE(bond) == BOND_MODE_TLB ||
+		BOND_MODE(bond) == BOND_MODE_ALB);
 }
 
 static inline bool bond_mode_uses_xmit_hash(const struct bonding *bond)
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v2 4/4] bonding: allow carrier and link status to determine link state
  2018-05-14 18:48 [PATCH net-next v2 0/4] bonding: performance and reliability Debabrata Banerjee
                   ` (2 preceding siblings ...)
  2018-05-14 18:48 ` [PATCH net-next v2 3/4] bonding: allow use of tx hashing in balance-alb Debabrata Banerjee
@ 2018-05-14 18:48 ` Debabrata Banerjee
  2018-05-16 16:11   ` Jay Vosburgh
  2018-05-16 16:16 ` [PATCH net-next v2 0/4] bonding: performance and reliability David Miller
  4 siblings, 1 reply; 12+ messages in thread
From: Debabrata Banerjee @ 2018-05-14 18:48 UTC (permalink / raw)
  To: David S . Miller, netdev, Jay Vosburgh
  Cc: Veaceslav Falico, Andy Gospodarek, dbanerje

In a mixed environment it may be difficult to tell if your hardware
support carrier, if it does not it can always report true. With a new
use_carrier option of 2, we can check both carrier and link status
sequentially, instead of one or the other

Signed-off-by: Debabrata Banerjee <dbanerje@akamai.com>
---
 Documentation/networking/bonding.txt |  4 ++--
 drivers/net/bonding/bond_main.c      | 12 ++++++++----
 drivers/net/bonding/bond_options.c   |  7 ++++---
 3 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt
index 9ba04c0bab8d..f063730e7e73 100644
--- a/Documentation/networking/bonding.txt
+++ b/Documentation/networking/bonding.txt
@@ -828,8 +828,8 @@ use_carrier
 	MII / ETHTOOL ioctl method to determine the link state.
 
 	A value of 1 enables the use of netif_carrier_ok(), a value of
-	0 will use the deprecated MII / ETHTOOL ioctls.  The default
-	value is 1.
+	0 will use the deprecated MII / ETHTOOL ioctls. A value of 2
+	will check both.  The default value is 1.
 
 xmit_hash_policy
 
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index f7f8a49cb32b..7e9652c4b35c 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -132,7 +132,7 @@ MODULE_PARM_DESC(downdelay, "Delay before considering link down, "
 			    "in milliseconds");
 module_param(use_carrier, int, 0);
 MODULE_PARM_DESC(use_carrier, "Use netif_carrier_ok (vs MII ioctls) in miimon; "
-			      "0 for off, 1 for on (default)");
+			      "0 for off, 1 for on (default), 2 for carrier then legacy checks");
 module_param(mode, charp, 0);
 MODULE_PARM_DESC(mode, "Mode of operation; 0 for balance-rr, "
 		       "1 for active-backup, 2 for balance-xor, "
@@ -434,12 +434,16 @@ static int bond_check_dev_link(struct bonding *bond,
 	int (*ioctl)(struct net_device *, struct ifreq *, int);
 	struct ifreq ifr;
 	struct mii_ioctl_data *mii;
+	bool carrier = true;
 
 	if (!reporting && !netif_running(slave_dev))
 		return 0;
 
 	if (bond->params.use_carrier)
-		return netif_carrier_ok(slave_dev) ? BMSR_LSTATUS : 0;
+		carrier = netif_carrier_ok(slave_dev) ? BMSR_LSTATUS : 0;
+
+	if (!carrier)
+		return carrier;
 
 	/* Try to get link status using Ethtool first. */
 	if (slave_dev->ethtool_ops->get_link)
@@ -4399,8 +4403,8 @@ static int bond_check_params(struct bond_params *params)
 		downdelay = 0;
 	}
 
-	if ((use_carrier != 0) && (use_carrier != 1)) {
-		pr_warn("Warning: use_carrier module parameter (%d), not of valid value (0/1), so it was set to 1\n",
+	if (use_carrier < 0 || use_carrier > 2) {
+		pr_warn("Warning: use_carrier module parameter (%d), not of valid value (0-2), so it was set to 1\n",
 			use_carrier);
 		use_carrier = 1;
 	}
diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
index 8a945c9341d6..dba6cef05134 100644
--- a/drivers/net/bonding/bond_options.c
+++ b/drivers/net/bonding/bond_options.c
@@ -164,9 +164,10 @@ static const struct bond_opt_value bond_primary_reselect_tbl[] = {
 };
 
 static const struct bond_opt_value bond_use_carrier_tbl[] = {
-	{ "off", 0,  0},
-	{ "on",  1,  BOND_VALFLAG_DEFAULT},
-	{ NULL,  -1, 0}
+	{ "off",  0,  0},
+	{ "on",   1,  BOND_VALFLAG_DEFAULT},
+	{ "both", 2,  0},
+	{ NULL,  -1,  0}
 };
 
 static const struct bond_opt_value bond_all_slaves_active_tbl[] = {
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v2 4/4] bonding: allow carrier and link status to determine link state
  2018-05-14 18:48 ` [PATCH net-next v2 4/4] bonding: allow carrier and link status to determine link state Debabrata Banerjee
@ 2018-05-16 16:11   ` Jay Vosburgh
  2018-05-16 16:28     ` David Miller
  0 siblings, 1 reply; 12+ messages in thread
From: Jay Vosburgh @ 2018-05-16 16:11 UTC (permalink / raw)
  To: Debabrata Banerjee
  Cc: David S . Miller, netdev, Veaceslav Falico, Andy Gospodarek

Debabrata Banerjee <dbanerje@akamai.com> wrote:

>In a mixed environment it may be difficult to tell if your hardware
>support carrier, if it does not it can always report true. With a new
>use_carrier option of 2, we can check both carrier and link status
>sequentially, instead of one or the other

	Your reply in the prior discussion suggests to me that there is
a bug somewhere else (perhaps in bonding, perhaps in the network
driver), and this is just papering over it.  As I said, I don't believe
that an additional hack to bonding is the appropriate way to resolve
this.

	-J

>Signed-off-by: Debabrata Banerjee <dbanerje@akamai.com>
>---
> Documentation/networking/bonding.txt |  4 ++--
> drivers/net/bonding/bond_main.c      | 12 ++++++++----
> drivers/net/bonding/bond_options.c   |  7 ++++---
> 3 files changed, 14 insertions(+), 9 deletions(-)
>
>diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt
>index 9ba04c0bab8d..f063730e7e73 100644
>--- a/Documentation/networking/bonding.txt
>+++ b/Documentation/networking/bonding.txt
>@@ -828,8 +828,8 @@ use_carrier
> 	MII / ETHTOOL ioctl method to determine the link state.
> 
> 	A value of 1 enables the use of netif_carrier_ok(), a value of
>-	0 will use the deprecated MII / ETHTOOL ioctls.  The default
>-	value is 1.
>+	0 will use the deprecated MII / ETHTOOL ioctls. A value of 2
>+	will check both.  The default value is 1.
> 
> xmit_hash_policy
> 
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index f7f8a49cb32b..7e9652c4b35c 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -132,7 +132,7 @@ MODULE_PARM_DESC(downdelay, "Delay before considering link down, "
> 			    "in milliseconds");
> module_param(use_carrier, int, 0);
> MODULE_PARM_DESC(use_carrier, "Use netif_carrier_ok (vs MII ioctls) in miimon; "
>-			      "0 for off, 1 for on (default)");
>+			      "0 for off, 1 for on (default), 2 for carrier then legacy checks");
> module_param(mode, charp, 0);
> MODULE_PARM_DESC(mode, "Mode of operation; 0 for balance-rr, "
> 		       "1 for active-backup, 2 for balance-xor, "
>@@ -434,12 +434,16 @@ static int bond_check_dev_link(struct bonding *bond,
> 	int (*ioctl)(struct net_device *, struct ifreq *, int);
> 	struct ifreq ifr;
> 	struct mii_ioctl_data *mii;
>+	bool carrier = true;
> 
> 	if (!reporting && !netif_running(slave_dev))
> 		return 0;
> 
> 	if (bond->params.use_carrier)
>-		return netif_carrier_ok(slave_dev) ? BMSR_LSTATUS : 0;
>+		carrier = netif_carrier_ok(slave_dev) ? BMSR_LSTATUS : 0;
>+
>+	if (!carrier)
>+		return carrier;
> 
> 	/* Try to get link status using Ethtool first. */
> 	if (slave_dev->ethtool_ops->get_link)
>@@ -4399,8 +4403,8 @@ static int bond_check_params(struct bond_params *params)
> 		downdelay = 0;
> 	}
> 
>-	if ((use_carrier != 0) && (use_carrier != 1)) {
>-		pr_warn("Warning: use_carrier module parameter (%d), not of valid value (0/1), so it was set to 1\n",
>+	if (use_carrier < 0 || use_carrier > 2) {
>+		pr_warn("Warning: use_carrier module parameter (%d), not of valid value (0-2), so it was set to 1\n",
> 			use_carrier);
> 		use_carrier = 1;
> 	}
>diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
>index 8a945c9341d6..dba6cef05134 100644
>--- a/drivers/net/bonding/bond_options.c
>+++ b/drivers/net/bonding/bond_options.c
>@@ -164,9 +164,10 @@ static const struct bond_opt_value bond_primary_reselect_tbl[] = {
> };
> 
> static const struct bond_opt_value bond_use_carrier_tbl[] = {
>-	{ "off", 0,  0},
>-	{ "on",  1,  BOND_VALFLAG_DEFAULT},
>-	{ NULL,  -1, 0}
>+	{ "off",  0,  0},
>+	{ "on",   1,  BOND_VALFLAG_DEFAULT},
>+	{ "both", 2,  0},
>+	{ NULL,  -1,  0}
> };
> 
> static const struct bond_opt_value bond_all_slaves_active_tbl[] = {
>-- 
>2.17.0

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v2 0/4] bonding: performance and reliability
  2018-05-14 18:48 [PATCH net-next v2 0/4] bonding: performance and reliability Debabrata Banerjee
                   ` (3 preceding siblings ...)
  2018-05-14 18:48 ` [PATCH net-next v2 4/4] bonding: allow carrier and link status to determine link state Debabrata Banerjee
@ 2018-05-16 16:16 ` David Miller
  4 siblings, 0 replies; 12+ messages in thread
From: David Miller @ 2018-05-16 16:16 UTC (permalink / raw)
  To: dbanerje; +Cc: netdev, j.vosburgh, vfalico, andy

From: Debabrata Banerjee <dbanerje@akamai.com>
Date: Mon, 14 May 2018 14:48:06 -0400

> Series of fixes to how rlb updates are handled, code cleanup, allowing
> higher performance tx hashing in balance-alb mode, and reliability of
> link up/down monitoring.
> 
> v2: refactor bond_is_nondyn_tlb with inline fn, update log comment to
> point out that multicast addresses will not get rlb updates.

Series applied, thank you.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v2 4/4] bonding: allow carrier and link status to determine link state
  2018-05-16 16:11   ` Jay Vosburgh
@ 2018-05-16 16:28     ` David Miller
  2018-05-16 16:54       ` Banerjee, Debabrata
  0 siblings, 1 reply; 12+ messages in thread
From: David Miller @ 2018-05-16 16:28 UTC (permalink / raw)
  To: jay.vosburgh; +Cc: dbanerje, netdev, vfalico, andy

From: Jay Vosburgh <jay.vosburgh@canonical.com>
Date: Wed, 16 May 2018 18:11:08 +0200

> Debabrata Banerjee <dbanerje@akamai.com> wrote:
> 
>>In a mixed environment it may be difficult to tell if your hardware
>>support carrier, if it does not it can always report true. With a new
>>use_carrier option of 2, we can check both carrier and link status
>>sequentially, instead of one or the other
> 
> 	Your reply in the prior discussion suggests to me that there is
> a bug somewhere else (perhaps in bonding, perhaps in the network
> driver), and this is just papering over it.  As I said, I don't believe
> that an additional hack to bonding is the appropriate way to resolve
> this.

Sorry this crossed with my review and application of this series, my bad.

Debabrata please give a specific example of a case where the new "both"
mode actually is needed.

Thank you.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH net-next v2 4/4] bonding: allow carrier and link status to determine link state
  2018-05-16 16:28     ` David Miller
@ 2018-05-16 16:54       ` Banerjee, Debabrata
  2018-05-16 17:10         ` David Miller
  0 siblings, 1 reply; 12+ messages in thread
From: Banerjee, Debabrata @ 2018-05-16 16:54 UTC (permalink / raw)
  To: 'David Miller', jay.vosburgh; +Cc: netdev, vfalico, andy

> From: David Miller [mailto:davem@davemloft.net]
> From: Jay Vosburgh <jay.vosburgh@canonical.com>
> Date: Wed, 16 May 2018 18:11:08 +0200
> 
> > Debabrata Banerjee <dbanerje@akamai.com> wrote:
> >
> >>In a mixed environment it may be difficult to tell if your hardware
> >>support carrier, if it does not it can always report true. With a new
> >>use_carrier option of 2, we can check both carrier and link status
> >>sequentially, instead of one or the other
> >
> > 	Your reply in the prior discussion suggests to me that there is a bug
> > somewhere else (perhaps in bonding, perhaps in the network driver),
> > and this is just papering over it.  As I said, I don't believe that an
> > additional hack to bonding is the appropriate way to resolve this.
> 
> Sorry this crossed with my review and application of this series, my bad.
> 
> Debabrata please give a specific example of a case where the new "both"
> mode actually is needed.
> 
> Thank you.

See output of /proc/net/bonding/bond0 below, same content as the prior email. bnx2x driver, on ith1: "MII Status: up" is directly from netif_carrier_ok(bond->dev) , but speed and duplex are unknown, which come from the same call as would be used from bond_mii_monitor(), __ethtool_get_link_ksettings(slave_dev, &ecmd). Yes it's possible this is because of bugs in the drivers themselves, but without being able to readily reproduce the state below we can't debug it, nor do we know which combination of hardware and drivers are reliable at any given point in time. We can say though that if this had been set to "both", ith1 would have been correctly removed from the bond. That said if this is still controversial I can resubmit the series without this patch.

Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: adaptive load balancing
Primary Slave: None
Currently Active Slave: ith1
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: ith0
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 1
Permanent HW addr: 00:e0:81:e5:0e:16
Slave queue ID: 0

Slave Interface: ith1
MII Status: up
Speed: Unknown
Duplex: Unknown
Link Failure Count: 0
Permanent HW addr: 00:e0:81:e5:0e:18
Slave queue ID: 0

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v2 4/4] bonding: allow carrier and link status to determine link state
  2018-05-16 16:54       ` Banerjee, Debabrata
@ 2018-05-16 17:10         ` David Miller
  2018-05-16 17:14           ` Banerjee, Debabrata
  0 siblings, 1 reply; 12+ messages in thread
From: David Miller @ 2018-05-16 17:10 UTC (permalink / raw)
  To: dbanerje; +Cc: jay.vosburgh, netdev, vfalico, andy

From: "Banerjee, Debabrata" <dbanerje@akamai.com>
Date: Wed, 16 May 2018 16:54:40 +0000

> See output of /proc/net/bonding/bond0 below, same content as the
> prior email. bnx2x driver, on ith1: "MII Status: up" is directly
> from netif_carrier_ok(bond->dev) , but speed and duplex are unknown,
> which come from the same call as would be used from
> bond_mii_monitor(), __ethtool_get_link_ksettings(slave_dev,
> &ecmd). Yes it's possible this is because of bugs in the drivers
> themselves, but without being able to readily reproduce the state
> below we can't debug it, nor do we know which combination of
> hardware and drivers are reliable at any given point in time. We can
> say though that if this had been set to "both", ith1 would have been
> correctly removed from the bond. That said if this is still
> controversial I can resubmit the series without this patch.

Looking at the bnx2x driver, it's management of link state is very
convoluted.

The link parameters and the "link_up" state is maintained separately
from the values that are snapshot when the carrier is enabled.

I don't think we should reward drivers for behaving this way.

If the carrier is up, you should be able to provide the link speed and
duplex values immediately not some indeterminate amount of time later.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH net-next v2 4/4] bonding: allow carrier and link status to determine link state
  2018-05-16 17:10         ` David Miller
@ 2018-05-16 17:14           ` Banerjee, Debabrata
  2018-05-16 17:20             ` David Miller
  0 siblings, 1 reply; 12+ messages in thread
From: Banerjee, Debabrata @ 2018-05-16 17:14 UTC (permalink / raw)
  To: 'David Miller'; +Cc: jay.vosburgh, netdev, vfalico, andy

> From: David Miller [mailto:davem@davemloft.net]
> 
> Looking at the bnx2x driver, it's management of link state is very convoluted.
> 
> The link parameters and the "link_up" state is maintained separately from
> the values that are snapshot when the carrier is enabled.
...
> If the carrier is up, you should be able to provide the link speed and duplex
> values immediately not some indeterminate amount of time later.

This was steady state, the machine was stuck like this for hours/days.


> I don't think we should reward drivers for behaving this way.

I'll send v3 without this patch.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v2 4/4] bonding: allow carrier and link status to determine link state
  2018-05-16 17:14           ` Banerjee, Debabrata
@ 2018-05-16 17:20             ` David Miller
  0 siblings, 0 replies; 12+ messages in thread
From: David Miller @ 2018-05-16 17:20 UTC (permalink / raw)
  To: dbanerje; +Cc: jay.vosburgh, netdev, vfalico, andy

From: "Banerjee, Debabrata" <dbanerje@akamai.com>
Date: Wed, 16 May 2018 17:14:07 +0000

> I'll send v3 without this patch.

I already applied your series so you need to send me a revert of this
patch.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-05-16 17:20 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-14 18:48 [PATCH net-next v2 0/4] bonding: performance and reliability Debabrata Banerjee
2018-05-14 18:48 ` [PATCH net-next v2 1/4] bonding: don't queue up extraneous rlb updates Debabrata Banerjee
2018-05-14 18:48 ` [PATCH net-next v2 2/4] bonding: use common mac addr checks Debabrata Banerjee
2018-05-14 18:48 ` [PATCH net-next v2 3/4] bonding: allow use of tx hashing in balance-alb Debabrata Banerjee
2018-05-14 18:48 ` [PATCH net-next v2 4/4] bonding: allow carrier and link status to determine link state Debabrata Banerjee
2018-05-16 16:11   ` Jay Vosburgh
2018-05-16 16:28     ` David Miller
2018-05-16 16:54       ` Banerjee, Debabrata
2018-05-16 17:10         ` David Miller
2018-05-16 17:14           ` Banerjee, Debabrata
2018-05-16 17:20             ` David Miller
2018-05-16 16:16 ` [PATCH net-next v2 0/4] bonding: performance and reliability David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.