All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V5 0/11] net/bonding: ADD IPoIB support for the bonding driver
@ 2007-09-20 13:33 Moni Shoua
  2007-09-20 13:39 ` [ofa-general] [PATCH V5 1/11] net/core: add a netdev notification for slave detach Moni Shoua
                   ` (11 more replies)
  0 siblings, 12 replies; 17+ messages in thread
From: Moni Shoua @ 2007-09-20 13:33 UTC (permalink / raw)
  To: Roland Dreier, Jay Vosburgh; +Cc: OpenFabrics General, netdev

This patch series is the fifth version (see below link to V4) of the 
suggested changes to the bonding driver so it would be able to support 
non ARPHRD_ETHER netdevices for its High-Availability (active-backup) mode. 

Patches 1-10 were originally submitted in V4 and patch 11 is an addition by Jay.

Jay,
The bonding patches you acked remain unchanged while I guess I sitll need
to get an official ack by Roland for the IPoIB patches.
Is it OK with you to push the entire series to the networking tree?
Roland has already agreed to do so.


Major changes from the previous version:
----------------------------------------
1. Style changes
2. IPoIB - notify slave detach on vlan delete
3. Add function to net/core for slave detach instead of having it only in
   ib/ipoib
4. IPoIB - handle ib device and bonding device the same way in neigh_cleanup
   function

Links to earlier discussion:
----------------------------

1. A discussion in netdev about bonding support for IPoIB.
http://lists.openwall.net/netdev/2006/11/30/46

2. V4 series
http://lists.openfabrics.org/pipermail/general/2007-August/039825.html


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ofa-general] [PATCH V5 1/11] net/core: add a netdev notification for slave detach
  2007-09-20 13:33 [PATCH V5 0/11] net/bonding: ADD IPoIB support for the bonding driver Moni Shoua
@ 2007-09-20 13:39 ` Moni Shoua
  2007-09-20 13:40 ` [PATCH V5 2/11] IB/ipoib: Notify the world before doing unregister Moni Shoua
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Moni Shoua @ 2007-09-20 13:39 UTC (permalink / raw)
  Cc: netdev, Roland Dreier, Jay Vosburgh, OpenFabrics General

A slave of a bonding master that wants to send a notification before
going down should call netdev_slave_detach(). The handling of this notification
will be done outside the context of unregister_netdevice() which is sometimes
necessary, as with IPoIB slave for example.

Signed-off-by: Moni Shoua <monis at voltaire.com>
---
 include/linux/if.h |    1 +
 net/core/dev.c     |   20 ++++++++++++++++++++
 2 files changed, 21 insertions(+)

Index: net-2.6/net/core/dev.c
===================================================================
--- net-2.6.orig/net/core/dev.c	2007-09-20 08:04:47.164051688 +0200
+++ net-2.6/net/core/dev.c	2007-09-20 09:20:21.493060579 +0200
@@ -2588,6 +2588,25 @@ int netdev_set_master(struct net_device 
 	return 0;
 }
 
+/**
+ *	netdev_slave_detach	-	notify that slave is about to detach from master
+ *	@slave: slave device
+ *
+ *	Raise a flag that slave is about to detach from master
+ *	and notify the netdev  chain.
+ *	The caller must hold the rtnl_mutex.
+ */
+
+int netdev_slave_detach(struct net_device *slave)
+{
+	int ret = 0;
+	if (slave->flags & IFF_SLAVE) {
+		slave->priv_flags |= IFF_SLAVE_DETACH;
+		ret = call_netdevice_notifiers(NETDEV_CHANGE, slave);
+	}
+	return ret;
+}
+
 static void __dev_set_promiscuity(struct net_device *dev, int inc)
 {
 	unsigned short old_flags = dev->flags;
@@ -4120,6 +4139,7 @@ EXPORT_SYMBOL(dev_set_mac_address);
 EXPORT_SYMBOL(free_netdev);
 EXPORT_SYMBOL(netdev_boot_setup_check);
 EXPORT_SYMBOL(netdev_set_master);
+EXPORT_SYMBOL(netdev_slave_detach);
 EXPORT_SYMBOL(netdev_state_change);
 EXPORT_SYMBOL(netif_receive_skb);
 EXPORT_SYMBOL(netif_rx);
Index: net-2.6/include/linux/if.h
===================================================================
--- net-2.6.orig/include/linux/if.h	2007-09-20 08:04:47.164051688 +0200
+++ net-2.6/include/linux/if.h	2007-09-20 08:15:29.577729301 +0200
@@ -61,6 +61,7 @@
 #define IFF_MASTER_ALB	0x10		/* bonding master, balance-alb.	*/
 #define IFF_BONDING	0x20		/* bonding master or slave	*/
 #define IFF_SLAVE_NEEDARP 0x40		/* need ARPs for validation	*/
+#define IFF_SLAVE_DETACH 0x80		/* slave is about to unregister */
 
 #define IF_GET_IFACE	0x0001		/* for querying only */
 #define IF_GET_PROTO	0x0002

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH V5 2/11] IB/ipoib: Notify the world before doing unregister
  2007-09-20 13:33 [PATCH V5 0/11] net/bonding: ADD IPoIB support for the bonding driver Moni Shoua
  2007-09-20 13:39 ` [ofa-general] [PATCH V5 1/11] net/core: add a netdev notification for slave detach Moni Shoua
@ 2007-09-20 13:40 ` Moni Shoua
  2007-09-20 16:20   ` [ofa-general] " Roland Dreier
  2007-09-20 13:41 ` [ofa-general] [PATCH V5 3/11] IB/ipoib: Bound the net device to the ipoib_neigh structue Moni Shoua
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 17+ messages in thread
From: Moni Shoua @ 2007-09-20 13:40 UTC (permalink / raw)
  Cc: Roland Dreier, Jay Vosburgh, OpenFabrics General, netdev

When the bonding device enslaves IPoIB devices it takes pointers to
functions in the ib_ipoib module. This is fine as long as the ib_ipoib
nodule remains loaded while the references to its functions exist.
So, to help bonding do a cleanup on time, when the IPoIB net device is a 
slave of a bonding master, let the master know that the IPoIB device is
about to unregister (but before calling unregister).

Signed-off-by: Moni Shoua <monis at voltaire.com>
---
 drivers/infiniband/ulp/ipoib/ipoib.h      |    7 +++++++
 drivers/infiniband/ulp/ipoib/ipoib_main.c |    3 +++
 drivers/infiniband/ulp/ipoib/ipoib_vlan.c |    1 +
 3 files changed, 11 insertions(+)

Index: net-2.6/drivers/infiniband/ulp/ipoib/ipoib_main.c
===================================================================
--- net-2.6.orig/drivers/infiniband/ulp/ipoib/ipoib_main.c	2007-09-20 08:35:34.000000000 +0200
+++ net-2.6/drivers/infiniband/ulp/ipoib/ipoib_main.c	2007-09-20 14:20:16.495147879 +0200
@@ -48,6 +48,7 @@
 #include <linux/in.h>
 
 #include <net/dst.h>
+#include <linux/netdevice.h>
 
 MODULE_AUTHOR("Roland Dreier");
 MODULE_DESCRIPTION("IP-over-InfiniBand net driver");
@@ -921,6 +922,7 @@ void ipoib_dev_cleanup(struct net_device
 
 	/* Delete any child interfaces first */
 	list_for_each_entry_safe(cpriv, tcpriv, &priv->child_intfs, list) {
+		ipoib_slave_detach(cpriv->dev);
 		unregister_netdev(cpriv->dev);
 		ipoib_dev_cleanup(cpriv->dev);
 		free_netdev(cpriv->dev);
@@ -1208,6 +1210,7 @@ static void ipoib_remove_one(struct ib_d
 		ib_unregister_event_handler(&priv->event_handler);
 		flush_scheduled_work();
 
+		ipoib_slave_detach(priv->dev);
 		unregister_netdev(priv->dev);
 		ipoib_dev_cleanup(priv->dev);
 		free_netdev(priv->dev);
Index: net-2.6/drivers/infiniband/ulp/ipoib/ipoib_vlan.c
===================================================================
--- net-2.6.orig/drivers/infiniband/ulp/ipoib/ipoib_vlan.c	2007-09-20 09:26:11.000000000 +0200
+++ net-2.6/drivers/infiniband/ulp/ipoib/ipoib_vlan.c	2007-09-20 09:27:20.182709679 +0200
@@ -157,6 +157,7 @@ int ipoib_vlan_delete(struct net_device 
 	mutex_lock(&ppriv->vlan_mutex);
 	list_for_each_entry_safe(priv, tpriv, &ppriv->child_intfs, list) {
 		if (priv->pkey == pkey) {
+			ipoib_slave_detach(priv->dev);
 			unregister_netdev(priv->dev);
 			ipoib_dev_cleanup(priv->dev);
 			list_del(&priv->list);
Index: net-2.6/drivers/infiniband/ulp/ipoib/ipoib.h
===================================================================
--- net-2.6.orig/drivers/infiniband/ulp/ipoib/ipoib.h	2007-09-20 12:18:56.000000000 +0200
+++ net-2.6/drivers/infiniband/ulp/ipoib/ipoib.h	2007-09-20 14:21:47.385972207 +0200
@@ -570,6 +570,13 @@ static inline void ipoib_cm_handle_rx_wc
 
 #endif
 
+static inline void ipoib_slave_detach(struct net_device *dev)
+{
+	rtnl_lock();
+	netdev_slave_detach(dev);
+	rtnl_unlock();
+}
+
 #ifdef CONFIG_INFINIBAND_IPOIB_DEBUG
 void ipoib_create_debug_files(struct net_device *dev);
 void ipoib_delete_debug_files(struct net_device *dev);


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ofa-general] [PATCH V5 3/11] IB/ipoib: Bound the net device to the ipoib_neigh structue
  2007-09-20 13:33 [PATCH V5 0/11] net/bonding: ADD IPoIB support for the bonding driver Moni Shoua
  2007-09-20 13:39 ` [ofa-general] [PATCH V5 1/11] net/core: add a netdev notification for slave detach Moni Shoua
  2007-09-20 13:40 ` [PATCH V5 2/11] IB/ipoib: Notify the world before doing unregister Moni Shoua
@ 2007-09-20 13:41 ` Moni Shoua
  2007-09-20 13:42 ` [PATCH V5 4/11] IB/ipoib: Verify address handle validity on send Moni Shoua
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Moni Shoua @ 2007-09-20 13:41 UTC (permalink / raw)
  Cc: netdev, Roland Dreier, Jay Vosburgh, OpenFabrics General

IPoIB uses a two layer neighboring scheme, such that for each struct neighbour
whose device is an ipoib one, there is a struct ipoib_neigh buddy which is
created on demand at the tx flow by an ipoib_neigh_alloc(skb->dst->neighbour)
call.

When using the bonding driver, neighbours are created by the net stack on behalf
of the bonding (master) device. On the tx flow the bonding code gets an skb such
that skb->dev points to the master device, it changes this skb to point on the
slave device and calls the slave hard_start_xmit function.

Under this scheme, ipoib_neigh_destructor assumption that for each struct
neighbour it gets, n->dev is an ipoib device and hence netdev_priv(n->dev)
can be casted to struct ipoib_dev_priv is buggy.

To fix it, this patch adds a dev field to struct ipoib_neigh which is used
instead of the struct neighbour dev one, when n->dev->flags has the
IFF_MASTER bit set.

Signed-off-by: Moni Shoua <monis at voltaire.com>
Signed-off-by: Or Gerlitz <ogerlitz at voltaire.com>
---
 drivers/infiniband/ulp/ipoib/ipoib.h           |    4 +++-
 drivers/infiniband/ulp/ipoib/ipoib_main.c      |   24 +++++++++++++++---------
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c |    3 ++-
 3 files changed, 20 insertions(+), 11 deletions(-)

Index: net-2.6/drivers/infiniband/ulp/ipoib/ipoib.h
===================================================================
--- net-2.6.orig/drivers/infiniband/ulp/ipoib/ipoib.h	2007-09-18 17:08:53.245849217 +0200
+++ net-2.6/drivers/infiniband/ulp/ipoib/ipoib.h	2007-09-18 17:09:26.534874404 +0200
@@ -328,6 +328,7 @@ struct ipoib_neigh {
 	struct sk_buff_head queue;
 
 	struct neighbour   *neighbour;
+	struct net_device *dev;
 
 	struct list_head    list;
 };
@@ -344,7 +345,8 @@ static inline struct ipoib_neigh **to_ip
 				     INFINIBAND_ALEN, sizeof(void *));
 }
 
-struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neigh);
+struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neigh,
+				      struct net_device *dev);
 void ipoib_neigh_free(struct net_device *dev, struct ipoib_neigh *neigh);
 
 extern struct workqueue_struct *ipoib_workqueue;
Index: net-2.6/drivers/infiniband/ulp/ipoib/ipoib_main.c
===================================================================
--- net-2.6.orig/drivers/infiniband/ulp/ipoib/ipoib_main.c	2007-09-18 17:08:53.245849217 +0200
+++ net-2.6/drivers/infiniband/ulp/ipoib/ipoib_main.c	2007-09-18 17:23:54.725744661 +0200
@@ -511,7 +511,7 @@ static void neigh_add_path(struct sk_buf
 	struct ipoib_path *path;
 	struct ipoib_neigh *neigh;
 
-	neigh = ipoib_neigh_alloc(skb->dst->neighbour);
+	neigh = ipoib_neigh_alloc(skb->dst->neighbour, skb->dev);
 	if (!neigh) {
 		++priv->stats.tx_dropped;
 		dev_kfree_skb_any(skb);
@@ -830,6 +830,13 @@ static void ipoib_neigh_cleanup(struct n
 	unsigned long flags;
 	struct ipoib_ah *ah = NULL;
 
+	neigh = *to_ipoib_neigh(n);
+	if (neigh) {
+		priv = netdev_priv(neigh->dev);
+		ipoib_dbg(priv, "neigh_destructor for bonding device: %s\n",
+			  n->dev->name);
+	} else
+		return;
 	ipoib_dbg(priv,
 		  "neigh_cleanup for %06x " IPOIB_GID_FMT "\n",
 		  IPOIB_QPN(n->ha),
@@ -837,13 +844,10 @@ static void ipoib_neigh_cleanup(struct n
 
 	spin_lock_irqsave(&priv->lock, flags);
 
-	neigh = *to_ipoib_neigh(n);
-	if (neigh) {
-		if (neigh->ah)
-			ah = neigh->ah;
-		list_del(&neigh->list);
-		ipoib_neigh_free(n->dev, neigh);
-	}
+	if (neigh->ah)
+		ah = neigh->ah;
+	list_del(&neigh->list);
+	ipoib_neigh_free(n->dev, neigh);
 
 	spin_unlock_irqrestore(&priv->lock, flags);
 
@@ -851,7 +855,8 @@ static void ipoib_neigh_cleanup(struct n
 		ipoib_put_ah(ah);
 }
 
-struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neighbour)
+struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neighbour,
+				      struct net_device *dev)
 {
 	struct ipoib_neigh *neigh;
 
@@ -860,6 +865,7 @@ struct ipoib_neigh *ipoib_neigh_alloc(st
 		return NULL;
 
 	neigh->neighbour = neighbour;
+	neigh->dev = dev;
 	*to_ipoib_neigh(neighbour) = neigh;
 	skb_queue_head_init(&neigh->queue);
 	ipoib_cm_set(neigh, NULL);
Index: net-2.6/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
===================================================================
--- net-2.6.orig/drivers/infiniband/ulp/ipoib/ipoib_multicast.c	2007-09-18 17:08:53.245849217 +0200
+++ net-2.6/drivers/infiniband/ulp/ipoib/ipoib_multicast.c	2007-09-18 17:09:26.536874045 +0200
@@ -727,7 +727,8 @@ out:
 		if (skb->dst            &&
 		    skb->dst->neighbour &&
 		    !*to_ipoib_neigh(skb->dst->neighbour)) {
-			struct ipoib_neigh *neigh = ipoib_neigh_alloc(skb->dst->neighbour);
+			struct ipoib_neigh *neigh = ipoib_neigh_alloc(skb->dst->neighbour,
+									skb->dev);
 
 			if (neigh) {
 				kref_get(&mcast->ah->ref);

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH V5 4/11] IB/ipoib: Verify address handle validity on send
  2007-09-20 13:33 [PATCH V5 0/11] net/bonding: ADD IPoIB support for the bonding driver Moni Shoua
                   ` (2 preceding siblings ...)
  2007-09-20 13:41 ` [ofa-general] [PATCH V5 3/11] IB/ipoib: Bound the net device to the ipoib_neigh structue Moni Shoua
@ 2007-09-20 13:42 ` Moni Shoua
  2007-09-20 13:43 ` [ofa-general] [PATCH V5 5/11] net/bonding: Enable bonding to enslave non ARPHRD_ETHER Moni Shoua
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Moni Shoua @ 2007-09-20 13:42 UTC (permalink / raw)
  To: Roland Dreier, Jay Vosburgh; +Cc: OpenFabrics General, netdev

When the bonding device senses a carrier loss of its active slave it replaces
that slave with a new one. In between the times when the carrier of an IPoIB
device goes down and ipoib_neigh is destroyed, it is possible that the
bonding driver will send a packet on a new slave that uses an old ipoib_neigh.
This patch detects and prevents this from happenning.

Signed-off-by: Moni Shoua <monis at voltaire.com>
Signed-off-by: Or Gerlitz <ogerlitz at voltaire.com>
---
 drivers/infiniband/ulp/ipoib/ipoib_main.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

Index: net-2.6/drivers/infiniband/ulp/ipoib/ipoib_main.c
===================================================================
--- net-2.6.orig/drivers/infiniband/ulp/ipoib/ipoib_main.c	2007-09-18 17:09:26.535874225 +0200
+++ net-2.6/drivers/infiniband/ulp/ipoib/ipoib_main.c	2007-09-18 17:10:22.375853147 +0200
@@ -686,9 +686,10 @@ static int ipoib_start_xmit(struct sk_bu
 				goto out;
 			}
 		} else if (neigh->ah) {
-			if (unlikely(memcmp(&neigh->dgid.raw,
+			if (unlikely((memcmp(&neigh->dgid.raw,
 					    skb->dst->neighbour->ha + 4,
-					    sizeof(union ib_gid)))) {
+					    sizeof(union ib_gid))) ||
+					 (neigh->dev != dev))) {
 				spin_lock(&priv->lock);
 				/*
 				 * It's safe to call ipoib_put_ah() inside


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ofa-general] [PATCH V5 5/11] net/bonding: Enable bonding to enslave non ARPHRD_ETHER
  2007-09-20 13:33 [PATCH V5 0/11] net/bonding: ADD IPoIB support for the bonding driver Moni Shoua
                   ` (3 preceding siblings ...)
  2007-09-20 13:42 ` [PATCH V5 4/11] IB/ipoib: Verify address handle validity on send Moni Shoua
@ 2007-09-20 13:43 ` Moni Shoua
  2007-09-20 13:45 ` [PATCH V5 6/11] net/bonding: Enable bonding to enslave netdevices not supporting set_mac_address() Moni Shoua
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Moni Shoua @ 2007-09-20 13:43 UTC (permalink / raw)
  To: Roland Dreier, Jay Vosburgh; +Cc: netdev, OpenFabrics General

This patch changes some of the bond netdevice attributes and functions
to be that of the active slave for the case of the enslaved device not being
of ARPHRD_ETHER type. Basically it overrides those setting done by ether_setup(),
which are netdevice **type** dependent and hence might be not appropriate for
devices of other types. It also enforces mutual exclusion on bonding slaves
from dissimilar ether types, as was concluded over the v1 discussion.

IPoIB (see Documentation/infiniband/ipoib.txt) MAC address is made of a 3 bytes
IB QP (Queue Pair) number and 16 bytes IB port GID (Global ID) of the port this
IPoIB device is bounded to. The QP is a resource created by the IB HW and the
GID is an identifier burned into the HCA (i have omitted here some details which
are not important for the bonding RFC).

Signed-off-by: Moni Shoua <monis at voltaire.com>
Signed-off-by: Or Gerlitz <ogerlitz at voltaire.com>
---
 drivers/net/bonding/bond_main.c |   39 +++++++++++++++++++++++++++++++++++++++
 1 files changed, 39 insertions(+)

Index: net-2.6/drivers/net/bonding/bond_main.c
===================================================================
--- net-2.6.orig/drivers/net/bonding/bond_main.c	2007-08-15 10:08:59.000000000 +0300
+++ net-2.6/drivers/net/bonding/bond_main.c	2007-08-15 10:54:13.424688411 +0300
@@ -1237,6 +1237,26 @@ static int bond_compute_features(struct 
 	return 0;
 }
 
+
+static void bond_setup_by_slave(struct net_device *bond_dev,
+				struct net_device *slave_dev)
+{
+	bond_dev->hard_header	        = slave_dev->hard_header;
+	bond_dev->rebuild_header        = slave_dev->rebuild_header;
+	bond_dev->hard_header_cache	= slave_dev->hard_header_cache;
+	bond_dev->header_cache_update   = slave_dev->header_cache_update;
+	bond_dev->hard_header_parse	= slave_dev->hard_header_parse;
+
+	bond_dev->neigh_setup           = slave_dev->neigh_setup;
+
+	bond_dev->type		    = slave_dev->type;
+	bond_dev->hard_header_len   = slave_dev->hard_header_len;
+	bond_dev->addr_len	    = slave_dev->addr_len;
+
+	memcpy(bond_dev->broadcast, slave_dev->broadcast,
+		slave_dev->addr_len);
+}
+
 /* enslave device <slave> to bond device <master> */
 int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
 {
@@ -1311,6 +1331,25 @@ int bond_enslave(struct net_device *bond
 		goto err_undo_flags;
 	}
 
+	/* set bonding device ether type by slave - bonding netdevices are
+	 * created with ether_setup, so when the slave type is not ARPHRD_ETHER
+	 * there is a need to override some of the type dependent attribs/funcs.
+	 *
+	 * bond ether type mutual exclusion - don't allow slaves of dissimilar
+	 * ether type (eg ARPHRD_ETHER and ARPHRD_INFINIBAND) share the same bond
+	 */
+	if (bond->slave_cnt == 0) {
+		if (slave_dev->type != ARPHRD_ETHER)
+			bond_setup_by_slave(bond_dev, slave_dev);
+	} else if (bond_dev->type != slave_dev->type) {
+		printk(KERN_ERR DRV_NAME ": %s ether type (%d) is different "
+			"from other slaves (%d), can not enslave it.\n",
+			slave_dev->name,
+			slave_dev->type, bond_dev->type);
+			res = -EINVAL;
+			goto err_undo_flags;
+	}
+
 	if (slave_dev->set_mac_address == NULL) {
 		printk(KERN_ERR DRV_NAME
 			": %s: Error: The slave device you specified does "

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH V5 6/11] net/bonding: Enable bonding to enslave netdevices not supporting set_mac_address()
  2007-09-20 13:33 [PATCH V5 0/11] net/bonding: ADD IPoIB support for the bonding driver Moni Shoua
                   ` (4 preceding siblings ...)
  2007-09-20 13:43 ` [ofa-general] [PATCH V5 5/11] net/bonding: Enable bonding to enslave non ARPHRD_ETHER Moni Shoua
@ 2007-09-20 13:45 ` Moni Shoua
  2007-09-20 13:58 ` [ofa-general] [PATCH V5 7/11] net/bonding: Enable IP multicast for bonding IPoIB devices Moni Shoua
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Moni Shoua @ 2007-09-20 13:45 UTC (permalink / raw)
  To: Roland Dreier, Jay Vosburgh; +Cc: OpenFabrics General, netdev

This patch allows for enslaving netdevices which do not support
the set_mac_address() function. In that case the bond mac address is the one
of the active slave, where remote peers are notified on the mac address
(neighbour) change by Gratuitous ARP sent by bonding when fail-over occurs
(this is already done by the bonding code).

Signed-off-by: Moni Shoua <monis at voltaire.com>
Signed-off-by: Or Gerlitz <ogerlitz at voltaire.com>
---
 drivers/net/bonding/bond_main.c |   87 +++++++++++++++++++++++++++-------------
 drivers/net/bonding/bonding.h   |    1 
 2 files changed, 60 insertions(+), 28 deletions(-)

Index: net-2.6/drivers/net/bonding/bond_main.c
===================================================================
--- net-2.6.orig/drivers/net/bonding/bond_main.c	2007-08-15 10:54:13.000000000 +0300
+++ net-2.6/drivers/net/bonding/bond_main.c	2007-08-15 10:54:41.971632881 +0300
@@ -1095,6 +1095,14 @@ void bond_change_active_slave(struct bon
 		if (new_active) {
 			bond_set_slave_active_flags(new_active);
 		}
+
+		/* when bonding does not set the slave MAC address, the bond MAC
+		 * address is the one of the active slave.
+		 */
+		if (new_active && !bond->do_set_mac_addr)
+			memcpy(bond->dev->dev_addr,  new_active->dev->dev_addr,
+				new_active->dev->addr_len);
+
 		bond_send_gratuitous_arp(bond);
 	}
 }
@@ -1351,13 +1359,22 @@ int bond_enslave(struct net_device *bond
 	}
 
 	if (slave_dev->set_mac_address == NULL) {
-		printk(KERN_ERR DRV_NAME
-			": %s: Error: The slave device you specified does "
-			"not support setting the MAC address. "
-			"Your kernel likely does not support slave "
-			"devices.\n", bond_dev->name);
-  		res = -EOPNOTSUPP;
-		goto err_undo_flags;
+		if (bond->slave_cnt == 0) {
+			printk(KERN_WARNING DRV_NAME
+				": %s: Warning: The first slave device you "
+				"specified does not support setting the MAC "
+				"address. This bond MAC address would be that "
+				"of the active slave.\n", bond_dev->name);
+			bond->do_set_mac_addr = 0;
+		} else if (bond->do_set_mac_addr) {
+			printk(KERN_ERR DRV_NAME
+				": %s: Error: The slave device you specified "
+				"does not support setting the MAC addres,."
+				"but this bond uses this practice. \n"
+				, bond_dev->name);
+			res = -EOPNOTSUPP;
+			goto err_undo_flags;
+		}
 	}
 
 	new_slave = kzalloc(sizeof(struct slave), GFP_KERNEL);
@@ -1378,16 +1395,18 @@ int bond_enslave(struct net_device *bond
 	 */
 	memcpy(new_slave->perm_hwaddr, slave_dev->dev_addr, ETH_ALEN);
 
-	/*
-	 * Set slave to master's mac address.  The application already
-	 * set the master's mac address to that of the first slave
-	 */
-	memcpy(addr.sa_data, bond_dev->dev_addr, bond_dev->addr_len);
-	addr.sa_family = slave_dev->type;
-	res = dev_set_mac_address(slave_dev, &addr);
-	if (res) {
-		dprintk("Error %d calling set_mac_address\n", res);
-		goto err_free;
+	if (bond->do_set_mac_addr) {
+		/*
+		 * Set slave to master's mac address.  The application already
+		 * set the master's mac address to that of the first slave
+		 */
+		memcpy(addr.sa_data, bond_dev->dev_addr, bond_dev->addr_len);
+		addr.sa_family = slave_dev->type;
+		res = dev_set_mac_address(slave_dev, &addr);
+		if (res) {
+			dprintk("Error %d calling set_mac_address\n", res);
+			goto err_free;
+		}
 	}
 
 	res = netdev_set_master(slave_dev, bond_dev);
@@ -1612,9 +1631,11 @@ err_close:
 	dev_close(slave_dev);
 
 err_restore_mac:
-	memcpy(addr.sa_data, new_slave->perm_hwaddr, ETH_ALEN);
-	addr.sa_family = slave_dev->type;
-	dev_set_mac_address(slave_dev, &addr);
+	if (bond->do_set_mac_addr) {
+		memcpy(addr.sa_data, new_slave->perm_hwaddr, ETH_ALEN);
+		addr.sa_family = slave_dev->type;
+		dev_set_mac_address(slave_dev, &addr);
+	}
 
 err_free:
 	kfree(new_slave);
@@ -1792,10 +1813,12 @@ int bond_release(struct net_device *bond
 	/* close slave before restoring its mac address */
 	dev_close(slave_dev);
 
-	/* restore original ("permanent") mac address */
-	memcpy(addr.sa_data, slave->perm_hwaddr, ETH_ALEN);
-	addr.sa_family = slave_dev->type;
-	dev_set_mac_address(slave_dev, &addr);
+	if (bond->do_set_mac_addr) {
+		/* restore original ("permanent") mac address */
+		memcpy(addr.sa_data, slave->perm_hwaddr, ETH_ALEN);
+		addr.sa_family = slave_dev->type;
+		dev_set_mac_address(slave_dev, &addr);
+	}
 
 	slave_dev->priv_flags &= ~(IFF_MASTER_8023AD | IFF_MASTER_ALB |
 				   IFF_SLAVE_INACTIVE | IFF_BONDING |
@@ -1882,10 +1905,12 @@ static int bond_release_all(struct net_d
 		/* close slave before restoring its mac address */
 		dev_close(slave_dev);
 
-		/* restore original ("permanent") mac address*/
-		memcpy(addr.sa_data, slave->perm_hwaddr, ETH_ALEN);
-		addr.sa_family = slave_dev->type;
-		dev_set_mac_address(slave_dev, &addr);
+		if (bond->do_set_mac_addr) {
+			/* restore original ("permanent") mac address*/
+			memcpy(addr.sa_data, slave->perm_hwaddr, ETH_ALEN);
+			addr.sa_family = slave_dev->type;
+			dev_set_mac_address(slave_dev, &addr);
+		}
 
 		slave_dev->priv_flags &= ~(IFF_MASTER_8023AD | IFF_MASTER_ALB |
 					   IFF_SLAVE_INACTIVE);
@@ -3922,6 +3947,9 @@ static int bond_set_mac_address(struct n
 
 	dprintk("bond=%p, name=%s\n", bond, (bond_dev ? bond_dev->name : "None"));
 
+	if (!bond->do_set_mac_addr)
+		return -EOPNOTSUPP;
+
 	if (!is_valid_ether_addr(sa->sa_data)) {
 		return -EADDRNOTAVAIL;
 	}
@@ -4312,6 +4340,9 @@ static int bond_init(struct net_device *
 	bond_create_proc_entry(bond);
 #endif
 
+	/* set do_set_mac_addr to true on startup */
+	bond->do_set_mac_addr = 1;
+
 	list_add_tail(&bond->bond_list, &bond_dev_list);
 
 	return 0;
Index: net-2.6/drivers/net/bonding/bonding.h
===================================================================
--- net-2.6.orig/drivers/net/bonding/bonding.h	2007-08-15 10:08:58.000000000 +0300
+++ net-2.6/drivers/net/bonding/bonding.h	2007-08-15 10:55:34.359354833 +0300
@@ -185,6 +185,7 @@ struct bonding {
 	struct   timer_list mii_timer;
 	struct   timer_list arp_timer;
 	s8       kill_timers;
+	s8       do_set_mac_addr;
 	struct   net_device_stats stats;
 #ifdef CONFIG_PROC_FS
 	struct   proc_dir_entry *proc_entry;


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ofa-general] [PATCH V5 7/11] net/bonding: Enable IP multicast for bonding IPoIB devices
  2007-09-20 13:33 [PATCH V5 0/11] net/bonding: ADD IPoIB support for the bonding driver Moni Shoua
                   ` (5 preceding siblings ...)
  2007-09-20 13:45 ` [PATCH V5 6/11] net/bonding: Enable bonding to enslave netdevices not supporting set_mac_address() Moni Shoua
@ 2007-09-20 13:58 ` Moni Shoua
  2007-09-20 14:00 ` [ofa-general] [PATCH V5 8/11] net/bonding: Handlle wrong assumptions that slave is always an Ethernet device Moni Shoua
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Moni Shoua @ 2007-09-20 13:58 UTC (permalink / raw)
  To: Roland Dreier, Jay Vosburgh; +Cc: netdev, OpenFabrics General

Allow to enslave devices when the bonding device is not up. Over the discussion
held at the previous post this seemed to be the most clean way to go, where it
is not expected to cause instabilities.

Normally, the bonding driver is UP before any enslavement takes place.
Once a netdevice is UP, the network stack acts to have it join some multicast groups
(eg the all-hosts 224.0.0.1). Now, since ether_setup() have set the bonding device
type to be ARPHRD_ETHER and address len to be ETHER_ALEN, the net core code
computes a wrong multicast link address. This is b/c ip_eth_mc_map() is called
where for multicast joins taking place after the enslavement another ip_xxx_mc_map()
is called (eg ip_ib_mc_map() when the bond type is ARPHRD_INFINIBAND)

Signed-off-by: Moni Shoua <monis at voltaire.com>
Signed-off-by: Or Gerlitz <ogerlitz at voltaire.com>
---
 drivers/net/bonding/bond_main.c  |    5 +++--
 drivers/net/bonding/bond_sysfs.c |    6 ++----
 2 files changed, 5 insertions(+), 6 deletions(-)

Index: net-2.6/drivers/net/bonding/bond_main.c
===================================================================
--- net-2.6.orig/drivers/net/bonding/bond_main.c	2007-08-15 10:54:41.000000000 +0300
+++ net-2.6/drivers/net/bonding/bond_main.c	2007-08-15 10:55:48.431862446 +0300
@@ -1285,8 +1285,9 @@ int bond_enslave(struct net_device *bond
 
 	/* bond must be initialized by bond_open() before enslaving */
 	if (!(bond_dev->flags & IFF_UP)) {
-		dprintk("Error, master_dev is not up\n");
-		return -EPERM;
+		printk(KERN_WARNING DRV_NAME
+			" %s: master_dev is not up in bond_enslave\n",
+			bond_dev->name);
 	}
 
 	/* already enslaved */
Index: net-2.6/drivers/net/bonding/bond_sysfs.c
===================================================================
--- net-2.6.orig/drivers/net/bonding/bond_sysfs.c	2007-08-15 10:08:58.000000000 +0300
+++ net-2.6/drivers/net/bonding/bond_sysfs.c	2007-08-15 10:55:48.432862269 +0300
@@ -266,11 +266,9 @@ static ssize_t bonding_store_slaves(stru
 
 	/* Quick sanity check -- is the bond interface up? */
 	if (!(bond->dev->flags & IFF_UP)) {
-		printk(KERN_ERR DRV_NAME
-		       ": %s: Unable to update slaves because interface is down.\n",
+		printk(KERN_WARNING DRV_NAME
+		       ": %s: doing slave updates when interface is down.\n",
 		       bond->dev->name);
-		ret = -EPERM;
-		goto out;
 	}
 
 	/* Note:  We can't hold bond->lock here, as bond_create grabs it. */

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ofa-general] [PATCH V5 8/11] net/bonding: Handlle wrong assumptions that slave is always an Ethernet device
  2007-09-20 13:33 [PATCH V5 0/11] net/bonding: ADD IPoIB support for the bonding driver Moni Shoua
                   ` (6 preceding siblings ...)
  2007-09-20 13:58 ` [ofa-general] [PATCH V5 7/11] net/bonding: Enable IP multicast for bonding IPoIB devices Moni Shoua
@ 2007-09-20 14:00 ` Moni Shoua
  2007-09-20 14:02 ` [ofa-general] PATCH V5 9/11] net/bonding: Delay sending of gratuitous ARP to avoid failure Moni Shoua
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Moni Shoua @ 2007-09-20 14:00 UTC (permalink / raw)
  To: Roland Dreier, Jay Vosburgh; +Cc: netdev, OpenFabrics General

bonding sometimes uses Ethernet constants (such as MTU and address length) which
are not good when it enslaves non Ethernet devices (such as InfiniBand).

Signed-off-by: Moni Shoua <monis at voltaire.com>
---
 drivers/net/bonding/bond_main.c  |    3 ++-
 drivers/net/bonding/bond_sysfs.c |   19 +++++++++++++------
 drivers/net/bonding/bonding.h    |    1 +
 3 files changed, 16 insertions(+), 7 deletions(-)

Index: net-2.6/drivers/net/bonding/bond_main.c
===================================================================
--- net-2.6.orig/drivers/net/bonding/bond_main.c	2007-08-15 10:55:48.000000000 +0300
+++ net-2.6/drivers/net/bonding/bond_main.c	2007-08-20 14:29:11.911298577 +0300
@@ -1224,7 +1224,8 @@ static int bond_compute_features(struct 
 	struct slave *slave;
 	struct net_device *bond_dev = bond->dev;
 	unsigned long features = bond_dev->features;
-	unsigned short max_hard_header_len = ETH_HLEN;
+	unsigned short max_hard_header_len = max((u16)ETH_HLEN,
+						bond_dev->hard_header_len);
 	int i;
 
 	features &= ~(NETIF_F_ALL_CSUM | BOND_VLAN_FEATURES);
Index: net-2.6/drivers/net/bonding/bond_sysfs.c
===================================================================
--- net-2.6.orig/drivers/net/bonding/bond_sysfs.c	2007-08-15 10:55:48.000000000 +0300
+++ net-2.6/drivers/net/bonding/bond_sysfs.c	2007-08-15 12:14:41.152469089 +0300
@@ -164,9 +164,7 @@ static ssize_t bonding_store_bonds(struc
 				printk(KERN_INFO DRV_NAME
 					": %s is being deleted...\n",
 					bond->dev->name);
-				bond_deinit(bond->dev);
-		        	bond_destroy_sysfs_entry(bond);
-				unregister_netdevice(bond->dev);
+				bond_destroy(bond);
 				rtnl_unlock();
 				goto out;
 			}
@@ -260,6 +258,7 @@ static ssize_t bonding_store_slaves(stru
 	char command[IFNAMSIZ + 1] = { 0, };
 	char *ifname;
 	int i, res, found, ret = count;
+	u32 original_mtu;
 	struct slave *slave;
 	struct net_device *dev = NULL;
 	struct bonding *bond = to_bond(d);
@@ -325,6 +324,7 @@ static ssize_t bonding_store_slaves(stru
 		}
 
 		/* Set the slave's MTU to match the bond */
+		original_mtu = dev->mtu;
 		if (dev->mtu != bond->dev->mtu) {
 			if (dev->change_mtu) {
 				res = dev->change_mtu(dev,
@@ -339,6 +339,9 @@ static ssize_t bonding_store_slaves(stru
 		}
 		rtnl_lock();
 		res = bond_enslave(bond->dev, dev);
+		bond_for_each_slave(bond, slave, i)
+			if (strnicmp(slave->dev->name, ifname, IFNAMSIZ) == 0)
+				slave->original_mtu = original_mtu;
 		rtnl_unlock();
 		if (res) {
 			ret = res;
@@ -351,13 +354,17 @@ static ssize_t bonding_store_slaves(stru
 		bond_for_each_slave(bond, slave, i)
 			if (strnicmp(slave->dev->name, ifname, IFNAMSIZ) == 0) {
 				dev = slave->dev;
+				original_mtu = slave->original_mtu;
 				break;
 			}
 		if (dev) {
 			printk(KERN_INFO DRV_NAME ": %s: Removing slave %s\n",
 				bond->dev->name, dev->name);
 			rtnl_lock();
-			res = bond_release(bond->dev, dev);
+			if (bond->setup_by_slave)
+				res = bond_release_and_destroy(bond->dev, dev);
+			else
+				res = bond_release(bond->dev, dev);
 			rtnl_unlock();
 			if (res) {
 				ret = res;
@@ -365,9 +372,9 @@ static ssize_t bonding_store_slaves(stru
 			}
 			/* set the slave MTU to the default */
 			if (dev->change_mtu) {
-				dev->change_mtu(dev, 1500);
+				dev->change_mtu(dev, original_mtu);
 			} else {
-				dev->mtu = 1500;
+				dev->mtu = original_mtu;
 			}
 		}
 		else {
Index: net-2.6/drivers/net/bonding/bonding.h
===================================================================
--- net-2.6.orig/drivers/net/bonding/bonding.h	2007-08-15 10:55:34.000000000 +0300
+++ net-2.6/drivers/net/bonding/bonding.h	2007-08-20 14:29:11.912298402 +0300
@@ -156,6 +156,7 @@ struct slave {
 	s8     link;    /* one of BOND_LINK_XXXX */
 	s8     state;   /* one of BOND_STATE_XXXX */
 	u32    original_flags;
+	u32    original_mtu;
 	u32    link_failure_count;
 	u16    speed;
 	u8     duplex;

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ofa-general] PATCH V5 9/11] net/bonding: Delay sending of gratuitous ARP to avoid failure
  2007-09-20 13:33 [PATCH V5 0/11] net/bonding: ADD IPoIB support for the bonding driver Moni Shoua
                   ` (7 preceding siblings ...)
  2007-09-20 14:00 ` [ofa-general] [PATCH V5 8/11] net/bonding: Handlle wrong assumptions that slave is always an Ethernet device Moni Shoua
@ 2007-09-20 14:02 ` Moni Shoua
  2007-09-20 14:04 ` [PATCH V5 10/11] net/bonding: Destroy bonding master when last slave is gone Moni Shoua
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Moni Shoua @ 2007-09-20 14:02 UTC (permalink / raw)
  To: Roland Dreier, Jay Vosburgh; +Cc: netdev, OpenFabrics General

Delay sending a gratuitous_arp when LINK_STATE_LINKWATCH_PENDING bit
in dev->state field is on. This improves the chances for the arp packet to
be transmitted.

Signed-off-by: Moni Shoua <monis at voltaire.com>
---
 drivers/net/bonding/bond_main.c |   24 +++++++++++++++++++++---
 drivers/net/bonding/bonding.h   |    1 +
 2 files changed, 22 insertions(+), 3 deletions(-)

Index: net-2.6/drivers/net/bonding/bond_main.c
===================================================================
--- net-2.6.orig/drivers/net/bonding/bond_main.c	2007-08-15 10:56:33.000000000 +0300
+++ net-2.6/drivers/net/bonding/bond_main.c	2007-08-15 11:04:37.221123652 +0300
@@ -1102,8 +1102,14 @@ void bond_change_active_slave(struct bon
 		if (new_active && !bond->do_set_mac_addr)
 			memcpy(bond->dev->dev_addr,  new_active->dev->dev_addr,
 				new_active->dev->addr_len);
-
-		bond_send_gratuitous_arp(bond);
+		if (bond->curr_active_slave &&
+			test_bit(__LINK_STATE_LINKWATCH_PENDING,
+					&bond->curr_active_slave->dev->state)) {
+			dprintk("delaying gratuitous arp on %s\n",
+				bond->curr_active_slave->dev->name);
+			bond->send_grat_arp = 1;
+		} else
+			bond_send_gratuitous_arp(bond);
 	}
 }
 
@@ -2083,6 +2089,17 @@ void bond_mii_monitor(struct net_device 
 	 * program could monitor the link itself if needed.
 	 */
 
+	if (bond->send_grat_arp) {
+		if (bond->curr_active_slave && test_bit(__LINK_STATE_LINKWATCH_PENDING,
+				&bond->curr_active_slave->dev->state))
+			dprintk("Needs to send gratuitous arp but not yet\n");
+		else {
+			dprintk("sending delayed gratuitous arp on on %s\n",
+				bond->curr_active_slave->dev->name);
+			bond_send_gratuitous_arp(bond);
+			bond->send_grat_arp = 0;
+		}
+	}
 	read_lock(&bond->curr_slave_lock);
 	oldcurrent = bond->curr_active_slave;
 	read_unlock(&bond->curr_slave_lock);
@@ -2484,7 +2501,7 @@ static void bond_send_gratuitous_arp(str
 
 	if (bond->master_ip) {
 		bond_arp_send(slave->dev, ARPOP_REPLY, bond->master_ip,
-				  bond->master_ip, 0);
+				bond->master_ip, 0);
 	}
 
 	list_for_each_entry(vlan, &bond->vlan_list, vlan_list) {
@@ -4293,6 +4310,7 @@ static int bond_init(struct net_device *
 	bond->current_arp_slave = NULL;
 	bond->primary_slave = NULL;
 	bond->dev = bond_dev;
+	bond->send_grat_arp = 0;
 	INIT_LIST_HEAD(&bond->vlan_list);
 
 	/* Initialize the device entry points */
Index: net-2.6/drivers/net/bonding/bonding.h
===================================================================
--- net-2.6.orig/drivers/net/bonding/bonding.h	2007-08-15 10:56:33.000000000 +0300
+++ net-2.6/drivers/net/bonding/bonding.h	2007-08-15 11:05:41.516451497 +0300
@@ -187,6 +187,7 @@ struct bonding {
 	struct   timer_list arp_timer;
 	s8       kill_timers;
 	s8       do_set_mac_addr;
+	s8	 send_grat_arp;
 	struct   net_device_stats stats;
 #ifdef CONFIG_PROC_FS
 	struct   proc_dir_entry *proc_entry;

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH V5 10/11] net/bonding: Destroy bonding master when last slave is gone
  2007-09-20 13:33 [PATCH V5 0/11] net/bonding: ADD IPoIB support for the bonding driver Moni Shoua
                   ` (8 preceding siblings ...)
  2007-09-20 14:02 ` [ofa-general] PATCH V5 9/11] net/bonding: Delay sending of gratuitous ARP to avoid failure Moni Shoua
@ 2007-09-20 14:04 ` Moni Shoua
  2007-09-20 14:06 ` [PATCH V5 5/11] net/bonding: Enable bonding to enslave non ARPHRD_ETHER Moni Shoua
  2007-09-20 14:07 ` [PATCH 11/11] bonding: Optionally allow ethernet slaves to keep own MAC Moni Shoua
  11 siblings, 0 replies; 17+ messages in thread
From: Moni Shoua @ 2007-09-20 14:04 UTC (permalink / raw)
  To: Roland Dreier, Jay Vosburgh; +Cc: OpenFabrics General, netdev

When bonding enslaves non Ethernet devices it takes pointers to functions 
in the module that owns the slaves. In this case it becomes unsafe
to keep the bonding master registered after last slave was unenslaved 
because we don't know if the pointers are still valid.  Destroying the bond when slave_cnt is zero
ensures that these functions be used anymore.

Signed-off-by: Moni Shoua <monis at voltaire.com>
---
 drivers/net/bonding/bond_main.c |   45 +++++++++++++++++++++++++++++++++++++++-
 drivers/net/bonding/bonding.h   |    3 ++
 2 files changed, 47 insertions(+), 1 deletion(-)

Index: net-2.6/drivers/net/bonding/bond_main.c
===================================================================
--- net-2.6.orig/drivers/net/bonding/bond_main.c	2007-08-20 14:43:17.123702132 +0300
+++ net-2.6/drivers/net/bonding/bond_main.c	2007-08-20 14:43:17.850571535 +0300
@@ -1256,6 +1256,7 @@ static int bond_compute_features(struct 
 static void bond_setup_by_slave(struct net_device *bond_dev,
 				struct net_device *slave_dev)
 {
+	struct bonding *bond = bond_dev->priv;
 	bond_dev->hard_header	        = slave_dev->hard_header;
 	bond_dev->rebuild_header        = slave_dev->rebuild_header;
 	bond_dev->hard_header_cache	= slave_dev->hard_header_cache;
@@ -1270,6 +1271,7 @@ static void bond_setup_by_slave(struct n
 
 	memcpy(bond_dev->broadcast, slave_dev->broadcast,
 		slave_dev->addr_len);
+	bond->setup_by_slave = 1;
 }
 
 /* enslave device <slave> to bond device <master> */
@@ -1838,6 +1840,35 @@ int bond_release(struct net_device *bond
 }
 
 /*
+* Destroy a bonding device.
+* Must be under rtnl_lock when this function is called.
+*/
+void bond_destroy(struct bonding *bond)
+{
+	bond_deinit(bond->dev);
+	bond_destroy_sysfs_entry(bond);
+	unregister_netdevice(bond->dev);
+}
+
+/*
+* First release a slave and than destroy the bond if no more slaves iare left.
+* Must be under rtnl_lock when this function is called.
+*/
+int  bond_release_and_destroy(struct net_device *bond_dev, struct net_device *slave_dev)
+{
+	struct bonding *bond = bond_dev->priv;
+	int ret;
+
+	ret = bond_release(bond_dev, slave_dev);
+	if ((ret == 0) && (bond->slave_cnt == 0)) {
+		printk(KERN_INFO DRV_NAME " %s: destroying bond for.\n",
+					bond_dev->name);
+		bond_destroy(bond);
+	}
+	return ret;
+}
+
+/*
  * This function releases all slaves.
  */
 static int bond_release_all(struct net_device *bond_dev)
@@ -3322,7 +3353,11 @@ static int bond_slave_netdev_event(unsig
 	switch (event) {
 	case NETDEV_UNREGISTER:
 		if (bond_dev) {
-			bond_release(bond_dev, slave_dev);
+			dprintk("slave %s unregisters\n", slave_dev->name);
+			if (bond->setup_by_slave)
+				bond_release_and_destroy(bond_dev, slave_dev);
+			else
+				bond_release(bond_dev, slave_dev);
 		}
 		break;
 	case NETDEV_CHANGE:
@@ -3331,6 +3366,13 @@ static int bond_slave_netdev_event(unsig
 		 * sets up a hierarchical bond, then rmmod's
 		 * one of the slave bonding devices?
 		 */
+		if (slave_dev->priv_flags & IFF_SLAVE_DETACH) {
+			dprintk("slave %s detaching\n", slave_dev->name);
+			if (bond->setup_by_slave)
+				bond_release_and_destroy(bond_dev, slave_dev);
+			else
+				bond_release(bond_dev, slave_dev);
+		}
 		break;
 	case NETDEV_DOWN:
 		/*
@@ -4311,6 +4353,7 @@ static int bond_init(struct net_device *
 	bond->primary_slave = NULL;
 	bond->dev = bond_dev;
 	bond->send_grat_arp = 0;
+	bond->setup_by_slave = 0;
 	INIT_LIST_HEAD(&bond->vlan_list);
 
 	/* Initialize the device entry points */
Index: net-2.6/drivers/net/bonding/bonding.h
===================================================================
--- net-2.6.orig/drivers/net/bonding/bonding.h	2007-08-20 14:43:17.123702132 +0300
+++ net-2.6/drivers/net/bonding/bonding.h	2007-08-20 14:47:52.845180870 +0300
@@ -188,6 +188,7 @@ struct bonding {
 	s8       kill_timers;
 	s8       do_set_mac_addr;
 	s8	 send_grat_arp;
+	s8	 setup_by_slave;
 	struct   net_device_stats stats;
 #ifdef CONFIG_PROC_FS
 	struct   proc_dir_entry *proc_entry;
@@ -295,6 +296,8 @@ static inline void bond_unset_master_alb
 struct vlan_entry *bond_next_vlan(struct bonding *bond, struct vlan_entry *curr);
 int bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb, struct net_device *slave_dev);
 int bond_create(char *name, struct bond_params *params, struct bonding **newbond);
+void bond_destroy(struct bonding *bond);
+int  bond_release_and_destroy(struct net_device *bond_dev, struct net_device *slave_dev);
 void bond_deinit(struct net_device *bond_dev);
 int bond_create_sysfs(void);
 void bond_destroy_sysfs(void);


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH V5 5/11] net/bonding: Enable bonding to enslave non ARPHRD_ETHER
  2007-09-20 13:33 [PATCH V5 0/11] net/bonding: ADD IPoIB support for the bonding driver Moni Shoua
                   ` (9 preceding siblings ...)
  2007-09-20 14:04 ` [PATCH V5 10/11] net/bonding: Destroy bonding master when last slave is gone Moni Shoua
@ 2007-09-20 14:06 ` Moni Shoua
  2007-09-20 14:07 ` [PATCH 11/11] bonding: Optionally allow ethernet slaves to keep own MAC Moni Shoua
  11 siblings, 0 replies; 17+ messages in thread
From: Moni Shoua @ 2007-09-20 14:06 UTC (permalink / raw)
  To: Roland Dreier, Jay Vosburgh; +Cc: OpenFabrics General, netdev

This patch changes some of the bond netdevice attributes and functions
to be that of the active slave for the case of the enslaved device not being
of ARPHRD_ETHER type. Basically it overrides those setting done by ether_setup(),
which are netdevice **type** dependent and hence might be not appropriate for
devices of other types. It also enforces mutual exclusion on bonding slaves
from dissimilar ether types, as was concluded over the v1 discussion.

IPoIB (see Documentation/infiniband/ipoib.txt) MAC address is made of a 3 bytes
IB QP (Queue Pair) number and 16 bytes IB port GID (Global ID) of the port this
IPoIB device is bounded to. The QP is a resource created by the IB HW and the
GID is an identifier burned into the HCA (i have omitted here some details which
are not important for the bonding RFC).

Signed-off-by: Moni Shoua <monis at voltaire.com>
Signed-off-by: Or Gerlitz <ogerlitz at voltaire.com>
---
 drivers/net/bonding/bond_main.c |   39 +++++++++++++++++++++++++++++++++++++++
 1 files changed, 39 insertions(+)

Index: net-2.6/drivers/net/bonding/bond_main.c
===================================================================
--- net-2.6.orig/drivers/net/bonding/bond_main.c	2007-08-15 10:08:59.000000000 +0300
+++ net-2.6/drivers/net/bonding/bond_main.c	2007-08-15 10:54:13.424688411 +0300
@@ -1237,6 +1237,26 @@ static int bond_compute_features(struct 
 	return 0;
 }
 
+
+static void bond_setup_by_slave(struct net_device *bond_dev,
+				struct net_device *slave_dev)
+{
+	bond_dev->hard_header	        = slave_dev->hard_header;
+	bond_dev->rebuild_header        = slave_dev->rebuild_header;
+	bond_dev->hard_header_cache	= slave_dev->hard_header_cache;
+	bond_dev->header_cache_update   = slave_dev->header_cache_update;
+	bond_dev->hard_header_parse	= slave_dev->hard_header_parse;
+
+	bond_dev->neigh_setup           = slave_dev->neigh_setup;
+
+	bond_dev->type		    = slave_dev->type;
+	bond_dev->hard_header_len   = slave_dev->hard_header_len;
+	bond_dev->addr_len	    = slave_dev->addr_len;
+
+	memcpy(bond_dev->broadcast, slave_dev->broadcast,
+		slave_dev->addr_len);
+}
+
 /* enslave device <slave> to bond device <master> */
 int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
 {
@@ -1311,6 +1331,25 @@ int bond_enslave(struct net_device *bond
 		goto err_undo_flags;
 	}
 
+	/* set bonding device ether type by slave - bonding netdevices are
+	 * created with ether_setup, so when the slave type is not ARPHRD_ETHER
+	 * there is a need to override some of the type dependent attribs/funcs.
+	 *
+	 * bond ether type mutual exclusion - don't allow slaves of dissimilar
+	 * ether type (eg ARPHRD_ETHER and ARPHRD_INFINIBAND) share the same bond
+	 */
+	if (bond->slave_cnt == 0) {
+		if (slave_dev->type != ARPHRD_ETHER)
+			bond_setup_by_slave(bond_dev, slave_dev);
+	} else if (bond_dev->type != slave_dev->type) {
+		printk(KERN_ERR DRV_NAME ": %s ether type (%d) is different "
+			"from other slaves (%d), can not enslave it.\n",
+			slave_dev->name,
+			slave_dev->type, bond_dev->type);
+			res = -EINVAL;
+			goto err_undo_flags;
+	}
+
 	if (slave_dev->set_mac_address == NULL) {
 		printk(KERN_ERR DRV_NAME
 			": %s: Error: The slave device you specified does "


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 11/11] bonding: Optionally allow ethernet slaves to keep own MAC
  2007-09-20 13:33 [PATCH V5 0/11] net/bonding: ADD IPoIB support for the bonding driver Moni Shoua
                   ` (10 preceding siblings ...)
  2007-09-20 14:06 ` [PATCH V5 5/11] net/bonding: Enable bonding to enslave non ARPHRD_ETHER Moni Shoua
@ 2007-09-20 14:07 ` Moni Shoua
  11 siblings, 0 replies; 17+ messages in thread
From: Moni Shoua @ 2007-09-20 14:07 UTC (permalink / raw)
  To: Roland Dreier, Jay Vosburgh; +Cc: OpenFabrics General, netdev

Update the "don't change MAC of slaves" functionality added in
previous changes to be a generic option, rather than something tied to IB
devices, as it's occasionally useful for regular ethernet devices as well.

	Adds "fail_over_mac" option (which is automatically enabled for IB
slaves), applicable only to active-backup mode.

	Includes documentation update.

	Updates bonding driver version to 3.2.0.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
---
 Documentation/networking/bonding.txt |   33 +++++++++++++++++++
 drivers/net/bonding/bond_main.c      |   57 +++++++++++++++++++++------------
 drivers/net/bonding/bond_sysfs.c     |   49 +++++++++++++++++++++++++++++
 drivers/net/bonding/bonding.h        |    6 ++--
 4 files changed, 121 insertions(+), 24 deletions(-)

diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt
index 1da5666..1134062 100644
--- a/Documentation/networking/bonding.txt
+++ b/Documentation/networking/bonding.txt
@@ -281,6 +281,39 @@ downdelay
 	will be rounded down to the nearest multiple.  The default
 	value is 0.
 
+fail_over_mac
+
+	Specifies whether active-backup mode should set all slaves to
+	the same MAC address (the traditional behavior), or, when
+	enabled, change the bond's MAC address when changing the
+	active interface (i.e., fail over the MAC address itself).
+
+	Fail over MAC is useful for devices that cannot ever alter
+	their MAC address, or for devices that refuse incoming
+	broadcasts with their own source MAC (which interferes with
+	the ARP monitor).
+
+	The down side of fail over MAC is that every device on the
+	network must be updated via gratuitous ARP, vs. just updating
+	a switch or set of switches (which often takes place for any
+	traffic, not just ARP traffic, if the switch snoops incoming
+	traffic to update its tables) for the traditional method.  If
+	the gratuitous ARP is lost, communication may be disrupted.
+
+	When fail over MAC is used in conjuction with the mii monitor,
+	devices which assert link up prior to being able to actually
+	transmit and receive are particularly susecptible to loss of
+	the gratuitous ARP, and an appropriate updelay setting may be
+	required.
+
+	A value of 0 disables fail over MAC, and is the default.  A
+	value of 1 enables fail over MAC.  This option is enabled
+	automatically if the first slave added cannot change its MAC
+	address.  This option may be modified via sysfs only when no
+	slaves are present in the bond.
+
+	This option was added in bonding version 3.2.0.
+
 lacp_rate
 
 	Option specifying the rate in which we'll ask our link partner
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 77caca3..c01ff9d 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -97,6 +97,7 @@ static char *xmit_hash_policy = NULL;
 static int arp_interval = BOND_LINK_ARP_INTERV;
 static char *arp_ip_target[BOND_MAX_ARP_TARGETS] = { NULL, };
 static char *arp_validate = NULL;
+static int fail_over_mac = 0;
 struct bond_params bonding_defaults;
 
 module_param(max_bonds, int, 0);
@@ -130,6 +131,8 @@ module_param_array(arp_ip_target, charp, NULL, 0);
 MODULE_PARM_DESC(arp_ip_target, "arp targets in n.n.n.n form");
 module_param(arp_validate, charp, 0);
 MODULE_PARM_DESC(arp_validate, "validate src/dst of ARP probes: none (default), active, backup or all");
+module_param(fail_over_mac, int, 0);
+MODULE_PARM_DESC(fail_over_mac, "For active-backup, do not set all slaves to the same MAC.  0 of off (default), 1 for on.");
 
 /*----------------------------- Global variables ----------------------------*/
 
@@ -1099,7 +1102,7 @@ void bond_change_active_slave(struct bonding *bond, struct slave *new_active)
 		/* when bonding does not set the slave MAC address, the bond MAC
 		 * address is the one of the active slave.
 		 */
-		if (new_active && !bond->do_set_mac_addr)
+		if (new_active && bond->params.fail_over_mac)
 			memcpy(bond->dev->dev_addr,  new_active->dev->dev_addr,
 				new_active->dev->addr_len);
 		if (bond->curr_active_slave &&
@@ -1371,16 +1374,16 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
 	if (slave_dev->set_mac_address == NULL) {
 		if (bond->slave_cnt == 0) {
 			printk(KERN_WARNING DRV_NAME
-				": %s: Warning: The first slave device you "
-				"specified does not support setting the MAC "
-				"address. This bond MAC address would be that "
-				"of the active slave.\n", bond_dev->name);
-			bond->do_set_mac_addr = 0;
-		} else if (bond->do_set_mac_addr) {
+			       ": %s: Warning: The first slave device "
+			       "specified does not support setting the MAC "
+			       "address. Enabling the fail_over_mac option.",
+			       bond_dev->name);
+			bond->params.fail_over_mac = 1;
+		} else if (!bond->params.fail_over_mac) {
 			printk(KERN_ERR DRV_NAME
-				": %s: Error: The slave device you specified "
-				"does not support setting the MAC addres,."
-				"but this bond uses this practice. \n"
+				": %s: Error: The slave device specified "
+				"does not support setting the MAC address, "
+				"but fail_over_mac is not enabled.\n"
 				, bond_dev->name);
 			res = -EOPNOTSUPP;
 			goto err_undo_flags;
@@ -1405,7 +1408,7 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
 	 */
 	memcpy(new_slave->perm_hwaddr, slave_dev->dev_addr, ETH_ALEN);
 
-	if (bond->do_set_mac_addr) {
+	if (!bond->params.fail_over_mac) {
 		/*
 		 * Set slave to master's mac address.  The application already
 		 * set the master's mac address to that of the first slave
@@ -1641,7 +1644,7 @@ err_close:
 	dev_close(slave_dev);
 
 err_restore_mac:
-	if (bond->do_set_mac_addr) {
+	if (!bond->params.fail_over_mac) {
 		memcpy(addr.sa_data, new_slave->perm_hwaddr, ETH_ALEN);
 		addr.sa_family = slave_dev->type;
 		dev_set_mac_address(slave_dev, &addr);
@@ -1823,7 +1826,7 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
 	/* close slave before restoring its mac address */
 	dev_close(slave_dev);
 
-	if (bond->do_set_mac_addr) {
+	if (!bond->params.fail_over_mac) {
 		/* restore original ("permanent") mac address */
 		memcpy(addr.sa_data, slave->perm_hwaddr, ETH_ALEN);
 		addr.sa_family = slave_dev->type;
@@ -1944,7 +1947,7 @@ static int bond_release_all(struct net_device *bond_dev)
 		/* close slave before restoring its mac address */
 		dev_close(slave_dev);
 
-		if (bond->do_set_mac_addr) {
+		if (!bond->params.fail_over_mac) {
 			/* restore original ("permanent") mac address*/
 			memcpy(addr.sa_data, slave->perm_hwaddr, ETH_ALEN);
 			addr.sa_family = slave_dev->type;
@@ -3066,9 +3069,15 @@ static void bond_info_show_master(struct seq_file *seq)
 	curr = bond->curr_active_slave;
 	read_unlock(&bond->curr_slave_lock);
 
-	seq_printf(seq, "Bonding Mode: %s\n",
+	seq_printf(seq, "Bonding Mode: %s",
 		   bond_mode_name(bond->params.mode));
 
+	if (bond->params.mode == BOND_MODE_ACTIVEBACKUP &&
+	    bond->params.fail_over_mac)
+		seq_printf(seq, " (fail_over_mac)");
+
+	seq_printf(seq, "\n");
+
 	if (bond->params.mode == BOND_MODE_XOR ||
 		bond->params.mode == BOND_MODE_8023AD) {
 		seq_printf(seq, "Transmit Hash Policy: %s (%d)\n",
@@ -4008,8 +4017,12 @@ static int bond_set_mac_address(struct net_device *bond_dev, void *addr)
 
 	dprintk("bond=%p, name=%s\n", bond, (bond_dev ? bond_dev->name : "None"));
 
-	if (!bond->do_set_mac_addr)
-		return -EOPNOTSUPP;
+	/*
+	 * If fail_over_mac is enabled, do nothing and return success.
+	 * Returning an error causes ifenslave to fail.
+	 */
+	if (bond->params.fail_over_mac)
+		return 0;
 
 	if (!is_valid_ether_addr(sa->sa_data)) {
 		return -EADDRNOTAVAIL;
@@ -4402,10 +4415,6 @@ static int bond_init(struct net_device *bond_dev, struct bond_params *params)
 #ifdef CONFIG_PROC_FS
 	bond_create_proc_entry(bond);
 #endif
-
-	/* set do_set_mac_addr to true on startup */
-	bond->do_set_mac_addr = 1;
-
 	list_add_tail(&bond->bond_list, &bond_dev_list);
 
 	return 0;
@@ -4739,6 +4748,11 @@ static int bond_check_params(struct bond_params *params)
 		primary = NULL;
 	}
 
+	if (fail_over_mac && (bond_mode != BOND_MODE_ACTIVEBACKUP))
+		printk(KERN_WARNING DRV_NAME
+		       ": Warning: fail_over_mac only affects "
+		       "active-backup mode.\n");
+
 	/* fill params struct with the proper values */
 	params->mode = bond_mode;
 	params->xmit_policy = xmit_hashtype;
@@ -4750,6 +4764,7 @@ static int bond_check_params(struct bond_params *params)
 	params->use_carrier = use_carrier;
 	params->lacp_fast = lacp_fast;
 	params->primary[0] = 0;
+	params->fail_over_mac = fail_over_mac;
 
 	if (primary) {
 		strncpy(params->primary, primary, IFNAMSIZ);
diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index 71db5d9..a907b68 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -567,6 +567,54 @@ static ssize_t bonding_store_arp_validate(struct device *d,
 static DEVICE_ATTR(arp_validate, S_IRUGO | S_IWUSR, bonding_show_arp_validate, bonding_store_arp_validate);
 
 /*
+ * Show and store fail_over_mac.  User only allowed to change the
+ * value when there are no slaves.
+ */
+static ssize_t bonding_show_fail_over_mac(struct device *d, struct device_attribute *attr, char *buf)
+{
+	struct bonding *bond = to_bond(d);
+
+	return sprintf(buf, "%d\n", bond->params.fail_over_mac) + 1;
+}
+
+static ssize_t bonding_store_fail_over_mac(struct device *d, struct device_attribute *attr, const char *buf, size_t count)
+{
+	int new_value;
+	int ret = count;
+	struct bonding *bond = to_bond(d);
+
+	if (bond->slave_cnt != 0) {
+		printk(KERN_ERR DRV_NAME
+		       ": %s: Can't alter fail_over_mac with slaves in bond.\n",
+		       bond->dev->name);
+		ret = -EPERM;
+		goto out;
+	}
+
+	if (sscanf(buf, "%d", &new_value) != 1) {
+		printk(KERN_ERR DRV_NAME
+		       ": %s: no fail_over_mac value specified.\n",
+		       bond->dev->name);
+		ret = -EINVAL;
+		goto out;
+	}
+
+	if ((new_value == 0) || (new_value == 1)) {
+		bond->params.fail_over_mac = new_value;
+		printk(KERN_INFO DRV_NAME ": %s: Setting fail_over_mac to %d.\n",
+		       bond->dev->name, new_value);
+	} else {
+		printk(KERN_INFO DRV_NAME
+		       ": %s: Ignoring invalid fail_over_mac value %d.\n",
+		       bond->dev->name, new_value);
+	}
+out:
+	return ret;
+}
+
+static DEVICE_ATTR(fail_over_mac, S_IRUGO | S_IWUSR, bonding_show_fail_over_mac, bonding_store_fail_over_mac);
+
+/*
  * Show and set the arp timer interval.  There are two tricky bits
  * here.  First, if ARP monitoring is activated, then we must disable
  * MII monitoring.  Second, if the ARP timer isn't running, we must
@@ -1390,6 +1438,7 @@ static DEVICE_ATTR(ad_partner_mac, S_IRUGO, bonding_show_ad_partner_mac, NULL);
 static struct attribute *per_bond_attrs[] = {
 	&dev_attr_slaves.attr,
 	&dev_attr_mode.attr,
+	&dev_attr_fail_over_mac.attr,
 	&dev_attr_arp_validate.attr,
 	&dev_attr_arp_interval.attr,
 	&dev_attr_arp_ip_target.attr,
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index ed0f587..9d6153e 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -22,8 +22,8 @@
 #include "bond_3ad.h"
 #include "bond_alb.h"
 
-#define DRV_VERSION	"3.1.3"
-#define DRV_RELDATE	"June 13, 2007"
+#define DRV_VERSION	"3.2.0"
+#define DRV_RELDATE	"September 13, 2007"
 #define DRV_NAME	"bonding"
 #define DRV_DESCRIPTION	"Ethernet Channel Bonding Driver"
 
@@ -128,6 +128,7 @@ struct bond_params {
 	int arp_interval;
 	int arp_validate;
 	int use_carrier;
+	int fail_over_mac;
 	int updelay;
 	int downdelay;
 	int lacp_fast;
@@ -186,7 +187,6 @@ struct bonding {
 	struct   timer_list mii_timer;
 	struct   timer_list arp_timer;
 	s8       kill_timers;
-	s8       do_set_mac_addr;
 	s8	 send_grat_arp;
 	s8	 setup_by_slave;
 	struct   net_device_stats stats;
-- 1.5.2-rc2.GIT 


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [ofa-general] Re: [PATCH V5 2/11] IB/ipoib: Notify the world before doing unregister
  2007-09-20 13:40 ` [PATCH V5 2/11] IB/ipoib: Notify the world before doing unregister Moni Shoua
@ 2007-09-20 16:20   ` Roland Dreier
  2007-09-23  7:55     ` Moni Shoua
  0 siblings, 1 reply; 17+ messages in thread
From: Roland Dreier @ 2007-09-20 16:20 UTC (permalink / raw)
  To: Moni Shoua; +Cc: netdev, Jay Vosburgh, OpenFabrics General

 > +		ipoib_slave_detach(cpriv->dev);
 >  		unregister_netdev(cpriv->dev);

Maybe you already answered this before, but I'm still not clear why
this notifier call can't just be added to the start of
unregister_netdevice(), so we can avoid having driver needing to know
anything about bonding internals?

 - R.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ofa-general] Re: [PATCH V5 2/11] IB/ipoib: Notify the world before doing unregister
  2007-09-20 16:20   ` [ofa-general] " Roland Dreier
@ 2007-09-23  7:55     ` Moni Shoua
  2007-09-23 16:34       ` Roland Dreier
  0 siblings, 1 reply; 17+ messages in thread
From: Moni Shoua @ 2007-09-23  7:55 UTC (permalink / raw)
  To: Roland Dreier; +Cc: Moni Shoua, netdev, Jay Vosburgh, OpenFabrics General

Roland Dreier wrote:
>  > +		ipoib_slave_detach(cpriv->dev);
>  >  		unregister_netdev(cpriv->dev);
> 
> Maybe you already answered this before, but I'm still not clear why
> this notifier call can't just be added to the start of
> unregister_netdevice(), so we can avoid having driver needing to know
> anything about bonding internals?
> 
>  - R.

The action in bonding to a detach of slave is to unregister the master (see patch 10).
This can't be done from the context of unregister_netdevice itself (it is protected by rtnl_lock).

That's why I had to notify the detach before unregister begins.



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ofa-general] Re: [PATCH V5 2/11] IB/ipoib: Notify the world before doing unregister
  2007-09-23  7:55     ` Moni Shoua
@ 2007-09-23 16:34       ` Roland Dreier
  2007-09-24 13:56         ` Moni Shoua
  0 siblings, 1 reply; 17+ messages in thread
From: Roland Dreier @ 2007-09-23 16:34 UTC (permalink / raw)
  To: Moni Shoua; +Cc: netdev, Jay Vosburgh, OpenFabrics General

 > The action in bonding to a detach of slave is to unregister the master (see patch 10).
 > This can't be done from the context of unregister_netdevice itself (it is protected by rtnl_lock).

I'm confused.  Your patch has:

 > +		ipoib_slave_detach(cpriv->dev);
 >  		unregister_netdev(cpriv->dev);

And ipoib_slave_detach() is:

 > +static inline void ipoib_slave_detach(struct net_device *dev)
 > +{
 > +	rtnl_lock();
 > +	netdev_slave_detach(dev);
 > +	rtnl_unlock();
 > +}

so you are calling netdev_slave_detach() with the rtnl lock held.
Why can't you make the same call from the start of unregister_netdevice()?

Anyway, if the rtnl lock is a problem, can you just add the call to
netdev_slave_detach() to unregister_netdev() before it takes the rtnl lock?

 - R.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ofa-general] Re: [PATCH V5 2/11] IB/ipoib: Notify the world before doing unregister
  2007-09-23 16:34       ` Roland Dreier
@ 2007-09-24 13:56         ` Moni Shoua
  0 siblings, 0 replies; 17+ messages in thread
From: Moni Shoua @ 2007-09-24 13:56 UTC (permalink / raw)
  To: Roland Dreier; +Cc: netdev, Jay Vosburgh, OpenFabrics General

Roland Dreier wrote:
>  > The action in bonding to a detach of slave is to unregister the master (see patch 10).
>  > This can't be done from the context of unregister_netdevice itself (it is protected by rtnl_lock).
> 
> I'm confused.  Your patch has:
> 
>  > +		ipoib_slave_detach(cpriv->dev);
>  >  		unregister_netdev(cpriv->dev);
> 
> And ipoib_slave_detach() is:
> 
>  > +static inline void ipoib_slave_detach(struct net_device *dev)
>  > +{
>  > +	rtnl_lock();
>  > +	netdev_slave_detach(dev);
>  > +	rtnl_unlock();
>  > +}
> 
> so you are calling netdev_slave_detach() with the rtnl lock held.
> Why can't you make the same call from the start of unregister_netdevice()?
> 
> Anyway, if the rtnl lock is a problem, can you just add the call to
> netdev_slave_detach() to unregister_netdev() before it takes the rtnl lock?
> 
>  - R.
> 

Your comment made me do a little rethinking.
In bonding, device is released by calling unregister_netdevice() that doesn't 
take the rtnl_lock (unlike unregister_netdev() that does). I guess that this made me 
confused to think that this is not possible. So, I guess I could put 
the detach notification in unregister_netedev() and the reaction to the notification 
in the bonding driver would not block.
However, I looked one more time at the code of unregister_netdevice() and found out that
nothing prevents from calling unregister_netdevice() again when the notification NETDEV_GOING_DOWN
is sent. I tried that and it works.
I have a new set of patches without sending a slave detach and I will send it soon.

Thanks for the comment Roland. It makes this patch simpler.

I'd also like to give a credit to Jay for the idea of using NETDEV_GOING_DOWN notification
instead of NETDEV_CHANGE+IFF_SLAVE_DETACH. He suggested it a while ago but I wrongly thought that
it wouldn't work.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2007-09-24 13:56 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-09-20 13:33 [PATCH V5 0/11] net/bonding: ADD IPoIB support for the bonding driver Moni Shoua
2007-09-20 13:39 ` [ofa-general] [PATCH V5 1/11] net/core: add a netdev notification for slave detach Moni Shoua
2007-09-20 13:40 ` [PATCH V5 2/11] IB/ipoib: Notify the world before doing unregister Moni Shoua
2007-09-20 16:20   ` [ofa-general] " Roland Dreier
2007-09-23  7:55     ` Moni Shoua
2007-09-23 16:34       ` Roland Dreier
2007-09-24 13:56         ` Moni Shoua
2007-09-20 13:41 ` [ofa-general] [PATCH V5 3/11] IB/ipoib: Bound the net device to the ipoib_neigh structue Moni Shoua
2007-09-20 13:42 ` [PATCH V5 4/11] IB/ipoib: Verify address handle validity on send Moni Shoua
2007-09-20 13:43 ` [ofa-general] [PATCH V5 5/11] net/bonding: Enable bonding to enslave non ARPHRD_ETHER Moni Shoua
2007-09-20 13:45 ` [PATCH V5 6/11] net/bonding: Enable bonding to enslave netdevices not supporting set_mac_address() Moni Shoua
2007-09-20 13:58 ` [ofa-general] [PATCH V5 7/11] net/bonding: Enable IP multicast for bonding IPoIB devices Moni Shoua
2007-09-20 14:00 ` [ofa-general] [PATCH V5 8/11] net/bonding: Handlle wrong assumptions that slave is always an Ethernet device Moni Shoua
2007-09-20 14:02 ` [ofa-general] PATCH V5 9/11] net/bonding: Delay sending of gratuitous ARP to avoid failure Moni Shoua
2007-09-20 14:04 ` [PATCH V5 10/11] net/bonding: Destroy bonding master when last slave is gone Moni Shoua
2007-09-20 14:06 ` [PATCH V5 5/11] net/bonding: Enable bonding to enslave non ARPHRD_ETHER Moni Shoua
2007-09-20 14:07 ` [PATCH 11/11] bonding: Optionally allow ethernet slaves to keep own MAC Moni Shoua

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.