netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC v3 0/6] networking: address root block upon initialization
@ 2014-03-03 22:46 Luis R. Rodriguez
  2014-03-03 22:47 ` [RFC v3 1/6] bridge: preserve random init MAC address Luis R. Rodriguez
                   ` (6 more replies)
  0 siblings, 7 replies; 13+ messages in thread
From: Luis R. Rodriguez @ 2014-03-03 22:46 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel, kvm, xen-devel, mcgrof

From: "Luis R. Rodriguez" <mcgrof@suse.com>

This is my third series on addressing removing the xen-netback hack of
using a high MAC address for a root block preference after feedback and
testing of the bridge feature Stephen mentioned. We want to remove that
hack as its possible to end up with IPv6 conflicts upon SLAAC and DAD.

As per Stephen we want to use the root_block preference but this is a
networking bridge port primitive, not a net_device feature. Since
ndo_add_slave() only lets us pass a net_device *and* since the bridge code
assumes outright that a slave devices can become a root port we have
to use a private flag on the net_device if a driver wants to enable
root block since initialization. If we don't address this preference
early the device may end up as the designated port and can run into
undesired link topology preferences.

As I tested using the root block preference I noticed that if a net_device
slave under the bridge gets the designated root port prior to setting in
userspace the root_block feature enabling the feature won't kick the
bridge to remove that net_device from the designated port. I addressed
that issue and also upkeeping the initial random MAC address given to
the bridge as if othwerwise we'd end up with a zero MAC address bridge
if we root block all ports. I have only done local tests I'd appreciate a
bit more wide test coverage and review.

I suspect the TAP devices used for virtualization can also benefit from this
flag so likewise since TAP devices can be tuned in userspace by design
we must enable an option to pass to userspace to let the tun driver set
the root block private flag prior to device registration. I'd be interested
to hear if folks usng kvm in a bridged topology have run into this issue
and if this addresses that issue there as well, I think it will.

If someone wants to test this in qemu the patch below can be used, but
note that this simply force-enables root block if the kernel has that
feature. Since I've heard claims that some topologies exist where
the virtualized backend device could be used as the designated port
on the bridge it may be desirable to enable qemu users to specify their
preference. If we however prove that we don't need this at all ever for
backends we likley want this enabled always.

These patches apply onto today's net-next.

diff --git a/net/tap-linux.c b/net/tap-linux.c
index 812bf2d..34be39d 100644
--- a/net/tap-linux.c
+++ b/net/tap-linux.c
@@ -61,6 +61,10 @@ int tap_open(char *ifname, int ifname_size, int *vnet_hdr,
         ifr.ifr_flags |= IFF_ONE_QUEUE;
     }
 
+    if (features & IFF_ROOT_BLOCK) {
+        ifr.ifr_flags |= IFF_ROOT_BLOCK;
+    }
+
     if (*vnet_hdr) {
         if (features & IFF_VNET_HDR) {
             *vnet_hdr = 1;
diff --git a/net/tap-linux.h b/net/tap-linux.h
index 1cf35d4..d59e477 100644
--- a/net/tap-linux.h
+++ b/net/tap-linux.h
@@ -42,6 +42,8 @@
 #define IFF_ATTACH_QUEUE 0x0200
 #define IFF_DETACH_QUEUE 0x0400
 
+#define IFF_ROOT_BLOCK   0x2000
+
 /* Features for GSO (TUNSETOFFLOAD). */
 #define TUN_F_CSUM	0x01	/* You can hand me unchecksummed packets. */
 #define TUN_F_TSO4	0x02	/* I can handle TSO for IPv4 packets */

Luis R. Rodriguez (6):
  bridge: preserve random init MAC address
  bridge: trigger a bridge calculation upon port changes
  bridge: fix bridge root block on designated port
  bridge: enable root block during device registration
  xen-netback: use a random MAC address and force bridge root block
  tun: add initialization root block support

 drivers/net/tun.c                   |  6 ++-
 drivers/net/xen-netback/interface.c | 14 +++----
 include/linux/netdevice.h           |  7 ++++
 include/uapi/linux/if_tun.h         |  1 +
 net/bridge/br_device.c              |  1 +
 net/bridge/br_if.c                  |  2 +
 net/bridge/br_netlink.c             | 24 ++++++++++++
 net/bridge/br_private.h             |  2 +
 net/bridge/br_stp.c                 | 73 ++++++++++++++++++++++++++++++++++---
 net/bridge/br_stp_if.c              |  6 ++-
 10 files changed, 119 insertions(+), 17 deletions(-)

-- 
1.9.0

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC v3 1/6] bridge: preserve random init MAC address
  2014-03-03 22:46 [RFC v3 0/6] networking: address root block upon initialization Luis R. Rodriguez
@ 2014-03-03 22:47 ` Luis R. Rodriguez
  2014-03-03 22:47 ` [RFC v3 2/6] bridge: trigger a bridge calculation upon port changes Luis R. Rodriguez
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Luis R. Rodriguez @ 2014-03-03 22:47 UTC (permalink / raw)
  To: netdev; +Cc: kvm, mcgrof, bridge, linux-kernel, Stephen Hemminger, xen-devel

From: "Luis R. Rodriguez" <mcgrof@suse.com>

As it is now if you add create a bridge it gets started
with a random MAC address and if you then add a net_device
as a slave but later kick it out you end up with a zero
MAC address. Instead preserve the original random MAC
address and use it.

If you manually set the bridge address that will always
be respected. This change only takes effect if at the time
of computing the new root port we determine we have found
no candidates.

Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: bridge@lists.linux-foundation.org
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Cc: kvm@vger.kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
---
 net/bridge/br_device.c  | 1 +
 net/bridge/br_private.h | 1 +
 net/bridge/br_stp_if.c  | 3 +++
 3 files changed, 5 insertions(+)

diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index b063050..5f13eac 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -368,6 +368,7 @@ void br_dev_setup(struct net_device *dev)
 	br->bridge_id.prio[1] = 0x00;
 
 	ether_addr_copy(br->group_addr, eth_reserved_addr_base);
+	ether_addr_copy(br->random_init_addr, dev->dev_addr);
 
 	br->stp_enabled = BR_NO_STP;
 	br->group_fwd_mask = BR_GROUPFWD_DEFAULT;
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index e1ca1dc..32a06da 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -240,6 +240,7 @@ struct net_bridge
 	unsigned long			bridge_hello_time;
 	unsigned long			bridge_forward_delay;
 
+	u8				random_init_addr[ETH_ALEN];
 	u8				group_addr[ETH_ALEN];
 	u16				root_port;
 
diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c
index 189ba1e..4c9ad45 100644
--- a/net/bridge/br_stp_if.c
+++ b/net/bridge/br_stp_if.c
@@ -239,6 +239,9 @@ bool br_stp_recalculate_bridge_id(struct net_bridge *br)
 	if (ether_addr_equal(br->bridge_id.addr, addr))
 		return false;	/* no change */
 
+	if (ether_addr_equal(addr, br_mac_zero))
+		addr = br->random_init_addr;
+
 	br_stp_change_bridge_id(br, addr);
 	return true;
 }
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC v3 2/6] bridge: trigger a bridge calculation upon port changes
  2014-03-03 22:46 [RFC v3 0/6] networking: address root block upon initialization Luis R. Rodriguez
  2014-03-03 22:47 ` [RFC v3 1/6] bridge: preserve random init MAC address Luis R. Rodriguez
@ 2014-03-03 22:47 ` Luis R. Rodriguez
  2014-03-03 22:47 ` [RFC v3 3/6] bridge: fix bridge root block on designated port Luis R. Rodriguez
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Luis R. Rodriguez @ 2014-03-03 22:47 UTC (permalink / raw)
  To: netdev; +Cc: kvm, mcgrof, bridge, linux-kernel, Stephen Hemminger, xen-devel

From: "Luis R. Rodriguez" <mcgrof@suse.com>

If netlink is used to tune a port we currently don't trigger a
new recalculation of the bridge id, ensure that happens just as
if we're adding a new net_device onto the bridge.

Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: bridge@lists.linux-foundation.org
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Cc: kvm@vger.kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
---
 net/bridge/br_netlink.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index e74b6d53..6f1b26d 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -364,6 +364,7 @@ int br_setlink(struct net_device *dev, struct nlmsghdr *nlh)
 	struct net_bridge_port *p;
 	struct nlattr *tb[IFLA_BRPORT_MAX + 1];
 	int err = 0;
+	bool changed;
 
 	protinfo = nlmsg_find_attr(nlh, sizeof(struct ifinfomsg), IFLA_PROTINFO);
 	afspec = nlmsg_find_attr(nlh, sizeof(struct ifinfomsg), IFLA_AF_SPEC);
@@ -386,7 +387,12 @@ int br_setlink(struct net_device *dev, struct nlmsghdr *nlh)
 
 			spin_lock_bh(&p->br->lock);
 			err = br_setport(p, tb);
+			changed = br_stp_recalculate_bridge_id(p->br);
 			spin_unlock_bh(&p->br->lock);
+			if (changed)
+				call_netdevice_notifiers(NETDEV_CHANGEADDR,
+							 p->br->dev);
+			netdev_update_features(p->br->dev);
 		} else {
 			/* Binary compatibility with old RSTP */
 			if (nla_len(protinfo) < sizeof(u8))
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC v3 3/6] bridge: fix bridge root block on designated port
  2014-03-03 22:46 [RFC v3 0/6] networking: address root block upon initialization Luis R. Rodriguez
  2014-03-03 22:47 ` [RFC v3 1/6] bridge: preserve random init MAC address Luis R. Rodriguez
  2014-03-03 22:47 ` [RFC v3 2/6] bridge: trigger a bridge calculation upon port changes Luis R. Rodriguez
@ 2014-03-03 22:47 ` Luis R. Rodriguez
  2014-03-03 22:47 ` [RFC v3 4/6] bridge: enable root block during device registration Luis R. Rodriguez
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Luis R. Rodriguez @ 2014-03-03 22:47 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel, kvm, xen-devel, mcgrof, Stephen Hemminger, bridge

From: "Luis R. Rodriguez" <mcgrof@suse.com>

Root port blocking was designed so that a bridge port can opt
out of becoming the designated root port for a bridge. If a port
however first becomes the designated root port and we then toggle
the root port block on it we currently don't kick that port out of
the designated root port. This fixes that. This is particularly
important for net_devices that would wish to never become a root
port from the start, currently toggling that off will enable root
port flag but it won't really kick the bridge and do what you'd
expect, the MAC address is kept on the bridge of the toggled port
for root_block if it was the designated port.

In order to catch if a port with root port block preference is set
we need to move our check for root block prior to the root selection
so check for root blocked ports upon eveyr br_configuration_update().
We also simply just prevent the root-blocked ports from consideration
as root port candidates on br_should_become_root_port() and
br_stp_recalculate_bridge_id().

The issue that this patch is trying to address and fix can be tested
easily before and after this patch is applied using 2 TAP net_devices
and then toggling at will with the root_block knob.

ip tuntap add dev tap0 mode tap
ip tuntap add dev tap1 mode tap
ip link add dev br0 type bridge
ip link show br0
echo -------------------------------------------
ip link set dev tap0 master br0
ip link
echo -------------------------------------------
ip link set dev tap1 master br0
ip link
echo -------------------------------------------

Upon review at the above results you can toggle root_block
on each port to see if you see the results you expect.

bridge link set dev tap0 root_block on
ip link
bridge link set dev tap1 root_block on
ip link

Toggling off root_block on any port should will bring back the
port to be a candidate for designated root port:

bridge link set dev tap1 root_block off
ip link
bridge link set dev tap0 root_block off
ip link

To nuke:

ip tuntap del tap0 mode tap
ip tuntap del tap0 mode tap

Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: bridge@lists.linux-foundation.org
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Cc: kvm@vger.kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
---
 net/bridge/br_netlink.c | 18 ++++++++++++
 net/bridge/br_private.h |  1 +
 net/bridge/br_stp.c     | 73 +++++++++++++++++++++++++++++++++++++++++++++----
 net/bridge/br_stp_if.c  |  3 +-
 4 files changed, 88 insertions(+), 7 deletions(-)

diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index 6f1b26d..fbec354 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -324,6 +324,21 @@ static void br_set_port_flag(struct net_bridge_port *p, struct nlattr *tb[],
 	}
 }
 
+static void br_kick_bridge_port(struct net_bridge_port *p)
+{
+	struct net_bridge *br = p->br;
+	bool wasroot;
+
+	wasroot = br_is_root_bridge(br);
+	br_become_designated_port(p);
+
+	br_configuration_update(br);
+	br_port_state_selection(br);
+
+	if (br_is_root_bridge(br) && !wasroot)
+		br_become_root_bridge(br);
+}
+
 /* Process bridge protocol info on port */
 static int br_setport(struct net_bridge_port *p, struct nlattr *tb[])
 {
@@ -353,6 +368,9 @@ static int br_setport(struct net_bridge_port *p, struct nlattr *tb[])
 		if (err)
 			return err;
 	}
+
+	br_kick_bridge_port(p);
+
 	return 0;
 }
 
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 32a06da..45d7917 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -150,6 +150,7 @@ struct net_bridge_port
 	u8				priority;
 	u8				state;
 	u16				port_no;
+	bool				root_block_enabled;
 	unsigned char			topology_change_ack;
 	unsigned char			config_pending;
 	port_id				port_id;
diff --git a/net/bridge/br_stp.c b/net/bridge/br_stp.c
index 3c86f05..f5741f3 100644
--- a/net/bridge/br_stp.c
+++ b/net/bridge/br_stp.c
@@ -59,6 +59,7 @@ static int br_should_become_root_port(const struct net_bridge_port *p,
 
 	br = p->br;
 	if (p->state == BR_STATE_DISABLED ||
+	   (p->flags & BR_ROOT_BLOCK) ||
 	    br_is_designated_port(p))
 		return 0;
 
@@ -104,7 +105,7 @@ static void br_root_port_block(const struct net_bridge *br,
 			       struct net_bridge_port *p)
 {
 
-	br_notice(br, "port %u(%s) tried to become root port (blocked)",
+	br_notice(br, "port %u (%s) is now root blocked",
 		  (unsigned int) p->port_no, p->dev->name);
 
 	p->state = BR_STATE_LISTENING;
@@ -124,11 +125,7 @@ static void br_root_selection(struct net_bridge *br)
 	list_for_each_entry(p, &br->port_list, list) {
 		if (!br_should_become_root_port(p, root_port))
 			continue;
-
-		if (p->flags & BR_ROOT_BLOCK)
-			br_root_port_block(br, p);
-		else
-			root_port = p->port_no;
+		root_port = p->port_no;
 	}
 
 	br->root_port = root_port;
@@ -358,10 +355,74 @@ static void br_reply(struct net_bridge_port *p)
 	br_transmit_config(p);
 }
 
+static bool br_root_port_block_check(struct net_bridge_port *p)
+{
+	bool designated = false;
+
+	if (p->root_block_enabled &&
+	    !(p->flags & BR_ROOT_BLOCK)) {
+		br_notice(p->br, "port %u (%s) is root block toggled off",
+			  (unsigned int) p->port_no, p->dev->name);
+		return false;
+	}
+
+	if (p->flags & BR_ROOT_BLOCK &&
+	    (p->root_block_enabled))
+		return false;
+
+	if (!(p->flags & BR_ROOT_BLOCK))
+		return false;
+
+	if (br_is_designated_port(p)) {
+		designated = true;
+		p->flags = BR_ROOT_BLOCK |
+			   BR_LEARNING |
+			   BR_FLOOD;
+		p->priority = 0x8000 >> BR_PORT_BITS;
+		br_init_port(p);
+	}
+
+	br_root_port_block(p->br, p);
+	p->root_block_enabled = true;
+
+	return designated;
+}
+
+/* Updates block ports as required and returns true if one was designated */
+static bool br_update_block_ports(struct net_bridge *br)
+{
+	struct net_bridge_port *p;
+	bool designated = false;
+
+	list_for_each_entry(p, &br->port_list, list) {
+		designated = br_root_port_block_check(p);
+		if (designated)
+			return designated;
+	}
+
+	return false;
+}
+
+static void br_update_designated_port(struct net_bridge *br)
+{
+	struct net_bridge_port *p;
+
+	list_for_each_entry(p, &br->port_list, list)
+		br_become_designated_port(p);
+}
+
 /* called under bridge lock */
 void br_configuration_update(struct net_bridge *br)
 {
+	bool designated_was_blocked = false;
+
+	designated_was_blocked = br_update_block_ports(br);
+
 	br_root_selection(br);
+
+	if (designated_was_blocked)
+		br_update_designated_port(br);
+
 	br_designated_port_selection(br);
 }
 
diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c
index 4c9ad45..62b84af 100644
--- a/net/bridge/br_stp_if.c
+++ b/net/bridge/br_stp_if.c
@@ -230,10 +230,11 @@ bool br_stp_recalculate_bridge_id(struct net_bridge *br)
 		return false;
 
 	list_for_each_entry(p, &br->port_list, list) {
+		if (p->flags & BR_ROOT_BLOCK)
+			continue;
 		if (addr == br_mac_zero ||
 		    memcmp(p->dev->dev_addr, addr, ETH_ALEN) < 0)
 			addr = p->dev->dev_addr;
-
 	}
 
 	if (ether_addr_equal(br->bridge_id.addr, addr))
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC v3 4/6] bridge: enable root block during device registration
  2014-03-03 22:46 [RFC v3 0/6] networking: address root block upon initialization Luis R. Rodriguez
                   ` (2 preceding siblings ...)
  2014-03-03 22:47 ` [RFC v3 3/6] bridge: fix bridge root block on designated port Luis R. Rodriguez
@ 2014-03-03 22:47 ` Luis R. Rodriguez
  2014-03-03 23:43   ` Stephen Hemminger
  2014-03-03 22:47 ` [RFC v3 5/6] xen-netback: use a random MAC address and force bridge root block Luis R. Rodriguez
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 13+ messages in thread
From: Luis R. Rodriguez @ 2014-03-03 22:47 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel, kvm, xen-devel, mcgrof, Stephen Hemminger, bridge

From: "Luis R. Rodriguez" <mcgrof@suse.com>

root block support was added via 1007dd1a on v3.8 but toggling
this flag is only allowed after a device has been registered and
added to a bridge as its a bridge *port* primitive, not a *net_device*
feature. There are work arounds possible to account for the lack
of netlink tools to toggle root_block, such as using the root_block
syfs attribute [0] or using udev / the driver to set the MAC address
to something high such as FE:FF:FF:FF:FF:FF, but neither of these
ensure root block is respected _from_the_start_ through device
initialization.

In order to support the root_block feature from the start since device
initialization and in order to avoid having to require userspace
work arounds to existing deployments this exposes a private
net_device flag which enables drivers that know they want to
start with the root_block feature enabled form the start. The
only caveat with this is topologies that require STP or non-root
will either have to use sysfs [0] or netlink tools like the
iproute2 bridge util to toggle the flag off after initialization.
This is an accepted compromise.

This flag is required given that ndo_add_slave() currently does not
allow specifying any other parameters other than the net_device. We
could extend this but in order to do that properly we'd need to
evaluate all other types of master device implementations.

[0] echo 1 > /sys/devices/vif-2-0/net/vif2.0/brport/root_block

Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: bridge@lists.linux-foundation.org
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Cc: kvm@vger.kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
---
 include/linux/netdevice.h | 7 +++++++
 net/bridge/br_if.c        | 2 ++
 2 files changed, 9 insertions(+)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 1a86948..b17643a 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1181,6 +1181,11 @@ struct net_device_ops {
  * @IFF_LIVE_ADDR_CHANGE: device supports hardware address
  *	change when it's running
  * @IFF_MACVLAN: Macvlan device
+ * @IFF_BRIDGE_ROOT_BLOCK: don't consider this net_device for root port
+ *	when this interface is added to a bridge. This makes use of the
+ *	root_block mechanism but since its a bridge port primitive this
+ *	flag can be used to instantiate the preference to have root block
+ *	enabled from the start since initialization.
  */
 enum netdev_priv_flags {
 	IFF_802_1Q_VLAN			= 1<<0,
@@ -1205,6 +1210,7 @@ enum netdev_priv_flags {
 	IFF_SUPP_NOFCS			= 1<<19,
 	IFF_LIVE_ADDR_CHANGE		= 1<<20,
 	IFF_MACVLAN			= 1<<21,
+	IFF_BRIDGE_ROOT_BLOCK		= 1<<22,
 };
 
 #define IFF_802_1Q_VLAN			IFF_802_1Q_VLAN
@@ -1229,6 +1235,7 @@ enum netdev_priv_flags {
 #define IFF_SUPP_NOFCS			IFF_SUPP_NOFCS
 #define IFF_LIVE_ADDR_CHANGE		IFF_LIVE_ADDR_CHANGE
 #define IFF_MACVLAN			IFF_MACVLAN
+#define IFF_BRIDGE_ROOT_BLOCK		IFF_BRIDGE_ROOT_BLOCK
 
 /*
  *	The DEVICE structure.
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 54d207d..4bee748 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -228,6 +228,8 @@ static struct net_bridge_port *new_nbp(struct net_bridge *br,
 	br_init_port(p);
 	p->state = BR_STATE_DISABLED;
 	br_stp_port_timer_init(p);
+	if (dev->priv_flags & IFF_BRIDGE_ROOT_BLOCK)
+		p->flags |= BR_ROOT_BLOCK;
 	br_multicast_add_port(p);
 
 	return p;
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC v3 5/6] xen-netback: use a random MAC address and force bridge root block
  2014-03-03 22:46 [RFC v3 0/6] networking: address root block upon initialization Luis R. Rodriguez
                   ` (3 preceding siblings ...)
  2014-03-03 22:47 ` [RFC v3 4/6] bridge: enable root block during device registration Luis R. Rodriguez
@ 2014-03-03 22:47 ` Luis R. Rodriguez
  2014-03-03 22:47 ` [RFC v3 6/6] tun: add initialization root block support Luis R. Rodriguez
  2014-03-04  1:05 ` [RFC v3 0/6] networking: address root block upon initialization Luis R. Rodriguez
  6 siblings, 0 replies; 13+ messages in thread
From: Luis R. Rodriguez @ 2014-03-03 22:47 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel, kvm, xen-devel, mcgrof, Stephen Hemminger, bridge

From: "Luis R. Rodriguez" <mcgrof@suse.com>

The purpose of using a static MAC address of FE:FF:FF:FF:FF:FF
was to prevent our backend interfaces from being used by the
bridge and nominating our interface as a root port on the bridge.
This was possible given that the bridge code will use the lowest MAC
address for a port once a new interface gets added to the bridge.

Sticking to a static MAC address is undesirable for a few reasons:

  a) using a static MAC address by default on all interfaces can
     lead to possible conflicts with IPv6 SLAAC and DAD

  b) The bridge code has a generic bridge port 'root block' feature
     to allow interfaces to opt out from root bridge port nominations.
     We want to help stop spreading the use of a high MAC address
     as a hack to do root port block, and instead get folks to
     use the proper APIs for this.

This modifies xen-netback to use a random MAC address with the xen OUI
prefix and enables the bridge root block feature since initialization.
Although toggling the root block feature requires the iproute2 bridge
tool [0] or sysfs [1] xen-netback users would only need to toggle this
off in the case that the net_device is required to be a candidate for
root port nomination on the bridge. This is an acceptable compromise
in order to avoid the possible conflicts with IPv6 SLAAC and DAD.

[0] bridge link set dev vif2.0 root_block off
[1] echo 1 > /sys/devices/vif-2-0/net/vif2.0/brport/root_block

Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: bridge@lists.linux-foundation.org
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Cc: kvm@vger.kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
---
 drivers/net/xen-netback/interface.c | 14 +++++---------
 1 file changed, 5 insertions(+), 9 deletions(-)

diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index 7669d49..79d71ec 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -42,6 +42,8 @@
 #define XENVIF_QUEUE_LENGTH 32
 #define XENVIF_NAPI_WEIGHT  64
 
+static const u8 xen_oui[3] = { 0x00, 0x16, 0x3e };
+
 int xenvif_schedulable(struct xenvif *vif)
 {
 	return netif_running(vif->dev) && netif_carrier_ok(vif->dev);
@@ -346,15 +348,9 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t domid,
 	for (i = 0; i < MAX_PENDING_REQS; i++)
 		vif->mmap_pages[i] = NULL;
 
-	/*
-	 * Initialise a dummy MAC address. We choose the numerically
-	 * largest non-broadcast address to prevent the address getting
-	 * stolen by an Ethernet bridge for STP purposes.
-	 * (FE:FF:FF:FF:FF:FF)
-	 */
-	memset(dev->dev_addr, 0xFF, ETH_ALEN);
-	dev->dev_addr[0] &= ~0x01;
-
+	eth_hw_addr_random(dev);
+	memcpy(dev->dev_addr, xen_oui, 3);
+	dev->priv_flags |= IFF_BRIDGE_ROOT_BLOCK;
 	netif_napi_add(dev, &vif->napi, xenvif_poll, XENVIF_NAPI_WEIGHT);
 
 	netif_carrier_off(dev);
-- 
1.9.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC v3 6/6] tun: add initialization root block support
  2014-03-03 22:46 [RFC v3 0/6] networking: address root block upon initialization Luis R. Rodriguez
                   ` (4 preceding siblings ...)
  2014-03-03 22:47 ` [RFC v3 5/6] xen-netback: use a random MAC address and force bridge root block Luis R. Rodriguez
@ 2014-03-03 22:47 ` Luis R. Rodriguez
  2014-03-04  1:05 ` [RFC v3 0/6] networking: address root block upon initialization Luis R. Rodriguez
  6 siblings, 0 replies; 13+ messages in thread
From: Luis R. Rodriguez @ 2014-03-03 22:47 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel, kvm, xen-devel, mcgrof, Stephen Hemminger, bridge

From: "Luis R. Rodriguez" <mcgrof@suse.com>

The networking bridge module allows us to specify a
root block preference on net_devices but this feature
is a bridge port primitive. The bridge module assumes
that once a device is added as a slave to the brige
that it can be considered for the the root port.
Furthermore ndo_add_slave() only lets us pass on the
net_device, but drivers that want to root block
since the beginning need specify their root block
preference prior to register_netdevice(). This means
the tun driver must expose a knob to userspace to
enable toggling this feature prior to adding the
device to a bridge to prevent any undesired link
bouncing issues on the bridge.

This should be useful for TAP interfaces which are used
for virtualization.

Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: bridge@lists.linux-foundation.org
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Cc: kvm@vger.kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
---
 drivers/net/tun.c           | 6 +++++-
 include/uapi/linux/if_tun.h | 1 +
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 8fe9cb7..e763fad 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1693,6 +1693,9 @@ static int tun_set_iff(struct net *net, struct file *file, struct ifreq *ifr)
 		if (err < 0)
 			goto err_free_flow;
 
+		if (ifr->ifr_flags & IFF_ROOT_BLOCK)
+			dev->priv_flags |= IFF_BRIDGE_ROOT_BLOCK;
+
 		err = register_netdevice(tun->dev);
 		if (err < 0)
 			goto err_detach;
@@ -1898,7 +1901,8 @@ static long __tun_chr_ioctl(struct file *file, unsigned int cmd,
 		 * This is needed because we never checked for invalid flags on
 		 * TUNSETIFF. */
 		return put_user(IFF_TUN | IFF_TAP | IFF_NO_PI | IFF_ONE_QUEUE |
-				IFF_VNET_HDR | IFF_MULTI_QUEUE,
+				IFF_VNET_HDR | IFF_MULTI_QUEUE |
+				IFF_ROOT_BLOCK,
 				(unsigned int __user*)argp);
 	} else if (cmd == TUNSETQUEUE)
 		return tun_set_queue(file, &ifr);
diff --git a/include/uapi/linux/if_tun.h b/include/uapi/linux/if_tun.h
index e9502dd..f814a85 100644
--- a/include/uapi/linux/if_tun.h
+++ b/include/uapi/linux/if_tun.h
@@ -73,6 +73,7 @@
 /* read-only flag */
 #define IFF_PERSIST	0x0800
 #define IFF_NOFILTER	0x1000
+#define IFF_ROOT_BLOCK	0x2000
 
 /* Socket options */
 #define TUN_TX_TIMESTAMP 1
-- 
1.9.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [RFC v3 4/6] bridge: enable root block during device registration
  2014-03-03 22:47 ` [RFC v3 4/6] bridge: enable root block during device registration Luis R. Rodriguez
@ 2014-03-03 23:43   ` Stephen Hemminger
  2014-03-03 23:58     ` Luis R. Rodriguez
  0 siblings, 1 reply; 13+ messages in thread
From: Stephen Hemminger @ 2014-03-03 23:43 UTC (permalink / raw)
  To: Luis R. Rodriguez; +Cc: kvm, netdev, mcgrof, bridge, linux-kernel, xen-devel

On Mon,  3 Mar 2014 14:47:03 -0800
"Luis R. Rodriguez" <mcgrof@do-not-panic.com> wrote:

> From: "Luis R. Rodriguez" <mcgrof@suse.com>
> 
> root block support was added via 1007dd1a on v3.8 but toggling
> this flag is only allowed after a device has been registered and
> added to a bridge as its a bridge *port* primitive, not a *net_device*
> feature. There are work arounds possible to account for the lack
> of netlink tools to toggle root_block, such as using the root_block
> syfs attribute [0] or using udev / the driver to set the MAC address
> to something high such as FE:FF:FF:FF:FF:FF, but neither of these
> ensure root block is respected _from_the_start_ through device
> initialization.
> 
> In order to support the root_block feature from the start since device
> initialization and in order to avoid having to require userspace
> work arounds to existing deployments this exposes a private
> net_device flag which enables drivers that know they want to
> start with the root_block feature enabled form the start. The
> only caveat with this is topologies that require STP or non-root
> will either have to use sysfs [0] or netlink tools like the
> iproute2 bridge util to toggle the flag off after initialization.
> This is an accepted compromise.
> 
> This flag is required given that ndo_add_slave() currently does not
> allow specifying any other parameters other than the net_device. We
> could extend this but in order to do that properly we'd need to
> evaluate all other types of master device implementations.
> 
> [0] echo 1 > /sys/devices/vif-2-0/net/vif2.0/brport/root_block
> 
> Cc: Stephen Hemminger <stephen@networkplumber.org>
> Cc: bridge@lists.linux-foundation.org
> Cc: netdev@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: xen-devel@lists.xenproject.org
> Cc: kvm@vger.kernel.org
> Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
> ---
>  include/linux/netdevice.h | 7 +++++++
>  net/bridge/br_if.c        | 2 ++
>  2 files changed, 9 insertions(+)
> 
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 1a86948..b17643a 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -1181,6 +1181,11 @@ struct net_device_ops {
>   * @IFF_LIVE_ADDR_CHANGE: device supports hardware address
>   *	change when it's running
>   * @IFF_MACVLAN: Macvlan device
> + * @IFF_BRIDGE_ROOT_BLOCK: don't consider this net_device for root port
> + *	when this interface is added to a bridge. This makes use of the
> + *	root_block mechanism but since its a bridge port primitive this
> + *	flag can be used to instantiate the preference to have root block
> + *	enabled from the start since initialization.
>   */

Doing this in priv flags bloats what is a limited resource (# of bits).
Plus there are issues (what if this is changed after adding to bridge)?

Maybe better to enhance existing netlink infrastructure to allow passing
flags on adding port to bridge.

Also, unless device is up, nothing will happen right away when added to bridge.
Root port status can be changed since device is disabled.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC v3 4/6] bridge: enable root block during device registration
  2014-03-03 23:43   ` Stephen Hemminger
@ 2014-03-03 23:58     ` Luis R. Rodriguez
  2014-03-04  0:31       ` Stephen Hemminger
  0 siblings, 1 reply; 13+ messages in thread
From: Luis R. Rodriguez @ 2014-03-03 23:58 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, bridge, linux-kernel, kvm, xen-devel

On Mon, Mar 3, 2014 at 3:43 PM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> Doing this in priv flags bloats what is a limited resource (# of bits).

Agreed. I tried to avoid it but saw no other option for addressing
this during initialization properly without requirng a userspace
upgrade.

> Plus there are issues (what if this is changed after adding to bridge)?

Its only considered when the net_device gets added to the bridge. If
you add it and then toggle root_block off that will be respected. If
you remove the net_device from the bridge and then add it back on the
priv flag will be respected but if desirable we could toggle it off if
userspace was used to disable root_block once on.

> Maybe better to enhance existing netlink infrastructure to allow passing
> flags on adding port to bridge.

Agreed, I looked into this and the limitation of using the existing
slave add request is that ndo_add_slave() only passes the net_device,
we'd have to either extend that (with collateral study on other slaves
as noted on my cover letter) or we'd have to make this a UAPI netlink
feature for the net_device. Some old userspace  may not use netlink
too, and they may be stuck with brctl, I tried to create a compromise
only to not affect old userspace if that kernel gets upgraded.

> Also, unless device is up, nothing will happen right away when added to bridge.

I get different results, the bridge immediately does compute whether
or not to nominate a device for the bridge interface and root port.
The commands provided as an example on 3/6 can be used to replicate
this.

> Root port status can be changed since device is disabled.

Agreed however the reason for the flag is to address removing high MAC
address hack-preference set on drivers that were merged prior to the
root_block feature. Without a proper way to address this upon
initialization we're stuck with requiring userspace upgrade if these
driver's hack is removed.

  Luis

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC v3 4/6] bridge: enable root block during device registration
  2014-03-03 23:58     ` Luis R. Rodriguez
@ 2014-03-04  0:31       ` Stephen Hemminger
  2014-03-04  0:53         ` Luis R. Rodriguez
  0 siblings, 1 reply; 13+ messages in thread
From: Stephen Hemminger @ 2014-03-04  0:31 UTC (permalink / raw)
  To: Luis R. Rodriguez; +Cc: netdev, bridge, linux-kernel, kvm, xen-devel

On Mon, 3 Mar 2014 15:58:50 -0800
"Luis R. Rodriguez" <mcgrof@do-not-panic.com> wrote:

> On Mon, Mar 3, 2014 at 3:43 PM, Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> > Doing this in priv flags bloats what is a limited resource (# of bits).  
> 
> Agreed. I tried to avoid it but saw no other option for addressing
> this during initialization properly without requirng a userspace
> upgrade.

Replacing one Xen hack for another doesn't seem like great progress.
I would rather see an API change as needed because there are other
things like setting STP parameters which might also want to be part
of initial device add.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC v3 4/6] bridge: enable root block during device registration
  2014-03-04  0:31       ` Stephen Hemminger
@ 2014-03-04  0:53         ` Luis R. Rodriguez
  0 siblings, 0 replies; 13+ messages in thread
From: Luis R. Rodriguez @ 2014-03-04  0:53 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, bridge, linux-kernel, kvm, xen-devel

On Mon, Mar 3, 2014 at 4:31 PM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> On Mon, 3 Mar 2014 15:58:50 -0800
> "Luis R. Rodriguez" <mcgrof@do-not-panic.com> wrote:
>
>> On Mon, Mar 3, 2014 at 3:43 PM, Stephen Hemminger
>> <stephen@networkplumber.org> wrote:
>> > Doing this in priv flags bloats what is a limited resource (# of bits).
>>
>> Agreed. I tried to avoid it but saw no other option for addressing
>> this during initialization properly without requirng a userspace
>> upgrade.
>
> Replacing one Xen hack for another doesn't seem like great progress.

I'm the last person that wants to see any internal driver hack
propagate, the goal was to kill one here.

> I would rather see an API change as needed because there are other
> things like setting STP parameters which might also want to be part
> of initial device add.

As it stands right now ndo_add_slave() currently only allows us to
pass the net_device, that's it, so we're confined there. One way I can
think of to properly extend this interface is to allow extra
parameters to be passed. We may even be able to move some old
priv_flags for bonding onto this new interface but before I moved down
this more invasive approach I wanted to review this other approach.
Does extending ndo_add_slave() seems reasonable, or do you have any
other preferred alternatives?

Even though the high MAC address was a xen hack it is a hack put in
place prior to the root block feature being added, so I would like at
the very least a bit of consideration on userspace here. Its really
the only reason why I've been stubborn on looking for a compromise. In
the KVM case that doesn't use the root_block feature yet I would have
considered just extending the ndo_add_slave() from the start but
because xen-netback *does* have in place a hack removing it requires
userspace considerations.

  Luis

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC v3 0/6] networking: address root block upon initialization
  2014-03-03 22:46 [RFC v3 0/6] networking: address root block upon initialization Luis R. Rodriguez
                   ` (5 preceding siblings ...)
  2014-03-03 22:47 ` [RFC v3 6/6] tun: add initialization root block support Luis R. Rodriguez
@ 2014-03-04  1:05 ` Luis R. Rodriguez
  2014-03-04  1:44   ` Stephen Hemminger
  6 siblings, 1 reply; 13+ messages in thread
From: Luis R. Rodriguez @ 2014-03-04  1:05 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel, kvm, xen-devel, Luis R. Rodriguez

On Mon, Mar 3, 2014 at 2:46 PM, Luis R. Rodriguez
<mcgrof@do-not-panic.com> wrote:
> From: "Luis R. Rodriguez" <mcgrof@suse.com>

<-- snip -->

> As I tested using the root block preference I noticed that if a net_device
> slave under the bridge gets the designated root port prior to setting in
> userspace the root_block feature enabling the feature won't kick the
> bridge to remove that net_device from the designated port. I addressed
> that issue and also upkeeping the initial random MAC address given to
> the bridge as if othwerwise we'd end up with a zero MAC address bridge
> if we root block all ports. I have only done local tests I'd appreciate a
> bit more wide test coverage and review.


Stephen,

I should note that even if we discard patches 4-6 patches for an
alternative implementation patches 1-3 should still be applicable for
review. Let me know what you think of those.

  Luis

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC v3 0/6] networking: address root block upon initialization
  2014-03-04  1:05 ` [RFC v3 0/6] networking: address root block upon initialization Luis R. Rodriguez
@ 2014-03-04  1:44   ` Stephen Hemminger
  0 siblings, 0 replies; 13+ messages in thread
From: Stephen Hemminger @ 2014-03-04  1:44 UTC (permalink / raw)
  To: Luis R. Rodriguez; +Cc: netdev, linux-kernel, kvm, xen-devel

On Mon, 3 Mar 2014 17:05:18 -0800
"Luis R. Rodriguez" <mcgrof@suse.com> wrote:

> On Mon, Mar 3, 2014 at 2:46 PM, Luis R. Rodriguez
> <mcgrof@do-not-panic.com> wrote:
> > From: "Luis R. Rodriguez" <mcgrof@suse.com>
> 
> <-- snip -->
> 
> > As I tested using the root block preference I noticed that if a net_device
> > slave under the bridge gets the designated root port prior to setting in
> > userspace the root_block feature enabling the feature won't kick the
> > bridge to remove that net_device from the designated port. I addressed
> > that issue and also upkeeping the initial random MAC address given to
> > the bridge as if othwerwise we'd end up with a zero MAC address bridge
> > if we root block all ports. I have only done local tests I'd appreciate a
> > bit more wide test coverage and review.
> 
> 
> Stephen,
> 
> I should note that even if we discard patches 4-6 patches for an
> alternative implementation patches 1-3 should still be applicable for
> review. Let me know what you think of those.
> 
>   Luis
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

I agree with 0-3 as normal improvements.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2014-03-04  1:44 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-03 22:46 [RFC v3 0/6] networking: address root block upon initialization Luis R. Rodriguez
2014-03-03 22:47 ` [RFC v3 1/6] bridge: preserve random init MAC address Luis R. Rodriguez
2014-03-03 22:47 ` [RFC v3 2/6] bridge: trigger a bridge calculation upon port changes Luis R. Rodriguez
2014-03-03 22:47 ` [RFC v3 3/6] bridge: fix bridge root block on designated port Luis R. Rodriguez
2014-03-03 22:47 ` [RFC v3 4/6] bridge: enable root block during device registration Luis R. Rodriguez
2014-03-03 23:43   ` Stephen Hemminger
2014-03-03 23:58     ` Luis R. Rodriguez
2014-03-04  0:31       ` Stephen Hemminger
2014-03-04  0:53         ` Luis R. Rodriguez
2014-03-03 22:47 ` [RFC v3 5/6] xen-netback: use a random MAC address and force bridge root block Luis R. Rodriguez
2014-03-03 22:47 ` [RFC v3 6/6] tun: add initialization root block support Luis R. Rodriguez
2014-03-04  1:05 ` [RFC v3 0/6] networking: address root block upon initialization Luis R. Rodriguez
2014-03-04  1:44   ` Stephen Hemminger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).