netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/3] net: bridge: propagate FDB table into hardware
@ 2012-02-29  7:17 John Fastabend
  2012-02-29  7:17 ` [RFC PATCH 1/3] net: refactor br_fdb_xxx rtnetlink routines John Fastabend
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: John Fastabend @ 2012-02-29  7:17 UTC (permalink / raw)
  To: jhs, shemminger, kernel
  Cc: hadi, bhutchings, roprabhu, mst, netdev, gregory.v.rose, davem

This series is a follow up to the RFC thread with the same title.
It is intended to manage the forwarding database in embedded
switches One specific example being SR-IOV NICs.

This adds a private netdev flag (IFF_OFFLOADED_FDB) to enable
management of the FDB and a ndm_flags bit to set in the PF_BRIDGE
netlink commands to push the command at the embedded device.

As it is currently implemented this requires loading the bridge
module but has the advantage that the HW and SW FDB commands look
the same. It was easy for example to modify existing user space
tools to support this.

At this point I've only lightly tested it and need to audit the
code in the light of day. But any comments would appreciated it. I
figured pushing code out was easier then continuing a thread about
hypothetical solutions.

Thanks!

---

John Fastabend (3):
      net: drivers: set IFF_OFFLOADED_FDB priv flag on ixgbe and igb
      net: expose ebridge FDB with priv flag IFF_OFFLOADED_FDB
      net: refactor br_fdb_xxx rtnetlink routines


 drivers/net/ethernet/intel/igb/igb_main.c     |    2 
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |    4 
 include/linux/if.h                            |    2 
 include/linux/neighbour.h                     |    2 
 net/bridge/br_fdb.c                           |   88 +++------
 net/bridge/br_netlink.c                       |  237 ++++++++++++++++++++++++-
 net/bridge/br_private.h                       |    9 +
 7 files changed, 269 insertions(+), 75 deletions(-)

-- 
Signature

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [RFC PATCH 1/3] net: refactor br_fdb_xxx rtnetlink routines
  2012-02-29  7:17 [RFC PATCH 0/3] net: bridge: propagate FDB table into hardware John Fastabend
@ 2012-02-29  7:17 ` John Fastabend
  2012-02-29  7:17 ` [RFC PATCH 2/3] net: expose ebridge FDB with priv flag IFF_OFFLOADED_FDB John Fastabend
  2012-02-29  7:17 ` [RFC PATCH 3/3] net: drivers: set IFF_OFFLOADED_FDB priv flag on ixgbe and igb John Fastabend
  2 siblings, 0 replies; 7+ messages in thread
From: John Fastabend @ 2012-02-29  7:17 UTC (permalink / raw)
  To: jhs, shemminger, kernel
  Cc: hadi, bhutchings, roprabhu, mst, netdev, gregory.v.rose, davem

Refactor PF_BRIDGE:RTM_NEWNEIGH, PF_BRIDGE:RTM_DELNEIGH, and
PF_BRIDGE:RTM_GETNEIGH messages to be handled by other ports
besides IFF_BRIDGE_PORT.

This is setup work to manage embedded FDB tables.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---

 net/bridge/br_fdb.c     |   88 ++++++++++++--------------------------------
 net/bridge/br_netlink.c |   95 ++++++++++++++++++++++++++++++++++++++++++++---
 net/bridge/br_private.h |    9 +++-
 3 files changed, 119 insertions(+), 73 deletions(-)

diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index 5ba0c84..888a15f 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -535,44 +535,34 @@ errout:
 }
 
 /* Dump information about entries, in response to GETNEIGH */
-int br_fdb_dump(struct sk_buff *skb, struct netlink_callback *cb)
+int br_fdb_dump(struct sk_buff *skb,
+		struct netlink_callback *cb,
+		struct net_device *dev,
+		int idx)
 {
-	struct net *net = sock_net(skb->sk);
-	struct net_device *dev;
-	int idx = 0;
-
-	rcu_read_lock();
-	for_each_netdev_rcu(net, dev) {
-		struct net_bridge *br = netdev_priv(dev);
-		int i;
-
-		if (!(dev->priv_flags & IFF_EBRIDGE))
-			continue;
-
-		for (i = 0; i < BR_HASH_SIZE; i++) {
-			struct hlist_node *h;
-			struct net_bridge_fdb_entry *f;
+	struct net_bridge *br = netdev_priv(dev);
+	int i;
 
-			hlist_for_each_entry_rcu(f, h, &br->hash[i], hlist) {
-				if (idx < cb->args[0])
-					goto skip;
+	for (i = 0; i < BR_HASH_SIZE; i++) {
+		struct hlist_node *h;
+		struct net_bridge_fdb_entry *f;
 
-				if (fdb_fill_info(skb, br, f,
-						  NETLINK_CB(cb->skb).pid,
-						  cb->nlh->nlmsg_seq,
-						  RTM_NEWNEIGH,
-						  NLM_F_MULTI) < 0)
-					break;
+		hlist_for_each_entry_rcu(f, h, &br->hash[i], hlist) {
+			if (idx < cb->args[0])
+				goto skip;
+
+			if (fdb_fill_info(skb, br, f,
+					  NETLINK_CB(cb->skb).pid,
+					  cb->nlh->nlmsg_seq,
+					  RTM_NEWNEIGH,
+					  NLM_F_MULTI) < 0)
+				break;
 skip:
-				++idx;
-			}
+			++idx;
 		}
 	}
-	rcu_read_unlock();
-
-	cb->args[0] = idx;
 
-	return skb->len;
+	return idx;
 }
 
 /* Update (create or replace) forwarding database entry */
@@ -614,12 +604,10 @@ static int fdb_add_entry(struct net_bridge_port *source, const __u8 *addr,
 }
 
 /* Add new permanent fdb entry with RTM_NEWNEIGH */
-int br_fdb_add(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
+int br_fdb_add(struct nlmsghdr *nlh, struct net_device *dev)
 {
-	struct net *net = sock_net(skb->sk);
 	struct ndmsg *ndm;
 	struct nlattr *tb[NDA_MAX+1];
-	struct net_device *dev;
 	struct net_bridge_port *p;
 	const __u8 *addr;
 	int err;
@@ -630,17 +618,6 @@ int br_fdb_add(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
 		return err;
 
 	ndm = nlmsg_data(nlh);
-	if (ndm->ndm_ifindex == 0) {
-		pr_info("bridge: RTM_NEWNEIGH with invalid ifindex\n");
-		return -EINVAL;
-	}
-
-	dev = __dev_get_by_index(net, ndm->ndm_ifindex);
-	if (dev == NULL) {
-		pr_info("bridge: RTM_NEWNEIGH with unknown ifindex\n");
-		return -ENODEV;
-	}
-
 	if (!tb[NDA_LLADDR] || nla_len(tb[NDA_LLADDR]) != ETH_ALEN) {
 		pr_info("bridge: RTM_NEWNEIGH with invalid address\n");
 		return -EINVAL;
@@ -692,33 +669,16 @@ static int fdb_delete_by_addr(struct net_bridge_port *p, const u8 *addr)
 }
 
 /* Remove neighbor entry with RTM_DELNEIGH */
-int br_fdb_delete(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
+int br_fdb_del(struct nlmsghdr *nlh, struct net_device *dev)
 {
-	struct net *net = sock_net(skb->sk);
-	struct ndmsg *ndm;
 	struct net_bridge_port *p;
 	struct nlattr *llattr;
 	const __u8 *addr;
-	struct net_device *dev;
 	int err;
 
 	ASSERT_RTNL();
-	if (nlmsg_len(nlh) < sizeof(*ndm))
-		return -EINVAL;
-
-	ndm = nlmsg_data(nlh);
-	if (ndm->ndm_ifindex == 0) {
-		pr_info("bridge: RTM_DELNEIGH with invalid ifindex\n");
-		return -EINVAL;
-	}
-
-	dev = __dev_get_by_index(net, ndm->ndm_ifindex);
-	if (dev == NULL) {
-		pr_info("bridge: RTM_DELNEIGH with unknown ifindex\n");
-		return -ENODEV;
-	}
 
-	llattr = nlmsg_find_attr(nlh, sizeof(*ndm), NDA_LLADDR);
+	llattr = nlmsg_find_attr(nlh, sizeof(struct ndmsg), NDA_LLADDR);
 	if (llattr == NULL || nla_len(llattr) != ETH_ALEN) {
 		pr_info("bridge: RTM_DELNEIGH with invalid address\n");
 		return -EINVAL;
diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index a1daf82..9e70191 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -211,6 +211,86 @@ static int br_validate(struct nlattr *tb[], struct nlattr *data[])
 	return 0;
 }
 
+static int rtnl_fdb_add(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
+{
+	struct net *net = sock_net(skb->sk);
+	struct ndmsg *ndm;
+	struct nlattr *tb[NDA_MAX+1];
+	struct net_device *dev;
+	int err;
+
+	err = nlmsg_parse(nlh, sizeof(*ndm), tb, NDA_MAX, NULL);
+	if (err < 0)
+		return err;
+
+	ndm = nlmsg_data(nlh);
+	if (ndm->ndm_ifindex == 0) {
+		pr_info("bridge: RTM_NEWNEIGH with invalid ifindex\n");
+		return -EINVAL;
+	}
+
+	dev = __dev_get_by_index(net, ndm->ndm_ifindex);
+	if (dev == NULL) {
+		pr_info("bridge: RTM_NEWNEIGH with unknown ifindex\n");
+		return -ENODEV;
+	}
+
+	if (dev->priv_flags & IFF_BRIDGE_PORT)
+		err = br_fdb_add(nlh, dev);
+	else
+		err = -ENODEV;
+
+	return err;
+}
+
+static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
+{
+	struct net *net = sock_net(skb->sk);
+	struct ndmsg *ndm;
+	struct net_device *dev;
+	int err;
+
+	ASSERT_RTNL();
+	if (nlmsg_len(nlh) < sizeof(*ndm))
+		return -EINVAL;
+
+	ndm = nlmsg_data(nlh);
+	if (ndm->ndm_ifindex == 0) {
+		pr_info("bridge: RTM_DELNEIGH with invalid ifindex\n");
+		return -EINVAL;
+	}
+
+	dev = __dev_get_by_index(net, ndm->ndm_ifindex);
+	if (dev == NULL) {
+		pr_info("bridge: RTM_DELNEIGH with unknown ifindex\n");
+		return -ENODEV;
+	}
+
+	if (dev->priv_flags & IFF_BRIDGE_PORT)
+		err = br_fdb_del(nlh, dev);
+	else
+		err = -ENODEV;
+
+	return err;
+}
+
+static int rtnl_fdb_dump(struct sk_buff *skb, struct netlink_callback *cb)
+{
+	int idx = 0;
+	struct net *net = sock_net(skb->sk);
+	struct net_device *dev;
+
+	rcu_read_lock();
+	for_each_netdev_rcu(net, dev) {
+		if (dev->priv_flags & IFF_EBRIDGE)
+			idx = br_fdb_dump(skb, cb, dev, idx);
+	}
+	rcu_read_unlock();
+
+	cb->args[0] = idx;
+	return skb->len;
+}
+
 static struct rtnl_link_ops br_link_ops __read_mostly = {
 	.kind		= "bridge",
 	.priv_size	= sizeof(struct net_bridge),
@@ -235,18 +315,21 @@ int __init br_netlink_init(void)
 			      br_rtm_setlink, NULL, NULL);
 	if (err)
 		goto err3;
+
 	err = __rtnl_register(PF_BRIDGE, RTM_NEWNEIGH,
-			      br_fdb_add, NULL, NULL);
+			      rtnl_fdb_add, NULL, NULL);
 	if (err)
-		goto err3;
+		goto  err3;
+
 	err = __rtnl_register(PF_BRIDGE, RTM_DELNEIGH,
-			      br_fdb_delete, NULL, NULL);
+			      rtnl_fdb_del, NULL, NULL);
 	if (err)
-		goto err3;
+		goto  err3;
+
 	err = __rtnl_register(PF_BRIDGE, RTM_GETNEIGH,
-			      NULL, br_fdb_dump, NULL);
+			      NULL, rtnl_fdb_dump, NULL);
 	if (err)
-		goto err3;
+		goto  err3;
 
 	return 0;
 
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 0b67a63..d56518a 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -363,9 +363,12 @@ extern int br_fdb_insert(struct net_bridge *br,
 extern void br_fdb_update(struct net_bridge *br,
 			  struct net_bridge_port *source,
 			  const unsigned char *addr);
-extern int br_fdb_dump(struct sk_buff *skb, struct netlink_callback *cb);
-extern int br_fdb_add(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg);
-extern int br_fdb_delete(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg);
+extern int br_fdb_dump(struct sk_buff *skb,
+		       struct netlink_callback *cb,
+		       struct net_device *dev,
+		       int idx);
+extern int br_fdb_add(struct nlmsghdr *nlh, struct net_device *dev);
+extern int br_fdb_del(struct nlmsghdr *nlh, struct net_device *dev);
 
 /* br_forward.c */
 extern void br_deliver(const struct net_bridge_port *to,

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC PATCH 2/3] net: expose ebridge FDB with priv flag IFF_OFFLOADED_FDB
  2012-02-29  7:17 [RFC PATCH 0/3] net: bridge: propagate FDB table into hardware John Fastabend
  2012-02-29  7:17 ` [RFC PATCH 1/3] net: refactor br_fdb_xxx rtnetlink routines John Fastabend
@ 2012-02-29  7:17 ` John Fastabend
  2012-03-02 19:56   ` Ben Hutchings
  2012-02-29  7:17 ` [RFC PATCH 3/3] net: drivers: set IFF_OFFLOADED_FDB priv flag on ixgbe and igb John Fastabend
  2 siblings, 1 reply; 7+ messages in thread
From: John Fastabend @ 2012-02-29  7:17 UTC (permalink / raw)
  To: jhs, shemminger, kernel
  Cc: hadi, bhutchings, roprabhu, mst, netdev, gregory.v.rose, davem

This adds a new private interface flag IFF_OFFLOADED_FDB and an
additional ndmsg flag NTF_EMBEDDED.

The private flag IFF_OFFLOADED_FDB should be set on devices to
indicate an embedded bridging component exists with a forwarding
database.

With this set PF_BRIDGE:{RTM_NEWNEIGH|RTM_DELNEIGH|RTM_GETNEIGH}
netlink msgs can manage the unicast address list on these devices
by setting the NTF_EMBEDDED flag in ndm_flags.

These commands are compatible with the SW bridge allowing the same
user space tools to be used with both SW bridges and HW bridges.

PF_BRIDGE:RTM_NEWNEIGH - adds an address to the unicast mac address
			 list. This fails if the address already
			 exists in the list. We do not allow user
			 space to bump the reference count on the
			 address list.
PF_BRIDGE:RTM_DELNEIGH - deletes an address in the unicast mac
			 address list if it exists.
PF_BRIDGE:RTM_GETNEIGH - dumps the unicast mac address list.

Examples session using the 'br'[1] tool to add, dump and then
delete a mac address with a new "embedded" option:

# ./br fdb add embedded 00:1b:21:55:23:ac dev eth6_rename
# ./br fdb
port			mac addr                flags
Embedded eth6_rename    00:1b:21:55:23:aa       local
Embedded eth6_rename    00:1b:21:55:23:ab       local
Embedded eth6_rename    00:1b:21:55:23:ac       local
	 veth2		76:99:5e:bf:e6:52       local
	 eth6_rename	00:1b:21:55:23:59       local
	 veth0		c6:09:6e:6c:7d:54       local

[1] 'br' tool was published as an RFC here and will be renamed 'bridge'
    http://patchwork.ozlabs.org/patch/117664/

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---

 include/linux/if.h        |    2 -
 include/linux/neighbour.h |    2 +
 net/bridge/br_netlink.c   |  144 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 146 insertions(+), 2 deletions(-)

diff --git a/include/linux/if.h b/include/linux/if.h
index f995c66..bd0efbc 100644
--- a/include/linux/if.h
+++ b/include/linux/if.h
@@ -81,7 +81,7 @@
 #define IFF_UNICAST_FLT	0x20000		/* Supports unicast filtering	*/
 #define IFF_TEAM_PORT	0x40000		/* device used as team port */
 #define IFF_SUPP_NOFCS	0x80000		/* device supports sending custom FCS */
-
+#define IFF_OFFLOADED_FDB 0x100000	/* devices used with HW FDB */
 
 #define IF_GET_IFACE	0x0001		/* for querying only */
 #define IF_GET_PROTO	0x0002
diff --git a/include/linux/neighbour.h b/include/linux/neighbour.h
index b188f68..9a478d3 100644
--- a/include/linux/neighbour.h
+++ b/include/linux/neighbour.h
@@ -33,6 +33,8 @@ enum {
 #define NTF_PROXY	0x08	/* == ATF_PUBL */
 #define NTF_ROUTER	0x80
 
+#define NTF_EMBEDDED	0x02	/* PF_BRIDGE embedded entry */
+
 /*
  *	Neighbor Cache Entry States.
  */
diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index 9e70191..7b1a581 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -211,6 +211,57 @@ static int br_validate(struct nlattr *tb[], struct nlattr *data[])
 	return 0;
 }
 
+static int rtnl_offloaded_fdb_add(struct nlmsghdr *nlh, struct net_device *dev)
+{
+	struct ndmsg *ndm;
+	struct netdev_hw_addr *ha;
+	struct nlattr *tb[NDA_MAX+1];
+	__u8 *addr;
+	int err;
+
+	ASSERT_RTNL();
+
+	if (!(dev->priv_flags & IFF_OFFLOADED_FDB))
+		return -ENODEV;
+
+	err = nlmsg_parse(nlh, sizeof(*ndm), tb, NDA_MAX, NULL);
+	if (err < 0)
+		return err;
+
+	ndm = nlmsg_data(nlh);
+	if (!tb[NDA_LLADDR] || nla_len(tb[NDA_LLADDR]) != ETH_ALEN) {
+		pr_info("fdb_add: RTM_NEWNEIGH with invalid address\n");
+		return -EINVAL;
+	}
+
+	addr = nla_data(tb[NDA_LLADDR]);
+	if (!is_valid_ether_addr(addr)) {
+		pr_info("fdb_add: RTM_NEWNEIGH with invalid ether address\n");
+		return -EINVAL;
+	}
+
+	if (ndm->ndm_state & NUD_PERMANENT) {
+		pr_info("fdb_add: RTM_NEWNEIGH with invalid state %#x\n",
+			ndm->ndm_state);
+		return -EINVAL;
+	}
+
+	if (is_multicast_ether_addr(addr))
+		return -EINVAL;
+
+	netif_addr_lock_bh(dev);
+	list_for_each_entry(ha, &dev->uc.list, list) {
+		if (!compare_ether_addr(ha->addr, addr)) {
+			netif_addr_unlock_bh(dev);
+			return -EEXIST;
+		}
+	}
+	netif_addr_unlock_bh(dev);
+
+	err = dev_uc_add(dev, addr);
+	return err;
+}
+
 static int rtnl_fdb_add(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
 {
 	struct net *net = sock_net(skb->sk);
@@ -235,7 +286,9 @@ static int rtnl_fdb_add(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
 		return -ENODEV;
 	}
 
-	if (dev->priv_flags & IFF_BRIDGE_PORT)
+	if (ndm->ndm_flags & NTF_EMBEDDED)
+		err = rtnl_offloaded_fdb_add(nlh, dev);
+	else if (dev->priv_flags & IFF_BRIDGE_PORT)
 		err = br_fdb_add(nlh, dev);
 	else
 		err = -ENODEV;
@@ -243,6 +296,43 @@ static int rtnl_fdb_add(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
 	return err;
 }
 
+static int rtnl_offloaded_fdb_del(struct nlmsghdr *nlh, struct net_device *dev)
+{
+	struct ndmsg *ndm;
+	struct nlattr *tb[NDA_MAX+1];
+	__u8 *addr;
+	int err;
+
+	ASSERT_RTNL();
+	err = nlmsg_parse(nlh, sizeof(*ndm), tb, NDA_MAX, NULL);
+	if (err < 0)
+		return err;
+
+	ndm = nlmsg_data(nlh);
+	if (!tb[NDA_LLADDR] || nla_len(tb[NDA_LLADDR]) != ETH_ALEN) {
+		pr_info("fdb_add: RTM_NEWNEIGH with invalid address\n");
+		return -EINVAL;
+	}
+
+	addr = nla_data(tb[NDA_LLADDR]);
+	if (!is_valid_ether_addr(addr)) {
+		pr_info("fdb_add: RTM_NEWNEIGH with invalid ether address\n");
+		return -EINVAL;
+	}
+
+	if (ndm->ndm_state & NUD_PERMANENT) {
+		pr_info("fdb_add: RTM_NEWNEIGH with invalid state %#x\n",
+			ndm->ndm_state);
+		return -EINVAL;
+	}
+
+	if (is_multicast_ether_addr(addr))
+		return -EINVAL;
+
+	err = dev_uc_del(dev, addr);
+	return err;
+}
+
 static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
 {
 	struct net *net = sock_net(skb->sk);
@@ -268,12 +358,61 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
 
 	if (dev->priv_flags & IFF_BRIDGE_PORT)
 		err = br_fdb_del(nlh, dev);
+	else if (dev->priv_flags & IFF_OFFLOADED_FDB)
+		err = rtnl_offloaded_fdb_del(nlh, dev);
 	else
 		err = -ENODEV;
 
 	return err;
 }
 
+static int rtnl_offloaded_fdb_dump(struct sk_buff *skb,
+				   struct netlink_callback *cb,
+				   struct net_device *dev,
+				   int idx)
+{
+	struct netdev_hw_addr *ha;
+	struct nlmsghdr *nlh;
+	struct ndmsg *ndm;
+	u32 pid, seq;
+
+	pid = NETLINK_CB(cb->skb).pid;
+	seq = cb->nlh->nlmsg_seq;
+
+	netif_addr_lock_bh(dev);
+	list_for_each_entry(ha, &dev->uc.list, list) {
+		if (idx < cb->args[0])
+			goto skip;
+
+		nlh = nlmsg_put(skb, pid, seq,
+				RTM_NEWNEIGH, sizeof(*ndm), NLM_F_MULTI);
+		if (!nlh)
+			break;
+
+		ndm = nlmsg_data(nlh);
+		ndm->ndm_family  = AF_BRIDGE;
+		ndm->ndm_pad1	 = 0;
+		ndm->ndm_pad2    = 0;
+		ndm->ndm_flags	 = NTF_EMBEDDED;
+		ndm->ndm_type	 = 0;
+		ndm->ndm_ifindex = dev->ifindex;
+		ndm->ndm_state   = NUD_PERMANENT;
+
+		NLA_PUT(skb, NDA_LLADDR, ETH_ALEN, ha->addr);
+
+		nlmsg_end(skb, nlh);
+skip:
+		++idx;
+	}
+	netif_addr_unlock_bh(dev);
+
+	return idx;
+nla_put_failure:
+	netif_addr_unlock_bh(dev);
+	nlmsg_cancel(skb, nlh);
+	return idx;
+}
+
 static int rtnl_fdb_dump(struct sk_buff *skb, struct netlink_callback *cb)
 {
 	int idx = 0;
@@ -284,6 +423,9 @@ static int rtnl_fdb_dump(struct sk_buff *skb, struct netlink_callback *cb)
 	for_each_netdev_rcu(net, dev) {
 		if (dev->priv_flags & IFF_EBRIDGE)
 			idx = br_fdb_dump(skb, cb, dev, idx);
+
+		if (dev->priv_flags & IFF_OFFLOADED_FDB)
+			idx = rtnl_offloaded_fdb_dump(skb, cb, dev, idx);
 	}
 	rcu_read_unlock();
 

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC PATCH 3/3] net: drivers: set IFF_OFFLOADED_FDB priv flag on ixgbe and igb
  2012-02-29  7:17 [RFC PATCH 0/3] net: bridge: propagate FDB table into hardware John Fastabend
  2012-02-29  7:17 ` [RFC PATCH 1/3] net: refactor br_fdb_xxx rtnetlink routines John Fastabend
  2012-02-29  7:17 ` [RFC PATCH 2/3] net: expose ebridge FDB with priv flag IFF_OFFLOADED_FDB John Fastabend
@ 2012-02-29  7:17 ` John Fastabend
  2 siblings, 0 replies; 7+ messages in thread
From: John Fastabend @ 2012-02-29  7:17 UTC (permalink / raw)
  To: jhs, shemminger, kernel
  Cc: hadi, bhutchings, roprabhu, mst, netdev, gregory.v.rose, davem

This sets the IFF_OFFLOADED_FDB in the priv flags when SR-IOV is
enabled on ixgbe and igb devices.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---

 drivers/net/ethernet/intel/igb/igb_main.c     |    2 ++
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |    4 +++-
 2 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index fda8247..1b4b8de 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -2272,6 +2272,8 @@ static void __devinit igb_probe_vfs(struct igb_adapter * adapter)
 	for (i = 0; i < adapter->vfs_allocated_count; i++)
 		igb_vf_configure(adapter, i);
 
+	adapter->netdev->priv_flags |= IFF_OFFLOADED_FDB;
+
 	/* DMA Coalescing is not supported in IOV mode. */
 	adapter->flags &= ~IGB_FLAG_DMAC;
 	goto out;
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 4e55860..ff5599b 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -7753,9 +7753,11 @@ static int __devinit ixgbe_probe(struct pci_dev *pdev,
 
 	netdev->priv_flags |= IFF_UNICAST_FLT;
 
-	if (adapter->flags & IXGBE_FLAG_SRIOV_ENABLED)
+	if (adapter->flags & IXGBE_FLAG_SRIOV_ENABLED) {
+		netdev->priv_flags |= IFF_OFFLOADED_FDB;
 		adapter->flags &= ~(IXGBE_FLAG_RSS_ENABLED |
 				    IXGBE_FLAG_DCB_ENABLED);
+	}
 
 #ifdef CONFIG_IXGBE_DCB
 	netdev->dcbnl_ops = &dcbnl_ops;

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH 2/3] net: expose ebridge FDB with priv flag IFF_OFFLOADED_FDB
  2012-02-29  7:17 ` [RFC PATCH 2/3] net: expose ebridge FDB with priv flag IFF_OFFLOADED_FDB John Fastabend
@ 2012-03-02 19:56   ` Ben Hutchings
  2012-03-02 20:00     ` Ben Hutchings
  2012-03-06  3:34     ` John Fastabend
  0 siblings, 2 replies; 7+ messages in thread
From: Ben Hutchings @ 2012-03-02 19:56 UTC (permalink / raw)
  To: John Fastabend
  Cc: jhs, shemminger, kernel, hadi, roprabhu, mst, netdev,
	gregory.v.rose, davem

On Tue, 2012-02-28 at 23:17 -0800, John Fastabend wrote:
> This adds a new private interface flag IFF_OFFLOADED_FDB and an
> additional ndmsg flag NTF_EMBEDDED.
> 
> The private flag IFF_OFFLOADED_FDB should be set on devices to
> indicate an embedded bridging component exists with a forwarding
> database.
> 
> With this set PF_BRIDGE:{RTM_NEWNEIGH|RTM_DELNEIGH|RTM_GETNEIGH}
> netlink msgs can manage the unicast address list on these devices
> by setting the NTF_EMBEDDED flag in ndm_flags.
> 
> These commands are compatible with the SW bridge allowing the same
> user space tools to be used with both SW bridges and HW bridges.
[...]
> index 9e70191..7b1a581 100644
> --- a/net/bridge/br_netlink.c
> +++ b/net/bridge/br_netlink.c
> @@ -211,6 +211,57 @@ static int br_validate(struct nlattr *tb[], struct nlattr *data[])
>  	return 0;
>  }
>  
> +static int rtnl_offloaded_fdb_add(struct nlmsghdr *nlh, struct net_device *dev)
> +{
> +	struct ndmsg *ndm;
> +	struct netdev_hw_addr *ha;
> +	struct nlattr *tb[NDA_MAX+1];
> +	__u8 *addr;
> +	int err;
> +
> +	ASSERT_RTNL();
> +
> +	if (!(dev->priv_flags & IFF_OFFLOADED_FDB))
> +		return -ENODEV;

EOPNOTSUPP

> +	err = nlmsg_parse(nlh, sizeof(*ndm), tb, NDA_MAX, NULL);
> +	if (err < 0)
> +		return err;
> +
> +	ndm = nlmsg_data(nlh);
> +	if (!tb[NDA_LLADDR] || nla_len(tb[NDA_LLADDR]) != ETH_ALEN) {
> +		pr_info("fdb_add: RTM_NEWNEIGH with invalid address\n");
> +		return -EINVAL;
> +	}
> +
> +	addr = nla_data(tb[NDA_LLADDR]);
> +	if (!is_valid_ether_addr(addr)) {
> +		pr_info("fdb_add: RTM_NEWNEIGH with invalid ether address\n");
> +		return -EINVAL;
> +	}
> +
> +	if (ndm->ndm_state & NUD_PERMANENT) {
> +		pr_info("fdb_add: RTM_NEWNEIGH with invalid state %#x\n",
> +			ndm->ndm_state);
> +		return -EINVAL;
> +	}
> +
> +	if (is_multicast_ether_addr(addr))
> +		return -EINVAL;

There is a pending change to the ndo_validate_addr interface which I
think would allow us to replace the Ethernet-specific checks with a
check against dev->addr_len and a call to ndo_validate_addr.  Not that I
really care about anything other than Ethernet, but it would be a little
more elegant.

> +	netif_addr_lock_bh(dev);
> +	list_for_each_entry(ha, &dev->uc.list, list) {
> +		if (!compare_ether_addr(ha->addr, addr)) {
> +			netif_addr_unlock_bh(dev);
> +			return -EEXIST;
> +		}
> +	}
> +	netif_addr_unlock_bh(dev);
> +
> +	err = dev_uc_add(dev, addr);
[...]

The software bridge makes the duplicate check conditional on NLM_F_EXCL.
Shouldn't this be consistent?

Also, this seems racy.  If all manipulation of the UC address list is
serialised by RTNL (which I think in practice it is) then there is no
race and we also don't need to take the netif_addr_lock(), but if we're
not allowed to assume that then we shouldn't drop the lock.  Perhaps the
check can be moved into a dev_uc_add_exclusive() or something like that.

General comment: why should the software bridge code be loaded in order
to control a hardware bridge?  (It's not even the only software bridge
we have!)  It would perhaps be neater to have separate modules for the
software bridge, the rtnetlink bridge ops, and the glue functions for
hardware bridges.  But that could be done later.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH 2/3] net: expose ebridge FDB with priv flag IFF_OFFLOADED_FDB
  2012-03-02 19:56   ` Ben Hutchings
@ 2012-03-02 20:00     ` Ben Hutchings
  2012-03-06  3:34     ` John Fastabend
  1 sibling, 0 replies; 7+ messages in thread
From: Ben Hutchings @ 2012-03-02 20:00 UTC (permalink / raw)
  To: John Fastabend
  Cc: jhs, shemminger, kernel, hadi, roprabhu, mst, netdev,
	gregory.v.rose, davem

On Fri, 2012-03-02 at 19:56 +0000, Ben Hutchings wrote:
[...]
> General comment: why should the software bridge code be loaded in order
> to control a hardware bridge?  (It's not even the only software bridge
> we have!)  It would perhaps be neater to have separate modules for the
> software bridge, the rtnetlink bridge ops, and the glue functions for
> hardware bridges.  But that could be done later.

I see from the other thread that you're already well aware for this.
So, no need to answer it.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH 2/3] net: expose ebridge FDB with priv flag IFF_OFFLOADED_FDB
  2012-03-02 19:56   ` Ben Hutchings
  2012-03-02 20:00     ` Ben Hutchings
@ 2012-03-06  3:34     ` John Fastabend
  1 sibling, 0 replies; 7+ messages in thread
From: John Fastabend @ 2012-03-06  3:34 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: jhs, shemminger, kernel, hadi, roprabhu, mst, netdev,
	gregory.v.rose, davem

On 3/2/2012 11:56 AM, Ben Hutchings wrote:
> On Tue, 2012-02-28 at 23:17 -0800, John Fastabend wrote:
>> This adds a new private interface flag IFF_OFFLOADED_FDB and an
>> additional ndmsg flag NTF_EMBEDDED.
>>
>> The private flag IFF_OFFLOADED_FDB should be set on devices to
>> indicate an embedded bridging component exists with a forwarding
>> database.
>>
>> With this set PF_BRIDGE:{RTM_NEWNEIGH|RTM_DELNEIGH|RTM_GETNEIGH}
>> netlink msgs can manage the unicast address list on these devices
>> by setting the NTF_EMBEDDED flag in ndm_flags.
>>
>> These commands are compatible with the SW bridge allowing the same
>> user space tools to be used with both SW bridges and HW bridges.
> [...]
>> index 9e70191..7b1a581 100644
>> --- a/net/bridge/br_netlink.c
>> +++ b/net/bridge/br_netlink.c
>> @@ -211,6 +211,57 @@ static int br_validate(struct nlattr *tb[], struct nlattr *data[])
>>  	return 0;
>>  }
>>  
>> +static int rtnl_offloaded_fdb_add(struct nlmsghdr *nlh, struct net_device *dev)
>> +{
>> +	struct ndmsg *ndm;
>> +	struct netdev_hw_addr *ha;
>> +	struct nlattr *tb[NDA_MAX+1];
>> +	__u8 *addr;
>> +	int err;
>> +
>> +	ASSERT_RTNL();
>> +
>> +	if (!(dev->priv_flags & IFF_OFFLOADED_FDB))
>> +		return -ENODEV;
> 
> EOPNOTSUPP
> 

Right thanks.

[...]

>> +	if (is_multicast_ether_addr(addr))
>> +		return -EINVAL;
> 
> There is a pending change to the ndo_validate_addr interface which I
> think would allow us to replace the Ethernet-specific checks with a
> check against dev->addr_len and a call to ndo_validate_addr.  Not that I
> really care about anything other than Ethernet, but it would be a little
> more elegant.
> 
>> +	netif_addr_lock_bh(dev);
>> +	list_for_each_entry(ha, &dev->uc.list, list) {
>> +		if (!compare_ether_addr(ha->addr, addr)) {
>> +			netif_addr_unlock_bh(dev);
>> +			return -EEXIST;
>> +		}
>> +	}
>> +	netif_addr_unlock_bh(dev);
>> +
>> +	err = dev_uc_add(dev, addr);
> [...]
> 
> The software bridge makes the duplicate check conditional on NLM_F_EXCL.
> Shouldn't this be consistent?

Agreed it should be consistent.

> 
> Also, this seems racy.  If all manipulation of the UC address list is
> serialised by RTNL (which I think in practice it is) then there is no
> race and we also don't need to take the netif_addr_lock(), but if we're
> not allowed to assume that then we shouldn't drop the lock.  Perhaps the
> check can be moved into a dev_uc_add_exclusive() or something like that.
> 

Yep I'll at least make this consistent with other usages. And maybe a
follow up patch to clean up the usage throughout.

Thanks,
John

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-03-06  3:34 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-29  7:17 [RFC PATCH 0/3] net: bridge: propagate FDB table into hardware John Fastabend
2012-02-29  7:17 ` [RFC PATCH 1/3] net: refactor br_fdb_xxx rtnetlink routines John Fastabend
2012-02-29  7:17 ` [RFC PATCH 2/3] net: expose ebridge FDB with priv flag IFF_OFFLOADED_FDB John Fastabend
2012-03-02 19:56   ` Ben Hutchings
2012-03-02 20:00     ` Ben Hutchings
2012-03-06  3:34     ` John Fastabend
2012-02-29  7:17 ` [RFC PATCH 3/3] net: drivers: set IFF_OFFLOADED_FDB priv flag on ixgbe and igb John Fastabend

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).