All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC net-next 0/6] offload linux bonding tc ingress rules
@ 2018-03-05 13:28 John Hurley
  2018-03-05 13:28 ` [RFC net-next 1/6] drivers: net: bonding: add tc offload infastructure to bond John Hurley
                   ` (8 more replies)
  0 siblings, 9 replies; 16+ messages in thread
From: John Hurley @ 2018-03-05 13:28 UTC (permalink / raw)
  To: netdev; +Cc: jiri, ogerlitz, jakub.kicinski, simon.horman, John Hurley

Hi,

This RFC patchset adds support for offloading tc ingress rules applied to
linux bonds. The premise of these patches is that if a rule is applied to
a bond port then the rule should be applied to each slave of the bond.

The linux bond itself registers a cb for offloading tc rules. Potential
slave netdevs on offload devices can then register with the bond for a
further callback - this code is basically the same as registering for an
egress dev offload in TC. Then when a rule is offloaded to the bond, it
can be relayed to each netdev that has registered with the bond code and
which is a slave of the given bond.

To prevent sync issues between the kernel and offload device, the linux
bond driver is affectively locked when it has offloaded rules. i.e no new
ports can be enslaved and no slaves can be released until the offload
rules are removed. Similarly, if a port on a bond is deleted, the bond is
destroyed, forcing a flush of all offloaded rules.

Also included in the RFC are changes to the NFP driver to utilise the new
code by registering NFP port representors for bond offload rules and
modifying cookie handling to allow the relaying of a rule to multiple
ports.

Thanks,
John

John Hurley (6):
  drivers: net: bonding: add tc offload infastructure to bond
  driver: net: bonding: allow registration of tc offload callbacks in
    bond
  drivers: net: bonding: restrict bond mods when rules are offloaded
  nfp: add ndo_set_mac_address for representors
  nfp: register repr ports for bond offloads
  nfp: support offloading multiple rules with same cookie

 drivers/net/bonding/bond_main.c                    | 277 ++++++++++++++++++++-
 drivers/net/ethernet/netronome/nfp/flower/main.c   |  24 +-
 drivers/net/ethernet/netronome/nfp/flower/main.h   |  10 +-
 .../net/ethernet/netronome/nfp/flower/metadata.c   |  20 +-
 .../net/ethernet/netronome/nfp/flower/offload.c    |  33 ++-
 drivers/net/ethernet/netronome/nfp/nfp_net_repr.c  |   1 +
 include/net/bonding.h                              |   9 +
 7 files changed, 351 insertions(+), 23 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC net-next 1/6] drivers: net: bonding: add tc offload infastructure to bond
  2018-03-05 13:28 [RFC net-next 0/6] offload linux bonding tc ingress rules John Hurley
@ 2018-03-05 13:28 ` John Hurley
  2018-03-05 13:28 ` [RFC net-next 2/6] driver: net: bonding: allow registration of tc offload callbacks in bond John Hurley
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: John Hurley @ 2018-03-05 13:28 UTC (permalink / raw)
  To: netdev; +Cc: jiri, ogerlitz, jakub.kicinski, simon.horman, John Hurley

Register an ndo and callback for linux bonds to offload TC block ingress
rules. Enable tc-hw-offload to be set by the user (defaults to off). When
on, the flag cannot be turned off if rules are offloaded.

Signed-off-by: John Hurley <john.hurley@netronome.com>
---
 drivers/net/bonding/bond_main.c | 64 +++++++++++++++++++++++++++++++++++++++++
 include/net/bonding.h           |  2 ++
 2 files changed, 66 insertions(+)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 4c19d23..e6415f6 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -325,6 +325,55 @@ static int bond_vlan_rx_kill_vid(struct net_device *bond_dev,
 	return 0;
 }
 
+/*------------------------------ TC HW Offload ------------------------------*/
+
+static inline unsigned int bond_get_offload_cnt(struct bonding *bond)
+{
+	if (!bond->tc_block)
+		return 0;
+
+	return bond->tc_block->offloadcnt;
+}
+
+static int bond_tc_relay_cb(enum tc_setup_type type, void *type_data,
+			    void *cb_priv)
+{
+	return 0;
+}
+
+static int bond_setup_tc_block(struct net_device *netdev,
+			       struct tc_block_offload *f)
+{
+	struct bonding *bond = netdev_priv(netdev);
+
+	if (f->binder_type != TCF_BLOCK_BINDER_TYPE_CLSACT_INGRESS)
+		return -EOPNOTSUPP;
+
+	switch (f->command) {
+	case TC_BLOCK_BIND:
+		bond->tc_block = f->block;
+		return tcf_block_cb_register(f->block, bond_tc_relay_cb,
+					     netdev, netdev);
+	case TC_BLOCK_UNBIND:
+		bond->tc_block = NULL;
+		tcf_block_cb_unregister(f->block, bond_tc_relay_cb, netdev);
+		return 0;
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static int bond_setup_tc(struct net_device *netdev, enum tc_setup_type type,
+			 void *type_data)
+{
+	switch (type) {
+	case TC_SETUP_BLOCK:
+		return bond_setup_tc_block(netdev, type_data);
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
 /*------------------------------- Link status -------------------------------*/
 
 /* Set the carrier state for the master according to the state of its
@@ -1065,6 +1114,17 @@ static netdev_features_t bond_fix_features(struct net_device *dev,
 	return features;
 }
 
+int bond_set_features(struct net_device *netdev, netdev_features_t features)
+{
+	if ((features & NETIF_F_HW_TC) <
+	    !!bond_get_offload_cnt(netdev_priv(netdev))) {
+		netdev_err(netdev, "Cannot disable TC offload while active\n");
+		return -EBUSY;
+	}
+
+	return 0;
+}
+
 #define BOND_VLAN_FEATURES	(NETIF_F_HW_CSUM | NETIF_F_SG | \
 				 NETIF_F_FRAGLIST | NETIF_F_ALL_TSO | \
 				 NETIF_F_HIGHDMA | NETIF_F_LRO)
@@ -4198,7 +4258,9 @@ static const struct net_device_ops bond_netdev_ops = {
 	.ndo_add_slave		= bond_enslave,
 	.ndo_del_slave		= bond_release,
 	.ndo_fix_features	= bond_fix_features,
+	.ndo_set_features	= bond_set_features,
 	.ndo_features_check	= passthru_features_check,
+	.ndo_setup_tc		= bond_setup_tc,
 };
 
 static const struct device_type bond_type = {
@@ -4259,6 +4321,8 @@ void bond_setup(struct net_device *bond_dev)
 
 	bond_dev->hw_features |= NETIF_F_GSO_ENCAP_ALL;
 	bond_dev->features |= bond_dev->hw_features;
+	/* Disable hw tc offload by default. */
+	bond_dev->hw_features |= NETIF_F_HW_TC;
 }
 
 /* Destroy a bonding device.
diff --git a/include/net/bonding.h b/include/net/bonding.h
index f801fc9..424b9ea 100644
--- a/include/net/bonding.h
+++ b/include/net/bonding.h
@@ -29,6 +29,7 @@
 #include <net/bond_3ad.h>
 #include <net/bond_alb.h>
 #include <net/bond_options.h>
+#include <net/pkt_cls.h>
 
 #define BOND_MAX_ARP_TARGETS	16
 
@@ -199,6 +200,7 @@ struct bonding {
 	struct   bond_up_slave __rcu *slave_arr; /* Array of usable slaves */
 	bool     force_primary;
 	s32      slave_cnt; /* never change this value outside the attach/detach wrappers */
+	struct   tcf_block *tc_block;
 	int     (*recv_probe)(const struct sk_buff *, struct bonding *,
 			      struct slave *);
 	/* mode_lock is used for mode-specific locking needs, currently used by:
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC net-next 2/6] driver: net: bonding: allow registration of tc offload callbacks in bond
  2018-03-05 13:28 [RFC net-next 0/6] offload linux bonding tc ingress rules John Hurley
  2018-03-05 13:28 ` [RFC net-next 1/6] drivers: net: bonding: add tc offload infastructure to bond John Hurley
@ 2018-03-05 13:28 ` John Hurley
  2018-03-07 10:57   ` Jiri Pirko
  2018-03-05 13:28 ` [RFC net-next 3/6] drivers: net: bonding: restrict bond mods when rules are offloaded John Hurley
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 16+ messages in thread
From: John Hurley @ 2018-03-05 13:28 UTC (permalink / raw)
  To: netdev; +Cc: jiri, ogerlitz, jakub.kicinski, simon.horman, John Hurley

Allow drivers to register netdev callbacks for tc offload in linux bonds.
If a netdev has registered and is a slave of a given bond, then any tc
rules offloaded to the bond will be relayed to it if both the bond and the
slave permit hw offload.

Because the bond itself is not offloaded, just the rules, we don't care
about whether the bond ports are on the same device or whether some of
slaves are representor ports and some are not.

Signed-off-by: John Hurley <john.hurley@netronome.com>
---
 drivers/net/bonding/bond_main.c | 195 +++++++++++++++++++++++++++++++++++++++-
 include/net/bonding.h           |   7 ++
 2 files changed, 201 insertions(+), 1 deletion(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index e6415f6..d9e41cf 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -335,9 +335,201 @@ static inline unsigned int bond_get_offload_cnt(struct bonding *bond)
 	return bond->tc_block->offloadcnt;
 }
 
+struct tcf_bond_cb {
+	struct list_head list;
+	tc_setup_cb_t *cb;
+	void *cb_priv;
+};
+
+struct tcf_bond_off {
+	struct rhash_head ht_node;
+	const struct net_device *netdev;
+	unsigned int refcnt;
+	struct list_head cb_list;
+};
+
+static const struct rhashtable_params tcf_bond_ht_params = {
+	.key_offset = offsetof(struct tcf_bond_off, netdev),
+	.head_offset = offsetof(struct tcf_bond_off, ht_node),
+	.key_len = sizeof(const struct net_device *),
+};
+
+static struct tcf_bond_off *tcf_bond_off_lookup(const struct net_device *dev)
+{
+	struct bond_net *bn = net_generic(dev_net(dev), bond_net_id);
+
+	return rhashtable_lookup_fast(&bn->bond_offload_ht, &dev,
+				      tcf_bond_ht_params);
+}
+
+static struct tcf_bond_cb *tcf_bond_off_cb_lookup(struct tcf_bond_off *off,
+						  tc_setup_cb_t *cb,
+						  void *cb_priv)
+{
+	struct tcf_bond_cb *bond_cb;
+
+	list_for_each_entry(bond_cb, &off->cb_list, list)
+		if (bond_cb->cb == cb && bond_cb->cb_priv == cb_priv)
+			return bond_cb;
+	return NULL;
+}
+
+static struct tcf_bond_off *tcf_bond_off_get(const struct net_device *dev,
+					     tc_setup_cb_t *cb,
+					     void *cb_priv)
+{
+	struct tcf_bond_off *bond_off;
+	struct bond_net *bn;
+
+	bond_off = tcf_bond_off_lookup(dev);
+	if (bond_off)
+		goto inc_ref;
+
+	bond_off = kzalloc(sizeof(*bond_off), GFP_KERNEL);
+	if (!bond_off)
+		return NULL;
+	INIT_LIST_HEAD(&bond_off->cb_list);
+	bond_off->netdev = dev;
+	bn = net_generic(dev_net(dev), bond_net_id);
+	rhashtable_insert_fast(&bn->bond_offload_ht, &bond_off->ht_node,
+			       tcf_bond_ht_params);
+
+inc_ref:
+	bond_off->refcnt++;
+	return bond_off;
+}
+
+static void tcf_bond_off_put(struct tcf_bond_off *bond_off)
+{
+	struct bond_net *bn;
+
+	if (--bond_off->refcnt)
+		return;
+	bn = net_generic(dev_net(bond_off->netdev), bond_net_id);
+	rhashtable_remove_fast(&bn->bond_offload_ht, &bond_off->ht_node,
+			       tcf_bond_ht_params);
+	kfree(bond_off);
+}
+
+static int tcf_bond_off_cb_add(struct tcf_bond_off *bond_off,
+			       tc_setup_cb_t *cb, void *cb_priv)
+{
+	struct tcf_bond_cb *bond_cb;
+
+	bond_cb = tcf_bond_off_cb_lookup(bond_off, cb, cb_priv);
+	if (WARN_ON(bond_cb))
+		return -EEXIST;
+	bond_cb = kzalloc(sizeof(*bond_cb), GFP_KERNEL);
+	if (!bond_cb)
+		return -ENOMEM;
+	bond_cb->cb = cb;
+	bond_cb->cb_priv = cb_priv;
+	list_add(&bond_cb->list, &bond_off->cb_list);
+	return 0;
+}
+
+static void tcf_bond_off_cb_del(struct tcf_bond_off *bond_off,
+				tc_setup_cb_t *cb, void *cb_priv)
+{
+	struct tcf_bond_cb *bond_cb;
+
+	bond_cb = tcf_bond_off_cb_lookup(bond_off, cb, cb_priv);
+	if (WARN_ON(!bond_cb))
+		return;
+	list_del(&bond_cb->list);
+	kfree(bond_cb);
+}
+
+static int __tc_setup_cb_bond_register(const struct net_device *dev,
+				       tc_setup_cb_t *cb, void *cb_priv)
+{
+	struct tcf_bond_off *bond_off = tcf_bond_off_get(dev, cb, cb_priv);
+	int err;
+
+	if (!bond_off)
+		return -ENOMEM;
+	err = tcf_bond_off_cb_add(bond_off, cb, cb_priv);
+	if (err)
+		goto err_cb_add;
+	return 0;
+
+err_cb_add:
+	tcf_bond_off_put(bond_off);
+	return err;
+}
+
+int tc_setup_cb_bond_register(const struct net_device *dev, tc_setup_cb_t *cb,
+			      void *cb_priv)
+{
+	int err;
+
+	rtnl_lock();
+	err = __tc_setup_cb_bond_register(dev, cb, cb_priv);
+	rtnl_unlock();
+	return err;
+}
+EXPORT_SYMBOL_GPL(tc_setup_cb_bond_register);
+
+static void __tc_setup_cb_bond_unregister(const struct net_device *dev,
+					  tc_setup_cb_t *cb, void *cb_priv)
+{
+	struct tcf_bond_off *bond_off = tcf_bond_off_lookup(dev);
+
+	if (WARN_ON(!bond_off))
+		return;
+	tcf_bond_off_cb_del(bond_off, cb, cb_priv);
+	tcf_bond_off_put(bond_off);
+}
+
+void tc_setup_cb_bond_unregister(const struct net_device *dev,
+				 tc_setup_cb_t *cb, void *cb_priv)
+{
+	rtnl_lock();
+	__tc_setup_cb_bond_unregister(dev, cb, cb_priv);
+	rtnl_unlock();
+}
+EXPORT_SYMBOL_GPL(tc_setup_cb_bond_unregister);
+
 static int bond_tc_relay_cb(enum tc_setup_type type, void *type_data,
 			    void *cb_priv)
 {
+	struct net_device *bond_dev = cb_priv;
+	struct tcf_bond_off *bond_off;
+	struct tcf_bond_cb *bond_cb;
+	struct list_head *iter;
+	struct bonding *bond;
+	struct slave *slave;
+	int err;
+
+	bond = netdev_priv(bond_dev);
+
+	if (!tc_can_offload(bond_dev))
+		return -EOPNOTSUPP;
+
+	bond_for_each_slave(bond, slave, iter) {
+		if (!tc_can_offload(slave->dev))
+			continue;
+
+		bond_off = tcf_bond_off_lookup(slave->dev);
+		if (!bond_off)
+			continue;
+
+		list_for_each_entry(bond_cb, &bond_off->cb_list, list) {
+			err = bond_cb->cb(type, type_data, bond_cb->cb_priv);
+			/* Possible here that some of the relayed callbacks are
+			 * accepted before the error meaning a rule add may be
+			 * offloaded to some ports and not others.
+			 *
+			 * If skip_sw is set then the classifier will generate
+			 * a destroy message undoing the adds. If not set then
+			 * some of the relays exist in hw and some software
+			 * only.
+			 */
+			if (err)
+				return err;
+		}
+	}
+
 	return 0;
 }
 
@@ -4829,7 +5021,7 @@ static int __net_init bond_net_init(struct net *net)
 	bond_create_proc_dir(bn);
 	bond_create_sysfs(bn);
 
-	return 0;
+	return rhashtable_init(&bn->bond_offload_ht, &tcf_bond_ht_params);
 }
 
 static void __net_exit bond_net_exit(struct net *net)
@@ -4848,6 +5040,7 @@ static void __net_exit bond_net_exit(struct net *net)
 	rtnl_unlock();
 
 	bond_destroy_proc_dir(bn);
+	rhashtable_destroy(&bn->bond_offload_ht);
 }
 
 static struct pernet_operations bond_net_ops = {
diff --git a/include/net/bonding.h b/include/net/bonding.h
index 424b9ea..056f5fc 100644
--- a/include/net/bonding.h
+++ b/include/net/bonding.h
@@ -30,6 +30,7 @@
 #include <net/bond_alb.h>
 #include <net/bond_options.h>
 #include <net/pkt_cls.h>
+#include <net/act_api.h>
 
 #define BOND_MAX_ARP_TARGETS	16
 
@@ -584,6 +585,7 @@ struct bond_net {
 	struct proc_dir_entry	*proc_dir;
 #endif
 	struct class_attribute	class_attr_bonding_masters;
+	struct rhashtable	bond_offload_ht;
 };
 
 int bond_arp_rcv(const struct sk_buff *skb, struct bonding *bond, struct slave *slave);
@@ -620,6 +622,11 @@ int bond_update_slave_arr(struct bonding *bond, struct slave *skipslave);
 void bond_slave_arr_work_rearm(struct bonding *bond, unsigned long delay);
 void bond_work_init_all(struct bonding *bond);
 
+int tc_setup_cb_bond_register(const struct net_device *dev, tc_setup_cb_t *cb,
+			      void *cb_priv);
+void tc_setup_cb_bond_unregister(const struct net_device *dev,
+				 tc_setup_cb_t *cb, void *cb_priv);
+
 #ifdef CONFIG_PROC_FS
 void bond_create_proc_entry(struct bonding *bond);
 void bond_remove_proc_entry(struct bonding *bond);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC net-next 3/6] drivers: net: bonding: restrict bond mods when rules are offloaded
  2018-03-05 13:28 [RFC net-next 0/6] offload linux bonding tc ingress rules John Hurley
  2018-03-05 13:28 ` [RFC net-next 1/6] drivers: net: bonding: add tc offload infastructure to bond John Hurley
  2018-03-05 13:28 ` [RFC net-next 2/6] driver: net: bonding: allow registration of tc offload callbacks in bond John Hurley
@ 2018-03-05 13:28 ` John Hurley
  2018-03-05 13:28 ` [RFC net-next 4/6] nfp: add ndo_set_mac_address for representors John Hurley
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: John Hurley @ 2018-03-05 13:28 UTC (permalink / raw)
  To: netdev; +Cc: jiri, ogerlitz, jakub.kicinski, simon.horman, John Hurley

When a bond has tc rules offloaded to its slaves, prevent new slaves being
added. To remove a slave from a bond, the offloaded rules must first be
deleted. For the case where a slave port on a bond is unregistered from
the kernel, flush all offloaded rules and destroy the bond.

Signed-off-by: John Hurley <john.hurley@netronome.com>
---
 drivers/net/bonding/bond_main.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index d9e41cf..4c146b1 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1607,6 +1607,15 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev,
 		return -EPERM;
 	}
 
+	/* check for TC offloaded */
+	if (bond_get_offload_cnt(bond)) {
+		NL_SET_ERR_MSG(extack,
+			       "Cannot enslave - bond has offloaded rules.");
+		netdev_err(bond_dev,
+			   "cannot enslave - bond has offloaded rules.\n");
+		return -EPERM;
+	}
+
 	/* vlan challenged mutual exclusion */
 	/* no need to lock since we're protected by rtnl_lock */
 	if (slave_dev->features & NETIF_F_VLAN_CHALLENGED) {
@@ -2237,6 +2246,13 @@ static int __bond_release_one(struct net_device *bond_dev,
 /* A wrapper used because of ndo_del_link */
 int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
 {
+	if (bond_get_offload_cnt(netdev_priv(bond_dev))) {
+		netdev_err(bond_dev,
+			   "cannot release %s - has offloaded rules.\n",
+			   slave_dev->name);
+		return -EPERM;
+	}
+
 	return __bond_release_one(bond_dev, slave_dev, false, false);
 }
 
@@ -3325,6 +3341,8 @@ static int bond_slave_netdev_event(unsigned long event,
 	case NETDEV_UNREGISTER:
 		if (bond_dev->type != ARPHRD_ETHER)
 			bond_release_and_destroy(bond_dev, slave_dev);
+		else if (bond_get_offload_cnt(bond))
+			unregister_netdevice_queue(bond_dev, NULL);
 		else
 			__bond_release_one(bond_dev, slave_dev, false, true);
 		break;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC net-next 4/6] nfp: add ndo_set_mac_address for representors
  2018-03-05 13:28 [RFC net-next 0/6] offload linux bonding tc ingress rules John Hurley
                   ` (2 preceding siblings ...)
  2018-03-05 13:28 ` [RFC net-next 3/6] drivers: net: bonding: restrict bond mods when rules are offloaded John Hurley
@ 2018-03-05 13:28 ` John Hurley
  2018-03-05 21:39   ` Or Gerlitz
  2018-03-05 13:28 ` [RFC net-next 5/6] nfp: register repr ports for bond offloads John Hurley
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 16+ messages in thread
From: John Hurley @ 2018-03-05 13:28 UTC (permalink / raw)
  To: netdev; +Cc: jiri, ogerlitz, jakub.kicinski, simon.horman, John Hurley

A representor hardware address does not have any meaning outside of the
kernel netdev/networking stack. Thus there is no need for any app specific
code for setting a representors hardware address, the default eth_mac_addr
is sufficient.

Signed-off-by: John Hurley <john.hurley@netronome.com>
---
 drivers/net/ethernet/netronome/nfp/nfp_net_repr.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_repr.c b/drivers/net/ethernet/netronome/nfp/nfp_net_repr.c
index 6195705..329e37d 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_repr.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_repr.c
@@ -266,6 +266,7 @@ const struct net_device_ops nfp_repr_netdev_ops = {
 	.ndo_get_vf_config	= nfp_app_get_vf_config,
 	.ndo_set_vf_link_state	= nfp_app_set_vf_link_state,
 	.ndo_set_features	= nfp_port_set_features,
+	.ndo_set_mac_address    = eth_mac_addr,
 };
 
 static void nfp_repr_clean(struct nfp_repr *repr)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC net-next 5/6] nfp: register repr ports for bond offloads
  2018-03-05 13:28 [RFC net-next 0/6] offload linux bonding tc ingress rules John Hurley
                   ` (3 preceding siblings ...)
  2018-03-05 13:28 ` [RFC net-next 4/6] nfp: add ndo_set_mac_address for representors John Hurley
@ 2018-03-05 13:28 ` John Hurley
  2018-03-05 13:28 ` [RFC net-next 6/6] nfp: support offloading multiple rules with same cookie John Hurley
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: John Hurley @ 2018-03-05 13:28 UTC (permalink / raw)
  To: netdev; +Cc: jiri, ogerlitz, jakub.kicinski, simon.horman, John Hurley

On initialisation, register nfp repr ports to receive callbacks when tc
rules are offloaded to any bond they may be attached to. Callback
function is the same that is used for non bonded port rule offload.

Signed-off-by: John Hurley <john.hurley@netronome.com>
---
 drivers/net/ethernet/netronome/nfp/flower/main.c   | 24 +++++++++++++++++++---
 drivers/net/ethernet/netronome/nfp/flower/main.h   |  2 ++
 .../net/ethernet/netronome/nfp/flower/offload.c    |  4 ++--
 3 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/flower/main.c b/drivers/net/ethernet/netronome/nfp/flower/main.c
index 742d6f1..0ca15f6 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/main.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/main.c
@@ -38,6 +38,7 @@
 #include <linux/vmalloc.h>
 #include <net/devlink.h>
 #include <net/dst_metadata.h>
+#include <net/bonding.h>
 
 #include "main.h"
 #include "../nfpcore/nfp_cpp.h"
@@ -177,14 +178,31 @@ nfp_flower_repr_netdev_stop(struct nfp_app *app, struct nfp_repr *repr)
 static int
 nfp_flower_repr_netdev_init(struct nfp_app *app, struct net_device *netdev)
 {
-	return tc_setup_cb_egdev_register(netdev,
-					  nfp_flower_setup_tc_egress_cb,
-					  netdev_priv(netdev));
+	int err;
+
+	err = tc_setup_cb_bond_register(netdev, nfp_flower_setup_tc_block_cb,
+					netdev_priv(netdev));
+	if (err)
+		return err;
+
+	err = tc_setup_cb_egdev_register(netdev, nfp_flower_setup_tc_egress_cb,
+					 netdev_priv(netdev));
+	if (err)
+		goto err_egdev;
+
+	return err;
+err_egdev:
+	tc_setup_cb_bond_unregister(netdev, nfp_flower_setup_tc_block_cb,
+				    netdev_priv(netdev));
+	return err;
 }
 
 static void
 nfp_flower_repr_netdev_clean(struct nfp_app *app, struct net_device *netdev)
 {
+	tc_setup_cb_bond_unregister(netdev, nfp_flower_setup_tc_block_cb,
+				    netdev_priv(netdev));
+
 	tc_setup_cb_egdev_unregister(netdev, nfp_flower_setup_tc_egress_cb,
 				     netdev_priv(netdev));
 }
diff --git a/drivers/net/ethernet/netronome/nfp/flower/main.h b/drivers/net/ethernet/netronome/nfp/flower/main.h
index c5cebf6..5fd7c1f 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/main.h
+++ b/drivers/net/ethernet/netronome/nfp/flower/main.h
@@ -213,5 +213,7 @@ void nfp_tunnel_request_route(struct nfp_app *app, struct sk_buff *skb);
 void nfp_tunnel_keep_alive(struct nfp_app *app, struct sk_buff *skb);
 int nfp_flower_setup_tc_egress_cb(enum tc_setup_type type, void *type_data,
 				  void *cb_priv);
+int nfp_flower_setup_tc_block_cb(enum tc_setup_type type, void *type_data,
+				 void *cb_priv);
 
 #endif
diff --git a/drivers/net/ethernet/netronome/nfp/flower/offload.c b/drivers/net/ethernet/netronome/nfp/flower/offload.c
index f3586c5..eb8abeb 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/offload.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/offload.c
@@ -553,8 +553,8 @@ int nfp_flower_setup_tc_egress_cb(enum tc_setup_type type, void *type_data,
 	}
 }
 
-static int nfp_flower_setup_tc_block_cb(enum tc_setup_type type,
-					void *type_data, void *cb_priv)
+int nfp_flower_setup_tc_block_cb(enum tc_setup_type type, void *type_data,
+				 void *cb_priv)
 {
 	struct nfp_repr *repr = cb_priv;
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC net-next 6/6] nfp: support offloading multiple rules with same cookie
  2018-03-05 13:28 [RFC net-next 0/6] offload linux bonding tc ingress rules John Hurley
                   ` (4 preceding siblings ...)
  2018-03-05 13:28 ` [RFC net-next 5/6] nfp: register repr ports for bond offloads John Hurley
@ 2018-03-05 13:28 ` John Hurley
  2018-03-05 21:43 ` [RFC net-next 0/6] offload linux bonding tc ingress rules Or Gerlitz
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: John Hurley @ 2018-03-05 13:28 UTC (permalink / raw)
  To: netdev; +Cc: jiri, ogerlitz, jakub.kicinski, simon.horman, John Hurley

If ports are bonded, the same rule with the same cookie may be offloaded
to multiple ports. Modify the rule lookup function to optionally include
an ingress netdev and a host context along with the cookie value when
searching for a rule. When a new rule is passed to the driver, the netdev
the rule is to be attached to is considered when searching for dublicates.
Conversely, when a stats update is received from the NFP card, the host
context is used alongside the cookie to map to the correct host rule.

Signed-off-by: John Hurley <john.hurley@netronome.com>
---
 drivers/net/ethernet/netronome/nfp/flower/main.h   |  8 ++++--
 .../net/ethernet/netronome/nfp/flower/metadata.c   | 20 +++++++++------
 .../net/ethernet/netronome/nfp/flower/offload.c    | 29 ++++++++++++++++------
 3 files changed, 40 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/flower/main.h b/drivers/net/ethernet/netronome/nfp/flower/main.h
index 5fd7c1f..2c48e10 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/main.h
+++ b/drivers/net/ethernet/netronome/nfp/flower/main.h
@@ -47,6 +47,7 @@
 struct net_device;
 struct nfp_app;
 
+#define NFP_FL_STATS_CTX_DONT_CARE	cpu_to_be32(0xffffffff)
 #define NFP_FL_STATS_ENTRY_RS		BIT(20)
 #define NFP_FL_STATS_ELEM_RS		4
 #define NFP_FL_REPEATED_HASH_MAX	BIT(17)
@@ -166,6 +167,7 @@ struct nfp_fl_payload {
 	spinlock_t lock; /* lock stats */
 	struct nfp_fl_stats stats;
 	__be32 nfp_tun_ipv4_addr;
+	struct net_device *ingress_dev;
 	char *unmasked_data;
 	char *mask_data;
 	char *action_data;
@@ -193,12 +195,14 @@ int nfp_flower_compile_action(struct tc_cls_flower_offload *flow,
 			      struct nfp_fl_payload *nfp_flow);
 int nfp_compile_flow_metadata(struct nfp_app *app,
 			      struct tc_cls_flower_offload *flow,
-			      struct nfp_fl_payload *nfp_flow);
+			      struct nfp_fl_payload *nfp_flow,
+			      struct net_device *netdev);
 int nfp_modify_flow_metadata(struct nfp_app *app,
 			     struct nfp_fl_payload *nfp_flow);
 
 struct nfp_fl_payload *
-nfp_flower_search_fl_table(struct nfp_app *app, unsigned long tc_flower_cookie);
+nfp_flower_search_fl_table(struct nfp_app *app, unsigned long tc_flower_cookie,
+			   struct net_device *netdev, __be32 host_ctx);
 struct nfp_fl_payload *
 nfp_flower_remove_fl_table(struct nfp_app *app, unsigned long tc_flower_cookie);
 
diff --git a/drivers/net/ethernet/netronome/nfp/flower/metadata.c b/drivers/net/ethernet/netronome/nfp/flower/metadata.c
index db977cf..21668aa 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/metadata.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/metadata.c
@@ -99,14 +99,18 @@ static int nfp_get_stats_entry(struct nfp_app *app, u32 *stats_context_id)
 
 /* Must be called with either RTNL or rcu_read_lock */
 struct nfp_fl_payload *
-nfp_flower_search_fl_table(struct nfp_app *app, unsigned long tc_flower_cookie)
+nfp_flower_search_fl_table(struct nfp_app *app, unsigned long tc_flower_cookie,
+			   struct net_device *netdev, __be32 host_ctx)
 {
 	struct nfp_flower_priv *priv = app->priv;
 	struct nfp_fl_payload *flower_entry;
 
 	hash_for_each_possible_rcu(priv->flow_table, flower_entry, link,
 				   tc_flower_cookie)
-		if (flower_entry->tc_flower_cookie == tc_flower_cookie)
+		if (flower_entry->tc_flower_cookie == tc_flower_cookie &&
+		    (!netdev || flower_entry->ingress_dev == netdev) &&
+		    (host_ctx == NFP_FL_STATS_CTX_DONT_CARE ||
+		     flower_entry->meta.host_ctx_id == host_ctx))
 			return flower_entry;
 
 	return NULL;
@@ -121,13 +125,11 @@ nfp_flower_update_stats(struct nfp_app *app, struct nfp_fl_stats_frame *stats)
 	flower_cookie = be64_to_cpu(stats->stats_cookie);
 
 	rcu_read_lock();
-	nfp_flow = nfp_flower_search_fl_table(app, flower_cookie);
+	nfp_flow = nfp_flower_search_fl_table(app, flower_cookie, NULL,
+					      stats->stats_con_id);
 	if (!nfp_flow)
 		goto exit_rcu_unlock;
 
-	if (nfp_flow->meta.host_ctx_id != stats->stats_con_id)
-		goto exit_rcu_unlock;
-
 	spin_lock(&nfp_flow->lock);
 	nfp_flow->stats.pkts += be32_to_cpu(stats->pkt_count);
 	nfp_flow->stats.bytes += be64_to_cpu(stats->byte_count);
@@ -317,7 +319,8 @@ nfp_check_mask_remove(struct nfp_app *app, char *mask_data, u32 mask_len,
 
 int nfp_compile_flow_metadata(struct nfp_app *app,
 			      struct tc_cls_flower_offload *flow,
-			      struct nfp_fl_payload *nfp_flow)
+			      struct nfp_fl_payload *nfp_flow,
+			      struct net_device *netdev)
 {
 	struct nfp_flower_priv *priv = app->priv;
 	struct nfp_fl_payload *check_entry;
@@ -348,7 +351,8 @@ int nfp_compile_flow_metadata(struct nfp_app *app,
 	nfp_flow->stats.bytes = 0;
 	nfp_flow->stats.used = jiffies;
 
-	check_entry = nfp_flower_search_fl_table(app, flow->cookie);
+	check_entry = nfp_flower_search_fl_table(app, flow->cookie, netdev,
+						 NFP_FL_STATS_CTX_DONT_CARE);
 	if (check_entry) {
 		if (nfp_release_stats_entry(app, stats_cxt))
 			return -EINVAL;
diff --git a/drivers/net/ethernet/netronome/nfp/flower/offload.c b/drivers/net/ethernet/netronome/nfp/flower/offload.c
index eb8abeb..3b554ff 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/offload.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/offload.c
@@ -404,6 +404,9 @@ nfp_flower_add_offload(struct nfp_app *app, struct net_device *netdev,
 		goto err_free_key_ls;
 	}
 
+	if (!egress)
+		flow_pay->ingress_dev = netdev;
+
 	err = nfp_flower_compile_flow_match(flow, key_layer, netdev, flow_pay,
 					    tun_type);
 	if (err)
@@ -413,7 +416,8 @@ nfp_flower_add_offload(struct nfp_app *app, struct net_device *netdev,
 	if (err)
 		goto err_destroy_flow;
 
-	err = nfp_compile_flow_metadata(app, flow, flow_pay);
+	err = nfp_compile_flow_metadata(app, flow, flow_pay,
+					flow_pay->ingress_dev);
 	if (err)
 		goto err_destroy_flow;
 
@@ -455,13 +459,18 @@ nfp_flower_add_offload(struct nfp_app *app, struct net_device *netdev,
  */
 static int
 nfp_flower_del_offload(struct nfp_app *app, struct net_device *netdev,
-		       struct tc_cls_flower_offload *flow)
+		       struct tc_cls_flower_offload *flow, bool egress)
 {
 	struct nfp_port *port = nfp_port_from_netdev(netdev);
+	struct net_device *ingr_dev = NULL;
 	struct nfp_fl_payload *nfp_flow;
 	int err;
 
-	nfp_flow = nfp_flower_search_fl_table(app, flow->cookie);
+	if (!egress)
+		ingr_dev = netdev;
+
+	nfp_flow = nfp_flower_search_fl_table(app, flow->cookie, ingr_dev,
+					      NFP_FL_STATS_CTX_DONT_CARE);
 	if (!nfp_flow)
 		return -ENOENT;
 
@@ -498,11 +507,17 @@ nfp_flower_del_offload(struct nfp_app *app, struct net_device *netdev,
  * Return: negative value on error, 0 if stats populated successfully.
  */
 static int
-nfp_flower_get_stats(struct nfp_app *app, struct tc_cls_flower_offload *flow)
+nfp_flower_get_stats(struct nfp_app *app, struct net_device *netdev,
+		     struct tc_cls_flower_offload *flow, bool egress)
 {
+	struct net_device *ingr_dev = NULL;
 	struct nfp_fl_payload *nfp_flow;
 
-	nfp_flow = nfp_flower_search_fl_table(app, flow->cookie);
+	if (!egress)
+		ingr_dev = netdev;
+
+	nfp_flow = nfp_flower_search_fl_table(app, flow->cookie, ingr_dev,
+					      NFP_FL_STATS_CTX_DONT_CARE);
 	if (!nfp_flow)
 		return -EINVAL;
 
@@ -528,9 +543,9 @@ nfp_flower_repr_offload(struct nfp_app *app, struct net_device *netdev,
 	case TC_CLSFLOWER_REPLACE:
 		return nfp_flower_add_offload(app, netdev, flower, egress);
 	case TC_CLSFLOWER_DESTROY:
-		return nfp_flower_del_offload(app, netdev, flower);
+		return nfp_flower_del_offload(app, netdev, flower, egress);
 	case TC_CLSFLOWER_STATS:
-		return nfp_flower_get_stats(app, flower);
+		return nfp_flower_get_stats(app, netdev, flower, egress);
 	}
 
 	return -EOPNOTSUPP;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [RFC net-next 4/6] nfp: add ndo_set_mac_address for representors
  2018-03-05 13:28 ` [RFC net-next 4/6] nfp: add ndo_set_mac_address for representors John Hurley
@ 2018-03-05 21:39   ` Or Gerlitz
  2018-03-05 23:20     ` John Hurley
  0 siblings, 1 reply; 16+ messages in thread
From: Or Gerlitz @ 2018-03-05 21:39 UTC (permalink / raw)
  To: John Hurley
  Cc: Linux Netdev List, Jiri Pirko, Or Gerlitz, Jakub Kicinski, Simon Horman

On Mon, Mar 5, 2018 at 3:28 PM, John Hurley <john.hurley@netronome.com> wrote:
> A representor hardware address does not have any meaning outside of the
> kernel netdev/networking stack. Thus there is no need for any app specific
> code for setting a representors hardware address, the default eth_mac_addr
> is sufficient.

where did you need that? does libvirt attempts to change the mac address or
it's for bonding to call, worth mentioning the use-case in the change log

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC net-next 0/6] offload linux bonding tc ingress rules
  2018-03-05 13:28 [RFC net-next 0/6] offload linux bonding tc ingress rules John Hurley
                   ` (5 preceding siblings ...)
  2018-03-05 13:28 ` [RFC net-next 6/6] nfp: support offloading multiple rules with same cookie John Hurley
@ 2018-03-05 21:43 ` Or Gerlitz
  2018-03-05 23:57   ` John Hurley
  2018-03-05 22:08 ` Jakub Kicinski
  2018-03-06  7:49 ` Ido Schimmel
  8 siblings, 1 reply; 16+ messages in thread
From: Or Gerlitz @ 2018-03-05 21:43 UTC (permalink / raw)
  To: John Hurley
  Cc: Linux Netdev List, Jiri Pirko, Or Gerlitz, Jakub Kicinski, Simon Horman

On Mon, Mar 5, 2018 at 3:28 PM, John Hurley <john.hurley@netronome.com> wrote:
> This RFC patchset adds support for offloading tc ingress rules applied to
> linux bonds. The premise of these patches is that if a rule is applied to
> a bond port then the rule should be applied to each slave of the bond.
>
> The linux bond itself registers a cb for offloading tc rules. Potential
> slave netdevs on offload devices can then register with the bond for a
> further callback - this code is basically the same as registering for an
> egress dev offload in TC. Then when a rule is offloaded to the bond, it
> can be relayed to each netdev that has registered with the bond code and
> which is a slave of the given bond.
>
> To prevent sync issues between the kernel and offload device, the linux
> bond driver is affectively locked when it has offloaded rules. i.e no new
> ports can be enslaved and no slaves can be released until the offload
> rules are removed. Similarly, if a port on a bond is deleted, the bond is
> destroyed, forcing a flush of all offloaded rules.
>
> Also included in the RFC are changes to the NFP driver to utilise the new
> code by registering NFP port representors for bond offload rules and
> modifying cookie handling to allow the relaying of a rule to multiple ports.

what is your approach for rules whose bond is their egress device
(ingress = vf port
representor)?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC net-next 0/6] offload linux bonding tc ingress rules
  2018-03-05 13:28 [RFC net-next 0/6] offload linux bonding tc ingress rules John Hurley
                   ` (6 preceding siblings ...)
  2018-03-05 21:43 ` [RFC net-next 0/6] offload linux bonding tc ingress rules Or Gerlitz
@ 2018-03-05 22:08 ` Jakub Kicinski
  2018-03-06  2:34   ` Roopa Prabhu
  2018-03-06  7:49 ` Ido Schimmel
  8 siblings, 1 reply; 16+ messages in thread
From: Jakub Kicinski @ 2018-03-05 22:08 UTC (permalink / raw)
  To: John Hurley; +Cc: netdev, jiri, ogerlitz, simon.horman

On Mon,  5 Mar 2018 13:28:28 +0000, John Hurley wrote:
> The linux bond itself registers a cb for offloading tc rules. Potential
> slave netdevs on offload devices can then register with the bond for a
> further callback - this code is basically the same as registering for an
> egress dev offload in TC. Then when a rule is offloaded to the bond, it
> can be relayed to each netdev that has registered with the bond code and
> which is a slave of the given bond.

As you know I would much rather see this handled in the TC core,
similarly to how blocks are shared.  We can add a new .ndo_setup_tc
notification like TC_MASTER_BLOCK_BIND and reuse the existing offload
tracking.  It would also fix the problem of freezing the bond and allow
better code reuse with team etc.

For tunnel offloads we necessarily have to stick to the weak offload
model, where any offload success satisfies skip_sw, but in case of
bonds we should strive for the strong model (as you are doing AFAICT).

The only difficulty seems to be replaying the bind commands when port
joins?  I mean finding all blocks on a bond.  But that should be
surmountable..

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC net-next 4/6] nfp: add ndo_set_mac_address for representors
  2018-03-05 21:39   ` Or Gerlitz
@ 2018-03-05 23:20     ` John Hurley
  0 siblings, 0 replies; 16+ messages in thread
From: John Hurley @ 2018-03-05 23:20 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Linux Netdev List, Jiri Pirko, Or Gerlitz, Jakub Kicinski, Simon Horman

On Mon, Mar 5, 2018 at 9:39 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
> On Mon, Mar 5, 2018 at 3:28 PM, John Hurley <john.hurley@netronome.com> wrote:
>> A representor hardware address does not have any meaning outside of the
>> kernel netdev/networking stack. Thus there is no need for any app specific
>> code for setting a representors hardware address, the default eth_mac_addr
>> is sufficient.
>
> where did you need that? does libvirt attempts to change the mac address or
> it's for bonding to call, worth mentioning the use-case in the change log

Hi Or,
yes, setting the mac is required to add the repr to a linux bond.
I agree, I should add the use case here. Thanks

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC net-next 0/6] offload linux bonding tc ingress rules
  2018-03-05 21:43 ` [RFC net-next 0/6] offload linux bonding tc ingress rules Or Gerlitz
@ 2018-03-05 23:57   ` John Hurley
  2018-03-06  0:16     ` Jakub Kicinski
  0 siblings, 1 reply; 16+ messages in thread
From: John Hurley @ 2018-03-05 23:57 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Linux Netdev List, Jiri Pirko, Or Gerlitz, Jakub Kicinski, Simon Horman

On Mon, Mar 5, 2018 at 9:43 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
> On Mon, Mar 5, 2018 at 3:28 PM, John Hurley <john.hurley@netronome.com> wrote:
>> This RFC patchset adds support for offloading tc ingress rules applied to
>> linux bonds. The premise of these patches is that if a rule is applied to
>> a bond port then the rule should be applied to each slave of the bond.
>>
>> The linux bond itself registers a cb for offloading tc rules. Potential
>> slave netdevs on offload devices can then register with the bond for a
>> further callback - this code is basically the same as registering for an
>> egress dev offload in TC. Then when a rule is offloaded to the bond, it
>> can be relayed to each netdev that has registered with the bond code and
>> which is a slave of the given bond.
>>
>> To prevent sync issues between the kernel and offload device, the linux
>> bond driver is affectively locked when it has offloaded rules. i.e no new
>> ports can be enslaved and no slaves can be released until the offload
>> rules are removed. Similarly, if a port on a bond is deleted, the bond is
>> destroyed, forcing a flush of all offloaded rules.
>>
>> Also included in the RFC are changes to the NFP driver to utilise the new
>> code by registering NFP port representors for bond offload rules and
>> modifying cookie handling to allow the relaying of a rule to multiple ports.
>
> what is your approach for rules whose bond is their egress device
> (ingress = vf port
> representor)?

Egress offload will be handled entirely by the NFP driver.
Basically, notifiers will track the slaves/masters and update the card
with any groups that consist entirely of reprs.
We then offload the TC rule outputting to the given group - because it
is an ingress match we can access the egress netdev in the block
callback.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC net-next 0/6] offload linux bonding tc ingress rules
  2018-03-05 23:57   ` John Hurley
@ 2018-03-06  0:16     ` Jakub Kicinski
  0 siblings, 0 replies; 16+ messages in thread
From: Jakub Kicinski @ 2018-03-06  0:16 UTC (permalink / raw)
  To: John Hurley
  Cc: Or Gerlitz, Linux Netdev List, Jiri Pirko, Or Gerlitz, Simon Horman

On Mon, 5 Mar 2018 23:57:18 +0000, John Hurley wrote:
> > what is your approach for rules whose bond is their egress device
> > (ingress = vf port
> > representor)?  
> 
> Egress offload will be handled entirely by the NFP driver.
> Basically, notifiers will track the slaves/masters and update the card
> with any groups that consist entirely of reprs.
> We then offload the TC rule outputting to the given group - because it
> is an ingress match we can access the egress netdev in the block
> callback.

And you handle egdev call too?  Or are we hoping to get rid of that
before? :)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC net-next 0/6] offload linux bonding tc ingress rules
  2018-03-05 22:08 ` Jakub Kicinski
@ 2018-03-06  2:34   ` Roopa Prabhu
  0 siblings, 0 replies; 16+ messages in thread
From: Roopa Prabhu @ 2018-03-06  2:34 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: John Hurley, netdev, Jiri Pirko, Or Gerlitz, Simon Horman

On Mon, Mar 5, 2018 at 2:08 PM, Jakub Kicinski
<jakub.kicinski@netronome.com> wrote:
> On Mon,  5 Mar 2018 13:28:28 +0000, John Hurley wrote:
>> The linux bond itself registers a cb for offloading tc rules. Potential
>> slave netdevs on offload devices can then register with the bond for a
>> further callback - this code is basically the same as registering for an
>> egress dev offload in TC. Then when a rule is offloaded to the bond, it
>> can be relayed to each netdev that has registered with the bond code and
>> which is a slave of the given bond.
>
> As you know I would much rather see this handled in the TC core,
> similarly to how blocks are shared.  We can add a new .ndo_setup_tc
> notification like TC_MASTER_BLOCK_BIND and reuse the existing offload
> tracking.  It would also fix the problem of freezing the bond and allow
> better code reuse with team etc.

+1 to handle this in tc core. We will soon find that many other
devices will need to propagate rules down
 the netdev stack and keeping it in tc core allows re-use like you state above.
The switchdev api's before they moved to notifiers in many cases had
bond and other netdev stack offload
traversal inside the switchdev layer (In the notifier world, I think a
driver can still register and track rules and other offload on
 bonds with its own ports as slaves)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC net-next 0/6] offload linux bonding tc ingress rules
  2018-03-05 13:28 [RFC net-next 0/6] offload linux bonding tc ingress rules John Hurley
                   ` (7 preceding siblings ...)
  2018-03-05 22:08 ` Jakub Kicinski
@ 2018-03-06  7:49 ` Ido Schimmel
  8 siblings, 0 replies; 16+ messages in thread
From: Ido Schimmel @ 2018-03-06  7:49 UTC (permalink / raw)
  To: John Hurley; +Cc: netdev, jiri, ogerlitz, jakub.kicinski, simon.horman

On Mon, Mar 05, 2018 at 01:28:28PM +0000, John Hurley wrote:
> To prevent sync issues between the kernel and offload device, the linux
> bond driver is affectively locked when it has offloaded rules. i.e no new
> ports can be enslaved and no slaves can be released until the offload
> rules are removed. Similarly, if a port on a bond is deleted, the bond is
> destroyed, forcing a flush of all offloaded rules.

Hi John,

I understand where this is coming from, but I don't think these
semantics are acceptable. The part about not adding new slaves might
make sense, but one needs to be able to remove slaves at any time.

Anyway, it would be much better to handle this in a generic way that
team and other stacked devices can later re-use. There's a similar sync
issue with VLAN filtering, which is handled by bond/team by calling
vlan_vids_add_by_dev() and vlan_vids_del_by_dev() in their
ndo_add_slave() and ndo_del_slave(), respectively. You can do something
similar and call into TC to replay the necessary information to the
newly added slave?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC net-next 2/6] driver: net: bonding: allow registration of tc offload callbacks in bond
  2018-03-05 13:28 ` [RFC net-next 2/6] driver: net: bonding: allow registration of tc offload callbacks in bond John Hurley
@ 2018-03-07 10:57   ` Jiri Pirko
  0 siblings, 0 replies; 16+ messages in thread
From: Jiri Pirko @ 2018-03-07 10:57 UTC (permalink / raw)
  To: John Hurley; +Cc: netdev, jiri, ogerlitz, jakub.kicinski, simon.horman

Mon, Mar 05, 2018 at 02:28:30PM CET, john.hurley@netronome.com wrote:
>Allow drivers to register netdev callbacks for tc offload in linux bonds.
>If a netdev has registered and is a slave of a given bond, then any tc
>rules offloaded to the bond will be relayed to it if both the bond and the
>slave permit hw offload.
>
>Because the bond itself is not offloaded, just the rules, we don't care
>about whether the bond ports are on the same device or whether some of
>slaves are representor ports and some are not.
>
>Signed-off-by: John Hurley <john.hurley@netronome.com>
>---
> drivers/net/bonding/bond_main.c | 195 +++++++++++++++++++++++++++++++++++++++-
> include/net/bonding.h           |   7 ++
> 2 files changed, 201 insertions(+), 1 deletion(-)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index e6415f6..d9e41cf 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c

[...]


>+EXPORT_SYMBOL_GPL(tc_setup_cb_bond_register);

Please, no "bond" specific calls from drivers. That would be wrong.
The idea behing block callbacks was that anyone who is interested could
register to receive those. In this case, slave device is interested.
So it should register to receive block callbacks in the same way as if
the block was directly on top of the slave device. The only thing you
need to handle is to propagate block bind/unbind from master down to the
slaves.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2018-03-07 10:57 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-05 13:28 [RFC net-next 0/6] offload linux bonding tc ingress rules John Hurley
2018-03-05 13:28 ` [RFC net-next 1/6] drivers: net: bonding: add tc offload infastructure to bond John Hurley
2018-03-05 13:28 ` [RFC net-next 2/6] driver: net: bonding: allow registration of tc offload callbacks in bond John Hurley
2018-03-07 10:57   ` Jiri Pirko
2018-03-05 13:28 ` [RFC net-next 3/6] drivers: net: bonding: restrict bond mods when rules are offloaded John Hurley
2018-03-05 13:28 ` [RFC net-next 4/6] nfp: add ndo_set_mac_address for representors John Hurley
2018-03-05 21:39   ` Or Gerlitz
2018-03-05 23:20     ` John Hurley
2018-03-05 13:28 ` [RFC net-next 5/6] nfp: register repr ports for bond offloads John Hurley
2018-03-05 13:28 ` [RFC net-next 6/6] nfp: support offloading multiple rules with same cookie John Hurley
2018-03-05 21:43 ` [RFC net-next 0/6] offload linux bonding tc ingress rules Or Gerlitz
2018-03-05 23:57   ` John Hurley
2018-03-06  0:16     ` Jakub Kicinski
2018-03-05 22:08 ` Jakub Kicinski
2018-03-06  2:34   ` Roopa Prabhu
2018-03-06  7:49 ` Ido Schimmel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.