All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH/RFC 00/12] Programming Open vSwitch (-like) flows into hardware using SwitchDev
@ 2016-09-28 12:42 Simon Horman
       [not found] ` <1475066582-1971-1-git-send-email-simon.horman-wFxRvT7yatFl57MIdRCFDg@public.gmane.org>
                   ` (11 more replies)
  0 siblings, 12 replies; 17+ messages in thread
From: Simon Horman @ 2016-09-28 12:42 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA, dev-yBygre7rU0TnMu66kgdUjQ; +Cc: Simon Horman

This series provides a prototype of programming Open vSwitch (-like) flows
into hardware using SwitchDev. It is a rework of an approach which I
previously posted in 2014. An Netronome has been using in real world
products for some time now.

Since that time upstream support for offloading flows evolved somewhat as
we can see in both the provision for TC to offload classifiers to hardware
and evolution of eBPF. And with Netdev 1.2 approaching it seems timely
to revisit this approach.

In this approach flows are programmed into hardware by the kernel and
the user provided by this patchset is the Open vSwitch kernel datapath.
By default the implementation tries to program flows into hardware and
software but only fails if the latter is not successful.

An netlink attribute, OVS_FLOW_ATTR_HW_STATUS, is provided to allow
user-space to determine if a flow was programmed into hardware or not.

User-space may ask for a flow to not be programmed into hardware using
OVS_FLOW_HW_REQ_SKIP_HW. There is also scope to skip adding flows into
software - that is only program them into hardware but that is not
implemented at this time.

This should allow existing users of that datapath, including but not
limited to the Open vSwitch user-space, to use these offloads with little
or no modification.

SwitchDev was chosen for this implementation as it already provides
offload of FDB and FIB entries, which are to some extent flows. So overall
the approach taken here is to add a new type of flow to SwitchDev.
Other options include NDOs and calling into TC, neither of which
I have prototyped but both of which seem entirely reasonable to me.


This prototype consists of three parts:
* Updates to SwitchDev to add support for a new flow object
* Implementation of support for the new flow object in Rocker and
  its OF-DPA world.
  - This is to provide a working example, in practice OF-DPA seems
    extremely limited in terms of its capacity to offload
    Open vSwitch (-like) flows)
* Updates to the Open vSwitch datapath to use the new SwitchDev flow
  objects

There are also minor enhancements tot he Qemu implementation to rocker
to add byte and idle-time statistics to OF-DPA flows. This moves the
implementation out of the scope of OF-DPA but where the best mechanism
I came up to exercise this approach.

They are here: https://github.com/horms/qemu rocker-stats-20160926

No changes to the Open vSwitch user-space are required to exercise this
code.


A different approach, not implemented by this patch-set, is for user-space
to program flows into hardware by some other means, for example TC, and/or
the (kernel) datapath. I believe that approach does not conflict with this
one. And there is some scope to share infrastructure in the kernel.


Simon Horman (12):
  sw_flow: make struct sw_flow_key available outside of net/openvswitch/
  switchdev: Add Open vSwitch (-like) flow object support
  switchdev: Add support for getting port object details
  rocker: Add Open vSwitch (-like) flow support
  rocker: Support Open vSwitch (-like) flow stats
  rocker: Add helper to check ports belong to the same rocker switch
  rocker: switchdev Add Open vSwitch (-like) flow support to OF-DPA
    world
  rocker: Support Open vSwitch (-like) flow stats in OF-DPA world
  openvswitch: Add key_attrs to struct sw_flow_match
  openvswitch: make get_dp_rcu() available outside datapath.c
  openvswitch: Support programming of flows into hardware
  hack: rocker: no ip frag match

 drivers/net/ethernet/rocker/rocker.h       |  11 +
 drivers/net/ethernet/rocker/rocker_hw.h    |   4 +
 drivers/net/ethernet/rocker/rocker_main.c  |  75 +++++++
 drivers/net/ethernet/rocker/rocker_ofdpa.c | 350 ++++++++++++++++++++++++++++-
 include/linux/sw_flow.h                    | 100 +++++++++
 include/net/switchdev.h                    |  74 ++++++
 include/uapi/linux/openvswitch.h           |  36 +++
 net/openvswitch/datapath.c                 |  77 ++++++-
 net/openvswitch/datapath.h                 |   2 +
 net/openvswitch/flow.c                     | 173 ++++++++++++++
 net/openvswitch/flow.h                     | 135 +++++------
 net/openvswitch/flow_netlink.c             |  56 ++++-
 net/openvswitch/flow_netlink.h             |   3 +
 net/openvswitch/vport-netdev.c             |  39 ++++
 net/switchdev/switchdev.c                  | 119 ++++++++++
 15 files changed, 1165 insertions(+), 89 deletions(-)
 create mode 100644 include/linux/sw_flow.h

-- 
2.7.0.rc3.207.g0ac5344

_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH/RFC 01/12] sw_flow: make struct sw_flow_key available outside of net/openvswitch/
       [not found] ` <1475066582-1971-1-git-send-email-simon.horman-wFxRvT7yatFl57MIdRCFDg@public.gmane.org>
@ 2016-09-28 12:42   ` Simon Horman
  2016-09-28 13:54   ` [PATCH/RFC 00/12] Programming Open vSwitch (-like) flows into hardware using SwitchDev Or Gerlitz
  1 sibling, 0 replies; 17+ messages in thread
From: Simon Horman @ 2016-09-28 12:42 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA, dev-yBygre7rU0TnMu66kgdUjQ; +Cc: Simon Horman

This is preparation for using struct sw_flow_key as a structure
to describe Open vSwitch (-like) flows to hardware. This structure
was chosen because it has the required fields. It should also
be possible to use a different structure if desired.

There are a few fields and structures used in struct sw_flow_key which
have ovs in their name. Some consideration could be given to:

* Renaming them to make them more generic
* Providing a trimmed-down structure.
* Using an alternate structure

Signed-off-by: Simon Horman <simon.horman@netronome.com>
---
 include/linux/sw_flow.h | 100 ++++++++++++++++++++++++++++++++++++++++++++++++
 net/openvswitch/flow.h  |  75 +-----------------------------------
 2 files changed, 101 insertions(+), 74 deletions(-)
 create mode 100644 include/linux/sw_flow.h

diff --git a/include/linux/sw_flow.h b/include/linux/sw_flow.h
new file mode 100644
index 000000000000..17e25418dc56
--- /dev/null
+++ b/include/linux/sw_flow.h
@@ -0,0 +1,100 @@
+/*
+ * Copyright (c) 2007-2011 Nicira Networks.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+ * 02110-1301, USA
+ */
+
+#ifndef _LINUX_SW_FLOW_H
+#define _LINUX_SW_FLOW_H 1
+
+#include <net/ip_tunnels.h>
+#include <uapi/linux/openvswitch.h>
+
+struct vlan_head {
+	__be16 tpid; /* Vlan type. Generally 802.1q or 802.1ad.*/
+	__be16 tci;  /* 0 if no VLAN, VLAN_TAG_PRESENT set otherwise. */
+};
+
+struct sw_flow_key {
+	u8 tun_opts[IP_TUNNEL_OPTS_MAX];
+	u8 tun_opts_len;
+	struct ip_tunnel_key tun_key;	/* Encapsulating tunnel key. */
+	struct {
+		u32	priority;	/* Packet QoS priority. */
+		u32	skb_mark;	/* SKB mark. */
+		u16	in_port;	/* Input switch port (or DP_MAX_PORTS). */
+	} __packed phy; /* Safe when right after 'tun_key'. */
+	u8 tun_proto;			/* Protocol of encapsulating tunnel. */
+	u32 ovs_flow_hash;		/* Datapath computed hash value.  */
+	u32 recirc_id;			/* Recirculation ID.  */
+	struct {
+		u8     src[ETH_ALEN];	/* Ethernet source address. */
+		u8     dst[ETH_ALEN];	/* Ethernet destination address. */
+		struct vlan_head vlan;
+		struct vlan_head cvlan;
+		__be16 type;		/* Ethernet frame type. */
+	} eth;
+	union {
+		struct {
+			__be32 top_lse;	/* top label stack entry */
+		} mpls;
+		struct {
+			u8     proto;	/* IP protocol or lower 8 bits of ARP opcode. */
+			u8     tos;	    /* IP ToS. */
+			u8     ttl;	    /* IP TTL/hop limit. */
+			u8     frag;	/* One of OVS_FRAG_TYPE_*. */
+		} ip;
+	};
+	struct {
+		__be16 src;		/* TCP/UDP/SCTP source port. */
+		__be16 dst;		/* TCP/UDP/SCTP destination port. */
+		__be16 flags;		/* TCP flags. */
+	} tp;
+	union {
+		struct {
+			struct {
+				__be32 src;	/* IP source address. */
+				__be32 dst;	/* IP destination address. */
+			} addr;
+			struct {
+				u8 sha[ETH_ALEN];	/* ARP source hardware address. */
+				u8 tha[ETH_ALEN];	/* ARP target hardware address. */
+			} arp;
+		} ipv4;
+		struct {
+			struct {
+				struct in6_addr src;	/* IPv6 source address. */
+				struct in6_addr dst;	/* IPv6 destination address. */
+			} addr;
+			__be32 label;			/* IPv6 flow label. */
+			struct {
+				struct in6_addr target;	/* ND target address. */
+				u8 sll[ETH_ALEN];	/* ND source link layer address. */
+				u8 tll[ETH_ALEN];	/* ND target link layer address. */
+			} nd;
+		} ipv6;
+	};
+	struct {
+		/* Connection tracking fields. */
+		u16 zone;
+		u32 mark;
+		u8 state;
+		struct ovs_key_ct_labels labels;
+	} ct;
+
+} __aligned(BITS_PER_LONG/8); /* Ensure that we can do comparisons as longs. */
+
+
+#endif /* _LINUX_SW_FLOW_H */
diff --git a/net/openvswitch/flow.h b/net/openvswitch/flow.h
index ae783f5c6695..0c70c3532469 100644
--- a/net/openvswitch/flow.h
+++ b/net/openvswitch/flow.h
@@ -31,6 +31,7 @@
 #include <linux/jiffies.h>
 #include <linux/time.h>
 #include <linux/flex_array.h>
+#include <linux/sw_flow.h>
 #include <net/inet_ecn.h>
 #include <net/ip_tunnels.h>
 #include <net/dst_metadata.h>
@@ -50,84 +51,10 @@ struct ovs_tunnel_info {
 	struct metadata_dst	*tun_dst;
 };
 
-struct vlan_head {
-	__be16 tpid; /* Vlan type. Generally 802.1q or 802.1ad.*/
-	__be16 tci;  /* 0 if no VLAN, VLAN_TAG_PRESENT set otherwise. */
-};
-
 #define OVS_SW_FLOW_KEY_METADATA_SIZE			\
 	(offsetof(struct sw_flow_key, recirc_id) +	\
 	FIELD_SIZEOF(struct sw_flow_key, recirc_id))
 
-struct sw_flow_key {
-	u8 tun_opts[IP_TUNNEL_OPTS_MAX];
-	u8 tun_opts_len;
-	struct ip_tunnel_key tun_key;	/* Encapsulating tunnel key. */
-	struct {
-		u32	priority;	/* Packet QoS priority. */
-		u32	skb_mark;	/* SKB mark. */
-		u16	in_port;	/* Input switch port (or DP_MAX_PORTS). */
-	} __packed phy; /* Safe when right after 'tun_key'. */
-	u8 tun_proto;			/* Protocol of encapsulating tunnel. */
-	u32 ovs_flow_hash;		/* Datapath computed hash value.  */
-	u32 recirc_id;			/* Recirculation ID.  */
-	struct {
-		u8     src[ETH_ALEN];	/* Ethernet source address. */
-		u8     dst[ETH_ALEN];	/* Ethernet destination address. */
-		struct vlan_head vlan;
-		struct vlan_head cvlan;
-		__be16 type;		/* Ethernet frame type. */
-	} eth;
-	union {
-		struct {
-			__be32 top_lse;	/* top label stack entry */
-		} mpls;
-		struct {
-			u8     proto;	/* IP protocol or lower 8 bits of ARP opcode. */
-			u8     tos;	    /* IP ToS. */
-			u8     ttl;	    /* IP TTL/hop limit. */
-			u8     frag;	/* One of OVS_FRAG_TYPE_*. */
-		} ip;
-	};
-	struct {
-		__be16 src;		/* TCP/UDP/SCTP source port. */
-		__be16 dst;		/* TCP/UDP/SCTP destination port. */
-		__be16 flags;		/* TCP flags. */
-	} tp;
-	union {
-		struct {
-			struct {
-				__be32 src;	/* IP source address. */
-				__be32 dst;	/* IP destination address. */
-			} addr;
-			struct {
-				u8 sha[ETH_ALEN];	/* ARP source hardware address. */
-				u8 tha[ETH_ALEN];	/* ARP target hardware address. */
-			} arp;
-		} ipv4;
-		struct {
-			struct {
-				struct in6_addr src;	/* IPv6 source address. */
-				struct in6_addr dst;	/* IPv6 destination address. */
-			} addr;
-			__be32 label;			/* IPv6 flow label. */
-			struct {
-				struct in6_addr target;	/* ND target address. */
-				u8 sll[ETH_ALEN];	/* ND source link layer address. */
-				u8 tll[ETH_ALEN];	/* ND target link layer address. */
-			} nd;
-		} ipv6;
-	};
-	struct {
-		/* Connection tracking fields. */
-		u16 zone;
-		u32 mark;
-		u8 state;
-		struct ovs_key_ct_labels labels;
-	} ct;
-
-} __aligned(BITS_PER_LONG/8); /* Ensure that we can do comparisons as longs. */
-
 struct sw_flow_key_range {
 	unsigned short int start;
 	unsigned short int end;
-- 
2.7.0.rc3.207.g0ac5344

_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH/RFC 02/12] switchdev: Add Open vSwitch (-like) flow object support
  2016-09-28 12:42 [PATCH/RFC 00/12] Programming Open vSwitch (-like) flows into hardware using SwitchDev Simon Horman
       [not found] ` <1475066582-1971-1-git-send-email-simon.horman-wFxRvT7yatFl57MIdRCFDg@public.gmane.org>
@ 2016-09-28 12:42 ` Simon Horman
  2016-09-28 12:42 ` [PATCH/RFC 03/12] switchdev: Add support for getting port object details Simon Horman
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Simon Horman @ 2016-09-28 12:42 UTC (permalink / raw)
  To: netdev, dev; +Cc: Simon Horman

The motivation of this patch is to provide objects for Open vSwitch (-like)
flows so that they may be programmed into hardware using switchdev.

The structures used here may well prove to be too Open vSwitch centric, but
the purpose of the prototype of which this patch is part is to explore if
switchdev is an appropriate mechanism for programming Open vSwitch (-like)
flows into hardware. The data structures can be tweaked/reworked as needed.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
---
 include/net/switchdev.h   | 39 +++++++++++++++++++++++++++++++
 net/switchdev/switchdev.c | 58 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 97 insertions(+)

diff --git a/include/net/switchdev.h b/include/net/switchdev.h
index 729fe1534160..8eda96e46f98 100644
--- a/include/net/switchdev.h
+++ b/include/net/switchdev.h
@@ -71,6 +71,7 @@ enum switchdev_obj_id {
 	SWITCHDEV_OBJ_ID_IPV4_FIB,
 	SWITCHDEV_OBJ_ID_PORT_FDB,
 	SWITCHDEV_OBJ_ID_PORT_MDB,
+	SWITCHDEV_OBJ_SW_FLOW,
 };
 
 struct switchdev_obj {
@@ -128,6 +129,19 @@ struct switchdev_obj_port_mdb {
 #define SWITCHDEV_OBJ_PORT_MDB(obj) \
 	container_of(obj, struct switchdev_obj_port_mdb, obj)
 
+/* SWITCHDEV_OBJ_ID_PORT_SW_FLOW */
+struct switchdev_obj_sw_flow {
+	struct switchdev_obj obj;
+	const struct sw_flow_key *key;
+	const struct sw_flow_key *mask;
+	u64 attrs;
+	const struct nlattr *actions;
+	u32 actions_len;
+};
+
+#define SWITCHDEV_OBJ_SW_FLOW(obj) \
+	container_of(obj, struct switchdev_obj_sw_flow, obj)
+
 void switchdev_trans_item_enqueue(struct switchdev_trans *trans,
 				  void *data, void (*destructor)(void const *),
 				  struct switchdev_trans_item *tritem);
@@ -223,6 +237,13 @@ int switchdev_port_fdb_del(struct ndmsg *ndm, struct nlattr *tb[],
 int switchdev_port_fdb_dump(struct sk_buff *skb, struct netlink_callback *cb,
 			    struct net_device *dev,
 			    struct net_device *filter_dev, int *idx);
+int switchdev_sw_flow_add(struct net_device *dev,
+			  const struct sw_flow_key *key,
+			  const struct sw_flow_key *mask, u64 attrs,
+			  const struct nlattr *actions, u32 actions_len);
+int switchdev_sw_flow_del(struct net_device *dev,
+			  const struct sw_flow_key *key,
+			  const struct sw_flow_key *mask, u64 attrs);
 void switchdev_port_fwd_mark_set(struct net_device *dev,
 				 struct net_device *group_dev,
 				 bool joining);
@@ -347,6 +368,24 @@ static inline int switchdev_port_fdb_dump(struct sk_buff *skb,
        return *idx;
 }
 
+static inline int switchdev_sw_flow_add(struct net_device *dev,
+					const struct sw_flow_key *key,
+					const struct sw_flow_key *mask,
+					u64 attrs,
+					const struct nlattr *actions,
+					u32 actions_len)
+{
+	return -EOPNOTSUPP;
+}
+
+static inline int switchdev_sw_flow_del(struct net_device *dev,
+					const struct sw_flow_key *key,
+					const struct sw_flow_key *mask,
+					u64 attrs)
+{
+	return -EOPNOTSUPP;
+}
+
 static inline bool switchdev_port_same_parent_id(struct net_device *a,
 						 struct net_device *b)
 {
diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c
index 10b819308439..db96c3345129 100644
--- a/net/switchdev/switchdev.c
+++ b/net/switchdev/switchdev.c
@@ -1286,6 +1286,64 @@ void switchdev_fib_ipv4_abort(struct fib_info *fi)
 }
 EXPORT_SYMBOL_GPL(switchdev_fib_ipv4_abort);
 
+/**
+ *	switchdev_sw_flow_add - Program a flow into a switch port
+ *
+ *	@dev: port device
+ *      @key: flow key
+ *      @mask: flow mask
+ *      @attrs: attributes present in key
+ *	@actions: actions of the flow
+ *	@actions_len: length of @actions
+ *
+ *	Program a flow into a port device where the flow is expressed as
+ *	an Open vSwitch flow key, mask, attributes, and actions
+ */
+int switchdev_sw_flow_add(struct net_device *dev,
+			  const struct sw_flow_key *key,
+			  const struct sw_flow_key *mask,
+			  u64 attrs, const struct nlattr *actions,
+			  u32 actions_len)
+{
+	struct switchdev_obj_sw_flow sw_flow = {
+		.obj.id = SWITCHDEV_OBJ_SW_FLOW,
+		.key = key,
+		.mask = mask,
+		.attrs = attrs,
+		.actions = actions,
+		.actions_len = actions_len,
+	};
+
+	return switchdev_port_obj_add(dev, &sw_flow.obj);
+}
+EXPORT_SYMBOL_GPL(switchdev_sw_flow_add);
+
+/**
+ *	switchdev_sw_flow_del - Delete flow from switch
+ *
+ *	@dev: port device
+ *      @key: flow key
+ *      @mask: flow mask
+ *      @attrs: attributes present in key
+ *
+ *	Delete a flow from a device where the flow is expressed as
+ *	an Open vSwitch flow key, mask and attributes.
+ */
+int switchdev_sw_flow_del(struct net_device *dev,
+			  const struct sw_flow_key *key,
+			  const struct sw_flow_key *mask, u64 attrs)
+{
+	struct switchdev_obj_sw_flow sw_flow = {
+		.obj.id = SWITCHDEV_OBJ_SW_FLOW,
+		.key = key,
+		.mask = mask,
+		.attrs = attrs,
+	};
+
+	return switchdev_port_obj_del(dev, &sw_flow.obj);
+}
+EXPORT_SYMBOL_GPL(switchdev_sw_flow_del);
+
 bool switchdev_port_same_parent_id(struct net_device *a,
 				   struct net_device *b)
 {
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH/RFC 03/12] switchdev: Add support for getting port object details
  2016-09-28 12:42 [PATCH/RFC 00/12] Programming Open vSwitch (-like) flows into hardware using SwitchDev Simon Horman
       [not found] ` <1475066582-1971-1-git-send-email-simon.horman-wFxRvT7yatFl57MIdRCFDg@public.gmane.org>
  2016-09-28 12:42 ` [PATCH/RFC 02/12] switchdev: Add Open vSwitch (-like) flow object support Simon Horman
@ 2016-09-28 12:42 ` Simon Horman
  2016-09-28 12:42 ` [PATCH/RFC 04/12] rocker: Add Open vSwitch (-like) flow support Simon Horman
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Simon Horman @ 2016-09-28 12:42 UTC (permalink / raw)
  To: netdev, dev; +Cc: Simon Horman

The motivation for this prototype is to allow the statistics - number of
hits - of Open vSwitch (-line) flows which have been programmed into
hardware to be retrieved.

This patch takes a generic approach by adding a SDO to allow retrieval of
object details.  The idea is that an object is passed in with sufficient
detail to allow it to be looked up and the driver may then fill in other
details of the object.

A follow up patch will prototype using this new SDO to retrieve statistics
for Open vSwitch (-like) flows that have been programmed into hardware.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
---
 include/net/switchdev.h   | 39 ++++++++++++++++++++++++++--
 net/switchdev/switchdev.c | 65 +++++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 100 insertions(+), 4 deletions(-)

diff --git a/include/net/switchdev.h b/include/net/switchdev.h
index 8eda96e46f98..92a9357e4fab 100644
--- a/include/net/switchdev.h
+++ b/include/net/switchdev.h
@@ -74,6 +74,12 @@ enum switchdev_obj_id {
 	SWITCHDEV_OBJ_SW_FLOW,
 };
 
+struct switchdev_obj_stats {
+	unsigned long rx_packets;
+	unsigned long rx_bytes;
+	unsigned long last_used;
+};
+
 struct switchdev_obj {
 	struct net_device *orig_dev;
 	enum switchdev_obj_id id;
@@ -135,8 +141,13 @@ struct switchdev_obj_sw_flow {
 	const struct sw_flow_key *key;
 	const struct sw_flow_key *mask;
 	u64 attrs;
-	const struct nlattr *actions;
-	u32 actions_len;
+	union {
+		struct {
+			const struct nlattr *actions;
+			u32 len;
+		} actions;
+		struct switchdev_obj_stats *stats;
+	};
 };
 
 #define SWITCHDEV_OBJ_SW_FLOW(obj) \
@@ -161,6 +172,8 @@ typedef int switchdev_obj_dump_cb_t(struct switchdev_obj *obj);
  * @switchdev_port_obj_del: Delete an object from port (see switchdev_obj_*).
  *
  * @switchdev_port_obj_dump: Dump port objects (see switchdev_obj_*).
+ *
+ * @switchdev_port_obj_get: Get an object from port (see switchdev_obj_*).
  */
 struct switchdev_ops {
 	int	(*switchdev_port_attr_get)(struct net_device *dev,
@@ -176,6 +189,8 @@ struct switchdev_ops {
 	int	(*switchdev_port_obj_dump)(struct net_device *dev,
 					   struct switchdev_obj *obj,
 					   switchdev_obj_dump_cb_t *cb);
+	int	(*switchdev_port_obj_get)(struct net_device *dev,
+					  struct switchdev_obj *obj);
 };
 
 enum switchdev_notifier_type {
@@ -212,6 +227,7 @@ int switchdev_port_obj_del(struct net_device *dev,
 			   const struct switchdev_obj *obj);
 int switchdev_port_obj_dump(struct net_device *dev, struct switchdev_obj *obj,
 			    switchdev_obj_dump_cb_t *cb);
+int switchdev_port_obj_get(struct net_device *dev, struct switchdev_obj *obj);
 int register_switchdev_notifier(struct notifier_block *nb);
 int unregister_switchdev_notifier(struct notifier_block *nb);
 int call_switchdev_notifiers(unsigned long val, struct net_device *dev,
@@ -244,6 +260,10 @@ int switchdev_sw_flow_add(struct net_device *dev,
 int switchdev_sw_flow_del(struct net_device *dev,
 			  const struct sw_flow_key *key,
 			  const struct sw_flow_key *mask, u64 attrs);
+int switchdev_sw_flow_get_stats(struct net_device *dev,
+				const struct sw_flow_key *key,
+				const struct sw_flow_key *mask, u64 attrs,
+				struct switchdev_obj_stats *stats);
 void switchdev_port_fwd_mark_set(struct net_device *dev,
 				 struct net_device *group_dev,
 				 bool joining);
@@ -287,6 +307,12 @@ static inline int switchdev_port_obj_dump(struct net_device *dev,
 	return -EOPNOTSUPP;
 }
 
+static inline int switchdev_port_obj_get(struct net_device *dev,
+					 struct switchdev_obj *obj)
+{
+	return -EOPNOTSUPP;
+}
+
 static inline int register_switchdev_notifier(struct notifier_block *nb)
 {
 	return 0;
@@ -386,6 +412,15 @@ static inline int switchdev_sw_flow_del(struct net_device *dev,
 	return -EOPNOTSUPP;
 }
 
+static inline int switchdev_ovs_flow_get_stats(struct net_device *dev,
+					       const struct sw_flow_key *key,
+					       const struct sw_flow_key *mask,
+					       u64 attrs,
+					       struct switchdev_obj_stats *stats)
+{
+	return 0;
+}
+
 static inline bool switchdev_port_same_parent_id(struct net_device *a,
 						 struct net_device *b)
 {
diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c
index db96c3345129..f89e6dc90eb6 100644
--- a/net/switchdev/switchdev.c
+++ b/net/switchdev/switchdev.c
@@ -574,6 +574,36 @@ int switchdev_port_obj_dump(struct net_device *dev, struct switchdev_obj *obj,
 }
 EXPORT_SYMBOL_GPL(switchdev_port_obj_dump);
 
+/**
+ *	switchdev_port_obj_get - Get port object
+ *
+ *	@dev: port device
+ *	@obj: object to get
+ */
+int switchdev_port_obj_get(struct net_device *dev, struct switchdev_obj *obj)
+{
+	const struct switchdev_ops *ops = dev->switchdev_ops;
+	struct net_device *lower_dev;
+	struct list_head *iter;
+	int err = -EOPNOTSUPP;
+
+	if (ops && ops->switchdev_port_obj_get)
+		return ops->switchdev_port_obj_get(dev, obj);
+
+	/* Switch device port(s) may be stacked under
+	 * bond/team/vlan dev, so recurse down to get objects on
+	 * first port at bottom of stack.
+	 */
+
+	netdev_for_each_lower_dev(dev, lower_dev, iter) {
+		err = switchdev_port_obj_get(lower_dev, obj);
+		break;
+	}
+
+	return err;
+}
+EXPORT_SYMBOL_GPL(switchdev_port_obj_get);
+
 static RAW_NOTIFIER_HEAD(switchdev_notif_chain);
 
 /**
@@ -1310,8 +1340,10 @@ int switchdev_sw_flow_add(struct net_device *dev,
 		.key = key,
 		.mask = mask,
 		.attrs = attrs,
-		.actions = actions,
-		.actions_len = actions_len,
+		.actions = {
+			.actions = actions,
+			.len = actions_len,
+		},
 	};
 
 	return switchdev_port_obj_add(dev, &sw_flow.obj);
@@ -1344,6 +1376,35 @@ int switchdev_sw_flow_del(struct net_device *dev,
 }
 EXPORT_SYMBOL_GPL(switchdev_sw_flow_del);
 
+/**
+ *	switchdev_sw_flow_get_stats - Get statistics of a flow from switch
+ *
+ *	@dev: port device
+ *      @key: flow key
+ *      @mask: flow mask
+ *      @attrs: attributes present in key
+ *	@stats: statistics to fill in
+ *
+ *	Get statistics of a flow from a device where the flow is expressed as
+ *	an Open vSwitch flow key, mask and attributes.
+ */
+int switchdev_sw_flow_get_stats(struct net_device *dev,
+				 const struct sw_flow_key *key,
+				 const struct sw_flow_key *mask, u64 attrs,
+				 struct switchdev_obj_stats *stats)
+{
+	struct switchdev_obj_sw_flow sw_flow = {
+		.obj.id = SWITCHDEV_OBJ_SW_FLOW,
+		.key = key,
+		.mask = mask,
+		.attrs = attrs,
+		.stats = stats,
+	};
+
+	return switchdev_port_obj_get(dev, &sw_flow.obj);
+}
+EXPORT_SYMBOL_GPL(switchdev_sw_flow_get_stats);
+
 bool switchdev_port_same_parent_id(struct net_device *a,
 				   struct net_device *b)
 {
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH/RFC 04/12] rocker: Add Open vSwitch (-like) flow support
  2016-09-28 12:42 [PATCH/RFC 00/12] Programming Open vSwitch (-like) flows into hardware using SwitchDev Simon Horman
                   ` (2 preceding siblings ...)
  2016-09-28 12:42 ` [PATCH/RFC 03/12] switchdev: Add support for getting port object details Simon Horman
@ 2016-09-28 12:42 ` Simon Horman
  2016-09-28 12:42 ` [PATCH/RFC 05/12] rocker: Support Open vSwitch (-like) flow stats Simon Horman
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Simon Horman @ 2016-09-28 12:42 UTC (permalink / raw)
  To: netdev, dev; +Cc: Simon Horman

Prototype programming of Open vSwitch (-like) flows into hardware
by implementing SWITCHDEV_OBJ_OVS_FLOW type objects in the
rocker_port_obj_{add,del} SDO, a new object type that
was added by an earlier patch in that forms part of this prototype.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
---
 drivers/net/ethernet/rocker/rocker.h      |  5 +++++
 drivers/net/ethernet/rocker/rocker_main.c | 30 ++++++++++++++++++++++++++++++
 2 files changed, 35 insertions(+)

diff --git a/drivers/net/ethernet/rocker/rocker.h b/drivers/net/ethernet/rocker/rocker.h
index 1ab995f7146b..ebcc863fd25e 100644
--- a/drivers/net/ethernet/rocker/rocker.h
+++ b/drivers/net/ethernet/rocker/rocker.h
@@ -117,6 +117,11 @@ struct rocker_world_ops {
 	int (*port_obj_vlan_dump)(const struct rocker_port *rocker_port,
 				  struct switchdev_obj_port_vlan *vlan,
 				  switchdev_obj_dump_cb_t *cb);
+	int (*port_obj_sw_flow_add)(struct rocker_port *rocker_port,
+				    const struct switchdev_obj_sw_flow *sw_flow,
+				    struct switchdev_trans *trans);
+	int (*port_obj_sw_flow_del)(struct rocker_port *rocker_port,
+				    const struct switchdev_obj_sw_flow *sw_flow);
 	int (*port_obj_fib4_add)(struct rocker_port *rocker_port,
 				 const struct switchdev_obj_ipv4_fib *fib4,
 				 struct switchdev_trans *trans);
diff --git a/drivers/net/ethernet/rocker/rocker_main.c b/drivers/net/ethernet/rocker/rocker_main.c
index 1f0c08602eba..e019772689a6 100644
--- a/drivers/net/ethernet/rocker/rocker_main.c
+++ b/drivers/net/ethernet/rocker/rocker_main.c
@@ -1682,6 +1682,27 @@ rocker_world_port_obj_fdb_dump(const struct rocker_port *rocker_port,
 	return wops->port_obj_fdb_dump(rocker_port, fdb, cb);
 }
 
+static int rocker_world_port_obj_sw_flow_add(struct rocker_port *rocker_port,
+				const struct switchdev_obj_sw_flow *sw_flow,
+				struct switchdev_trans *trans)
+{
+	struct rocker_world_ops *wops = rocker_port->rocker->wops;
+
+	if (!wops->port_obj_sw_flow_add)
+		return -EOPNOTSUPP;
+	return wops->port_obj_sw_flow_add(rocker_port, sw_flow, trans);
+}
+
+static int rocker_world_port_obj_sw_flow_del(struct rocker_port *rocker_port,
+				const struct switchdev_obj_sw_flow *sw_flow)
+{
+	struct rocker_world_ops *wops = rocker_port->rocker->wops;
+
+	if (!wops->port_obj_sw_flow_del)
+		return -EOPNOTSUPP;
+	return wops->port_obj_sw_flow_del(rocker_port, sw_flow);
+}
+
 static int rocker_world_port_master_linked(struct rocker_port *rocker_port,
 					   struct net_device *master)
 {
@@ -2106,6 +2127,11 @@ static int rocker_port_obj_add(struct net_device *dev,
 						    SWITCHDEV_OBJ_PORT_FDB(obj),
 						    trans);
 		break;
+	case SWITCHDEV_OBJ_SW_FLOW:
+		err = rocker_world_port_obj_sw_flow_add(rocker_port,
+							 SWITCHDEV_OBJ_SW_FLOW(obj),
+							 trans);
+		break;
 	default:
 		err = -EOPNOTSUPP;
 		break;
@@ -2133,6 +2159,10 @@ static int rocker_port_obj_del(struct net_device *dev,
 		err = rocker_world_port_obj_fdb_del(rocker_port,
 						    SWITCHDEV_OBJ_PORT_FDB(obj));
 		break;
+	case SWITCHDEV_OBJ_SW_FLOW:
+		err = rocker_world_port_obj_sw_flow_del(rocker_port,
+							 SWITCHDEV_OBJ_SW_FLOW(obj));
+		break;
 	default:
 		err = -EOPNOTSUPP;
 		break;
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH/RFC 05/12] rocker: Support Open vSwitch (-like) flow stats
  2016-09-28 12:42 [PATCH/RFC 00/12] Programming Open vSwitch (-like) flows into hardware using SwitchDev Simon Horman
                   ` (3 preceding siblings ...)
  2016-09-28 12:42 ` [PATCH/RFC 04/12] rocker: Add Open vSwitch (-like) flow support Simon Horman
@ 2016-09-28 12:42 ` Simon Horman
  2016-09-28 12:42 ` [PATCH/RFC 06/12] rocker: Add helper to check ports belong to the same rocker switch Simon Horman
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Simon Horman @ 2016-09-28 12:42 UTC (permalink / raw)
  To: netdev, dev; +Cc: Simon Horman

Prototype an implementation of the new switchdev_port_obj_get SDO for the
SWITCHDEV_OBJ_OVS_FLOW object type. This allows retrieval of
statistics for Open vSwitch (-like) flows which have been programmed into
hardware.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
---
 drivers/net/ethernet/rocker/rocker.h      |  2 ++
 drivers/net/ethernet/rocker/rocker_main.c | 30 ++++++++++++++++++++++++++++++
 2 files changed, 32 insertions(+)

diff --git a/drivers/net/ethernet/rocker/rocker.h b/drivers/net/ethernet/rocker/rocker.h
index ebcc863fd25e..332adce701fa 100644
--- a/drivers/net/ethernet/rocker/rocker.h
+++ b/drivers/net/ethernet/rocker/rocker.h
@@ -122,6 +122,8 @@ struct rocker_world_ops {
 				    struct switchdev_trans *trans);
 	int (*port_obj_sw_flow_del)(struct rocker_port *rocker_port,
 				    const struct switchdev_obj_sw_flow *sw_flow);
+	int (*port_obj_sw_flow_get_stats)(struct rocker_port *rocker_port,
+					  const struct switchdev_obj_sw_flow *sw_flow);
 	int (*port_obj_fib4_add)(struct rocker_port *rocker_port,
 				 const struct switchdev_obj_ipv4_fib *fib4,
 				 struct switchdev_trans *trans);
diff --git a/drivers/net/ethernet/rocker/rocker_main.c b/drivers/net/ethernet/rocker/rocker_main.c
index e019772689a6..6c6a486cced6 100644
--- a/drivers/net/ethernet/rocker/rocker_main.c
+++ b/drivers/net/ethernet/rocker/rocker_main.c
@@ -1703,6 +1703,16 @@ static int rocker_world_port_obj_sw_flow_del(struct rocker_port *rocker_port,
 	return wops->port_obj_sw_flow_del(rocker_port, sw_flow);
 }
 
+static int rocker_world_port_obj_sw_flow_get_stats(struct rocker_port *rocker_port,
+				const struct switchdev_obj_sw_flow *sw_flow)
+{
+	struct rocker_world_ops *wops = rocker_port->rocker->wops;
+
+	if (!wops->port_obj_sw_flow_get_stats)
+		return -EOPNOTSUPP;
+	return wops->port_obj_sw_flow_get_stats(rocker_port, sw_flow);
+}
+
 static int rocker_world_port_master_linked(struct rocker_port *rocker_port,
 					   struct net_device *master)
 {
@@ -2197,12 +2207,32 @@ static int rocker_port_obj_dump(struct net_device *dev,
 	return err;
 }
 
+static int rocker_port_obj_get(struct net_device *dev,
+			       struct switchdev_obj *obj)
+{
+	struct rocker_port *rocker_port = netdev_priv(dev);
+	int err = 0;
+
+	switch (obj->id) {
+	case SWITCHDEV_OBJ_SW_FLOW:
+		err = rocker_world_port_obj_sw_flow_get_stats(rocker_port,
+						 SWITCHDEV_OBJ_SW_FLOW(obj));
+		break;
+	default:
+		err = -EOPNOTSUPP;
+		break;
+	}
+
+	return err;
+}
+
 static const struct switchdev_ops rocker_port_switchdev_ops = {
 	.switchdev_port_attr_get	= rocker_port_attr_get,
 	.switchdev_port_attr_set	= rocker_port_attr_set,
 	.switchdev_port_obj_add		= rocker_port_obj_add,
 	.switchdev_port_obj_del		= rocker_port_obj_del,
 	.switchdev_port_obj_dump	= rocker_port_obj_dump,
+	.switchdev_port_obj_get		= rocker_port_obj_get,
 };
 
 /********************
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH/RFC 06/12] rocker: Add helper to check ports belong to the same rocker switch
  2016-09-28 12:42 [PATCH/RFC 00/12] Programming Open vSwitch (-like) flows into hardware using SwitchDev Simon Horman
                   ` (4 preceding siblings ...)
  2016-09-28 12:42 ` [PATCH/RFC 05/12] rocker: Support Open vSwitch (-like) flow stats Simon Horman
@ 2016-09-28 12:42 ` Simon Horman
  2016-09-28 12:42 ` [PATCH/RFC 07/12] rocker: switchdev Add Open vSwitch (-like) flow support to OF-DPA world Simon Horman
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Simon Horman @ 2016-09-28 12:42 UTC (permalink / raw)
  To: netdev, dev; +Cc: Simon Horman

This will be used by a follow-up patch to add Add Open vSwitch (-like) flow
support to the OF-DPA rocker world.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
---
 drivers/net/ethernet/rocker/rocker.h      |  4 ++++
 drivers/net/ethernet/rocker/rocker_main.c | 15 +++++++++++++++
 2 files changed, 19 insertions(+)

diff --git a/drivers/net/ethernet/rocker/rocker.h b/drivers/net/ethernet/rocker/rocker.h
index 332adce701fa..4b4a7d2af774 100644
--- a/drivers/net/ethernet/rocker/rocker.h
+++ b/drivers/net/ethernet/rocker/rocker.h
@@ -85,6 +85,10 @@ int rocker_cmd_exec(struct rocker_port *rocker_port, bool nowait,
 int rocker_port_set_learning(struct rocker_port *rocker_port,
 			     bool learning);
 
+/* True if a and b are ports on the same rocker switch */
+bool rocker_port_dev_cmp_rocker(const struct net_device *a,
+				const struct net_device *b);
+
 struct rocker_world_ops {
 	const char *kind;
 	size_t priv_size;
diff --git a/drivers/net/ethernet/rocker/rocker_main.c b/drivers/net/ethernet/rocker/rocker_main.c
index 6c6a486cced6..02101f88dc08 100644
--- a/drivers/net/ethernet/rocker/rocker_main.c
+++ b/drivers/net/ethernet/rocker/rocker_main.c
@@ -2859,6 +2859,21 @@ static bool rocker_port_dev_check(const struct net_device *dev)
 	return dev->netdev_ops == &rocker_port_netdev_ops;
 }
 
+bool rocker_port_dev_cmp_rocker(const struct net_device *a,
+				const struct net_device *b)
+{
+	struct rocker_port *rocker_port_a, *rocker_port_b;
+
+	if (!rocker_port_dev_check(a) || !rocker_port_dev_check(b))
+		return false;
+
+
+	rocker_port_a = netdev_priv(a);
+	rocker_port_b = netdev_priv(b);
+
+	return rocker_port_a->rocker == rocker_port_b->rocker;
+}
+
 static int rocker_netdevice_event(struct notifier_block *unused,
 				  unsigned long event, void *ptr)
 {
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH/RFC 07/12] rocker: switchdev Add Open vSwitch (-like) flow support to OF-DPA world
  2016-09-28 12:42 [PATCH/RFC 00/12] Programming Open vSwitch (-like) flows into hardware using SwitchDev Simon Horman
                   ` (5 preceding siblings ...)
  2016-09-28 12:42 ` [PATCH/RFC 06/12] rocker: Add helper to check ports belong to the same rocker switch Simon Horman
@ 2016-09-28 12:42 ` Simon Horman
  2016-09-28 12:42 ` [PATCH/RFC 08/12] rocker: Support Open vSwitch (-like) flow stats in " Simon Horman
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Simon Horman @ 2016-09-28 12:42 UTC (permalink / raw)
  To: netdev, dev; +Cc: Simon Horman

Prototype programming of Open vSwitch (-like) flows into hardware
by implementing SWITCHDEV_OBJ_OVS_FLOW type objects in the
rocker_port_obj_{add,del} SDO, a new object type that
was added by an earlier patch in that forms part of this prototype.

A very limited subset of flows are accepted by this implementation,
however, it is sufficient for a working ping test: the ICMP packets are
programmed into hardware although the ARP packets are not.

Follow-up patches in this prototype will modify the Open vSwitch kernel
datapath to call the new SDOs and thus make use of the changes in this
patch.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
---
 drivers/net/ethernet/rocker/rocker_ofdpa.c | 261 +++++++++++++++++++++++++++++
 1 file changed, 261 insertions(+)

diff --git a/drivers/net/ethernet/rocker/rocker_ofdpa.c b/drivers/net/ethernet/rocker/rocker_ofdpa.c
index fcad907baecf..6669f2ba2f97 100644
--- a/drivers/net/ethernet/rocker/rocker_ofdpa.c
+++ b/drivers/net/ethernet/rocker/rocker_ofdpa.c
@@ -19,6 +19,7 @@
 #include <linux/inetdevice.h>
 #include <linux/if_vlan.h>
 #include <linux/if_bridge.h>
+#include <linux/sw_flow.h>
 #include <net/neighbour.h>
 #include <net/switchdev.h>
 #include <net/ip_fib.h>
@@ -2795,6 +2796,264 @@ static int ofdpa_port_obj_fdb_dump(const struct rocker_port *rocker_port,
 	return err;
 }
 
+#if IS_ENABLED(CONFIG_OPENVSWITCH)
+
+static int
+ofdpa_port_sw_flow_actions(struct ofdpa_port *ofdpa_port,
+			   const struct switchdev_obj_sw_flow *flow,
+			   struct ofdpa_flow_tbl_key *out_key)
+{
+	int rem, err, out_ifindex = -1, len = flow->actions.len;
+	const struct nlattr *a, *attr = flow->actions.actions;
+	struct net_device *out_dev, *dev = ofdpa_port->dev;
+	struct rocker_port *out_rocker_port;
+	struct ofdpa_port *out_ofdpa_port;
+	__be16 vlan_id;
+	u32 out_pport;
+
+	for (a = attr, rem = len; rem > 0; a = nla_next(a, &rem)) {
+		int type = nla_type(a);
+
+		switch (type) {
+		case OVS_ACTION_ATTR_OUTPUT:
+			/* Only unicast is supported at this time */
+			if (out_ifindex >= 0)
+				return -ENOTSUPP;
+			out_ifindex = nla_get_u32(a);
+			break;
+
+		case OVS_ACTION_ATTR_POP_VLAN:
+		case OVS_ACTION_ATTR_PUSH_VLAN:
+		case OVS_ACTION_ATTR_SET:
+		case OVS_ACTION_ATTR_USERSPACE:
+		case OVS_ACTION_ATTR_HASH:
+		case OVS_ACTION_ATTR_RECIRC:
+		case OVS_ACTION_ATTR_PUSH_MPLS:
+		case OVS_ACTION_ATTR_POP_MPLS:
+		case OVS_ACTION_ATTR_SET_MASKED:
+		case OVS_ACTION_ATTR_SAMPLE:
+			return -ENOTSUPP;
+
+		case OVS_ACTION_ATTR_UNSPEC:
+		default:
+			return -EINVAL;
+		}
+	}
+
+	/* No output */
+	if (out_ifindex == -1)
+		return -ENOTSUPP;
+
+	out_dev = dev_get_by_index(dev_net(dev), out_ifindex);
+	if (!out_dev)
+		return -EINVAL;
+
+	/* It is invalid to output to the input port */
+	if (dev == out_dev) {
+		err = -EINVAL;
+		goto err;
+	}
+
+	/* Only support flows whose input and output port are on
+	 * the same rocker switch.
+	 */
+	if (!rocker_port_dev_cmp_rocker(dev, out_dev)) {
+		err = -ENOTSUPP;
+		goto err;
+	}
+
+	out_rocker_port = netdev_priv(out_dev);
+	out_ofdpa_port = out_rocker_port->wpriv;
+	out_pport = out_ofdpa_port->pport;
+	vlan_id = out_ofdpa_port->internal_vlan_id;
+	out_key->acl.group_id = ROCKER_GROUP_L2_INTERFACE(vlan_id, out_pport);
+
+	err = 0;
+err:
+	dev_put(out_dev);
+	return err;
+}
+
+static int ofdpa_port_sw_flow_match(struct ofdpa_port *ofdpa_port,
+				    const struct switchdev_obj_sw_flow *flow,
+				    struct ofdpa_flow_tbl_key *out_key)
+{
+	const struct sw_flow_key *mask = flow->mask;
+	const struct sw_flow_key *key = flow->key;
+	const u8 *eth_dst = NULL, *eth_dst_mask = NULL;
+	u64 key_allowed, key_required;
+
+	key_required =
+		BIT_ULL(OVS_KEY_ATTR_IN_PORT) |
+		BIT_ULL(OVS_KEY_ATTR_ETHERNET) |
+		BIT_ULL(OVS_KEY_ATTR_ETHERTYPE);
+
+	/* TODO: Support more key fields as per those
+	 * permitted in the OF-DPA ACL Flow Table.
+	 */
+	key_allowed = key_required |
+		BIT_ULL(OVS_KEY_ATTR_PRIORITY) |	/* Only zero */
+		BIT_ULL(OVS_KEY_ATTR_IPV4) |
+		BIT_ULL(OVS_KEY_ATTR_ICMP) |
+		BIT_ULL(OVS_KEY_ATTR_SKB_MARK) |	/* Only zero */
+		BIT_ULL(OVS_KEY_ATTR_TUNNEL) |		/* Only zero */
+		BIT_ULL(OVS_KEY_ATTR_DP_HASH) |		/* Only zero */
+		BIT_ULL(OVS_KEY_ATTR_RECIRC_ID);	/* Only zero */
+
+	if ((flow->attrs & key_required) != key_required ||
+	    (flow->attrs & key_allowed) != flow->attrs)
+		return -ENOTSUPP;
+
+	/* Only support zero/no skb priority */
+	if (flow->attrs | BIT_ULL(OVS_KEY_ATTR_PRIORITY) &&
+	    key->phy.priority)
+		return -ENOTSUPP;
+
+	/* Only support zero/no tunnel id */
+	if (flow->attrs | BIT_ULL(OVS_KEY_ATTR_TUNNEL) &&
+	    key->tun_key.tun_id != cpu_to_be64(0))
+		return -ENOTSUPP;
+
+	/* Only support zero/no skb mark */
+	if (flow->attrs | BIT_ULL(OVS_KEY_ATTR_SKB_MARK) && key->phy.skb_mark)
+		return -ENOTSUPP;
+
+	/* Only support zero/no dp hash */
+	if (flow->attrs | BIT_ULL(OVS_KEY_ATTR_DP_HASH) && key->ovs_flow_hash)
+		return -ENOTSUPP;
+
+	/* Only support zero/no recirculation id */
+	if (flow->attrs | BIT_ULL(OVS_KEY_ATTR_RECIRC_ID) && key->recirc_id)
+		return -ENOTSUPP;
+
+	/* The OF-DPA ACL table requires an unmasked match on ethernet type */
+	if (mask->eth.type != cpu_to_be16(0xffff))
+		return -ENOTSUPP;
+
+	if (flow->attrs | BIT_ULL(OVS_KEY_ATTR_IPV4)) {
+		if (mask->ip.frag)
+			/* There is no IP frag match in OF-DPA */
+			return -ENOTSUPP;
+		if (mask->ipv4.addr.src != cpu_to_be32(0) ||
+		    mask->ipv4.addr.dst != cpu_to_be32(0))
+			/* Rocker doesn't implement these matches */
+			return -ENOTSUPP;
+		out_key->acl.ip_proto = key->ip.proto;
+		out_key->acl.ip_proto_mask = mask->ip.proto;
+		out_key->acl.ip_tos = key->ip.tos;
+		out_key->acl.ip_tos_mask = mask->ip.tos;
+	}
+
+	if (flow->attrs | BIT_ULL(OVS_KEY_ATTR_ICMP) &&
+	   (mask->tp.src != cpu_to_be16(0) || mask->tp.dst != cpu_to_be16(0)))
+		/* Rocker doesn't implement these matches */
+		return -ENOTSUPP;
+
+	out_key->acl.in_pport = ofdpa_port->pport;
+	out_key->acl.in_pport_mask = mask->phy.in_port;
+	ether_addr_copy(out_key->acl.eth_src, key->eth.src);
+	ether_addr_copy(out_key->acl.eth_src_mask, mask->eth.src);
+	ether_addr_copy(out_key->acl.eth_dst, key->eth.dst);
+	ether_addr_copy(out_key->acl.eth_dst_mask, mask->eth.dst);
+	out_key->acl.eth_type = key->eth.type;
+
+	if (!ether_addr_equal(out_key->acl.eth_dst, zero_mac)) {
+		eth_dst = key->eth.dst;
+		eth_dst_mask = mask->eth.dst;
+	}
+
+	out_key->acl.vlan_id = ofdpa_port->internal_vlan_id;
+	out_key->acl.vlan_id_mask = htons(0xffff);
+	out_key->priority = OFDPA_PRIORITY_ACL_CTRL;
+
+	return 0;
+}
+
+static struct ofdpa_flow_tbl_entry *
+ofdpa_port_sw_flow_entry(struct ofdpa_port *ofdpa_port,
+			 struct switchdev_trans *trans,
+			 const struct switchdev_obj_sw_flow *flow)
+{
+	struct ofdpa_flow_tbl_entry *entry;
+	int flags = 0;
+	int err;
+
+	entry = ofdpa_kzalloc(trans, flags, sizeof(*entry));
+	if (!entry)
+		return ERR_PTR(-ENOMEM);
+
+	entry->key.tbl_id = ROCKER_OF_DPA_TABLE_ID_ACL_POLICY;
+	entry->key_len = offsetof(struct ofdpa_flow_tbl_key, acl.group_id);
+
+	err = ofdpa_port_sw_flow_match(ofdpa_port, flow, &entry->key);
+	if (err)
+		goto err;
+
+	return entry;
+
+err:
+	ofdpa_kfree(trans, entry);
+	return ERR_PTR(err);
+}
+
+static int ofdpa_port_obj_sw_flow_add(struct rocker_port *rocker_port,
+				      const struct switchdev_obj_sw_flow *flow,
+				      struct switchdev_trans *trans)
+{
+	struct ofdpa_port *ofdpa_port = rocker_port->wpriv;
+	struct ofdpa_flow_tbl_entry *entry;
+	int err, flags = 0;
+
+	if (!ofdpa_port_is_ovsed(ofdpa_port))
+		return -EINVAL;
+
+	entry = ofdpa_port_sw_flow_entry(ofdpa_port, trans, flow);
+	if (IS_ERR(entry))
+		return PTR_ERR(entry);
+
+	err = ofdpa_port_sw_flow_actions(ofdpa_port, flow, &entry->key);
+	if (err) {
+		ofdpa_kfree(trans, entry);
+		return err;
+	}
+
+	return ofdpa_flow_tbl_add(ofdpa_port, trans, flags, entry);
+}
+
+static int ofdpa_port_obj_sw_flow_del(struct rocker_port *rocker_port,
+				       const struct switchdev_obj_sw_flow *flow)
+{
+	struct ofdpa_port *ofdpa_port = rocker_port->wpriv;
+	struct switchdev_trans *trans = NULL;
+	struct ofdpa_flow_tbl_entry *entry;
+	int flags = 0;
+
+	if (!ofdpa_port_is_ovsed(ofdpa_port))
+		return -EINVAL;
+
+	entry = ofdpa_port_sw_flow_entry(ofdpa_port, trans, flow);
+	if (IS_ERR(entry))
+		return PTR_ERR(entry);
+
+	return ofdpa_flow_tbl_del(ofdpa_port, trans, flags, entry);
+}
+
+#else
+
+static int ofdpa_port_obj_sw_flow_add(struct rocker_port *rocker_port,
+				       const struct switchdev_obj_sw_flow *flow,
+				       struct switchdev_trans *trans) {
+	return -ENOTSUPP;
+}
+
+static int ofdpa_port_obj_sw_flow_del(struct rocker_port *rocker_port,
+				       const struct switchdev_obj_sw_flow *flow)
+{
+	return -ENOTSUPP;
+}
+
+#endif
+
 static int ofdpa_port_bridge_join(struct ofdpa_port *ofdpa_port,
 				  struct net_device *bridge)
 {
@@ -2946,6 +3205,8 @@ struct rocker_world_ops rocker_ofdpa_ops = {
 	.port_obj_fdb_add = ofdpa_port_obj_fdb_add,
 	.port_obj_fdb_del = ofdpa_port_obj_fdb_del,
 	.port_obj_fdb_dump = ofdpa_port_obj_fdb_dump,
+	.port_obj_sw_flow_add = ofdpa_port_obj_sw_flow_add,
+	.port_obj_sw_flow_del = ofdpa_port_obj_sw_flow_del,
 	.port_master_linked = ofdpa_port_master_linked,
 	.port_master_unlinked = ofdpa_port_master_unlinked,
 	.port_neigh_update = ofdpa_port_neigh_update,
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH/RFC 08/12] rocker: Support Open vSwitch (-like) flow stats in OF-DPA world
  2016-09-28 12:42 [PATCH/RFC 00/12] Programming Open vSwitch (-like) flows into hardware using SwitchDev Simon Horman
                   ` (6 preceding siblings ...)
  2016-09-28 12:42 ` [PATCH/RFC 07/12] rocker: switchdev Add Open vSwitch (-like) flow support to OF-DPA world Simon Horman
@ 2016-09-28 12:42 ` Simon Horman
  2016-09-28 12:42 ` [PATCH/RFC 09/12] openvswitch: Add key_attrs to struct sw_flow_match Simon Horman
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Simon Horman @ 2016-09-28 12:42 UTC (permalink / raw)
  To: netdev, dev; +Cc: Simon Horman

Prototype an implementation of the new switchdev_port_obj_get SDO for the
SWITCHDEV_OBJ_OVS_FLOW object type. This allows retrieval of statistics for
Open vSwitch (-like) flows which have been programmed into hardware.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
---
 drivers/net/ethernet/rocker/rocker_hw.h    |  4 ++
 drivers/net/ethernet/rocker/rocker_ofdpa.c | 92 ++++++++++++++++++++++++++++--
 2 files changed, 92 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/rocker/rocker_hw.h b/drivers/net/ethernet/rocker/rocker_hw.h
index 2adfe88859f2..0a9d33bc80a0 100644
--- a/drivers/net/ethernet/rocker/rocker_hw.h
+++ b/drivers/net/ethernet/rocker/rocker_hw.h
@@ -373,6 +373,10 @@ enum {
 	ROCKER_TLV_OF_DPA_FLOW_STAT_RX_PKTS,	/* u64 */
 	ROCKER_TLV_OF_DPA_FLOW_STAT_TX_PKTS,	/* u64 */
 
+	ROCKER_TLV_OF_DPA_FLOW_STAT_RX_BYTES,	/* u64 */
+	ROCKER_TLV_OF_DPA_FLOW_STAT_TX_BYTES,	/* u64 */
+	ROCKER_TLV_OF_DPA_FLOW_STAT_IDLE,	/* u64 */
+
 	__ROCKER_TLV_OF_DPA_FLOW_STAT_MAX,
 	ROCKER_TLV_OF_DPA_FLOW_STAT_MAX = __ROCKER_TLV_OF_DPA_FLOW_STAT_MAX - 1,
 };
diff --git a/drivers/net/ethernet/rocker/rocker_ofdpa.c b/drivers/net/ethernet/rocker/rocker_ofdpa.c
index 6669f2ba2f97..3b441359a3a7 100644
--- a/drivers/net/ethernet/rocker/rocker_ofdpa.c
+++ b/drivers/net/ethernet/rocker/rocker_ofdpa.c
@@ -619,9 +619,9 @@ static int ofdpa_cmd_flow_tbl_add(const struct rocker_port *rocker_port,
 	return 0;
 }
 
-static int ofdpa_cmd_flow_tbl_del(const struct rocker_port *rocker_port,
-				  struct rocker_desc_info *desc_info,
-				  void *priv)
+static int ofdpa_cmd_flow_tbl_prep(const struct rocker_port *rocker_port,
+				   struct rocker_desc_info *desc_info,
+				   void *priv)
 {
 	const struct ofdpa_flow_tbl_entry *entry = priv;
 	struct rocker_tlv *cmd_info;
@@ -884,7 +884,7 @@ static int ofdpa_flow_tbl_del(struct ofdpa_port *ofdpa_port,
 		if (!switchdev_trans_ph_prepare(trans))
 			err = rocker_cmd_exec(ofdpa_port->rocker_port,
 					      ofdpa_flags_nowait(flags),
-					      ofdpa_cmd_flow_tbl_del,
+					      ofdpa_cmd_flow_tbl_prep,
 					      found, NULL, NULL);
 		ofdpa_kfree(trans, found);
 	}
@@ -3038,6 +3038,82 @@ static int ofdpa_port_obj_sw_flow_del(struct rocker_port *rocker_port,
 	return ofdpa_flow_tbl_del(ofdpa_port, trans, flags, entry);
 }
 
+static int
+ofdpa_cmd_flow_tbl_get_stats_proc(const struct rocker_port *rocker_port,
+				const struct rocker_desc_info *desc_info,
+				void *priv)
+{
+	const struct rocker_tlv *info_attrs[ROCKER_TLV_OF_DPA_FLOW_STAT_MAX + 1];
+	const struct rocker_tlv *attrs[ROCKER_TLV_CMD_MAX + 1];
+	struct switchdev_obj_stats *stats = priv;
+
+	rocker_tlv_parse_desc(attrs, ROCKER_TLV_CMD_MAX, desc_info);
+	if (!attrs[ROCKER_TLV_CMD_INFO])
+		return -EIO;
+
+	rocker_tlv_parse_nested(info_attrs, ROCKER_TLV_OF_DPA_FLOW_STAT_MAX,
+			       attrs[ROCKER_TLV_CMD_INFO]);
+
+	if (info_attrs[ROCKER_TLV_OF_DPA_FLOW_STAT_RX_PKTS])
+		stats->rx_packets = rocker_tlv_get_u64(info_attrs[ROCKER_TLV_OF_DPA_FLOW_STAT_RX_PKTS]);
+	if (info_attrs[ROCKER_TLV_OF_DPA_FLOW_STAT_RX_BYTES])
+		stats->rx_bytes = rocker_tlv_get_u64(info_attrs[ROCKER_TLV_OF_DPA_FLOW_STAT_RX_BYTES]);
+	if (info_attrs[ROCKER_TLV_OF_DPA_FLOW_STAT_IDLE]) {
+		u64 idle_ms;
+
+		idle_ms = rocker_tlv_get_u64(info_attrs[ROCKER_TLV_OF_DPA_FLOW_STAT_IDLE]);
+		stats->last_used = jiffies - (idle_ms * HZ / 1000);
+	}
+
+	return 0;
+}
+
+static int ofdpa_flow_tbl_get_stats(struct ofdpa_port *ofdpa_port,
+				    struct ofdpa_flow_tbl_entry *match,
+				    struct switchdev_obj_stats *stats)
+{
+	struct ofdpa *ofdpa = ofdpa_port->ofdpa;
+	struct ofdpa_flow_tbl_entry *found;
+	size_t key_len = match->key_len ? match->key_len : sizeof(found->key);
+	unsigned long lock_flags;
+	int err = 0;
+
+	match->key_crc32 = crc32(~0, &match->key, key_len);
+
+	spin_lock_irqsave(&ofdpa->flow_tbl_lock, lock_flags);
+
+	found = ofdpa_flow_tbl_find(ofdpa, match);
+
+	if (found)
+		found->cmd = ROCKER_TLV_CMD_TYPE_OF_DPA_GROUP_GET_STATS;
+
+	spin_unlock_irqrestore(&ofdpa->flow_tbl_lock, lock_flags);
+
+	ofdpa_kfree(NULL, match);
+
+	if (found) {
+		err = rocker_cmd_exec(ofdpa_port->rocker_port, 0,
+				      ofdpa_cmd_flow_tbl_prep, found,
+				      ofdpa_cmd_flow_tbl_get_stats_proc, stats);
+	}
+
+	return err;
+}
+
+static int ofdpa_port_obj_sw_flow_get_stats(struct rocker_port *rocker_port,
+				     const struct switchdev_obj_sw_flow *flow)
+{
+	struct ofdpa_port *ofdpa_port = rocker_port->wpriv;
+	struct switchdev_trans *trans = NULL;
+	struct ofdpa_flow_tbl_entry *entry;
+
+	entry = ofdpa_port_sw_flow_entry(ofdpa_port, trans, flow);
+	if (IS_ERR(entry))
+		return PTR_ERR(entry);
+
+	return ofdpa_flow_tbl_get_stats(ofdpa_port, entry, flow->stats);
+}
+
 #else
 
 static int ofdpa_port_obj_sw_flow_add(struct rocker_port *rocker_port,
@@ -3052,6 +3128,13 @@ static int ofdpa_port_obj_sw_flow_del(struct rocker_port *rocker_port,
 	return -ENOTSUPP;
 }
 
+static int ofdpa_port_obj_ovs_flow_get_stats(struct rocker_port *rocker_port,
+				     const struct switchdev_obj_ovs_flow *flow)
+{
+{
+	return -ENOTSUPP;
+}
+
 #endif
 
 static int ofdpa_port_bridge_join(struct ofdpa_port *ofdpa_port,
@@ -3207,6 +3290,7 @@ struct rocker_world_ops rocker_ofdpa_ops = {
 	.port_obj_fdb_dump = ofdpa_port_obj_fdb_dump,
 	.port_obj_sw_flow_add = ofdpa_port_obj_sw_flow_add,
 	.port_obj_sw_flow_del = ofdpa_port_obj_sw_flow_del,
+	.port_obj_sw_flow_get_stats = ofdpa_port_obj_sw_flow_get_stats,
 	.port_master_linked = ofdpa_port_master_linked,
 	.port_master_unlinked = ofdpa_port_master_unlinked,
 	.port_neigh_update = ofdpa_port_neigh_update,
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH/RFC 09/12] openvswitch: Add key_attrs to struct sw_flow_match
  2016-09-28 12:42 [PATCH/RFC 00/12] Programming Open vSwitch (-like) flows into hardware using SwitchDev Simon Horman
                   ` (7 preceding siblings ...)
  2016-09-28 12:42 ` [PATCH/RFC 08/12] rocker: Support Open vSwitch (-like) flow stats in " Simon Horman
@ 2016-09-28 12:42 ` Simon Horman
  2016-09-28 12:43 ` [PATCH/RFC 10/12] openvswitch: make get_dp_rcu() available outside datapath.c Simon Horman
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Simon Horman @ 2016-09-28 12:42 UTC (permalink / raw)
  To: netdev, dev; +Cc: Simon Horman

This is in preparation for using key_attrs outside of their current context
to allow quickly checking which attributes are set. This is in turn in
preparation for prototyping programming Open vSwitch (-like) flows into
hardware.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
---
 net/openvswitch/flow.h         |  1 +
 net/openvswitch/flow_netlink.c | 14 +++++++++-----
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/net/openvswitch/flow.h b/net/openvswitch/flow.h
index 0c70c3532469..eb6bb7908e2d 100644
--- a/net/openvswitch/flow.h
+++ b/net/openvswitch/flow.h
@@ -72,6 +72,7 @@ struct sw_flow_match {
 	struct sw_flow_key *key;
 	struct sw_flow_key_range range;
 	struct sw_flow_mask *mask;
+	u64 key_attrs;
 };
 
 #define MAX_UFID_LENGTH 16 /* 128 bits */
diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
index ae25ded82b3b..89c20bdc2cc7 100644
--- a/net/openvswitch/flow_netlink.c
+++ b/net/openvswitch/flow_netlink.c
@@ -121,10 +121,13 @@ static void update_range(struct sw_flow_match *match,
 	} while (0)
 
 static bool match_validate(const struct sw_flow_match *match,
-			   u64 key_attrs, u64 mask_attrs, bool log)
+			   u64 mask_attrs, bool log)
 {
 	u64 key_expected = 1 << OVS_KEY_ATTR_ETHERNET;
-	u64 mask_allowed = key_attrs;  /* At most allow all key attributes */
+	u64 mask_allowed;
+
+	/* At most allow all key attributes */
+	mask_allowed = match->key_attrs;
 
 	/* The following mask attributes allowed only if they
 	 * pass the validation tests. */
@@ -237,10 +240,10 @@ static bool match_validate(const struct sw_flow_match *match,
 		}
 	}
 
-	if ((key_attrs & key_expected) != key_expected) {
+	if ((match->key_attrs & key_expected) != key_expected) {
 		/* Key attributes check failed. */
 		OVS_NLERR(log, "Missing key (keys=%llx, expected=%llx)",
-			  (unsigned long long)key_attrs,
+			  (unsigned long long)match->key_attrs,
 			  (unsigned long long)key_expected);
 		return false;
 	}
@@ -1399,7 +1402,8 @@ int ovs_nla_get_match(struct net *net, struct sw_flow_match *match,
 			goto free_newmask;
 	}
 
-	if (!match_validate(match, key_attrs, mask_attrs, log))
+	match->key_attrs = key_attrs;
+	if (!match_validate(match, mask_attrs, log))
 		err = -EINVAL;
 
 free_newmask:
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH/RFC 10/12] openvswitch: make get_dp_rcu() available outside datapath.c
  2016-09-28 12:42 [PATCH/RFC 00/12] Programming Open vSwitch (-like) flows into hardware using SwitchDev Simon Horman
                   ` (8 preceding siblings ...)
  2016-09-28 12:42 ` [PATCH/RFC 09/12] openvswitch: Add key_attrs to struct sw_flow_match Simon Horman
@ 2016-09-28 12:43 ` Simon Horman
  2016-09-28 12:43 ` [PATCH/RFC 11/12] openvswitch: Support programming of flows into hardware Simon Horman
  2016-09-28 12:43 ` [PATCH/RFC 12/12] hack: rocker: no ip frag match Simon Horman
  11 siblings, 0 replies; 17+ messages in thread
From: Simon Horman @ 2016-09-28 12:43 UTC (permalink / raw)
  To: netdev, dev; +Cc: Simon Horman

Make get_dp_rcu() available outside datapath.c and as a precaution add a
check to ensure that rcu_read_lock is held.

This is in preparation for calling get_dp_rcu() from other source files
which is in turn in preparation for prototyping allowing Open vSwitch to
program flows into hardware.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
---
 net/openvswitch/datapath.c | 4 +++-
 net/openvswitch/datapath.h | 2 ++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index 4d67ea856067..365d480031d3 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -145,10 +145,12 @@ static int queue_userspace_packet(struct datapath *dp, struct sk_buff *,
 				  uint32_t cutlen);
 
 /* Must be called with rcu_read_lock. */
-static struct datapath *get_dp_rcu(struct net *net, int dp_ifindex)
+struct datapath *get_dp_rcu(struct net *net, int dp_ifindex)
 {
 	struct net_device *dev = dev_get_by_index_rcu(net, dp_ifindex);
 
+	WARN_ON_ONCE(!rcu_read_lock_held());
+
 	if (dev) {
 		struct vport *vport = ovs_internal_dev_get_vport(dev);
 		if (vport)
diff --git a/net/openvswitch/datapath.h b/net/openvswitch/datapath.h
index ab85c1cae255..6094d912eaec 100644
--- a/net/openvswitch/datapath.h
+++ b/net/openvswitch/datapath.h
@@ -154,6 +154,8 @@ int lockdep_ovsl_is_held(void);
 #define lockdep_ovsl_is_held()	1
 #endif
 
+struct datapath *get_dp_rcu(struct net *net, int dp_ifindex);
+
 #define ASSERT_OVSL()		WARN_ON(!lockdep_ovsl_is_held())
 #define ovsl_dereference(p)					\
 	rcu_dereference_protected(p, lockdep_ovsl_is_held())
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH/RFC 11/12] openvswitch: Support programming of flows into hardware
  2016-09-28 12:42 [PATCH/RFC 00/12] Programming Open vSwitch (-like) flows into hardware using SwitchDev Simon Horman
                   ` (9 preceding siblings ...)
  2016-09-28 12:43 ` [PATCH/RFC 10/12] openvswitch: make get_dp_rcu() available outside datapath.c Simon Horman
@ 2016-09-28 12:43 ` Simon Horman
  2016-09-28 12:43 ` [PATCH/RFC 12/12] hack: rocker: no ip frag match Simon Horman
  11 siblings, 0 replies; 17+ messages in thread
From: Simon Horman @ 2016-09-28 12:43 UTC (permalink / raw)
  To: netdev, dev; +Cc: Simon Horman

The purpose of this prototype is to attempt to further discussion of
how Open vSwitch and similar flows may be programmed into hardware.

The approach taken in this prototype here is to always add flows to
software, the existing behaviour, and program flows into hardware when
possible.  As Open vSwitch datapath flows do not overlap this should be
safe even if some flows are programmed into hardware and some are not.

User-space is provided with the possibility of opting out
by setting the OVS_FLOW_ATTR_HW_REQ attribute of a request to
VS_FLOW_HW_REQ_SKIP_HW. There is scope to add other modes as needed,
for example: skip adding flow to software.

User-space is also provided with feedback on weather a flow has been
programmed into hardware or not via the OVS_FLOW_ATTR_HW_STATUS attribute
of replies.

Overall the intention is to allow the kernel to manage resources,
including flows, and for user-space to have secondary control using
the above mentioned attributes.

Access to hardware, to program flows into hardware, remove them from
hardware and obtain the statistics of flows programmed into hardware is
done via SDOs as per switchdev patches earlier in the patchset which
comprise this prototype. The earlier patches also include an implementation
of the relevant SDOs for the Rocker switch.

Some implementation notes:

* ovs_hw_flow_stats_add should probably update tcp_flags.

  However an implication of that would be that the hardware to which
  offloads is being made a) supports tracking tcp_flags and b) by
  implication parses L4 headers.  If that is a requirement of allowing
  Open vSwitch flows to be programmed into hardware, then so be it. But if
  it is a hard requirement then it may eliminate some hardware options.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
---
 include/uapi/linux/openvswitch.h |  36 ++++++++
 net/openvswitch/datapath.c       |  73 +++++++++++++++--
 net/openvswitch/flow.c           | 173 +++++++++++++++++++++++++++++++++++++++
 net/openvswitch/flow.h           |  59 +++++++++++++
 net/openvswitch/flow_netlink.c   |  42 ++++++++++
 net/openvswitch/flow_netlink.h   |   3 +
 net/openvswitch/vport-netdev.c   |  39 +++++++++
 7 files changed, 420 insertions(+), 5 deletions(-)

diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
index 59ed3992c760..96e223a04b09 100644
--- a/include/uapi/linux/openvswitch.h
+++ b/include/uapi/linux/openvswitch.h
@@ -506,6 +506,11 @@ struct ovs_key_ct_labels {
  * @OVS_FLOW_ATTR_UFID_FLAGS: A 32-bit value of OR'd %OVS_UFID_F_*
  * flags that provide alternative semantics for flow installation and
  * retrieval. Optional for all requests.
+ * @OVS_FLOW_ATTR_HW_REQ: A 32-bit value giving a OVS_HW_FLOW_REQ_*.
+ * Present in requests if it would not be OVS_FLOW_HW_REQ_DEFAULT.
+ * @OVS_FLOW_ATTR_HW_STATUS: A 32-bit value giving a OVS_HW_FLOW_STATUS_*.
+ * Ignored in all requests. Present in notifications if it would not be
+ * OVS_FLOW_HW_STATUS_NOT_PRESENT.
  *
  * These attributes follow the &struct ovs_header within the Generic Netlink
  * payload for %OVS_FLOW_* commands.
@@ -524,12 +529,43 @@ enum ovs_flow_attr {
 	OVS_FLOW_ATTR_UFID,      /* Variable length unique flow identifier. */
 	OVS_FLOW_ATTR_UFID_FLAGS,/* u32 of OVS_UFID_F_*. */
 	OVS_FLOW_ATTR_PAD,
+	OVS_FLOW_ATTR_HW_REQ,    /* u32 which is one of OVS_HW_FLOW_REQ_*. */
+	OVS_FLOW_ATTR_HW_STATUS, /* s32 which is one of OVS_HW_FLOW_STATUS_* or
+				  * a negative errno. */
 	__OVS_FLOW_ATTR_MAX
 };
 
 #define OVS_FLOW_ATTR_MAX (__OVS_FLOW_ATTR_MAX - 1)
 
 /**
+ * enum ovs_flow_hw_req - Attributes for requesting programming of a flow into software and hardware.
+ * @OVS_FLOW_HW_REQ_DEFAULT: Use default determined by implementation
+ * @OVS_FLOW_HW_REQ_SKIP_HW: Do not program flow into hardware
+ *
+ * Influence programming of flow into software and hardware.
+ */
+enum ovs_flow_hw_req {
+	OVS_FLOW_HW_REQ_DEFAULT,
+	OVS_FLOW_HW_REQ_SKIP_HW,
+	__OVS_FLOW_HW_MAX,
+};
+
+/**
+ * enum ovs_flow_hw_status - Status of attempt to program flow into hardware
+ * @OVS_FLOW_HW_STATUS_NOT_PRESENT: Flow was not programmed into hardware
+ * because: it was not requested; it was removed from hardware as requested;
+ * programming flows into hardware is not supported by the datapath.
+ * @OVS_FLOW_HW_STATUS_PRESENT: Flow is programmed into hardware
+ *
+ * Status of request of programming programming flow into hardware.
+ */
+enum ovs_flow_hw_status {
+	OVS_FLOW_HW_STATUS_NOT_PRESENT,
+	OVS_FLOW_HW_STATUS_PRESENT,
+	__OVS_FLOW_HW_STATUS_MAX,
+};
+
+/**
  * Omit attributes for notifications.
  *
  * If a datapath request contains an %OVS_UFID_F_OMIT_* flag, then the datapath
diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index 365d480031d3..2b06acce5ff5 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -50,6 +50,7 @@
 #include <net/genetlink.h>
 #include <net/net_namespace.h>
 #include <net/netns/generic.h>
+#include <net/switchdev.h>
 
 #include "datapath.h"
 #include "flow.h"
@@ -762,7 +763,7 @@ static size_t ovs_flow_cmd_msg_size(const struct sw_flow_actions *acts,
 }
 
 /* Called with ovs_mutex or RCU read lock. */
-static int ovs_flow_cmd_fill_stats(const struct sw_flow *flow,
+static int ovs_flow_cmd_fill_stats(struct sw_flow *flow, int dp_ifindex,
 				   struct sk_buff *skb)
 {
 	struct ovs_flow_stats stats;
@@ -770,6 +771,7 @@ static int ovs_flow_cmd_fill_stats(const struct sw_flow *flow,
 	unsigned long used;
 
 	ovs_flow_stats_get(flow, &stats, &used, &tcp_flags);
+	ovs_hw_flow_stats_add(flow, dp_ifindex, skb, &stats, &used);
 
 	if (used &&
 	    nla_put_u64_64bit(skb, OVS_FLOW_ATTR_USED, ovs_flow_used_time(used),
@@ -829,8 +831,20 @@ static int ovs_flow_cmd_fill_actions(const struct sw_flow *flow,
 	return 0;
 }
 
+static int ovs_hw_flow_put_status(const struct sw_flow *flow,
+				  struct sk_buff *skb)
+{
+#ifdef CONFIG_NET_SWITCHDEV
+       if (flow->hw_flow_present &&
+	   nla_put_s32(skb, OVS_FLOW_ATTR_HW_STATUS,
+		       OVS_FLOW_HW_STATUS_NOT_PRESENT))
+	       return -EMSGSIZE;
+#endif
+       return 0;
+}
+
 /* Called with ovs_mutex or RCU read lock. */
-static int ovs_flow_cmd_fill_info(const struct sw_flow *flow, int dp_ifindex,
+static int ovs_flow_cmd_fill_info(struct sw_flow *flow, int dp_ifindex,
 				  struct sk_buff *skb, u32 portid,
 				  u32 seq, u32 flags, u8 cmd, u32 ufid_flags)
 {
@@ -861,7 +875,11 @@ static int ovs_flow_cmd_fill_info(const struct sw_flow *flow, int dp_ifindex,
 			goto error;
 	}
 
-	err = ovs_flow_cmd_fill_stats(flow, skb);
+	err = ovs_flow_cmd_fill_stats(flow, dp_ifindex, skb);
+	if (err)
+		goto error;
+
+	err = ovs_hw_flow_put_status(flow, skb);
 	if (err)
 		goto error;
 
@@ -901,7 +919,7 @@ static struct sk_buff *ovs_flow_cmd_alloc_info(const struct sw_flow_actions *act
 }
 
 /* Called with ovs_mutex. */
-static struct sk_buff *ovs_flow_cmd_build_info(const struct sw_flow *flow,
+static struct sk_buff *ovs_flow_cmd_build_info(struct sw_flow *flow,
 					       int dp_ifindex,
 					       struct genl_info *info, u8 cmd,
 					       bool always, u32 ufid_flags)
@@ -933,6 +951,7 @@ static int ovs_flow_cmd_new(struct sk_buff *skb, struct genl_info *info)
 	struct sw_flow_actions *acts;
 	struct sw_flow_match match;
 	u32 ufid_flags = ovs_nla_get_ufid_flags(a[OVS_FLOW_ATTR_UFID_FLAGS]);
+	enum ovs_flow_hw_req hw_req = OVS_FLOW_HW_REQ_DEFAULT;
 	int error;
 	bool log = !a[OVS_FLOW_ATTR_PROBE];
 
@@ -947,6 +966,15 @@ static int ovs_flow_cmd_new(struct sk_buff *skb, struct genl_info *info)
 		goto error;
 	}
 
+	if (a[OVS_FLOW_ATTR_HW_REQ]) {
+		hw_req = nla_get_u32(a[OVS_FLOW_ATTR_HW_REQ]);
+
+		if (hw_req > OVS_FLOW_HW_REQ_SKIP_HW) {
+			OVS_NLERR(log, "Unsupported hardware flow request for new flow.");
+			goto error;
+		}
+	}
+
 	/* Most of the time we need to allocate a new flow, do it before
 	 * locking.
 	 */
@@ -1012,6 +1040,9 @@ static int ovs_flow_cmd_new(struct sk_buff *skb, struct genl_info *info)
 			goto err_unlock_ovs;
 		}
 
+		if (hw_req == OVS_FLOW_HW_REQ_DEFAULT)
+			ovs_hw_flow_new(dp, new_flow, match.key_attrs, acts);
+
 		if (unlikely(reply)) {
 			error = ovs_flow_cmd_fill_info(new_flow,
 						       ovs_header->dp_ifindex,
@@ -1036,6 +1067,7 @@ static int ovs_flow_cmd_new(struct sk_buff *skb, struct genl_info *info)
 			error = -EEXIST;
 			goto err_unlock_ovs;
 		}
+
 		/* The flow identifier has to be the same for flow updates.
 		 * Look for any overlapping flow.
 		 */
@@ -1050,6 +1082,12 @@ static int ovs_flow_cmd_new(struct sk_buff *skb, struct genl_info *info)
 				goto err_unlock_ovs;
 			}
 		}
+
+		if (hw_req == OVS_FLOW_HW_REQ_DEFAULT)
+			ovs_hw_flow_set(dp, flow, match.key_attrs, acts);
+		else
+			ovs_hw_flow_del(dp, flow);
+
 		/* Update actions. */
 		old_acts = ovsl_dereference(flow->sf_acts);
 		rcu_assign_pointer(flow->sf_acts, acts);
@@ -1120,10 +1158,27 @@ static int ovs_flow_cmd_set(struct sk_buff *skb, struct genl_info *info)
 	struct sw_flow_match match;
 	struct sw_flow_id sfid;
 	u32 ufid_flags = ovs_nla_get_ufid_flags(a[OVS_FLOW_ATTR_UFID_FLAGS]);
-	int error = 0;
+	enum ovs_flow_hw_req hw_req = OVS_FLOW_HW_REQ_DEFAULT;
+	int error;
 	bool log = !a[OVS_FLOW_ATTR_PROBE];
 	bool ufid_present;
 
+	/* Extract key. */
+	error = -EINVAL;
+	if (!a[OVS_FLOW_ATTR_KEY]) {
+		OVS_NLERR(log, "Flow key attribute not present in set flow.");
+		goto error;
+	}
+
+	if (a[OVS_FLOW_ATTR_HW_REQ]) {
+		hw_req = nla_get_u32(a[OVS_FLOW_ATTR_HW_REQ]);
+
+		if (hw_req > OVS_FLOW_HW_REQ_SKIP_HW) {
+			OVS_NLERR(log, "Unsupported hardware flow request for new flow.");
+			goto error;
+		}
+	}
+
 	ufid_present = ovs_nla_get_ufid(&sfid, a[OVS_FLOW_ATTR_UFID], log);
 	if (a[OVS_FLOW_ATTR_KEY]) {
 		ovs_match_init(&match, &key, true, &mask);
@@ -1207,6 +1262,12 @@ static int ovs_flow_cmd_set(struct sk_buff *skb, struct genl_info *info)
 	/* Clear stats. */
 	if (a[OVS_FLOW_ATTR_CLEAR])
 		ovs_flow_stats_clear(flow);
+
+	if (hw_req == OVS_FLOW_HW_REQ_DEFAULT)
+		ovs_hw_flow_set(dp, flow, match.key_attrs, acts);
+	else
+		ovs_hw_flow_del(dp, flow);
+
 	ovs_unlock();
 
 	if (reply)
@@ -1330,6 +1391,8 @@ static int ovs_flow_cmd_del(struct sk_buff *skb, struct genl_info *info)
 		goto unlock;
 	}
 
+	ovs_hw_flow_del(dp, flow);
+
 	ovs_flow_tbl_remove(&dp->table, flow);
 	ovs_unlock();
 
diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
index 634cc10d6dee..9b9bf924c489 100644
--- a/net/openvswitch/flow.c
+++ b/net/openvswitch/flow.c
@@ -53,6 +53,177 @@
 #include "flow_netlink.h"
 #include "vport.h"
 
+#ifdef CONFIG_NET_SWITCHDEV
+/* Must be called with ovs_mutex or rcu_read_lock. */
+static int ovs_hw_flow(const struct datapath *dp,
+		       struct sw_flow *flow, u64 key_attrs,
+		       const struct sw_flow_actions *acts)
+{
+	struct vport *vport;
+	int err;
+
+	if (!(flow->key_attrs | BIT_ULL(OVS_KEY_ATTR_IN_PORT)))
+		return -ENOTSUPP;
+
+	vport = ovs_lookup_vport(dp, flow->key.phy.in_port);
+	if (!vport)
+		return -EINVAL;
+
+	if (acts) {
+		struct nlattr *actions;
+
+		actions = ovs_switchdev_flow_actions(dp, acts->actions,
+						     acts->actions_len);
+		if (IS_ERR(actions))
+			return PTR_ERR(actions);
+
+		rtnl_lock();
+		err = switchdev_sw_flow_add(vport->dev, &flow->key,
+					    &flow->mask->key, key_attrs,
+					    actions, acts->actions_len);
+		rtnl_unlock();
+		kfree(actions);
+
+		flow->key_attrs = key_attrs;
+	} else {
+		rtnl_lock();
+		err = switchdev_sw_flow_del(vport->dev, &flow->key,
+					    &flow->mask->key, key_attrs);
+		rtnl_unlock();
+	}
+
+	return err;
+}
+
+void ovs_hw_flow_new(const struct datapath *dp, struct sw_flow *flow,
+		     u64 key_attrs, const struct sw_flow_actions *acts)
+{
+	if (ovs_hw_flow(dp, flow, key_attrs, acts) < 0)
+		flow->hw_flow_present = false;
+	else
+		flow->hw_flow_present = true;
+
+	memset(&flow->hw_stats, 0, sizeof flow->hw_stats);
+	memset(&flow->hw_stats_offset, 0, sizeof flow->hw_stats_offset);
+	memset(&flow->hw_stats_base, 0, sizeof flow->hw_stats_base);
+}
+
+void ovs_hw_flow_del(const struct datapath *dp, struct sw_flow *flow)
+{
+	int err;
+
+	if (!flow->hw_flow_present)
+		return;
+
+	err = ovs_hw_flow(dp, flow, flow->key_attrs, NULL);
+	if (err) {
+		net_warn_ratelimited("openvswitch: could not delete hardware flow: %d\n", err);
+		return;
+	}
+
+	flow->hw_flow_present = false;
+
+	flow->hw_stats_base.rx_packets += flow->hw_stats.rx_packets -
+		flow->hw_stats_offset.rx_packets;
+	flow->hw_stats_base.rx_bytes += flow->hw_stats.rx_bytes -
+		flow->hw_stats_offset.rx_bytes;
+
+	/* In case flow is once again programmed into hardware by
+	 * ovs_hw_flow_set()
+	 */
+	memset(&flow->hw_stats, 0, sizeof flow->hw_stats);
+	memset(&flow->hw_stats_offset, 0, sizeof flow->hw_stats_offset);
+}
+
+void ovs_hw_flow_set(const struct datapath *dp, struct sw_flow *flow,
+		     u64 key_attrs, const struct sw_flow_actions *acts)
+{
+	int err;
+
+	/* Try to add flow to hardware.
+	 * This may succeed where even if the flow was previously not added
+	 * to hardware. e.g. because the vport for the output action exists
+	 * but did not earlier.
+	 */
+	err = ovs_hw_flow(dp, flow, key_attrs, acts);
+	if (err < 0) {
+		ovs_hw_flow_del(dp, flow);
+		flow->hw_flow_present = false;
+	} else {
+		flow->hw_flow_present = true;
+	}
+}
+
+/* Must be called with ovs_mutex or rcu_read_lock. */
+/* XXX: ovs_hw_flow_stats_add should probably update tcp_flags.
+ *
+ * However an implication of that would be that the hardware to which
+ * offloads is being made a) supports tracking tcp_flags and b) by
+ * implication parses L4 headers.  If that is a requirement of allowing
+ * Open vSwitch flows to be programmed into hardware, then so be it. But if
+ * it is a hard requirement then it may eliminate some hardware options.
+ */
+int ovs_hw_flow_stats_add(struct sw_flow *flow, int dp_ifindex,
+			  struct sk_buff *skb, struct ovs_flow_stats *stats,
+			  unsigned long *used)
+{
+	struct net *net = sock_net(skb->sk);
+	const struct datapath *dp;
+	struct vport *vport;
+	int err;
+
+	/* Residual statistics from flow programmed into and then
+	 * removed from hardware. */
+	stats->n_packets += flow->hw_stats_base.rx_packets;
+	stats->n_bytes += flow->hw_stats_base.rx_bytes;
+
+	if (!flow->hw_flow_present)
+		return 0;
+
+	dp = get_dp_rcu(net, dp_ifindex);
+	if (!dp)
+		return -EINVAL;
+
+	/* This is not called unless ovs_hw_flow() has previously succeeded
+	 * and thus the flow has an in_port.
+	 */
+	vport = ovs_lookup_vport(dp, flow->key.phy.in_port);
+	if (!vport)
+		return -EINVAL;
+
+	err = switchdev_sw_flow_get_stats(vport->dev, &flow->key,
+					  &flow->mask->key, flow->key_attrs,
+					  &flow->hw_stats);
+	if (err)
+		return err;
+
+	stats->n_packets += flow->hw_stats.rx_packets -
+		flow->hw_stats_offset.rx_packets;
+	stats->n_bytes += flow->hw_stats.rx_bytes -
+		flow->hw_stats_offset.rx_bytes;
+	/* The aim of the condition here is to provide a zero value
+	 * if the flow programmed into hardware has not been used
+	 * since stats were last reset. This is in keeping with
+	 * the treatment of software flows.
+	 */
+	if (flow->hw_stats.last_used > flow->hw_stats_offset.last_used)
+		*used = max(*used, flow->hw_stats.last_used);
+
+	return 0;
+}
+
+/* Called with ovs_mutex. */
+static void ovs_hw_flow_stats_clear(struct sw_flow *flow)
+{
+	flow->hw_stats_offset = flow->hw_stats;
+}
+
+#else /* CONFIG_NET_SWITCHDEV */
+
+static void ovs_hw_flow_stats_clear(struct sw_flow *flow) {}
+
+#endif /* CONFIG_NET_SWITCHDEV */
+
 u64 ovs_flow_used_time(unsigned long flow_jiffies)
 {
 	struct timespec cur_ts;
@@ -181,6 +352,8 @@ void ovs_flow_stats_clear(struct sw_flow *flow)
 			spin_unlock_bh(&stats->lock);
 		}
 	}
+
+	ovs_hw_flow_stats_clear(flow);
 }
 
 static int check_header(struct sk_buff *skb, int len)
diff --git a/net/openvswitch/flow.h b/net/openvswitch/flow.h
index eb6bb7908e2d..134538dbe518 100644
--- a/net/openvswitch/flow.h
+++ b/net/openvswitch/flow.h
@@ -35,8 +35,10 @@
 #include <net/inet_ecn.h>
 #include <net/ip_tunnels.h>
 #include <net/dst_metadata.h>
+#include <net/switchdev.h>
 
 struct sk_buff;
+struct datapath;
 
 /* Store options at the end of the array if they are less than the
  * maximum size. This allows us to get the benefits of variable length
@@ -113,6 +115,20 @@ struct sw_flow {
 	struct sw_flow_id id;
 	struct sw_flow_mask *mask;
 	struct sw_flow_actions __rcu *sf_acts;
+#ifdef CONFIG_NET_SWITCHDEV
+	bool hw_flow_present;		  /* true is flow is programmed
+					     into hardware. */
+	/* unused unless flow has been programmed in hardware. */
+	u64 key_attrs;
+	struct switchdev_obj_stats hw_stats; /* stats most recently read
+					      * from hardware, or zeroed if
+					      * not read yet. */
+	struct switchdev_obj_stats hw_stats_offset; /* Set to hw_stats when
+						     * stats are cleared. */
+	struct switchdev_obj_stats hw_stats_base; /* Set to hw_stats when
+						   * flow is removed from
+						   * hardware. */
+#endif
 	struct flow_stats __rcu *stats[]; /* One for each CPU.  First one
 					   * is allocated at flow creation time,
 					   * the rest are allocated on demand
@@ -160,4 +176,47 @@ int ovs_flow_key_extract_userspace(struct net *net, const struct nlattr *attr,
 				   struct sk_buff *skb,
 				   struct sw_flow_key *key, bool log);
 
+#ifdef CONFIG_NET_SWITCHDEV
+/* Must be called with ovs_mutex or rcu_read_lock. */
+void ovs_hw_flow_new(const struct datapath *dp,
+		     struct sw_flow *flow, u64 key_attrs,
+		     const struct sw_flow_actions *acts);
+
+/* Must be called with ovs_mutex or rcu_read_lock. */
+void ovs_hw_flow_del(const struct datapath *dp, struct sw_flow *flow);
+
+/* Must be called with ovs_mutex or rcu_read_lock. */
+void ovs_hw_flow_set(const struct datapath *dp, struct sw_flow *flow,
+		     u64 key_attrs, const struct sw_flow_actions *acts);
+
+/* Must be called with ovs_mutex or rcu_read_lock. */
+int ovs_hw_flow_stats_add(struct sw_flow *flow, int dp_ifindex,
+			  struct sk_buff *skb, struct ovs_flow_stats *stats,
+			  unsigned long *used);
+#else /* CONFIG_NET_SWITCHDEV */
+static inline void ovs_hw_flow_new(const struct datapath *dp,
+				   struct sw_flow *flow, u64 key_attrs,
+				   const struct sw_flow_actions *acts)
+{
+	return 0;
+}
+
+static inline void ovs_hw_flow_del(const struct datapath *dp,
+				   struct sw_flow *flow){}
+
+static inline void ovs_hw_flow_set(const struct datapath *dp,
+				   struct sw_flow *flow, u64 key_attrs,
+				   const struct sw_flow_actions *acts)
+{
+	return 0;
+}
+
+static inline int ovs_hw_flow_stats_add(struct sw_flow *flow, int dp_ifindex,
+					struct sk_buff *skb,
+					struct ovs_flow_stats *stats,
+					unsigned long *used)
+{
+	return 0;
+}
+#endif /* CONFIG_NET_SWITCHDEV */
 #endif /* flow.h */
diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
index 89c20bdc2cc7..68096f26d6a1 100644
--- a/net/openvswitch/flow_netlink.c
+++ b/net/openvswitch/flow_netlink.c
@@ -50,6 +50,7 @@
 #include <net/vxlan.h>
 
 #include "flow_netlink.h"
+#include "vport.h"
 
 struct ovs_len_tbl {
 	int len;
@@ -2648,3 +2649,44 @@ int ovs_nla_put_actions(const struct nlattr *attr, int len, struct sk_buff *skb)
 
 	return 0;
 }
+
+#ifdef CONFIG_NET_SWITCHDEV
+struct nlattr *ovs_switchdev_flow_actions(const struct datapath *dp,
+					  const struct nlattr *acts, u32 len)
+{
+	struct nlattr *new_acts;
+	struct nlattr *a;
+	int rem, err;
+
+	new_acts = kmalloc(len, GFP_KERNEL);
+	if (!new_acts)
+		return ERR_PTR(-ENOMEM);
+
+	memcpy(new_acts, acts, len);
+
+	for (a = new_acts, rem = len; rem > 0; a = nla_next(a, &rem)) {
+		int type = nla_type(a);
+		struct vport *vport;
+
+		/* Only support output actions at this time */
+		if (type != OVS_ACTION_ATTR_OUTPUT) {
+			err = -ENOTSUPP;
+			goto err;
+		}
+
+		/* Convert ODP ports number to ifindex. */
+		vport = ovs_lookup_vport(dp, nla_get_u32(a));
+		if (!vport) {
+			err = -ENOTSUPP;
+			goto err;
+		}
+		*(u32 *)nla_data(a) = vport->dev->ifindex;
+	}
+
+	return new_acts;
+
+err:
+	kfree(new_acts);
+	return ERR_PTR(err);
+}
+#endif
diff --git a/net/openvswitch/flow_netlink.h b/net/openvswitch/flow_netlink.h
index 45f9769e5aac..3622a8a10eb4 100644
--- a/net/openvswitch/flow_netlink.h
+++ b/net/openvswitch/flow_netlink.h
@@ -76,4 +76,7 @@ int ovs_nla_put_actions(const struct nlattr *attr,
 void ovs_nla_free_flow_actions(struct sw_flow_actions *);
 void ovs_nla_free_flow_actions_rcu(struct sw_flow_actions *);
 
+struct nlattr *ovs_switchdev_flow_actions(const struct datapath *dp,
+					  const struct nlattr *acts, u32 len);
+
 #endif /* flow_netlink.h */
diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c
index 4e3972344aa6..78ff0c37df53 100644
--- a/net/openvswitch/vport-netdev.c
+++ b/net/openvswitch/vport-netdev.c
@@ -144,6 +144,44 @@ static struct vport *netdev_create(const struct vport_parms *parms)
 	return ovs_netdev_link(vport, parms->name);
 }
 
+
+#ifdef CONFIG_NET_SWITCHDEV
+void ovs_netdev_clear_hw_flows(struct vport *vport)
+{
+	struct net_device *upper_dev;
+	struct table_instance *ti;
+	struct datapath *dp;
+	u32 i;
+
+	upper_dev = netdev_master_upper_dev_get(vport->dev);
+
+	rcu_read_lock();
+	dp = get_dp_rcu(dev_net(upper_dev), upper_dev->ifindex);
+	if (!dp) {
+		net_warn_ratelimited("%s: could not get datapath",
+				     vport->dev->name);
+		goto err;
+	}
+
+	ti = rcu_dereference(dp->table.ti);
+
+	for (i = 0; i < ti->n_buckets; i++) {
+		struct hlist_head *head = flex_array_get(ti->buckets, i);
+		struct sw_flow *flow;
+
+		hlist_for_each_entry_rcu(flow, head,
+					 flow_table.node[ti->node_ver])
+			if (flow->key.phy.in_port == vport->port_no)
+				ovs_hw_flow_del(dp, flow);
+	}
+
+err:
+	rcu_read_unlock();
+}
+#else
+void ovs_netdev_clear_hw_flows(struct netdev *dev) {}
+#endif
+
 static void vport_netdev_free(struct rcu_head *rcu)
 {
 	struct vport *vport = container_of(rcu, struct vport, rcu);
@@ -158,6 +196,7 @@ void ovs_netdev_detach_dev(struct vport *vport)
 	ASSERT_RTNL();
 	vport->dev->priv_flags &= ~IFF_OVS_DATAPATH;
 	netdev_rx_handler_unregister(vport->dev);
+	ovs_netdev_clear_hw_flows(vport);
 	netdev_upper_dev_unlink(vport->dev,
 				netdev_master_upper_dev_get(vport->dev));
 	dev_set_promiscuity(vport->dev, -1);
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH/RFC 12/12] hack: rocker: no ip frag match
  2016-09-28 12:42 [PATCH/RFC 00/12] Programming Open vSwitch (-like) flows into hardware using SwitchDev Simon Horman
                   ` (10 preceding siblings ...)
  2016-09-28 12:43 ` [PATCH/RFC 11/12] openvswitch: Support programming of flows into hardware Simon Horman
@ 2016-09-28 12:43 ` Simon Horman
  11 siblings, 0 replies; 17+ messages in thread
From: Simon Horman @ 2016-09-28 12:43 UTC (permalink / raw)
  To: netdev, dev; +Cc: Simon Horman

*** hack; for informational purposes only; not for upstream merge ***

Open vSwitch expects to match on the fragmentation state, however, the
of-dpa world of rocker does not implement such a field in its flow key. As
a work-around ignore it to allow testing (in the absence of fragments).

Signed-off-by: Simon Horman <simon.horman@netronome.com>
---
 drivers/net/ethernet/rocker/rocker_ofdpa.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/net/ethernet/rocker/rocker_ofdpa.c b/drivers/net/ethernet/rocker/rocker_ofdpa.c
index 3b441359a3a7..2f20f1ded5bf 100644
--- a/drivers/net/ethernet/rocker/rocker_ofdpa.c
+++ b/drivers/net/ethernet/rocker/rocker_ofdpa.c
@@ -2931,9 +2931,6 @@ static int ofdpa_port_sw_flow_match(struct ofdpa_port *ofdpa_port,
 		return -ENOTSUPP;
 
 	if (flow->attrs | BIT_ULL(OVS_KEY_ATTR_IPV4)) {
-		if (mask->ip.frag)
-			/* There is no IP frag match in OF-DPA */
-			return -ENOTSUPP;
 		if (mask->ipv4.addr.src != cpu_to_be32(0) ||
 		    mask->ipv4.addr.dst != cpu_to_be32(0))
 			/* Rocker doesn't implement these matches */
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH/RFC 00/12] Programming Open vSwitch (-like) flows into hardware using SwitchDev
       [not found] ` <1475066582-1971-1-git-send-email-simon.horman-wFxRvT7yatFl57MIdRCFDg@public.gmane.org>
  2016-09-28 12:42   ` [PATCH/RFC 01/12] sw_flow: make struct sw_flow_key available outside of net/openvswitch/ Simon Horman
@ 2016-09-28 13:54   ` Or Gerlitz
  2016-09-29  8:09     ` Simon Horman
  1 sibling, 1 reply; 17+ messages in thread
From: Or Gerlitz @ 2016-09-28 13:54 UTC (permalink / raw)
  To: Simon Horman; +Cc: dev-yBygre7rU0TnMu66kgdUjQ, Linux Netdev List, Rony Efraim

On Wed, Sep 28, 2016 at 3:42 PM, Simon Horman
<simon.horman@netronome.com> wrote:

> A different approach, not implemented by this patch-set, is for user-space
> to program flows into hardware by some other means, for example TC, and/or
> the (kernel) datapath.

Right, and we've submitted that code to the OVS community 24h ago [1].

This was done along the feedback we've got for the last two years (since
the  LPC 2014 networking micro-conf). It allows offloading from
multiple user-space
applications through a single UAPI -- the TC one (currently we did
flower, but the OVSD
patch set can be extended to use whatever TC offloads are supported by
the port driver,
e.g U32, eBPF) and integration with 3rd party policy modules  running
in user-space.

Lets hear people opinions and see where we go from now.

> I believe that approach does not conflict with this one.
>  And there is some scope to share infrastructure in the kernel

maybe, possibly

We've having a talk in netdev 1.2 on offloading HW offloading of OVS
and similar applications,
I would encourage people to come and approach me and/or Rony Efraim
from Mellanox before/after
the talk to discuss that F2F, would love to get feedbacks, and also here...

Or.

[1] pointers to patches implementing the 2nd approach

cover-letter http://openvswitch.org/pipermail/dev/2016-September/079952.html

patches

https://patchwork.ozlabs.org/patch/675560/
https://patchwork.ozlabs.org/patch/675567/
https://patchwork.ozlabs.org/patch/675565/
https://patchwork.ozlabs.org/patch/675559/
https://patchwork.ozlabs.org/patch/675564/
https://patchwork.ozlabs.org/patch/675563/
https://patchwork.ozlabs.org/patch/675568/
https://patchwork.ozlabs.org/patch/675566/
https://patchwork.ozlabs.org/patch/675562/

[2] http://www.netdevconf.org/1.2/session.html?rony-efraim-1
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH/RFC 00/12] Programming Open vSwitch (-like) flows into hardware using SwitchDev
  2016-09-28 13:54   ` [PATCH/RFC 00/12] Programming Open vSwitch (-like) flows into hardware using SwitchDev Or Gerlitz
@ 2016-09-29  8:09     ` Simon Horman
       [not found]       ` <20160929080904.GA24113-ucRxlxcrRFEsysjaEhV7d2ey4e3TpSOZIxS8c3vjKQDk1uMJSBkQmQ@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Simon Horman @ 2016-09-29  8:09 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: Linux Netdev List, dev, Rony Efraim

Hi Or,

On Wed, Sep 28, 2016 at 04:54:40PM +0300, Or Gerlitz wrote:
> On Wed, Sep 28, 2016 at 3:42 PM, Simon Horman
> <simon.horman@netronome.com> wrote:
> 
> > A different approach, not implemented by this patch-set, is for user-space
> > to program flows into hardware by some other means, for example TC, and/or
> > the (kernel) datapath.
> 
> Right, and we've submitted that code to the OVS community 24h ago [1].
> 
> This was done along the feedback we've got for the last two years (since
> the  LPC 2014 networking micro-conf). It allows offloading from
> multiple user-space
> applications through a single UAPI -- the TC one (currently we did
> flower, but the OVSD
> patch set can be extended to use whatever TC offloads are supported by
> the port driver,
> e.g U32, eBPF) and integration with 3rd party policy modules  running
> in user-space.
> 
> Lets hear people opinions and see where we go from now.
> 
> > I believe that approach does not conflict with this one.
> >  And there is some scope to share infrastructure in the kernel
> 
> maybe, possibly
> 
> We've having a talk in netdev 1.2 on offloading HW offloading of OVS
> and similar applications,
> I would encourage people to come and approach me and/or Rony Efraim
> from Mellanox before/after
> the talk to discuss that F2F, would love to get feedbacks, and also here...

Thanks for putting my post in context with the work you mention.
I am looking forward to some F2F discussions next week.

> Or.
> 
> [1] pointers to patches implementing the 2nd approach
> 
> cover-letter http://openvswitch.org/pipermail/dev/2016-September/079952.html
> 
> patches
> 
> https://patchwork.ozlabs.org/patch/675560/
> https://patchwork.ozlabs.org/patch/675567/
> https://patchwork.ozlabs.org/patch/675565/
> https://patchwork.ozlabs.org/patch/675559/
> https://patchwork.ozlabs.org/patch/675564/
> https://patchwork.ozlabs.org/patch/675563/
> https://patchwork.ozlabs.org/patch/675568/
> https://patchwork.ozlabs.org/patch/675566/
> https://patchwork.ozlabs.org/patch/675562/
> 
> [2] http://www.netdevconf.org/1.2/session.html?rony-efraim-1

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH/RFC 00/12] Programming Open vSwitch (-like) flows into hardware using SwitchDev
       [not found]       ` <20160929080904.GA24113-ucRxlxcrRFEsysjaEhV7d2ey4e3TpSOZIxS8c3vjKQDk1uMJSBkQmQ@public.gmane.org>
@ 2016-09-30 22:12         ` pravin shelar
       [not found]           ` <CAOrHB_CG6wJ4t1zTaFZ8Uq5Ltoiqx+ctk-ZhqZ0sy3tF5-YXVQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: pravin shelar @ 2016-09-30 22:12 UTC (permalink / raw)
  To: Simon Horman
  Cc: dev-yBygre7rU0TnMu66kgdUjQ, Linux Netdev List, Rony Efraim, Or Gerlitz

On Thu, Sep 29, 2016 at 1:09 AM, Simon Horman
<simon.horman@netronome.com> wrote:
> Hi Or,
>
> On Wed, Sep 28, 2016 at 04:54:40PM +0300, Or Gerlitz wrote:
>> On Wed, Sep 28, 2016 at 3:42 PM, Simon Horman
>> <simon.horman@netronome.com> wrote:
>>
>> > A different approach, not implemented by this patch-set, is for user-space
>> > to program flows into hardware by some other means, for example TC, and/or
>> > the (kernel) datapath.
>>
>> Right, and we've submitted that code to the OVS community 24h ago [1].
>>
>> This was done along the feedback we've got for the last two years (since
>> the  LPC 2014 networking micro-conf). It allows offloading from
>> multiple user-space
>> applications through a single UAPI -- the TC one (currently we did
>> flower, but the OVSD
>> patch set can be extended to use whatever TC offloads are supported by
>> the port driver,
>> e.g U32, eBPF) and integration with 3rd party policy modules  running
>> in user-space.
>>
>> Lets hear people opinions and see where we go from now.
>>
>> > I believe that approach does not conflict with this one.
>> >  And there is some scope to share infrastructure in the kernel
>>
>> maybe, possibly
>>
>> We've having a talk in netdev 1.2 on offloading HW offloading of OVS
>> and similar applications,
>> I would encourage people to come and approach me and/or Rony Efraim
>> from Mellanox before/after
>> the talk to discuss that F2F, would love to get feedbacks, and also here...
>
> Thanks for putting my post in context with the work you mention.
> I am looking forward to some F2F discussions next week.
>

I will not be able to come to netdev conference, so I wanted to post
my thoughts here.
I would like to thank for sharing the code. But I see some issues with
this design for flow offloading.

1. This adds OVS specific flow based offloading piece to switchdev
rather than adding generic one. Any switchdev flow based offloads
would be available to OVS only. Any other software switch can not make
use of it.
2. This complicates OVS kernel module by adding hardware related state
in datapath. OVS datapath is suppose to be software switch.
3. Since this switchdev offload is controlled by OVS kernel module,
any new API exposed by switchdev would need some changes to OVS kernel
module and possibly OVS API.
4. There is project going on to rewrite OVS kernel module in eBPF.
Thus plugging in hardware offload in the OVS kernel module make it
obsolete once we move to new implementation. We would need to rewrite
hardware offload again for eBPF OVS.
5. OVS kernel module has much less context than userspace. So
userspace based API can use the context and can be more effective when
offloading a flow to limited offloading resources.
6. This patch has also increased struct flow size but most of users
would not be using switchdev offloads.
7. This could be a just a bug, but patch 11 changes OVS_FLOW_CMD_SET
API where flow key has changed a required parameter from optional.

Why not allow switchdev offload API for userspace similar to TC flower
offload? or we could use flower API for switchdev flow offload.

Thanks,
Pravin.
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH/RFC 00/12] Programming Open vSwitch (-like) flows into hardware using SwitchDev
       [not found]           ` <CAOrHB_CG6wJ4t1zTaFZ8Uq5Ltoiqx+ctk-ZhqZ0sy3tF5-YXVQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-10-01  9:05             ` Or Gerlitz
  0 siblings, 0 replies; 17+ messages in thread
From: Or Gerlitz @ 2016-10-01  9:05 UTC (permalink / raw)
  To: pravin shelar
  Cc: dev-yBygre7rU0TnMu66kgdUjQ, Simon Horman, Rony Efraim, Linux Netdev List

On Sat, Oct 1, 2016 at 1:12 AM, pravin shelar <pshelar@ovn.org> wrote:

[...]

> Why not allow switchdev offload API for userspace similar to TC flower
> offload? or we could use flower API for switchdev flow offload.

Hi Pravin,

Could you also share your thoughts on the RFC we've posted couple of
days ago to the OVS devel mailing list? the offloading there [1] is
from OVSD using TC/Flower

Or.

[1] http://openvswitch.org/pipermail/dev/2016-September/079952.html
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2016-10-01  9:05 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-28 12:42 [PATCH/RFC 00/12] Programming Open vSwitch (-like) flows into hardware using SwitchDev Simon Horman
     [not found] ` <1475066582-1971-1-git-send-email-simon.horman-wFxRvT7yatFl57MIdRCFDg@public.gmane.org>
2016-09-28 12:42   ` [PATCH/RFC 01/12] sw_flow: make struct sw_flow_key available outside of net/openvswitch/ Simon Horman
2016-09-28 13:54   ` [PATCH/RFC 00/12] Programming Open vSwitch (-like) flows into hardware using SwitchDev Or Gerlitz
2016-09-29  8:09     ` Simon Horman
     [not found]       ` <20160929080904.GA24113-ucRxlxcrRFEsysjaEhV7d2ey4e3TpSOZIxS8c3vjKQDk1uMJSBkQmQ@public.gmane.org>
2016-09-30 22:12         ` pravin shelar
     [not found]           ` <CAOrHB_CG6wJ4t1zTaFZ8Uq5Ltoiqx+ctk-ZhqZ0sy3tF5-YXVQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-10-01  9:05             ` Or Gerlitz
2016-09-28 12:42 ` [PATCH/RFC 02/12] switchdev: Add Open vSwitch (-like) flow object support Simon Horman
2016-09-28 12:42 ` [PATCH/RFC 03/12] switchdev: Add support for getting port object details Simon Horman
2016-09-28 12:42 ` [PATCH/RFC 04/12] rocker: Add Open vSwitch (-like) flow support Simon Horman
2016-09-28 12:42 ` [PATCH/RFC 05/12] rocker: Support Open vSwitch (-like) flow stats Simon Horman
2016-09-28 12:42 ` [PATCH/RFC 06/12] rocker: Add helper to check ports belong to the same rocker switch Simon Horman
2016-09-28 12:42 ` [PATCH/RFC 07/12] rocker: switchdev Add Open vSwitch (-like) flow support to OF-DPA world Simon Horman
2016-09-28 12:42 ` [PATCH/RFC 08/12] rocker: Support Open vSwitch (-like) flow stats in " Simon Horman
2016-09-28 12:42 ` [PATCH/RFC 09/12] openvswitch: Add key_attrs to struct sw_flow_match Simon Horman
2016-09-28 12:43 ` [PATCH/RFC 10/12] openvswitch: make get_dp_rcu() available outside datapath.c Simon Horman
2016-09-28 12:43 ` [PATCH/RFC 11/12] openvswitch: Support programming of flows into hardware Simon Horman
2016-09-28 12:43 ` [PATCH/RFC 12/12] hack: rocker: no ip frag match Simon Horman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.