[PATCH net-next v12 0/9] openvswitch: support for layer 3 encapsulated packets

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH net-next v12 0/9] openvswitch: support for layer 3 encapsulated packets
@ 2016-10-17 13:02 Jiri Benc
       [not found] ` <cover.1476708212.git.jbenc-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
                   ` (7 more replies)
  0 siblings, 8 replies; 27+ messages in thread
From: Jiri Benc @ 2016-10-17 13:02 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA; +Cc: dev-yBygre7rU0TnMu66kgdUjQ, Simon Horman

At the core of this patch set is removing the assumption in Open vSwitch
datapath that all packets have Ethernet header. Support for layer 3 GRE
tunnels is also added by this patchset.

The implementation relies on the presence of pop_eth and push_eth actions
in datapath flows to facilitate adding and removing Ethernet headers as
appropriate. The construction of such flows is left up to user-space.

This series is based on work by Simon Horman, Lorand Jakab, Thomas Morin and
others. I kept Lorand's and Simon's s-o-b in the patches that are derived
from v11 to record their authorship of parts of the code. Please let me know
if you disagree with this.

v12 differs from v11 a lot. The main changes are:

* The patches were restructured and split differently for easier review.
* They were rebased and adjusted to the current net-next. Especially MPLS
  handling is different (and easier) thanks to the recent MPLS GSO rework.
* Several bugs were discovered and fixed. The most notable is fragment
  handling: header adjustment for ARPHRD_NONE devices on tx needs to be done
  after refragmentation, not before it. This required significant changes in
  the patchset. Another one is stricter checking of attributes (match on L2
  vs. L3 packet) at the kernel level.
* Instead of is_layer3 bool, a mac_proto field is used. See patch 2. This is
  a matter of taste and alternate approaches are offered in patch 2
  description.

There is no change to uAPI since v11. The previously posted patchset for
Open vSwitch user space works with this submission unmodified.

Jiri Benc (8):
  openvswitch: use hard_header_len instead of hardcoded ETH_HLEN
  openvswitch: add mac_proto field to the flow key
  openvswitch: pass mac_proto to ovs_vport_send
  openvswitch: support MPLS push and pop for L3 packets
  openvswitch: add processing of L3 packets
  openvswitch: netlink: support L3 packets
  openvswitch: add Ethernet push and pop actions
  openvswitch: allow L3 netdev ports

Simon Horman (1):
  openvswitch: use ipgre tunnel rather than gretap tunnel

 include/net/gre.h                |   4 +-
 include/uapi/linux/openvswitch.h |  15 ++++
 net/ipv4/ip_gre.c                |   9 +-
 net/openvswitch/actions.c        | 111 +++++++++++++++++-------
 net/openvswitch/datapath.c       |  17 +---
 net/openvswitch/flow.c           | 100 ++++++++++++++++------
 net/openvswitch/flow.h           |  22 +++++
 net/openvswitch/flow_netlink.c   | 179 ++++++++++++++++++++++++++-------------
 net/openvswitch/vport-gre.c      |   2 +-
 net/openvswitch/vport-netdev.c   |   9 +-
 net/openvswitch/vport.c          |  28 ++++--
 net/openvswitch/vport.h          |   2 +-
 12 files changed, 356 insertions(+), 142 deletions(-)

-- 
1.8.3.1

_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH net-next v12 1/9] openvswitch: use hard_header_len instead of hardcoded ETH_HLEN
       [not found] ` <cover.1476708212.git.jbenc-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2016-10-17 13:02   ` Jiri Benc
  2016-10-19  5:13     ` Pravin Shelar
  2016-10-17 13:02   ` [PATCH net-next v12 2/9] openvswitch: add mac_proto field to the flow key Jiri Benc
  2016-10-19  5:11   ` [PATCH net-next v12 0/9] openvswitch: support for layer 3 encapsulated packets Pravin Shelar
  2 siblings, 1 reply; 27+ messages in thread
From: Jiri Benc @ 2016-10-17 13:02 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA; +Cc: dev-yBygre7rU0TnMu66kgdUjQ, Simon Horman

On tx, use hard_header_len while deciding whether to refragment or drop the
packet. That way, all combinations are calculated correctly:

* L2 packet going to L2 interface (the L2 header len is subtracted),
* L2 packet going to L3 interface (the L2 header is included in the packet
  lenght),
* L3 packet going to L3 interface.

Signed-off-by: Jiri Benc <jbenc@redhat.com>
---
 net/openvswitch/actions.c |  3 ++-
 net/openvswitch/vport.c   | 10 ++++++----
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index 1105c4e29c62..49af167105d3 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -791,7 +791,8 @@ static void do_output(struct datapath *dp, struct sk_buff *skb, int out_port,
 				pskb_trim(skb, ETH_HLEN);
 		}
 
-		if (likely(!mru || (skb->len <= mru + ETH_HLEN))) {
+		if (likely(!mru ||
+		           (skb->len <= mru + vport->dev->hard_header_len))) {
 			ovs_vport_send(vport, skb);
 		} else if (mru <= vport->dev->mtu) {
 			struct net *net = read_pnet(&dp->net);
diff --git a/net/openvswitch/vport.c b/net/openvswitch/vport.c
index 7387418ac514..d5550fd6a013 100644
--- a/net/openvswitch/vport.c
+++ b/net/openvswitch/vport.c
@@ -481,9 +481,10 @@ void ovs_vport_deferred_free(struct vport *vport)
 }
 EXPORT_SYMBOL_GPL(ovs_vport_deferred_free);
 
-static unsigned int packet_length(const struct sk_buff *skb)
+static unsigned int packet_length(const struct sk_buff *skb,
+				  struct net_device *dev)
 {
-	unsigned int length = skb->len - ETH_HLEN;
+	unsigned int length = skb->len - dev->hard_header_len;
 
 	if (!skb_vlan_tag_present(skb) &&
 	    eth_type_vlan(skb->protocol))
@@ -501,10 +502,11 @@ void ovs_vport_send(struct vport *vport, struct sk_buff *skb)
 {
 	int mtu = vport->dev->mtu;
 
-	if (unlikely(packet_length(skb) > mtu && !skb_is_gso(skb))) {
+	if (unlikely(packet_length(skb, vport->dev) > mtu &&
+		     !skb_is_gso(skb))) {
 		net_warn_ratelimited("%s: dropped over-mtu packet: %d > %d\n",
 				     vport->dev->name,
-				     packet_length(skb), mtu);
+				     packet_length(skb, vport->dev), mtu);
 		vport->dev->stats.tx_errors++;
 		goto drop;
 	}
-- 
1.8.3.1

_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH net-next v12 2/9] openvswitch: add mac_proto field to the flow key
       [not found] ` <cover.1476708212.git.jbenc-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2016-10-17 13:02   ` [PATCH net-next v12 1/9] openvswitch: use hard_header_len instead of hardcoded ETH_HLEN Jiri Benc
@ 2016-10-17 13:02   ` Jiri Benc
       [not found]     ` <098a6ab19e26cf7b1cc16f57f5fc0cc019ca44d6.1476708212.git.jbenc-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2016-10-19  5:11   ` [PATCH net-next v12 0/9] openvswitch: support for layer 3 encapsulated packets Pravin Shelar
  2 siblings, 1 reply; 27+ messages in thread
From: Jiri Benc @ 2016-10-17 13:02 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA; +Cc: dev-yBygre7rU0TnMu66kgdUjQ, Simon Horman

Use a hole in the structure. We support only Ethernet so far and will add
a support for L2-less packets shortly. We could use a bool to indicate
whether the Ethernet header is present or not but the approach with the
mac_proto field is more generic and occupies the same number of bytes in the
struct, while allowing later extensibility. It also makes the code in the
next patches more self explaining.

It would be nice to use ARPHRD_ constants but those are u16 which would be
waste. Thus define our own constants.

Another upside of this is that we can overload this new field to also denote
whether the flow key is valid. This has the advantage that on
refragmentation, we don't have to reparse the packet but can rely on the
stored eth.type. This is especially important for the next patches in this
series - instead of adding another branch for L2-less packets before calling
ovs_fragment, we can just remove all those branches completely.

Signed-off-by: Jiri Benc <jbenc@redhat.com>
---
There are three possible approaches:

(1) The one in this patch.

(2) Use just a one bit flag indicating whether the packet is L3 or Ethernet
    (similar to the "is_layer3" bool in v11). The code would stay very
    similar to this patchset, the memory consumption would be the same.

(3) Use value of 14 for MAC_PROTO_ETHERNET. It would simplify things nicely,
    as ovs_mac_header_len would be identical to ovs_key_mac_proto, saving
    one comparison. Of course, this would mean that if other L2 protocols
    are added in the future, they can only have L2 header length different
    than 14. Sounds hacky, although I kind of like this.

After thinking about pros and cons, I implemented (1). Seems to be most
clear of the three options. But I'm happy to implement (2) or (3) if it's
deemed better.
---
 net/openvswitch/actions.c      | 14 +++-----------
 net/openvswitch/flow.c         |  1 +
 net/openvswitch/flow.h         | 22 ++++++++++++++++++++++
 net/openvswitch/flow_netlink.c |  5 +++++
 4 files changed, 31 insertions(+), 11 deletions(-)

diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index 49af167105d3..44144f914920 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -137,12 +137,12 @@ static struct deferred_action *add_deferred_actions(struct sk_buff *skb,
 
 static void invalidate_flow_key(struct sw_flow_key *key)
 {
-	key->eth.type = htons(0);
+	key->mac_proto |= SW_FLOW_KEY_INVALID;
 }
 
 static bool is_flow_key_valid(const struct sw_flow_key *key)
 {
-	return !!key->eth.type;
+	return !(key->mac_proto & SW_FLOW_KEY_INVALID);
 }
 
 static void update_ethertype(struct sk_buff *skb, struct ethhdr *hdr,
@@ -796,16 +796,8 @@ static void do_output(struct datapath *dp, struct sk_buff *skb, int out_port,
 			ovs_vport_send(vport, skb);
 		} else if (mru <= vport->dev->mtu) {
 			struct net *net = read_pnet(&dp->net);
-			__be16 ethertype = key->eth.type;
 
-			if (!is_flow_key_valid(key)) {
-				if (eth_p_mpls(skb->protocol))
-					ethertype = skb->inner_protocol;
-				else
-					ethertype = vlan_get_protocol(skb);
-			}
-
-			ovs_fragment(net, vport, skb, mru, ethertype);
+			ovs_fragment(net, vport, skb, mru, key->eth.type);
 		} else {
 			kfree_skb(skb);
 		}
diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
index 22087062bd10..96c8c4716603 100644
--- a/net/openvswitch/flow.c
+++ b/net/openvswitch/flow.c
@@ -751,6 +751,7 @@ int ovs_flow_key_extract(const struct ip_tunnel_info *tun_info,
 	key->phy.skb_mark = skb->mark;
 	ovs_ct_fill_key(skb, key);
 	key->ovs_flow_hash = 0;
+	key->mac_proto = MAC_PROTO_ETHERNET;
 	key->recirc_id = 0;
 
 	return key_extract(skb, key);
diff --git a/net/openvswitch/flow.h b/net/openvswitch/flow.h
index ae783f5c6695..f61cae7f9030 100644
--- a/net/openvswitch/flow.h
+++ b/net/openvswitch/flow.h
@@ -37,6 +37,12 @@
 
 struct sk_buff;
 
+enum sw_flow_mac_proto {
+	MAC_PROTO_NONE = 0,
+	MAC_PROTO_ETHERNET,
+};
+#define SW_FLOW_KEY_INVALID	0x80
+
 /* Store options at the end of the array if they are less than the
  * maximum size. This allows us to get the benefits of variable length
  * matching for small options.
@@ -68,6 +74,7 @@ struct sw_flow_key {
 		u32	skb_mark;	/* SKB mark. */
 		u16	in_port;	/* Input switch port (or DP_MAX_PORTS). */
 	} __packed phy; /* Safe when right after 'tun_key'. */
+	u8 mac_proto;			/* MAC layer protocol (e.g. Ethernet). */
 	u8 tun_proto;			/* Protocol of encapsulating tunnel. */
 	u32 ovs_flow_hash;		/* Datapath computed hash value.  */
 	u32 recirc_id;			/* Recirculation ID.  */
@@ -206,6 +213,21 @@ struct arp_eth_header {
 	unsigned char       ar_tip[4];		/* target IP address        */
 } __packed;
 
+static inline u8 ovs_key_mac_proto(const struct sw_flow_key *key)
+{
+	return key->mac_proto & ~SW_FLOW_KEY_INVALID;
+}
+
+static inline u16 __ovs_mac_header_len(u8 mac_proto)
+{
+	return mac_proto == MAC_PROTO_ETHERNET ? ETH_HLEN : 0;
+}
+
+static inline u16 ovs_mac_header_len(const struct sw_flow_key *key)
+{
+	return __ovs_mac_header_len(ovs_key_mac_proto(key));
+}
+
 static inline bool ovs_identifier_is_ufid(const struct sw_flow_id *sfid)
 {
 	return sfid->ufid_len;
diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
index ae25ded82b3b..ccb9900c5230 100644
--- a/net/openvswitch/flow_netlink.c
+++ b/net/openvswitch/flow_netlink.c
@@ -1059,6 +1059,11 @@ static int metadata_from_nlattrs(struct net *net, struct sw_flow_match *match,
 				   sizeof(*cl), is_mask);
 		*attrs &= ~(1ULL << OVS_KEY_ATTR_CT_LABELS);
 	}
+
+	/* Always exact match mac_proto */
+	SW_FLOW_KEY_PUT(match, mac_proto, is_mask ? 0xff : MAC_PROTO_ETHERNET,
+			is_mask);
+
 	return 0;
 }
 
-- 
1.8.3.1

_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH net-next v12 3/9] openvswitch: pass mac_proto to ovs_vport_send
  2016-10-17 13:02 [PATCH net-next v12 0/9] openvswitch: support for layer 3 encapsulated packets Jiri Benc
       [not found] ` <cover.1476708212.git.jbenc-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2016-10-17 13:02 ` Jiri Benc
  2016-10-19  5:11   ` Pravin Shelar
  2016-10-17 13:02 ` [PATCH net-next v12 4/9] openvswitch: support MPLS push and pop for L3 packets Jiri Benc
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 27+ messages in thread
From: Jiri Benc @ 2016-10-17 13:02 UTC (permalink / raw)
  To: netdev; +Cc: dev, Pravin Shelar, Lorand Jakab, Simon Horman

We'll need it to alter packets sent to ARPHRD_NONE interfaces.

Change do_output() to use the actual L2 header size of the packet when
deciding on the minimum cutlen. The assumption here is that what matters is
not the output interface hard_header_len but rather the L2 header of the
particular packet. For example, ARPHRD_NONE tunnels that encapsulate
Ethernet should get at least the Ethernet header.

Signed-off-by: Jiri Benc <jbenc@redhat.com>
---
 net/openvswitch/actions.c | 29 +++++++++++++++++------------
 net/openvswitch/vport.c   |  2 +-
 net/openvswitch/vport.h   |  2 +-
 3 files changed, 19 insertions(+), 14 deletions(-)

diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index 44144f914920..dc8bb97e2258 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -66,6 +66,7 @@ struct ovs_frag_data {
 	u16 vlan_tci;
 	__be16 vlan_proto;
 	unsigned int l2_len;
+	u8 mac_proto;
 	u8 l2_data[MAX_L2_LEN];
 };
 
@@ -673,7 +674,7 @@ static int ovs_vport_output(struct net *net, struct sock *sk, struct sk_buff *sk
 		skb_reset_mac_len(skb);
 	}
 
-	ovs_vport_send(vport, skb);
+	ovs_vport_send(vport, skb, data->mac_proto);
 	return 0;
 }
 
@@ -692,7 +693,7 @@ static int ovs_vport_output(struct net *net, struct sock *sk, struct sk_buff *sk
  * ovs_vport_output(), which is called once per fragmented packet.
  */
 static void prepare_frag(struct vport *vport, struct sk_buff *skb,
-			 u16 orig_network_offset)
+			 u16 orig_network_offset, u8 mac_proto)
 {
 	unsigned int hlen = skb_network_offset(skb);
 	struct ovs_frag_data *data;
@@ -705,6 +706,7 @@ static void prepare_frag(struct vport *vport, struct sk_buff *skb,
 	data->network_offset = orig_network_offset;
 	data->vlan_tci = skb->vlan_tci;
 	data->vlan_proto = skb->vlan_proto;
+	data->mac_proto = mac_proto;
 	data->l2_len = hlen;
 	memcpy(&data->l2_data, skb->data, hlen);
 
@@ -713,7 +715,8 @@ static void prepare_frag(struct vport *vport, struct sk_buff *skb,
 }
 
 static void ovs_fragment(struct net *net, struct vport *vport,
-			 struct sk_buff *skb, u16 mru, __be16 ethertype)
+			 struct sk_buff *skb, u16 mru,
+			 struct sw_flow_key *key)
 {
 	u16 orig_network_offset = 0;
 
@@ -727,11 +730,12 @@ static void ovs_fragment(struct net *net, struct vport *vport,
 		goto err;
 	}
 
-	if (ethertype == htons(ETH_P_IP)) {
+	if (key->eth.type == htons(ETH_P_IP)) {
 		struct dst_entry ovs_dst;
 		unsigned long orig_dst;
 
-		prepare_frag(vport, skb, orig_network_offset);
+		prepare_frag(vport, skb, orig_network_offset,
+			     ovs_key_mac_proto(key));
 		dst_init(&ovs_dst, &ovs_dst_ops, NULL, 1,
 			 DST_OBSOLETE_NONE, DST_NOCOUNT);
 		ovs_dst.dev = vport->dev;
@@ -742,7 +746,7 @@ static void ovs_fragment(struct net *net, struct vport *vport,
 
 		ip_do_fragment(net, skb->sk, skb, ovs_vport_output);
 		refdst_drop(orig_dst);
-	} else if (ethertype == htons(ETH_P_IPV6)) {
+	} else if (key->eth.type == htons(ETH_P_IPV6)) {
 		const struct nf_ipv6_ops *v6ops = nf_get_ipv6_ops();
 		unsigned long orig_dst;
 		struct rt6_info ovs_rt;
@@ -751,7 +755,8 @@ static void ovs_fragment(struct net *net, struct vport *vport,
 			goto err;
 		}
 
-		prepare_frag(vport, skb, orig_network_offset);
+		prepare_frag(vport, skb, orig_network_offset,
+			     ovs_key_mac_proto(key));
 		memset(&ovs_rt, 0, sizeof(ovs_rt));
 		dst_init(&ovs_rt.dst, &ovs_dst_ops, NULL, 1,
 			 DST_OBSOLETE_NONE, DST_NOCOUNT);
@@ -765,7 +770,7 @@ static void ovs_fragment(struct net *net, struct vport *vport,
 		refdst_drop(orig_dst);
 	} else {
 		WARN_ONCE(1, "Failed fragment ->%s: eth=%04x, MRU=%d, MTU=%d.",
-			  ovs_vport_name(vport), ntohs(ethertype), mru,
+			  ovs_vport_name(vport), ntohs(key->eth.type), mru,
 			  vport->dev->mtu);
 		goto err;
 	}
@@ -785,19 +790,19 @@ static void do_output(struct datapath *dp, struct sk_buff *skb, int out_port,
 		u32 cutlen = OVS_CB(skb)->cutlen;
 
 		if (unlikely(cutlen > 0)) {
-			if (skb->len - cutlen > ETH_HLEN)
+			if (skb->len - cutlen > ovs_mac_header_len(key))
 				pskb_trim(skb, skb->len - cutlen);
 			else
-				pskb_trim(skb, ETH_HLEN);
+				pskb_trim(skb, ovs_mac_header_len(key));
 		}
 
 		if (likely(!mru ||
 		           (skb->len <= mru + vport->dev->hard_header_len))) {
-			ovs_vport_send(vport, skb);
+			ovs_vport_send(vport, skb, ovs_key_mac_proto(key));
 		} else if (mru <= vport->dev->mtu) {
 			struct net *net = read_pnet(&dp->net);
 
-			ovs_fragment(net, vport, skb, mru, key->eth.type);
+			ovs_fragment(net, vport, skb, mru, key);
 		} else {
 			kfree_skb(skb);
 		}
diff --git a/net/openvswitch/vport.c b/net/openvswitch/vport.c
index d5550fd6a013..dc0e2212edfc 100644
--- a/net/openvswitch/vport.c
+++ b/net/openvswitch/vport.c
@@ -498,7 +498,7 @@ static unsigned int packet_length(const struct sk_buff *skb,
 	return length;
 }
 
-void ovs_vport_send(struct vport *vport, struct sk_buff *skb)
+void ovs_vport_send(struct vport *vport, struct sk_buff *skb, u8 mac_proto)
 {
 	int mtu = vport->dev->mtu;
 
diff --git a/net/openvswitch/vport.h b/net/openvswitch/vport.h
index f01f28a567ad..81925e21ff8e 100644
--- a/net/openvswitch/vport.h
+++ b/net/openvswitch/vport.h
@@ -198,6 +198,6 @@ static inline const char *ovs_vport_name(struct vport *vport)
 	})
 
 void ovs_vport_ops_unregister(struct vport_ops *ops);
-void ovs_vport_send(struct vport *vport, struct sk_buff *skb);
+void ovs_vport_send(struct vport *vport, struct sk_buff *skb, u8 mac_proto);
 
 #endif /* vport.h */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH net-next v12 4/9] openvswitch: support MPLS push and pop for L3 packets
  2016-10-17 13:02 [PATCH net-next v12 0/9] openvswitch: support for layer 3 encapsulated packets Jiri Benc
       [not found] ` <cover.1476708212.git.jbenc-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2016-10-17 13:02 ` [PATCH net-next v12 3/9] openvswitch: pass mac_proto to ovs_vport_send Jiri Benc
@ 2016-10-17 13:02 ` Jiri Benc
       [not found]   ` <004e53f40443698cffdaf8f9839649cc4b9101cc.1476708213.git.jbenc-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2016-10-17 13:02 ` [PATCH net-next v12 5/9] openvswitch: add processing of " Jiri Benc
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 27+ messages in thread
From: Jiri Benc @ 2016-10-17 13:02 UTC (permalink / raw)
  To: netdev; +Cc: dev, Pravin Shelar, Lorand Jakab, Simon Horman

Update Ethernet header only if there is one.

Signed-off-by: Jiri Benc <jbenc@redhat.com>
---
 net/openvswitch/actions.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index dc8bb97e2258..064cbcb7b0c5 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -187,7 +187,8 @@ static int push_mpls(struct sk_buff *skb, struct sw_flow_key *key,
 
 	skb_postpush_rcsum(skb, new_mpls_lse, MPLS_HLEN);
 
-	update_ethertype(skb, eth_hdr(skb), mpls->mpls_ethertype);
+	if (ovs_key_mac_proto(key) == MAC_PROTO_ETHERNET)
+		update_ethertype(skb, eth_hdr(skb), mpls->mpls_ethertype);
 	skb->protocol = mpls->mpls_ethertype;
 
 	invalidate_flow_key(key);
@@ -197,7 +198,6 @@ static int push_mpls(struct sk_buff *skb, struct sw_flow_key *key,
 static int pop_mpls(struct sk_buff *skb, struct sw_flow_key *key,
 		    const __be16 ethertype)
 {
-	struct ethhdr *hdr;
 	int err;
 
 	err = skb_ensure_writable(skb, skb->mac_len + MPLS_HLEN);
@@ -213,11 +213,15 @@ static int pop_mpls(struct sk_buff *skb, struct sw_flow_key *key,
 	skb_reset_mac_header(skb);
 	skb_set_network_header(skb, skb->mac_len);
 
-	/* mpls_hdr() is used to locate the ethertype field correctly in the
-	 * presence of VLAN tags.
-	 */
-	hdr = (struct ethhdr *)((void *)mpls_hdr(skb) - ETH_HLEN);
-	update_ethertype(skb, hdr, ethertype);
+	if (ovs_key_mac_proto(key) == MAC_PROTO_ETHERNET) {
+		struct ethhdr *hdr;
+
+		/* mpls_hdr() is used to locate the ethertype field correctly in the
+		 * presence of VLAN tags.
+		 */
+		hdr = (struct ethhdr *)((void *)mpls_hdr(skb) - ETH_HLEN);
+		update_ethertype(skb, hdr, ethertype);
+	}
 	if (eth_p_mpls(skb->protocol))
 		skb->protocol = ethertype;
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH net-next v12 5/9] openvswitch: add processing of L3 packets
  2016-10-17 13:02 [PATCH net-next v12 0/9] openvswitch: support for layer 3 encapsulated packets Jiri Benc
                   ` (2 preceding siblings ...)
  2016-10-17 13:02 ` [PATCH net-next v12 4/9] openvswitch: support MPLS push and pop for L3 packets Jiri Benc
@ 2016-10-17 13:02 ` Jiri Benc
  2016-10-19  5:13   ` Pravin Shelar
  2016-10-21  4:19   ` Pravin Shelar
  2016-10-17 13:02 ` [PATCH net-next v12 6/9] openvswitch: netlink: support " Jiri Benc
                   ` (3 subsequent siblings)
  7 siblings, 2 replies; 27+ messages in thread
From: Jiri Benc @ 2016-10-17 13:02 UTC (permalink / raw)
  To: netdev; +Cc: dev, Pravin Shelar, Lorand Jakab, Simon Horman

Support receiving, extracting flow key and sending of L3 packets (packets
without an Ethernet header).

Note that even after this patch, non-Ethernet interfaces are still not
allowed to be added to bridges. Similarly, netlink interface for sending and
receiving L3 packets to/from user space is not in place yet.

Based on previous versions by Lorand Jakab and Simon Horman.

Signed-off-by: Lorand Jakab <lojakab@cisco.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Jiri Benc <jbenc@redhat.com>
---
 net/openvswitch/datapath.c |  17 ++------
 net/openvswitch/flow.c     | 101 ++++++++++++++++++++++++++++++++++-----------
 net/openvswitch/vport.c    |  16 +++++++
 3 files changed, 96 insertions(+), 38 deletions(-)

diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index 4d67ea856067..c5b719fca8d4 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -562,7 +562,6 @@ static int ovs_packet_cmd_execute(struct sk_buff *skb, struct genl_info *info)
 	struct sw_flow *flow;
 	struct sw_flow_actions *sf_acts;
 	struct datapath *dp;
-	struct ethhdr *eth;
 	struct vport *input_vport;
 	u16 mru = 0;
 	int len;
@@ -583,17 +582,6 @@ static int ovs_packet_cmd_execute(struct sk_buff *skb, struct genl_info *info)
 
 	nla_memcpy(__skb_put(packet, len), a[OVS_PACKET_ATTR_PACKET], len);
 
-	skb_reset_mac_header(packet);
-	eth = eth_hdr(packet);
-
-	/* Normally, setting the skb 'protocol' field would be handled by a
-	 * call to eth_type_trans(), but it assumes there's a sending
-	 * device, which we may not have. */
-	if (eth_proto_is_802_3(eth->h_proto))
-		packet->protocol = eth->h_proto;
-	else
-		packet->protocol = htons(ETH_P_802_2);
-
 	/* Set packet's mru */
 	if (a[OVS_PACKET_ATTR_MRU]) {
 		mru = nla_get_u16(a[OVS_PACKET_ATTR_MRU]);
@@ -609,8 +597,10 @@ static int ovs_packet_cmd_execute(struct sk_buff *skb, struct genl_info *info)
 
 	err = ovs_flow_key_extract_userspace(net, a[OVS_PACKET_ATTR_KEY],
 					     packet, &flow->key, log);
-	if (err)
+	if (err) {
+		packet = NULL;
 		goto err_flow_free;
+	}
 
 	err = ovs_nla_copy_actions(net, a[OVS_PACKET_ATTR_ACTIONS],
 				   &flow->key, &acts, log);
@@ -620,6 +610,7 @@ static int ovs_packet_cmd_execute(struct sk_buff *skb, struct genl_info *info)
 	rcu_assign_pointer(flow->sf_acts, acts);
 	packet->priority = flow->key.phy.priority;
 	packet->mark = flow->key.phy.skb_mark;
+	packet->protocol = flow->key.eth.type;
 
 	rcu_read_lock();
 	dp = get_dp_rcu(net, ovs_header->dp_ifindex);
diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
index 96c8c4716603..7aee6e273765 100644
--- a/net/openvswitch/flow.c
+++ b/net/openvswitch/flow.c
@@ -334,14 +334,17 @@ static int parse_vlan_tag(struct sk_buff *skb, struct vlan_head *key_vh)
 	return 1;
 }
 
-static int parse_vlan(struct sk_buff *skb, struct sw_flow_key *key)
+static void clear_vlan(struct sw_flow_key *key)
 {
-	int res;
-
 	key->eth.vlan.tci = 0;
 	key->eth.vlan.tpid = 0;
 	key->eth.cvlan.tci = 0;
 	key->eth.cvlan.tpid = 0;
+}
+
+static int parse_vlan(struct sk_buff *skb, struct sw_flow_key *key)
+{
+	int res;
 
 	if (skb_vlan_tag_present(skb)) {
 		key->eth.vlan.tci = htons(skb->vlan_tci);
@@ -483,17 +486,20 @@ static int parse_icmpv6(struct sk_buff *skb, struct sw_flow_key *key,
  *
  * Returns 0 if successful, otherwise a negative errno value.
  *
- * Initializes @skb header pointers as follows:
+ * Initializes @skb header fields as follows:
  *
- *    - skb->mac_header: the Ethernet header.
+ *    - skb->mac_header: the L2 header.
  *
- *    - skb->network_header: just past the Ethernet header, or just past the
- *      VLAN header, to the first byte of the Ethernet payload.
+ *    - skb->network_header: just past the L2 header, or just past the
+ *      VLAN header, to the first byte of the L2 payload.
  *
  *    - skb->transport_header: If key->eth.type is ETH_P_IP or ETH_P_IPV6
  *      on output, then just past the IP header, if one is present and
  *      of a correct length, otherwise the same as skb->network_header.
  *      For other key->eth.type values it is left untouched.
+ *
+ *    - skb->protocol: the type of the data starting at skb->network_header.
+ *      Equals to key->eth.type.
  */
 static int key_extract(struct sk_buff *skb, struct sw_flow_key *key)
 {
@@ -505,28 +511,35 @@ static int key_extract(struct sk_buff *skb, struct sw_flow_key *key)
 
 	skb_reset_mac_header(skb);
 
-	/* Link layer.  We are guaranteed to have at least the 14 byte Ethernet
-	 * header in the linear data area.
-	 */
-	eth = eth_hdr(skb);
-	ether_addr_copy(key->eth.src, eth->h_source);
-	ether_addr_copy(key->eth.dst, eth->h_dest);
+	/* Link layer. */
+	clear_vlan(key);
+	if (key->mac_proto == MAC_PROTO_NONE) {
+		if (unlikely(eth_type_vlan(skb->protocol)))
+			return -EINVAL;
 
-	__skb_pull(skb, 2 * ETH_ALEN);
-	/* We are going to push all headers that we pull, so no need to
-	 * update skb->csum here.
-	 */
+		skb_reset_network_header(skb);
+	} else {
+		eth = eth_hdr(skb);
+		ether_addr_copy(key->eth.src, eth->h_source);
+		ether_addr_copy(key->eth.dst, eth->h_dest);
 
-	if (unlikely(parse_vlan(skb, key)))
-		return -ENOMEM;
+		__skb_pull(skb, 2 * ETH_ALEN);
+		/* We are going to push all headers that we pull, so no need to
+		* update skb->csum here.
+		*/
 
-	key->eth.type = parse_ethertype(skb);
-	if (unlikely(key->eth.type == htons(0)))
-		return -ENOMEM;
+		if (unlikely(parse_vlan(skb, key)))
+			return -ENOMEM;
 
-	skb_reset_network_header(skb);
+		skb->protocol = parse_ethertype(skb);
+		if (unlikely(skb->protocol == htons(0)))
+			return -ENOMEM;
+
+		skb_reset_network_header(skb);
+		__skb_push(skb, skb->data - skb_mac_header(skb));
+	}
 	skb_reset_mac_len(skb);
-	__skb_push(skb, skb->data - skb_mac_header(skb));
+	key->eth.type = skb->protocol;
 
 	/* Network layer. */
 	if (key->eth.type == htons(ETH_P_IP)) {
@@ -721,6 +734,20 @@ int ovs_flow_key_update(struct sk_buff *skb, struct sw_flow_key *key)
 	return key_extract(skb, key);
 }
 
+static u8 key_extract_mac_proto(struct sk_buff *skb)
+{
+	switch (skb->dev->type) {
+	case ARPHRD_ETHER:
+		return MAC_PROTO_ETHERNET;
+	case ARPHRD_NONE:
+		if (skb->protocol == htons(ETH_P_TEB))
+			return MAC_PROTO_ETHERNET;
+		return MAC_PROTO_NONE;
+	}
+	WARN_ON_ONCE(1);
+	return MAC_PROTO_ETHERNET;
+}
+
 int ovs_flow_key_extract(const struct ip_tunnel_info *tun_info,
 			 struct sk_buff *skb, struct sw_flow_key *key)
 {
@@ -751,7 +778,7 @@ int ovs_flow_key_extract(const struct ip_tunnel_info *tun_info,
 	key->phy.skb_mark = skb->mark;
 	ovs_ct_fill_key(skb, key);
 	key->ovs_flow_hash = 0;
-	key->mac_proto = MAC_PROTO_ETHERNET;
+	key->mac_proto = key_extract_mac_proto(skb);
 	key->recirc_id = 0;
 
 	return key_extract(skb, key);
@@ -768,5 +795,29 @@ int ovs_flow_key_extract_userspace(struct net *net, const struct nlattr *attr,
 	if (err)
 		return err;
 
+	if (ovs_key_mac_proto(key) == MAC_PROTO_NONE) {
+		/* key_extract assumes that skb->protocol is set-up for
+		 * layer 3 packets which is the case for other callers,
+		 * in particular packets recieved from the network stack.
+		 * Here the correct value can be set from the metadata
+		 * extracted above.
+		 */
+		skb->protocol = key->eth.type;
+	} else {
+		struct ethhdr *eth;
+
+		skb_reset_mac_header(skb);
+		eth = eth_hdr(skb);
+
+		/* Normally, setting the skb 'protocol' field would be
+		 * handled by a call to eth_type_trans(), but it assumes
+		 * there's a sending device, which we may not have.
+		 */
+		if (eth_proto_is_802_3(eth->h_proto))
+			skb->protocol = eth->h_proto;
+		else
+			skb->protocol = htons(ETH_P_802_2);
+	}
+
 	return key_extract(skb, key);
 }
diff --git a/net/openvswitch/vport.c b/net/openvswitch/vport.c
index dc0e2212edfc..8361b62a47c2 100644
--- a/net/openvswitch/vport.c
+++ b/net/openvswitch/vport.c
@@ -502,6 +502,22 @@ void ovs_vport_send(struct vport *vport, struct sk_buff *skb, u8 mac_proto)
 {
 	int mtu = vport->dev->mtu;
 
+	switch (vport->dev->type) {
+	case ARPHRD_NONE:
+		if (mac_proto != MAC_PROTO_NONE) {
+			WARN_ON_ONCE(mac_proto != MAC_PROTO_ETHERNET);
+
+			skb_reset_network_header(skb);
+			skb_reset_mac_len(skb);
+			skb->protocol = htons(ETH_P_TEB);
+		}
+		break;
+	case ARPHRD_ETHER:
+		if (mac_proto != MAC_PROTO_ETHERNET)
+			goto drop;
+		break;
+	}
+
 	if (unlikely(packet_length(skb, vport->dev) > mtu &&
 		     !skb_is_gso(skb))) {
 		net_warn_ratelimited("%s: dropped over-mtu packet: %d > %d\n",
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH net-next v12 6/9] openvswitch: netlink: support L3 packets
  2016-10-17 13:02 [PATCH net-next v12 0/9] openvswitch: support for layer 3 encapsulated packets Jiri Benc
                   ` (3 preceding siblings ...)
  2016-10-17 13:02 ` [PATCH net-next v12 5/9] openvswitch: add processing of " Jiri Benc
@ 2016-10-17 13:02 ` Jiri Benc
  2016-10-21  4:20   ` Pravin Shelar
  2016-10-17 13:02 ` [PATCH net-next v12 7/9] openvswitch: add Ethernet push and pop actions Jiri Benc
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 27+ messages in thread
From: Jiri Benc @ 2016-10-17 13:02 UTC (permalink / raw)
  To: netdev; +Cc: dev, Pravin Shelar, Lorand Jakab, Simon Horman

Extend the ovs flow netlink protocol to support L3 packets. Packets without
OVS_KEY_ATTR_ETHERNET attribute specify L3 packets; for those, the
OVS_KEY_ATTR_ETHERTYPE attribute is mandatory.

Push/pop vlan actions are only supported for Ethernet packets.

Based on previous versions by Lorand Jakab and Simon Horman.

Signed-off-by: Lorand Jakab <lojakab@cisco.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Jiri Benc <jbenc@redhat.com>
---
 net/openvswitch/flow_netlink.c | 160 +++++++++++++++++++++++++----------------
 1 file changed, 99 insertions(+), 61 deletions(-)

diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
index ccb9900c5230..c3d0cc4321c3 100644
--- a/net/openvswitch/flow_netlink.c
+++ b/net/openvswitch/flow_netlink.c
@@ -123,7 +123,7 @@ static void update_range(struct sw_flow_match *match,
 static bool match_validate(const struct sw_flow_match *match,
 			   u64 key_attrs, u64 mask_attrs, bool log)
 {
-	u64 key_expected = 1 << OVS_KEY_ATTR_ETHERNET;
+	u64 key_expected = 0;
 	u64 mask_allowed = key_attrs;  /* At most allow all key attributes */
 
 	/* The following mask attributes allowed only if they
@@ -969,10 +969,33 @@ static int parse_vlan_from_nlattrs(struct sw_flow_match *match,
 	return 0;
 }
 
+static int parse_eth_type_from_nlattrs(struct sw_flow_match *match,
+				       u64 *attrs, const struct nlattr **a,
+				       bool is_mask, bool log)
+{
+	__be16 eth_type;
+
+	eth_type = nla_get_be16(a[OVS_KEY_ATTR_ETHERTYPE]);
+	if (is_mask) {
+		/* Always exact match EtherType. */
+		eth_type = htons(0xffff);
+	} else if (!eth_proto_is_802_3(eth_type)) {
+		OVS_NLERR(log, "EtherType %x is less than min %x",
+				ntohs(eth_type), ETH_P_802_3_MIN);
+		return -EINVAL;
+	}
+
+	SW_FLOW_KEY_PUT(match, eth.type, eth_type, is_mask);
+	*attrs &= ~(1 << OVS_KEY_ATTR_ETHERTYPE);
+	return 0;
+}
+
 static int metadata_from_nlattrs(struct net *net, struct sw_flow_match *match,
 				 u64 *attrs, const struct nlattr **a,
 				 bool is_mask, bool log)
 {
+	u8 mac_proto = MAC_PROTO_ETHERNET;
+
 	if (*attrs & (1 << OVS_KEY_ATTR_DP_HASH)) {
 		u32 hash_val = nla_get_u32(a[OVS_KEY_ATTR_DP_HASH]);
 
@@ -1060,9 +1083,19 @@ static int metadata_from_nlattrs(struct net *net, struct sw_flow_match *match,
 		*attrs &= ~(1ULL << OVS_KEY_ATTR_CT_LABELS);
 	}
 
+	/* For layer 3 packets the Ethernet type is provided
+	 * and treated as metadata but no MAC addresses are provided.
+	 */
+	if (!(*attrs & (1ULL << OVS_KEY_ATTR_ETHERNET)) &&
+	    (*attrs & (1ULL << OVS_KEY_ATTR_ETHERTYPE)))
+		mac_proto = MAC_PROTO_NONE;
+
 	/* Always exact match mac_proto */
-	SW_FLOW_KEY_PUT(match, mac_proto, is_mask ? 0xff : MAC_PROTO_ETHERNET,
-			is_mask);
+	SW_FLOW_KEY_PUT(match, mac_proto, is_mask ? 0xff : mac_proto, is_mask);
+
+	if (mac_proto == MAC_PROTO_NONE)
+		return parse_eth_type_from_nlattrs(match, attrs, a, is_mask,
+						   log);
 
 	return 0;
 }
@@ -1086,33 +1119,26 @@ static int ovs_key_from_nlattrs(struct net *net, struct sw_flow_match *match,
 		SW_FLOW_KEY_MEMCPY(match, eth.dst,
 				eth_key->eth_dst, ETH_ALEN, is_mask);
 		attrs &= ~(1 << OVS_KEY_ATTR_ETHERNET);
-	}
-
-	if (attrs & (1 << OVS_KEY_ATTR_VLAN)) {
-		/* VLAN attribute is always parsed before getting here since it
-		 * may occur multiple times.
-		 */
-		OVS_NLERR(log, "VLAN attribute unexpected.");
-		return -EINVAL;
-	}
-
-	if (attrs & (1 << OVS_KEY_ATTR_ETHERTYPE)) {
-		__be16 eth_type;
 
-		eth_type = nla_get_be16(a[OVS_KEY_ATTR_ETHERTYPE]);
-		if (is_mask) {
-			/* Always exact match EtherType. */
-			eth_type = htons(0xffff);
-		} else if (!eth_proto_is_802_3(eth_type)) {
-			OVS_NLERR(log, "EtherType %x is less than min %x",
-				  ntohs(eth_type), ETH_P_802_3_MIN);
+		if (attrs & (1 << OVS_KEY_ATTR_VLAN)) {
+			/* VLAN attribute is always parsed before getting here since it
+			 * may occur multiple times.
+			 */
+			OVS_NLERR(log, "VLAN attribute unexpected.");
 			return -EINVAL;
 		}
 
-		SW_FLOW_KEY_PUT(match, eth.type, eth_type, is_mask);
-		attrs &= ~(1 << OVS_KEY_ATTR_ETHERTYPE);
-	} else if (!is_mask) {
-		SW_FLOW_KEY_PUT(match, eth.type, htons(ETH_P_802_2), is_mask);
+		if (attrs & (1 << OVS_KEY_ATTR_ETHERTYPE)) {
+			err = parse_eth_type_from_nlattrs(match, &attrs, a, is_mask,
+							  log);
+			if (err)
+				return err;
+		} else if (!is_mask) {
+			SW_FLOW_KEY_PUT(match, eth.type, htons(ETH_P_802_2), is_mask);
+		}
+	} else if (!match->key->eth.type) {
+		OVS_NLERR(log, "Either Ethernet header or EtherType is required.");
+		return -EINVAL;
 	}
 
 	if (attrs & (1 << OVS_KEY_ATTR_IPV4)) {
@@ -1561,42 +1587,44 @@ static int __ovs_nla_put_key(const struct sw_flow_key *swkey,
 	if (ovs_ct_put_key(output, skb))
 		goto nla_put_failure;
 
-	nla = nla_reserve(skb, OVS_KEY_ATTR_ETHERNET, sizeof(*eth_key));
-	if (!nla)
-		goto nla_put_failure;
-
-	eth_key = nla_data(nla);
-	ether_addr_copy(eth_key->eth_src, output->eth.src);
-	ether_addr_copy(eth_key->eth_dst, output->eth.dst);
-
-	if (swkey->eth.vlan.tci || eth_type_vlan(swkey->eth.type)) {
-		if (ovs_nla_put_vlan(skb, &output->eth.vlan, is_mask))
+	if (ovs_key_mac_proto(swkey) == MAC_PROTO_ETHERNET) {
+		nla = nla_reserve(skb, OVS_KEY_ATTR_ETHERNET, sizeof(*eth_key));
+		if (!nla)
 			goto nla_put_failure;
-		encap = nla_nest_start(skb, OVS_KEY_ATTR_ENCAP);
-		if (!swkey->eth.vlan.tci)
-			goto unencap;
 
-		if (swkey->eth.cvlan.tci || eth_type_vlan(swkey->eth.type)) {
-			if (ovs_nla_put_vlan(skb, &output->eth.cvlan, is_mask))
+		eth_key = nla_data(nla);
+		ether_addr_copy(eth_key->eth_src, output->eth.src);
+		ether_addr_copy(eth_key->eth_dst, output->eth.dst);
+
+		if (swkey->eth.vlan.tci || eth_type_vlan(swkey->eth.type)) {
+			if (ovs_nla_put_vlan(skb, &output->eth.vlan, is_mask))
 				goto nla_put_failure;
-			in_encap = nla_nest_start(skb, OVS_KEY_ATTR_ENCAP);
-			if (!swkey->eth.cvlan.tci)
+			encap = nla_nest_start(skb, OVS_KEY_ATTR_ENCAP);
+			if (!swkey->eth.vlan.tci)
 				goto unencap;
+
+			if (swkey->eth.cvlan.tci || eth_type_vlan(swkey->eth.type)) {
+				if (ovs_nla_put_vlan(skb, &output->eth.cvlan, is_mask))
+					goto nla_put_failure;
+				in_encap = nla_nest_start(skb, OVS_KEY_ATTR_ENCAP);
+				if (!swkey->eth.cvlan.tci)
+					goto unencap;
+			}
 		}
-	}
 
-	if (swkey->eth.type == htons(ETH_P_802_2)) {
-		/*
-		 * Ethertype 802.2 is represented in the netlink with omitted
-		 * OVS_KEY_ATTR_ETHERTYPE in the flow key attribute, and
-		 * 0xffff in the mask attribute.  Ethertype can also
-		 * be wildcarded.
-		 */
-		if (is_mask && output->eth.type)
-			if (nla_put_be16(skb, OVS_KEY_ATTR_ETHERTYPE,
-						output->eth.type))
-				goto nla_put_failure;
-		goto unencap;
+		if (swkey->eth.type == htons(ETH_P_802_2)) {
+			/*
+			* Ethertype 802.2 is represented in the netlink with omitted
+			* OVS_KEY_ATTR_ETHERTYPE in the flow key attribute, and
+			* 0xffff in the mask attribute.  Ethertype can also
+			* be wildcarded.
+			*/
+			if (is_mask && output->eth.type)
+				if (nla_put_be16(skb, OVS_KEY_ATTR_ETHERTYPE,
+							output->eth.type))
+					goto nla_put_failure;
+			goto unencap;
+		}
 	}
 
 	if (nla_put_be16(skb, OVS_KEY_ATTR_ETHERTYPE, output->eth.type))
@@ -2131,8 +2159,8 @@ static bool validate_masked(u8 *data, int len)
 
 static int validate_set(const struct nlattr *a,
 			const struct sw_flow_key *flow_key,
-			struct sw_flow_actions **sfa,
-			bool *skip_copy, __be16 eth_type, bool masked, bool log)
+			struct sw_flow_actions **sfa, bool *skip_copy,
+			u8 mac_proto, __be16 eth_type, bool masked, bool log)
 {
 	const struct nlattr *ovs_key = nla_data(a);
 	int key_type = nla_type(ovs_key);
@@ -2162,9 +2190,12 @@ static int validate_set(const struct nlattr *a,
 	case OVS_KEY_ATTR_SKB_MARK:
 	case OVS_KEY_ATTR_CT_MARK:
 	case OVS_KEY_ATTR_CT_LABELS:
-	case OVS_KEY_ATTR_ETHERNET:
 		break;
 
+	case OVS_KEY_ATTR_ETHERNET:
+		if (mac_proto != MAC_PROTO_ETHERNET)
+			return -EINVAL;
+
 	case OVS_KEY_ATTR_TUNNEL:
 		if (masked)
 			return -EINVAL; /* Masked tunnel set not supported. */
@@ -2329,6 +2360,7 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
 				  int depth, struct sw_flow_actions **sfa,
 				  __be16 eth_type, __be16 vlan_tci, bool log)
 {
+	u8 mac_proto = ovs_key_mac_proto(key);
 	const struct nlattr *a;
 	int rem, err;
 
@@ -2399,10 +2431,14 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
 		}
 
 		case OVS_ACTION_ATTR_POP_VLAN:
+			if (mac_proto != MAC_PROTO_ETHERNET)
+				return -EINVAL;
 			vlan_tci = htons(0);
 			break;
 
 		case OVS_ACTION_ATTR_PUSH_VLAN:
+			if (mac_proto != MAC_PROTO_ETHERNET)
+				return -EINVAL;
 			vlan = nla_data(a);
 			if (!eth_type_vlan(vlan->vlan_tpid))
 				return -EINVAL;
@@ -2452,14 +2488,16 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
 
 		case OVS_ACTION_ATTR_SET:
 			err = validate_set(a, key, sfa,
-					   &skip_copy, eth_type, false, log);
+					   &skip_copy, mac_proto, eth_type,
+					   false, log);
 			if (err)
 				return err;
 			break;
 
 		case OVS_ACTION_ATTR_SET_MASKED:
 			err = validate_set(a, key, sfa,
-					   &skip_copy, eth_type, true, log);
+					   &skip_copy, mac_proto, eth_type,
+					   true, log);
 			if (err)
 				return err;
 			break;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH net-next v12 7/9] openvswitch: add Ethernet push and pop actions
  2016-10-17 13:02 [PATCH net-next v12 0/9] openvswitch: support for layer 3 encapsulated packets Jiri Benc
                   ` (4 preceding siblings ...)
  2016-10-17 13:02 ` [PATCH net-next v12 6/9] openvswitch: netlink: support " Jiri Benc
@ 2016-10-17 13:02 ` Jiri Benc
  2016-10-21  4:22   ` Pravin Shelar
  2016-10-17 13:02 ` [PATCH net-next v12 8/9] openvswitch: allow L3 netdev ports Jiri Benc
  2016-10-17 13:02 ` [PATCH net-next v12 9/9] openvswitch: use ipgre tunnel rather than gretap tunnel Jiri Benc
  7 siblings, 1 reply; 27+ messages in thread
From: Jiri Benc @ 2016-10-17 13:02 UTC (permalink / raw)
  To: netdev; +Cc: dev, Pravin Shelar, Lorand Jakab, Simon Horman

It's not allowed to push Ethernet header in front of another Ethernet
header.

It's not allowed to pop Ethernet header if there's a vlan tag. This
preserves the invariant that L3 packet never has a vlan tag.

Based on previous versions by Lorand Jakab and Simon Horman.

Signed-off-by: Lorand Jakab <lojakab@cisco.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Jiri Benc <jbenc@redhat.com>
---
 include/uapi/linux/openvswitch.h | 15 ++++++++++++
 net/openvswitch/actions.c        | 49 ++++++++++++++++++++++++++++++++++++++++
 net/openvswitch/flow_netlink.c   | 18 +++++++++++++++
 3 files changed, 82 insertions(+)

diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
index 59ed3992c760..375d812fea36 100644
--- a/include/uapi/linux/openvswitch.h
+++ b/include/uapi/linux/openvswitch.h
@@ -705,6 +705,15 @@ enum ovs_nat_attr {
 
 #define OVS_NAT_ATTR_MAX (__OVS_NAT_ATTR_MAX - 1)
 
+/*
+ * struct ovs_action_push_eth - %OVS_ACTION_ATTR_PUSH_ETH action argument.
+ * @addresses: Source and destination MAC addresses.
+ * @eth_type: Ethernet type
+ */
+struct ovs_action_push_eth {
+	struct ovs_key_ethernet addresses;
+};
+
 /**
  * enum ovs_action_attr - Action types.
  *
@@ -738,6 +747,10 @@ enum ovs_nat_attr {
  * is no MPLS label stack, as determined by ethertype, no action is taken.
  * @OVS_ACTION_ATTR_CT: Track the connection. Populate the conntrack-related
  * entries in the flow key.
+ * @OVS_ACTION_ATTR_PUSH_ETH: Push a new outermost Ethernet header onto the
+ * packet.
+ * @OVS_ACTION_ATTR_POP_ETH: Pop the outermost Ethernet header off the
+ * packet.
  *
  * Only a single header can be set with a single %OVS_ACTION_ATTR_SET.  Not all
  * fields within a header are modifiable, e.g. the IPv4 protocol and fragment
@@ -765,6 +778,8 @@ enum ovs_action_attr {
 				       * bits. */
 	OVS_ACTION_ATTR_CT,           /* Nested OVS_CT_ATTR_* . */
 	OVS_ACTION_ATTR_TRUNC,        /* u32 struct ovs_action_trunc. */
+	OVS_ACTION_ATTR_PUSH_ETH,     /* struct ovs_action_push_eth. */
+	OVS_ACTION_ATTR_POP_ETH,      /* No argument. */
 
 	__OVS_ACTION_ATTR_MAX,	      /* Nothing past this will be accepted
 				       * from userspace. */
diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index 064cbcb7b0c5..a63572fb878e 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -317,6 +317,47 @@ static int set_eth_addr(struct sk_buff *skb, struct sw_flow_key *flow_key,
 	return 0;
 }
 
+/* pop_eth does not support VLAN packets as this action is never called
+ * for them.
+ */
+static int pop_eth(struct sk_buff *skb, struct sw_flow_key *key)
+{
+	skb_pull_rcsum(skb, ETH_HLEN);
+	skb_reset_mac_header(skb);
+	skb->mac_len -= ETH_HLEN;
+
+	/* safe right before invalidate_flow_key */
+	key->mac_proto = MAC_PROTO_NONE;
+	invalidate_flow_key(key);
+	return 0;
+}
+
+static int push_eth(struct sk_buff *skb, struct sw_flow_key *key,
+		    const struct ovs_action_push_eth *ethh)
+{
+	struct ethhdr *hdr;
+
+	/* Add the new Ethernet header */
+	if (skb_cow_head(skb, ETH_HLEN) < 0)
+		return -ENOMEM;
+
+	skb_push(skb, ETH_HLEN);
+	skb_reset_mac_header(skb);
+	skb->mac_len = ETH_HLEN;
+
+	hdr = eth_hdr(skb);
+	ether_addr_copy(hdr->h_source, ethh->addresses.eth_src);
+	ether_addr_copy(hdr->h_dest, ethh->addresses.eth_dst);
+	hdr->h_proto = skb->protocol;
+
+	skb_postpush_rcsum(skb, hdr, ETH_HLEN);
+
+	/* safe right before invalidate_flow_key */
+	key->mac_proto = MAC_PROTO_ETHERNET;
+	invalidate_flow_key(key);
+	return 0;
+}
+
 static void update_ip_l4_checksum(struct sk_buff *skb, struct iphdr *nh,
 				  __be32 addr, __be32 new_addr)
 {
@@ -1200,6 +1241,14 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
 			if (err)
 				return err == -EINPROGRESS ? 0 : err;
 			break;
+
+		case OVS_ACTION_ATTR_PUSH_ETH:
+			err = push_eth(skb, key, nla_data(a));
+			break;
+
+		case OVS_ACTION_ATTR_POP_ETH:
+			err = pop_eth(skb, key);
+			break;
 		}
 
 		if (unlikely(err)) {
diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
index c3d0cc4321c3..d19044f2b1f4 100644
--- a/net/openvswitch/flow_netlink.c
+++ b/net/openvswitch/flow_netlink.c
@@ -2383,6 +2383,8 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
 			[OVS_ACTION_ATTR_HASH] = sizeof(struct ovs_action_hash),
 			[OVS_ACTION_ATTR_CT] = (u32)-1,
 			[OVS_ACTION_ATTR_TRUNC] = sizeof(struct ovs_action_trunc),
+			[OVS_ACTION_ATTR_PUSH_ETH] = sizeof(struct ovs_action_push_eth),
+			[OVS_ACTION_ATTR_POP_ETH] = 0,
 		};
 		const struct ovs_action_push_vlan *vlan;
 		int type = nla_type(a);
@@ -2517,6 +2519,22 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
 			skip_copy = true;
 			break;
 
+		case OVS_ACTION_ATTR_PUSH_ETH:
+			/* Disallow pushing an Ethernet header if one
+			 * is already present */
+			if (mac_proto != MAC_PROTO_NONE)
+				return -EINVAL;
+			mac_proto = MAC_PROTO_NONE;
+			break;
+
+		case OVS_ACTION_ATTR_POP_ETH:
+			if (mac_proto != MAC_PROTO_ETHERNET)
+				return -EINVAL;
+			if (vlan_tci & htons(VLAN_TAG_PRESENT))
+				return -EINVAL;
+			mac_proto = MAC_PROTO_ETHERNET;
+			break;
+
 		default:
 			OVS_NLERR(log, "Unknown Action type %d", type);
 			return -EINVAL;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH net-next v12 8/9] openvswitch: allow L3 netdev ports
  2016-10-17 13:02 [PATCH net-next v12 0/9] openvswitch: support for layer 3 encapsulated packets Jiri Benc
                   ` (5 preceding siblings ...)
  2016-10-17 13:02 ` [PATCH net-next v12 7/9] openvswitch: add Ethernet push and pop actions Jiri Benc
@ 2016-10-17 13:02 ` Jiri Benc
  2016-10-21  4:22   ` Pravin Shelar
  2016-10-17 13:02 ` [PATCH net-next v12 9/9] openvswitch: use ipgre tunnel rather than gretap tunnel Jiri Benc
  7 siblings, 1 reply; 27+ messages in thread
From: Jiri Benc @ 2016-10-17 13:02 UTC (permalink / raw)
  To: netdev; +Cc: dev, Pravin Shelar, Lorand Jakab, Simon Horman

Allow ARPHRD_NONE interfaces to be added to ovs bridge.

Based on previous versions by Lorand Jakab and Simon Horman.

Signed-off-by: Lorand Jakab <lojakab@cisco.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Jiri Benc <jbenc@redhat.com>
---
 net/openvswitch/vport-netdev.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c
index 4e3972344aa6..4309796bc870 100644
--- a/net/openvswitch/vport-netdev.c
+++ b/net/openvswitch/vport-netdev.c
@@ -57,8 +57,10 @@ static void netdev_port_receive(struct sk_buff *skb)
 	if (unlikely(!skb))
 		return;
 
-	skb_push(skb, ETH_HLEN);
-	skb_postpush_rcsum(skb, skb->data, ETH_HLEN);
+	if (skb->dev->type == ARPHRD_ETHER) {
+		skb_push(skb, ETH_HLEN);
+		skb_postpush_rcsum(skb, skb->data, ETH_HLEN);
+	}
 	ovs_vport_receive(vport, skb, skb_tunnel_info(skb));
 	return;
 error:
@@ -97,7 +99,8 @@ struct vport *ovs_netdev_link(struct vport *vport, const char *name)
 	}
 
 	if (vport->dev->flags & IFF_LOOPBACK ||
-	    vport->dev->type != ARPHRD_ETHER ||
+	    (vport->dev->type != ARPHRD_ETHER &&
+	     vport->dev->type != ARPHRD_NONE) ||
 	    ovs_is_internal_dev(vport->dev)) {
 		err = -EINVAL;
 		goto error_put;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH net-next v12 9/9] openvswitch: use ipgre tunnel rather than gretap tunnel
  2016-10-17 13:02 [PATCH net-next v12 0/9] openvswitch: support for layer 3 encapsulated packets Jiri Benc
                   ` (6 preceding siblings ...)
  2016-10-17 13:02 ` [PATCH net-next v12 8/9] openvswitch: allow L3 netdev ports Jiri Benc
@ 2016-10-17 13:02 ` Jiri Benc
       [not found]   ` <4e5bfb307ee1e419f973a100e8acc7556311b64a.1476708213.git.jbenc-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  7 siblings, 1 reply; 27+ messages in thread
From: Jiri Benc @ 2016-10-17 13:02 UTC (permalink / raw)
  To: netdev; +Cc: dev, Pravin Shelar, Lorand Jakab, Simon Horman

From: Simon Horman <simon.horman@netronome.com>

This allows GRE tunnels to send and receive both
layer 2 packets (packets with an ethernet header) and
layer 3 packets (packets without an ethernet header).

Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Jiri Benc <jbenc@redhat.com>
---
v12: removed the non-gre hunks (now part of previous patches in this
     patchset)
---
 include/net/gre.h           | 4 ++--
 net/ipv4/ip_gre.c           | 9 +++++----
 net/openvswitch/vport-gre.c | 2 +-
 3 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/include/net/gre.h b/include/net/gre.h
index d25d836c129b..1a0bb1cefa60 100644
--- a/include/net/gre.h
+++ b/include/net/gre.h
@@ -31,8 +31,8 @@ struct gre_protocol {
 int gre_add_protocol(const struct gre_protocol *proto, u8 version);
 int gre_del_protocol(const struct gre_protocol *proto, u8 version);
 
-struct net_device *gretap_fb_dev_create(struct net *net, const char *name,
-				       u8 name_assign_type);
+struct net_device *gre_fb_dev_create(struct net *net, const char *name,
+				     u8 name_assign_type);
 int gre_parse_header(struct sk_buff *skb, struct tnl_ptk_info *tpi,
 		     bool *csum_err, __be16 proto, int nhs);
 
diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
index 576f705d8180..18caea5c6d09 100644
--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -1125,8 +1125,8 @@ static int ipgre_fill_info(struct sk_buff *skb, const struct net_device *dev)
 	.get_link_net	= ip_tunnel_get_link_net,
 };
 
-struct net_device *gretap_fb_dev_create(struct net *net, const char *name,
-					u8 name_assign_type)
+struct net_device *gre_fb_dev_create(struct net *net, const char *name,
+				     u8 name_assign_type)
 {
 	struct nlattr *tb[IFLA_MAX + 1];
 	struct net_device *dev;
@@ -1137,13 +1137,14 @@ struct net_device *gretap_fb_dev_create(struct net *net, const char *name,
 	memset(&tb, 0, sizeof(tb));
 
 	dev = rtnl_create_link(net, name, name_assign_type,
-			       &ipgre_tap_ops, tb);
+			       &ipgre_link_ops, tb);
 	if (IS_ERR(dev))
 		return dev;
 
 	/* Configure flow based GRE device. */
 	t = netdev_priv(dev);
 	t->collect_md = true;
+	dev->type = ARPHRD_NONE;
 
 	err = ipgre_newlink(net, dev, tb, NULL);
 	if (err < 0) {
@@ -1168,7 +1169,7 @@ struct net_device *gretap_fb_dev_create(struct net *net, const char *name,
 	unregister_netdevice_many(&list_kill);
 	return ERR_PTR(err);
 }
-EXPORT_SYMBOL_GPL(gretap_fb_dev_create);
+EXPORT_SYMBOL_GPL(gre_fb_dev_create);
 
 static int __net_init ipgre_tap_init_net(struct net *net)
 {
diff --git a/net/openvswitch/vport-gre.c b/net/openvswitch/vport-gre.c
index 0e72d95b0e8f..3fc3014bf924 100644
--- a/net/openvswitch/vport-gre.c
+++ b/net/openvswitch/vport-gre.c
@@ -61,7 +61,7 @@ static struct vport *gre_tnl_create(const struct vport_parms *parms)
 		return vport;
 
 	rtnl_lock();
-	dev = gretap_fb_dev_create(net, parms->name, NET_NAME_USER);
+	dev = gre_fb_dev_create(net, parms->name, NET_NAME_USER);
 	if (IS_ERR(dev)) {
 		rtnl_unlock();
 		ovs_vport_free(vport);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v12 0/9] openvswitch: support for layer 3 encapsulated packets
       [not found] ` <cover.1476708212.git.jbenc-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2016-10-17 13:02   ` [PATCH net-next v12 1/9] openvswitch: use hard_header_len instead of hardcoded ETH_HLEN Jiri Benc
  2016-10-17 13:02   ` [PATCH net-next v12 2/9] openvswitch: add mac_proto field to the flow key Jiri Benc
@ 2016-10-19  5:11   ` Pravin Shelar
  2016-10-19 16:59     ` Jiri Benc
  2 siblings, 1 reply; 27+ messages in thread
From: Pravin Shelar @ 2016-10-19  5:11 UTC (permalink / raw)
  To: Jiri Benc; +Cc: ovs dev, Linux Kernel Network Developers, Simon Horman

On Mon, Oct 17, 2016 at 6:02 AM, Jiri Benc <jbenc@redhat.com> wrote:
> At the core of this patch set is removing the assumption in Open vSwitch
> datapath that all packets have Ethernet header. Support for layer 3 GRE
> tunnels is also added by this patchset.
>
> The implementation relies on the presence of pop_eth and push_eth actions
> in datapath flows to facilitate adding and removing Ethernet headers as
> appropriate. The construction of such flows is left up to user-space.
>
> This series is based on work by Simon Horman, Lorand Jakab, Thomas Morin and
> others. I kept Lorand's and Simon's s-o-b in the patches that are derived
> from v11 to record their authorship of parts of the code. Please let me know
> if you disagree with this.
>
> v12 differs from v11 a lot. The main changes are:
>
> * The patches were restructured and split differently for easier review.
> * They were rebased and adjusted to the current net-next. Especially MPLS
>   handling is different (and easier) thanks to the recent MPLS GSO rework.
> * Several bugs were discovered and fixed. The most notable is fragment
>   handling: header adjustment for ARPHRD_NONE devices on tx needs to be done
>   after refragmentation, not before it. This required significant changes in
>   the patchset. Another one is stricter checking of attributes (match on L2
>   vs. L3 packet) at the kernel level.
> * Instead of is_layer3 bool, a mac_proto field is used. See patch 2. This is
>   a matter of taste and alternate approaches are offered in patch 2
>   description.
>
> There is no change to uAPI since v11. The previously posted patchset for
> Open vSwitch user space works with this submission unmodified.
>

I have not finished the review yet, but most of patches looks good to
me. Can you send userspace patches against latest master so that I can
try the patches with tunnel setup?
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v12 2/9] openvswitch: add mac_proto field to the flow key
       [not found]     ` <098a6ab19e26cf7b1cc16f57f5fc0cc019ca44d6.1476708212.git.jbenc-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2016-10-19  5:11       ` Pravin Shelar
  0 siblings, 0 replies; 27+ messages in thread
From: Pravin Shelar @ 2016-10-19  5:11 UTC (permalink / raw)
  To: Jiri Benc; +Cc: ovs dev, Linux Kernel Network Developers, Simon Horman

On Mon, Oct 17, 2016 at 6:02 AM, Jiri Benc <jbenc@redhat.com> wrote:
> Use a hole in the structure. We support only Ethernet so far and will add
> a support for L2-less packets shortly. We could use a bool to indicate
> whether the Ethernet header is present or not but the approach with the
> mac_proto field is more generic and occupies the same number of bytes in the
> struct, while allowing later extensibility. It also makes the code in the
> next patches more self explaining.
>
> It would be nice to use ARPHRD_ constants but those are u16 which would be
> waste. Thus define our own constants.
>
> Another upside of this is that we can overload this new field to also denote
> whether the flow key is valid. This has the advantage that on
> refragmentation, we don't have to reparse the packet but can rely on the
> stored eth.type. This is especially important for the next patches in this
> series - instead of adding another branch for L2-less packets before calling
> ovs_fragment, we can just remove all those branches completely.
>
> Signed-off-by: Jiri Benc <jbenc@redhat.com>
> ---
> There are three possible approaches:
>
> (1) The one in this patch.
>
> (2) Use just a one bit flag indicating whether the packet is L3 or Ethernet
>     (similar to the "is_layer3" bool in v11). The code would stay very
>     similar to this patchset, the memory consumption would be the same.
>
> (3) Use value of 14 for MAC_PROTO_ETHERNET. It would simplify things nicely,
>     as ovs_mac_header_len would be identical to ovs_key_mac_proto, saving
>     one comparison. Of course, this would mean that if other L2 protocols
>     are added in the future, they can only have L2 header length different
>     than 14. Sounds hacky, although I kind of like this.
>
> After thinking about pros and cons, I implemented (1). Seems to be most
> clear of the three options. But I'm happy to implement (2) or (3) if it's
> deemed better.

I like approach taken by this patch.

Acked-by: Pravin B Shelar <pshelar@ovn.org>
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v12 3/9] openvswitch: pass mac_proto to ovs_vport_send
  2016-10-17 13:02 ` [PATCH net-next v12 3/9] openvswitch: pass mac_proto to ovs_vport_send Jiri Benc
@ 2016-10-19  5:11   ` Pravin Shelar
  0 siblings, 0 replies; 27+ messages in thread
From: Pravin Shelar @ 2016-10-19  5:11 UTC (permalink / raw)
  To: Jiri Benc
  Cc: Linux Kernel Network Developers, ovs dev, Lorand Jakab, Simon Horman

On Mon, Oct 17, 2016 at 6:02 AM, Jiri Benc <jbenc@redhat.com> wrote:
> We'll need it to alter packets sent to ARPHRD_NONE interfaces.
>
> Change do_output() to use the actual L2 header size of the packet when
> deciding on the minimum cutlen. The assumption here is that what matters is
> not the output interface hard_header_len but rather the L2 header of the
> particular packet. For example, ARPHRD_NONE tunnels that encapsulate
> Ethernet should get at least the Ethernet header.
>
> Signed-off-by: Jiri Benc <jbenc@redhat.com>

Acked-by: Pravin B Shelar <pshelar@ovn.org>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v12 4/9] openvswitch: support MPLS push and pop for L3 packets
       [not found]   ` <004e53f40443698cffdaf8f9839649cc4b9101cc.1476708213.git.jbenc-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2016-10-19  5:11     ` Pravin Shelar
  0 siblings, 0 replies; 27+ messages in thread
From: Pravin Shelar @ 2016-10-19  5:11 UTC (permalink / raw)
  To: Jiri Benc; +Cc: ovs dev, Linux Kernel Network Developers, Simon Horman

On Mon, Oct 17, 2016 at 6:02 AM, Jiri Benc <jbenc@redhat.com> wrote:
> Update Ethernet header only if there is one.
>
> Signed-off-by: Jiri Benc <jbenc@redhat.com>

Acked-by: Pravin B Shelar <pshelar@ovn.org>
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v12 5/9] openvswitch: add processing of L3 packets
  2016-10-17 13:02 ` [PATCH net-next v12 5/9] openvswitch: add processing of " Jiri Benc
@ 2016-10-19  5:13   ` Pravin Shelar
  2016-10-19 16:52     ` Jiri Benc
  2016-10-21  4:19   ` Pravin Shelar
  1 sibling, 1 reply; 27+ messages in thread
From: Pravin Shelar @ 2016-10-19  5:13 UTC (permalink / raw)
  To: Jiri Benc
  Cc: Linux Kernel Network Developers, ovs dev, Lorand Jakab, Simon Horman

On Mon, Oct 17, 2016 at 6:02 AM, Jiri Benc <jbenc@redhat.com> wrote:
> Support receiving, extracting flow key and sending of L3 packets (packets
> without an Ethernet header).
>
> Note that even after this patch, non-Ethernet interfaces are still not
> allowed to be added to bridges. Similarly, netlink interface for sending and
> receiving L3 packets to/from user space is not in place yet.
>
> Based on previous versions by Lorand Jakab and Simon Horman.
>
> Signed-off-by: Lorand Jakab <lojakab@cisco.com>
> Signed-off-by: Simon Horman <simon.horman@netronome.com>
> Signed-off-by: Jiri Benc <jbenc@redhat.com>
> ---
>  net/openvswitch/datapath.c |  17 ++------
>  net/openvswitch/flow.c     | 101 ++++++++++++++++++++++++++++++++++-----------
>  net/openvswitch/vport.c    |  16 +++++++
>  3 files changed, 96 insertions(+), 38 deletions(-)
>
...

> @@ -505,28 +511,35 @@ static int key_extract(struct sk_buff *skb, struct sw_flow_key *key)
>
>         skb_reset_mac_header(skb);
>
> -       /* Link layer.  We are guaranteed to have at least the 14 byte Ethernet
> -        * header in the linear data area.
> -        */
> -       eth = eth_hdr(skb);
> -       ether_addr_copy(key->eth.src, eth->h_source);
> -       ether_addr_copy(key->eth.dst, eth->h_dest);
> +       /* Link layer. */
> +       clear_vlan(key);
> +       if (key->mac_proto == MAC_PROTO_NONE) {
> +               if (unlikely(eth_type_vlan(skb->protocol)))
> +                       return -EINVAL;
>
> -       __skb_pull(skb, 2 * ETH_ALEN);
> -       /* We are going to push all headers that we pull, so no need to
> -        * update skb->csum here.
> -        */
> +               skb_reset_network_header(skb);
> +       } else {
> +               eth = eth_hdr(skb);
> +               ether_addr_copy(key->eth.src, eth->h_source);
> +               ether_addr_copy(key->eth.dst, eth->h_dest);
>
> -       if (unlikely(parse_vlan(skb, key)))
> -               return -ENOMEM;
> +               __skb_pull(skb, 2 * ETH_ALEN);
> +               /* We are going to push all headers that we pull, so no need to
> +               * update skb->csum here.
> +               */
>
> -       key->eth.type = parse_ethertype(skb);
> -       if (unlikely(key->eth.type == htons(0)))
> -               return -ENOMEM;
> +               if (unlikely(parse_vlan(skb, key)))
> +                       return -ENOMEM;
>
> -       skb_reset_network_header(skb);
> +               skb->protocol = parse_ethertype(skb);

I am not sure about changing skb->protocol here.
By changing this skb loosing information about packet type. Therefore
if packet re-enters OVS (through different bridge), this packet would
look like L3 packet. function key_extract_mac_proto() would not see
TEB type packet.

> +               if (unlikely(skb->protocol == htons(0)))
> +                       return -ENOMEM;
> +
> +               skb_reset_network_header(skb);
> +               __skb_push(skb, skb->data - skb_mac_header(skb));
> +       }
>         skb_reset_mac_len(skb);
> -       __skb_push(skb, skb->data - skb_mac_header(skb));
> +       key->eth.type = skb->protocol;
>
>         /* Network layer. */
>         if (key->eth.type == htons(ETH_P_IP)) {
> @@ -721,6 +734,20 @@ int ovs_flow_key_update(struct sk_buff *skb, struct sw_flow_key *key)
>         return key_extract(skb, key);
>  }
>
> +static u8 key_extract_mac_proto(struct sk_buff *skb)
> +{
> +       switch (skb->dev->type) {
> +       case ARPHRD_ETHER:
> +               return MAC_PROTO_ETHERNET;
> +       case ARPHRD_NONE:
> +               if (skb->protocol == htons(ETH_P_TEB))
> +                       return MAC_PROTO_ETHERNET;
> +               return MAC_PROTO_NONE;
> +       }
> +       WARN_ON_ONCE(1);
> +       return MAC_PROTO_ETHERNET;
> +}
> +

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v12 1/9] openvswitch: use hard_header_len instead of hardcoded ETH_HLEN
  2016-10-17 13:02   ` [PATCH net-next v12 1/9] openvswitch: use hard_header_len instead of hardcoded ETH_HLEN Jiri Benc
@ 2016-10-19  5:13     ` Pravin Shelar
  0 siblings, 0 replies; 27+ messages in thread
From: Pravin Shelar @ 2016-10-19  5:13 UTC (permalink / raw)
  To: Jiri Benc
  Cc: Linux Kernel Network Developers, ovs dev, Lorand Jakab, Simon Horman

On Mon, Oct 17, 2016 at 6:02 AM, Jiri Benc <jbenc@redhat.com> wrote:
> On tx, use hard_header_len while deciding whether to refragment or drop the
> packet. That way, all combinations are calculated correctly:
>
> * L2 packet going to L2 interface (the L2 header len is subtracted),
> * L2 packet going to L3 interface (the L2 header is included in the packet
>   lenght),
> * L3 packet going to L3 interface.
>
> Signed-off-by: Jiri Benc <jbenc@redhat.com>

Acked-by: Pravin B Shelar <pshelar@ovn.org>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v12 9/9] openvswitch: use ipgre tunnel rather than gretap tunnel
       [not found]   ` <4e5bfb307ee1e419f973a100e8acc7556311b64a.1476708213.git.jbenc-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2016-10-19  5:14     ` Pravin Shelar
  2016-10-19 16:58       ` Jiri Benc
  0 siblings, 1 reply; 27+ messages in thread
From: Pravin Shelar @ 2016-10-19  5:14 UTC (permalink / raw)
  To: Jiri Benc; +Cc: ovs dev, Linux Kernel Network Developers, Simon Horman

On Mon, Oct 17, 2016 at 6:02 AM, Jiri Benc <jbenc@redhat.com> wrote:
> From: Simon Horman <simon.horman@netronome.com>
>
> This allows GRE tunnels to send and receive both
> layer 2 packets (packets with an ethernet header) and
> layer 3 packets (packets without an ethernet header).
>
> Signed-off-by: Simon Horman <simon.horman@netronome.com>
> Signed-off-by: Jiri Benc <jbenc@redhat.com>
> ---
> v12: removed the non-gre hunks (now part of previous patches in this
>      patchset)
> ---
>  include/net/gre.h           | 4 ++--
>  net/ipv4/ip_gre.c           | 9 +++++----
>  net/openvswitch/vport-gre.c | 2 +-
>  3 files changed, 8 insertions(+), 7 deletions(-)
>
> diff --git a/include/net/gre.h b/include/net/gre.h
> index d25d836c129b..1a0bb1cefa60 100644
> --- a/include/net/gre.h
> +++ b/include/net/gre.h
> @@ -31,8 +31,8 @@ struct gre_protocol {
>  int gre_add_protocol(const struct gre_protocol *proto, u8 version);
>  int gre_del_protocol(const struct gre_protocol *proto, u8 version);
>
> -struct net_device *gretap_fb_dev_create(struct net *net, const char *name,
> -                                      u8 name_assign_type);
> +struct net_device *gre_fb_dev_create(struct net *net, const char *name,
> +                                    u8 name_assign_type);
>  int gre_parse_header(struct sk_buff *skb, struct tnl_ptk_info *tpi,
>                      bool *csum_err, __be16 proto, int nhs);
>
> diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
> index 576f705d8180..18caea5c6d09 100644
> --- a/net/ipv4/ip_gre.c
> +++ b/net/ipv4/ip_gre.c
> @@ -1125,8 +1125,8 @@ static int ipgre_fill_info(struct sk_buff *skb, const struct net_device *dev)
>         .get_link_net   = ip_tunnel_get_link_net,
>  };
>
> -struct net_device *gretap_fb_dev_create(struct net *net, const char *name,
> -                                       u8 name_assign_type)
> +struct net_device *gre_fb_dev_create(struct net *net, const char *name,
> +                                    u8 name_assign_type)
>  {
>         struct nlattr *tb[IFLA_MAX + 1];
>         struct net_device *dev;
> @@ -1137,13 +1137,14 @@ struct net_device *gretap_fb_dev_create(struct net *net, const char *name,
>         memset(&tb, 0, sizeof(tb));
>
>         dev = rtnl_create_link(net, name, name_assign_type,
> -                              &ipgre_tap_ops, tb);
> +                              &ipgre_link_ops, tb);
>         if (IS_ERR(dev))
>                 return dev;
>
>         /* Configure flow based GRE device. */
>         t = netdev_priv(dev);
>         t->collect_md = true;
> +       dev->type = ARPHRD_NONE;
>

This is OVS tunnel compatibility code. We are not suppose to add new
features to compat code. Just provide a way to configure such device
over rtnl.
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v12 5/9] openvswitch: add processing of L3 packets
  2016-10-19  5:13   ` Pravin Shelar
@ 2016-10-19 16:52     ` Jiri Benc
  2016-10-20 18:48       ` Pravin Shelar
  0 siblings, 1 reply; 27+ messages in thread
From: Jiri Benc @ 2016-10-19 16:52 UTC (permalink / raw)
  To: Pravin Shelar
  Cc: Linux Kernel Network Developers, ovs dev, Lorand Jakab, Simon Horman

On Tue, 18 Oct 2016 22:13:45 -0700, Pravin Shelar wrote:
> On Mon, Oct 17, 2016 at 6:02 AM, Jiri Benc <jbenc@redhat.com> wrote:
> > -       skb_reset_network_header(skb);
> > +               skb->protocol = parse_ethertype(skb);
> 
> I am not sure about changing skb->protocol here.
> By changing this skb loosing information about packet type. Therefore
> if packet re-enters OVS (through different bridge), this packet would
> look like L3 packet. function key_extract_mac_proto() would not see
> TEB type packet.

This should be okay. If the packet is sent out to an Ethernet interface
(whatever interface it is), skb->protocol needs to contain the payload
type. We're not interested in ETH_P_TEB. If the packet is sent out to
an ARPHRD_NONE interface, ETH_P_TEB is pushed back.

Basically, what we're doing here is unconditionally converting
ETH_P_TEB packets *coming from ARPHRD_NONE interfaces* (this is
important) into regular Ethernet packets. Which is exactly what we want.

Am I missing something?

 Jiri

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v12 9/9] openvswitch: use ipgre tunnel rather than gretap tunnel
  2016-10-19  5:14     ` Pravin Shelar
@ 2016-10-19 16:58       ` Jiri Benc
  0 siblings, 0 replies; 27+ messages in thread
From: Jiri Benc @ 2016-10-19 16:58 UTC (permalink / raw)
  To: Pravin Shelar
  Cc: Linux Kernel Network Developers, ovs dev, Lorand Jakab, Simon Horman

On Tue, 18 Oct 2016 22:14:06 -0700, Pravin Shelar wrote:
> This is OVS tunnel compatibility code. We are not suppose to add new
> features to compat code. Just provide a way to configure such device
> over rtnl.

Makes sense. Please consider this a test-only patch for now. I won't
include it in v13.

rtnetlink already supports configuration of such devices, I added that
some time ago. We'll just need to make the user space aware of that.
But at the kernel side, just dropping this patch is enough.

 Jiri

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v12 0/9] openvswitch: support for layer 3 encapsulated packets
  2016-10-19  5:11   ` [PATCH net-next v12 0/9] openvswitch: support for layer 3 encapsulated packets Pravin Shelar
@ 2016-10-19 16:59     ` Jiri Benc
  0 siblings, 0 replies; 27+ messages in thread
From: Jiri Benc @ 2016-10-19 16:59 UTC (permalink / raw)
  To: Pravin Shelar
  Cc: Linux Kernel Network Developers, ovs dev, Lorand Jakab, Simon Horman

On Tue, 18 Oct 2016 22:11:21 -0700, Pravin Shelar wrote:
> I have not finished the review yet, but most of patches looks good to
> me. Can you send userspace patches against latest master so that I can
> try the patches with tunnel setup?

Will do in a few minutes.

Thanks for the review,

 Jiri

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v12 5/9] openvswitch: add processing of L3 packets
  2016-10-19 16:52     ` Jiri Benc
@ 2016-10-20 18:48       ` Pravin Shelar
  0 siblings, 0 replies; 27+ messages in thread
From: Pravin Shelar @ 2016-10-20 18:48 UTC (permalink / raw)
  To: Jiri Benc; +Cc: ovs dev, Linux Kernel Network Developers, Simon Horman

On Wed, Oct 19, 2016 at 9:52 AM, Jiri Benc <jbenc@redhat.com> wrote:
> On Tue, 18 Oct 2016 22:13:45 -0700, Pravin Shelar wrote:
>> On Mon, Oct 17, 2016 at 6:02 AM, Jiri Benc <jbenc@redhat.com> wrote:
>> > -       skb_reset_network_header(skb);
>> > +               skb->protocol = parse_ethertype(skb);
>>
>> I am not sure about changing skb->protocol here.
>> By changing this skb loosing information about packet type. Therefore
>> if packet re-enters OVS (through different bridge), this packet would
>> look like L3 packet. function key_extract_mac_proto() would not see
>> TEB type packet.
>
> This should be okay. If the packet is sent out to an Ethernet interface
> (whatever interface it is), skb->protocol needs to contain the payload
> type. We're not interested in ETH_P_TEB. If the packet is sent out to
> an ARPHRD_NONE interface, ETH_P_TEB is pushed back.
>
I see, vport send is restoring the skb protocol field. It should be fine then.

> Basically, what we're doing here is unconditionally converting
> ETH_P_TEB packets *coming from ARPHRD_NONE interfaces* (this is
> important) into regular Ethernet packets. Which is exactly what we want.
>
> Am I missing something?
>
>  Jiri
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v12 5/9] openvswitch: add processing of L3 packets
  2016-10-17 13:02 ` [PATCH net-next v12 5/9] openvswitch: add processing of " Jiri Benc
  2016-10-19  5:13   ` Pravin Shelar
@ 2016-10-21  4:19   ` Pravin Shelar
  2016-10-21 11:59     ` Jiri Benc
  1 sibling, 1 reply; 27+ messages in thread
From: Pravin Shelar @ 2016-10-21  4:19 UTC (permalink / raw)
  To: Jiri Benc
  Cc: Linux Kernel Network Developers, ovs dev, Lorand Jakab, Simon Horman

On Mon, Oct 17, 2016 at 6:02 AM, Jiri Benc <jbenc@redhat.com> wrote:
> Support receiving, extracting flow key and sending of L3 packets (packets
> without an Ethernet header).
>
> Note that even after this patch, non-Ethernet interfaces are still not
> allowed to be added to bridges. Similarly, netlink interface for sending and
> receiving L3 packets to/from user space is not in place yet.
>
> Based on previous versions by Lorand Jakab and Simon Horman.
>
> Signed-off-by: Lorand Jakab <lojakab@cisco.com>
> Signed-off-by: Simon Horman <simon.horman@netronome.com>
> Signed-off-by: Jiri Benc <jbenc@redhat.com>
> ---
>  net/openvswitch/datapath.c |  17 ++------
>  net/openvswitch/flow.c     | 101 ++++++++++++++++++++++++++++++++++-----------
>  net/openvswitch/vport.c    |  16 +++++++
>  3 files changed, 96 insertions(+), 38 deletions(-)
>
> diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
> index 4d67ea856067..c5b719fca8d4 100644
> --- a/net/openvswitch/datapath.c
> +++ b/net/openvswitch/datapath.c
...
...
> @@ -609,8 +597,10 @@ static int ovs_packet_cmd_execute(struct sk_buff *skb, struct genl_info *info)
>
>         err = ovs_flow_key_extract_userspace(net, a[OVS_PACKET_ATTR_KEY],
>                                              packet, &flow->key, log);
> -       if (err)
> +       if (err) {
> +               packet = NULL;
>                 goto err_flow_free;
> +       }
>
Why packet is set to NULL? This would leak skb memory in case of error.

>         err = ovs_nla_copy_actions(net, a[OVS_PACKET_ATTR_ACTIONS],
>                                    &flow->key, &acts, log);
...
> diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
> index 96c8c4716603..7aee6e273765 100644
> --- a/net/openvswitch/flow.c
> +++ b/net/openvswitch/flow.c
...

> @@ -721,6 +734,20 @@ int ovs_flow_key_update(struct sk_buff *skb, struct sw_flow_key *key)
>         return key_extract(skb, key);
>  }
>
> +static u8 key_extract_mac_proto(struct sk_buff *skb)
> +{
> +       switch (skb->dev->type) {
> +       case ARPHRD_ETHER:
> +               return MAC_PROTO_ETHERNET;
> +       case ARPHRD_NONE:
> +               if (skb->protocol == htons(ETH_P_TEB))
> +                       return MAC_PROTO_ETHERNET;
> +               return MAC_PROTO_NONE;
> +       }
> +       WARN_ON_ONCE(1);
> +       return MAC_PROTO_ETHERNET;
This packet should be dropped.

> +}
> +
>  int ovs_flow_key_extract(const struct ip_tunnel_info *tun_info,
>                          struct sk_buff *skb, struct sw_flow_key *key)
>  {
....

> diff --git a/net/openvswitch/vport.c b/net/openvswitch/vport.c
> index dc0e2212edfc..8361b62a47c2 100644
> --- a/net/openvswitch/vport.c
> +++ b/net/openvswitch/vport.c
> @@ -502,6 +502,22 @@ void ovs_vport_send(struct vport *vport, struct sk_buff *skb, u8 mac_proto)
>  {
>         int mtu = vport->dev->mtu;
>
> +       switch (vport->dev->type) {
> +       case ARPHRD_NONE:
> +               if (mac_proto != MAC_PROTO_NONE) {
> +                       WARN_ON_ONCE(mac_proto != MAC_PROTO_ETHERNET);
> +
It would be easy to read if you check mac_proto for MAC_PROTO_ETHERNET
and then warn otherwise.
another issue is the packet is not dropped if mac_proto is not
MAC_PROTO_ETHERNET

> +                       skb_reset_network_header(skb);
> +                       skb_reset_mac_len(skb);
> +                       skb->protocol = htons(ETH_P_TEB);
> +               }
> +               break;
> +       case ARPHRD_ETHER:
> +               if (mac_proto != MAC_PROTO_ETHERNET)
> +                       goto drop;
> +               break;
> +       }
> +
>         if (unlikely(packet_length(skb, vport->dev) > mtu &&
>                      !skb_is_gso(skb))) {
>                 net_warn_ratelimited("%s: dropped over-mtu packet: %d > %d\n",
> --
> 1.8.3.1
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v12 6/9] openvswitch: netlink: support L3 packets
  2016-10-17 13:02 ` [PATCH net-next v12 6/9] openvswitch: netlink: support " Jiri Benc
@ 2016-10-21  4:20   ` Pravin Shelar
  0 siblings, 0 replies; 27+ messages in thread
From: Pravin Shelar @ 2016-10-21  4:20 UTC (permalink / raw)
  To: Jiri Benc
  Cc: Linux Kernel Network Developers, ovs dev, Lorand Jakab, Simon Horman

On Mon, Oct 17, 2016 at 6:02 AM, Jiri Benc <jbenc@redhat.com> wrote:
> Extend the ovs flow netlink protocol to support L3 packets. Packets without
> OVS_KEY_ATTR_ETHERNET attribute specify L3 packets; for those, the
> OVS_KEY_ATTR_ETHERTYPE attribute is mandatory.
>
> Push/pop vlan actions are only supported for Ethernet packets.
>
> Based on previous versions by Lorand Jakab and Simon Horman.
>
> Signed-off-by: Lorand Jakab <lojakab@cisco.com>
> Signed-off-by: Simon Horman <simon.horman@netronome.com>
> Signed-off-by: Jiri Benc <jbenc@redhat.com>

Acked-by: Pravin B Shelar <pshelar@ovn.org>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v12 7/9] openvswitch: add Ethernet push and pop actions
  2016-10-17 13:02 ` [PATCH net-next v12 7/9] openvswitch: add Ethernet push and pop actions Jiri Benc
@ 2016-10-21  4:22   ` Pravin Shelar
  2016-10-21 12:21     ` Jiri Benc
  0 siblings, 1 reply; 27+ messages in thread
From: Pravin Shelar @ 2016-10-21  4:22 UTC (permalink / raw)
  To: Jiri Benc
  Cc: Linux Kernel Network Developers, ovs dev, Lorand Jakab, Simon Horman

On Mon, Oct 17, 2016 at 6:02 AM, Jiri Benc <jbenc@redhat.com> wrote:
> It's not allowed to push Ethernet header in front of another Ethernet
> header.
>
> It's not allowed to pop Ethernet header if there's a vlan tag. This
> preserves the invariant that L3 packet never has a vlan tag.
>
> Based on previous versions by Lorand Jakab and Simon Horman.
>
> Signed-off-by: Lorand Jakab <lojakab@cisco.com>
> Signed-off-by: Simon Horman <simon.horman@netronome.com>
> Signed-off-by: Jiri Benc <jbenc@redhat.com>
> ---
>  include/uapi/linux/openvswitch.h | 15 ++++++++++++
>  net/openvswitch/actions.c        | 49 ++++++++++++++++++++++++++++++++++++++++
>  net/openvswitch/flow_netlink.c   | 18 +++++++++++++++
>  3 files changed, 82 insertions(+)
>
...
...
> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> index 064cbcb7b0c5..a63572fb878e 100644
> --- a/net/openvswitch/actions.c
> +++ b/net/openvswitch/actions.c
> @@ -317,6 +317,47 @@ static int set_eth_addr(struct sk_buff *skb, struct sw_flow_key *flow_key,
>         return 0;
>  }
>
> +/* pop_eth does not support VLAN packets as this action is never called
> + * for them.
> + */
> +static int pop_eth(struct sk_buff *skb, struct sw_flow_key *key)
> +{
> +       skb_pull_rcsum(skb, ETH_HLEN);
> +       skb_reset_mac_header(skb);
> +       skb->mac_len -= ETH_HLEN;
> +
> +       /* safe right before invalidate_flow_key */
> +       key->mac_proto = MAC_PROTO_NONE;
> +       invalidate_flow_key(key);
> +       return 0;
> +}
> +
> +static int push_eth(struct sk_buff *skb, struct sw_flow_key *key,
> +                   const struct ovs_action_push_eth *ethh)
> +{
> +       struct ethhdr *hdr;
> +
> +       /* Add the new Ethernet header */
> +       if (skb_cow_head(skb, ETH_HLEN) < 0)
> +               return -ENOMEM;
> +
> +       skb_push(skb, ETH_HLEN);
> +       skb_reset_mac_header(skb);
> +       skb->mac_len = ETH_HLEN;
> +

The eth pop substracts ETH_HLEN but here the length is set. I think it
should be consistent with respect to eth-pop.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v12 8/9] openvswitch: allow L3 netdev ports
  2016-10-17 13:02 ` [PATCH net-next v12 8/9] openvswitch: allow L3 netdev ports Jiri Benc
@ 2016-10-21  4:22   ` Pravin Shelar
  0 siblings, 0 replies; 27+ messages in thread
From: Pravin Shelar @ 2016-10-21  4:22 UTC (permalink / raw)
  To: Jiri Benc
  Cc: Linux Kernel Network Developers, ovs dev, Lorand Jakab, Simon Horman

On Mon, Oct 17, 2016 at 6:02 AM, Jiri Benc <jbenc@redhat.com> wrote:
> Allow ARPHRD_NONE interfaces to be added to ovs bridge.
>
> Based on previous versions by Lorand Jakab and Simon Horman.
>
> Signed-off-by: Lorand Jakab <lojakab@cisco.com>
> Signed-off-by: Simon Horman <simon.horman@netronome.com>
> Signed-off-by: Jiri Benc <jbenc@redhat.com>

Acked-by: Pravin B Shelar <pshelar@ovn.org>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v12 5/9] openvswitch: add processing of L3 packets
  2016-10-21  4:19   ` Pravin Shelar
@ 2016-10-21 11:59     ` Jiri Benc
  0 siblings, 0 replies; 27+ messages in thread
From: Jiri Benc @ 2016-10-21 11:59 UTC (permalink / raw)
  To: Pravin Shelar
  Cc: Linux Kernel Network Developers, ovs dev, Lorand Jakab, Simon Horman

On Thu, 20 Oct 2016 21:19:14 -0700, Pravin Shelar wrote:
> On Mon, Oct 17, 2016 at 6:02 AM, Jiri Benc <jbenc@redhat.com> wrote:
> > @@ -609,8 +597,10 @@ static int ovs_packet_cmd_execute(struct sk_buff *skb, struct genl_info *info)
> >
> >         err = ovs_flow_key_extract_userspace(net, a[OVS_PACKET_ATTR_KEY],
> >                                              packet, &flow->key, log);
> > -       if (err)
> > +       if (err) {
> > +               packet = NULL;
> >                 goto err_flow_free;
> > +       }
> >
> Why packet is set to NULL? This would leak skb memory in case of error.

Leftover from the times when this was based on top of the patchset that
unified vlan handling. Thanks for catching it.

> > +static u8 key_extract_mac_proto(struct sk_buff *skb)
> > +{
> > +       switch (skb->dev->type) {
> > +       case ARPHRD_ETHER:
> > +               return MAC_PROTO_ETHERNET;
> > +       case ARPHRD_NONE:
> > +               if (skb->protocol == htons(ETH_P_TEB))
> > +                       return MAC_PROTO_ETHERNET;
> > +               return MAC_PROTO_NONE;
> > +       }
> > +       WARN_ON_ONCE(1);
> > +       return MAC_PROTO_ETHERNET;
> This packet should be dropped.

This should never happen, I put WARN_ON_ONCE there to catch impossible
to happen bugs (and BUG_ON is really too strong here). But sure, why
not, will drop the packet in v13.

> > +       switch (vport->dev->type) {
> > +       case ARPHRD_NONE:
> > +               if (mac_proto != MAC_PROTO_NONE) {
> > +                       WARN_ON_ONCE(mac_proto != MAC_PROTO_ETHERNET);
> > +
> It would be easy to read if you check mac_proto for MAC_PROTO_ETHERNET
> and then warn otherwise.

Okay.

> another issue is the packet is not dropped if mac_proto is not
> MAC_PROTO_ETHERNET

Likewise, impossible to happen, but will drop the packet.

Thanks,

 Jiri

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH net-next v12 7/9] openvswitch: add Ethernet push and pop actions
  2016-10-21  4:22   ` Pravin Shelar
@ 2016-10-21 12:21     ` Jiri Benc
  0 siblings, 0 replies; 27+ messages in thread
From: Jiri Benc @ 2016-10-21 12:21 UTC (permalink / raw)
  To: Pravin Shelar
  Cc: Linux Kernel Network Developers, ovs dev, Lorand Jakab, Simon Horman

On Thu, 20 Oct 2016 21:22:21 -0700, Pravin Shelar wrote:
> The eth pop substracts ETH_HLEN but here the length is set. I think it
> should be consistent with respect to eth-pop.

Agreed. Will use skb_reset_mac_len in both, that better reflects what's
going on.

Thanks,

 Jiri

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2016-10-21 12:21 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-17 13:02 [PATCH net-next v12 0/9] openvswitch: support for layer 3 encapsulated packets Jiri Benc
     [not found] ` <cover.1476708212.git.jbenc-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-10-17 13:02   ` [PATCH net-next v12 1/9] openvswitch: use hard_header_len instead of hardcoded ETH_HLEN Jiri Benc
2016-10-19  5:13     ` Pravin Shelar
2016-10-17 13:02   ` [PATCH net-next v12 2/9] openvswitch: add mac_proto field to the flow key Jiri Benc
     [not found]     ` <098a6ab19e26cf7b1cc16f57f5fc0cc019ca44d6.1476708212.git.jbenc-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-10-19  5:11       ` Pravin Shelar
2016-10-19  5:11   ` [PATCH net-next v12 0/9] openvswitch: support for layer 3 encapsulated packets Pravin Shelar
2016-10-19 16:59     ` Jiri Benc
2016-10-17 13:02 ` [PATCH net-next v12 3/9] openvswitch: pass mac_proto to ovs_vport_send Jiri Benc
2016-10-19  5:11   ` Pravin Shelar
2016-10-17 13:02 ` [PATCH net-next v12 4/9] openvswitch: support MPLS push and pop for L3 packets Jiri Benc
     [not found]   ` <004e53f40443698cffdaf8f9839649cc4b9101cc.1476708213.git.jbenc-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-10-19  5:11     ` Pravin Shelar
2016-10-17 13:02 ` [PATCH net-next v12 5/9] openvswitch: add processing of " Jiri Benc
2016-10-19  5:13   ` Pravin Shelar
2016-10-19 16:52     ` Jiri Benc
2016-10-20 18:48       ` Pravin Shelar
2016-10-21  4:19   ` Pravin Shelar
2016-10-21 11:59     ` Jiri Benc
2016-10-17 13:02 ` [PATCH net-next v12 6/9] openvswitch: netlink: support " Jiri Benc
2016-10-21  4:20   ` Pravin Shelar
2016-10-17 13:02 ` [PATCH net-next v12 7/9] openvswitch: add Ethernet push and pop actions Jiri Benc
2016-10-21  4:22   ` Pravin Shelar
2016-10-21 12:21     ` Jiri Benc
2016-10-17 13:02 ` [PATCH net-next v12 8/9] openvswitch: allow L3 netdev ports Jiri Benc
2016-10-21  4:22   ` Pravin Shelar
2016-10-17 13:02 ` [PATCH net-next v12 9/9] openvswitch: use ipgre tunnel rather than gretap tunnel Jiri Benc
     [not found]   ` <4e5bfb307ee1e419f973a100e8acc7556311b64a.1476708213.git.jbenc-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-10-19  5:14     ` Pravin Shelar
2016-10-19 16:58       ` Jiri Benc

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.