All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next v2 0/7] vxlan: General improvements.
@ 2016-11-05 18:45 Pravin B Shelar
  2016-11-05 18:45 ` [PATCH net-next v2 1/7] vxlan: avoid vlan processing in vxlan device Pravin B Shelar
                   ` (6 more replies)
  0 siblings, 7 replies; 31+ messages in thread
From: Pravin B Shelar @ 2016-11-05 18:45 UTC (permalink / raw)
  To: netdev; +Cc: Pravin B Shelar

Following patch series improves vxlan fast path, removes
duplicate code and simplifies vxlan xmit code path.

v1-v2:
Fix compilation error when IPv6 support is not enabled.

Pravin B Shelar (7):
  vxlan: avoid vlan processing in vxlan device.
  vxlan: simplify exception handling
  vxlan: avoid checking socket multiple times.
  vxlan: improve vxlan route lookup checks.
  vxlan: simplify RTF_LOCAL handling.
  vxlan: simplify vxlan xmit
  vxlan: remove unsed vxlan_dev_dst_port()

 drivers/net/vxlan.c     | 275 +++++++++++++++++++++++-------------------------
 include/linux/if_vlan.h |  16 ---
 include/net/vxlan.h     |  10 --
 3 files changed, 133 insertions(+), 168 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH net-next v2 1/7] vxlan: avoid vlan processing in vxlan device.
  2016-11-05 18:45 [PATCH net-next v2 0/7] vxlan: General improvements Pravin B Shelar
@ 2016-11-05 18:45 ` Pravin B Shelar
  2016-11-06  9:57   ` Sergei Shtylyov
  2016-11-09 16:04   ` Jiri Benc
  2016-11-05 18:45 ` [PATCH net-next v2 2/7] vxlan: simplify exception handling Pravin B Shelar
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 31+ messages in thread
From: Pravin B Shelar @ 2016-11-05 18:45 UTC (permalink / raw)
  To: netdev; +Cc: Pravin B Shelar

VxLan device does not have special handling for vlan taging on egress.
Therefore it does not make sense to expose vlan offloading feature.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
---
 drivers/net/vxlan.c     |  9 +--------
 include/linux/if_vlan.h | 16 ----------------
 2 files changed, 1 insertion(+), 24 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index cb5cc7c..756d826 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -1748,18 +1748,13 @@ static int vxlan_build_skb(struct sk_buff *skb, struct dst_entry *dst,
 	}
 
 	min_headroom = LL_RESERVED_SPACE(dst->dev) + dst->header_len
-			+ VXLAN_HLEN + iphdr_len
-			+ (skb_vlan_tag_present(skb) ? VLAN_HLEN : 0);
+			+ VXLAN_HLEN + iphdr_len;
 
 	/* Need space for new headers (invalidates iph ptr) */
 	err = skb_cow_head(skb, min_headroom);
 	if (unlikely(err))
 		goto out_free;
 
-	skb = vlan_hwaccel_push_inside(skb);
-	if (WARN_ON(!skb))
-		return -ENOMEM;
-
 	err = iptunnel_handle_offloads(skb, type);
 	if (err)
 		goto out_free;
@@ -2527,10 +2522,8 @@ static void vxlan_setup(struct net_device *dev)
 	dev->features   |= NETIF_F_GSO_SOFTWARE;
 
 	dev->vlan_features = dev->features;
-	dev->features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX;
 	dev->hw_features |= NETIF_F_SG | NETIF_F_HW_CSUM | NETIF_F_RXCSUM;
 	dev->hw_features |= NETIF_F_GSO_SOFTWARE;
-	dev->hw_features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX;
 	netif_keep_dst(dev);
 	dev->priv_flags |= IFF_NO_QUEUE;
 
diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
index 3319d97..8d5fcd6 100644
--- a/include/linux/if_vlan.h
+++ b/include/linux/if_vlan.h
@@ -399,22 +399,6 @@ static inline struct sk_buff *__vlan_hwaccel_push_inside(struct sk_buff *skb)
 		skb->vlan_tci = 0;
 	return skb;
 }
-/*
- * vlan_hwaccel_push_inside - pushes vlan tag to the payload
- * @skb: skbuff to tag
- *
- * Checks is tag is present in @skb->vlan_tci and if it is, it pushes the
- * VLAN tag from @skb->vlan_tci inside to the payload.
- *
- * Following the skb_unshare() example, in case of error, the calling function
- * doesn't have to worry about freeing the original skb.
- */
-static inline struct sk_buff *vlan_hwaccel_push_inside(struct sk_buff *skb)
-{
-	if (skb_vlan_tag_present(skb))
-		skb = __vlan_hwaccel_push_inside(skb);
-	return skb;
-}
 
 /**
  * __vlan_hwaccel_put_tag - hardware accelerated VLAN inserting
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH net-next v2 2/7] vxlan: simplify exception handling
  2016-11-05 18:45 [PATCH net-next v2 0/7] vxlan: General improvements Pravin B Shelar
  2016-11-05 18:45 ` [PATCH net-next v2 1/7] vxlan: avoid vlan processing in vxlan device Pravin B Shelar
@ 2016-11-05 18:45 ` Pravin B Shelar
  2016-11-09 16:10   ` Jiri Benc
  2016-11-05 18:45 ` [PATCH net-next v2 3/7] vxlan: avoid checking socket multiple times Pravin B Shelar
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 31+ messages in thread
From: Pravin B Shelar @ 2016-11-05 18:45 UTC (permalink / raw)
  To: netdev; +Cc: Pravin B Shelar

vxlan egress path error handling has became complicated, it
need to handle IPv4 and IPv6 tunnel cases.
Earlier patch removes vlan handling from vxlan_build_skb(), so
vxlan_build_skb does not need to free skb and we can simplify
the xmit path by having single error handling for both type of
tunnels.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
---
 drivers/net/vxlan.c | 28 +++++++++++-----------------
 1 file changed, 11 insertions(+), 17 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 756d826..a1e707f 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -1789,7 +1789,7 @@ static int vxlan_build_skb(struct sk_buff *skb, struct dst_entry *dst,
 	return 0;
 
 out_free:
-	kfree_skb(skb);
+	dst_release(dst);
 	return err;
 }
 
@@ -1927,7 +1927,6 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
 	struct ip_tunnel_info *info;
 	struct vxlan_dev *vxlan = netdev_priv(dev);
 	struct sock *sk;
-	struct rtable *rt = NULL;
 	const struct iphdr *old_iph;
 	union vxlan_addr *dst;
 	union vxlan_addr remote_ip, local_ip;
@@ -2009,6 +2008,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
 
 	if (dst->sa.sa_family == AF_INET) {
 		struct vxlan_sock *sock4 = rcu_dereference(vxlan->vn4_sock);
+		struct rtable *rt;
 
 		if (!sock4)
 			goto drop;
@@ -2030,7 +2030,8 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
 			netdev_dbg(dev, "circular route to %pI4\n",
 				   &dst->sin.sin_addr.s_addr);
 			dev->stats.collisions++;
-			goto rt_tx_error;
+			ip_rt_put(rt);
+			goto tx_error;
 		}
 
 		/* Bypass encapsulation if the destination is local */
@@ -2058,7 +2059,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
 		err = vxlan_build_skb(skb, &rt->dst, sizeof(struct iphdr),
 				      vni, md, flags, udp_sum);
 		if (err < 0)
-			goto xmit_tx_error;
+			goto tx_error;
 
 		udp_tunnel_xmit_skb(rt, sk, skb, src->sin.sin_addr.s_addr,
 				    dst->sin.sin_addr.s_addr, tos, ttl, df,
@@ -2117,11 +2118,9 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
 		skb_scrub_packet(skb, xnet);
 		err = vxlan_build_skb(skb, ndst, sizeof(struct ipv6hdr),
 				      vni, md, flags, udp_sum);
-		if (err < 0) {
-			dst_release(ndst);
-			dev->stats.tx_errors++;
-			return;
-		}
+		if (err < 0)
+			goto tx_error;
+
 		udp_tunnel6_xmit_skb(ndst, sk, skb, dev,
 				     &src->sin6.sin6_addr,
 				     &dst->sin6.sin6_addr, tos, ttl,
@@ -2133,17 +2132,12 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
 
 drop:
 	dev->stats.tx_dropped++;
-	goto tx_free;
+	dev_kfree_skb(skb);
+	return;
 
-xmit_tx_error:
-	/* skb is already freed. */
-	skb = NULL;
-rt_tx_error:
-	ip_rt_put(rt);
 tx_error:
 	dev->stats.tx_errors++;
-tx_free:
-	dev_kfree_skb(skb);
+	kfree_skb(skb);
 }
 
 /* Transmit local packets over Vxlan
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH net-next v2 3/7] vxlan: avoid checking socket multiple times.
  2016-11-05 18:45 [PATCH net-next v2 0/7] vxlan: General improvements Pravin B Shelar
  2016-11-05 18:45 ` [PATCH net-next v2 1/7] vxlan: avoid vlan processing in vxlan device Pravin B Shelar
  2016-11-05 18:45 ` [PATCH net-next v2 2/7] vxlan: simplify exception handling Pravin B Shelar
@ 2016-11-05 18:45 ` Pravin B Shelar
  2016-11-09 16:16   ` Jiri Benc
  2016-11-09 16:34   ` Jiri Benc
  2016-11-05 18:45 ` [PATCH net-next v2 4/7] vxlan: improve vxlan route lookup checks Pravin B Shelar
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 31+ messages in thread
From: Pravin B Shelar @ 2016-11-05 18:45 UTC (permalink / raw)
  To: netdev; +Cc: Pravin B Shelar

Check the vxlan socket in vxlan6_getroute().

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
---
 drivers/net/vxlan.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index a1e707f..bd17ab5 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -944,7 +944,9 @@ static bool vxlan_group_used(struct vxlan_net *vn, struct vxlan_dev *dev)
 {
 	struct vxlan_dev *vxlan;
 	struct vxlan_sock *sock4;
-	struct vxlan_sock *sock6 = NULL;
+#if IS_ENABLED(CONFIG_IPV6)
+	struct vxlan_sock *sock6;
+#endif
 	unsigned short family = dev->default_dst.remote_ip.sa.sa_family;
 
 	sock4 = rtnl_dereference(dev->vn4_sock);
@@ -1830,6 +1832,7 @@ static struct rtable *vxlan_get_route(struct vxlan_dev *vxlan,
 
 #if IS_ENABLED(CONFIG_IPV6)
 static struct dst_entry *vxlan6_get_route(struct vxlan_dev *vxlan,
+					  struct vxlan_sock *sock6,
 					  struct sk_buff *skb, int oif, u8 tos,
 					  __be32 label,
 					  const struct in6_addr *daddr,
@@ -1837,7 +1840,6 @@ static struct dst_entry *vxlan6_get_route(struct vxlan_dev *vxlan,
 					  struct dst_cache *dst_cache,
 					  const struct ip_tunnel_info *info)
 {
-	struct vxlan_sock *sock6 = rcu_dereference(vxlan->vn6_sock);
 	bool use_cache = ip_tunnel_dst_cache_usable(skb, info);
 	struct dst_entry *ndst;
 	struct flowi6 fl6;
@@ -2070,11 +2072,9 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
 		struct dst_entry *ndst;
 		u32 rt6i_flags;
 
-		if (!sock6)
-			goto drop;
 		sk = sock6->sock->sk;
 
-		ndst = vxlan6_get_route(vxlan, skb,
+		ndst = vxlan6_get_route(vxlan, sock6, skb,
 					rdst ? rdst->remote_ifindex : 0, tos,
 					label, &dst->sin6.sin6_addr,
 					&src->sin6.sin6_addr,
@@ -2426,9 +2426,10 @@ static int vxlan_fill_metadata_dst(struct net_device *dev, struct sk_buff *skb)
 		ip_rt_put(rt);
 	} else {
 #if IS_ENABLED(CONFIG_IPV6)
+		struct vxlan_sock *sock6 = rcu_dereference(vxlan->vn6_sock);
 		struct dst_entry *ndst;
 
-		ndst = vxlan6_get_route(vxlan, skb, 0, info->key.tos,
+		ndst = vxlan6_get_route(vxlan, sock6, skb, 0, info->key.tos,
 					info->key.label, &info->key.u.ipv6.dst,
 					&info->key.u.ipv6.src, NULL, info);
 		if (IS_ERR(ndst))
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH net-next v2 4/7] vxlan: improve vxlan route lookup checks.
  2016-11-05 18:45 [PATCH net-next v2 0/7] vxlan: General improvements Pravin B Shelar
                   ` (2 preceding siblings ...)
  2016-11-05 18:45 ` [PATCH net-next v2 3/7] vxlan: avoid checking socket multiple times Pravin B Shelar
@ 2016-11-05 18:45 ` Pravin B Shelar
  2016-11-09 16:41   ` Jiri Benc
  2016-11-05 18:45 ` [PATCH net-next v2 5/7] vxlan: simplify RTF_LOCAL handling Pravin B Shelar
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 31+ messages in thread
From: Pravin B Shelar @ 2016-11-05 18:45 UTC (permalink / raw)
  To: netdev; +Cc: Pravin B Shelar

Move route sanity check to respective vxlan[4/6]_get_route functions.
This allows us to perform all sanity checks before caching the dst so
that we can avoid these checks on subsequent packets.
This give move accurate metadata information for packet from
fill_metadata_dst().

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
---
 drivers/net/vxlan.c | 71 +++++++++++++++++++++++++----------------------------
 1 file changed, 34 insertions(+), 37 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index bd17ab5..f106178 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -1795,7 +1795,8 @@ static int vxlan_build_skb(struct sk_buff *skb, struct dst_entry *dst,
 	return err;
 }
 
-static struct rtable *vxlan_get_route(struct vxlan_dev *vxlan,
+static struct rtable *vxlan_get_route(struct vxlan_dev *vxlan, struct net_device *dev,
+				      struct vxlan_sock *sock4,
 				      struct sk_buff *skb, int oif, u8 tos,
 				      __be32 daddr, __be32 *saddr,
 				      struct dst_cache *dst_cache,
@@ -1805,6 +1806,9 @@ static struct rtable *vxlan_get_route(struct vxlan_dev *vxlan,
 	struct rtable *rt = NULL;
 	struct flowi4 fl4;
 
+	if (!sock4)
+		return ERR_PTR(-EIO);
+
 	if (tos && !info)
 		use_cache = false;
 	if (use_cache) {
@@ -1822,16 +1826,27 @@ static struct rtable *vxlan_get_route(struct vxlan_dev *vxlan,
 	fl4.saddr = *saddr;
 
 	rt = ip_route_output_key(vxlan->net, &fl4);
-	if (!IS_ERR(rt)) {
+	if (likely(!IS_ERR(rt))) {
+		if (rt->dst.dev == dev) {
+			netdev_dbg(dev, "circular route to %pI4\n", &daddr);
+			dev->stats.collisions++;
+			ip_rt_put(rt);
+			return ERR_PTR(-ELOOP);
+		}
+
 		*saddr = fl4.saddr;
 		if (use_cache)
 			dst_cache_set_ip4(dst_cache, &rt->dst, fl4.saddr);
+	} else {
+		netdev_dbg(dev, "no route to %pI4\n", &daddr);
+		dev->stats.tx_carrier_errors++;
 	}
 	return rt;
 }
 
 #if IS_ENABLED(CONFIG_IPV6)
 static struct dst_entry *vxlan6_get_route(struct vxlan_dev *vxlan,
+					  struct net_device *dev,
 					  struct vxlan_sock *sock6,
 					  struct sk_buff *skb, int oif, u8 tos,
 					  __be32 label,
@@ -1867,8 +1882,18 @@ static struct dst_entry *vxlan6_get_route(struct vxlan_dev *vxlan,
 	err = ipv6_stub->ipv6_dst_lookup(vxlan->net,
 					 sock6->sock->sk,
 					 &ndst, &fl6);
-	if (err < 0)
+	if (unlikely(err < 0)) {
+		netdev_dbg(dev, "no route to %pI6\n", daddr);
+		dev->stats.tx_carrier_errors++;
 		return ERR_PTR(err);
+	}
+
+	if (unlikely(ndst->dev == dev)) {
+		netdev_dbg(dev, "circular route to %pI6\n", daddr);
+		dst_release(ndst);
+		dev->stats.collisions++;
+		return ERR_PTR(-ELOOP);
+	}
 
 	*saddr = fl6.saddr;
 	if (use_cache)
@@ -2012,29 +2037,15 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
 		struct vxlan_sock *sock4 = rcu_dereference(vxlan->vn4_sock);
 		struct rtable *rt;
 
-		if (!sock4)
-			goto drop;
 		sk = sock4->sock->sk;
 
-		rt = vxlan_get_route(vxlan, skb,
+		rt = vxlan_get_route(vxlan, dev, sock4, skb,
 				     rdst ? rdst->remote_ifindex : 0, tos,
 				     dst->sin.sin_addr.s_addr,
 				     &src->sin.sin_addr.s_addr,
 				     dst_cache, info);
-		if (IS_ERR(rt)) {
-			netdev_dbg(dev, "no route to %pI4\n",
-				   &dst->sin.sin_addr.s_addr);
-			dev->stats.tx_carrier_errors++;
-			goto tx_error;
-		}
-
-		if (rt->dst.dev == dev) {
-			netdev_dbg(dev, "circular route to %pI4\n",
-				   &dst->sin.sin_addr.s_addr);
-			dev->stats.collisions++;
-			ip_rt_put(rt);
+		if (IS_ERR(rt))
 			goto tx_error;
-		}
 
 		/* Bypass encapsulation if the destination is local */
 		if (!info && rt->rt_flags & RTCF_LOCAL &&
@@ -2074,25 +2085,13 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
 
 		sk = sock6->sock->sk;
 
-		ndst = vxlan6_get_route(vxlan, sock6, skb,
+		ndst = vxlan6_get_route(vxlan, dev, sock6, skb,
 					rdst ? rdst->remote_ifindex : 0, tos,
 					label, &dst->sin6.sin6_addr,
 					&src->sin6.sin6_addr,
 					dst_cache, info);
-		if (IS_ERR(ndst)) {
-			netdev_dbg(dev, "no route to %pI6\n",
-				   &dst->sin6.sin6_addr);
-			dev->stats.tx_carrier_errors++;
-			goto tx_error;
-		}
-
-		if (ndst->dev == dev) {
-			netdev_dbg(dev, "circular route to %pI6\n",
-				   &dst->sin6.sin6_addr);
-			dst_release(ndst);
-			dev->stats.collisions++;
+		if (IS_ERR(ndst))
 			goto tx_error;
-		}
 
 		/* Bypass encapsulation if the destination is local */
 		rt6i_flags = ((struct rt6_info *)ndst)->rt6i_flags;
@@ -2416,9 +2415,7 @@ static int vxlan_fill_metadata_dst(struct net_device *dev, struct sk_buff *skb)
 		struct vxlan_sock *sock4 = rcu_dereference(vxlan->vn4_sock);
 		struct rtable *rt;
 
-		if (!sock4)
-			return -EINVAL;
-		rt = vxlan_get_route(vxlan, skb, 0, info->key.tos,
+		rt = vxlan_get_route(vxlan, dev, sock4, skb, 0, info->key.tos,
 				     info->key.u.ipv4.dst,
 				     &info->key.u.ipv4.src, NULL, info);
 		if (IS_ERR(rt))
@@ -2429,7 +2426,7 @@ static int vxlan_fill_metadata_dst(struct net_device *dev, struct sk_buff *skb)
 		struct vxlan_sock *sock6 = rcu_dereference(vxlan->vn6_sock);
 		struct dst_entry *ndst;
 
-		ndst = vxlan6_get_route(vxlan, sock6, skb, 0, info->key.tos,
+		ndst = vxlan6_get_route(vxlan, dev, sock6, skb, 0, info->key.tos,
 					info->key.label, &info->key.u.ipv6.dst,
 					&info->key.u.ipv6.src, NULL, info);
 		if (IS_ERR(ndst))
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH net-next v2 5/7] vxlan: simplify RTF_LOCAL handling.
  2016-11-05 18:45 [PATCH net-next v2 0/7] vxlan: General improvements Pravin B Shelar
                   ` (3 preceding siblings ...)
  2016-11-05 18:45 ` [PATCH net-next v2 4/7] vxlan: improve vxlan route lookup checks Pravin B Shelar
@ 2016-11-05 18:45 ` Pravin B Shelar
  2016-11-09 16:53   ` Jiri Benc
  2016-11-05 18:45 ` [PATCH net-next v2 6/7] vxlan: simplify vxlan xmit Pravin B Shelar
  2016-11-05 18:45 ` [PATCH net-next v2 7/7] vxlan: remove unsed vxlan_dev_dst_port() Pravin B Shelar
  6 siblings, 1 reply; 31+ messages in thread
From: Pravin B Shelar @ 2016-11-05 18:45 UTC (permalink / raw)
  To: netdev; +Cc: Pravin B Shelar

Avoid code duplicate code for handling RTF_LOCAL routes.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
---
 drivers/net/vxlan.c | 85 +++++++++++++++++++++++++++++++----------------------
 1 file changed, 50 insertions(+), 35 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index f106178..838be4c 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -1947,6 +1947,40 @@ static void vxlan_encap_bypass(struct sk_buff *skb, struct vxlan_dev *src_vxlan,
 	}
 }
 
+static int check_route_rtf_local(struct sk_buff *skb, struct net_device *dev,
+				 struct vxlan_dev *vxlan, union vxlan_addr *daddr,
+				 __be32 dst_port, __be32 vni, struct dst_entry *dst,
+				 u32 rt_flags)
+{
+#if IS_ENABLED(CONFIG_IPV6)
+	/* IPv6 rt-flags are checked against RTF_LOCAL, but the value of
+	 * RTF_LOCAL is equal to RTCF_LOCAL. So to keep code simple
+	 * we can use RTCF_LOCAL which works for ipv4 and ipv6 route entry.
+	 */
+	BUILD_BUG_ON(RTCF_LOCAL != RTF_LOCAL);
+#endif
+	/* Bypass encapsulation if the destination is local */
+	if (rt_flags & RTCF_LOCAL &&
+	    !(rt_flags & (RTCF_BROADCAST | RTCF_MULTICAST))) {
+		struct vxlan_dev *dst_vxlan;
+
+		dst_release(dst);
+		dst_vxlan = vxlan_find_vni(vxlan->net, vni,
+					   daddr->sa.sa_family, dst_port,
+					   vxlan->flags);
+		if (!dst_vxlan) {
+			dev->stats.tx_errors++;
+			kfree_skb(skb);
+
+			return -ENOENT;
+		}
+		vxlan_encap_bypass(skb, vxlan, dst_vxlan);
+		return 1;
+	}
+
+	return 0;
+}
+
 static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
 			   struct vxlan_rdst *rdst, bool did_rsc)
 {
@@ -2047,26 +2081,16 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
 		if (IS_ERR(rt))
 			goto tx_error;
 
-		/* Bypass encapsulation if the destination is local */
-		if (!info && rt->rt_flags & RTCF_LOCAL &&
-		    !(rt->rt_flags & (RTCF_BROADCAST | RTCF_MULTICAST))) {
-			struct vxlan_dev *dst_vxlan;
-
-			ip_rt_put(rt);
-			dst_vxlan = vxlan_find_vni(vxlan->net, vni,
-						   dst->sa.sa_family, dst_port,
-						   vxlan->flags);
-			if (!dst_vxlan)
-				goto tx_error;
-			vxlan_encap_bypass(skb, vxlan, dst_vxlan);
-			return;
-		}
-
-		if (!info)
+		if (!info) {
+			err = check_route_rtf_local(skb, dev, vxlan, dst,
+						    dst_port, vni, &rt->dst,
+						    rt->rt_flags);
+			if (err)
+				return;
 			udp_sum = !(flags & VXLAN_F_UDP_ZERO_CSUM_TX);
-		else if (info->key.tun_flags & TUNNEL_DONT_FRAGMENT)
+		} else if (info->key.tun_flags & TUNNEL_DONT_FRAGMENT) {
 			df = htons(IP_DF);
-
+		}
 		tos = ip_tunnel_ecn_encap(tos, old_iph, skb);
 		ttl = ttl ? : ip4_dst_hoplimit(&rt->dst);
 		err = vxlan_build_skb(skb, &rt->dst, sizeof(struct iphdr),
@@ -2081,7 +2105,6 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
 	} else {
 		struct vxlan_sock *sock6 = rcu_dereference(vxlan->vn6_sock);
 		struct dst_entry *ndst;
-		u32 rt6i_flags;
 
 		sk = sock6->sock->sk;
 
@@ -2093,24 +2116,16 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
 		if (IS_ERR(ndst))
 			goto tx_error;
 
-		/* Bypass encapsulation if the destination is local */
-		rt6i_flags = ((struct rt6_info *)ndst)->rt6i_flags;
-		if (!info && rt6i_flags & RTF_LOCAL &&
-		    !(rt6i_flags & (RTCF_BROADCAST | RTCF_MULTICAST))) {
-			struct vxlan_dev *dst_vxlan;
-
-			dst_release(ndst);
-			dst_vxlan = vxlan_find_vni(vxlan->net, vni,
-						   dst->sa.sa_family, dst_port,
-						   vxlan->flags);
-			if (!dst_vxlan)
-				goto tx_error;
-			vxlan_encap_bypass(skb, vxlan, dst_vxlan);
-			return;
-		}
+		if (!info) {
+			u32 rt6i_flags = ((struct rt6_info *)ndst)->rt6i_flags;
 
-		if (!info)
+			err = check_route_rtf_local(skb, dev, vxlan, dst,
+						    dst_port, vni, ndst,
+						    rt6i_flags);
+			if (err)
+				return;
 			udp_sum = !(flags & VXLAN_F_UDP_ZERO_CSUM6_TX);
+		}
 
 		tos = ip_tunnel_ecn_encap(tos, old_iph, skb);
 		ttl = ttl ? : ip6_dst_hoplimit(ndst);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH net-next v2 6/7] vxlan: simplify vxlan xmit
  2016-11-05 18:45 [PATCH net-next v2 0/7] vxlan: General improvements Pravin B Shelar
                   ` (4 preceding siblings ...)
  2016-11-05 18:45 ` [PATCH net-next v2 5/7] vxlan: simplify RTF_LOCAL handling Pravin B Shelar
@ 2016-11-05 18:45 ` Pravin B Shelar
  2016-11-09 16:59   ` Jiri Benc
  2016-11-05 18:45 ` [PATCH net-next v2 7/7] vxlan: remove unsed vxlan_dev_dst_port() Pravin B Shelar
  6 siblings, 1 reply; 31+ messages in thread
From: Pravin B Shelar @ 2016-11-05 18:45 UTC (permalink / raw)
  To: netdev; +Cc: Pravin B Shelar

Existing vxlan xmit function handles two distinct cases.
1. vxlan net device
2. vxlan lwt device.
By seperating initilization these two cases the egress path
looks better.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
---
 drivers/net/vxlan.c | 79 ++++++++++++++++++++++++-----------------------------
 1 file changed, 35 insertions(+), 44 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 838be4c..db6e012 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -1987,8 +1987,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
 	struct dst_cache *dst_cache;
 	struct ip_tunnel_info *info;
 	struct vxlan_dev *vxlan = netdev_priv(dev);
-	struct sock *sk;
-	const struct iphdr *old_iph;
+	const struct iphdr *old_iph = ip_hdr(skb);
 	union vxlan_addr *dst;
 	union vxlan_addr remote_ip, local_ip;
 	union vxlan_addr *src;
@@ -1996,7 +1995,6 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
 	struct vxlan_metadata *md = &_md;
 	__be16 src_port = 0, dst_port;
 	__be32 vni, label;
-	__be16 df = 0;
 	__u8 tos, ttl;
 	int err;
 	u32 flags = vxlan->flags;
@@ -2006,11 +2004,34 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
 	info = skb_tunnel_info(skb);
 
 	if (rdst) {
+		dst = &rdst->remote_ip;
+		if (vxlan_addr_any(dst)) {
+			if (did_rsc) {
+				/* short-circuited back to local bridge */
+				vxlan_encap_bypass(skb, vxlan, vxlan);
+				return;
+			}
+			goto drop;
+		}
+
 		dst_port = rdst->remote_port ? rdst->remote_port : vxlan->cfg.dst_port;
 		vni = rdst->remote_vni;
-		dst = &rdst->remote_ip;
 		src = &vxlan->cfg.saddr;
 		dst_cache = &rdst->dst_cache;
+		md->gbp = skb->mark;
+		ttl = vxlan->cfg.ttl;
+		if (!ttl && vxlan_addr_multicast(dst))
+			ttl = 1;
+
+		tos = vxlan->cfg.tos;
+		if (tos == 1)
+			tos = ip_tunnel_get_dsfield(old_iph, skb);
+
+		if (dst->sa.sa_family == AF_INET)
+			udp_sum = !(flags & VXLAN_F_UDP_ZERO_CSUM_TX);
+		else
+			udp_sum = !(flags & VXLAN_F_UDP_ZERO_CSUM6_TX);
+		label = vxlan->cfg.label;
 	} else {
 		if (!info) {
 			WARN_ONCE(1, "%s: Missing encapsulation instructions\n",
@@ -2030,32 +2051,6 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
 		dst = &remote_ip;
 		src = &local_ip;
 		dst_cache = &info->dst_cache;
-	}
-
-	if (vxlan_addr_any(dst)) {
-		if (did_rsc) {
-			/* short-circuited back to local bridge */
-			vxlan_encap_bypass(skb, vxlan, vxlan);
-			return;
-		}
-		goto drop;
-	}
-
-	old_iph = ip_hdr(skb);
-
-	ttl = vxlan->cfg.ttl;
-	if (!ttl && vxlan_addr_multicast(dst))
-		ttl = 1;
-
-	tos = vxlan->cfg.tos;
-	if (tos == 1)
-		tos = ip_tunnel_get_dsfield(old_iph, skb);
-
-	label = vxlan->cfg.label;
-	src_port = udp_flow_src_port(dev_net(dev), skb, vxlan->cfg.port_min,
-				     vxlan->cfg.port_max, true);
-
-	if (info) {
 		ttl = info->key.ttl;
 		tos = info->key.tos;
 		label = info->key.label;
@@ -2063,15 +2058,15 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
 
 		if (info->options_len)
 			md = ip_tunnel_info_opts(info);
-	} else {
-		md->gbp = skb->mark;
 	}
 
+	src_port = udp_flow_src_port(dev_net(dev), skb, vxlan->cfg.port_min,
+				     vxlan->cfg.port_max, true);
+
 	if (dst->sa.sa_family == AF_INET) {
 		struct vxlan_sock *sock4 = rcu_dereference(vxlan->vn4_sock);
 		struct rtable *rt;
-
-		sk = sock4->sock->sk;
+		__be16 df = 0;
 
 		rt = vxlan_get_route(vxlan, dev, sock4, skb,
 				     rdst ? rdst->remote_ifindex : 0, tos,
@@ -2087,18 +2082,17 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
 						    rt->rt_flags);
 			if (err)
 				return;
-			udp_sum = !(flags & VXLAN_F_UDP_ZERO_CSUM_TX);
 		} else if (info->key.tun_flags & TUNNEL_DONT_FRAGMENT) {
 			df = htons(IP_DF);
 		}
-		tos = ip_tunnel_ecn_encap(tos, old_iph, skb);
-		ttl = ttl ? : ip4_dst_hoplimit(&rt->dst);
 		err = vxlan_build_skb(skb, &rt->dst, sizeof(struct iphdr),
 				      vni, md, flags, udp_sum);
 		if (err < 0)
 			goto tx_error;
 
-		udp_tunnel_xmit_skb(rt, sk, skb, src->sin.sin_addr.s_addr,
+		tos = ip_tunnel_ecn_encap(tos, old_iph, skb);
+		ttl = ttl ? : ip4_dst_hoplimit(&rt->dst);
+		udp_tunnel_xmit_skb(rt, sock4->sock->sk, skb, src->sin.sin_addr.s_addr,
 				    dst->sin.sin_addr.s_addr, tos, ttl, df,
 				    src_port, dst_port, xnet, !udp_sum);
 #if IS_ENABLED(CONFIG_IPV6)
@@ -2106,8 +2100,6 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
 		struct vxlan_sock *sock6 = rcu_dereference(vxlan->vn6_sock);
 		struct dst_entry *ndst;
 
-		sk = sock6->sock->sk;
-
 		ndst = vxlan6_get_route(vxlan, dev, sock6, skb,
 					rdst ? rdst->remote_ifindex : 0, tos,
 					label, &dst->sin6.sin6_addr,
@@ -2124,18 +2116,17 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
 						    rt6i_flags);
 			if (err)
 				return;
-			udp_sum = !(flags & VXLAN_F_UDP_ZERO_CSUM6_TX);
 		}
 
-		tos = ip_tunnel_ecn_encap(tos, old_iph, skb);
-		ttl = ttl ? : ip6_dst_hoplimit(ndst);
 		skb_scrub_packet(skb, xnet);
 		err = vxlan_build_skb(skb, ndst, sizeof(struct ipv6hdr),
 				      vni, md, flags, udp_sum);
 		if (err < 0)
 			goto tx_error;
 
-		udp_tunnel6_xmit_skb(ndst, sk, skb, dev,
+		tos = ip_tunnel_ecn_encap(tos, old_iph, skb);
+		ttl = ttl ? : ip6_dst_hoplimit(ndst);
+		udp_tunnel6_xmit_skb(ndst, sock6->sock->sk, skb, dev,
 				     &src->sin6.sin6_addr,
 				     &dst->sin6.sin6_addr, tos, ttl,
 				     label, src_port, dst_port, !udp_sum);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH net-next v2 7/7] vxlan: remove unsed vxlan_dev_dst_port()
  2016-11-05 18:45 [PATCH net-next v2 0/7] vxlan: General improvements Pravin B Shelar
                   ` (5 preceding siblings ...)
  2016-11-05 18:45 ` [PATCH net-next v2 6/7] vxlan: simplify vxlan xmit Pravin B Shelar
@ 2016-11-05 18:45 ` Pravin B Shelar
  6 siblings, 0 replies; 31+ messages in thread
From: Pravin B Shelar @ 2016-11-05 18:45 UTC (permalink / raw)
  To: netdev; +Cc: Pravin B Shelar

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
---
 include/net/vxlan.h | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/include/net/vxlan.h b/include/net/vxlan.h
index 308adc4..49a5920 100644
--- a/include/net/vxlan.h
+++ b/include/net/vxlan.h
@@ -281,16 +281,6 @@ struct vxlan_dev {
 struct net_device *vxlan_dev_create(struct net *net, const char *name,
 				    u8 name_assign_type, struct vxlan_config *conf);
 
-static inline __be16 vxlan_dev_dst_port(struct vxlan_dev *vxlan,
-					unsigned short family)
-{
-#if IS_ENABLED(CONFIG_IPV6)
-	if (family == AF_INET6)
-		return inet_sk(vxlan->vn6_sock->sock->sk)->inet_sport;
-#endif
-	return inet_sk(vxlan->vn4_sock->sock->sk)->inet_sport;
-}
-
 static inline netdev_features_t vxlan_features_check(struct sk_buff *skb,
 						     netdev_features_t features)
 {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next v2 1/7] vxlan: avoid vlan processing in vxlan device.
  2016-11-05 18:45 ` [PATCH net-next v2 1/7] vxlan: avoid vlan processing in vxlan device Pravin B Shelar
@ 2016-11-06  9:57   ` Sergei Shtylyov
  2016-11-09 16:04   ` Jiri Benc
  1 sibling, 0 replies; 31+ messages in thread
From: Sergei Shtylyov @ 2016-11-06  9:57 UTC (permalink / raw)
  To: Pravin B Shelar, netdev

Hello.

On 11/5/2016 9:45 PM, Pravin B Shelar wrote:

> VxLan device does not have special handling for vlan taging on egress.

    Tagging.

> Therefore it does not make sense to expose vlan offloading feature.
>
> Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
[...]

MBR, Sergei

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next v2 1/7] vxlan: avoid vlan processing in vxlan device.
  2016-11-05 18:45 ` [PATCH net-next v2 1/7] vxlan: avoid vlan processing in vxlan device Pravin B Shelar
  2016-11-06  9:57   ` Sergei Shtylyov
@ 2016-11-09 16:04   ` Jiri Benc
  1 sibling, 0 replies; 31+ messages in thread
From: Jiri Benc @ 2016-11-09 16:04 UTC (permalink / raw)
  To: Pravin B Shelar; +Cc: netdev

On Sat,  5 Nov 2016 11:45:51 -0700, Pravin B Shelar wrote:
> VxLan device does not have special handling for vlan taging on egress.
> Therefore it does not make sense to expose vlan offloading feature.

Makes sense. Though it would be good to have a more detailed
description, I already started writing a reply that the patch changes
VXLAN behavior when I realized it doesn't.

If you're respinning for another reasons, please make the description
a bit more elaborate. Not worth resending just for this, though, IMHO.

Acked-by: Jiri Benc <jbenc@redhat.com>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next v2 2/7] vxlan: simplify exception handling
  2016-11-05 18:45 ` [PATCH net-next v2 2/7] vxlan: simplify exception handling Pravin B Shelar
@ 2016-11-09 16:10   ` Jiri Benc
  2016-11-10  3:33     ` Pravin Shelar
  0 siblings, 1 reply; 31+ messages in thread
From: Jiri Benc @ 2016-11-09 16:10 UTC (permalink / raw)
  To: Pravin B Shelar; +Cc: netdev

On Sat,  5 Nov 2016 11:45:52 -0700, Pravin B Shelar wrote:
> @@ -2058,7 +2059,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
>  		err = vxlan_build_skb(skb, &rt->dst, sizeof(struct iphdr),
>  				      vni, md, flags, udp_sum);
>  		if (err < 0)
> -			goto xmit_tx_error;
> +			goto tx_error;

Seems you're leaking rt here?

> @@ -2117,11 +2118,9 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
>  		skb_scrub_packet(skb, xnet);
>  		err = vxlan_build_skb(skb, ndst, sizeof(struct ipv6hdr),
>  				      vni, md, flags, udp_sum);
> -		if (err < 0) {
> -			dst_release(ndst);
> -			dev->stats.tx_errors++;
> -			return;
> -		}
> +		if (err < 0)
> +			goto tx_error;

And ndst here?

 Jiri

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next v2 3/7] vxlan: avoid checking socket multiple times.
  2016-11-05 18:45 ` [PATCH net-next v2 3/7] vxlan: avoid checking socket multiple times Pravin B Shelar
@ 2016-11-09 16:16   ` Jiri Benc
  2016-11-09 16:34   ` Jiri Benc
  1 sibling, 0 replies; 31+ messages in thread
From: Jiri Benc @ 2016-11-09 16:16 UTC (permalink / raw)
  To: Pravin B Shelar; +Cc: netdev

On Sat,  5 Nov 2016 11:45:53 -0700, Pravin B Shelar wrote:
> @@ -944,7 +944,9 @@ static bool vxlan_group_used(struct vxlan_net *vn, struct vxlan_dev *dev)
>  {
>  	struct vxlan_dev *vxlan;
>  	struct vxlan_sock *sock4;
> -	struct vxlan_sock *sock6 = NULL;
> +#if IS_ENABLED(CONFIG_IPV6)
> +	struct vxlan_sock *sock6;
> +#endif
>  	unsigned short family = dev->default_dst.remote_ip.sa.sa_family;
>  
>  	sock4 = rtnl_dereference(dev->vn4_sock);

This hunk seems to be unrelated to the rest of the patch (and is
certainly unrelated to the description). Please post it as a standalone
patch.

The rest of the patch looks good.

Thanks,

 Jiri

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next v2 3/7] vxlan: avoid checking socket multiple times.
  2016-11-05 18:45 ` [PATCH net-next v2 3/7] vxlan: avoid checking socket multiple times Pravin B Shelar
  2016-11-09 16:16   ` Jiri Benc
@ 2016-11-09 16:34   ` Jiri Benc
  2016-11-10  3:33     ` Pravin Shelar
  1 sibling, 1 reply; 31+ messages in thread
From: Jiri Benc @ 2016-11-09 16:34 UTC (permalink / raw)
  To: Pravin B Shelar; +Cc: netdev

On Sat,  5 Nov 2016 11:45:53 -0700, Pravin B Shelar wrote:
> @@ -2070,11 +2072,9 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
>  		struct dst_entry *ndst;
>  		u32 rt6i_flags;
>  
> -		if (!sock6)
> -			goto drop;
>  		sk = sock6->sock->sk;

I take back that the rest of the patch looks good. This will panic if
an IPv6 packet is routed (through encap route) to an IPv4-only
interface.

 Jiri

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next v2 4/7] vxlan: improve vxlan route lookup checks.
  2016-11-05 18:45 ` [PATCH net-next v2 4/7] vxlan: improve vxlan route lookup checks Pravin B Shelar
@ 2016-11-09 16:41   ` Jiri Benc
  2016-11-10  3:34     ` Pravin Shelar
  0 siblings, 1 reply; 31+ messages in thread
From: Jiri Benc @ 2016-11-09 16:41 UTC (permalink / raw)
  To: Pravin B Shelar; +Cc: netdev

On Sat,  5 Nov 2016 11:45:54 -0700, Pravin B Shelar wrote:
> Move route sanity check to respective vxlan[4/6]_get_route functions.
> This allows us to perform all sanity checks before caching the dst so
> that we can avoid these checks on subsequent packets.
> This give move accurate metadata information for packet from
> fill_metadata_dst().

The description is misleading, it applies only to one vxlan lwt use case
(openvswitch). For other use cases, the patch has no effect.

I found the current handling of route lookup results irritating, too.
The reason I did not change this while doing vxlan cleanup some time
ago was that I assumed we should not increase dev stats from
vxlan_fill_metadata_dst. Isn't that so?

 Jiri

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next v2 5/7] vxlan: simplify RTF_LOCAL handling.
  2016-11-05 18:45 ` [PATCH net-next v2 5/7] vxlan: simplify RTF_LOCAL handling Pravin B Shelar
@ 2016-11-09 16:53   ` Jiri Benc
  2016-11-10  3:34     ` Pravin Shelar
  0 siblings, 1 reply; 31+ messages in thread
From: Jiri Benc @ 2016-11-09 16:53 UTC (permalink / raw)
  To: Pravin B Shelar; +Cc: netdev

On Sat,  5 Nov 2016 11:45:55 -0700, Pravin B Shelar wrote:
> +static int check_route_rtf_local(struct sk_buff *skb, struct net_device *dev,
> +				 struct vxlan_dev *vxlan, union vxlan_addr *daddr,
> +				 __be32 dst_port, __be32 vni, struct dst_entry *dst,
> +				 u32 rt_flags)

It's not just checking, it's also bypassing encapsulation if the check
is successful. Would be good to use a name that suggests this effect,
e.g. encap_bypass_if_local (I know, not a nice name) or something.

Thanks,

 Jiri

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next v2 6/7] vxlan: simplify vxlan xmit
  2016-11-05 18:45 ` [PATCH net-next v2 6/7] vxlan: simplify vxlan xmit Pravin B Shelar
@ 2016-11-09 16:59   ` Jiri Benc
  2016-11-10  3:35     ` Pravin Shelar
  0 siblings, 1 reply; 31+ messages in thread
From: Jiri Benc @ 2016-11-09 16:59 UTC (permalink / raw)
  To: Pravin B Shelar; +Cc: netdev

On Sat,  5 Nov 2016 11:45:56 -0700, Pravin B Shelar wrote:
> @@ -2006,11 +2004,34 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
>  	info = skb_tunnel_info(skb);
>  
>  	if (rdst) {
> +		dst = &rdst->remote_ip;
> +		if (vxlan_addr_any(dst)) {
> +			if (did_rsc) {
> +				/* short-circuited back to local bridge */
> +				vxlan_encap_bypass(skb, vxlan, vxlan);
> +				return;
> +			}
> +			goto drop;
> +		}
> +
>  		dst_port = rdst->remote_port ? rdst->remote_port : vxlan->cfg.dst_port;
>  		vni = rdst->remote_vni;
> -		dst = &rdst->remote_ip;
>  		src = &vxlan->cfg.saddr;
>  		dst_cache = &rdst->dst_cache;
> +		md->gbp = skb->mark;
> +		ttl = vxlan->cfg.ttl;
> +		if (!ttl && vxlan_addr_multicast(dst))
> +			ttl = 1;
> +
> +		tos = vxlan->cfg.tos;
> +		if (tos == 1)
> +			tos = ip_tunnel_get_dsfield(old_iph, skb);

Uninitialized old_iph.

Besides, you can't do this, having TOS, TTL, etc. specified is
perfectly legal for lwtunnel interfaces, too.

 Jiri

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next v2 2/7] vxlan: simplify exception handling
  2016-11-09 16:10   ` Jiri Benc
@ 2016-11-10  3:33     ` Pravin Shelar
  2016-11-10  9:47       ` Jiri Benc
  0 siblings, 1 reply; 31+ messages in thread
From: Pravin Shelar @ 2016-11-10  3:33 UTC (permalink / raw)
  To: Jiri Benc; +Cc: Linux Kernel Network Developers

On Wed, Nov 9, 2016 at 8:10 AM, Jiri Benc <jbenc@redhat.com> wrote:
> On Sat,  5 Nov 2016 11:45:52 -0700, Pravin B Shelar wrote:
>> @@ -2058,7 +2059,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
>>               err = vxlan_build_skb(skb, &rt->dst, sizeof(struct iphdr),
>>                                     vni, md, flags, udp_sum);
>>               if (err < 0)
>> -                     goto xmit_tx_error;
>> +                     goto tx_error;
>
> Seems you're leaking rt here?
>
I have moved the dst error handling to vxlan_build_skb(), which is
releasing the dst entry.

>> @@ -2117,11 +2118,9 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
>>               skb_scrub_packet(skb, xnet);
>>               err = vxlan_build_skb(skb, ndst, sizeof(struct ipv6hdr),
>>                                     vni, md, flags, udp_sum);
>> -             if (err < 0) {
>> -                     dst_release(ndst);
>> -                     dev->stats.tx_errors++;
>> -                     return;
>> -             }
>> +             if (err < 0)
>> +                     goto tx_error;
>
> And ndst here?
>
same as above.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next v2 3/7] vxlan: avoid checking socket multiple times.
  2016-11-09 16:34   ` Jiri Benc
@ 2016-11-10  3:33     ` Pravin Shelar
  0 siblings, 0 replies; 31+ messages in thread
From: Pravin Shelar @ 2016-11-10  3:33 UTC (permalink / raw)
  To: Jiri Benc; +Cc: Linux Kernel Network Developers

On Wed, Nov 9, 2016 at 8:34 AM, Jiri Benc <jbenc@redhat.com> wrote:
> On Sat,  5 Nov 2016 11:45:53 -0700, Pravin B Shelar wrote:
>> @@ -2070,11 +2072,9 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
>>               struct dst_entry *ndst;
>>               u32 rt6i_flags;
>>
>> -             if (!sock6)
>> -                     goto drop;
>>               sk = sock6->sock->sk;
>
> I take back that the rest of the patch looks good. This will panic if
> an IPv6 packet is routed (through encap route) to an IPv4-only
> interface.
>

Actually this is fixed in later patch. But I will fix this patch too.

Thanks for review.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next v2 4/7] vxlan: improve vxlan route lookup checks.
  2016-11-09 16:41   ` Jiri Benc
@ 2016-11-10  3:34     ` Pravin Shelar
  2016-11-10  9:56       ` Jiri Benc
  0 siblings, 1 reply; 31+ messages in thread
From: Pravin Shelar @ 2016-11-10  3:34 UTC (permalink / raw)
  To: Jiri Benc; +Cc: Linux Kernel Network Developers

On Wed, Nov 9, 2016 at 8:41 AM, Jiri Benc <jbenc@redhat.com> wrote:
> On Sat,  5 Nov 2016 11:45:54 -0700, Pravin B Shelar wrote:
>> Move route sanity check to respective vxlan[4/6]_get_route functions.
>> This allows us to perform all sanity checks before caching the dst so
>> that we can avoid these checks on subsequent packets.
>> This give move accurate metadata information for packet from
>> fill_metadata_dst().
>
> The description is misleading, it applies only to one vxlan lwt use case
> (openvswitch). For other use cases, the patch has no effect.
>
Why it would not help in non-ovs vxlan egress path? It avoids checking
(if condition) for device loop.

> I found the current handling of route lookup results irritating, too.
> The reason I did not change this while doing vxlan cleanup some time
> ago was that I assumed we should not increase dev stats from
> vxlan_fill_metadata_dst. Isn't that so?
>

Thats right. I will fix it.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next v2 5/7] vxlan: simplify RTF_LOCAL handling.
  2016-11-09 16:53   ` Jiri Benc
@ 2016-11-10  3:34     ` Pravin Shelar
  0 siblings, 0 replies; 31+ messages in thread
From: Pravin Shelar @ 2016-11-10  3:34 UTC (permalink / raw)
  To: Jiri Benc; +Cc: Linux Kernel Network Developers

On Wed, Nov 9, 2016 at 8:53 AM, Jiri Benc <jbenc@redhat.com> wrote:
> On Sat,  5 Nov 2016 11:45:55 -0700, Pravin B Shelar wrote:
>> +static int check_route_rtf_local(struct sk_buff *skb, struct net_device *dev,
>> +                              struct vxlan_dev *vxlan, union vxlan_addr *daddr,
>> +                              __be32 dst_port, __be32 vni, struct dst_entry *dst,
>> +                              u32 rt_flags)
>
> It's not just checking, it's also bypassing encapsulation if the check
> is successful. Would be good to use a name that suggests this effect,
> e.g. encap_bypass_if_local (I know, not a nice name) or something.
>

I am fine with this name. I will change the patch.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next v2 6/7] vxlan: simplify vxlan xmit
  2016-11-09 16:59   ` Jiri Benc
@ 2016-11-10  3:35     ` Pravin Shelar
  2016-11-10 10:08       ` Jiri Benc
  0 siblings, 1 reply; 31+ messages in thread
From: Pravin Shelar @ 2016-11-10  3:35 UTC (permalink / raw)
  To: Jiri Benc; +Cc: Linux Kernel Network Developers

On Wed, Nov 9, 2016 at 8:59 AM, Jiri Benc <jbenc@redhat.com> wrote:
> On Sat,  5 Nov 2016 11:45:56 -0700, Pravin B Shelar wrote:
>> @@ -2006,11 +2004,34 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
>>       info = skb_tunnel_info(skb);
>>
>>       if (rdst) {
>> +             dst = &rdst->remote_ip;
>> +             if (vxlan_addr_any(dst)) {
>> +                     if (did_rsc) {
>> +                             /* short-circuited back to local bridge */
>> +                             vxlan_encap_bypass(skb, vxlan, vxlan);
>> +                             return;
>> +                     }
>> +                     goto drop;
>> +             }
>> +
>>               dst_port = rdst->remote_port ? rdst->remote_port : vxlan->cfg.dst_port;
>>               vni = rdst->remote_vni;
>> -             dst = &rdst->remote_ip;
>>               src = &vxlan->cfg.saddr;
>>               dst_cache = &rdst->dst_cache;
>> +             md->gbp = skb->mark;
>> +             ttl = vxlan->cfg.ttl;
>> +             if (!ttl && vxlan_addr_multicast(dst))
>> +                     ttl = 1;
>> +
>> +             tos = vxlan->cfg.tos;
>> +             if (tos == 1)
>> +                     tos = ip_tunnel_get_dsfield(old_iph, skb);
>
> Uninitialized old_iph.
>
It is initialized in begining of this function.

> Besides, you can't do this, having TOS, TTL, etc. specified is
> perfectly legal for lwtunnel interfaces, too.
>

TOS and TTL is initialized for LWT just else block. so I do not see
any changes compared to current implementation.

Can you elaborate on your concerns?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next v2 2/7] vxlan: simplify exception handling
  2016-11-10  3:33     ` Pravin Shelar
@ 2016-11-10  9:47       ` Jiri Benc
  2016-11-10 18:10         ` Pravin Shelar
  0 siblings, 1 reply; 31+ messages in thread
From: Jiri Benc @ 2016-11-10  9:47 UTC (permalink / raw)
  To: Pravin Shelar; +Cc: Linux Kernel Network Developers

On Wed, 9 Nov 2016 19:33:24 -0800, Pravin Shelar wrote:
> I have moved the dst error handling to vxlan_build_skb(), which is
> releasing the dst entry.

Hmm, right, I missed that. But it's weird. The previous logic was
confusing (skb is freed in case of error but not otherwise), I'm with
you on this change. But having the same confusing logic with dst_entry
instead doesn't improve things.

Could we free both skb and dst_entry from the caller?

Thanks,

 Jiri

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next v2 4/7] vxlan: improve vxlan route lookup checks.
  2016-11-10  3:34     ` Pravin Shelar
@ 2016-11-10  9:56       ` Jiri Benc
  2016-11-10 18:10         ` Pravin Shelar
  0 siblings, 1 reply; 31+ messages in thread
From: Jiri Benc @ 2016-11-10  9:56 UTC (permalink / raw)
  To: Pravin Shelar; +Cc: Linux Kernel Network Developers

On Wed, 9 Nov 2016 19:34:06 -0800, Pravin Shelar wrote:
> Why it would not help in non-ovs vxlan egress path? It avoids checking
> (if condition) for device loop.

I may be missing something but I count the same number of conditions
for each packet, they're just at a different place after the patch.

E.g. for IPv4, the "if (!sock4)" is moved from vxlan_xmit_one into
vxlan_get_route and the "rt" error handling stays logically the same
(three if conditions in the non-error path) but is moved into
vxlan_get_route. Similarly for IPv6.

 Jiri

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next v2 6/7] vxlan: simplify vxlan xmit
  2016-11-10  3:35     ` Pravin Shelar
@ 2016-11-10 10:08       ` Jiri Benc
  0 siblings, 0 replies; 31+ messages in thread
From: Jiri Benc @ 2016-11-10 10:08 UTC (permalink / raw)
  To: Pravin Shelar; +Cc: Linux Kernel Network Developers

On Wed, 9 Nov 2016 19:35:00 -0800, Pravin Shelar wrote:
> On Wed, Nov 9, 2016 at 8:59 AM, Jiri Benc <jbenc@redhat.com> wrote:
> > Uninitialized old_iph.
> >
> It is initialized in begining of this function.

Right you are, sorry.

> TOS and TTL is initialized for LWT just else block. so I do not see
> any changes compared to current implementation.

Oh, right. We set them unconditionally from tunnel_info. I was
convinced we fallback to the interface config values when they're not
given and stopped reviewing the patch but we don't.

This is a nice cleanup, thanks!

Acked-by: Jiri Benc <jbenc@redhat.com>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next v2 2/7] vxlan: simplify exception handling
  2016-11-10  9:47       ` Jiri Benc
@ 2016-11-10 18:10         ` Pravin Shelar
  2016-11-10 18:33           ` Jiri Benc
  0 siblings, 1 reply; 31+ messages in thread
From: Pravin Shelar @ 2016-11-10 18:10 UTC (permalink / raw)
  To: Jiri Benc; +Cc: Linux Kernel Network Developers

On Thu, Nov 10, 2016 at 1:47 AM, Jiri Benc <jbenc@redhat.com> wrote:
> On Wed, 9 Nov 2016 19:33:24 -0800, Pravin Shelar wrote:
>> I have moved the dst error handling to vxlan_build_skb(), which is
>> releasing the dst entry.
>
> Hmm, right, I missed that. But it's weird. The previous logic was
> confusing (skb is freed in case of error but not otherwise), I'm with
> you on this change. But having the same confusing logic with dst_entry
> instead doesn't improve things.
>
> Could we free both skb and dst_entry from the caller?
>

I wanted to do same, that is free dst and skb in caller function. But
that would need more changes due to discrepancy in IPv4 udp-tunnel and
IPv6 udp-tunnel api. IPv4 works on route entry and IPv6 needs dst
entry. so If caller frees dst-entry then I need additional variable to
keep track of dst entry which is what I am trying to avoid.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next v2 4/7] vxlan: improve vxlan route lookup checks.
  2016-11-10  9:56       ` Jiri Benc
@ 2016-11-10 18:10         ` Pravin Shelar
  2016-11-10 18:37           ` Jiri Benc
  0 siblings, 1 reply; 31+ messages in thread
From: Pravin Shelar @ 2016-11-10 18:10 UTC (permalink / raw)
  To: Jiri Benc; +Cc: Linux Kernel Network Developers

On Thu, Nov 10, 2016 at 1:56 AM, Jiri Benc <jbenc@redhat.com> wrote:
> On Wed, 9 Nov 2016 19:34:06 -0800, Pravin Shelar wrote:
>> Why it would not help in non-ovs vxlan egress path? It avoids checking
>> (if condition) for device loop.
>
> I may be missing something but I count the same number of conditions
> for each packet, they're just at a different place after the patch.
>
I am specifically talking about cached routes. If the dst entry is
cached, this patch avoids checking for device loop.

> E.g. for IPv4, the "if (!sock4)" is moved from vxlan_xmit_one into
> vxlan_get_route and the "rt" error handling stays logically the same
> (three if conditions in the non-error path) but is moved into
> vxlan_get_route. Similarly for IPv6.
>
>  Jiri

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next v2 2/7] vxlan: simplify exception handling
  2016-11-10 18:10         ` Pravin Shelar
@ 2016-11-10 18:33           ` Jiri Benc
  2016-11-10 19:21             ` Pravin Shelar
  0 siblings, 1 reply; 31+ messages in thread
From: Jiri Benc @ 2016-11-10 18:33 UTC (permalink / raw)
  To: Pravin Shelar; +Cc: Linux Kernel Network Developers

On Thu, 10 Nov 2016 10:10:09 -0800, Pravin Shelar wrote:
> I wanted to do same, that is free dst and skb in caller function. But
> that would need more changes due to discrepancy in IPv4 udp-tunnel and
> IPv6 udp-tunnel api. IPv4 works on route entry and IPv6 needs dst
> entry. so If caller frees dst-entry then I need additional variable to
> keep track of dst entry which is what I am trying to avoid.

Is additional variable really that bad? It's likely to be optimized by
the compiler and it will lead to less surprises. You obviously caught
me during review :-)

 Jiri

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next v2 4/7] vxlan: improve vxlan route lookup checks.
  2016-11-10 18:10         ` Pravin Shelar
@ 2016-11-10 18:37           ` Jiri Benc
  0 siblings, 0 replies; 31+ messages in thread
From: Jiri Benc @ 2016-11-10 18:37 UTC (permalink / raw)
  To: Pravin Shelar; +Cc: Linux Kernel Network Developers

On Thu, 10 Nov 2016 10:10:25 -0800, Pravin Shelar wrote:
> I am specifically talking about cached routes. If the dst entry is
> cached, this patch avoids checking for device loop.

Okay, true, for cached routes we do less work with this patch.

This was more a side note anyway, the real comment was increasing of
the dev stats (which you already said you would fix).

Thanks!

 Jiri

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next v2 2/7] vxlan: simplify exception handling
  2016-11-10 18:33           ` Jiri Benc
@ 2016-11-10 19:21             ` Pravin Shelar
  2016-11-11 11:14               ` Jiri Benc
  0 siblings, 1 reply; 31+ messages in thread
From: Pravin Shelar @ 2016-11-10 19:21 UTC (permalink / raw)
  To: Jiri Benc; +Cc: Linux Kernel Network Developers

On Thu, Nov 10, 2016 at 10:33 AM, Jiri Benc <jbenc@redhat.com> wrote:
> On Thu, 10 Nov 2016 10:10:09 -0800, Pravin Shelar wrote:
>> I wanted to do same, that is free dst and skb in caller function. But
>> that would need more changes due to discrepancy in IPv4 udp-tunnel and
>> IPv6 udp-tunnel api. IPv4 works on route entry and IPv6 needs dst
>> entry. so If caller frees dst-entry then I need additional variable to
>> keep track of dst entry which is what I am trying to avoid.
>
> Is additional variable really that bad? It's likely to be optimized by
> the compiler and it will lead to less surprises. You obviously caught
> me during review :-)
>
One additional variable is not bad but look at what has happened in
vxlan_xmit_one(). There are already more than 20 variables defined. It
is hard to read code in this case.
anyways I can add another variable to the function. I do not feel that
strongly about this.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next v2 2/7] vxlan: simplify exception handling
  2016-11-10 19:21             ` Pravin Shelar
@ 2016-11-11 11:14               ` Jiri Benc
  2016-11-11 23:24                 ` Pravin Shelar
  0 siblings, 1 reply; 31+ messages in thread
From: Jiri Benc @ 2016-11-11 11:14 UTC (permalink / raw)
  To: Pravin Shelar; +Cc: Linux Kernel Network Developers

On Thu, 10 Nov 2016 11:21:19 -0800, Pravin Shelar wrote:
> One additional variable is not bad but look at what has happened in
> vxlan_xmit_one(). There are already more than 20 variables defined. It
> is hard to read code in this case.

I agree that the function is horrible.

What I was thinking about was separating the vxlan data and control
plane. The vxlan data plane would perform encapsulation and
decapsulation based on lwtunnel infrastructure and the rest of the
"classical" vxlan would be just one of the users of that. Basically
replacing vxlan_rdst by ip_tunnel_info, among other things.

That would make the vxlan code much much cleaner.

> anyways I can add another variable to the function. I do not feel that
> strongly about this.

Me neither, actually. I prefer another variable but I won't oppose the
patchset just based on that if you choose differently.

Thanks,

 Jiri

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH net-next v2 2/7] vxlan: simplify exception handling
  2016-11-11 11:14               ` Jiri Benc
@ 2016-11-11 23:24                 ` Pravin Shelar
  0 siblings, 0 replies; 31+ messages in thread
From: Pravin Shelar @ 2016-11-11 23:24 UTC (permalink / raw)
  To: Jiri Benc; +Cc: Linux Kernel Network Developers

On Fri, Nov 11, 2016 at 3:14 AM, Jiri Benc <jbenc@redhat.com> wrote:
> On Thu, 10 Nov 2016 11:21:19 -0800, Pravin Shelar wrote:
>> One additional variable is not bad but look at what has happened in
>> vxlan_xmit_one(). There are already more than 20 variables defined. It
>> is hard to read code in this case.
>
> I agree that the function is horrible.
>
> What I was thinking about was separating the vxlan data and control
> plane. The vxlan data plane would perform encapsulation and
> decapsulation based on lwtunnel infrastructure and the rest of the
> "classical" vxlan would be just one of the users of that. Basically
> replacing vxlan_rdst by ip_tunnel_info, among other things.
>
> That would make the vxlan code much much cleaner.
>
I have patch which does something similar for geneve. But it is tricky
to do it for vxlan.

>> anyways I can add another variable to the function. I do not feel that
>> strongly about this.
>
> Me neither, actually. I prefer another variable but I won't oppose the
> patchset just based on that if you choose differently.
>
I have updated patches already. I will post it soon.

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2016-11-11 23:24 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-05 18:45 [PATCH net-next v2 0/7] vxlan: General improvements Pravin B Shelar
2016-11-05 18:45 ` [PATCH net-next v2 1/7] vxlan: avoid vlan processing in vxlan device Pravin B Shelar
2016-11-06  9:57   ` Sergei Shtylyov
2016-11-09 16:04   ` Jiri Benc
2016-11-05 18:45 ` [PATCH net-next v2 2/7] vxlan: simplify exception handling Pravin B Shelar
2016-11-09 16:10   ` Jiri Benc
2016-11-10  3:33     ` Pravin Shelar
2016-11-10  9:47       ` Jiri Benc
2016-11-10 18:10         ` Pravin Shelar
2016-11-10 18:33           ` Jiri Benc
2016-11-10 19:21             ` Pravin Shelar
2016-11-11 11:14               ` Jiri Benc
2016-11-11 23:24                 ` Pravin Shelar
2016-11-05 18:45 ` [PATCH net-next v2 3/7] vxlan: avoid checking socket multiple times Pravin B Shelar
2016-11-09 16:16   ` Jiri Benc
2016-11-09 16:34   ` Jiri Benc
2016-11-10  3:33     ` Pravin Shelar
2016-11-05 18:45 ` [PATCH net-next v2 4/7] vxlan: improve vxlan route lookup checks Pravin B Shelar
2016-11-09 16:41   ` Jiri Benc
2016-11-10  3:34     ` Pravin Shelar
2016-11-10  9:56       ` Jiri Benc
2016-11-10 18:10         ` Pravin Shelar
2016-11-10 18:37           ` Jiri Benc
2016-11-05 18:45 ` [PATCH net-next v2 5/7] vxlan: simplify RTF_LOCAL handling Pravin B Shelar
2016-11-09 16:53   ` Jiri Benc
2016-11-10  3:34     ` Pravin Shelar
2016-11-05 18:45 ` [PATCH net-next v2 6/7] vxlan: simplify vxlan xmit Pravin B Shelar
2016-11-09 16:59   ` Jiri Benc
2016-11-10  3:35     ` Pravin Shelar
2016-11-10 10:08       ` Jiri Benc
2016-11-05 18:45 ` [PATCH net-next v2 7/7] vxlan: remove unsed vxlan_dev_dst_port() Pravin B Shelar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.