All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 0/2] vxlan: Don't use UDP socket for transmit
@ 2015-01-17 18:18 Tom Herbert
  2015-01-17 18:18 ` [PATCH net-next 1/2] udp: Do not require sock in udp_tunnel_xmit_skb Tom Herbert
  2015-01-17 18:18 ` [PATCH net-next 2/2] vxlan: Eliminate dependency on UDP socket in transmit path Tom Herbert
  0 siblings, 2 replies; 16+ messages in thread
From: Tom Herbert @ 2015-01-17 18:18 UTC (permalink / raw)
  To: davem, tgraf, netdev

UDP socket is not pertinent to transmit for UDP tunnels, checksum
enablement can be done with a socket. This patch set eliminates
reference to a socket in udp_tunnel_xmit functions and in VXLAN
trnasmit.

Also, make GBP, RCO, can CSUM6_RX flags visible to receive socket
and only match these for shareable socket.
Tom Herbert (2):
  udp: Do not require sock in udp_tunnel_xmit_skb
  vxlan: Eliminate dependency on UDP socket in transmit path

 drivers/net/vxlan.c           | 66 +++++++++++++++++++++----------------------
 include/net/udp_tunnel.h      | 16 +++++------
 include/net/vxlan.h           | 13 +++++----
 net/ipv4/udp_tunnel.c         | 12 ++++----
 net/ipv6/ip6_udp_tunnel.c     | 12 ++++----
 net/openvswitch/vport-vxlan.c |  6 ++--
 6 files changed, 62 insertions(+), 63 deletions(-)

-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH net-next 1/2] udp: Do not require sock in udp_tunnel_xmit_skb
  2015-01-17 18:18 [PATCH net-next 0/2] vxlan: Don't use UDP socket for transmit Tom Herbert
@ 2015-01-17 18:18 ` Tom Herbert
  2015-01-17 23:45   ` Jesse Gross
  2015-01-18 22:43   ` Or Gerlitz
  2015-01-17 18:18 ` [PATCH net-next 2/2] vxlan: Eliminate dependency on UDP socket in transmit path Tom Herbert
  1 sibling, 2 replies; 16+ messages in thread
From: Tom Herbert @ 2015-01-17 18:18 UTC (permalink / raw)
  To: davem, tgraf, netdev

The UDP tunnel transmit functions udp_tunnel_xmit_skb and
udp_tunnel6_xmit_skb include a socket argument. The socket being
passed to the functions (from VXLAN) is a UDP created for receive
side. The only thing that the socket is used for in the transmit
functions is to get the setting for checksum (enabled or zero).
This patch removes the argument and and adds a nocheck argument
for checksum setting. This eliminates the unnecessary dependency
on a UDP socket for UDP tunnel transmit.

Signed-off-by: Tom Herbert <therbert@google.com>
---
 drivers/net/vxlan.c       | 10 ++++++----
 include/net/udp_tunnel.h  | 16 ++++++++--------
 net/ipv4/udp_tunnel.c     | 12 ++++++------
 net/ipv6/ip6_udp_tunnel.c | 12 ++++++------
 4 files changed, 26 insertions(+), 24 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 6b6b456..4fb4205 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -1763,8 +1763,9 @@ static int vxlan6_xmit_skb(struct vxlan_sock *vs,
 
 	skb_set_inner_protocol(skb, htons(ETH_P_TEB));
 
-	udp_tunnel6_xmit_skb(vs->sock, dst, skb, dev, saddr, daddr, prio,
-			     ttl, src_port, dst_port);
+	udp_tunnel6_xmit_skb(dst, skb, dev, saddr, daddr, prio,
+			     ttl, src_port, dst_port,
+			     udp_get_no_check6_tx(vs->sock->sk));
 	return 0;
 err:
 	dst_release(dst);
@@ -1842,8 +1843,9 @@ int vxlan_xmit_skb(struct vxlan_sock *vs,
 
 	skb_set_inner_protocol(skb, htons(ETH_P_TEB));
 
-	return udp_tunnel_xmit_skb(vs->sock, rt, skb, src, dst, tos,
-				   ttl, df, src_port, dst_port, xnet);
+	return udp_tunnel_xmit_skb(rt, skb, src, dst, tos,
+				   ttl, df, src_port, dst_port, xnet,
+				   vs->sock->sk->sk_no_check_tx);
 }
 EXPORT_SYMBOL_GPL(vxlan_xmit_skb);
 
diff --git a/include/net/udp_tunnel.h b/include/net/udp_tunnel.h
index 2a50a70..1a20d33 100644
--- a/include/net/udp_tunnel.h
+++ b/include/net/udp_tunnel.h
@@ -77,17 +77,17 @@ void setup_udp_tunnel_sock(struct net *net, struct socket *sock,
 			   struct udp_tunnel_sock_cfg *sock_cfg);
 
 /* Transmit the skb using UDP encapsulation. */
-int udp_tunnel_xmit_skb(struct socket *sock, struct rtable *rt,
-			struct sk_buff *skb, __be32 src, __be32 dst,
-			__u8 tos, __u8 ttl, __be16 df, __be16 src_port,
-			__be16 dst_port, bool xnet);
+int udp_tunnel_xmit_skb(struct rtable *rt, struct sk_buff *skb,
+			__be32 src, __be32 dst, __u8 tos, __u8 ttl,
+			__be16 df, __be16 src_port, __be16 dst_port,
+			bool xnet, bool nocheck);
 
 #if IS_ENABLED(CONFIG_IPV6)
-int udp_tunnel6_xmit_skb(struct socket *sock, struct dst_entry *dst,
-			 struct sk_buff *skb, struct net_device *dev,
-			 struct in6_addr *saddr, struct in6_addr *daddr,
+int udp_tunnel6_xmit_skb(struct dst_entry *dst, struct sk_buff *skb,
+			 struct net_device *dev, struct in6_addr *saddr,
+			 struct in6_addr *daddr,
 			 __u8 prio, __u8 ttl, __be16 src_port,
-			 __be16 dst_port);
+			 __be16 dst_port, bool nocheck);
 #endif
 
 void udp_tunnel_sock_release(struct socket *sock);
diff --git a/net/ipv4/udp_tunnel.c b/net/ipv4/udp_tunnel.c
index 9996e63..c83b354 100644
--- a/net/ipv4/udp_tunnel.c
+++ b/net/ipv4/udp_tunnel.c
@@ -75,10 +75,10 @@ void setup_udp_tunnel_sock(struct net *net, struct socket *sock,
 }
 EXPORT_SYMBOL_GPL(setup_udp_tunnel_sock);
 
-int udp_tunnel_xmit_skb(struct socket *sock, struct rtable *rt,
-			struct sk_buff *skb, __be32 src, __be32 dst,
-			__u8 tos, __u8 ttl, __be16 df, __be16 src_port,
-			__be16 dst_port, bool xnet)
+int udp_tunnel_xmit_skb(struct rtable *rt, struct sk_buff *skb,
+			__be32 src, __be32 dst, __u8 tos, __u8 ttl,
+			__be16 df, __be16 src_port, __be16 dst_port,
+			bool xnet, bool nocheck)
 {
 	struct udphdr *uh;
 
@@ -90,9 +90,9 @@ int udp_tunnel_xmit_skb(struct socket *sock, struct rtable *rt,
 	uh->source = src_port;
 	uh->len = htons(skb->len);
 
-	udp_set_csum(sock->sk->sk_no_check_tx, skb, src, dst, skb->len);
+	udp_set_csum(nocheck, skb, src, dst, skb->len);
 
-	return iptunnel_xmit(sock->sk, rt, skb, src, dst, IPPROTO_UDP,
+	return iptunnel_xmit(skb->sk, rt, skb, src, dst, IPPROTO_UDP,
 			     tos, ttl, df, xnet);
 }
 EXPORT_SYMBOL_GPL(udp_tunnel_xmit_skb);
diff --git a/net/ipv6/ip6_udp_tunnel.c b/net/ipv6/ip6_udp_tunnel.c
index 8db6c98..32d9b26 100644
--- a/net/ipv6/ip6_udp_tunnel.c
+++ b/net/ipv6/ip6_udp_tunnel.c
@@ -62,14 +62,14 @@ error:
 }
 EXPORT_SYMBOL_GPL(udp_sock_create6);
 
-int udp_tunnel6_xmit_skb(struct socket *sock, struct dst_entry *dst,
-			 struct sk_buff *skb, struct net_device *dev,
-			 struct in6_addr *saddr, struct in6_addr *daddr,
-			 __u8 prio, __u8 ttl, __be16 src_port, __be16 dst_port)
+int udp_tunnel6_xmit_skb(struct dst_entry *dst, struct sk_buff *skb,
+			 struct net_device *dev, struct in6_addr *saddr,
+			 struct in6_addr *daddr,
+			 __u8 prio, __u8 ttl, __be16 src_port,
+			 __be16 dst_port, bool nocheck)
 {
 	struct udphdr *uh;
 	struct ipv6hdr *ip6h;
-	struct sock *sk = sock->sk;
 
 	__skb_push(skb, sizeof(*uh));
 	skb_reset_transport_header(skb);
@@ -85,7 +85,7 @@ int udp_tunnel6_xmit_skb(struct socket *sock, struct dst_entry *dst,
 			    | IPSKB_REROUTED);
 	skb_dst_set(skb, dst);
 
-	udp6_set_csum(udp_get_no_check6_tx(sk), skb, saddr, daddr, skb->len);
+	udp6_set_csum(nocheck, skb, saddr, daddr, skb->len);
 
 	__skb_push(skb, sizeof(*ip6h));
 	skb_reset_network_header(skb);
-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next 2/2] vxlan: Eliminate dependency on UDP socket in transmit path
  2015-01-17 18:18 [PATCH net-next 0/2] vxlan: Don't use UDP socket for transmit Tom Herbert
  2015-01-17 18:18 ` [PATCH net-next 1/2] udp: Do not require sock in udp_tunnel_xmit_skb Tom Herbert
@ 2015-01-17 18:18 ` Tom Herbert
  2015-01-19  8:59   ` Thomas Graf
  1 sibling, 1 reply; 16+ messages in thread
From: Tom Herbert @ 2015-01-17 18:18 UTC (permalink / raw)
  To: davem, tgraf, netdev

In the vxlan transmit path there is no need to reference the socket
for a tunnel which is needed for the receive side. We do, however,
need the vxlan_dev flags. This patch eliminate references
to the socket in the transmit path, and changes VXLAN_F_UNSHAREABLE
to be VXLAN_F_RCV_FLAGS. This mask is used to store the flags
applicable to receive (GBP, CSUM6_RX, and REMCSUM_RX) in the
vxlan_sock flags.

Signed-off-by: Tom Herbert <therbert@google.com>
---
 drivers/net/vxlan.c           | 60 ++++++++++++++++++++-----------------------
 include/net/vxlan.h           | 13 ++++++----
 net/openvswitch/vport-vxlan.c |  6 ++---
 3 files changed, 38 insertions(+), 41 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 4fb4205..fb7805b 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -270,12 +270,13 @@ static struct vxlan_sock *vxlan_find_sock(struct net *net, sa_family_t family,
 					  __be16 port, u32 flags)
 {
 	struct vxlan_sock *vs;
-	u32 match_flags = flags & VXLAN_F_UNSHAREABLE;
+
+	flags &= VXLAN_F_RCV_FLAGS;
 
 	hlist_for_each_entry_rcu(vs, vs_head(net, port), hlist) {
 		if (inet_sk(vs->sock->sk)->inet_sport == port &&
 		    inet_sk(vs->sock->sk)->sk.sk_family == family &&
-		    (vs->flags & VXLAN_F_UNSHAREABLE) == match_flags)
+		    vs->flags == flags)
 			return vs;
 	}
 	return NULL;
@@ -1668,7 +1669,7 @@ static bool route_shortcircuit(struct net_device *dev, struct sk_buff *skb)
 	return false;
 }
 
-static void vxlan_build_gbp_hdr(struct vxlanhdr *vxh, struct vxlan_sock *vs,
+static void vxlan_build_gbp_hdr(struct vxlanhdr *vxh, u32 vxflags,
 				struct vxlan_metadata *md)
 {
 	struct vxlanhdr_gbp *gbp;
@@ -1686,21 +1687,20 @@ static void vxlan_build_gbp_hdr(struct vxlanhdr *vxh, struct vxlan_sock *vs,
 }
 
 #if IS_ENABLED(CONFIG_IPV6)
-static int vxlan6_xmit_skb(struct vxlan_sock *vs,
-			   struct dst_entry *dst, struct sk_buff *skb,
+static int vxlan6_xmit_skb(struct dst_entry *dst, struct sk_buff *skb,
 			   struct net_device *dev, struct in6_addr *saddr,
 			   struct in6_addr *daddr, __u8 prio, __u8 ttl,
 			   __be16 src_port, __be16 dst_port,
-			   struct vxlan_metadata *md, bool xnet)
+			   struct vxlan_metadata *md, bool xnet, u32 vxflags)
 {
 	struct vxlanhdr *vxh;
 	int min_headroom;
 	int err;
-	bool udp_sum = !udp_get_no_check6_tx(vs->sock->sk);
+	bool udp_sum = !(vxflags & VXLAN_F_UDP_ZERO_CSUM6_TX);
 	int type = udp_sum ? SKB_GSO_UDP_TUNNEL_CSUM : SKB_GSO_UDP_TUNNEL;
 	u16 hdrlen = sizeof(struct vxlanhdr);
 
-	if ((vs->flags & VXLAN_F_REMCSUM_TX) &&
+	if ((vxflags & VXLAN_F_REMCSUM_TX) &&
 	    skb->ip_summed == CHECKSUM_PARTIAL) {
 		int csum_start = skb_checksum_start_offset(skb);
 
@@ -1758,14 +1758,14 @@ static int vxlan6_xmit_skb(struct vxlan_sock *vs,
 		}
 	}
 
-	if (vs->flags & VXLAN_F_GBP)
-		vxlan_build_gbp_hdr(vxh, vs, md);
+	if (vxflags & VXLAN_F_GBP)
+		vxlan_build_gbp_hdr(vxh, vxflags, md);
 
 	skb_set_inner_protocol(skb, htons(ETH_P_TEB));
 
 	udp_tunnel6_xmit_skb(dst, skb, dev, saddr, daddr, prio,
 			     ttl, src_port, dst_port,
-			     udp_get_no_check6_tx(vs->sock->sk));
+			     !!(vxflags & VXLAN_F_UDP_ZERO_CSUM6_TX));
 	return 0;
 err:
 	dst_release(dst);
@@ -1773,20 +1773,19 @@ err:
 }
 #endif
 
-int vxlan_xmit_skb(struct vxlan_sock *vs,
-		   struct rtable *rt, struct sk_buff *skb,
+int vxlan_xmit_skb(struct rtable *rt, struct sk_buff *skb,
 		   __be32 src, __be32 dst, __u8 tos, __u8 ttl, __be16 df,
 		   __be16 src_port, __be16 dst_port,
-		   struct vxlan_metadata *md, bool xnet)
+		   struct vxlan_metadata *md, bool xnet, u32 vxflags)
 {
 	struct vxlanhdr *vxh;
 	int min_headroom;
 	int err;
-	bool udp_sum = !vs->sock->sk->sk_no_check_tx;
+	bool udp_sum = !!(vxflags & VXLAN_F_UDP_CSUM);
 	int type = udp_sum ? SKB_GSO_UDP_TUNNEL_CSUM : SKB_GSO_UDP_TUNNEL;
 	u16 hdrlen = sizeof(struct vxlanhdr);
 
-	if ((vs->flags & VXLAN_F_REMCSUM_TX) &&
+	if ((vxflags & VXLAN_F_REMCSUM_TX) &&
 	    skb->ip_summed == CHECKSUM_PARTIAL) {
 		int csum_start = skb_checksum_start_offset(skb);
 
@@ -1838,14 +1837,14 @@ int vxlan_xmit_skb(struct vxlan_sock *vs,
 		}
 	}
 
-	if (vs->flags & VXLAN_F_GBP)
-		vxlan_build_gbp_hdr(vxh, vs, md);
+	if (vxflags & VXLAN_F_GBP)
+		vxlan_build_gbp_hdr(vxh, vxflags, md);
 
 	skb_set_inner_protocol(skb, htons(ETH_P_TEB));
 
 	return udp_tunnel_xmit_skb(rt, skb, src, dst, tos,
 				   ttl, df, src_port, dst_port, xnet,
-				   vs->sock->sk->sk_no_check_tx);
+				   !(vxflags & VXLAN_F_UDP_CSUM));
 }
 EXPORT_SYMBOL_GPL(vxlan_xmit_skb);
 
@@ -1977,10 +1976,11 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
 		md.vni = htonl(vni << 8);
 		md.gbp = skb->mark;
 
-		err = vxlan_xmit_skb(vxlan->vn_sock, rt, skb,
-				     fl4.saddr, dst->sin.sin_addr.s_addr,
-				     tos, ttl, df, src_port, dst_port, &md,
-				     !net_eq(vxlan->net, dev_net(vxlan->dev)));
+		err = vxlan_xmit_skb(rt, skb, fl4.saddr,
+				     dst->sin.sin_addr.s_addr, tos, ttl, df,
+				     src_port, dst_port, &md,
+				     !net_eq(vxlan->net, dev_net(vxlan->dev)),
+				     vxlan->flags);
 		if (err < 0) {
 			/* skb is already freed. */
 			skb = NULL;
@@ -2036,10 +2036,10 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
 		md.vni = htonl(vni << 8);
 		md.gbp = skb->mark;
 
-		err = vxlan6_xmit_skb(vxlan->vn_sock, ndst, skb,
-				      dev, &fl6.saddr, &fl6.daddr, 0, ttl,
-				      src_port, dst_port, &md,
-				      !net_eq(vxlan->net, dev_net(vxlan->dev)));
+		err = vxlan6_xmit_skb(ndst, skb, dev, &fl6.saddr, &fl6.daddr,
+				      0, ttl, src_port, dst_port, &md,
+				      !net_eq(vxlan->net, dev_net(vxlan->dev)),
+				      vxlan->flags);
 #endif
 	}
 
@@ -2511,15 +2511,11 @@ static struct socket *vxlan_create_sock(struct net *net, bool ipv6,
 
 	if (ipv6) {
 		udp_conf.family = AF_INET6;
-		udp_conf.use_udp6_tx_checksums =
-		    !(flags & VXLAN_F_UDP_ZERO_CSUM6_TX);
 		udp_conf.use_udp6_rx_checksums =
 		    !(flags & VXLAN_F_UDP_ZERO_CSUM6_RX);
 	} else {
 		udp_conf.family = AF_INET;
 		udp_conf.local_ip.s_addr = INADDR_ANY;
-		udp_conf.use_udp_checksums =
-		    !!(flags & VXLAN_F_UDP_CSUM);
 	}
 
 	udp_conf.local_udp_port = port;
@@ -2563,7 +2559,7 @@ static struct vxlan_sock *vxlan_socket_create(struct net *net, __be16 port,
 	atomic_set(&vs->refcnt, 1);
 	vs->rcv = rcv;
 	vs->data = data;
-	vs->flags = flags;
+	vs->flags = (flags & VXLAN_F_RCV_FLAGS);
 
 	/* Initialize the vxlan udp offloads structure */
 	vs->udp_offloads.port = port;
diff --git a/include/net/vxlan.h b/include/net/vxlan.h
index 7be8c34..2927d62 100644
--- a/include/net/vxlan.h
+++ b/include/net/vxlan.h
@@ -129,8 +129,12 @@ struct vxlan_sock {
 #define VXLAN_F_REMCSUM_RX		0x400
 #define VXLAN_F_GBP			0x800
 
-/* These flags must match in order for a socket to be shareable */
-#define VXLAN_F_UNSHAREABLE		VXLAN_F_GBP
+/* Flags that are used in the receive patch. These flags must match in
+ * order for a socket to be shareable
+ */
+#define VXLAN_F_RCV_FLAGS		(VXLAN_F_GBP |			\
+					 VXLAN_F_UDP_ZERO_CSUM6_RX |	\
+					 VXLAN_F_REMCSUM_RX)
 
 struct vxlan_sock *vxlan_sock_add(struct net *net, __be16 port,
 				  vxlan_rcv_t *rcv, void *data,
@@ -138,11 +142,10 @@ struct vxlan_sock *vxlan_sock_add(struct net *net, __be16 port,
 
 void vxlan_sock_release(struct vxlan_sock *vs);
 
-int vxlan_xmit_skb(struct vxlan_sock *vs,
-		   struct rtable *rt, struct sk_buff *skb,
+int vxlan_xmit_skb(struct rtable *rt, struct sk_buff *skb,
 		   __be32 src, __be32 dst, __u8 tos, __u8 ttl, __be16 df,
 		   __be16 src_port, __be16 dst_port, struct vxlan_metadata *md,
-		   bool xnet);
+		   bool xnet, u32 vxflags);
 
 static inline netdev_features_t vxlan_features_check(struct sk_buff *skb,
 						     netdev_features_t features)
diff --git a/net/openvswitch/vport-vxlan.c b/net/openvswitch/vport-vxlan.c
index 8a2d54c..3cc983b 100644
--- a/net/openvswitch/vport-vxlan.c
+++ b/net/openvswitch/vport-vxlan.c
@@ -252,12 +252,10 @@ static int vxlan_tnl_send(struct vport *vport, struct sk_buff *skb)
 	md.vni = htonl(be64_to_cpu(tun_key->tun_id) << 8);
 	md.gbp = vxlan_ext_gbp(skb);
 
-	err = vxlan_xmit_skb(vxlan_port->vs, rt, skb,
-			     fl.saddr, tun_key->ipv4_dst,
+	err = vxlan_xmit_skb(rt, skb, fl.saddr, tun_key->ipv4_dst,
 			     tun_key->ipv4_tos, tun_key->ipv4_ttl, df,
 			     src_port, dst_port,
-			     &md,
-			     false);
+			     &md, false, vxlan_port->exts);
 	if (err < 0)
 		ip_rt_put(rt);
 	return err;
-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 1/2] udp: Do not require sock in udp_tunnel_xmit_skb
  2015-01-17 18:18 ` [PATCH net-next 1/2] udp: Do not require sock in udp_tunnel_xmit_skb Tom Herbert
@ 2015-01-17 23:45   ` Jesse Gross
  2015-01-18 22:43   ` Or Gerlitz
  1 sibling, 0 replies; 16+ messages in thread
From: Jesse Gross @ 2015-01-17 23:45 UTC (permalink / raw)
  To: Tom Herbert; +Cc: David Miller, Thomas Graf, netdev

On Sat, Jan 17, 2015 at 10:18 AM, Tom Herbert <therbert@google.com> wrote:
> The UDP tunnel transmit functions udp_tunnel_xmit_skb and
> udp_tunnel6_xmit_skb include a socket argument. The socket being
> passed to the functions (from VXLAN) is a UDP created for receive
> side. The only thing that the socket is used for in the transmit
> functions is to get the setting for checksum (enabled or zero).
> This patch removes the argument and and adds a nocheck argument
> for checksum setting. This eliminates the unnecessary dependency
> on a UDP socket for UDP tunnel transmit.
>
> Signed-off-by: Tom Herbert <therbert@google.com>

I think you need to update Geneve as well:

net/ipv4/geneve.c:139:38: warning: incorrect type in argument 1
(different base types)
net/ipv4/geneve.c:139:38:    expected struct rtable *rt
net/ipv4/geneve.c:139:38:    got struct socket *sock
net/ipv4/geneve.c:139:46: warning: incorrect type in argument 2
(different base types)
net/ipv4/geneve.c:139:46:    expected struct sk_buff *skb
net/ipv4/geneve.c:139:46:    got struct rtable *rt
net/ipv4/geneve.c:139:50: warning: incorrect type in argument 3
(different base types)
net/ipv4/geneve.c:139:50:    expected restricted __be32 [usertype] src
net/ipv4/geneve.c:139:50:    got struct sk_buff *[assigned] skb
net/ipv4/geneve.c:139:60: warning: incorrect type in argument 5
(different base types)
net/ipv4/geneve.c:139:60:    expected unsigned char [unsigned] [usertype] tos
net/ipv4/geneve.c:139:60:    got restricted __be32 [usertype] dst
net/ipv4/geneve.c:140:41: warning: incorrect type in argument 7
(different base types)
net/ipv4/geneve.c:140:41:    expected restricted __be16 [usertype] df
net/ipv4/geneve.c:140:41:    got unsigned char [unsigned] [usertype] ttl
net/ipv4/geneve.c:140:60: warning: incorrect type in argument 10
(different base types)
net/ipv4/geneve.c:140:60:    expected bool [unsigned] [usertype] xnet
net/ipv4/geneve.c:140:60:    got restricted __be16 [usertype] dst_port

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 1/2] udp: Do not require sock in udp_tunnel_xmit_skb
  2015-01-17 18:18 ` [PATCH net-next 1/2] udp: Do not require sock in udp_tunnel_xmit_skb Tom Herbert
  2015-01-17 23:45   ` Jesse Gross
@ 2015-01-18 22:43   ` Or Gerlitz
  2015-01-20 17:36     ` Tom Herbert
  1 sibling, 1 reply; 16+ messages in thread
From: Or Gerlitz @ 2015-01-18 22:43 UTC (permalink / raw)
  To: Tom Herbert; +Cc: David Miller, Thomas Graf, Linux Netdev List

On Sat, Jan 17, 2015 at 8:18 PM, Tom Herbert <therbert@google.com> wrote:
> The UDP tunnel transmit functions udp_tunnel_xmit_skb and
> udp_tunnel6_xmit_skb include a socket argument. The socket being
> passed to the functions (from VXLAN) is a UDP created for receive
> side. The only thing that the socket is used for in the transmit
> functions is to get the setting for checksum (enabled or zero).

Tom, just to clarify - re the sockets usage in the transmit side,
somewhere bind or alike is done on them such that we have multiple
source UDP ports for given host VXLAN traffic. Here for example the
sender host is 192.168.31.17 and two ports are seen here 54206 and
50795.

Just wanted to make sure this series doesn't change that, since if
this is the case, we introduce here a regression w.r.t RSS hash
spreading from the outer UDP header data at the receiver side (which
is the right thing to do, per your LKS session...)

IP 192.168.31.18.45367 > 192.168.31.17.4789: UDP, length 62
IP 192.168.31.18.45515 > 192.168.31.17.4789: UDP, length 62
IP 192.168.31.17.54206 > 192.168.31.18.4789: UDP, length 26814
IP 192.168.31.18.45367 > 192.168.31.17.4789: UDP, length 62
IP 192.168.31.18.45515 > 192.168.31.17.4789: UDP, length 62
IP 192.168.31.17.50795 > 192.168.31.18.4789: UDP, length 25498
IP 192.168.31.17.50795 > 192.168.31.18.4789: UDP, length 64922
IP 192.168.31.17.50795 > 192.168.31.18.4789: UDP, length 64922
IP 192.168.31.18.45367 > 192.168.31.17.4789: UDP, length 62
IP 192.168.31.17.54206 > 192.168.31.18.4789: UDP, length 38170

> This patch removes the argument and and adds a nocheck argument
> for checksum setting. This eliminates the unnecessary dependency
> on a UDP socket for UDP tunnel transmit.
>
> Signed-off-by: Tom Herbert <therbert@google.com>
> ---
>  drivers/net/vxlan.c       | 10 ++++++----
>  include/net/udp_tunnel.h  | 16 ++++++++--------
>  net/ipv4/udp_tunnel.c     | 12 ++++++------
>  net/ipv6/ip6_udp_tunnel.c | 12 ++++++------
>  4 files changed, 26 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
> index 6b6b456..4fb4205 100644
> --- a/drivers/net/vxlan.c
> +++ b/drivers/net/vxlan.c
> @@ -1763,8 +1763,9 @@ static int vxlan6_xmit_skb(struct vxlan_sock *vs,
>
>         skb_set_inner_protocol(skb, htons(ETH_P_TEB));
>
> -       udp_tunnel6_xmit_skb(vs->sock, dst, skb, dev, saddr, daddr, prio,
> -                            ttl, src_port, dst_port);
> +       udp_tunnel6_xmit_skb(dst, skb, dev, saddr, daddr, prio,
> +                            ttl, src_port, dst_port,
> +                            udp_get_no_check6_tx(vs->sock->sk));
>         return 0;
>  err:
>         dst_release(dst);
> @@ -1842,8 +1843,9 @@ int vxlan_xmit_skb(struct vxlan_sock *vs,
>
>         skb_set_inner_protocol(skb, htons(ETH_P_TEB));
>
> -       return udp_tunnel_xmit_skb(vs->sock, rt, skb, src, dst, tos,
> -                                  ttl, df, src_port, dst_port, xnet);
> +       return udp_tunnel_xmit_skb(rt, skb, src, dst, tos,
> +                                  ttl, df, src_port, dst_port, xnet,
> +                                  vs->sock->sk->sk_no_check_tx);
>  }
>  EXPORT_SYMBOL_GPL(vxlan_xmit_skb);
>
> diff --git a/include/net/udp_tunnel.h b/include/net/udp_tunnel.h
> index 2a50a70..1a20d33 100644
> --- a/include/net/udp_tunnel.h
> +++ b/include/net/udp_tunnel.h
> @@ -77,17 +77,17 @@ void setup_udp_tunnel_sock(struct net *net, struct socket *sock,
>                            struct udp_tunnel_sock_cfg *sock_cfg);
>
>  /* Transmit the skb using UDP encapsulation. */
> -int udp_tunnel_xmit_skb(struct socket *sock, struct rtable *rt,
> -                       struct sk_buff *skb, __be32 src, __be32 dst,
> -                       __u8 tos, __u8 ttl, __be16 df, __be16 src_port,
> -                       __be16 dst_port, bool xnet);
> +int udp_tunnel_xmit_skb(struct rtable *rt, struct sk_buff *skb,
> +                       __be32 src, __be32 dst, __u8 tos, __u8 ttl,
> +                       __be16 df, __be16 src_port, __be16 dst_port,
> +                       bool xnet, bool nocheck);
>
>  #if IS_ENABLED(CONFIG_IPV6)
> -int udp_tunnel6_xmit_skb(struct socket *sock, struct dst_entry *dst,
> -                        struct sk_buff *skb, struct net_device *dev,
> -                        struct in6_addr *saddr, struct in6_addr *daddr,
> +int udp_tunnel6_xmit_skb(struct dst_entry *dst, struct sk_buff *skb,
> +                        struct net_device *dev, struct in6_addr *saddr,
> +                        struct in6_addr *daddr,
>                          __u8 prio, __u8 ttl, __be16 src_port,
> -                        __be16 dst_port);
> +                        __be16 dst_port, bool nocheck);
>  #endif
>
>  void udp_tunnel_sock_release(struct socket *sock);
> diff --git a/net/ipv4/udp_tunnel.c b/net/ipv4/udp_tunnel.c
> index 9996e63..c83b354 100644
> --- a/net/ipv4/udp_tunnel.c
> +++ b/net/ipv4/udp_tunnel.c
> @@ -75,10 +75,10 @@ void setup_udp_tunnel_sock(struct net *net, struct socket *sock,
>  }
>  EXPORT_SYMBOL_GPL(setup_udp_tunnel_sock);
>
> -int udp_tunnel_xmit_skb(struct socket *sock, struct rtable *rt,
> -                       struct sk_buff *skb, __be32 src, __be32 dst,
> -                       __u8 tos, __u8 ttl, __be16 df, __be16 src_port,
> -                       __be16 dst_port, bool xnet)
> +int udp_tunnel_xmit_skb(struct rtable *rt, struct sk_buff *skb,
> +                       __be32 src, __be32 dst, __u8 tos, __u8 ttl,
> +                       __be16 df, __be16 src_port, __be16 dst_port,
> +                       bool xnet, bool nocheck)
>  {
>         struct udphdr *uh;
>
> @@ -90,9 +90,9 @@ int udp_tunnel_xmit_skb(struct socket *sock, struct rtable *rt,
>         uh->source = src_port;
>         uh->len = htons(skb->len);
>
> -       udp_set_csum(sock->sk->sk_no_check_tx, skb, src, dst, skb->len);
> +       udp_set_csum(nocheck, skb, src, dst, skb->len);
>
> -       return iptunnel_xmit(sock->sk, rt, skb, src, dst, IPPROTO_UDP,
> +       return iptunnel_xmit(skb->sk, rt, skb, src, dst, IPPROTO_UDP,
>                              tos, ttl, df, xnet);
>  }
>  EXPORT_SYMBOL_GPL(udp_tunnel_xmit_skb);
> diff --git a/net/ipv6/ip6_udp_tunnel.c b/net/ipv6/ip6_udp_tunnel.c
> index 8db6c98..32d9b26 100644
> --- a/net/ipv6/ip6_udp_tunnel.c
> +++ b/net/ipv6/ip6_udp_tunnel.c
> @@ -62,14 +62,14 @@ error:
>  }
>  EXPORT_SYMBOL_GPL(udp_sock_create6);
>
> -int udp_tunnel6_xmit_skb(struct socket *sock, struct dst_entry *dst,
> -                        struct sk_buff *skb, struct net_device *dev,
> -                        struct in6_addr *saddr, struct in6_addr *daddr,
> -                        __u8 prio, __u8 ttl, __be16 src_port, __be16 dst_port)
> +int udp_tunnel6_xmit_skb(struct dst_entry *dst, struct sk_buff *skb,
> +                        struct net_device *dev, struct in6_addr *saddr,
> +                        struct in6_addr *daddr,
> +                        __u8 prio, __u8 ttl, __be16 src_port,
> +                        __be16 dst_port, bool nocheck)
>  {
>         struct udphdr *uh;
>         struct ipv6hdr *ip6h;
> -       struct sock *sk = sock->sk;
>
>         __skb_push(skb, sizeof(*uh));
>         skb_reset_transport_header(skb);
> @@ -85,7 +85,7 @@ int udp_tunnel6_xmit_skb(struct socket *sock, struct dst_entry *dst,
>                             | IPSKB_REROUTED);
>         skb_dst_set(skb, dst);
>
> -       udp6_set_csum(udp_get_no_check6_tx(sk), skb, saddr, daddr, skb->len);
> +       udp6_set_csum(nocheck, skb, saddr, daddr, skb->len);
>
>         __skb_push(skb, sizeof(*ip6h));
>         skb_reset_network_header(skb);
> --
> 2.2.0.rc0.207.ga3a616c
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 2/2] vxlan: Eliminate dependency on UDP socket in transmit path
  2015-01-17 18:18 ` [PATCH net-next 2/2] vxlan: Eliminate dependency on UDP socket in transmit path Tom Herbert
@ 2015-01-19  8:59   ` Thomas Graf
  2015-01-20 17:29     ` Tom Herbert
  0 siblings, 1 reply; 16+ messages in thread
From: Thomas Graf @ 2015-01-19  8:59 UTC (permalink / raw)
  To: Tom Herbert; +Cc: davem, netdev

On 01/17/15 at 10:18am, Tom Herbert wrote:
> diff --git a/include/net/vxlan.h b/include/net/vxlan.h
> index 7be8c34..2927d62 100644
> --- a/include/net/vxlan.h
> +++ b/include/net/vxlan.h
> @@ -129,8 +129,12 @@ struct vxlan_sock {
>  #define VXLAN_F_REMCSUM_RX		0x400
>  #define VXLAN_F_GBP			0x800
>  
> -/* These flags must match in order for a socket to be shareable */
> -#define VXLAN_F_UNSHAREABLE		VXLAN_F_GBP
> +/* Flags that are used in the receive patch. These flags must match in
                                          ^^^^^



> + * order for a socket to be shareable
> + */
> +#define VXLAN_F_RCV_FLAGS		(VXLAN_F_GBP |			\
> +					 VXLAN_F_UDP_ZERO_CSUM6_RX |	\
> +					 VXLAN_F_REMCSUM_RX)

I'm fine with this. It is slightly odd that we will be transmitting
RCO and other extensions on UDP ports which cannot accept the same
frames. I assume you have specific use cases for this scenario.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 2/2] vxlan: Eliminate dependency on UDP socket in transmit path
  2015-01-19  8:59   ` Thomas Graf
@ 2015-01-20 17:29     ` Tom Herbert
  2015-01-20 18:13       ` Thomas Graf
  0 siblings, 1 reply; 16+ messages in thread
From: Tom Herbert @ 2015-01-20 17:29 UTC (permalink / raw)
  To: Thomas Graf; +Cc: David Miller, Linux Netdev List

On Mon, Jan 19, 2015 at 12:59 AM, Thomas Graf <tgraf@suug.ch> wrote:
> On 01/17/15 at 10:18am, Tom Herbert wrote:
>> diff --git a/include/net/vxlan.h b/include/net/vxlan.h
>> index 7be8c34..2927d62 100644
>> --- a/include/net/vxlan.h
>> +++ b/include/net/vxlan.h
>> @@ -129,8 +129,12 @@ struct vxlan_sock {
>>  #define VXLAN_F_REMCSUM_RX           0x400
>>  #define VXLAN_F_GBP                  0x800
>>
>> -/* These flags must match in order for a socket to be shareable */
>> -#define VXLAN_F_UNSHAREABLE          VXLAN_F_GBP
>> +/* Flags that are used in the receive patch. These flags must match in
>                                           ^^^^^
>
>
>
>> + * order for a socket to be shareable
>> + */
>> +#define VXLAN_F_RCV_FLAGS            (VXLAN_F_GBP |                  \
>> +                                      VXLAN_F_UDP_ZERO_CSUM6_RX |    \
>> +                                      VXLAN_F_REMCSUM_RX)
>
> I'm fine with this. It is slightly odd that we will be transmitting
> RCO and other extensions on UDP ports which cannot accept the same
> frames. I assume you have specific use cases for this scenario.

I didn't see any reason to preclude that, if it needs to be symmetric
in that case it can be forced at the configuration. Being able to
receive RCO but not have to send it to certain peers is important use
case. You may want to consider this also for GBP if there are cases
where we accept GBP from different peers, but only send it to certain
ones.

Tom

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 1/2] udp: Do not require sock in udp_tunnel_xmit_skb
  2015-01-18 22:43   ` Or Gerlitz
@ 2015-01-20 17:36     ` Tom Herbert
  2015-01-20 21:47       ` Or Gerlitz
  0 siblings, 1 reply; 16+ messages in thread
From: Tom Herbert @ 2015-01-20 17:36 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: David Miller, Thomas Graf, Linux Netdev List

On Sun, Jan 18, 2015 at 2:43 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
> On Sat, Jan 17, 2015 at 8:18 PM, Tom Herbert <therbert@google.com> wrote:
>> The UDP tunnel transmit functions udp_tunnel_xmit_skb and
>> udp_tunnel6_xmit_skb include a socket argument. The socket being
>> passed to the functions (from VXLAN) is a UDP created for receive
>> side. The only thing that the socket is used for in the transmit
>> functions is to get the setting for checksum (enabled or zero).
>
> Tom, just to clarify - re the sockets usage in the transmit side,
> somewhere bind or alike is done on them such that we have multiple
> source UDP ports for given host VXLAN traffic. Here for example the
> sender host is 192.168.31.17 and two ports are seen here 54206 and
> 50795.
>
> Just wanted to make sure this series doesn't change that, since if
> this is the case, we introduce here a regression w.r.t RSS hash
> spreading from the outer UDP header data at the receiver side (which
> is the right thing to do, per your LKS session...)
>
Hi Or,

Using or not using a socket on transmit should have no bearing to the
receive side. RSS works based on the hash of the UDP 5-tuple which
should include a source port set to a value for the inner flow. Since
the UDP socket is unconnected it should have no bearing on RFS or XPS
either.

By the way, for the receiver we should never need to use SO_REUSEPORT
for these encapsulation sockets. If we're getting contention on a
single socket that should be resolved in the kernel (I believe Eric
was looking at eliminate lookup costs for these).

Tom

> IP 192.168.31.18.45367 > 192.168.31.17.4789: UDP, length 62
> IP 192.168.31.18.45515 > 192.168.31.17.4789: UDP, length 62
> IP 192.168.31.17.54206 > 192.168.31.18.4789: UDP, length 26814
> IP 192.168.31.18.45367 > 192.168.31.17.4789: UDP, length 62
> IP 192.168.31.18.45515 > 192.168.31.17.4789: UDP, length 62
> IP 192.168.31.17.50795 > 192.168.31.18.4789: UDP, length 25498
> IP 192.168.31.17.50795 > 192.168.31.18.4789: UDP, length 64922
> IP 192.168.31.17.50795 > 192.168.31.18.4789: UDP, length 64922
> IP 192.168.31.18.45367 > 192.168.31.17.4789: UDP, length 62
> IP 192.168.31.17.54206 > 192.168.31.18.4789: UDP, length 38170
>
>> This patch removes the argument and and adds a nocheck argument
>> for checksum setting. This eliminates the unnecessary dependency
>> on a UDP socket for UDP tunnel transmit.
>>
>> Signed-off-by: Tom Herbert <therbert@google.com>
>> ---
>>  drivers/net/vxlan.c       | 10 ++++++----
>>  include/net/udp_tunnel.h  | 16 ++++++++--------
>>  net/ipv4/udp_tunnel.c     | 12 ++++++------
>>  net/ipv6/ip6_udp_tunnel.c | 12 ++++++------
>>  4 files changed, 26 insertions(+), 24 deletions(-)
>>
>> diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
>> index 6b6b456..4fb4205 100644
>> --- a/drivers/net/vxlan.c
>> +++ b/drivers/net/vxlan.c
>> @@ -1763,8 +1763,9 @@ static int vxlan6_xmit_skb(struct vxlan_sock *vs,
>>
>>         skb_set_inner_protocol(skb, htons(ETH_P_TEB));
>>
>> -       udp_tunnel6_xmit_skb(vs->sock, dst, skb, dev, saddr, daddr, prio,
>> -                            ttl, src_port, dst_port);
>> +       udp_tunnel6_xmit_skb(dst, skb, dev, saddr, daddr, prio,
>> +                            ttl, src_port, dst_port,
>> +                            udp_get_no_check6_tx(vs->sock->sk));
>>         return 0;
>>  err:
>>         dst_release(dst);
>> @@ -1842,8 +1843,9 @@ int vxlan_xmit_skb(struct vxlan_sock *vs,
>>
>>         skb_set_inner_protocol(skb, htons(ETH_P_TEB));
>>
>> -       return udp_tunnel_xmit_skb(vs->sock, rt, skb, src, dst, tos,
>> -                                  ttl, df, src_port, dst_port, xnet);
>> +       return udp_tunnel_xmit_skb(rt, skb, src, dst, tos,
>> +                                  ttl, df, src_port, dst_port, xnet,
>> +                                  vs->sock->sk->sk_no_check_tx);
>>  }
>>  EXPORT_SYMBOL_GPL(vxlan_xmit_skb);
>>
>> diff --git a/include/net/udp_tunnel.h b/include/net/udp_tunnel.h
>> index 2a50a70..1a20d33 100644
>> --- a/include/net/udp_tunnel.h
>> +++ b/include/net/udp_tunnel.h
>> @@ -77,17 +77,17 @@ void setup_udp_tunnel_sock(struct net *net, struct socket *sock,
>>                            struct udp_tunnel_sock_cfg *sock_cfg);
>>
>>  /* Transmit the skb using UDP encapsulation. */
>> -int udp_tunnel_xmit_skb(struct socket *sock, struct rtable *rt,
>> -                       struct sk_buff *skb, __be32 src, __be32 dst,
>> -                       __u8 tos, __u8 ttl, __be16 df, __be16 src_port,
>> -                       __be16 dst_port, bool xnet);
>> +int udp_tunnel_xmit_skb(struct rtable *rt, struct sk_buff *skb,
>> +                       __be32 src, __be32 dst, __u8 tos, __u8 ttl,
>> +                       __be16 df, __be16 src_port, __be16 dst_port,
>> +                       bool xnet, bool nocheck);
>>
>>  #if IS_ENABLED(CONFIG_IPV6)
>> -int udp_tunnel6_xmit_skb(struct socket *sock, struct dst_entry *dst,
>> -                        struct sk_buff *skb, struct net_device *dev,
>> -                        struct in6_addr *saddr, struct in6_addr *daddr,
>> +int udp_tunnel6_xmit_skb(struct dst_entry *dst, struct sk_buff *skb,
>> +                        struct net_device *dev, struct in6_addr *saddr,
>> +                        struct in6_addr *daddr,
>>                          __u8 prio, __u8 ttl, __be16 src_port,
>> -                        __be16 dst_port);
>> +                        __be16 dst_port, bool nocheck);
>>  #endif
>>
>>  void udp_tunnel_sock_release(struct socket *sock);
>> diff --git a/net/ipv4/udp_tunnel.c b/net/ipv4/udp_tunnel.c
>> index 9996e63..c83b354 100644
>> --- a/net/ipv4/udp_tunnel.c
>> +++ b/net/ipv4/udp_tunnel.c
>> @@ -75,10 +75,10 @@ void setup_udp_tunnel_sock(struct net *net, struct socket *sock,
>>  }
>>  EXPORT_SYMBOL_GPL(setup_udp_tunnel_sock);
>>
>> -int udp_tunnel_xmit_skb(struct socket *sock, struct rtable *rt,
>> -                       struct sk_buff *skb, __be32 src, __be32 dst,
>> -                       __u8 tos, __u8 ttl, __be16 df, __be16 src_port,
>> -                       __be16 dst_port, bool xnet)
>> +int udp_tunnel_xmit_skb(struct rtable *rt, struct sk_buff *skb,
>> +                       __be32 src, __be32 dst, __u8 tos, __u8 ttl,
>> +                       __be16 df, __be16 src_port, __be16 dst_port,
>> +                       bool xnet, bool nocheck)
>>  {
>>         struct udphdr *uh;
>>
>> @@ -90,9 +90,9 @@ int udp_tunnel_xmit_skb(struct socket *sock, struct rtable *rt,
>>         uh->source = src_port;
>>         uh->len = htons(skb->len);
>>
>> -       udp_set_csum(sock->sk->sk_no_check_tx, skb, src, dst, skb->len);
>> +       udp_set_csum(nocheck, skb, src, dst, skb->len);
>>
>> -       return iptunnel_xmit(sock->sk, rt, skb, src, dst, IPPROTO_UDP,
>> +       return iptunnel_xmit(skb->sk, rt, skb, src, dst, IPPROTO_UDP,
>>                              tos, ttl, df, xnet);
>>  }
>>  EXPORT_SYMBOL_GPL(udp_tunnel_xmit_skb);
>> diff --git a/net/ipv6/ip6_udp_tunnel.c b/net/ipv6/ip6_udp_tunnel.c
>> index 8db6c98..32d9b26 100644
>> --- a/net/ipv6/ip6_udp_tunnel.c
>> +++ b/net/ipv6/ip6_udp_tunnel.c
>> @@ -62,14 +62,14 @@ error:
>>  }
>>  EXPORT_SYMBOL_GPL(udp_sock_create6);
>>
>> -int udp_tunnel6_xmit_skb(struct socket *sock, struct dst_entry *dst,
>> -                        struct sk_buff *skb, struct net_device *dev,
>> -                        struct in6_addr *saddr, struct in6_addr *daddr,
>> -                        __u8 prio, __u8 ttl, __be16 src_port, __be16 dst_port)
>> +int udp_tunnel6_xmit_skb(struct dst_entry *dst, struct sk_buff *skb,
>> +                        struct net_device *dev, struct in6_addr *saddr,
>> +                        struct in6_addr *daddr,
>> +                        __u8 prio, __u8 ttl, __be16 src_port,
>> +                        __be16 dst_port, bool nocheck)
>>  {
>>         struct udphdr *uh;
>>         struct ipv6hdr *ip6h;
>> -       struct sock *sk = sock->sk;
>>
>>         __skb_push(skb, sizeof(*uh));
>>         skb_reset_transport_header(skb);
>> @@ -85,7 +85,7 @@ int udp_tunnel6_xmit_skb(struct socket *sock, struct dst_entry *dst,
>>                             | IPSKB_REROUTED);
>>         skb_dst_set(skb, dst);
>>
>> -       udp6_set_csum(udp_get_no_check6_tx(sk), skb, saddr, daddr, skb->len);
>> +       udp6_set_csum(nocheck, skb, saddr, daddr, skb->len);
>>
>>         __skb_push(skb, sizeof(*ip6h));
>>         skb_reset_network_header(skb);
>> --
>> 2.2.0.rc0.207.ga3a616c
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 2/2] vxlan: Eliminate dependency on UDP socket in transmit path
  2015-01-20 17:29     ` Tom Herbert
@ 2015-01-20 18:13       ` Thomas Graf
  2015-01-20 22:53         ` Tom Herbert
  0 siblings, 1 reply; 16+ messages in thread
From: Thomas Graf @ 2015-01-20 18:13 UTC (permalink / raw)
  To: Tom Herbert; +Cc: David Miller, Linux Netdev List

On 01/20/15 at 09:29am, Tom Herbert wrote:
> I didn't see any reason to preclude that, if it needs to be symmetric
> in that case it can be forced at the configuration. Being able to
> receive RCO but not have to send it to certain peers is important use
> case. You may want to consider this also for GBP if there are cases
> where we accept GBP from different peers, but only send it to certain
> ones.

I think asymmetric configurations are fine, in particular
receive-only. I was reluctant to the send-only scenario initially
as I would expect a VTEP sending RCO frames on UDP dport 8472 to
also always be able to accept RCO frames on that port. I can't
come up with any specific cases where this would lead to problems
though so I have no objections.

As for GBP, as processing of the policy group requires additional
iptables or OVS rules anyway, such behaviour would be implemented
in those rules by either ignoring the mark or dropping such frames.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 1/2] udp: Do not require sock in udp_tunnel_xmit_skb
  2015-01-20 17:36     ` Tom Herbert
@ 2015-01-20 21:47       ` Or Gerlitz
  2015-01-20 22:24         ` Tom Herbert
  0 siblings, 1 reply; 16+ messages in thread
From: Or Gerlitz @ 2015-01-20 21:47 UTC (permalink / raw)
  To: Tom Herbert; +Cc: David Miller, Thomas Graf, Linux Netdev List

On Tue, Jan 20, 2015 at 7:36 PM, Tom Herbert <therbert@google.com> wrote:
> On Sun, Jan 18, 2015 at 2:43 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
>> On Sat, Jan 17, 2015 at 8:18 PM, Tom Herbert <therbert@google.com> wrote:
>>> The UDP tunnel transmit functions udp_tunnel_xmit_skb and
>>> udp_tunnel6_xmit_skb include a socket argument. The socket being
>>> passed to the functions (from VXLAN) is a UDP created for receive
>>> side. The only thing that the socket is used for in the transmit
>>> functions is to get the setting for checksum (enabled or zero).
>>
>> Tom, just to clarify - re the sockets usage in the transmit side,
>> somewhere bind or alike is done on them such that we have multiple
>> source UDP ports for given host VXLAN traffic. Here for example the
>> sender host is 192.168.31.17 and two ports are seen here 54206 and
>> 50795.
>>
>> Just wanted to make sure this series doesn't change that, since if
>> this is the case, we introduce here a regression w.r.t RSS hash
>> spreading from the outer UDP header data at the receiver side (which
>> is the right thing to do, per your LKS session...)
>>
> Hi Or,
>
> Using or not using a socket on transmit should have no bearing to the
> receive side. RSS works based on the hash of the UDP 5-tuple which
> should include a source port set to a value for the inner flow. Since
> the UDP socket is unconnected it should have no bearing on RFS or XPS
> either.

Hi Tom,

You say "which should include a source port set to a value for the
inner flow", well this series doesn't add such logic, nor I am fully
clear what piece exactly is responsible for the fact that I see
multiple source udp ports used from vxlan traffic flowing out of a
certain host. I just wanted to make sure that these patches don't
introduce a regression w.r.t to the **current** (not future) state of
things, can you confirm this?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 1/2] udp: Do not require sock in udp_tunnel_xmit_skb
  2015-01-20 21:47       ` Or Gerlitz
@ 2015-01-20 22:24         ` Tom Herbert
  2015-01-21 20:57           ` Or Gerlitz
  0 siblings, 1 reply; 16+ messages in thread
From: Tom Herbert @ 2015-01-20 22:24 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: David Miller, Thomas Graf, Linux Netdev List

On Tue, Jan 20, 2015 at 1:47 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
> On Tue, Jan 20, 2015 at 7:36 PM, Tom Herbert <therbert@google.com> wrote:
>> On Sun, Jan 18, 2015 at 2:43 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
>>> On Sat, Jan 17, 2015 at 8:18 PM, Tom Herbert <therbert@google.com> wrote:
>>>> The UDP tunnel transmit functions udp_tunnel_xmit_skb and
>>>> udp_tunnel6_xmit_skb include a socket argument. The socket being
>>>> passed to the functions (from VXLAN) is a UDP created for receive
>>>> side. The only thing that the socket is used for in the transmit
>>>> functions is to get the setting for checksum (enabled or zero).
>>>
>>> Tom, just to clarify - re the sockets usage in the transmit side,
>>> somewhere bind or alike is done on them such that we have multiple
>>> source UDP ports for given host VXLAN traffic. Here for example the
>>> sender host is 192.168.31.17 and two ports are seen here 54206 and
>>> 50795.
>>>
>>> Just wanted to make sure this series doesn't change that, since if
>>> this is the case, we introduce here a regression w.r.t RSS hash
>>> spreading from the outer UDP header data at the receiver side (which
>>> is the right thing to do, per your LKS session...)
>>>
>> Hi Or,
>>
>> Using or not using a socket on transmit should have no bearing to the
>> receive side. RSS works based on the hash of the UDP 5-tuple which
>> should include a source port set to a value for the inner flow. Since
>> the UDP socket is unconnected it should have no bearing on RFS or XPS
>> either.
>
> Hi Tom,
>
> You say "which should include a source port set to a value for the
> inner flow", well this series doesn't add such logic, nor I am fully
> clear what piece exactly is responsible for the fact that I see
> multiple source udp ports used from vxlan traffic flowing out of a
> certain host. I just wanted to make sure that these patches don't
> introduce a regression w.r.t to the **current** (not future) state of
> things, can you confirm this?

vxlan_xmit_one calls udp_flow_src_port to get a source port value
based on the encapsulated flow.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 2/2] vxlan: Eliminate dependency on UDP socket in transmit path
  2015-01-20 18:13       ` Thomas Graf
@ 2015-01-20 22:53         ` Tom Herbert
  0 siblings, 0 replies; 16+ messages in thread
From: Tom Herbert @ 2015-01-20 22:53 UTC (permalink / raw)
  To: Thomas Graf; +Cc: David Miller, Linux Netdev List

On Tue, Jan 20, 2015 at 10:13 AM, Thomas Graf <tgraf@suug.ch> wrote:
> On 01/20/15 at 09:29am, Tom Herbert wrote:
>> I didn't see any reason to preclude that, if it needs to be symmetric
>> in that case it can be forced at the configuration. Being able to
>> receive RCO but not have to send it to certain peers is important use
>> case. You may want to consider this also for GBP if there are cases
>> where we accept GBP from different peers, but only send it to certain
>> ones.
>
> I think asymmetric configurations are fine, in particular
> receive-only. I was reluctant to the send-only scenario initially
> as I would expect a VTEP sending RCO frames on UDP dport 8472 to
> also always be able to accept RCO frames on that port. I can't
> come up with any specific cases where this would lead to problems
> though so I have no objections.
>
> As for GBP, as processing of the policy group requires additional
> iptables or OVS rules anyway, such behaviour would be implemented
> in those rules by either ignoring the mark or dropping such frames.

It's risky from a protocol perspective to assume that sending
something will be properly ignored. Just because we're willing to
receive something, doesn't mean we necessarily want to send it--
that's the robustness principle :-)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 1/2] udp: Do not require sock in udp_tunnel_xmit_skb
  2015-01-20 22:24         ` Tom Herbert
@ 2015-01-21 20:57           ` Or Gerlitz
  2015-01-21 22:24             ` Tom Herbert
  0 siblings, 1 reply; 16+ messages in thread
From: Or Gerlitz @ 2015-01-21 20:57 UTC (permalink / raw)
  To: Tom Herbert; +Cc: David Miller, Thomas Graf, Linux Netdev List

On Wed, Jan 21, 2015 at 12:24 AM, Tom Herbert <therbert@google.com> wrote:
> vxlan_xmit_one calls udp_flow_src_port to get a source port value
> based on the encapsulated flow.

OK, got it -- does a by-product of invoking udp_flow_src_port on the
skb is having an hash mark on the skb which is based on the inner
packet and can (is?) used for TX queue selection over MQ devices?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 1/2] udp: Do not require sock in udp_tunnel_xmit_skb
  2015-01-21 20:57           ` Or Gerlitz
@ 2015-01-21 22:24             ` Tom Herbert
  2015-01-21 22:37               ` Or Gerlitz
  0 siblings, 1 reply; 16+ messages in thread
From: Tom Herbert @ 2015-01-21 22:24 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: David Miller, Thomas Graf, Linux Netdev List

On Wed, Jan 21, 2015 at 12:57 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
> On Wed, Jan 21, 2015 at 12:24 AM, Tom Herbert <therbert@google.com> wrote:
>> vxlan_xmit_one calls udp_flow_src_port to get a source port value
>> based on the encapsulated flow.
>
> OK, got it -- does a by-product of invoking udp_flow_src_port on the
> skb is having an hash mark on the skb which is based on the inner
> packet and can (is?) used for TX queue selection over MQ devices?

Yes, that should be fine. The hash can be used all the way up to TCP
and save in the TCP socket as the rxhash for a connection. This was
RFS, XPS, etc. will work for TCP over an encapsulation.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 1/2] udp: Do not require sock in udp_tunnel_xmit_skb
  2015-01-21 22:24             ` Tom Herbert
@ 2015-01-21 22:37               ` Or Gerlitz
  2015-01-21 22:55                 ` Tom Herbert
  0 siblings, 1 reply; 16+ messages in thread
From: Or Gerlitz @ 2015-01-21 22:37 UTC (permalink / raw)
  To: Tom Herbert; +Cc: David Miller, Thomas Graf, Linux Netdev List

On Thu, Jan 22, 2015 at 12:24 AM, Tom Herbert <therbert@google.com> wrote:
> On Wed, Jan 21, 2015 at 12:57 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
>> On Wed, Jan 21, 2015 at 12:24 AM, Tom Herbert <therbert@google.com> wrote:
>>> vxlan_xmit_one calls udp_flow_src_port to get a source port value
>>> based on the encapsulated flow.
>>
>> OK, got it -- does a by-product of invoking udp_flow_src_port on the
>> skb is having an hash mark on the skb which is based on the inner
>> packet and can (is?) used for TX queue selection over MQ devices?
>
> Yes, that should be fine. The hash can be used all the way up to TCP
> and save in the TCP socket as the rxhash for a connection. This was
> RFS, XPS, etc. will work for TCP over an encapsulation.


so.. can be used or is already used today? Also we're talking on the
TX path, so I wasn't sure to follow on your mentioning of RFS...

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next 1/2] udp: Do not require sock in udp_tunnel_xmit_skb
  2015-01-21 22:37               ` Or Gerlitz
@ 2015-01-21 22:55                 ` Tom Herbert
  0 siblings, 0 replies; 16+ messages in thread
From: Tom Herbert @ 2015-01-21 22:55 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: David Miller, Thomas Graf, Linux Netdev List

On Wed, Jan 21, 2015 at 2:37 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
> On Thu, Jan 22, 2015 at 12:24 AM, Tom Herbert <therbert@google.com> wrote:
>> On Wed, Jan 21, 2015 at 12:57 PM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
>>> On Wed, Jan 21, 2015 at 12:24 AM, Tom Herbert <therbert@google.com> wrote:
>>>> vxlan_xmit_one calls udp_flow_src_port to get a source port value
>>>> based on the encapsulated flow.
>>>
>>> OK, got it -- does a by-product of invoking udp_flow_src_port on the
>>> skb is having an hash mark on the skb which is based on the inner
>>> packet and can (is?) used for TX queue selection over MQ devices?
>>
>> Yes, that should be fine. The hash can be used all the way up to TCP
>> and save in the TCP socket as the rxhash for a connection. This was
>> RFS, XPS, etc. will work for TCP over an encapsulation.
>
>
> so.. can be used or is already used today? Also we're talking on the
> TX path, so I wasn't sure to follow on your mentioning of RFS...

Yes. AFAIK all encapsulation implementations are single queue and
otherwise transparent to the application layer for the hash and queue
selection.

Is there a particular feature you're having trouble with?

Thanks,
Tom

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2015-01-21 22:55 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-17 18:18 [PATCH net-next 0/2] vxlan: Don't use UDP socket for transmit Tom Herbert
2015-01-17 18:18 ` [PATCH net-next 1/2] udp: Do not require sock in udp_tunnel_xmit_skb Tom Herbert
2015-01-17 23:45   ` Jesse Gross
2015-01-18 22:43   ` Or Gerlitz
2015-01-20 17:36     ` Tom Herbert
2015-01-20 21:47       ` Or Gerlitz
2015-01-20 22:24         ` Tom Herbert
2015-01-21 20:57           ` Or Gerlitz
2015-01-21 22:24             ` Tom Herbert
2015-01-21 22:37               ` Or Gerlitz
2015-01-21 22:55                 ` Tom Herbert
2015-01-17 18:18 ` [PATCH net-next 2/2] vxlan: Eliminate dependency on UDP socket in transmit path Tom Herbert
2015-01-19  8:59   ` Thomas Graf
2015-01-20 17:29     ` Tom Herbert
2015-01-20 18:13       ` Thomas Graf
2015-01-20 22:53         ` Tom Herbert

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.