netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] add GENEVE netdev tunnel driver
@ 2015-04-02 19:17 John W. Linville
  2015-04-02 19:17 ` [RFC PATCH 1/5] geneve: remove MODULE_ALIAS_RTNL_LINK from net/ipv4/geneve.c John W. Linville
                   ` (5 more replies)
  0 siblings, 6 replies; 28+ messages in thread
From: John W. Linville @ 2015-04-02 19:17 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Jesse Gross, Andy Zhou, Stephen Hemminger,
	Alexander Duyck

This 5-patch kernel series adds a netdev implementation of a GENEVE
tunnel driver, and the single iproute2 patch enables creation and
such for those netdevs.  This makes use of the existing GENEVE
infrastructure already used by the OVS code.  The net/ipv4/geneve.c
file is renamed as net/ipv4/libgeneve.c as part of these changes.

The overall structure of the GENEVE netdev driver is strongly
influenced by the VXLAN netdev driver.  This is not surprising, as the
two drivers are intended to serve similar purposes.  As development
of the GENEVE driver continues, it is likely that those similarities
will grow stronger.

The current implementation is very simple, restricting itself to a
single GENEVE tunnel netdev per net namespace.  This is due only to
the simplicity of the implementation, and no such limit is inherent
to GENEVE in any way.  I am releasing this as an RFC patch series at
this point both as a means of reporting progress on this effort and
in hopes of collecting useful feedback.

Using the included iproute2 patch, a GENEVE tunnel is created thusly:

	ip link add dev gnv0 type geneve remote 192.168.22.1 vni 1234
	ip link set gnv0 up
	ip addr add 10.1.1.1/24 dev gnv0

After a corresponding tunnel interface is created at the link partner,
traffic should proceed as expected.

Please let me know if anyone has problems.  I look forward to your
feedback!

John

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC PATCH 1/5] geneve: remove MODULE_ALIAS_RTNL_LINK from net/ipv4/geneve.c
  2015-04-02 19:17 [RFC] add GENEVE netdev tunnel driver John W. Linville
@ 2015-04-02 19:17 ` John W. Linville
  2015-04-02 23:39   ` Stephen Hemminger
  2015-04-02 19:17 ` [RFC PATCH 2/5] geneve: move definition of geneve_hdr() to geneve.h John W. Linville
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 28+ messages in thread
From: John W. Linville @ 2015-04-02 19:17 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Jesse Gross, Andy Zhou, Stephen Hemminger,
	Alexander Duyck, John W. Linville

This file is essentially a library for implementing the geneve
encapsulation protocol.  The file does not register any rtnl_link_ops,
so the MODULE_ALIAS_RTNL_LINK macro is inappropriate here.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
---
 net/ipv4/geneve.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/net/ipv4/geneve.c b/net/ipv4/geneve.c
index 5a4828ba05ad..ba5283adbee8 100644
--- a/net/ipv4/geneve.c
+++ b/net/ipv4/geneve.c
@@ -450,4 +450,3 @@ module_exit(geneve_cleanup_module);
 MODULE_LICENSE("GPL");
 MODULE_AUTHOR("Jesse Gross <jesse@nicira.com>");
 MODULE_DESCRIPTION("Driver for GENEVE encapsulated traffic");
-MODULE_ALIAS_RTNL_LINK("geneve");
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC PATCH 2/5] geneve: move definition of geneve_hdr() to geneve.h
  2015-04-02 19:17 [RFC] add GENEVE netdev tunnel driver John W. Linville
  2015-04-02 19:17 ` [RFC PATCH 1/5] geneve: remove MODULE_ALIAS_RTNL_LINK from net/ipv4/geneve.c John W. Linville
@ 2015-04-02 19:17 ` John W. Linville
  2015-04-02 19:17 ` [RFC PATCH 3/5] Rename support library for geneve John W. Linville
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 28+ messages in thread
From: John W. Linville @ 2015-04-02 19:17 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Jesse Gross, Andy Zhou, Stephen Hemminger,
	Alexander Duyck, John W. Linville

This is a static inline with identical definitions in multiple places...

Signed-off-by: John W. Linville <linville@tuxdriver.com>
---
 include/net/geneve.h           | 5 +++++
 net/ipv4/geneve.c              | 5 -----
 net/openvswitch/vport-geneve.c | 5 -----
 3 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/include/net/geneve.h b/include/net/geneve.h
index 14fb8d3390b4..2a0543a1899d 100644
--- a/include/net/geneve.h
+++ b/include/net/geneve.h
@@ -62,6 +62,11 @@ struct genevehdr {
 	struct geneve_opt options[];
 };
 
+static inline struct genevehdr *geneve_hdr(const struct sk_buff *skb)
+{
+	return (struct genevehdr *)(udp_hdr(skb) + 1);
+}
+
 #ifdef CONFIG_INET
 struct geneve_sock;
 
diff --git a/net/ipv4/geneve.c b/net/ipv4/geneve.c
index ba5283adbee8..7990075d3fc8 100644
--- a/net/ipv4/geneve.c
+++ b/net/ipv4/geneve.c
@@ -60,11 +60,6 @@ struct geneve_net {
 
 static int geneve_net_id;
 
-static inline struct genevehdr *geneve_hdr(const struct sk_buff *skb)
-{
-	return (struct genevehdr *)(udp_hdr(skb) + 1);
-}
-
 static struct geneve_sock *geneve_find_sock(struct net *net,
 					    sa_family_t family, __be16 port)
 {
diff --git a/net/openvswitch/vport-geneve.c b/net/openvswitch/vport-geneve.c
index bf02fd5808c9..208c576bd1b6 100644
--- a/net/openvswitch/vport-geneve.c
+++ b/net/openvswitch/vport-geneve.c
@@ -46,11 +46,6 @@ static inline struct geneve_port *geneve_vport(const struct vport *vport)
 	return vport_priv(vport);
 }
 
-static inline struct genevehdr *geneve_hdr(const struct sk_buff *skb)
-{
-	return (struct genevehdr *)(udp_hdr(skb) + 1);
-}
-
 /* Convert 64 bit tunnel ID to 24 bit VNI. */
 static void tunnel_id_to_vni(__be64 tun_id, __u8 *vni)
 {
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC PATCH 3/5] Rename support library for geneve
  2015-04-02 19:17 [RFC] add GENEVE netdev tunnel driver John W. Linville
  2015-04-02 19:17 ` [RFC PATCH 1/5] geneve: remove MODULE_ALIAS_RTNL_LINK from net/ipv4/geneve.c John W. Linville
  2015-04-02 19:17 ` [RFC PATCH 2/5] geneve: move definition of geneve_hdr() to geneve.h John W. Linville
@ 2015-04-02 19:17 ` John W. Linville
  2015-04-03  0:05   ` Cong Wang
  2015-04-02 19:17 ` [RFC PATCH 4/5] libgeneve: identify as driver library in modules description John W. Linville
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 28+ messages in thread
From: John W. Linville @ 2015-04-02 19:17 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Jesse Gross, Andy Zhou, Stephen Hemminger,
	Alexander Duyck, John W. Linville

net/ipv4/geneve.c -> net/ipv4/libgeneve.c

This name better reflects the purpose of the module.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
---
 net/ipv4/Kconfig                   | 4 ++--
 net/ipv4/Makefile                  | 2 +-
 net/ipv4/{geneve.c => libgeneve.c} | 0
 net/openvswitch/Kconfig            | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)
 rename net/ipv4/{geneve.c => libgeneve.c} (100%)

diff --git a/net/ipv4/Kconfig b/net/ipv4/Kconfig
index bd2901604842..0fbf9f0428cd 100644
--- a/net/ipv4/Kconfig
+++ b/net/ipv4/Kconfig
@@ -331,8 +331,8 @@ config NET_FOU_IP_TUNNELS
 	  When this option is enabled IP tunnels can be configured to use
 	  FOU or GUE encapsulation.
 
-config GENEVE
-	tristate "Generic Network Virtualization Encapsulation (Geneve)"
+config LIBGENEVE
+	tristate "Generic Network Virtualization Encapsulation library"
 	depends on INET
 	select NET_UDP_TUNNEL
 	---help---
diff --git a/net/ipv4/Makefile b/net/ipv4/Makefile
index 518c04ed666e..b16e91ecb0c6 100644
--- a/net/ipv4/Makefile
+++ b/net/ipv4/Makefile
@@ -56,7 +56,7 @@ obj-$(CONFIG_TCP_CONG_YEAH) += tcp_yeah.o
 obj-$(CONFIG_TCP_CONG_ILLINOIS) += tcp_illinois.o
 obj-$(CONFIG_MEMCG_KMEM) += tcp_memcontrol.o
 obj-$(CONFIG_NETLABEL) += cipso_ipv4.o
-obj-$(CONFIG_GENEVE) += geneve.o
+obj-$(CONFIG_LIBGENEVE) += libgeneve.o
 
 obj-$(CONFIG_XFRM) += xfrm4_policy.o xfrm4_state.o xfrm4_input.o \
 		      xfrm4_output.o xfrm4_protocol.o
diff --git a/net/ipv4/geneve.c b/net/ipv4/libgeneve.c
similarity index 100%
rename from net/ipv4/geneve.c
rename to net/ipv4/libgeneve.c
diff --git a/net/openvswitch/Kconfig b/net/openvswitch/Kconfig
index ed6b0f8dd1bb..8e20a4b8f6fa 100644
--- a/net/openvswitch/Kconfig
+++ b/net/openvswitch/Kconfig
@@ -59,7 +59,7 @@ config OPENVSWITCH_VXLAN
 config OPENVSWITCH_GENEVE
 	tristate "Open vSwitch Geneve tunneling support"
 	depends on OPENVSWITCH
-	depends on GENEVE
+	depends on LIBGENEVE
 	default OPENVSWITCH
 	---help---
 	  If you say Y here, then the Open vSwitch will be able create geneve vport.
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC PATCH 4/5] libgeneve: identify as driver library in modules description
  2015-04-02 19:17 [RFC] add GENEVE netdev tunnel driver John W. Linville
                   ` (2 preceding siblings ...)
  2015-04-02 19:17 ` [RFC PATCH 3/5] Rename support library for geneve John W. Linville
@ 2015-04-02 19:17 ` John W. Linville
  2015-04-02 19:17 ` [RFC PATCH 5/5] geneve: add initial netdev driver for GENEVE tunnels John W. Linville
  2015-04-02 19:17 ` [RFC PATCH] iproute2: GENEVE support John W. Linville
  5 siblings, 0 replies; 28+ messages in thread
From: John W. Linville @ 2015-04-02 19:17 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Jesse Gross, Andy Zhou, Stephen Hemminger,
	Alexander Duyck, John W. Linville

Signed-off-by: John W. Linville <linville@tuxdriver.com>
---
 net/ipv4/libgeneve.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/libgeneve.c b/net/ipv4/libgeneve.c
index 7990075d3fc8..1083ebd84ab6 100644
--- a/net/ipv4/libgeneve.c
+++ b/net/ipv4/libgeneve.c
@@ -444,4 +444,4 @@ module_exit(geneve_cleanup_module);
 
 MODULE_LICENSE("GPL");
 MODULE_AUTHOR("Jesse Gross <jesse@nicira.com>");
-MODULE_DESCRIPTION("Driver for GENEVE encapsulated traffic");
+MODULE_DESCRIPTION("Driver library for GENEVE encapsulated traffic");
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC PATCH 5/5] geneve: add initial netdev driver for GENEVE tunnels
  2015-04-02 19:17 [RFC] add GENEVE netdev tunnel driver John W. Linville
                   ` (3 preceding siblings ...)
  2015-04-02 19:17 ` [RFC PATCH 4/5] libgeneve: identify as driver library in modules description John W. Linville
@ 2015-04-02 19:17 ` John W. Linville
  2015-04-02 20:20   ` Jiri Pirko
                     ` (2 more replies)
  2015-04-02 19:17 ` [RFC PATCH] iproute2: GENEVE support John W. Linville
  5 siblings, 3 replies; 28+ messages in thread
From: John W. Linville @ 2015-04-02 19:17 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Jesse Gross, Andy Zhou, Stephen Hemminger,
	Alexander Duyck, John W. Linville

This is an initial implementation of a netdev driver for GENEVE
tunnels.  This implementation uses a fixed UDP port, and only supports
a single tunnel (and therefore only a single VNI) per net namespace.
Only IPv4 links are supported at this time.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
---
 drivers/net/Kconfig          |  14 ++
 drivers/net/Makefile         |   1 +
 drivers/net/geneve.c         | 451 +++++++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/if_link.h |   9 +
 4 files changed, 475 insertions(+)
 create mode 100644 drivers/net/geneve.c

diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index df51d6025a90..c2519a4e0845 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -179,6 +179,20 @@ config VXLAN
 	  To compile this driver as a module, choose M here: the module
 	  will be called vxlan.
 
+config GENEVE
+       tristate "Generic Network Virtualization Encapsulation netdev"
+       depends on INET && LIBGENEVE
+       select NET_IP_TUNNEL
+       ---help---
+	  This allows one to create geneve virtual interfaces that provide
+	  Layer 2 Networks over Layer 3 Networks. GENEVE is often used
+	  to tunnel virtual network infrastructure in virtualized environments.
+	  For more information see:
+	    http://tools.ietf.org/html/draft-gross-geneve-02
+
+	  To compile this driver as a module, choose M here: the module
+	  will be called geneve.
+
 config NETCONSOLE
 	tristate "Network console logging support"
 	---help---
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index e25fdd7d905e..c12cb22478a7 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -23,6 +23,7 @@ obj-$(CONFIG_TUN) += tun.o
 obj-$(CONFIG_VETH) += veth.o
 obj-$(CONFIG_VIRTIO_NET) += virtio_net.o
 obj-$(CONFIG_VXLAN) += vxlan.o
+obj-$(CONFIG_GENEVE) += geneve.o
 obj-$(CONFIG_NLMON) += nlmon.o
 
 #
diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
new file mode 100644
index 000000000000..fe8895487fc2
--- /dev/null
+++ b/drivers/net/geneve.c
@@ -0,0 +1,451 @@
+/*
+ * GENEVE: Generic Network Virtualization Encapsulation
+ *
+ * Copyright (c) 2015 Red Hat, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/netdevice.h>
+#include <linux/etherdevice.h>
+#include <net/rtnetlink.h>
+#include <net/geneve.h>
+
+#define GENEVE_NETDEV_VER	"0.1"
+
+#define GENEVE_UDP_PORT		6081
+
+#define GENEVE_N_VID	(1u << 24)
+#define GENEVE_VID_MASK	(GENEVE_N_VID - 1)
+
+/* per-network namespace private data for this module */
+struct geneve_net {
+	struct list_head  geneve_list;
+};
+
+/* Pseudo network device */
+struct geneve_dev {
+	struct net	   *net;	/* netns for packet i/o */
+	struct net_device  *dev;	/* netdev for geneve tunnel */
+	struct geneve_sock *sock;	/* socket used for geneve tunnel */
+	u8 vni[3];			/* virtual network ID for tunnel */
+	struct sockaddr_in remote;	/* IPv4 address for link partner */
+	struct work_struct sock_work;	/* work item for binding socket */
+	struct list_head   next;	/* geneve's per namespace list */
+};
+
+static void geneve_sock_work(struct work_struct *work);
+
+static struct workqueue_struct *geneve_wq;
+
+static int geneve_net_id;
+
+/* geneve receive/decap routine */
+static void geneve_rx(struct geneve_sock *gs, struct sk_buff *skb)
+{
+	struct genevehdr *gnvh = geneve_hdr(skb);
+	struct geneve_dev *geneve;
+	struct pcpu_sw_netstats *stats;
+
+	geneve = gs->rcv_data;
+
+	/* Does the VNI match the device? */
+	if (memcmp(gnvh->vni, geneve->vni, sizeof(geneve->vni)))
+		goto drop;
+
+	skb_reset_mac_header(skb);
+	skb_scrub_packet(skb, !net_eq(geneve->net, dev_net(geneve->dev)));
+	skb->protocol = eth_type_trans(skb, geneve->dev);
+
+	/* force IP checksum recalculation */
+	skb->ip_summed = CHECKSUM_NONE;
+
+	/* Ignore packet loops (and multicast echo) */
+	if (ether_addr_equal(eth_hdr(skb)->h_source, geneve->dev->dev_addr))
+		goto drop;
+
+	skb_reset_network_header(skb);
+
+	stats = this_cpu_ptr(geneve->dev->tstats);
+	u64_stats_update_begin(&stats->syncp);
+	stats->rx_packets++;
+	stats->rx_bytes += skb->len;
+	u64_stats_update_end(&stats->syncp);
+
+	netif_rx(skb);
+
+	return;
+drop:
+	/* Consume bad packet */
+	kfree_skb(skb);
+}
+
+/* Scheduled at device creation to bind to a socket */
+static void geneve_sock_work(struct work_struct *work)
+{
+	struct geneve_dev *geneve = container_of(work, struct geneve_dev, sock_work);
+	struct net *net = geneve->net;
+	struct geneve_sock *gs;
+
+	gs = geneve_sock_add(net, htons(GENEVE_UDP_PORT), geneve_rx, geneve,
+	                     true, false);
+	if (!IS_ERR(gs))
+		geneve->sock = gs;
+
+	dev_put(geneve->dev);
+}
+
+/* Setup stats when device is created */
+static int geneve_init(struct net_device *dev)
+{
+	struct geneve_dev *geneve = netdev_priv(dev);
+
+	dev->tstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats);
+	if (!dev->tstats)
+		return -ENOMEM;
+
+	/* make new socket outside of RTNL */
+	dev_hold(dev);
+	queue_work(geneve_wq, &geneve->sock_work);
+
+	return 0;
+}
+
+static void geneve_uninit(struct net_device *dev)
+{
+	struct geneve_dev *geneve = netdev_priv(dev);
+	struct geneve_sock *gs = geneve->sock;
+
+	if (gs)
+		geneve_sock_release(gs);
+	free_percpu(dev->tstats);
+}
+
+static int geneve_open(struct net_device *dev)
+{
+	struct geneve_dev *geneve = netdev_priv(dev);
+	struct geneve_sock *gs = geneve->sock;
+
+	/* socket hasn't been created */
+	if (!gs)
+		return -ENOTCONN;
+
+	return 0;
+}
+
+static int geneve_stop(struct net_device *dev)
+{
+	return 0;
+}
+
+static netdev_tx_t geneve_xmit(struct sk_buff *skb, struct net_device *dev)
+{
+	struct geneve_dev *geneve = netdev_priv(dev);
+	struct geneve_sock *gs = geneve->sock;
+	struct rtable *rt = NULL;
+	const struct iphdr *iip; /* interior IP header */
+	struct flowi4 fl4;
+	int err;
+	__be16 sport;
+	__u8 tos, ttl = 0;
+
+	iip = ip_hdr(skb);
+
+	skb_reset_mac_header(skb);
+
+	/* TODO: port min/max limits should be configurable */
+	sport = udp_flow_src_port(dev_net(dev), skb, 0, 0, true);
+
+	memset(&fl4, 0, sizeof(fl4));
+	fl4.daddr = geneve->remote.sin_addr.s_addr;
+	rt = ip_route_output_key(geneve->net, &fl4);
+	if (IS_ERR(rt)) {
+		netdev_dbg(dev, "no route to %pI4\n", &fl4.daddr);
+		dev->stats.tx_carrier_errors++;
+		goto tx_error;
+	}
+	if (rt->dst.dev == dev) { /* is this necessary? */
+		netdev_dbg(dev, "circular route to %pI4\n", &fl4.daddr);
+		dev->stats.collisions++;
+		goto rt_tx_error;
+	}
+
+	/* TODO: tos and ttl should be configurable */
+
+	tos = ip_tunnel_ecn_encap(0, iip, skb);
+
+	if (IN_MULTICAST(ntohl(fl4.daddr)))
+		ttl = 1;
+
+	ttl = ttl ? : ip4_dst_hoplimit(&rt->dst);
+
+	/* no need to handle local destination and encap bypass...yet... */
+
+	err = geneve_xmit_skb(gs, rt, skb, fl4.saddr, fl4.daddr,
+	                      tos, ttl, 0, sport, htons(GENEVE_UDP_PORT), 0,
+	                      geneve->vni, 0, NULL, false,
+	                      !net_eq(geneve->net, dev_net(geneve->dev)));
+	if (err < 0)
+		ip_rt_put(rt);
+
+	iptunnel_xmit_stats(err, &dev->stats, dev->tstats);
+
+	return NETDEV_TX_OK;
+
+rt_tx_error:
+	ip_rt_put(rt);
+tx_error:
+	dev->stats.tx_errors++;
+	dev_kfree_skb(skb);
+	return NETDEV_TX_OK;
+}
+
+static const struct net_device_ops geneve_netdev_ops = {
+	.ndo_init		= geneve_init,
+	.ndo_uninit		= geneve_uninit,
+	.ndo_open		= geneve_open,
+	.ndo_stop		= geneve_stop,
+	.ndo_start_xmit		= geneve_xmit,
+	.ndo_get_stats64	= ip_tunnel_get_stats64,
+	.ndo_change_mtu		= eth_change_mtu,
+	.ndo_validate_addr	= eth_validate_addr,
+	.ndo_set_mac_address	= eth_mac_addr,
+};
+
+/* Info for udev, that this is a virtual tunnel endpoint */
+static struct device_type geneve_type = {
+	.name = "geneve",
+};
+
+/* Initialize the device structure. */
+static void geneve_setup(struct net_device *dev)
+{
+	struct geneve_dev *geneve = netdev_priv(dev);
+
+	ether_setup(dev);
+
+	dev->netdev_ops = &geneve_netdev_ops;
+	dev->destructor = free_netdev;
+	SET_NETDEV_DEVTYPE(dev, &geneve_type);
+
+	INIT_WORK(&geneve->sock_work, geneve_sock_work);
+
+	dev->tx_queue_len = 0;
+	dev->features = 0;
+
+	dev->vlan_features = dev->features;
+	dev->hw_features = 0;
+
+	geneve->dev = dev;
+}
+
+static const struct nla_policy geneve_policy[IFLA_GENEVE_MAX + 1] = {
+	[IFLA_GENEVE_ID]		= { .type = NLA_U32 },
+	[IFLA_GENEVE_REMOTE]		= { .len = FIELD_SIZEOF(struct iphdr, daddr) },
+};
+
+static int geneve_validate(struct nlattr *tb[], struct nlattr *data[])
+{
+	if (tb[IFLA_ADDRESS]) {
+		if (nla_len(tb[IFLA_ADDRESS]) != ETH_ALEN)
+			return -EINVAL;
+
+		if (!is_valid_ether_addr(nla_data(tb[IFLA_ADDRESS])))
+			return -EADDRNOTAVAIL;
+	}
+
+	if (!data)
+		return -EINVAL;
+
+	if (data[IFLA_GENEVE_ID]) {
+		__u32 vni =  nla_get_u32(data[IFLA_GENEVE_ID]);
+		if (vni >= GENEVE_VID_MASK)
+			return -ERANGE;
+	}
+
+	return 0;
+}
+
+static void geneve_get_drvinfo(struct net_device *dev,
+			       struct ethtool_drvinfo *drvinfo)
+{
+	strlcpy(drvinfo->version, GENEVE_NETDEV_VER, sizeof(drvinfo->version));
+	strlcpy(drvinfo->driver, "geneve", sizeof(drvinfo->driver));
+}
+
+static const struct ethtool_ops geneve_ethtool_ops = {
+	.get_drvinfo	= geneve_get_drvinfo,
+	.get_link	= ethtool_op_get_link,
+};
+
+static int geneve_newlink(struct net *net, struct net_device *dev,
+			 struct nlattr *tb[], struct nlattr *data[])
+{
+	struct geneve_net *gn = net_generic(net, geneve_net_id);
+	struct geneve_dev *geneve = netdev_priv(dev);
+	__u32 vni;
+	int err;
+
+	/* TODO: need to support multiple tunnels in a namespace */
+	if (!list_empty(&gn->geneve_list))
+		return -EBUSY;
+
+	if (!data[IFLA_GENEVE_ID])
+		return -EINVAL;
+
+	geneve->net = net;
+
+	vni = nla_get_u32(data[IFLA_GENEVE_ID]);
+	geneve->vni[0] = (vni & 0x00ff0000) >> 16;
+	geneve->vni[1] = (vni & 0x0000ff00) >> 8;
+	geneve->vni[2] =  vni & 0x000000ff;
+
+	if (data[IFLA_GENEVE_REMOTE])
+		geneve->remote.sin_addr.s_addr =
+			nla_get_be32(data[IFLA_GENEVE_REMOTE]);
+
+	dev->ethtool_ops = &geneve_ethtool_ops;
+
+	if (tb[IFLA_ADDRESS] == NULL)
+		eth_hw_addr_random(dev);
+
+	err = register_netdevice(dev);
+	if (err)
+		return err;
+
+	list_add(&geneve->next, &gn->geneve_list);
+
+	return 0;
+}
+
+static void geneve_dellink(struct net_device *dev, struct list_head *head)
+{
+	struct geneve_dev *geneve = netdev_priv(dev);
+
+	list_del(&geneve->next);
+	unregister_netdevice_queue(dev, head);
+}
+
+static size_t geneve_get_size(const struct net_device *dev)
+{
+	return nla_total_size(sizeof(__u32)) +	/* IFLA_GENEVE_ID */
+		nla_total_size(sizeof(struct in_addr)) + /* IFLA_GENEVE_REMOTE */
+		0;
+}
+
+static int geneve_fill_info(struct sk_buff *skb, const struct net_device *dev)
+{
+	struct geneve_dev *geneve = netdev_priv(dev);
+	__u32 vni;
+
+	vni = (geneve->vni[0] << 16) | (geneve->vni[1] << 8) | geneve->vni[2];
+	if (nla_put_u32(skb, IFLA_GENEVE_ID, vni))
+		goto nla_put_failure;
+
+	if (nla_put_be32(skb, IFLA_GENEVE_REMOTE,
+			 geneve->remote.sin_addr.s_addr))
+		goto nla_put_failure;
+
+	return 0;
+
+nla_put_failure:
+	return -EMSGSIZE;
+}
+
+static struct rtnl_link_ops geneve_link_ops __read_mostly = {
+	.kind		= "geneve",
+	.maxtype	= IFLA_GENEVE_MAX,
+	.policy		= geneve_policy,
+	.priv_size	= sizeof(struct geneve_dev),
+	.setup		= geneve_setup,
+	.validate	= geneve_validate,
+	.newlink	= geneve_newlink,
+	.dellink	= geneve_dellink,
+	.get_size	= geneve_get_size,
+	.fill_info	= geneve_fill_info,
+};
+
+static __net_init int geneve_init_net(struct net *net)
+{
+	struct geneve_net *gn = net_generic(net, geneve_net_id);
+
+	INIT_LIST_HEAD(&gn->geneve_list);
+
+	return 0;
+}
+
+static void __net_exit geneve_exit_net(struct net *net)
+{
+	struct geneve_net *gn = net_generic(net, geneve_net_id);
+	struct geneve_dev *geneve, *next;
+	struct net_device *dev, *aux;
+	LIST_HEAD(list);
+
+	rtnl_lock();
+	for_each_netdev_safe(net, dev, aux)
+		if (dev->rtnl_link_ops == &geneve_link_ops)
+			unregister_netdevice_queue(dev, &list);
+
+	list_for_each_entry_safe(geneve, next, &gn->geneve_list, next) {
+		/* If geneve->dev is in the same netns, it was already added
+		 * to the list by the previous loop.
+		 */
+		if (!net_eq(dev_net(geneve->dev), net))
+			unregister_netdevice_queue(dev, &list);
+	}
+
+	unregister_netdevice_many(&list);
+	rtnl_unlock();
+}
+
+static struct pernet_operations geneve_net_ops = {
+	.init = geneve_init_net,
+	.exit = geneve_exit_net,
+	.id   = &geneve_net_id,
+	.size = sizeof(struct geneve_net),
+};
+
+static int __init geneve_init_module(void)
+{
+	int rc;
+
+	geneve_wq = alloc_workqueue("geneve", 0, 0);
+	if (!geneve_wq)
+		return -ENOMEM;
+
+	rc = register_pernet_subsys(&geneve_net_ops);
+	if (rc)
+		goto out1;
+
+	rc = rtnl_link_register(&geneve_link_ops);
+	if (rc)
+		goto out2;
+
+	return 0;
+out2:
+	unregister_pernet_subsys(&geneve_net_ops);
+out1:
+	destroy_workqueue(geneve_wq);
+	return rc;
+}
+late_initcall(geneve_init_module);
+
+static void __exit geneve_cleanup_module(void)
+{
+	rtnl_link_unregister(&geneve_link_ops);
+	destroy_workqueue(geneve_wq);
+	unregister_pernet_subsys(&geneve_net_ops);
+}
+module_exit(geneve_cleanup_module);
+
+MODULE_LICENSE("GPL");
+MODULE_VERSION(GENEVE_NETDEV_VER);
+MODULE_AUTHOR("John W. Linville <linville@tuxdriver.com>");
+MODULE_DESCRIPTION("Interface driver for GENEVE encapsulated traffic");
+MODULE_ALIAS_RTNL_LINK("geneve");
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index 7ffb18df01ca..b0c93c361844 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -390,6 +390,15 @@ struct ifla_vxlan_port_range {
 	__be16	high;
 };
 
+/* GENEVE section */
+enum {
+	IFLA_GENEVE_UNSPEC,
+	IFLA_GENEVE_ID,
+	IFLA_GENEVE_REMOTE,
+	__IFLA_GENEVE_MAX
+};
+#define IFLA_GENEVE_MAX	(__IFLA_GENEVE_MAX - 1)
+
 /* Bonding section */
 
 enum {
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC PATCH] iproute2: GENEVE support
  2015-04-02 19:17 [RFC] add GENEVE netdev tunnel driver John W. Linville
                   ` (4 preceding siblings ...)
  2015-04-02 19:17 ` [RFC PATCH 5/5] geneve: add initial netdev driver for GENEVE tunnels John W. Linville
@ 2015-04-02 19:17 ` John W. Linville
  5 siblings, 0 replies; 28+ messages in thread
From: John W. Linville @ 2015-04-02 19:17 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Jesse Gross, Andy Zhou, Stephen Hemminger,
	Alexander Duyck, John W. Linville

Signed-off-by: John W. Linville <linville@tuxdriver.com>
---
 include/linux/if_link.h |   9 ++++
 ip/Makefile             |   3 +-
 ip/iplink.c             |   2 +-
 ip/iplink_geneve.c      | 122 ++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 134 insertions(+), 2 deletions(-)
 create mode 100644 ip/iplink_geneve.c

diff --git a/include/linux/if_link.h b/include/linux/if_link.h
index 3450c3fbdc65..7fb7c3fc6380 100644
--- a/include/linux/if_link.h
+++ b/include/linux/if_link.h
@@ -382,6 +382,15 @@ struct ifla_vxlan_port_range {
 	__be16	high;
 };
 
+/* GENEVE section */
+enum {
+	IFLA_GENEVE_UNSPEC,
+	IFLA_GENEVE_ID,
+	IFLA_GENEVE_REMOTE,
+	__IFLA_GENEVE_MAX
+};
+#define IFLA_GENEVE_MAX	(__IFLA_GENEVE_MAX - 1)
+
 /* Bonding section */
 
 enum {
diff --git a/ip/Makefile b/ip/Makefile
index 2c742f305fef..77653ecc5785 100644
--- a/ip/Makefile
+++ b/ip/Makefile
@@ -6,7 +6,8 @@ IPOBJ=ip.o ipaddress.o ipaddrlabel.o iproute.o iprule.o ipnetns.o \
     iplink_macvlan.o iplink_macvtap.o ipl2tp.o link_vti.o link_vti6.o \
     iplink_vxlan.o tcp_metrics.o iplink_ipoib.o ipnetconf.o link_ip6tnl.o \
     link_iptnl.o link_gre6.o iplink_bond.o iplink_bond_slave.o iplink_hsr.o \
-    iplink_bridge.o iplink_bridge_slave.o ipfou.o iplink_ipvlan.o
+    iplink_bridge.o iplink_bridge_slave.o ipfou.o iplink_ipvlan.o \
+    iplink_geneve.o
 
 RTMONOBJ=rtmon.o
 
diff --git a/ip/iplink.c b/ip/iplink.c
index 5893ee401cf9..da7cccef4c4a 100644
--- a/ip/iplink.c
+++ b/ip/iplink.c
@@ -92,7 +92,7 @@ void iplink_usage(void)
 		fprintf(stderr, "TYPE := { vlan | veth | vcan | dummy | ifb | macvlan | macvtap |\n");
 		fprintf(stderr, "          bridge | bond | ipoib | ip6tnl | ipip | sit | vxlan |\n");
 		fprintf(stderr, "          gre | gretap | ip6gre | ip6gretap | vti | nlmon |\n");
-		fprintf(stderr, "          bond_slave | ipvlan }\n");
+		fprintf(stderr, "          bond_slave | ipvlan | geneve }\n");
 	}
 	exit(-1);
 }
diff --git a/ip/iplink_geneve.c b/ip/iplink_geneve.c
new file mode 100644
index 000000000000..74703e1ee156
--- /dev/null
+++ b/ip/iplink_geneve.c
@@ -0,0 +1,122 @@
+/*
+ * iplink_geneve.c	GENEVE device support
+ *
+ *              This program is free software; you can redistribute it and/or
+ *              modify it under the terms of the GNU General Public License
+ *              as published by the Free Software Foundation; either version
+ *              2 of the License, or (at your option) any later version.
+ *
+ * Authors:     John W. Linville <linville@tuxdriver.com>
+ */
+
+#include <stdio.h>
+
+#include "utils.h"
+#include "ip_common.h"
+
+static void print_explain(FILE *f)
+{
+	fprintf(f, "Usage: ... geneve id VNI remote ADDR\n");
+	fprintf(f, "\n");
+	fprintf(f, "Where: VNI  := 0-16777215\n");
+	fprintf(f, "       ADDR := IP_ADDRESS\n");
+}
+
+static void explain(void)
+{
+	print_explain(stderr);
+}
+
+static int geneve_parse_opt(struct link_util *lu, int argc, char **argv,
+			  struct nlmsghdr *n)
+{
+	__u32 vni = 0;
+	int vni_set = 0;
+	__u32 daddr = 0;
+	struct in6_addr daddr6 = IN6ADDR_ANY_INIT;
+
+
+	while (argc > 0) {
+		if (!matches(*argv, "id") ||
+		    !matches(*argv, "vni")) {
+			NEXT_ARG();
+			if (get_u32(&vni, *argv, 0) ||
+			    vni >= 1u << 24)
+				invarg("invalid id", *argv);
+			vni_set = 1;
+		} else if (!matches(*argv, "remote")) {
+			NEXT_ARG();
+			if (!inet_get_addr(*argv, &daddr, &daddr6)) {
+				fprintf(stderr, "Invalid address \"%s\"\n", *argv);
+				return -1;
+			}
+			if (IN_MULTICAST(ntohl(daddr)))
+				invarg("invalid remote address", *argv);
+		} else if (matches(*argv, "help") == 0) {
+			explain();
+			return -1;
+		} else {
+			fprintf(stderr, "geneve: unknown command \"%s\"?\n", *argv);
+			explain();
+			return -1;
+		}
+		argc--, argv++;
+	}
+
+	if (!vni_set) {
+		fprintf(stderr, "geneve: missing virtual network identifier\n");
+		return -1;
+	}
+
+	if (!daddr) {
+		fprintf(stderr, "geneve: remove link partner not specified\n");
+		return -1;
+	}
+	if (memcmp(&daddr6, &in6addr_any, sizeof(daddr6)) != 0) {
+		fprintf(stderr, "geneve: remove link over IPv6 not supported\n");
+		return -1;
+	}
+
+	addattr32(n, 1024, IFLA_GENEVE_ID, vni);
+	if (daddr)
+		addattr_l(n, 1024, IFLA_GENEVE_REMOTE, &daddr, 4);
+
+	return 0;
+}
+
+static void geneve_print_opt(struct link_util *lu, FILE *f, struct rtattr *tb[])
+{
+	__u32 vni;
+	char s1[1024];
+
+	if (!tb)
+		return;
+
+	if (!tb[IFLA_GENEVE_ID] ||
+	    RTA_PAYLOAD(tb[IFLA_GENEVE_ID]) < sizeof(__u32))
+		return;
+
+	vni = rta_getattr_u32(tb[IFLA_GENEVE_ID]);
+	fprintf(f, "id %u ", vni);
+
+	if (tb[IFLA_GENEVE_REMOTE]) {
+		__be32 addr = rta_getattr_u32(tb[IFLA_GENEVE_REMOTE]);
+		if (addr)
+			fprintf(f, "remote %s ",
+				format_host(AF_INET, 4, &addr, s1, sizeof(s1)));
+	}
+}
+
+static void geneve_print_help(struct link_util *lu, int argc, char **argv,
+	FILE *f)
+{
+	print_explain(f);
+}
+
+struct link_util geneve_link_util = {
+	.id		= "geneve",
+	.maxattr	= IFLA_GENEVE_MAX,
+	.parse_opt	= geneve_parse_opt,
+	.print_opt	= geneve_print_opt,
+	.print_help	= geneve_print_help,
+};
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 5/5] geneve: add initial netdev driver for GENEVE tunnels
  2015-04-02 19:17 ` [RFC PATCH 5/5] geneve: add initial netdev driver for GENEVE tunnels John W. Linville
@ 2015-04-02 20:20   ` Jiri Pirko
  2015-04-03 14:57     ` John W. Linville
  2015-04-03  5:55   ` Simon Horman
  2015-04-03 21:05   ` Jesse Gross
  2 siblings, 1 reply; 28+ messages in thread
From: Jiri Pirko @ 2015-04-02 20:20 UTC (permalink / raw)
  To: John W. Linville
  Cc: netdev, David S. Miller, Jesse Gross, Andy Zhou,
	Stephen Hemminger, Alexander Duyck

Thu, Apr 02, 2015 at 09:17:06PM CEST, linville@tuxdriver.com wrote:
>This is an initial implementation of a netdev driver for GENEVE
>tunnels.  This implementation uses a fixed UDP port, and only supports
>a single tunnel (and therefore only a single VNI) per net namespace.
>Only IPv4 links are supported at this time.


Thanks for doing this John!


>
>Signed-off-by: John W. Linville <linville@tuxdriver.com>
>---
> drivers/net/Kconfig          |  14 ++
> drivers/net/Makefile         |   1 +
> drivers/net/geneve.c         | 451 +++++++++++++++++++++++++++++++++++++++++++
> include/uapi/linux/if_link.h |   9 +
> 4 files changed, 475 insertions(+)
> create mode 100644 drivers/net/geneve.c
>

...

>+/* Initialize the device structure. */
>+static void geneve_setup(struct net_device *dev)
>+{
>+	struct geneve_dev *geneve = netdev_priv(dev);
>+
>+	ether_setup(dev);
>+
>+	dev->netdev_ops = &geneve_netdev_ops;
>+	dev->destructor = free_netdev;
>+	SET_NETDEV_DEVTYPE(dev, &geneve_type);
>+
>+	INIT_WORK(&geneve->sock_work, geneve_sock_work);

I would push work initialization into geneve_newlink. Seems odd to have
it here in setup.

>+
>+	dev->tx_queue_len = 0;
>+	dev->features = 0;
>+
>+	dev->vlan_features = dev->features;
>+	dev->hw_features = 0;
>+
>+	geneve->dev = dev;
>+}
>+
>+static const struct nla_policy geneve_policy[IFLA_GENEVE_MAX + 1] = {
>+	[IFLA_GENEVE_ID]		= { .type = NLA_U32 },
>+	[IFLA_GENEVE_REMOTE]		= { .len = FIELD_SIZEOF(struct iphdr, daddr) },
>+};
>+
>+static int geneve_validate(struct nlattr *tb[], struct nlattr *data[])
>+{
>+	if (tb[IFLA_ADDRESS]) {
>+		if (nla_len(tb[IFLA_ADDRESS]) != ETH_ALEN)
>+			return -EINVAL;
>+
>+		if (!is_valid_ether_addr(nla_data(tb[IFLA_ADDRESS])))
>+			return -EADDRNOTAVAIL;
>+	}
>+
>+	if (!data)
>+		return -EINVAL;
>+
>+	if (data[IFLA_GENEVE_ID]) {
>+		__u32 vni =  nla_get_u32(data[IFLA_GENEVE_ID]);

missing newline

>+		if (vni >= GENEVE_VID_MASK)
>+			return -ERANGE;
>+	}
>+
>+	return 0;
>+}
>+
>+static void geneve_get_drvinfo(struct net_device *dev,
>+			       struct ethtool_drvinfo *drvinfo)
>+{
>+	strlcpy(drvinfo->version, GENEVE_NETDEV_VER, sizeof(drvinfo->version));
>+	strlcpy(drvinfo->driver, "geneve", sizeof(drvinfo->driver));
>+}
>+
>+static const struct ethtool_ops geneve_ethtool_ops = {
>+	.get_drvinfo	= geneve_get_drvinfo,
>+	.get_link	= ethtool_op_get_link,
>+};
>+
>+static int geneve_newlink(struct net *net, struct net_device *dev,
>+			 struct nlattr *tb[], struct nlattr *data[])
>+{
>+	struct geneve_net *gn = net_generic(net, geneve_net_id);
>+	struct geneve_dev *geneve = netdev_priv(dev);
>+	__u32 vni;

why not "u32" ?

>+	int err;
>+
>+	/* TODO: need to support multiple tunnels in a namespace */
>+	if (!list_empty(&gn->geneve_list))
>+		return -EBUSY;

Interesting limitation :)

...

>+static void __net_exit geneve_exit_net(struct net *net)
>+{
>+	struct geneve_net *gn = net_generic(net, geneve_net_id);
>+	struct geneve_dev *geneve, *next;
>+	struct net_device *dev, *aux;
>+	LIST_HEAD(list);
>+
>+	rtnl_lock();
>+	for_each_netdev_safe(net, dev, aux)
>+		if (dev->rtnl_link_ops == &geneve_link_ops)
>+			unregister_netdevice_queue(dev, &list);
>+
>+	list_for_each_entry_safe(geneve, next, &gn->geneve_list, next) {
>+		/* If geneve->dev is in the same netns, it was already added
>+		 * to the list by the previous loop.
>+		 */
>+		if (!net_eq(dev_net(geneve->dev), net))
>+			unregister_netdevice_queue(dev, &list);
>+	}

I know this is c&p of vxlan, but I do not understand why the first loop
is there. The second loop will take care of all since all devs are
listed in ->geneve_list, right?

Also you do not need _safe variant since you traverse through
->geneve_list, which is not modified.

>+
>+	unregister_netdevice_many(&list);
>+	rtnl_unlock();
>+}

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 1/5] geneve: remove MODULE_ALIAS_RTNL_LINK from net/ipv4/geneve.c
  2015-04-02 19:17 ` [RFC PATCH 1/5] geneve: remove MODULE_ALIAS_RTNL_LINK from net/ipv4/geneve.c John W. Linville
@ 2015-04-02 23:39   ` Stephen Hemminger
  2015-04-03 12:17     ` Jiri Pirko
  0 siblings, 1 reply; 28+ messages in thread
From: Stephen Hemminger @ 2015-04-02 23:39 UTC (permalink / raw)
  To: John W. Linville
  Cc: netdev, David S. Miller, Jesse Gross, Andy Zhou, Alexander Duyck

On Thu,  2 Apr 2015 15:17:02 -0400
"John W. Linville" <linville@tuxdriver.com> wrote:

> This file is essentially a library for implementing the geneve
> encapsulation protocol.  The file does not register any rtnl_link_ops,
> so the MODULE_ALIAS_RTNL_LINK macro is inappropriate here.
> 
> Signed-off-by: John W. Linville <linville@tuxdriver.com>
> ---
>  net/ipv4/geneve.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/net/ipv4/geneve.c b/net/ipv4/geneve.c
> index 5a4828ba05ad..ba5283adbee8 100644
> --- a/net/ipv4/geneve.c
> +++ b/net/ipv4/geneve.c
> @@ -450,4 +450,3 @@ module_exit(geneve_cleanup_module);
>  MODULE_LICENSE("GPL");
>  MODULE_AUTHOR("Jesse Gross <jesse@nicira.com>");
>  MODULE_DESCRIPTION("Driver for GENEVE encapsulated traffic");
> -MODULE_ALIAS_RTNL_LINK("geneve");

ok but then how does it get autoloaded?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 3/5] Rename support library for geneve
  2015-04-02 19:17 ` [RFC PATCH 3/5] Rename support library for geneve John W. Linville
@ 2015-04-03  0:05   ` Cong Wang
  2015-04-03 14:40     ` John W. Linville
  0 siblings, 1 reply; 28+ messages in thread
From: Cong Wang @ 2015-04-03  0:05 UTC (permalink / raw)
  To: John W. Linville
  Cc: netdev, David S. Miller, Jesse Gross, Andy Zhou,
	Stephen Hemminger, Alexander Duyck

On Thu, Apr 2, 2015 at 12:17 PM, John W. Linville
<linville@tuxdriver.com> wrote:
> net/ipv4/geneve.c -> net/ipv4/libgeneve.c
>
> This name better reflects the purpose of the module.

There are several files under net/ipv4/ used as a library,
there is no reason to just highlight this single one. So
this change is not necessary at all.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 5/5] geneve: add initial netdev driver for GENEVE tunnels
  2015-04-02 19:17 ` [RFC PATCH 5/5] geneve: add initial netdev driver for GENEVE tunnels John W. Linville
  2015-04-02 20:20   ` Jiri Pirko
@ 2015-04-03  5:55   ` Simon Horman
  2015-04-03 14:41     ` John W. Linville
  2015-04-03 21:05   ` Jesse Gross
  2 siblings, 1 reply; 28+ messages in thread
From: Simon Horman @ 2015-04-03  5:55 UTC (permalink / raw)
  To: John W. Linville
  Cc: netdev, David S. Miller, Jesse Gross, Andy Zhou,
	Stephen Hemminger, Alexander Duyck

Hi John,

On Thu, Apr 02, 2015 at 03:17:06PM -0400, John W. Linville wrote:
> This is an initial implementation of a netdev driver for GENEVE
> tunnels.  This implementation uses a fixed UDP port, and only supports
> a single tunnel (and therefore only a single VNI) per net namespace.
> Only IPv4 links are supported at this time.
> 
> Signed-off-by: John W. Linville <linville@tuxdriver.com>

Thanks for working on this. I'm very happy to see a Geneve driver hit netdev.

I have a question below.

[snip]

> +/* Scheduled at device creation to bind to a socket */
> +static void geneve_sock_work(struct work_struct *work)
> +{
> +	struct geneve_dev *geneve = container_of(work, struct geneve_dev, sock_work);
> +	struct net *net = geneve->net;
> +	struct geneve_sock *gs;
> +
> +	gs = geneve_sock_add(net, htons(GENEVE_UDP_PORT), geneve_rx, geneve,
> +	                     true, false);
> +	if (!IS_ERR(gs))
> +		geneve->sock = gs;
> +
> +	dev_put(geneve->dev);
> +}
> +
> +/* Setup stats when device is created */
> +static int geneve_init(struct net_device *dev)
> +{
> +	struct geneve_dev *geneve = netdev_priv(dev);
> +
> +	dev->tstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats);
> +	if (!dev->tstats)
> +		return -ENOMEM;
> +
> +	/* make new socket outside of RTNL */
> +	dev_hold(dev);
> +	queue_work(geneve_wq, &geneve->sock_work);
> +
> +	return 0;
> +}
> +
> +static void geneve_uninit(struct net_device *dev)
> +{
> +	struct geneve_dev *geneve = netdev_priv(dev);
> +	struct geneve_sock *gs = geneve->sock;
> +
> +	if (gs)
> +		geneve_sock_release(gs);
> +	free_percpu(dev->tstats);
> +}

I am wondering if there a possibility that geneve_sock_work() could run
after the check for gs in geneve_uninit() thus leaking gs?

[snip]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 1/5] geneve: remove MODULE_ALIAS_RTNL_LINK from net/ipv4/geneve.c
  2015-04-02 23:39   ` Stephen Hemminger
@ 2015-04-03 12:17     ` Jiri Pirko
  2015-04-03 14:27       ` John W. Linville
  0 siblings, 1 reply; 28+ messages in thread
From: Jiri Pirko @ 2015-04-03 12:17 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: John W. Linville, netdev, David S. Miller, Jesse Gross,
	Andy Zhou, Alexander Duyck

Fri, Apr 03, 2015 at 01:39:19AM CEST, stephen@networkplumber.org wrote:
>On Thu,  2 Apr 2015 15:17:02 -0400
>"John W. Linville" <linville@tuxdriver.com> wrote:
>
>> This file is essentially a library for implementing the geneve
>> encapsulation protocol.  The file does not register any rtnl_link_ops,
>> so the MODULE_ALIAS_RTNL_LINK macro is inappropriate here.
>> 
>> Signed-off-by: John W. Linville <linville@tuxdriver.com>
>> ---
>>  net/ipv4/geneve.c | 1 -
>>  1 file changed, 1 deletion(-)
>> 
>> diff --git a/net/ipv4/geneve.c b/net/ipv4/geneve.c
>> index 5a4828ba05ad..ba5283adbee8 100644
>> --- a/net/ipv4/geneve.c
>> +++ b/net/ipv4/geneve.c
>> @@ -450,4 +450,3 @@ module_exit(geneve_cleanup_module);
>>  MODULE_LICENSE("GPL");
>>  MODULE_AUTHOR("Jesse Gross <jesse@nicira.com>");
>>  MODULE_DESCRIPTION("Driver for GENEVE encapsulated traffic");
>> -MODULE_ALIAS_RTNL_LINK("geneve");
>
>ok but then how does it get autoloaded?

There is no "struct rtnl_link_ops" defined for this. Therefore this does
not have any sense. User might use rtnl to load the module, but why?
That is clearly a bug which John is fixing.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 1/5] geneve: remove MODULE_ALIAS_RTNL_LINK from net/ipv4/geneve.c
  2015-04-03 12:17     ` Jiri Pirko
@ 2015-04-03 14:27       ` John W. Linville
  0 siblings, 0 replies; 28+ messages in thread
From: John W. Linville @ 2015-04-03 14:27 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Stephen Hemminger, netdev, David S. Miller, Jesse Gross,
	Andy Zhou, Alexander Duyck

On Fri, Apr 03, 2015 at 02:17:55PM +0200, Jiri Pirko wrote:
> Fri, Apr 03, 2015 at 01:39:19AM CEST, stephen@networkplumber.org wrote:
> >On Thu,  2 Apr 2015 15:17:02 -0400
> >"John W. Linville" <linville@tuxdriver.com> wrote:
> >
> >> This file is essentially a library for implementing the geneve
> >> encapsulation protocol.  The file does not register any rtnl_link_ops,
> >> so the MODULE_ALIAS_RTNL_LINK macro is inappropriate here.
> >> 
> >> Signed-off-by: John W. Linville <linville@tuxdriver.com>
> >> ---
> >>  net/ipv4/geneve.c | 1 -
> >>  1 file changed, 1 deletion(-)
> >> 
> >> diff --git a/net/ipv4/geneve.c b/net/ipv4/geneve.c
> >> index 5a4828ba05ad..ba5283adbee8 100644
> >> --- a/net/ipv4/geneve.c
> >> +++ b/net/ipv4/geneve.c
> >> @@ -450,4 +450,3 @@ module_exit(geneve_cleanup_module);
> >>  MODULE_LICENSE("GPL");
> >>  MODULE_AUTHOR("Jesse Gross <jesse@nicira.com>");
> >>  MODULE_DESCRIPTION("Driver for GENEVE encapsulated traffic");
> >> -MODULE_ALIAS_RTNL_LINK("geneve");
> >
> >ok but then how does it get autoloaded?
> 
> There is no "struct rtnl_link_ops" defined for this. Therefore this does
> not have any sense. User might use rtnl to load the module, but why?
> That is clearly a bug which John is fixing.

Yes, exactly.  This module gets loaded via the depmod/modprobe magic
that handles loading modules to resolve external references during
the loadtime linking.

John
-- 
John W. Linville		Someday the world will need a hero, and you
linville@tuxdriver.com			might be all we have.  Be ready.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 3/5] Rename support library for geneve
  2015-04-03  0:05   ` Cong Wang
@ 2015-04-03 14:40     ` John W. Linville
  2015-04-03 15:54       ` Nicolas Dichtel
  0 siblings, 1 reply; 28+ messages in thread
From: John W. Linville @ 2015-04-03 14:40 UTC (permalink / raw)
  To: Cong Wang
  Cc: netdev, David S. Miller, Jesse Gross, Andy Zhou,
	Stephen Hemminger, Alexander Duyck

On Thu, Apr 02, 2015 at 05:05:26PM -0700, Cong Wang wrote:
> On Thu, Apr 2, 2015 at 12:17 PM, John W. Linville
> <linville@tuxdriver.com> wrote:
> > net/ipv4/geneve.c -> net/ipv4/libgeneve.c
> >
> > This name better reflects the purpose of the module.
> 
> There are several files under net/ipv4/ used as a library,
> there is no reason to just highlight this single one. So
> this change is not necessary at all.

If you really want to bikeshed this, than the necessity of the name
change depends upon your perspective...  The thing is that I want
the netdev module to be named "geneve.ko".  This mirrors the naming
convention used by the vxlan module, and it just makes sense to me.

Having two modules with the same name seems to confuse modprobe
and company such that module autoloading stops working correctly.
The details escape me, but feel free to apply this series without this
one patch and try it for yourself.  The net effect is that either this
name has to change, or I need a different name (geneve-netdev.ko?) for
the netdev module.  A third option might be to combine this module
with the netdev code, but I kinda like the structure of keeping this
shared code in its own module.

Do you object to the name itself?  Or to having a file with a name like
that under the net/ipv4 directory?  Would you find genevelib.c any
less objectionable than libgeneve.c?  Can you suggest a better place
to keep the shared geneve code?  Or a name change that you like better?

Honestly, this name change seems like the simplest and least awkward
choice to me.  But, I am open to suggestions...

John
-- 
John W. Linville		Someday the world will need a hero, and you
linville@tuxdriver.com			might be all we have.  Be ready.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 5/5] geneve: add initial netdev driver for GENEVE tunnels
  2015-04-03  5:55   ` Simon Horman
@ 2015-04-03 14:41     ` John W. Linville
  0 siblings, 0 replies; 28+ messages in thread
From: John W. Linville @ 2015-04-03 14:41 UTC (permalink / raw)
  To: Simon Horman
  Cc: netdev, David S. Miller, Jesse Gross, Andy Zhou,
	Stephen Hemminger, Alexander Duyck

On Fri, Apr 03, 2015 at 02:55:02PM +0900, Simon Horman wrote:
> Hi John,
> 
> On Thu, Apr 02, 2015 at 03:17:06PM -0400, John W. Linville wrote:
> > This is an initial implementation of a netdev driver for GENEVE
> > tunnels.  This implementation uses a fixed UDP port, and only supports
> > a single tunnel (and therefore only a single VNI) per net namespace.
> > Only IPv4 links are supported at this time.
> > 
> > Signed-off-by: John W. Linville <linville@tuxdriver.com>
> 
> Thanks for working on this. I'm very happy to see a Geneve driver hit netdev.
> 
> I have a question below.
> 
> [snip]
> 
> > +/* Scheduled at device creation to bind to a socket */
> > +static void geneve_sock_work(struct work_struct *work)
> > +{
> > +	struct geneve_dev *geneve = container_of(work, struct geneve_dev, sock_work);
> > +	struct net *net = geneve->net;
> > +	struct geneve_sock *gs;
> > +
> > +	gs = geneve_sock_add(net, htons(GENEVE_UDP_PORT), geneve_rx, geneve,
> > +	                     true, false);
> > +	if (!IS_ERR(gs))
> > +		geneve->sock = gs;
> > +
> > +	dev_put(geneve->dev);
> > +}
> > +
> > +/* Setup stats when device is created */
> > +static int geneve_init(struct net_device *dev)
> > +{
> > +	struct geneve_dev *geneve = netdev_priv(dev);
> > +
> > +	dev->tstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats);
> > +	if (!dev->tstats)
> > +		return -ENOMEM;
> > +
> > +	/* make new socket outside of RTNL */
> > +	dev_hold(dev);
> > +	queue_work(geneve_wq, &geneve->sock_work);
> > +
> > +	return 0;
> > +}
> > +
> > +static void geneve_uninit(struct net_device *dev)
> > +{
> > +	struct geneve_dev *geneve = netdev_priv(dev);
> > +	struct geneve_sock *gs = geneve->sock;
> > +
> > +	if (gs)
> > +		geneve_sock_release(gs);
> > +	free_percpu(dev->tstats);
> > +}
> 
> I am wondering if there a possibility that geneve_sock_work() could run
> after the check for gs in geneve_uninit() thus leaking gs?
> 
> [snip]

Hey, good catch!  I should add some locking around that...

John
-- 
John W. Linville		Someday the world will need a hero, and you
linville@tuxdriver.com			might be all we have.  Be ready.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 5/5] geneve: add initial netdev driver for GENEVE tunnels
  2015-04-02 20:20   ` Jiri Pirko
@ 2015-04-03 14:57     ` John W. Linville
  2015-04-03 15:07       ` John W. Linville
  0 siblings, 1 reply; 28+ messages in thread
From: John W. Linville @ 2015-04-03 14:57 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, David S. Miller, Jesse Gross, Andy Zhou,
	Stephen Hemminger, Alexander Duyck

On Thu, Apr 02, 2015 at 10:20:02PM +0200, Jiri Pirko wrote:
> Thu, Apr 02, 2015 at 09:17:06PM CEST, linville@tuxdriver.com wrote:
> >This is an initial implementation of a netdev driver for GENEVE
> >tunnels.  This implementation uses a fixed UDP port, and only supports
> >a single tunnel (and therefore only a single VNI) per net namespace.
> >Only IPv4 links are supported at this time.
> 
> 
> Thanks for doing this John!
> 
> 
> >
> >Signed-off-by: John W. Linville <linville@tuxdriver.com>
> >---
> > drivers/net/Kconfig          |  14 ++
> > drivers/net/Makefile         |   1 +
> > drivers/net/geneve.c         | 451 +++++++++++++++++++++++++++++++++++++++++++
> > include/uapi/linux/if_link.h |   9 +
> > 4 files changed, 475 insertions(+)
> > create mode 100644 drivers/net/geneve.c
> >
> 
> ...
> 
> >+/* Initialize the device structure. */
> >+static void geneve_setup(struct net_device *dev)
> >+{
> >+	struct geneve_dev *geneve = netdev_priv(dev);
> >+
> >+	ether_setup(dev);
> >+
> >+	dev->netdev_ops = &geneve_netdev_ops;
> >+	dev->destructor = free_netdev;
> >+	SET_NETDEV_DEVTYPE(dev, &geneve_type);
> >+
> >+	INIT_WORK(&geneve->sock_work, geneve_sock_work);
> 
> I would push work initialization into geneve_newlink. Seems odd to have
> it here in setup.

Yes, that will probably work a lot better for multiple tunnels on a
host... :-)

> >+
> >+	dev->tx_queue_len = 0;
> >+	dev->features = 0;
> >+
> >+	dev->vlan_features = dev->features;
> >+	dev->hw_features = 0;
> >+
> >+	geneve->dev = dev;
> >+}
> >+
> >+static const struct nla_policy geneve_policy[IFLA_GENEVE_MAX + 1] = {
> >+	[IFLA_GENEVE_ID]		= { .type = NLA_U32 },
> >+	[IFLA_GENEVE_REMOTE]		= { .len = FIELD_SIZEOF(struct iphdr, daddr) },
> >+};
> >+
> >+static int geneve_validate(struct nlattr *tb[], struct nlattr *data[])
> >+{
> >+	if (tb[IFLA_ADDRESS]) {
> >+		if (nla_len(tb[IFLA_ADDRESS]) != ETH_ALEN)
> >+			return -EINVAL;
> >+
> >+		if (!is_valid_ether_addr(nla_data(tb[IFLA_ADDRESS])))
> >+			return -EADDRNOTAVAIL;
> >+	}
> >+
> >+	if (!data)
> >+		return -EINVAL;
> >+
> >+	if (data[IFLA_GENEVE_ID]) {
> >+		__u32 vni =  nla_get_u32(data[IFLA_GENEVE_ID]);
> 
> missing newline

Sure.

> >+		if (vni >= GENEVE_VID_MASK)
> >+			return -ERANGE;
> >+	}
> >+
> >+	return 0;
> >+}
> >+
> >+static void geneve_get_drvinfo(struct net_device *dev,
> >+			       struct ethtool_drvinfo *drvinfo)
> >+{
> >+	strlcpy(drvinfo->version, GENEVE_NETDEV_VER, sizeof(drvinfo->version));
> >+	strlcpy(drvinfo->driver, "geneve", sizeof(drvinfo->driver));
> >+}
> >+
> >+static const struct ethtool_ops geneve_ethtool_ops = {
> >+	.get_drvinfo	= geneve_get_drvinfo,
> >+	.get_link	= ethtool_op_get_link,
> >+};
> >+
> >+static int geneve_newlink(struct net *net, struct net_device *dev,
> >+			 struct nlattr *tb[], struct nlattr *data[])
> >+{
> >+	struct geneve_net *gn = net_generic(net, geneve_net_id);
> >+	struct geneve_dev *geneve = netdev_priv(dev);
> >+	__u32 vni;
> 
> why not "u32" ?

I think I copied that from vxlan.c.  In fact, I'm not really sure I
understand why both exist?
 
> >+	int err;
> >+
> >+	/* TODO: need to support multiple tunnels in a namespace */
> >+	if (!list_empty(&gn->geneve_list))
> >+		return -EBUSY;
> 
> Interesting limitation :)

That should disappear, of course. :-)

> ...
> 
> >+static void __net_exit geneve_exit_net(struct net *net)
> >+{
> >+	struct geneve_net *gn = net_generic(net, geneve_net_id);
> >+	struct geneve_dev *geneve, *next;
> >+	struct net_device *dev, *aux;
> >+	LIST_HEAD(list);
> >+
> >+	rtnl_lock();
> >+	for_each_netdev_safe(net, dev, aux)
> >+		if (dev->rtnl_link_ops == &geneve_link_ops)
> >+			unregister_netdevice_queue(dev, &list);
> >+
> >+	list_for_each_entry_safe(geneve, next, &gn->geneve_list, next) {
> >+		/* If geneve->dev is in the same netns, it was already added
> >+		 * to the list by the previous loop.
> >+		 */
> >+		if (!net_eq(dev_net(geneve->dev), net))
> >+			unregister_netdevice_queue(dev, &list);
> >+	}
> 
> I know this is c&p of vxlan, but I do not understand why the first loop
> is there. The second loop will take care of all since all devs are
> listed in ->geneve_list, right?
> 
> Also you do not need _safe variant since you traverse through
> ->geneve_list, which is not modified.

Yes, it is boilerplate from vxlan.  Maybe Stephen can explain it?

Maybe it relates to the the ordering of the unregister queue?
I'll try to figure it out...

> >+
> >+	unregister_netdevice_many(&list);
> >+	rtnl_unlock();
> >+}

Thanks for the review and suggestions!

John

-- 
John W. Linville		Someday the world will need a hero, and you
linville@tuxdriver.com			might be all we have.  Be ready.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 5/5] geneve: add initial netdev driver for GENEVE tunnels
  2015-04-03 14:57     ` John W. Linville
@ 2015-04-03 15:07       ` John W. Linville
  2015-04-03 15:20         ` Jiri Pirko
  0 siblings, 1 reply; 28+ messages in thread
From: John W. Linville @ 2015-04-03 15:07 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, David S. Miller, Jesse Gross, Andy Zhou,
	Stephen Hemminger, Alexander Duyck

On Fri, Apr 03, 2015 at 10:57:12AM -0400, John W. Linville wrote:
> On Thu, Apr 02, 2015 at 10:20:02PM +0200, Jiri Pirko wrote:
> > Thu, Apr 02, 2015 at 09:17:06PM CEST, linville@tuxdriver.com wrote:
> > >This is an initial implementation of a netdev driver for GENEVE
> > >tunnels.  This implementation uses a fixed UDP port, and only supports
> > >a single tunnel (and therefore only a single VNI) per net namespace.
> > >Only IPv4 links are supported at this time.
> > 
> > 
> > Thanks for doing this John!
> > 
> > 
> > >
> > >Signed-off-by: John W. Linville <linville@tuxdriver.com>
> > >---
> > > drivers/net/Kconfig          |  14 ++
> > > drivers/net/Makefile         |   1 +
> > > drivers/net/geneve.c         | 451 +++++++++++++++++++++++++++++++++++++++++++
> > > include/uapi/linux/if_link.h |   9 +
> > > 4 files changed, 475 insertions(+)
> > > create mode 100644 drivers/net/geneve.c
> > >
> > 
> > ...
> > 
> > >+/* Initialize the device structure. */
> > >+static void geneve_setup(struct net_device *dev)
> > >+{
> > >+	struct geneve_dev *geneve = netdev_priv(dev);
> > >+
> > >+	ether_setup(dev);
> > >+
> > >+	dev->netdev_ops = &geneve_netdev_ops;
> > >+	dev->destructor = free_netdev;
> > >+	SET_NETDEV_DEVTYPE(dev, &geneve_type);
> > >+
> > >+	INIT_WORK(&geneve->sock_work, geneve_sock_work);
> > 
> > I would push work initialization into geneve_newlink. Seems odd to have
> > it here in setup.
> 
> Yes, that will probably work a lot better for multiple tunnels on a
> host... :-)

Ignore that comment -- I was thinking...something else...need coffee... ;-)

What makes newlink better for INIT_WORK than setup?

> 
> > >+
> > >+	dev->tx_queue_len = 0;
> > >+	dev->features = 0;
> > >+
> > >+	dev->vlan_features = dev->features;
> > >+	dev->hw_features = 0;
> > >+
> > >+	geneve->dev = dev;
> > >+}
> > >+
> > >+static const struct nla_policy geneve_policy[IFLA_GENEVE_MAX + 1] = {
> > >+	[IFLA_GENEVE_ID]		= { .type = NLA_U32 },
> > >+	[IFLA_GENEVE_REMOTE]		= { .len = FIELD_SIZEOF(struct iphdr, daddr) },
> > >+};
> > >+
> > >+static int geneve_validate(struct nlattr *tb[], struct nlattr *data[])
> > >+{
> > >+	if (tb[IFLA_ADDRESS]) {
> > >+		if (nla_len(tb[IFLA_ADDRESS]) != ETH_ALEN)
> > >+			return -EINVAL;
> > >+
> > >+		if (!is_valid_ether_addr(nla_data(tb[IFLA_ADDRESS])))
> > >+			return -EADDRNOTAVAIL;
> > >+	}
> > >+
> > >+	if (!data)
> > >+		return -EINVAL;
> > >+
> > >+	if (data[IFLA_GENEVE_ID]) {
> > >+		__u32 vni =  nla_get_u32(data[IFLA_GENEVE_ID]);
> > 
> > missing newline
> 
> Sure.
> 
> > >+		if (vni >= GENEVE_VID_MASK)
> > >+			return -ERANGE;
> > >+	}
> > >+
> > >+	return 0;
> > >+}
> > >+
> > >+static void geneve_get_drvinfo(struct net_device *dev,
> > >+			       struct ethtool_drvinfo *drvinfo)
> > >+{
> > >+	strlcpy(drvinfo->version, GENEVE_NETDEV_VER, sizeof(drvinfo->version));
> > >+	strlcpy(drvinfo->driver, "geneve", sizeof(drvinfo->driver));
> > >+}
> > >+
> > >+static const struct ethtool_ops geneve_ethtool_ops = {
> > >+	.get_drvinfo	= geneve_get_drvinfo,
> > >+	.get_link	= ethtool_op_get_link,
> > >+};
> > >+
> > >+static int geneve_newlink(struct net *net, struct net_device *dev,
> > >+			 struct nlattr *tb[], struct nlattr *data[])
> > >+{
> > >+	struct geneve_net *gn = net_generic(net, geneve_net_id);
> > >+	struct geneve_dev *geneve = netdev_priv(dev);
> > >+	__u32 vni;
> > 
> > why not "u32" ?
> 
> I think I copied that from vxlan.c.  In fact, I'm not really sure I
> understand why both exist?
>  
> > >+	int err;
> > >+
> > >+	/* TODO: need to support multiple tunnels in a namespace */
> > >+	if (!list_empty(&gn->geneve_list))
> > >+		return -EBUSY;
> > 
> > Interesting limitation :)
> 
> That should disappear, of course. :-)
> 
> > ...
> > 
> > >+static void __net_exit geneve_exit_net(struct net *net)
> > >+{
> > >+	struct geneve_net *gn = net_generic(net, geneve_net_id);
> > >+	struct geneve_dev *geneve, *next;
> > >+	struct net_device *dev, *aux;
> > >+	LIST_HEAD(list);
> > >+
> > >+	rtnl_lock();
> > >+	for_each_netdev_safe(net, dev, aux)
> > >+		if (dev->rtnl_link_ops == &geneve_link_ops)
> > >+			unregister_netdevice_queue(dev, &list);
> > >+
> > >+	list_for_each_entry_safe(geneve, next, &gn->geneve_list, next) {
> > >+		/* If geneve->dev is in the same netns, it was already added
> > >+		 * to the list by the previous loop.
> > >+		 */
> > >+		if (!net_eq(dev_net(geneve->dev), net))
> > >+			unregister_netdevice_queue(dev, &list);
> > >+	}
> > 
> > I know this is c&p of vxlan, but I do not understand why the first loop
> > is there. The second loop will take care of all since all devs are
> > listed in ->geneve_list, right?
> > 
> > Also you do not need _safe variant since you traverse through
> > ->geneve_list, which is not modified.
> 
> Yes, it is boilerplate from vxlan.  Maybe Stephen can explain it?
> 
> Maybe it relates to the the ordering of the unregister queue?
> I'll try to figure it out...
> 
> > >+
> > >+	unregister_netdevice_many(&list);
> > >+	rtnl_unlock();
> > >+}
> 
> Thanks for the review and suggestions!
> 
> John
> 
> -- 
> John W. Linville		Someday the world will need a hero, and you
> linville@tuxdriver.com			might be all we have.  Be ready.
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
John W. Linville		Someday the world will need a hero, and you
linville@tuxdriver.com			might be all we have.  Be ready.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 5/5] geneve: add initial netdev driver for GENEVE tunnels
  2015-04-03 15:07       ` John W. Linville
@ 2015-04-03 15:20         ` Jiri Pirko
  2015-04-03 18:31           ` John W. Linville
  0 siblings, 1 reply; 28+ messages in thread
From: Jiri Pirko @ 2015-04-03 15:20 UTC (permalink / raw)
  To: John W. Linville
  Cc: netdev, David S. Miller, Jesse Gross, Andy Zhou,
	Stephen Hemminger, Alexander Duyck

Fri, Apr 03, 2015 at 05:07:44PM CEST, linville@tuxdriver.com wrote:
>On Fri, Apr 03, 2015 at 10:57:12AM -0400, John W. Linville wrote:
>> On Thu, Apr 02, 2015 at 10:20:02PM +0200, Jiri Pirko wrote:
>> > Thu, Apr 02, 2015 at 09:17:06PM CEST, linville@tuxdriver.com wrote:
>> > >This is an initial implementation of a netdev driver for GENEVE
>> > >tunnels.  This implementation uses a fixed UDP port, and only supports
>> > >a single tunnel (and therefore only a single VNI) per net namespace.
>> > >Only IPv4 links are supported at this time.
>> > 
>> > 
>> > Thanks for doing this John!
>> > 
>> > 
>> > >
>> > >Signed-off-by: John W. Linville <linville@tuxdriver.com>
>> > >---
>> > > drivers/net/Kconfig          |  14 ++
>> > > drivers/net/Makefile         |   1 +
>> > > drivers/net/geneve.c         | 451 +++++++++++++++++++++++++++++++++++++++++++
>> > > include/uapi/linux/if_link.h |   9 +
>> > > 4 files changed, 475 insertions(+)
>> > > create mode 100644 drivers/net/geneve.c
>> > >
>> > 
>> > ...
>> > 
>> > >+/* Initialize the device structure. */
>> > >+static void geneve_setup(struct net_device *dev)
>> > >+{
>> > >+	struct geneve_dev *geneve = netdev_priv(dev);
>> > >+
>> > >+	ether_setup(dev);
>> > >+
>> > >+	dev->netdev_ops = &geneve_netdev_ops;
>> > >+	dev->destructor = free_netdev;
>> > >+	SET_NETDEV_DEVTYPE(dev, &geneve_type);
>> > >+
>> > >+	INIT_WORK(&geneve->sock_work, geneve_sock_work);
>> > 
>> > I would push work initialization into geneve_newlink. Seems odd to have
>> > it here in setup.
>> 
>> Yes, that will probably work a lot better for multiple tunnels on a
>> host... :-)
>
>Ignore that comment -- I was thinking...something else...need coffee... ;-)
>
>What makes newlink better for INIT_WORK than setup?


Can be here for sure. I just thought that setup should setup netdev
according to type. This is not it. This initializes a part of priv
structure.


>
>> 
>> > >+
>> > >+	dev->tx_queue_len = 0;
>> > >+	dev->features = 0;
>> > >+
>> > >+	dev->vlan_features = dev->features;
>> > >+	dev->hw_features = 0;
>> > >+
>> > >+	geneve->dev = dev;
>> > >+}
>> > >+
>> > >+static const struct nla_policy geneve_policy[IFLA_GENEVE_MAX + 1] = {
>> > >+	[IFLA_GENEVE_ID]		= { .type = NLA_U32 },
>> > >+	[IFLA_GENEVE_REMOTE]		= { .len = FIELD_SIZEOF(struct iphdr, daddr) },
>> > >+};
>> > >+
>> > >+static int geneve_validate(struct nlattr *tb[], struct nlattr *data[])
>> > >+{
>> > >+	if (tb[IFLA_ADDRESS]) {
>> > >+		if (nla_len(tb[IFLA_ADDRESS]) != ETH_ALEN)
>> > >+			return -EINVAL;
>> > >+
>> > >+		if (!is_valid_ether_addr(nla_data(tb[IFLA_ADDRESS])))
>> > >+			return -EADDRNOTAVAIL;
>> > >+	}
>> > >+
>> > >+	if (!data)
>> > >+		return -EINVAL;
>> > >+
>> > >+	if (data[IFLA_GENEVE_ID]) {
>> > >+		__u32 vni =  nla_get_u32(data[IFLA_GENEVE_ID]);
>> > 
>> > missing newline
>> 
>> Sure.
>> 
>> > >+		if (vni >= GENEVE_VID_MASK)
>> > >+			return -ERANGE;
>> > >+	}
>> > >+
>> > >+	return 0;
>> > >+}
>> > >+
>> > >+static void geneve_get_drvinfo(struct net_device *dev,
>> > >+			       struct ethtool_drvinfo *drvinfo)
>> > >+{
>> > >+	strlcpy(drvinfo->version, GENEVE_NETDEV_VER, sizeof(drvinfo->version));
>> > >+	strlcpy(drvinfo->driver, "geneve", sizeof(drvinfo->driver));
>> > >+}
>> > >+
>> > >+static const struct ethtool_ops geneve_ethtool_ops = {
>> > >+	.get_drvinfo	= geneve_get_drvinfo,
>> > >+	.get_link	= ethtool_op_get_link,
>> > >+};
>> > >+
>> > >+static int geneve_newlink(struct net *net, struct net_device *dev,
>> > >+			 struct nlattr *tb[], struct nlattr *data[])
>> > >+{
>> > >+	struct geneve_net *gn = net_generic(net, geneve_net_id);
>> > >+	struct geneve_dev *geneve = netdev_priv(dev);
>> > >+	__u32 vni;
>> > 
>> > why not "u32" ?
>> 
>> I think I copied that from vxlan.c.  In fact, I'm not really sure I
>> understand why both exist?
>>  
>> > >+	int err;
>> > >+
>> > >+	/* TODO: need to support multiple tunnels in a namespace */
>> > >+	if (!list_empty(&gn->geneve_list))
>> > >+		return -EBUSY;
>> > 
>> > Interesting limitation :)
>> 
>> That should disappear, of course. :-)
>> 
>> > ...
>> > 
>> > >+static void __net_exit geneve_exit_net(struct net *net)
>> > >+{
>> > >+	struct geneve_net *gn = net_generic(net, geneve_net_id);
>> > >+	struct geneve_dev *geneve, *next;
>> > >+	struct net_device *dev, *aux;
>> > >+	LIST_HEAD(list);
>> > >+
>> > >+	rtnl_lock();
>> > >+	for_each_netdev_safe(net, dev, aux)
>> > >+		if (dev->rtnl_link_ops == &geneve_link_ops)
>> > >+			unregister_netdevice_queue(dev, &list);
>> > >+
>> > >+	list_for_each_entry_safe(geneve, next, &gn->geneve_list, next) {
>> > >+		/* If geneve->dev is in the same netns, it was already added
>> > >+		 * to the list by the previous loop.
>> > >+		 */
>> > >+		if (!net_eq(dev_net(geneve->dev), net))
>> > >+			unregister_netdevice_queue(dev, &list);
>> > >+	}
>> > 
>> > I know this is c&p of vxlan, but I do not understand why the first loop
>> > is there. The second loop will take care of all since all devs are
>> > listed in ->geneve_list, right?
>> > 
>> > Also you do not need _safe variant since you traverse through
>> > ->geneve_list, which is not modified.
>> 
>> Yes, it is boilerplate from vxlan.  Maybe Stephen can explain it?
>> 
>> Maybe it relates to the the ordering of the unregister queue?
>> I'll try to figure it out...
>> 
>> > >+
>> > >+	unregister_netdevice_many(&list);
>> > >+	rtnl_unlock();
>> > >+}
>> 
>> Thanks for the review and suggestions!
>> 
>> John
>> 
>> -- 
>> John W. Linville		Someday the world will need a hero, and you
>> linville@tuxdriver.com			might be all we have.  Be ready.
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> 
>
>-- 
>John W. Linville		Someday the world will need a hero, and you
>linville@tuxdriver.com			might be all we have.  Be ready.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 3/5] Rename support library for geneve
  2015-04-03 14:40     ` John W. Linville
@ 2015-04-03 15:54       ` Nicolas Dichtel
  2015-04-03 18:25         ` John W. Linville
  0 siblings, 1 reply; 28+ messages in thread
From: Nicolas Dichtel @ 2015-04-03 15:54 UTC (permalink / raw)
  To: John W. Linville, Cong Wang
  Cc: netdev, David S. Miller, Jesse Gross, Andy Zhou,
	Stephen Hemminger, Alexander Duyck

Le 03/04/2015 16:40, John W. Linville a écrit :
[snip]
> Honestly, this name change seems like the simplest and least awkward
> choice to me.  But, I am open to suggestions...
There is already some files named '_core.c' under net/ipv4 and net/ipv6.
What about geneve_core.c ?

My two cents,
Nicolas

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 3/5] Rename support library for geneve
  2015-04-03 15:54       ` Nicolas Dichtel
@ 2015-04-03 18:25         ` John W. Linville
  0 siblings, 0 replies; 28+ messages in thread
From: John W. Linville @ 2015-04-03 18:25 UTC (permalink / raw)
  To: Nicolas Dichtel
  Cc: Cong Wang, netdev, David S. Miller, Jesse Gross, Andy Zhou,
	Stephen Hemminger, Alexander Duyck

On Fri, Apr 03, 2015 at 05:54:02PM +0200, Nicolas Dichtel wrote:
> Le 03/04/2015 16:40, John W. Linville a écrit :
> [snip]
> >Honestly, this name change seems like the simplest and least awkward
> >choice to me.  But, I am open to suggestions...
> There is already some files named '_core.c' under net/ipv4 and net/ipv6.
> What about geneve_core.c ?

Works for me...

-- 
John W. Linville		Someday the world will need a hero, and you
linville@tuxdriver.com			might be all we have.  Be ready.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 5/5] geneve: add initial netdev driver for GENEVE tunnels
  2015-04-03 15:20         ` Jiri Pirko
@ 2015-04-03 18:31           ` John W. Linville
  0 siblings, 0 replies; 28+ messages in thread
From: John W. Linville @ 2015-04-03 18:31 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, David S. Miller, Jesse Gross, Andy Zhou,
	Stephen Hemminger, Alexander Duyck

On Fri, Apr 03, 2015 at 05:20:53PM +0200, Jiri Pirko wrote:
> Fri, Apr 03, 2015 at 05:07:44PM CEST, linville@tuxdriver.com wrote:
> >On Fri, Apr 03, 2015 at 10:57:12AM -0400, John W. Linville wrote:
> >> On Thu, Apr 02, 2015 at 10:20:02PM +0200, Jiri Pirko wrote:
> >> > Thu, Apr 02, 2015 at 09:17:06PM CEST, linville@tuxdriver.com wrote:
> >> > >This is an initial implementation of a netdev driver for GENEVE
> >> > >tunnels.  This implementation uses a fixed UDP port, and only supports
> >> > >a single tunnel (and therefore only a single VNI) per net namespace.
> >> > >Only IPv4 links are supported at this time.
> >> > 
> >> > 
> >> > Thanks for doing this John!
> >> > 
> >> > 
> >> > >
> >> > >Signed-off-by: John W. Linville <linville@tuxdriver.com>
> >> > >---
> >> > > drivers/net/Kconfig          |  14 ++
> >> > > drivers/net/Makefile         |   1 +
> >> > > drivers/net/geneve.c         | 451 +++++++++++++++++++++++++++++++++++++++++++
> >> > > include/uapi/linux/if_link.h |   9 +
> >> > > 4 files changed, 475 insertions(+)
> >> > > create mode 100644 drivers/net/geneve.c
> >> > >
> >> > 
> >> > ...
> >> > 
> >> > >+/* Initialize the device structure. */
> >> > >+static void geneve_setup(struct net_device *dev)
> >> > >+{
> >> > >+	struct geneve_dev *geneve = netdev_priv(dev);
> >> > >+
> >> > >+	ether_setup(dev);
> >> > >+
> >> > >+	dev->netdev_ops = &geneve_netdev_ops;
> >> > >+	dev->destructor = free_netdev;
> >> > >+	SET_NETDEV_DEVTYPE(dev, &geneve_type);
> >> > >+
> >> > >+	INIT_WORK(&geneve->sock_work, geneve_sock_work);
> >> > 
> >> > I would push work initialization into geneve_newlink. Seems odd to have
> >> > it here in setup.
> >> 
> >> Yes, that will probably work a lot better for multiple tunnels on a
> >> host... :-)
> >
> >Ignore that comment -- I was thinking...something else...need coffee... ;-)
> >
> >What makes newlink better for INIT_WORK than setup?
> 
> 
> Can be here for sure. I just thought that setup should setup netdev
> according to type. This is not it. This initializes a part of priv
> structure.

Ah -- I wasn't thinking about it that way.  I'll consider it. :-)

John
-- 
John W. Linville		Someday the world will need a hero, and you
linville@tuxdriver.com			might be all we have.  Be ready.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 5/5] geneve: add initial netdev driver for GENEVE tunnels
  2015-04-02 19:17 ` [RFC PATCH 5/5] geneve: add initial netdev driver for GENEVE tunnels John W. Linville
  2015-04-02 20:20   ` Jiri Pirko
  2015-04-03  5:55   ` Simon Horman
@ 2015-04-03 21:05   ` Jesse Gross
  2015-04-04  1:01     ` Francois Romieu
  2015-04-06 18:43     ` John W. Linville
  2 siblings, 2 replies; 28+ messages in thread
From: Jesse Gross @ 2015-04-03 21:05 UTC (permalink / raw)
  To: John W. Linville
  Cc: netdev, David S. Miller, Andy Zhou, Stephen Hemminger, Alexander Duyck

On Thu, Apr 2, 2015 at 12:17 PM, John W. Linville
<linville@tuxdriver.com> wrote:
> This is an initial implementation of a netdev driver for GENEVE
> tunnels.  This implementation uses a fixed UDP port, and only supports
> a single tunnel (and therefore only a single VNI) per net namespace.
> Only IPv4 links are supported at this time.
>
> Signed-off-by: John W. Linville <linville@tuxdriver.com>

Thanks from me as well for working on this!

There is the common IP tunnel device code that GRE, IPIP, etc. use -
pretty much every tunnel except for VXLAN. Does it make sense to use
that? VXLAN doesn't because it has a fair amount of specialized logic
and perhaps Geneve will end up going down that path as well, so
perhaps it doesn't make sense but I wanted to make sure that you were
aware of it.

I also wanted to mention that Geneve (the protocol, not the current
implementation) can encapsulate protocols other than Ethernet, similar
to GRE. I don't think this is necessary for a first implementation but
it's worth keeping in mind in case there is anything that we end up
designing in the interfaces that can keep this clean in the future.

> diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
> new file mode 100644
> index 000000000000..fe8895487fc2
> --- /dev/null
> +++ b/drivers/net/geneve.c
> +/* geneve receive/decap routine */
> +static void geneve_rx(struct geneve_sock *gs, struct sk_buff *skb)
> +{
> +       struct genevehdr *gnvh = geneve_hdr(skb);
> +       struct geneve_dev *geneve;
> +       struct pcpu_sw_netstats *stats;
> +
> +       geneve = gs->rcv_data;
> +
> +       /* Does the VNI match the device? */
> +       if (memcmp(gnvh->vni, geneve->vni, sizeof(geneve->vni)))
> +               goto drop;

Since Geneve packets can carry options and this doesn't currently
support any, I think we need to at least check the 'C' bit in the
header and drop packets if it is set to ensure that we don't
accidentally ignore critical options.

> +       /* force IP checksum recalculation */
> +       skb->ip_summed = CHECKSUM_NONE;

I don't think that this should be necessary. There has been a fair
amount of work to ensure that checksums carry over across
encapsulations where possible so we shouldn't have to blow away the
state. We just need to do a skb_postpull_rcsum() after the call to
eth_type_trans().

We probably should do ECN decapsulate in here somewhere as well.

> +/* Initialize the device structure. */
> +static void geneve_setup(struct net_device *dev)
> +{
> +       struct geneve_dev *geneve = netdev_priv(dev);
> +
> +       ether_setup(dev);
> +
> +       dev->netdev_ops = &geneve_netdev_ops;
> +       dev->destructor = free_netdev;
> +       SET_NETDEV_DEVTYPE(dev, &geneve_type);
> +
> +       INIT_WORK(&geneve->sock_work, geneve_sock_work);
> +
> +       dev->tx_queue_len = 0;
> +       dev->features = 0;
> +
> +       dev->vlan_features = dev->features;
> +       dev->hw_features = 0;

It should be possible to enable most features without a problem.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 5/5] geneve: add initial netdev driver for GENEVE tunnels
  2015-04-03 21:05   ` Jesse Gross
@ 2015-04-04  1:01     ` Francois Romieu
  2015-04-06 18:06       ` Jesse Gross
  2015-04-06 18:43     ` John W. Linville
  1 sibling, 1 reply; 28+ messages in thread
From: Francois Romieu @ 2015-04-04  1:01 UTC (permalink / raw)
  To: Jesse Gross
  Cc: John W. Linville, netdev, David S. Miller, Andy Zhou,
	Stephen Hemminger, Alexander Duyck

Jesse Gross <jesse@nicira.com> :
[...]
> > diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
> > new file mode 100644
> > index 000000000000..fe8895487fc2
> > --- /dev/null
> > +++ b/drivers/net/geneve.c
> > +/* geneve receive/decap routine */
> > +static void geneve_rx(struct geneve_sock *gs, struct sk_buff *skb)
> > +{
> > +       struct genevehdr *gnvh = geneve_hdr(skb);
> > +       struct geneve_dev *geneve;
> > +       struct pcpu_sw_netstats *stats;
> > +
> > +       geneve = gs->rcv_data;
> > +
> > +       /* Does the VNI match the device? */
> > +       if (memcmp(gnvh->vni, geneve->vni, sizeof(geneve->vni)))
> > +               goto drop;
> 
> Since Geneve packets can carry options and this doesn't currently
> support any, I think we need to at least check the 'C' bit in the
> header and drop packets if it is set to ensure that we don't
> accidentally ignore critical options.

Speaking of it, it's imho a bit too easy to confuse GENEVE_CRIT_OPT_TYPE
with the relevant 'C' bit mask.

-- 
Ueimor

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 5/5] geneve: add initial netdev driver for GENEVE tunnels
  2015-04-04  1:01     ` Francois Romieu
@ 2015-04-06 18:06       ` Jesse Gross
  2015-04-06 18:44         ` John W. Linville
  0 siblings, 1 reply; 28+ messages in thread
From: Jesse Gross @ 2015-04-06 18:06 UTC (permalink / raw)
  To: Francois Romieu
  Cc: John W. Linville, netdev, David S. Miller, Andy Zhou,
	Stephen Hemminger, Alexander Duyck

On Fri, Apr 3, 2015 at 6:01 PM, Francois Romieu <romieu@fr.zoreil.com> wrote:
> Jesse Gross <jesse@nicira.com> :
> [...]
>> > diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
>> > new file mode 100644
>> > index 000000000000..fe8895487fc2
>> > --- /dev/null
>> > +++ b/drivers/net/geneve.c
>> > +/* geneve receive/decap routine */
>> > +static void geneve_rx(struct geneve_sock *gs, struct sk_buff *skb)
>> > +{
>> > +       struct genevehdr *gnvh = geneve_hdr(skb);
>> > +       struct geneve_dev *geneve;
>> > +       struct pcpu_sw_netstats *stats;
>> > +
>> > +       geneve = gs->rcv_data;
>> > +
>> > +       /* Does the VNI match the device? */
>> > +       if (memcmp(gnvh->vni, geneve->vni, sizeof(geneve->vni)))
>> > +               goto drop;
>>
>> Since Geneve packets can carry options and this doesn't currently
>> support any, I think we need to at least check the 'C' bit in the
>> header and drop packets if it is set to ensure that we don't
>> accidentally ignore critical options.
>
> Speaking of it, it's imho a bit too easy to confuse GENEVE_CRIT_OPT_TYPE
> with the relevant 'C' bit mask.

Which 'C' bit mask? You mean the bitfield in the header? I guess but
I'm not sure what would make it clearer and since they are different
types it seems somewhat difficult to actually misuse them in practice.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 5/5] geneve: add initial netdev driver for GENEVE tunnels
  2015-04-03 21:05   ` Jesse Gross
  2015-04-04  1:01     ` Francois Romieu
@ 2015-04-06 18:43     ` John W. Linville
  2015-04-06 22:52       ` Jesse Gross
  1 sibling, 1 reply; 28+ messages in thread
From: John W. Linville @ 2015-04-06 18:43 UTC (permalink / raw)
  To: Jesse Gross
  Cc: netdev, David S. Miller, Andy Zhou, Stephen Hemminger, Alexander Duyck

On Fri, Apr 03, 2015 at 02:05:20PM -0700, Jesse Gross wrote:
> On Thu, Apr 2, 2015 at 12:17 PM, John W. Linville
> <linville@tuxdriver.com> wrote:
> > This is an initial implementation of a netdev driver for GENEVE
> > tunnels.  This implementation uses a fixed UDP port, and only supports
> > a single tunnel (and therefore only a single VNI) per net namespace.
> > Only IPv4 links are supported at this time.
> >
> > Signed-off-by: John W. Linville <linville@tuxdriver.com>
> 
> Thanks from me as well for working on this!
> 
> There is the common IP tunnel device code that GRE, IPIP, etc. use -
> pretty much every tunnel except for VXLAN. Does it make sense to use
> that? VXLAN doesn't because it has a fair amount of specialized logic
> and perhaps Geneve will end up going down that path as well, so
> perhaps it doesn't make sense but I wanted to make sure that you were
> aware of it.

Thanks, I was only somewhat aware of it.  I'm inclined to think that
geneve will grow more options (like vxlan), and that sticking with
what I have make sense.

> I also wanted to mention that Geneve (the protocol, not the current
> implementation) can encapsulate protocols other than Ethernet, similar
> to GRE. I don't think this is necessary for a first implementation but
> it's worth keeping in mind in case there is anything that we end up
> designing in the interfaces that can keep this clean in the future.

Yeah, good point.  Do you think that would be specified at
tunnel setup?  What sort of flexibility will it require?

> > diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
> > new file mode 100644
> > index 000000000000..fe8895487fc2
> > --- /dev/null
> > +++ b/drivers/net/geneve.c
> > +/* geneve receive/decap routine */
> > +static void geneve_rx(struct geneve_sock *gs, struct sk_buff *skb)
> > +{
> > +       struct genevehdr *gnvh = geneve_hdr(skb);
> > +       struct geneve_dev *geneve;
> > +       struct pcpu_sw_netstats *stats;
> > +
> > +       geneve = gs->rcv_data;
> > +
> > +       /* Does the VNI match the device? */
> > +       if (memcmp(gnvh->vni, geneve->vni, sizeof(geneve->vni)))
> > +               goto drop;
> 
> Since Geneve packets can carry options and this doesn't currently
> support any, I think we need to at least check the 'C' bit in the
> header and drop packets if it is set to ensure that we don't
> accidentally ignore critical options.

Yes, that is a good point.

> > +       /* force IP checksum recalculation */
> > +       skb->ip_summed = CHECKSUM_NONE;
> 
> I don't think that this should be necessary. There has been a fair
> amount of work to ensure that checksums carry over across
> encapsulations where possible so we shouldn't have to blow away the
> state. We just need to do a skb_postpull_rcsum() after the call to
> eth_type_trans().
> 
> We probably should do ECN decapsulate in here somewhere as well.

Sure.  This is all early development, and "make it work" was the main
concern. :-)

> > +/* Initialize the device structure. */
> > +static void geneve_setup(struct net_device *dev)
> > +{
> > +       struct geneve_dev *geneve = netdev_priv(dev);
> > +
> > +       ether_setup(dev);
> > +
> > +       dev->netdev_ops = &geneve_netdev_ops;
> > +       dev->destructor = free_netdev;
> > +       SET_NETDEV_DEVTYPE(dev, &geneve_type);
> > +
> > +       INIT_WORK(&geneve->sock_work, geneve_sock_work);
> > +
> > +       dev->tx_queue_len = 0;
> > +       dev->features = 0;
> > +
> > +       dev->vlan_features = dev->features;
> > +       dev->hw_features = 0;
> 
> It should be possible to enable most features without a problem.

Sure.  I'll look more into that as well.

Thanks,

John
-- 
John W. Linville		Someday the world will need a hero, and you
linville@tuxdriver.com			might be all we have.  Be ready.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 5/5] geneve: add initial netdev driver for GENEVE tunnels
  2015-04-06 18:06       ` Jesse Gross
@ 2015-04-06 18:44         ` John W. Linville
  2015-04-06 20:44           ` Francois Romieu
  0 siblings, 1 reply; 28+ messages in thread
From: John W. Linville @ 2015-04-06 18:44 UTC (permalink / raw)
  To: Jesse Gross
  Cc: Francois Romieu, netdev, David S. Miller, Andy Zhou,
	Stephen Hemminger, Alexander Duyck

On Mon, Apr 06, 2015 at 11:06:02AM -0700, Jesse Gross wrote:
> On Fri, Apr 3, 2015 at 6:01 PM, Francois Romieu <romieu@fr.zoreil.com> wrote:
> > Jesse Gross <jesse@nicira.com> :
> > [...]
> >> > diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
> >> > new file mode 100644
> >> > index 000000000000..fe8895487fc2
> >> > --- /dev/null
> >> > +++ b/drivers/net/geneve.c
> >> > +/* geneve receive/decap routine */
> >> > +static void geneve_rx(struct geneve_sock *gs, struct sk_buff *skb)
> >> > +{
> >> > +       struct genevehdr *gnvh = geneve_hdr(skb);
> >> > +       struct geneve_dev *geneve;
> >> > +       struct pcpu_sw_netstats *stats;
> >> > +
> >> > +       geneve = gs->rcv_data;
> >> > +
> >> > +       /* Does the VNI match the device? */
> >> > +       if (memcmp(gnvh->vni, geneve->vni, sizeof(geneve->vni)))
> >> > +               goto drop;
> >>
> >> Since Geneve packets can carry options and this doesn't currently
> >> support any, I think we need to at least check the 'C' bit in the
> >> header and drop packets if it is set to ensure that we don't
> >> accidentally ignore critical options.
> >
> > Speaking of it, it's imho a bit too easy to confuse GENEVE_CRIT_OPT_TYPE
> > with the relevant 'C' bit mask.
> 
> Which 'C' bit mask? You mean the bitfield in the header? I guess but
> I'm not sure what would make it clearer and since they are different
> types it seems somewhat difficult to actually misuse them in practice.

What would you suggest, Francois?  A GENEVE_CRIT_OPT_PRESENT() macro?

John
-- 
John W. Linville		Someday the world will need a hero, and you
linville@tuxdriver.com			might be all we have.  Be ready.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 5/5] geneve: add initial netdev driver for GENEVE tunnels
  2015-04-06 18:44         ` John W. Linville
@ 2015-04-06 20:44           ` Francois Romieu
  0 siblings, 0 replies; 28+ messages in thread
From: Francois Romieu @ 2015-04-06 20:44 UTC (permalink / raw)
  To: John W. Linville
  Cc: Jesse Gross, netdev, David S. Miller, Andy Zhou,
	Stephen Hemminger, Alexander Duyck

John W. Linville <linville@tuxdriver.com> :
[...]
> What would you suggest, Francois?  A GENEVE_CRIT_OPT_PRESENT() macro?

Either that or a typed 'bool geneve_crit_opt_present(struct geneve_opt *opt)'
to hide the content of struct geneve_opt and move any relevant bitmask in
the only .c file where it is actually used.

-- 
Ueimor

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 5/5] geneve: add initial netdev driver for GENEVE tunnels
  2015-04-06 18:43     ` John W. Linville
@ 2015-04-06 22:52       ` Jesse Gross
  0 siblings, 0 replies; 28+ messages in thread
From: Jesse Gross @ 2015-04-06 22:52 UTC (permalink / raw)
  To: John W. Linville
  Cc: netdev, David S. Miller, Andy Zhou, Stephen Hemminger, Alexander Duyck

On Mon, Apr 6, 2015 at 11:43 AM, John W. Linville
<linville@tuxdriver.com> wrote:
> On Fri, Apr 03, 2015 at 02:05:20PM -0700, Jesse Gross wrote:
>> I also wanted to mention that Geneve (the protocol, not the current
>> implementation) can encapsulate protocols other than Ethernet, similar
>> to GRE. I don't think this is necessary for a first implementation but
>> it's worth keeping in mind in case there is anything that we end up
>> designing in the interfaces that can keep this clean in the future.
>
> Yeah, good point.  Do you think that would be specified at
> tunnel setup?  What sort of flexibility will it require?

Yes, I would expect that it is specified at tunnel setup time since it
will affect the type of netdevice that is created. I don't think that
it needs to be tremendously flexible as most networks will carry
either Ethernet or IP and use that as the next protocol (although
there could be other interesting use cases like specifying that the
payload is encrypted).

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2015-04-06 22:52 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-02 19:17 [RFC] add GENEVE netdev tunnel driver John W. Linville
2015-04-02 19:17 ` [RFC PATCH 1/5] geneve: remove MODULE_ALIAS_RTNL_LINK from net/ipv4/geneve.c John W. Linville
2015-04-02 23:39   ` Stephen Hemminger
2015-04-03 12:17     ` Jiri Pirko
2015-04-03 14:27       ` John W. Linville
2015-04-02 19:17 ` [RFC PATCH 2/5] geneve: move definition of geneve_hdr() to geneve.h John W. Linville
2015-04-02 19:17 ` [RFC PATCH 3/5] Rename support library for geneve John W. Linville
2015-04-03  0:05   ` Cong Wang
2015-04-03 14:40     ` John W. Linville
2015-04-03 15:54       ` Nicolas Dichtel
2015-04-03 18:25         ` John W. Linville
2015-04-02 19:17 ` [RFC PATCH 4/5] libgeneve: identify as driver library in modules description John W. Linville
2015-04-02 19:17 ` [RFC PATCH 5/5] geneve: add initial netdev driver for GENEVE tunnels John W. Linville
2015-04-02 20:20   ` Jiri Pirko
2015-04-03 14:57     ` John W. Linville
2015-04-03 15:07       ` John W. Linville
2015-04-03 15:20         ` Jiri Pirko
2015-04-03 18:31           ` John W. Linville
2015-04-03  5:55   ` Simon Horman
2015-04-03 14:41     ` John W. Linville
2015-04-03 21:05   ` Jesse Gross
2015-04-04  1:01     ` Francois Romieu
2015-04-06 18:06       ` Jesse Gross
2015-04-06 18:44         ` John W. Linville
2015-04-06 20:44           ` Francois Romieu
2015-04-06 18:43     ` John W. Linville
2015-04-06 22:52       ` Jesse Gross
2015-04-02 19:17 ` [RFC PATCH] iproute2: GENEVE support John W. Linville

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).