* [PATCH] add GENEVE netdev tunnel driver @ 2015-05-08 17:20 John W. Linville 2015-05-08 17:20 ` [PATCH 1/5] geneve: remove MODULE_ALIAS_RTNL_LINK from net/ipv4/geneve.c John W. Linville ` (6 more replies) 0 siblings, 7 replies; 18+ messages in thread From: John W. Linville @ 2015-05-08 17:20 UTC (permalink / raw) To: netdev Cc: David S. Miller, Jesse Gross, Andy Zhou, Stephen Hemminger, Alexander Duyck This 5-patch kernel series adds a netdev implementation of a GENEVE tunnel driver, and the single iproute2 patch enables creation and such for those netdevs. This makes use of the existing GENEVE infrastructure already used by the OVS code. The net/ipv4/geneve.c file is renamed as net/ipv4/geneve_core.c as part of these changes. drivers/net/Kconfig | 14 + drivers/net/Makefile | 1 drivers/net/geneve.c | 550 +++++++++++++++++++++++++++++++++++++++++ include/net/geneve.h | 5 include/uapi/linux/if_link.h | 9 net/ipv4/Kconfig | 4 net/ipv4/Makefile | 2 net/ipv4/geneve.c | 6 net/ipv4/geneve_core.c | 4 net/openvswitch/Kconfig | 2 net/openvswitch/vport-geneve.c | 5 11 files changed, 585 insertions(+), 17 deletions(-) The overall structure of the GENEVE netdev driver is strongly influenced by the VXLAN netdev driver. This is not surprising, as the two drivers are intended to serve similar purposes. As development of the GENEVE driver continues, it is likely that those similarities will grow stronger. This will include both simple configuration options (e.g. TOS and TTL settings) and new control plane support. The current implementation is very simple, restricting itself to point to point links over IPv4. This is due only to the simplicity of the implementation, and no such limit is inherent to GENEVE in any way. Support for IPv6 links and more sophisticated control plane options are predictable enhancements. Using the included iproute2 patch, a GENEVE tunnel is created thusly: ip link add dev gnv0 type geneve remote 192.168.22.1 vni 1234 ip link set gnv0 up ip addr add 10.1.1.1/24 dev gnv0 After a corresponding tunnel interface is created at the link partner, traffic should proceed as expected. Please let me know if anyone has problems...thanks! John -- John W. Linville Someday the world will need a hero, and you linville@tuxdriver.com might be all we have. Be ready. ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH 1/5] geneve: remove MODULE_ALIAS_RTNL_LINK from net/ipv4/geneve.c 2015-05-08 17:20 [PATCH] add GENEVE netdev tunnel driver John W. Linville @ 2015-05-08 17:20 ` John W. Linville 2015-05-08 17:20 ` [PATCH 2/5] geneve: move definition of geneve_hdr() to geneve.h John W. Linville ` (5 subsequent siblings) 6 siblings, 0 replies; 18+ messages in thread From: John W. Linville @ 2015-05-08 17:20 UTC (permalink / raw) To: netdev Cc: David S. Miller, Jesse Gross, Andy Zhou, Stephen Hemminger, Alexander Duyck, John W. Linville This file is essentially a library for implementing the geneve encapsulation protocol. The file does not register any rtnl_link_ops, so the MODULE_ALIAS_RTNL_LINK macro is inappropriate here. Signed-off-by: John W. Linville <linville@tuxdriver.com> --- net/ipv4/geneve.c | 1 - 1 file changed, 1 deletion(-) diff --git a/net/ipv4/geneve.c b/net/ipv4/geneve.c index 8986e63f3bda..8e6a7fe27a4c 100644 --- a/net/ipv4/geneve.c +++ b/net/ipv4/geneve.c @@ -450,4 +450,3 @@ module_exit(geneve_cleanup_module); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Jesse Gross <jesse@nicira.com>"); MODULE_DESCRIPTION("Driver for GENEVE encapsulated traffic"); -MODULE_ALIAS_RTNL_LINK("geneve"); -- 2.1.0 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 2/5] geneve: move definition of geneve_hdr() to geneve.h 2015-05-08 17:20 [PATCH] add GENEVE netdev tunnel driver John W. Linville 2015-05-08 17:20 ` [PATCH 1/5] geneve: remove MODULE_ALIAS_RTNL_LINK from net/ipv4/geneve.c John W. Linville @ 2015-05-08 17:20 ` John W. Linville 2015-05-08 17:20 ` [PATCH 3/5] geneve: Rename support library as geneve_core John W. Linville ` (4 subsequent siblings) 6 siblings, 0 replies; 18+ messages in thread From: John W. Linville @ 2015-05-08 17:20 UTC (permalink / raw) To: netdev Cc: David S. Miller, Jesse Gross, Andy Zhou, Stephen Hemminger, Alexander Duyck, John W. Linville This is a static inline with identical definitions in multiple places... Signed-off-by: John W. Linville <linville@tuxdriver.com> --- include/net/geneve.h | 5 +++++ net/ipv4/geneve.c | 5 ----- net/openvswitch/vport-geneve.c | 5 ----- 3 files changed, 5 insertions(+), 10 deletions(-) diff --git a/include/net/geneve.h b/include/net/geneve.h index 14fb8d3390b4..2a0543a1899d 100644 --- a/include/net/geneve.h +++ b/include/net/geneve.h @@ -62,6 +62,11 @@ struct genevehdr { struct geneve_opt options[]; }; +static inline struct genevehdr *geneve_hdr(const struct sk_buff *skb) +{ + return (struct genevehdr *)(udp_hdr(skb) + 1); +} + #ifdef CONFIG_INET struct geneve_sock; diff --git a/net/ipv4/geneve.c b/net/ipv4/geneve.c index 8e6a7fe27a4c..001843d41135 100644 --- a/net/ipv4/geneve.c +++ b/net/ipv4/geneve.c @@ -60,11 +60,6 @@ struct geneve_net { static int geneve_net_id; -static inline struct genevehdr *geneve_hdr(const struct sk_buff *skb) -{ - return (struct genevehdr *)(udp_hdr(skb) + 1); -} - static struct geneve_sock *geneve_find_sock(struct net *net, sa_family_t family, __be16 port) { diff --git a/net/openvswitch/vport-geneve.c b/net/openvswitch/vport-geneve.c index bf02fd5808c9..208c576bd1b6 100644 --- a/net/openvswitch/vport-geneve.c +++ b/net/openvswitch/vport-geneve.c @@ -46,11 +46,6 @@ static inline struct geneve_port *geneve_vport(const struct vport *vport) return vport_priv(vport); } -static inline struct genevehdr *geneve_hdr(const struct sk_buff *skb) -{ - return (struct genevehdr *)(udp_hdr(skb) + 1); -} - /* Convert 64 bit tunnel ID to 24 bit VNI. */ static void tunnel_id_to_vni(__be64 tun_id, __u8 *vni) { -- 2.1.0 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 3/5] geneve: Rename support library as geneve_core 2015-05-08 17:20 [PATCH] add GENEVE netdev tunnel driver John W. Linville 2015-05-08 17:20 ` [PATCH 1/5] geneve: remove MODULE_ALIAS_RTNL_LINK from net/ipv4/geneve.c John W. Linville 2015-05-08 17:20 ` [PATCH 2/5] geneve: move definition of geneve_hdr() to geneve.h John W. Linville @ 2015-05-08 17:20 ` John W. Linville 2015-05-08 17:20 ` [PATCH 4/5] geneve_core: identify as driver library in modules description John W. Linville ` (3 subsequent siblings) 6 siblings, 0 replies; 18+ messages in thread From: John W. Linville @ 2015-05-08 17:20 UTC (permalink / raw) To: netdev Cc: David S. Miller, Jesse Gross, Andy Zhou, Stephen Hemminger, Alexander Duyck, John W. Linville net/ipv4/geneve.c -> net/ipv4/geneve_core.c This name better reflects the purpose of the module. Signed-off-by: John W. Linville <linville@tuxdriver.com> --- Also, it prevents name resolution issues with module loading for the geneve netdev coming later in this series... net/ipv4/Kconfig | 4 ++-- net/ipv4/Makefile | 2 +- net/ipv4/{geneve.c => geneve_core.c} | 0 net/openvswitch/Kconfig | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) rename net/ipv4/{geneve.c => geneve_core.c} (100%) diff --git a/net/ipv4/Kconfig b/net/ipv4/Kconfig index bd2901604842..d83071dccd74 100644 --- a/net/ipv4/Kconfig +++ b/net/ipv4/Kconfig @@ -331,8 +331,8 @@ config NET_FOU_IP_TUNNELS When this option is enabled IP tunnels can be configured to use FOU or GUE encapsulation. -config GENEVE - tristate "Generic Network Virtualization Encapsulation (Geneve)" +config GENEVE_CORE + tristate "Generic Network Virtualization Encapsulation library" depends on INET select NET_UDP_TUNNEL ---help--- diff --git a/net/ipv4/Makefile b/net/ipv4/Makefile index 518c04ed666e..b36236dd6014 100644 --- a/net/ipv4/Makefile +++ b/net/ipv4/Makefile @@ -56,7 +56,7 @@ obj-$(CONFIG_TCP_CONG_YEAH) += tcp_yeah.o obj-$(CONFIG_TCP_CONG_ILLINOIS) += tcp_illinois.o obj-$(CONFIG_MEMCG_KMEM) += tcp_memcontrol.o obj-$(CONFIG_NETLABEL) += cipso_ipv4.o -obj-$(CONFIG_GENEVE) += geneve.o +obj-$(CONFIG_GENEVE_CORE) += geneve_core.o obj-$(CONFIG_XFRM) += xfrm4_policy.o xfrm4_state.o xfrm4_input.o \ xfrm4_output.o xfrm4_protocol.o diff --git a/net/ipv4/geneve.c b/net/ipv4/geneve_core.c similarity index 100% rename from net/ipv4/geneve.c rename to net/ipv4/geneve_core.c diff --git a/net/openvswitch/Kconfig b/net/openvswitch/Kconfig index ed6b0f8dd1bb..15840401a2ce 100644 --- a/net/openvswitch/Kconfig +++ b/net/openvswitch/Kconfig @@ -59,7 +59,7 @@ config OPENVSWITCH_VXLAN config OPENVSWITCH_GENEVE tristate "Open vSwitch Geneve tunneling support" depends on OPENVSWITCH - depends on GENEVE + depends on GENEVE_CORE default OPENVSWITCH ---help--- If you say Y here, then the Open vSwitch will be able create geneve vport. -- 2.1.0 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 4/5] geneve_core: identify as driver library in modules description 2015-05-08 17:20 [PATCH] add GENEVE netdev tunnel driver John W. Linville ` (2 preceding siblings ...) 2015-05-08 17:20 ` [PATCH 3/5] geneve: Rename support library as geneve_core John W. Linville @ 2015-05-08 17:20 ` John W. Linville 2015-05-08 17:20 ` [PATCH 5/5] geneve: add initial netdev driver for GENEVE tunnels John W. Linville ` (2 subsequent siblings) 6 siblings, 0 replies; 18+ messages in thread From: John W. Linville @ 2015-05-08 17:20 UTC (permalink / raw) To: netdev Cc: David S. Miller, Jesse Gross, Andy Zhou, Stephen Hemminger, Alexander Duyck, John W. Linville Signed-off-by: John W. Linville <linville@tuxdriver.com> --- net/ipv4/geneve_core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/ipv4/geneve_core.c b/net/ipv4/geneve_core.c index 001843d41135..311a4ba6950a 100644 --- a/net/ipv4/geneve_core.c +++ b/net/ipv4/geneve_core.c @@ -430,7 +430,7 @@ static int __init geneve_init_module(void) if (rc) return rc; - pr_info("Geneve driver\n"); + pr_info("Geneve core logic\n"); return 0; } @@ -444,4 +444,4 @@ module_exit(geneve_cleanup_module); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Jesse Gross <jesse@nicira.com>"); -MODULE_DESCRIPTION("Driver for GENEVE encapsulated traffic"); +MODULE_DESCRIPTION("Driver library for GENEVE encapsulated traffic"); -- 2.1.0 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 5/5] geneve: add initial netdev driver for GENEVE tunnels 2015-05-08 17:20 [PATCH] add GENEVE netdev tunnel driver John W. Linville ` (3 preceding siblings ...) 2015-05-08 17:20 ` [PATCH 4/5] geneve_core: identify as driver library in modules description John W. Linville @ 2015-05-08 17:20 ` John W. Linville 2015-05-08 20:55 ` Cong Wang ` (2 more replies) 2015-05-08 17:27 ` [PATCH] iproute2: GENEVE support John W. Linville 2015-05-08 19:32 ` [PATCH] add GENEVE netdev tunnel driver Stephen Hemminger 6 siblings, 3 replies; 18+ messages in thread From: John W. Linville @ 2015-05-08 17:20 UTC (permalink / raw) To: netdev Cc: David S. Miller, Jesse Gross, Andy Zhou, Stephen Hemminger, Alexander Duyck, John W. Linville This is an initial implementation of a netdev driver for GENEVE tunnels. This implementation uses a fixed UDP port, and only supports point-to-point links with specific partner endpoints. Only IPv4 links are supported at this time. Signed-off-by: John W. Linville <linville@tuxdriver.com> --- drivers/net/Kconfig | 14 ++ drivers/net/Makefile | 1 + drivers/net/geneve.c | 550 +++++++++++++++++++++++++++++++++++++++++++ include/uapi/linux/if_link.h | 9 + 4 files changed, 574 insertions(+) create mode 100644 drivers/net/geneve.c diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig index df51d6025a90..019fceffc9e5 100644 --- a/drivers/net/Kconfig +++ b/drivers/net/Kconfig @@ -179,6 +179,20 @@ config VXLAN To compile this driver as a module, choose M here: the module will be called vxlan. +config GENEVE + tristate "Generic Network Virtualization Encapsulation netdev" + depends on INET && GENEVE_CORE + select NET_IP_TUNNEL + ---help--- + This allows one to create geneve virtual interfaces that provide + Layer 2 Networks over Layer 3 Networks. GENEVE is often used + to tunnel virtual network infrastructure in virtualized environments. + For more information see: + http://tools.ietf.org/html/draft-gross-geneve-02 + + To compile this driver as a module, choose M here: the module + will be called geneve. + config NETCONSOLE tristate "Network console logging support" ---help--- diff --git a/drivers/net/Makefile b/drivers/net/Makefile index e25fdd7d905e..c12cb22478a7 100644 --- a/drivers/net/Makefile +++ b/drivers/net/Makefile @@ -23,6 +23,7 @@ obj-$(CONFIG_TUN) += tun.o obj-$(CONFIG_VETH) += veth.o obj-$(CONFIG_VIRTIO_NET) += virtio_net.o obj-$(CONFIG_VXLAN) += vxlan.o +obj-$(CONFIG_GENEVE) += geneve.o obj-$(CONFIG_NLMON) += nlmon.o # diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c new file mode 100644 index 000000000000..102030de1d45 --- /dev/null +++ b/drivers/net/geneve.c @@ -0,0 +1,550 @@ +/* + * GENEVE: Generic Network Virtualization Encapsulation + * + * Copyright (c) 2015 Red Hat, Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include <linux/kernel.h> +#include <linux/module.h> +#include <linux/netdevice.h> +#include <linux/etherdevice.h> +#include <linux/hash.h> +#include <net/rtnetlink.h> +#include <net/geneve.h> + +#define GENEVE_NETDEV_VER "0.6" + +#define GENEVE_UDP_PORT 6081 + +#define GENEVE_N_VID (1u << 24) +#define GENEVE_VID_MASK (GENEVE_N_VID - 1) + +#define VNI_HASH_BITS 10 +#define VNI_HASH_SIZE (1<<VNI_HASH_BITS) + +static bool log_ecn_error = true; +module_param(log_ecn_error, bool, 0644); +MODULE_PARM_DESC(log_ecn_error, "Log packets received with corrupted ECN"); + +/* per-network namespace private data for this module */ +struct geneve_net { + struct list_head geneve_list; + struct hlist_head vni_list[VNI_HASH_SIZE]; + spinlock_t vni_lock; +}; + +/* Pseudo network device */ +struct geneve_dev { + struct hlist_node hlist; /* vni hash table */ + struct net *net; /* netns for packet i/o */ + struct net_device *dev; /* netdev for geneve tunnel */ + struct geneve_sock *sock; /* socket used for geneve tunnel */ + u8 vni[3]; /* virtual network ID for tunnel */ + struct sockaddr_in remote; /* IPv4 address for link partner */ + struct work_struct sock_work; /* work item for binding socket */ + struct list_head next; /* geneve's per namespace list */ +}; + +static void geneve_sock_work(struct work_struct *work); + +static struct workqueue_struct *geneve_wq; + +static int geneve_net_id; + +static inline __u32 geneve_net_vni_hash(u8 vni[3]) +{ + __u32 vnid; + + vnid = (vni[0] << 16) | (vni[1] << 8) | vni[2]; + return hash_32(vnid, VNI_HASH_BITS); +} + +static void geneve_net_vni_add(struct geneve_net *gn, __u32 hash, + struct geneve_dev *geneve) +{ + spin_lock(&gn->vni_lock); + hlist_add_head_rcu(&geneve->hlist, &gn->vni_list[hash]); + spin_unlock(&gn->vni_lock); +} + +static void geneve_net_vni_del(struct geneve_dev *geneve) +{ + struct geneve_net *gn = net_generic(geneve->net, geneve_net_id); + + spin_lock(&gn->vni_lock); + if (!hlist_unhashed(&geneve->hlist)) + hlist_del_rcu(&geneve->hlist); + spin_unlock(&gn->vni_lock); +} + +/* geneve receive/decap routine */ +static void geneve_rx(struct geneve_sock *gs, struct sk_buff *skb) +{ + struct genevehdr *gnvh = geneve_hdr(skb); + struct geneve_dev *dummy, *geneve = NULL; + struct geneve_net *gn; + struct iphdr *iph = NULL; + struct pcpu_sw_netstats *stats; + struct hlist_head *vni_list_head; + int err = 0; + __u32 hash; + + iph = ip_hdr(skb); /* Still outer IP header... */ + + gn = gs->rcv_data; + + /* Find the device for this VNI */ + hash = geneve_net_vni_hash(gnvh->vni); + vni_list_head = &gn->vni_list[hash]; + hlist_for_each_entry_rcu(dummy, vni_list_head, hlist) { + if (!memcmp(gnvh->vni, dummy->vni, sizeof(dummy->vni)) && + iph->saddr == dummy->remote.sin_addr.s_addr) + geneve = dummy; + } + if (!geneve) + goto drop; + + /* Drop packets w/ critical options, + * since we don't support any... + */ + if (gnvh->critical) + goto drop; + + skb_reset_mac_header(skb); + skb_scrub_packet(skb, !net_eq(geneve->net, dev_net(geneve->dev))); + skb->protocol = eth_type_trans(skb, geneve->dev); + skb_postpull_rcsum(skb, eth_hdr(skb), ETH_HLEN); + + /* Ignore packet loops (and multicast echo) */ + if (ether_addr_equal(eth_hdr(skb)->h_source, geneve->dev->dev_addr)) + goto drop; + + skb_reset_network_header(skb); + + iph = ip_hdr(skb); /* Now inner IP header... */ + err = IP_ECN_decapsulate(iph, skb); + + if (unlikely(err)) { + if (log_ecn_error) + net_info_ratelimited("non-ECT from %pI4 with TOS=%#x\n", + &iph->saddr, iph->tos); + if (err > 1) { + ++geneve->dev->stats.rx_frame_errors; + ++geneve->dev->stats.rx_errors; + goto drop; + } + } + + stats = this_cpu_ptr(geneve->dev->tstats); + u64_stats_update_begin(&stats->syncp); + stats->rx_packets++; + stats->rx_bytes += skb->len; + u64_stats_update_end(&stats->syncp); + + netif_rx(skb); + + return; +drop: + /* Consume bad packet */ + kfree_skb(skb); +} + +/* Scheduled at device creation to bind to a socket */ +static void geneve_sock_work(struct work_struct *work) +{ + struct geneve_dev *geneve = container_of(work, struct geneve_dev, + sock_work); + struct net *net = geneve->net; + struct geneve_net *gn = net_generic(geneve->net, geneve_net_id); + struct geneve_sock *gs; + + gs = geneve_sock_add(net, htons(GENEVE_UDP_PORT), geneve_rx, gn, + false, false); + if (!IS_ERR(gs)) + geneve->sock = gs; + + dev_put(geneve->dev); +} + +/* Setup stats when device is created */ +static int geneve_init(struct net_device *dev) +{ + struct geneve_dev *geneve = netdev_priv(dev); + + dev->tstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats); + if (!dev->tstats) + return -ENOMEM; + + /* make new socket outside of RTNL */ + dev_hold(dev); + queue_work(geneve_wq, &geneve->sock_work); + + return 0; +} + +static void geneve_uninit(struct net_device *dev) +{ + struct geneve_dev *geneve = netdev_priv(dev); + struct geneve_sock *gs = geneve->sock; + + if (gs) + geneve_sock_release(gs); + free_percpu(dev->tstats); +} + +static int geneve_open(struct net_device *dev) +{ + struct geneve_dev *geneve = netdev_priv(dev); + struct geneve_sock *gs = geneve->sock; + + /* socket hasn't been created */ + if (!gs) + return -ENOTCONN; + + return 0; +} + +static int geneve_stop(struct net_device *dev) +{ + return 0; +} + +static netdev_tx_t geneve_xmit(struct sk_buff *skb, struct net_device *dev) +{ + struct geneve_dev *geneve = netdev_priv(dev); + struct geneve_sock *gs = geneve->sock; + struct rtable *rt = NULL; + const struct iphdr *iip; /* interior IP header */ + struct flowi4 fl4; + int err; + __be16 sport; + __u8 tos, ttl = 0; + + iip = ip_hdr(skb); + + skb_reset_mac_header(skb); + + /* TODO: port min/max limits should be configurable */ + sport = udp_flow_src_port(dev_net(dev), skb, 0, 0, true); + + memset(&fl4, 0, sizeof(fl4)); + fl4.daddr = geneve->remote.sin_addr.s_addr; + rt = ip_route_output_key(geneve->net, &fl4); + if (IS_ERR(rt)) { + netdev_dbg(dev, "no route to %pI4\n", &fl4.daddr); + dev->stats.tx_carrier_errors++; + goto tx_error; + } + if (rt->dst.dev == dev) { /* is this necessary? */ + netdev_dbg(dev, "circular route to %pI4\n", &fl4.daddr); + dev->stats.collisions++; + goto rt_tx_error; + } + + /* TODO: tos and ttl should be configurable */ + + tos = ip_tunnel_ecn_encap(0, iip, skb); + + if (IN_MULTICAST(ntohl(fl4.daddr))) + ttl = 1; + + ttl = ttl ? : ip4_dst_hoplimit(&rt->dst); + + /* no need to handle local destination and encap bypass...yet... */ + + err = geneve_xmit_skb(gs, rt, skb, fl4.saddr, fl4.daddr, + tos, ttl, 0, sport, htons(GENEVE_UDP_PORT), 0, + geneve->vni, 0, NULL, false, + !net_eq(geneve->net, dev_net(geneve->dev))); + if (err < 0) + ip_rt_put(rt); + + iptunnel_xmit_stats(err, &dev->stats, dev->tstats); + + return NETDEV_TX_OK; + +rt_tx_error: + ip_rt_put(rt); +tx_error: + dev->stats.tx_errors++; + dev_kfree_skb(skb); + return NETDEV_TX_OK; +} + +static const struct net_device_ops geneve_netdev_ops = { + .ndo_init = geneve_init, + .ndo_uninit = geneve_uninit, + .ndo_open = geneve_open, + .ndo_stop = geneve_stop, + .ndo_start_xmit = geneve_xmit, + .ndo_get_stats64 = ip_tunnel_get_stats64, + .ndo_change_mtu = eth_change_mtu, + .ndo_validate_addr = eth_validate_addr, + .ndo_set_mac_address = eth_mac_addr, +}; + +static void geneve_get_drvinfo(struct net_device *dev, + struct ethtool_drvinfo *drvinfo) +{ + strlcpy(drvinfo->version, GENEVE_NETDEV_VER, sizeof(drvinfo->version)); + strlcpy(drvinfo->driver, "geneve", sizeof(drvinfo->driver)); +} + +static const struct ethtool_ops geneve_ethtool_ops = { + .get_drvinfo = geneve_get_drvinfo, + .get_link = ethtool_op_get_link, +}; + +/* Info for udev, that this is a virtual tunnel endpoint */ +static struct device_type geneve_type = { + .name = "geneve", +}; + +/* Initialize the device structure. */ +static void geneve_setup(struct net_device *dev) +{ + ether_setup(dev); + + dev->netdev_ops = &geneve_netdev_ops; + dev->ethtool_ops = &geneve_ethtool_ops; + dev->destructor = free_netdev; + + SET_NETDEV_DEVTYPE(dev, &geneve_type); + + dev->tx_queue_len = 0; + dev->features |= NETIF_F_LLTX; + dev->features |= NETIF_F_SG | NETIF_F_HW_CSUM; + dev->features |= NETIF_F_RXCSUM; + dev->features |= NETIF_F_GSO_SOFTWARE; + + dev->vlan_features = dev->features; + dev->features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX; + + dev->hw_features |= NETIF_F_SG | NETIF_F_HW_CSUM | NETIF_F_RXCSUM; + dev->hw_features |= NETIF_F_GSO_SOFTWARE; + dev->hw_features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX; + + netif_keep_dst(dev); + dev->priv_flags |= IFF_LIVE_ADDR_CHANGE; +} + +static const struct nla_policy geneve_policy[IFLA_GENEVE_MAX + 1] = { + [IFLA_GENEVE_ID] = { .type = NLA_U32 }, + [IFLA_GENEVE_REMOTE] = { .len = FIELD_SIZEOF(struct iphdr, daddr) }, +}; + +static int geneve_validate(struct nlattr *tb[], struct nlattr *data[]) +{ + if (tb[IFLA_ADDRESS]) { + if (nla_len(tb[IFLA_ADDRESS]) != ETH_ALEN) + return -EINVAL; + + if (!is_valid_ether_addr(nla_data(tb[IFLA_ADDRESS]))) + return -EADDRNOTAVAIL; + } + + if (!data) + return -EINVAL; + + if (data[IFLA_GENEVE_ID]) { + __u32 vni = nla_get_u32(data[IFLA_GENEVE_ID]); + + if (vni >= GENEVE_VID_MASK) + return -ERANGE; + } + + return 0; +} + +static int geneve_newlink(struct net *net, struct net_device *dev, + struct nlattr *tb[], struct nlattr *data[]) +{ + struct geneve_net *gn = net_generic(net, geneve_net_id); + struct geneve_dev *dummy, *geneve = netdev_priv(dev); + struct hlist_head *vni_list_head; + struct sockaddr_in remote; /* IPv4 address for link partner */ + __u32 vni, hash; + int err; + + if (!data[IFLA_GENEVE_ID]) + return -EINVAL; + + geneve->net = net; + geneve->dev = dev; + + INIT_WORK(&geneve->sock_work, geneve_sock_work); + + vni = nla_get_u32(data[IFLA_GENEVE_ID]); + geneve->vni[0] = (vni & 0x00ff0000) >> 16; + geneve->vni[1] = (vni & 0x0000ff00) >> 8; + geneve->vni[2] = vni & 0x000000ff; + + if (data[IFLA_GENEVE_REMOTE]) + geneve->remote.sin_addr.s_addr = + nla_get_be32(data[IFLA_GENEVE_REMOTE]); + + remote = geneve->remote; + hash = geneve_net_vni_hash(geneve->vni); + vni_list_head = &gn->vni_list[hash]; + hlist_for_each_entry_rcu(dummy, vni_list_head, hlist) { + if (!memcmp(geneve->vni, dummy->vni, sizeof(dummy->vni)) && + !memcmp(&remote, &dummy->remote, sizeof(dummy->remote))) + return -EBUSY; + } + + if (tb[IFLA_ADDRESS] == NULL) + eth_hw_addr_random(dev); + + err = register_netdevice(dev); + if (err) + return err; + + list_add(&geneve->next, &gn->geneve_list); + + geneve_net_vni_add(gn, hash, geneve); + + return 0; +} + +static void geneve_dellink(struct net_device *dev, struct list_head *head) +{ + struct geneve_dev *geneve = netdev_priv(dev); + + geneve_net_vni_del(geneve); + + list_del(&geneve->next); + unregister_netdevice_queue(dev, head); +} + +static size_t geneve_get_size(const struct net_device *dev) +{ + return nla_total_size(sizeof(__u32)) + /* IFLA_GENEVE_ID */ + nla_total_size(sizeof(struct in_addr)) + /* IFLA_GENEVE_REMOTE */ + 0; +} + +static int geneve_fill_info(struct sk_buff *skb, const struct net_device *dev) +{ + struct geneve_dev *geneve = netdev_priv(dev); + __u32 vni; + + vni = (geneve->vni[0] << 16) | (geneve->vni[1] << 8) | geneve->vni[2]; + if (nla_put_u32(skb, IFLA_GENEVE_ID, vni)) + goto nla_put_failure; + + if (nla_put_be32(skb, IFLA_GENEVE_REMOTE, + geneve->remote.sin_addr.s_addr)) + goto nla_put_failure; + + return 0; + +nla_put_failure: + return -EMSGSIZE; +} + +static struct rtnl_link_ops geneve_link_ops __read_mostly = { + .kind = "geneve", + .maxtype = IFLA_GENEVE_MAX, + .policy = geneve_policy, + .priv_size = sizeof(struct geneve_dev), + .setup = geneve_setup, + .validate = geneve_validate, + .newlink = geneve_newlink, + .dellink = geneve_dellink, + .get_size = geneve_get_size, + .fill_info = geneve_fill_info, +}; + +static __net_init int geneve_init_net(struct net *net) +{ + struct geneve_net *gn = net_generic(net, geneve_net_id); + unsigned int h; + + INIT_LIST_HEAD(&gn->geneve_list); + spin_lock_init(&gn->vni_lock); + + for (h = 0; h < VNI_HASH_SIZE; ++h) + INIT_HLIST_HEAD(&gn->vni_list[h]); + + return 0; +} + +static void __net_exit geneve_exit_net(struct net *net) +{ + struct geneve_net *gn = net_generic(net, geneve_net_id); + struct geneve_dev *geneve, *next; + struct net_device *dev, *aux; + LIST_HEAD(list); + + rtnl_lock(); + + /* gather any geneve devices that were moved into this ns */ + for_each_netdev_safe(net, dev, aux) + if (dev->rtnl_link_ops == &geneve_link_ops) + unregister_netdevice_queue(dev, &list); + + /* now gather any other geneve devices that were created in this ns */ + list_for_each_entry_safe(geneve, next, &gn->geneve_list, next) { + /* If geneve->dev is in the same netns, it was already added + * to the list by the previous loop. + */ + if (!net_eq(dev_net(geneve->dev), net)) + unregister_netdevice_queue(geneve->dev, &list); + } + + /* unregister the devices gathered above */ + unregister_netdevice_many(&list); + rtnl_unlock(); +} + +static struct pernet_operations geneve_net_ops = { + .init = geneve_init_net, + .exit = geneve_exit_net, + .id = &geneve_net_id, + .size = sizeof(struct geneve_net), +}; + +static int __init geneve_init_module(void) +{ + int rc; + + geneve_wq = alloc_workqueue("geneve", 0, 0); + if (!geneve_wq) + return -ENOMEM; + + rc = register_pernet_subsys(&geneve_net_ops); + if (rc) + goto out1; + + rc = rtnl_link_register(&geneve_link_ops); + if (rc) + goto out2; + + return 0; +out2: + unregister_pernet_subsys(&geneve_net_ops); +out1: + destroy_workqueue(geneve_wq); + return rc; +} +late_initcall(geneve_init_module); + +static void __exit geneve_cleanup_module(void) +{ + rtnl_link_unregister(&geneve_link_ops); + destroy_workqueue(geneve_wq); + unregister_pernet_subsys(&geneve_net_ops); +} +module_exit(geneve_cleanup_module); + +MODULE_LICENSE("GPL"); +MODULE_VERSION(GENEVE_NETDEV_VER); +MODULE_AUTHOR("John W. Linville <linville@tuxdriver.com>"); +MODULE_DESCRIPTION("Interface driver for GENEVE encapsulated traffic"); +MODULE_ALIAS_RTNL_LINK("geneve"); diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h index d9cd19214b98..2ca17d1cff3f 100644 --- a/include/uapi/linux/if_link.h +++ b/include/uapi/linux/if_link.h @@ -390,6 +390,15 @@ struct ifla_vxlan_port_range { __be16 high; }; +/* GENEVE section */ +enum { + IFLA_GENEVE_UNSPEC, + IFLA_GENEVE_ID, + IFLA_GENEVE_REMOTE, + __IFLA_GENEVE_MAX +}; +#define IFLA_GENEVE_MAX (__IFLA_GENEVE_MAX - 1) + /* Bonding section */ enum { -- 2.1.0 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH 5/5] geneve: add initial netdev driver for GENEVE tunnels 2015-05-08 17:20 ` [PATCH 5/5] geneve: add initial netdev driver for GENEVE tunnels John W. Linville @ 2015-05-08 20:55 ` Cong Wang 2015-05-08 23:22 ` John W. Linville 2015-05-08 23:19 ` Jesse Gross 2015-05-11 20:51 ` [PATCH v2 " John W. Linville 2 siblings, 1 reply; 18+ messages in thread From: Cong Wang @ 2015-05-08 20:55 UTC (permalink / raw) To: John W. Linville Cc: netdev, David S. Miller, Jesse Gross, Andy Zhou, Stephen Hemminger, Alexander Duyck On Fri, May 8, 2015 at 10:20 AM, John W. Linville <linville@tuxdriver.com> wrote: > + > +/* Setup stats when device is created */ > +static int geneve_init(struct net_device *dev) > +{ > + struct geneve_dev *geneve = netdev_priv(dev); > + > + dev->tstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats); > + if (!dev->tstats) > + return -ENOMEM; > + > + /* make new socket outside of RTNL */ > + dev_hold(dev); > + queue_work(geneve_wq, &geneve->sock_work); > + Any reason to create socket in this init() rather than in ndo_open()? > + return 0; > +} > + > +static void geneve_uninit(struct net_device *dev) > +{ > + struct geneve_dev *geneve = netdev_priv(dev); > + struct geneve_sock *gs = geneve->sock; > + > + if (gs) > + geneve_sock_release(gs); > + free_percpu(dev->tstats); > +} Ditto, ndo_stop(). > + > +static int geneve_newlink(struct net *net, struct net_device *dev, > + struct nlattr *tb[], struct nlattr *data[]) > +{ ... > + > + if (data[IFLA_GENEVE_REMOTE]) > + geneve->remote.sin_addr.s_addr = > + nla_get_be32(data[IFLA_GENEVE_REMOTE]); nla_get_in_addr() ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 5/5] geneve: add initial netdev driver for GENEVE tunnels 2015-05-08 20:55 ` Cong Wang @ 2015-05-08 23:22 ` John W. Linville 2015-05-10 23:48 ` David Miller 0 siblings, 1 reply; 18+ messages in thread From: John W. Linville @ 2015-05-08 23:22 UTC (permalink / raw) To: Cong Wang Cc: netdev, David S. Miller, Jesse Gross, Andy Zhou, Stephen Hemminger, Alexander Duyck On Fri, May 08, 2015 at 01:55:15PM -0700, Cong Wang wrote: > On Fri, May 8, 2015 at 10:20 AM, John W. Linville > <linville@tuxdriver.com> wrote: > > + > > +/* Setup stats when device is created */ > > +static int geneve_init(struct net_device *dev) > > +{ > > + struct geneve_dev *geneve = netdev_priv(dev); > > + > > + dev->tstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats); > > + if (!dev->tstats) > > + return -ENOMEM; > > + > > + /* make new socket outside of RTNL */ > > + dev_hold(dev); > > + queue_work(geneve_wq, &geneve->sock_work); > > + > > > Any reason to create socket in this init() rather than in ndo_open()? The socket can be created asynchronously and ndo_open can fail if the socket creation hasn't succeeded. > > + return 0; > > +} > > + > > +static void geneve_uninit(struct net_device *dev) > > +{ > > + struct geneve_dev *geneve = netdev_priv(dev); > > + struct geneve_sock *gs = geneve->sock; > > + > > + if (gs) > > + geneve_sock_release(gs); > > + free_percpu(dev->tstats); > > +} > > > Ditto, ndo_stop(). I really don't see the point of the ndo_open/ndo_stop inquiry. The socket creation seems analagous to device initialization to me. > > + > > +static int geneve_newlink(struct net *net, struct net_device *dev, > > + struct nlattr *tb[], struct nlattr *data[]) > > +{ > ... > > + > > + if (data[IFLA_GENEVE_REMOTE]) > > + geneve->remote.sin_addr.s_addr = > > + nla_get_be32(data[IFLA_GENEVE_REMOTE]); > > > nla_get_in_addr() The implementation of that is (not surprisingly) exactly the same as nla_get_be32. I'll take it under advisement for a later patch, but I don't really think a purely cosmetic change should interfere with getting this merged. John -- John W. Linville Someday the world will need a hero, and you linville@tuxdriver.com might be all we have. Be ready. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 5/5] geneve: add initial netdev driver for GENEVE tunnels 2015-05-08 23:22 ` John W. Linville @ 2015-05-10 23:48 ` David Miller 2015-05-11 15:17 ` John W. Linville 0 siblings, 1 reply; 18+ messages in thread From: David Miller @ 2015-05-10 23:48 UTC (permalink / raw) To: linville; +Cc: cwang, netdev, jesse, azhou, stephen, alexander.h.duyck From: "John W. Linville" <linville@tuxdriver.com> Date: Fri, 8 May 2015 19:22:36 -0400 > On Fri, May 08, 2015 at 01:55:15PM -0700, Cong Wang wrote: >> On Fri, May 8, 2015 at 10:20 AM, John W. Linville >> <linville@tuxdriver.com> wrote: >> > + >> > +/* Setup stats when device is created */ >> > +static int geneve_init(struct net_device *dev) >> > +{ >> > + struct geneve_dev *geneve = netdev_priv(dev); >> > + >> > + dev->tstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats); >> > + if (!dev->tstats) >> > + return -ENOMEM; >> > + >> > + /* make new socket outside of RTNL */ >> > + dev_hold(dev); >> > + queue_work(geneve_wq, &geneve->sock_work); >> > + >> >> >> Any reason to create socket in this init() rather than in ndo_open()? > > The socket can be created asynchronously and ndo_open can fail if > the socket creation hasn't succeeded. In what manner is the socket creation asynchronous here? It synchronously returns success or failure as far as I can tell. >> Ditto, ndo_stop(). > > I really don't see the point of the ndo_open/ndo_stop inquiry. > The socket creation seems analagous to device initialization to me. It's about resource allocation. Even in ethernet drivers, memory allocations such as those done for RX and TX rings are done at ->ndo_open and released at ->ndo_stop() time. Therefore it's sort of reasonable to stretch that idea to how you will handle sockets here in the geneve driver. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 5/5] geneve: add initial netdev driver for GENEVE tunnels 2015-05-10 23:48 ` David Miller @ 2015-05-11 15:17 ` John W. Linville 0 siblings, 0 replies; 18+ messages in thread From: John W. Linville @ 2015-05-11 15:17 UTC (permalink / raw) To: David Miller; +Cc: cwang, netdev, jesse, azhou, stephen, alexander.h.duyck On Sun, May 10, 2015 at 07:48:30PM -0400, David Miller wrote: > From: "John W. Linville" <linville@tuxdriver.com> > Date: Fri, 8 May 2015 19:22:36 -0400 > > > On Fri, May 08, 2015 at 01:55:15PM -0700, Cong Wang wrote: > >> On Fri, May 8, 2015 at 10:20 AM, John W. Linville > >> <linville@tuxdriver.com> wrote: > >> > + > >> > +/* Setup stats when device is created */ > >> > +static int geneve_init(struct net_device *dev) > >> > +{ > >> > + struct geneve_dev *geneve = netdev_priv(dev); > >> > + > >> > + dev->tstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats); > >> > + if (!dev->tstats) > >> > + return -ENOMEM; > >> > + > >> > + /* make new socket outside of RTNL */ > >> > + dev_hold(dev); > >> > + queue_work(geneve_wq, &geneve->sock_work); > >> > + > >> > >> > >> Any reason to create socket in this init() rather than in ndo_open()? > > > > The socket can be created asynchronously and ndo_open can fail if > > the socket creation hasn't succeeded. > > In what manner is the socket creation asynchronous here? It > synchronously returns success or failure as far as I can tell. Well, I misspoke -- I meant to indicate "outside of RTNL". But, I have been bitten again by copying (an older version of) vxlan a bit too closely. I don't think I need to worry about the RTNL stuff at socket open time for this driver. > >> Ditto, ndo_stop(). > > > > I really don't see the point of the ndo_open/ndo_stop inquiry. > > The socket creation seems analagous to device initialization to me. > > It's about resource allocation. > > Even in ethernet drivers, memory allocations such as those done for > RX and TX rings are done at ->ndo_open and released at ->ndo_stop() > time. > > Therefore it's sort of reasonable to stretch that idea to how you > will handle sockets here in the geneve driver. Sure, thanks for laying it out for me! I'll rework this bit, and address some of the other comments raised by Cong Wang and Jesse Gross and spin a v2. John -- John W. Linville Someday the world will need a hero, and you linville@tuxdriver.com might be all we have. Be ready. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 5/5] geneve: add initial netdev driver for GENEVE tunnels 2015-05-08 17:20 ` [PATCH 5/5] geneve: add initial netdev driver for GENEVE tunnels John W. Linville 2015-05-08 20:55 ` Cong Wang @ 2015-05-08 23:19 ` Jesse Gross 2015-05-11 20:51 ` [PATCH v2 " John W. Linville 2 siblings, 0 replies; 18+ messages in thread From: Jesse Gross @ 2015-05-08 23:19 UTC (permalink / raw) To: John W. Linville Cc: netdev, David S. Miller, Andy Zhou, Stephen Hemminger, Alexander Duyck On Fri, May 8, 2015 at 10:20 AM, John W. Linville <linville@tuxdriver.com> wrote: > diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c > new file mode 100644 > index 000000000000..102030de1d45 > --- /dev/null > +++ b/drivers/net/geneve.c > +/* geneve receive/decap routine */ > +static void geneve_rx(struct geneve_sock *gs, struct sk_buff *skb) [...] > + /* Find the device for this VNI */ > + hash = geneve_net_vni_hash(gnvh->vni); > + vni_list_head = &gn->vni_list[hash]; > + hlist_for_each_entry_rcu(dummy, vni_list_head, hlist) { > + if (!memcmp(gnvh->vni, dummy->vni, sizeof(dummy->vni)) && > + iph->saddr == dummy->remote.sin_addr.s_addr) > + geneve = dummy; I guess we might as well break out of the loop at this point rather than keep searching. > +static int geneve_newlink(struct net *net, struct net_device *dev, > + struct nlattr *tb[], struct nlattr *data[]) > +{ [...] > + if (!data[IFLA_GENEVE_ID]) > + return -EINVAL; Should we enforce that IFLA_GENEVE_REMOTE is present? Otherwise, it's not clear what we would do without it. [...] > + list_add(&geneve->next, &gn->geneve_list); > + > + geneve_net_vni_add(gn, hash, geneve); The locking seems a bit inconsistent for these two pieces - they are accessed in the same places but one has a special lock and the other doesn't. I think the answer is that neither needs a lock because they are both protected by RTNL but it made me pause. ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v2 5/5] geneve: add initial netdev driver for GENEVE tunnels 2015-05-08 17:20 ` [PATCH 5/5] geneve: add initial netdev driver for GENEVE tunnels John W. Linville 2015-05-08 20:55 ` Cong Wang 2015-05-08 23:19 ` Jesse Gross @ 2015-05-11 20:51 ` John W. Linville 2015-05-13 3:06 ` David Miller 2 siblings, 1 reply; 18+ messages in thread From: John W. Linville @ 2015-05-11 20:51 UTC (permalink / raw) To: netdev Cc: David S. Miller, Jesse Gross, Andy Zhou, Stephen Hemminger, Alexander Duyck, John W. Linville This is an initial implementation of a netdev driver for GENEVE tunnels. This implementation uses a fixed UDP port, and only supports point-to-point links with specific partner endpoints. Only IPv4 links are supported at this time. Signed-off-by: John W. Linville <linville@tuxdriver.com> --- Changes in v2: - removal of unneeded special lock for vni_list - removal of geneve_net_vni_add/del (replaced by open code) - break out of vni search loop in geneve_rx after match found - no longer deferring socket open at ndo_init(), now doing it in ndo_open() - check for non-multicast, non-zero remote link partner in newlink() - remove now unused workqueue stuff drivers/net/Kconfig | 14 ++ drivers/net/Makefile | 1 + drivers/net/geneve.c | 503 +++++++++++++++++++++++++++++++++++++++++++ include/uapi/linux/if_link.h | 9 + 4 files changed, 527 insertions(+) create mode 100644 drivers/net/geneve.c diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig index df51d6025a90..019fceffc9e5 100644 --- a/drivers/net/Kconfig +++ b/drivers/net/Kconfig @@ -179,6 +179,20 @@ config VXLAN To compile this driver as a module, choose M here: the module will be called vxlan. +config GENEVE + tristate "Generic Network Virtualization Encapsulation netdev" + depends on INET && GENEVE_CORE + select NET_IP_TUNNEL + ---help--- + This allows one to create geneve virtual interfaces that provide + Layer 2 Networks over Layer 3 Networks. GENEVE is often used + to tunnel virtual network infrastructure in virtualized environments. + For more information see: + http://tools.ietf.org/html/draft-gross-geneve-02 + + To compile this driver as a module, choose M here: the module + will be called geneve. + config NETCONSOLE tristate "Network console logging support" ---help--- diff --git a/drivers/net/Makefile b/drivers/net/Makefile index e25fdd7d905e..c12cb22478a7 100644 --- a/drivers/net/Makefile +++ b/drivers/net/Makefile @@ -23,6 +23,7 @@ obj-$(CONFIG_TUN) += tun.o obj-$(CONFIG_VETH) += veth.o obj-$(CONFIG_VIRTIO_NET) += virtio_net.o obj-$(CONFIG_VXLAN) += vxlan.o +obj-$(CONFIG_GENEVE) += geneve.o obj-$(CONFIG_NLMON) += nlmon.o # diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c new file mode 100644 index 000000000000..b7eafa4c1a67 --- /dev/null +++ b/drivers/net/geneve.c @@ -0,0 +1,503 @@ +/* + * GENEVE: Generic Network Virtualization Encapsulation + * + * Copyright (c) 2015 Red Hat, Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include <linux/kernel.h> +#include <linux/module.h> +#include <linux/netdevice.h> +#include <linux/etherdevice.h> +#include <linux/hash.h> +#include <net/rtnetlink.h> +#include <net/geneve.h> + +#define GENEVE_NETDEV_VER "0.6" + +#define GENEVE_UDP_PORT 6081 + +#define GENEVE_N_VID (1u << 24) +#define GENEVE_VID_MASK (GENEVE_N_VID - 1) + +#define VNI_HASH_BITS 10 +#define VNI_HASH_SIZE (1<<VNI_HASH_BITS) + +static bool log_ecn_error = true; +module_param(log_ecn_error, bool, 0644); +MODULE_PARM_DESC(log_ecn_error, "Log packets received with corrupted ECN"); + +/* per-network namespace private data for this module */ +struct geneve_net { + struct list_head geneve_list; + struct hlist_head vni_list[VNI_HASH_SIZE]; +}; + +/* Pseudo network device */ +struct geneve_dev { + struct hlist_node hlist; /* vni hash table */ + struct net *net; /* netns for packet i/o */ + struct net_device *dev; /* netdev for geneve tunnel */ + struct geneve_sock *sock; /* socket used for geneve tunnel */ + u8 vni[3]; /* virtual network ID for tunnel */ + struct sockaddr_in remote; /* IPv4 address for link partner */ + struct list_head next; /* geneve's per namespace list */ +}; + +static int geneve_net_id; + +static inline __u32 geneve_net_vni_hash(u8 vni[3]) +{ + __u32 vnid; + + vnid = (vni[0] << 16) | (vni[1] << 8) | vni[2]; + return hash_32(vnid, VNI_HASH_BITS); +} + +/* geneve receive/decap routine */ +static void geneve_rx(struct geneve_sock *gs, struct sk_buff *skb) +{ + struct genevehdr *gnvh = geneve_hdr(skb); + struct geneve_dev *dummy, *geneve = NULL; + struct geneve_net *gn; + struct iphdr *iph = NULL; + struct pcpu_sw_netstats *stats; + struct hlist_head *vni_list_head; + int err = 0; + __u32 hash; + + iph = ip_hdr(skb); /* Still outer IP header... */ + + gn = gs->rcv_data; + + /* Find the device for this VNI */ + hash = geneve_net_vni_hash(gnvh->vni); + vni_list_head = &gn->vni_list[hash]; + hlist_for_each_entry_rcu(dummy, vni_list_head, hlist) { + if (!memcmp(gnvh->vni, dummy->vni, sizeof(dummy->vni)) && + iph->saddr == dummy->remote.sin_addr.s_addr) { + geneve = dummy; + break; + } + } + if (!geneve) + goto drop; + + /* Drop packets w/ critical options, + * since we don't support any... + */ + if (gnvh->critical) + goto drop; + + skb_reset_mac_header(skb); + skb_scrub_packet(skb, !net_eq(geneve->net, dev_net(geneve->dev))); + skb->protocol = eth_type_trans(skb, geneve->dev); + skb_postpull_rcsum(skb, eth_hdr(skb), ETH_HLEN); + + /* Ignore packet loops (and multicast echo) */ + if (ether_addr_equal(eth_hdr(skb)->h_source, geneve->dev->dev_addr)) + goto drop; + + skb_reset_network_header(skb); + + iph = ip_hdr(skb); /* Now inner IP header... */ + err = IP_ECN_decapsulate(iph, skb); + + if (unlikely(err)) { + if (log_ecn_error) + net_info_ratelimited("non-ECT from %pI4 with TOS=%#x\n", + &iph->saddr, iph->tos); + if (err > 1) { + ++geneve->dev->stats.rx_frame_errors; + ++geneve->dev->stats.rx_errors; + goto drop; + } + } + + stats = this_cpu_ptr(geneve->dev->tstats); + u64_stats_update_begin(&stats->syncp); + stats->rx_packets++; + stats->rx_bytes += skb->len; + u64_stats_update_end(&stats->syncp); + + netif_rx(skb); + + return; +drop: + /* Consume bad packet */ + kfree_skb(skb); +} + +/* Setup stats when device is created */ +static int geneve_init(struct net_device *dev) +{ + dev->tstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats); + if (!dev->tstats) + return -ENOMEM; + + return 0; +} + +static void geneve_uninit(struct net_device *dev) +{ + free_percpu(dev->tstats); +} + +static int geneve_open(struct net_device *dev) +{ + struct geneve_dev *geneve = netdev_priv(dev); + struct net *net = geneve->net; + struct geneve_net *gn = net_generic(geneve->net, geneve_net_id); + struct geneve_sock *gs; + + gs = geneve_sock_add(net, htons(GENEVE_UDP_PORT), geneve_rx, gn, + false, false); + if (IS_ERR(gs)) + return PTR_ERR(gs); + + geneve->sock = gs; + + return 0; +} + +static int geneve_stop(struct net_device *dev) +{ + struct geneve_dev *geneve = netdev_priv(dev); + struct geneve_sock *gs = geneve->sock; + + geneve_sock_release(gs); + + return 0; +} + +static netdev_tx_t geneve_xmit(struct sk_buff *skb, struct net_device *dev) +{ + struct geneve_dev *geneve = netdev_priv(dev); + struct geneve_sock *gs = geneve->sock; + struct rtable *rt = NULL; + const struct iphdr *iip; /* interior IP header */ + struct flowi4 fl4; + int err; + __be16 sport; + __u8 tos, ttl = 0; + + iip = ip_hdr(skb); + + skb_reset_mac_header(skb); + + /* TODO: port min/max limits should be configurable */ + sport = udp_flow_src_port(dev_net(dev), skb, 0, 0, true); + + memset(&fl4, 0, sizeof(fl4)); + fl4.daddr = geneve->remote.sin_addr.s_addr; + rt = ip_route_output_key(geneve->net, &fl4); + if (IS_ERR(rt)) { + netdev_dbg(dev, "no route to %pI4\n", &fl4.daddr); + dev->stats.tx_carrier_errors++; + goto tx_error; + } + if (rt->dst.dev == dev) { /* is this necessary? */ + netdev_dbg(dev, "circular route to %pI4\n", &fl4.daddr); + dev->stats.collisions++; + goto rt_tx_error; + } + + /* TODO: tos and ttl should be configurable */ + + tos = ip_tunnel_ecn_encap(0, iip, skb); + + if (IN_MULTICAST(ntohl(fl4.daddr))) + ttl = 1; + + ttl = ttl ? : ip4_dst_hoplimit(&rt->dst); + + /* no need to handle local destination and encap bypass...yet... */ + + err = geneve_xmit_skb(gs, rt, skb, fl4.saddr, fl4.daddr, + tos, ttl, 0, sport, htons(GENEVE_UDP_PORT), 0, + geneve->vni, 0, NULL, false, + !net_eq(geneve->net, dev_net(geneve->dev))); + if (err < 0) + ip_rt_put(rt); + + iptunnel_xmit_stats(err, &dev->stats, dev->tstats); + + return NETDEV_TX_OK; + +rt_tx_error: + ip_rt_put(rt); +tx_error: + dev->stats.tx_errors++; + dev_kfree_skb(skb); + return NETDEV_TX_OK; +} + +static const struct net_device_ops geneve_netdev_ops = { + .ndo_init = geneve_init, + .ndo_uninit = geneve_uninit, + .ndo_open = geneve_open, + .ndo_stop = geneve_stop, + .ndo_start_xmit = geneve_xmit, + .ndo_get_stats64 = ip_tunnel_get_stats64, + .ndo_change_mtu = eth_change_mtu, + .ndo_validate_addr = eth_validate_addr, + .ndo_set_mac_address = eth_mac_addr, +}; + +static void geneve_get_drvinfo(struct net_device *dev, + struct ethtool_drvinfo *drvinfo) +{ + strlcpy(drvinfo->version, GENEVE_NETDEV_VER, sizeof(drvinfo->version)); + strlcpy(drvinfo->driver, "geneve", sizeof(drvinfo->driver)); +} + +static const struct ethtool_ops geneve_ethtool_ops = { + .get_drvinfo = geneve_get_drvinfo, + .get_link = ethtool_op_get_link, +}; + +/* Info for udev, that this is a virtual tunnel endpoint */ +static struct device_type geneve_type = { + .name = "geneve", +}; + +/* Initialize the device structure. */ +static void geneve_setup(struct net_device *dev) +{ + ether_setup(dev); + + dev->netdev_ops = &geneve_netdev_ops; + dev->ethtool_ops = &geneve_ethtool_ops; + dev->destructor = free_netdev; + + SET_NETDEV_DEVTYPE(dev, &geneve_type); + + dev->tx_queue_len = 0; + dev->features |= NETIF_F_LLTX; + dev->features |= NETIF_F_SG | NETIF_F_HW_CSUM; + dev->features |= NETIF_F_RXCSUM; + dev->features |= NETIF_F_GSO_SOFTWARE; + + dev->vlan_features = dev->features; + dev->features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX; + + dev->hw_features |= NETIF_F_SG | NETIF_F_HW_CSUM | NETIF_F_RXCSUM; + dev->hw_features |= NETIF_F_GSO_SOFTWARE; + dev->hw_features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX; + + netif_keep_dst(dev); + dev->priv_flags |= IFF_LIVE_ADDR_CHANGE; +} + +static const struct nla_policy geneve_policy[IFLA_GENEVE_MAX + 1] = { + [IFLA_GENEVE_ID] = { .type = NLA_U32 }, + [IFLA_GENEVE_REMOTE] = { .len = FIELD_SIZEOF(struct iphdr, daddr) }, +}; + +static int geneve_validate(struct nlattr *tb[], struct nlattr *data[]) +{ + if (tb[IFLA_ADDRESS]) { + if (nla_len(tb[IFLA_ADDRESS]) != ETH_ALEN) + return -EINVAL; + + if (!is_valid_ether_addr(nla_data(tb[IFLA_ADDRESS]))) + return -EADDRNOTAVAIL; + } + + if (!data) + return -EINVAL; + + if (data[IFLA_GENEVE_ID]) { + __u32 vni = nla_get_u32(data[IFLA_GENEVE_ID]); + + if (vni >= GENEVE_VID_MASK) + return -ERANGE; + } + + return 0; +} + +static int geneve_newlink(struct net *net, struct net_device *dev, + struct nlattr *tb[], struct nlattr *data[]) +{ + struct geneve_net *gn = net_generic(net, geneve_net_id); + struct geneve_dev *dummy, *geneve = netdev_priv(dev); + struct hlist_head *vni_list_head; + struct sockaddr_in remote; /* IPv4 address for link partner */ + __u32 vni, hash; + int err; + + if (!data[IFLA_GENEVE_ID] || !data[IFLA_GENEVE_REMOTE]) + return -EINVAL; + + geneve->net = net; + geneve->dev = dev; + + vni = nla_get_u32(data[IFLA_GENEVE_ID]); + geneve->vni[0] = (vni & 0x00ff0000) >> 16; + geneve->vni[1] = (vni & 0x0000ff00) >> 8; + geneve->vni[2] = vni & 0x000000ff; + + geneve->remote.sin_addr.s_addr = + nla_get_in_addr(data[IFLA_GENEVE_REMOTE]); + if (IN_MULTICAST(ntohl(geneve->remote.sin_addr.s_addr))) + return -EINVAL; + + remote = geneve->remote; + hash = geneve_net_vni_hash(geneve->vni); + vni_list_head = &gn->vni_list[hash]; + hlist_for_each_entry_rcu(dummy, vni_list_head, hlist) { + if (!memcmp(geneve->vni, dummy->vni, sizeof(dummy->vni)) && + !memcmp(&remote, &dummy->remote, sizeof(dummy->remote))) + return -EBUSY; + } + + if (tb[IFLA_ADDRESS] == NULL) + eth_hw_addr_random(dev); + + err = register_netdevice(dev); + if (err) + return err; + + list_add(&geneve->next, &gn->geneve_list); + + hlist_add_head_rcu(&geneve->hlist, &gn->vni_list[hash]); + + return 0; +} + +static void geneve_dellink(struct net_device *dev, struct list_head *head) +{ + struct geneve_dev *geneve = netdev_priv(dev); + + if (!hlist_unhashed(&geneve->hlist)) + hlist_del_rcu(&geneve->hlist); + + list_del(&geneve->next); + unregister_netdevice_queue(dev, head); +} + +static size_t geneve_get_size(const struct net_device *dev) +{ + return nla_total_size(sizeof(__u32)) + /* IFLA_GENEVE_ID */ + nla_total_size(sizeof(struct in_addr)) + /* IFLA_GENEVE_REMOTE */ + 0; +} + +static int geneve_fill_info(struct sk_buff *skb, const struct net_device *dev) +{ + struct geneve_dev *geneve = netdev_priv(dev); + __u32 vni; + + vni = (geneve->vni[0] << 16) | (geneve->vni[1] << 8) | geneve->vni[2]; + if (nla_put_u32(skb, IFLA_GENEVE_ID, vni)) + goto nla_put_failure; + + if (nla_put_in_addr(skb, IFLA_GENEVE_REMOTE, + geneve->remote.sin_addr.s_addr)) + goto nla_put_failure; + + return 0; + +nla_put_failure: + return -EMSGSIZE; +} + +static struct rtnl_link_ops geneve_link_ops __read_mostly = { + .kind = "geneve", + .maxtype = IFLA_GENEVE_MAX, + .policy = geneve_policy, + .priv_size = sizeof(struct geneve_dev), + .setup = geneve_setup, + .validate = geneve_validate, + .newlink = geneve_newlink, + .dellink = geneve_dellink, + .get_size = geneve_get_size, + .fill_info = geneve_fill_info, +}; + +static __net_init int geneve_init_net(struct net *net) +{ + struct geneve_net *gn = net_generic(net, geneve_net_id); + unsigned int h; + + INIT_LIST_HEAD(&gn->geneve_list); + + for (h = 0; h < VNI_HASH_SIZE; ++h) + INIT_HLIST_HEAD(&gn->vni_list[h]); + + return 0; +} + +static void __net_exit geneve_exit_net(struct net *net) +{ + struct geneve_net *gn = net_generic(net, geneve_net_id); + struct geneve_dev *geneve, *next; + struct net_device *dev, *aux; + LIST_HEAD(list); + + rtnl_lock(); + + /* gather any geneve devices that were moved into this ns */ + for_each_netdev_safe(net, dev, aux) + if (dev->rtnl_link_ops == &geneve_link_ops) + unregister_netdevice_queue(dev, &list); + + /* now gather any other geneve devices that were created in this ns */ + list_for_each_entry_safe(geneve, next, &gn->geneve_list, next) { + /* If geneve->dev is in the same netns, it was already added + * to the list by the previous loop. + */ + if (!net_eq(dev_net(geneve->dev), net)) + unregister_netdevice_queue(geneve->dev, &list); + } + + /* unregister the devices gathered above */ + unregister_netdevice_many(&list); + rtnl_unlock(); +} + +static struct pernet_operations geneve_net_ops = { + .init = geneve_init_net, + .exit = geneve_exit_net, + .id = &geneve_net_id, + .size = sizeof(struct geneve_net), +}; + +static int __init geneve_init_module(void) +{ + int rc; + + rc = register_pernet_subsys(&geneve_net_ops); + if (rc) + goto out1; + + rc = rtnl_link_register(&geneve_link_ops); + if (rc) + goto out2; + + return 0; +out2: + unregister_pernet_subsys(&geneve_net_ops); +out1: + return rc; +} +late_initcall(geneve_init_module); + +static void __exit geneve_cleanup_module(void) +{ + rtnl_link_unregister(&geneve_link_ops); + unregister_pernet_subsys(&geneve_net_ops); +} +module_exit(geneve_cleanup_module); + +MODULE_LICENSE("GPL"); +MODULE_VERSION(GENEVE_NETDEV_VER); +MODULE_AUTHOR("John W. Linville <linville@tuxdriver.com>"); +MODULE_DESCRIPTION("Interface driver for GENEVE encapsulated traffic"); +MODULE_ALIAS_RTNL_LINK("geneve"); diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h index d9cd19214b98..2ca17d1cff3f 100644 --- a/include/uapi/linux/if_link.h +++ b/include/uapi/linux/if_link.h @@ -390,6 +390,15 @@ struct ifla_vxlan_port_range { __be16 high; }; +/* GENEVE section */ +enum { + IFLA_GENEVE_UNSPEC, + IFLA_GENEVE_ID, + IFLA_GENEVE_REMOTE, + __IFLA_GENEVE_MAX +}; +#define IFLA_GENEVE_MAX (__IFLA_GENEVE_MAX - 1) + /* Bonding section */ enum { -- 2.1.0 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v2 5/5] geneve: add initial netdev driver for GENEVE tunnels 2015-05-11 20:51 ` [PATCH v2 " John W. Linville @ 2015-05-13 3:06 ` David Miller 2015-05-13 16:53 ` John W. Linville 0 siblings, 1 reply; 18+ messages in thread From: David Miller @ 2015-05-13 3:06 UTC (permalink / raw) To: linville; +Cc: netdev, jesse, azhou, stephen, alexander.h.duyck From: "John W. Linville" <linville@tuxdriver.com> Date: Mon, 11 May 2015 16:51:06 -0400 > This is an initial implementation of a netdev driver for GENEVE > tunnels. This implementation uses a fixed UDP port, and only supports > point-to-point links with specific partner endpoints. Only IPv4 > links are supported at this time. > > Signed-off-by: John W. Linville <linville@tuxdriver.com> > --- > Changes in v2: > - removal of unneeded special lock for vni_list > - removal of geneve_net_vni_add/del (replaced by open code) > - break out of vni search loop in geneve_rx after match found > - no longer deferring socket open at ndo_init(), now doing it in ndo_open() > - check for non-multicast, non-zero remote link partner in newlink() > - remove now unused workqueue stuff John, could you please repost the full series when you make changes based upon feedback? That helps me a lot. Thanks! ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2 5/5] geneve: add initial netdev driver for GENEVE tunnels 2015-05-13 3:06 ` David Miller @ 2015-05-13 16:53 ` John W. Linville 0 siblings, 0 replies; 18+ messages in thread From: John W. Linville @ 2015-05-13 16:53 UTC (permalink / raw) To: David Miller; +Cc: netdev, jesse, azhou, stephen, alexander.h.duyck On Tue, May 12, 2015 at 11:06:28PM -0400, David Miller wrote: > From: "John W. Linville" <linville@tuxdriver.com> > Date: Mon, 11 May 2015 16:51:06 -0400 > > > This is an initial implementation of a netdev driver for GENEVE > > tunnels. This implementation uses a fixed UDP port, and only supports > > point-to-point links with specific partner endpoints. Only IPv4 > > links are supported at this time. > > > > Signed-off-by: John W. Linville <linville@tuxdriver.com> > > --- > > Changes in v2: > > - removal of unneeded special lock for vni_list > > - removal of geneve_net_vni_add/del (replaced by open code) > > - break out of vni search loop in geneve_rx after match found > > - no longer deferring socket open at ndo_init(), now doing it in ndo_open() > > - check for non-multicast, non-zero remote link partner in newlink() > > - remove now unused workqueue stuff > > John, could you please repost the full series when you make changes > based upon feedback? That helps me a lot. > > Thanks! Sure, no problem... :-) -- John W. Linville Someday the world will need a hero, and you linville@tuxdriver.com might be all we have. Be ready. ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH] iproute2: GENEVE support 2015-05-08 17:20 [PATCH] add GENEVE netdev tunnel driver John W. Linville ` (4 preceding siblings ...) 2015-05-08 17:20 ` [PATCH 5/5] geneve: add initial netdev driver for GENEVE tunnels John W. Linville @ 2015-05-08 17:27 ` John W. Linville 2015-05-08 23:27 ` Jesse Gross 2015-05-11 18:47 ` [PATCH v2] " John W. Linville 2015-05-08 19:32 ` [PATCH] add GENEVE netdev tunnel driver Stephen Hemminger 6 siblings, 2 replies; 18+ messages in thread From: John W. Linville @ 2015-05-08 17:27 UTC (permalink / raw) To: netdev Cc: David S. Miller, Jesse Gross, Andy Zhou, Stephen Hemminger, Alexander Duyck, John W. Linville Signed-off-by: John W. Linville <linville@tuxdriver.com> --- This includes the include/linux/if_link.h bits, that will need to be dropped after iproute2 does the 4.1 update for that file... include/linux/if_link.h | 9 ++++ ip/Makefile | 3 +- ip/iplink.c | 2 +- ip/iplink_geneve.c | 122 ++++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 134 insertions(+), 2 deletions(-) create mode 100644 ip/iplink_geneve.c diff --git a/include/linux/if_link.h b/include/linux/if_link.h index 3d0d61317733..86058638c4d9 100644 --- a/include/linux/if_link.h +++ b/include/linux/if_link.h @@ -388,6 +388,15 @@ struct ifla_vxlan_port_range { __be16 high; }; +/* GENEVE section */ +enum { + IFLA_GENEVE_UNSPEC, + IFLA_GENEVE_ID, + IFLA_GENEVE_REMOTE, + __IFLA_GENEVE_MAX +}; +#define IFLA_GENEVE_MAX (__IFLA_GENEVE_MAX - 1) + /* Bonding section */ enum { diff --git a/ip/Makefile b/ip/Makefile index 2c742f305fef..77653ecc5785 100644 --- a/ip/Makefile +++ b/ip/Makefile @@ -6,7 +6,8 @@ IPOBJ=ip.o ipaddress.o ipaddrlabel.o iproute.o iprule.o ipnetns.o \ iplink_macvlan.o iplink_macvtap.o ipl2tp.o link_vti.o link_vti6.o \ iplink_vxlan.o tcp_metrics.o iplink_ipoib.o ipnetconf.o link_ip6tnl.o \ link_iptnl.o link_gre6.o iplink_bond.o iplink_bond_slave.o iplink_hsr.o \ - iplink_bridge.o iplink_bridge_slave.o ipfou.o iplink_ipvlan.o + iplink_bridge.o iplink_bridge_slave.o ipfou.o iplink_ipvlan.o \ + iplink_geneve.o RTMONOBJ=rtmon.o diff --git a/ip/iplink.c b/ip/iplink.c index bb437b96239a..39c76e778020 100644 --- a/ip/iplink.c +++ b/ip/iplink.c @@ -93,7 +93,7 @@ void iplink_usage(void) fprintf(stderr, "TYPE := { vlan | veth | vcan | dummy | ifb | macvlan | macvtap |\n"); fprintf(stderr, " bridge | bond | ipoib | ip6tnl | ipip | sit | vxlan |\n"); fprintf(stderr, " gre | gretap | ip6gre | ip6gretap | vti | nlmon |\n"); - fprintf(stderr, " bond_slave | ipvlan }\n"); + fprintf(stderr, " bond_slave | ipvlan | geneve }\n"); } exit(-1); } diff --git a/ip/iplink_geneve.c b/ip/iplink_geneve.c new file mode 100644 index 000000000000..74703e1ee156 --- /dev/null +++ b/ip/iplink_geneve.c @@ -0,0 +1,122 @@ +/* + * iplink_geneve.c GENEVE device support + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + * + * Authors: John W. Linville <linville@tuxdriver.com> + */ + +#include <stdio.h> + +#include "utils.h" +#include "ip_common.h" + +static void print_explain(FILE *f) +{ + fprintf(f, "Usage: ... geneve id VNI remote ADDR\n"); + fprintf(f, "\n"); + fprintf(f, "Where: VNI := 0-16777215\n"); + fprintf(f, " ADDR := IP_ADDRESS\n"); +} + +static void explain(void) +{ + print_explain(stderr); +} + +static int geneve_parse_opt(struct link_util *lu, int argc, char **argv, + struct nlmsghdr *n) +{ + __u32 vni = 0; + int vni_set = 0; + __u32 daddr = 0; + struct in6_addr daddr6 = IN6ADDR_ANY_INIT; + + + while (argc > 0) { + if (!matches(*argv, "id") || + !matches(*argv, "vni")) { + NEXT_ARG(); + if (get_u32(&vni, *argv, 0) || + vni >= 1u << 24) + invarg("invalid id", *argv); + vni_set = 1; + } else if (!matches(*argv, "remote")) { + NEXT_ARG(); + if (!inet_get_addr(*argv, &daddr, &daddr6)) { + fprintf(stderr, "Invalid address \"%s\"\n", *argv); + return -1; + } + if (IN_MULTICAST(ntohl(daddr))) + invarg("invalid remote address", *argv); + } else if (matches(*argv, "help") == 0) { + explain(); + return -1; + } else { + fprintf(stderr, "geneve: unknown command \"%s\"?\n", *argv); + explain(); + return -1; + } + argc--, argv++; + } + + if (!vni_set) { + fprintf(stderr, "geneve: missing virtual network identifier\n"); + return -1; + } + + if (!daddr) { + fprintf(stderr, "geneve: remove link partner not specified\n"); + return -1; + } + if (memcmp(&daddr6, &in6addr_any, sizeof(daddr6)) != 0) { + fprintf(stderr, "geneve: remove link over IPv6 not supported\n"); + return -1; + } + + addattr32(n, 1024, IFLA_GENEVE_ID, vni); + if (daddr) + addattr_l(n, 1024, IFLA_GENEVE_REMOTE, &daddr, 4); + + return 0; +} + +static void geneve_print_opt(struct link_util *lu, FILE *f, struct rtattr *tb[]) +{ + __u32 vni; + char s1[1024]; + + if (!tb) + return; + + if (!tb[IFLA_GENEVE_ID] || + RTA_PAYLOAD(tb[IFLA_GENEVE_ID]) < sizeof(__u32)) + return; + + vni = rta_getattr_u32(tb[IFLA_GENEVE_ID]); + fprintf(f, "id %u ", vni); + + if (tb[IFLA_GENEVE_REMOTE]) { + __be32 addr = rta_getattr_u32(tb[IFLA_GENEVE_REMOTE]); + if (addr) + fprintf(f, "remote %s ", + format_host(AF_INET, 4, &addr, s1, sizeof(s1))); + } +} + +static void geneve_print_help(struct link_util *lu, int argc, char **argv, + FILE *f) +{ + print_explain(f); +} + +struct link_util geneve_link_util = { + .id = "geneve", + .maxattr = IFLA_GENEVE_MAX, + .parse_opt = geneve_parse_opt, + .print_opt = geneve_print_opt, + .print_help = geneve_print_help, +}; -- 2.1.0 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH] iproute2: GENEVE support 2015-05-08 17:27 ` [PATCH] iproute2: GENEVE support John W. Linville @ 2015-05-08 23:27 ` Jesse Gross 2015-05-11 18:47 ` [PATCH v2] " John W. Linville 1 sibling, 0 replies; 18+ messages in thread From: Jesse Gross @ 2015-05-08 23:27 UTC (permalink / raw) To: John W. Linville Cc: netdev, David S. Miller, Andy Zhou, Stephen Hemminger, Alexander Duyck On Fri, May 8, 2015 at 10:27 AM, John W. Linville <linville@tuxdriver.com> wrote: > diff --git a/ip/iplink_geneve.c b/ip/iplink_geneve.c > new file mode 100644 > index 000000000000..74703e1ee156 > --- /dev/null > +++ b/ip/iplink_geneve.c > +static int geneve_parse_opt(struct link_util *lu, int argc, char **argv, > + struct nlmsghdr *n) > +{ [...] > + } else if (!matches(*argv, "remote")) { > + NEXT_ARG(); > + if (!inet_get_addr(*argv, &daddr, &daddr6)) { > + fprintf(stderr, "Invalid address \"%s\"\n", *argv); > + return -1; > + } > + if (IN_MULTICAST(ntohl(daddr))) > + invarg("invalid remote address", *argv); We should probably validate the no multicast check in the kernel as well since it won't do the right thing anyways. [...] > + if (!daddr) { > + fprintf(stderr, "geneve: remove link partner not specified\n"); > + return -1; > + } > + if (memcmp(&daddr6, &in6addr_any, sizeof(daddr6)) != 0) { > + fprintf(stderr, "geneve: remove link over IPv6 not supported\n"); > + return -1; > + } Two typos in the above strings - "remove" instead of "remote". ^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v2] iproute2: GENEVE support 2015-05-08 17:27 ` [PATCH] iproute2: GENEVE support John W. Linville 2015-05-08 23:27 ` Jesse Gross @ 2015-05-11 18:47 ` John W. Linville 1 sibling, 0 replies; 18+ messages in thread From: John W. Linville @ 2015-05-11 18:47 UTC (permalink / raw) To: netdev Cc: David S. Miller, Jesse Gross, Andy Zhou, Stephen Hemminger, Alexander Duyck, John W. Linville Signed-off-by: John W. Linville <linville@tuxdriver.com> --- This includes the include/linux/if_link.h bits, that will need to be dropped after iproute2 does the 4.1 update for that file... v2 - spelling correction identified by Jesse Gross include/linux/if_link.h | 9 ++++ ip/Makefile | 3 +- ip/iplink.c | 2 +- ip/iplink_geneve.c | 122 ++++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 134 insertions(+), 2 deletions(-) create mode 100644 ip/iplink_geneve.c diff --git a/include/linux/if_link.h b/include/linux/if_link.h index 3d0d61317733..86058638c4d9 100644 --- a/include/linux/if_link.h +++ b/include/linux/if_link.h @@ -388,6 +388,15 @@ struct ifla_vxlan_port_range { __be16 high; }; +/* GENEVE section */ +enum { + IFLA_GENEVE_UNSPEC, + IFLA_GENEVE_ID, + IFLA_GENEVE_REMOTE, + __IFLA_GENEVE_MAX +}; +#define IFLA_GENEVE_MAX (__IFLA_GENEVE_MAX - 1) + /* Bonding section */ enum { diff --git a/ip/Makefile b/ip/Makefile index 2c742f305fef..77653ecc5785 100644 --- a/ip/Makefile +++ b/ip/Makefile @@ -6,7 +6,8 @@ IPOBJ=ip.o ipaddress.o ipaddrlabel.o iproute.o iprule.o ipnetns.o \ iplink_macvlan.o iplink_macvtap.o ipl2tp.o link_vti.o link_vti6.o \ iplink_vxlan.o tcp_metrics.o iplink_ipoib.o ipnetconf.o link_ip6tnl.o \ link_iptnl.o link_gre6.o iplink_bond.o iplink_bond_slave.o iplink_hsr.o \ - iplink_bridge.o iplink_bridge_slave.o ipfou.o iplink_ipvlan.o + iplink_bridge.o iplink_bridge_slave.o ipfou.o iplink_ipvlan.o \ + iplink_geneve.o RTMONOBJ=rtmon.o diff --git a/ip/iplink.c b/ip/iplink.c index bb437b96239a..39c76e778020 100644 --- a/ip/iplink.c +++ b/ip/iplink.c @@ -93,7 +93,7 @@ void iplink_usage(void) fprintf(stderr, "TYPE := { vlan | veth | vcan | dummy | ifb | macvlan | macvtap |\n"); fprintf(stderr, " bridge | bond | ipoib | ip6tnl | ipip | sit | vxlan |\n"); fprintf(stderr, " gre | gretap | ip6gre | ip6gretap | vti | nlmon |\n"); - fprintf(stderr, " bond_slave | ipvlan }\n"); + fprintf(stderr, " bond_slave | ipvlan | geneve }\n"); } exit(-1); } diff --git a/ip/iplink_geneve.c b/ip/iplink_geneve.c new file mode 100644 index 000000000000..8b4cf57bdec2 --- /dev/null +++ b/ip/iplink_geneve.c @@ -0,0 +1,122 @@ +/* + * iplink_geneve.c GENEVE device support + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + * + * Authors: John W. Linville <linville@tuxdriver.com> + */ + +#include <stdio.h> + +#include "utils.h" +#include "ip_common.h" + +static void print_explain(FILE *f) +{ + fprintf(f, "Usage: ... geneve id VNI remote ADDR\n"); + fprintf(f, "\n"); + fprintf(f, "Where: VNI := 0-16777215\n"); + fprintf(f, " ADDR := IP_ADDRESS\n"); +} + +static void explain(void) +{ + print_explain(stderr); +} + +static int geneve_parse_opt(struct link_util *lu, int argc, char **argv, + struct nlmsghdr *n) +{ + __u32 vni = 0; + int vni_set = 0; + __u32 daddr = 0; + struct in6_addr daddr6 = IN6ADDR_ANY_INIT; + + + while (argc > 0) { + if (!matches(*argv, "id") || + !matches(*argv, "vni")) { + NEXT_ARG(); + if (get_u32(&vni, *argv, 0) || + vni >= 1u << 24) + invarg("invalid id", *argv); + vni_set = 1; + } else if (!matches(*argv, "remote")) { + NEXT_ARG(); + if (!inet_get_addr(*argv, &daddr, &daddr6)) { + fprintf(stderr, "Invalid address \"%s\"\n", *argv); + return -1; + } + if (IN_MULTICAST(ntohl(daddr))) + invarg("invalid remote address", *argv); + } else if (matches(*argv, "help") == 0) { + explain(); + return -1; + } else { + fprintf(stderr, "geneve: unknown command \"%s\"?\n", *argv); + explain(); + return -1; + } + argc--, argv++; + } + + if (!vni_set) { + fprintf(stderr, "geneve: missing virtual network identifier\n"); + return -1; + } + + if (!daddr) { + fprintf(stderr, "geneve: remote link partner not specified\n"); + return -1; + } + if (memcmp(&daddr6, &in6addr_any, sizeof(daddr6)) != 0) { + fprintf(stderr, "geneve: remote link over IPv6 not supported\n"); + return -1; + } + + addattr32(n, 1024, IFLA_GENEVE_ID, vni); + if (daddr) + addattr_l(n, 1024, IFLA_GENEVE_REMOTE, &daddr, 4); + + return 0; +} + +static void geneve_print_opt(struct link_util *lu, FILE *f, struct rtattr *tb[]) +{ + __u32 vni; + char s1[1024]; + + if (!tb) + return; + + if (!tb[IFLA_GENEVE_ID] || + RTA_PAYLOAD(tb[IFLA_GENEVE_ID]) < sizeof(__u32)) + return; + + vni = rta_getattr_u32(tb[IFLA_GENEVE_ID]); + fprintf(f, "id %u ", vni); + + if (tb[IFLA_GENEVE_REMOTE]) { + __be32 addr = rta_getattr_u32(tb[IFLA_GENEVE_REMOTE]); + if (addr) + fprintf(f, "remote %s ", + format_host(AF_INET, 4, &addr, s1, sizeof(s1))); + } +} + +static void geneve_print_help(struct link_util *lu, int argc, char **argv, + FILE *f) +{ + print_explain(f); +} + +struct link_util geneve_link_util = { + .id = "geneve", + .maxattr = IFLA_GENEVE_MAX, + .parse_opt = geneve_parse_opt, + .print_opt = geneve_print_opt, + .print_help = geneve_print_help, +}; -- 2.1.0 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH] add GENEVE netdev tunnel driver 2015-05-08 17:20 [PATCH] add GENEVE netdev tunnel driver John W. Linville ` (5 preceding siblings ...) 2015-05-08 17:27 ` [PATCH] iproute2: GENEVE support John W. Linville @ 2015-05-08 19:32 ` Stephen Hemminger 6 siblings, 0 replies; 18+ messages in thread From: Stephen Hemminger @ 2015-05-08 19:32 UTC (permalink / raw) To: John W. Linville Cc: netdev, David S. Miller, Jesse Gross, Andy Zhou, Alexander Duyck On Fri, 8 May 2015 13:20:52 -0400 "John W. Linville" <linville@tuxdriver.com> wrote: > This 5-patch kernel series adds a netdev implementation of a GENEVE > tunnel driver, and the single iproute2 patch enables creation and > such for those netdevs. This makes use of the existing GENEVE > infrastructure already used by the OVS code. The net/ipv4/geneve.c > file is renamed as net/ipv4/geneve_core.c as part of these changes. > > drivers/net/Kconfig | 14 + > drivers/net/Makefile | 1 > drivers/net/geneve.c | 550 +++++++++++++++++++++++++++++++++++++++++ > include/net/geneve.h | 5 > include/uapi/linux/if_link.h | 9 > net/ipv4/Kconfig | 4 > net/ipv4/Makefile | 2 > net/ipv4/geneve.c | 6 > net/ipv4/geneve_core.c | 4 > net/openvswitch/Kconfig | 2 > net/openvswitch/vport-geneve.c | 5 > 11 files changed, 585 insertions(+), 17 deletions(-) > > The overall structure of the GENEVE netdev driver is strongly > influenced by the VXLAN netdev driver. This is not surprising, as the > two drivers are intended to serve similar purposes. As development of > the GENEVE driver continues, it is likely that those similarities will > grow stronger. This will include both simple configuration options > (e.g. TOS and TTL settings) and new control plane support. Look good. Thanks. Acked-by: Stephen Hemminger <stephen@networkplumber.org> ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2015-05-13 17:00 UTC | newest] Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-05-08 17:20 [PATCH] add GENEVE netdev tunnel driver John W. Linville 2015-05-08 17:20 ` [PATCH 1/5] geneve: remove MODULE_ALIAS_RTNL_LINK from net/ipv4/geneve.c John W. Linville 2015-05-08 17:20 ` [PATCH 2/5] geneve: move definition of geneve_hdr() to geneve.h John W. Linville 2015-05-08 17:20 ` [PATCH 3/5] geneve: Rename support library as geneve_core John W. Linville 2015-05-08 17:20 ` [PATCH 4/5] geneve_core: identify as driver library in modules description John W. Linville 2015-05-08 17:20 ` [PATCH 5/5] geneve: add initial netdev driver for GENEVE tunnels John W. Linville 2015-05-08 20:55 ` Cong Wang 2015-05-08 23:22 ` John W. Linville 2015-05-10 23:48 ` David Miller 2015-05-11 15:17 ` John W. Linville 2015-05-08 23:19 ` Jesse Gross 2015-05-11 20:51 ` [PATCH v2 " John W. Linville 2015-05-13 3:06 ` David Miller 2015-05-13 16:53 ` John W. Linville 2015-05-08 17:27 ` [PATCH] iproute2: GENEVE support John W. Linville 2015-05-08 23:27 ` Jesse Gross 2015-05-11 18:47 ` [PATCH v2] " John W. Linville 2015-05-08 19:32 ` [PATCH] add GENEVE netdev tunnel driver Stephen Hemminger
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).