All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jiri Slaby <jslaby@suse.cz>
To: stable@vger.kernel.org
Cc: linux-kernel@vger.kernel.org,
	David Stevens <dlstevens@us.ibm.com>,
	"David S. Miller" <davem@davemloft.net>,
	Jiri Slaby <jslaby@suse.cz>
Subject: [PATCH 3.12 48/72] vxlan: fix nonfunctional neigh_reduce()
Date: Fri, 18 Apr 2014 11:22:21 +0200	[thread overview]
Message-ID: <d8be18c52dbc94989f6d74637b731af39cd3d902.1397812482.git.jslaby@suse.cz> (raw)
In-Reply-To: <3389f243c528afc7c7300c83b8f296290cd3656d.1397812482.git.jslaby@suse.cz>
In-Reply-To: <cover.1397812482.git.jslaby@suse.cz>

From: David Stevens <dlstevens@us.ibm.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

[ Upstream commit 4b29dba9c085a4fb79058fb1c45a2f6257ca3dfa ]

The VXLAN neigh_reduce() code is completely non-functional since
check-in. Specific errors:

1) The original code drops all packets with a multicast destination address,
	even though neighbor solicitations are sent to the solicited-node
	address, a multicast address. The code after this check was never run.
2) The neighbor table lookup used the IPv6 header destination, which is the
	solicited node address, rather than the target address from the
	neighbor solicitation. So neighbor lookups would always fail if it
	got this far. Also for L3MISSes.
3) The code calls ndisc_send_na(), which does a send on the tunnel device.
	The context for neigh_reduce() is the transmit path, vxlan_xmit(),
	where the host or a bridge-attached neighbor is trying to transmit
	a neighbor solicitation. To respond to it, the tunnel endpoint needs
	to do a *receive* of the appropriate neighbor advertisement. Doing a
	send, would only try to send the advertisement, encapsulated, to the
	remote destinations in the fdb -- hosts that definitely did not do the
	corresponding solicitation.
4) The code uses the tunnel endpoint IPv6 forwarding flag to determine the
	isrouter flag in the advertisement. This has nothing to do with whether
	or not the target is a router, and generally won't be set since the
	tunnel endpoint is bridging, not routing, traffic.

	The patch below creates a proxy neighbor advertisement to respond to
neighbor solicitions as intended, providing proper IPv6 support for neighbor
reduction.

Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 drivers/net/vxlan.c | 127 ++++++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 113 insertions(+), 14 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 020fe03f37c0..6c0d1c103286 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -1236,15 +1236,103 @@ out:
 }
 
 #if IS_ENABLED(CONFIG_IPV6)
+
+static struct sk_buff *vxlan_na_create(struct sk_buff *request,
+	struct neighbour *n, bool isrouter)
+{
+	struct net_device *dev = request->dev;
+	struct sk_buff *reply;
+	struct nd_msg *ns, *na;
+	struct ipv6hdr *pip6;
+	u8 *daddr;
+	int na_olen = 8; /* opt hdr + ETH_ALEN for target */
+	int ns_olen;
+	int i, len;
+
+	if (dev == NULL)
+		return NULL;
+
+	len = LL_RESERVED_SPACE(dev) + sizeof(struct ipv6hdr) +
+		sizeof(*na) + na_olen + dev->needed_tailroom;
+	reply = alloc_skb(len, GFP_ATOMIC);
+	if (reply == NULL)
+		return NULL;
+
+	reply->protocol = htons(ETH_P_IPV6);
+	reply->dev = dev;
+	skb_reserve(reply, LL_RESERVED_SPACE(request->dev));
+	skb_push(reply, sizeof(struct ethhdr));
+	skb_set_mac_header(reply, 0);
+
+	ns = (struct nd_msg *)skb_transport_header(request);
+
+	daddr = eth_hdr(request)->h_source;
+	ns_olen = request->len - skb_transport_offset(request) - sizeof(*ns);
+	for (i = 0; i < ns_olen-1; i += (ns->opt[i+1]<<3)) {
+		if (ns->opt[i] == ND_OPT_SOURCE_LL_ADDR) {
+			daddr = ns->opt + i + sizeof(struct nd_opt_hdr);
+			break;
+		}
+	}
+
+	/* Ethernet header */
+	memcpy(eth_hdr(reply)->h_dest, daddr, ETH_ALEN);
+	memcpy(eth_hdr(reply)->h_source, n->ha, ETH_ALEN);
+	eth_hdr(reply)->h_proto = htons(ETH_P_IPV6);
+	reply->protocol = htons(ETH_P_IPV6);
+
+	skb_pull(reply, sizeof(struct ethhdr));
+	skb_set_network_header(reply, 0);
+	skb_put(reply, sizeof(struct ipv6hdr));
+
+	/* IPv6 header */
+
+	pip6 = ipv6_hdr(reply);
+	memset(pip6, 0, sizeof(struct ipv6hdr));
+	pip6->version = 6;
+	pip6->priority = ipv6_hdr(request)->priority;
+	pip6->nexthdr = IPPROTO_ICMPV6;
+	pip6->hop_limit = 255;
+	pip6->daddr = ipv6_hdr(request)->saddr;
+	pip6->saddr = *(struct in6_addr *)n->primary_key;
+
+	skb_pull(reply, sizeof(struct ipv6hdr));
+	skb_set_transport_header(reply, 0);
+
+	na = (struct nd_msg *)skb_put(reply, sizeof(*na) + na_olen);
+
+	/* Neighbor Advertisement */
+	memset(na, 0, sizeof(*na)+na_olen);
+	na->icmph.icmp6_type = NDISC_NEIGHBOUR_ADVERTISEMENT;
+	na->icmph.icmp6_router = isrouter;
+	na->icmph.icmp6_override = 1;
+	na->icmph.icmp6_solicited = 1;
+	na->target = ns->target;
+	memcpy(&na->opt[2], n->ha, ETH_ALEN);
+	na->opt[0] = ND_OPT_TARGET_LL_ADDR;
+	na->opt[1] = na_olen >> 3;
+
+	na->icmph.icmp6_cksum = csum_ipv6_magic(&pip6->saddr,
+		&pip6->daddr, sizeof(*na)+na_olen, IPPROTO_ICMPV6,
+		csum_partial(na, sizeof(*na)+na_olen, 0));
+
+	pip6->payload_len = htons(sizeof(*na)+na_olen);
+
+	skb_push(reply, sizeof(struct ipv6hdr));
+
+	reply->ip_summed = CHECKSUM_UNNECESSARY;
+
+	return reply;
+}
+
 static int neigh_reduce(struct net_device *dev, struct sk_buff *skb)
 {
 	struct vxlan_dev *vxlan = netdev_priv(dev);
-	struct neighbour *n;
-	union vxlan_addr ipa;
+	struct nd_msg *msg;
 	const struct ipv6hdr *iphdr;
 	const struct in6_addr *saddr, *daddr;
-	struct nd_msg *msg;
-	struct inet6_dev *in6_dev = NULL;
+	struct neighbour *n;
+	struct inet6_dev *in6_dev;
 
 	in6_dev = __in6_dev_get(dev);
 	if (!in6_dev)
@@ -1257,19 +1345,20 @@ static int neigh_reduce(struct net_device *dev, struct sk_buff *skb)
 	saddr = &iphdr->saddr;
 	daddr = &iphdr->daddr;
 
-	if (ipv6_addr_loopback(daddr) ||
-	    ipv6_addr_is_multicast(daddr))
-		goto out;
-
 	msg = (struct nd_msg *)skb_transport_header(skb);
 	if (msg->icmph.icmp6_code != 0 ||
 	    msg->icmph.icmp6_type != NDISC_NEIGHBOUR_SOLICITATION)
 		goto out;
 
-	n = neigh_lookup(ipv6_stub->nd_tbl, daddr, dev);
+	if (ipv6_addr_loopback(daddr) ||
+	    ipv6_addr_is_multicast(&msg->target))
+		goto out;
+
+	n = neigh_lookup(ipv6_stub->nd_tbl, &msg->target, dev);
 
 	if (n) {
 		struct vxlan_fdb *f;
+		struct sk_buff *reply;
 
 		if (!(n->nud_state & NUD_CONNECTED)) {
 			neigh_release(n);
@@ -1283,13 +1372,23 @@ static int neigh_reduce(struct net_device *dev, struct sk_buff *skb)
 			goto out;
 		}
 
-		ipv6_stub->ndisc_send_na(dev, n, saddr, &msg->target,
-					 !!in6_dev->cnf.forwarding,
-					 true, false, false);
+		reply = vxlan_na_create(skb, n,
+					!!(f ? f->flags & NTF_ROUTER : 0));
+
 		neigh_release(n);
+
+		if (reply == NULL)
+			goto out;
+
+		if (netif_rx_ni(reply) == NET_RX_DROP)
+			dev->stats.rx_dropped++;
+
 	} else if (vxlan->flags & VXLAN_F_L3MISS) {
-		ipa.sin6.sin6_addr = *daddr;
-		ipa.sa.sa_family = AF_INET6;
+		union vxlan_addr ipa = {
+			.sin6.sin6_addr = msg->target,
+			.sa.sa_family = AF_INET6,
+		};
+
 		vxlan_ip_miss(dev, &ipa);
 	}
 
-- 
1.9.2


  parent reply	other threads:[~2014-04-18  9:31 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-18  9:21 [PATCH 3.12 00/72] 3.12.18-stable review Jiri Slaby
2014-04-18  9:21 ` [PATCH 3.12 01/72] powernow-k6: disable cache when changing frequency Jiri Slaby
2014-04-18  9:21 ` [PATCH 3.12 02/72] powernow-k6: correctly initialize default parameters Jiri Slaby
2014-04-18  9:21 ` [PATCH 3.12 03/72] powernow-k6: reorder frequencies Jiri Slaby
2014-04-18  9:21 ` [PATCH 3.12 04/72] PCI: mvebu: move clock enable before register access Jiri Slaby
2014-04-18  9:21 ` [PATCH 3.12 05/72] selinux: correctly label /proc inodes in use before the policy is loaded Jiri Slaby
2014-04-18  9:21 ` [PATCH 3.12 06/72] futex: Allow architectures to skip futex_atomic_cmpxchg_inatomic() test Jiri Slaby
2014-04-18  9:21 ` [PATCH 3.12 07/72] m68k: Skip " Jiri Slaby
2014-04-18  9:21 ` [PATCH 3.12 08/72] Char: ipmi_bt_sm, fix infinite loop Jiri Slaby
2014-04-18  9:21 ` [PATCH 3.12 09/72] nfs: initialize the ACL support bits to zero Jiri Slaby
2014-04-18  9:21 ` [PATCH 3.12 10/72] NFSv3: Fix return value of nfs3_proc_setacls Jiri Slaby
2014-04-18  9:21 ` [PATCH 3.12 11/72] SUNRPC: Fix potential memory scribble in xprt_free_bc_request() Jiri Slaby
2014-04-18  9:21 ` [PATCH 3.12 12/72] ext4: Speedup WB_SYNC_ALL pass called from sync(2) Jiri Slaby
2014-04-18  9:21 ` [PATCH 3.12 13/72] floppy: bail out in open() if drive is not responding to block0 read Jiri Slaby
2014-07-03 10:12   ` Olaf Hering
2014-07-04 21:22     ` Jiri Kosina
2014-04-18  9:21 ` [PATCH 3.12 14/72] drm/i915: Undo the PIPEA quirk for i845 Jiri Slaby
2014-04-18  9:21 ` [PATCH 3.12 15/72] drm/cirrus: Fix cirrus drm driver for fbdev + qemu Jiri Slaby
2014-04-18  9:21 ` [PATCH 3.12 16/72] drm/radeon: change audio enable logic Jiri Slaby
2014-04-18  9:21 ` [PATCH 3.12 17/72] drm/radeon: enable speaker allocation setup on dce3.2 Jiri Slaby
2014-04-18  9:21 ` [PATCH 3.12 18/72] drm: Prefer noninterlace cmdline mode unless explicitly specified Jiri Slaby
2014-04-18  9:21 ` [PATCH 3.12 19/72] fb: reorder the lock sequence to fix potential dead lock Jiri Slaby
2014-04-18  9:21 ` [PATCH 3.12 20/72] video/fb: Propagate error code from failing to unregister conflicting fb Jiri Slaby
2014-04-18  9:21   ` Jiri Slaby
2014-04-18  9:21   ` Jiri Slaby
2014-04-18  9:21 ` [PATCH 3.12 21/72] fbdev: Make the switch from generic to native driver less alarming Jiri Slaby
2014-04-18  9:21 ` [PATCH 3.12 22/72] drm: add drm_set_preferred_mode Jiri Slaby
2014-04-18  9:21 ` [PATCH 3.12 23/72] drm/cirrus: use drm_set_preferred_mode Jiri Slaby
2014-04-18  9:21 ` [PATCH 3.12 24/72] net: fix for a race condition in the inet frag code Jiri Slaby
2014-04-18  9:21 ` [PATCH 3.12 25/72] net: sctp: fix skb leakage in COOKIE ECHO path of chunk->auth_chunk Jiri Slaby
2014-04-18  9:21 ` [PATCH 3.12 26/72] bridge: multicast: add sanity check for query source addresses Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 27/72] tipc: allow connection shutdown callback to be invoked in advance Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 28/72] tipc: fix connection refcount leak Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 29/72] tipc: drop subscriber connection id invalidation Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 30/72] tipc: fix memory leak during module removal Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 31/72] tipc: don't log disabled tasklet handler errors Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 32/72] inet: frag: make sure forced eviction removes all frags Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 33/72] net: unix: non blocking recvmsg() should not return -EINTR Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 34/72] ipv6: Fix exthdrs offload registration Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 35/72] ipv6: don't set DST_NOCOUNT for remotely added routes Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 36/72] bnx2: Fix shutdown sequence Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 37/72] pkt_sched: fq: do not hold qdisc lock while allocating memory Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 38/72] vlan: Set correct source MAC address with TX VLAN offload enabled Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 39/72] tcp: tcp_release_cb() should release socket ownership Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 40/72] bridge: multicast: add sanity check for general query destination Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 41/72] bridge: multicast: enable snooping on general queries only Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 42/72] net: socket: error on a negative msg_namelen Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 43/72] bonding: set correct vlan id for alb xmit path Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 44/72] eth: fec: Fix lost promiscuous mode after reconnecting cable Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 45/72] ipv6: Avoid unnecessary temporary addresses being generated Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 46/72] ipv6: ip6_append_data_mtu do not handle the mtu of the second fragment properly Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 47/72] vxlan: fix potential NULL dereference in arp_reduce() Jiri Slaby
2014-04-18  9:22 ` Jiri Slaby [this message]
2014-04-18  9:22 ` [PATCH 3.12 49/72] tcp: syncookies: do not use getnstimeofday() Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 50/72] rtnetlink: fix fdb notification flags Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 51/72] ipmr: fix mfc " Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 52/72] ip6mr: " Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 53/72] net: micrel : ks8851-ml: add vdd-supply support Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 54/72] netpoll: fix the skb check in pkt_is_ns Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 55/72] tipc: fix spinlock recursion bug for failed subscriptions Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 56/72] ip_tunnel: Fix dst ref-count Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 57/72] tg3: Do not include vlan acceleration features in vlan_features Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 58/72] usbnet: include wait queue head in device structure Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 59/72] vlan: Set hard_header_len according to available acceleration Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 60/72] vhost: fix total length when packets are too short Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 61/72] vhost: validate vhost_get_vq_desc return value Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 62/72] xen-netback: remove pointless clause from if statement Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 63/72] ipv6: some ipv6 statistic counters failed to disable bh Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 64/72] netlink: don't compare the nul-termination in nla_strcmp Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 65/72] xen-netback: disable rogue vif in kthread context Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 66/72] net: vxlan: fix crash when interface is created with no group Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 67/72] isdnloop: Validate NUL-terminated strings from user Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 68/72] isdnloop: several buffer overflows Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 69/72] rds: prevent dereference of a NULL device in rds_iw_laddr_check Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 70/72] ARC: [nsimosci] Change .dts to use generic 8250 UART Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 71/72] ARC: [nsimosci] Unbork console Jiri Slaby
2014-04-18  9:22 ` [PATCH 3.12 72/72] crypto: ghash-clmulni-intel - use C implementation for setkey() Jiri Slaby
2014-04-18 19:18 ` [PATCH 3.12 00/72] 3.12.18-stable review Guenter Roeck
2014-04-18 22:12 ` Shuah Khan
2014-04-24  7:46   ` Jiri Slaby

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d8be18c52dbc94989f6d74637b731af39cd3d902.1397812482.git.jslaby@suse.cz \
    --to=jslaby@suse.cz \
    --cc=davem@davemloft.net \
    --cc=dlstevens@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.