All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next] iproute2: MPLS support
@ 2015-03-13 18:50 Eric W. Biederman
  2015-03-13 18:52 ` [PATCH net-next 1/8] iproute2: Add a source addres length parameter to rt_addr_n2a Eric W. Biederman
                   ` (9 more replies)
  0 siblings, 10 replies; 29+ messages in thread
From: Eric W. Biederman @ 2015-03-13 18:50 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev


This set of changes  adds support for nexthops in different
address families, with the new netlink RTA_VIA option.

Support is added for routes that change the destination address (as MPLS
does) with the RTA_NEWDST attribute.

Support for MPLS addresses is added (for multiple labels I
had to make up the syntax I used label/label/label as it fits in
well with addresses that don't have a space in them).

Support for these options is merged into David's net-next kernel
tree, and the meaning of the options is unlikely to change in any
significant way before this code merges upstream.

The documentation has been updated to report that the new options
are present and to report roughly what they do.  This includes
ip --help and the man pages.

Eric W. Biederman (8):
      iproute2: Add a source addres length parameter to rt_addr_n2a
      iproute2: Make the addr argument of ll_addr_n2a const
      iproute2: Add support for printing AF_PACKET addresses
      iproute2: Add address family to/from string helper functions.
      iproute2: misc whitespace cleanup
      iproute2: Add support for  RTA_VIA attributes
      iproute2: Add support for the RTA_NEWDST attribute.
      iproute2: Add basic mpls support to iproute

 Makefile                  |   3 ++
 include/linux/mpls.h      |  34 +++++++++++++++
 include/linux/rtnetlink.h |  10 +++++
 include/rt_names.h        |   2 +-
 include/utils.h           |  22 ++++++++--
 ip/ip.c                   |  20 +++------
 ip/iplink_bond.c          |   1 +
 ip/ipmonitor.c            |   3 ++
 ip/ipmroute.c             |   2 +
 ip/ipprefix.c             |   4 +-
 ip/iproute.c              | 108 ++++++++++++++++++++++++++++++++++++++++------
 ip/iprule.c               |  10 +++--
 ip/iptunnel.c             |   4 +-
 ip/ipxfrm.c               |  17 +++++---
 ip/link_ip6tnl.c          |   2 +
 ip/xfrm_monitor.c         |   8 ++--
 lib/ll_addr.c             |   2 +-
 lib/mpls_ntop.c           |  48 +++++++++++++++++++++
 lib/mpls_pton.c           |  58 +++++++++++++++++++++++++
 lib/utils.c               |  91 ++++++++++++++++++++++++++++++++++----
 man/man8/ip-route.8.in    |  23 +++++++---
 man/man8/ip.8             |   7 ++-
 22 files changed, 413 insertions(+), 66 deletions(-)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH net-next 1/8] iproute2: Add a source addres length parameter to rt_addr_n2a
  2015-03-13 18:50 [PATCH net-next] iproute2: MPLS support Eric W. Biederman
@ 2015-03-13 18:52 ` Eric W. Biederman
  2015-03-13 18:52 ` [PATCH net-next 2/8] iproute2: Make the addr argument of ll_addr_n2a const Eric W. Biederman
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 29+ messages in thread
From: Eric W. Biederman @ 2015-03-13 18:52 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev


For some address families (like AF_PACKET) it is helpful to have the
length when printing the address.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/utils.h   |  2 +-
 ip/iplink_bond.c  |  1 +
 ip/ipmroute.c     |  2 ++
 ip/ipprefix.c     |  4 +++-
 ip/iproute.c      | 11 +++++++----
 ip/iprule.c       | 10 ++++++----
 ip/iptunnel.c     |  2 +-
 ip/ipxfrm.c       | 17 +++++++++++------
 ip/link_ip6tnl.c  |  2 ++
 ip/xfrm_monitor.c |  8 +++++---
 lib/utils.c       |  4 ++--
 11 files changed, 41 insertions(+), 22 deletions(-)

diff --git a/include/utils.h b/include/utils.h
index fec9ef4f2f4b..e1dd3f69165f 100644
--- a/include/utils.h
+++ b/include/utils.h
@@ -103,7 +103,7 @@ extern __u8* hexstring_a2n(const char *str, __u8 *buf, int blen);
 
 extern const char *format_host(int af, int len, const void *addr,
 			       char *buf, int buflen);
-extern const char *rt_addr_n2a(int af, const void *addr,
+extern const char *rt_addr_n2a(int af, int len, const void *addr,
 			       char *buf, int buflen);
 
 void missarg(const char *) __attribute__((noreturn));
diff --git a/ip/iplink_bond.c b/ip/iplink_bond.c
index 3009ec912e23..a573f92b03a0 100644
--- a/ip/iplink_bond.c
+++ b/ip/iplink_bond.c
@@ -415,6 +415,7 @@ static void bond_print_opt(struct link_util *lu, FILE *f, struct rtattr *tb[])
 			if (iptb[i])
 				fprintf(f, "%s",
 					rt_addr_n2a(AF_INET,
+						    RTA_PAYLOAD(iptb[i]),
 						    RTA_DATA(iptb[i]),
 						    buf,
 						    INET_ADDRSTRLEN));
diff --git a/ip/ipmroute.c b/ip/ipmroute.c
index b4ed9f15fda5..13ac892512d0 100644
--- a/ip/ipmroute.c
+++ b/ip/ipmroute.c
@@ -116,6 +116,7 @@ int print_mroute(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
 	if (tb[RTA_SRC])
 		len = snprintf(obuf, sizeof(obuf),
 			       "(%s, ", rt_addr_n2a(family,
+						    RTA_PAYLOAD(tb[RTA_SRC]),
 						    RTA_DATA(tb[RTA_SRC]),
 						    abuf, sizeof(abuf)));
 	else
@@ -123,6 +124,7 @@ int print_mroute(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
 	if (tb[RTA_DST])
 		snprintf(obuf + len, sizeof(obuf) - len,
 			 "%s)", rt_addr_n2a(family,
+					    RTA_PAYLOAD(tb[RTA_DST]),
 					    RTA_DATA(tb[RTA_DST]),
 					    abuf, sizeof(abuf)));
 	else
diff --git a/ip/ipprefix.c b/ip/ipprefix.c
index 02c0efce6836..26b596151217 100644
--- a/ip/ipprefix.c
+++ b/ip/ipprefix.c
@@ -80,7 +80,9 @@ int print_prefix(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
 		pfx = (struct in6_addr *)RTA_DATA(tb[PREFIX_ADDRESS]);
 
 		memset(abuf, '\0', sizeof(abuf));
-		fprintf(fp, "%s", rt_addr_n2a(family, pfx,
+		fprintf(fp, "%s", rt_addr_n2a(family,
+					      RTA_PAYLOAD(tb[PREFIX_ADDRESS]),
+					      pfx,
 					      abuf, sizeof(abuf)));
 	}
 	fprintf(fp, "/%u ", prefix->prefix_len);
diff --git a/ip/iproute.c b/ip/iproute.c
index 76d8e36ccc2b..201eb98a23ad 100644
--- a/ip/iproute.c
+++ b/ip/iproute.c
@@ -353,8 +353,9 @@ int print_route(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
 	if (tb[RTA_DST]) {
 		if (r->rtm_dst_len != host_len) {
 			fprintf(fp, "%s/%u ", rt_addr_n2a(r->rtm_family,
-							 RTA_DATA(tb[RTA_DST]),
-							 abuf, sizeof(abuf)),
+						       RTA_PAYLOAD(tb[RTA_DST]),
+						       RTA_DATA(tb[RTA_DST]),
+						       abuf, sizeof(abuf)),
 				r->rtm_dst_len
 				);
 		} else {
@@ -372,8 +373,9 @@ int print_route(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
 	if (tb[RTA_SRC]) {
 		if (r->rtm_src_len != host_len) {
 			fprintf(fp, "from %s/%u ", rt_addr_n2a(r->rtm_family,
-							 RTA_DATA(tb[RTA_SRC]),
-							 abuf, sizeof(abuf)),
+						       RTA_PAYLOAD(tb[RTA_SRC]),
+						       RTA_DATA(tb[RTA_SRC]),
+						       abuf, sizeof(abuf)),
 				r->rtm_src_len
 				);
 		} else {
@@ -415,6 +417,7 @@ int print_route(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
 		 */
 		fprintf(fp, " src %s ",
 			rt_addr_n2a(r->rtm_family,
+				    RTA_PAYLOAD(tb[RTA_PREFSRC]),
 				    RTA_DATA(tb[RTA_PREFSRC]),
 				    abuf, sizeof(abuf)));
 	}
diff --git a/ip/iprule.c b/ip/iprule.c
index 366878e90f84..0b1ad698177c 100644
--- a/ip/iprule.c
+++ b/ip/iprule.c
@@ -89,8 +89,9 @@ int print_rule(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
 	if (tb[FRA_SRC]) {
 		if (r->rtm_src_len != host_len) {
 			fprintf(fp, "from %s/%u ", rt_addr_n2a(r->rtm_family,
-							 RTA_DATA(tb[FRA_SRC]),
-							 abuf, sizeof(abuf)),
+						       RTA_PAYLOAD(tb[FRA_SRC]),
+						       RTA_DATA(tb[FRA_SRC]),
+						       abuf, sizeof(abuf)),
 				r->rtm_src_len
 				);
 		} else {
@@ -109,8 +110,9 @@ int print_rule(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
 	if (tb[FRA_DST]) {
 		if (r->rtm_dst_len != host_len) {
 			fprintf(fp, "to %s/%u ", rt_addr_n2a(r->rtm_family,
-							 RTA_DATA(tb[FRA_DST]),
-							 abuf, sizeof(abuf)),
+						       RTA_PAYLOAD(tb[FRA_DST]),
+						       RTA_DATA(tb[FRA_DST]),
+						       abuf, sizeof(abuf)),
 				r->rtm_dst_len
 				);
 		} else {
diff --git a/ip/iptunnel.c b/ip/iptunnel.c
index caf8a28e62e8..29188c450370 100644
--- a/ip/iptunnel.c
+++ b/ip/iptunnel.c
@@ -343,7 +343,7 @@ static void print_tunnel(struct ip_tunnel_parm *p)
 	       p->name,
 	       tnl_strproto(p->iph.protocol),
 	       p->iph.daddr ? format_host(AF_INET, 4, &p->iph.daddr, s1, sizeof(s1))  : "any",
-	       p->iph.saddr ? rt_addr_n2a(AF_INET, &p->iph.saddr, s2, sizeof(s2)) : "any");
+	       p->iph.saddr ? rt_addr_n2a(AF_INET, 4, &p->iph.saddr, s2, sizeof(s2)) : "any");
 
 	if (p->iph.protocol == IPPROTO_IPV6 && (p->i_flags & SIT_ISATAP)) {
 		struct ip_tunnel_prl prl[16];
diff --git a/ip/ipxfrm.c b/ip/ipxfrm.c
index 659fa6b64579..eacefd907f46 100644
--- a/ip/ipxfrm.c
+++ b/ip/ipxfrm.c
@@ -288,10 +288,10 @@ void xfrm_id_info_print(xfrm_address_t *saddr, struct xfrm_id *id,
 		fputs(title, fp);
 
 	memset(abuf, '\0', sizeof(abuf));
-	fprintf(fp, "src %s ", rt_addr_n2a(family,
+	fprintf(fp, "src %s ", rt_addr_n2a(family, sizeof(*saddr),
 					   saddr, abuf, sizeof(abuf)));
 	memset(abuf, '\0', sizeof(abuf));
-	fprintf(fp, "dst %s", rt_addr_n2a(family,
+	fprintf(fp, "dst %s", rt_addr_n2a(family, sizeof(id->daddr),
 					  &id->daddr, abuf, sizeof(abuf)));
 	fprintf(fp, "%s", _SL_);
 
@@ -455,11 +455,15 @@ void xfrm_selector_print(struct xfrm_selector *sel, __u16 family,
 		fputs(prefix, fp);
 
 	memset(abuf, '\0', sizeof(abuf));
-	fprintf(fp, "src %s/%u ", rt_addr_n2a(f, &sel->saddr, abuf, sizeof(abuf)),
+	fprintf(fp, "src %s/%u ",
+		rt_addr_n2a(f, sizeof(sel->saddr), &sel->saddr,
+			    abuf, sizeof(abuf)),
 		sel->prefixlen_s);
 
 	memset(abuf, '\0', sizeof(abuf));
-	fprintf(fp, "dst %s/%u ", rt_addr_n2a(f, &sel->daddr, abuf, sizeof(abuf)),
+	fprintf(fp, "dst %s/%u ",
+		rt_addr_n2a(f, sizeof(sel->daddr), &sel->daddr,
+			    abuf, sizeof(abuf)),
 		sel->prefixlen_d);
 
 	if (sel->proto)
@@ -754,7 +758,8 @@ void xfrm_xfrma_print(struct rtattr *tb[], __u16 family,
 
 		memset(abuf, '\0', sizeof(abuf));
 		fprintf(fp, "addr %s",
-			rt_addr_n2a(family, &e->encap_oa, abuf, sizeof(abuf)));
+			rt_addr_n2a(family, sizeof(e->encap_oa), &e->encap_oa,
+				    abuf, sizeof(abuf)));
 		fprintf(fp, "%s", _SL_);
 	}
 
@@ -782,7 +787,7 @@ void xfrm_xfrma_print(struct rtattr *tb[], __u16 family,
 
 		memset(abuf, '\0', sizeof(abuf));
 		fprintf(fp, "%s",
-			rt_addr_n2a(family, coa,
+			rt_addr_n2a(family, sizeof(*coa), coa,
 				    abuf, sizeof(abuf)));
 		fprintf(fp, "%s", _SL_);
 	}
diff --git a/ip/link_ip6tnl.c b/ip/link_ip6tnl.c
index 5ed3d5a23fb5..cf59a9338f57 100644
--- a/ip/link_ip6tnl.c
+++ b/ip/link_ip6tnl.c
@@ -285,6 +285,7 @@ static void ip6tunnel_print_opt(struct link_util *lu, FILE *f, struct rtattr *tb
 	if (tb[IFLA_IPTUN_REMOTE]) {
 		fprintf(f, "remote %s ",
 			rt_addr_n2a(AF_INET6,
+				    RTA_PAYLOAD(tb[IFLA_IPTUN_REMOTE]),
 				    RTA_DATA(tb[IFLA_IPTUN_REMOTE]),
 				    s1, sizeof(s1)));
 	}
@@ -292,6 +293,7 @@ static void ip6tunnel_print_opt(struct link_util *lu, FILE *f, struct rtattr *tb
 	if (tb[IFLA_IPTUN_LOCAL]) {
 		fprintf(f, "local %s ",
 			rt_addr_n2a(AF_INET6,
+				    RTA_PAYLOAD(tb[IFLA_IPTUN_LOCAL]),
 				    RTA_DATA(tb[IFLA_IPTUN_LOCAL]),
 				    s1, sizeof(s1)));
 	}
diff --git a/ip/xfrm_monitor.c b/ip/xfrm_monitor.c
index 50116a7b5433..b2b2d6e27a45 100644
--- a/ip/xfrm_monitor.c
+++ b/ip/xfrm_monitor.c
@@ -227,7 +227,8 @@ static void xfrm_usersa_print(const struct xfrm_usersa_id *sa_id, __u32 reqid, F
 
 	buf[0] = 0;
 	fprintf(fp, "dst %s ",
-		rt_addr_n2a(sa_id->family, &sa_id->daddr, buf, sizeof(buf)));
+		rt_addr_n2a(sa_id->family, sizeof(sa_id->daddr), &sa_id->daddr,
+			    buf, sizeof(buf)));
 
 	fprintf(fp, " reqid 0x%x", reqid);
 
@@ -246,7 +247,8 @@ static int xfrm_ae_print(const struct sockaddr_nl *who,
 	xfrm_ae_flags_print(id->flags, arg);
 	fprintf(fp,"\n\t");
 	memset(abuf, '\0', sizeof(abuf));
-	fprintf(fp, "src %s ", rt_addr_n2a(id->sa_id.family, &id->saddr,
+	fprintf(fp, "src %s ", rt_addr_n2a(id->sa_id.family,
+					   sizeof(id->saddr), &id->saddr,
 					   abuf, sizeof(abuf)));
 
 	xfrm_usersa_print(&id->sa_id, id->reqid, fp);
@@ -262,7 +264,7 @@ static void xfrm_print_addr(FILE *fp, int family, xfrm_address_t *a)
 	char buf[256];
 
 	buf[0] = 0;
-	fprintf(fp, "%s", rt_addr_n2a(family, a, buf, sizeof(buf)));
+	fprintf(fp, "%s", rt_addr_n2a(family, sizeof(*a), a, buf, sizeof(buf)));
 }
 
 static int xfrm_mapping_print(const struct sockaddr_nl *who,
diff --git a/lib/utils.c b/lib/utils.c
index e2b05bc0edcc..a97eae9d5148 100644
--- a/lib/utils.c
+++ b/lib/utils.c
@@ -624,7 +624,7 @@ int __get_user_hz(void)
 	return sysconf(_SC_CLK_TCK);
 }
 
-const char *rt_addr_n2a(int af, const void *addr, char *buf, int buflen)
+const char *rt_addr_n2a(int af, int len, const void *addr, char *buf, int buflen)
 {
 	switch (af) {
 	case AF_INET:
@@ -731,7 +731,7 @@ const char *format_host(int af, int len, const void *addr,
 			return n;
 	}
 #endif
-	return rt_addr_n2a(af, addr, buf, buflen);
+	return rt_addr_n2a(af, len, addr, buf, buflen);
 }
 
 
-- 
2.2.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH net-next 2/8] iproute2: Make the addr argument of ll_addr_n2a const
  2015-03-13 18:50 [PATCH net-next] iproute2: MPLS support Eric W. Biederman
  2015-03-13 18:52 ` [PATCH net-next 1/8] iproute2: Add a source addres length parameter to rt_addr_n2a Eric W. Biederman
@ 2015-03-13 18:52 ` Eric W. Biederman
  2015-03-13 18:54 ` [PATCH net-next 3/8] iproute2: Add support for printing AF_PACKET addresses Eric W. Biederman
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 29+ messages in thread
From: Eric W. Biederman @ 2015-03-13 18:52 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev


This avoids build warnings when AF_PACKET support is added
to rt_addr_n2a.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/rt_names.h | 2 +-
 lib/ll_addr.c      | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/rt_names.h b/include/rt_names.h
index c0ea4f982904..921be0607b51 100644
--- a/include/rt_names.h
+++ b/include/rt_names.h
@@ -22,7 +22,7 @@ int inet_proto_a2n(const char *buf);
 
 
 const char * ll_type_n2a(int type, char *buf, int len);
-const char *ll_addr_n2a(unsigned char *addr, int alen,
+const char *ll_addr_n2a(const unsigned char *addr, int alen,
 			int type, char *buf, int blen);
 int ll_addr_a2n(char *lladdr, int len, const char *arg);
 
diff --git a/lib/ll_addr.c b/lib/ll_addr.c
index c12ab075c4a9..2ce9abfbb8c6 100644
--- a/lib/ll_addr.c
+++ b/lib/ll_addr.c
@@ -29,7 +29,7 @@
 #include "utils.h"
 
 
-const char *ll_addr_n2a(unsigned char *addr, int alen, int type, char *buf, int blen)
+const char *ll_addr_n2a(const unsigned char *addr, int alen, int type, char *buf, int blen)
 {
 	int i;
 	int l;
-- 
2.2.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH net-next 3/8] iproute2: Add support for printing AF_PACKET addresses
  2015-03-13 18:50 [PATCH net-next] iproute2: MPLS support Eric W. Biederman
  2015-03-13 18:52 ` [PATCH net-next 1/8] iproute2: Add a source addres length parameter to rt_addr_n2a Eric W. Biederman
  2015-03-13 18:52 ` [PATCH net-next 2/8] iproute2: Make the addr argument of ll_addr_n2a const Eric W. Biederman
@ 2015-03-13 18:54 ` Eric W. Biederman
  2015-03-13 18:55 ` [PATCH net-next 4/8] iproute2: Add address family to/from string helper functions Eric W. Biederman
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 29+ messages in thread
From: Eric W. Biederman @ 2015-03-13 18:54 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev


Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 lib/utils.c | 21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/lib/utils.c b/lib/utils.c
index a97eae9d5148..65d1632ddbc1 100644
--- a/lib/utils.c
+++ b/lib/utils.c
@@ -25,11 +25,12 @@
 #include <asm/types.h>
 #include <linux/pkt_sched.h>
 #include <linux/param.h>
+#include <linux/if_arp.h>
 #include <time.h>
 #include <sys/time.h>
 #include <errno.h>
 
-
+#include "rt_names.h"
 #include "utils.h"
 #include "namespace.h"
 
@@ -397,6 +398,18 @@ int get_addr_1(inet_prefix *addr, const char *name, int family)
 		return 0;
 	}
 
+	if (family == AF_PACKET) {
+		int len;
+		len = ll_addr_a2n((char *)&addr->data, sizeof(addr->data), name);
+		if (len < 0)
+			return -1;
+
+		addr->family = AF_PACKET;
+		addr->bytelen = len;
+		addr->bitlen = len * 8;
+		return 0;
+	}
+
 	if (strchr(name, ':')) {
 		addr->family = AF_INET6;
 		if (family != AF_UNSPEC && family != AF_INET6)
@@ -485,10 +498,6 @@ done:
 
 int get_addr(inet_prefix *dst, const char *arg, int family)
 {
-	if (family == AF_PACKET) {
-		fprintf(stderr, "Error: \"%s\" may be inet address, but it is not allowed in this context.\n", arg);
-		exit(1);
-	}
 	if (get_addr_1(dst, arg, family)) {
 		fprintf(stderr, "Error: an inet address is expected rather than \"%s\".\n", arg);
 		exit(1);
@@ -638,6 +647,8 @@ const char *rt_addr_n2a(int af, int len, const void *addr, char *buf, int buflen
 		memcpy(dna.a_addr, addr, 2);
 		return dnet_ntop(af, &dna, buf, buflen);
 	}
+	case AF_PACKET:
+		return ll_addr_n2a(addr, len, ARPHRD_VOID, buf, buflen);
 	default:
 		return "???";
 	}
-- 
2.2.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH net-next 4/8] iproute2: Add address family to/from string helper functions.
  2015-03-13 18:50 [PATCH net-next] iproute2: MPLS support Eric W. Biederman
                   ` (2 preceding siblings ...)
  2015-03-13 18:54 ` [PATCH net-next 3/8] iproute2: Add support for printing AF_PACKET addresses Eric W. Biederman
@ 2015-03-13 18:55 ` Eric W. Biederman
  2015-03-13 18:56 ` [PATCH net-next 5/8] iproute2: misc whitespace cleanup Eric W. Biederman
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 29+ messages in thread
From: Eric W. Biederman @ 2015-03-13 18:55 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev


Add the functions family_name and read_family to convert an address
family to a string and to convernt a string to an address family.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/utils.h |  3 +++
 ip/ip.c         | 16 +++-------------
 lib/utils.c     | 35 +++++++++++++++++++++++++++++++++++
 3 files changed, 41 insertions(+), 13 deletions(-)

diff --git a/include/utils.h b/include/utils.h
index e1dd3f69165f..d96bd03dd816 100644
--- a/include/utils.h
+++ b/include/utils.h
@@ -106,6 +106,9 @@ extern const char *format_host(int af, int len, const void *addr,
 extern const char *rt_addr_n2a(int af, int len, const void *addr,
 			       char *buf, int buflen);
 
+extern int read_family(const char *name);
+extern const char *family_name(int family);
+
 void missarg(const char *) __attribute__((noreturn));
 void invarg(const char *, const char *) __attribute__((noreturn));
 void duparg(const char *, const char *) __attribute__((noreturn));
diff --git a/ip/ip.c b/ip/ip.c
index da16b15f8b55..85256d8ea0c1 100644
--- a/ip/ip.c
+++ b/ip/ip.c
@@ -190,21 +190,11 @@ int main(int argc, char **argv)
 			argv++;
 			if (argc <= 1)
 				usage();
-			if (strcmp(argv[1], "inet") == 0)
-				preferred_family = AF_INET;
-			else if (strcmp(argv[1], "inet6") == 0)
-				preferred_family = AF_INET6;
-			else if (strcmp(argv[1], "dnet") == 0)
-				preferred_family = AF_DECnet;
-			else if (strcmp(argv[1], "link") == 0)
-				preferred_family = AF_PACKET;
-			else if (strcmp(argv[1], "ipx") == 0)
-				preferred_family = AF_IPX;
-			else if (strcmp(argv[1], "bridge") == 0)
-				preferred_family = AF_BRIDGE;
-			else if (strcmp(argv[1], "help") == 0)
+			if (strcmp(argv[1], "help") == 0)
 				usage();
 			else
+				preferred_family = read_family(argv[1]);
+			if (preferred_family == AF_UNSPEC)
 				invarg("invalid protocol family", argv[1]);
 		} else if (strcmp(opt, "-4") == 0) {
 			preferred_family = AF_INET;
diff --git a/lib/utils.c b/lib/utils.c
index 65d1632ddbc1..b293407e550d 100644
--- a/lib/utils.c
+++ b/lib/utils.c
@@ -654,6 +654,41 @@ const char *rt_addr_n2a(int af, int len, const void *addr, char *buf, int buflen
 	}
 }
 
+int read_family(const char *name)
+{
+	int family = AF_UNSPEC;
+	if (strcmp(name, "inet") == 0)
+		family = AF_INET;
+	else if (strcmp(name, "inet6") == 0)
+		family = AF_INET6;
+	else if (strcmp(name, "dnet") == 0)
+		family = AF_DECnet;
+	else if (strcmp(name, "link") == 0)
+		family = AF_PACKET;
+	else if (strcmp(name, "ipx") == 0)
+		family = AF_IPX;
+	else if (strcmp(name, "bridge") == 0)
+		family = AF_BRIDGE;
+	return family;
+}
+
+const char *family_name(int family)
+{
+	if (family == AF_INET)
+		return "inet";
+	if (family == AF_INET6)
+		return "inet6";
+	if (family == AF_DECnet)
+		return "dnet";
+	if (family == AF_PACKET)
+		return "link";
+	if (family == AF_IPX)
+		return "ipx";
+	if (family == AF_BRIDGE)
+		return "bridge";
+	return "???";
+}
+
 #ifdef RESOLVE_HOSTNAMES
 struct namerec
 {
-- 
2.2.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH net-next 5/8] iproute2: misc whitespace cleanup
  2015-03-13 18:50 [PATCH net-next] iproute2: MPLS support Eric W. Biederman
                   ` (3 preceding siblings ...)
  2015-03-13 18:55 ` [PATCH net-next 4/8] iproute2: Add address family to/from string helper functions Eric W. Biederman
@ 2015-03-13 18:56 ` Eric W. Biederman
  2015-03-13 18:57 ` [PATCH net-next 6/8] iproute2: Add support for RTA_VIA attributes Eric W. Biederman
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 29+ messages in thread
From: Eric W. Biederman @ 2015-03-13 18:56 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev


This removes an extra space and causes things to line up.

---
 ip/iptunnel.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ip/iptunnel.c b/ip/iptunnel.c
index 29188c450370..be84b83ec673 100644
--- a/ip/iptunnel.c
+++ b/ip/iptunnel.c
@@ -342,7 +342,7 @@ static void print_tunnel(struct ip_tunnel_parm *p)
 	printf("%s: %s/ip  remote %s  local %s ",
 	       p->name,
 	       tnl_strproto(p->iph.protocol),
-	       p->iph.daddr ? format_host(AF_INET, 4, &p->iph.daddr, s1, sizeof(s1))  : "any",
+	       p->iph.daddr ? format_host(AF_INET, 4, &p->iph.daddr, s1, sizeof(s1)) : "any",
 	       p->iph.saddr ? rt_addr_n2a(AF_INET, 4, &p->iph.saddr, s2, sizeof(s2)) : "any");
 
 	if (p->iph.protocol == IPPROTO_IPV6 && (p->i_flags & SIT_ISATAP)) {
-- 
2.2.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH net-next 6/8] iproute2: Add support for  RTA_VIA attributes
  2015-03-13 18:50 [PATCH net-next] iproute2: MPLS support Eric W. Biederman
                   ` (4 preceding siblings ...)
  2015-03-13 18:56 ` [PATCH net-next 5/8] iproute2: misc whitespace cleanup Eric W. Biederman
@ 2015-03-13 18:57 ` Eric W. Biederman
  2015-03-13 18:58 ` [PATCH net-next 7/8] iproute2: Add support for the RTA_NEWDST attribute Eric W. Biederman
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 29+ messages in thread
From: Eric W. Biederman @ 2015-03-13 18:57 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev


Add support for the RTA_VIA attribute that specifies an address family
as well as an address for the next hop gateway.

To make it easy to pass this reorder inet_prefix so that it's tail
is a proper RTA_VIA attribute.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/rtnetlink.h |  7 +++++
 include/utils.h           |  7 +++--
 ip/iproute.c              | 76 +++++++++++++++++++++++++++++++++++++++++------
 man/man8/ip-route.8.in    | 18 +++++++----
 4 files changed, 90 insertions(+), 18 deletions(-)

diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
index 3eb78105399b..03e4c8df8e60 100644
--- a/include/linux/rtnetlink.h
+++ b/include/linux/rtnetlink.h
@@ -303,6 +303,7 @@ enum rtattr_type_t {
 	RTA_TABLE,
 	RTA_MARK,
 	RTA_MFC_STATS,
+	RTA_VIA,
 	__RTA_MAX
 };
 
@@ -344,6 +345,12 @@ struct rtnexthop {
 #define RTNH_SPACE(len)	RTNH_ALIGN(RTNH_LENGTH(len))
 #define RTNH_DATA(rtnh)   ((struct rtattr*)(((char*)(rtnh)) + RTNH_LENGTH(0)))
 
+/* RTA_VIA */
+struct rtvia {
+	__kernel_sa_family_t	rtvia_family;
+	__u8			rtvia_addr[0];
+};
+
 /* RTM_CACHEINFO */
 
 struct rta_cacheinfo {
diff --git a/include/utils.h b/include/utils.h
index d96bd03dd816..2db09c7f790f 100644
--- a/include/utils.h
+++ b/include/utils.h
@@ -50,10 +50,11 @@ extern void incomplete_command(void) __attribute__((noreturn));
 
 typedef struct
 {
-	__u8 family;
-	__u8 bytelen;
+	__u16 flags;
+	__u16 bytelen;
 	__s16 bitlen;
-	__u32 flags;
+	/* These next two fields match rtvia */
+	__u16 family;
 	__u32 data[8];
 } inet_prefix;
 
diff --git a/ip/iproute.c b/ip/iproute.c
index 201eb98a23ad..6c40899dbf46 100644
--- a/ip/iproute.c
+++ b/ip/iproute.c
@@ -75,7 +75,8 @@ static void usage(void)
 	fprintf(stderr, "             [ table TABLE_ID ] [ proto RTPROTO ]\n");
 	fprintf(stderr, "             [ scope SCOPE ] [ metric METRIC ]\n");
 	fprintf(stderr, "INFO_SPEC := NH OPTIONS FLAGS [ nexthop NH ]...\n");
-	fprintf(stderr, "NH := [ via ADDRESS ] [ dev STRING ] [ weight NUMBER ] NHFLAGS\n");
+	fprintf(stderr, "NH := [ via [ FAMILY ] ADDRESS ] [ dev STRING ] [ weight NUMBER ] NHFLAGS\n");
+	fprintf(stderr, "FAMILY := [ inet | inet6 | ipx | dnet | bridge | link ]");
 	fprintf(stderr, "OPTIONS := FLAGS [ mtu NUMBER ] [ advmss NUMBER ]\n");
 	fprintf(stderr, "           [ rtt TIME ] [ rttvar TIME ] [ reordering NUMBER ]\n");
 	fprintf(stderr, "           [ window NUMBER] [ cwnd NUMBER ] [ initcwnd NUMBER ]\n");
@@ -185,8 +186,15 @@ static int filter_nlmsg(struct nlmsghdr *n, struct rtattr **tb, int host_len)
 	    (r->rtm_family != filter.msrc.family ||
 	     (filter.msrc.bitlen >= 0 && filter.msrc.bitlen < r->rtm_src_len)))
 		return 0;
-	if (filter.rvia.family && r->rtm_family != filter.rvia.family)
-		return 0;
+	if (filter.rvia.family) {
+		int family = r->rtm_family;
+		if (tb[RTA_VIA]) {
+			struct rtvia *via = RTA_DATA(tb[RTA_VIA]);
+			family = via->rtvia_family;
+		}
+		if (family != filter.rvia.family)
+			return 0;
+	}
 	if (filter.rprefsrc.family && r->rtm_family != filter.rprefsrc.family)
 		return 0;
 
@@ -205,6 +213,12 @@ static int filter_nlmsg(struct nlmsghdr *n, struct rtattr **tb, int host_len)
 		via.family = r->rtm_family;
 		if (tb[RTA_GATEWAY])
 			memcpy(&via.data, RTA_DATA(tb[RTA_GATEWAY]), host_len/8);
+		if (tb[RTA_VIA]) {
+			size_t len = RTA_PAYLOAD(tb[RTA_VIA]) - 2;
+			struct rtvia *rtvia = RTA_DATA(tb[RTA_VIA]);
+			via.family = rtvia->rtvia_family;
+			memcpy(&via.data, rtvia->rtvia_addr, len);
+		}
 	}
 	if (filter.rprefsrc.bitlen>0) {
 		memset(&prefsrc, 0, sizeof(prefsrc));
@@ -400,6 +414,14 @@ int print_route(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
 				    RTA_DATA(tb[RTA_GATEWAY]),
 				    abuf, sizeof(abuf)));
 	}
+	if (tb[RTA_VIA]) {
+		size_t len = RTA_PAYLOAD(tb[RTA_VIA]) - 2;
+		struct rtvia *via = RTA_DATA(tb[RTA_VIA]);
+		fprintf(fp, "via %s %s ",
+			family_name(via->rtvia_family),
+			format_host(via->rtvia_family, len, via->rtvia_addr,
+				    abuf, sizeof(abuf)));
+	}
 	if (tb[RTA_OIF] && filter.oifmask != -1)
 		fprintf(fp, "dev %s ", ll_index_to_name(*(int*)RTA_DATA(tb[RTA_OIF])));
 
@@ -615,6 +637,14 @@ int print_route(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
 							    RTA_DATA(tb[RTA_GATEWAY]),
 							    abuf, sizeof(abuf)));
 				}
+				if (tb[RTA_VIA]) {
+					size_t len = RTA_PAYLOAD(tb[RTA_VIA]) - 2;
+					struct rtvia *via = RTA_DATA(tb[RTA_VIA]);
+					fprintf(fp, "via %s %s ",
+						family_name(via->rtvia_family),
+						format_host(via->rtvia_family, len, via->rtvia_addr,
+							    abuf, sizeof(abuf)));
+				}
 				if (tb[RTA_FLOW]) {
 					__u32 to = rta_getattr_u32(tb[RTA_FLOW]);
 					__u32 from = to>>16;
@@ -662,12 +692,23 @@ static int parse_one_nh(struct rtmsg *r, struct rtattr *rta,
 	while (++argv, --argc > 0) {
 		if (strcmp(*argv, "via") == 0) {
 			inet_prefix addr;
+			int family;
 			NEXT_ARG();
-			get_addr(&addr, *argv, r->rtm_family);
+			family = read_family(*argv);
+			if (family == AF_UNSPEC)
+				family = r->rtm_family;
+			else
+				NEXT_ARG();
+			get_addr(&addr, *argv, family);
 			if (r->rtm_family == AF_UNSPEC)
 				r->rtm_family = addr.family;
-			rta_addattr_l(rta, 4096, RTA_GATEWAY, &addr.data, addr.bytelen);
-			rtnh->rtnh_len += sizeof(struct rtattr) + addr.bytelen;
+			if (addr.family == r->rtm_family) {
+				rta_addattr_l(rta, 4096, RTA_GATEWAY, &addr.data, addr.bytelen);
+				rtnh->rtnh_len += sizeof(struct rtattr) + addr.bytelen;
+			} else {
+				rta_addattr_l(rta, 4096, RTA_VIA, &addr.family, addr.bytelen+2);
+				rtnh->rtnh_len += sizeof(struct rtattr) + addr.bytelen+2;
+			}
 		} else if (strcmp(*argv, "dev") == 0) {
 			NEXT_ARG();
 			if ((rtnh->rtnh_ifindex = ll_name_to_index(*argv)) == 0) {
@@ -775,12 +816,21 @@ static int iproute_modify(int cmd, unsigned flags, int argc, char **argv)
 			addattr_l(&req.n, sizeof(req), RTA_PREFSRC, &addr.data, addr.bytelen);
 		} else if (strcmp(*argv, "via") == 0) {
 			inet_prefix addr;
+			int family;
 			gw_ok = 1;
 			NEXT_ARG();
-			get_addr(&addr, *argv, req.r.rtm_family);
+			family = read_family(*argv);
+			if (family == AF_UNSPEC)
+				family = req.r.rtm_family;
+			else
+				NEXT_ARG();
+			get_addr(&addr, *argv, family);
 			if (req.r.rtm_family == AF_UNSPEC)
 				req.r.rtm_family = addr.family;
-			addattr_l(&req.n, sizeof(req), RTA_GATEWAY, &addr.data, addr.bytelen);
+			if (addr.family == req.r.rtm_family)
+				addattr_l(&req.n, sizeof(req), RTA_GATEWAY, &addr.data, addr.bytelen);
+			else
+				addattr_l(&req.n, sizeof(req), RTA_VIA, &addr.family, addr.bytelen+2);
 		} else if (strcmp(*argv, "from") == 0) {
 			inet_prefix addr;
 			NEXT_ARG();
@@ -1265,8 +1315,14 @@ static int iproute_list_flush_or_save(int argc, char **argv, int action)
 			get_unsigned(&mark, *argv, 0);
 			filter.markmask = -1;
 		} else if (strcmp(*argv, "via") == 0) {
+			int family;
 			NEXT_ARG();
-			get_prefix(&filter.rvia, *argv, do_ipv6);
+			family = read_family(*argv);
+			if (family == AF_UNSPEC)
+				family = do_ipv6;
+			else
+				NEXT_ARG();
+			get_prefix(&filter.rvia, *argv, family);
 		} else if (strcmp(*argv, "src") == 0) {
 			NEXT_ARG();
 			get_prefix(&filter.rprefsrc, *argv, do_ipv6);
@@ -1568,6 +1624,8 @@ static int iproute_get(int argc, char **argv)
 			tb[RTA_OIF]->rta_type = 0;
 		if (tb[RTA_GATEWAY])
 			tb[RTA_GATEWAY]->rta_type = 0;
+		if (tb[RTA_VIA])
+			tb[RTA_VIA]->rta_type = 0;
 		if (!idev && tb[RTA_IIF])
 			tb[RTA_IIF]->rta_type = 0;
 		req.n.nlmsg_flags = NLM_F_REQUEST;
diff --git a/man/man8/ip-route.8.in b/man/man8/ip-route.8.in
index 2b1583d5a30c..906cfea0cd6b 100644
--- a/man/man8/ip-route.8.in
+++ b/man/man8/ip-route.8.in
@@ -81,13 +81,18 @@ replace " } "
 .ti -8
 .IR NH " := [ "
 .B  via
-.IR ADDRESS " ] [ "
+[
+.IR FAMILY " ] " ADDRESS " ] [ "
 .B  dev
 .IR STRING " ] [ "
 .B  weight
 .IR NUMBER " ] " NHFLAGS
 
 .ti -8
+.IR FAMILY " := [ "
+.BR inet " | " inet6 " | " ipx " | " dnet " | " bridge " | " link " ]"
+
+.ti -8
 .IR OPTIONS " := " FLAGS " [ "
 .B  mtu
 .IR NUMBER " ] [ "
@@ -333,9 +338,10 @@ table by default.
 the output device name.
 
 .TP
-.BI via " ADDRESS"
-the address of the nexthop router.  Actually, the sense of this field
-depends on the route type.  For normal
+.BI via " [ FAMILY ] ADDRESS"
+the address of the nexthop router, in the address family FAMILY.
+Actually, the sense of this field depends on the route type.  For
+normal
 .B unicast
 routes it is either the true next hop router or, if it is a direct
 route installed in BSD compatibility mode, it can be a local address
@@ -472,7 +478,7 @@ is a complex value with its own syntax similar to the top level
 argument lists:
 
 .in +8
-.BI via " ADDRESS"
+.BI via " [ FAMILY ] ADDRESS"
 - is the nexthop router.
 .sp
 
@@ -669,7 +675,7 @@ only list routes of this type.
 only list routes going via this device.
 
 .TP
-.BI via " PREFIX"
+.BI via " [ FAMILY ] PREFIX"
 only list routes going via the nexthop routers selected by
 .IR PREFIX "."
 
-- 
2.2.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH net-next 7/8] iproute2: Add support for the RTA_NEWDST attribute.
  2015-03-13 18:50 [PATCH net-next] iproute2: MPLS support Eric W. Biederman
                   ` (5 preceding siblings ...)
  2015-03-13 18:57 ` [PATCH net-next 6/8] iproute2: Add support for RTA_VIA attributes Eric W. Biederman
@ 2015-03-13 18:58 ` Eric W. Biederman
  2015-03-13 18:59 ` [PATCH net-next 8/8] iproute2: Add basic mpls support to iproute Eric W. Biederman
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 29+ messages in thread
From: Eric W. Biederman @ 2015-03-13 18:58 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev


This attribute is like RTA_DST except it specifies the destination
address to place on a packet when it leaves the host.  For ip based
protocols this is destination NAT and not a common part of forwarding.
For protocols like MPLS label swapping is something that typically
happens on every hop.

There is likely to be a RTA_NEWSRC at some point so RTA_NEWDST
is printed as "as to"  and can be specified either as "as to"
or just "as"

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/rtnetlink.h |  1 +
 ip/iproute.c              | 19 ++++++++++++++++++-
 man/man8/ip-route.8.in    |  5 +++++
 3 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
index 03e4c8df8e60..0d4100535bd7 100644
--- a/include/linux/rtnetlink.h
+++ b/include/linux/rtnetlink.h
@@ -304,6 +304,7 @@ enum rtattr_type_t {
 	RTA_MARK,
 	RTA_MFC_STATS,
 	RTA_VIA,
+	RTA_NEWDST,
 	__RTA_MAX
 };
 
diff --git a/ip/iproute.c b/ip/iproute.c
index 6c40899dbf46..b97aa4b598bf 100644
--- a/ip/iproute.c
+++ b/ip/iproute.c
@@ -77,7 +77,7 @@ static void usage(void)
 	fprintf(stderr, "INFO_SPEC := NH OPTIONS FLAGS [ nexthop NH ]...\n");
 	fprintf(stderr, "NH := [ via [ FAMILY ] ADDRESS ] [ dev STRING ] [ weight NUMBER ] NHFLAGS\n");
 	fprintf(stderr, "FAMILY := [ inet | inet6 | ipx | dnet | bridge | link ]");
-	fprintf(stderr, "OPTIONS := FLAGS [ mtu NUMBER ] [ advmss NUMBER ]\n");
+	fprintf(stderr, "OPTIONS := FLAGS [ mtu NUMBER ] [ advmss NUMBER ] [ as [ to ] ADDRESS ]\n");
 	fprintf(stderr, "           [ rtt TIME ] [ rttvar TIME ] [ reordering NUMBER ]\n");
 	fprintf(stderr, "           [ window NUMBER] [ cwnd NUMBER ] [ initcwnd NUMBER ]\n");
 	fprintf(stderr, "           [ ssthresh NUMBER ] [ realms REALM ] [ src ADDRESS ]\n");
@@ -402,6 +402,13 @@ int print_route(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
 	} else if (r->rtm_src_len) {
 		fprintf(fp, "from 0/%u ", r->rtm_src_len);
 	}
+	if (tb[RTA_NEWDST]) {
+		fprintf(fp, "as to %s ", format_host(r->rtm_family,
+						  RTA_PAYLOAD(tb[RTA_NEWDST]),
+						  RTA_DATA(tb[RTA_NEWDST]),
+						  abuf, sizeof(abuf))
+			);
+	}
 	if (r->rtm_tos && filter.tosmask != -1) {
 		SPRINT_BUF(b1);
 		fprintf(fp, "tos %s ", rtnl_dsfield_n2a(r->rtm_tos, b1, sizeof(b1)));
@@ -814,6 +821,16 @@ static int iproute_modify(int cmd, unsigned flags, int argc, char **argv)
 			if (req.r.rtm_family == AF_UNSPEC)
 				req.r.rtm_family = addr.family;
 			addattr_l(&req.n, sizeof(req), RTA_PREFSRC, &addr.data, addr.bytelen);
+		} else if (strcmp(*argv, "as") == 0) {
+			inet_prefix addr;
+			NEXT_ARG();
+			if (strcmp(*argv, "to") == 0) {
+				NEXT_ARG();
+			}
+			get_addr(&addr, *argv, req.r.rtm_family);
+			if (req.r.rtm_family == AF_UNSPEC)
+				req.r.rtm_family = addr.family;
+			addattr_l(&req.n, sizeof(req), RTA_NEWDST, &addr.data, addr.bytelen);
 		} else if (strcmp(*argv, "via") == 0) {
 			inet_prefix addr;
 			int family;
diff --git a/man/man8/ip-route.8.in b/man/man8/ip-route.8.in
index 906cfea0cd6b..5112344971c0 100644
--- a/man/man8/ip-route.8.in
+++ b/man/man8/ip-route.8.in
@@ -98,6 +98,11 @@ replace " } "
 .IR NUMBER " ] [ "
 .B  advmss
 .IR NUMBER " ] [ "
+.B  as
+[
+.B to
+]
+.IR ADDRESS " ]"
 .B  rtt
 .IR TIME " ] [ "
 .B  rttvar
-- 
2.2.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH net-next 8/8] iproute2: Add basic mpls support to iproute
  2015-03-13 18:50 [PATCH net-next] iproute2: MPLS support Eric W. Biederman
                   ` (6 preceding siblings ...)
  2015-03-13 18:58 ` [PATCH net-next 7/8] iproute2: Add support for the RTA_NEWDST attribute Eric W. Biederman
@ 2015-03-13 18:59 ` Eric W. Biederman
       [not found] ` <c3ad7d77783046d38e5b23b5e1fe0f71@BRMWP-EXMB11.corp.brocade.com>
  2015-03-24 22:36 ` [PATCH net-next] iproute2: MPLS support Stephen Hemminger
  9 siblings, 0 replies; 29+ messages in thread
From: Eric W. Biederman @ 2015-03-13 18:59 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev


- Pull in the uapi mpls.h
- Update rtnetlink.h to include the mpls rtnetlink notification multicast group.
- Define AF_MPLS in utils.h if it is not defined from elsewhere
  as is done with AF_DECnet

The address syntax for multiple mpls labels is a complete invention.
When I looked there seemed to be no wide spread convention for talking
about an mpls label stack in text for.  Sometimes people did:
"{ Label1, Label2, Label3 }", sometimes people would do:
"[ label3, label2, label1 ]", and most of the time label
stacks were not explicitly shown at all.

The syntax I wound up using, so it would not have spaces and so it
would visually distinct from other kinds of addresses is.

label1/label2/label3 Where label1 is the label at the top of the label
stack and label3 is the label at the bottom on the label stack.

When there is a single label this matches what seems to be convention
with other tools.  Just print out the numeric value of the mpls label.

The netlink protocol for labels uses the on the wire format for a
label stack. The ttl and traffic class are expected to be 0.  Using
the on the wire format is common and what happens with other address
types. BGP when passing label stacks also uses this technique with the
exception that the ttl byte is not included making each label in a BGP
label stack 3 bytes instead of 4.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 Makefile                  |  3 +++
 include/linux/mpls.h      | 34 +++++++++++++++++++++++++++
 include/linux/rtnetlink.h |  2 ++
 include/utils.h           | 10 ++++++++
 ip/ip.c                   |  4 +++-
 ip/ipmonitor.c            |  3 +++
 ip/iproute.c              |  4 +++-
 lib/mpls_ntop.c           | 48 +++++++++++++++++++++++++++++++++++++++
 lib/mpls_pton.c           | 58 +++++++++++++++++++++++++++++++++++++++++++++++
 lib/utils.c               | 31 +++++++++++++++++++++++--
 man/man8/ip-route.8.in    |  2 +-
 man/man8/ip.8             |  7 +++++-
 12 files changed, 200 insertions(+), 6 deletions(-)
 create mode 100644 include/linux/mpls.h
 create mode 100644 lib/mpls_ntop.c
 create mode 100644 lib/mpls_pton.c

diff --git a/Makefile b/Makefile
index 9dbb29f3d0cd..ca6c2e141308 100644
--- a/Makefile
+++ b/Makefile
@@ -26,6 +26,9 @@ ADDLIB+=dnet_ntop.o dnet_pton.o
 #options for ipx
 ADDLIB+=ipx_ntop.o ipx_pton.o
 
+#options for mpls
+ADDLIB+=mpls_ntop.o mpls_pton.o
+
 CC = gcc
 HOSTCC = gcc
 DEFINES += -D_GNU_SOURCE
diff --git a/include/linux/mpls.h b/include/linux/mpls.h
new file mode 100644
index 000000000000..bc9abfe88c9a
--- /dev/null
+++ b/include/linux/mpls.h
@@ -0,0 +1,34 @@
+#ifndef _UAPI_MPLS_H
+#define _UAPI_MPLS_H
+
+#include <linux/types.h>
+#include <asm/byteorder.h>
+
+/* Reference: RFC 5462, RFC 3032
+ *
+ *  0                   1                   2                   3
+ *  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ * |                Label                  | TC  |S|       TTL     |
+ * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ *
+ *	Label:  Label Value, 20 bits
+ *	TC:     Traffic Class field, 3 bits
+ *	S:      Bottom of Stack, 1 bit
+ *	TTL:    Time to Live, 8 bits
+ */
+
+struct mpls_label {
+	__be32 entry;
+};
+
+#define MPLS_LS_LABEL_MASK      0xFFFFF000
+#define MPLS_LS_LABEL_SHIFT     12
+#define MPLS_LS_TC_MASK         0x00000E00
+#define MPLS_LS_TC_SHIFT        9
+#define MPLS_LS_S_MASK          0x00000100
+#define MPLS_LS_S_SHIFT         8
+#define MPLS_LS_TTL_MASK        0x000000FF
+#define MPLS_LS_TTL_SHIFT       0
+
+#endif /* _UAPI_MPLS_H */
diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
index 0d4100535bd7..2e0dc0f638ba 100644
--- a/include/linux/rtnetlink.h
+++ b/include/linux/rtnetlink.h
@@ -629,6 +629,8 @@ enum rtnetlink_groups {
 #define RTNLGRP_IPV6_NETCONF	RTNLGRP_IPV6_NETCONF
 	RTNLGRP_MDB,
 #define RTNLGRP_MDB		RTNLGRP_MDB
+	RTNLGRP_MPLS_ROUTE,
+#define RTNLGRP_MPLS_ROUTE	RTNLGRP_MPLS_ROUTE
 	__RTNLGRP_MAX
 };
 #define RTNLGRP_MAX	(__RTNLGRP_MAX - 1)
diff --git a/include/utils.h b/include/utils.h
index 2db09c7f790f..c8d8635c15e2 100644
--- a/include/utils.h
+++ b/include/utils.h
@@ -78,6 +78,13 @@ struct ipx_addr {
 	u_int8_t  ipx_node[IPX_NODE_LEN];
 };
 
+#ifndef AF_MPLS
+# define AF_MPLS 28
+#endif
+
+/* Maximum number of labels the mpls helpers support */
+#define MPLS_MAX_LABELS 8
+
 extern __u32 get_addr32(const char *name);
 extern int get_addr_1(inet_prefix *dst, const char *arg, int family);
 extern int get_prefix_1(inet_prefix *dst, char *arg, int family);
@@ -123,6 +130,9 @@ int dnet_pton(int af, const char *src, void *addr);
 const char *ipx_ntop(int af, const void *addr, char *str, size_t len);
 int ipx_pton(int af, const char *src, void *addr);
 
+const char *mpls_ntop(int af, const void *addr, char *str, size_t len);
+int mpls_pton(int af, const char *src, void *addr);
+
 extern int __iproute2_hz_internal;
 extern int __get_hz(void);
 
diff --git a/ip/ip.c b/ip/ip.c
index 85256d8ea0c1..f7f214b2f5ab 100644
--- a/ip/ip.c
+++ b/ip/ip.c
@@ -52,7 +52,7 @@ static void usage(void)
 "                   netns | l2tp | fou | tcp_metrics | token | netconf }\n"
 "       OPTIONS := { -V[ersion] | -s[tatistics] | -d[etails] | -r[esolve] |\n"
 "                    -h[uman-readable] | -iec |\n"
-"                    -f[amily] { inet | inet6 | ipx | dnet | bridge | link } |\n"
+"                    -f[amily] { inet | inet6 | ipx | dnet | mpls | bridge | link } |\n"
 "                    -4 | -6 | -I | -D | -B | -0 |\n"
 "                    -l[oops] { maximum-addr-flush-attempts } |\n"
 "                    -o[neline] | -t[imestamp] | -ts[hort] | -b[atch] [filename] |\n"
@@ -206,6 +206,8 @@ int main(int argc, char **argv)
 			preferred_family = AF_IPX;
 		} else if (strcmp(opt, "-D") == 0) {
 			preferred_family = AF_DECnet;
+		} else if (strcmp(opt, "-M") == 0) {
+			preferred_family = AF_MPLS;
 		} else if (strcmp(opt, "-B") == 0) {
 			preferred_family = AF_BRIDGE;
 		} else if (matches(opt, "-human") == 0 ||
diff --git a/ip/ipmonitor.c b/ip/ipmonitor.c
index 6b5e66534551..7833a2632927 100644
--- a/ip/ipmonitor.c
+++ b/ip/ipmonitor.c
@@ -158,6 +158,7 @@ int do_ipmonitor(int argc, char **argv)
 	groups |= nl_mgrp(RTNLGRP_IPV6_IFADDR);
 	groups |= nl_mgrp(RTNLGRP_IPV4_ROUTE);
 	groups |= nl_mgrp(RTNLGRP_IPV6_ROUTE);
+	groups |= nl_mgrp(RTNLGRP_MPLS_ROUTE);
 	groups |= nl_mgrp(RTNLGRP_IPV4_MROUTE);
 	groups |= nl_mgrp(RTNLGRP_IPV6_MROUTE);
 	groups |= nl_mgrp(RTNLGRP_IPV6_PREFIX);
@@ -235,6 +236,8 @@ int do_ipmonitor(int argc, char **argv)
 			groups |= nl_mgrp(RTNLGRP_IPV4_ROUTE);
 		if (!preferred_family || preferred_family == AF_INET6)
 			groups |= nl_mgrp(RTNLGRP_IPV6_ROUTE);
+		if (!preferred_family || preferred_family == AF_MPLS)
+			groups |= nl_mgrp(RTNLGRP_MPLS_ROUTE);
 	}
 	if (lmroute) {
 		if (!preferred_family || preferred_family == AF_INET)
diff --git a/ip/iproute.c b/ip/iproute.c
index b97aa4b598bf..07b2cca96225 100644
--- a/ip/iproute.c
+++ b/ip/iproute.c
@@ -76,7 +76,7 @@ static void usage(void)
 	fprintf(stderr, "             [ scope SCOPE ] [ metric METRIC ]\n");
 	fprintf(stderr, "INFO_SPEC := NH OPTIONS FLAGS [ nexthop NH ]...\n");
 	fprintf(stderr, "NH := [ via [ FAMILY ] ADDRESS ] [ dev STRING ] [ weight NUMBER ] NHFLAGS\n");
-	fprintf(stderr, "FAMILY := [ inet | inet6 | ipx | dnet | bridge | link ]");
+	fprintf(stderr, "FAMILY := [ inet | inet6 | ipx | dnet | mpls | bridge | link ]");
 	fprintf(stderr, "OPTIONS := FLAGS [ mtu NUMBER ] [ advmss NUMBER ] [ as [ to ] ADDRESS ]\n");
 	fprintf(stderr, "           [ rtt TIME ] [ rttvar TIME ] [ reordering NUMBER ]\n");
 	fprintf(stderr, "           [ window NUMBER] [ cwnd NUMBER ] [ initcwnd NUMBER ]\n");
@@ -292,6 +292,8 @@ static int calc_host_len(const struct rtmsg *r)
 		return 16;
 	else if (r->rtm_family == AF_IPX)
 		return 80;
+	else if (r->rtm_family == AF_MPLS)
+		return 20;
 	else
 		return -1;
 }
diff --git a/lib/mpls_ntop.c b/lib/mpls_ntop.c
new file mode 100644
index 000000000000..945d6d5e4535
--- /dev/null
+++ b/lib/mpls_ntop.c
@@ -0,0 +1,48 @@
+#include <errno.h>
+#include <string.h>
+#include <sys/types.h>
+#include <netinet/in.h>
+#include <linux/mpls.h>
+
+#include "utils.h"
+
+static const char *mpls_ntop1(const struct mpls_label *addr, char *buf, size_t buflen)
+{
+	size_t destlen = buflen;
+	char *dest = buf;
+	int count;
+
+	for (count = 0; count < MPLS_MAX_LABELS; count++) {
+		uint32_t entry = ntohl(addr[count].entry);
+		uint32_t label = (entry & MPLS_LS_LABEL_MASK) >> MPLS_LS_LABEL_SHIFT;
+		int len = snprintf(dest, destlen, "%u", label);
+
+		/* Is this the end? */
+		if (entry & MPLS_LS_S_MASK)
+			return buf;
+
+
+		dest += len;
+		destlen -= len;
+		if (destlen) {
+			*dest = '/';
+			dest++;
+			destlen--;
+		}
+	}
+	errno = -E2BIG;
+	return NULL;
+}
+
+const char *mpls_ntop(int af, const void *addr, char *buf, size_t buflen)
+{
+	switch(af) {
+	case AF_MPLS:
+		errno = 0;
+		return mpls_ntop1((struct mpls_label *)addr, buf, buflen);
+	default:
+		errno = EAFNOSUPPORT;
+	}
+
+	return NULL;
+}
diff --git a/lib/mpls_pton.c b/lib/mpls_pton.c
new file mode 100644
index 000000000000..bd448cfcf14a
--- /dev/null
+++ b/lib/mpls_pton.c
@@ -0,0 +1,58 @@
+#include <errno.h>
+#include <string.h>
+#include <sys/types.h>
+#include <netinet/in.h>
+#include <linux/mpls.h>
+
+#include "utils.h"
+
+
+static int mpls_pton1(const char *name, struct mpls_label *addr)
+{
+	char *endp;
+	unsigned count;
+
+	for (count = 0; count < MPLS_MAX_LABELS; count++) {
+		unsigned long label;
+
+		label = strtoul(name, &endp, 0);
+		/* Fail when the label value is out or range */
+		if (label >= (1 << 20))
+			return 0;
+
+		if (endp == name) /* no digits */
+			return 0;
+
+		addr->entry = htonl(label << MPLS_LS_LABEL_SHIFT);
+		if (*endp == '\0') {
+			addr->entry |= htonl(1 << MPLS_LS_S_SHIFT);
+			return 1;
+		}
+
+		/* Bad character in the address */
+		if (*endp != '/')
+			return 0;
+
+		name = endp + 1;
+		addr += 1;
+	}
+	/* The address was too long */
+	return 0;
+}
+
+int mpls_pton(int af, const char *src, void *addr)
+{
+	int err;
+
+	switch(af) {
+	case AF_MPLS:
+		errno = 0;
+		err = mpls_pton1(src, (struct mpls_label *)addr);
+		break;
+	default:
+		errno = EAFNOSUPPORT;
+		err = -1;
+	}
+
+	return err;
+}
diff --git a/lib/utils.c b/lib/utils.c
index b293407e550d..4d19718b8763 100644
--- a/lib/utils.c
+++ b/lib/utils.c
@@ -26,6 +26,7 @@
 #include <linux/pkt_sched.h>
 #include <linux/param.h>
 #include <linux/if_arp.h>
+#include <linux/mpls.h>
 #include <time.h>
 #include <sys/time.h>
 #include <errno.h>
@@ -390,7 +391,7 @@ int get_addr_1(inet_prefix *addr, const char *name, int family)
 	if (strcmp(name, "default") == 0 ||
 	    strcmp(name, "all") == 0 ||
 	    strcmp(name, "any") == 0) {
-		if (family == AF_DECnet)
+		if ((family == AF_DECnet) || (family == AF_MPLS))
 			return -1;
 		addr->family = family;
 		addr->bytelen = (family == AF_INET6 ? 16 : 4);
@@ -432,6 +433,23 @@ int get_addr_1(inet_prefix *addr, const char *name, int family)
 		return 0;
 	}
 
+	if (family == AF_MPLS) {
+		int i;
+		addr->family = AF_MPLS;
+		if (mpls_pton(AF_MPLS, name, addr->data) <= 0)
+			return -1;
+		addr->bytelen = 4;
+		addr->bitlen = 20;
+		/* How many bytes do I need? */
+		for (i = 0; i < 8; i++) {
+			if (ntohl(addr->data[i]) & MPLS_LS_S_MASK) {
+				addr->bytelen = (i + 1)*4;
+				break;
+			}
+		}
+		return 0;
+	}
+
 	addr->family = AF_INET;
 	if (family != AF_UNSPEC && family != AF_INET)
 		return -1;
@@ -455,7 +473,7 @@ int get_prefix_1(inet_prefix *dst, char *arg, int family)
 	if (strcmp(arg, "default") == 0 ||
 	    strcmp(arg, "any") == 0 ||
 	    strcmp(arg, "all") == 0) {
-		if (family == AF_DECnet)
+		if ((family == AF_DECnet) || (family = AF_MPLS))
 			return -1;
 		dst->family = family;
 		dst->bytelen = 0;
@@ -476,6 +494,9 @@ int get_prefix_1(inet_prefix *dst, char *arg, int family)
 		case AF_DECnet:
 			dst->bitlen = 16;
 			break;
+		case AF_MPLS:
+			dst->bitlen = 20;
+			break;
 		default:
 		case AF_INET:
 			dst->bitlen = 32;
@@ -639,6 +660,8 @@ const char *rt_addr_n2a(int af, int len, const void *addr, char *buf, int buflen
 	case AF_INET:
 	case AF_INET6:
 		return inet_ntop(af, addr, buf, buflen);
+	case AF_MPLS:
+		return mpls_ntop(af, addr, buf, buflen);
 	case AF_IPX:
 		return ipx_ntop(af, addr, buf, buflen);
 	case AF_DECnet:
@@ -667,6 +690,8 @@ int read_family(const char *name)
 		family = AF_PACKET;
 	else if (strcmp(name, "ipx") == 0)
 		family = AF_IPX;
+	else if (strcmp(name, "mpls") == 0)
+		family = AF_MPLS;
 	else if (strcmp(name, "bridge") == 0)
 		family = AF_BRIDGE;
 	return family;
@@ -684,6 +709,8 @@ const char *family_name(int family)
 		return "link";
 	if (family == AF_IPX)
 		return "ipx";
+	if (family == AF_MPLS)
+		return "mpls";
 	if (family == AF_BRIDGE)
 		return "bridge";
 	return "???";
diff --git a/man/man8/ip-route.8.in b/man/man8/ip-route.8.in
index 5112344971c0..1163536d0e9c 100644
--- a/man/man8/ip-route.8.in
+++ b/man/man8/ip-route.8.in
@@ -90,7 +90,7 @@ replace " } "
 
 .ti -8
 .IR FAMILY " := [ "
-.BR inet " | " inet6 " | " ipx " | " dnet " | " bridge " | " link " ]"
+.BR inet " | " inet6 " | " ipx " | " dnet " | " mpls " | " bridge " | " link " ]"
 
 .ti -8
 .IR OPTIONS " := " FLAGS " [ "
diff --git a/man/man8/ip.8 b/man/man8/ip.8
index 016e8c660cd0..1755473ee32a 100644
--- a/man/man8/ip.8
+++ b/man/man8/ip.8
@@ -73,7 +73,7 @@ Zero (0) means loop until all addresses are removed.
 .TP
 .BR "\-f" , " \-family " <FAMILY>
 Specifies the protocol family to use. The protocol family identifier can be one of
-.BR "inet" , " inet6" , " bridge" , " ipx" , " dnet"
+.BR "inet" , " inet6" , " bridge" , " ipx" , " dnet" , " mpls"
 or
 .BR link .
 If this option is not present,
@@ -115,6 +115,11 @@ shortcut for
 .BR "\-family ipx" .
 
 .TP
+.B \-M
+shortcut for
+.BR "\-family mpls" .
+
+.TP
 .B \-0
 shortcut for
 .BR "\-family link" .
-- 
2.2.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH net-next 1/8] iproute2: Add a source addres length parameter to rt_addr_n2a
       [not found] ` <c3ad7d77783046d38e5b23b5e1fe0f71@BRMWP-EXMB11.corp.brocade.com>
@ 2015-03-15 19:33   ` Stephen Hemminger
  2015-03-15 19:42     ` Eric W. Biederman
  0 siblings, 1 reply; 29+ messages in thread
From: Stephen Hemminger @ 2015-03-15 19:33 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: netdev

On Fri, 13 Mar 2015 18:52:09 +0000
"Eric W. Biederman" <ebiederm@xmission.com> wrote:

> 
> For some address families (like AF_PACKET) it is helpful to have the
> length when printing the address.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>  

Recent changes to utils.h make this patch no longer apply cleanly.
Please redo thanks.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH net-next 1/8] iproute2: Add a source addres length parameter to rt_addr_n2a
  2015-03-15 19:33   ` [PATCH net-next 1/8] iproute2: Add a source addres length parameter to rt_addr_n2a Stephen Hemminger
@ 2015-03-15 19:42     ` Eric W. Biederman
  2015-03-15 19:47       ` [PATCH net-next 0/8] iproute2: MPLS support (now with af_bit_len) Eric W. Biederman
  0 siblings, 1 reply; 29+ messages in thread
From: Eric W. Biederman @ 2015-03-15 19:42 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev

Stephen Hemminger <shemming@brocade.com> writes:

> On Fri, 13 Mar 2015 18:52:09 +0000
> "Eric W. Biederman" <ebiederm@xmission.com> wrote:
>
>> 
>> For some address families (like AF_PACKET) it is helpful to have the
>> length when printing the address.
>> 
>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>>  
>
> Recent changes to utils.h make this patch no longer apply cleanly.
> Please redo thanks.

Yeah the change to introduce af_bitlen does naturally cause some
conflicts.  I will fix things up and resend.

Patch application races are a pain.

Eric

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH net-next 0/8] iproute2: MPLS support (now with af_bit_len)
  2015-03-15 19:42     ` Eric W. Biederman
@ 2015-03-15 19:47       ` Eric W. Biederman
  2015-03-15 19:48         ` [PATCH net-next 1/8] iproute2: Add a source addres length parameter to rt_addr_n2a Eric W. Biederman
                           ` (7 more replies)
  0 siblings, 8 replies; 29+ messages in thread
From: Eric W. Biederman @ 2015-03-15 19:47 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev


This set of changes  adds support for nexthops in different
address families, with the new netlink RTA_VIA option.

Support is added for routes that change the destination address (as MPLS
does) with the RTA_NEWDST attribute.

Support for MPLS addresses is added (for multiple labels I
had to make up the syntax I used label/label/label as it fits in
well with addresses that don't have a space in them).

Support for these options is merged into David's net-next kernel
tree, and the meaning of the options is unlikely to change in any
significant way before this code merges upstream.

The documentation has been updated to report that the new options
are present and to report roughly what they do.  This includes
ip --help and the man pages.

Eric W. Biederman (8):
      iproute2: Add a source addres length parameter to rt_addr_n2a
      iproute2: Make the addr argument of ll_addr_n2a const
      iproute2: Add support for printing AF_PACKET addresses
      iproute2: Add address family to/from string helper functions.
      iproute2: misc whitespace cleanup
      iproute2: Add support for  RTA_VIA attributes
      iproute2: Add support for the RTA_NEWDST attribute.
      iproute2: Add basic mpls support to iproute

 Makefile                  |   3 ++
 include/linux/mpls.h      |  34 +++++++++++++++
 include/linux/rtnetlink.h |  10 +++++
 include/rt_names.h        |   2 +-
 include/utils.h           |  22 ++++++++--
 ip/ip.c                   |  20 +++------
 ip/iplink_bond.c          |   1 +
 ip/ipmonitor.c            |   3 ++
 ip/ipmroute.c             |   2 +
 ip/ipprefix.c             |   4 +-
 ip/iproute.c              | 106 ++++++++++++++++++++++++++++++++++++++++------
 ip/iprule.c               |  10 +++--
 ip/iptunnel.c             |   4 +-
 ip/ipxfrm.c               |  17 +++++---
 ip/link_ip6tnl.c          |   2 +
 ip/xfrm_monitor.c         |   8 ++--
 lib/ll_addr.c             |   2 +-
 lib/mpls_ntop.c           |  48 +++++++++++++++++++++
 lib/mpls_pton.c           |  58 +++++++++++++++++++++++++
 lib/utils.c               |  90 +++++++++++++++++++++++++++++++++++----
 man/man8/ip-route.8.in    |  23 +++++++---
 man/man8/ip.8             |   7 ++-
 22 files changed, 410 insertions(+), 66 deletions(-)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH net-next 1/8] iproute2: Add a source addres length parameter to rt_addr_n2a
  2015-03-15 19:47       ` [PATCH net-next 0/8] iproute2: MPLS support (now with af_bit_len) Eric W. Biederman
@ 2015-03-15 19:48         ` Eric W. Biederman
  2015-03-15 19:49         ` [PATCH net-next 2/8] iproute2: Make the addr argument of ll_addr_n2a const Eric W. Biederman
                           ` (6 subsequent siblings)
  7 siblings, 0 replies; 29+ messages in thread
From: Eric W. Biederman @ 2015-03-15 19:48 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev


For some address families (like AF_PACKET) it is helpful to have the
length when prenting the address.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/utils.h   |  2 +-
 ip/iplink_bond.c  |  1 +
 ip/ipmroute.c     |  2 ++
 ip/ipprefix.c     |  4 +++-
 ip/iproute.c      | 11 +++++++----
 ip/iprule.c       | 10 ++++++----
 ip/iptunnel.c     |  2 +-
 ip/ipxfrm.c       | 17 +++++++++++------
 ip/link_ip6tnl.c  |  2 ++
 ip/xfrm_monitor.c |  8 +++++---
 lib/utils.c       |  4 ++--
 11 files changed, 41 insertions(+), 22 deletions(-)

diff --git a/include/utils.h b/include/utils.h
index 9151c4f103e3..1b39e2c5cbfc 100644
--- a/include/utils.h
+++ b/include/utils.h
@@ -106,7 +106,7 @@ extern int af_byte_len(int af);
 
 extern const char *format_host(int af, int len, const void *addr,
 			       char *buf, int buflen);
-extern const char *rt_addr_n2a(int af, const void *addr,
+extern const char *rt_addr_n2a(int af, int len, const void *addr,
 			       char *buf, int buflen);
 
 void missarg(const char *) __attribute__((noreturn));
diff --git a/ip/iplink_bond.c b/ip/iplink_bond.c
index 3009ec912e23..a573f92b03a0 100644
--- a/ip/iplink_bond.c
+++ b/ip/iplink_bond.c
@@ -415,6 +415,7 @@ static void bond_print_opt(struct link_util *lu, FILE *f, struct rtattr *tb[])
 			if (iptb[i])
 				fprintf(f, "%s",
 					rt_addr_n2a(AF_INET,
+						    RTA_PAYLOAD(iptb[i]),
 						    RTA_DATA(iptb[i]),
 						    buf,
 						    INET_ADDRSTRLEN));
diff --git a/ip/ipmroute.c b/ip/ipmroute.c
index b4ed9f15fda5..13ac892512d0 100644
--- a/ip/ipmroute.c
+++ b/ip/ipmroute.c
@@ -116,6 +116,7 @@ int print_mroute(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
 	if (tb[RTA_SRC])
 		len = snprintf(obuf, sizeof(obuf),
 			       "(%s, ", rt_addr_n2a(family,
+						    RTA_PAYLOAD(tb[RTA_SRC]),
 						    RTA_DATA(tb[RTA_SRC]),
 						    abuf, sizeof(abuf)));
 	else
@@ -123,6 +124,7 @@ int print_mroute(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
 	if (tb[RTA_DST])
 		snprintf(obuf + len, sizeof(obuf) - len,
 			 "%s)", rt_addr_n2a(family,
+					    RTA_PAYLOAD(tb[RTA_DST]),
 					    RTA_DATA(tb[RTA_DST]),
 					    abuf, sizeof(abuf)));
 	else
diff --git a/ip/ipprefix.c b/ip/ipprefix.c
index 02c0efce6836..26b596151217 100644
--- a/ip/ipprefix.c
+++ b/ip/ipprefix.c
@@ -80,7 +80,9 @@ int print_prefix(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
 		pfx = (struct in6_addr *)RTA_DATA(tb[PREFIX_ADDRESS]);
 
 		memset(abuf, '\0', sizeof(abuf));
-		fprintf(fp, "%s", rt_addr_n2a(family, pfx,
+		fprintf(fp, "%s", rt_addr_n2a(family,
+					      RTA_PAYLOAD(tb[PREFIX_ADDRESS]),
+					      pfx,
 					      abuf, sizeof(abuf)));
 	}
 	fprintf(fp, "/%u ", prefix->prefix_len);
diff --git a/ip/iproute.c b/ip/iproute.c
index b32025ff0da5..79d0760a34f6 100644
--- a/ip/iproute.c
+++ b/ip/iproute.c
@@ -339,8 +339,9 @@ int print_route(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
 	if (tb[RTA_DST]) {
 		if (r->rtm_dst_len != host_len) {
 			fprintf(fp, "%s/%u ", rt_addr_n2a(r->rtm_family,
-							 RTA_DATA(tb[RTA_DST]),
-							 abuf, sizeof(abuf)),
+						       RTA_PAYLOAD(tb[RTA_DST]),
+						       RTA_DATA(tb[RTA_DST]),
+						       abuf, sizeof(abuf)),
 				r->rtm_dst_len
 				);
 		} else {
@@ -358,8 +359,9 @@ int print_route(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
 	if (tb[RTA_SRC]) {
 		if (r->rtm_src_len != host_len) {
 			fprintf(fp, "from %s/%u ", rt_addr_n2a(r->rtm_family,
-							 RTA_DATA(tb[RTA_SRC]),
-							 abuf, sizeof(abuf)),
+						       RTA_PAYLOAD(tb[RTA_SRC]),
+						       RTA_DATA(tb[RTA_SRC]),
+						       abuf, sizeof(abuf)),
 				r->rtm_src_len
 				);
 		} else {
@@ -401,6 +403,7 @@ int print_route(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
 		 */
 		fprintf(fp, " src %s ",
 			rt_addr_n2a(r->rtm_family,
+				    RTA_PAYLOAD(tb[RTA_PREFSRC]),
 				    RTA_DATA(tb[RTA_PREFSRC]),
 				    abuf, sizeof(abuf)));
 	}
diff --git a/ip/iprule.c b/ip/iprule.c
index 54ed7536e064..967969c0e60e 100644
--- a/ip/iprule.c
+++ b/ip/iprule.c
@@ -82,8 +82,9 @@ int print_rule(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
 	if (tb[FRA_SRC]) {
 		if (r->rtm_src_len != host_len) {
 			fprintf(fp, "from %s/%u ", rt_addr_n2a(r->rtm_family,
-							 RTA_DATA(tb[FRA_SRC]),
-							 abuf, sizeof(abuf)),
+						       RTA_PAYLOAD(tb[FRA_SRC]),
+						       RTA_DATA(tb[FRA_SRC]),
+						       abuf, sizeof(abuf)),
 				r->rtm_src_len
 				);
 		} else {
@@ -102,8 +103,9 @@ int print_rule(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
 	if (tb[FRA_DST]) {
 		if (r->rtm_dst_len != host_len) {
 			fprintf(fp, "to %s/%u ", rt_addr_n2a(r->rtm_family,
-							 RTA_DATA(tb[FRA_DST]),
-							 abuf, sizeof(abuf)),
+						       RTA_PAYLOAD(tb[FRA_DST]),
+						       RTA_DATA(tb[FRA_DST]),
+						       abuf, sizeof(abuf)),
 				r->rtm_dst_len
 				);
 		} else {
diff --git a/ip/iptunnel.c b/ip/iptunnel.c
index caf8a28e62e8..29188c450370 100644
--- a/ip/iptunnel.c
+++ b/ip/iptunnel.c
@@ -343,7 +343,7 @@ static void print_tunnel(struct ip_tunnel_parm *p)
 	       p->name,
 	       tnl_strproto(p->iph.protocol),
 	       p->iph.daddr ? format_host(AF_INET, 4, &p->iph.daddr, s1, sizeof(s1))  : "any",
-	       p->iph.saddr ? rt_addr_n2a(AF_INET, &p->iph.saddr, s2, sizeof(s2)) : "any");
+	       p->iph.saddr ? rt_addr_n2a(AF_INET, 4, &p->iph.saddr, s2, sizeof(s2)) : "any");
 
 	if (p->iph.protocol == IPPROTO_IPV6 && (p->i_flags & SIT_ISATAP)) {
 		struct ip_tunnel_prl prl[16];
diff --git a/ip/ipxfrm.c b/ip/ipxfrm.c
index 659fa6b64579..eacefd907f46 100644
--- a/ip/ipxfrm.c
+++ b/ip/ipxfrm.c
@@ -288,10 +288,10 @@ void xfrm_id_info_print(xfrm_address_t *saddr, struct xfrm_id *id,
 		fputs(title, fp);
 
 	memset(abuf, '\0', sizeof(abuf));
-	fprintf(fp, "src %s ", rt_addr_n2a(family,
+	fprintf(fp, "src %s ", rt_addr_n2a(family, sizeof(*saddr),
 					   saddr, abuf, sizeof(abuf)));
 	memset(abuf, '\0', sizeof(abuf));
-	fprintf(fp, "dst %s", rt_addr_n2a(family,
+	fprintf(fp, "dst %s", rt_addr_n2a(family, sizeof(id->daddr),
 					  &id->daddr, abuf, sizeof(abuf)));
 	fprintf(fp, "%s", _SL_);
 
@@ -455,11 +455,15 @@ void xfrm_selector_print(struct xfrm_selector *sel, __u16 family,
 		fputs(prefix, fp);
 
 	memset(abuf, '\0', sizeof(abuf));
-	fprintf(fp, "src %s/%u ", rt_addr_n2a(f, &sel->saddr, abuf, sizeof(abuf)),
+	fprintf(fp, "src %s/%u ",
+		rt_addr_n2a(f, sizeof(sel->saddr), &sel->saddr,
+			    abuf, sizeof(abuf)),
 		sel->prefixlen_s);
 
 	memset(abuf, '\0', sizeof(abuf));
-	fprintf(fp, "dst %s/%u ", rt_addr_n2a(f, &sel->daddr, abuf, sizeof(abuf)),
+	fprintf(fp, "dst %s/%u ",
+		rt_addr_n2a(f, sizeof(sel->daddr), &sel->daddr,
+			    abuf, sizeof(abuf)),
 		sel->prefixlen_d);
 
 	if (sel->proto)
@@ -754,7 +758,8 @@ void xfrm_xfrma_print(struct rtattr *tb[], __u16 family,
 
 		memset(abuf, '\0', sizeof(abuf));
 		fprintf(fp, "addr %s",
-			rt_addr_n2a(family, &e->encap_oa, abuf, sizeof(abuf)));
+			rt_addr_n2a(family, sizeof(e->encap_oa), &e->encap_oa,
+				    abuf, sizeof(abuf)));
 		fprintf(fp, "%s", _SL_);
 	}
 
@@ -782,7 +787,7 @@ void xfrm_xfrma_print(struct rtattr *tb[], __u16 family,
 
 		memset(abuf, '\0', sizeof(abuf));
 		fprintf(fp, "%s",
-			rt_addr_n2a(family, coa,
+			rt_addr_n2a(family, sizeof(*coa), coa,
 				    abuf, sizeof(abuf)));
 		fprintf(fp, "%s", _SL_);
 	}
diff --git a/ip/link_ip6tnl.c b/ip/link_ip6tnl.c
index 5ed3d5a23fb5..cf59a9338f57 100644
--- a/ip/link_ip6tnl.c
+++ b/ip/link_ip6tnl.c
@@ -285,6 +285,7 @@ static void ip6tunnel_print_opt(struct link_util *lu, FILE *f, struct rtattr *tb
 	if (tb[IFLA_IPTUN_REMOTE]) {
 		fprintf(f, "remote %s ",
 			rt_addr_n2a(AF_INET6,
+				    RTA_PAYLOAD(tb[IFLA_IPTUN_REMOTE]),
 				    RTA_DATA(tb[IFLA_IPTUN_REMOTE]),
 				    s1, sizeof(s1)));
 	}
@@ -292,6 +293,7 @@ static void ip6tunnel_print_opt(struct link_util *lu, FILE *f, struct rtattr *tb
 	if (tb[IFLA_IPTUN_LOCAL]) {
 		fprintf(f, "local %s ",
 			rt_addr_n2a(AF_INET6,
+				    RTA_PAYLOAD(tb[IFLA_IPTUN_LOCAL]),
 				    RTA_DATA(tb[IFLA_IPTUN_LOCAL]),
 				    s1, sizeof(s1)));
 	}
diff --git a/ip/xfrm_monitor.c b/ip/xfrm_monitor.c
index 50116a7b5433..b2b2d6e27a45 100644
--- a/ip/xfrm_monitor.c
+++ b/ip/xfrm_monitor.c
@@ -227,7 +227,8 @@ static void xfrm_usersa_print(const struct xfrm_usersa_id *sa_id, __u32 reqid, F
 
 	buf[0] = 0;
 	fprintf(fp, "dst %s ",
-		rt_addr_n2a(sa_id->family, &sa_id->daddr, buf, sizeof(buf)));
+		rt_addr_n2a(sa_id->family, sizeof(sa_id->daddr), &sa_id->daddr,
+			    buf, sizeof(buf)));
 
 	fprintf(fp, " reqid 0x%x", reqid);
 
@@ -246,7 +247,8 @@ static int xfrm_ae_print(const struct sockaddr_nl *who,
 	xfrm_ae_flags_print(id->flags, arg);
 	fprintf(fp,"\n\t");
 	memset(abuf, '\0', sizeof(abuf));
-	fprintf(fp, "src %s ", rt_addr_n2a(id->sa_id.family, &id->saddr,
+	fprintf(fp, "src %s ", rt_addr_n2a(id->sa_id.family,
+					   sizeof(id->saddr), &id->saddr,
 					   abuf, sizeof(abuf)));
 
 	xfrm_usersa_print(&id->sa_id, id->reqid, fp);
@@ -262,7 +264,7 @@ static void xfrm_print_addr(FILE *fp, int family, xfrm_address_t *a)
 	char buf[256];
 
 	buf[0] = 0;
-	fprintf(fp, "%s", rt_addr_n2a(family, a, buf, sizeof(buf)));
+	fprintf(fp, "%s", rt_addr_n2a(family, sizeof(*a), a, buf, sizeof(buf)));
 }
 
 static int xfrm_mapping_print(const struct sockaddr_nl *who,
diff --git a/lib/utils.c b/lib/utils.c
index 9cda26810da1..4b4f20126822 100644
--- a/lib/utils.c
+++ b/lib/utils.c
@@ -636,7 +636,7 @@ int __get_user_hz(void)
 	return sysconf(_SC_CLK_TCK);
 }
 
-const char *rt_addr_n2a(int af, const void *addr, char *buf, int buflen)
+const char *rt_addr_n2a(int af, int len, const void *addr, char *buf, int buflen)
 {
 	switch (af) {
 	case AF_INET:
@@ -723,7 +723,7 @@ const char *format_host(int af, int len, const void *addr,
 			return n;
 	}
 #endif
-	return rt_addr_n2a(af, addr, buf, buflen);
+	return rt_addr_n2a(af, len, addr, buf, buflen);
 }
 
 
-- 
2.2.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH net-next 2/8] iproute2: Make the addr argument of ll_addr_n2a const
  2015-03-15 19:47       ` [PATCH net-next 0/8] iproute2: MPLS support (now with af_bit_len) Eric W. Biederman
  2015-03-15 19:48         ` [PATCH net-next 1/8] iproute2: Add a source addres length parameter to rt_addr_n2a Eric W. Biederman
@ 2015-03-15 19:49         ` Eric W. Biederman
  2015-03-15 19:49         ` [PATCH net-next 3/8] iproute2: Add support for printing AF_PACKET addresses Eric W. Biederman
                           ` (5 subsequent siblings)
  7 siblings, 0 replies; 29+ messages in thread
From: Eric W. Biederman @ 2015-03-15 19:49 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev


This avoids build warnings when AF_PACKET support is added
to rt_addr_n2a.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/rt_names.h | 2 +-
 lib/ll_addr.c      | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/rt_names.h b/include/rt_names.h
index c0ea4f982904..921be0607b51 100644
--- a/include/rt_names.h
+++ b/include/rt_names.h
@@ -22,7 +22,7 @@ int inet_proto_a2n(const char *buf);
 
 
 const char * ll_type_n2a(int type, char *buf, int len);
-const char *ll_addr_n2a(unsigned char *addr, int alen,
+const char *ll_addr_n2a(const unsigned char *addr, int alen,
 			int type, char *buf, int blen);
 int ll_addr_a2n(char *lladdr, int len, const char *arg);
 
diff --git a/lib/ll_addr.c b/lib/ll_addr.c
index c12ab075c4a9..2ce9abfbb8c6 100644
--- a/lib/ll_addr.c
+++ b/lib/ll_addr.c
@@ -29,7 +29,7 @@
 #include "utils.h"
 
 
-const char *ll_addr_n2a(unsigned char *addr, int alen, int type, char *buf, int blen)
+const char *ll_addr_n2a(const unsigned char *addr, int alen, int type, char *buf, int blen)
 {
 	int i;
 	int l;
-- 
2.2.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH net-next 3/8] iproute2: Add support for printing AF_PACKET addresses
  2015-03-15 19:47       ` [PATCH net-next 0/8] iproute2: MPLS support (now with af_bit_len) Eric W. Biederman
  2015-03-15 19:48         ` [PATCH net-next 1/8] iproute2: Add a source addres length parameter to rt_addr_n2a Eric W. Biederman
  2015-03-15 19:49         ` [PATCH net-next 2/8] iproute2: Make the addr argument of ll_addr_n2a const Eric W. Biederman
@ 2015-03-15 19:49         ` Eric W. Biederman
  2015-03-15 19:50         ` [PATCH net-next 4/8] iproute2: Add address family to/from string helper functions Eric W. Biederman
                           ` (4 subsequent siblings)
  7 siblings, 0 replies; 29+ messages in thread
From: Eric W. Biederman @ 2015-03-15 19:49 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev


Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 lib/utils.c | 21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/lib/utils.c b/lib/utils.c
index 4b4f20126822..af98c42565a5 100644
--- a/lib/utils.c
+++ b/lib/utils.c
@@ -25,11 +25,12 @@
 #include <asm/types.h>
 #include <linux/pkt_sched.h>
 #include <linux/param.h>
+#include <linux/if_arp.h>
 #include <time.h>
 #include <sys/time.h>
 #include <errno.h>
 
-
+#include "rt_names.h"
 #include "utils.h"
 #include "namespace.h"
 
@@ -397,6 +398,18 @@ int get_addr_1(inet_prefix *addr, const char *name, int family)
 		return 0;
 	}
 
+	if (family == AF_PACKET) {
+		int len;
+		len = ll_addr_a2n((char *)&addr->data, sizeof(addr->data), name);
+		if (len < 0)
+			return -1;
+
+		addr->family = AF_PACKET;
+		addr->bytelen = len;
+		addr->bitlen = len * 8;
+		return 0;
+	}
+
 	if (strchr(name, ':')) {
 		addr->family = AF_INET6;
 		if (family != AF_UNSPEC && family != AF_INET6)
@@ -497,10 +510,6 @@ done:
 
 int get_addr(inet_prefix *dst, const char *arg, int family)
 {
-	if (family == AF_PACKET) {
-		fprintf(stderr, "Error: \"%s\" may be inet address, but it is not allowed in this context.\n", arg);
-		exit(1);
-	}
 	if (get_addr_1(dst, arg, family)) {
 		fprintf(stderr, "Error: an inet address is expected rather than \"%s\".\n", arg);
 		exit(1);
@@ -650,6 +659,8 @@ const char *rt_addr_n2a(int af, int len, const void *addr, char *buf, int buflen
 		memcpy(dna.a_addr, addr, 2);
 		return dnet_ntop(af, &dna, buf, buflen);
 	}
+	case AF_PACKET:
+		return ll_addr_n2a(addr, len, ARPHRD_VOID, buf, buflen);
 	default:
 		return "???";
 	}
-- 
2.2.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH net-next 4/8] iproute2: Add address family to/from string helper functions.
  2015-03-15 19:47       ` [PATCH net-next 0/8] iproute2: MPLS support (now with af_bit_len) Eric W. Biederman
                           ` (2 preceding siblings ...)
  2015-03-15 19:49         ` [PATCH net-next 3/8] iproute2: Add support for printing AF_PACKET addresses Eric W. Biederman
@ 2015-03-15 19:50         ` Eric W. Biederman
  2015-03-15 19:51         ` [PATCH net-next 5/8] iproute2: misc whitespace cleanup Eric W. Biederman
                           ` (3 subsequent siblings)
  7 siblings, 0 replies; 29+ messages in thread
From: Eric W. Biederman @ 2015-03-15 19:50 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev


Add the functions family_name and read_family to convert an address
family to a string and to convernt a string to an address family.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/utils.h |  3 +++
 ip/ip.c         | 16 +++-------------
 lib/utils.c     | 35 +++++++++++++++++++++++++++++++++++
 3 files changed, 41 insertions(+), 13 deletions(-)

diff --git a/include/utils.h b/include/utils.h
index 1b39e2c5cbfc..99dde4c5a667 100644
--- a/include/utils.h
+++ b/include/utils.h
@@ -109,6 +109,9 @@ extern const char *format_host(int af, int len, const void *addr,
 extern const char *rt_addr_n2a(int af, int len, const void *addr,
 			       char *buf, int buflen);
 
+extern int read_family(const char *name);
+extern const char *family_name(int family);
+
 void missarg(const char *) __attribute__((noreturn));
 void invarg(const char *, const char *) __attribute__((noreturn));
 void duparg(const char *, const char *) __attribute__((noreturn));
diff --git a/ip/ip.c b/ip/ip.c
index da16b15f8b55..85256d8ea0c1 100644
--- a/ip/ip.c
+++ b/ip/ip.c
@@ -190,21 +190,11 @@ int main(int argc, char **argv)
 			argv++;
 			if (argc <= 1)
 				usage();
-			if (strcmp(argv[1], "inet") == 0)
-				preferred_family = AF_INET;
-			else if (strcmp(argv[1], "inet6") == 0)
-				preferred_family = AF_INET6;
-			else if (strcmp(argv[1], "dnet") == 0)
-				preferred_family = AF_DECnet;
-			else if (strcmp(argv[1], "link") == 0)
-				preferred_family = AF_PACKET;
-			else if (strcmp(argv[1], "ipx") == 0)
-				preferred_family = AF_IPX;
-			else if (strcmp(argv[1], "bridge") == 0)
-				preferred_family = AF_BRIDGE;
-			else if (strcmp(argv[1], "help") == 0)
+			if (strcmp(argv[1], "help") == 0)
 				usage();
 			else
+				preferred_family = read_family(argv[1]);
+			if (preferred_family == AF_UNSPEC)
 				invarg("invalid protocol family", argv[1]);
 		} else if (strcmp(opt, "-4") == 0) {
 			preferred_family = AF_INET;
diff --git a/lib/utils.c b/lib/utils.c
index af98c42565a5..e5c49896fa01 100644
--- a/lib/utils.c
+++ b/lib/utils.c
@@ -666,6 +666,41 @@ const char *rt_addr_n2a(int af, int len, const void *addr, char *buf, int buflen
 	}
 }
 
+int read_family(const char *name)
+{
+	int family = AF_UNSPEC;
+	if (strcmp(name, "inet") == 0)
+		family = AF_INET;
+	else if (strcmp(name, "inet6") == 0)
+		family = AF_INET6;
+	else if (strcmp(name, "dnet") == 0)
+		family = AF_DECnet;
+	else if (strcmp(name, "link") == 0)
+		family = AF_PACKET;
+	else if (strcmp(name, "ipx") == 0)
+		family = AF_IPX;
+	else if (strcmp(name, "bridge") == 0)
+		family = AF_BRIDGE;
+	return family;
+}
+
+const char *family_name(int family)
+{
+	if (family == AF_INET)
+		return "inet";
+	if (family == AF_INET6)
+		return "inet6";
+	if (family == AF_DECnet)
+		return "dnet";
+	if (family == AF_PACKET)
+		return "link";
+	if (family == AF_IPX)
+		return "ipx";
+	if (family == AF_BRIDGE)
+		return "bridge";
+	return "???";
+}
+
 #ifdef RESOLVE_HOSTNAMES
 struct namerec
 {
-- 
2.2.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH net-next 5/8] iproute2: misc whitespace cleanup
  2015-03-15 19:47       ` [PATCH net-next 0/8] iproute2: MPLS support (now with af_bit_len) Eric W. Biederman
                           ` (3 preceding siblings ...)
  2015-03-15 19:50         ` [PATCH net-next 4/8] iproute2: Add address family to/from string helper functions Eric W. Biederman
@ 2015-03-15 19:51         ` Eric W. Biederman
  2015-03-15 19:52         ` [PATCH net-next 6/8] iproute2: Add support for the RTA_VIA attribute Eric W. Biederman
                           ` (2 subsequent siblings)
  7 siblings, 0 replies; 29+ messages in thread
From: Eric W. Biederman @ 2015-03-15 19:51 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev


---
 ip/iptunnel.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ip/iptunnel.c b/ip/iptunnel.c
index 29188c450370..be84b83ec673 100644
--- a/ip/iptunnel.c
+++ b/ip/iptunnel.c
@@ -342,7 +342,7 @@ static void print_tunnel(struct ip_tunnel_parm *p)
 	printf("%s: %s/ip  remote %s  local %s ",
 	       p->name,
 	       tnl_strproto(p->iph.protocol),
-	       p->iph.daddr ? format_host(AF_INET, 4, &p->iph.daddr, s1, sizeof(s1))  : "any",
+	       p->iph.daddr ? format_host(AF_INET, 4, &p->iph.daddr, s1, sizeof(s1)) : "any",
 	       p->iph.saddr ? rt_addr_n2a(AF_INET, 4, &p->iph.saddr, s2, sizeof(s2)) : "any");
 
 	if (p->iph.protocol == IPPROTO_IPV6 && (p->i_flags & SIT_ISATAP)) {
-- 
2.2.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH net-next 6/8] iproute2: Add support for the RTA_VIA attribute
  2015-03-15 19:47       ` [PATCH net-next 0/8] iproute2: MPLS support (now with af_bit_len) Eric W. Biederman
                           ` (4 preceding siblings ...)
  2015-03-15 19:51         ` [PATCH net-next 5/8] iproute2: misc whitespace cleanup Eric W. Biederman
@ 2015-03-15 19:52         ` Eric W. Biederman
  2015-04-06 23:04           ` roopa
  2015-03-15 19:53         ` [PATCH net-next 7/8] iproute2: Add support for the RTA_NEWDST attribute Eric W. Biederman
  2015-03-15 19:53         ` [PATCH net-next 8/8] iproute2: Add basic mpls support to iproute Eric W. Biederman
  7 siblings, 1 reply; 29+ messages in thread
From: Eric W. Biederman @ 2015-03-15 19:52 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev


Add support for the RTA_VIA attribute that specifies an address family
as well as an address for the next hop gateway.

To make it easy to pass this reorder inet_prefix so that it's tail
is a proper RTA_VIA attribute.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/rtnetlink.h |  7 +++++
 include/utils.h           |  7 +++--
 ip/iproute.c              | 76 +++++++++++++++++++++++++++++++++++++++++------
 man/man8/ip-route.8.in    | 18 +++++++----
 4 files changed, 90 insertions(+), 18 deletions(-)

diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
index 3eb78105399b..03e4c8df8e60 100644
--- a/include/linux/rtnetlink.h
+++ b/include/linux/rtnetlink.h
@@ -303,6 +303,7 @@ enum rtattr_type_t {
 	RTA_TABLE,
 	RTA_MARK,
 	RTA_MFC_STATS,
+	RTA_VIA,
 	__RTA_MAX
 };
 
@@ -344,6 +345,12 @@ struct rtnexthop {
 #define RTNH_SPACE(len)	RTNH_ALIGN(RTNH_LENGTH(len))
 #define RTNH_DATA(rtnh)   ((struct rtattr*)(((char*)(rtnh)) + RTNH_LENGTH(0)))
 
+/* RTA_VIA */
+struct rtvia {
+	__kernel_sa_family_t	rtvia_family;
+	__u8			rtvia_addr[0];
+};
+
 /* RTM_CACHEINFO */
 
 struct rta_cacheinfo {
diff --git a/include/utils.h b/include/utils.h
index 99dde4c5a667..6bbcc10756d3 100644
--- a/include/utils.h
+++ b/include/utils.h
@@ -50,10 +50,11 @@ extern void incomplete_command(void) __attribute__((noreturn));
 
 typedef struct
 {
-	__u8 family;
-	__u8 bytelen;
+	__u16 flags;
+	__u16 bytelen;
 	__s16 bitlen;
-	__u32 flags;
+	/* These next two fields match rtvia */
+	__u16 family;
 	__u32 data[8];
 } inet_prefix;
 
diff --git a/ip/iproute.c b/ip/iproute.c
index 79d0760a34f6..c6ee411fdd56 100644
--- a/ip/iproute.c
+++ b/ip/iproute.c
@@ -75,7 +75,8 @@ static void usage(void)
 	fprintf(stderr, "             [ table TABLE_ID ] [ proto RTPROTO ]\n");
 	fprintf(stderr, "             [ scope SCOPE ] [ metric METRIC ]\n");
 	fprintf(stderr, "INFO_SPEC := NH OPTIONS FLAGS [ nexthop NH ]...\n");
-	fprintf(stderr, "NH := [ via ADDRESS ] [ dev STRING ] [ weight NUMBER ] NHFLAGS\n");
+	fprintf(stderr, "NH := [ via [ FAMILY ] ADDRESS ] [ dev STRING ] [ weight NUMBER ] NHFLAGS\n");
+	fprintf(stderr, "FAMILY := [ inet | inet6 | ipx | dnet | bridge | link ]");
 	fprintf(stderr, "OPTIONS := FLAGS [ mtu NUMBER ] [ advmss NUMBER ]\n");
 	fprintf(stderr, "           [ rtt TIME ] [ rttvar TIME ] [ reordering NUMBER ]\n");
 	fprintf(stderr, "           [ window NUMBER] [ cwnd NUMBER ] [ initcwnd NUMBER ]\n");
@@ -185,8 +186,15 @@ static int filter_nlmsg(struct nlmsghdr *n, struct rtattr **tb, int host_len)
 	    (r->rtm_family != filter.msrc.family ||
 	     (filter.msrc.bitlen >= 0 && filter.msrc.bitlen < r->rtm_src_len)))
 		return 0;
-	if (filter.rvia.family && r->rtm_family != filter.rvia.family)
-		return 0;
+	if (filter.rvia.family) {
+		int family = r->rtm_family;
+		if (tb[RTA_VIA]) {
+			struct rtvia *via = RTA_DATA(tb[RTA_VIA]);
+			family = via->rtvia_family;
+		}
+		if (family != filter.rvia.family)
+			return 0;
+	}
 	if (filter.rprefsrc.family && r->rtm_family != filter.rprefsrc.family)
 		return 0;
 
@@ -205,6 +213,12 @@ static int filter_nlmsg(struct nlmsghdr *n, struct rtattr **tb, int host_len)
 		via.family = r->rtm_family;
 		if (tb[RTA_GATEWAY])
 			memcpy(&via.data, RTA_DATA(tb[RTA_GATEWAY]), host_len/8);
+		if (tb[RTA_VIA]) {
+			size_t len = RTA_PAYLOAD(tb[RTA_VIA]) - 2;
+			struct rtvia *rtvia = RTA_DATA(tb[RTA_VIA]);
+			via.family = rtvia->rtvia_family;
+			memcpy(&via.data, rtvia->rtvia_addr, len);
+		}
 	}
 	if (filter.rprefsrc.bitlen>0) {
 		memset(&prefsrc, 0, sizeof(prefsrc));
@@ -386,6 +400,14 @@ int print_route(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
 				    RTA_DATA(tb[RTA_GATEWAY]),
 				    abuf, sizeof(abuf)));
 	}
+	if (tb[RTA_VIA]) {
+		size_t len = RTA_PAYLOAD(tb[RTA_VIA]) - 2;
+		struct rtvia *via = RTA_DATA(tb[RTA_VIA]);
+		fprintf(fp, "via %s %s ",
+			family_name(via->rtvia_family),
+			format_host(via->rtvia_family, len, via->rtvia_addr,
+				    abuf, sizeof(abuf)));
+	}
 	if (tb[RTA_OIF] && filter.oifmask != -1)
 		fprintf(fp, "dev %s ", ll_index_to_name(*(int*)RTA_DATA(tb[RTA_OIF])));
 
@@ -601,6 +623,14 @@ int print_route(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
 							    RTA_DATA(tb[RTA_GATEWAY]),
 							    abuf, sizeof(abuf)));
 				}
+				if (tb[RTA_VIA]) {
+					size_t len = RTA_PAYLOAD(tb[RTA_VIA]) - 2;
+					struct rtvia *via = RTA_DATA(tb[RTA_VIA]);
+					fprintf(fp, "via %s %s ",
+						family_name(via->rtvia_family),
+						format_host(via->rtvia_family, len, via->rtvia_addr,
+							    abuf, sizeof(abuf)));
+				}
 				if (tb[RTA_FLOW]) {
 					__u32 to = rta_getattr_u32(tb[RTA_FLOW]);
 					__u32 from = to>>16;
@@ -648,12 +678,23 @@ static int parse_one_nh(struct rtmsg *r, struct rtattr *rta,
 	while (++argv, --argc > 0) {
 		if (strcmp(*argv, "via") == 0) {
 			inet_prefix addr;
+			int family;
 			NEXT_ARG();
-			get_addr(&addr, *argv, r->rtm_family);
+			family = read_family(*argv);
+			if (family == AF_UNSPEC)
+				family = r->rtm_family;
+			else
+				NEXT_ARG();
+			get_addr(&addr, *argv, family);
 			if (r->rtm_family == AF_UNSPEC)
 				r->rtm_family = addr.family;
-			rta_addattr_l(rta, 4096, RTA_GATEWAY, &addr.data, addr.bytelen);
-			rtnh->rtnh_len += sizeof(struct rtattr) + addr.bytelen;
+			if (addr.family == r->rtm_family) {
+				rta_addattr_l(rta, 4096, RTA_GATEWAY, &addr.data, addr.bytelen);
+				rtnh->rtnh_len += sizeof(struct rtattr) + addr.bytelen;
+			} else {
+				rta_addattr_l(rta, 4096, RTA_VIA, &addr.family, addr.bytelen+2);
+				rtnh->rtnh_len += sizeof(struct rtattr) + addr.bytelen+2;
+			}
 		} else if (strcmp(*argv, "dev") == 0) {
 			NEXT_ARG();
 			if ((rtnh->rtnh_ifindex = ll_name_to_index(*argv)) == 0) {
@@ -761,12 +802,21 @@ static int iproute_modify(int cmd, unsigned flags, int argc, char **argv)
 			addattr_l(&req.n, sizeof(req), RTA_PREFSRC, &addr.data, addr.bytelen);
 		} else if (strcmp(*argv, "via") == 0) {
 			inet_prefix addr;
+			int family;
 			gw_ok = 1;
 			NEXT_ARG();
-			get_addr(&addr, *argv, req.r.rtm_family);
+			family = read_family(*argv);
+			if (family == AF_UNSPEC)
+				family = req.r.rtm_family;
+			else
+				NEXT_ARG();
+			get_addr(&addr, *argv, family);
 			if (req.r.rtm_family == AF_UNSPEC)
 				req.r.rtm_family = addr.family;
-			addattr_l(&req.n, sizeof(req), RTA_GATEWAY, &addr.data, addr.bytelen);
+			if (addr.family == req.r.rtm_family)
+				addattr_l(&req.n, sizeof(req), RTA_GATEWAY, &addr.data, addr.bytelen);
+			else
+				addattr_l(&req.n, sizeof(req), RTA_VIA, &addr.family, addr.bytelen+2);
 		} else if (strcmp(*argv, "from") == 0) {
 			inet_prefix addr;
 			NEXT_ARG();
@@ -1251,8 +1301,14 @@ static int iproute_list_flush_or_save(int argc, char **argv, int action)
 			get_unsigned(&mark, *argv, 0);
 			filter.markmask = -1;
 		} else if (strcmp(*argv, "via") == 0) {
+			int family;
 			NEXT_ARG();
-			get_prefix(&filter.rvia, *argv, do_ipv6);
+			family = read_family(*argv);
+			if (family == AF_UNSPEC)
+				family = do_ipv6;
+			else
+				NEXT_ARG();
+			get_prefix(&filter.rvia, *argv, family);
 		} else if (strcmp(*argv, "src") == 0) {
 			NEXT_ARG();
 			get_prefix(&filter.rprefsrc, *argv, do_ipv6);
@@ -1554,6 +1610,8 @@ static int iproute_get(int argc, char **argv)
 			tb[RTA_OIF]->rta_type = 0;
 		if (tb[RTA_GATEWAY])
 			tb[RTA_GATEWAY]->rta_type = 0;
+		if (tb[RTA_VIA])
+			tb[RTA_VIA]->rta_type = 0;
 		if (!idev && tb[RTA_IIF])
 			tb[RTA_IIF]->rta_type = 0;
 		req.n.nlmsg_flags = NLM_F_REQUEST;
diff --git a/man/man8/ip-route.8.in b/man/man8/ip-route.8.in
index 2b1583d5a30c..906cfea0cd6b 100644
--- a/man/man8/ip-route.8.in
+++ b/man/man8/ip-route.8.in
@@ -81,13 +81,18 @@ replace " } "
 .ti -8
 .IR NH " := [ "
 .B  via
-.IR ADDRESS " ] [ "
+[
+.IR FAMILY " ] " ADDRESS " ] [ "
 .B  dev
 .IR STRING " ] [ "
 .B  weight
 .IR NUMBER " ] " NHFLAGS
 
 .ti -8
+.IR FAMILY " := [ "
+.BR inet " | " inet6 " | " ipx " | " dnet " | " bridge " | " link " ]"
+
+.ti -8
 .IR OPTIONS " := " FLAGS " [ "
 .B  mtu
 .IR NUMBER " ] [ "
@@ -333,9 +338,10 @@ table by default.
 the output device name.
 
 .TP
-.BI via " ADDRESS"
-the address of the nexthop router.  Actually, the sense of this field
-depends on the route type.  For normal
+.BI via " [ FAMILY ] ADDRESS"
+the address of the nexthop router, in the address family FAMILY.
+Actually, the sense of this field depends on the route type.  For
+normal
 .B unicast
 routes it is either the true next hop router or, if it is a direct
 route installed in BSD compatibility mode, it can be a local address
@@ -472,7 +478,7 @@ is a complex value with its own syntax similar to the top level
 argument lists:
 
 .in +8
-.BI via " ADDRESS"
+.BI via " [ FAMILY ] ADDRESS"
 - is the nexthop router.
 .sp
 
@@ -669,7 +675,7 @@ only list routes of this type.
 only list routes going via this device.
 
 .TP
-.BI via " PREFIX"
+.BI via " [ FAMILY ] PREFIX"
 only list routes going via the nexthop routers selected by
 .IR PREFIX "."
 
-- 
2.2.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH net-next 7/8] iproute2: Add support for the RTA_NEWDST attribute.
  2015-03-15 19:47       ` [PATCH net-next 0/8] iproute2: MPLS support (now with af_bit_len) Eric W. Biederman
                           ` (5 preceding siblings ...)
  2015-03-15 19:52         ` [PATCH net-next 6/8] iproute2: Add support for the RTA_VIA attribute Eric W. Biederman
@ 2015-03-15 19:53         ` Eric W. Biederman
  2015-03-15 19:53         ` [PATCH net-next 8/8] iproute2: Add basic mpls support to iproute Eric W. Biederman
  7 siblings, 0 replies; 29+ messages in thread
From: Eric W. Biederman @ 2015-03-15 19:53 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev


This attribute is like RTA_DST except it specifies the destination
address to place on a packet when it leaves the host.  For ip based
protocols this is destination NAT and not a common part of forwarding.
For protocols like MPLS label swapping is something that typically
happens on every hop.

There is likely to be a RTA_NEWSRC at some point so RTA_NEWDST
is printed as "as to"  and can be specified either as "as to"
or just "as"

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/rtnetlink.h |  1 +
 ip/iproute.c              | 19 ++++++++++++++++++-
 man/man8/ip-route.8.in    |  5 +++++
 3 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
index 03e4c8df8e60..0d4100535bd7 100644
--- a/include/linux/rtnetlink.h
+++ b/include/linux/rtnetlink.h
@@ -304,6 +304,7 @@ enum rtattr_type_t {
 	RTA_MARK,
 	RTA_MFC_STATS,
 	RTA_VIA,
+	RTA_NEWDST,
 	__RTA_MAX
 };
 
diff --git a/ip/iproute.c b/ip/iproute.c
index c6ee411fdd56..2e691b8ef294 100644
--- a/ip/iproute.c
+++ b/ip/iproute.c
@@ -77,7 +77,7 @@ static void usage(void)
 	fprintf(stderr, "INFO_SPEC := NH OPTIONS FLAGS [ nexthop NH ]...\n");
 	fprintf(stderr, "NH := [ via [ FAMILY ] ADDRESS ] [ dev STRING ] [ weight NUMBER ] NHFLAGS\n");
 	fprintf(stderr, "FAMILY := [ inet | inet6 | ipx | dnet | bridge | link ]");
-	fprintf(stderr, "OPTIONS := FLAGS [ mtu NUMBER ] [ advmss NUMBER ]\n");
+	fprintf(stderr, "OPTIONS := FLAGS [ mtu NUMBER ] [ advmss NUMBER ] [ as [ to ] ADDRESS ]\n");
 	fprintf(stderr, "           [ rtt TIME ] [ rttvar TIME ] [ reordering NUMBER ]\n");
 	fprintf(stderr, "           [ window NUMBER] [ cwnd NUMBER ] [ initcwnd NUMBER ]\n");
 	fprintf(stderr, "           [ ssthresh NUMBER ] [ realms REALM ] [ src ADDRESS ]\n");
@@ -388,6 +388,13 @@ int print_route(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
 	} else if (r->rtm_src_len) {
 		fprintf(fp, "from 0/%u ", r->rtm_src_len);
 	}
+	if (tb[RTA_NEWDST]) {
+		fprintf(fp, "as to %s ", format_host(r->rtm_family,
+						  RTA_PAYLOAD(tb[RTA_NEWDST]),
+						  RTA_DATA(tb[RTA_NEWDST]),
+						  abuf, sizeof(abuf))
+			);
+	}
 	if (r->rtm_tos && filter.tosmask != -1) {
 		SPRINT_BUF(b1);
 		fprintf(fp, "tos %s ", rtnl_dsfield_n2a(r->rtm_tos, b1, sizeof(b1)));
@@ -800,6 +807,16 @@ static int iproute_modify(int cmd, unsigned flags, int argc, char **argv)
 			if (req.r.rtm_family == AF_UNSPEC)
 				req.r.rtm_family = addr.family;
 			addattr_l(&req.n, sizeof(req), RTA_PREFSRC, &addr.data, addr.bytelen);
+		} else if (strcmp(*argv, "as") == 0) {
+			inet_prefix addr;
+			NEXT_ARG();
+			if (strcmp(*argv, "to") == 0) {
+				NEXT_ARG();
+			}
+			get_addr(&addr, *argv, req.r.rtm_family);
+			if (req.r.rtm_family == AF_UNSPEC)
+				req.r.rtm_family = addr.family;
+			addattr_l(&req.n, sizeof(req), RTA_NEWDST, &addr.data, addr.bytelen);
 		} else if (strcmp(*argv, "via") == 0) {
 			inet_prefix addr;
 			int family;
diff --git a/man/man8/ip-route.8.in b/man/man8/ip-route.8.in
index 906cfea0cd6b..5112344971c0 100644
--- a/man/man8/ip-route.8.in
+++ b/man/man8/ip-route.8.in
@@ -98,6 +98,11 @@ replace " } "
 .IR NUMBER " ] [ "
 .B  advmss
 .IR NUMBER " ] [ "
+.B  as
+[
+.B to
+]
+.IR ADDRESS " ]"
 .B  rtt
 .IR TIME " ] [ "
 .B  rttvar
-- 
2.2.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH net-next 8/8] iproute2: Add basic mpls support to iproute
  2015-03-15 19:47       ` [PATCH net-next 0/8] iproute2: MPLS support (now with af_bit_len) Eric W. Biederman
                           ` (6 preceding siblings ...)
  2015-03-15 19:53         ` [PATCH net-next 7/8] iproute2: Add support for the RTA_NEWDST attribute Eric W. Biederman
@ 2015-03-15 19:53         ` Eric W. Biederman
  7 siblings, 0 replies; 29+ messages in thread
From: Eric W. Biederman @ 2015-03-15 19:53 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev


- Pull in the uapi mpls.h
- Update rtnetlink.h to include the mpls rtnetlink notification multicast group.
- Define AF_MPLS in utils.h if it is not defined from elsewhere
  as is done with AF_DECnet

The address syntax for multiple mpls labels is a complete invention.
When I looked there seemed to be no wide spread convention for talking
about an mpls label stack in text for.  Sometimes people did:
"{ Label1, Label2, Label3 }", sometimes people would do:
"[ label3, label2, label1 ]", and most of the time label
stacks were not explicitly shown at all.

The syntax I wound up using, so it would not have spaces and so it
would visually distinct from other kinds of addresses is.

label1/label2/label3 Where label1 is the label at the top of the label
stack and label3 is the label at the bottom on the label stack.

When there is a single label this matches what seems to be convention
with other tools.  Just print out the numeric value of the mpls label.

The netlink protocol for labels uses the on the wire format for a
label stack. The ttl and traffic class are expected to be 0.  Using
the on the wire format is common and what happens with other address
types. BGP when passing label stacks also uses this technique with the
exception that the ttl byte is not included making each label in a BGP
label stack 3 bytes instead of 4.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 Makefile                  |  3 +++
 include/linux/mpls.h      | 34 +++++++++++++++++++++++++++
 include/linux/rtnetlink.h |  2 ++
 include/utils.h           | 10 ++++++++
 ip/ip.c                   |  4 +++-
 ip/ipmonitor.c            |  3 +++
 ip/iproute.c              |  2 +-
 lib/mpls_ntop.c           | 48 +++++++++++++++++++++++++++++++++++++++
 lib/mpls_pton.c           | 58 +++++++++++++++++++++++++++++++++++++++++++++++
 lib/utils.c               | 30 ++++++++++++++++++++++--
 man/man8/ip-route.8.in    |  2 +-
 man/man8/ip.8             |  7 +++++-
 12 files changed, 197 insertions(+), 6 deletions(-)
 create mode 100644 include/linux/mpls.h
 create mode 100644 lib/mpls_ntop.c
 create mode 100644 lib/mpls_pton.c

diff --git a/Makefile b/Makefile
index 9dbb29f3d0cd..ca6c2e141308 100644
--- a/Makefile
+++ b/Makefile
@@ -26,6 +26,9 @@ ADDLIB+=dnet_ntop.o dnet_pton.o
 #options for ipx
 ADDLIB+=ipx_ntop.o ipx_pton.o
 
+#options for mpls
+ADDLIB+=mpls_ntop.o mpls_pton.o
+
 CC = gcc
 HOSTCC = gcc
 DEFINES += -D_GNU_SOURCE
diff --git a/include/linux/mpls.h b/include/linux/mpls.h
new file mode 100644
index 000000000000..bc9abfe88c9a
--- /dev/null
+++ b/include/linux/mpls.h
@@ -0,0 +1,34 @@
+#ifndef _UAPI_MPLS_H
+#define _UAPI_MPLS_H
+
+#include <linux/types.h>
+#include <asm/byteorder.h>
+
+/* Reference: RFC 5462, RFC 3032
+ *
+ *  0                   1                   2                   3
+ *  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ * |                Label                  | TC  |S|       TTL     |
+ * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ *
+ *	Label:  Label Value, 20 bits
+ *	TC:     Traffic Class field, 3 bits
+ *	S:      Bottom of Stack, 1 bit
+ *	TTL:    Time to Live, 8 bits
+ */
+
+struct mpls_label {
+	__be32 entry;
+};
+
+#define MPLS_LS_LABEL_MASK      0xFFFFF000
+#define MPLS_LS_LABEL_SHIFT     12
+#define MPLS_LS_TC_MASK         0x00000E00
+#define MPLS_LS_TC_SHIFT        9
+#define MPLS_LS_S_MASK          0x00000100
+#define MPLS_LS_S_SHIFT         8
+#define MPLS_LS_TTL_MASK        0x000000FF
+#define MPLS_LS_TTL_SHIFT       0
+
+#endif /* _UAPI_MPLS_H */
diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
index 0d4100535bd7..2e0dc0f638ba 100644
--- a/include/linux/rtnetlink.h
+++ b/include/linux/rtnetlink.h
@@ -629,6 +629,8 @@ enum rtnetlink_groups {
 #define RTNLGRP_IPV6_NETCONF	RTNLGRP_IPV6_NETCONF
 	RTNLGRP_MDB,
 #define RTNLGRP_MDB		RTNLGRP_MDB
+	RTNLGRP_MPLS_ROUTE,
+#define RTNLGRP_MPLS_ROUTE	RTNLGRP_MPLS_ROUTE
 	__RTNLGRP_MAX
 };
 #define RTNLGRP_MAX	(__RTNLGRP_MAX - 1)
diff --git a/include/utils.h b/include/utils.h
index 6bbcc10756d3..0efdb6e02caf 100644
--- a/include/utils.h
+++ b/include/utils.h
@@ -78,6 +78,13 @@ struct ipx_addr {
 	u_int8_t  ipx_node[IPX_NODE_LEN];
 };
 
+#ifndef AF_MPLS
+# define AF_MPLS 28
+#endif
+
+/* Maximum number of labels the mpls helpers support */
+#define MPLS_MAX_LABELS 8
+
 extern __u32 get_addr32(const char *name);
 extern int get_addr_1(inet_prefix *dst, const char *arg, int family);
 extern int get_prefix_1(inet_prefix *dst, char *arg, int family);
@@ -126,6 +133,9 @@ int dnet_pton(int af, const char *src, void *addr);
 const char *ipx_ntop(int af, const void *addr, char *str, size_t len);
 int ipx_pton(int af, const char *src, void *addr);
 
+const char *mpls_ntop(int af, const void *addr, char *str, size_t len);
+int mpls_pton(int af, const char *src, void *addr);
+
 extern int __iproute2_hz_internal;
 extern int __get_hz(void);
 
diff --git a/ip/ip.c b/ip/ip.c
index 85256d8ea0c1..f7f214b2f5ab 100644
--- a/ip/ip.c
+++ b/ip/ip.c
@@ -52,7 +52,7 @@ static void usage(void)
 "                   netns | l2tp | fou | tcp_metrics | token | netconf }\n"
 "       OPTIONS := { -V[ersion] | -s[tatistics] | -d[etails] | -r[esolve] |\n"
 "                    -h[uman-readable] | -iec |\n"
-"                    -f[amily] { inet | inet6 | ipx | dnet | bridge | link } |\n"
+"                    -f[amily] { inet | inet6 | ipx | dnet | mpls | bridge | link } |\n"
 "                    -4 | -6 | -I | -D | -B | -0 |\n"
 "                    -l[oops] { maximum-addr-flush-attempts } |\n"
 "                    -o[neline] | -t[imestamp] | -ts[hort] | -b[atch] [filename] |\n"
@@ -206,6 +206,8 @@ int main(int argc, char **argv)
 			preferred_family = AF_IPX;
 		} else if (strcmp(opt, "-D") == 0) {
 			preferred_family = AF_DECnet;
+		} else if (strcmp(opt, "-M") == 0) {
+			preferred_family = AF_MPLS;
 		} else if (strcmp(opt, "-B") == 0) {
 			preferred_family = AF_BRIDGE;
 		} else if (matches(opt, "-human") == 0 ||
diff --git a/ip/ipmonitor.c b/ip/ipmonitor.c
index 6b5e66534551..7833a2632927 100644
--- a/ip/ipmonitor.c
+++ b/ip/ipmonitor.c
@@ -158,6 +158,7 @@ int do_ipmonitor(int argc, char **argv)
 	groups |= nl_mgrp(RTNLGRP_IPV6_IFADDR);
 	groups |= nl_mgrp(RTNLGRP_IPV4_ROUTE);
 	groups |= nl_mgrp(RTNLGRP_IPV6_ROUTE);
+	groups |= nl_mgrp(RTNLGRP_MPLS_ROUTE);
 	groups |= nl_mgrp(RTNLGRP_IPV4_MROUTE);
 	groups |= nl_mgrp(RTNLGRP_IPV6_MROUTE);
 	groups |= nl_mgrp(RTNLGRP_IPV6_PREFIX);
@@ -235,6 +236,8 @@ int do_ipmonitor(int argc, char **argv)
 			groups |= nl_mgrp(RTNLGRP_IPV4_ROUTE);
 		if (!preferred_family || preferred_family == AF_INET6)
 			groups |= nl_mgrp(RTNLGRP_IPV6_ROUTE);
+		if (!preferred_family || preferred_family == AF_MPLS)
+			groups |= nl_mgrp(RTNLGRP_MPLS_ROUTE);
 	}
 	if (lmroute) {
 		if (!preferred_family || preferred_family == AF_INET)
diff --git a/ip/iproute.c b/ip/iproute.c
index 2e691b8ef294..d49ebf0580bd 100644
--- a/ip/iproute.c
+++ b/ip/iproute.c
@@ -76,7 +76,7 @@ static void usage(void)
 	fprintf(stderr, "             [ scope SCOPE ] [ metric METRIC ]\n");
 	fprintf(stderr, "INFO_SPEC := NH OPTIONS FLAGS [ nexthop NH ]...\n");
 	fprintf(stderr, "NH := [ via [ FAMILY ] ADDRESS ] [ dev STRING ] [ weight NUMBER ] NHFLAGS\n");
-	fprintf(stderr, "FAMILY := [ inet | inet6 | ipx | dnet | bridge | link ]");
+	fprintf(stderr, "FAMILY := [ inet | inet6 | ipx | dnet | mpls | bridge | link ]");
 	fprintf(stderr, "OPTIONS := FLAGS [ mtu NUMBER ] [ advmss NUMBER ] [ as [ to ] ADDRESS ]\n");
 	fprintf(stderr, "           [ rtt TIME ] [ rttvar TIME ] [ reordering NUMBER ]\n");
 	fprintf(stderr, "           [ window NUMBER] [ cwnd NUMBER ] [ initcwnd NUMBER ]\n");
diff --git a/lib/mpls_ntop.c b/lib/mpls_ntop.c
new file mode 100644
index 000000000000..945d6d5e4535
--- /dev/null
+++ b/lib/mpls_ntop.c
@@ -0,0 +1,48 @@
+#include <errno.h>
+#include <string.h>
+#include <sys/types.h>
+#include <netinet/in.h>
+#include <linux/mpls.h>
+
+#include "utils.h"
+
+static const char *mpls_ntop1(const struct mpls_label *addr, char *buf, size_t buflen)
+{
+	size_t destlen = buflen;
+	char *dest = buf;
+	int count;
+
+	for (count = 0; count < MPLS_MAX_LABELS; count++) {
+		uint32_t entry = ntohl(addr[count].entry);
+		uint32_t label = (entry & MPLS_LS_LABEL_MASK) >> MPLS_LS_LABEL_SHIFT;
+		int len = snprintf(dest, destlen, "%u", label);
+
+		/* Is this the end? */
+		if (entry & MPLS_LS_S_MASK)
+			return buf;
+
+
+		dest += len;
+		destlen -= len;
+		if (destlen) {
+			*dest = '/';
+			dest++;
+			destlen--;
+		}
+	}
+	errno = -E2BIG;
+	return NULL;
+}
+
+const char *mpls_ntop(int af, const void *addr, char *buf, size_t buflen)
+{
+	switch(af) {
+	case AF_MPLS:
+		errno = 0;
+		return mpls_ntop1((struct mpls_label *)addr, buf, buflen);
+	default:
+		errno = EAFNOSUPPORT;
+	}
+
+	return NULL;
+}
diff --git a/lib/mpls_pton.c b/lib/mpls_pton.c
new file mode 100644
index 000000000000..bd448cfcf14a
--- /dev/null
+++ b/lib/mpls_pton.c
@@ -0,0 +1,58 @@
+#include <errno.h>
+#include <string.h>
+#include <sys/types.h>
+#include <netinet/in.h>
+#include <linux/mpls.h>
+
+#include "utils.h"
+
+
+static int mpls_pton1(const char *name, struct mpls_label *addr)
+{
+	char *endp;
+	unsigned count;
+
+	for (count = 0; count < MPLS_MAX_LABELS; count++) {
+		unsigned long label;
+
+		label = strtoul(name, &endp, 0);
+		/* Fail when the label value is out or range */
+		if (label >= (1 << 20))
+			return 0;
+
+		if (endp == name) /* no digits */
+			return 0;
+
+		addr->entry = htonl(label << MPLS_LS_LABEL_SHIFT);
+		if (*endp == '\0') {
+			addr->entry |= htonl(1 << MPLS_LS_S_SHIFT);
+			return 1;
+		}
+
+		/* Bad character in the address */
+		if (*endp != '/')
+			return 0;
+
+		name = endp + 1;
+		addr += 1;
+	}
+	/* The address was too long */
+	return 0;
+}
+
+int mpls_pton(int af, const char *src, void *addr)
+{
+	int err;
+
+	switch(af) {
+	case AF_MPLS:
+		errno = 0;
+		err = mpls_pton1(src, (struct mpls_label *)addr);
+		break;
+	default:
+		errno = EAFNOSUPPORT;
+		err = -1;
+	}
+
+	return err;
+}
diff --git a/lib/utils.c b/lib/utils.c
index e5c49896fa01..b9c33bcfb223 100644
--- a/lib/utils.c
+++ b/lib/utils.c
@@ -26,6 +26,7 @@
 #include <linux/pkt_sched.h>
 #include <linux/param.h>
 #include <linux/if_arp.h>
+#include <linux/mpls.h>
 #include <time.h>
 #include <sys/time.h>
 #include <errno.h>
@@ -390,7 +391,7 @@ int get_addr_1(inet_prefix *addr, const char *name, int family)
 	if (strcmp(name, "default") == 0 ||
 	    strcmp(name, "all") == 0 ||
 	    strcmp(name, "any") == 0) {
-		if (family == AF_DECnet)
+		if ((family == AF_DECnet) || (family == AF_MPLS))
 			return -1;
 		addr->family = family;
 		addr->bytelen = (family == AF_INET6 ? 16 : 4);
@@ -432,6 +433,23 @@ int get_addr_1(inet_prefix *addr, const char *name, int family)
 		return 0;
 	}
 
+	if (family == AF_MPLS) {
+		int i;
+		addr->family = AF_MPLS;
+		if (mpls_pton(AF_MPLS, name, addr->data) <= 0)
+			return -1;
+		addr->bytelen = 4;
+		addr->bitlen = 20;
+		/* How many bytes do I need? */
+		for (i = 0; i < 8; i++) {
+			if (ntohl(addr->data[i]) & MPLS_LS_S_MASK) {
+				addr->bytelen = (i + 1)*4;
+				break;
+			}
+		}
+		return 0;
+	}
+
 	addr->family = AF_INET;
 	if (family != AF_UNSPEC && family != AF_INET)
 		return -1;
@@ -455,6 +473,8 @@ int af_bit_len(int af)
 		return 16;
 	case AF_IPX:
 		return 80;
+	case AF_MPLS:
+		return 20;
 	}
 
 	return 0;
@@ -476,7 +496,7 @@ int get_prefix_1(inet_prefix *dst, char *arg, int family)
 	if (strcmp(arg, "default") == 0 ||
 	    strcmp(arg, "any") == 0 ||
 	    strcmp(arg, "all") == 0) {
-		if (family == AF_DECnet)
+		if ((family == AF_DECnet) || (family = AF_MPLS))
 			return -1;
 		dst->family = family;
 		dst->bytelen = 0;
@@ -651,6 +671,8 @@ const char *rt_addr_n2a(int af, int len, const void *addr, char *buf, int buflen
 	case AF_INET:
 	case AF_INET6:
 		return inet_ntop(af, addr, buf, buflen);
+	case AF_MPLS:
+		return mpls_ntop(af, addr, buf, buflen);
 	case AF_IPX:
 		return ipx_ntop(af, addr, buf, buflen);
 	case AF_DECnet:
@@ -679,6 +701,8 @@ int read_family(const char *name)
 		family = AF_PACKET;
 	else if (strcmp(name, "ipx") == 0)
 		family = AF_IPX;
+	else if (strcmp(name, "mpls") == 0)
+		family = AF_MPLS;
 	else if (strcmp(name, "bridge") == 0)
 		family = AF_BRIDGE;
 	return family;
@@ -696,6 +720,8 @@ const char *family_name(int family)
 		return "link";
 	if (family == AF_IPX)
 		return "ipx";
+	if (family == AF_MPLS)
+		return "mpls";
 	if (family == AF_BRIDGE)
 		return "bridge";
 	return "???";
diff --git a/man/man8/ip-route.8.in b/man/man8/ip-route.8.in
index 5112344971c0..1163536d0e9c 100644
--- a/man/man8/ip-route.8.in
+++ b/man/man8/ip-route.8.in
@@ -90,7 +90,7 @@ replace " } "
 
 .ti -8
 .IR FAMILY " := [ "
-.BR inet " | " inet6 " | " ipx " | " dnet " | " bridge " | " link " ]"
+.BR inet " | " inet6 " | " ipx " | " dnet " | " mpls " | " bridge " | " link " ]"
 
 .ti -8
 .IR OPTIONS " := " FLAGS " [ "
diff --git a/man/man8/ip.8 b/man/man8/ip.8
index 016e8c660cd0..1755473ee32a 100644
--- a/man/man8/ip.8
+++ b/man/man8/ip.8
@@ -73,7 +73,7 @@ Zero (0) means loop until all addresses are removed.
 .TP
 .BR "\-f" , " \-family " <FAMILY>
 Specifies the protocol family to use. The protocol family identifier can be one of
-.BR "inet" , " inet6" , " bridge" , " ipx" , " dnet"
+.BR "inet" , " inet6" , " bridge" , " ipx" , " dnet" , " mpls"
 or
 .BR link .
 If this option is not present,
@@ -115,6 +115,11 @@ shortcut for
 .BR "\-family ipx" .
 
 .TP
+.B \-M
+shortcut for
+.BR "\-family mpls" .
+
+.TP
 .B \-0
 shortcut for
 .BR "\-family link" .
-- 
2.2.1

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH net-next] iproute2: MPLS support
  2015-03-13 18:50 [PATCH net-next] iproute2: MPLS support Eric W. Biederman
                   ` (8 preceding siblings ...)
       [not found] ` <c3ad7d77783046d38e5b23b5e1fe0f71@BRMWP-EXMB11.corp.brocade.com>
@ 2015-03-24 22:36 ` Stephen Hemminger
  9 siblings, 0 replies; 29+ messages in thread
From: Stephen Hemminger @ 2015-03-24 22:36 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: netdev

On Fri, 13 Mar 2015 13:50:11 -0500
ebiederm@xmission.com (Eric W. Biederman) wrote:

> This set of changes  adds support for nexthops in different
> address families, with the new netlink RTA_VIA option.
> 
> Support is added for routes that change the destination address (as MPLS
> does) with the RTA_NEWDST attribute.
> 
> Support for MPLS addresses is added (for multiple labels I
> had to make up the syntax I used label/label/label as it fits in
> well with addresses that don't have a space in them).
> 
> Support for these options is merged into David's net-next kernel
> tree, and the meaning of the options is unlikely to change in any
> significant way before this code merges upstream.
> 
> The documentation has been updated to report that the new options
> are present and to report roughly what they do.  This includes
> ip --help and the man pages.
> 
> Eric W. Biederman (8):
>       iproute2: Add a source addres length parameter to rt_addr_n2a
>       iproute2: Make the addr argument of ll_addr_n2a const
>       iproute2: Add support for printing AF_PACKET addresses
>       iproute2: Add address family to/from string helper functions.
>       iproute2: misc whitespace cleanup
>       iproute2: Add support for  RTA_VIA attributes
>       iproute2: Add support for the RTA_NEWDST attribute.
>       iproute2: Add basic mpls support to iproute

Applied to net-next branch of iproute.
Only change was to use result of kernel make headers_install
version of mpls.h

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH net-next 6/8] iproute2: Add support for the RTA_VIA attribute
  2015-03-15 19:52         ` [PATCH net-next 6/8] iproute2: Add support for the RTA_VIA attribute Eric W. Biederman
@ 2015-04-06 23:04           ` roopa
  2015-04-06 23:27             ` Andy Gospodarek
  0 siblings, 1 reply; 29+ messages in thread
From: roopa @ 2015-04-06 23:04 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Stephen Hemminger, netdev, Vivek Venkatraman, Andy Gospodarek, rshearma

On 3/15/15, 12:52 PM, Eric W. Biederman wrote:
> Add support for the RTA_VIA attribute that specifies an address family
> as well as an address for the next hop gateway.
>
> To make it easy to pass this reorder inet_prefix so that it's tail
> is a proper RTA_VIA attribute.
>
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>   include/linux/rtnetlink.h |  7 +++++
>   include/utils.h           |  7 +++--
>   ip/iproute.c              | 76 +++++++++++++++++++++++++++++++++++++++++------
>   man/man8/ip-route.8.in    | 18 +++++++----
>   4 files changed, 90 insertions(+), 18 deletions(-)
>
> diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
> index 3eb78105399b..03e4c8df8e60 100644
> --- a/include/linux/rtnetlink.h
> +++ b/include/linux/rtnetlink.h
> @@ -303,6 +303,7 @@ enum rtattr_type_t {
>   	RTA_TABLE,
>   	RTA_MARK,
>   	RTA_MFC_STATS,
> +	RTA_VIA,

eric, if its not too late, what do you think about renaming RTA_VIA 
attribute to
RTA_NEWGATEWAY (similar to your new RTA_NEWDST attribute to specify a 
label dst) ?. RTA_VIA is fine too.
This is indeed a new way to specify a gateway (and can/will be used by 
RFC 5549 in the future).

If there is interest in renaming it to RTA_NEWGATEWAY (or any other 
name, cant think of anything better right now),
I will be happy to submit a follow-on patch.

Thanks!.

>   	__RTA_MAX
>   };
>   
> @@ -344,6 +345,12 @@ struct rtnexthop {
>   #define RTNH_SPACE(len)	RTNH_ALIGN(RTNH_LENGTH(len))
>   #define RTNH_DATA(rtnh)   ((struct rtattr*)(((char*)(rtnh)) + RTNH_LENGTH(0)))
>   
> +/* RTA_VIA */
> +struct rtvia {
> +	__kernel_sa_family_t	rtvia_family;
> +	__u8			rtvia_addr[0];
> +};
> +
>   /* RTM_CACHEINFO */
>   
>   struct rta_cacheinfo {
> diff --git a/include/utils.h b/include/utils.h
> index 99dde4c5a667..6bbcc10756d3 100644
> --- a/include/utils.h
> +++ b/include/utils.h
> @@ -50,10 +50,11 @@ extern void incomplete_command(void) __attribute__((noreturn));
>   
>   typedef struct
>   {
> -	__u8 family;
> -	__u8 bytelen;
> +	__u16 flags;
> +	__u16 bytelen;
>   	__s16 bitlen;
> -	__u32 flags;
> +	/* These next two fields match rtvia */
> +	__u16 family;
>   	__u32 data[8];
>   } inet_prefix;
>   
> diff --git a/ip/iproute.c b/ip/iproute.c
> index 79d0760a34f6..c6ee411fdd56 100644
> --- a/ip/iproute.c
> +++ b/ip/iproute.c
> @@ -75,7 +75,8 @@ static void usage(void)
>   	fprintf(stderr, "             [ table TABLE_ID ] [ proto RTPROTO ]\n");
>   	fprintf(stderr, "             [ scope SCOPE ] [ metric METRIC ]\n");
>   	fprintf(stderr, "INFO_SPEC := NH OPTIONS FLAGS [ nexthop NH ]...\n");
> -	fprintf(stderr, "NH := [ via ADDRESS ] [ dev STRING ] [ weight NUMBER ] NHFLAGS\n");
> +	fprintf(stderr, "NH := [ via [ FAMILY ] ADDRESS ] [ dev STRING ] [ weight NUMBER ] NHFLAGS\n");
> +	fprintf(stderr, "FAMILY := [ inet | inet6 | ipx | dnet | bridge | link ]");
>   	fprintf(stderr, "OPTIONS := FLAGS [ mtu NUMBER ] [ advmss NUMBER ]\n");
>   	fprintf(stderr, "           [ rtt TIME ] [ rttvar TIME ] [ reordering NUMBER ]\n");
>   	fprintf(stderr, "           [ window NUMBER] [ cwnd NUMBER ] [ initcwnd NUMBER ]\n");
> @@ -185,8 +186,15 @@ static int filter_nlmsg(struct nlmsghdr *n, struct rtattr **tb, int host_len)
>   	    (r->rtm_family != filter.msrc.family ||
>   	     (filter.msrc.bitlen >= 0 && filter.msrc.bitlen < r->rtm_src_len)))
>   		return 0;
> -	if (filter.rvia.family && r->rtm_family != filter.rvia.family)
> -		return 0;
> +	if (filter.rvia.family) {
> +		int family = r->rtm_family;
> +		if (tb[RTA_VIA]) {
> +			struct rtvia *via = RTA_DATA(tb[RTA_VIA]);
> +			family = via->rtvia_family;
> +		}
> +		if (family != filter.rvia.family)
> +			return 0;
> +	}
>   	if (filter.rprefsrc.family && r->rtm_family != filter.rprefsrc.family)
>   		return 0;
>   
> @@ -205,6 +213,12 @@ static int filter_nlmsg(struct nlmsghdr *n, struct rtattr **tb, int host_len)
>   		via.family = r->rtm_family;
>   		if (tb[RTA_GATEWAY])
>   			memcpy(&via.data, RTA_DATA(tb[RTA_GATEWAY]), host_len/8);
> +		if (tb[RTA_VIA]) {
> +			size_t len = RTA_PAYLOAD(tb[RTA_VIA]) - 2;
> +			struct rtvia *rtvia = RTA_DATA(tb[RTA_VIA]);
> +			via.family = rtvia->rtvia_family;
> +			memcpy(&via.data, rtvia->rtvia_addr, len);
> +		}
>   	}
>   	if (filter.rprefsrc.bitlen>0) {
>   		memset(&prefsrc, 0, sizeof(prefsrc));
> @@ -386,6 +400,14 @@ int print_route(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
>   				    RTA_DATA(tb[RTA_GATEWAY]),
>   				    abuf, sizeof(abuf)));
>   	}
> +	if (tb[RTA_VIA]) {
> +		size_t len = RTA_PAYLOAD(tb[RTA_VIA]) - 2;
> +		struct rtvia *via = RTA_DATA(tb[RTA_VIA]);
> +		fprintf(fp, "via %s %s ",
> +			family_name(via->rtvia_family),
> +			format_host(via->rtvia_family, len, via->rtvia_addr,
> +				    abuf, sizeof(abuf)));
> +	}
>   	if (tb[RTA_OIF] && filter.oifmask != -1)
>   		fprintf(fp, "dev %s ", ll_index_to_name(*(int*)RTA_DATA(tb[RTA_OIF])));
>   
> @@ -601,6 +623,14 @@ int print_route(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
>   							    RTA_DATA(tb[RTA_GATEWAY]),
>   							    abuf, sizeof(abuf)));
>   				}
> +				if (tb[RTA_VIA]) {
> +					size_t len = RTA_PAYLOAD(tb[RTA_VIA]) - 2;
> +					struct rtvia *via = RTA_DATA(tb[RTA_VIA]);
> +					fprintf(fp, "via %s %s ",
> +						family_name(via->rtvia_family),
> +						format_host(via->rtvia_family, len, via->rtvia_addr,
> +							    abuf, sizeof(abuf)));
> +				}
>   				if (tb[RTA_FLOW]) {
>   					__u32 to = rta_getattr_u32(tb[RTA_FLOW]);
>   					__u32 from = to>>16;
> @@ -648,12 +678,23 @@ static int parse_one_nh(struct rtmsg *r, struct rtattr *rta,
>   	while (++argv, --argc > 0) {
>   		if (strcmp(*argv, "via") == 0) {
>   			inet_prefix addr;
> +			int family;
>   			NEXT_ARG();
> -			get_addr(&addr, *argv, r->rtm_family);
> +			family = read_family(*argv);
> +			if (family == AF_UNSPEC)
> +				family = r->rtm_family;
> +			else
> +				NEXT_ARG();
> +			get_addr(&addr, *argv, family);
>   			if (r->rtm_family == AF_UNSPEC)
>   				r->rtm_family = addr.family;
> -			rta_addattr_l(rta, 4096, RTA_GATEWAY, &addr.data, addr.bytelen);
> -			rtnh->rtnh_len += sizeof(struct rtattr) + addr.bytelen;
> +			if (addr.family == r->rtm_family) {
> +				rta_addattr_l(rta, 4096, RTA_GATEWAY, &addr.data, addr.bytelen);
> +				rtnh->rtnh_len += sizeof(struct rtattr) + addr.bytelen;
> +			} else {
> +				rta_addattr_l(rta, 4096, RTA_VIA, &addr.family, addr.bytelen+2);
> +				rtnh->rtnh_len += sizeof(struct rtattr) + addr.bytelen+2;
> +			}
>   		} else if (strcmp(*argv, "dev") == 0) {
>   			NEXT_ARG();
>   			if ((rtnh->rtnh_ifindex = ll_name_to_index(*argv)) == 0) {
> @@ -761,12 +802,21 @@ static int iproute_modify(int cmd, unsigned flags, int argc, char **argv)
>   			addattr_l(&req.n, sizeof(req), RTA_PREFSRC, &addr.data, addr.bytelen);
>   		} else if (strcmp(*argv, "via") == 0) {
>   			inet_prefix addr;
> +			int family;
>   			gw_ok = 1;
>   			NEXT_ARG();
> -			get_addr(&addr, *argv, req.r.rtm_family);
> +			family = read_family(*argv);
> +			if (family == AF_UNSPEC)
> +				family = req.r.rtm_family;
> +			else
> +				NEXT_ARG();
> +			get_addr(&addr, *argv, family);
>   			if (req.r.rtm_family == AF_UNSPEC)
>   				req.r.rtm_family = addr.family;
> -			addattr_l(&req.n, sizeof(req), RTA_GATEWAY, &addr.data, addr.bytelen);
> +			if (addr.family == req.r.rtm_family)
> +				addattr_l(&req.n, sizeof(req), RTA_GATEWAY, &addr.data, addr.bytelen);
> +			else
> +				addattr_l(&req.n, sizeof(req), RTA_VIA, &addr.family, addr.bytelen+2);
>   		} else if (strcmp(*argv, "from") == 0) {
>   			inet_prefix addr;
>   			NEXT_ARG();
> @@ -1251,8 +1301,14 @@ static int iproute_list_flush_or_save(int argc, char **argv, int action)
>   			get_unsigned(&mark, *argv, 0);
>   			filter.markmask = -1;
>   		} else if (strcmp(*argv, "via") == 0) {
> +			int family;
>   			NEXT_ARG();
> -			get_prefix(&filter.rvia, *argv, do_ipv6);
> +			family = read_family(*argv);
> +			if (family == AF_UNSPEC)
> +				family = do_ipv6;
> +			else
> +				NEXT_ARG();
> +			get_prefix(&filter.rvia, *argv, family);
>   		} else if (strcmp(*argv, "src") == 0) {
>   			NEXT_ARG();
>   			get_prefix(&filter.rprefsrc, *argv, do_ipv6);
> @@ -1554,6 +1610,8 @@ static int iproute_get(int argc, char **argv)
>   			tb[RTA_OIF]->rta_type = 0;
>   		if (tb[RTA_GATEWAY])
>   			tb[RTA_GATEWAY]->rta_type = 0;
> +		if (tb[RTA_VIA])
> +			tb[RTA_VIA]->rta_type = 0;
>   		if (!idev && tb[RTA_IIF])
>   			tb[RTA_IIF]->rta_type = 0;
>   		req.n.nlmsg_flags = NLM_F_REQUEST;
> diff --git a/man/man8/ip-route.8.in b/man/man8/ip-route.8.in
> index 2b1583d5a30c..906cfea0cd6b 100644
> --- a/man/man8/ip-route.8.in
> +++ b/man/man8/ip-route.8.in
> @@ -81,13 +81,18 @@ replace " } "
>   .ti -8
>   .IR NH " := [ "
>   .B  via
> -.IR ADDRESS " ] [ "
> +[
> +.IR FAMILY " ] " ADDRESS " ] [ "
>   .B  dev
>   .IR STRING " ] [ "
>   .B  weight
>   .IR NUMBER " ] " NHFLAGS
>   
>   .ti -8
> +.IR FAMILY " := [ "
> +.BR inet " | " inet6 " | " ipx " | " dnet " | " bridge " | " link " ]"
> +
> +.ti -8
>   .IR OPTIONS " := " FLAGS " [ "
>   .B  mtu
>   .IR NUMBER " ] [ "
> @@ -333,9 +338,10 @@ table by default.
>   the output device name.
>   
>   .TP
> -.BI via " ADDRESS"
> -the address of the nexthop router.  Actually, the sense of this field
> -depends on the route type.  For normal
> +.BI via " [ FAMILY ] ADDRESS"
> +the address of the nexthop router, in the address family FAMILY.
> +Actually, the sense of this field depends on the route type.  For
> +normal
>   .B unicast
>   routes it is either the true next hop router or, if it is a direct
>   route installed in BSD compatibility mode, it can be a local address
> @@ -472,7 +478,7 @@ is a complex value with its own syntax similar to the top level
>   argument lists:
>   
>   .in +8
> -.BI via " ADDRESS"
> +.BI via " [ FAMILY ] ADDRESS"
>   - is the nexthop router.
>   .sp
>   
> @@ -669,7 +675,7 @@ only list routes of this type.
>   only list routes going via this device.
>   
>   .TP
> -.BI via " PREFIX"
> +.BI via " [ FAMILY ] PREFIX"
>   only list routes going via the nexthop routers selected by
>   .IR PREFIX "."
>   

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH net-next 6/8] iproute2: Add support for the RTA_VIA attribute
  2015-04-06 23:04           ` roopa
@ 2015-04-06 23:27             ` Andy Gospodarek
  2015-04-07 14:55               ` roopa
  0 siblings, 1 reply; 29+ messages in thread
From: Andy Gospodarek @ 2015-04-06 23:27 UTC (permalink / raw)
  To: roopa
  Cc: Eric W. Biederman, Stephen Hemminger, netdev, Vivek Venkatraman,
	rshearma

On Mon, Apr 06, 2015 at 04:04:06PM -0700, roopa wrote:
> On 3/15/15, 12:52 PM, Eric W. Biederman wrote:
> >Add support for the RTA_VIA attribute that specifies an address family
> >as well as an address for the next hop gateway.
> >
> >To make it easy to pass this reorder inet_prefix so that it's tail
> >is a proper RTA_VIA attribute.
> >
> >Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> >---
> >  include/linux/rtnetlink.h |  7 +++++
> >  include/utils.h           |  7 +++--
> >  ip/iproute.c              | 76 +++++++++++++++++++++++++++++++++++++++++------
> >  man/man8/ip-route.8.in    | 18 +++++++----
> >  4 files changed, 90 insertions(+), 18 deletions(-)
> >
> >diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
> >index 3eb78105399b..03e4c8df8e60 100644
> >--- a/include/linux/rtnetlink.h
> >+++ b/include/linux/rtnetlink.h
> >@@ -303,6 +303,7 @@ enum rtattr_type_t {
> >  	RTA_TABLE,
> >  	RTA_MARK,
> >  	RTA_MFC_STATS,
> >+	RTA_VIA,
> 
> eric, if its not too late, what do you think about renaming RTA_VIA
> attribute to
> RTA_NEWGATEWAY (similar to your new RTA_NEWDST attribute to specify a label
> dst) ?. RTA_VIA is fine too.
> This is indeed a new way to specify a gateway (and can/will be used by RFC
> 5549 in the future).
> 
> If there is interest in renaming it to RTA_NEWGATEWAY (or any other name,
> cant think of anything better right now),
> I will be happy to submit a follow-on patch.

FWIW, I actually do not mind the name RTA_VIA.  I was planning to
replace use of RTA_GATEWAY in iproute2 and just usa RTA_VIA for all
nexthops regardless of the address family of the dest route or nexthop
and would allow easy creation of the infrastructure needed to support
RFC5549 -- obviously while keeping backwards compatibility in the
kernel.

This was what my orignal set did (not submitted to netdev, but discussed
with others at netconf) and it was much cleaner code-wise (but not ideal
as I overloaded the use of RTA_GATEWAY and that was not pleasing to me
or others).


> 
> Thanks!.
> 
> >  	__RTA_MAX
> >  };
> >@@ -344,6 +345,12 @@ struct rtnexthop {
> >  #define RTNH_SPACE(len)	RTNH_ALIGN(RTNH_LENGTH(len))
> >  #define RTNH_DATA(rtnh)   ((struct rtattr*)(((char*)(rtnh)) + RTNH_LENGTH(0)))
> >+/* RTA_VIA */
> >+struct rtvia {
> >+	__kernel_sa_family_t	rtvia_family;
> >+	__u8			rtvia_addr[0];
> >+};
> >+
> >  /* RTM_CACHEINFO */
> >  struct rta_cacheinfo {
> >diff --git a/include/utils.h b/include/utils.h
> >index 99dde4c5a667..6bbcc10756d3 100644
> >--- a/include/utils.h
> >+++ b/include/utils.h
> >@@ -50,10 +50,11 @@ extern void incomplete_command(void) __attribute__((noreturn));
> >  typedef struct
> >  {
> >-	__u8 family;
> >-	__u8 bytelen;
> >+	__u16 flags;
> >+	__u16 bytelen;
> >  	__s16 bitlen;
> >-	__u32 flags;
> >+	/* These next two fields match rtvia */
> >+	__u16 family;
> >  	__u32 data[8];
> >  } inet_prefix;
> >diff --git a/ip/iproute.c b/ip/iproute.c
> >index 79d0760a34f6..c6ee411fdd56 100644
> >--- a/ip/iproute.c
> >+++ b/ip/iproute.c
> >@@ -75,7 +75,8 @@ static void usage(void)
> >  	fprintf(stderr, "             [ table TABLE_ID ] [ proto RTPROTO ]\n");
> >  	fprintf(stderr, "             [ scope SCOPE ] [ metric METRIC ]\n");
> >  	fprintf(stderr, "INFO_SPEC := NH OPTIONS FLAGS [ nexthop NH ]...\n");
> >-	fprintf(stderr, "NH := [ via ADDRESS ] [ dev STRING ] [ weight NUMBER ] NHFLAGS\n");
> >+	fprintf(stderr, "NH := [ via [ FAMILY ] ADDRESS ] [ dev STRING ] [ weight NUMBER ] NHFLAGS\n");
> >+	fprintf(stderr, "FAMILY := [ inet | inet6 | ipx | dnet | bridge | link ]");
> >  	fprintf(stderr, "OPTIONS := FLAGS [ mtu NUMBER ] [ advmss NUMBER ]\n");
> >  	fprintf(stderr, "           [ rtt TIME ] [ rttvar TIME ] [ reordering NUMBER ]\n");
> >  	fprintf(stderr, "           [ window NUMBER] [ cwnd NUMBER ] [ initcwnd NUMBER ]\n");
> >@@ -185,8 +186,15 @@ static int filter_nlmsg(struct nlmsghdr *n, struct rtattr **tb, int host_len)
> >  	    (r->rtm_family != filter.msrc.family ||
> >  	     (filter.msrc.bitlen >= 0 && filter.msrc.bitlen < r->rtm_src_len)))
> >  		return 0;
> >-	if (filter.rvia.family && r->rtm_family != filter.rvia.family)
> >-		return 0;
> >+	if (filter.rvia.family) {
> >+		int family = r->rtm_family;
> >+		if (tb[RTA_VIA]) {
> >+			struct rtvia *via = RTA_DATA(tb[RTA_VIA]);
> >+			family = via->rtvia_family;
> >+		}
> >+		if (family != filter.rvia.family)
> >+			return 0;
> >+	}
> >  	if (filter.rprefsrc.family && r->rtm_family != filter.rprefsrc.family)
> >  		return 0;
> >@@ -205,6 +213,12 @@ static int filter_nlmsg(struct nlmsghdr *n, struct rtattr **tb, int host_len)
> >  		via.family = r->rtm_family;
> >  		if (tb[RTA_GATEWAY])
> >  			memcpy(&via.data, RTA_DATA(tb[RTA_GATEWAY]), host_len/8);
> >+		if (tb[RTA_VIA]) {
> >+			size_t len = RTA_PAYLOAD(tb[RTA_VIA]) - 2;
> >+			struct rtvia *rtvia = RTA_DATA(tb[RTA_VIA]);
> >+			via.family = rtvia->rtvia_family;
> >+			memcpy(&via.data, rtvia->rtvia_addr, len);
> >+		}
> >  	}
> >  	if (filter.rprefsrc.bitlen>0) {
> >  		memset(&prefsrc, 0, sizeof(prefsrc));
> >@@ -386,6 +400,14 @@ int print_route(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
> >  				    RTA_DATA(tb[RTA_GATEWAY]),
> >  				    abuf, sizeof(abuf)));
> >  	}
> >+	if (tb[RTA_VIA]) {
> >+		size_t len = RTA_PAYLOAD(tb[RTA_VIA]) - 2;
> >+		struct rtvia *via = RTA_DATA(tb[RTA_VIA]);
> >+		fprintf(fp, "via %s %s ",
> >+			family_name(via->rtvia_family),
> >+			format_host(via->rtvia_family, len, via->rtvia_addr,
> >+				    abuf, sizeof(abuf)));
> >+	}
> >  	if (tb[RTA_OIF] && filter.oifmask != -1)
> >  		fprintf(fp, "dev %s ", ll_index_to_name(*(int*)RTA_DATA(tb[RTA_OIF])));
> >@@ -601,6 +623,14 @@ int print_route(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
> >  							    RTA_DATA(tb[RTA_GATEWAY]),
> >  							    abuf, sizeof(abuf)));
> >  				}
> >+				if (tb[RTA_VIA]) {
> >+					size_t len = RTA_PAYLOAD(tb[RTA_VIA]) - 2;
> >+					struct rtvia *via = RTA_DATA(tb[RTA_VIA]);
> >+					fprintf(fp, "via %s %s ",
> >+						family_name(via->rtvia_family),
> >+						format_host(via->rtvia_family, len, via->rtvia_addr,
> >+							    abuf, sizeof(abuf)));
> >+				}
> >  				if (tb[RTA_FLOW]) {
> >  					__u32 to = rta_getattr_u32(tb[RTA_FLOW]);
> >  					__u32 from = to>>16;
> >@@ -648,12 +678,23 @@ static int parse_one_nh(struct rtmsg *r, struct rtattr *rta,
> >  	while (++argv, --argc > 0) {
> >  		if (strcmp(*argv, "via") == 0) {
> >  			inet_prefix addr;
> >+			int family;
> >  			NEXT_ARG();
> >-			get_addr(&addr, *argv, r->rtm_family);
> >+			family = read_family(*argv);
> >+			if (family == AF_UNSPEC)
> >+				family = r->rtm_family;
> >+			else
> >+				NEXT_ARG();
> >+			get_addr(&addr, *argv, family);
> >  			if (r->rtm_family == AF_UNSPEC)
> >  				r->rtm_family = addr.family;
> >-			rta_addattr_l(rta, 4096, RTA_GATEWAY, &addr.data, addr.bytelen);
> >-			rtnh->rtnh_len += sizeof(struct rtattr) + addr.bytelen;
> >+			if (addr.family == r->rtm_family) {
> >+				rta_addattr_l(rta, 4096, RTA_GATEWAY, &addr.data, addr.bytelen);
> >+				rtnh->rtnh_len += sizeof(struct rtattr) + addr.bytelen;
> >+			} else {
> >+				rta_addattr_l(rta, 4096, RTA_VIA, &addr.family, addr.bytelen+2);
> >+				rtnh->rtnh_len += sizeof(struct rtattr) + addr.bytelen+2;
> >+			}
> >  		} else if (strcmp(*argv, "dev") == 0) {
> >  			NEXT_ARG();
> >  			if ((rtnh->rtnh_ifindex = ll_name_to_index(*argv)) == 0) {
> >@@ -761,12 +802,21 @@ static int iproute_modify(int cmd, unsigned flags, int argc, char **argv)
> >  			addattr_l(&req.n, sizeof(req), RTA_PREFSRC, &addr.data, addr.bytelen);
> >  		} else if (strcmp(*argv, "via") == 0) {
> >  			inet_prefix addr;
> >+			int family;
> >  			gw_ok = 1;
> >  			NEXT_ARG();
> >-			get_addr(&addr, *argv, req.r.rtm_family);
> >+			family = read_family(*argv);
> >+			if (family == AF_UNSPEC)
> >+				family = req.r.rtm_family;
> >+			else
> >+				NEXT_ARG();
> >+			get_addr(&addr, *argv, family);
> >  			if (req.r.rtm_family == AF_UNSPEC)
> >  				req.r.rtm_family = addr.family;
> >-			addattr_l(&req.n, sizeof(req), RTA_GATEWAY, &addr.data, addr.bytelen);
> >+			if (addr.family == req.r.rtm_family)
> >+				addattr_l(&req.n, sizeof(req), RTA_GATEWAY, &addr.data, addr.bytelen);
> >+			else
> >+				addattr_l(&req.n, sizeof(req), RTA_VIA, &addr.family, addr.bytelen+2);
> >  		} else if (strcmp(*argv, "from") == 0) {
> >  			inet_prefix addr;
> >  			NEXT_ARG();
> >@@ -1251,8 +1301,14 @@ static int iproute_list_flush_or_save(int argc, char **argv, int action)
> >  			get_unsigned(&mark, *argv, 0);
> >  			filter.markmask = -1;
> >  		} else if (strcmp(*argv, "via") == 0) {
> >+			int family;
> >  			NEXT_ARG();
> >-			get_prefix(&filter.rvia, *argv, do_ipv6);
> >+			family = read_family(*argv);
> >+			if (family == AF_UNSPEC)
> >+				family = do_ipv6;
> >+			else
> >+				NEXT_ARG();
> >+			get_prefix(&filter.rvia, *argv, family);
> >  		} else if (strcmp(*argv, "src") == 0) {
> >  			NEXT_ARG();
> >  			get_prefix(&filter.rprefsrc, *argv, do_ipv6);
> >@@ -1554,6 +1610,8 @@ static int iproute_get(int argc, char **argv)
> >  			tb[RTA_OIF]->rta_type = 0;
> >  		if (tb[RTA_GATEWAY])
> >  			tb[RTA_GATEWAY]->rta_type = 0;
> >+		if (tb[RTA_VIA])
> >+			tb[RTA_VIA]->rta_type = 0;
> >  		if (!idev && tb[RTA_IIF])
> >  			tb[RTA_IIF]->rta_type = 0;
> >  		req.n.nlmsg_flags = NLM_F_REQUEST;
> >diff --git a/man/man8/ip-route.8.in b/man/man8/ip-route.8.in
> >index 2b1583d5a30c..906cfea0cd6b 100644
> >--- a/man/man8/ip-route.8.in
> >+++ b/man/man8/ip-route.8.in
> >@@ -81,13 +81,18 @@ replace " } "
> >  .ti -8
> >  .IR NH " := [ "
> >  .B  via
> >-.IR ADDRESS " ] [ "
> >+[
> >+.IR FAMILY " ] " ADDRESS " ] [ "
> >  .B  dev
> >  .IR STRING " ] [ "
> >  .B  weight
> >  .IR NUMBER " ] " NHFLAGS
> >  .ti -8
> >+.IR FAMILY " := [ "
> >+.BR inet " | " inet6 " | " ipx " | " dnet " | " bridge " | " link " ]"
> >+
> >+.ti -8
> >  .IR OPTIONS " := " FLAGS " [ "
> >  .B  mtu
> >  .IR NUMBER " ] [ "
> >@@ -333,9 +338,10 @@ table by default.
> >  the output device name.
> >  .TP
> >-.BI via " ADDRESS"
> >-the address of the nexthop router.  Actually, the sense of this field
> >-depends on the route type.  For normal
> >+.BI via " [ FAMILY ] ADDRESS"
> >+the address of the nexthop router, in the address family FAMILY.
> >+Actually, the sense of this field depends on the route type.  For
> >+normal
> >  .B unicast
> >  routes it is either the true next hop router or, if it is a direct
> >  route installed in BSD compatibility mode, it can be a local address
> >@@ -472,7 +478,7 @@ is a complex value with its own syntax similar to the top level
> >  argument lists:
> >  .in +8
> >-.BI via " ADDRESS"
> >+.BI via " [ FAMILY ] ADDRESS"
> >  - is the nexthop router.
> >  .sp
> >@@ -669,7 +675,7 @@ only list routes of this type.
> >  only list routes going via this device.
> >  .TP
> >-.BI via " PREFIX"
> >+.BI via " [ FAMILY ] PREFIX"
> >  only list routes going via the nexthop routers selected by
> >  .IR PREFIX "."
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH net-next 6/8] iproute2: Add support for the RTA_VIA attribute
  2015-04-06 23:27             ` Andy Gospodarek
@ 2015-04-07 14:55               ` roopa
  2015-04-07 16:09                 ` Eric W. Biederman
  0 siblings, 1 reply; 29+ messages in thread
From: roopa @ 2015-04-07 14:55 UTC (permalink / raw)
  To: Andy Gospodarek
  Cc: Eric W. Biederman, Stephen Hemminger, netdev, Vivek Venkatraman,
	rshearma

On 4/6/15, 4:27 PM, Andy Gospodarek wrote:
> On Mon, Apr 06, 2015 at 04:04:06PM -0700, roopa wrote:
>> On 3/15/15, 12:52 PM, Eric W. Biederman wrote:
>>> Add support for the RTA_VIA attribute that specifies an address family
>>> as well as an address for the next hop gateway.
>>>
>>> To make it easy to pass this reorder inet_prefix so that it's tail
>>> is a proper RTA_VIA attribute.
>>>
>>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>>> ---
>>>   include/linux/rtnetlink.h |  7 +++++
>>>   include/utils.h           |  7 +++--
>>>   ip/iproute.c              | 76 +++++++++++++++++++++++++++++++++++++++++------
>>>   man/man8/ip-route.8.in    | 18 +++++++----
>>>   4 files changed, 90 insertions(+), 18 deletions(-)
>>>
>>> diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
>>> index 3eb78105399b..03e4c8df8e60 100644
>>> --- a/include/linux/rtnetlink.h
>>> +++ b/include/linux/rtnetlink.h
>>> @@ -303,6 +303,7 @@ enum rtattr_type_t {
>>>   	RTA_TABLE,
>>>   	RTA_MARK,
>>>   	RTA_MFC_STATS,
>>> +	RTA_VIA,
>> eric, if its not too late, what do you think about renaming RTA_VIA
>> attribute to
>> RTA_NEWGATEWAY (similar to your new RTA_NEWDST attribute to specify a label
>> dst) ?. RTA_VIA is fine too.
>> This is indeed a new way to specify a gateway (and can/will be used by RFC
>> 5549 in the future).
>>
>> If there is interest in renaming it to RTA_NEWGATEWAY (or any other name,
>> cant think of anything better right now),
>> I will be happy to submit a follow-on patch.
> FWIW, I actually do not mind the name RTA_VIA.  I was planning to
> replace use of RTA_GATEWAY in iproute2 and just usa RTA_VIA for all
> nexthops regardless of the address family of the dest route or nexthop
> and would allow easy creation of the infrastructure needed to support
> RFC5549 -- obviously while keeping backwards compatibility in the
> kernel.
ok, good to know.
>
> This was what my orignal set did (not submitted to netdev, but discussed
> with others at netconf) and it was much cleaner code-wise (but not ideal
> as I overloaded the use of RTA_GATEWAY and that was not pleasing to me
> or others).
ok, yeah i remember you had RTA_GATEWAY6 or something like that.

just to clarify, i was not suggesting overloading.
eric introduced cleaner abstracted attributes for RTA_DST and RTA_GATEWAY.
One is called RTA_NEWDST and I was thinking if changing RTA_GATEWAY to 
RTA_NEWGATEWAY
would be less confusing (because, the rest of the structures (ipv4/ipv6) 
where you will put the
RTA_VIA information is still called gw).

No worries, RTA_VIA can stay if more people prefer that.

Thanks!.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH net-next 6/8] iproute2: Add support for the RTA_VIA attribute
  2015-04-07 14:55               ` roopa
@ 2015-04-07 16:09                 ` Eric W. Biederman
  2015-04-07 16:58                   ` Vivek Venkatraman
  2015-04-07 18:15                   ` roopa
  0 siblings, 2 replies; 29+ messages in thread
From: Eric W. Biederman @ 2015-04-07 16:09 UTC (permalink / raw)
  To: roopa
  Cc: Andy Gospodarek, Stephen Hemminger, netdev, Vivek Venkatraman, rshearma

roopa <roopa@cumulusnetworks.com> writes:

> On 4/6/15, 4:27 PM, Andy Gospodarek wrote:
>> On Mon, Apr 06, 2015 at 04:04:06PM -0700, roopa wrote:
>>> On 3/15/15, 12:52 PM, Eric W. Biederman wrote:
>>>> Add support for the RTA_VIA attribute that specifies an address family
>>>> as well as an address for the next hop gateway.
>>>>
>>>> To make it easy to pass this reorder inet_prefix so that it's tail
>>>> is a proper RTA_VIA attribute.
>>>>
>>>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>>>> ---
>>>>   include/linux/rtnetlink.h |  7 +++++
>>>>   include/utils.h           |  7 +++--
>>>>   ip/iproute.c              | 76 +++++++++++++++++++++++++++++++++++++++++------
>>>>   man/man8/ip-route.8.in    | 18 +++++++----
>>>>   4 files changed, 90 insertions(+), 18 deletions(-)
>>>>
>>>> diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
>>>> index 3eb78105399b..03e4c8df8e60 100644
>>>> --- a/include/linux/rtnetlink.h
>>>> +++ b/include/linux/rtnetlink.h
>>>> @@ -303,6 +303,7 @@ enum rtattr_type_t {
>>>>   	RTA_TABLE,
>>>>   	RTA_MARK,
>>>>   	RTA_MFC_STATS,
>>>> +	RTA_VIA,
>>> eric, if its not too late, what do you think about renaming RTA_VIA
>>> attribute to
>>> RTA_NEWGATEWAY (similar to your new RTA_NEWDST attribute to specify a label
>>> dst) ?. RTA_VIA is fine too.
>>> This is indeed a new way to specify a gateway (and can/will be used by RFC
>>> 5549 in the future).
>>>
>>> If there is interest in renaming it to RTA_NEWGATEWAY (or any other name,
>>> cant think of anything better right now),
>>> I will be happy to submit a follow-on patch.
>> FWIW, I actually do not mind the name RTA_VIA.  I was planning to
>> replace use of RTA_GATEWAY in iproute2 and just usa RTA_VIA for all
>> nexthops regardless of the address family of the dest route or nexthop
>> and would allow easy creation of the infrastructure needed to support
>> RFC5549 -- obviously while keeping backwards compatibility in the
>> kernel.
> ok, good to know.

To answer the original question.  The new in RTA_NEWDST is not new as in
a new attribute it is new as in replace the destination address with a
new destination address.  NAT in other words.  Which is how mpls routing
works.  Each hop NATs the address before sending the packet on.

>> This was what my orignal set did (not submitted to netdev, but discussed
>> with others at netconf) and it was much cleaner code-wise (but not ideal
>> as I overloaded the use of RTA_GATEWAY and that was not pleasing to me
>> or others).
> ok, yeah i remember you had RTA_GATEWAY6 or something like that.
>
> just to clarify, i was not suggesting overloading.
> eric introduced cleaner abstracted attributes for RTA_DST and RTA_GATEWAY.
> One is called RTA_NEWDST and I was thinking if changing RTA_GATEWAY to
> RTA_NEWGATEWAY
> would be less confusing (because, the rest of the structures
> (ipv4/ipv6) where you will put the
> RTA_VIA information is still called gw).
>
> No worries, RTA_VIA can stay if more people prefer that.

As long as the number and the semantics don't change I don't much care.

However I think via is probably what we should have called the concept
and the field in the first place, and certainly there are corner cases
where the machine where we are going via is not actually a gateway but
the final destination, when you are talking about multiple protocols.

Regardless the name RTA_VIA is my best attempt in that direction.

All of my added support in iproute2 should work for RFC5549.  As well as
for mpls.

Eric

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH net-next 6/8] iproute2: Add support for the RTA_VIA attribute
  2015-04-07 16:09                 ` Eric W. Biederman
@ 2015-04-07 16:58                   ` Vivek Venkatraman
  2015-04-07 19:38                     ` Eric W. Biederman
  2015-04-07 18:15                   ` roopa
  1 sibling, 1 reply; 29+ messages in thread
From: Vivek Venkatraman @ 2015-04-07 16:58 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: roopa, Andy Gospodarek, Stephen Hemminger, netdev, Robert Shearman

On Tue, Apr 7, 2015 at 9:09 AM, Eric W. Biederman <ebiederm@xmission.com> wrote:
> roopa <roopa@cumulusnetworks.com> writes:
>
>> On 4/6/15, 4:27 PM, Andy Gospodarek wrote:
>>> On Mon, Apr 06, 2015 at 04:04:06PM -0700, roopa wrote:
>>>> On 3/15/15, 12:52 PM, Eric W. Biederman wrote:
>>>>> Add support for the RTA_VIA attribute that specifies an address family
>>>>> as well as an address for the next hop gateway.
>>>>>
>>>>> To make it easy to pass this reorder inet_prefix so that it's tail
>>>>> is a proper RTA_VIA attribute.
>>>>>
>>>>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>>>>> ---
>>>>>   include/linux/rtnetlink.h |  7 +++++
>>>>>   include/utils.h           |  7 +++--
>>>>>   ip/iproute.c              | 76 +++++++++++++++++++++++++++++++++++++++++------
>>>>>   man/man8/ip-route.8.in    | 18 +++++++----
>>>>>   4 files changed, 90 insertions(+), 18 deletions(-)
>>>>>
>>>>> diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
>>>>> index 3eb78105399b..03e4c8df8e60 100644
>>>>> --- a/include/linux/rtnetlink.h
>>>>> +++ b/include/linux/rtnetlink.h
>>>>> @@ -303,6 +303,7 @@ enum rtattr_type_t {
>>>>>    RTA_TABLE,
>>>>>    RTA_MARK,
>>>>>    RTA_MFC_STATS,
>>>>> +  RTA_VIA,
>>>> eric, if its not too late, what do you think about renaming RTA_VIA
>>>> attribute to
>>>> RTA_NEWGATEWAY (similar to your new RTA_NEWDST attribute to specify a label
>>>> dst) ?. RTA_VIA is fine too.
>>>> This is indeed a new way to specify a gateway (and can/will be used by RFC
>>>> 5549 in the future).
>>>>
>>>> If there is interest in renaming it to RTA_NEWGATEWAY (or any other name,
>>>> cant think of anything better right now),
>>>> I will be happy to submit a follow-on patch.
>>> FWIW, I actually do not mind the name RTA_VIA.  I was planning to
>>> replace use of RTA_GATEWAY in iproute2 and just usa RTA_VIA for all
>>> nexthops regardless of the address family of the dest route or nexthop
>>> and would allow easy creation of the infrastructure needed to support
>>> RFC5549 -- obviously while keeping backwards compatibility in the
>>> kernel.
>> ok, good to know.
>
> To answer the original question.  The new in RTA_NEWDST is not new as in
> a new attribute it is new as in replace the destination address with a
> new destination address.  NAT in other words.  Which is how mpls routing
> works.  Each hop NATs the address before sending the packet on.
>

At the edge, when doing IPoMPLS, we'll be imposing a set of labels on
top of the packet rather than replacing, but the same semantics can be
applied because the destination address is now different and becomes a
label stack.

One thing to note is that the destination address replaced/imposed
could change based on the path selected, when there is ECMP. So, I
propose that the iproute2 syntax of "as [to]" be reconsidered for
MPLS, otherwise we'll end up with something like the following when
this is extended to setup IPoMPLS direct forwarding with ECMP:

ip route add 147.1.1.0/24 nexthop as to 400/2230 via inet 192.168.1.1
dev eth0 nexthop as to 600/2400 via inet 192.168.2.1 dev eth1

Instead, if we use the specifier "label", we'll get:

ip route add 147.1.1.0/24 nexthop via inet 192.168.1.1 dev eth0 label
400/2230 nexthop via inet 192.168.2.1 dev eth1 label 600/2400

The transit case (label swapping) would look like:

ip -f mpls route add 400 via inet 192.168.1.10 dev eth0 label 500

The syntax can then be better extended to specify a label operation
such as "pop" which would be needed when performing ultimate hop pop
(UHP) and then lookup/forward based on underlying label stack or IP
header.

A new application besides MPLS that needs to modify the destination
address would use its own keyword but encode using the RTA_NEWDST
attribute.

>>> This was what my orignal set did (not submitted to netdev, but discussed
>>> with others at netconf) and it was much cleaner code-wise (but not ideal
>>> as I overloaded the use of RTA_GATEWAY and that was not pleasing to me
>>> or others).
>> ok, yeah i remember you had RTA_GATEWAY6 or something like that.
>>
>> just to clarify, i was not suggesting overloading.
>> eric introduced cleaner abstracted attributes for RTA_DST and RTA_GATEWAY.
>> One is called RTA_NEWDST and I was thinking if changing RTA_GATEWAY to
>> RTA_NEWGATEWAY
>> would be less confusing (because, the rest of the structures
>> (ipv4/ipv6) where you will put the
>> RTA_VIA information is still called gw).
>>
>> No worries, RTA_VIA can stay if more people prefer that.
>
> As long as the number and the semantics don't change I don't much care.
>
> However I think via is probably what we should have called the concept
> and the field in the first place, and certainly there are corner cases
> where the machine where we are going via is not actually a gateway but
> the final destination, when you are talking about multiple protocols.
>
> Regardless the name RTA_VIA is my best attempt in that direction.
>
> All of my added support in iproute2 should work for RFC5549.  As well as
> for mpls.
>
> Eric

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH net-next 6/8] iproute2: Add support for the RTA_VIA attribute
  2015-04-07 16:09                 ` Eric W. Biederman
  2015-04-07 16:58                   ` Vivek Venkatraman
@ 2015-04-07 18:15                   ` roopa
  1 sibling, 0 replies; 29+ messages in thread
From: roopa @ 2015-04-07 18:15 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andy Gospodarek, Stephen Hemminger, netdev, Vivek Venkatraman, rshearma

On 4/7/15, 9:09 AM, Eric W. Biederman wrote:
> roopa <roopa@cumulusnetworks.com> writes:
>
>> On 4/6/15, 4:27 PM, Andy Gospodarek wrote:
>>> On Mon, Apr 06, 2015 at 04:04:06PM -0700, roopa wrote:
>>>> On 3/15/15, 12:52 PM, Eric W. Biederman wrote:
>>>>> Add support for the RTA_VIA attribute that specifies an address family
>>>>> as well as an address for the next hop gateway.
>>>>>
>>>>> To make it easy to pass this reorder inet_prefix so that it's tail
>>>>> is a proper RTA_VIA attribute.
>>>>>
>>>>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>>>>> ---
>>>>>    include/linux/rtnetlink.h |  7 +++++
>>>>>    include/utils.h           |  7 +++--
>>>>>    ip/iproute.c              | 76 +++++++++++++++++++++++++++++++++++++++++------
>>>>>    man/man8/ip-route.8.in    | 18 +++++++----
>>>>>    4 files changed, 90 insertions(+), 18 deletions(-)
>>>>>
>>>>> diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
>>>>> index 3eb78105399b..03e4c8df8e60 100644
>>>>> --- a/include/linux/rtnetlink.h
>>>>> +++ b/include/linux/rtnetlink.h
>>>>> @@ -303,6 +303,7 @@ enum rtattr_type_t {
>>>>>    	RTA_TABLE,
>>>>>    	RTA_MARK,
>>>>>    	RTA_MFC_STATS,
>>>>> +	RTA_VIA,
>>>> eric, if its not too late, what do you think about renaming RTA_VIA
>>>> attribute to
>>>> RTA_NEWGATEWAY (similar to your new RTA_NEWDST attribute to specify a label
>>>> dst) ?. RTA_VIA is fine too.
>>>> This is indeed a new way to specify a gateway (and can/will be used by RFC
>>>> 5549 in the future).
>>>>
>>>> If there is interest in renaming it to RTA_NEWGATEWAY (or any other name,
>>>> cant think of anything better right now),
>>>> I will be happy to submit a follow-on patch.
>>> FWIW, I actually do not mind the name RTA_VIA.  I was planning to
>>> replace use of RTA_GATEWAY in iproute2 and just usa RTA_VIA for all
>>> nexthops regardless of the address family of the dest route or nexthop
>>> and would allow easy creation of the infrastructure needed to support
>>> RFC5549 -- obviously while keeping backwards compatibility in the
>>> kernel.
>> ok, good to know.
> To answer the original question.  The new in RTA_NEWDST is not new as in
> a new attribute it is new as in replace the destination address with a
> new destination address.  NAT in other words.  Which is how mpls routing
> works.  Each hop NATs the address before sending the packet on.
thanks for the clarifying this.
>
>>> This was what my orignal set did (not submitted to netdev, but discussed
>>> with others at netconf) and it was much cleaner code-wise (but not ideal
>>> as I overloaded the use of RTA_GATEWAY and that was not pleasing to me
>>> or others).
>> ok, yeah i remember you had RTA_GATEWAY6 or something like that.
>>
>> just to clarify, i was not suggesting overloading.
>> eric introduced cleaner abstracted attributes for RTA_DST and RTA_GATEWAY.
>> One is called RTA_NEWDST and I was thinking if changing RTA_GATEWAY to
>> RTA_NEWGATEWAY
>> would be less confusing (because, the rest of the structures
>> (ipv4/ipv6) where you will put the
>> RTA_VIA information is still called gw).
>>
>> No worries, RTA_VIA can stay if more people prefer that.
> As long as the number and the semantics don't change I don't much care.
>
> However I think via is probably what we should have called the concept
> and the field in the first place, and certainly there are corner cases
> where the machine where we are going via is not actually a gateway but
> the final destination, when you are talking about multiple protocols.
agreed.

>
> Regardless the name RTA_VIA is my best attempt in that direction.
ack
>
> All of my added support in iproute2 should work for RFC5549.  As well as
> for mpls.
>
>
agreed.

and thanks.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH net-next 6/8] iproute2: Add support for the RTA_VIA attribute
  2015-04-07 16:58                   ` Vivek Venkatraman
@ 2015-04-07 19:38                     ` Eric W. Biederman
  2015-04-07 21:12                       ` Vivek Venkatraman
  0 siblings, 1 reply; 29+ messages in thread
From: Eric W. Biederman @ 2015-04-07 19:38 UTC (permalink / raw)
  To: Vivek Venkatraman
  Cc: roopa, Andy Gospodarek, Stephen Hemminger, netdev, Robert Shearman

Vivek Venkatraman <vivek@cumulusnetworks.com> writes:

> At the edge, when doing IPoMPLS, we'll be imposing a set of labels on
> top of the packet rather than replacing, but the same semantics can be
> applied because the destination address is now different and becomes a
> label stack.

Exactly how this will happen is an open question.  The hard part is we
need something light weight enough that we can scale to 1 million
routes, aka a full routing table. 

Network devices consume much too much memory to contemplate having a
different network device for each of 1 million different routes.

The transform infrastructure (xfrm) that is used for ipsec looks
attractive for imposing tunnels but it is clumsy, and does not map well
to the kinds of tunnels IPoMPLS traffic needs.

Having something in the ipv4 and ipv6 fib entry say a pointer or a 32bit
key that refers to a struct mpls_route to impose looks like what we want
int he abstract.  What the userspace interface for that implemenation is
something that I do not see clearly.  Ideally we build a userspace
interface that works not only for MPLS but also for other tunnel types
like IPIP, GRE, etc.   This would allow not only MPLS tunnels but other
tunnel types to be supported up to the full routing table size.

Perhaps a new attribute RTA_ENCAP that encodes a structure with
a tunnel type and enough information to encode the tunnel header.
I would have to make a survey of the existing tunnel types to see
if there is enough of a pattern an option that works for multiple
protocols could actually be achieved.

Using a tunnel that is not a network device and as such does not need
to keep packet counters looks like it will scale much better than our
other options, even with the best memory usage simplications I can
imagine for network devices.  Maintenance of per cpu counters (which are
necessary for performance) requires a non-trivial amount of memory and
as such are much harder to scale.

> One thing to note is that the destination address replaced/imposed
> could change based on the path selected, when there is ECMP. So, I
> propose that the iproute2 syntax of "as [to]" be reconsidered for
> MPLS, otherwise we'll end up with something like the following when
> this is extended to setup IPoMPLS direct forwarding with ECMP:
>
> ip route add 147.1.1.0/24 nexthop as to 400/2230 via inet 192.168.1.1
> dev eth0 nexthop as to 600/2400 via inet 192.168.2.1 dev eth1

That does not work with the semantics of the RTA_NEWDST message require
the new address to be in the same address family as the old address.
So it is useful for NATing IPv4 or IPv6 with routes (if you are
so inclined) but it is not useful for imposing an mpls header.

> Instead, if we use the specifier "label", we'll get:
>
> ip route add 147.1.1.0/24 nexthop via inet 192.168.1.1 dev eth0 label
> 400/2230 nexthop via inet 192.168.2.1 dev eth1 label 600/2400
>
> The transit case (label swapping) would look like:
>
> ip -f mpls route add 400 via inet 192.168.1.10 dev eth0 label 500
>
> The syntax can then be better extended to specify a label operation
> such as "pop" which would be needed when performing ultimate hop pop
> (UHP) and then lookup/forward based on underlying label stack or IP
> header.

Pop is the case where where the RTA_NEWDST attribute is empty (or
unspecified).

>From an mpls perspective the RTA_DST label is always popped (if it
matches) and the RTA_NEWDST label stack is always pushed.

> A new application besides MPLS that needs to modify the destination
> address would use its own keyword but encode using the RTA_NEWDST
> attribute.

Eric

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH net-next 6/8] iproute2: Add support for the RTA_VIA attribute
  2015-04-07 19:38                     ` Eric W. Biederman
@ 2015-04-07 21:12                       ` Vivek Venkatraman
  0 siblings, 0 replies; 29+ messages in thread
From: Vivek Venkatraman @ 2015-04-07 21:12 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: roopa, Andy Gospodarek, Stephen Hemminger, netdev, Robert Shearman

On Tue, Apr 7, 2015 at 12:38 PM, Eric W. Biederman
<ebiederm@xmission.com> wrote:
> Vivek Venkatraman <vivek@cumulusnetworks.com> writes:
>
>> At the edge, when doing IPoMPLS, we'll be imposing a set of labels on
>> top of the packet rather than replacing, but the same semantics can be
>> applied because the destination address is now different and becomes a
>> label stack.
>
> Exactly how this will happen is an open question.  The hard part is we
> need something light weight enough that we can scale to 1 million
> routes, aka a full routing table.
>
> Network devices consume much too much memory to contemplate having a
> different network device for each of 1 million different routes.
>

Agree and this is the exact point I had raised about your initial
proposal to inject IP into an MPLS tunnel.

> The transform infrastructure (xfrm) that is used for ipsec looks
> attractive for imposing tunnels but it is clumsy, and does not map well
> to the kinds of tunnels IPoMPLS traffic needs.
>
> Having something in the ipv4 and ipv6 fib entry say a pointer or a 32bit
> key that refers to a struct mpls_route to impose looks like what we want
> int he abstract.  What the userspace interface for that implemenation is
> something that I do not see clearly.  Ideally we build a userspace
> interface that works not only for MPLS but also for other tunnel types
> like IPIP, GRE, etc.   This would allow not only MPLS tunnels but other
> tunnel types to be supported up to the full routing table size.
>
> Perhaps a new attribute RTA_ENCAP that encodes a structure with
> a tunnel type and enough information to encode the tunnel header.
> I would have to make a survey of the existing tunnel types to see
> if there is enough of a pattern an option that works for multiple
> protocols could actually be achieved.
>
> Using a tunnel that is not a network device and as such does not need
> to keep packet counters looks like it will scale much better than our
> other options, even with the best memory usage simplications I can
> imagine for network devices.  Maintenance of per cpu counters (which are
> necessary for performance) requires a non-trivial amount of memory and
> as such are much harder to scale.
>

Yes. I believe there are 2 use cases to consider:

a) When MPLS LSPs specify a labeled-path and are not a tunnel per se.
This would be the case when they are setup hop-by-hop to following
routing, as would be the case using a protocol such as LDP or BGP. In
this case, the label stack is really just an encap and there is no
separate network device associated with each LSP (and certainly not
with any application/inner label such as a VPN label or a PWE label).

I believe this will be the common use-case in the data center and in
certain situations in provider networks.

b) When MPLS LSPs actually represent a tunnel interface. This would be
the case when they are traffic-engineered using a protocol such as
RSVP-TE. A network device would be associated with this tunnel and
specify the tunnel encapsulation (one or more labels) but the labels
imposed by the application would still come from the corresponding IP
or L2 constructs (e.g., fib entry).

This use-case is likely to be seen more in provider networks than in
the data center.

We have been looking into (a) and it is along the lines you mention
above (fib entry refers to mpls_route), but not flushed out enough to
post and seek opinion. In terms of the user interface (iproute2
commands), it is along the lines of my examples, though I have clearly
overlooked the point you make below about RTA_NEWDST.

>> One thing to note is that the destination address replaced/imposed
>> could change based on the path selected, when there is ECMP. So, I
>> propose that the iproute2 syntax of "as [to]" be reconsidered for
>> MPLS, otherwise we'll end up with something like the following when
>> this is extended to setup IPoMPLS direct forwarding with ECMP:
>>
>> ip route add 147.1.1.0/24 nexthop as to 400/2230 via inet 192.168.1.1
>> dev eth0 nexthop as to 600/2400 via inet 192.168.2.1 dev eth1
>
> That does not work with the semantics of the RTA_NEWDST message require
> the new address to be in the same address family as the old address.
> So it is useful for NATing IPv4 or IPv6 with routes (if you are
> so inclined) but it is not useful for imposing an mpls header.
>

I had overlooked this. I think now that it would be handy to allow the
address family to be specified for the new address so it can be used
in both cases (edge and transit). A keyword like "label" could
automatically imply AF_MPLS for the new address and a new application
could follow suit.

>> Instead, if we use the specifier "label", we'll get:
>>
>> ip route add 147.1.1.0/24 nexthop via inet 192.168.1.1 dev eth0 label
>> 400/2230 nexthop via inet 192.168.2.1 dev eth1 label 600/2400
>>
>> The transit case (label swapping) would look like:
>>
>> ip -f mpls route add 400 via inet 192.168.1.10 dev eth0 label 500
>>
>> The syntax can then be better extended to specify a label operation
>> such as "pop" which would be needed when performing ultimate hop pop
>> (UHP) and then lookup/forward based on underlying label stack or IP
>> header.
>
> Pop is the case where where the RTA_NEWDST attribute is empty (or
> unspecified).
>
> From an mpls perspective the RTA_DST label is always popped (if it
> matches) and the RTA_NEWDST label stack is always pushed.
>

The idea was to pop and do a subsequent lookup rather than pop and
forward. This would be an alternative to forwarding to loopback device
in order to lookup on inner labels.

>> A new application besides MPLS that needs to modify the destination
>> address would use its own keyword but encode using the RTA_NEWDST
>> attribute.
>
> Eric

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2015-04-07 21:12 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-13 18:50 [PATCH net-next] iproute2: MPLS support Eric W. Biederman
2015-03-13 18:52 ` [PATCH net-next 1/8] iproute2: Add a source addres length parameter to rt_addr_n2a Eric W. Biederman
2015-03-13 18:52 ` [PATCH net-next 2/8] iproute2: Make the addr argument of ll_addr_n2a const Eric W. Biederman
2015-03-13 18:54 ` [PATCH net-next 3/8] iproute2: Add support for printing AF_PACKET addresses Eric W. Biederman
2015-03-13 18:55 ` [PATCH net-next 4/8] iproute2: Add address family to/from string helper functions Eric W. Biederman
2015-03-13 18:56 ` [PATCH net-next 5/8] iproute2: misc whitespace cleanup Eric W. Biederman
2015-03-13 18:57 ` [PATCH net-next 6/8] iproute2: Add support for RTA_VIA attributes Eric W. Biederman
2015-03-13 18:58 ` [PATCH net-next 7/8] iproute2: Add support for the RTA_NEWDST attribute Eric W. Biederman
2015-03-13 18:59 ` [PATCH net-next 8/8] iproute2: Add basic mpls support to iproute Eric W. Biederman
     [not found] ` <c3ad7d77783046d38e5b23b5e1fe0f71@BRMWP-EXMB11.corp.brocade.com>
2015-03-15 19:33   ` [PATCH net-next 1/8] iproute2: Add a source addres length parameter to rt_addr_n2a Stephen Hemminger
2015-03-15 19:42     ` Eric W. Biederman
2015-03-15 19:47       ` [PATCH net-next 0/8] iproute2: MPLS support (now with af_bit_len) Eric W. Biederman
2015-03-15 19:48         ` [PATCH net-next 1/8] iproute2: Add a source addres length parameter to rt_addr_n2a Eric W. Biederman
2015-03-15 19:49         ` [PATCH net-next 2/8] iproute2: Make the addr argument of ll_addr_n2a const Eric W. Biederman
2015-03-15 19:49         ` [PATCH net-next 3/8] iproute2: Add support for printing AF_PACKET addresses Eric W. Biederman
2015-03-15 19:50         ` [PATCH net-next 4/8] iproute2: Add address family to/from string helper functions Eric W. Biederman
2015-03-15 19:51         ` [PATCH net-next 5/8] iproute2: misc whitespace cleanup Eric W. Biederman
2015-03-15 19:52         ` [PATCH net-next 6/8] iproute2: Add support for the RTA_VIA attribute Eric W. Biederman
2015-04-06 23:04           ` roopa
2015-04-06 23:27             ` Andy Gospodarek
2015-04-07 14:55               ` roopa
2015-04-07 16:09                 ` Eric W. Biederman
2015-04-07 16:58                   ` Vivek Venkatraman
2015-04-07 19:38                     ` Eric W. Biederman
2015-04-07 21:12                       ` Vivek Venkatraman
2015-04-07 18:15                   ` roopa
2015-03-15 19:53         ` [PATCH net-next 7/8] iproute2: Add support for the RTA_NEWDST attribute Eric W. Biederman
2015-03-15 19:53         ` [PATCH net-next 8/8] iproute2: Add basic mpls support to iproute Eric W. Biederman
2015-03-24 22:36 ` [PATCH net-next] iproute2: MPLS support Stephen Hemminger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.