netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next] net: ipv6: put autoconf routes into per-interface tables
@ 2017-01-06 15:30 Lorenzo Colitti
  2017-01-08  4:24 ` David Ahern
  0 siblings, 1 reply; 14+ messages in thread
From: Lorenzo Colitti @ 2017-01-06 15:30 UTC (permalink / raw)
  To: netdev
  Cc: zenczykowski, hannes, ek, hideaki.yoshifuji, davem, dsa, drosen,
	Lorenzo Colitti

Currently, IPv6 router discovery always puts routes into
RT6_TABLE_MAIN. This makes it difficult to maintain and switch
between multiple simultaneous network connections (e.g., wifi
and wired).

To work around this connection managers typically either move
autoconfiguration to userspace entirely (e.g., dhcpcd) or take
the routes they want and re-add them to the main table as static
routes with low metrics (e.g., NetworkManager). This puts the
burden on the connection manager to watch netlink or listen to
RAs to see if the routes have changed, delete the routes when
their lifetime expires, etc. This is complex and often not
implemented correctly.

This patch adds a per-interface sysctl to have the kernel put
autoconf routes into different tables. This allows each interface
to have its own routing table if desired.  Choosing the default
interface, or using different interfaces at the same time on a
per-socket or per-packet basis) can be done using policy routing
mechanisms that use as SO_BINDTODEVICE / IPV6_PKTINFO, mark-based
routing, or UID-based routing to select specific routing tables.

The sysctl behaves as follows:

- = 0: default. Put routes into RT6_TABLE_MAIN if the interface
       is not in a VRF, or into the VRF table if it is.
- > 0: manual. Put routes into the specified table.
- < 0: automatic. Add the absolute value of the sysctl to the
       device's ifindex, and use that table.

The automatic mode is most useful in conjunction with
net.ipv6.conf.default.accept_ra_rt_table. A connection manager
or distribution can set this to, say, -1000 on boot, and
thereafter know that routes received on every interface will
always be in that interface's routing table, and that the mapping
between interfaces and routing tables is deterministic. It also
ensures that if an interface is created and immediately receives
an RA, the route will go into the correct routing table without
needing any intervention from userspace.

The automatic mode (with conf.default.accept_ra_rt_table = -1000)
has been used in Android since 5.0.

Tested: compiles allnoconfig, allyesconfig, allmodconfig
Tested: passes existing Android kernel unit tests
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
---
 Documentation/networking/ip-sysctl.txt | 13 +++++++++++
 include/linux/ipv6.h                   |  1 +
 include/net/addrconf.h                 |  2 ++
 include/uapi/linux/ipv6.h              |  1 +
 net/ipv6/addrconf.c                    | 40 +++++++++++++++++++++++++++++++---
 net/ipv6/route.c                       | 11 +++++-----
 6 files changed, 59 insertions(+), 9 deletions(-)

diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index 7dd65c9cf7..d1311d8f33 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -1471,6 +1471,19 @@ accept_ra_rt_info_max_plen - INTEGER
 	Functional default: 0 if accept_ra_rtr_pref is enabled.
 			    -1 if accept_ra_rtr_pref is disabled.
 
+accept_ra_rt_table - INTEGER
+	Which table to put routes created by Router Advertisements into.
+
+	= 0: Use the main table if the device is not in a VRF, and the
+	     VRF table if it is.
+	> 0: Use the specified table.
+	< 0: Add the absolute value to the receiving interface index,
+	     and use that table. For example, if set to -1000, an RA
+	     received on interface index 4 will create routes in
+	     table 1004.
+
+	Default: 0
+
 accept_ra_rtr_pref - BOOLEAN
 	Accept Router Preference in RA.
 
diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index 671d014e64..55d75074aa 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -69,6 +69,7 @@ struct ipv6_devconf {
 	__s32		seg6_require_hmac;
 #endif
 	__u32		enhanced_dad;
+	__s32		accept_ra_rt_table;
 
 	struct ctl_table_header *sysctl_header;
 };
diff --git a/include/net/addrconf.h b/include/net/addrconf.h
index 8f998afc13..e1bd2bc027 100644
--- a/include/net/addrconf.h
+++ b/include/net/addrconf.h
@@ -242,6 +242,8 @@ static inline bool ipv6_is_mld(struct sk_buff *skb, int nexthdr, int offset)
 void addrconf_prefix_rcv(struct net_device *dev,
 			 u8 *opt, int len, bool sllao);
 
+u32 addrconf_rt_table(const struct net_device *dev, u32 default_table);
+
 /*
  *	anycast prototypes (anycast.c)
  */
diff --git a/include/uapi/linux/ipv6.h b/include/uapi/linux/ipv6.h
index eaf65dc82e..95c3553242 100644
--- a/include/uapi/linux/ipv6.h
+++ b/include/uapi/linux/ipv6.h
@@ -182,6 +182,7 @@ enum {
 	DEVCONF_SEG6_ENABLED,
 	DEVCONF_SEG6_REQUIRE_HMAC,
 	DEVCONF_ENHANCED_DAD,
+	DEVCONF_ACCEPT_RA_RT_TABLE,
 	DEVCONF_MAX
 };
 
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index c1e124bc8e..d4a6b877f8 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -243,6 +243,7 @@ static struct ipv6_devconf ipv6_devconf __read_mostly = {
 	.seg6_require_hmac	= 0,
 #endif
 	.enhanced_dad           = 1,
+	.accept_ra_rt_table	= 0,
 };
 
 static struct ipv6_devconf ipv6_devconf_dflt __read_mostly = {
@@ -294,6 +295,7 @@ static struct ipv6_devconf ipv6_devconf_dflt __read_mostly = {
 	.seg6_require_hmac	= 0,
 #endif
 	.enhanced_dad           = 1,
+	.accept_ra_rt_table	= 0,
 };
 
 /* Check if a valid qdisc is available */
@@ -2210,6 +2212,30 @@ static void  ipv6_try_regen_rndid(struct inet6_dev *idev, struct in6_addr *tmpad
 		ipv6_regen_rndid(idev);
 }
 
+#ifdef CONFIG_IPV6_MULTIPLE_TABLES
+u32 addrconf_rt_table(const struct net_device *dev, u32 default_table)
+{
+	struct inet6_dev *idev = in6_dev_get(dev);
+	u32 table;
+	int sysctl = idev->cnf.accept_ra_rt_table;
+
+	if (sysctl == 0)
+		table = l3mdev_fib_table(dev) ? : default_table;
+	else if (sysctl > 0)
+		table = (u32)sysctl;
+	else
+		table = (unsigned int)dev->ifindex + (-sysctl);
+
+	in6_dev_put(idev);
+	return table;
+}
+#else
+u32 addrconf_rt_table(const struct net_device *dev, u32 default_table)
+{
+	return RT6_TABLE_DFLT;
+}
+#endif
+
 /*
  *	Add prefix route.
  */
@@ -2219,7 +2245,7 @@ addrconf_prefix_route(struct in6_addr *pfx, int plen, struct net_device *dev,
 		      unsigned long expires, u32 flags)
 {
 	struct fib6_config cfg = {
-		.fc_table = l3mdev_fib_table(dev) ? : RT6_TABLE_PREFIX,
+		.fc_table = addrconf_rt_table(dev, RT6_TABLE_PREFIX),
 		.fc_metric = IP6_RT_PRIO_ADDRCONF,
 		.fc_ifindex = dev->ifindex,
 		.fc_expires = expires,
@@ -2252,9 +2278,9 @@ static struct rt6_info *addrconf_get_prefix_route(const struct in6_addr *pfx,
 	struct fib6_node *fn;
 	struct rt6_info *rt = NULL;
 	struct fib6_table *table;
-	u32 tb_id = l3mdev_fib_table(dev) ? : RT6_TABLE_PREFIX;
 
-	table = fib6_get_table(dev_net(dev), tb_id);
+	table = fib6_get_table(dev_net(dev),
+			       addrconf_rt_table(dev, RT6_TABLE_PREFIX));
 	if (!table)
 		return NULL;
 
@@ -4975,6 +5001,7 @@ static inline void ipv6_store_devconf(struct ipv6_devconf *cnf,
 	array[DEVCONF_SEG6_REQUIRE_HMAC] = cnf->seg6_require_hmac;
 #endif
 	array[DEVCONF_ENHANCED_DAD] = cnf->enhanced_dad;
+	array[DEVCONF_ACCEPT_RA_RT_TABLE] = cnf->accept_ra_rt_table;
 }
 
 static inline size_t inet6_ifla6_size(void)
@@ -6090,6 +6117,13 @@ static const struct ctl_table addrconf_sysctl[] = {
 		.proc_handler   = proc_dointvec,
 	},
 	{
+		.procname	= "accept_ra_rt_table",
+		.data		= &ipv6_devconf.accept_ra_rt_table,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec,
+	},
+	{
 		/* sentinel */
 	}
 };
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 8417c41d8e..86469ec27f 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -2345,13 +2345,12 @@ static struct rt6_info *rt6_get_route_info(struct net *net,
 					   const struct in6_addr *gwaddr,
 					   struct net_device *dev)
 {
-	u32 tb_id = l3mdev_fib_table(dev) ? : RT6_TABLE_INFO;
 	int ifindex = dev->ifindex;
 	struct fib6_node *fn;
 	struct rt6_info *rt = NULL;
 	struct fib6_table *table;
 
-	table = fib6_get_table(net, tb_id);
+	table = fib6_get_table(net, addrconf_rt_table(dev, RT6_TABLE_INFO));
 	if (!table)
 		return NULL;
 
@@ -2392,7 +2391,7 @@ static struct rt6_info *rt6_add_route_info(struct net *net,
 		.fc_nlinfo.nl_net = net,
 	};
 
-	cfg.fc_table = l3mdev_fib_table(dev) ? : RT6_TABLE_INFO,
+	cfg.fc_table = addrconf_rt_table(dev, RT6_TABLE_INFO);
 	cfg.fc_dst = *prefix;
 	cfg.fc_gateway = *gwaddr;
 
@@ -2408,11 +2407,11 @@ static struct rt6_info *rt6_add_route_info(struct net *net,
 
 struct rt6_info *rt6_get_dflt_router(const struct in6_addr *addr, struct net_device *dev)
 {
-	u32 tb_id = l3mdev_fib_table(dev) ? : RT6_TABLE_DFLT;
 	struct rt6_info *rt;
 	struct fib6_table *table;
 
-	table = fib6_get_table(dev_net(dev), tb_id);
+	table = fib6_get_table(dev_net(dev),
+			       addrconf_rt_table(dev, RT6_TABLE_DFLT));
 	if (!table)
 		return NULL;
 
@@ -2434,7 +2433,7 @@ struct rt6_info *rt6_add_dflt_router(const struct in6_addr *gwaddr,
 				     unsigned int pref)
 {
 	struct fib6_config cfg = {
-		.fc_table	= l3mdev_fib_table(dev) ? : RT6_TABLE_DFLT,
+		.fc_table	= addrconf_rt_table(dev, RT6_TABLE_DFLT),
 		.fc_metric	= IP6_RT_PRIO_USER,
 		.fc_ifindex	= dev->ifindex,
 		.fc_flags	= RTF_GATEWAY | RTF_ADDRCONF | RTF_DEFAULT |
-- 
2.11.0.390.gc69c2f50cf-goog

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next] net: ipv6: put autoconf routes into per-interface tables
  2017-01-06 15:30 [PATCH net-next] net: ipv6: put autoconf routes into per-interface tables Lorenzo Colitti
@ 2017-01-08  4:24 ` David Ahern
  2017-01-09 21:53   ` Andrey Jr. Melnikov
  2017-01-10  2:01   ` Lorenzo Colitti
  0 siblings, 2 replies; 14+ messages in thread
From: David Ahern @ 2017-01-08  4:24 UTC (permalink / raw)
  To: Lorenzo Colitti, netdev
  Cc: zenczykowski, hannes, ek, hideaki.yoshifuji, davem, drosen

On 1/6/17 8:30 AM, Lorenzo Colitti wrote:
> This patch adds a per-interface sysctl to have the kernel put
> autoconf routes into different tables. This allows each interface
> to have its own routing table if desired.  Choosing the default
> interface, or using different interfaces at the same time on a
> per-socket or per-packet basis) can be done using policy routing
> mechanisms that use as SO_BINDTODEVICE / IPV6_PKTINFO, mark-based
> routing, or UID-based routing to select specific routing tables.

Why not use the VRF capability then? create a VRF and assign the interface to it. End result is the same -- separate tables and the need to use a bind-to-device API to hit those routes.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next] net: ipv6: put autoconf routes into per-interface tables
  2017-01-08  4:24 ` David Ahern
@ 2017-01-09 21:53   ` Andrey Jr. Melnikov
  2017-01-10  2:01   ` Lorenzo Colitti
  1 sibling, 0 replies; 14+ messages in thread
From: Andrey Jr. Melnikov @ 2017-01-09 21:53 UTC (permalink / raw)
  To: netdev

David Ahern <dsa@cumulusnetworks.com> wrote:
> On 1/6/17 8:30 AM, Lorenzo Colitti wrote:
> > This patch adds a per-interface sysctl to have the kernel put
> > autoconf routes into different tables. This allows each interface
> > to have its own routing table if desired.  Choosing the default
> > interface, or using different interfaces at the same time on a
> > per-socket or per-packet basis) can be done using policy routing
> > mechanisms that use as SO_BINDTODEVICE / IPV6_PKTINFO, mark-based
> > routing, or UID-based routing to select specific routing tables.

> Why not use the VRF capability then? create a VRF and assign the interface to it. 
> End result is the same -- separate tables and the need to use a bind-to-device API to hit those routes.

Show *really working* config with VRF & IPv6?

In my tests - kernel unable to accept DAD, fill logs with "ICMPv6: RA: ndisc_router_discovery failed to add default route"
and nothing work. VRF interface don't contains IPv6 address.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next] net: ipv6: put autoconf routes into per-interface tables
  2017-01-08  4:24 ` David Ahern
  2017-01-09 21:53   ` Andrey Jr. Melnikov
@ 2017-01-10  2:01   ` Lorenzo Colitti
  2017-01-10  2:08     ` David Ahern
  2017-01-12 15:17     ` David Ahern
  1 sibling, 2 replies; 14+ messages in thread
From: Lorenzo Colitti @ 2017-01-10  2:01 UTC (permalink / raw)
  To: David Ahern
  Cc: netdev, Maciej Żenczykowski, Hannes Frederic Sowa,
	Erik Kline, YOSHIFUJI Hideaki, David Miller, Daniel Rosenberg

On Sun, Jan 8, 2017 at 1:24 PM, David Ahern <dsa@cumulusnetworks.com> wrote:
> Why not use the VRF capability then? create a VRF and assign the interface to it. End result is the same -- separate tables and the need to use a bind-to-device API to hit those routes.

Requiring that VRFs for this creates additional complexity, because
each network now requires its own VRF. That means that the connection
manager must create the VRF before the interface comes up and receives
the RA.

In some cases this might not be possible. For example, consider a tun
interface that's created by a different process such as a VPN client.
In this case the connection manager doesn't know the interface name,
and the VPN client doesn't know to create the VRF, so if the tun
interface gets an RA after the tun is created but

As others have mentioned, IPv6 on VRFs in client mode is also not
necessarily well-supported at the moment, and I don't know how long it
would take for it to be (assuming it can be made to work properly in
client mode without breaking the primary use cases for VRFs).

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next] net: ipv6: put autoconf routes into per-interface tables
  2017-01-10  2:01   ` Lorenzo Colitti
@ 2017-01-10  2:08     ` David Ahern
  2017-01-10  2:29       ` Lorenzo Colitti
  2017-01-12 15:17     ` David Ahern
  1 sibling, 1 reply; 14+ messages in thread
From: David Ahern @ 2017-01-10  2:08 UTC (permalink / raw)
  To: Lorenzo Colitti
  Cc: netdev, Maciej Żenczykowski, Hannes Frederic Sowa,
	Erik Kline, YOSHIFUJI Hideaki, David Miller, Daniel Rosenberg

On 1/9/17 7:01 PM, Lorenzo Colitti wrote:
> As others have mentioned, IPv6 on VRFs in client mode is also not
> necessarily well-supported at the moment, and I don't know how long it
> would take for it to be (assuming it can be made to work properly in
> client mode without breaking the primary use cases for VRFs).

That's news to me. What about IPv6 and VRF is not working or well-supported?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next] net: ipv6: put autoconf routes into per-interface tables
  2017-01-10  2:08     ` David Ahern
@ 2017-01-10  2:29       ` Lorenzo Colitti
  2017-01-10  3:04         ` David Ahern
  0 siblings, 1 reply; 14+ messages in thread
From: Lorenzo Colitti @ 2017-01-10  2:29 UTC (permalink / raw)
  To: David Ahern
  Cc: netdev, Maciej Żenczykowski, Hannes Frederic Sowa,
	Erik Kline, YOSHIFUJI Hideaki, David Miller, Daniel Rosenberg,
	temnota.am

On Tue, Jan 10, 2017 at 11:08 AM, David Ahern <dsa@cumulusnetworks.com> wrote:
> That's news to me. What about IPv6 and VRF is not working or well-supported?

I have no firsthand experience of this myself, but if the problems
that Andrey reports above in this thread are real, then those would
indicate that the code is not well-supported. Being unable to accept
DAD is a pretty serious issue. Andrey, what version of the kernel did
you see this on?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next] net: ipv6: put autoconf routes into per-interface tables
  2017-01-10  2:29       ` Lorenzo Colitti
@ 2017-01-10  3:04         ` David Ahern
  2017-01-10  3:30           ` Lorenzo Colitti
  0 siblings, 1 reply; 14+ messages in thread
From: David Ahern @ 2017-01-10  3:04 UTC (permalink / raw)
  To: Lorenzo Colitti
  Cc: netdev, Maciej Żenczykowski, Hannes Frederic Sowa,
	Erik Kline, YOSHIFUJI Hideaki, David Miller, Daniel Rosenberg,
	temnota.am

On 1/9/17 7:29 PM, Lorenzo Colitti wrote:
> On Tue, Jan 10, 2017 at 11:08 AM, David Ahern <dsa@cumulusnetworks.com> wrote:
>> That's news to me. What about IPv6 and VRF is not working or well-supported?
> 
> I have no firsthand experience of this myself, but if the problems
> that Andrey reports above in this thread are real, then those would
> indicate that the code is not well-supported. Being unable to accept
> DAD is a pretty serious issue. Andrey, what version of the kernel did
> you see this on?
> 

Are you referencing an Android or google thread? This patch thread has not mentioned any IPv6 problems.

Just a day or so ago I did a dup address test on an interface in a vrf and it worked fine, but perhaps I missed something.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next] net: ipv6: put autoconf routes into per-interface tables
  2017-01-10  3:04         ` David Ahern
@ 2017-01-10  3:30           ` Lorenzo Colitti
  2017-01-10  3:39             ` David Ahern
  0 siblings, 1 reply; 14+ messages in thread
From: Lorenzo Colitti @ 2017-01-10  3:30 UTC (permalink / raw)
  To: David Ahern
  Cc: netdev, Maciej Żenczykowski, Hannes Frederic Sowa,
	Erik Kline, YOSHIFUJI Hideaki, David Miller, Daniel Rosenberg,
	temnota.am

On Tue, Jan 10, 2017 at 12:04 PM, David Ahern <dsa@cumulusnetworks.com> wrote:
> > I have no firsthand experience of this myself, but if the problems
> > that Andrey reports above in this thread are real, then those would
> > indicate that the code is not well-supported. Being unable to accept
> > DAD is a pretty serious issue. Andrey, what version of the kernel did
> > you see this on?
>
> Are you referencing an Android or google thread? This patch thread has not mentioned any IPv6 problems.

No, this thread. I see message-ID 8cddkd-etc.ln1@banana.localnet in
this thread. If you didn't get it, I also see it in the comments on
patchwork - https://patchwork.ozlabs.org/patch/711956/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next] net: ipv6: put autoconf routes into per-interface tables
  2017-01-10  3:30           ` Lorenzo Colitti
@ 2017-01-10  3:39             ` David Ahern
  2017-01-10 13:21               ` Andrey Jr. Melnikov
  0 siblings, 1 reply; 14+ messages in thread
From: David Ahern @ 2017-01-10  3:39 UTC (permalink / raw)
  To: Lorenzo Colitti
  Cc: netdev, Maciej Żenczykowski, Hannes Frederic Sowa,
	Erik Kline, YOSHIFUJI Hideaki, David Miller, Daniel Rosenberg,
	temnota.am

On 1/9/17 8:30 PM, Lorenzo Colitti wrote:
> On Tue, Jan 10, 2017 at 12:04 PM, David Ahern <dsa@cumulusnetworks.com> wrote:
>>> I have no firsthand experience of this myself, but if the problems
>>> that Andrey reports above in this thread are real, then those would
>>> indicate that the code is not well-supported. Being unable to accept
>>> DAD is a pretty serious issue. Andrey, what version of the kernel did
>>> you see this on?
>>
>> Are you referencing an Android or google thread? This patch thread has not mentioned any IPv6 problems.
> 
> No, this thread. I see message-ID 8cddkd-etc.ln1@banana.localnet in
> this thread. If you didn't get it, I also see it in the comments on
> patchwork - https://patchwork.ozlabs.org/patch/711956/
> 

Odd that I did not get that -- checked spam and trash.

Andrey is missing 830218c1add1da16519b71909e5cf21522b7d062 which tells me the comment is not based on 4.10 or net-next:

$ git describe 830218c1add1da16519b71909e5cf21522b7d062
v4.8-14744-g830218c1add1

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next] net: ipv6: put autoconf routes into per-interface tables
  2017-01-10  3:39             ` David Ahern
@ 2017-01-10 13:21               ` Andrey Jr. Melnikov
  2017-01-10 17:47                 ` Lorenzo Colitti
  0 siblings, 1 reply; 14+ messages in thread
From: Andrey Jr. Melnikov @ 2017-01-10 13:21 UTC (permalink / raw)
  To: netdev

David Ahern <dsa@cumulusnetworks.com> wrote:
> On 1/9/17 8:30 PM, Lorenzo Colitti wrote:
> > On Tue, Jan 10, 2017 at 12:04 PM, David Ahern <dsa@cumulusnetworks.com> wrote:
> >>> I have no firsthand experience of this myself, but if the problems
> >>> that Andrey reports above in this thread are real, then those would
> >>> indicate that the code is not well-supported. Being unable to accept
> >>> DAD is a pretty serious issue. Andrey, what version of the kernel did
> >>> you see this on?
> >>
> >> Are you referencing an Android or google thread? This patch thread has not mentioned any IPv6 problems.
> > 
> > No, this thread. I see message-ID 8cddkd-etc.ln1@banana.localnet in
> > this thread. If you didn't get it, I also see it in the comments on
> > patchwork - https://patchwork.ozlabs.org/patch/711956/
> > 

> Odd that I did not get that -- checked spam and trash.

> Andrey is missing 830218c1add1da16519b71909e5cf21522b7d062 which tells me the comment is not based on 4.10 or net-next:

> $ git describe 830218c1add1da16519b71909e5cf21522b7d062
> v4.8-14744-g830218c1add1

Good catch. I'm running 4.8 without this patch. Current 4.10-rc works. Sorry
for noise.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next] net: ipv6: put autoconf routes into per-interface tables
  2017-01-10 13:21               ` Andrey Jr. Melnikov
@ 2017-01-10 17:47                 ` Lorenzo Colitti
  2017-01-11 14:11                   ` David Miller
  0 siblings, 1 reply; 14+ messages in thread
From: Lorenzo Colitti @ 2017-01-10 17:47 UTC (permalink / raw)
  To: Andrey Jr. Melnikov; +Cc: netdev

On Tue, Jan 10, 2017 at 10:21 PM, Andrey Jr. Melnikov
<temnota.am@gmail.com> wrote:
>
> > >>> I have no firsthand experience of this myself, but if the problems
> > >>> that Andrey reports above in this thread are real, then those would
> > >>> indicate that the code is not well-supported. Being unable to accept
> > >>> DAD is a pretty serious issue. Andrey, what version of the kernel did
> > >>> you see this on?
>
> Good catch. I'm running 4.8 without this patch. Current 4.10-rc works. Sorry
> for noise.

Ack. As I said before, I haven't seen this myself. Shouldn't have made
assertions without firsthand evidence.

That said, I think this patch is useful even though autoconf on VRFs
works the same way. One reason is the example I provided above: it
works even for interfaces that don't exist yet, whereas a VRF has to
be created ahead of time, which means that the interface cannot
immediately come up and receive an RA or its configuration will be
incorrect.

I also think that from a configuration perspective it's not
necessarily useful to have one VRF for every interface, but that sort
of depends on your point of view. Perhaps it's fine on a client system
to have both vrf-wlan0 and wlan0, and vrf-eth0 and eth0. That might be
confusing to users but maybe users don't really care?

More in general I think that using a VRFs is buying into a bigger set
of assumptions/restrictions than this patch does. For example, if I'm
reading ipv6_dev_get_saddr correctly, once you put an interface in a
VRF you can't really use the weak host model any more, because the
stack won't pick a source address from outside the VRF if the route
lookup returned a route in the VRF. Turning on the functionality in
patch is a more minimal change that only affects autoconf.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next] net: ipv6: put autoconf routes into per-interface tables
  2017-01-10 17:47                 ` Lorenzo Colitti
@ 2017-01-11 14:11                   ` David Miller
  2017-01-11 16:46                     ` Lorenzo Colitti
  0 siblings, 1 reply; 14+ messages in thread
From: David Miller @ 2017-01-11 14:11 UTC (permalink / raw)
  To: lorenzo; +Cc: temnota.am, netdev

From: Lorenzo Colitti <lorenzo@google.com>
Date: Wed, 11 Jan 2017 02:47:55 +0900

> On Tue, Jan 10, 2017 at 10:21 PM, Andrey Jr. Melnikov
> <temnota.am@gmail.com> wrote:
>>
>> > >>> I have no firsthand experience of this myself, but if the problems
>> > >>> that Andrey reports above in this thread are real, then those would
>> > >>> indicate that the code is not well-supported. Being unable to accept
>> > >>> DAD is a pretty serious issue. Andrey, what version of the kernel did
>> > >>> you see this on?
>>
>> Good catch. I'm running 4.8 without this patch. Current 4.10-rc works. Sorry
>> for noise.
> 
> Ack. As I said before, I haven't seen this myself. Shouldn't have made
> assertions without firsthand evidence.
> 
> That said, I think this patch is useful even though autoconf on VRFs
> works the same way. One reason is the example I provided above: it
> works even for interfaces that don't exist yet, whereas a VRF has to
> be created ahead of time, which means that the interface cannot
> immediately come up and receive an RA or its configuration will be
> incorrect.
> 
> I also think that from a configuration perspective it's not
> necessarily useful to have one VRF for every interface, but that sort
> of depends on your point of view. Perhaps it's fine on a client system
> to have both vrf-wlan0 and wlan0, and vrf-eth0 and eth0. That might be
> confusing to users but maybe users don't really care?
> 
> More in general I think that using a VRFs is buying into a bigger set
> of assumptions/restrictions than this patch does. For example, if I'm
> reading ipv6_dev_get_saddr correctly, once you put an interface in a
> VRF you can't really use the weak host model any more, because the
> stack won't pick a source address from outside the VRF if the route
> lookup returned a route in the VRF. Turning on the functionality in
> patch is a more minimal change that only affects autoconf.

I understand what you're saying, but if you look at how apps can be
put into hierarchical control groups, and automatically bind to VRF's
based upon where they are in that cgroup hierarchy, it matches your
use case precisely.

Maybe the setup is not ideal, but architectually it does everything
you want.  And we strongly try to avoid adding multiple ways to do the
same thing.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next] net: ipv6: put autoconf routes into per-interface tables
  2017-01-11 14:11                   ` David Miller
@ 2017-01-11 16:46                     ` Lorenzo Colitti
  0 siblings, 0 replies; 14+ messages in thread
From: Lorenzo Colitti @ 2017-01-11 16:46 UTC (permalink / raw)
  To: David Miller; +Cc: Andrey Jr. Melnikov, netdev

On Wed, Jan 11, 2017 at 11:11 PM, David Miller <davem@davemloft.net> wrote:
> I understand what you're saying, but if you look at how apps can be
> put into hierarchical control groups, and automatically bind to VRF's
> based upon where they are in that cgroup hierarchy, it matches your
> use case precisely.

I think whether an app is in bound to a certain VRF or not is not
directly related to this patch. What this patch does is provide a way
to ensure that routes learned via autoconf go into a specific routing
table, so that policy routing rules can select them.

Without this patch, and without VRFs, the routes for all networks and
all interfaces all go into the same routing table (main). That doesn't
work well on a multinetwork device. As David A. points out, with VRFs
this can be done - since each VRF has its own routing table, the
routes are isolated and ip rules can be applied to determine which
ones are used. However, I'm not convinced that VRFs are a great
solution to this problem. A couple of problems I see here are:

1. When an interface is created on the fly, the system must guarantee
that a VRF for it exists, and the interface is put into it, before it
comes up and receives an RA. This is not insurmountable - for example,
you can set net.conf.default.disable_ipv6 to 1, and set it to 0 on the
interface once it's in a VRF. Not sure this is feasible on a
mainstream distribution, but it could be done on something like
Android that's more tightly integrated.

2. I'm not sure it's possible to use routing policy to select between
interfaces in the same VRF. For example, if you have a carrier that
provides the user with the same IP address and similar connectivity on
both cellular data and a carrier-operated wifi network, and you put
those two in the same VRF, I don't see a way to say via routing policy
"prefer wifi over cellular", because the routes for both are in the
same table and AIUI the only discriminator between the two - the oif -
has to be set to the VRF ifindex. I suppose it might be possible to
alter the metrics of routes that were previously created by autoconf
but that sounds desperately hacky.

I suppose both of these could also be resolved by ensuring that each
interface is its own VRF at creation time. But if every VRF only
contains exactly one interface, the VRF construct doesn't really seem
useful.

That said, we've maintained this patch out of tree for a few years now
and we can continue to do so for a while longer. When the VRF code
rolls into enough SoC kernels we can make an attempt to use it and see
what issues we find.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next] net: ipv6: put autoconf routes into per-interface tables
  2017-01-10  2:01   ` Lorenzo Colitti
  2017-01-10  2:08     ` David Ahern
@ 2017-01-12 15:17     ` David Ahern
  1 sibling, 0 replies; 14+ messages in thread
From: David Ahern @ 2017-01-12 15:17 UTC (permalink / raw)
  To: Lorenzo Colitti
  Cc: netdev, Maciej Żenczykowski, Hannes Frederic Sowa,
	Erik Kline, YOSHIFUJI Hideaki, David Miller, Daniel Rosenberg

On 1/9/17 7:01 PM, Lorenzo Colitti wrote:
> On Sun, Jan 8, 2017 at 1:24 PM, David Ahern <dsa@cumulusnetworks.com> wrote:
>> Why not use the VRF capability then? create a VRF and assign the interface to it. End result is the same -- separate tables and the need to use a bind-to-device API to hit those routes.
> 
> Requiring that VRFs for this creates additional complexity, because
> each network now requires its own VRF. That means that the connection
> manager must create the VRF before the interface comes up and receives
> the RA.
> 
> In some cases this might not be possible. For example, consider a tun
> interface that's created by a different process such as a VPN client.
> In this case the connection manager doesn't know the interface name,
> and the VPN client doesn't know to create the VRF, so if the tun
> interface gets an RA after the tun is created but

Have you looked at adding basic l3mdev capabilities to tun? in this case just l3mdev_fib_table needs be implemented. On interface create push down a table id and set the IFF_L3MDEV_MASTER flag.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-01-12 15:17 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-06 15:30 [PATCH net-next] net: ipv6: put autoconf routes into per-interface tables Lorenzo Colitti
2017-01-08  4:24 ` David Ahern
2017-01-09 21:53   ` Andrey Jr. Melnikov
2017-01-10  2:01   ` Lorenzo Colitti
2017-01-10  2:08     ` David Ahern
2017-01-10  2:29       ` Lorenzo Colitti
2017-01-10  3:04         ` David Ahern
2017-01-10  3:30           ` Lorenzo Colitti
2017-01-10  3:39             ` David Ahern
2017-01-10 13:21               ` Andrey Jr. Melnikov
2017-01-10 17:47                 ` Lorenzo Colitti
2017-01-11 14:11                   ` David Miller
2017-01-11 16:46                     ` Lorenzo Colitti
2017-01-12 15:17     ` David Ahern

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).