All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] net: sysctl for RA default route MTU
@ 2015-03-24 18:03 Roman Gushchin
  2015-03-24 19:27 ` David Miller
  0 siblings, 1 reply; 20+ messages in thread
From: Roman Gushchin @ 2015-03-24 18:03 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel, Roman Gushchin

This patch introduces new ipv6 sysctl: ra_default_route_mtu.
If it's set (> 0), it defines per-route MTU for any new default route
received by RA.

This sysctl will help in the following configuration: we want to use
jumbo-frames for internal networks and default ethernet frames for
default route. Per-route MTU can only lower per-link MTU, so link MTU
should be set to ~9000 (statically or via RA).

Due to dynamic nature of RA, setting MTU for default route will require
userspace agent, that will monitor changes of default route
and (re)configure it. Not simple. The suggested sysctl solves this problem.

Signed-off-by: Roman Gushchin <klamm@yandex-team.ru>
---
 Documentation/networking/ip-sysctl.txt | 5 +++++
 include/linux/ipv6.h                   | 1 +
 include/uapi/linux/ipv6.h              | 1 +
 net/ipv6/addrconf.c                    | 9 +++++++++
 net/ipv6/ndisc.c                       | 3 ++-
 net/ipv6/route.c                       | 8 ++++++++
 6 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index 1b8c964..c013dda 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -1316,6 +1316,11 @@ accept_ra_mtu - BOOLEAN
 	Functional default: enabled if accept_ra is enabled.
 			    disabled if accept_ra is disabled.
 
+ra_default_route_mtu - INTEGER
+	Define MTU for any new default route received by RA.
+
+	Functional default: disabled (0).
+
 accept_redirects - BOOLEAN
 	Accept Redirects.
 
diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index 4d5169f..b310c9f 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -40,6 +40,7 @@ struct ipv6_devconf {
 	__s32		proxy_ndp;
 	__s32		accept_source_route;
 	__s32		accept_ra_from_local;
+	__s32		ra_default_route_mtu;
 #ifdef CONFIG_IPV6_OPTIMISTIC_DAD
 	__s32		optimistic_dad;
 	__s32		use_optimistic;
diff --git a/include/uapi/linux/ipv6.h b/include/uapi/linux/ipv6.h
index 437a6a4..4539c31 100644
--- a/include/uapi/linux/ipv6.h
+++ b/include/uapi/linux/ipv6.h
@@ -170,6 +170,7 @@ enum {
 	DEVCONF_ACCEPT_RA_FROM_LOCAL,
 	DEVCONF_USE_OPTIMISTIC,
 	DEVCONF_ACCEPT_RA_MTU,
+	DEVCONF_RA_DEFAULT_ROUTE_MTU,
 	DEVCONF_MAX
 };
 
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index b603002..9d5ec10 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -189,6 +189,7 @@ static struct ipv6_devconf ipv6_devconf __read_mostly = {
 	.accept_ra_defrtr	= 1,
 	.accept_ra_from_local	= 0,
 	.accept_ra_pinfo	= 1,
+	.ra_default_route_mtu	= 0,
 #ifdef CONFIG_IPV6_ROUTER_PREF
 	.accept_ra_rtr_pref	= 1,
 	.rtr_probe_interval	= 60 * HZ,
@@ -240,6 +241,7 @@ static struct ipv6_devconf ipv6_devconf_dflt __read_mostly = {
 	.accept_dad		= 1,
 	.suppress_frag_ndisc	= 1,
 	.accept_ra_mtu		= 1,
+	.ra_default_route_mtu	= 0,
 };
 
 /* Check if a valid qdisc is available */
@@ -4398,6 +4400,7 @@ static inline void ipv6_store_devconf(struct ipv6_devconf *cnf,
 	array[DEVCONF_SUPPRESS_FRAG_NDISC] = cnf->suppress_frag_ndisc;
 	array[DEVCONF_ACCEPT_RA_FROM_LOCAL] = cnf->accept_ra_from_local;
 	array[DEVCONF_ACCEPT_RA_MTU] = cnf->accept_ra_mtu;
+	array[DEVCONF_RA_DEFAULT_ROUTE_MTU] = cnf->ra_default_route_mtu;
 }
 
 static inline size_t inet6_ifla6_size(void)
@@ -5314,6 +5317,12 @@ static struct addrconf_sysctl_table
 			.mode		= 0644,
 			.proc_handler	= proc_dointvec,
 		},
+			.procname	= "ra_default_route_mtu",
+			.data		= &ipv6_devconf.ra_default_route_mtu,
+			.maxlen		= sizeof(int),
+			.mode		= 0644,
+			.proc_handler	= proc_dointvec,
+		},
 		{
 			/* sentinel */
 		}
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 471ed24..c70ab44 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -1362,7 +1362,8 @@ skip_routeinfo:
 		} else if (in6_dev->cnf.mtu6 != mtu) {
 			in6_dev->cnf.mtu6 = mtu;
 
-			if (rt)
+			if (rt && (!in6_dev->cnf.ra_default_route_mtu ||
+				   mtu < in6_dev->cnf.ra_default_route_mtu))
 				dst_metric_set(&rt->dst, RTAX_MTU, mtu);
 
 			rt6_mtu_change(skb->dev, mtu);
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 4688bd4..6394adf 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1714,6 +1714,14 @@ int ip6_route_add(struct fib6_config *cfg)
 
 	rt->rt6i_flags = cfg->fc_flags;
 
+	if ((cfg->fc_flags & (RTF_ADDRCONF | RTF_DEFAULT | RTF_GATEWAY)) ==
+	    (RTF_ADDRCONF | RTF_DEFAULT | RTF_GATEWAY)) {
+		u32 mtu = idev->cnf.ra_default_route_mtu;
+
+		if (mtu && mtu >= IPV6_MIN_MTU && mtu <= idev->cnf.mtu6)
+			dst_metric_set(&rt->dst, RTAX_MTU, mtu);
+	}
+
 install_route:
 	rt->dst.dev = dev;
 	rt->rt6i_idev = idev;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] net: sysctl for RA default route MTU
  2015-03-24 18:03 [PATCH] net: sysctl for RA default route MTU Roman Gushchin
@ 2015-03-24 19:27 ` David Miller
  2015-03-25  9:49   ` Roman Gushchin
  2015-03-25 11:07   ` [PATCH v2] " Roman Gushchin
  0 siblings, 2 replies; 20+ messages in thread
From: David Miller @ 2015-03-24 19:27 UTC (permalink / raw)
  To: klamm; +Cc: linux-kernel


netdev@vger.kernel.org is the correct place to submit networking patches,
thank you.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH] net: sysctl for RA default route MTU
  2015-03-24 19:27 ` David Miller
@ 2015-03-25  9:49   ` Roman Gushchin
  2015-03-25 11:07   ` [PATCH v2] " Roman Gushchin
  1 sibling, 0 replies; 20+ messages in thread
From: Roman Gushchin @ 2015-03-25  9:49 UTC (permalink / raw)
  To: David S. Miller, netdev; +Cc: linux-kernel, Roman Gushchin

This patch introduces new ipv6 sysctl: ra_default_route_mtu.
If it's set (> 0), it defines per-route MTU for any new default route
received by RA.

This sysctl will help in the following configuration: we want to use
jumbo-frames for internal networks and default ethernet frames for
default route. Per-route MTU can only lower per-link MTU, so link MTU
should be set to ~9000 (statically or via RA).

Due to dynamic nature of RA, setting MTU for default route will require
userspace agent, that will monitor changes of default route
and (re)configure it. Not simple. The suggested sysctl solves this problem.

Signed-off-by: Roman Gushchin <klamm@yandex-team.ru>
---
 Documentation/networking/ip-sysctl.txt | 5 +++++
 include/linux/ipv6.h                   | 1 +
 include/uapi/linux/ipv6.h              | 1 +
 net/ipv6/addrconf.c                    | 9 +++++++++
 net/ipv6/ndisc.c                       | 3 ++-
 net/ipv6/route.c                       | 8 ++++++++
 6 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index 1b8c964..c013dda 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -1316,6 +1316,11 @@ accept_ra_mtu - BOOLEAN
 	Functional default: enabled if accept_ra is enabled.
 			    disabled if accept_ra is disabled.
 
+ra_default_route_mtu - INTEGER
+	Define MTU for any new default route received by RA.
+
+	Functional default: disabled (0).
+
 accept_redirects - BOOLEAN
 	Accept Redirects.
 
diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index 4d5169f..b310c9f 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -40,6 +40,7 @@ struct ipv6_devconf {
 	__s32		proxy_ndp;
 	__s32		accept_source_route;
 	__s32		accept_ra_from_local;
+	__s32		ra_default_route_mtu;
 #ifdef CONFIG_IPV6_OPTIMISTIC_DAD
 	__s32		optimistic_dad;
 	__s32		use_optimistic;
diff --git a/include/uapi/linux/ipv6.h b/include/uapi/linux/ipv6.h
index 437a6a4..4539c31 100644
--- a/include/uapi/linux/ipv6.h
+++ b/include/uapi/linux/ipv6.h
@@ -170,6 +170,7 @@ enum {
 	DEVCONF_ACCEPT_RA_FROM_LOCAL,
 	DEVCONF_USE_OPTIMISTIC,
 	DEVCONF_ACCEPT_RA_MTU,
+	DEVCONF_RA_DEFAULT_ROUTE_MTU,
 	DEVCONF_MAX
 };
 
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index b603002..9d5ec10 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -189,6 +189,7 @@ static struct ipv6_devconf ipv6_devconf __read_mostly = {
 	.accept_ra_defrtr	= 1,
 	.accept_ra_from_local	= 0,
 	.accept_ra_pinfo	= 1,
+	.ra_default_route_mtu	= 0,
 #ifdef CONFIG_IPV6_ROUTER_PREF
 	.accept_ra_rtr_pref	= 1,
 	.rtr_probe_interval	= 60 * HZ,
@@ -240,6 +241,7 @@ static struct ipv6_devconf ipv6_devconf_dflt __read_mostly = {
 	.accept_dad		= 1,
 	.suppress_frag_ndisc	= 1,
 	.accept_ra_mtu		= 1,
+	.ra_default_route_mtu	= 0,
 };
 
 /* Check if a valid qdisc is available */
@@ -4398,6 +4400,7 @@ static inline void ipv6_store_devconf(struct ipv6_devconf *cnf,
 	array[DEVCONF_SUPPRESS_FRAG_NDISC] = cnf->suppress_frag_ndisc;
 	array[DEVCONF_ACCEPT_RA_FROM_LOCAL] = cnf->accept_ra_from_local;
 	array[DEVCONF_ACCEPT_RA_MTU] = cnf->accept_ra_mtu;
+	array[DEVCONF_RA_DEFAULT_ROUTE_MTU] = cnf->ra_default_route_mtu;
 }
 
 static inline size_t inet6_ifla6_size(void)
@@ -5314,6 +5317,12 @@ static struct addrconf_sysctl_table
 			.mode		= 0644,
 			.proc_handler	= proc_dointvec,
 		},
+			.procname	= "ra_default_route_mtu",
+			.data		= &ipv6_devconf.ra_default_route_mtu,
+			.maxlen		= sizeof(int),
+			.mode		= 0644,
+			.proc_handler	= proc_dointvec,
+		},
 		{
 			/* sentinel */
 		}
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 471ed24..c70ab44 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -1362,7 +1362,8 @@ skip_routeinfo:
 		} else if (in6_dev->cnf.mtu6 != mtu) {
 			in6_dev->cnf.mtu6 = mtu;
 
-			if (rt)
+			if (rt && (!in6_dev->cnf.ra_default_route_mtu ||
+				   mtu < in6_dev->cnf.ra_default_route_mtu))
 				dst_metric_set(&rt->dst, RTAX_MTU, mtu);
 
 			rt6_mtu_change(skb->dev, mtu);
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 4688bd4..6394adf 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1714,6 +1714,14 @@ int ip6_route_add(struct fib6_config *cfg)
 
 	rt->rt6i_flags = cfg->fc_flags;
 
+	if ((cfg->fc_flags & (RTF_ADDRCONF | RTF_DEFAULT | RTF_GATEWAY)) ==
+	    (RTF_ADDRCONF | RTF_DEFAULT | RTF_GATEWAY)) {
+		u32 mtu = idev->cnf.ra_default_route_mtu;
+
+		if (mtu && mtu >= IPV6_MIN_MTU && mtu <= idev->cnf.mtu6)
+			dst_metric_set(&rt->dst, RTAX_MTU, mtu);
+	}
+
 install_route:
 	rt->dst.dev = dev;
 	rt->rt6i_idev = idev;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2] net: sysctl for RA default route MTU
  2015-03-24 19:27 ` David Miller
  2015-03-25  9:49   ` Roman Gushchin
@ 2015-03-25 11:07   ` Roman Gushchin
  2015-03-25 12:34     ` Denis Kirjanov
  2015-03-25 15:52     ` Hannes Frederic Sowa
  1 sibling, 2 replies; 20+ messages in thread
From: Roman Gushchin @ 2015-03-25 11:07 UTC (permalink / raw)
  To: David S. Miller, netdev; +Cc: linux-kernel, Roman Gushchin

This patch introduces new ipv6 sysctl: ra_default_route_mtu.
If it's set (> 0), it defines per-route MTU for any new default route
received by RA.

This sysctl will help in the following configuration: we want to use
jumbo-frames for internal networks and default ethernet frames for
default route. Per-route MTU can only lower per-link MTU, so link MTU
should be set to ~9000 (statically or via RA).

Due to dynamic nature of RA, setting MTU for default route will require
userspace agent, that will monitor changes of default route
and (re)configure it. Not simple. The suggested sysctl solves this
problem.

Signed-off-by: Roman Gushchin <klamm@yandex-team.ru>

---

Changes from v1: add forgotten brace.
---
 Documentation/networking/ip-sysctl.txt |  5 +++++
 include/linux/ipv6.h                   |  1 +
 include/uapi/linux/ipv6.h              |  1 +
 net/ipv6/addrconf.c                    | 10 ++++++++++
 net/ipv6/ndisc.c                       |  3 ++-
 net/ipv6/route.c                       |  8 ++++++++
 6 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index 1b8c964..c013dda 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -1316,6 +1316,11 @@ accept_ra_mtu - BOOLEAN
 	Functional default: enabled if accept_ra is enabled.
 			    disabled if accept_ra is disabled.
 
+ra_default_route_mtu - INTEGER
+	Define MTU for any new default route received by RA.
+
+	Functional default: disabled (0).
+
 accept_redirects - BOOLEAN
 	Accept Redirects.
 
diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index 4d5169f..b310c9f 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -40,6 +40,7 @@ struct ipv6_devconf {
 	__s32		proxy_ndp;
 	__s32		accept_source_route;
 	__s32		accept_ra_from_local;
+	__s32		ra_default_route_mtu;
 #ifdef CONFIG_IPV6_OPTIMISTIC_DAD
 	__s32		optimistic_dad;
 	__s32		use_optimistic;
diff --git a/include/uapi/linux/ipv6.h b/include/uapi/linux/ipv6.h
index 437a6a4..4539c31 100644
--- a/include/uapi/linux/ipv6.h
+++ b/include/uapi/linux/ipv6.h
@@ -170,6 +170,7 @@ enum {
 	DEVCONF_ACCEPT_RA_FROM_LOCAL,
 	DEVCONF_USE_OPTIMISTIC,
 	DEVCONF_ACCEPT_RA_MTU,
+	DEVCONF_RA_DEFAULT_ROUTE_MTU,
 	DEVCONF_MAX
 };
 
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index b603002..322dd733 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -189,6 +189,7 @@ static struct ipv6_devconf ipv6_devconf __read_mostly = {
 	.accept_ra_defrtr	= 1,
 	.accept_ra_from_local	= 0,
 	.accept_ra_pinfo	= 1,
+	.ra_default_route_mtu	= 0,
 #ifdef CONFIG_IPV6_ROUTER_PREF
 	.accept_ra_rtr_pref	= 1,
 	.rtr_probe_interval	= 60 * HZ,
@@ -240,6 +241,7 @@ static struct ipv6_devconf ipv6_devconf_dflt __read_mostly = {
 	.accept_dad		= 1,
 	.suppress_frag_ndisc	= 1,
 	.accept_ra_mtu		= 1,
+	.ra_default_route_mtu	= 0,
 };
 
 /* Check if a valid qdisc is available */
@@ -4398,6 +4400,7 @@ static inline void ipv6_store_devconf(struct ipv6_devconf *cnf,
 	array[DEVCONF_SUPPRESS_FRAG_NDISC] = cnf->suppress_frag_ndisc;
 	array[DEVCONF_ACCEPT_RA_FROM_LOCAL] = cnf->accept_ra_from_local;
 	array[DEVCONF_ACCEPT_RA_MTU] = cnf->accept_ra_mtu;
+	array[DEVCONF_RA_DEFAULT_ROUTE_MTU] = cnf->ra_default_route_mtu;
 }
 
 static inline size_t inet6_ifla6_size(void)
@@ -5315,6 +5318,13 @@ static struct addrconf_sysctl_table
 			.proc_handler	= proc_dointvec,
 		},
 		{
+			.procname	= "ra_default_route_mtu",
+			.data		= &ipv6_devconf.ra_default_route_mtu,
+			.maxlen		= sizeof(int),
+			.mode		= 0644,
+			.proc_handler	= proc_dointvec,
+		},
+		{
 			/* sentinel */
 		}
 	},
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 471ed24..c70ab44 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -1362,7 +1362,8 @@ skip_routeinfo:
 		} else if (in6_dev->cnf.mtu6 != mtu) {
 			in6_dev->cnf.mtu6 = mtu;
 
-			if (rt)
+			if (rt && (!in6_dev->cnf.ra_default_route_mtu ||
+				   mtu < in6_dev->cnf.ra_default_route_mtu))
 				dst_metric_set(&rt->dst, RTAX_MTU, mtu);
 
 			rt6_mtu_change(skb->dev, mtu);
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 4688bd4..6394adf 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1714,6 +1714,14 @@ int ip6_route_add(struct fib6_config *cfg)
 
 	rt->rt6i_flags = cfg->fc_flags;
 
+	if ((cfg->fc_flags & (RTF_ADDRCONF | RTF_DEFAULT | RTF_GATEWAY)) ==
+	    (RTF_ADDRCONF | RTF_DEFAULT | RTF_GATEWAY)) {
+		u32 mtu = idev->cnf.ra_default_route_mtu;
+
+		if (mtu && mtu >= IPV6_MIN_MTU && mtu <= idev->cnf.mtu6)
+			dst_metric_set(&rt->dst, RTAX_MTU, mtu);
+	}
+
 install_route:
 	rt->dst.dev = dev;
 	rt->rt6i_idev = idev;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] net: sysctl for RA default route MTU
  2015-03-25 11:07   ` [PATCH v2] " Roman Gushchin
@ 2015-03-25 12:34     ` Denis Kirjanov
  2015-03-25 15:52     ` Hannes Frederic Sowa
  1 sibling, 0 replies; 20+ messages in thread
From: Denis Kirjanov @ 2015-03-25 12:34 UTC (permalink / raw)
  To: Roman Gushchin; +Cc: David S. Miller, netdev, linux-kernel

On 3/25/15, Roman Gushchin <klamm@yandex-team.ru> wrote:
> This patch introduces new ipv6 sysctl: ra_default_route_mtu.
> If it's set (> 0), it defines per-route MTU for any new default route
> received by RA.
>
> This sysctl will help in the following configuration: we want to use
> jumbo-frames for internal networks and default ethernet frames for
> default route. Per-route MTU can only lower per-link MTU, so link MTU
> should be set to ~9000 (statically or via RA).
>
> Due to dynamic nature of RA, setting MTU for default route will require
> userspace agent, that will monitor changes of default route
> and (re)configure it. Not simple. The suggested sysctl solves this
> problem.
>
> Signed-off-by: Roman Gushchin <klamm@yandex-team.ru>

It's net-next material

> ---
>
> Changes from v1: add forgotten brace.
> ---
>  Documentation/networking/ip-sysctl.txt |  5 +++++
>  include/linux/ipv6.h                   |  1 +
>  include/uapi/linux/ipv6.h              |  1 +
>  net/ipv6/addrconf.c                    | 10 ++++++++++
>  net/ipv6/ndisc.c                       |  3 ++-
>  net/ipv6/route.c                       |  8 ++++++++
>  6 files changed, 27 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/networking/ip-sysctl.txt
> b/Documentation/networking/ip-sysctl.txt
> index 1b8c964..c013dda 100644
> --- a/Documentation/networking/ip-sysctl.txt
> +++ b/Documentation/networking/ip-sysctl.txt
> @@ -1316,6 +1316,11 @@ accept_ra_mtu - BOOLEAN
>  	Functional default: enabled if accept_ra is enabled.
>  			    disabled if accept_ra is disabled.
>
> +ra_default_route_mtu - INTEGER
> +	Define MTU for any new default route received by RA.
> +
> +	Functional default: disabled (0).
> +
>  accept_redirects - BOOLEAN
>  	Accept Redirects.
>
> diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
> index 4d5169f..b310c9f 100644
> --- a/include/linux/ipv6.h
> +++ b/include/linux/ipv6.h
> @@ -40,6 +40,7 @@ struct ipv6_devconf {
>  	__s32		proxy_ndp;
>  	__s32		accept_source_route;
>  	__s32		accept_ra_from_local;
> +	__s32		ra_default_route_mtu;
>  #ifdef CONFIG_IPV6_OPTIMISTIC_DAD
>  	__s32		optimistic_dad;
>  	__s32		use_optimistic;
> diff --git a/include/uapi/linux/ipv6.h b/include/uapi/linux/ipv6.h
> index 437a6a4..4539c31 100644
> --- a/include/uapi/linux/ipv6.h
> +++ b/include/uapi/linux/ipv6.h
> @@ -170,6 +170,7 @@ enum {
>  	DEVCONF_ACCEPT_RA_FROM_LOCAL,
>  	DEVCONF_USE_OPTIMISTIC,
>  	DEVCONF_ACCEPT_RA_MTU,
> +	DEVCONF_RA_DEFAULT_ROUTE_MTU,
>  	DEVCONF_MAX
>  };
>
> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
> index b603002..322dd733 100644
> --- a/net/ipv6/addrconf.c
> +++ b/net/ipv6/addrconf.c
> @@ -189,6 +189,7 @@ static struct ipv6_devconf ipv6_devconf __read_mostly =
> {
>  	.accept_ra_defrtr	= 1,
>  	.accept_ra_from_local	= 0,
>  	.accept_ra_pinfo	= 1,
> +	.ra_default_route_mtu	= 0,
>  #ifdef CONFIG_IPV6_ROUTER_PREF
>  	.accept_ra_rtr_pref	= 1,
>  	.rtr_probe_interval	= 60 * HZ,
> @@ -240,6 +241,7 @@ static struct ipv6_devconf ipv6_devconf_dflt
> __read_mostly = {
>  	.accept_dad		= 1,
>  	.suppress_frag_ndisc	= 1,
>  	.accept_ra_mtu		= 1,
> +	.ra_default_route_mtu	= 0,
>  };
>
>  /* Check if a valid qdisc is available */
> @@ -4398,6 +4400,7 @@ static inline void ipv6_store_devconf(struct
> ipv6_devconf *cnf,
>  	array[DEVCONF_SUPPRESS_FRAG_NDISC] = cnf->suppress_frag_ndisc;
>  	array[DEVCONF_ACCEPT_RA_FROM_LOCAL] = cnf->accept_ra_from_local;
>  	array[DEVCONF_ACCEPT_RA_MTU] = cnf->accept_ra_mtu;
> +	array[DEVCONF_RA_DEFAULT_ROUTE_MTU] = cnf->ra_default_route_mtu;
>  }
>
>  static inline size_t inet6_ifla6_size(void)
> @@ -5315,6 +5318,13 @@ static struct addrconf_sysctl_table
>  			.proc_handler	= proc_dointvec,
>  		},
>  		{
> +			.procname	= "ra_default_route_mtu",
> +			.data		= &ipv6_devconf.ra_default_route_mtu,
> +			.maxlen		= sizeof(int),
> +			.mode		= 0644,
> +			.proc_handler	= proc_dointvec,
> +		},
> +		{
>  			/* sentinel */
>  		}
>  	},
> diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
> index 471ed24..c70ab44 100644
> --- a/net/ipv6/ndisc.c
> +++ b/net/ipv6/ndisc.c
> @@ -1362,7 +1362,8 @@ skip_routeinfo:
>  		} else if (in6_dev->cnf.mtu6 != mtu) {
>  			in6_dev->cnf.mtu6 = mtu;
>
> -			if (rt)
> +			if (rt && (!in6_dev->cnf.ra_default_route_mtu ||
> +				   mtu < in6_dev->cnf.ra_default_route_mtu))
>  				dst_metric_set(&rt->dst, RTAX_MTU, mtu);
>
>  			rt6_mtu_change(skb->dev, mtu);
> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
> index 4688bd4..6394adf 100644
> --- a/net/ipv6/route.c
> +++ b/net/ipv6/route.c
> @@ -1714,6 +1714,14 @@ int ip6_route_add(struct fib6_config *cfg)
>
>  	rt->rt6i_flags = cfg->fc_flags;
>
> +	if ((cfg->fc_flags & (RTF_ADDRCONF | RTF_DEFAULT | RTF_GATEWAY)) ==
> +	    (RTF_ADDRCONF | RTF_DEFAULT | RTF_GATEWAY)) {
> +		u32 mtu = idev->cnf.ra_default_route_mtu;
> +
> +		if (mtu && mtu >= IPV6_MIN_MTU && mtu <= idev->cnf.mtu6)
> +			dst_metric_set(&rt->dst, RTAX_MTU, mtu);
> +	}
> +
>  install_route:
>  	rt->dst.dev = dev;
>  	rt->rt6i_idev = idev;
> --
> 2.1.0
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] net: sysctl for RA default route MTU
  2015-03-25 11:07   ` [PATCH v2] " Roman Gushchin
  2015-03-25 12:34     ` Denis Kirjanov
@ 2015-03-25 15:52     ` Hannes Frederic Sowa
       [not found]       ` <39171427301187@webcorp02f.yandex-team.ru>
  1 sibling, 1 reply; 20+ messages in thread
From: Hannes Frederic Sowa @ 2015-03-25 15:52 UTC (permalink / raw)
  To: Roman Gushchin, David S. Miller, netdev; +Cc: linux-kernel

On Wed, Mar 25, 2015, at 12:07, Roman Gushchin wrote:
> --- a/net/ipv6/route.c
> +++ b/net/ipv6/route.c
> @@ -1714,6 +1714,14 @@ int ip6_route_add(struct fib6_config *cfg)
>  
>  	rt->rt6i_flags = cfg->fc_flags;
>  
> +       if ((cfg->fc_flags & (RTF_ADDRCONF | RTF_DEFAULT | RTF_GATEWAY))
> ==
> +           (RTF_ADDRCONF | RTF_DEFAULT | RTF_GATEWAY)) {
> +               u32 mtu = idev->cnf.ra_default_route_mtu;
> +
> +               if (mtu && mtu >= IPV6_MIN_MTU && mtu <= idev->cnf.mtu6)
> +                       dst_metric_set(&rt->dst, RTAX_MTU, mtu);
> +       }
> +

Could you move this RA specific snippet over to ndisc.c?

Hmm

How do you use this option?

You use jumbo frames on the on-link network and announce all routes via
route options where you also want to communicate to with jumbo frames?

I wonder if an offlink_mtu parameter would be more suitable?

Thanks,
Hannes

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] net: sysctl for RA default route MTU
       [not found]       ` <39171427301187@webcorp02f.yandex-team.ru>
@ 2015-03-25 18:14         ` Hannes Frederic Sowa
  2015-03-26 11:49           ` [PATCH v3] " Roman Gushchin
  0 siblings, 1 reply; 20+ messages in thread
From: Hannes Frederic Sowa @ 2015-03-25 18:14 UTC (permalink / raw)
  To: Roman Gushchin, David S. Miller, netdev; +Cc: linux-kernel

Hi,

On Wed, Mar 25, 2015, at 17:33, Roman Gushchin wrote:
> 25.03.2015, 18:52, "Hannes Frederic Sowa" <hannes@stressinduktion.org>:
> > On Wed, Mar 25, 2015, at 12:07, Roman Gushchin wrote:
> >>  --- a/net/ipv6/route.c
> >>  +++ b/net/ipv6/route.c
> >>  @@ -1714,6 +1714,14 @@ int ip6_route_add(struct fib6_config *cfg)
> >>
> >>           rt->rt6i_flags = cfg->fc_flags;
> >>
> >>  +       if ((cfg->fc_flags & (RTF_ADDRCONF | RTF_DEFAULT | RTF_GATEWAY))
> >>  ==
> >>  +           (RTF_ADDRCONF | RTF_DEFAULT | RTF_GATEWAY)) {
> >>  +               u32 mtu = idev->cnf.ra_default_route_mtu;
> >>  +
> >>  +               if (mtu && mtu >= IPV6_MIN_MTU && mtu <= idev->cnf.mtu6)
> >>  +                       dst_metric_set(&rt->dst, RTAX_MTU, mtu);
> >>  +       }
> >>  +
> >
> > Could you move this RA specific snippet over to ndisc.c?
> 
> Ok, no problem.

Thanks!

> >
> > Hmm
> >
> > How do you use this option?
> 
> We want to set and keep normal (~1500) MTU on default route for external
> connections
> without an additional userspace effort, while link MTU is 9000 to support
> jumbo frames
> on other routes.
> 
> > You use jumbo frames on the on-link network and announce all routes via
> > route options where you also want to communicate to with jumbo frames?
> 
> Yes, exactly.
> 
> >
> > I wonder if an offlink_mtu parameter would be more suitable?
> 
> If I understand you correctly, the difference is which MTU will have
> routes, announced via RIO.
> Am I right?
> If so, it will not help in our case, because only default route should
> have "small" MTU,
> and there is no way to announce per-route MTUs for RIO routes.
> I thought about two separate knobs (ra_default_route_mtu and
> ra_rt_info_route_mtu, for example),
> but it seemed to me too excessive.

Hmm. I revert my opinion on offlink_mtu parameter.

So the approach would be to just basically leave your patch as-is and if
another segment can be talked to with jumbo frames one could just let
the RA speaker add another route announcement which should get a more
specific route into the tables with the jumbo MTU from the RA packet.
Only default routes will get the overwritten MTU value from the new
knob. Am I correct? So your approach seems to be the most flexible
option.

Thanks and looking forward to the new patch,
Hannes

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v3] net: sysctl for RA default route MTU
  2015-03-25 18:14         ` Hannes Frederic Sowa
@ 2015-03-26 11:49           ` Roman Gushchin
  2015-03-26 14:49             ` Hannes Frederic Sowa
  2015-03-29 19:42             ` David Miller
  0 siblings, 2 replies; 20+ messages in thread
From: Roman Gushchin @ 2015-03-26 11:49 UTC (permalink / raw)
  To: netdev, Hannes Frederic Sowa
  Cc: linux-kernel, David S. Miller, Roman Gushchin

This patch introduces new ipv6 sysctl: ra_default_route_mtu.
If it's set (> 0), it defines per-route MTU for any new default route
received by RA.

This sysctl will help in the following configuration: we want to use
jumbo-frames for internal networks and default ethernet frames for
default route. Per-route MTU can only lower per-link MTU, so link MTU
should be set to ~9000 (statically or via RA).

Due to dynamic nature of RA, setting MTU for default route will require
userspace agent, that will monitor changes of default route
and (re)configure it. Not simple. The suggested sysctl solves this
problem.

Signed-off-by: Roman Gushchin <klamm@yandex-team.ru>

---

Changes from v1: add forgotten brace.
Changes from v2: move RA-specific code from route.c to ndisc.c
---
 Documentation/networking/ip-sysctl.txt |  5 +++++
 include/linux/ipv6.h                   |  1 +
 include/uapi/linux/ipv6.h              |  1 +
 net/ipv6/addrconf.c                    | 10 ++++++++++
 net/ipv6/ndisc.c                       |  8 +++++++-
 5 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index 1b8c964..c013dda 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -1316,6 +1316,11 @@ accept_ra_mtu - BOOLEAN
 	Functional default: enabled if accept_ra is enabled.
 			    disabled if accept_ra is disabled.
 
+ra_default_route_mtu - INTEGER
+	Define MTU for any new default route received by RA.
+
+	Functional default: disabled (0).
+
 accept_redirects - BOOLEAN
 	Accept Redirects.
 
diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index 4d5169f..68b4e1e 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -53,6 +53,7 @@ struct ipv6_devconf {
 	__s32           ndisc_notify;
 	__s32		suppress_frag_ndisc;
 	__s32		accept_ra_mtu;
+	__s32		ra_default_route_mtu;
 	void		*sysctl;
 };
 
diff --git a/include/uapi/linux/ipv6.h b/include/uapi/linux/ipv6.h
index 437a6a4..4539c31 100644
--- a/include/uapi/linux/ipv6.h
+++ b/include/uapi/linux/ipv6.h
@@ -170,6 +170,7 @@ enum {
 	DEVCONF_ACCEPT_RA_FROM_LOCAL,
 	DEVCONF_USE_OPTIMISTIC,
 	DEVCONF_ACCEPT_RA_MTU,
+	DEVCONF_RA_DEFAULT_ROUTE_MTU,
 	DEVCONF_MAX
 };
 
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index b603002..cec352d 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -202,6 +202,7 @@ static struct ipv6_devconf ipv6_devconf __read_mostly = {
 	.accept_dad		= 1,
 	.suppress_frag_ndisc	= 1,
 	.accept_ra_mtu		= 1,
+	.ra_default_route_mtu	= 0,
 };
 
 static struct ipv6_devconf ipv6_devconf_dflt __read_mostly = {
@@ -240,6 +241,7 @@ static struct ipv6_devconf ipv6_devconf_dflt __read_mostly = {
 	.accept_dad		= 1,
 	.suppress_frag_ndisc	= 1,
 	.accept_ra_mtu		= 1,
+	.ra_default_route_mtu	= 0,
 };
 
 /* Check if a valid qdisc is available */
@@ -4398,6 +4400,7 @@ static inline void ipv6_store_devconf(struct ipv6_devconf *cnf,
 	array[DEVCONF_SUPPRESS_FRAG_NDISC] = cnf->suppress_frag_ndisc;
 	array[DEVCONF_ACCEPT_RA_FROM_LOCAL] = cnf->accept_ra_from_local;
 	array[DEVCONF_ACCEPT_RA_MTU] = cnf->accept_ra_mtu;
+	array[DEVCONF_RA_DEFAULT_ROUTE_MTU] = cnf->ra_default_route_mtu;
 }
 
 static inline size_t inet6_ifla6_size(void)
@@ -5315,6 +5318,13 @@ static struct addrconf_sysctl_table
 			.proc_handler	= proc_dointvec,
 		},
 		{
+			.procname	= "ra_default_route_mtu",
+			.data		= &ipv6_devconf.ra_default_route_mtu,
+			.maxlen		= sizeof(int),
+			.mode		= 0644,
+			.proc_handler	= proc_dointvec,
+		},
+		{
 			/* sentinel */
 		}
 	},
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 471ed24..835b466 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -1200,6 +1200,11 @@ static void ndisc_router_discovery(struct sk_buff *skb)
 				  "RA: %s failed to add default route\n",
 				  __func__);
 			return;
+		} else {
+			u32 mtu = in6_dev->cnf.ra_default_route_mtu;
+
+			if (mtu && mtu >= IPV6_MIN_MTU && mtu <= in6_dev->cnf.mtu6)
+				dst_metric_set(&rt->dst, RTAX_MTU, mtu);
 		}
 
 		neigh = dst_neigh_lookup(&rt->dst, &ipv6_hdr(skb)->saddr);
@@ -1362,7 +1367,8 @@ skip_routeinfo:
 		} else if (in6_dev->cnf.mtu6 != mtu) {
 			in6_dev->cnf.mtu6 = mtu;
 
-			if (rt)
+			if (rt && (!in6_dev->cnf.ra_default_route_mtu ||
+				   mtu < in6_dev->cnf.ra_default_route_mtu))
 				dst_metric_set(&rt->dst, RTAX_MTU, mtu);
 
 			rt6_mtu_change(skb->dev, mtu);
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v3] net: sysctl for RA default route MTU
  2015-03-26 11:49           ` [PATCH v3] " Roman Gushchin
@ 2015-03-26 14:49             ` Hannes Frederic Sowa
  2015-03-29 19:42             ` David Miller
  1 sibling, 0 replies; 20+ messages in thread
From: Hannes Frederic Sowa @ 2015-03-26 14:49 UTC (permalink / raw)
  To: Roman Gushchin, netdev; +Cc: linux-kernel, David S. Miller

On Thu, Mar 26, 2015, at 12:49, Roman Gushchin wrote:
> This patch introduces new ipv6 sysctl: ra_default_route_mtu.
> If it's set (> 0), it defines per-route MTU for any new default route
> received by RA.
> 
> This sysctl will help in the following configuration: we want to use
> jumbo-frames for internal networks and default ethernet frames for
> default route. Per-route MTU can only lower per-link MTU, so link MTU
> should be set to ~9000 (statically or via RA).
> 
> Due to dynamic nature of RA, setting MTU for default route will require
> userspace agent, that will monitor changes of default route
> and (re)configure it. Not simple. The suggested sysctl solves this
> problem.
> 
> Signed-off-by: Roman Gushchin <klamm@yandex-team.ru>

Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>

Thanks!

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3] net: sysctl for RA default route MTU
  2015-03-26 11:49           ` [PATCH v3] " Roman Gushchin
  2015-03-26 14:49             ` Hannes Frederic Sowa
@ 2015-03-29 19:42             ` David Miller
  2015-03-30 12:30               ` Roman Gushchin
  1 sibling, 1 reply; 20+ messages in thread
From: David Miller @ 2015-03-29 19:42 UTC (permalink / raw)
  To: klamm; +Cc: netdev, hannes, linux-kernel

From: Roman Gushchin <klamm@yandex-team.ru>
Date: Thu, 26 Mar 2015 14:49:54 +0300

> This patch introduces new ipv6 sysctl: ra_default_route_mtu.
> If it's set (> 0), it defines per-route MTU for any new default route
> received by RA.
> 
> This sysctl will help in the following configuration: we want to use
> jumbo-frames for internal networks and default ethernet frames for
> default route. Per-route MTU can only lower per-link MTU, so link MTU
> should be set to ~9000 (statically or via RA).
> 
> Due to dynamic nature of RA, setting MTU for default route will require
> userspace agent, that will monitor changes of default route
> and (re)configure it. Not simple. The suggested sysctl solves this
> problem.
> 
> Signed-off-by: Roman Gushchin <klamm@yandex-team.ru>

This does not apply cleanly to net-next, please respin.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v3] net: sysctl for RA default route MTU
  2015-03-29 19:42             ` David Miller
@ 2015-03-30 12:30               ` Roman Gushchin
  2015-03-31 20:05                 ` David Miller
  0 siblings, 1 reply; 20+ messages in thread
From: Roman Gushchin @ 2015-03-30 12:30 UTC (permalink / raw)
  To: David S. Miller, netdev
  Cc: linux-kernel, Hannes Frederic Sowa, Roman Gushchin

This patch introduces new ipv6 sysctl: ra_default_route_mtu.
If it's set (> 0), it defines per-route MTU for any new default route
received by RA.

This sysctl will help in the following configuration: we want to use
jumbo-frames for internal networks and default ethernet frames for
default route. Per-route MTU can only lower per-link MTU, so link MTU
should be set to ~9000 (statically or via RA).

Due to dynamic nature of RA, setting MTU for default route will require
userspace agent, that will monitor changes of default route
and (re)configure it. Not simple. The suggested sysctl solves this
problem.

Signed-off-by: Roman Gushchin <klamm@yandex-team.ru>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>

---

Changes from v1: add forgotten brace.
Changes from v2: move RA-specific code from route.c to ndisc.c
---
 Documentation/networking/ip-sysctl.txt |  5 +++++
 include/linux/ipv6.h                   |  1 +
 include/uapi/linux/ipv6.h              |  1 +
 net/ipv6/addrconf.c                    | 10 ++++++++++
 net/ipv6/ndisc.c                       |  8 +++++++-
 5 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index 071fb18..cf86729 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -1349,6 +1349,11 @@ accept_ra_mtu - BOOLEAN
 	Functional default: enabled if accept_ra is enabled.
 			    disabled if accept_ra is disabled.
 
+ra_default_route_mtu - INTEGER
+	Define MTU for any new default route received by RA.
+
+	Functional default: disabled (0).
+
 accept_redirects - BOOLEAN
 	Accept Redirects.
 
diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index 82806c6..c7727b5 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -53,6 +53,7 @@ struct ipv6_devconf {
 	__s32           ndisc_notify;
 	__s32		suppress_frag_ndisc;
 	__s32		accept_ra_mtu;
+	__s32		ra_default_route_mtu;
 	struct ipv6_stable_secret {
 		bool initialized;
 		struct in6_addr secret;
diff --git a/include/uapi/linux/ipv6.h b/include/uapi/linux/ipv6.h
index 5efa54a..1d31d70 100644
--- a/include/uapi/linux/ipv6.h
+++ b/include/uapi/linux/ipv6.h
@@ -170,6 +170,7 @@ enum {
 	DEVCONF_ACCEPT_RA_FROM_LOCAL,
 	DEVCONF_USE_OPTIMISTIC,
 	DEVCONF_ACCEPT_RA_MTU,
+	DEVCONF_RA_DEFAULT_ROUTE_MTU,
 	DEVCONF_STABLE_SECRET,
 	DEVCONF_MAX
 };
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 2660263..15528f7 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -209,6 +209,7 @@ static struct ipv6_devconf ipv6_devconf __read_mostly = {
 	.accept_dad		= 1,
 	.suppress_frag_ndisc	= 1,
 	.accept_ra_mtu		= 1,
+	.ra_default_route_mtu	= 0,
 	.stable_secret		= {
 		.initialized = false,
 	}
@@ -250,6 +251,7 @@ static struct ipv6_devconf ipv6_devconf_dflt __read_mostly = {
 	.accept_dad		= 1,
 	.suppress_frag_ndisc	= 1,
 	.accept_ra_mtu		= 1,
+	.ra_default_route_mtu	= 0,
 	.stable_secret		= {
 		.initialized = false,
 	},
@@ -4583,6 +4585,7 @@ static inline void ipv6_store_devconf(struct ipv6_devconf *cnf,
 	array[DEVCONF_SUPPRESS_FRAG_NDISC] = cnf->suppress_frag_ndisc;
 	array[DEVCONF_ACCEPT_RA_FROM_LOCAL] = cnf->accept_ra_from_local;
 	array[DEVCONF_ACCEPT_RA_MTU] = cnf->accept_ra_mtu;
+	array[DEVCONF_RA_DEFAULT_ROUTE_MTU] = cnf->ra_default_route_mtu;
 	/* we omit DEVCONF_STABLE_SECRET for now */
 }
 
@@ -5576,6 +5579,13 @@ static struct addrconf_sysctl_table
 			.proc_handler	= proc_dointvec,
 		},
 		{
+			.procname	= "ra_default_route_mtu",
+			.data		= &ipv6_devconf.ra_default_route_mtu,
+			.maxlen		= sizeof(int),
+			.mode		= 0644,
+			.proc_handler	= proc_dointvec,
+		},
+		{
 			.procname	= "stable_secret",
 			.data		= &ipv6_devconf.stable_secret,
 			.maxlen		= IPV6_MAX_STRLEN,
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 247ad7c..2a3a564 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -1208,6 +1208,11 @@ static void ndisc_router_discovery(struct sk_buff *skb)
 				  "RA: %s failed to add default route\n",
 				  __func__);
 			return;
+		} else {
+			u32 mtu = in6_dev->cnf.ra_default_route_mtu;
+
+			if (mtu && mtu >= IPV6_MIN_MTU && mtu <= in6_dev->cnf.mtu6)
+				dst_metric_set(&rt->dst, RTAX_MTU, mtu);
 		}
 
 		neigh = dst_neigh_lookup(&rt->dst, &ipv6_hdr(skb)->saddr);
@@ -1370,7 +1375,8 @@ skip_routeinfo:
 		} else if (in6_dev->cnf.mtu6 != mtu) {
 			in6_dev->cnf.mtu6 = mtu;
 
-			if (rt)
+			if (rt && (!in6_dev->cnf.ra_default_route_mtu ||
+				   mtu < in6_dev->cnf.ra_default_route_mtu))
 				dst_metric_set(&rt->dst, RTAX_MTU, mtu);
 
 			rt6_mtu_change(skb->dev, mtu);
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v3] net: sysctl for RA default route MTU
  2015-03-30 12:30               ` Roman Gushchin
@ 2015-03-31 20:05                 ` David Miller
  2015-03-31 20:35                   ` Hannes Frederic Sowa
  0 siblings, 1 reply; 20+ messages in thread
From: David Miller @ 2015-03-31 20:05 UTC (permalink / raw)
  To: klamm; +Cc: netdev, linux-kernel, hannes

From: Roman Gushchin <klamm@yandex-team.ru>
Date: Mon, 30 Mar 2015 15:30:57 +0300

> This patch introduces new ipv6 sysctl: ra_default_route_mtu.
> If it's set (> 0), it defines per-route MTU for any new default route
> received by RA.
> 
> This sysctl will help in the following configuration: we want to use
> jumbo-frames for internal networks and default ethernet frames for
> default route. Per-route MTU can only lower per-link MTU, so link MTU
> should be set to ~9000 (statically or via RA).
> 
> Due to dynamic nature of RA, setting MTU for default route will require
> userspace agent, that will monitor changes of default route
> and (re)configure it. Not simple. The suggested sysctl solves this
> problem.
> 
> Signed-off-by: Roman Gushchin <klamm@yandex-team.ru>
> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>

I don't like this change at all.  The way I see things you already
have the mechanisms necessary to do this.

You obviously control the entity providing the default routes and
these RA messages, therefore you absolutely can configure it to
provide an appropriate MTU value in those RA messages.

Problem solved, and no kernel changes necessary.

I am warning you ahead of time that I will have a very low tolerance
for replies to this email containing stories explaining why this is
"difficult" to do.  The fact is that the mechanism is there and if you
have designed things at your site in a way such that the mechanism
designed for this has become less useful, that isn't my problem.

I'm not adding facilities that duplicated existing methods that
already exist to accomplish this task.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3] net: sysctl for RA default route MTU
  2015-03-31 20:05                 ` David Miller
@ 2015-03-31 20:35                   ` Hannes Frederic Sowa
  2015-03-31 20:49                     ` David Miller
  0 siblings, 1 reply; 20+ messages in thread
From: Hannes Frederic Sowa @ 2015-03-31 20:35 UTC (permalink / raw)
  To: David Miller, klamm; +Cc: netdev, linux-kernel

On Tue, Mar 31, 2015, at 22:05, David Miller wrote:
> From: Roman Gushchin <klamm@yandex-team.ru>
> Date: Mon, 30 Mar 2015 15:30:57 +0300
> 
> > This patch introduces new ipv6 sysctl: ra_default_route_mtu.
> > If it's set (> 0), it defines per-route MTU for any new default route
> > received by RA.
> > 
> > This sysctl will help in the following configuration: we want to use
> > jumbo-frames for internal networks and default ethernet frames for
> > default route. Per-route MTU can only lower per-link MTU, so link MTU
> > should be set to ~9000 (statically or via RA).
> > 
> > Due to dynamic nature of RA, setting MTU for default route will require
> > userspace agent, that will monitor changes of default route
> > and (re)configure it. Not simple. The suggested sysctl solves this
> > problem.
> > 
> > Signed-off-by: Roman Gushchin <klamm@yandex-team.ru>
> > Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
> 
> I don't like this change at all.  The way I see things you already
> have the mechanisms necessary to do this.

This is totally understandable and the change seems not to fit because
it alters incoming information, but I try to quickly explain my
reasoning for the Ack:

Neighbour Discovery does not fit the way how linux handles MTUs. It is
only possible to send out one MTU option on the Router Advertisement and
we pick it up as the ipv6 MTU value for the interface. A RA can provide
further routing information but no MTU option is possible to be
specified on those route options, thus they will adapt the link MTU.
There is no differentiation between interface MTU and per-route MTU. 

One common setup is to have local jumbo frames to speed up e.g. NFS
traffic and use default routes with MTU 1500 to reach the outside world.
As I had no other idea how to solve this with in-kernel autoconf
mechanism I thought this change would be reasonable.

Obviously one can disable autoconf and set up routes by hand with
correct MTU values which should solve the problem - or use custom DHCPv6
options to do so.

> You obviously control the entity providing the default routes and
> these RA messages, therefore you absolutely can configure it to
> provide an appropriate MTU value in those RA messages.
> 
> Problem solved, and no kernel changes necessary.
> 
> I am warning you ahead of time that I will have a very low tolerance
> for replies to this email containing stories explaining why this is
> "difficult" to do.  The fact is that the mechanism is there and if you
> have designed things at your site in a way such that the mechanism
> designed for this has become less useful, that isn't my problem.
> 
> I'm not adding facilities that duplicated existing methods that
> already exist to accomplish this task.

Could you quickly comment on what you had in mind? I guess it is about
handling RA in user space on the end hosts and overwriting MTU during
insertion of the routes?

Thanks,
Hannes

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3] net: sysctl for RA default route MTU
  2015-03-31 20:35                   ` Hannes Frederic Sowa
@ 2015-03-31 20:49                     ` David Miller
  2015-04-01  9:58                       ` Roman Gushchin
  0 siblings, 1 reply; 20+ messages in thread
From: David Miller @ 2015-03-31 20:49 UTC (permalink / raw)
  To: hannes; +Cc: klamm, netdev, linux-kernel

From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Tue, 31 Mar 2015 22:35:48 +0200

> Could you quickly comment on what you had in mind? I guess it is about
> handling RA in user space on the end hosts and overwriting MTU during
> insertion of the routes?

Even after reading your email I have no idea why you can't just have
RA provide a 1500 byte MTU, everything else uses the device's 9000
MTU, problem solved?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3] net: sysctl for RA default route MTU
  2015-03-31 20:49                     ` David Miller
@ 2015-04-01  9:58                       ` Roman Gushchin
  2015-04-01 17:55                         ` David Miller
  0 siblings, 1 reply; 20+ messages in thread
From: Roman Gushchin @ 2015-04-01  9:58 UTC (permalink / raw)
  To: David Miller, hannes; +Cc: netdev, linux-kernel

31.03.2015, 23:49, "David Miller" <davem@davemloft.net>:
> From: Hannes Frederic Sowa <hannes@stressinduktion.org>
> Date: Tue, 31 Mar 2015 22:35:48 +0200
>>  Could you quickly comment on what you had in mind? I guess it is about
>>  handling RA in user space on the end hosts and overwriting MTU during
>>  insertion of the routes?
>
> Even after reading your email I have no idea why you can't just have
> RA provide a 1500 byte MTU, everything else uses the device's 9000
> MTU, problem solved?

Because the MTU (provided by RA) is assigned to the device.

Thanks,
Roman

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3] net: sysctl for RA default route MTU
  2015-04-01  9:58                       ` Roman Gushchin
@ 2015-04-01 17:55                         ` David Miller
  2015-04-01 19:27                           ` Hannes Frederic Sowa
  0 siblings, 1 reply; 20+ messages in thread
From: David Miller @ 2015-04-01 17:55 UTC (permalink / raw)
  To: klamm; +Cc: hannes, netdev, linux-kernel

From: Roman Gushchin <klamm@yandex-team.ru>
Date: Wed, 01 Apr 2015 12:58:50 +0300

> 31.03.2015, 23:49, "David Miller" <davem@davemloft.net>:
>> From: Hannes Frederic Sowa <hannes@stressinduktion.org>
>> Date: Tue, 31 Mar 2015 22:35:48 +0200
>>>  Could you quickly comment on what you had in mind? I guess it is about
>>>  handling RA in user space on the end hosts and overwriting MTU during
>>>  insertion of the routes?
>>
>> Even after reading your email I have no idea why you can't just have
>> RA provide a 1500 byte MTU, everything else uses the device's 9000
>> MTU, problem solved?
> 
> Because the MTU (provided by RA) is assigned to the device.

Ok, that severely limits the usefulness of this option I guess.

The next question I have is about the behavior of the new setting
in the presence of an RA MTU option.  It seems like the sysctl
doesn't override that RA MTU option, but rather just clamps it.

And then if it's in range, this controls only whether the default
route has it's MTU adjusted.

That doesn't make any sense to me if we then go and do the
rt6_mtu_change() call unconditionally.  The route metric update
and the rt6_mtu_change() go hand in hand.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3] net: sysctl for RA default route MTU
  2015-04-01 17:55                         ` David Miller
@ 2015-04-01 19:27                           ` Hannes Frederic Sowa
  2015-04-02 18:08                             ` Roman Gushchin
  0 siblings, 1 reply; 20+ messages in thread
From: Hannes Frederic Sowa @ 2015-04-01 19:27 UTC (permalink / raw)
  To: David Miller, klamm; +Cc: netdev, linux-kernel



On Wed, Apr 1, 2015, at 19:55, David Miller wrote:
> From: Roman Gushchin <klamm@yandex-team.ru>
> Date: Wed, 01 Apr 2015 12:58:50 +0300
> 
> > 31.03.2015, 23:49, "David Miller" <davem@davemloft.net>:
> >> From: Hannes Frederic Sowa <hannes@stressinduktion.org>
> >> Date: Tue, 31 Mar 2015 22:35:48 +0200
> >>>  Could you quickly comment on what you had in mind? I guess it is about
> >>>  handling RA in user space on the end hosts and overwriting MTU during
> >>>  insertion of the routes?
> >>
> >> Even after reading your email I have no idea why you can't just have
> >> RA provide a 1500 byte MTU, everything else uses the device's 9000
> >> MTU, problem solved?
> > 
> > Because the MTU (provided by RA) is assigned to the device.
> 
> Ok, that severely limits the usefulness of this option I guess.
> 
> The next question I have is about the behavior of the new setting
> in the presence of an RA MTU option.  It seems like the sysctl
> doesn't override that RA MTU option, but rather just clamps it.
> 
> And then if it's in range, this controls only whether the default
> route has it's MTU adjusted.
> 
> That doesn't make any sense to me if we then go and do the
> rt6_mtu_change() call unconditionally.  The route metric update
> and the rt6_mtu_change() go hand in hand.

Agreed but that gets interesting:

I guess during testing the cnf.mtu6 value was equal to the newly
announced mtu value, so the rt6_mtu_change call does not happen. We
update cnf.mtu6 so a second RA packet would actually bring the system
into the desired state but we have a moment where the default route
carries a too big MTU. That's not good.

Easiest solution is to reorder those calls but that also leaves us with
a time frame where we carry the incorrect MTU on the default route.
Otherwise we must conditionally filter out the default routes.

Roman, any ideas?

Thanks,
Hannes

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3] net: sysctl for RA default route MTU
  2015-04-01 19:27                           ` Hannes Frederic Sowa
@ 2015-04-02 18:08                             ` Roman Gushchin
  2015-04-07 15:58                               ` Hannes Frederic Sowa
  0 siblings, 1 reply; 20+ messages in thread
From: Roman Gushchin @ 2015-04-02 18:08 UTC (permalink / raw)
  To: Hannes Frederic Sowa, David Miller; +Cc: netdev, linux-kernel

>>  The next question I have is about the behavior of the new setting
>>  in the presence of an RA MTU option.  It seems like the sysctl
>>  doesn't override that RA MTU option, but rather just clamps it.
>>
>>  And then if it's in range, this controls only whether the default
>>  route has it's MTU adjusted.
>>
>>  That doesn't make any sense to me if we then go and do the
>>  rt6_mtu_change() call unconditionally.  The route metric update
>>  and the rt6_mtu_change() go hand in hand.
>
> Agreed but that gets interesting:
>
> I guess during testing the cnf.mtu6 value was equal to the newly
> announced mtu value, so the rt6_mtu_change call does not happen. We
> update cnf.mtu6 so a second RA packet would actually bring the system
> into the desired state but we have a moment where the default route
> carries a too big MTU. That's not good.

Agreed.

> Easiest solution is to reorder those calls but that also leaves us with
> a time frame where we carry the incorrect MTU on the default route.
> Otherwise we must conditionally filter out the default routes.
> Roman, any ideas?

I think, such approach will work on practise, but looks not very beatiful.

May be, a better idea is to serarate per-route and per-device MTU,
so an updating of per-device MTU will not affect per-route MTU.
Actual MTU can always been calculated as min(route_mtu, device_mtu),
but we wouldn't need to update mtu on each route on receiving RA MTU option, 
for instance.

Do you see any problems with such approach?

Thanks,
Roman

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3] net: sysctl for RA default route MTU
  2015-04-02 18:08                             ` Roman Gushchin
@ 2015-04-07 15:58                               ` Hannes Frederic Sowa
  2015-04-08 19:03                                 ` Roman Gushchin
  0 siblings, 1 reply; 20+ messages in thread
From: Hannes Frederic Sowa @ 2015-04-07 15:58 UTC (permalink / raw)
  To: Roman Gushchin; +Cc: David Miller, netdev, linux-kernel

On Do, 2015-04-02 at 21:08 +0300, Roman Gushchin wrote:
> >>  The next question I have is about the behavior of the new setting
> >>  in the presence of an RA MTU option.  It seems like the sysctl
> >>  doesn't override that RA MTU option, but rather just clamps it.
> >>
> >>  And then if it's in range, this controls only whether the default
> >>  route has it's MTU adjusted.
> >>
> >>  That doesn't make any sense to me if we then go and do the
> >>  rt6_mtu_change() call unconditionally.  The route metric update
> >>  and the rt6_mtu_change() go hand in hand.
> >
> > Agreed but that gets interesting:
> >
> > I guess during testing the cnf.mtu6 value was equal to the newly
> > announced mtu value, so the rt6_mtu_change call does not happen. We
> > update cnf.mtu6 so a second RA packet would actually bring the system
> > into the desired state but we have a moment where the default route
> > carries a too big MTU. That's not good.
> 
> Agreed.
> 
> > Easiest solution is to reorder those calls but that also leaves us with
> > a time frame where we carry the incorrect MTU on the default route.
> > Otherwise we must conditionally filter out the default routes.
> > Roman, any ideas?
> 
> I think, such approach will work on practise, but looks not very beatiful.
> 
> May be, a better idea is to serarate per-route and per-device MTU,
> so an updating of per-device MTU will not affect per-route MTU.
> Actual MTU can always been calculated as min(route_mtu, device_mtu),
> but we wouldn't need to update mtu on each route on receiving RA MTU option, 
> for instance.
> 
> Do you see any problems with such approach?

If I understood you correct this actually seems to be quite an intrusive
change? :/ Can you show me some code how to do this?

I would also dislike adding a filtering capability to the route mtu
updates. Currently I don't have a god idea, sorry.

Bye,
Hannes



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3] net: sysctl for RA default route MTU
  2015-04-07 15:58                               ` Hannes Frederic Sowa
@ 2015-04-08 19:03                                 ` Roman Gushchin
  0 siblings, 0 replies; 20+ messages in thread
From: Roman Gushchin @ 2015-04-08 19:03 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: David Miller, netdev, linux-kernel

07.04.2015, 18:58, "Hannes Frederic Sowa" <hannes@stressinduktion.org>:
>  On Do, 2015-04-02 at 21:08 +0300, Roman Gushchin wrote:
>>>>    The next question I have is about the behavior of the new setting
>>>>    in the presence of an RA MTU option.  It seems like the sysctl
>>>>    doesn't override that RA MTU option, but rather just clamps it.
>>>>
>>>>    And then if it's in range, this controls only whether the default
>>>>    route has it's MTU adjusted.
>>>>
>>>>    That doesn't make any sense to me if we then go and do the
>>>>    rt6_mtu_change() call unconditionally.  The route metric update
>>>>    and the rt6_mtu_change() go hand in hand.
>>>   Agreed but that gets interesting:
>>>
>>>   I guess during testing the cnf.mtu6 value was equal to the newly
>>>   announced mtu value, so the rt6_mtu_change call does not happen. We
>>>   update cnf.mtu6 so a second RA packet would actually bring the system
>>>   into the desired state but we have a moment where the default route
>>>   carries a too big MTU. That's not good.
>>   Agreed.
>>>   Easiest solution is to reorder those calls but that also leaves us with
>>>   a time frame where we carry the incorrect MTU on the default route.
>>>   Otherwise we must conditionally filter out the default routes.
>>>   Roman, any ideas?
>>   I think, such approach will work on practise, but looks not very beatiful.
>>
>>   May be, a better idea is to serarate per-route and per-device MTU,
>>   so an updating of per-device MTU will not affect per-route MTU.
>>   Actual MTU can always been calculated as min(route_mtu, device_mtu),
>>   but we wouldn't need to update mtu on each route on receiving RA MTU option,
>>   for instance.
>>
>>   Do you see any problems with such approach?
>  If I understood you correct this actually seems to be quite an intrusive
>  change? :/ Can you show me some code how to do this?

Too intrusive, really)

>  I would also dislike adding a filtering capability to the route mtu
>  updates. Currently I don't have a god idea, sorry.

Hmm, I thought a bit more about this issue... And It seems to me now, that there is no issue at all.
If RA MTU is larger than ra_default_route_mtu, rt6_mtu_change() will not update it,
because dst_mtu(&rt->dst) != idev->cnf.mtu6 :
	if (rt->dst.dev == arg->dev &&
	    !dst_metric_locked(&rt->dst, RTAX_MTU) &&
	    (dst_mtu(&rt->dst) >= arg->mtu ||
	     (dst_mtu(&rt->dst) < arg->mtu &&
	      dst_mtu(&rt->dst) == idev->cnf.mtu6))) {
		dst_metric_set(&rt->dst, RTAX_MTU, arg->mtu);
	}
So, it's ok.

Otherwise, if RA MTU is lower than ra_default_route_mtu, rt6_mtu_change() will lower default route mtu, and it's ok too. There is a short period of time, when a newly created default route has too large MTU, but it's not scary. And it's exactly as it works now if new RA advertise MTU smaller than previous.

Do I miss something?

Thanks!

Regards,
Roman

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2015-04-08 19:04 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-24 18:03 [PATCH] net: sysctl for RA default route MTU Roman Gushchin
2015-03-24 19:27 ` David Miller
2015-03-25  9:49   ` Roman Gushchin
2015-03-25 11:07   ` [PATCH v2] " Roman Gushchin
2015-03-25 12:34     ` Denis Kirjanov
2015-03-25 15:52     ` Hannes Frederic Sowa
     [not found]       ` <39171427301187@webcorp02f.yandex-team.ru>
2015-03-25 18:14         ` Hannes Frederic Sowa
2015-03-26 11:49           ` [PATCH v3] " Roman Gushchin
2015-03-26 14:49             ` Hannes Frederic Sowa
2015-03-29 19:42             ` David Miller
2015-03-30 12:30               ` Roman Gushchin
2015-03-31 20:05                 ` David Miller
2015-03-31 20:35                   ` Hannes Frederic Sowa
2015-03-31 20:49                     ` David Miller
2015-04-01  9:58                       ` Roman Gushchin
2015-04-01 17:55                         ` David Miller
2015-04-01 19:27                           ` Hannes Frederic Sowa
2015-04-02 18:08                             ` Roman Gushchin
2015-04-07 15:58                               ` Hannes Frederic Sowa
2015-04-08 19:03                                 ` Roman Gushchin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.