All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next] ipv6: use DST_NOCOUNT in ip6_rt_pcpu_alloc()
@ 2020-05-08 14:34 Eric Dumazet
  2020-05-08 14:39 ` David Ahern
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Eric Dumazet @ 2020-05-08 14:34 UTC (permalink / raw)
  To: David S . Miller
  Cc: netdev, Eric Dumazet, Eric Dumazet, Martin KaFai Lau,
	David Ahern, Wei Wang, Maciej Żenczykowski

We currently have to adjust ipv6 route gc_thresh/max_size depending
on number of cpus on a server, this makes very little sense.

If the kernels sets /proc/sys/net/ipv6/route/gc_thresh to 1024
and /proc/sys/net/ipv6/route/max_size to 4096, then we better
not track the percpu dst that our implementation uses.

Only routes not added (directly or indirectly) by the admin
should be tracked and limited.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: David Ahern <dsahern@kernel.org>
Cc: Wei Wang <weiwan@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
---
 net/ipv6/route.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index a9072dba00f4fb0b61bce1fc0f44a3a81ba702fa..4292653af533bb641ae8571fffe45b39327d0380 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1377,7 +1377,7 @@ static struct rt6_info *ip6_rt_pcpu_alloc(const struct fib6_result *res)
 
 	rcu_read_lock();
 	dev = ip6_rt_get_dev_rcu(res);
-	pcpu_rt = ip6_dst_alloc(dev_net(dev), dev, flags);
+	pcpu_rt = ip6_dst_alloc(dev_net(dev), dev, flags | DST_NOCOUNT);
 	rcu_read_unlock();
 	if (!pcpu_rt) {
 		fib6_info_release(f6i);
-- 
2.26.2.645.ge9eca65c58-goog


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next] ipv6: use DST_NOCOUNT in ip6_rt_pcpu_alloc()
  2020-05-08 14:34 [PATCH net-next] ipv6: use DST_NOCOUNT in ip6_rt_pcpu_alloc() Eric Dumazet
@ 2020-05-08 14:39 ` David Ahern
  2020-05-08 14:43   ` Eric Dumazet
  2020-05-08 16:31 ` Wei Wang
  2020-05-09  5:34 ` Jakub Kicinski
  2 siblings, 1 reply; 7+ messages in thread
From: David Ahern @ 2020-05-08 14:39 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller
  Cc: netdev, Eric Dumazet, Martin KaFai Lau, David Ahern, Wei Wang,
	Maciej Żenczykowski

On 5/8/20 8:34 AM, Eric Dumazet wrote:
> We currently have to adjust ipv6 route gc_thresh/max_size depending
> on number of cpus on a server, this makes very little sense.
> 
> If the kernels sets /proc/sys/net/ipv6/route/gc_thresh to 1024
> and /proc/sys/net/ipv6/route/max_size to 4096, then we better
> not track the percpu dst that our implementation uses.
> 
> Only routes not added (directly or indirectly) by the admin
> should be tracked and limited.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Martin KaFai Lau <kafai@fb.com>
> Cc: David Ahern <dsahern@kernel.org>
> Cc: Wei Wang <weiwan@google.com>
> Cc: Maciej Żenczykowski <maze@google.com>
> ---
>  net/ipv6/route.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
> index a9072dba00f4fb0b61bce1fc0f44a3a81ba702fa..4292653af533bb641ae8571fffe45b39327d0380 100644
> --- a/net/ipv6/route.c
> +++ b/net/ipv6/route.c
> @@ -1377,7 +1377,7 @@ static struct rt6_info *ip6_rt_pcpu_alloc(const struct fib6_result *res)
>  
>  	rcu_read_lock();
>  	dev = ip6_rt_get_dev_rcu(res);
> -	pcpu_rt = ip6_dst_alloc(dev_net(dev), dev, flags);
> +	pcpu_rt = ip6_dst_alloc(dev_net(dev), dev, flags | DST_NOCOUNT);
>  	rcu_read_unlock();
>  	if (!pcpu_rt) {
>  		fib6_info_release(f6i);
> 

At this point in IPv6's evolution it seems like it can align more with
IPv4 and just get rid of the dst limits completely.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next] ipv6: use DST_NOCOUNT in ip6_rt_pcpu_alloc()
  2020-05-08 14:39 ` David Ahern
@ 2020-05-08 14:43   ` Eric Dumazet
  2020-05-08 15:03     ` David Ahern
  0 siblings, 1 reply; 7+ messages in thread
From: Eric Dumazet @ 2020-05-08 14:43 UTC (permalink / raw)
  To: David Ahern, Eric Dumazet, David S . Miller
  Cc: netdev, Eric Dumazet, Martin KaFai Lau, David Ahern, Wei Wang,
	Maciej Żenczykowski



On 5/8/20 7:39 AM, David Ahern wrote:
> On 5/8/20 8:34 AM, Eric Dumazet wrote:
>> We currently have to adjust ipv6 route gc_thresh/max_size depending
>> on number of cpus on a server, this makes very little sense.
>>
>> If the kernels sets /proc/sys/net/ipv6/route/gc_thresh to 1024
>> and /proc/sys/net/ipv6/route/max_size to 4096, then we better
>> not track the percpu dst that our implementation uses.
>>
>> Only routes not added (directly or indirectly) by the admin
>> should be tracked and limited.
>>
>> Signed-off-by: Eric Dumazet <edumazet@google.com>
>> Cc: Martin KaFai Lau <kafai@fb.com>
>> Cc: David Ahern <dsahern@kernel.org>
>> Cc: Wei Wang <weiwan@google.com>
>> Cc: Maciej Żenczykowski <maze@google.com>
>> ---
>>  net/ipv6/route.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
>> index a9072dba00f4fb0b61bce1fc0f44a3a81ba702fa..4292653af533bb641ae8571fffe45b39327d0380 100644
>> --- a/net/ipv6/route.c
>> +++ b/net/ipv6/route.c
>> @@ -1377,7 +1377,7 @@ static struct rt6_info *ip6_rt_pcpu_alloc(const struct fib6_result *res)
>>  
>>  	rcu_read_lock();
>>  	dev = ip6_rt_get_dev_rcu(res);
>> -	pcpu_rt = ip6_dst_alloc(dev_net(dev), dev, flags);
>> +	pcpu_rt = ip6_dst_alloc(dev_net(dev), dev, flags | DST_NOCOUNT);
>>  	rcu_read_unlock();
>>  	if (!pcpu_rt) {
>>  		fib6_info_release(f6i);
>>
> 
> At this point in IPv6's evolution it seems like it can align more with
> IPv4 and just get rid of the dst limits completely.
> 

This patch can be backported without any pains ;)

Getting rid of limits, even for exceptions ?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next] ipv6: use DST_NOCOUNT in ip6_rt_pcpu_alloc()
  2020-05-08 14:43   ` Eric Dumazet
@ 2020-05-08 15:03     ` David Ahern
  2020-05-08 15:13       ` Eric Dumazet
  0 siblings, 1 reply; 7+ messages in thread
From: David Ahern @ 2020-05-08 15:03 UTC (permalink / raw)
  To: Eric Dumazet, Eric Dumazet, David S . Miller
  Cc: netdev, Martin KaFai Lau, David Ahern, Wei Wang,
	Maciej Żenczykowski

On 5/8/20 8:43 AM, Eric Dumazet wrote:
> This patch can be backported without any pains ;)

sure, but you tagged it as net-next, not net.

> 
> Getting rid of limits, even for exceptions ?

Running through where dst entries are created in IPv6:
1. pcpu cache
2. uncached_list
3. exceptions like pmtu and redirect

All of those match IPv4 and as I recall IPv4 does not have any limits,
even on exceptions and redirect. If IPv4 does not have limits, why
should IPv6? And if the argument is uncontrolled memory consumption, is
there an expectation that IPv6 generates more exceptions?

My argument really just boils down to consistency between them. IPv4
does not use DST_NOCOUNT, so why put that burden on v6?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next] ipv6: use DST_NOCOUNT in ip6_rt_pcpu_alloc()
  2020-05-08 15:03     ` David Ahern
@ 2020-05-08 15:13       ` Eric Dumazet
  0 siblings, 0 replies; 7+ messages in thread
From: Eric Dumazet @ 2020-05-08 15:13 UTC (permalink / raw)
  To: David Ahern
  Cc: Eric Dumazet, David S . Miller, netdev, Martin KaFai Lau,
	David Ahern, Wei Wang, Maciej Żenczykowski

On Fri, May 8, 2020 at 8:04 AM David Ahern <dsahern@gmail.com> wrote:
>
> On 5/8/20 8:43 AM, Eric Dumazet wrote:
> > This patch can be backported without any pains ;)
>
> sure, but you tagged it as net-next, not net.

Because it is not a recent regression.
We are late in rc cycle.

I tend to push patches on net-next, then ask later for stable
backports once we are sure no regression was added,
even for patches that look 'very safe'  ;)



>
> >
> > Getting rid of limits, even for exceptions ?
>
> Running through where dst entries are created in IPv6:
> 1. pcpu cache
> 2. uncached_list
> 3. exceptions like pmtu and redirect
>
> All of those match IPv4 and as I recall IPv4 does not have any limits,
> even on exceptions and redirect. If IPv4 does not have limits, why
> should IPv6? And if the argument is uncontrolled memory consumption, is
> there an expectation that IPv6 generates more exceptions?
>
> My argument really just boils down to consistency between them. IPv4
> does not use DST_NOCOUNT, so why put that burden on v6?

That is something that needs further investigation.
Too many fires at this moment on my plate.

My patch stops bleeding right now.

Thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next] ipv6: use DST_NOCOUNT in ip6_rt_pcpu_alloc()
  2020-05-08 14:34 [PATCH net-next] ipv6: use DST_NOCOUNT in ip6_rt_pcpu_alloc() Eric Dumazet
  2020-05-08 14:39 ` David Ahern
@ 2020-05-08 16:31 ` Wei Wang
  2020-05-09  5:34 ` Jakub Kicinski
  2 siblings, 0 replies; 7+ messages in thread
From: Wei Wang @ 2020-05-08 16:31 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, netdev, Eric Dumazet, Martin KaFai Lau,
	David Ahern, Maciej Żenczykowski

On Fri, May 8, 2020 at 7:34 AM Eric Dumazet <edumazet@google.com> wrote:
>
> We currently have to adjust ipv6 route gc_thresh/max_size depending
> on number of cpus on a server, this makes very little sense.
>
> If the kernels sets /proc/sys/net/ipv6/route/gc_thresh to 1024
> and /proc/sys/net/ipv6/route/max_size to 4096, then we better
> not track the percpu dst that our implementation uses.
>
> Only routes not added (directly or indirectly) by the admin
> should be tracked and limited.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Martin KaFai Lau <kafai@fb.com>
> Cc: David Ahern <dsahern@kernel.org>
> Cc: Wei Wang <weiwan@google.com>
> Cc: Maciej Żenczykowski <maze@google.com>
> ---


Acked-by: Wei Wang <weiwan@google.com>

>
>  net/ipv6/route.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
> index a9072dba00f4fb0b61bce1fc0f44a3a81ba702fa..4292653af533bb641ae8571fffe45b39327d0380 100644
> --- a/net/ipv6/route.c
> +++ b/net/ipv6/route.c
> @@ -1377,7 +1377,7 @@ static struct rt6_info *ip6_rt_pcpu_alloc(const struct fib6_result *res)
>
>         rcu_read_lock();
>         dev = ip6_rt_get_dev_rcu(res);
> -       pcpu_rt = ip6_dst_alloc(dev_net(dev), dev, flags);
> +       pcpu_rt = ip6_dst_alloc(dev_net(dev), dev, flags | DST_NOCOUNT);
>         rcu_read_unlock();
>         if (!pcpu_rt) {
>                 fib6_info_release(f6i);
> --
> 2.26.2.645.ge9eca65c58-goog
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next] ipv6: use DST_NOCOUNT in ip6_rt_pcpu_alloc()
  2020-05-08 14:34 [PATCH net-next] ipv6: use DST_NOCOUNT in ip6_rt_pcpu_alloc() Eric Dumazet
  2020-05-08 14:39 ` David Ahern
  2020-05-08 16:31 ` Wei Wang
@ 2020-05-09  5:34 ` Jakub Kicinski
  2 siblings, 0 replies; 7+ messages in thread
From: Jakub Kicinski @ 2020-05-09  5:34 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, netdev, Eric Dumazet, Martin KaFai Lau,
	David Ahern, Wei Wang, Maciej Żenczykowski

On Fri,  8 May 2020 07:34:14 -0700 Eric Dumazet wrote:
> We currently have to adjust ipv6 route gc_thresh/max_size depending
> on number of cpus on a server, this makes very little sense.
> 
> If the kernels sets /proc/sys/net/ipv6/route/gc_thresh to 1024
> and /proc/sys/net/ipv6/route/max_size to 4096, then we better
> not track the percpu dst that our implementation uses.
> 
> Only routes not added (directly or indirectly) by the admin
> should be tracked and limited.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied, thank you!

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-05-09  5:34 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-08 14:34 [PATCH net-next] ipv6: use DST_NOCOUNT in ip6_rt_pcpu_alloc() Eric Dumazet
2020-05-08 14:39 ` David Ahern
2020-05-08 14:43   ` Eric Dumazet
2020-05-08 15:03     ` David Ahern
2020-05-08 15:13       ` Eric Dumazet
2020-05-08 16:31 ` Wei Wang
2020-05-09  5:34 ` Jakub Kicinski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.