All of lore.kernel.org
 help / color / mirror / Atom feed
* IPv6 routing table max_size badly dimensioned compared to IPv4
@ 2014-02-27 19:24 bert hubert
  2014-02-27 19:59 ` Eric Dumazet
  0 siblings, 1 reply; 10+ messages in thread
From: bert hubert @ 2014-02-27 19:24 UTC (permalink / raw)
  To: netdev

Hi everybody,

Today, a PowerDNS (open source dns, www.powerdns.com) deployment ran into
trouble with large amounts of IPv6 users.  It appears a large telco 'flicked
the switch'.  We had around 8000 DNS queries/s over IPv6, and everything
slowed to a crawl.  100% CPU utilization, most of it in the kernel. The same
amount of queries over IPv4 causes no problems.

Note, this system is not functioning as a router or anything. It is just
serving IPv6 DNS to a reasonable number of clients.

Thanks to diligent debugging and rapid help from friends over at SUSE, who
suggested setting net.ipv6.route.max_size to a higher than default value,
all problems were quickly resolved (thanks!).

>From a quick reading of ip6_dst_gc, it is obvious that exceeding the
max_size of the IPv6 routing table quickly becomes painful, causing non-stop
gc scans.

net.ipv6.route.max_size defaults to 4096. The equivalent setting for IPv4
defaults to 'millions' or is even dynamically sizing in modern kernels.

Now I know distributions can set this sysctl at will, but it appears that
many of them don't. It does appear odd that we still assume at a kernel
level that IPv6 is 'rare', a thousand times more rare than IPv4.

If people think this is a good idea, I could try to lift some of the other
'autosizing' code out there to get the IPv6 max_size limit raised on
non-contrained hardware. 

Please let me know!

-- 
PowerDNS Website: http://www.powerdns.com/
Contact us by phone on +31-15-7850372

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: IPv6 routing table max_size badly dimensioned compared to IPv4
  2014-02-27 19:24 IPv6 routing table max_size badly dimensioned compared to IPv4 bert hubert
@ 2014-02-27 19:59 ` Eric Dumazet
  2014-02-27 20:05   ` bert hubert
  2014-02-27 20:08   ` Hannes Frederic Sowa
  0 siblings, 2 replies; 10+ messages in thread
From: Eric Dumazet @ 2014-02-27 19:59 UTC (permalink / raw)
  To: bert hubert; +Cc: netdev

On Thu, 2014-02-27 at 20:24 +0100, bert hubert wrote:
> Hi everybody,
> 
> Today, a PowerDNS (open source dns, www.powerdns.com) deployment ran into
> trouble with large amounts of IPv6 users.  It appears a large telco 'flicked
> the switch'.  We had around 8000 DNS queries/s over IPv6, and everything
> slowed to a crawl.  100% CPU utilization, most of it in the kernel. The same
> amount of queries over IPv4 causes no problems.
> 
> Note, this system is not functioning as a router or anything. It is just
> serving IPv6 DNS to a reasonable number of clients.
> 
> Thanks to diligent debugging and rapid help from friends over at SUSE, who
> suggested setting net.ipv6.route.max_size to a higher than default value,
> all problems were quickly resolved (thanks!).
> 
> From a quick reading of ip6_dst_gc, it is obvious that exceeding the
> max_size of the IPv6 routing table quickly becomes painful, causing non-stop
> gc scans.
> 
> net.ipv6.route.max_size defaults to 4096. The equivalent setting for IPv4
> defaults to 'millions' or is even dynamically sizing in modern kernels.
> 
> Now I know distributions can set this sysctl at will, but it appears that
> many of them don't. It does appear odd that we still assume at a kernel
> level that IPv6 is 'rare', a thousand times more rare than IPv4.
> 
> If people think this is a good idea, I could try to lift some of the other
> 'autosizing' code out there to get the IPv6 max_size limit raised on
> non-contrained hardware. 
> 
> Please let me know!
> 

What kernel version do you use ?

I thought this was already solved.

Commit 957c665f37007de93ccbe45902a23143724170d0 is in linux 3.0
("ipv6: Don't put artificial limit on routing table size.")

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: IPv6 routing table max_size badly dimensioned compared to IPv4
  2014-02-27 19:59 ` Eric Dumazet
@ 2014-02-27 20:05   ` bert hubert
  2014-02-27 20:08   ` Hannes Frederic Sowa
  1 sibling, 0 replies; 10+ messages in thread
From: bert hubert @ 2014-02-27 20:05 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev

On Thu, Feb 27, 2014 at 11:59:04AM -0800, Eric Dumazet wrote:
> What kernel version do you use ?

Thanks for the fast reply!

It calls itself "Linux node28 3.0.101-0.15-default #1 SMP Wed Jan 22
15:49:03 UTC 2014 (5c01f4e) x86_64 x86_64 x86_64 GNU/Linux"

Which is from "SLES11 SP3". I don't know how to interpret '3.0.101' compared
to 957c665f37007de93ccbe45902a23143724170d0, though.

>From what I see from git HEAD however, the number 4096 is still actually
used for things.

	Bert

> 
> I thought this was already solved.
> 
> Commit 957c665f37007de93ccbe45902a23143724170d0 is in linux 3.0
> ("ipv6: Don't put artificial limit on routing table size.")
> 
> 
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: IPv6 routing table max_size badly dimensioned compared to IPv4
  2014-02-27 19:59 ` Eric Dumazet
  2014-02-27 20:05   ` bert hubert
@ 2014-02-27 20:08   ` Hannes Frederic Sowa
  2014-02-27 20:23     ` Eric Dumazet
  1 sibling, 1 reply; 10+ messages in thread
From: Hannes Frederic Sowa @ 2014-02-27 20:08 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: bert hubert, netdev

On Thu, Feb 27, 2014 at 11:59:04AM -0800, Eric Dumazet wrote:
> On Thu, 2014-02-27 at 20:24 +0100, bert hubert wrote:
> > Hi everybody,
> > 
> > Today, a PowerDNS (open source dns, www.powerdns.com) deployment ran into
> > trouble with large amounts of IPv6 users.  It appears a large telco 'flicked
> > the switch'.  We had around 8000 DNS queries/s over IPv6, and everything
> > slowed to a crawl.  100% CPU utilization, most of it in the kernel. The same
> > amount of queries over IPv4 causes no problems.
> > 
> > Note, this system is not functioning as a router or anything. It is just
> > serving IPv6 DNS to a reasonable number of clients.
> > 
> > Thanks to diligent debugging and rapid help from friends over at SUSE, who
> > suggested setting net.ipv6.route.max_size to a higher than default value,
> > all problems were quickly resolved (thanks!).
> > 
> > From a quick reading of ip6_dst_gc, it is obvious that exceeding the
> > max_size of the IPv6 routing table quickly becomes painful, causing non-stop
> > gc scans.
> > 
> > net.ipv6.route.max_size defaults to 4096. The equivalent setting for IPv4
> > defaults to 'millions' or is even dynamically sizing in modern kernels.
> > 
> > Now I know distributions can set this sysctl at will, but it appears that
> > many of them don't. It does appear odd that we still assume at a kernel
> > level that IPv6 is 'rare', a thousand times more rare than IPv4.
> > 
> > If people think this is a good idea, I could try to lift some of the other
> > 'autosizing' code out there to get the IPv6 max_size limit raised on
> > non-contrained hardware. 
> > 
> > Please let me know!
> > 
> 
> What kernel version do you use ?
> 
> I thought this was already solved.
> 
> Commit 957c665f37007de93ccbe45902a23143724170d0 is in linux 3.0
> ("ipv6: Don't put artificial limit on routing table size.")

The problem with DNS is that just for the DNS/UDP ping-pong we clone a
rt6_info and reinsert it back into the fib.

DST_NOCOUNT logic only ensures we can still insert new routes from user space
while the maximum number of routes is reached in the routing table. In case
the ipv6 fib is under heavy load it does not help with performance.

We might think about raising this limit a bit. Number of IPv6 routing entries
in global routing table already passed 4096. Maybe some heuristic which scales
this value according to available memory?

Greetings,

  Hannes

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: IPv6 routing table max_size badly dimensioned compared to IPv4
  2014-02-27 20:08   ` Hannes Frederic Sowa
@ 2014-02-27 20:23     ` Eric Dumazet
  2014-02-27 20:50       ` Hannes Frederic Sowa
  0 siblings, 1 reply; 10+ messages in thread
From: Eric Dumazet @ 2014-02-27 20:23 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: bert hubert, netdev

On Thu, 2014-02-27 at 21:08 +0100, Hannes Frederic Sowa wrote:
> On Thu, Feb 27, 2014 at 11:59:04AM -0800, Eric Dumazet wrote:
> > On Thu, 2014-02-27 at 20:24 +0100, bert hubert wrote:
> > > Hi everybody,
> > > 
> > > Today, a PowerDNS (open source dns, www.powerdns.com) deployment ran into
> > > trouble with large amounts of IPv6 users.  It appears a large telco 'flicked
> > > the switch'.  We had around 8000 DNS queries/s over IPv6, and everything
> > > slowed to a crawl.  100% CPU utilization, most of it in the kernel. The same
> > > amount of queries over IPv4 causes no problems.
> > > 
> > > Note, this system is not functioning as a router or anything. It is just
> > > serving IPv6 DNS to a reasonable number of clients.
> > > 
> > > Thanks to diligent debugging and rapid help from friends over at SUSE, who
> > > suggested setting net.ipv6.route.max_size to a higher than default value,
> > > all problems were quickly resolved (thanks!).
> > > 
> > > From a quick reading of ip6_dst_gc, it is obvious that exceeding the
> > > max_size of the IPv6 routing table quickly becomes painful, causing non-stop
> > > gc scans.
> > > 
> > > net.ipv6.route.max_size defaults to 4096. The equivalent setting for IPv4
> > > defaults to 'millions' or is even dynamically sizing in modern kernels.
> > > 
> > > Now I know distributions can set this sysctl at will, but it appears that
> > > many of them don't. It does appear odd that we still assume at a kernel
> > > level that IPv6 is 'rare', a thousand times more rare than IPv4.
> > > 
> > > If people think this is a good idea, I could try to lift some of the other
> > > 'autosizing' code out there to get the IPv6 max_size limit raised on
> > > non-contrained hardware. 
> > > 
> > > Please let me know!
> > > 
> > 
> > What kernel version do you use ?
> > 
> > I thought this was already solved.
> > 
> > Commit 957c665f37007de93ccbe45902a23143724170d0 is in linux 3.0
> > ("ipv6: Don't put artificial limit on routing table size.")
> 
> The problem with DNS is that just for the DNS/UDP ping-pong we clone a
> rt6_info and reinsert it back into the fib.
> 
> DST_NOCOUNT logic only ensures we can still insert new routes from user space
> while the maximum number of routes is reached in the routing table. In case
> the ipv6 fib is under heavy load it does not help with performance.
> 
> We might think about raising this limit a bit. Number of IPv6 routing entries
> in global routing table already passed 4096. Maybe some heuristic which scales
> this value according to available memory?


Well, if we attach one dst per packet, it seems we already have a limit
of number of packets on the host (qdisc limits + packets on TX ring
buffers)

So DST_NOCOUNT should be used in this case. I thought David patch
was already doing this.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: IPv6 routing table max_size badly dimensioned compared to IPv4
  2014-02-27 20:23     ` Eric Dumazet
@ 2014-02-27 20:50       ` Hannes Frederic Sowa
  2014-02-27 21:02         ` Eric Dumazet
  0 siblings, 1 reply; 10+ messages in thread
From: Hannes Frederic Sowa @ 2014-02-27 20:50 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: bert hubert, netdev

On Thu, Feb 27, 2014 at 12:23:03PM -0800, Eric Dumazet wrote:
> > > What kernel version do you use ?
> > > 
> > > I thought this was already solved.
> > > 
> > > Commit 957c665f37007de93ccbe45902a23143724170d0 is in linux 3.0
> > > ("ipv6: Don't put artificial limit on routing table size.")
> > 
> > The problem with DNS is that just for the DNS/UDP ping-pong we clone a
> > rt6_info and reinsert it back into the fib.
> > 
> > DST_NOCOUNT logic only ensures we can still insert new routes from user space
> > while the maximum number of routes is reached in the routing table. In case
> > the ipv6 fib is under heavy load it does not help with performance.
> > 
> > We might think about raising this limit a bit. Number of IPv6 routing entries
> > in global routing table already passed 4096. Maybe some heuristic which scales
> > this value according to available memory?
> 
> 
> Well, if we attach one dst per packet, it seems we already have a limit
> of number of packets on the host (qdisc limits + packets on TX ring
> buffers)
> 
> So DST_NOCOUNT should be used in this case. I thought David patch
> was already doing this.

We store those routes back into the routing table, so we must have a way to
count them and trigger gc at some point.

Greetings,

  Hannes

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: IPv6 routing table max_size badly dimensioned compared to IPv4
  2014-02-27 20:50       ` Hannes Frederic Sowa
@ 2014-02-27 21:02         ` Eric Dumazet
  2014-02-27 22:47           ` David Miller
  0 siblings, 1 reply; 10+ messages in thread
From: Eric Dumazet @ 2014-02-27 21:02 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: bert hubert, netdev

On Thu, 2014-02-27 at 21:50 +0100, Hannes Frederic Sowa wrote:

> We store those routes back into the routing table, so we must have a way to
> count them and trigger gc at some point.

Right, and current implementation will not scale.

If we need to perform 10000 inserts per second, and gc timeout is 60
seconds, tree contains 600.000 entries, gc takes forever...

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: IPv6 routing table max_size badly dimensioned compared to IPv4
  2014-02-27 21:02         ` Eric Dumazet
@ 2014-02-27 22:47           ` David Miller
  2014-02-28 11:07             ` bert hubert
  0 siblings, 1 reply; 10+ messages in thread
From: David Miller @ 2014-02-27 22:47 UTC (permalink / raw)
  To: eric.dumazet; +Cc: hannes, bert.hubert, netdev

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 27 Feb 2014 13:02:00 -0800

> On Thu, 2014-02-27 at 21:50 +0100, Hannes Frederic Sowa wrote:
> 
>> We store those routes back into the routing table, so we must have a way to
>> count them and trigger gc at some point.
> 
> Right, and current implementation will not scale.
> 
> If we need to perform 10000 inserts per second, and gc timeout is 60
> seconds, tree contains 600.000 entries, gc takes forever...

The only long term solution is to align ipv6 to be more like ipv4.

What's interesting is that if you look at the code, the original
author clearly intended to make callers be able to use route's
from the tree as-is without cloning.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: IPv6 routing table max_size badly dimensioned compared to IPv4
  2014-02-27 22:47           ` David Miller
@ 2014-02-28 11:07             ` bert hubert
  2014-04-04 16:03               ` Lukas Tribus
  0 siblings, 1 reply; 10+ messages in thread
From: bert hubert @ 2014-02-28 11:07 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, hannes, netdev

On Thu, Feb 27, 2014 at 05:47:55PM -0500, David Miller wrote:
> > If we need to perform 10000 inserts per second, and gc timeout is 60
> > seconds, tree contains 600.000 entries, gc takes forever...
> 
> The only long term solution is to align ipv6 to be more like ipv4.

Yes please. There appears to be a lingering assumption IPv6 is small scale.

T-Mobile USA is now providing IPv6 only service via DNA64/NAT64, creating
(dozens of) millions of IPv6 only clients. (More on this 'hack' on
http://blog.powerdns.com/2013/05/17/ripe-66-powerdns-and-dns64nat64/ )

So aligning IPv6 scalability with IPv4 scalability would be grand, thanks!

	Bert

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: IPv6 routing table max_size badly dimensioned compared to IPv4
  2014-02-28 11:07             ` bert hubert
@ 2014-04-04 16:03               ` Lukas Tribus
  0 siblings, 0 replies; 10+ messages in thread
From: Lukas Tribus @ 2014-04-04 16:03 UTC (permalink / raw)
  To: bert hubert, David Miller; +Cc: eric.dumazet, hannes, netdev

> On Thu, Feb 27, 2014 at 05:47:55PM -0500, David Miller wrote:
>>> If we need to perform 10000 inserts per second, and gc timeout is 60
>>> seconds, tree contains 600.000 entries, gc takes forever...
>>
>> The only long term solution is to align ipv6 to be more like ipv4.
>
> Yes please. There appears to be a lingering assumption IPv6 is small scale.
>
> T-Mobile USA is now providing IPv6 only service via DNA64/NAT64, creating
> (dozens of) millions of IPv6 only clients. (More on this 'hack' on
> http://blog.powerdns.com/2013/05/17/ripe-66-powerdns-and-dns64nat64/ )
>
> So aligning IPv6 scalability with IPv4 scalability would be grand, thanks!

FYI, this was also mentioned by Paul Saab from Facebook in a
presentation [1] during last months V6 World Congress [2].

Looks like more and more people start hitting this.



Best regards,

Lukas


[1] https://www.facebook.com/groups/2234775539/10152303014725540/
[2] http://www.uppersideconferences.com/v6world2014/v6world2014introduction.html 		 	   		  

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-04-04 16:08 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-27 19:24 IPv6 routing table max_size badly dimensioned compared to IPv4 bert hubert
2014-02-27 19:59 ` Eric Dumazet
2014-02-27 20:05   ` bert hubert
2014-02-27 20:08   ` Hannes Frederic Sowa
2014-02-27 20:23     ` Eric Dumazet
2014-02-27 20:50       ` Hannes Frederic Sowa
2014-02-27 21:02         ` Eric Dumazet
2014-02-27 22:47           ` David Miller
2014-02-28 11:07             ` bert hubert
2014-04-04 16:03               ` Lukas Tribus

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.