From: David Ahern <email@example.com> To: Alarig Le Lay <firstname.lastname@example.org>, email@example.com, firstname.lastname@example.org, Vincent Bernat <email@example.com> Subject: Re: IPv6 regression introduced by commit 3b6761d18bc11f2af2a6fc494e9026d39593f22c Date: Sat, 7 Mar 2020 17:52:10 -0700 [thread overview] Message-ID: <firstname.lastname@example.org> (raw) In-Reply-To: <email@example.com> On 3/5/20 1:17 AM, Alarig Le Lay wrote: > Hi, > > On the bird users ML, we discussed a bug we’re facing when having a > full table: from time to time all the IPv6 traffic is dropped (and all > neighbors are invalidated), after a while it comes back again, then wait > a few minutes and it’s dropped again, and so on. Kernel version? you are monitoring neighbor states with 'ip monitor' or something else? > > Basil Fillan determined that it comes from the commit > 3b6761d18bc11f2af2a6fc494e9026d39593f22c. > ... > We've also experienced this after upgrading a few routers to Debian Buster. > With a kernel bisect we found that a bug was introduced in the following > commit: > > 3b6761d18bc11f2af2a6fc494e9026d39593f22c > > This bug was still present in master as of a few weeks ago. > > It appears entries are added to the IPv6 route cache which aren't visible from > "ip -6 route show cache", but are causing the route cache garbage collection > system to trigger extremely often (every packet?) once it exceeds the value of > net.ipv6.route.max_size. Our original symptom was extreme forwarding jitter > caused within the ip6_dst_gc function (identified by some spelunking with > systemtap & perf) worsening as the size of the cache increased. This was due > to our max_size sysctl inadvertently being set to 1 million. Reducing this > value to the default 4096 broke IPv6 forwarding entirely on our test system > under affected kernels. Our documentation had this sysctl marked as the > maximum number of IPv6 routes, so it looks like the use changed at some point. > > We've rolled our routers back to kernel 4.9 (with the sysctl set to 4096) for > now, which fixed our immediate issue. > > You can reproduce this by adding more than 4096 (default value of the sysctl) > routes to the kernel and running "ip route get" for each of them. Once the > route cache is filled, the error "RTNETLINK answers: Network is unreachable" > will be received for each subsequent "ip route get" incantation, and v6 > connectivity will be interrupted. > The above does not reproduce for me on 5.6 or 4.19, and I would have been really surprised if it had, so I have to question the git bisect result. There is no limit on fib entries, and the number of FIB entries has no impact on the sysctl in question, net.ipv6.route.max_size. That sysctl limits the number of dst_entry instances. When the threshold is exceeded (and the gc_thesh for ipv6 defaults to 1024), each new alloc attempts to free one via gc. There are many legitimate reasons for why 4k entries have been created - mtu exceptions, redirects, per-cpu caching, vrfs, ... In 4.9 FIB entries are created as an rt6_info which is a v6 wrapper around dst_entry. That changed in 4.15 or 4.16 - I forget which now, and the commit you reference above is part of the refactoring to make IPv6 more like IPv4 with a different, smaller data structure for fib entries. A lot of other changes have also gone into IPv6 between 4.9 and top of tree, and at this point the whole gc thing can probably go away for v6 like it was removed for ipv4. Try the 5.4 LTS and see if you still hit a problem.
next prev parent reply other threads:[~2020-03-08 1:02 UTC|newest] Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-03-05 8:17 Alarig Le Lay 2020-03-08 0:52 ` David Ahern [this message] 2020-03-08 10:57 ` Alarig Le Lay 2020-03-09 2:15 ` David Ahern 2020-03-09 8:59 ` Fabian Grünbichler 2020-03-09 10:47 ` Alarig Le Lay 2020-03-09 11:35 ` Fabian Grünbichler 2020-03-10 10:35 ` Alarig Le Lay 2020-03-10 15:27 ` David Ahern 2020-03-29 14:09 ` Alarig Le Lay 2020-09-27 15:35 ` Baptiste Jonglez 2020-09-27 16:10 ` Baptiste Jonglez 2020-09-28 3:38 ` David Ahern 2020-09-28 5:39 ` Vincent Bernat 2020-09-28 6:48 ` Baptiste Jonglez 2020-09-29 3:39 ` David Ahern
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --subject='Re: IPv6 regression introduced by commit 3b6761d18bc11f2af2a6fc494e9026d39593f22c' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).