archive mirror
 help / color / mirror / Atom feed
From: Baptiste Jonglez <>
To: David Ahern <>
Cc: Alarig Le Lay <>,,,
	Vincent Bernat <>, Oliver <>
Subject: Re: IPv6 regression introduced by commit 3b6761d18bc11f2af2a6fc494e9026d39593f22c
Date: Sun, 27 Sep 2020 17:35:52 +0200	[thread overview]
Message-ID: <20200927153552.GA471334@fedic> (raw)
In-Reply-To: <>

[-- Attachment #1: Type: text/plain, Size: 2290 bytes --]


We are seeing the same issue, more information below.

On 07-03-20, David Ahern wrote:
> On 3/5/20 1:17 AM, Alarig Le Lay wrote:
> > Hi,
> > 
> > On the bird users ML, we discussed a bug we’re facing when having a
> > full table: from time to time all the IPv6 traffic is dropped (and all
> > neighbors are invalidated), after a while it comes back again, then wait
> > a few minutes and it’s dropped again, and so on.
> Kernel version?

We are seeing the issue with 4.19 (debian stable) and 5.4 (debian
stable backports from a few months ago).  Others reported still seeing
the issue with 5.7:

Interestingly, the issue manifests itself in several different ways:

1) failing IPv6 neighbours, what Alarig reported.  We are seeing this
   on a full-view BGP router with rather low amount of IPv6 traffic
   (around 10-20 Mbps)

2) high jitter when forwarding IPv6 traffic: this was in the original
   report from Basil and also here:

3) system lockup: the system becomes unresponsive, with messages like:

     watchdog: BUG: soft lockup - CPU#X stuck for XXs!

   and messages about transmit timeouts from the NIC driver.

   This happened to us on a router that has a BGP full view and
   handles around 50-100 Mbps of IPv6 traffic, which probably means
   lots of route lookups.  It happened with both 4.19 and 5.4.  On the
   other hand, kernel 4.9 runs fine on that exact same router (we are
   running debian buster with the old kernel from debian stretch).

When we can't use an older kernel, our current workaround is the
following sysctl config:

    net.ipv6.route.gc_thresh = 100000
    net.ipv6.route.max_size = 400000

From my understanding, this works because it basically disables the gc
in most cases.

However, the "fib_rt_alloc" field from /proc/net/rt6_stats (6th field)
is steadily increasing: after 2 days of uptime it's at 67k.  At some
point it will hit the gc threshold, we'll see what happens.

I am also trying to reproduce the issue locally.


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  parent reply	other threads:[~2020-09-27 15:55 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-05  8:17 Alarig Le Lay
2020-03-08  0:52 ` David Ahern
2020-03-08 10:57   ` Alarig Le Lay
2020-03-09  2:15     ` David Ahern
2020-03-09  8:59       ` Fabian Grünbichler
2020-03-09 10:47         ` Alarig Le Lay
2020-03-09 11:35           ` Fabian Grünbichler
2020-03-10 10:35       ` Alarig Le Lay
2020-03-10 15:27         ` David Ahern
2020-03-29 14:09           ` Alarig Le Lay
2020-09-27 15:35   ` Baptiste Jonglez [this message]
2020-09-27 16:10     ` Baptiste Jonglez
2020-09-28  3:38       ` David Ahern
2020-09-28  5:39         ` Vincent Bernat
2020-09-28  6:48         ` Baptiste Jonglez
2020-09-29  3:39           ` David Ahern

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200927153552.GA471334@fedic \ \ \ \ \ \ \ \
    --subject='Re: IPv6 regression introduced by commit 3b6761d18bc11f2af2a6fc494e9026d39593f22c' \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).