netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jonathan Maxwell <jmaxwell37@gmail.com>
To: Andrea Mayer <andrea.mayer@uniroma2.it>
Cc: Paolo Abeni <pabeni@redhat.com>,
	davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
	yoshfuji@linux-ipv6.org, dsahern@kernel.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	Stefano Salsano <stefano.salsano@uniroma2.it>,
	Paolo Lungaroni <paolo.lungaroni@uniroma2.it>,
	Ahmed Abdelsalam <ahabdels.dev@gmail.com>
Subject: Re: [net-next] ipv6: fix routing cache overflow for raw sockets
Date: Sat, 24 Dec 2022 18:38:01 +1100	[thread overview]
Message-ID: <CAGHK07APOwLvhs73WKkQfZuEy2FoKEWJusSyejKVcth4D47g=w@mail.gmail.com> (raw)
In-Reply-To: <20221223212835.eb9d03f3f7db22360e34341d@uniroma2.it>

On Sat, Dec 24, 2022 at 7:28 AM Andrea Mayer <andrea.mayer@uniroma2.it> wrote:
>
> Hi Jon,
> please see below, thanks.
>
> On Wed, 21 Dec 2022 08:48:11 +1100
> Jonathan Maxwell <jmaxwell37@gmail.com> wrote:
>
> > On Tue, Dec 20, 2022 at 11:35 PM Paolo Abeni <pabeni@redhat.com> wrote:
> > >
> > > On Mon, 2022-12-19 at 10:48 +1100, Jon Maxwell wrote:
> > > > Sending Ipv6 packets in a loop via a raw socket triggers an issue where a
> > > > route is cloned by ip6_rt_cache_alloc() for each packet sent. This quickly
> > > > consumes the Ipv6 max_size threshold which defaults to 4096 resulting in
> > > > these warnings:
> > > >
> > > > [1]   99.187805] dst_alloc: 7728 callbacks suppressed
> > > > [2] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
> > > > .
> > > > .
> > > > [300] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
> > >
> > > If I read correctly, the maximum number of dst that the raw socket can
> > > use this way is limited by the number of packets it allows via the
> > > sndbuf limit, right?
> > >
> >
> > Yes, but in my test sndbuf limit is never hit so it clones a route for
> > every packet.
> >
> > e.g:
> >
> > output from C program sending 5000000 packets via a raw socket.
> >
> > ip raw: total num pkts 5000000
> >
> > # bpftrace -e 'kprobe:dst_alloc {@count[comm] = count()}'
> > Attaching 1 probe...
> >
> > @count[a.out]: 5000009
> >
> > > Are other FLOWI_FLAG_KNOWN_NH users affected, too? e.g. nf_dup_ipv6,
> > > ipvs, seg6?
> > >
> >
> > Any call to ip6_pol_route(s) where no res.nh->fib_nh_gw_family is 0 can do it.
> > But we have only seen this for raw sockets so far.
> >
>
> In the SRv6 subsystem, the seg6_lookup_nexthop() is used by some
> cross-connecting behaviors such as End.X and End.DX6 to forward traffic to a
> specified nexthop. SRv6 End.X/DX6 can specify an IPv6 DA (i.e., a nexthop)
> different from the one carried by the IPv6 header. For this purpose,
> seg6_lookup_nexthop() sets the FLOWI_FLAG_KNOWN_NH.
>
Hi Andrea,

Thanks for pointing that datapath out. The more generic approach we are
taking bringing Ipv6 closer to Ipv4 in this regard should fix all instances
of this.

> > > > [1]   99.187805] dst_alloc: 7728 callbacks suppressed
> > > > [2] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
> > > > .
> > > > .
> > > > [300] Route cache is full: consider increasing sysctl net.ipv6.route.max_size.
>
> I can reproduce the same warning messages reported by you, by instantiating an
> End.X behavior whose nexthop is handled by a route for which there is no "via".
> In this configuration, the ip6_pol_route() (called by seg6_lookup_nexthop())
> triggers ip6_rt_cache_alloc() because i) the FLOWI_FLAG_KNOWN_NH is present ii)
> and the res.nh->fib_nh_gw_family is 0 (as already pointed out).
>

Nice, when I get back after the holiday break I'll submit the next patch. It
would be great if you could test the new patch and let me know how it works in
your tests at that juncture. I'll keep you posted.

Regards

Jon

> > Regards
> >
> > Jon
>
> Ciao,
> Andrea

  reply	other threads:[~2022-12-24  7:38 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-18 23:48 [net-next] ipv6: fix routing cache overflow for raw sockets Jon Maxwell
2022-12-20 12:35 ` Paolo Abeni
2022-12-20 15:10   ` David Ahern
2022-12-20 21:55     ` Jonathan Maxwell
2022-12-21  4:31       ` Jonathan Maxwell
2022-12-22  5:39         ` Jonathan Maxwell
2022-12-22 16:17           ` David Ahern
2022-12-22 22:36             ` Jonathan Maxwell
2022-12-20 15:17   ` Julian Anastasov
2022-12-20 15:41   ` Julian Anastasov
2022-12-20 21:48   ` Jonathan Maxwell
2022-12-23 20:28     ` Andrea Mayer
2022-12-24  7:38       ` Jonathan Maxwell [this message]
2023-01-02 23:59         ` Jonathan Maxwell
2023-01-03 16:07           ` Andrea Mayer
2023-01-06 23:26             ` Andrea Mayer
2023-01-07 23:46               ` Jonathan Maxwell
2023-01-08 17:34                 ` Andrea Mayer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGHK07APOwLvhs73WKkQfZuEy2FoKEWJusSyejKVcth4D47g=w@mail.gmail.com' \
    --to=jmaxwell37@gmail.com \
    --cc=ahabdels.dev@gmail.com \
    --cc=andrea.mayer@uniroma2.it \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=paolo.lungaroni@uniroma2.it \
    --cc=stefano.salsano@uniroma2.it \
    --cc=yoshfuji@linux-ipv6.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).