All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 00/19] net: Enable nexthop objects with IPv4 and IPv6 routes
@ 2019-06-05 23:15 David Ahern
  2019-06-05 23:15 ` [PATCH net-next 01/19] nexthops: Add ipv6 helper to walk all fib6_nh in a nexthop struct David Ahern
                   ` (18 more replies)
  0 siblings, 19 replies; 26+ messages in thread
From: David Ahern @ 2019-06-05 23:15 UTC (permalink / raw)
  To: davem, netdev; +Cc: idosch, kafai, weiwan, sbrivio, David Ahern

From: David Ahern <dsahern@gmail.com>

This is the final set of the initial nexthop object work. When I
started this idea almost 2 years ago, it took 18 seconds to inject
700k+ IPv4 routes with 1 hop and about 28 seconds for 4-paths. Some
of that time was due to inefficiencies in 'ip', but most of it was
kernel side with excessive synchronize_rcu calls in ipv4, and redundant
processing validating a nexthop spec (device, gateway, encap). Worse,
the time increased dramatically as the number of legs in the routes
increased; for example, taking over 72 seconds for 16-path routes.

After this set, with increased dirty memory limits (fib_sync_mem sysctl),
an improved ip and nexthop objects a full internet fib (743,799 routes
based on a pull in January 2019) can be pushed to the kernel in 4.3
seconds. Even better, the time to insert is "almost" constant with
increasing number of paths. The 'almost constant' time is due to
expanding the nexthop definitions when generating notifications. A
follow on patch will be sent adding a sysctl that allows an admin to
avoid the nexthop expansion and truly get constant route insert time
regardless of the number of paths in a route! (Useful once all programs
used for a deployment that care about routes understand nexthop objects).

To be clear, 'ip' is used for benchmarking for no other reason than
'ip -batch' is a trivial to use for the tests. FRR, for example, better
manages nexthops and route changes and the way those are pushed to the
kernel and thus will have less userspace processing times than 'ip -batch'.

Patches 1-10 iterate over fib6_nh with a nexthop invoke a processing
function per fib6_nh. Prior to nexthop objects, a fib6_info referenced
a single fib6_nh. Multipath routes were added as separate fib6_info for
each leg of the route and linked as siblings:

    f6i -> sibling -> sibling ... -> sibling
     |                                   |
     +--------- multipath route ---------+

With nexthop objects a single fib6_info references an external
nexthop which may have a series of fib6_nh:

     f6i ---> nexthop ---> fib6_nh
                           ...
                           fib6_nh

making IPv6 routes similar to IPv4. The side effect is that a single
fib6_info now indirectly references a series of fib6_nh so the code
needs to walk each entry and call the local, per-fib6_nh processing
function.

Patches 11 and 13 wire up use of nexthops with fib entries for IPv4
and IPv6. With these commits you can actually use nexthops with routes.

Patch 12 is an optimization for IPv4 when using nexthops in the most
predominant use case (no metrics).

Patches 14 handles replace of a nexthop config.

Patches 15-18 add update pmtu and redirect tests to use both old and
new routing.

Patches 19 adds new test for the nexthop infrastructure where a single
nexthop is used by multiple prefixes to communicate with remote hosts.
This is on top of the functional tests already committed.

David Ahern (19):
  nexthops: Add ipv6 helper to walk all fib6_nh in a nexthop struct
  ipv6: Handle all fib6_nh in a nexthop in fib6_drop_pcpu_from
  ipv6: Handle all fib6_nh in a nexthop in rt6_device_match
  ipv6: Handle all fib6_nh in a nexthop in __find_rr_leaf
  ipv6: Handle all fib6_nh in a nexthop in rt6_nlmsg_size
  ipv6: Handle all fib6_nh in a nexthop in fib6_info_uses_dev
  ipv6: Handle all fib6_nh in a nexthop in exception handling
  ipv6: Handle all fib6_nh in a nexthop in __ip6_route_redirect
  ipv6: Handle all fib6_nh in a nexthop in rt6_do_redirect
  ipv6: Handle all fib6_nh in a nexthop in mtu updates
  ipv4: Allow routes to use nexthop objects
  ipv4: Optimization for fib_info lookup with nexthops
  ipv6: Allow routes to use nexthop objects
  nexthops: add support for replace
  selftests: pmtu: Move running of test into a new function
  selftests: pmtu: Move route installs to a new function
  selftests: pmtu: Add support for routing via nexthop objects
  selftests: icmp_redirect: Add support for routing via nexthop objects
  selftests: Add test with multiple prefixes using single nexthop

 include/net/ip6_fib.h                              |   1 +
 include/net/ip_fib.h                               |   1 +
 include/net/nexthop.h                              |   4 +
 net/ipv4/fib_frontend.c                            |  19 +
 net/ipv4/fib_semantics.c                           |  86 +++-
 net/ipv4/nexthop.c                                 | 275 ++++++++++++-
 net/ipv6/ip6_fib.c                                 |  31 +-
 net/ipv6/route.c                                   | 456 +++++++++++++++++++--
 .../selftests/net/fib_nexthop_multiprefix.sh       | 290 +++++++++++++
 tools/testing/selftests/net/icmp_redirect.sh       |  49 +++
 tools/testing/selftests/net/pmtu.sh                | 237 ++++++++---
 11 files changed, 1324 insertions(+), 125 deletions(-)
 create mode 100755 tools/testing/selftests/net/fib_nexthop_multiprefix.sh

-- 
2.11.0

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2019-06-06 22:01 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-05 23:15 [PATCH net-next 00/19] net: Enable nexthop objects with IPv4 and IPv6 routes David Ahern
2019-06-05 23:15 ` [PATCH net-next 01/19] nexthops: Add ipv6 helper to walk all fib6_nh in a nexthop struct David Ahern
2019-06-06 21:52   ` David Miller
2019-06-06 22:01     ` David Ahern
2019-06-05 23:15 ` [PATCH net-next 02/19] ipv6: Handle all fib6_nh in a nexthop in fib6_drop_pcpu_from David Ahern
2019-06-05 23:15 ` [PATCH net-next 03/19] ipv6: Handle all fib6_nh in a nexthop in rt6_device_match David Ahern
2019-06-05 23:15 ` [PATCH net-next 04/19] ipv6: Handle all fib6_nh in a nexthop in __find_rr_leaf David Ahern
2019-06-05 23:15 ` [PATCH net-next 05/19] ipv6: Handle all fib6_nh in a nexthop in rt6_nlmsg_size David Ahern
2019-06-05 23:15 ` [PATCH net-next 06/19] ipv6: Handle all fib6_nh in a nexthop in fib6_info_uses_dev David Ahern
2019-06-05 23:15 ` [PATCH net-next 07/19] ipv6: Handle all fib6_nh in a nexthop in exception handling David Ahern
2019-06-05 23:15 ` [PATCH net-next 08/19] ipv6: Handle all fib6_nh in a nexthop in __ip6_route_redirect David Ahern
2019-06-05 23:15 ` [PATCH net-next 09/19] ipv6: Handle all fib6_nh in a nexthop in rt6_do_redirect David Ahern
2019-06-05 23:15 ` [PATCH net-next 10/19] ipv6: Handle all fib6_nh in a nexthop in mtu updates David Ahern
2019-06-05 23:15 ` [PATCH net-next 11/19] ipv4: Allow routes to use nexthop objects David Ahern
2019-06-05 23:15 ` [PATCH net-next 12/19] ipv4: Optimization for fib_info lookup with nexthops David Ahern
2019-06-05 23:15 ` [PATCH net-next 13/19] ipv6: Allow routes to use nexthop objects David Ahern
2019-06-05 23:15 ` [PATCH net-next 14/19] nexthops: add support for replace David Ahern
2019-06-06 21:52   ` David Miller
2019-06-05 23:15 ` [PATCH net-next 15/19] selftests: pmtu: Move running of test into a new function David Ahern
2019-06-06  5:51   ` Stefano Brivio
2019-06-05 23:15 ` [PATCH net-next 16/19] selftests: pmtu: Move route installs to " David Ahern
2019-06-06  5:51   ` Stefano Brivio
2019-06-05 23:15 ` [PATCH net-next 17/19] selftests: pmtu: Add support for routing via nexthop objects David Ahern
2019-06-06  5:51   ` Stefano Brivio
2019-06-05 23:15 ` [PATCH net-next 18/19] selftests: icmp_redirect: " David Ahern
2019-06-05 23:15 ` [PATCH net-next 19/19] selftests: Add test with multiple prefixes using single nexthop David Ahern

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.