netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH net-next 00/15] Simplify IPv4 route offload API
@ 2019-10-02  8:40 Ido Schimmel
  2019-10-02  8:40 ` [RFC PATCH net-next 01/15] ipv4: Add temporary events to the FIB notification chain Ido Schimmel
                   ` (15 more replies)
  0 siblings, 16 replies; 39+ messages in thread
From: Ido Schimmel @ 2019-10-02  8:40 UTC (permalink / raw)
  To: netdev; +Cc: davem, dsahern, jiri, jakub.kicinski, saeedm, mlxsw, Ido Schimmel

From: Ido Schimmel <idosch@mellanox.com>

Today, whenever an IPv4 route is added or deleted a notification is sent
in the FIB notification chain and it is up to offload drivers to decide
if the route should be programmed to the hardware or not. This is not an
easy task as in hardware routes are keyed by {prefix, prefix length,
table id}, whereas the kernel can store multiple such routes that only
differ in metric / TOS / nexthop info.

This series makes sure that only routes that are actually used in the
data path are notified to offload drivers. This greatly simplifies the
work these drivers need to do, as they are now only concerned with
programming the hardware and do not need to replicate the IPv4 route
insertion logic and store multiple identical routes.

The route that is notified is the first FIB alias in the FIB node with
the given {prefix, prefix length, table ID}. In case the route is
deleted and there is another route with the same key, a replace
notification is emitted. Otherwise, a delete notification is emitted.

The above means that in the case of multiple routes with the same key,
but different TOS, only the route with the highest TOS is notified.
While the kernel can route a packet based on its TOS, this is not
supported by any hardware devices I'm familiar with. Moreover, this is
not supported by IPv6 nor by BIRD/FRR from what I could see. Offload
drivers should therefore use the presence of a non-zero TOS as an
indication to trap packets matching the route and let the kernel route
them instead. mlxsw has been doing it for the past two years.

The series also adds an "in hardware" indication to routes, in addition
to the offload indication we already have on nexthops today. Besides
being long overdue, the reason this is done in this series is that it
makes it possible to easily test the new FIB notification API over
netdevsim.

To ensure there is no degradation in route insertion rates, I used
Vincent Bernat's script [1][2] from [3] to inject 500,000 routes from an
MRT dump from a router with a full view. On a system with Intel(R)
Xeon(R) CPU D-1527 @ 2.20GHz I measured 8.184 seconds, averaged over 10
runs and saw no degradation compared to net-next from today.

Patchset overview:
Patches #1-#7 introduce the new FIB notifications
Patches #8-#9 convert listeners to make use of the new notifications
Patches #10-#14 add "in hardware" indication for IPv4 routes, including
a dummy FIB offload implementation in netdevsim
Patch #15 adds a selftest for the new FIB notifications API over
netdevsim

The series is based on Jiri's "devlink: allow devlink instances to
change network namespace" series [4]. The patches can be found here [5]
and patched iproute2 with the "in hardware" indication can be found here
[6].

IPv6 is next on my TODO list.

[1] https://github.com/vincentbernat/network-lab/blob/master/common/helpers/lab-routes-ipvX/insert-from-bgp
[2] https://gist.github.com/idosch/2eb96efe50eb5234d205e964f0814859
[3] https://vincent.bernat.ch/en/blog/2017-ipv4-route-lookup-linux
[4] https://patchwork.ozlabs.org/cover/1162295/
[5] https://github.com/idosch/linux/tree/fib-notifier
[6] https://github.com/idosch/iproute2/tree/fib-notifier

Ido Schimmel (15):
  ipv4: Add temporary events to the FIB notification chain
  ipv4: Notify route after insertion to the routing table
  ipv4: Notify route if replacing currently offloaded one
  ipv4: Notify newly added route if should be offloaded
  ipv4: Handle route deletion notification
  ipv4: Handle route deletion notification during flush
  ipv4: Only Replay routes of interest to new listeners
  mlxsw: spectrum_router: Start using new IPv4 route notifications
  ipv4: Remove old route notifications and convert listeners
  ipv4: Replace route in list before notifying
  ipv4: Encapsulate function arguments in a struct
  ipv4: Add "in hardware" indication to routes
  mlxsw: spectrum_router: Mark routes as "in hardware"
  netdevsim: fib: Mark routes as "in hardware"
  selftests: netdevsim: Add test for route offload API

 .../net/ethernet/mellanox/mlx5/core/lag_mp.c  |   4 -
 .../ethernet/mellanox/mlxsw/spectrum_router.c | 152 ++-----
 drivers/net/ethernet/rocker/rocker_main.c     |   4 +-
 drivers/net/netdevsim/fib.c                   | 263 ++++++++++-
 include/net/ip_fib.h                          |   5 +
 include/uapi/linux/rtnetlink.h                |   1 +
 net/ipv4/fib_lookup.h                         |  18 +-
 net/ipv4/fib_semantics.c                      |  30 +-
 net/ipv4/fib_trie.c                           | 223 ++++++++--
 net/ipv4/route.c                              |  12 +-
 .../drivers/net/netdevsim/fib_notifier.sh     | 411 ++++++++++++++++++
 11 files changed, 938 insertions(+), 185 deletions(-)
 create mode 100755 tools/testing/selftests/drivers/net/netdevsim/fib_notifier.sh

-- 
2.21.0


^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2019-10-04 23:20 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-02  8:40 [RFC PATCH net-next 00/15] Simplify IPv4 route offload API Ido Schimmel
2019-10-02  8:40 ` [RFC PATCH net-next 01/15] ipv4: Add temporary events to the FIB notification chain Ido Schimmel
2019-10-02  8:40 ` [RFC PATCH net-next 02/15] ipv4: Notify route after insertion to the routing table Ido Schimmel
2019-10-03  1:34   ` David Ahern
2019-10-03  5:16     ` Ido Schimmel
2019-10-02  8:40 ` [RFC PATCH net-next 03/15] ipv4: Notify route if replacing currently offloaded one Ido Schimmel
2019-10-02  8:40 ` [RFC PATCH net-next 04/15] ipv4: Notify newly added route if should be offloaded Ido Schimmel
2019-10-02  8:40 ` [RFC PATCH net-next 05/15] ipv4: Handle route deletion notification Ido Schimmel
2019-10-02  8:40 ` [RFC PATCH net-next 06/15] ipv4: Handle route deletion notification during flush Ido Schimmel
2019-10-02  8:40 ` [RFC PATCH net-next 07/15] ipv4: Only Replay routes of interest to new listeners Ido Schimmel
2019-10-02 17:44   ` Jiri Pirko
2019-10-03 13:04     ` Ido Schimmel
2019-10-02  8:40 ` [RFC PATCH net-next 08/15] mlxsw: spectrum_router: Start using new IPv4 route notifications Ido Schimmel
2019-10-02 17:52   ` Jiri Pirko
2019-10-02 18:01     ` Jiri Pirko
2019-10-03 15:10       ` Ido Schimmel
2019-10-02  8:40 ` [RFC PATCH net-next 09/15] ipv4: Remove old route notifications and convert listeners Ido Schimmel
2019-10-02  8:40 ` [RFC PATCH net-next 10/15] ipv4: Replace route in list before notifying Ido Schimmel
2019-10-02  8:40 ` [RFC PATCH net-next 11/15] ipv4: Encapsulate function arguments in a struct Ido Schimmel
2019-10-02  8:41 ` [RFC PATCH net-next 12/15] ipv4: Add "in hardware" indication to routes Ido Schimmel
2019-10-02 15:58   ` Roopa Prabhu
2019-10-02 18:21     ` Jiri Pirko
2019-10-03  2:34       ` David Ahern
2019-10-03  5:37         ` Ido Schimmel
2019-10-04  1:55           ` David Ahern
2019-10-04 14:43             ` Ido Schimmel
2019-10-04 16:38               ` David Ahern
2019-10-04 17:43                 ` Roopa Prabhu
2019-10-04 23:20                   ` David Ahern
2019-10-03  5:40         ` Jiri Pirko
2019-10-03 12:59     ` Ido Schimmel
2019-10-04  4:25       ` Roopa Prabhu
2019-10-02  8:41 ` [RFC PATCH net-next 13/15] mlxsw: spectrum_router: Mark routes as "in hardware" Ido Schimmel
2019-10-02 18:27   ` Jiri Pirko
2019-10-03 15:16     ` Ido Schimmel
2019-10-02  8:41 ` [RFC PATCH net-next 14/15] netdevsim: fib: " Ido Schimmel
2019-10-02  8:41 ` [RFC PATCH net-next 15/15] selftests: netdevsim: Add test for route offload API Ido Schimmel
2019-10-02 18:17 ` [RFC PATCH net-next 00/15] Simplify IPv4 " Jiri Pirko
2019-10-03  5:18   ` Ido Schimmel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).