All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
To: David Ahern <dsahern@kernel.org>, Hangbin Liu <liuhangbin@gmail.com>
Cc: Thomas Haller <thaller@redhat.com>,
	Benjamin Poirier <bpoirier@nvidia.com>,
	Stephen Hemminger <stephen@networkplumber.org>,
	Ido Schimmel <idosch@idosch.org>,
	netdev@vger.kernel.org, "David S . Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>
Subject: Re: [PATCH net-next] ipv4/fib: send RTM_DELROUTE notify when flush fib
Date: Thu, 14 Sep 2023 17:43:19 +0200	[thread overview]
Message-ID: <b83e24a4-6de3-0df2-d902-f2cc3cdbaf41@6wind.com> (raw)
In-Reply-To: <a4003473-6809-db97-3d06-cec8e08c6ed6@6wind.com>

Le 13/09/2023 à 16:53, Nicolas Dichtel a écrit :
> Le 13/09/2023 à 16:43, David Ahern a écrit :
>> On 9/13/23 8:11 AM, Nicolas Dichtel wrote:
>>> The compat_mode was introduced for daemons that doesn't support the nexthop
>>> framework. There must be a notification (RTM_DELROUTE) when a route is deleted
>>> due to a carrier down event. Right now, the backward compat is broken.
>>
>> The compat_mode is for daemons that do not understand the nexthop id
>> attribute, and need the legacy set of attributes for the route - i.e,
> Yes, it's my point.
> On my system, one daemon understands and configures nexthop id and another one
> doesn't understand nexthop id. This last daemon removes routes when an interface
> is put down but not when the carrier is lost.
> The kernel doc [1] says:
> 	Further, updates or deletes of a nexthop configuration generate route
> 	notifications for each fib entry using the nexthop.
> So, my understanding is that a RTM_DELROUTE msg should be sent when a nexthop is
> removed due to a carrier lost event.
> 
> [1]
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/networking/ip-sysctl.rst#n2116

I dug a bit more about these (missing) notifications. I will try to describe
what should be done for cases where there is no notification:

When an interface is set down:
 - the single (!multipath) routes associated with this interface should be
   removed;
 - for multipath routes:
   + if all nh use this interface: the routes are deleted;
   + if only some nh uses this interface :
     ~ if all other nh already point to a down interface: the routes are deleted;
     ~ if at least one nh points to an up interface:
       o the nh are *temporarily* disabled if it's a plain nexthop;
       o the nh is *definitely* removed if it's a nexthop object;
When the interface is set up later, disabled nh are restored (ie only plain
nexthop of multipath routes).

When an interface loses its carrier:
 - for routes using plain nexthop: nothing happens;
 - for routes using nexthop objects:
   + for single routes: they are deleted;
   + for multipath routes, the nh is definitely removed if it's a nexthop
     object (ie the route is deleted if there is no other nexthop in the group);
When an interface recovers its carrier, there is nothing to do.

When the last ipv4 address of an interface is removed:
 - for routes using nexthop objects: nothing happens;
 - for routes using plain nexthop: the same rules as 'interface down' applies.
When an ipv4 address is added again on the interface, disabled nh are restored
(ie only plain nexthop of multipath routes).

I bet I miss some cases.

Conclusions:
 - legacy applications (that are not aware of nexthop objects) cannot maintain a
   routing cache (even with compat_mode enabled);
 - fixing only the legacy applications (aka compat_mode) seems too
   complex;
 - even if an application is aware of nexthop objects, the rules to maintain a
   cache are far from obvious.

I don't understand why there is so much reluctance to not send a notification
when a route is deleted. This would fix all cases.
I understand that the goal was to save netlink traffic, but in this case, the
daemons that are interested in maintaining a routing cache have to fully parse
their cache to mark/remove routes. For big routing tables, this will cost a lot
of cpu, so I wonder if it's really a gain for the system. On such systems, there
is probably more than one daemon in this case, so even more cpu to spend for
these operations.

As Thomas said, this discussion has come up for more than a decade. And with the
nexthop objects support, it's even more complex. There is obviously something to do.

At least, I would have expected an RTM_DELNEXTHOP msg for each deleted nexthop.
But this wouldn't solve the routing cache sync for legacy applications.


Regards,
Nicolas

  reply	other threads:[~2023-09-14 15:43 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-18  8:00 [PATCH net-next] ipv4/fib: send RTM_DELROUTE notify when flush fib Hangbin Liu
2023-07-18 10:19 ` Ido Schimmel
2023-07-18 10:32   ` Ido Schimmel
2023-07-18 14:45     ` David Ahern
2023-07-18 15:58   ` Stephen Hemminger
2023-07-20  7:51     ` Hangbin Liu
2023-07-20 14:29       ` Ido Schimmel
2023-07-21  1:34         ` Hangbin Liu
2023-07-21  4:01           ` David Ahern
2023-07-21  5:46             ` Hangbin Liu
2023-07-23  7:38               ` Ido Schimmel
2023-07-24  8:56                 ` Hangbin Liu
2023-07-24 15:48                   ` Stephen Hemminger
2023-07-25  8:20                     ` Hangbin Liu
2023-07-25 16:36                       ` Stephen Hemminger
2023-07-28 13:01                         ` Nicolas Dichtel
2023-07-28 15:42                           ` David Ahern
2023-08-02  9:10                             ` Thomas Haller
2023-08-08  1:44                               ` David Ahern
2023-08-08 18:59                                 ` Benjamin Poirier
2023-09-11  9:50                                   ` Thomas Haller
2023-09-13  7:58                                     ` Nicolas Dichtel
2023-09-13  9:54                                       ` Hangbin Liu
2023-09-13 14:11                                         ` Nicolas Dichtel
2023-09-13 14:43                                           ` David Ahern
2023-09-13 14:53                                             ` Nicolas Dichtel
2023-09-14 15:43                                               ` Nicolas Dichtel [this message]
2023-09-15  3:07                                                 ` David Ahern
2023-09-15 15:54                                                   ` Nicolas Dichtel
2023-09-13 14:41                                       ` David Ahern
2023-09-15 16:59                                         ` Stephen Hemminger
2023-07-26 10:17                     ` [Questions] Some issues about IPv4/IPv6 nexthop route (was Re: [PATCH net-next] ipv4/fib: send RTM_DELROUTE notify when flush fib) Hangbin Liu
2023-07-26 15:57                       ` David Ahern
2023-07-27  4:19                         ` [Questions] Some issues about IPv4/IPv6 nexthop route Hangbin Liu
2023-07-27 15:35                           ` David Ahern
2023-07-27 14:45                       ` [Questions] Some issues about IPv4/IPv6 nexthop route (was Re: [PATCH net-next] ipv4/fib: send RTM_DELROUTE notify when flush fib) Ido Schimmel
2023-08-28  7:53                         ` [Questions] Some issues about IPv4/IPv6 nexthop route Hangbin Liu
2023-08-28 15:06                           ` David Ahern
2023-08-29  1:07                             ` Hangbin Liu
2023-08-29  1:42                               ` David Ahern
2023-08-02  9:06                 ` [PATCH net-next] ipv4/fib: send RTM_DELROUTE notify when flush fib Thomas Haller
2023-08-04  8:09                 ` Hangbin Liu
2023-08-09  7:06                   ` Ido Schimmel
2023-08-09 10:02                     ` Hangbin Liu
2023-07-25 14:13 ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b83e24a4-6de3-0df2-d902-f2cc3cdbaf41@6wind.com \
    --to=nicolas.dichtel@6wind.com \
    --cc=bpoirier@nvidia.com \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=idosch@idosch.org \
    --cc=kuba@kernel.org \
    --cc=liuhangbin@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=stephen@networkplumber.org \
    --cc=thaller@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.