netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RTM_DELROUTE not sent anymore when deleting (last) nexthop of routes in 6.1
@ 2022-11-24  9:20 Jonas Gorski
  2022-11-24 12:41 ` Ido Schimmel
  2022-11-25  8:36 ` RTM_DELROUTE not sent anymore when deleting (last) nexthop of routes in 6.1 #forregzbot Thorsten Leemhuis
  0 siblings, 2 replies; 12+ messages in thread
From: Jonas Gorski @ 2022-11-24  9:20 UTC (permalink / raw)
  To: Network Development; +Cc: David Ahern

Hello,

when an IPv4 route gets removed because its nexthop was deleted, the
kernel does not send a RTM_DELROUTE netlink notifications anymore in
6.1. A bisect lead me to 61b91eb33a69 ("ipv4: Handle attempt to delete
multipath route when fib_info contains an nh reference"), and
reverting it makes it work again.

It can be reproduced by doing the following and listening to netlink
(e.g. via ip monitor)

ip a a 172.16.1.1/24 dev veth1
ip nexthop add id 100 via 172.16.1.2 dev veth1
ip route add 172.16.101.0/24 nhid 100
ip nexthop del id 100

where the nexthop del will trigger a RTM_DELNEXTHOP message, but no
RTM_DELROUTE, but the route is gone afterwards anyways.

Doing the same thing with IPv6 still works as expected

ip a a 2001:db8:91::1/64 dev veth1
ip nexthop add id 100 via 2001:db8:91::2 dev veth1
ip route add 2001:db8:101::/64 nhid 100
ip nexthop del id 100

Here the kernel will send out both the RTM_DELNEXTHOP and the
RTM_DELROUTE netlink messages.

Unfortunately my net-foo is not good enough to propose a fix.

Best regards
Jonas

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RTM_DELROUTE not sent anymore when deleting (last) nexthop of routes in 6.1
  2022-11-24  9:20 RTM_DELROUTE not sent anymore when deleting (last) nexthop of routes in 6.1 Jonas Gorski
@ 2022-11-24 12:41 ` Ido Schimmel
  2022-11-24 14:15   ` Jonas Gorski
  2022-11-25  3:53   ` David Ahern
  2022-11-25  8:36 ` RTM_DELROUTE not sent anymore when deleting (last) nexthop of routes in 6.1 #forregzbot Thorsten Leemhuis
  1 sibling, 2 replies; 12+ messages in thread
From: Ido Schimmel @ 2022-11-24 12:41 UTC (permalink / raw)
  To: Jonas Gorski; +Cc: Network Development, David Ahern

On Thu, Nov 24, 2022 at 10:20:00AM +0100, Jonas Gorski wrote:
> Hello,
> 
> when an IPv4 route gets removed because its nexthop was deleted, the
> kernel does not send a RTM_DELROUTE netlink notifications anymore in
> 6.1. A bisect lead me to 61b91eb33a69 ("ipv4: Handle attempt to delete
> multipath route when fib_info contains an nh reference"), and
> reverting it makes it work again.
> 
> It can be reproduced by doing the following and listening to netlink
> (e.g. via ip monitor)
> 
> ip a a 172.16.1.1/24 dev veth1
> ip nexthop add id 100 via 172.16.1.2 dev veth1
> ip route add 172.16.101.0/24 nhid 100
> ip nexthop del id 100
> 
> where the nexthop del will trigger a RTM_DELNEXTHOP message, but no
> RTM_DELROUTE, but the route is gone afterwards anyways.

I tried the reproducer and I get the same notifications in ip monitor
regardless of whether 61b91eb33a69 is reverted or not.

Looking at the code and thinking about it, I don't think we ever
generated RTM_DELROUTE notifications when IPv4 routes were flushed (to
avoid a notification storm).

Are you running an upstream kernel?

Thanks

> 
> Doing the same thing with IPv6 still works as expected
> 
> ip a a 2001:db8:91::1/64 dev veth1
> ip nexthop add id 100 via 2001:db8:91::2 dev veth1
> ip route add 2001:db8:101::/64 nhid 100
> ip nexthop del id 100
> 
> Here the kernel will send out both the RTM_DELNEXTHOP and the
> RTM_DELROUTE netlink messages.
> 
> Unfortunately my net-foo is not good enough to propose a fix.
> 
> Best regards
> Jonas

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RTM_DELROUTE not sent anymore when deleting (last) nexthop of routes in 6.1
  2022-11-24 12:41 ` Ido Schimmel
@ 2022-11-24 14:15   ` Jonas Gorski
  2022-11-24 14:40     ` Jonas Gorski
  2022-11-25  3:53   ` David Ahern
  1 sibling, 1 reply; 12+ messages in thread
From: Jonas Gorski @ 2022-11-24 14:15 UTC (permalink / raw)
  To: Ido Schimmel; +Cc: Network Development, David Ahern

Hi Ido,

On Thu, 24 Nov 2022 at 13:41, Ido Schimmel <idosch@idosch.org> wrote:
>
> On Thu, Nov 24, 2022 at 10:20:00AM +0100, Jonas Gorski wrote:
> > Hello,
> >
> > when an IPv4 route gets removed because its nexthop was deleted, the
> > kernel does not send a RTM_DELROUTE netlink notifications anymore in
> > 6.1. A bisect lead me to 61b91eb33a69 ("ipv4: Handle attempt to delete
> > multipath route when fib_info contains an nh reference"), and
> > reverting it makes it work again.
> >
> > It can be reproduced by doing the following and listening to netlink
> > (e.g. via ip monitor)
> >
> > ip a a 172.16.1.1/24 dev veth1
> > ip nexthop add id 100 via 172.16.1.2 dev veth1
> > ip route add 172.16.101.0/24 nhid 100
> > ip nexthop del id 100
> >
> > where the nexthop del will trigger a RTM_DELNEXTHOP message, but no
> > RTM_DELROUTE, but the route is gone afterwards anyways.
>
> I tried the reproducer and I get the same notifications in ip monitor
> regardless of whether 61b91eb33a69 is reverted or not.
>
> Looking at the code and thinking about it, I don't think we ever
> generated RTM_DELROUTE notifications when IPv4 routes were flushed (to
> avoid a notification storm).
>
> Are you running an upstream kernel?

Okay, after having a second look, you are right, and I got myself
confused by IPv6 generating RTM_DELROUTE notifications, but which is
besides the point.

The point where it fails is that FRR tries to delete its route(s), and
fails to do so with this commit applied (=> RTM_DELROUTE goes
missing), then does the RTM_DELNEXTHOP.

So while there is indeed no RTM_DELROUTE generated in response to the
kernel, it was generated when FRR was successfully deleting its routes
before.

Not sure if this already qualifies as breaking userspace though, but
it's definitely something that used to work with 6.0 and before, and
does not work anymore now.

The error in FRR log is:

[YXPF5-B2CE0] netlink_route_multipath_msg_encode: RTM_DELROUTE
10.0.1.0/24 vrf 0(254)
[HYEHE-CQZ9G] nl_batch_send: netlink-dp (NS 0), batch size=44, msg cnt=1
[XS99C-X3KS5] netlink-dp (NS 0): error: No such process
type=RTM_DELROUTE(25), seq=22, pid=2419702167

with the revert it succeeds.

I'll see if I can get a better idea of the actual netlink message sent.

Regards
Jonas

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RTM_DELROUTE not sent anymore when deleting (last) nexthop of routes in 6.1
  2022-11-24 14:15   ` Jonas Gorski
@ 2022-11-24 14:40     ` Jonas Gorski
  2022-11-24 14:50       ` Ido Schimmel
  0 siblings, 1 reply; 12+ messages in thread
From: Jonas Gorski @ 2022-11-24 14:40 UTC (permalink / raw)
  To: Ido Schimmel; +Cc: Network Development, David Ahern

On Thu, 24 Nov 2022 at 15:15, Jonas Gorski <jonas.gorski@gmail.com> wrote:
>
> Hi Ido,
>
> On Thu, 24 Nov 2022 at 13:41, Ido Schimmel <idosch@idosch.org> wrote:
> >
> > On Thu, Nov 24, 2022 at 10:20:00AM +0100, Jonas Gorski wrote:
> > > Hello,
> > >
> > > when an IPv4 route gets removed because its nexthop was deleted, the
> > > kernel does not send a RTM_DELROUTE netlink notifications anymore in
> > > 6.1. A bisect lead me to 61b91eb33a69 ("ipv4: Handle attempt to delete
> > > multipath route when fib_info contains an nh reference"), and
> > > reverting it makes it work again.
> > >
> >
> > Are you running an upstream kernel?
>
> Okay, after having a second look, you are right, and I got myself
> confused by IPv6 generating RTM_DELROUTE notifications, but which is
> besides the point.
>
> The point where it fails is that FRR tries to delete its route(s), and
> fails to do so with this commit applied (=> RTM_DELROUTE goes
> missing), then does the RTM_DELNEXTHOP.
>
> So while there is indeed no RTM_DELROUTE generated in response to the
> kernel, it was generated when FRR was successfully deleting its routes
> before.
>
> Not sure if this already qualifies as breaking userspace though, but
> it's definitely something that used to work with 6.0 and before, and
> does not work anymore now.
>
> The error in FRR log is:
>
> [YXPF5-B2CE0] netlink_route_multipath_msg_encode: RTM_DELROUTE
> 10.0.1.0/24 vrf 0(254)
> [HYEHE-CQZ9G] nl_batch_send: netlink-dp (NS 0), batch size=44, msg cnt=1
> [XS99C-X3KS5] netlink-dp (NS 0): error: No such process
> type=RTM_DELROUTE(25), seq=22, pid=2419702167
>
> with the revert it succeeds.
>
> I'll see if I can get a better idea of the actual netlink message sent.

Okay, found the knob:

nlmsghdr [len=44 type=(25) DELROUTE flags=(0x0401)
{REQUEST,(ATOMIC|CREATE)} seq=22 pid=2185212923]
  rtmsg [family=(2) AF_INET dstlen=24 srclen=0 tos=0 table=254
protocol=(186) UNKNOWN scope=(0) UNIVERSE type=(0) UNSPEC flags=0x0000
{}]
    rta [len=8 (payload=4) type=(1) DST]
      10.0.1.0
    rta [len=8 (payload=4) type=(6) PRIORITY]
      20
netlink-dp (NS 0): error: No such process type=RTM_DELROUTE(25),
seq=22, pid=2185212923

The route was created via

nlmsghdr [len=52 type=(24) NEWROUTE flags=(0x0501)
{REQUEST,DUMP,(ROOT|REPLACE|CAPPED),(ATOMIC|CREATE)} seq=18
pid=2185212923]
 rtmsg [family=(2) AF_INET dstlen=24 srclen=0 tos=0 table=254
protocol=(186) UNKNOWN scope=(0) UNIVERSE type=(1) UNICAST
flags=0x0000 {}]
    rta [len=8 (payload=4) type=(1) DST]
      10.0.1.0
    rta [len=8 (payload=4) type=(6) PRIORITY]
       20
     rta [len=8 (payload=4) type=(30) NH_ID]
     18

and for completion the nexthop is created via:

nlmsghdr [len=48 type=(104) NEWNEXTHOP flags=(0x0501)
{REQUEST,DUMP,(ROOT|REPLACE|CAPPED),(ATOMIC|CREATE)} seq=17
pid=2185212923]
   nhm [family=(2) AF_INET scope=(0) UNIVERSE protocol=(11) ZEBRA
flags=0x00000000 {}]
    rta [len=8 (payload=4) type=(1) ID]
      18
    rta [len=8 (payload=4) type=(6) GATEWAY]
      10.0.0.1
    rta [len=8 (payload=4) type=(5) OIF]
      62


Regards
Jonas

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RTM_DELROUTE not sent anymore when deleting (last) nexthop of routes in 6.1
  2022-11-24 14:40     ` Jonas Gorski
@ 2022-11-24 14:50       ` Ido Schimmel
  2022-11-24 15:20         ` Jonas Gorski
  0 siblings, 1 reply; 12+ messages in thread
From: Ido Schimmel @ 2022-11-24 14:50 UTC (permalink / raw)
  To: Jonas Gorski; +Cc: Network Development, David Ahern

On Thu, Nov 24, 2022 at 03:40:19PM +0100, Jonas Gorski wrote:
> On Thu, 24 Nov 2022 at 15:15, Jonas Gorski <jonas.gorski@gmail.com> wrote:
> >
> > Hi Ido,
> >
> > On Thu, 24 Nov 2022 at 13:41, Ido Schimmel <idosch@idosch.org> wrote:
> > >
> > > On Thu, Nov 24, 2022 at 10:20:00AM +0100, Jonas Gorski wrote:
> > > > Hello,
> > > >
> > > > when an IPv4 route gets removed because its nexthop was deleted, the
> > > > kernel does not send a RTM_DELROUTE netlink notifications anymore in
> > > > 6.1. A bisect lead me to 61b91eb33a69 ("ipv4: Handle attempt to delete
> > > > multipath route when fib_info contains an nh reference"), and
> > > > reverting it makes it work again.
> > > >
> > >
> > > Are you running an upstream kernel?
> >
> > Okay, after having a second look, you are right, and I got myself
> > confused by IPv6 generating RTM_DELROUTE notifications, but which is
> > besides the point.
> >
> > The point where it fails is that FRR tries to delete its route(s), and
> > fails to do so with this commit applied (=> RTM_DELROUTE goes
> > missing), then does the RTM_DELNEXTHOP.
> >
> > So while there is indeed no RTM_DELROUTE generated in response to the
> > kernel, it was generated when FRR was successfully deleting its routes
> > before.
> >
> > Not sure if this already qualifies as breaking userspace though, but
> > it's definitely something that used to work with 6.0 and before, and
> > does not work anymore now.
> >
> > The error in FRR log is:
> >
> > [YXPF5-B2CE0] netlink_route_multipath_msg_encode: RTM_DELROUTE
> > 10.0.1.0/24 vrf 0(254)
> > [HYEHE-CQZ9G] nl_batch_send: netlink-dp (NS 0), batch size=44, msg cnt=1
> > [XS99C-X3KS5] netlink-dp (NS 0): error: No such process
> > type=RTM_DELROUTE(25), seq=22, pid=2419702167
> >
> > with the revert it succeeds.
> >
> > I'll see if I can get a better idea of the actual netlink message sent.
> 
> Okay, found the knob:
> 
> nlmsghdr [len=44 type=(25) DELROUTE flags=(0x0401)
> {REQUEST,(ATOMIC|CREATE)} seq=22 pid=2185212923]
>   rtmsg [family=(2) AF_INET dstlen=24 srclen=0 tos=0 table=254
> protocol=(186) UNKNOWN scope=(0) UNIVERSE type=(0) UNSPEC flags=0x0000
> {}]
>     rta [len=8 (payload=4) type=(1) DST]
>       10.0.1.0
>     rta [len=8 (payload=4) type=(6) PRIORITY]
>       20

The route is deleted with only prefix information (NH_ID not specified).
Matches this comment and the code:
https://github.com/FRRouting/frr/blob/master/zebra/rt_netlink.c#L2091

> netlink-dp (NS 0): error: No such process type=RTM_DELROUTE(25),
> seq=22, pid=2185212923
> 
> The route was created via
> 
> nlmsghdr [len=52 type=(24) NEWROUTE flags=(0x0501)
> {REQUEST,DUMP,(ROOT|REPLACE|CAPPED),(ATOMIC|CREATE)} seq=18
> pid=2185212923]
>  rtmsg [family=(2) AF_INET dstlen=24 srclen=0 tos=0 table=254
> protocol=(186) UNKNOWN scope=(0) UNIVERSE type=(1) UNICAST
> flags=0x0000 {}]
>     rta [len=8 (payload=4) type=(1) DST]
>       10.0.1.0
>     rta [len=8 (payload=4) type=(6) PRIORITY]
>        20
>      rta [len=8 (payload=4) type=(30) NH_ID]
>      18

Here the nexthop ID is obviously present.

Let me try to fix it and add a test for this flow.

Thanks for all the details!

> 
> and for completion the nexthop is created via:
> 
> nlmsghdr [len=48 type=(104) NEWNEXTHOP flags=(0x0501)
> {REQUEST,DUMP,(ROOT|REPLACE|CAPPED),(ATOMIC|CREATE)} seq=17
> pid=2185212923]
>    nhm [family=(2) AF_INET scope=(0) UNIVERSE protocol=(11) ZEBRA
> flags=0x00000000 {}]
>     rta [len=8 (payload=4) type=(1) ID]
>       18
>     rta [len=8 (payload=4) type=(6) GATEWAY]
>       10.0.0.1
>     rta [len=8 (payload=4) type=(5) OIF]
>       62
> 
> 
> Regards
> Jonas

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RTM_DELROUTE not sent anymore when deleting (last) nexthop of routes in 6.1
  2022-11-24 14:50       ` Ido Schimmel
@ 2022-11-24 15:20         ` Jonas Gorski
  2022-11-24 16:04           ` Ido Schimmel
  0 siblings, 1 reply; 12+ messages in thread
From: Jonas Gorski @ 2022-11-24 15:20 UTC (permalink / raw)
  To: Ido Schimmel; +Cc: Network Development, David Ahern

On Thu, 24 Nov 2022 at 15:50, Ido Schimmel <idosch@idosch.org> wrote:
>
> On Thu, Nov 24, 2022 at 03:40:19PM +0100, Jonas Gorski wrote:
> > On Thu, 24 Nov 2022 at 15:15, Jonas Gorski <jonas.gorski@gmail.com> wrote:
> > >
> > > Hi Ido,
> > >
> > > On Thu, 24 Nov 2022 at 13:41, Ido Schimmel <idosch@idosch.org> wrote:
> > > >
> > > > On Thu, Nov 24, 2022 at 10:20:00AM +0100, Jonas Gorski wrote:
> > > > > Hello,
> > > > >
> > > > > when an IPv4 route gets removed because its nexthop was deleted, the
> > > > > kernel does not send a RTM_DELROUTE netlink notifications anymore in
> > > > > 6.1. A bisect lead me to 61b91eb33a69 ("ipv4: Handle attempt to delete
> > > > > multipath route when fib_info contains an nh reference"), and
> > > > > reverting it makes it work again.
> > > > >
> > > >
> > > > Are you running an upstream kernel?
> > >
> > > Okay, after having a second look, you are right, and I got myself
> > > confused by IPv6 generating RTM_DELROUTE notifications, but which is
> > > besides the point.
> > >
> > > The point where it fails is that FRR tries to delete its route(s), and
> > > fails to do so with this commit applied (=> RTM_DELROUTE goes
> > > missing), then does the RTM_DELNEXTHOP.
> > >
> > > So while there is indeed no RTM_DELROUTE generated in response to the
> > > kernel, it was generated when FRR was successfully deleting its routes
> > > before.
> > >
> > > Not sure if this already qualifies as breaking userspace though, but
> > > it's definitely something that used to work with 6.0 and before, and
> > > does not work anymore now.
> > >
> > > The error in FRR log is:
> > >
> > > [YXPF5-B2CE0] netlink_route_multipath_msg_encode: RTM_DELROUTE
> > > 10.0.1.0/24 vrf 0(254)
> > > [HYEHE-CQZ9G] nl_batch_send: netlink-dp (NS 0), batch size=44, msg cnt=1
> > > [XS99C-X3KS5] netlink-dp (NS 0): error: No such process
> > > type=RTM_DELROUTE(25), seq=22, pid=2419702167
> > >
> > > with the revert it succeeds.
> > >
> > > I'll see if I can get a better idea of the actual netlink message sent.
> >
> > Okay, found the knob:
> >
> > nlmsghdr [len=44 type=(25) DELROUTE flags=(0x0401)
> > {REQUEST,(ATOMIC|CREATE)} seq=22 pid=2185212923]
> >   rtmsg [family=(2) AF_INET dstlen=24 srclen=0 tos=0 table=254
> > protocol=(186) UNKNOWN scope=(0) UNIVERSE type=(0) UNSPEC flags=0x0000
> > {}]
> >     rta [len=8 (payload=4) type=(1) DST]
> >       10.0.1.0
> >     rta [len=8 (payload=4) type=(6) PRIORITY]
> >       20
>
> The route is deleted with only prefix information (NH_ID not specified).
> Matches this comment and the code:
> https://github.com/FRRouting/frr/blob/master/zebra/rt_netlink.c#L2091
>
> > netlink-dp (NS 0): error: No such process type=RTM_DELROUTE(25),
> > seq=22, pid=2185212923
> >
> > The route was created via
> >
> > nlmsghdr [len=52 type=(24) NEWROUTE flags=(0x0501)
> > {REQUEST,DUMP,(ROOT|REPLACE|CAPPED),(ATOMIC|CREATE)} seq=18
> > pid=2185212923]
> >  rtmsg [family=(2) AF_INET dstlen=24 srclen=0 tos=0 table=254
> > protocol=(186) UNKNOWN scope=(0) UNIVERSE type=(1) UNICAST
> > flags=0x0000 {}]
> >     rta [len=8 (payload=4) type=(1) DST]
> >       10.0.1.0
> >     rta [len=8 (payload=4) type=(6) PRIORITY]
> >        20
> >      rta [len=8 (payload=4) type=(30) NH_ID]
> >      18
>
> Here the nexthop ID is obviously present.
>
> Let me try to fix it and add a test for this flow.
>
> Thanks for all the details!

You are welcome, and thanks for the quick response!

We have an integration test using FRR that got broken by this, so I
can also easily test anything you throw at me (assuming CET working
hours).

Regards
Jonas

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RTM_DELROUTE not sent anymore when deleting (last) nexthop of routes in 6.1
  2022-11-24 15:20         ` Jonas Gorski
@ 2022-11-24 16:04           ` Ido Schimmel
  2022-11-24 16:58             ` Jonas Gorski
  0 siblings, 1 reply; 12+ messages in thread
From: Ido Schimmel @ 2022-11-24 16:04 UTC (permalink / raw)
  To: Jonas Gorski; +Cc: Network Development, David Ahern

On Thu, Nov 24, 2022 at 04:20:49PM +0100, Jonas Gorski wrote:
> We have an integration test using FRR that got broken by this, so I
> can also easily test anything you throw at me (assuming CET working
> hours).

Please test the following fix [1]. Tested manually using [2]. With the
fix or 61b91eb33a69 reverted the route is successfully deleted. Without
the fix I get:

RTNETLINK answers: No such process
198.51.100.0/24 nhid 1 via 192.0.2.2 dev dummy10

If the fix is OK, I will submit it along with a selftest to make
sure it does not regress in the future.

Thanks

[1]
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index f721c308248b..19a662003eef 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -888,9 +888,11 @@ int fib_nh_match(struct net *net, struct fib_config *cfg, struct fib_info *fi,
                return 1;
        }
 
-       /* cannot match on nexthop object attributes */
-       if (fi->nh)
-               return 1;
+       if (fi->nh) {
+               if (cfg->fc_oif || cfg->fc_gw_family || cfg->fc_mp)
+                       return 1;
+               return 0;
+       }
 
        if (cfg->fc_oif || cfg->fc_gw_family) {
                struct fib_nh *nh;

[2]
#!/bin/bash

ip link del dev dummy10 &> /dev/null

ip link add name dummy10 up type dummy
ip address add 192.0.2.1/24 dev dummy10
ip nexthop add id 1 via 192.0.2.2 dev dummy10
ip route add 198.51.100.0/24 nhid 1
ip route del 198.51.100.0/24
ip route show 198.51.100.0/24

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: RTM_DELROUTE not sent anymore when deleting (last) nexthop of routes in 6.1
  2022-11-24 16:04           ` Ido Schimmel
@ 2022-11-24 16:58             ` Jonas Gorski
  0 siblings, 0 replies; 12+ messages in thread
From: Jonas Gorski @ 2022-11-24 16:58 UTC (permalink / raw)
  To: Ido Schimmel; +Cc: Network Development, David Ahern

On Thu, 24 Nov 2022 at 17:04, Ido Schimmel <idosch@idosch.org> wrote:
>
> On Thu, Nov 24, 2022 at 04:20:49PM +0100, Jonas Gorski wrote:
> > We have an integration test using FRR that got broken by this, so I
> > can also easily test anything you throw at me (assuming CET working
> > hours).
>
> Please test the following fix [1]. Tested manually using [2]. With the
> fix or 61b91eb33a69 reverted the route is successfully deleted. Without
> the fix I get:
>
> RTNETLINK answers: No such process
> 198.51.100.0/24 nhid 1 via 192.0.2.2 dev dummy10
>
> If the fix is OK, I will submit it along with a selftest to make
> sure it does not regress in the future.
>
> Thanks
>
> [1]
> diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
> index f721c308248b..19a662003eef 100644
> --- a/net/ipv4/fib_semantics.c
> +++ b/net/ipv4/fib_semantics.c
> @@ -888,9 +888,11 @@ int fib_nh_match(struct net *net, struct fib_config *cfg, struct fib_info *fi,
>                 return 1;
>         }
>
> -       /* cannot match on nexthop object attributes */
> -       if (fi->nh)
> -               return 1;
> +       if (fi->nh) {
> +               if (cfg->fc_oif || cfg->fc_gw_family || cfg->fc_mp)
> +                       return 1;
> +               return 0;
> +       }
>
>         if (cfg->fc_oif || cfg->fc_gw_family) {
>                 struct fib_nh *nh;

I can confirm this fixes the issue. Reading the code, this is
basically like it was before the commit with an additional return 1
for (fi->nh && cfg->fc_mp).

Thanks for the quick fix! Here, have a

Tested-by: Jonas Gorski <jonas.gorski@gmail.com>

Regards
Jonas

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RTM_DELROUTE not sent anymore when deleting (last) nexthop of routes in 6.1
  2022-11-24 12:41 ` Ido Schimmel
  2022-11-24 14:15   ` Jonas Gorski
@ 2022-11-25  3:53   ` David Ahern
  1 sibling, 0 replies; 12+ messages in thread
From: David Ahern @ 2022-11-25  3:53 UTC (permalink / raw)
  To: Ido Schimmel, Jonas Gorski; +Cc: Network Development

On 11/24/22 5:41 AM, Ido Schimmel wrote:
> Looking at the code and thinking about it, I don't think we ever
> generated RTM_DELROUTE notifications when IPv4 routes were flushed (to
> avoid a notification storm).

exactly.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RTM_DELROUTE not sent anymore when deleting (last) nexthop of routes in 6.1 #forregzbot
  2022-11-24  9:20 RTM_DELROUTE not sent anymore when deleting (last) nexthop of routes in 6.1 Jonas Gorski
  2022-11-24 12:41 ` Ido Schimmel
@ 2022-11-25  8:36 ` Thorsten Leemhuis
  2022-11-27 12:08   ` Thorsten Leemhuis
  1 sibling, 1 reply; 12+ messages in thread
From: Thorsten Leemhuis @ 2022-11-25  8:36 UTC (permalink / raw)
  To: Network Development; +Cc: regressions

[Note: this mail is primarily send for documentation purposes and/or for
regzbot, my Linux kernel regression tracking bot. That's why I removed
most or all folks from the list of recipients, but left any that looked
like a mailing lists. These mails usually contain '#forregzbot' in the
subject, to make them easy to spot and filter out.]

[TLDR: I'm adding this regression report to the list of tracked
regressions; all text from me you find below is based on a few templates
paragraphs you might have encountered already already in similar form.]

Hi, this is your Linux kernel regression tracker.

On 24.11.22 10:20, Jonas Gorski wrote:
> Hello,
> 
> when an IPv4 route gets removed because its nexthop was deleted, the
> kernel does not send a RTM_DELROUTE netlink notifications anymore in
> 6.1. A bisect lead me to 61b91eb33a69 ("ipv4: Handle attempt to delete
> multipath route when fib_info contains an nh reference"), and
> reverting it makes it work again.

Thanks for the report. To be sure below issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression
tracking bot:

#regzbot ^introduced 61b91eb33a69
#regzbot title net: RTM_DELROUTE not sent anymore when deleting (last)
nexthop of routes
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply -- ideally with also
telling regzbot about it, as explained here:
https://linux-regtracking.leemhuis.info/tracked-regression/

Reminder for developers: When fixing the issue, add 'Link:' tags
pointing to the report (the mail this one replies to), as explained for
in the Linux kernel's documentation; above webpage explains why this is
important for tracked regressions.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RTM_DELROUTE not sent anymore when deleting (last) nexthop of routes in 6.1 #forregzbot
  2022-11-25  8:36 ` RTM_DELROUTE not sent anymore when deleting (last) nexthop of routes in 6.1 #forregzbot Thorsten Leemhuis
@ 2022-11-27 12:08   ` Thorsten Leemhuis
  2022-11-29  8:50     ` Thorsten Leemhuis
  0 siblings, 1 reply; 12+ messages in thread
From: Thorsten Leemhuis @ 2022-11-27 12:08 UTC (permalink / raw)
  To: Network Development; +Cc: regressions

On 25.11.22 09:36, Thorsten Leemhuis wrote:
> On 24.11.22 10:20, Jonas Gorski wrote:
>> when an IPv4 route gets removed because its nexthop was deleted, the
>> kernel does not send a RTM_DELROUTE netlink notifications anymore in
>> 6.1. A bisect lead me to 61b91eb33a69 ("ipv4: Handle attempt to delete
>> multipath route when fib_info contains an nh reference"), and
>> reverting it makes it work again.
> 
> Thanks for the report. To be sure below issue doesn't fall through the
> cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression
> tracking bot:
> 
> #regzbot ^introduced 61b91eb33a69
> #regzbot title net: RTM_DELROUTE not sent anymore when deleting (last)
> nexthop of routes
> #regzbot ignore-activity

#regzbot monitor:
https://lore.kernel.org/all/20221124210932.2470010-1-idosch@nvidia.com/

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RTM_DELROUTE not sent anymore when deleting (last) nexthop of routes in 6.1 #forregzbot
  2022-11-27 12:08   ` Thorsten Leemhuis
@ 2022-11-29  8:50     ` Thorsten Leemhuis
  0 siblings, 0 replies; 12+ messages in thread
From: Thorsten Leemhuis @ 2022-11-29  8:50 UTC (permalink / raw)
  To: Network Development; +Cc: regressions

On 27.11.22 13:08, Thorsten Leemhuis wrote:
> On 25.11.22 09:36, Thorsten Leemhuis wrote:
>> On 24.11.22 10:20, Jonas Gorski wrote:
>>> when an IPv4 route gets removed because its nexthop was deleted, the
>>> kernel does not send a RTM_DELROUTE netlink notifications anymore in
>>> 6.1. A bisect lead me to 61b91eb33a69 ("ipv4: Handle attempt to delete
>>> multipath route when fib_info contains an nh reference"), and
>>> reverting it makes it work again.
>>
>> Thanks for the report. To be sure below issue doesn't fall through the
>> cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression
>> tracking bot:
>>
>> #regzbot ^introduced 61b91eb33a69
>> #regzbot title net: RTM_DELROUTE not sent anymore when deleting (last)
>> nexthop of routes
>> #regzbot ignore-activity
> 
> #regzbot monitor:
> https://lore.kernel.org/all/20221124210932.2470010-1-idosch@nvidia.com/

#regzbot fixed-by: d5082d386eee

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-11-29  8:51 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-24  9:20 RTM_DELROUTE not sent anymore when deleting (last) nexthop of routes in 6.1 Jonas Gorski
2022-11-24 12:41 ` Ido Schimmel
2022-11-24 14:15   ` Jonas Gorski
2022-11-24 14:40     ` Jonas Gorski
2022-11-24 14:50       ` Ido Schimmel
2022-11-24 15:20         ` Jonas Gorski
2022-11-24 16:04           ` Ido Schimmel
2022-11-24 16:58             ` Jonas Gorski
2022-11-25  3:53   ` David Ahern
2022-11-25  8:36 ` RTM_DELROUTE not sent anymore when deleting (last) nexthop of routes in 6.1 #forregzbot Thorsten Leemhuis
2022-11-27 12:08   ` Thorsten Leemhuis
2022-11-29  8:50     ` Thorsten Leemhuis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).