From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EB21BC433E0 for ; Sun, 28 Feb 2021 14:30:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B3B5D64E85 for ; Sun, 28 Feb 2021 14:30:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230224AbhB1O2h (ORCPT ); Sun, 28 Feb 2021 09:28:37 -0500 Received: from out4-smtp.messagingengine.com ([66.111.4.28]:42867 "EHLO out4-smtp.messagingengine.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230140AbhB1O2g (ORCPT ); Sun, 28 Feb 2021 09:28:36 -0500 Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id DE4815C00F7; Sun, 28 Feb 2021 09:27:29 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Sun, 28 Feb 2021 09:27:29 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=Fny1XAGwTP14kN8off7ygbjw9+bZTff2fGYjxTQ9nYE=; b=pSjrqdnl s9FGTjwPAgK8U1KDoQG7xRrXey6wDL9u+3tfpDe4qo3EZFjdmx6BGTJRDfDWCyGR +DRr4apAO8ImbJRuTb0UytjXCUUtFZhg2Mt96/Aon6a6adA2KaqEzS6MXvxr9Fur I9Px3wAnTGO7Rd3Z9srlM26ajQ+Ogb64X7wD95Om2QUaIsA1aThQ8yzsD4rYI3YE 9euKwgqKoE6G1P2T3RAZkah/Gz3Gsapt4Id4+6PijiiG0L8RMhYN2kFKM+EKh/cm LUqmxAeA6v5RJYKP2hm0ojzYo8HjoCBzAXiESmMfsVbQ8AooKq9OT2drK7OuSSpN orYxmjzUoN19tA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrleeigdeiiecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhephffvufffkffojghfggfgsedtkeertd ertddtnecuhfhrohhmpefkughoucfutghhihhmmhgvlhcuoehiughoshgthhesihguohhs tghhrdhorhhgqeenucggtffrrghtthgvrhhnpeduteeiveffffevleekleejffekhfekhe fgtdfftefhledvjefggfehgfevjeekhfenucfkphepkeegrddvvdelrdduheefrdeggeen ucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehiughosh gthhesihguohhstghhrdhorhhg X-ME-Proxy: Received: from shredder.lan (igld-84-229-153-44.inter.net.il [84.229.153.44]) by mail.messagingengine.com (Postfix) with ESMTPA id B263A1080057; Sun, 28 Feb 2021 09:27:27 -0500 (EST) From: Ido Schimmel To: netdev@vger.kernel.org Cc: davem@davemloft.net, kuba@kernel.org, dsahern@gmail.com, roopa@nvidia.com, sharpd@nvidia.com, mlxsw@nvidia.com, Ido Schimmel Subject: [RFC PATCH net 1/2] nexthop: Do not flush blackhole nexthops when loopback goes down Date: Sun, 28 Feb 2021 16:26:12 +0200 Message-Id: <20210228142613.1642938-2-idosch@idosch.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210228142613.1642938-1-idosch@idosch.org> References: <20210228142613.1642938-1-idosch@idosch.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Ido Schimmel As far as user space is concerned, blackhole nexthops do not have a nexthop device and therefore should not be affected by the administrative or carrier state of any netdev. However, when the loopback netdev goes down all the blackhole nexthops are flushed. This happens because internally the kernel associates blackhole nexthops with the loopback netdev. This behavior is both confusing to those not familiar with kernel internals and also diverges from the legacy API where blackhole IPv4 routes are not flushed when the loopback netdev goes down: # ip route add blackhole 198.51.100.0/24 # ip link set dev lo down # ip route show 198.51.100.0/24 blackhole 198.51.100.0/24 Blackhole IPv6 routes are flushed, but at least user space knows that they are associated with the loopback netdev: # ip -6 route show 2001:db8:1::/64 blackhole 2001:db8:1::/64 dev lo metric 1024 pref medium Fix this by only flushing blackhole nexthops when the loopback netdev is unregistered. Fixes: ab84be7e54fc ("net: Initial nexthop code") Signed-off-by: Ido Schimmel Reported-by: Donald Sharp --- net/ipv4/nexthop.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c index f1c6cbdb9e43..743777bce179 100644 --- a/net/ipv4/nexthop.c +++ b/net/ipv4/nexthop.c @@ -1399,7 +1399,7 @@ static int insert_nexthop(struct net *net, struct nexthop *new_nh, /* rtnl */ /* remove all nexthops tied to a device being deleted */ -static void nexthop_flush_dev(struct net_device *dev) +static void nexthop_flush_dev(struct net_device *dev, unsigned long event) { unsigned int hash = nh_dev_hashfn(dev->ifindex); struct net *net = dev_net(dev); @@ -1411,6 +1411,10 @@ static void nexthop_flush_dev(struct net_device *dev) if (nhi->fib_nhc.nhc_dev != dev) continue; + if (nhi->reject_nh && + (event == NETDEV_DOWN || event == NETDEV_CHANGE)) + continue; + remove_nexthop(net, nhi->nh_parent, NULL); } } @@ -2189,11 +2193,11 @@ static int nh_netdev_event(struct notifier_block *this, switch (event) { case NETDEV_DOWN: case NETDEV_UNREGISTER: - nexthop_flush_dev(dev); + nexthop_flush_dev(dev, event); break; case NETDEV_CHANGE: if (!(dev_get_flags(dev) & (IFF_RUNNING | IFF_LOWER_UP))) - nexthop_flush_dev(dev); + nexthop_flush_dev(dev, event); break; case NETDEV_CHANGEMTU: info_ext = ptr; -- 2.29.2