From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.7 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C6F77CA9EB3 for ; Fri, 18 Oct 2019 00:01:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 9A6EF20869 for ; Fri, 18 Oct 2019 00:01:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2503757AbfJRABV (ORCPT ); Thu, 17 Oct 2019 20:01:21 -0400 Received: from shards.monkeyblade.net ([23.128.96.9]:44000 "EHLO shards.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729190AbfJRABV (ORCPT ); Thu, 17 Oct 2019 20:01:21 -0400 Received: from localhost (unknown [IPv6:2601:601:9f00:1e2::d71]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) (Authenticated sender: davem-davemloft) by shards.monkeyblade.net (Postfix) with ESMTPSA id 273191433FC52; Thu, 17 Oct 2019 17:01:21 -0700 (PDT) Date: Thu, 17 Oct 2019 17:01:20 -0700 (PDT) Message-Id: <20191017.170120.984298608358144040.davem@davemloft.net> To: weiwan@google.com Cc: netdev@vger.kernel.org, idosch@idosch.org, jesse@mbuki-mvuki.org, kafai@fb.com, dsahern@gmail.com Subject: Re: [PATCH net] ipv4: fix race condition between route lookup and invalidation From: David Miller In-Reply-To: <20191016190315.151095-1-weiwan@google.com> References: <20191016190315.151095-1-weiwan@google.com> X-Mailer: Mew version 6.8 on Emacs 26.1 Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.5.12 (shards.monkeyblade.net [149.20.54.216]); Thu, 17 Oct 2019 17:01:21 -0700 (PDT) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Wei Wang Date: Wed, 16 Oct 2019 12:03:15 -0700 > Jesse and Ido reported the following race condition: > - Received packet A is forwarded and cached dst entry is > taken from the nexthop ('nhc->nhc_rth_input'). Calls skb_dst_set() > > - Given Jesse has busy routers ("ingesting full BGP routing tables > from multiple ISPs"), route is added / deleted and rt_cache_flush() is > called > > - Received packet B tries to use the same cached dst entry > from t0, but rt_cache_valid() is no longer true and it is replaced in > rt_cache_route() by the newer one. This calls dst_dev_put() on the > original dst entry which assigns the blackhole netdev to 'dst->dev' > > - dst_input(skb) is called on packet A and it is dropped due > to 'dst->dev' being the blackhole netdev > > There are 2 issues in the v4 routing code: > 1. A per-netns counter is used to do the validation of the route. That > means whenever a route is changed in the netns, users of all routes in > the netns needs to redo lookup. v6 has an implementation of only > updating fn_sernum for routes that are affected. > 2. When rt_cache_valid() returns false, rt_cache_route() is called to > throw away the current cache, and create a new one. This seems > unnecessary because as long as this route does not change, the route > cache does not need to be recreated. > > To fully solve the above 2 issues, it probably needs quite some code > changes and requires careful testing, and does not suite for net branch. > > So this patch only tries to add the deleted cached rt into the uncached > list, so user could still be able to use it to receive packets until > it's done. > > Fixes: 95c47f9cf5e0 ("ipv4: call dst_dev_put() properly") > Signed-off-by: Wei Wang > Reported-by: Ido Schimmel > Reported-by: Jesse Hathaway > Tested-by: Jesse Hathaway > Acked-by: Martin KaFai Lau Applied and queued up for -stable.