All of lore.kernel.org
 help / color / mirror / Atom feed
From: Martin Lau <kafai@fb.com>
To: Wei Wang <weiwan@google.com>
Cc: David Ahern <dsahern@gmail.com>, David Ahern <dsahern@kernel.org>,
	"David S . Miller" <davem@davemloft.net>,
	Linux Kernel Network Developers <netdev@vger.kernel.org>,
	"idosch@mellanox.com" <idosch@mellanox.com>,
	"saeedm@mellanox.com" <saeedm@mellanox.com>
Subject: Re: [PATCH v2 net-next 4/7] ipv6: Plumb support for nexthop object in a fib6_info
Date: Wed, 5 Jun 2019 00:39:05 +0000	[thread overview]
Message-ID: <20190605003903.zxxrebpzq2rfzy52@kafai-mbp.dhcp.thefacebook.com> (raw)
In-Reply-To: <CAEA6p_AAP10bXOQPfOqY253H7BQYgksP_iDXDi-RKguLcKh0SA@mail.gmail.com>

On Tue, Jun 04, 2019 at 02:36:27PM -0700, Wei Wang wrote:
> On Tue, Jun 4, 2019 at 2:13 PM David Ahern <dsahern@gmail.com> wrote:
> >
> > On 6/4/19 3:06 PM, Martin Lau wrote:
> > > On Tue, Jun 04, 2019 at 02:17:28PM -0600, David Ahern wrote:
> > >> On 6/3/19 11:29 PM, Martin Lau wrote:
> > >>> On Mon, Jun 03, 2019 at 07:36:06PM -0600, David Ahern wrote:
> > >>>> On 6/3/19 6:58 PM, Martin Lau wrote:
> > >>>>> I have concern on calling ip6_create_rt_rcu() in general which seems
> > >>>>> to trace back to this commit
> > >>>>> dec9b0e295f6 ("net/ipv6: Add rt6_info create function for ip6_pol_route_lookup")
> > >>>>>
> > >>>>> This rt is not tracked in pcpu_rt, rt6_uncached_list or exception bucket.
> > >>>>> In particular, how to react to NETDEV_UNREGISTER/DOWN like
> > >>>>> the rt6_uncached_list_flush_dev() does and calls dev_put()?
> > >>>>>
> > >>>>> The existing callers seem to do dst_release() immediately without
> > >>>>> caching it, but still concerning.
> > >>>>
> > >>>> those are the callers that don't care about the dst_entry, but are
> > >>>> forced to deal with it. Removing the tie between fib lookups an
> > >>>> dst_entry is again the right solution.
> > >>> Great to know that there will be a solution.  It would be great
> > >>> if there is patch (or repo) to show how that may look like on
> > >>> those rt6_lookup() callers.
> > >>
> > >> Not 'will be', 'there is' a solution now. Someone just needs to do the
> > >> conversions and devise the tests for the impacted users.
> > > I don't think everyone will convert to the new nexthop solution
> > > immediately.
> > >
> > > How about ensuring the existing usage stays solid first?
> >
> > Use of nexthop objects has nothing to do with separating fib lookups
> > from dst_entries, but with the addition of fib6_result it Just Works.
> >
> > Wei converted ipv6 to use exception caches instead of adding them to the
> > FIB.
> >
> > I converted ipv6 to use separate data structures for fib entries, added
> > direct fib6 lookup functions and added fib6_result. See the
> > net/core/filter.c.
> >
> > The stage is set for converting users.
> >
> > For example, ip6_nh_lookup_table does not care about the dst entry, only
> > the fib entry. This converts it:
> >
> > static int ip6_nh_lookup_table(struct net *net, struct fib6_config *cfg,
> >                                const struct in6_addr *gw_addr, u32 tbid,
> >                                int flags, struct fib6_result *res)
> > {
> >         struct flowi6 fl6 = {
> >                 .flowi6_oif = cfg->fc_ifindex,
> >                 .daddr = *gw_addr,
> >                 .saddr = cfg->fc_prefsrc,
> >         };
> >         struct fib6_table *table;
> >         struct rt6_info *rt;
> >
> >         table = fib6_get_table(net, tbid);
> >         if (!table)
> >                 return -EINVAL;
> >
> >         if (!ipv6_addr_any(&cfg->fc_prefsrc))
> >                 flags |= RT6_LOOKUP_F_HAS_SADDR;
> >
> >         flags |= RT6_LOOKUP_F_IGNORE_LINKSTATE;
> >
> >         fib6_table_lookup(net, table, cfg->fc_ifindex, fl6, res, flags);
> >         if (res.f6i == net->ipv6.fib6_null_entry)
> >                 return -ENETUNREACH;
> >
> >         fib6_select_path(net, &res, fl6, oif, false, NULL, flags);
> >
> >         return 0;
> > }
> 
> I do agree with Martin that ip6_create_rt_rcu() seems to be dangerous
> as the dst cache created by this func does not get tracked anywhere
> and it is up to the user to not cache it for too long.
IMO, ip6_create_rt_rcu(), which returns untracked rt, was a mistake
and removing it has been overdue.  Tracking down the unregister dev
bug is not easy.

> But I think David, what you are suggesting is:
> instead of trying to convert ip6_create_rt_rcu() to use the pcpu_dst
> logic, completely get rid of the calling to ip6_create_rt_rcu(), and
> directly return f6i in those cases to the caller. Is that correct?
I am fine with either of these two ways to remove ip6_create_rt_rcu().
Further depending on ip6_create_rt_rcu() in this patch even in
ip6_pol_route_lookup() is arguably neither of these two ways...

  parent reply	other threads:[~2019-06-05  0:39 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-03  4:08 [PATCH v2 net-next 0/7] net: add struct nexthop to fib{6}_info David Ahern
2019-06-03  4:08 ` [PATCH v2 net-next 1/7] ipv4: Use accessors for fib_info nexthop data David Ahern
2019-06-03  4:08 ` [PATCH v2 net-next 2/7] ipv4: Prepare for fib6_nh from a nexthop object David Ahern
2019-06-03  4:08 ` [PATCH v2 net-next 3/7] ipv4: Plumb support for nexthop object in a fib_info David Ahern
2019-06-03  4:08 ` [PATCH v2 net-next 4/7] ipv6: Plumb support for nexthop object in a fib6_info David Ahern
2019-06-03 18:09   ` Wei Wang
2019-06-03 18:37     ` David Ahern
2019-06-03 20:42     ` David Ahern
2019-06-03 21:58       ` Wei Wang
2019-06-03 22:35         ` David Ahern
2019-06-03 23:05           ` Wei Wang
2019-06-03 23:18             ` David Ahern
2019-06-03 23:30               ` Wei Wang
2019-06-04  0:58               ` Martin Lau
2019-06-04  1:36                 ` David Ahern
2019-06-04  5:29                   ` Martin Lau
2019-06-04 20:17                     ` David Ahern
2019-06-04 21:06                       ` Martin Lau
2019-06-04 21:13                         ` David Ahern
2019-06-04 21:36                           ` Wei Wang
2019-06-04 23:30                             ` David Ahern
2019-06-05  0:39                             ` Martin Lau [this message]
2019-06-05  2:05                               ` David Ahern
2019-06-05  6:01                                 ` Martin Lau
2019-06-04 21:53                           ` Martin Lau
2019-06-03  4:08 ` [PATCH v2 net-next 5/7] mlxsw: Fail attempts to use routes with nexthop objects David Ahern
2019-06-03  4:08 ` [PATCH v2 net-next 6/7] mlx5: " David Ahern
2019-06-03  4:08 ` [PATCH v2 net-next 7/7] rocker: " David Ahern

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190605003903.zxxrebpzq2rfzy52@kafai-mbp.dhcp.thefacebook.com \
    --to=kafai@fb.com \
    --cc=davem@davemloft.net \
    --cc=dsahern@gmail.com \
    --cc=dsahern@kernel.org \
    --cc=idosch@mellanox.com \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@mellanox.com \
    --cc=weiwan@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.