netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stefano Brivio <sbrivio@redhat.com>
To: David Ahern <dsahern@gmail.com>
Cc: David Miller <davem@davemloft.net>,
	Jianlin Shi <jishi@redhat.com>, Wei Wang <weiwan@google.com>,
	Martin KaFai Lau <kafai@fb.com>,
	Eric Dumazet <edumazet@google.com>,
	Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com>,
	netdev@vger.kernel.org
Subject: Re: [PATCH net-next v6 04/11] ipv4: Dump route exceptions if requested
Date: Thu, 20 Jun 2019 21:01:13 +0200	[thread overview]
Message-ID: <20190620210113.6aa2c022@redhat.com> (raw)
In-Reply-To: <777387d8-fa15-388e-875a-02aa5df977dd@gmail.com>

On Thu, 20 Jun 2019 07:31:32 -0600
David Ahern <dsahern@gmail.com> wrote:

> On 6/19/19 5:59 PM, Stefano Brivio wrote:
> > diff --git a/include/net/route.h b/include/net/route.h
> > index 065b47754f05..e7f65388a6d4 100644
> > --- a/include/net/route.h
> > +++ b/include/net/route.h
> > @@ -44,6 +44,7 @@
> >  #define RT_CONN_FLAGS_TOS(sk,tos)   (RT_TOS(tos) | sock_flag(sk, SOCK_LOCALROUTE))
> >  
> >  struct fib_nh;
> > +struct fib_alias;
> >  struct fib_info;
> >  struct uncached_list;
> >  struct rtable {  
> 
> we should not expose fib_alias to route.c.

I was also not enthusiastic about that, but...:

> > @@ -230,6 +231,9 @@ void fib_modify_prefix_metric(struct in_ifaddr *ifa, u32 new_metric);
> >  void rt_add_uncached_list(struct rtable *rt);
> >  void rt_del_uncached_list(struct rtable *rt);
> >  
> > +int fnhe_dump_buckets(struct fib_alias *fa, int nhsel, struct sk_buff *skb,
> > +		      struct netlink_callback *cb, int *fa_index, int fa_start);
> > +
> >  static inline void ip_rt_put(struct rtable *rt)
> >  {
> >  	/* dst_release() accepts a NULL parameter.
> > diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
> > index 94e5d83db4db..03f51e5192e5 100644
> > --- a/net/ipv4/fib_trie.c
> > +++ b/net/ipv4/fib_trie.c
> > @@ -2078,28 +2078,51 @@ void fib_free_table(struct fib_table *tb)
> >  	call_rcu(&tb->rcu, __trie_free_rcu);
> >  }
> >  
> > +static int fib_dump_fnhe_from_leaf(struct fib_alias *fa, struct sk_buff *skb,
> > +				   struct netlink_callback *cb,
> > +				   int *fa_index, int fa_start)
> > +{
> > +	struct fib_info *fi = fa->fa_info;
> > +	int nhsel;
> > +
> > +	if (!fi || fi->fib_flags & RTNH_F_DEAD)
> > +		return 0;
> > +
> > +	for (nhsel = 0; nhsel < fib_info_num_path(fi); nhsel++) {
> > +		int err;
> > +
> > +		err = fnhe_dump_buckets(fa, nhsel, skb, cb, fa_index, fa_start);
> > +		if (err)
> > +			return err;
> > +	}
> > +
> > +	return 0;
> > +}  
> 
> fib_info would be the better argument to pass in to the fnhe dump, and

...we need to pass the table ID to rt_fill_info(). Sure, I can pass
that explicitly, but doing so kind of tells me I'm not passing the
right argument, with sufficient information. What do you think?

> I think the loop over where the bucket is should be in route.c as well.
> So how about fib_info_dump_fnhe() as the helper exposed from route.c,
> and it does the loop over nexthops and calls fnhe_dump_buckets.

Yes, I could do that conveniently if I'm passing a fib_info there. I'm
stlll undecided if it's worth it, I guess I don't really have a
preference.

> As for the loop, you could fill an skb without finishing a bucket inside
> of a nexthop so you need top track which nexthop is current as well.

I think this is not a problem, and also checked that selftests trigger
this. Buckets are transparent to the counter for partial dumps (s_fa),
they are just an arbitrary grouping from that perspective, just like
items on the chain for the same bucket.

Take this example, s_i values in [], s_fa values in ():

  node (fa) #1 [1]
    nexthop #1
    bucket #1 -> #0 in chain (1)
    bucket #2 -> #0 in chain (2) -> #1 in chain (3) -> #2 in chain (4)
    bucket #3 -> #0 in chain (5) -> #1 in chain (6)

    nexthop #2
    bucket #1 -> #0 in chain (7) -> #1 in chain (8)
    bucket #2 -> #0 in chain (9)
  --
  node (fa) #2 [2]
    nexthop #1
    bucket #1 -> #0 in chain (1) -> #1 in chain (2)
    bucket #2 -> #0 in chain (3)


If I stop at (3), (4), (7) for "node #1", or at (2) for "node #2", it
doesn't really matter, because nexthops and buckets are always
traversed in the same way (walking flattens all that).

For IPv4, I could even drop the in-tree/in-node distinction (s_i/s_fa).
But accounting becomes terribly inconvenient though, and it would be
inconsistent with what I needed to do for IPv6 (skip/skip_in_node): we
have 'sernum' there, which is used to mark what node we need to restart
from in case of changes. Within a node, however, I can't make any
assumptions like that, so if the fib6 tree changes, I'll restart from
the beginning of the node (see discussion with Martin on v1).

My idea would be to keep it like it is at the moment, and later make it
as "accurate" as it is on IPv6, introducing something like 'sernum'. If
we start with this, it will be more convenient to do that later.

-- 
Stefano

  reply	other threads:[~2019-06-20 19:01 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-19 23:59 [PATCH net-next v6 00/11] Fix listing (IPv4, IPv6) and flushing (IPv6) of cached route exceptions Stefano Brivio
2019-06-19 23:59 ` [PATCH net-next v6 01/11] fib_frontend, ip6_fib: Select routes or exceptions dump from RTM_F_CLONED Stefano Brivio
2019-06-19 23:59 ` [PATCH net-next v6 02/11] ipv4/fib_frontend: Allow RTM_F_CLONED flag to be used for filtering Stefano Brivio
2019-06-19 23:59 ` [PATCH net-next v6 03/11] ipv4/route: Allow NULL flowinfo in rt_fill_info() Stefano Brivio
2019-06-20 13:15   ` David Ahern
2019-06-20 19:00     ` Stefano Brivio
2019-06-19 23:59 ` [PATCH net-next v6 04/11] ipv4: Dump route exceptions if requested Stefano Brivio
2019-06-20 13:31   ` David Ahern
2019-06-20 19:01     ` Stefano Brivio [this message]
2019-06-20 19:21       ` David Ahern
2019-06-19 23:59 ` [PATCH net-next v6 05/11] Revert "net/ipv6: Bail early if user only wants cloned entries" Stefano Brivio
2019-06-19 23:59 ` [PATCH net-next v6 06/11] ipv6/route: Don't match on fc_nh_id if not set in ip6_route_del() Stefano Brivio
2019-06-20 14:16   ` David Ahern
2019-06-20 19:01     ` Stefano Brivio
2019-06-19 23:59 ` [PATCH net-next v6 07/11] ipv6/route: Change return code of rt6_dump_route() for partial node dumps Stefano Brivio
2019-06-20 14:17   ` David Ahern
2019-06-19 23:59 ` [PATCH net-next v6 08/11] ipv6: Dump route exceptions if requested Stefano Brivio
2019-06-20 14:24   ` David Ahern
2019-06-20 19:02     ` Stefano Brivio
2019-06-20 19:22       ` David Ahern
2019-06-19 23:59 ` [PATCH net-next v6 09/11] ip6_fib: Don't discard nodes with valid routing information in fib6_locate_1() Stefano Brivio
2019-06-19 23:59 ` [PATCH net-next v6 10/11] selftests: pmtu: Introduce list_flush_ipv4_exception test case Stefano Brivio
2019-06-19 23:59 ` [PATCH net-next v6 11/11] selftests: pmtu: Make list_flush_ipv6_exception test more demanding Stefano Brivio

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190620210113.6aa2c022@redhat.com \
    --to=sbrivio@redhat.com \
    --cc=davem@davemloft.net \
    --cc=dsahern@gmail.com \
    --cc=edumazet@google.com \
    --cc=jishi@redhat.com \
    --cc=kafai@fb.com \
    --cc=matti.vaittinen@fi.rohmeurope.com \
    --cc=netdev@vger.kernel.org \
    --cc=weiwan@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).