All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Ahern <dsahern@gmail.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: netdev@vger.kernel.org, borkmann@iogearbox.net, ast@kernel.org,
	davem@davemloft.net, shm@cumulusnetworks.com,
	roopa@cumulusnetworks.com, brouer@redhat.com, toke@toke.dk,
	john.fastabend@gmail.com
Subject: Re: [RFC v2 bpf-next 8/9] bpf: Provide helper to do lookups in kernel FIB table
Date: Sun, 29 Apr 2018 19:13:35 -0600	[thread overview]
Message-ID: <4729b693-20d7-dd9e-c48b-be8386ce9bed@gmail.com> (raw)
In-Reply-To: <20180429233640.jklasxafvap2q7ig@ast-mbp>

On 4/29/18 5:36 PM, Alexei Starovoitov wrote:
>> +	if (flags & BPF_FIB_LOOKUP_DIRECT) {
>> +		u32 tbid = l3mdev_fib_table_rcu(dev) ? : RT_TABLE_MAIN;
>> +		struct fib_table *tb;
>> +
>> +		tb = fib_get_table(net, tbid);
>> +		if (unlikely(!tb))
>> +			return 0;
>> +
>> +		err = fib_table_lookup(tb, &fl4, &res, FIB_LOOKUP_NOREF);
>> +	} else {
>> +		fl4.flowi4_mark = 0;
>> +		fl4.flowi4_secid = 0;
>> +		fl4.flowi4_tun_key.tun_id = 0;
>> +		fl4.flowi4_uid = sock_net_uid(net, NULL);
>> +
>> +		err = fib_lookup(net, &fl4, &res, FIB_LOOKUP_NOREF);
>> +	}
>> +
>> +	if (err || res.type != RTN_UNICAST)
>> +		return 0;
> 
> should this be an error returned to the user instead of zero?
> Seems useful to indicate.

res.type != UNICAST is not an error; it means other delivery type (e.g.,
local).

err < 0 means unreachable, prohibit, blackhole, etc. Arguably the error
could be returned to the xdp program, but it is more complicated than
that. Blackhole is a common default route or policy, but RTN_BLACKHOLE
== -EINVAL which is also the error code if the user passes invalid
arguments to the program.

> 
>> +
>> +	if (res.fi->fib_nhs > 1)
>> +		fib_select_path(net, &res, &fl4, NULL);
>> +
>> +	nh = &res.fi->fib_nh[res.nh_sel];
>> +
>> +	/* do not handle lwt encaps right now */
>> +	if (nh->nh_lwtstate)
>> +		return 0;
> 
> adn return enotsupp here?

see below

> 
>> +
>> +	dev = nh->nh_dev;
>> +	if (unlikely(!dev))
>> +		return 0;
> 
> enodev ?

see below

> 
>> +
>> +	if (nh->nh_gw)
>> +		params->ipv4_dst = nh->nh_gw;
>> +
>> +	params->rt_metric = res.fi->fib_priority;
>> +
>> +	/* xdp and cls_bpf programs are run in RCU-bh so
>> +	 * rcu_read_lock_bh is not needed here
>> +	 */
>> +	neigh = __ipv4_neigh_lookup_noref(dev, (__force u32)params->ipv4_dst);
>> +	if (neigh)
>> +		return bpf_fib_set_fwd_params(params, neigh, dev);
>> +
>> +	return 0;
> 
> Even this return 0 doesn't quite fit to what doc says:
> "0 if packet needs to continue  up the stack for further processing"
> What stack suppose to do ?

First packet on a route the nexthop may not be resolved. Without punting
to the stack it never has an impetus to resolve that neighbor.


> It will hit the same condition and packet will be dropped, right?

no. It can resolve the neighbor so follow up packets can be forwarded in
the fast path.

> Isn't it better to report all errors back to bpf prog and let
> the program make decision instead of 'return 0' almost everywhere?

The idea here is to fast pass packets that fit a supported profile and
are to be forwarded. Everything else should continue up the stack as it
has wider capabilities. The helper and XDP programs should make no
assumptions on what the broader kernel and userspace might be monitoring
or want to do with packets that can not be forwarded in the fast path.
This is very similar to hardware forwarding when it punts packets to the
CPU for control plane assistance.

  reply	other threads:[~2018-04-30  1:13 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-29 18:07 [RFC v2 bpf-next 0/9] bpf: Add helper to do FIB lookups David Ahern
2018-04-29 18:07 ` [RFC v2 bpf-next 1/9] net/ipv6: Rename fib6_lookup to fib6_node_lookup David Ahern
2018-04-29 18:07 ` [RFC v2 bpf-next 2/9] net/ipv6: Rename rt6_multipath_select David Ahern
2018-04-29 18:07 ` [RFC v2 bpf-next 3/9] net/ipv6: Extract table lookup from ip6_pol_route David Ahern
2018-04-29 18:07 ` [RFC v2 bpf-next 4/9] net/ipv6: Refactor fib6_rule_action David Ahern
2018-04-29 18:07 ` [RFC v2 bpf-next 5/9] net/ipv6: Add fib6_lookup David Ahern
2018-05-01 18:15   ` Vincent Bernat
2018-05-01 18:25     ` David Ahern
2018-04-29 18:07 ` [RFC v2 bpf-next 6/9] net/ipv6: Update fib6 tracepoint to take fib6_info David Ahern
2018-04-29 18:07 ` [RFC v2 bpf-next 7/9] net/ipv6: Add fib lookup stubs for use in bpf helper David Ahern
2018-04-29 18:07 ` [RFC v2 bpf-next 8/9] bpf: Provide helper to do lookups in kernel FIB table David Ahern
2018-04-29 23:36   ` Alexei Starovoitov
2018-04-30  1:13     ` David Ahern [this message]
2018-05-15  3:46       ` David Ahern
2018-05-02 11:27   ` Jesper Dangaard Brouer
2018-05-02 15:37     ` David Ahern
2018-05-02 17:00       ` David Miller
2018-04-29 18:07 ` [RFC v2 bpf-next 9/9] samples/bpf: Add examples of ipv4 and ipv6 forwarding in XDP David Ahern
2018-05-02 11:13   ` Jesper Dangaard Brouer
2018-05-02 15:40     ` David Ahern
2018-05-01 14:20 ` [RFC v2 bpf-next 0/9] bpf: Add helper to do FIB lookups David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4729b693-20d7-dd9e-c48b-be8386ce9bed@gmail.com \
    --to=dsahern@gmail.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=ast@kernel.org \
    --cc=borkmann@iogearbox.net \
    --cc=brouer@redhat.com \
    --cc=davem@davemloft.net \
    --cc=john.fastabend@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=roopa@cumulusnetworks.com \
    --cc=shm@cumulusnetworks.com \
    --cc=toke@toke.dk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.