From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Ahern Subject: Re: [RFC v2 bpf-next 8/9] bpf: Provide helper to do lookups in kernel FIB table Date: Mon, 14 May 2018 21:46:11 -0600 Message-ID: <1e211d16-81b6-1eb9-32cc-a9137b6ced4d@gmail.com> References: <20180429180752.15428-1-dsahern@gmail.com> <20180429180752.15428-9-dsahern@gmail.com> <20180429233640.jklasxafvap2q7ig@ast-mbp> <4729b693-20d7-dd9e-c48b-be8386ce9bed@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Cc: netdev@vger.kernel.org, borkmann@iogearbox.net, ast@kernel.org, davem@davemloft.net, shm@cumulusnetworks.com, roopa@cumulusnetworks.com, brouer@redhat.com, toke@toke.dk, john.fastabend@gmail.com To: Alexei Starovoitov Return-path: Received: from mail-pf0-f196.google.com ([209.85.192.196]:36721 "EHLO mail-pf0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752060AbeEODqP (ORCPT ); Mon, 14 May 2018 23:46:15 -0400 Received: by mail-pf0-f196.google.com with SMTP id w129-v6so7013572pfd.3 for ; Mon, 14 May 2018 20:46:15 -0700 (PDT) In-Reply-To: <4729b693-20d7-dd9e-c48b-be8386ce9bed@gmail.com> Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: On 4/29/18 7:13 PM, David Ahern wrote: > > The idea here is to fast pass packets that fit a supported profile and > are to be forwarded. Everything else should continue up the stack as it > has wider capabilities. The helper and XDP programs should make no > assumptions on what the broader kernel and userspace might be monitoring > or want to do with packets that can not be forwarded in the fast path. > This is very similar to hardware forwarding when it punts packets to the > CPU for control plane assistance. > Thinking about this some more and how to return more information to the bpf program about the FIB lookup. bpf_fib_lookup struct is 64-bytes. It can not be expanded without hurting performance. I could do another union on an input parameter and return flags indicating why the returned index is 0. Something like this: diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 360a1168c353..75591522444c 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -2314,6 +2314,12 @@ struct bpf_raw_tracepoint_args { #define BPF_FIB_LOOKUP_DIRECT BIT(0) #define BPF_FIB_LOOKUP_OUTPUT BIT(1) +#define BPF_FIB_LKUP_RET_NO_FWD BIT(0) /* pkt is not fwded */ +#define BPF_FIB_LKUP_RET_UNSUPP_LWT BIT(1) /* fwd requires unsupp encap */ +#define BPF_FIB_LKUP_RET_NO_NHDEV BIT(2) /* nh device does not exist */ +#define BPF_FIB_LKUP_RET_NO_NEIGH BIT(3) /* no neigh entry for nh */ +#define BPF_FIB_LKUP_RET_FRAG_NEEDED BIT(4) /* pkt too big to fwd */ + struct bpf_fib_lookup { /* input */ __u8 family; /* network family, AF_INET, AF_INET6, AF_MPLS */ @@ -2325,7 +2331,11 @@ struct bpf_fib_lookup { /* total length of packet from network header - used for MTU check */ __u16 tot_len; - __u32 ifindex; /* L3 device index for lookup */ + + union { + __u32 ifindex; /* in: L3 device index for lookup */ + __u32 ret_flags; /* out: BPF_FIB_LOOKUP_RET flags */ + } union { /* inputs to lookup */ Similarly for the fib result, it could be returned with a union on say family: union { __u8 family; /* in: network family, AF_INET, AF_INET6, AF_MPLS */ __u8 rt_type; /* out: FIB lookup route type */ }; Then if the fib result is -EINVAL/-EHOSTUNREACH/-EACCES, rt_type is set to RTN_BLACKHOLE/RTN_UNREACHABLE/RTN_PROHIBIT allowing the XDP program to make an informed decision on dropping the packet. To avoid performance hits on the forwarding path, these return values would *only* set if the ifindex returned is 0.