From mboxrd@z Thu Jan 1 00:00:00 1970 From: Martin Lau Subject: Re: [PATCH bpf-next] bpf: Extend the sk_lookup() helper to XDP hookpoint. Date: Fri, 19 Oct 2018 05:06:29 +0000 Message-ID: <20181019050620.yomnfxdide3v7k6v@kafai-mbp.dhcp.thefacebook.com> References: <20181018115029.77ff8e71@ubun> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Cc: "daniel@iogearbox.net" , Nitin Hande , netdev , "ast@kernel.org" , Jesper Brouer , john fastabend To: Joe Stringer Return-path: Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:45846 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726942AbeJSNLh (ORCPT ); Fri, 19 Oct 2018 09:11:37 -0400 In-Reply-To: Content-Language: en-US Content-ID: <386A8EF95BACA14699E0C291924B64AE@namprd15.prod.outlook.com> Sender: netdev-owner@vger.kernel.org List-ID: On Thu, Oct 18, 2018 at 04:52:40PM -0700, Joe Stringer wrote: > On Thu, 18 Oct 2018 at 14:20, Daniel Borkmann wrot= e: > > > > On 10/18/2018 11:06 PM, Joe Stringer wrote: > > > On Thu, 18 Oct 2018 at 11:54, Nitin Hande wro= te: > > [...] > > >> Open Issue > > >> * The underlying code relies on presence of an skb to find out the > > >> right sk for the case of REUSEPORT socket option. Since there is > > >> no skb available at XDP hookpoint, the helper function will return > > >> the first available sk based off the 5 tuple hash. If the desire > > >> is to return a particular sk matching reuseport_cb function, please > > >> suggest way to tackle it, which can be addressed in a future commit. > > > > >> Signed-off-by: Nitin Hande > > > > > > Thanks Nitin, LGTM overall. > > > > > > The REUSEPORT thing suggests that the usage of this helper from XDP > > > layer may lead to a different socket being selected vs. the equivalen= t > > > call at TC hook, or other places where the selection may occur. This > > > could be a bit counter-intuitive. > > > > > > One thought I had to work around this was to introduce a flag, > > > something like BPF_F_FIND_REUSEPORT_SK_BY_HASH. This flag would > > > effectively communicate in the API that the bpf_sk_lookup_xxx() > > > functions will only select a REUSEPORT socket based on the hash and > > > not by, for example BPF_PROG_TYPE_SK_REUSEPORT programs. The absence > > > of the flag would support finding REUSEPORT sockets by other > > > mechanisms (which would be allowed for now from TC hooks but would be > > > disallowed from XDP, since there's no specific plan to support this). > > > > Hmm, given skb is NULL here the only way to lookup the socket in such > > scenario is based on hash, that is, inet_ehashfn() / inet6_ehashfn(), > > perhaps alternative is to pass this hash in from XDP itself to the > > helper so it could be custom selector. Do you have a specific use case > > on this for XDP (just curious)? >=20 > I don't have a use case for SO_REUSEPORT introspection from XDP, so > I'm primarily thinking from the perspective of making the behaviour > clear in the API in a way that leaves open the possibility for a > reasonable implementation in future. From that perspective, my main > concern is that it may surprise some BPF writers that the same > "bpf_sk_lookup_tcp()" call (with identical parameters) may have > different behaviour at TC vs. XDP layers, as the BPF selection of > sockets is respected at TC but not at XDP. >=20 > FWIW we're already out of parameters for the actual call, so if we > wanted to allow passing a hash in, we'd need to either dedicate half > the 'flags' field for this configurable hash, or consider adding the > new hash parameter to 'struct bpf_sock_tuple'. >=20 > +Martin for any thoughts on SO_REUSEPORT and XDP here. The XDP/TC prog has read access to the sk fields through 'struct bpf_sock'? A quick thought... Considering all sk in the same reuse->socks[] share many things (e.g. family,type,protocol,ip,port..etc are the same), I wonder returning which particular sk from reuse->socks[] will matter too much since most of the fields from 'struct bpf_sock' will be the same. Some of fields in 'struct bpf_sock' could be different though, like priority? Hence, another possibility is to limit the accessible fields for the XDP prog. Only allow accessing the fields that must be the same among the sk in the same reuse->socks[].