All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Daniel Borkmann <daniel@iogearbox.net>
Cc: bpf@vger.kernel.org, netdev@vger.kernel.org,
	Daniel Borkmann <borkmann@iogearbox.net>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	maze@google.com, lmb@cloudflare.com, shaun@tigera.io,
	Lorenzo Bianconi <lorenzo@kernel.org>,
	marek@cloudflare.com, John Fastabend <john.fastabend@gmail.com>,
	Jakub Kicinski <kuba@kernel.org>,
	eyal.birger@gmail.com, colrack@gmail.com, brouer@redhat.com
Subject: Re: [PATCH bpf-next V12 2/7] bpf: fix bpf_fib_lookup helper MTU check for SKB ctx
Date: Mon, 25 Jan 2021 16:58:34 +0100	[thread overview]
Message-ID: <20210125165834.558f7fe1@carbon> (raw)
In-Reply-To: <06f94963-f16b-3339-abf2-6529b474a2f6@iogearbox.net>

On Sat, 23 Jan 2021 02:47:28 +0100
Daniel Borkmann <daniel@iogearbox.net> wrote:

> On 1/18/21 5:54 PM, Jesper Dangaard Brouer wrote:
> > BPF end-user on Cilium slack-channel (Carlo Carraro) wants to use
> > bpf_fib_lookup for doing MTU-check, but *prior* to extending packet size,
> > by adjusting fib_params 'tot_len' with the packet length plus the expected
> > encap size. (Just like the bpf_check_mtu helper supports). He discovered
> > that for SKB ctx the param->tot_len was not used, instead skb->len was used
> > (via MTU check in is_skb_forwardable() that checks against netdev MTU).
> > 
> > Fix this by using fib_params 'tot_len' for MTU check. If not provided (e.g.
> > zero) then keep existing TC behaviour intact. Notice that 'tot_len' for MTU
> > check is done like XDP code-path, which checks against FIB-dst MTU.
> > 
> > V10:
> > - Use same method as XDP for 'tot_len' MTU check
> > 
> > Fixes: 4c79579b44b1 ("bpf: Change bpf_fib_lookup to return lookup status")
> > Reported-by: Carlo Carraro <colrack@gmail.com>
> > Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> > ---
> >   net/core/filter.c |   13 ++++++++++---
> >   1 file changed, 10 insertions(+), 3 deletions(-)
> > 
> > diff --git a/net/core/filter.c b/net/core/filter.c
> > index 5beadd659091..d5e6f395cf64 100644
> > --- a/net/core/filter.c
> > +++ b/net/core/filter.c
> > @@ -5569,6 +5569,7 @@ BPF_CALL_4(bpf_skb_fib_lookup, struct sk_buff *, skb,
> >   {
> >   	struct net *net = dev_net(skb->dev);
> >   	int rc = -EAFNOSUPPORT;
> > +	bool check_mtu = false;
> >   
> >   	if (plen < sizeof(*params))
> >   		return -EINVAL;
> > @@ -5576,22 +5577,28 @@ BPF_CALL_4(bpf_skb_fib_lookup, struct sk_buff *, skb,
> >   	if (flags & ~(BPF_FIB_LOOKUP_DIRECT | BPF_FIB_LOOKUP_OUTPUT))
> >   		return -EINVAL;
> >   
> > +	if (params->tot_len)
> > +		check_mtu = true;
> > +
> >   	switch (params->family) {
> >   #if IS_ENABLED(CONFIG_INET)
> >   	case AF_INET:
> > -		rc = bpf_ipv4_fib_lookup(net, params, flags, false);
> > +		rc = bpf_ipv4_fib_lookup(net, params, flags, check_mtu);
> >   		break;
> >   #endif
> >   #if IS_ENABLED(CONFIG_IPV6)
> >   	case AF_INET6:
> > -		rc = bpf_ipv6_fib_lookup(net, params, flags, false);
> > +		rc = bpf_ipv6_fib_lookup(net, params, flags, check_mtu);
> >   		break;
> >   #endif
> >   	}
> >   
> > -	if (!rc) {
> > +	if (rc == BPF_FIB_LKUP_RET_SUCCESS && !check_mtu) {
> >   		struct net_device *dev;
> >   
> > +		/* When tot_len isn't provided by user,
> > +		 * check skb against net_device MTU
> > +		 */
> >   		dev = dev_get_by_index_rcu(net, params->ifindex);
> >   		if (!is_skb_forwardable(dev, skb))
> >   			rc = BPF_FIB_LKUP_RET_FRAG_NEEDED;  
> 
> Btw, looking at some of the old feedback, looks like [0] got missed somehow. Would
> be nice if we could simplify this rather ugly bit with refetching dev for tc.
> 
>    [0] https://lore.kernel.org/bpf/f959017b-5d3c-5cdb-a016-c467a3c9a2fc@iogearbox.net/
>        https://lore.kernel.org/bpf/f8ff26f0-b1b6-6dd1-738d-4c592a8efdb0@gmail.com/

I have tried to incorporate the ideas from [0].  If you notice this
code path is only called in-case params->tot_len is zero (!check_mtu).
Then if params->tot_len does contain something, then the check_mtu code
path of bpf_ipv4_fib_lookup() and bpf_ipv6_fib_lookup().

I agree, that it is ugly refetching dev.  I'll look at moving the
lookup dev_get_by_index_rcu() call one level-up, as both
bpf_ipv4_fib_lookup() and bpf_ipv6_fib_lookup() does this call. (I
cannot move the check into the functions due skb is not avail in XDP
case)

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


  reply	other threads:[~2021-01-26  5:38 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-18 16:54 [PATCH bpf-next V12 0/7] bpf: New approach for BPF MTU handling Jesper Dangaard Brouer
2021-01-18 16:54 ` [PATCH bpf-next V12 1/7] bpf: Remove MTU check in __bpf_skb_max_len Jesper Dangaard Brouer
2021-01-18 16:54 ` [PATCH bpf-next V12 2/7] bpf: fix bpf_fib_lookup helper MTU check for SKB ctx Jesper Dangaard Brouer
2021-01-23  1:47   ` Daniel Borkmann
2021-01-25 15:58     ` Jesper Dangaard Brouer [this message]
2021-01-18 16:54 ` [PATCH bpf-next V12 3/7] bpf: bpf_fib_lookup return MTU value as output when looked up Jesper Dangaard Brouer
2021-01-18 16:54 ` [PATCH bpf-next V12 4/7] bpf: add BPF-helper for MTU checking Jesper Dangaard Brouer
2021-01-23  1:35   ` Daniel Borkmann
2021-01-25  8:41     ` Jesper Dangaard Brouer
2021-01-25 22:27       ` Daniel Borkmann
2021-01-26  9:13         ` Jesper Dangaard Brouer
2021-01-18 16:54 ` [PATCH bpf-next V12 5/7] bpf: drop MTU check when doing TC-BPF redirect to ingress Jesper Dangaard Brouer
2021-01-18 16:54 ` [PATCH bpf-next V12 6/7] selftests/bpf: use bpf_check_mtu in selftest test_cls_redirect Jesper Dangaard Brouer
2021-01-18 16:54 ` [PATCH bpf-next V12 7/7] bpf/selftests: tests using bpf_check_mtu BPF-helper Jesper Dangaard Brouer
2021-01-23  1:49   ` Daniel Borkmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210125165834.558f7fe1@carbon \
    --to=brouer@redhat.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=borkmann@iogearbox.net \
    --cc=bpf@vger.kernel.org \
    --cc=colrack@gmail.com \
    --cc=daniel@iogearbox.net \
    --cc=eyal.birger@gmail.com \
    --cc=john.fastabend@gmail.com \
    --cc=kuba@kernel.org \
    --cc=lmb@cloudflare.com \
    --cc=lorenzo@kernel.org \
    --cc=marek@cloudflare.com \
    --cc=maze@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=shaun@tigera.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.