linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: "Maciej Żenczykowski" <zenczykowski@gmail.com>
Cc: "Maciej Żenczykowski" <maze@google.com>,
	"Alexei Starovoitov" <ast@kernel.org>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	"Linux Network Development Mailing List" <netdev@vger.kernel.org>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	"BPF Mailing List" <bpf@vger.kernel.org>,
	"David S . Miller" <davem@davemloft.net>
Subject: Re: [PATCH v2] net: bpf: permit redirect from L3 to L2 devices at near max mtu
Date: Wed, 6 May 2020 16:55:17 -0700	[thread overview]
Message-ID: <20200506165517.140d39ac@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com> (raw)
In-Reply-To: <20200506233259.112545-1-zenczykowski@gmail.com>

On Wed,  6 May 2020 16:32:59 -0700 Maciej Żenczykowski wrote:
> From: Maciej Żenczykowski <maze@google.com>
> 
> __bpf_skb_max_len(skb) is used from:
>   bpf_skb_adjust_room
>   __bpf_skb_change_tail
>   __bpf_skb_change_head
> 
> but in the case of forwarding we're likely calling these functions
> during receive processing on ingress and bpf_redirect()'ing at
> a later point in time to egress on another interface, thus these
> mtu checks are for the wrong device (input instead of output).
> 
> This is particularly problematic if we're receiving on an L3 1500 mtu
> cellular interface, trying to add an L2 header and forwarding to
> an L3 mtu 1500 mtu wifi/ethernet device (which is thus L2 1514).
> 
> The mtu check prevents us from adding the 14 byte ethernet header prior
> to forwarding the packet.
> 
> After the packet has already been redirected, we'd need to add
> an additional 2nd ebpf program on the target device's egress tc hook,
> but then we'd also see non-redirected traffic and have no easy
> way to tell apart normal egress with ethernet header packets
> from forwarded ethernet headerless packets.
> 
> Credits to Alexei Starovoitov for the suggestion on how to 'fix' this.
> 
> This probably should be further fixed up *somehow*, just
> not at all clear how.  This does at least make things work.
> 
> Cc: Alexei Starovoitov <ast@kernel.org>
> Signed-off-by: Maciej Żenczykowski <maze@google.com>
> ---
>  net/core/filter.c | 16 ++++++++++++++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 7d6ceaa54d21..811aba77e24b 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -3159,8 +3159,20 @@ static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff,
>  
>  static u32 __bpf_skb_max_len(const struct sk_buff *skb)
>  {
> -	return skb->dev ? skb->dev->mtu + skb->dev->hard_header_len :
> -			  SKB_MAX_ALLOC;
> +	if (skb->dev) {
> +		unsigned short header_len = skb->dev->hard_header_len;
> +
> +		/* HACK: Treat 0 as ETH_HLEN to allow redirect while
> +		 * adding ethernet header from an L3 (tun, rawip, cellular)
> +		 * to an L2 device (tap, ethernet, wifi)
> +		 */
> +		if (!header_len)
> +			header_len = ETH_HLEN;
> +
> +		return skb->dev->mtu + header_len;
> +	} else {
> +		return SKB_MAX_ALLOC;
> +	}
>  }
>  
>  BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,

I thought we have established that checking device MTU (m*T*u) 
at ingress makes a very limited amount of sense, no?

Shooting from the hip here, but won't something like:

    if (!skb->dev || skb->tc_at_ingress)
        return SKB_MAX_ALLOC;
    return skb->dev->mtu + skb->dev->hard_header_len;

Solve your problem?

  reply	other threads:[~2020-05-06 23:55 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-20 23:14 [PATCH] [RFC] net: bpf: make __bpf_skb_max_len(skb) an skb-independent constant Maciej Żenczykowski
2020-04-20 23:26 ` Maciej Żenczykowski
2020-04-21 17:27 ` Jakub Kicinski
2020-04-21 20:36   ` Maciej Żenczykowski
2020-04-28 17:53     ` Alexei Starovoitov
2020-05-06 23:32 ` [PATCH v2] net: bpf: permit redirect from L3 to L2 devices at near max mtu Maciej Żenczykowski
2020-05-06 23:55   ` Jakub Kicinski [this message]
2020-05-07  0:47     ` Maciej Żenczykowski
2020-05-07  2:32       ` Maciej Żenczykowski
2020-05-07  2:36         ` [PATCH v3] net: bpf: permit redirect from ingress L3 to egress " Maciej Żenczykowski
2020-05-07 15:54           ` Daniel Borkmann
2020-05-07 16:46             ` Maciej Żenczykowski
2020-05-07 21:05               ` Daniel Borkmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200506165517.140d39ac@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com \
    --to=kuba@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maze@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=zenczykowski@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).