All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dongseok Yi" <dseok.yi@samsung.com>
To: "'Willem de Bruijn'" <willemdebruijn.kernel@gmail.com>
Cc: "'Yunsheng Lin'" <linyunsheng@huawei.com>,
	"'Daniel Borkmann'" <daniel@iogearbox.net>,
	"'bpf'" <bpf@vger.kernel.org>,
	"'Alexei Starovoitov'" <ast@kernel.org>,
	"'Andrii Nakryiko'" <andrii@kernel.org>,
	"'Martin KaFai Lau'" <kafai@fb.com>,
	"'Song Liu'" <songliubraving@fb.com>,
	"'Yonghong Song'" <yhs@fb.com>,
	"'John Fastabend'" <john.fastabend@gmail.com>,
	"'KP Singh'" <kpsingh@kernel.org>,
	"'David S. Miller'" <davem@davemloft.net>,
	"'Jakub Kicinski'" <kuba@kernel.org>,
	"'Network Development'" <netdev@vger.kernel.org>,
	"'linux-kernel'" <linux-kernel@vger.kernel.org>
Subject: RE: [PATCH bpf] bpf: check for data_len before upgrading mss when 6 to 4
Date: Mon, 10 May 2021 11:22:36 +0900	[thread overview]
Message-ID: <00c901d74543$57fa3620$07eea260$@samsung.com> (raw)
In-Reply-To: <CAF=yD-L9pxAFoT+c1Xk5YS42ZaJ+YLVQVnV+fvtqn-gLxq9ENg@mail.gmail.com>

On Fri, May 07, 2021 at 09:50:03AM -0400, Willem de Bruijn wrote:
> On Fri, May 7, 2021 at 4:25 AM Dongseok Yi <dseok.yi@samsung.com> wrote:
> >
> > On Thu, May 06, 2021 at 09:53:45PM -0400, Willem de Bruijn wrote:
> > > On Thu, May 6, 2021 at 9:45 PM Yunsheng Lin <linyunsheng@huawei.com> wrote:
> > > >
> > > > On 2021/5/7 9:25, Willem de Bruijn wrote:
> > > > >>>> head_skb's data_len is the sum of skb_gro_len for each skb of the frags.
> > > > >>>> data_len could be 8 if server sent a small size packet and it is GROed
> > > > >>>> to head_skb.
> > > > >>>>
> > > > >>>> Please let me know if I am missing something.
> > > > >>>
> > > > >>> This is my understanding of the data path. This is a forwarding path
> > > > >>> for TCP traffic.
> > > > >>>
> > > > >>> GRO is enabled and will coalesce multiple segments into a single large
> > > > >>> packet. In bad cases, the coalesced packet payload is > MSS, but < MSS
> > > > >>> + 20.
> > > > >>>
> > > > >>> Somewhere between GRO and GSO you have a BPF program that converts the
> > > > >>> IPv6 address to IPv4.
> > > > >>
> > > > >> Your understanding is right. The data path is GRO -> BPF 6 to 4 ->
> > > > >> GSO.
> > > > >>
> > > > >>>
> > > > >>> There is no concept of head_skb at the time of this BPF program. It is
> > > > >>> a single SKB, with an skb linear part and multiple data items in the
> > > > >>> frags (no frag_list).
> > > > >>
> > > > >> Sorry for the confusion. head_skb what I mentioned was a skb linear
> > > > >> part. I'm considering a single SKB with frags too.
> > > > >>
> > > > >>>
> > > > >>> When entering the GSO stack, this single skb now has a payload length
> > > > >>> < MSS. So it would just make a valid TCP packet on its own?
> > > > >>>
> > > > >>> skb_gro_len is only relevant inside the GRO stack. It internally casts
> > > > >>> the skb->cb[] to NAPI_GRO_CB. This field is a scratch area that may be
> > > > >>> reused for other purposes later by other layers of the datapath. It is
> > > > >>> not safe to read this inside bpf_skb_proto_6_to_4.
> > > > >>
> > > > >> The condition what I made uses skb->data_len not skb_gro_len. Does
> > > > >> skb->data_len have a different meaning on each layer? As I know,
> > > > >> data_len indicates the amount of frags or frag_list. skb->data_len
> > > > >> should be > 20 in the sample case because the payload size of the skb
> > > > >> linear part is the same with mss.
> > > > >
> > > > > Ah, got it.
> > > > >
> > > > > data_len is the length of the skb minus the length in the skb linear
> > > > > section (as seen in skb_headlen).
> > > > >
> > > > > So this gso skb consists of two segments, the first one entirely
> > > > > linear, the payload of the second is in skb_shinfo(skb)->frags[0].
> > > > >
> > > > > It is not guaranteed that gso skbs built from two individual skbs end
> > > > > up looking like that. Only protocol headers in the linear segment and
> > > > > the payload of both in frags is common.
> > > > >
> > > > >> We can modify netif_needs_gso as another option to hit
> > > > >> skb_needs_linearize in validate_xmit_skb. But I think we should compare
> > > > >> skb->gso_size and skb->data_len too to check if mss exceed a payload
> > > > >> size.
> > > > >
> > > > > The rest of the stack does not build such gso packets with payload len
> > > > > < mss, so we should not have to add workarounds in the gso hot path
> > > > > for this.
> > > > >
> > > > > Also no need to linearize this skb. I think that if the bpf program
> > > > > would just clear the gso type, the packet would be sent correctly.
> > > > > Unless I'm missing something.
> > > >
> > > > Does the checksum/len field in ip and tcp/udp header need adjusting
> > > > before clearing gso type as the packet has became bigger?
> > >
> > > gro takes care of this. see for instance inet_gro_complete for updates
> > > to the ip header.
> >
> > I think clearing the gso type will get an error at tcp4_gso_segment
> > because netif_needs_gso returns true in validate_xmit_skb.
> 
> Oh right. Whether a packet is gso is defined by gso_size being
> non-zero, not by gso_type.
> 
> > >
> > > > Also, instead of testing skb->data_len, may test the skb->len?
> > > >
> > > > skb->len - (mac header + ip/ipv6 header + udp/tcp header) > mss + len_diff
> > >
> > > Yes. Essentially doing the same calculation as the gso code that is
> > > causing the packet to be dropped.
> >
> > BPF program is usually out of control. Can we take a general approach?
> > The below 2 cases has no issue when mss upgrading.
> > 1) skb->data_len > mss + 20
> > 2) skb->data_len < mss && skb->data_len > 20
> > The corner case is when
> > 3) skb->data_len > mss && skb->data_len < mss + 20
> 
> Again, you cannot use skb->data_len alone to make inferences about the
> size of the second packet.

This approach is oriented a general way that does not make inferences
about the size of the second packet.

We can obviously increase the mss size when
1) skb->data_len > mss + 20
The issue will be fixed even if we consider the #1 condition.

But there is a precondition that mss < skb payload. If skb->data_len <
mss then skb_headlen(skb) contains the size of mss. So, we can check
the #2 condition too.
2) skb->data_len < mss && skb->data_len > 20

> 
> >
> > But to cover #3 case, we should check the condition Yunsheng Lin said.
> > What if we do mss upgrading for both #1 and #2 cases only?
> >
> > +               unsigned short off_len = skb->data_len > shinfo->gso_size ?
> > +                       shinfo->gso_size : 0;
> > [...]
> >                 /* Due to IPv4 header, MSS can be upgraded. */
> > -               skb_increase_gso_size(shinfo, len_diff);
> > +               if (skb->data_len - off_len > len_diff)
> > +                       skb_increase_gso_size(shinfo, len_diff);
> 
> That generates TCP packets with different MSS within the same stream.
> 
> My suggestion remains to just not change MSS at all. But this has to
> be a new flag to avoid changing established behavior.

I don't understand why the mss size should be kept in GSO step. Will
there be any issue with different mss?

In general, upgrading mss make sense when 6 to 4. The new flag would be
set by user to not change mss. What happened if user does not set the
flag? I still think we should fix the issue with a general approach. Or
can we remove the skb_increase_gso_size line?


  reply	other threads:[~2021-05-10  2:22 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20210429102143epcas2p4c8747c09a9de28f003c20389c050394a@epcas2p4.samsung.com>
2021-04-29 10:08 ` [PATCH bpf] bpf: check for data_len before upgrading mss when 6 to 4 Dongseok Yi
2021-05-05 20:55   ` Daniel Borkmann
2021-05-06  0:45     ` Dongseok Yi
2021-05-06  1:45       ` Willem de Bruijn
2021-05-06  2:27         ` Dongseok Yi
2021-05-06 18:21           ` Willem de Bruijn
2021-05-07  0:53             ` Dongseok Yi
2021-05-07  1:25               ` Willem de Bruijn
2021-05-07  1:45                 ` Yunsheng Lin
2021-05-07  1:53                   ` Willem de Bruijn
2021-05-07  8:25                     ` Dongseok Yi
2021-05-07  9:11                       ` Yunsheng Lin
2021-05-07 10:36                         ` Dongseok Yi
2021-05-07 13:50                       ` Willem de Bruijn
2021-05-10  2:22                         ` Dongseok Yi [this message]
2021-05-10 13:19                           ` Willem de Bruijn
2021-05-10 13:46                             ` Willem de Bruijn
2021-05-11  1:11                               ` Dongseok Yi
2021-05-11 17:38                                 ` Willem de Bruijn
2021-05-12  0:45                                   ` Dongseok Yi
     [not found]   ` <CGME20210511065056epcas2p1788505019deb274f5c57650a2f5d7ef0@epcas2p1.samsung.com>
2021-05-11  6:36     ` [PATCH bpf v2] bpf: check BPF_F_ADJ_ROOM_FIXED_GSO when upgrading mss in " Dongseok Yi
2021-05-11 17:42       ` Willem de Bruijn
2021-05-12  6:56         ` Dongseok Yi
     [not found]       ` <CGME20210512074058epcas2p35536c27bdfafaa6431e164c142007f96@epcas2p3.samsung.com>
2021-05-12  7:27         ` [PATCH bpf-next v3] bpf: check for BPF_F_ADJ_ROOM_FIXED_GSO when bpf_skb_change_proto Dongseok Yi
2021-05-12 14:13           ` Willem de Bruijn
2021-05-18 20:10           ` patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='00c901d74543$57fa3620$07eea260$@samsung.com' \
    --to=dseok.yi@samsung.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=john.fastabend@gmail.com \
    --cc=kafai@fb.com \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linyunsheng@huawei.com \
    --cc=netdev@vger.kernel.org \
    --cc=songliubraving@fb.com \
    --cc=willemdebruijn.kernel@gmail.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.