bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stanislav Fomichev <sdf@fomichev.me>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>,
	Stanislav Fomichev <sdf@google.com>,
	Network Development <netdev@vger.kernel.org>,
	bpf <bpf@vger.kernel.org>, David Miller <davem@davemloft.net>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Simon Horman <simon.horman@netronome.com>,
	Willem de Bruijn <willemb@google.com>,
	Petar Penkov <peterpenkov96@gmail.com>
Subject: Re: [RFC bpf-next v3 6/8] flow_dissector: handle no-skb use case
Date: Tue, 26 Mar 2019 19:44:21 -0700	[thread overview]
Message-ID: <20190327024421.GE7431@mini-arch.hsd1.ca.comcast.net> (raw)
In-Reply-To: <20190327014121.p45cblrgqgdyiu6z@ast-mbp>

On 03/26, Alexei Starovoitov wrote:
> On Tue, Mar 26, 2019 at 11:54:56AM -0700, Stanislav Fomichev wrote:
> > On 03/26, Alexei Starovoitov wrote:
> > > On Tue, Mar 26, 2019 at 11:17:19AM -0700, Stanislav Fomichev wrote:
> > > > On 03/26, Alexei Starovoitov wrote:
> > > > > On Tue, Mar 26, 2019 at 10:52 AM Willem de Bruijn
> > > > > <willemdebruijn.kernel@gmail.com> wrote:
> > > > > > The BPF flow dissector should work the same. It is fine to pass the
> > > > > > data including ethernet header, but parsing can start at nhoff with
> > > > > > proto explicitly passed.
> > > > > >
> > > > > > We should not assume Ethernet link layer.
> > > > > 
> > > > > then skb-less dissector has to be different program type
> > > > > because semantics are different.
> > > > The semantics are the same as for c-based __skb_flow_dissect.
> > > > We just need to pass nhoff and proto that has been passed to
> > > > __skb_flow_dissect to the bpf program. In case of with-skb,
> > > > take this initial data from skb, like __skb_flow_dissect does (and don't
> > > > ask BPF program to do it essentially):
> > > > 
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git/tree/net/core/flow_dissector.c#n763
> > > > 
> > > > I was thinking of passing proto as flow_keys->n_proto and we already
> > > > pass flow_keys->nhoff, so no need to do anything for it. With that,
> > > > BPF program doesn't need to look into skb and can parse optional vlan
> > > > and L3+ headers. The same way __skb_flow_dissect does that.
> > > 
> > > makes sense. then I'd also prefer for proto to be in flow_keys to
> > > high light this difference.
> > Maybe rename existing flow_keys->n_proto to flow_keys->proto?
> > That would match __skb_flow_dissect and remove ambiguity with both proto
> > and n_proto in flow_keys.
> 
> disabling useless fields in ctx is one thing, since probability of breaking users
> is low, but renaming n_proto is imo too much.
> 
> > > may be add vlan_proto/present/tci there as well?
> > > At least on the kernel side ctx rewriter will be the same for w/ & w/o skb cases.
> > Why do you think we need them? My understanding was that when
> > skb_vlan_tag_present(skb) (or skb->vlan_present) returns true, that means
> > that vlan info has been already parsed out of the packet and stored in
> > the vlan_tci/vlan_proto (where vlan_proto is 8021Q/8021AD); skb data
> > points to proper L3 header.
> > 
> > If that's correct, BPF flow dissector should not care about that. For
> > example, look at how C-based flow dissector does that:
> > 
> > https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git/tree/net/core/flow_dissector.c#n944
> > 
> > If skb_vlan_tag_present(skb) returns true, we set proto to skb->protocol
> > and move on.
> > 
> > But, we would need vlan_proto/present/tci in the flow_keys in the future.
> > We don't currently return parsed vlan data from the BPF flow dissector.
> > But it feels like it's getting into bpf-next territory :-)
> 
> Whether ctx->data points to L2 or L3 is uapi regardless whether
> progs/bpf_flow.c is relying on that or not.
> So far I think you're saying that in all three cases:
> no-skb, skb befor rfs, skb after rfs ctx->data points to L2, right?
> This has to be preserved.
It points to L3 (or vlan). And this will be preserved, I have no
intention to change that.

Just to make sure, we are on the same page, here is what
__skb_flow_dissect (and BPF prog) is seeing in nhoff.

NO-VLAN is always the same for both with-skb/no-skb:
+----+----+-----+--+
|DMAC|SMAC|PROTO|L3|
+----+----+-----+--+
                 ^
                 +-- nhoff
                     proto = PROTO

VLAN no-skb (eth_get_headlen):
+----+----+----+---+-----+--+
|DMAC|SMAC|TPID|TCI|PROTO|L3|
+----+----+----+---+-----+--+
                ^
                +-- nhoff
                    proto = TPID

VLAN with-skb, RFS (pre __netif_receive_skb_core):
+----+----+----+---+-----+--+
|DMAC|SMAC|TPID|TCI|PROTO|L3|
+----+----+----+---+-----+--+
                ^
                +-- nhoff
                    proto = TPID

VLAN with-skb, post RFS (post __netif_receive_skb_core / skb_vlan_untag):
+----+----+----+---+-----+--+
|DMAC|SMAC|TPID|TCI|PROTO|L3|
+----+----+----+---+-----+--+
                          ^
                          +-- nhoff
			      proto = PROTO

And in the last case, networking stack sets:
 * skb->vlan_present to true
 * skb->vlan_proto to TPID
 * skb->vlan_tci to TCI
 * skb->protocol to PROTO
 * pulls vlan header, so skb->data points to L3 header

> Only now after reading bpf_flow.c for Nth time I realized what semantics
> you gave to skb->vlan* and skb->protocol fields. All of them have
> to be kept as-is.
Don't read too much into current bpf_flow.c, I don't think it really
works with vlans in all the cases :-/

It always looks back, assuming post RFS situation; that needs to be
changed by dropping that "if (!skb->vlan_present)" and just looking
into input 'proto' (and optionally parsing vlan hdr if proto ==
802.1q/ad, which we already, sort of, do).

I'm gonna add a small testcase for BPF_PROG_TEST_RUN.

> For no-skb cases all of them should be available with the same logic
> and it has to documented, since it's different from other bpf progs
> that access these fields.
I feel like dropping those vlan_{present,proto,tci} from bpf flow dissector.
It should not care what's in the skb and should just rely on the input 'proto'
to optionally parse vlan header.

+1 on documenting all of that

  reply	other threads:[~2019-03-27  2:44 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-22 19:58 [RFC bpf-next v3 0/8] net: flow_dissector: trigger BPF hook when called from eth_get_headlen Stanislav Fomichev
2019-03-22 19:58 ` [RFC bpf-next v3 1/8] flow_dissector: allow access only to a subset of __sk_buff fields Stanislav Fomichev
2019-03-22 19:58 ` [RFC bpf-next v3 2/8] flow_dissector: switch kernel context to struct bpf_flow_dissector Stanislav Fomichev
2019-03-22 19:58 ` [RFC bpf-next v3 3/8] flow_dissector: fix clamping of BPF flow_keys for non-zero nhoff Stanislav Fomichev
2019-03-22 19:58 ` [RFC bpf-next v3 4/8] bpf: when doing BPF_PROG_TEST_RUN for flow dissector use no-skb mode Stanislav Fomichev
2019-03-22 19:59 ` [RFC bpf-next v3 5/8] net: plumb network namespace into __skb_flow_dissect Stanislav Fomichev
2019-03-22 19:59 ` [RFC bpf-next v3 6/8] flow_dissector: handle no-skb use case Stanislav Fomichev
2019-03-23  1:00   ` Alexei Starovoitov
2019-03-23  1:19     ` Stanislav Fomichev
2019-03-23  1:41       ` Alexei Starovoitov
2019-03-23 16:05         ` Stanislav Fomichev
2019-03-26  0:35           ` Alexei Starovoitov
2019-03-26 16:45             ` Stanislav Fomichev
2019-03-26 17:48               ` Alexei Starovoitov
2019-03-26 17:51                 ` Willem de Bruijn
2019-03-26 18:08                   ` Alexei Starovoitov
2019-03-26 18:17                     ` Stanislav Fomichev
2019-03-26 18:30                       ` Alexei Starovoitov
2019-03-26 18:54                         ` Stanislav Fomichev
2019-03-27  1:41                           ` Alexei Starovoitov
2019-03-27  2:44                             ` Stanislav Fomichev [this message]
2019-03-27 17:55                               ` Alexei Starovoitov
2019-03-27 19:58                                 ` Stanislav Fomichev
2019-03-28  1:26                                   ` Alexei Starovoitov
2019-03-28  3:14                                     ` Willem de Bruijn
2019-03-28  3:32                                       ` Alexei Starovoitov
2019-03-28  4:17                                         ` Stanislav Fomichev
2019-03-28 12:58                                           ` Willem de Bruijn
2019-04-01 16:30                                             ` Stanislav Fomichev
2019-03-22 19:59 ` [RFC bpf-next v3 7/8] net: pass net argument to the eth_get_headlen Stanislav Fomichev
2019-03-22 19:59 ` [RFC bpf-next v3 8/8] selftests/bpf: add flow dissector bpf_skb_load_bytes helper test Stanislav Fomichev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190327024421.GE7431@mini-arch.hsd1.ca.comcast.net \
    --to=sdf@fomichev.me \
    --cc=alexei.starovoitov@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    --cc=peterpenkov96@gmail.com \
    --cc=sdf@google.com \
    --cc=simon.horman@netronome.com \
    --cc=willemb@google.com \
    --cc=willemdebruijn.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).