All of lore.kernel.org
 help / color / mirror / Atom feed
From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Petar Penkov <peterpenkov96@gmail.com>,
	Network Development <netdev@vger.kernel.org>,
	David Miller <davem@davemloft.net>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	simon.horman@netronome.com, ecree@solarflare.com,
	songliubraving@fb.com, Tom Herbert <tom@herbertland.com>,
	Petar Penkov <ppenkov@google.com>,
	Willem de Bruijn <willemb@google.com>
Subject: Re: [bpf-next, v2 1/3] flow_dissector: implements flow dissector BPF hook
Date: Wed, 12 Sep 2018 14:43:37 -0400	[thread overview]
Message-ID: <CAF=yD-LOvyj9t1O7MpqfXE7eYKpFZn96Gx144vmkFgppZe2UbQ@mail.gmail.com> (raw)
In-Reply-To: <20180912034632.bkxpktzvze242rvr@ast-mbp>

On Tue, Sep 11, 2018 at 11:47 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Fri, Sep 07, 2018 at 05:11:08PM -0700, Petar Penkov wrote:
> > From: Petar Penkov <ppenkov@google.com>
> >
> > Adds a hook for programs of type BPF_PROG_TYPE_FLOW_DISSECTOR and
> > attach type BPF_FLOW_DISSECTOR that is executed in the flow dissector
> > path. The BPF program is per-network namespace.
> >
> > Signed-off-by: Petar Penkov <ppenkov@google.com>
> > Signed-off-by: Willem de Bruijn <willemb@google.com>
> > ---
> >  include/linux/bpf.h            |   1 +
> >  include/linux/bpf_types.h      |   1 +
> >  include/linux/skbuff.h         |   7 ++
> >  include/net/net_namespace.h    |   3 +
> >  include/net/sch_generic.h      |  12 ++-
> >  include/uapi/linux/bpf.h       |  25 ++++++
> >  kernel/bpf/syscall.c           |   8 ++
> >  kernel/bpf/verifier.c          |  32 ++++++++
> >  net/core/filter.c              |  67 ++++++++++++++++
> >  net/core/flow_dissector.c      | 136 +++++++++++++++++++++++++++++++++
> >  tools/bpf/bpftool/prog.c       |   1 +
> >  tools/include/uapi/linux/bpf.h |  25 ++++++
> >  tools/lib/bpf/libbpf.c         |   2 +
>
> please split up update to tools/include/uapi/linux/bpf.h as a separate patch 2.
> We often have conflicts in there, so best to have a separate.
> Also please split tools/lib and tools/bpf chnages into patch 3.

Will do in v3.

> >  struct qdisc_skb_cb {
> > -     unsigned int            pkt_len;
> > -     u16                     slave_dev_queue_mapping;
> > -     u16                     tc_classid;
> > +     union {
> > +             struct {
> > +                     unsigned int            pkt_len;
> > +                     u16                     slave_dev_queue_mapping;
> > +                     u16                     tc_classid;
> > +             };
> > +             struct bpf_flow_keys *flow_keys;
> > +     };
>
> is this magic really necessary? flow_dissector runs very early in recv path.
> There is no qdisc or conflicts with tcp/ip use of cb.
> I think the whole cb block can be used.

The flow dissector also runs in the context of TC, from flower.
But that is not a reason to use this struct.

We need both (a) data shared with the BPF application and between
applications using tail-calls, to pass along the parse offset (nhoff),
and (b) data not accessible by the program, to store the flow_keys
pointer.

qdisc_skb_cb already has this split, exposing only the 20B .data
field to the application. Flow dissection currently reuses the existing
bpf_convert_ctx_access logic for this field.

We could create a separate flowdissect_skb_cb struct with the
same split setup, but a second constraint is that relevant internal
BPF interfaces already expect qdisc_skb_cb, notably
bkf_skb_data_end. So the union was easier.

There is another way to pass nhoff besides cb[] (see below). But
I don't immediately see another place to store the flow_keys ptr.

At least, if we pass skb as context. One larger change would
be to introduce another ctx struct, similar to sk_reuseport_(kern|md).

> > @@ -2333,6 +2335,7 @@ struct __sk_buff {
> >       /* ... here. */
> >
> >       __u32 data_meta;
> > +     __u32 flow_keys;
>
> please use
> struct bpf_flow_keys *flow_keys;
> instead.
>
> See what we did in 'struct sk_msg_md' and in 'struct sk_reuseport_md'.
> There is no need to hide pointers in u32.
>

Will do in v3.

> > @@ -658,6 +754,46 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
> >                                             FLOW_DISSECTOR_KEY_BASIC,
> >                                             target_container);
> >
> > +     rcu_read_lock();
> > +     attached = skb ? rcu_dereference(dev_net(skb->dev)->flow_dissector_prog)
> > +                    : NULL;
> > +     if (attached) {
> > +             /* Note that even though the const qualifier is discarded
> > +              * throughout the execution of the BPF program, all changes(the
> > +              * control block) are reverted after the BPF program returns.
> > +              * Therefore, __skb_flow_dissect does not alter the skb.
> > +              */
> > +             struct bpf_flow_keys flow_keys = {};
> > +             struct qdisc_skb_cb cb_saved;
> > +             struct qdisc_skb_cb *cb;
> > +             u16 *pseudo_cb;
> > +             u32 result;
> > +
> > +             cb = qdisc_skb_cb(skb);
> > +             pseudo_cb = (u16 *)bpf_skb_cb((struct sk_buff *)skb);
> > +
> > +             /* Save Control Block */
> > +             memcpy(&cb_saved, cb, sizeof(cb_saved));
> > +             memset(cb, 0, sizeof(cb_saved));
> > +
> > +             /* Pass parameters to the BPF program */
> > +             cb->flow_keys = &flow_keys;
> > +             *pseudo_cb = nhoff;
>
> I don't understand this bit.
> What is this pseudo_cb and why nhoff goes in there?
> Some odd way to pass it into the prog?

Yes. nhoff passes the offset to the program to start parsing from.
Applications also pass this during tail calls.

Alternatively we can just add a new field to struct bpf_flow_keys.

  reply	other threads:[~2018-09-12 23:50 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-08  0:11 [bpf-next, v2 0/3] Introduce eBPF flow dissector Petar Penkov
2018-09-08  0:11 ` [bpf-next, v2 1/3] flow_dissector: implements flow dissector BPF hook Petar Penkov
2018-09-12  3:46   ` Alexei Starovoitov
2018-09-12 18:43     ` Willem de Bruijn [this message]
2018-09-12 22:25       ` Alexei Starovoitov
2018-09-12 22:53         ` Willem de Bruijn
2018-09-08  0:11 ` [bpf-next, v2 2/3] flow_dissector: implements eBPF parser Petar Penkov
2018-09-08  0:11 ` [bpf-next, v2 3/3] selftests/bpf: test bpf flow dissection Petar Penkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAF=yD-LOvyj9t1O7MpqfXE7eYKpFZn96Gx144vmkFgppZe2UbQ@mail.gmail.com' \
    --to=willemdebruijn.kernel@gmail.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=ast@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=ecree@solarflare.com \
    --cc=netdev@vger.kernel.org \
    --cc=peterpenkov96@gmail.com \
    --cc=ppenkov@google.com \
    --cc=simon.horman@netronome.com \
    --cc=songliubraving@fb.com \
    --cc=tom@herbertland.com \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.