All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: Daniel Borkmann <daniel@iogearbox.net>
Cc: Johannes Berg <johannes@sipsolutions.net>,
	Brenden Blanco <bblanco@plumgrid.com>,
	davem@davemloft.net, netdev@vger.kernel.org, tom@herbertland.com,
	ogerlitz@mellanox.com, john.fastabend@gmail.com,
	brouer@redhat.com
Subject: Re: [RFC PATCH 4/5] mlx4: add support for fast rx drop bpf program
Date: Mon, 4 Apr 2016 18:17:59 -0700	[thread overview]
Message-ID: <20160405011757.GA80209@ast-mbp.thefacebook.com> (raw)
In-Reply-To: <5702D645.1090401@iogearbox.net>

On Mon, Apr 04, 2016 at 11:01:57PM +0200, Daniel Borkmann wrote:
> On 04/04/2016 08:46 PM, Alexei Starovoitov wrote:
> >On Mon, Apr 04, 2016 at 11:57:52AM +0200, Daniel Borkmann wrote:
> >>On 04/04/2016 09:35 AM, Johannes Berg wrote:
> >>>On Sat, 2016-04-02 at 23:38 -0700, Brenden Blanco wrote:
> >>>>
> >>>>Having a common check makes sense. The tricky thing is that the type can
> >>>>only be checked after taking the reference, and I wanted to keep the
> >>>>scope of the prog brief in the case of errors. I would have to move the
> >>>>bpf_prog_get logic into dev_change_bpf_fd and pass a bpf_prog * into the
> >>>>ndo instead. Would that API look fine to you?
> >>>
> >>>I can't really comment, I wasn't planning on using the API right now :)
> >>>
> >>>However, what else is there that the driver could possibly do with the
> >>>FD, other than getting the bpf_prog?
> >>>
> >>>>A possible extension of this is just to keep the bpf_prog * in the
> >>>>netdev itself and expose a feature flag from the driver rather than
> >>>>an ndo. But that would mean another 8 bytes in the netdev.
> >>>
> >>>That also misses the signal to the driver when the program is
> >>>set/removed, so I don't think that works. I'd argue it's not really
> >>>desirable anyway though since I wouldn't expect a majority of drivers
> >>>to start supporting this.
> >>
> >>I think ndo is probably fine for this purpose, see also my other mail. I
> >>think currently, the only really driver specific code would be to store
> >>the prog pointer somewhere and to pass needed meta data to populate the
> >>fake skb.
> >
> >yes. I think ndo is better and having bpf_prog in the driver priv
> >part is likely better as well, since driver may decide to put it into
> >their ring struct for faster fetch or layout prog pointer next to other
> >priv fields for better cache.
> >Having prog in 'struct net_device' may look very sensible right now,
> >since there is not much code around it, but later it may be causing
> >some performance headachces. I think it's better to have complete
> >freedom in the drivers and later move code to generic part.
> >Same applies to your other comment about moving mlx4_bpf_set() and
> >mlx4_call_bpf() into generic. It's better for them to be driver
> >specific in the moment. Right now we have only mlx4 anyway.
> 
> Sure, right now it's only mlx4, but we need to make sure that once this gets
> adapted/extended by others, that we won't end up with programs that can only
> be run by specific drivers e.g., due to meta data only available for this kind
> of driver but not others supporting XDP. So, some form of generic part will
> be needed in any case, also makes it easier for testing changes.

yes. if packet metadata becomes different for different drivers it will
be a major pain to write portable programs and we should strive to avoid that.
Right now it's only 'len' which obviously available everywhere and any new
field need to be argued for.
Same will apply to helper functions. In this rfc it's just sk_filter_func_proto,
which is a good set for filtering packets, but load/store bytes + csum helpers
need to be added along with packet redirect to be useful for eth2eth traffic.
Since there is no skb and there is no legacy of LD_ABS with negative offsets,
we can have direct load/store bytes, so csum operations may become instructions
as well and will be faster too. Yes. It would mean that tc+cls_bpf have to look
different in bpf assembler, but since they're written in C we can abstract
them at user space C level and can have the same program compiled as cls_bpf
using cls_bpf helpers and as bpf_phys_dev using direct load/store instructions.
Like bpf_skb_load_bytes() may become inlined memcpy with extra len check
for bpf_phys_dev prog type. All options are open.
The only thing we cannot compromise on is max performance.
If performance suffers due to generality/legacy_code/compatiblity_with_X_or_Y,
we should pick performance.

  reply	other threads:[~2016-04-05  1:18 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-02  1:21 [RFC PATCH 0/5] Add driver bpf hook for early packet drop Brenden Blanco
2016-04-02  1:21 ` [RFC PATCH 1/5] bpf: add PHYS_DEV prog type for early driver filter Brenden Blanco
2016-04-02 16:39   ` Tom Herbert
2016-04-03  7:02     ` Brenden Blanco
2016-04-04 22:07       ` Thomas Graf
2016-04-05  8:19         ` Jesper Dangaard Brouer
2016-04-04  8:49   ` Daniel Borkmann
2016-04-04 13:07     ` Jesper Dangaard Brouer
2016-04-04 13:36       ` Daniel Borkmann
2016-04-04 14:09         ` Tom Herbert
2016-04-04 15:12           ` Jesper Dangaard Brouer
2016-04-04 15:29             ` Brenden Blanco
2016-04-04 16:07               ` John Fastabend
2016-04-04 16:17                 ` Brenden Blanco
2016-04-04 20:00                   ` Alexei Starovoitov
2016-04-04 22:04                     ` Thomas Graf
2016-04-05  2:25                       ` Alexei Starovoitov
2016-04-05  8:11                         ` Jesper Dangaard Brouer
2016-04-05  9:29                     ` Jesper Dangaard Brouer
2016-04-05 22:06                       ` Alexei Starovoitov
2016-04-04 14:33       ` Eric Dumazet
2016-04-04 15:18         ` Edward Cree
2016-04-02  1:21 ` [RFC PATCH 2/5] net: add ndo to set bpf prog in adapter rx Brenden Blanco
2016-04-02  1:21 ` [RFC PATCH 3/5] rtnl: add option for setting link bpf prog Brenden Blanco
2016-04-02  1:21 ` [RFC PATCH 4/5] mlx4: add support for fast rx drop bpf program Brenden Blanco
2016-04-02  2:08   ` Eric Dumazet
2016-04-02  2:47     ` Alexei Starovoitov
2016-04-04 14:57       ` Jesper Dangaard Brouer
2016-04-04 15:22         ` Eric Dumazet
2016-04-04 18:50           ` Alexei Starovoitov
2016-04-05 14:15             ` Or Gerlitz
2016-04-06  4:05               ` Brenden Blanco
2016-04-03  6:15     ` Brenden Blanco
2016-04-05  2:20       ` Brenden Blanco
2016-04-05  2:44         ` Eric Dumazet
2016-04-05 18:59         ` Eran Ben Elisha
2016-04-02  8:23   ` Jesper Dangaard Brouer
2016-04-03  6:11     ` Brenden Blanco
2016-04-04 18:27       ` Alexei Starovoitov
2016-04-05  6:04         ` Jesper Dangaard Brouer
2016-04-02 18:40   ` Johannes Berg
2016-04-03  6:38     ` Brenden Blanco
2016-04-04  7:35       ` Johannes Berg
2016-04-04  9:57         ` Daniel Borkmann
2016-04-04 18:46           ` Alexei Starovoitov
2016-04-04 21:01             ` Daniel Borkmann
2016-04-05  1:17               ` Alexei Starovoitov [this message]
2016-04-04  8:33   ` Jesper Dangaard Brouer
2016-04-04  9:22   ` Daniel Borkmann
2016-04-02  1:21 ` [RFC PATCH 5/5] Add sample for adding simple drop program to link Brenden Blanco
2016-04-06 19:48   ` Jesper Dangaard Brouer
2016-04-06 20:01     ` Jesper Dangaard Brouer
2016-04-06 23:11       ` Alexei Starovoitov
2016-04-06 20:03     ` Daniel Borkmann
2016-04-02 16:47 ` [RFC PATCH 0/5] Add driver bpf hook for early packet drop Tom Herbert
2016-04-03  5:41   ` Brenden Blanco
2016-04-04  7:48     ` Jesper Dangaard Brouer
2016-04-04 18:10       ` Alexei Starovoitov
2016-04-02 18:41 ` Johannes Berg
2016-04-02 22:57   ` Tom Herbert
2016-04-03  2:28     ` Lorenzo Colitti
2016-04-04  7:37       ` Johannes Berg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160405011757.GA80209@ast-mbp.thefacebook.com \
    --to=alexei.starovoitov@gmail.com \
    --cc=bblanco@plumgrid.com \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=johannes@sipsolutions.net \
    --cc=john.fastabend@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=ogerlitz@mellanox.com \
    --cc=tom@herbertland.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.