bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jamal Hadi Salim <jhs@mojatatu.com>
To: Jakub Kicinski <kuba@kernel.org>
Cc: "Tom Herbert" <tom@sipanda.io>,
	"John Fastabend" <john.fastabend@gmail.com>,
	"Singhai, Anjali" <anjali.singhai@intel.com>,
	"Paolo Abeni" <pabeni@redhat.com>,
	"Linux Kernel Network Developers" <netdev@vger.kernel.org>,
	"Chatterjee, Deb" <deb.chatterjee@intel.com>,
	"Limaye, Namrata" <namrata.limaye@intel.com>,
	"Marcelo Ricardo Leitner" <mleitner@redhat.com>,
	"Shirshyad, Mahesh" <Mahesh.Shirshyad@amd.com>,
	"Jain, Vipin" <Vipin.Jain@amd.com>,
	"Osinski, Tomasz" <tomasz.osinski@intel.com>,
	"Jiri Pirko" <jiri@resnulli.us>,
	"Cong Wang" <xiyou.wangcong@gmail.com>,
	"David S . Miller" <davem@davemloft.net>,
	"Eric Dumazet" <edumazet@google.com>,
	"Vlad Buslov" <vladbu@nvidia.com>,
	"Simon Horman" <horms@kernel.org>,
	"Khalid Manaa" <khalidm@nvidia.com>,
	"Toke Høiland-Jørgensen" <toke@redhat.com>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	"Victor Nogueira" <victor@mojatatu.com>,
	"Tammela, Pedro" <pctammela@mojatatu.com>,
	"Daly, Dan" <dan.daly@intel.com>,
	"Andy Fingerhut" <andy.fingerhut@gmail.com>,
	"Sommers, Chris" <chris.sommers@keysight.com>,
	"Matty Kadosh" <mattyk@nvidia.com>, bpf <bpf@vger.kernel.org>
Subject: Re: Hardware Offload discussion WAS(Re: [PATCH net-next v12 00/15] Introducing P4TC (series 1)
Date: Sun, 3 Mar 2024 12:00:10 -0500	[thread overview]
Message-ID: <CAM0EoMncuPvUsRwE+Ajojgg-8JD+1oJ7j2Rw+7oN60MjjAHV-g@mail.gmail.com> (raw)
In-Reply-To: <20240302192747.371684fb@kernel.org>

On Sat, Mar 2, 2024 at 10:27 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Sat, 2 Mar 2024 09:36:53 -0500 Jamal Hadi Salim wrote:
> > 2) Your point on:  "integrate later", or at least "fill in the gaps"
> > This part i am probably going to mumble on. I am going to consider
> > more than just doing ACLs/MAT via flower/u32 for the sake of
> > discussion.
> > True, "fill the gaps" has been our model so far. It requires kernel
> > changes, user space code changes etc justifiably so because most of
> > the time such datapaths are subject to standardization via IETF, IEEE,
> > etc and new extensions come in on a regular basis.  And sometimes we
> > do add features that one or two users or a single vendor has need for
> > at the cost of kernel and user/control extension. Given our work
> > process, any features added this way take a long time to make it to
> > the end user.
>
> What I had in mind was more of a DDP model. The device loads it binary
> blob FW in whatever way it does, then it tells the kernel its parser
> graph, and tables. The kernel exposes those tables to user space.
> All dynamic, no need to change the kernel for each new protocol.
>
> But that's different in two ways:
>  1. the device tells kernel the tables, no "dynamic reprogramming"
>  2. you don't need the SW side, the only use of the API is to interact
>     with the device
>
> User can still do BPF kfuncs to look up in the tables (like in FIB),
> but call them from cls_bpf.
>

This is not far off from what is envisioned today in the discussions.
The main issue is who loads the binary? We went from devlink to the
filter doing the loading. DDP is ethtool. We still need to tie a PCI
device/tc block to the "program" so we can do skip_sw and it works.
Meaning a device that is capable of handling multiple programs can
have multiple blobs loaded. A "program" is mapped to a tc filter and
MAT control works the same way as it does today (netlink/tc ndo).

A program in P4 has a name, ID and people have been suggesting a sha1
identity (or a signature of some kind should be generated by the
compiler). So the upward propagation could be tied to discovering
these 3 tuples from the driver. Then the control plane targets a
program via those tuples via netlink (as we do currently).

I do note, using the DDP sample space, currently whatever gets loaded
is "trusted" and really you need to have human knowledge of what the
NIC's parsing + MAT is to send the control. With P4 that is all
visible/programmable by the end user (i am not a proponent of vendors
"shipping" things or calling them for support) - so should be
sufficient to just discover what is in the binary and send the correct
control messages down.

> I think in P4 terms that may be something more akin to only providing
> the runtime API? I seem to recall they had some distinction...

There are several solutions out there (ex: TDI, P4runtime) - our API
is netlink and those could be written on top of netlink, there's no
controversy there.
So the starting point is defining the datapath using P4, generating
the binary blob and whatever constraints needed using the vendor
backend and for s/w equivalent generating the eBPF datapath.

> > At the cost of this sounding controversial, i am going
> > to call things like fdb, fib, etc which have fixed datapaths in the
> > kernel "legacy". These "legacy" datapaths almost all the time have
>
> The cynic in me sometimes thinks that the biggest problem with "legacy"
> protocols is that it's hard to make money on them :)

That's a big motivation without a doubt, but also there are people
that want to experiment with things. One of the craziest examples we
have is someone who created a P4 program for "in network calculator",
essentially a calculator in the datapath. You send it two operands and
an operator using custom headers, it does the math and responds with a
result in a new header. By itself this program is a toy but it
demonstrates that if one wanted to, they could have something custom
in hardware and/or kernel datapath.

cheers,
jamal

  reply	other threads:[~2024-03-03 17:00 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-25 16:54 [PATCH net-next v12 00/15] Introducing P4TC (series 1) Jamal Hadi Salim
2024-02-25 16:54 ` [PATCH net-next v12 01/15] net: sched: act_api: Introduce P4 actions list Jamal Hadi Salim
2024-02-29 15:05   ` Paolo Abeni
2024-02-29 18:21     ` Jamal Hadi Salim
2024-03-01  7:30       ` Paolo Abeni
2024-03-01 12:39         ` Jamal Hadi Salim
2024-02-25 16:54 ` [PATCH net-next v12 02/15] net/sched: act_api: increase action kind string length Jamal Hadi Salim
2024-02-25 16:54 ` [PATCH net-next v12 03/15] net/sched: act_api: Update tc_action_ops to account for P4 actions Jamal Hadi Salim
2024-02-29 16:19   ` Paolo Abeni
2024-02-29 18:30     ` Jamal Hadi Salim
2024-02-25 16:54 ` [PATCH net-next v12 04/15] net/sched: act_api: add struct p4tc_action_ops as a parameter to lookup callback Jamal Hadi Salim
2024-02-25 16:54 ` [PATCH net-next v12 05/15] net: sched: act_api: Add support for preallocated P4 action instances Jamal Hadi Salim
2024-02-25 16:54 ` [PATCH net-next v12 06/15] p4tc: add P4 data types Jamal Hadi Salim
2024-02-29 15:09   ` Paolo Abeni
2024-02-29 18:31     ` Jamal Hadi Salim
2024-02-25 16:54 ` [PATCH net-next v12 07/15] p4tc: add template API Jamal Hadi Salim
2024-02-25 16:54 ` [PATCH net-next v12 08/15] p4tc: add template pipeline create, get, update, delete Jamal Hadi Salim
2024-02-25 16:54 ` [PATCH net-next v12 09/15] p4tc: add template action create, update, delete, get, flush and dump Jamal Hadi Salim
2024-02-25 16:54 ` [PATCH net-next v12 10/15] p4tc: add runtime action support Jamal Hadi Salim
2024-02-25 16:54 ` [PATCH net-next v12 11/15] p4tc: add template table create, update, delete, get, flush and dump Jamal Hadi Salim
2024-02-25 16:54 ` [PATCH net-next v12 12/15] p4tc: add runtime table entry create and update Jamal Hadi Salim
2024-02-25 16:54 ` [PATCH net-next v12 13/15] p4tc: add runtime table entry get, delete, flush and dump Jamal Hadi Salim
2024-02-25 16:54 ` [PATCH net-next v12 14/15] p4tc: add set of P4TC table kfuncs Jamal Hadi Salim
2024-03-01  6:53   ` Martin KaFai Lau
2024-03-01 12:31     ` Jamal Hadi Salim
2024-03-03  1:32       ` Martin KaFai Lau
2024-03-03 17:20         ` Jamal Hadi Salim
2024-03-05  7:40           ` Martin KaFai Lau
2024-03-05 12:30             ` Jamal Hadi Salim
2024-03-06  7:58               ` Martin KaFai Lau
2024-03-06 20:22                 ` Jamal Hadi Salim
2024-03-06 22:21                   ` Martin KaFai Lau
2024-03-06 23:19                     ` Jamal Hadi Salim
2024-02-25 16:54 ` [PATCH net-next v12 15/15] p4tc: add P4 classifier Jamal Hadi Salim
2024-02-28 17:11 ` [PATCH net-next v12 00/15] Introducing P4TC (series 1) John Fastabend
2024-02-28 18:23   ` Jamal Hadi Salim
2024-02-28 21:13     ` John Fastabend
2024-03-01  7:02   ` Martin KaFai Lau
2024-03-01 12:36     ` Jamal Hadi Salim
2024-02-29 17:13 ` Paolo Abeni
2024-02-29 18:49   ` Jamal Hadi Salim
2024-02-29 20:52     ` John Fastabend
2024-02-29 21:49   ` Singhai, Anjali
2024-02-29 22:33     ` John Fastabend
2024-02-29 22:48       ` Jamal Hadi Salim
     [not found]         ` <CAOuuhY8qbsYCjdUYUZv8J3jz8HGXmtxLmTDP6LKgN5uRVZwMnQ@mail.gmail.com>
2024-03-01 17:00           ` Jakub Kicinski
2024-03-01 17:39             ` Jamal Hadi Salim
2024-03-02  1:32               ` Jakub Kicinski
2024-03-02  2:20                 ` Tom Herbert
2024-03-03  3:15                   ` Jakub Kicinski
2024-03-03 16:31                     ` Tom Herbert
2024-03-04 20:07                       ` Jakub Kicinski
2024-03-04 20:58                         ` eBPF to implement core functionility WAS " Tom Herbert
2024-03-04 21:19                       ` Stanislav Fomichev
2024-03-04 22:01                         ` Tom Herbert
2024-03-04 23:24                           ` Stanislav Fomichev
2024-03-04 23:50                             ` Tom Herbert
2024-03-02  2:59                 ` Hardware Offload discussion WAS(Re: " Jamal Hadi Salim
2024-03-02 14:36                   ` Jamal Hadi Salim
2024-03-03  3:27                     ` Jakub Kicinski
2024-03-03 17:00                       ` Jamal Hadi Salim [this message]
2024-03-03 18:10                         ` Tom Herbert
2024-03-03 19:04                           ` Jamal Hadi Salim
2024-03-04 20:18                             ` Jakub Kicinski
2024-03-04 21:02                               ` Jamal Hadi Salim
2024-03-04 21:23                             ` Stanislav Fomichev
2024-03-04 21:44                               ` Jamal Hadi Salim
2024-03-04 22:23                                 ` Stanislav Fomichev
2024-03-04 22:59                                   ` Jamal Hadi Salim
2024-03-04 23:14                                     ` Stanislav Fomichev
2024-03-01 18:53   ` Chris Sommers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAM0EoMncuPvUsRwE+Ajojgg-8JD+1oJ7j2Rw+7oN60MjjAHV-g@mail.gmail.com \
    --to=jhs@mojatatu.com \
    --cc=Mahesh.Shirshyad@amd.com \
    --cc=Vipin.Jain@amd.com \
    --cc=andy.fingerhut@gmail.com \
    --cc=anjali.singhai@intel.com \
    --cc=bpf@vger.kernel.org \
    --cc=chris.sommers@keysight.com \
    --cc=dan.daly@intel.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=deb.chatterjee@intel.com \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=jiri@resnulli.us \
    --cc=john.fastabend@gmail.com \
    --cc=khalidm@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=mattyk@nvidia.com \
    --cc=mleitner@redhat.com \
    --cc=namrata.limaye@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pctammela@mojatatu.com \
    --cc=toke@redhat.com \
    --cc=tom@sipanda.io \
    --cc=tomasz.osinski@intel.com \
    --cc=victor@mojatatu.com \
    --cc=vladbu@nvidia.com \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).