bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Tom Herbert" <tom@sipanda.io>,
	"John Fastabend" <john.fastabend@gmail.com>,
	"Singhai, Anjali" <anjali.singhai@intel.com>,
	"Paolo Abeni" <pabeni@redhat.com>,
	"Linux Kernel Network Developers" <netdev@vger.kernel.org>,
	"Chatterjee, Deb" <deb.chatterjee@intel.com>,
	"Limaye, Namrata" <namrata.limaye@intel.com>,
	mleitner@redhat.com, Mahesh.Shirshyad@amd.com,
	Vipin.Jain@amd.com, "Osinski, Tomasz" <tomasz.osinski@intel.com>,
	"Jiri Pirko" <jiri@resnulli.us>,
	"Cong Wang" <xiyou.wangcong@gmail.com>,
	"David S . Miller" <davem@davemloft.net>,
	edumazet@google.com, "Vlad Buslov" <vladbu@nvidia.com>,
	horms@kernel.org, khalidm@nvidia.com,
	"Toke Høiland-Jørgensen" <toke@redhat.com>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	"Victor Nogueira" <victor@mojatatu.com>,
	"Tammela, Pedro" <pctammela@mojatatu.com>,
	"Daly, Dan" <dan.daly@intel.com>,
	andy.fingerhut@gmail.com, "Sommers,
	Chris" <chris.sommers@keysight.com>,
	mattyk@nvidia.com, bpf@vger.kernel.org
Subject: Re: [PATCH net-next v12 00/15] Introducing P4TC (series 1)
Date: Fri, 1 Mar 2024 17:32:14 -0800	[thread overview]
Message-ID: <20240301173214.3d95e22b@kernel.org> (raw)
In-Reply-To: <CAM0EoM=-hzSNxOegHqhAQD7qoAR2CS3Dyh-chRB+H7C7TQzmow@mail.gmail.com>

On Fri, 1 Mar 2024 12:39:56 -0500 Jamal Hadi Salim wrote:
> On Fri, Mar 1, 2024 at 12:00 PM Jakub Kicinski <kuba@kernel.org> wrote:
> > > Pardon my ignorance, but doesn't P4 want to be compiled to a backend
> > > target? How does going through TC make this seamless?  
> >
> > +1
> 
> I should clarify what i meant by "seamless". It means the same control
> API is used for s/w or h/w. This is a feature of tc, and is not being
> introduced by P4TC. P4 control only deals with Match-action tables -
> just as TC does.

Right, and the compiled P4 pipeline is tacked onto that API.
Loading that presumably implies a pipeline reset. There's 
no precedent for loading things into TC resulting a device
datapath reset.

> > My intuition is that for offload the device would be programmed at
> > start-of-day / probe. By loading the compiled P4 from /lib/firmware.
> > Then the _device_ tells the kernel what tables and parser graph it's
> > got.
> 
> BTW: I just want to say that these patches are about s/w - not
> offload. Someone asked about offload so as in normal discussions we
> steered in that direction. The hardware piece will require additional
> patchsets which still require discussions. I hope we dont steer off
> too much, otherwise i can start a new thread just to discuss current
> view of the h/w.
> 
> Its not the device telling the kernel what it has. Its the other way around.

Yes, I'm describing how I'd have designed it :) If it was the same
as what you've already implemented - why would I be typing it into
an email.. ? :)

> From the P4 program you generate the s/w (the ebpf code and other
> auxillary stuff) and h/w pieces using a compiler.
> You compile ebpf, etc, then load.

That part is fine.

> The current point of discussion is the hw binary is to be "activated"
> through the same tc filter that does the s/w. So one could say:
> 
> tc filter add block 22 ingress protocol all prio 1 p4 pname simple_l3
> \
>    prog type hw filename "simple_l3.o" ... \
>    action bpf obj $PARSER.o section p4tc/parser \
>    action bpf obj $PROGNAME.o section p4tc/main
> 
> And that would through tc driver callbacks signal to the driver to
> find the binary possibly via  /lib/firmware
> Some of the original discussion was to use devlink for loading the
> binary - but that went nowhere.

Back to the device reset, unless the load has no impact on inflight
traffic the loading doesn't belong in TC, IMO. Plus you're going to
run into (what IIRC was Jiri's complaint) that you're loading arbitrary
binary blobs, opaque to the kernel.

> Once you have this in place then netlink with tc skip_sw/hw. This is
> what i meant by "seamless"
> 
> > Plus, if we're talking about offloads, aren't we getting back into
> > the same controversies we had when merging OvS (not that I was around).
> > The "standalone stack to the side" problem. Some of the tables in the
> > pipeline may be for routing, not ACLs. Should they be fed from the
> > routing stack? How is that integration going to work? The parsing
> > graph feels a bit like global device configuration, not a piece of
> > functionality that should sit under sub-sub-system in the corner.  
> 
> The current (maybe i should say initial) thought is the P4 program
> does not touch the existing kernel infra such as fdb etc.

It's off to the side thing. Ignoring the fact that *all*, networking
devices already have parsers which would benefit from being accurately
described.

> Of course we can model the kernel datapath using P4 but you wont be
> using "ip route add..." or "bridge fdb...".
> In the future, P4 extern could be used to model existing infra and we
> should be able to use the same tooling. That is a discussion that
> comes on/off (i think it did in the last meeting).

Maybe, IDK. I thought prevailing wisdom, at least for offloads,
is to offload the existing networking stack, and fill in the gaps.
Not build a completely new implementation from scratch, and "integrate
later". Or at least "fill in the gaps" is how I like to think.

I can't quite fit together in my head how this is okay, but OvS
was not allowed to add their offload API. And what's supposed to
be part of TC and what isn't, where you only expect to have one 
filter here, and create a whole new object universe inside TC.

But that's just my opinions. The way things work we may wake up one 
day and find out that Dave has applied this :)

  reply	other threads:[~2024-03-02  1:32 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-25 16:54 [PATCH net-next v12 00/15] Introducing P4TC (series 1) Jamal Hadi Salim
2024-02-25 16:54 ` [PATCH net-next v12 01/15] net: sched: act_api: Introduce P4 actions list Jamal Hadi Salim
2024-02-29 15:05   ` Paolo Abeni
2024-02-29 18:21     ` Jamal Hadi Salim
2024-03-01  7:30       ` Paolo Abeni
2024-03-01 12:39         ` Jamal Hadi Salim
2024-02-25 16:54 ` [PATCH net-next v12 02/15] net/sched: act_api: increase action kind string length Jamal Hadi Salim
2024-02-25 16:54 ` [PATCH net-next v12 03/15] net/sched: act_api: Update tc_action_ops to account for P4 actions Jamal Hadi Salim
2024-02-29 16:19   ` Paolo Abeni
2024-02-29 18:30     ` Jamal Hadi Salim
2024-02-25 16:54 ` [PATCH net-next v12 04/15] net/sched: act_api: add struct p4tc_action_ops as a parameter to lookup callback Jamal Hadi Salim
2024-02-25 16:54 ` [PATCH net-next v12 05/15] net: sched: act_api: Add support for preallocated P4 action instances Jamal Hadi Salim
2024-02-25 16:54 ` [PATCH net-next v12 06/15] p4tc: add P4 data types Jamal Hadi Salim
2024-02-29 15:09   ` Paolo Abeni
2024-02-29 18:31     ` Jamal Hadi Salim
2024-02-25 16:54 ` [PATCH net-next v12 07/15] p4tc: add template API Jamal Hadi Salim
2024-02-25 16:54 ` [PATCH net-next v12 08/15] p4tc: add template pipeline create, get, update, delete Jamal Hadi Salim
2024-02-25 16:54 ` [PATCH net-next v12 09/15] p4tc: add template action create, update, delete, get, flush and dump Jamal Hadi Salim
2024-02-25 16:54 ` [PATCH net-next v12 10/15] p4tc: add runtime action support Jamal Hadi Salim
2024-02-25 16:54 ` [PATCH net-next v12 11/15] p4tc: add template table create, update, delete, get, flush and dump Jamal Hadi Salim
2024-02-25 16:54 ` [PATCH net-next v12 12/15] p4tc: add runtime table entry create and update Jamal Hadi Salim
2024-02-25 16:54 ` [PATCH net-next v12 13/15] p4tc: add runtime table entry get, delete, flush and dump Jamal Hadi Salim
2024-02-25 16:54 ` [PATCH net-next v12 14/15] p4tc: add set of P4TC table kfuncs Jamal Hadi Salim
2024-03-01  6:53   ` Martin KaFai Lau
2024-03-01 12:31     ` Jamal Hadi Salim
2024-03-03  1:32       ` Martin KaFai Lau
2024-03-03 17:20         ` Jamal Hadi Salim
2024-03-05  7:40           ` Martin KaFai Lau
2024-03-05 12:30             ` Jamal Hadi Salim
2024-03-06  7:58               ` Martin KaFai Lau
2024-03-06 20:22                 ` Jamal Hadi Salim
2024-03-06 22:21                   ` Martin KaFai Lau
2024-03-06 23:19                     ` Jamal Hadi Salim
2024-02-25 16:54 ` [PATCH net-next v12 15/15] p4tc: add P4 classifier Jamal Hadi Salim
2024-02-28 17:11 ` [PATCH net-next v12 00/15] Introducing P4TC (series 1) John Fastabend
2024-02-28 18:23   ` Jamal Hadi Salim
2024-02-28 21:13     ` John Fastabend
2024-03-01  7:02   ` Martin KaFai Lau
2024-03-01 12:36     ` Jamal Hadi Salim
2024-02-29 17:13 ` Paolo Abeni
2024-02-29 18:49   ` Jamal Hadi Salim
2024-02-29 20:52     ` John Fastabend
2024-02-29 21:49   ` Singhai, Anjali
2024-02-29 22:33     ` John Fastabend
2024-02-29 22:48       ` Jamal Hadi Salim
     [not found]         ` <CAOuuhY8qbsYCjdUYUZv8J3jz8HGXmtxLmTDP6LKgN5uRVZwMnQ@mail.gmail.com>
2024-03-01 17:00           ` Jakub Kicinski
2024-03-01 17:39             ` Jamal Hadi Salim
2024-03-02  1:32               ` Jakub Kicinski [this message]
2024-03-02  2:20                 ` Tom Herbert
2024-03-03  3:15                   ` Jakub Kicinski
2024-03-03 16:31                     ` Tom Herbert
2024-03-04 20:07                       ` Jakub Kicinski
2024-03-04 20:58                         ` eBPF to implement core functionility WAS " Tom Herbert
2024-03-04 21:19                       ` Stanislav Fomichev
2024-03-04 22:01                         ` Tom Herbert
2024-03-04 23:24                           ` Stanislav Fomichev
2024-03-04 23:50                             ` Tom Herbert
2024-03-02  2:59                 ` Hardware Offload discussion WAS(Re: " Jamal Hadi Salim
2024-03-02 14:36                   ` Jamal Hadi Salim
2024-03-03  3:27                     ` Jakub Kicinski
2024-03-03 17:00                       ` Jamal Hadi Salim
2024-03-03 18:10                         ` Tom Herbert
2024-03-03 19:04                           ` Jamal Hadi Salim
2024-03-04 20:18                             ` Jakub Kicinski
2024-03-04 21:02                               ` Jamal Hadi Salim
2024-03-04 21:23                             ` Stanislav Fomichev
2024-03-04 21:44                               ` Jamal Hadi Salim
2024-03-04 22:23                                 ` Stanislav Fomichev
2024-03-04 22:59                                   ` Jamal Hadi Salim
2024-03-04 23:14                                     ` Stanislav Fomichev
2024-03-01 18:53   ` Chris Sommers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240301173214.3d95e22b@kernel.org \
    --to=kuba@kernel.org \
    --cc=Mahesh.Shirshyad@amd.com \
    --cc=Vipin.Jain@amd.com \
    --cc=andy.fingerhut@gmail.com \
    --cc=anjali.singhai@intel.com \
    --cc=bpf@vger.kernel.org \
    --cc=chris.sommers@keysight.com \
    --cc=dan.daly@intel.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=deb.chatterjee@intel.com \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=jhs@mojatatu.com \
    --cc=jiri@resnulli.us \
    --cc=john.fastabend@gmail.com \
    --cc=khalidm@nvidia.com \
    --cc=mattyk@nvidia.com \
    --cc=mleitner@redhat.com \
    --cc=namrata.limaye@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pctammela@mojatatu.com \
    --cc=toke@redhat.com \
    --cc=tom@sipanda.io \
    --cc=tomasz.osinski@intel.com \
    --cc=victor@mojatatu.com \
    --cc=vladbu@nvidia.com \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).