All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Fastabend <john.fastabend@gmail.com>
To: "Alexander Lobakin" <alexandr.lobakin@intel.com>,
	"Toke Høiland-Jørgensen" <toke@redhat.com>,
	"Saeed Mahameed" <saeed@kernel.org>
Cc: Alexander Lobakin <alexandr.lobakin@intel.com>,
	Jesper Dangaard Brouer <brouer@redhat.com>,
	"Raczynski, Piotr" <piotr.raczynski@intel.com>,
	"Zhang, Jessica" <jessica.zhang@intel.com>,
	"Kubiak, Marcin" <marcin.kubiak@intel.com>,
	"Joseph, Jithu" <jithu.joseph@intel.com>,
	"kurt@linutronix.de" <kurt@linutronix.de>,
	"Maloor, Kishen" <kishen.maloor@intel.com>,
	"Gomes, Vinicius" <vinicius.gomes@intel.com>,
	"Brandeburg, Jesse" <jesse.brandeburg@intel.com>,
	"Swiatkowski, Michal" <michal.swiatkowski@intel.com>,
	"Plantykow, Marta A" <marta.a.plantykow@intel.com>,
	"Ong, Boon Leong" <boon.leong.ong@intel.com>,
	"Desouza, Ederson" <ederson.desouza@intel.com>,
	"Song, Yoong Siang" <yoong.siang.song@intel.com>,
	"Czapnik, Lukasz" <lukasz.czapnik@intel.com>,
	bpf@vger.kernel.org
Subject: Re: AF_XDP metadata/hints
Date: Tue, 25 May 2021 21:51:22 -0700	[thread overview]
Message-ID: <60add3cad4ef0_3b75f2086@john-XPS-13-9370.notmuch> (raw)
In-Reply-To: <20210525142027.1432-1-alexandr.lobakin@intel.com>

Alexander Lobakin wrote:
> From: Toke Høiland-Jørgensen <toke@redhat.com>
> Date: Sun, 23 May 2021 13:54:47 +0200
> 
> > Saeed Mahameed <saeed@kernel.org> writes:
> > 
> > > On Fri, 2021-05-21 at 15:31 +0200, Jesper Dangaard Brouer wrote:
> > >> On Fri, 21 May 2021 10:53:40 +0000
> > >> "Lobakin, Alexandr" <alexandr.lobakin@intel.com> wrote:
> > >>
> > >> > I've opened two discussions at https://github.com/alobakin/linux,
> > >> > feel free to join them and/or create new ones to share your thoughts
> > >> > and concerns.
> > >>
> > >> Thanks Alexandr for keeping the thread/subject alive.
> > >>
> > >> I guess this is a new GitHub features "Discussions".  I've never used
> > >> that in a project before, lets see how this goes.  The usual approach
> > >> is discussions over email on netdev (Cc. netdev@vger.kernel.org).
> > >
> > > I agree we need full visibility and transparency, i actually recommend:
> > > bpf@vger.kernel.org
> > 
> > +1, please keep this on the list :)
> 
> Sure, let's keep it the classic way.
> I removed the netdev ML from the CCs and added bpf there.
> 
> Regarding the comments from GitHub discussions:
> 
> alobakin:
> 
> > Since 5.11, it's now possible to obtain a BTF not only for vmlinux,
> > but also for modules.
> > This will eliminate a need for manually composing and registering a
> > BTF inside the driver code, which is 100+ locs for ice for example.
> > 
> > That's obviously not the most straightforward and trivial way, but
> > could help a lot.
> 
> saeedtx:
> 
> > the point of registering BTF directly from the driver is to allow

There is no paticular reason the BTF has to come from the driver it
could also be generated in userspace or elsewhere. The driver is
handy because at least the driver should always have correct BTF so
you avoid versioning to some extent.

> > "Flex metadata" meaning that meta data format can be constructed on
> > the fly according to user demand.

How is flex metadata configured? I believe this is going to need
some user tooling and a hard reset (ucode load?) in the driver to
transition the hardware state.

My original vision was use P4 (or whatever language) to build
your necessary microcode/firmware/blob. Compile that to your
specific hardware backend NIC. That process should give you
two objects. The BTF and the blob to throw at the hardware.
Letting the driver expose the BTF over /sys/fs/btf/driver.btf
makes a lot of sense as well, but is not strictly necessary
as long as you have some way to get the BTF.

Anyways from a design side IMO hardware configuration should be
done independent of any BPF/BTF operations.

> > BTF for modules is constructed only at compilation time and
> > registered only on module load. so there is no way to implement flex
> > metadata with vmlinux BTF. we still need a dynamic registration API
> > for current and future HW where the HW will provide the BTF
> > dynamically.

+1 can we expose it in /sys/fs/btf/ seems like the reasonable
thing to me.

> > 
> > I am sure we can find mutliple ways to reduce the 100+ LOC, but the
> > goal is to have the dynamic btf_register/unregister API
> 
> We initially planned to register just one (or several) predefined
> BTF(s) per module/netdevice that would provide a full list of
> supported fields. The flexibility of metadata then is in that BPF
> core calls for netdevice's ndo_bpf() on BPF program setup and
> provides a metadata layout requested by that BPF prog to the driver,

I don't think this is the right direction. The driver should be
telling us whats supported or we should "just" know because we
configured it. Overloading ndo_bpf with the config step
seems unnecessarily complex. CO-RE is going to happen way before
we even get to the ndo_bpf() so trying to decide layout this
late is likely not going to work. How would you even know what
to do with a load op?

> so it could configure its hotpath (current NICs) or a hardware
> (future NICs) to build metadata accordingly.
> Driver can declare several BTFs (e.g. a "generic" one with things
> like hashes and csums one and a custom one) and it would work either
> through dynamic registering or through /sys approach.

IMO driver needs to expose one single BTF image of what CO-RE ops
need to be done on a object.

Separate the config of hardware from the BPF infrastructure these
are two separate things.

> 
> This is all discussable anyways, we're happy to hear different
> opinions and thoughts to collectively choose the optimal way.
> 
> > -Toke
> 
> Thanks,
> Al

  reply	other threads:[~2021-05-26  4:51 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <dc2c38cdccfa5eca925cfc9d59b0674e208c9c9d.camel@intel.com>
     [not found] ` <2226aeaab7a4ca8e4f26413514bf54ab2c81ea36.camel@intel.com>
     [not found] ` <DM6PR11MB2780A8C5410ECB3C9700EAB5CA579@DM6PR11MB2780.namprd11.prod.outlook.com>
     [not found]   ` <PH0PR11MB487034313697F395BB5BA3C5E4579@PH0PR11MB4870.namprd11.prod.outlook.com>
     [not found]     ` <DM4PR11MB5422733A87913EFF8904C17184579@DM4PR11MB5422.namprd11.prod.outlook.com>
     [not found]       ` <20210507131034.5a62ce56@carbon>
     [not found]         ` <DM4PR11MB5422FE9618B3692D48FCE4EA84549@DM4PR11MB5422.namprd11.prod.outlook.com>
     [not found]           ` <20210510185029.1ca6f872@carbon>
     [not found]             ` <DM4PR11MB54227C25DFD4E882CB03BD3884539@DM4PR11MB5422.namprd11.prod.outlook.com>
     [not found]               ` <20210512102546.5c098483@carbon>
     [not found]                 ` <DM4PR11MB542273C9D8BF63505DC6E21784519@DM4PR11MB5422.namprd11.prod.outlook.com>
     [not found]                   ` <7b347a985e590e2a422f837971b30bd83f9c7ac3.camel@nvidia.com>
     [not found]                     ` <DM4PR11MB5422762E82C0531B92BDF09A842B9@DM4PR11MB5422.namprd11.prod.outlook.com>
     [not found]                       ` <DM4PR11MB5422269F6113268172B9E26A842A9@DM4PR11MB5422.namprd11.prod.outlook.com>
     [not found]                         ` <DM4PR11MB54224769926B06EE76635A6484299@DM4PR11MB5422.namprd11.prod.outlook.com>
2021-05-21 13:31                           ` AF_XDP metadata/hints Jesper Dangaard Brouer
2021-05-21 17:53                             ` Saeed Mahameed
2021-05-23 11:54                               ` Toke Høiland-Jørgensen
2021-05-25 14:20                                 ` Alexander Lobakin
2021-05-26  4:51                                   ` John Fastabend [this message]
2021-05-26 11:49                                     ` Jesper Dangaard Brouer
2021-05-26 13:06                                       ` Toke Høiland-Jørgensen
2021-05-26 15:35                                         ` John Fastabend
2021-05-26 15:41                                           ` John Fastabend
2021-05-26 15:54                                           ` Alexander Lobakin
2021-05-26 16:33                                             ` John Fastabend
2021-05-26 18:44                                               ` Jesper Dangaard Brouer
2021-05-26 16:41                                             ` Alexei Starovoitov
2021-05-26 17:01                                               ` John Fastabend
2021-05-26 17:38                                           ` Jesper Dangaard Brouer
2021-05-26 14:49                                   ` Jesper Dangaard Brouer
2021-06-05  0:32           ` Desouza, Ederson
2021-06-11 19:25             ` Alexander Lobakin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=60add3cad4ef0_3b75f2086@john-XPS-13-9370.notmuch \
    --to=john.fastabend@gmail.com \
    --cc=alexandr.lobakin@intel.com \
    --cc=boon.leong.ong@intel.com \
    --cc=bpf@vger.kernel.org \
    --cc=brouer@redhat.com \
    --cc=ederson.desouza@intel.com \
    --cc=jesse.brandeburg@intel.com \
    --cc=jessica.zhang@intel.com \
    --cc=jithu.joseph@intel.com \
    --cc=kishen.maloor@intel.com \
    --cc=kurt@linutronix.de \
    --cc=lukasz.czapnik@intel.com \
    --cc=marcin.kubiak@intel.com \
    --cc=marta.a.plantykow@intel.com \
    --cc=michal.swiatkowski@intel.com \
    --cc=piotr.raczynski@intel.com \
    --cc=saeed@kernel.org \
    --cc=toke@redhat.com \
    --cc=vinicius.gomes@intel.com \
    --cc=yoong.siang.song@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.