All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Fastabend <john.fastabend@gmail.com>
To: Maciej Fijalkowski <maciej.fijalkowski@intel.com>,
	John Fastabend <john.fastabend@gmail.com>
Cc: "Toke Høiland-Jørgensen" <toke@redhat.com>,
	"Björn Töpel" <bjorn.topel@intel.com>,
	daniel@iogearbox.net, ast@kernel.org, bpf@vger.kernel.org,
	netdev@vger.kernel.org, andrii@kernel.org,
	magnus.karlsson@intel.com, ciara.loftus@intel.com
Subject: Re: [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link
Date: Tue, 16 Feb 2021 13:17:25 -0800	[thread overview]
Message-ID: <602c3665b1f61_76b702083a@john-XPS-13-9370.notmuch> (raw)
In-Reply-To: <20210216205021.GC17126@ranger.igk.intel.com>

Maciej Fijalkowski wrote:
> On Tue, Feb 16, 2021 at 11:15:41AM -0800, John Fastabend wrote:
> > Toke Høiland-Jørgensen wrote:
> > > Björn Töpel <bjorn.topel@intel.com> writes:
> > > 
> > > > On 2021-02-15 21:49, John Fastabend wrote:
> > > >> Maciej Fijalkowski wrote:
> > > >>> Currently, if there are multiple xdpsock instances running on a single
> > > >>> interface and in case one of the instances is terminated, the rest of
> > > >>> them are left in an inoperable state due to the fact of unloaded XDP
> > > >>> prog from interface.
> > > >>>
> > > >>> To address that, step away from setting bpf prog in favour of bpf_link.
> > > >>> This means that refcounting of BPF resources will be done automatically
> > > >>> by bpf_link itself.
> > > >>>
> > > >>> When setting up BPF resources during xsk socket creation, check whether
> > > >>> bpf_link for a given ifindex already exists via set of calls to
> > > >>> bpf_link_get_next_id -> bpf_link_get_fd_by_id -> bpf_obj_get_info_by_fd
> > > >>> and comparing the ifindexes from bpf_link and xsk socket.
> > > >>>
> > > >>> If there's no bpf_link yet, create one for a given XDP prog and unload
> > > >>> explicitly existing prog if XDP_FLAGS_UPDATE_IF_NOEXIST is not set.
> > > >>>
> > > >>> If bpf_link is already at a given ifindex and underlying program is not
> > > >>> AF-XDP one, bail out or update the bpf_link's prog given the presence of
> > > >>> XDP_FLAGS_UPDATE_IF_NOEXIST.
> > > >>>
> > > >>> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>

[...]

> > 
> > I'm not a real user of AF_XDP (yet.), but here is how I would expect it
> > to work based on how the sockmap pieces work, which are somewhat similar
> > given they also deal with sockets.
> 
> Don't want to be picky, but I suppose sockmap won't play with ndo_bpf()
> and that's what was bothering us.
> 
> > 
> > Program 
> > (1) load and pin an XDP BPF program
> >     - obj = bpf_object__open(prog);
> >     - bpf_object__load_xattr(&attr);
> >     - bpf_program__pin()
> > (2) pin the map, find map_xsk using any of the map APIs
> >     - bpf_map__pin(map_xsk, path_to_pin)
> > (3) attach to XDP
> >     - link = bpf_program__attach_xdp()
> >     - bpf_link__pin()
> > 
> > At this point you have a BPF program loaded, a xsk map, and a link all
> > pinned and ready. And we can add socks using the process found in
> > `enter_xsks_into_map` in the sample. This can be the same program that
> > loaded/pinned the XDP program or some other program it doesn't really
> > matter.
> > 
> >  - create xsk fd
> >       . xsk_umem__create()
> >       . xsk_socket__create
> >  - open map @ pinned path
> >  - bpf_map_update_elem(xsks_map, &key, &fd, 0);
> > 
> > Then it looks like we don't have any conflicts? The XDP program is pinned
> > and exists in its normal scope. The xsk elements can be added/deleted
> > as normal.
> 
> The only difference from what you wrote up is the resource pinning, when
> compared to what we currently have + the set I am proposing.
> 
> So, if you're saying it looks like we don't have any conflicts and I am
> saying that the flow is what we have, then...? :)
> 
> You would have to ask Magnus what was behind the decision to avoid API
> from tools/lib/bpf/libbpf.c but rather directly call underlying functions
> from tools/lib/bpf/bpf.c. Nevertheless, it doesn't really make a big
> difference to me.
> 
> > If the XDP program is removed and the map referencing (using
> > normal ref rules) reaches zero its also deleted. Above is more or less
> > the same flow we use for any BPF program so looks good to me.
> 
> We have to be a bit more specific around the "XDP program is removed".
> Is it removed from the network interface? That's the thing that we want to
> avoid when there are other xsk sockets active on a given interface.

What I'm suggesting here is to use the normal rules for when an
XDP program is removed from the network interface. I don't think we
should do anything extra to keep the XDP program attached/loaded
just because it has a xsk map which may or may not have some
entries in it. If the user wants this behavior they should pin
the bpf_link pointer associated with the xdp program. This is the
same as any other program/map I have in my BPF system.

> 
> With bpf_link, XDP prog is removed only when bpf_link's refcount reaches
> zero, via link->ops->release() callback that is invoked from
> bpf_link_put().
> 
> And the refcount is updated with each process that attaches/detaches from
> the bpf_link on interface.
> 
> > 
> > The trouble seems to pop up when using the higher abstraction APIs
> > xsk_setup_xdp_prog and friends I guess? I just see above as already
> > fairly easy to use and we have good helpers to create the sockets it looks
> > like. Maybe I missed some design considerations. IMO higher level
> > abstractions should go in new libxdp and above should stay in libbpf.
> 
> xsk_setup_xdp_prog doesn't really feel like higher level abstraction, as I
> mentioned, to me it has one layer of abstraction peeled off.

Except it seems to have caused some issues. I don't think I gain
much from the API personally vs just doing above steps. But, I
appreciate you are just trying to fix it here so patches are
a good idea with v2 improvements. And I expect whenever
libbpf/libxdp split the above "high level" APIs will land in
libxdp.

Thanks,
John

> 
> > 
> > /rant off ;)
> > 
> > Thanks,
> > John



  reply	other threads:[~2021-02-16 21:18 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-15 15:46 [PATCH bpf-next 0/3] Introduce bpf_link in libbpf's xsk Maciej Fijalkowski
2021-02-15 15:46 ` [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link Maciej Fijalkowski
2021-02-15 17:07   ` Toke Høiland-Jørgensen
2021-02-15 17:38     ` Björn Töpel
2021-02-15 19:35       ` Toke Høiland-Jørgensen
2021-02-16  2:01         ` Maciej Fijalkowski
2021-02-16  9:15           ` Björn Töpel
2021-02-16 10:27           ` Toke Høiland-Jørgensen
2021-02-16 20:15             ` Maciej Fijalkowski
2021-02-15 20:22       ` John Fastabend
2021-02-15 21:38         ` Toke Høiland-Jørgensen
2021-02-16  0:18           ` John Fastabend
2021-02-16  2:23             ` Maciej Fijalkowski
2021-02-16  9:23               ` Björn Töpel
2021-02-16 10:36             ` Toke Høiland-Jørgensen
2021-02-23  1:15               ` Andrii Nakryiko
2021-02-17  2:23           ` Dan Siemon
2021-02-17  7:16             ` Magnus Karlsson
2021-02-17  7:36               ` Magnus Karlsson
2021-02-16  2:10         ` Maciej Fijalkowski
2021-02-15 20:49   ` John Fastabend
2021-02-16  2:38     ` Maciej Fijalkowski
2021-02-16 18:19       ` John Fastabend
2021-02-16 20:10         ` Maciej Fijalkowski
2021-02-16  9:20     ` Björn Töpel
2021-02-16 10:39       ` Toke Høiland-Jørgensen
2021-02-16 19:15         ` John Fastabend
2021-02-16 20:50           ` Maciej Fijalkowski
2021-02-16 21:17             ` John Fastabend [this message]
2021-02-15 15:46 ` [PATCH bpf-next 2/3] libbpf: clear map_info before each bpf_obj_get_info_by_fd Maciej Fijalkowski
2021-02-15 20:33   ` John Fastabend
2021-02-16  2:42     ` Maciej Fijalkowski
2021-02-15 15:46 ` [PATCH bpf-next 3/3] samples: bpf: do not unload prog within xdpsock Maciej Fijalkowski
2021-02-15 20:24   ` John Fastabend
2021-02-16  9:22     ` Björn Töpel
2021-02-16 14:15       ` Maciej Fijalkowski
2021-02-15 16:07 ` [PATCH bpf-next 0/3] Introduce bpf_link in libbpf's xsk Björn Töpel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=602c3665b1f61_76b702083a@john-XPS-13-9370.notmuch \
    --to=john.fastabend@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bjorn.topel@intel.com \
    --cc=bpf@vger.kernel.org \
    --cc=ciara.loftus@intel.com \
    --cc=daniel@iogearbox.net \
    --cc=maciej.fijalkowski@intel.com \
    --cc=magnus.karlsson@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=toke@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.