bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	John Fastabend <john.fastabend@gmail.com>,
	Jakub Kicinski <kuba@kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
	Yonghong Song <yhs@fb.com>, Andrii Nakryiko <andriin@fb.com>,
	"David S. Miller" <davem@davemloft.net>,
	Jesper Dangaard Brouer <brouer@redhat.com>,
	Lorenz Bauer <lmb@cloudflare.com>, Andrey Ignatov <rdna@fb.com>,
	Networking <netdev@vger.kernel.org>, bpf <bpf@vger.kernel.org>
Subject: Re: [PATCH bpf-next 1/4] xdp: Support specifying expected existing program when attaching XDP
Date: Sun, 29 Mar 2020 12:39:21 +0200	[thread overview]
Message-ID: <87369rla1y.fsf@toke.dk> (raw)
In-Reply-To: <20200328233546.7ayswtraepw3ia2x@ast-mbp>

Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:

> On Sat, Mar 28, 2020 at 08:34:12PM +0100, Toke Høiland-Jørgensen wrote:
>> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
>> 
>> > On Sat, Mar 28, 2020 at 02:43:18AM +0100, Toke Høiland-Jørgensen wrote:
>> >> 
>> >> No, I was certainly not planning to use that to teach libxdp to just
>> >> nuke any bpf_link it finds attached to an interface. Quite the contrary,
>> >> the point of this series is to allow libxdp to *avoid* replacing
>> >> something on the interface that it didn't put there itself.
>> >
>> > Exactly! "that it didn't put there itself".
>> > How are you going to do that?
>> > I really hope you thought it through and came up with magic.
>> > Because I tried and couldn't figure out how to do that with IFLA_XDP*
>> > Please walk me step by step how do you think it's possible.
>> 
>> I'm inspecting the BPF program itself to make sure it's compatible.
>> Specifically, I'm embedding a piece of metadata into the program BTF,
>> using Andrii's encoding trick that we also use for defining maps. So
>> xdp-dispatcher.c contains this[0]:
>> 
>> __uint(dispatcher_version, XDP_DISPATCHER_VERSION) SEC(XDP_METADATA_SECTION);
>> 
>> and libxdp will refuse to touch any program that it finds loaded on an
>> iface which doesn't have this, or which has a version number that is
>> higher than what the library understands.
>
> so libxdp will do:
> ifindex -> id of currently attached prog -> fd -> prog_info -> btf -> read map
> -> find "dispatcher_version"
> and then it will do replace_fd with new version of the dispatcher ?
> I see how this approach helps the second set of races (from fd into "dispatcher_version")
> when another libxdp is doing the same.
> But there is still a race in query->id->fd. Much smaller though.

You mean the program can disappear before the ID can be turned into an
fd? Yeah, I guess that can happen, but that can just be treated as a
failure that triggers the retry logic.

> In that sense replace_fd is a better behaved prog replacement than
> just calling bpf_set_link_xdp_fd() without XDP_FLAGS_UPDATE_IF_NOEXIST.
> But not much. The libxdp doesn't own the attachment.
> If replace_fd fails what libxdp is going to do?
> Try the whole thing from the beginning?
> ifindex -> id2 -> fd2 ...

Yes, this is predicated on a "retry on failure" logic.

> Say it succeeded.
> But the libxdp1 that won the first race has no clue that libxdp2
> retried and there is a different dispatcher prog there.
> So you'll add netlink notifiers for libxdp to watch ?

No, the idea is that the dispatchers are compatible. So app1 installs
dispatcher1 with sequence (prog1), then app2 installs dispatcher2 with
sequence (prog1,prog2) - or (prog2,prog1) depending on ordering.

> That would mean that some user space process has to be always running
> while typical firewall doesn't need any user space. The firewall.rpm can 
> install its prog with all firewall rules, permanently link it to
> the interface and exit.
> But let's continue. So single libxdp daemon is now waiting for notifications
> or both libxdp1 and libxdp2 that are part of two firewalls that are
> being 'yum installed' are waiting for notifications?
> How fight between libxdp1 and libxdp2 to install what they want going
> to be resolved?
> If their versions are the same I think they will settle quickly
> since both libraries will see dispatcher prog with expected version number, right?
> What if versions are different? Older libxdp or newer libxdp suppose to give up?
> If libxdp2 is newer it will still be able to use older dispatcher prog
> that was installed by libxdp1, but it would need to disable all new
> user facing library features?

It will depend on what changes between versions, I guess. But yeah, I
don't think we can completely rule out that a "compatibility mode" may
be necessary at some point. This is orthogonal to how the programs are
being attached, though.

> I guess all that is acceptable behavior to some libxdp users.

I believe so.

>> > I'm saying that without bpf_link for xdp libxdp has no ability to
>> > identify an attachment that is theirs.
>> 
>> Ah, so *that* was what you meant with "unique attachment". It never
>> occurred to me that answering this question ("is it my program?") was to
>> be a feature of bpf_link; I always assumed that would be a property of
>> the bpf_prog itself.
>> 
>> Any reason what I'm describing above wouldn't work for you?
>
> I don't see how this is even apples to apples comparison.
> Racy query via id with sort-of "atomic" replacement and no ownership
> vs guaranteed attachment with exact ownership and no races.

No, I guess in your "management daemon" case the kernel-enforced
exclusivity does come in handy. And as I said, I can live with there
being two APIs as long as there's a reasonable way to override the
bpf_link "lock" :)

>> > I see two ways out of this stalemate:
>> > 1. assume that replace_fd extension landed and develop libxdp further
>> >    into fully fledged library. May be not a complete library, but at least
>> >    for few more weeks. If then you still think replace_fd is enough
>> >    I'll land it.
>> > 2. I can land replace_fd now, but please don't be surprised that
>> >    I will revert it several weeks from now when it's clear that
>> >    it's not enough.
>> >  
>> > Which one do you prefer?
>> 
>> I prefer 2. Reverting if it does turn out that I'm wrong is fine. Heck,
>> in that case I'll even send the revert myself :)
>
> Ok. Applied.

Great, thanks!

-Toke


  reply	other threads:[~2020-03-29 10:39 UTC|newest]

Thread overview: 120+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-19 13:13 [PATCH bpf-next 0/4] XDP: Support atomic replacement of XDP interface attachments Toke Høiland-Jørgensen
2020-03-19 13:13 ` [PATCH bpf-next 1/4] xdp: Support specifying expected existing program when attaching XDP Toke Høiland-Jørgensen
2020-03-19 22:52   ` Jakub Kicinski
2020-03-20  8:48     ` Toke Høiland-Jørgensen
2020-03-20 17:35       ` Jakub Kicinski
2020-03-20 18:17         ` Toke Høiland-Jørgensen
2020-03-20 18:35           ` Jakub Kicinski
2020-03-20 18:30         ` John Fastabend
2020-03-20 20:24           ` Andrii Nakryiko
2020-03-23 11:24             ` Toke Høiland-Jørgensen
2020-03-23 16:54               ` Jakub Kicinski
2020-03-23 18:14               ` Andrii Nakryiko
2020-03-23 19:23                 ` Toke Høiland-Jørgensen
2020-03-24  1:01                   ` David Ahern
2020-03-24  4:53                     ` Andrii Nakryiko
2020-03-24 20:55                       ` David Ahern
2020-03-24 22:56                         ` Andrii Nakryiko
2020-03-24  5:00                   ` Andrii Nakryiko
2020-03-24 10:57                     ` Toke Høiland-Jørgensen
2020-03-24 18:53                       ` Jakub Kicinski
2020-03-24 22:30                         ` Andrii Nakryiko
2020-03-25  1:25                           ` Jakub Kicinski
2020-03-24 19:22                       ` John Fastabend
2020-03-25  1:36                         ` Alexei Starovoitov
2020-03-25  2:15                           ` Jakub Kicinski
2020-03-25 18:06                             ` Alexei Starovoitov
2020-03-25 18:20                               ` Jakub Kicinski
2020-03-25 19:14                                 ` Alexei Starovoitov
2020-03-25 10:42                           ` Toke Høiland-Jørgensen
2020-03-25 18:11                             ` Alexei Starovoitov
2020-03-25 10:30                         ` Toke Høiland-Jørgensen
2020-03-25 17:56                           ` Alexei Starovoitov
2020-03-24 22:25                       ` Andrii Nakryiko
2020-03-25  9:38                         ` Toke Høiland-Jørgensen
2020-03-25 17:55                           ` Alexei Starovoitov
2020-03-26  0:16                           ` Andrii Nakryiko
2020-03-26  5:13                             ` Jakub Kicinski
2020-03-26 18:09                               ` Andrii Nakryiko
2020-03-26 19:40                               ` Alexei Starovoitov
2020-03-26 20:05                                 ` Edward Cree
2020-03-27 11:09                                   ` Lorenz Bauer
2020-03-27 23:11                                   ` Alexei Starovoitov
2020-03-26 10:04                             ` Lorenz Bauer
2020-03-26 17:47                               ` Jakub Kicinski
2020-03-26 19:45                                 ` Alexei Starovoitov
2020-03-26 18:18                               ` Andrii Nakryiko
2020-03-26 19:53                               ` Alexei Starovoitov
2020-03-27 11:11                                 ` Toke Høiland-Jørgensen
2020-04-02 20:21                                   ` bpf: ability to attach freplace to multiple parents Alexei Starovoitov
2020-04-02 21:23                                     ` Toke Høiland-Jørgensen
2020-04-02 21:54                                       ` Alexei Starovoitov
2020-04-03  8:38                                         ` Toke Høiland-Jørgensen
2020-04-07  1:44                                           ` Alexei Starovoitov
2020-04-07  9:20                                             ` Toke Høiland-Jørgensen
2020-05-12  8:34                                         ` Toke Høiland-Jørgensen
2020-05-12  9:53                                           ` Alan Maguire
2020-05-12 13:02                                             ` Toke Høiland-Jørgensen
2020-05-12 23:18                                             ` Alexei Starovoitov
2020-05-12 23:06                                           ` Alexei Starovoitov
2020-05-13 10:25                                             ` Toke Høiland-Jørgensen
2020-04-02 21:24                                     ` Andrey Ignatov
2020-04-02 22:01                                       ` Alexei Starovoitov
2020-03-26 12:35                             ` [PATCH bpf-next 1/4] xdp: Support specifying expected existing program when attaching XDP Toke Høiland-Jørgensen
2020-03-26 19:06                               ` Andrii Nakryiko
2020-03-27 11:06                                 ` Lorenz Bauer
2020-03-27 16:12                                   ` David Ahern
2020-03-27 20:10                                     ` Andrii Nakryiko
2020-03-27 23:02                                     ` Alexei Starovoitov
2020-03-30 15:25                                       ` Edward Cree
2020-03-31  3:43                                         ` Alexei Starovoitov
2020-03-31 22:05                                           ` Edward Cree
2020-03-31 22:16                                             ` Alexei Starovoitov
2020-03-27 19:42                                   ` Andrii Nakryiko
2020-03-27 19:45                                   ` Andrii Nakryiko
2020-03-27 23:09                                   ` Alexei Starovoitov
2020-03-27 11:46                                 ` Toke Høiland-Jørgensen
2020-03-27 20:07                                   ` Andrii Nakryiko
2020-03-27 22:16                                     ` Toke Høiland-Jørgensen
2020-03-27 22:54                                       ` Andrii Nakryiko
2020-03-28  1:09                                         ` Toke Høiland-Jørgensen
2020-03-28  1:44                                           ` Andrii Nakryiko
2020-03-28 19:43                                             ` Toke Høiland-Jørgensen
2020-03-26 19:58                               ` Alexei Starovoitov
2020-03-27 12:06                                 ` Toke Høiland-Jørgensen
2020-03-27 23:00                                   ` Alexei Starovoitov
2020-03-28  1:43                                     ` Toke Høiland-Jørgensen
2020-03-28  2:26                                       ` Alexei Starovoitov
2020-03-28 19:34                                         ` Toke Høiland-Jørgensen
2020-03-28 23:35                                           ` Alexei Starovoitov
2020-03-29 10:39                                             ` Toke Høiland-Jørgensen [this message]
2020-03-29 19:26                                               ` Alexei Starovoitov
2020-03-30 10:19                                                 ` Toke Høiland-Jørgensen
2020-03-29 20:23                                           ` Andrii Nakryiko
2020-03-30 13:53                                             ` Toke Høiland-Jørgensen
2020-03-30 20:17                                               ` Andrii Nakryiko
2020-03-31 10:13                                                 ` Toke Høiland-Jørgensen
2020-03-31 13:48                                                   ` Daniel Borkmann
2020-03-31 15:00                                                     ` Toke Høiland-Jørgensen
2020-03-31 20:19                                                       ` Andrii Nakryiko
2020-03-31 20:15                                                     ` Andrii Nakryiko
2020-03-30 15:41                                             ` Edward Cree
2020-03-30 19:13                                               ` Jakub Kicinski
2020-03-31  4:01                                               ` Alexei Starovoitov
2020-03-31 11:34                                                 ` Toke Høiland-Jørgensen
2020-03-31 18:52                                                   ` Alexei Starovoitov
2020-03-20 20:30       ` Daniel Borkmann
2020-03-20 20:40         ` Daniel Borkmann
2020-03-20 21:30           ` Jakub Kicinski
2020-03-20 21:55             ` Daniel Borkmann
2020-03-20 23:35               ` Jakub Kicinski
2020-03-20 20:39       ` Andrii Nakryiko
2020-03-23 11:25         ` Toke Høiland-Jørgensen
2020-03-23 18:07           ` Andrii Nakryiko
2020-03-23 23:54           ` Andrey Ignatov
2020-03-24 10:16             ` Toke Høiland-Jørgensen
2020-03-20  2:13   ` Yonghong Song
2020-03-20  8:48     ` Toke Høiland-Jørgensen
2020-03-19 13:13 ` [PATCH bpf-next 2/4] tools: Add EXPECTED_FD-related definitions in if_link.h Toke Høiland-Jørgensen
2020-03-19 13:13 ` [PATCH bpf-next 3/4] libbpf: Add function to set link XDP fd while specifying old fd Toke Høiland-Jørgensen
2020-03-19 13:13 ` [PATCH bpf-next 4/4] selftests/bpf: Add tests for attaching XDP programs Toke Høiland-Jørgensen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87369rla1y.fsf@toke.dk \
    --to=toke@redhat.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andriin@fb.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=john.fastabend@gmail.com \
    --cc=kafai@fb.com \
    --cc=kuba@kernel.org \
    --cc=lmb@cloudflare.com \
    --cc=netdev@vger.kernel.org \
    --cc=rdna@fb.com \
    --cc=songliubraving@fb.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).