bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Jakub Kicinski <kuba@kernel.org>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	Alexei Starovoitov <ast@fb.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	Andrii Nakryiko <andriin@fb.com>, bpf <bpf@vger.kernel.org>,
	Networking <netdev@vger.kernel.org>,
	Kernel Team <kernel-team@fb.com>
Subject: Re: [PATCH bpf-next 0/3] Introduce pinnable bpf_link kernel abstraction
Date: Tue, 10 Mar 2020 13:22:34 +0100	[thread overview]
Message-ID: <87eeu0qu0l.fsf@toke.dk> (raw)
In-Reply-To: <20200309115043.17b2d6ef@kicinski-fedora-PC1C0HJN>

Jakub Kicinski <kuba@kernel.org> writes:

> On Mon, 09 Mar 2020 12:41:14 +0100 Toke Høiland-Jørgensen wrote:
>> > You said that like the library doesn't arbitrate access and manage
>> > resources.. It does exactly the same work the daemon would do.  
>> 
>> Sure, the logic is in the library, but the state (which programs are
>> loaded) and synchronisation primitives (atomic replace of attached
>> program) are provided by the kernel. 
>
> I see your point of view. The state in the kernel which the library has
> to read out every time is what I was thinking of as deserialization.

Ohh, right. I consider the BTF-embedded data as 'configuration data'
which is different to 'state' in my mind. So hence my confusion about
what you were talking about re: state :)

> The library has to take some lock, and then read the state from the
> kernel, and then construct its internal state based on that. I think
> you have some cleverness there to stuff everything in BTF so far, but
> I'd expect if the library grows that may become cumbersome and
> wasteful (it's pinned memory after all).
>
> Parsing the packet once could be an example of something that could be
> managed by the library to avoid wasted cycles. Then programs would have
> to describe their requirements, and library may need to do rewrites of
> the bytecode.

Hmm, I've been trying to make libxdp fairly minimal in scope. It seems
like you are assuming that we'll end up with lots of additional
functionality? Do you have anything in particular in mind, or are you
talking in general terms here?

> I guess everything can be stuffed into BTF, but I'm not 100% sure
> kernel is supposed to be a database either.

I actually started out with the BTF approach because I wanted something
that could be part of the program bytecode (instead of, say, an external
config file that had to be carried along with the .o file). That it
survives a round-trip into the kernel turned out to be a nice bonus :)

I do agree with you in general terms, though: There's probably a limit
to how much stuff we can stick into this. The obvious better-suited
storage mechanism for more data is a BPF map, isn't it? I'm not sure
there's any point in moving to that before we have actual use cases for
richer state/metadata, though?

> Note that the atomic replace may not sufficient for safe operation, as
> reading the state from the kernel is also not atomic.

Yeah, there's a potential for read-update-write races. However, assuming
that the dispatcher program itself is not modified after initial setup
(i.e., we build a new one every time), I think this can be solved with a
"cmpxchg" operation where userspace includes the fd of the program it
thinks it is replacing, and the kernel refuses the operation if this
doesn't match. Do you disagree that this would be sufficient?

-Toke


  reply	other threads:[~2020-03-10 12:22 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-28 22:39 [PATCH bpf-next 0/3] Introduce pinnable bpf_link kernel abstraction Andrii Nakryiko
2020-02-28 22:39 ` [PATCH bpf-next 1/3] bpf: introduce pinnable bpf_link abstraction Andrii Nakryiko
2020-03-02 10:13   ` Toke Høiland-Jørgensen
2020-03-02 18:06     ` Andrii Nakryiko
2020-03-02 21:40       ` Toke Høiland-Jørgensen
2020-03-02 23:37         ` Andrii Nakryiko
2020-03-03  2:50   ` Alexei Starovoitov
2020-03-03  4:18     ` Andrii Nakryiko
2020-02-28 22:39 ` [PATCH bpf-next 2/3] libbpf: add bpf_link pinning/unpinning Andrii Nakryiko
2020-03-02 10:16   ` Toke Høiland-Jørgensen
2020-03-02 18:09     ` Andrii Nakryiko
2020-03-02 21:45       ` Toke Høiland-Jørgensen
2020-02-28 22:39 ` [PATCH bpf-next 3/3] selftests/bpf: add link pinning selftests Andrii Nakryiko
2020-03-02 10:11 ` [PATCH bpf-next 0/3] Introduce pinnable bpf_link kernel abstraction Toke Høiland-Jørgensen
2020-03-02 18:05   ` Andrii Nakryiko
2020-03-02 22:24     ` Toke Høiland-Jørgensen
2020-03-02 23:35       ` Andrii Nakryiko
2020-03-03  8:12         ` Toke Høiland-Jørgensen
2020-03-03  8:12       ` Daniel Borkmann
2020-03-03 15:46         ` Alexei Starovoitov
2020-03-03 19:23           ` Daniel Borkmann
2020-03-03 19:46             ` Andrii Nakryiko
2020-03-03 20:24               ` Toke Høiland-Jørgensen
2020-03-03 20:53                 ` Daniel Borkmann
2020-03-03 22:01                   ` Alexei Starovoitov
2020-03-03 22:27                     ` Toke Høiland-Jørgensen
2020-03-04  4:36                       ` Alexei Starovoitov
2020-03-04  7:47                         ` Toke Høiland-Jørgensen
2020-03-04 15:47                           ` Alexei Starovoitov
2020-03-05 10:37                             ` Toke Høiland-Jørgensen
2020-03-05 16:34                               ` Alexei Starovoitov
2020-03-05 22:34                                 ` Daniel Borkmann
2020-03-05 22:50                                   ` Alexei Starovoitov
2020-03-05 23:42                                     ` Daniel Borkmann
2020-03-06  8:31                                       ` Toke Høiland-Jørgensen
2020-03-06 10:25                                         ` Daniel Borkmann
2020-03-06 10:42                                           ` Toke Høiland-Jørgensen
2020-03-06 18:09                                           ` David Ahern
2020-03-04 19:41                         ` Jakub Kicinski
2020-03-04 20:45                           ` Alexei Starovoitov
2020-03-04 21:24                             ` Jakub Kicinski
2020-03-05  1:07                               ` Alexei Starovoitov
2020-03-05  8:16                                 ` Jakub Kicinski
2020-03-05 11:05                                   ` Toke Høiland-Jørgensen
2020-03-05 18:13                                     ` Jakub Kicinski
2020-03-09 11:41                                       ` Toke Høiland-Jørgensen
2020-03-09 18:50                                         ` Jakub Kicinski
2020-03-10 12:22                                           ` Toke Høiland-Jørgensen [this message]
2020-03-05 16:39                                   ` Alexei Starovoitov
2020-03-03 22:40                 ` Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87eeu0qu0l.fsf@toke.dk \
    --to=toke@redhat.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andriin@fb.com \
    --cc=ast@fb.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kernel-team@fb.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).