bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: Jakub Kicinski <kuba@kernel.org>
Cc: "John Fastabend" <john.fastabend@gmail.com>,
	"Toke Høiland-Jørgensen" <toke@redhat.com>,
	"Andrii Nakryiko" <andrii.nakryiko@gmail.com>,
	"Alexei Starovoitov" <ast@kernel.org>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	"Martin KaFai Lau" <kafai@fb.com>,
	"Song Liu" <songliubraving@fb.com>, "Yonghong Song" <yhs@fb.com>,
	"Andrii Nakryiko" <andriin@fb.com>,
	"David S. Miller" <davem@davemloft.net>,
	"Jesper Dangaard Brouer" <brouer@redhat.com>,
	"Lorenz Bauer" <lmb@cloudflare.com>,
	"Andrey Ignatov" <rdna@fb.com>,
	Networking <netdev@vger.kernel.org>, bpf <bpf@vger.kernel.org>
Subject: Re: [PATCH bpf-next 1/4] xdp: Support specifying expected existing program when attaching XDP
Date: Wed, 25 Mar 2020 12:14:54 -0700	[thread overview]
Message-ID: <20200325191454.ub5x3kayowsc75vg@ast-mbp> (raw)
In-Reply-To: <20200325112005.205d985a@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>

On Wed, Mar 25, 2020 at 11:20:05AM -0700, Jakub Kicinski wrote:
> On Wed, 25 Mar 2020 11:06:38 -0700 Alexei Starovoitov wrote:
> > On Tue, Mar 24, 2020 at 07:15:54PM -0700, Jakub Kicinski wrote:
> > > It is the way to configure XDP today, so it's only natural to
> > > scrutinize the attempts to replace it.   
> > 
> > No one is replacing it.
> 
> You're blocking extensions to the existing API, that means that part 
> of the API is frozen and is being replaced.

two things are wrong in the above stmt:
1. extensions are not frozen in general.
2. api is not being replaced. ownership is lacking. It needs to be added.
   It's a new concept. Not a replacement.

> > > Also I personally don't think you'd see this much push back trying to
> > > add bpf_link-based stuff to cls_bpf, that's an add-on. XDP is
> > > integrated very fundamentally with the networking stack at this point.
> > >   
> > > > Details are important and every case is different. So imo:
> > > > converting ethtool to netlink - great stuff.
> > > > converting netdev irq/queue management to netlink - great stuff too.
> > > > adding more netlink api for xdp - really bad idea.  
> > > 
> > > Why is it a bad idea?  
> > 
> > I explained in three other emails. tldr: lack of ownership.
> 
> Those came later, I think, thanks.
> 
> Fine, maybe one day someone will find the extension you're proposing
> useful. To me that's not a justification to freeze the existing API
> (you said "adding more netlink api for xdp - really bad idea").
> 
> Besides, if you look at Toke's libxdp work (which exists), what's the
> ownership of the attached program? Whichever application touched it
> last?
> 
> The whole auto-detachment thing may work nicely in cls_bpf and
> sub-programs attached to the root XDP program, but it's a bit hard 
> to imagine how its useful for the singleton root XDP program.

bpf_link introduces two new things: 1. ownership 2. auto-detach
They are both useful. Looks like the use case for 2 is obvious, but
1 can exist without being FD based.

> 
> > > There are plenty things which will only be available over netlink.
> > > Configuring the interface so installing the XDP program is possible
> > > (disabling features, configuring queues etc.). Chances are user gets
> > > the ifindex of the interface to attach to over netlink in the first
> > > place. The queue configuration (which you agree belongs in netlink)
> > > will definitely get more complex to allow REDIRECTs to work more
> > > smoothly. AF_XDP needs all sort of netlink stuff.  
> > 
> > sure. that has nothing to do with ownership of attachment.
> 
> AFAICT the allure to John is the uniform API, and no need for netlink.
> I was explaining how that's a bad goal to have.

You clearly misunderstood. Neither John nor I were saying that there is
no need for netlink.

> 
> > > Netlink gives us the notification mechanism which is how we solve
> > > coordination across daemons (something that BPF subsystem is only 
> > > now trying to solve).  
> > 
> > I don't care about notifications on attachment and no one is trying to
> > solve that as far as I can see. It's not a problem to solve in the first place.
> 
> Well, it's the existing solution to the "ownership" problem.
> I think most people simply didn't know about it.

Toke's set introduces the same thing to XDP as
commit 7dd68b3279f1 ("bpf: Support replacing cgroup-bpf program in MULTI mode")
did for cgroup-bpf.
Both are trying to address the same issue and both are NOT doing.
That cgroup-bpf commit looked like a great solution just three month ago.
Now it's clear it's not fixing the underlying issue.
Same thing with Toke's fix. It feels good now, but going to be uselss
without introducing ownership.

Why that cgroup-bpf commit not fixing it?
Take a look at that commit. The first paragraph is
"
The common use-case in production is to have multiple cgroup-bpf
programs per attach type that cover multiple use-cases. Such programs
are attached with BPF_F_ALLOW_MULTI and can be maintained by different
people.
"
Then the description goes into explaining how one service wants to replace its prog.
In this case it sort of works because it's single c++ service with multiple
progs that do different things. There is a 'centralized daemon' (kinda) that
can try to orchestrate. It breaks when there are two c++ services.
That replace_bpf_fd is trying to be a link identifier. But the kernel lacks
that identifier.
I think it would be simpler to understand the ownership if bpf_link had
its own IDR for every link. Every attachment(link) would be an object with its
own id. We could have iterated over all attachments with GET_NEXT_ID, for example.
But that's nice to have. Not strictly necessary.
The ownership of the attachment needs to be permanent. It needs to belong
to a task and other tasks should not be able to break that attachment.
That cgroup-bpf commit addressing part of the issue by "inventing" an identifier
for the attachment (in the form of prog_fd that suppose to be there in that
attachment), but not addressing the owner part of the attachment.
Only the task(s) that own that attachment should be able to modify the attachment.

One can imagine how attachment ID can be completely implemented with netlink.
Is it good idea? Not really, because there is no mechanism to transfer the ownership.
Having an FD that points to a kernel object that represents the ownership makes it
easy for user space to pass the ownership (by passing an FD).
Auto-detach part comes for free with FD based bpf_link, but that's not the main feature.
May be we will add a flag to disable auto-detach too.

  reply	other threads:[~2020-03-25 19:15 UTC|newest]

Thread overview: 120+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-19 13:13 [PATCH bpf-next 0/4] XDP: Support atomic replacement of XDP interface attachments Toke Høiland-Jørgensen
2020-03-19 13:13 ` [PATCH bpf-next 1/4] xdp: Support specifying expected existing program when attaching XDP Toke Høiland-Jørgensen
2020-03-19 22:52   ` Jakub Kicinski
2020-03-20  8:48     ` Toke Høiland-Jørgensen
2020-03-20 17:35       ` Jakub Kicinski
2020-03-20 18:17         ` Toke Høiland-Jørgensen
2020-03-20 18:35           ` Jakub Kicinski
2020-03-20 18:30         ` John Fastabend
2020-03-20 20:24           ` Andrii Nakryiko
2020-03-23 11:24             ` Toke Høiland-Jørgensen
2020-03-23 16:54               ` Jakub Kicinski
2020-03-23 18:14               ` Andrii Nakryiko
2020-03-23 19:23                 ` Toke Høiland-Jørgensen
2020-03-24  1:01                   ` David Ahern
2020-03-24  4:53                     ` Andrii Nakryiko
2020-03-24 20:55                       ` David Ahern
2020-03-24 22:56                         ` Andrii Nakryiko
2020-03-24  5:00                   ` Andrii Nakryiko
2020-03-24 10:57                     ` Toke Høiland-Jørgensen
2020-03-24 18:53                       ` Jakub Kicinski
2020-03-24 22:30                         ` Andrii Nakryiko
2020-03-25  1:25                           ` Jakub Kicinski
2020-03-24 19:22                       ` John Fastabend
2020-03-25  1:36                         ` Alexei Starovoitov
2020-03-25  2:15                           ` Jakub Kicinski
2020-03-25 18:06                             ` Alexei Starovoitov
2020-03-25 18:20                               ` Jakub Kicinski
2020-03-25 19:14                                 ` Alexei Starovoitov [this message]
2020-03-25 10:42                           ` Toke Høiland-Jørgensen
2020-03-25 18:11                             ` Alexei Starovoitov
2020-03-25 10:30                         ` Toke Høiland-Jørgensen
2020-03-25 17:56                           ` Alexei Starovoitov
2020-03-24 22:25                       ` Andrii Nakryiko
2020-03-25  9:38                         ` Toke Høiland-Jørgensen
2020-03-25 17:55                           ` Alexei Starovoitov
2020-03-26  0:16                           ` Andrii Nakryiko
2020-03-26  5:13                             ` Jakub Kicinski
2020-03-26 18:09                               ` Andrii Nakryiko
2020-03-26 19:40                               ` Alexei Starovoitov
2020-03-26 20:05                                 ` Edward Cree
2020-03-27 11:09                                   ` Lorenz Bauer
2020-03-27 23:11                                   ` Alexei Starovoitov
2020-03-26 10:04                             ` Lorenz Bauer
2020-03-26 17:47                               ` Jakub Kicinski
2020-03-26 19:45                                 ` Alexei Starovoitov
2020-03-26 18:18                               ` Andrii Nakryiko
2020-03-26 19:53                               ` Alexei Starovoitov
2020-03-27 11:11                                 ` Toke Høiland-Jørgensen
2020-04-02 20:21                                   ` bpf: ability to attach freplace to multiple parents Alexei Starovoitov
2020-04-02 21:23                                     ` Toke Høiland-Jørgensen
2020-04-02 21:54                                       ` Alexei Starovoitov
2020-04-03  8:38                                         ` Toke Høiland-Jørgensen
2020-04-07  1:44                                           ` Alexei Starovoitov
2020-04-07  9:20                                             ` Toke Høiland-Jørgensen
2020-05-12  8:34                                         ` Toke Høiland-Jørgensen
2020-05-12  9:53                                           ` Alan Maguire
2020-05-12 13:02                                             ` Toke Høiland-Jørgensen
2020-05-12 23:18                                             ` Alexei Starovoitov
2020-05-12 23:06                                           ` Alexei Starovoitov
2020-05-13 10:25                                             ` Toke Høiland-Jørgensen
2020-04-02 21:24                                     ` Andrey Ignatov
2020-04-02 22:01                                       ` Alexei Starovoitov
2020-03-26 12:35                             ` [PATCH bpf-next 1/4] xdp: Support specifying expected existing program when attaching XDP Toke Høiland-Jørgensen
2020-03-26 19:06                               ` Andrii Nakryiko
2020-03-27 11:06                                 ` Lorenz Bauer
2020-03-27 16:12                                   ` David Ahern
2020-03-27 20:10                                     ` Andrii Nakryiko
2020-03-27 23:02                                     ` Alexei Starovoitov
2020-03-30 15:25                                       ` Edward Cree
2020-03-31  3:43                                         ` Alexei Starovoitov
2020-03-31 22:05                                           ` Edward Cree
2020-03-31 22:16                                             ` Alexei Starovoitov
2020-03-27 19:42                                   ` Andrii Nakryiko
2020-03-27 19:45                                   ` Andrii Nakryiko
2020-03-27 23:09                                   ` Alexei Starovoitov
2020-03-27 11:46                                 ` Toke Høiland-Jørgensen
2020-03-27 20:07                                   ` Andrii Nakryiko
2020-03-27 22:16                                     ` Toke Høiland-Jørgensen
2020-03-27 22:54                                       ` Andrii Nakryiko
2020-03-28  1:09                                         ` Toke Høiland-Jørgensen
2020-03-28  1:44                                           ` Andrii Nakryiko
2020-03-28 19:43                                             ` Toke Høiland-Jørgensen
2020-03-26 19:58                               ` Alexei Starovoitov
2020-03-27 12:06                                 ` Toke Høiland-Jørgensen
2020-03-27 23:00                                   ` Alexei Starovoitov
2020-03-28  1:43                                     ` Toke Høiland-Jørgensen
2020-03-28  2:26                                       ` Alexei Starovoitov
2020-03-28 19:34                                         ` Toke Høiland-Jørgensen
2020-03-28 23:35                                           ` Alexei Starovoitov
2020-03-29 10:39                                             ` Toke Høiland-Jørgensen
2020-03-29 19:26                                               ` Alexei Starovoitov
2020-03-30 10:19                                                 ` Toke Høiland-Jørgensen
2020-03-29 20:23                                           ` Andrii Nakryiko
2020-03-30 13:53                                             ` Toke Høiland-Jørgensen
2020-03-30 20:17                                               ` Andrii Nakryiko
2020-03-31 10:13                                                 ` Toke Høiland-Jørgensen
2020-03-31 13:48                                                   ` Daniel Borkmann
2020-03-31 15:00                                                     ` Toke Høiland-Jørgensen
2020-03-31 20:19                                                       ` Andrii Nakryiko
2020-03-31 20:15                                                     ` Andrii Nakryiko
2020-03-30 15:41                                             ` Edward Cree
2020-03-30 19:13                                               ` Jakub Kicinski
2020-03-31  4:01                                               ` Alexei Starovoitov
2020-03-31 11:34                                                 ` Toke Høiland-Jørgensen
2020-03-31 18:52                                                   ` Alexei Starovoitov
2020-03-20 20:30       ` Daniel Borkmann
2020-03-20 20:40         ` Daniel Borkmann
2020-03-20 21:30           ` Jakub Kicinski
2020-03-20 21:55             ` Daniel Borkmann
2020-03-20 23:35               ` Jakub Kicinski
2020-03-20 20:39       ` Andrii Nakryiko
2020-03-23 11:25         ` Toke Høiland-Jørgensen
2020-03-23 18:07           ` Andrii Nakryiko
2020-03-23 23:54           ` Andrey Ignatov
2020-03-24 10:16             ` Toke Høiland-Jørgensen
2020-03-20  2:13   ` Yonghong Song
2020-03-20  8:48     ` Toke Høiland-Jørgensen
2020-03-19 13:13 ` [PATCH bpf-next 2/4] tools: Add EXPECTED_FD-related definitions in if_link.h Toke Høiland-Jørgensen
2020-03-19 13:13 ` [PATCH bpf-next 3/4] libbpf: Add function to set link XDP fd while specifying old fd Toke Høiland-Jørgensen
2020-03-19 13:13 ` [PATCH bpf-next 4/4] selftests/bpf: Add tests for attaching XDP programs Toke Høiland-Jørgensen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200325191454.ub5x3kayowsc75vg@ast-mbp \
    --to=alexei.starovoitov@gmail.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andriin@fb.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=john.fastabend@gmail.com \
    --cc=kafai@fb.com \
    --cc=kuba@kernel.org \
    --cc=lmb@cloudflare.com \
    --cc=netdev@vger.kernel.org \
    --cc=rdna@fb.com \
    --cc=songliubraving@fb.com \
    --cc=toke@redhat.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).