bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: Yafang Shao <laoar.shao@gmail.com>
Cc: Song Liu <song@kernel.org>, Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
	Yonghong Song <yhs@fb.com>,
	John Fastabend <john.fastabend@gmail.com>,
	KP Singh <kpsingh@kernel.org>,
	Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>,
	Jiri Olsa <jolsa@kernel.org>, bpf <bpf@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH bpf-next 00/13] bpf: Introduce BPF namespace
Date: Wed, 5 Apr 2023 20:05:50 -0700	[thread overview]
Message-ID: <CAADnVQ+1VEBHTM5Rm-gx8-bg=tfv=4x+aONhF0bAmBFZG3W8Qg@mail.gmail.com> (raw)
In-Reply-To: <CALOAHbCEsucGRB+n5hTnPm-HssmB91HD4PFVRhdO=CZnJXfR6A@mail.gmail.com>

On Wed, Apr 5, 2023 at 7:55 PM Yafang Shao <laoar.shao@gmail.com> wrote:
>
> It seems that I didn't describe the issue clearly.
> The container doesn't have CAP_SYS_ADMIN, but the CAP_SYS_ADMIN is
> required to run bpftool,  so the bpftool running in the container
> can't get the ID of bpf objects or convert IDs to FDs.
> Is there something that I missed ?

Nothing. This is by design. bpftool needs sudo. That's all.


>
> > > --- a/kernel/bpf/syscall.c
> > > +++ b/kernel/bpf/syscall.c
> > > @@ -3705,9 +3705,6 @@ static int bpf_obj_get_next_id(const union bpf_attr *attr,
> > >         if (CHECK_ATTR(BPF_OBJ_GET_NEXT_ID) || next_id >= INT_MAX)
> > >                 return -EINVAL;
> > >
> > > -       if (!capable(CAP_SYS_ADMIN))
> > > -               return -EPERM;
> > > -
> > >         next_id++;
> > >         spin_lock_bh(lock);
> > >         if (!idr_get_next(idr, &next_id))
> > >
> > > Because the container doesn't have CAP_SYS_ADMIN enabled, while they
> > > only have CAP_BPF and other required CAPs.
> > >
> > > Another possible solution is that we run an agent in the host, and the
> > > user in the container who wants to get the bpf objects info in his
> > > container should send a request to this agent via unix domain socket.
> > > That is what we are doing now in our production environment.  That
> > > said, each container has to run a client to get the bpf object fd.
> >
> > None of such hacks are necessary. People that debug bpf setups with bpftool
> > can always sudo.
> >
> > > There are some downsides,
> > > -  It can't handle pinned bpf programs
> > >    For pinned programs, the user can get them from the pinned files
> > > directly, so he can use bpftool in his case, only with some
> > > complaints.
> > > -  If the user attached the bpf prog, and then removed the pinned
> > > file, but didn't detach it.
> > >    That happened. But this error case can't be handled.
> > > - There may be other corner cases that it can't fit.
> > >
> > > There's a solution to improve it, but we also need to change the
> > > kernel. That is, we can use the wasted space btf->name.
> > >
> > > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> > > index b7e5a55..59d73a3 100644
> > > --- a/kernel/bpf/btf.c
> > > +++ b/kernel/bpf/btf.c
> > > @@ -5542,6 +5542,8 @@ static struct btf *btf_parse(bpfptr_t btf_data,
> > > u32 btf_data_size,
> > >                 err = -ENOMEM;
> > >                 goto errout;
> > >         }
> > > +       snprintf(btf->name, sizeof(btf->name), "%s-%d-%d", current->comm,
> > > +                        current->pid, cgroup_id(task_cgroup(p, cpu_cgrp_id)));
> >
> > Unnecessary.
> > comm, pid, cgroup can be printed by bpftool without changing the kernel.
>
> Some questions,
> - What if the process exits after attaching the bpf prog and the prog
> is not auto-detachable?
>   For example, the reuserport bpf prog is not auto-detachable. After
> pins the reuserport bpf prog, a task can attach it through the pinned
> bpf file, but if the task forgets to detach it and the pinned file is
> removed, then it seems there's no way to figure out which task or
> cgroup this prog belongs to...

you're saying that there is a bpf prog in the kernel without
corresponding user space ? Meaning no user space process has an FD
that points to this prog or FD to a map that this prog is using?
In such a case this is truly kernel bpf prog. It doesn't belong to cgroup.

> - Could you pls. explain in detail how to get comm, pid, or cgroup
> from a pinned bpffs file?

pinned bpf prog and no user space holds FD to it?
It's not part of any cgroup. Nothing to print.

  reply	other threads:[~2023-04-06  3:07 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-26  9:21 [RFC PATCH bpf-next 00/13] bpf: Introduce BPF namespace Yafang Shao
2023-03-26  9:21 ` [RFC PATCH bpf-next 01/13] fork: New clone3 flag for " Yafang Shao
2023-03-26  9:21 ` [RFC PATCH bpf-next 02/13] proc_ns: Extend the field type in struct proc_ns_operations to long Yafang Shao
2023-03-26  9:21 ` [RFC PATCH bpf-next 03/13] bpf: Implement bpf namespace Yafang Shao
2023-03-26  9:21 ` [RFC PATCH bpf-next 04/13] bpf: No need to check if id is 0 Yafang Shao
2023-03-26  9:22 ` [RFC PATCH bpf-next 05/13] bpf: Make bpf objects id have the same alloc and free pattern Yafang Shao
2023-03-26  9:22 ` [RFC PATCH bpf-next 06/13] bpf: Helpers to alloc and free object id in bpf namespace Yafang Shao
2023-03-26  9:22 ` [RFC PATCH bpf-next 07/13] bpf: Add bpf helper to get bpf object id Yafang Shao
2023-03-26  9:22 ` [RFC PATCH bpf-next 08/13] bpf: Alloc and free bpf_map id in bpf namespace Yafang Shao
2023-03-26 10:50   ` Toke Høiland-Jørgensen
2023-03-27  2:44     ` Yafang Shao
2023-03-26  9:22 ` [RFC PATCH bpf-next 09/13] bpf: Alloc and free bpf_prog " Yafang Shao
2023-03-26  9:22 ` [RFC PATCH bpf-next 10/13] bpf: Alloc and free bpf_link " Yafang Shao
2023-03-26  9:22 ` [RFC PATCH bpf-next 11/13] bpf: Allow iterating bpf objects with CAP_BPF " Yafang Shao
2023-03-26  9:22 ` [RFC PATCH bpf-next 12/13] bpf: Use bpf_idr_lock array instead Yafang Shao
2023-03-26  9:22 ` [RFC PATCH bpf-next 13/13] selftests/bpf: Add selftest for bpf namespace Yafang Shao
2023-03-26 10:49 ` [RFC PATCH bpf-next 00/13] bpf: Introduce BPF namespace Toke Høiland-Jørgensen
2023-03-27  3:07   ` Yafang Shao
2023-03-27 20:51     ` Toke Høiland-Jørgensen
2023-03-28  3:48       ` Yafang Shao
2023-03-27 17:28 ` Stanislav Fomichev
2023-03-28  3:42   ` Yafang Shao
2023-03-28 17:15     ` Stanislav Fomichev
2023-03-29  3:02       ` Yafang Shao
2023-03-29 20:50         ` Stanislav Fomichev
2023-03-30  2:40           ` Yafang Shao
2023-03-27 19:03 ` Song Liu
2023-03-28  3:47   ` Yafang Shao
2023-04-02 23:37     ` Alexei Starovoitov
2023-04-03  3:05       ` Yafang Shao
2023-04-03 22:50         ` Alexei Starovoitov
2023-04-04  2:59           ` Yafang Shao
2023-04-06  2:06             ` Alexei Starovoitov
2023-04-06  2:54               ` Yafang Shao
2023-04-06  3:05                 ` Alexei Starovoitov [this message]
2023-04-06  3:22                   ` Yafang Shao
2023-04-06  4:24                     ` Alexei Starovoitov
2023-04-06  5:43                       ` Yafang Shao
2023-04-06 20:22                         ` Andrii Nakryiko
2023-04-07  1:43                           ` Alexei Starovoitov
2023-04-07  4:33                             ` Yafang Shao
2023-04-07 15:32                               ` Alexei Starovoitov
2023-04-07 15:59                             ` Andrii Nakryiko
2023-04-07 16:05                               ` Alexei Starovoitov
2023-04-07 16:21                                 ` Yafang Shao
2023-04-07 16:31                                   ` Alexei Starovoitov
2023-04-07 16:35                                     ` Yafang Shao
2023-03-31  5:52 ` Hao Luo
2023-04-01 16:32   ` Yafang Shao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAADnVQ+1VEBHTM5Rm-gx8-bg=tfv=4x+aONhF0bAmBFZG3W8Qg@mail.gmail.com' \
    --to=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=haoluo@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kafai@fb.com \
    --cc=kpsingh@kernel.org \
    --cc=laoar.shao@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sdf@google.com \
    --cc=song@kernel.org \
    --cc=songliubraving@fb.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).