All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH bpf-next] bpf: Allow get bpf object with CAP_BPF
@ 2022-11-29 16:16 Yafang Shao
  2022-11-30  0:44 ` Hao Luo
  0 siblings, 1 reply; 9+ messages in thread
From: Yafang Shao @ 2022-11-29 16:16 UTC (permalink / raw)
  To: ast, daniel, andrii, kafai, songliubraving, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa
  Cc: bpf, Yafang Shao

In the containerized envriomentation, if a container is not
privileged but with CAP_BPF, it is not easy to debug bpf created in this
container, let alone using bpftool. Because these bpf objects are
invisible if they are not pinned in bpffs. Currently we have to
interact with the process which creates these bpf objects to get the
information. It may be better if we can control the access to each
object the same way as we control the file in bpffs, but now I think we
should allow the accessibility of these objects with CAP_BPF.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
 kernel/bpf/syscall.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 35972afb6850..9cd6b41e2d2b 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -3660,7 +3660,7 @@ static int bpf_obj_get_next_id(const union bpf_attr *attr,
 	if (CHECK_ATTR(BPF_OBJ_GET_NEXT_ID) || next_id >= INT_MAX)
 		return -EINVAL;
 
-	if (!capable(CAP_SYS_ADMIN))
+	if (!bpf_capable())
 		return -EPERM;
 
 	next_id++;
@@ -3741,7 +3741,7 @@ static int bpf_prog_get_fd_by_id(const union bpf_attr *attr)
 	if (CHECK_ATTR(BPF_PROG_GET_FD_BY_ID))
 		return -EINVAL;
 
-	if (!capable(CAP_SYS_ADMIN))
+	if (!bpf_capable())
 		return -EPERM;
 
 	prog = bpf_prog_by_id(id);
@@ -3768,7 +3768,7 @@ static int bpf_map_get_fd_by_id(const union bpf_attr *attr)
 	    attr->open_flags & ~BPF_OBJ_FLAG_MASK)
 		return -EINVAL;
 
-	if (!capable(CAP_SYS_ADMIN))
+	if (!bpf_capable())
 		return -EPERM;
 
 	f_flags = bpf_get_file_flag(attr->open_flags);
@@ -4345,7 +4345,7 @@ static int bpf_btf_get_fd_by_id(const union bpf_attr *attr)
 	if (CHECK_ATTR(BPF_BTF_GET_FD_BY_ID))
 		return -EINVAL;
 
-	if (!capable(CAP_SYS_ADMIN))
+	if (!bpf_capable())
 		return -EPERM;
 
 	return btf_get_fd_by_id(attr->btf_id);
@@ -4769,7 +4769,7 @@ static int bpf_link_get_fd_by_id(const union bpf_attr *attr)
 	if (CHECK_ATTR(BPF_LINK_GET_FD_BY_ID))
 		return -EINVAL;
 
-	if (!capable(CAP_SYS_ADMIN))
+	if (!bpf_capable())
 		return -EPERM;
 
 	link = bpf_link_by_id(id);
-- 
2.30.1 (Apple Git-130)


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH bpf-next] bpf: Allow get bpf object with CAP_BPF
  2022-11-29 16:16 [RFC PATCH bpf-next] bpf: Allow get bpf object with CAP_BPF Yafang Shao
@ 2022-11-30  0:44 ` Hao Luo
  2022-11-30 11:58   ` Yafang Shao
  0 siblings, 1 reply; 9+ messages in thread
From: Hao Luo @ 2022-11-30  0:44 UTC (permalink / raw)
  To: Yafang Shao
  Cc: ast, daniel, andrii, kafai, songliubraving, yhs, john.fastabend,
	kpsingh, sdf, jolsa, bpf

On Tue, Nov 29, 2022 at 8:16 AM Yafang Shao <laoar.shao@gmail.com> wrote:
>
> In the containerized envriomentation, if a container is not
> privileged but with CAP_BPF, it is not easy to debug bpf created in this
> container, let alone using bpftool. Because these bpf objects are
> invisible if they are not pinned in bpffs. Currently we have to
> interact with the process which creates these bpf objects to get the
> information. It may be better if we can control the access to each
> object the same way as we control the file in bpffs, but now I think we
> should allow the accessibility of these objects with CAP_BPF.
>
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> ---
>  kernel/bpf/syscall.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
>

As far as I can tell, requiring CAP_SYS_ADMIN on iterating IDs and
converting IDs to FDs is intended and is an important design in BPF's
security model [1]. So this change does not look good.

From the commit message, I'm not clear how BPF is debugged in
containers in your use case. Maybe the debugging process should be
required to have CAP_SYS_ADMIN?

[1] https://lore.kernel.org/bpf/20200513230355.7858-1-alexei.starovoitov@gmail.com/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH bpf-next] bpf: Allow get bpf object with CAP_BPF
  2022-11-30  0:44 ` Hao Luo
@ 2022-11-30 11:58   ` Yafang Shao
  2022-11-30 18:06     ` Song Liu
  0 siblings, 1 reply; 9+ messages in thread
From: Yafang Shao @ 2022-11-30 11:58 UTC (permalink / raw)
  To: Hao Luo
  Cc: ast, daniel, andrii, kafai, songliubraving, yhs, john.fastabend,
	kpsingh, sdf, jolsa, bpf

On Wed, Nov 30, 2022 at 8:44 AM Hao Luo <haoluo@google.com> wrote:
>
> On Tue, Nov 29, 2022 at 8:16 AM Yafang Shao <laoar.shao@gmail.com> wrote:
> >
> > In the containerized envriomentation, if a container is not
> > privileged but with CAP_BPF, it is not easy to debug bpf created in this
> > container, let alone using bpftool. Because these bpf objects are
> > invisible if they are not pinned in bpffs. Currently we have to
> > interact with the process which creates these bpf objects to get the
> > information. It may be better if we can control the access to each
> > object the same way as we control the file in bpffs, but now I think we
> > should allow the accessibility of these objects with CAP_BPF.
> >
> > Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> > ---
> >  kernel/bpf/syscall.c | 10 +++++-----
> >  1 file changed, 5 insertions(+), 5 deletions(-)
> >
>
> As far as I can tell, requiring CAP_SYS_ADMIN on iterating IDs and
> converting IDs to FDs is intended and is an important design in BPF's
> security model [1]. So this change does not look good.
>

I understand that allowing ID->FD transition for CAP_SYS_ADMIN only is
for security.
But it also prevents the user from transiting its own bpf object ID,
that is a problem.

> From the commit message, I'm not clear how BPF is debugged in
> containers in your use case. Maybe the debugging process should be
> required to have CAP_SYS_ADMIN?
>

Some container users will run bpf programs in their container,
sometimes they want to check the bpf objects created by themselves  by
using bpftool or read/write the bpf maps with their own tools. But if
the bpf objects are not pinned, the only way to get these bpf objects
is via SCM_RIGHTS.
There should be a general way to get the FD of their own objects when
CAP_BPF is enabled.
With CAP_SYS_ADMIN, the container user can do almost anything, which
is very dangerous.
While with CAP_BPF, the risk can be kept within BPF.

I think we should improve this situation by allowing the user to
transit its own bpf object IDs.
There are some possible solutions,
1. introduce BPF_ID namespace
    Let's use namespace to isolate the bpf object ID instead of
preventing them from reading all IDs.
2. introduce a global sysctl knob to allow users to do the ID->FD transition
    for example, introduce a new value into unprivileged_bpf_disabled.
    -0 Unprivileged calls to ``bpf()`` are enabled
   +0 Unprivileged calls to ``bpf()`` are enabled except the calls
   +  which explicitly requires ``CAP_BPF`` or ``CAP_SYS_ADMIN``
    1 Unprivileged calls to ``bpf()`` are disabled without recovery
    2 Unprivileged calls to ``bpf()`` are disabled
  +3 All unprivileged calls to ``bpf()`` are enabled

WDYT ?

-- 
Regards
Yafang

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH bpf-next] bpf: Allow get bpf object with CAP_BPF
  2022-11-30 11:58   ` Yafang Shao
@ 2022-11-30 18:06     ` Song Liu
  2022-12-01  0:37       ` Hao Luo
  2022-12-01 14:34       ` Yafang Shao
  0 siblings, 2 replies; 9+ messages in thread
From: Song Liu @ 2022-11-30 18:06 UTC (permalink / raw)
  To: Yafang Shao
  Cc: Hao Luo, ast, daniel, andrii, kafai, songliubraving, yhs,
	john.fastabend, kpsingh, sdf, jolsa, bpf

On Wed, Nov 30, 2022 at 3:59 AM Yafang Shao <laoar.shao@gmail.com> wrote:
>
[...]
> I understand that allowing ID->FD transition for CAP_SYS_ADMIN only is
> for security.
> But it also prevents the user from transiting its own bpf object ID,
> that is a problem.
>
> > From the commit message, I'm not clear how BPF is debugged in
> > containers in your use case. Maybe the debugging process should be
> > required to have CAP_SYS_ADMIN?
> >
>
> Some container users will run bpf programs in their container,
> sometimes they want to check the bpf objects created by themselves  by
> using bpftool or read/write the bpf maps with their own tools. But if
> the bpf objects are not pinned, the only way to get these bpf objects
> is via SCM_RIGHTS.
> There should be a general way to get the FD of their own objects when
> CAP_BPF is enabled.
> With CAP_SYS_ADMIN, the container user can do almost anything, which
> is very dangerous.
> While with CAP_BPF, the risk can be kept within BPF.
>
> I think we should improve this situation by allowing the user to
> transit its own bpf object IDs.
> There are some possible solutions,
> 1. introduce BPF_ID namespace
>     Let's use namespace to isolate the bpf object ID instead of
> preventing them from reading all IDs.
> 2. introduce a global sysctl knob to allow users to do the ID->FD transition
>     for example, introduce a new value into unprivileged_bpf_disabled.
>     -0 Unprivileged calls to ``bpf()`` are enabled
>    +0 Unprivileged calls to ``bpf()`` are enabled except the calls
>    +  which explicitly requires ``CAP_BPF`` or ``CAP_SYS_ADMIN``
>     1 Unprivileged calls to ``bpf()`` are disabled without recovery
>     2 Unprivileged calls to ``bpf()`` are disabled
>   +3 All unprivileged calls to ``bpf()`` are enabled
>
> WDYT ?

Personally, I think some namespace might be the solution we need.
But adding a namespace is a lot of work, so we need to make sure to
do it correctly.

This might be a good topic to discuss in the BPF office hour.

Thanks,
Song

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH bpf-next] bpf: Allow get bpf object with CAP_BPF
  2022-11-30 18:06     ` Song Liu
@ 2022-12-01  0:37       ` Hao Luo
  2022-12-01 14:46         ` Yafang Shao
  2022-12-01 14:34       ` Yafang Shao
  1 sibling, 1 reply; 9+ messages in thread
From: Hao Luo @ 2022-12-01  0:37 UTC (permalink / raw)
  To: Song Liu
  Cc: Yafang Shao, ast, daniel, andrii, kafai, songliubraving, yhs,
	john.fastabend, kpsingh, sdf, jolsa, bpf

On Wed, Nov 30, 2022 at 10:07 AM Song Liu <song@kernel.org> wrote:
>
> On Wed, Nov 30, 2022 at 3:59 AM Yafang Shao <laoar.shao@gmail.com> wrote:
> >
> [...]
> > I understand that allowing ID->FD transition for CAP_SYS_ADMIN only is
> > for security.
> > But it also prevents the user from transiting its own bpf object ID,
> > that is a problem.
> >
> > > From the commit message, I'm not clear how BPF is debugged in
> > > containers in your use case. Maybe the debugging process should be
> > > required to have CAP_SYS_ADMIN?
> > >
> >
> > Some container users will run bpf programs in their container,
> > sometimes they want to check the bpf objects created by themselves  by
> > using bpftool or read/write the bpf maps with their own tools. But if
> > the bpf objects are not pinned, the only way to get these bpf objects
> > is via SCM_RIGHTS.
> > There should be a general way to get the FD of their own objects when
> > CAP_BPF is enabled.
> > With CAP_SYS_ADMIN, the container user can do almost anything, which
> > is very dangerous.
> > While with CAP_BPF, the risk can be kept within BPF.
> >
> > I think we should improve this situation by allowing the user to
> > transit its own bpf object IDs.
> > There are some possible solutions,
> > 1. introduce BPF_ID namespace
> >     Let's use namespace to isolate the bpf object ID instead of
> > preventing them from reading all IDs.
> > 2. introduce a global sysctl knob to allow users to do the ID->FD transition
> >     for example, introduce a new value into unprivileged_bpf_disabled.
> >     -0 Unprivileged calls to ``bpf()`` are enabled
> >    +0 Unprivileged calls to ``bpf()`` are enabled except the calls
> >    +  which explicitly requires ``CAP_BPF`` or ``CAP_SYS_ADMIN``
> >     1 Unprivileged calls to ``bpf()`` are disabled without recovery
> >     2 Unprivileged calls to ``bpf()`` are disabled
> >   +3 All unprivileged calls to ``bpf()`` are enabled
> >
> > WDYT ?
>
> Personally, I think some namespace might be the solution we need.
> But adding a namespace is a lot of work, so we need to make sure to
> do it correctly.
>
> This might be a good topic to discuss in the BPF office hour.
>

I think namespace is more preferable. A discussion in the BPF office
hour sounds good.

Following are my thoughts:

1. What does the BPF_ID namespace look like? Will it be like the PID
namespace, remapping IDs in each namespace? or just restricting the
object IDs visible to the users?

2. What's wrong with passing FD? Is it really necessary to introduce a
namespace for this purpose?

3. IIRC, Song proposed introducing a namespace for BPF isolation, not
just isolating IDs [1]. How does it relate to the BPF_ID namespace?

[1] https://lore.kernel.org/all/CAPhsuW6c17p3XkzSxxo7YBW9LHjqerOqQvt7C1+S--8C9omeng@mail.gmail.com/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH bpf-next] bpf: Allow get bpf object with CAP_BPF
  2022-11-30 18:06     ` Song Liu
  2022-12-01  0:37       ` Hao Luo
@ 2022-12-01 14:34       ` Yafang Shao
  1 sibling, 0 replies; 9+ messages in thread
From: Yafang Shao @ 2022-12-01 14:34 UTC (permalink / raw)
  To: Song Liu
  Cc: Hao Luo, ast, daniel, andrii, kafai, songliubraving, yhs,
	john.fastabend, kpsingh, sdf, jolsa, bpf

On Thu, Dec 1, 2022 at 2:07 AM Song Liu <song@kernel.org> wrote:
>
> On Wed, Nov 30, 2022 at 3:59 AM Yafang Shao <laoar.shao@gmail.com> wrote:
> >
> [...]
> > I understand that allowing ID->FD transition for CAP_SYS_ADMIN only is
> > for security.
> > But it also prevents the user from transiting its own bpf object ID,
> > that is a problem.
> >
> > > From the commit message, I'm not clear how BPF is debugged in
> > > containers in your use case. Maybe the debugging process should be
> > > required to have CAP_SYS_ADMIN?
> > >
> >
> > Some container users will run bpf programs in their container,
> > sometimes they want to check the bpf objects created by themselves  by
> > using bpftool or read/write the bpf maps with their own tools. But if
> > the bpf objects are not pinned, the only way to get these bpf objects
> > is via SCM_RIGHTS.
> > There should be a general way to get the FD of their own objects when
> > CAP_BPF is enabled.
> > With CAP_SYS_ADMIN, the container user can do almost anything, which
> > is very dangerous.
> > While with CAP_BPF, the risk can be kept within BPF.
> >
> > I think we should improve this situation by allowing the user to
> > transit its own bpf object IDs.
> > There are some possible solutions,
> > 1. introduce BPF_ID namespace
> >     Let's use namespace to isolate the bpf object ID instead of
> > preventing them from reading all IDs.
> > 2. introduce a global sysctl knob to allow users to do the ID->FD transition
> >     for example, introduce a new value into unprivileged_bpf_disabled.
> >     -0 Unprivileged calls to ``bpf()`` are enabled
> >    +0 Unprivileged calls to ``bpf()`` are enabled except the calls
> >    +  which explicitly requires ``CAP_BPF`` or ``CAP_SYS_ADMIN``
> >     1 Unprivileged calls to ``bpf()`` are disabled without recovery
> >     2 Unprivileged calls to ``bpf()`` are disabled
> >   +3 All unprivileged calls to ``bpf()`` are enabled
> >
> > WDYT ?
>
> Personally, I think some namespace might be the solution we need.
> But adding a namespace is a lot of work, so we need to make sure to
> do it correctly.
>

Right, lots of code to write. I will think about it carefully.

> This might be a good topic to discuss in the BPF office hour.
>

Once I figure out a workable solution, I will post a proposal.

--
Regards
Yafang

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH bpf-next] bpf: Allow get bpf object with CAP_BPF
  2022-12-01  0:37       ` Hao Luo
@ 2022-12-01 14:46         ` Yafang Shao
  2022-12-02  5:36           ` Hao Luo
  0 siblings, 1 reply; 9+ messages in thread
From: Yafang Shao @ 2022-12-01 14:46 UTC (permalink / raw)
  To: Hao Luo
  Cc: Song Liu, ast, daniel, andrii, kafai, songliubraving, yhs,
	john.fastabend, kpsingh, sdf, jolsa, bpf

On Thu, Dec 1, 2022 at 8:38 AM Hao Luo <haoluo@google.com> wrote:
>
> On Wed, Nov 30, 2022 at 10:07 AM Song Liu <song@kernel.org> wrote:
> >
> > On Wed, Nov 30, 2022 at 3:59 AM Yafang Shao <laoar.shao@gmail.com> wrote:
> > >
> > [...]
> > > I understand that allowing ID->FD transition for CAP_SYS_ADMIN only is
> > > for security.
> > > But it also prevents the user from transiting its own bpf object ID,
> > > that is a problem.
> > >
> > > > From the commit message, I'm not clear how BPF is debugged in
> > > > containers in your use case. Maybe the debugging process should be
> > > > required to have CAP_SYS_ADMIN?
> > > >
> > >
> > > Some container users will run bpf programs in their container,
> > > sometimes they want to check the bpf objects created by themselves  by
> > > using bpftool or read/write the bpf maps with their own tools. But if
> > > the bpf objects are not pinned, the only way to get these bpf objects
> > > is via SCM_RIGHTS.
> > > There should be a general way to get the FD of their own objects when
> > > CAP_BPF is enabled.
> > > With CAP_SYS_ADMIN, the container user can do almost anything, which
> > > is very dangerous.
> > > While with CAP_BPF, the risk can be kept within BPF.
> > >
> > > I think we should improve this situation by allowing the user to
> > > transit its own bpf object IDs.
> > > There are some possible solutions,
> > > 1. introduce BPF_ID namespace
> > >     Let's use namespace to isolate the bpf object ID instead of
> > > preventing them from reading all IDs.
> > > 2. introduce a global sysctl knob to allow users to do the ID->FD transition
> > >     for example, introduce a new value into unprivileged_bpf_disabled.
> > >     -0 Unprivileged calls to ``bpf()`` are enabled
> > >    +0 Unprivileged calls to ``bpf()`` are enabled except the calls
> > >    +  which explicitly requires ``CAP_BPF`` or ``CAP_SYS_ADMIN``
> > >     1 Unprivileged calls to ``bpf()`` are disabled without recovery
> > >     2 Unprivileged calls to ``bpf()`` are disabled
> > >   +3 All unprivileged calls to ``bpf()`` are enabled
> > >
> > > WDYT ?
> >
> > Personally, I think some namespace might be the solution we need.
> > But adding a namespace is a lot of work, so we need to make sure to
> > do it correctly.
> >
> > This might be a good topic to discuss in the BPF office hour.
> >
>
> I think namespace is more preferable. A discussion in the BPF office
> hour sounds good.
>
> Following are my thoughts:
>

Thanks for your thoughts.

> 1. What does the BPF_ID namespace look like? Will it be like the PID
> namespace, remapping IDs in each namespace? or just restricting the
> object IDs visible to the users?
>

I prefer the former.  It looks like the PID namespace, which also uses
the idr_alloc().

> 2. What's wrong with passing FD? Is it really necessary to introduce a
> namespace for this purpose?
>

Passing FD is not flexible, and generic tools like bpftool can't work.
In the long run, I think the restriction of CAP_SYS_ADMIN should be
replaced by better isolation mechanisms, so introducing a namespace to
replace it won't be a bad idea.

> 3. IIRC, Song proposed introducing a namespace for BPF isolation, not
> just isolating IDs [1]. How does it relate to the BPF_ID namespace?
>
> [1] https://lore.kernel.org/all/CAPhsuW6c17p3XkzSxxo7YBW9LHjqerOqQvt7C1+S--8C9omeng@mail.gmail.com/

I have looked through the slides of this proposal, but failed to
figure out how Song will design the BPF namespace. Maybe Song can give
us a better explanation.
Per my understanding, the goal of Song's proposal should be combined
by many namespaces and other isolation mechanisms.  For example, with
the help of PID namespace, we can make sure only the tasks in this
container can be traced by the bpf programs running in it.

-- 
Regards
Yafang

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH bpf-next] bpf: Allow get bpf object with CAP_BPF
  2022-12-01 14:46         ` Yafang Shao
@ 2022-12-02  5:36           ` Hao Luo
  2022-12-02  7:36             ` Song Liu
  0 siblings, 1 reply; 9+ messages in thread
From: Hao Luo @ 2022-12-02  5:36 UTC (permalink / raw)
  To: Yafang Shao
  Cc: Song Liu, ast, daniel, andrii, kafai, songliubraving, yhs,
	john.fastabend, kpsingh, sdf, jolsa, bpf

On Thu, Dec 1, 2022 at 6:47 AM Yafang Shao <laoar.shao@gmail.com> wrote:
>
> On Thu, Dec 1, 2022 at 8:38 AM Hao Luo <haoluo@google.com> wrote:
<...>
> > 3. IIRC, Song proposed introducing a namespace for BPF isolation, not
> > just isolating IDs [1]. How does it relate to the BPF_ID namespace?
> >
> > [1] https://lore.kernel.org/all/CAPhsuW6c17p3XkzSxxo7YBW9LHjqerOqQvt7C1+S--8C9omeng@mail.gmail.com/
>
> I have looked through the slides of this proposal, but failed to
> figure out how Song will design the BPF namespace. Maybe Song can give
> us a better explanation.
> Per my understanding, the goal of Song's proposal should be combined
> by many namespaces and other isolation mechanisms.  For example, with
> the help of PID namespace, we can make sure only the tasks in this
> container can be traced by the bpf programs running in it.
>

Among the 5 items in [1], it looks like the third item "Limit which
BPF programs are accessible to non-root users" is what you proposed
here. The other items are more about isolation, I think. So, the
question is, if we have a BPF_ID namespace, would that be sufficient
for debugging in containers? If yes, at least it's something useful.
We can start from the BPF_ID namespace, bring it for discussion, and
gather other requirements gradually.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH bpf-next] bpf: Allow get bpf object with CAP_BPF
  2022-12-02  5:36           ` Hao Luo
@ 2022-12-02  7:36             ` Song Liu
  0 siblings, 0 replies; 9+ messages in thread
From: Song Liu @ 2022-12-02  7:36 UTC (permalink / raw)
  To: Hao Luo
  Cc: Yafang Shao, ast, daniel, andrii, kafai, songliubraving, yhs,
	john.fastabend, kpsingh, sdf, jolsa, bpf, Mathieu Desnoyers

On Thu, Dec 1, 2022 at 9:36 PM Hao Luo <haoluo@google.com> wrote:
>
> On Thu, Dec 1, 2022 at 6:47 AM Yafang Shao <laoar.shao@gmail.com> wrote:
> >
> > On Thu, Dec 1, 2022 at 8:38 AM Hao Luo <haoluo@google.com> wrote:
> <...>
> > > 3. IIRC, Song proposed introducing a namespace for BPF isolation, not
> > > just isolating IDs [1]. How does it relate to the BPF_ID namespace?
> > >
> > > [1] https://lore.kernel.org/all/CAPhsuW6c17p3XkzSxxo7YBW9LHjqerOqQvt7C1+S--8C9omeng@mail.gmail.com/
> >
> > I have looked through the slides of this proposal, but failed to
> > figure out how Song will design the BPF namespace. Maybe Song can give
> > us a better explanation.
> > Per my understanding, the goal of Song's proposal should be combined
> > by many namespaces and other isolation mechanisms.  For example, with
> > the help of PID namespace, we can make sure only the tasks in this
> > container can be traced by the bpf programs running in it.

The proposal didn't really go anywhere. LOL.

As Yafang said, it requires multiple mechanisms to work together, and thus
is very complicated. OTOH, we are not sure whether BPF tracing is still
useful when it is really safe. Specifically, probe_read is not safe, but really
useful.

A related idea is tracer namespace, presented by Mathieu Desnoyers at
LPC 2022. [2]

[2] https://lpc.events/event/16/contributions/1237/

> >
>
> Among the 5 items in [1], it looks like the third item "Limit which
> BPF programs are accessible to non-root users" is what you proposed
> here. The other items are more about isolation, I think. So, the
> question is, if we have a BPF_ID namespace, would that be sufficient
> for debugging in containers? If yes, at least it's something useful.
> We can start from the BPF_ID namespace, bring it for discussion, and
> gather other requirements gradually.

BPF_ID namespace is a better defined idea. I am not sure whether we
want the complexity of a namespace, though.

Thanks,
Song

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-12-02  7:36 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-29 16:16 [RFC PATCH bpf-next] bpf: Allow get bpf object with CAP_BPF Yafang Shao
2022-11-30  0:44 ` Hao Luo
2022-11-30 11:58   ` Yafang Shao
2022-11-30 18:06     ` Song Liu
2022-12-01  0:37       ` Hao Luo
2022-12-01 14:46         ` Yafang Shao
2022-12-02  5:36           ` Hao Luo
2022-12-02  7:36             ` Song Liu
2022-12-01 14:34       ` Yafang Shao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.