All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kui-Feng Lee <kuifeng@fb.com>
To: "memxor@gmail.com" <memxor@gmail.com>
Cc: Yonghong Song <yhs@fb.com>,
	"brauner@kernel.org" <brauner@kernel.org>,
	"daniel@iogearbox.net" <daniel@iogearbox.net>,
	"olsajiri@gmail.com" <olsajiri@gmail.com>,
	"ast@kernel.org" <ast@kernel.org>,
	Kernel Team <Kernel-team@fb.com>,
	"bpf@vger.kernel.org" <bpf@vger.kernel.org>,
	"andrii@kernel.org" <andrii@kernel.org>
Subject: Re: [PATCH bpf-next 1/3] bpf: Parameterize task iterators.
Date: Sat, 30 Jul 2022 02:46:33 +0000	[thread overview]
Message-ID: <73a50faa6d9f36b1da75adb717887698cfe7d02a.camel@fb.com> (raw)
In-Reply-To: <CAP01T74HRHapKDAfj104KNGnzCgNQSu_M5-KfEvGBNzLWNfd+Q@mail.gmail.com>

On Thu, 2022-07-28 at 19:00 +0200, Kumar Kartikeya Dwivedi wrote:
> On Thu, 28 Jul 2022 at 18:40, Kui-Feng Lee <kuifeng@fb.com> wrote:
> > 
> > On Thu, 2022-07-28 at 18:22 +0200, Kumar Kartikeya Dwivedi wrote:
> > > On Thu, 28 Jul 2022 at 17:16, Kui-Feng Lee <kuifeng@fb.com>
> > > wrote:
> > > > 
> > > > On Thu, 2022-07-28 at 10:47 +0200, Kumar Kartikeya Dwivedi
> > > > wrote:
> > > > > On Thu, 28 Jul 2022 at 07:25, Kui-Feng Lee <kuifeng@fb.com>
> > > > > wrote:
> > > > > > 
> > > > > > On Wed, 2022-07-27 at 10:19 +0200, Kumar Kartikeya Dwivedi
> > > > > > wrote:
> > > > > > > On Wed, 27 Jul 2022 at 09:01, Kui-Feng Lee
> > > > > > > <kuifeng@fb.com>
> > > > > > > wrote:
> > > > > > > > 
> > > > > > > > On Tue, 2022-07-26 at 14:13 +0200, Jiri Olsa wrote:
> > > > > > > > > On Mon, Jul 25, 2022 at 10:17:11PM -0700, Kui-Feng
> > > > > > > > > Lee
> > > > > > > > > wrote:
> > > > > > > > > > Allow creating an iterator that loops through
> > > > > > > > > > resources
> > > > > > > > > > of
> > > > > > > > > > one
> > > > > > > > > > task/thread.
> > > > > > > > > > 
> > > > > > > > > > People could only create iterators to loop through
> > > > > > > > > > all
> > > > > > > > > > resources of
> > > > > > > > > > files, vma, and tasks in the system, even though
> > > > > > > > > > they
> > > > > > > > > > were
> > > > > > > > > > interested
> > > > > > > > > > in only the resources of a specific task or
> > > > > > > > > > process.
> > > > > > > > > > Passing
> > > > > > > > > > the
> > > > > > > > > > additional parameters, people can now create an
> > > > > > > > > > iterator to
> > > > > > > > > > go
> > > > > > > > > > through all resources or only the resources of a
> > > > > > > > > > task.
> > > > > > > > > > 
> > > > > > > > > > Signed-off-by: Kui-Feng Lee <kuifeng@fb.com>
> > > > > > > > > > ---
> > > > > > > > > >   include/linux/bpf.h            |  4 ++
> > > > > > > > > >   include/uapi/linux/bpf.h       | 23 ++++++++++
> > > > > > > > > >   kernel/bpf/task_iter.c         | 81
> > > > > > > > > > +++++++++++++++++++++++++-
> > > > > > > > > > ----
> > > > > > > > > > ----
> > > > > > > > > >   tools/include/uapi/linux/bpf.h | 23 ++++++++++
> > > > > > > > > >   4 files changed, 109 insertions(+), 22
> > > > > > > > > > deletions(-)
> > > > > > > > > > 
> > > > > > > > > > diff --git a/include/linux/bpf.h
> > > > > > > > > > b/include/linux/bpf.h
> > > > > > > > > > index 11950029284f..c8d164404e20 100644
> > > > > > > > > > --- a/include/linux/bpf.h
> > > > > > > > > > +++ b/include/linux/bpf.h
> > > > > > > > > > @@ -1718,6 +1718,10 @@ int bpf_obj_get_user(const
> > > > > > > > > > char
> > > > > > > > > > __user
> > > > > > > > > > *pathname, int flags);
> > > > > > > > > > 
> > > > > > > > > >   struct bpf_iter_aux_info {
> > > > > > > > > >          struct bpf_map *map;
> > > > > > > > > > +       struct {
> > > > > > > > > > +               __u32   tid;
> > > > > > > > > 
> > > > > > > > > should be just u32 ?
> > > > > > > > 
> > > > > > > > Or, should change the following 'type' to __u8?
> > > > > > > 
> > > > > > > Would it be better to use a pidfd instead of a tid here?
> > > > > > > Unset
> > > > > > > pidfd
> > > > > > > would mean going over all tasks, and any fd > 0 implies
> > > > > > > attaching
> > > > > > > to
> > > > > > > a
> > > > > > > specific task (as is the convention in BPF land). Most of
> > > > > > > the
> > > > > > > new
> > > > > > > UAPIs working on processes are using pidfds (to work with
> > > > > > > a
> > > > > > > stable
> > > > > > > handle instead of a reusable ID).
> > > > > > > The iterator taking an fd also gives an opportunity to
> > > > > > > BPF
> > > > > > > LSMs
> > > > > > > to
> > > > > > > attach permissions/policies to it (once we have a file
> > > > > > > local
> > > > > > > storage
> > > > > > > map) e.g. whether creating a task iterator for that
> > > > > > > specific
> > > > > > > pidfd
> > > > > > > instance (backed by the struct file) would be allowed or
> > > > > > > not.
> > > > > > > You are using getpid in the selftest and keeping track of
> > > > > > > last_tgid
> > > > > > > in
> > > > > > > the iterator, so I guess you don't even need to extend
> > > > > > > pidfd_open
> > > > > > > to
> > > > > > > work on thread IDs right now for your use case (and
> > > > > > > fdtable
> > > > > > > and
> > > > > > > mm
> > > > > > > are
> > > > > > > shared for POSIX threads anyway, so for those two it
> > > > > > > won't
> > > > > > > make a
> > > > > > > difference).
> > > > > > > 
> > > > > > > What is your opinion?
> > > > > > 
> > > > > > Do you mean removed both tid and type, and replace them
> > > > > > with a
> > > > > > pidfd?
> > > > > > We can do that in uapi, struct bpf_link_info.  But, the
> > > > > > interal
> > > > > > types,
> > > > > > ex. bpf_iter_aux_info, still need to use tid or struct file
> > > > > > to
> > > > > > avoid
> > > > > > getting file from the per-process fdtable.  Is that what
> > > > > > you
> > > > > > mean?
> > > > > > 
> > > > > 
> > > > > Yes, just for the UAPI, it is similar to taking map_fd for
> > > > > map
> > > > > iter.
> > > > > In bpf_link_info we should report just the tid, just like map
> > > > > iter
> > > > > reports map_id.
> > > > 
> > > > It sounds good to me.
> > > > 
> > > > One thing I need a clarification. You mentioned that a fd > 0
> > > > implies
> > > > attaching to a specific task, however fd can be 0. So, it
> > > > should be
> > > > fd
> > > > > = 0. So, it forces the user to initialize the value of pidfd
> > > > > to -
> > > > > 1.
> > > > So, for convenience, we still need a field like 'type' to make
> > > > it
> > > > easy
> > > > to create iterators without a filter.
> > > > 
> > > 
> > > Right, but in lots of BPF UAPI fields, fd 0 means fd is unset, so
> > > it
> > > is fine to rely on that assumption. For e.g. even for map_fd,
> > > bpf_map_elem iterator considers fd 0 to be unset. Then you don't
> > > need
> > > the type field.
> > 
> > I just realize that pidfd may be meaningless for the bpf_link_info
> > returned by bpf_obj_get_info_by_fd() since the origin fd might be
> > closed already.  So, I will always set it a value of 0.
> > 
> 
> For bpf_link_info, we should only be returning the tid of the task it
> is attached to, you cannot report the pidfd in bpf_link_info
> correctly (as you already realised). By default this would be 0,
> which is also an invalid tid, but when pidfd is set it will be the
> tid of the task it is attached to, so it works well.


We have a lot of dicussions around using tid or pidfd?
Kumar also mentioned about removing 'type'.
However, I have a feel that we need to keep 'type' in struct
bpf_link_info.  I cam imagine that we may like to create iterators of
tasks in a cgroup or other paramters in futhure.  'type' will help us
to tell the types of a parameter.



  parent reply	other threads:[~2022-07-30  2:46 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-26  5:17 [PATCH bpf-next 0/3] Parameterize task iterators Kui-Feng Lee
2022-07-26  5:17 ` [PATCH bpf-next 1/3] bpf: " Kui-Feng Lee
2022-07-26 12:13   ` Jiri Olsa
2022-07-26 13:19     ` Jiri Olsa
2022-07-27  6:39       ` Kui-Feng Lee
2022-07-27  6:56     ` Kui-Feng Lee
2022-07-27  8:19       ` Kumar Kartikeya Dwivedi
2022-07-28  5:25         ` Kui-Feng Lee
2022-07-28  8:47           ` Kumar Kartikeya Dwivedi
2022-07-28 15:16             ` Kui-Feng Lee
2022-07-28 16:22               ` Kumar Kartikeya Dwivedi
2022-07-28 16:40                 ` Kui-Feng Lee
2022-07-28 17:08                   ` Yonghong Song
2022-07-28 17:52                     ` Kumar Kartikeya Dwivedi
2022-07-28 19:11                       ` Hao Luo
2022-08-02 14:25                         ` Kumar Kartikeya Dwivedi
2022-07-28 22:54                       ` Yonghong Song
2022-07-29  9:10                         ` Christian Brauner
     [not found]                   ` <CAP01T74HRHapKDAfj104KNGnzCgNQSu_M5-KfEvGBNzLWNfd+Q@mail.gmail.com>
2022-07-30  2:46                     ` Kui-Feng Lee [this message]
2022-07-28 18:01                 ` Hao Luo
2022-07-26  5:17 ` [PATCH bpf-next 2/3] bpf: Handle bpf_link_info for the parameterized task BPF iterators Kui-Feng Lee
2022-07-26  5:17 ` [PATCH bpf-next 3/3] selftests/bpf: Test " Kui-Feng Lee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=73a50faa6d9f36b1da75adb717887698cfe7d02a.camel@fb.com \
    --to=kuifeng@fb.com \
    --cc=Kernel-team@fb.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brauner@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=memxor@gmail.com \
    --cc=olsajiri@gmail.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.