netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yonghong Song <yhs@fb.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: bpf <bpf@vger.kernel.org>, Networking <netdev@vger.kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Kernel Team <kernel-team@fb.com>
Subject: Re: [PATCH bpf-next 1/2] bpf: change uapi for bpf iterator map elements
Date: Sun, 2 Aug 2020 19:23:36 -0700	[thread overview]
Message-ID: <bb01225b-d4a4-c76b-5e1f-3dc37135f637@fb.com> (raw)
In-Reply-To: <CAEf4BzbaRXHpZ5b_6rojnk2dQxLFCOEwtGjNExdg5FEWadF+9g@mail.gmail.com>



On 8/2/20 6:25 PM, Andrii Nakryiko wrote:
> On Sat, Aug 1, 2020 at 9:22 PM Yonghong Song <yhs@fb.com> wrote:
>>
>> Commit a5cbe05a6673 ("bpf: Implement bpf iterator for
>> map elements") added bpf iterator support for
>> map elements. The map element bpf iterator requires
>> info to identify a particular map. In the above
>> commit, the attr->link_create.target_fd is used
>> to carry map_fd and an enum bpf_iter_link_info
>> is added to uapi to specify the target_fd actually
>> representing a map_fd:
>>      enum bpf_iter_link_info {
>>          BPF_ITER_LINK_UNSPEC = 0,
>>          BPF_ITER_LINK_MAP_FD = 1,
>>
>>          MAX_BPF_ITER_LINK_INFO,
>>      };
>>
>> This is an extensible approach as we can grow
>> enumerator for pid, cgroup_id, etc. and we can
>> unionize target_fd for pid, cgroup_id, etc.
>> But in the future, there are chances that
>> more complex customization may happen, e.g.,
>> for tasks, it could be filtered based on
>> both cgroup_id and user_id.
>>
>> This patch changed the uapi to have fields
>>          __aligned_u64   iter_info;
>>          __u32           iter_info_len;
>> for additional iter_info for link_create.
>> The iter_info is defined as
>>          union bpf_iter_link_info {
>>                  struct {
>>                          __u32   map_fd;
>>                  } map;
>>          };
>>
>> So future extension for additional customization
>> will be easier. The bpf_iter_link_info will be
>> passed to target callback to validate and generic
>> bpf_iter framework does not need to deal it any
>> more.
>>
>> Signed-off-by: Yonghong Song <yhs@fb.com>
>> ---
>>   include/linux/bpf.h            | 10 ++++---
>>   include/uapi/linux/bpf.h       | 15 +++++-----
>>   kernel/bpf/bpf_iter.c          | 52 +++++++++++++++-------------------
>>   kernel/bpf/map_iter.c          | 37 ++++++++++++++++++------
>>   kernel/bpf/syscall.c           |  2 +-
>>   net/core/bpf_sk_storage.c      | 37 ++++++++++++++++++------
>>   tools/include/uapi/linux/bpf.h | 15 +++++-----
>>   7 files changed, 104 insertions(+), 64 deletions(-)
>>
> 
> [...]
> 
>>   int bpf_iter_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
>>   {
>> +       union bpf_iter_link_info __user *ulinfo;
>>          struct bpf_link_primer link_primer;
>>          struct bpf_iter_target_info *tinfo;
>> -       struct bpf_iter_aux_info aux = {};
>> +       union bpf_iter_link_info linfo;
>>          struct bpf_iter_link *link;
>> -       u32 prog_btf_id, target_fd;
>> +       u32 prog_btf_id, linfo_len;
>>          bool existed = false;
>> -       struct bpf_map *map;
>>          int err;
>>
>> +       memset(&linfo, 0, sizeof(union bpf_iter_link_info));
>> +
>> +       ulinfo = u64_to_user_ptr(attr->link_create.iter_info);
>> +       linfo_len = attr->link_create.iter_info_len;
>> +       if (ulinfo && linfo_len) {
> 
> We probably want to be more strict here: if either pointer or len is
> non-zero, both should be present and valid. Otherwise we can have
> garbage in iter_info, as long as iter_info_len is zero.

yes, it is possible iter_info_len = 0 and iter_info is not null and
if this happens, iter_info will not be examined.

in kernel, we have places this is handled similarly. For example,
for cgroup bpf_prog query.

kernel/bpf/cgroup.c, function __cgroup_bpf_query

   __u32 __user *prog_ids = u64_to_user_ptr(attr->query.prog_ids);
   ...
   if (attr->query.prog_cnt == 0 || !prog_ids || !cnt)
     return 0;

In the above case, it is possible prog_cnt = 0 and prog_ids != NULL,
or prog_ids == NULL and prog_cnt != 0, and we won't return error
to user space.

Not 100% sure whether we have convention here or not.

> 
>> +               err = bpf_check_uarg_tail_zero(ulinfo, sizeof(linfo),
>> +                                              linfo_len);
>> +               if (err)
>> +                       return err;
>> +               linfo_len = min_t(u32, linfo_len, sizeof(linfo));
>> +               if (copy_from_user(&linfo, ulinfo, linfo_len))
>> +                       return -EFAULT;
>> +       }
>> +
>>          prog_btf_id = prog->aux->attach_btf_id;
>>          mutex_lock(&targets_mutex);
>>          list_for_each_entry(tinfo, &targets, list) {
>> @@ -411,13 +425,6 @@ int bpf_iter_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
>>          if (!existed)
>>                  return -ENOENT;
>>
>> -       /* Make sure user supplied flags are target expected. */
>> -       target_fd = attr->link_create.target_fd;
>> -       if (attr->link_create.flags != tinfo->reg_info->req_linfo)
>> -               return -EINVAL;
>> -       if (!attr->link_create.flags && target_fd)
>> -               return -EINVAL;
>> -
> 
> Please still ensure that no flags are specified.

Make sense. I also need to ensure target_fd is 0 since it is not used 
any more.

> 
> 
>>          link = kzalloc(sizeof(*link), GFP_USER | __GFP_NOWARN);
>>          if (!link)
>>                  return -ENOMEM;
>> @@ -431,28 +438,15 @@ int bpf_iter_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
>>                  return err;
>>          }
>>
> 
> [...]
> 
>> -static int bpf_iter_check_map(struct bpf_prog *prog,
>> -                             struct bpf_iter_aux_info *aux)
>> +static int bpf_iter_attach_map(struct bpf_prog *prog,
>> +                              union bpf_iter_link_info *linfo,
>> +                              struct bpf_iter_aux_info *aux)
>>   {
>> -       struct bpf_map *map = aux->map;
>> +       struct bpf_map *map;
>> +       int err = -EINVAL;
>>
>> -       if (map->map_type != BPF_MAP_TYPE_SK_STORAGE)
>> +       if (!linfo->map.map_fd)
>>                  return -EINVAL;
> 
> This could be -EBADF?

Good suggestion. Will do.

> 
>>
>> -       if (prog->aux->max_rdonly_access > map->value_size)
>> -               return -EACCES;
>> +       map = bpf_map_get_with_uref(linfo->map.map_fd);
>> +       if (IS_ERR(map))
>> +               return PTR_ERR(map);
>> +
>> +       if (map->map_type != BPF_MAP_TYPE_SK_STORAGE)
>> +               goto put_map;
>> +
>> +       if (prog->aux->max_rdonly_access > map->value_size) {
>> +               err = -EACCES;
>> +               goto put_map;
>> +       }
> 
> [...]
> 

  reply	other threads:[~2020-08-03  2:23 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-02  4:21 [PATCH bpf-next 0/2] bpf: change uapi for bpf iterator map elements Yonghong Song
2020-08-02  4:21 ` [PATCH bpf-next 1/2] " Yonghong Song
2020-08-03  1:25   ` Andrii Nakryiko
2020-08-03  2:23     ` Yonghong Song [this message]
2020-08-03  5:11       ` Andrii Nakryiko
2020-08-03  6:21         ` Yonghong Song
2020-08-02  4:21 ` [PATCH bpf-next 2/2] libbpf: support new uapi for map element bpf iterator Yonghong Song
2020-08-03  1:35   ` Andrii Nakryiko
2020-08-03  2:30     ` Yonghong Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bb01225b-d4a4-c76b-5e1f-3dc37135f637@fb.com \
    --to=yhs@fb.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kernel-team@fb.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).