bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yonghong Song <yhs@fb.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: bpf <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
	Cong Wang <xiyou.wangcong@gmail.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Kernel Team <kernel-team@fb.com>
Subject: Re: [PATCH bpf-next v3 04/11] bpf: add bpf_for_each_map_elem() helper
Date: Thu, 25 Feb 2021 19:22:00 -0800	[thread overview]
Message-ID: <e02a20ad-4d1f-dfee-2dcc-a32d57cd182c@fb.com> (raw)
In-Reply-To: <c042d2c7-a15f-c9b3-be7e-c729c1cf7184@fb.com>



On 2/25/21 6:16 PM, Yonghong Song wrote:
> 
> 
> On 2/25/21 2:41 PM, Andrii Nakryiko wrote:
>> On Thu, Feb 25, 2021 at 1:35 AM Yonghong Song <yhs@fb.com> wrote:
>>>
>>> The bpf_for_each_map_elem() helper is introduced which
>>> iterates all map elements with a callback function. The
>>> helper signature looks like
>>>    long bpf_for_each_map_elem(map, callback_fn, callback_ctx, flags)
>>> and for each map element, the callback_fn will be called. For example,
>>> like hashmap, the callback signature may look like
>>>    long callback_fn(map, key, val, callback_ctx)
>>>
>>> There are two known use cases for this. One is from upstream ([1]) where
>>> a for_each_map_elem helper may help implement a timeout mechanism
>>> in a more generic way. Another is from our internal discussion
>>> for a firewall use case where a map contains all the rules. The packet
>>> data can be compared to all these rules to decide allow or deny
>>> the packet.
>>>
>>> For array maps, users can already use a bounded loop to traverse
>>> elements. Using this helper can avoid using bounded loop. For other
>>> type of maps (e.g., hash maps) where bounded loop is hard or
>>> impossible to use, this helper provides a convenient way to
>>> operate on all elements.
>>>
>>> For callback_fn, besides map and map element, a callback_ctx,
>>> allocated on caller stack, is also passed to the callback
>>> function. This callback_ctx argument can provide additional
>>> input and allow to write to caller stack for output.
>>>
>>> If the callback_fn returns 0, the helper will iterate through next
>>> element if available. If the callback_fn returns 1, the helper
>>> will stop iterating and returns to the bpf program. Other return
>>> values are not used for now.
>>>
>>> Currently, this helper is only available with jit. It is possible
>>> to make it work with interpreter with so effort but I leave it
>>> as the future work.
>>>
>>> [1]: 
>>> https://lore.kernel.org/bpf/20210122205415.113822-1-xiyou.wangcong@gmail.com/ 
>>>
>>>
>>> Signed-off-by: Yonghong Song <yhs@fb.com>
>>> ---
>>
>> It looks good from the perspective of implementation (though I
>> admittedly lost track of all the insn[0|1].imm transformations). But
>> see some suggestions below (I hope you can incorporate them).
>>
>> Overall, though:
>>
>> Acked-by: Andrii Nakryiko <andrii@kernel.org>
>>
>>>   include/linux/bpf.h            |  13 +++
>>>   include/linux/bpf_verifier.h   |   3 +
>>>   include/uapi/linux/bpf.h       |  29 ++++-
>>>   kernel/bpf/bpf_iter.c          |  16 +++
>>>   kernel/bpf/helpers.c           |   2 +
>>>   kernel/bpf/verifier.c          | 208 ++++++++++++++++++++++++++++++---
>>>   kernel/trace/bpf_trace.c       |   2 +
>>>   tools/include/uapi/linux/bpf.h |  29 ++++-
>>>   8 files changed, 287 insertions(+), 15 deletions(-)
>>>
[...]
>>
>>>   static int prepare_func_exit(struct bpf_verifier_env *env, int 
>>> *insn_idx)
>>>   {
>>>          struct bpf_verifier_state *state = env->cur_state;
>>> @@ -5400,8 +5487,22 @@ static int prepare_func_exit(struct 
>>> bpf_verifier_env *env, int *insn_idx)
>>>
>>>          state->curframe--;
>>>          caller = state->frame[state->curframe];
>>> -       /* return to the caller whatever r0 had in the callee */
>>> -       caller->regs[BPF_REG_0] = *r0;
>>> +       if (callee->in_callback_fn) {
>>> +               /* enforce R0 return value range [0, 1]. */
>>> +               struct tnum range = tnum_range(0, 1);
>>> +
>>> +               if (r0->type != SCALAR_VALUE) {
>>> +                       verbose(env, "R0 not a scalar value\n");
>>> +                       return -EACCES;
>>> +               }
>>> +               if (!tnum_in(range, r0->var_off)) {
>>
>> if (!tnum_in(tnum_range(0, 1), r0->var_off)) should work as well,
>> unless you find it less readable (I don't but no strong feeling here)
> 
> Will give a try.

Will keep it as is since range is used below for error reporting.

> 
>>
>>
>>> +                       verbose_invalid_scalar(env, r0, &range, 
>>> "callback return", "R0");
>>> +                       return -EINVAL;
>>> +               }
>>> +       } else {
>>> +               /* return to the caller whatever r0 had in the callee */
>>> +               caller->regs[BPF_REG_0] = *r0;
>>> +       }
>>>
>>>          /* Transfer references to the caller */
>>>          err = transfer_reference_state(caller, callee);
[...]

  reply	other threads:[~2021-02-26  3:23 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-25  7:33 [PATCH bpf-next v3 00/11] bpf: add bpf_for_each_map_elem() helper Yonghong Song
2021-02-25  7:33 ` [PATCH bpf-next v3 01/11] bpf: factor out visit_func_call_insn() in check_cfg() Yonghong Song
2021-02-25 21:54   ` Andrii Nakryiko
2021-02-25 22:01     ` Alexei Starovoitov
2021-02-25  7:33 ` [PATCH bpf-next v3 02/11] bpf: factor out verbose_invalid_scalar() Yonghong Song
2021-02-25 21:56   ` Andrii Nakryiko
2021-02-25  7:33 ` [PATCH bpf-next v3 03/11] bpf: refactor check_func_call() to allow callback function Yonghong Song
2021-02-25 22:05   ` Andrii Nakryiko
2021-02-25 22:31     ` Andrii Nakryiko
2021-02-26  0:08       ` Yonghong Song
2021-02-26  1:18         ` Andrii Nakryiko
2021-02-26  0:05     ` Yonghong Song
2021-02-25  7:33 ` [PATCH bpf-next v3 04/11] bpf: add bpf_for_each_map_elem() helper Yonghong Song
2021-02-25 22:41   ` Andrii Nakryiko
2021-02-26  2:16     ` Yonghong Song
2021-02-26  3:22       ` Yonghong Song [this message]
2021-02-26  2:27   ` Cong Wang
2021-02-26  3:27     ` Yonghong Song
2021-02-25  7:33 ` [PATCH bpf-next v3 05/11] bpf: add hashtab support for " Yonghong Song
2021-02-25 22:44   ` Andrii Nakryiko
2021-02-25  7:33 ` [PATCH bpf-next v3 06/11] bpf: add arraymap " Yonghong Song
2021-02-25 22:48   ` Andrii Nakryiko
2021-02-25  7:33 ` [PATCH bpf-next v3 07/11] libbpf: move function is_ldimm64() earlier in libbpf.c Yonghong Song
2021-02-25  7:33 ` [PATCH bpf-next v3 08/11] libbpf: support subprog address relocation Yonghong Song
2021-02-25 23:04   ` Andrii Nakryiko
2021-02-25  7:33 ` [PATCH bpf-next v3 09/11] bpftool: print subprog address properly Yonghong Song
2021-02-25 23:04   ` Andrii Nakryiko
2021-02-25  7:33 ` [PATCH bpf-next v3 10/11] selftests/bpf: add hashmap test for bpf_for_each_map_elem() helper Yonghong Song
2021-02-25 23:25   ` Andrii Nakryiko
2021-02-26  3:24     ` Yonghong Song
2021-02-25  7:33 ` [PATCH bpf-next v3 11/11] selftests/bpf: add arraymap " Yonghong Song
2021-02-25 23:26   ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e02a20ad-4d1f-dfee-2dcc-a32d57cd182c@fb.com \
    --to=yhs@fb.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kernel-team@fb.com \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).