All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mauricio Vasquez <mauricio.vasquez@polito.it>
To: Daniel Borkmann <daniel@iogearbox.net>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>, netdev@vger.kernel.org
Subject: Re: [PATCH bpf-next 1/3] bpf: add bpf queue map
Date: Thu, 9 Aug 2018 09:51:49 -0500	[thread overview]
Message-ID: <867e95e1-cb45-54c8-ce66-d3e49161d5e6@polito.it> (raw)
In-Reply-To: <ca4532da-de68-6308-e969-2061b037c571@iogearbox.net>



On 08/09/2018 04:02 AM, Daniel Borkmann wrote:
> On 08/09/2018 06:48 AM, Alexei Starovoitov wrote:
>> On Wed, Aug 08, 2018 at 10:08:47PM -0500, Mauricio Vasquez wrote:
>>>> And how about adding three new helpers: push/pop/peek as well?
>>>> Reusing lookup/update is neat, but does lookup == pop
>>>> or does lookup == peek ?
>>>> I suspect it will be confusing.
>>>> Three new helpers cost nothing, but will make bpf progs easier to read.
>>> I agree. I have one doubt here, update/lookup/delete is implemented in all
>>> map types, if the operation is not supported it returns -EINVAL.
>>> For push/pop/peek, should we implement it in all maps or is it better to
>>> check the map type before invoking map->ops->push/pop/seek?
>>> (Maybe checking if map->ops->xxx is NULL will also work)
>> Since push/pop/peak are only for this queue/stack I thought we won't
>> be adding 'ops' for them and just call the helpers from progs.
>> But now I'm having second thoughts, since 3 new commands for syscall
>> feels like overkill.
>> At the same time I still don't feel that lookup == pop is the right alias.
>> Also what peak is going to alias to ?
>> May be let's go back to your original idea with a tweak:
>> push == update
>> peak == lookup
>> pop = lookup + delete
>> In other words in case of stack the bpf_map_lookup_elem will return
>> the pointer to value of top element in the stack and
>> bpf_map_delete_elem will delete that top element.
>> Then in user space we can introduce push/pop always_inline functions like:
>> void bpf_push(struct bpf_map *map, void *value)
>> {
>>    bpf_map_update_elem(map, NULL/*key*/, value);
>> }
>>
>> void *bpf_pop(struct bpf_map *map)
>> {
>>    void * val = bpf_map_lookup_elem(map, NULL/*key*/);
>>    bpf_map_delete_elem(map, NULL/*key*/);
>>    return val;
>> }
>>
>> Thoughts?
> I actually think that having new push/peak/pop BPF map helpers would
> be fine, as well as having them sit in map->ops. Initially I'd probably
> leave out the syscall ops counterparts so they can be used only from
> program.

I also think that having the new helpers is good. I don't have a strong 
opinion about saving them in map->ops or not, in any case it has to be 
verified that push/pop/peek are only invoked in stack/queue maps. I 
think we could force in on the verifier, would it be ok?

> Potentially array and per-cpu array could implement them even by having
> an internal pointer to the current slot which we move on push/pop. This
> would potentially also avoid all the RCU callbacks to free the elements,
> elem allocations, list locking, and improve cache locality compared to
> the current implementation.

I could change the implementation, a linked list is used when 
BPF_F_NO_PREALLOC is passed, otherwise a plain array with internal 
indexes is used.
I have a question about RCU and a possible array-based implementation. 
It should be guarantee that a pointer from a pop() operation is valid 
for the whole program duration.
If an eBPF program A calls pop() it will get a pointer to the head in 
the array, and the head index will move, it is possible that a second 
program B (running on a different CPU) pushes a new element overwriting 
the element pointed by A.
I think we need to use a more sophisticated mechanism than just an array 
and two indexes. Am I missing something?

> Agree that existing ops are not the right alias, but deferring to user
> space as inline function also doesn't really seem like a good fit, imho,
> so I'd prefer rather to have something native. (Aside from that, the
> above inline bpf_pop() would also race between CPUs.)

I think we should have push/pop/peek syscalls as well, having a 
bpf_pop() that is race prone would create problems. Users expected maps 
operations to be safe, so having one that is not will confuse them.
> Thanks,
> Daniel
>

Thanks,
Mauricio.

  reply	other threads:[~2018-08-09 17:17 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-06 13:58 [PATCH bpf-next 0/3] Implement bpf map queue Mauricio Vasquez B
2018-08-06 13:58 ` [PATCH bpf-next 1/3] bpf: add bpf queue map Mauricio Vasquez B
2018-08-07 13:40   ` Daniel Borkmann
2018-08-09  2:50     ` Mauricio Vasquez
2018-08-07 13:52   ` Daniel Borkmann
2018-08-09  2:55     ` Mauricio Vasquez
2018-08-07 14:42   ` Alexei Starovoitov
2018-08-09  3:08     ` Mauricio Vasquez
2018-08-09  4:48       ` Alexei Starovoitov
2018-08-09  9:02         ` Daniel Borkmann
2018-08-09 14:51           ` Mauricio Vasquez [this message]
2018-08-09 16:23             ` Alexei Starovoitov
2018-08-09 23:41               ` Mauricio Vasquez
2018-08-10  3:09                 ` Alexei Starovoitov
2018-08-06 13:58 ` [PATCH bpf-next 2/3] selftests/bpf: add test cases for BPF_MAP_TYPE_QUEUE Mauricio Vasquez B
2018-08-07 13:42   ` Daniel Borkmann
2018-08-06 13:58 ` [PATCH bpf-next 3/3] bpf: add sample " Mauricio Vasquez B
2018-08-07 13:44   ` Daniel Borkmann
2018-08-09  2:52     ` Mauricio Vasquez
2018-08-07 20:31   ` Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=867e95e1-cb45-54c8-ce66-d3e49161d5e6@polito.it \
    --to=mauricio.vasquez@polito.it \
    --cc=alexei.starovoitov@gmail.com \
    --cc=ast@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.