From: Yonghong Song <yhs@fb.com>
To: Denis Salopek <denis.salopek@sartura.hr>, <bpf@vger.kernel.org>
Cc: <juraj.vijtiuk@sartura.hr>, <luka.oreskovic@sartura.hr>,
<luka.perkov@sartura.hr>, <daniel@iogearbox.net>,
<andrii.nakryiko@gmail.com>
Subject: Re: [PATCH v4 bpf-next] bpf: add lookup_and_delete_elem support to hashtab
Date: Sun, 4 Apr 2021 22:42:49 -0700 [thread overview]
Message-ID: <9325297a-66b8-eb26-22cb-aa4a740e8df6@fb.com> (raw)
In-Reply-To: <ce69af50-3667-c52d-1f1a-b924bbe0fc58@fb.com>
On 4/4/21 9:47 PM, Yonghong Song wrote:
>
>
> On 3/29/21 5:57 AM, Denis Salopek wrote:
>> Extend the existing bpf_map_lookup_and_delete_elem() functionality to
>> hashtab maps, in addition to stacks and queues.
>> Add bpf_map_lookup_and_delete_elem_flags() libbpf API in order to use
>> the BPF_F_LOCK flag.
>> Create a new hashtab bpf_map_ops function that does lookup and deletion
>> of the element under the same bucket lock and add the created map_ops to
>> bpf.h.
>> Add the appropriate test cases to 'maps' and 'lru_map' selftests
>> accompanied with new test_progs.
>>
>> Cc: Juraj Vijtiuk <juraj.vijtiuk@sartura.hr>
>> Cc: Luka Oreskovic <luka.oreskovic@sartura.hr>
>> Cc: Luka Perkov <luka.perkov@sartura.hr>
>> Signed-off-by: Denis Salopek <denis.salopek@sartura.hr>
>> ---
>> v2: Add functionality for LRU/per-CPU, add test_progs tests.
>> v3: Add bpf_map_lookup_and_delete_elem_flags() and enable BPF_F_LOCK
>> flag, change CHECKs to ASSERT_OKs, initialize variables to 0.
>> v4: Fix the return value for unsupported map types.
>> ---
>> include/linux/bpf.h | 2 +
>> kernel/bpf/hashtab.c | 97 ++++++
>> kernel/bpf/syscall.c | 27 +-
>> tools/lib/bpf/bpf.c | 13 +
>> tools/lib/bpf/bpf.h | 2 +
>> tools/lib/bpf/libbpf.map | 1 +
>> .../bpf/prog_tests/lookup_and_delete.c | 279 ++++++++++++++++++
>> .../bpf/progs/test_lookup_and_delete.c | 26 ++
>> tools/testing/selftests/bpf/test_lru_map.c | 8 +
>> tools/testing/selftests/bpf/test_maps.c | 19 +-
>> 10 files changed, 469 insertions(+), 5 deletions(-)
>> create mode 100644
>> tools/testing/selftests/bpf/prog_tests/lookup_and_delete.c
>> create mode 100644
>> tools/testing/selftests/bpf/progs/test_lookup_and_delete.c
>
> Since another revision is needed, could you break the patch to several
> commits which will make it easy to review?
> Patch 1: kernel + uapi header:
> include/linux/bpf.h
> kernel/bpf/hashtab.c
> kernel/bpf/syscall.c
> include/uapi/linux/bpf.h and tools/include/uapi/linux/bpf.h (see below)
> Patch 2: libbpf change
> tools/lib/bpf/bpf.{c,h}, libbpf.map
> Patch 3: selftests/bpf change
> tools/testing/selftests/bpf/...
>
>>
>> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
>> index 9fdd839b418c..8af4bd7c7229 100644
>> --- a/include/linux/bpf.h
>> +++ b/include/linux/bpf.h
>> @@ -68,6 +68,8 @@ struct bpf_map_ops {
>> void *(*map_lookup_elem_sys_only)(struct bpf_map *map, void *key);
>> int (*map_lookup_batch)(struct bpf_map *map, const union
>> bpf_attr *attr,
>> union bpf_attr __user *uattr);
>> + int (*map_lookup_and_delete_elem)(struct bpf_map *map, void *key,
>> + void *value, u64 flags);
>> int (*map_lookup_and_delete_batch)(struct bpf_map *map,
>> const union bpf_attr *attr,
>> union bpf_attr __user *uattr);
>> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
>> index d7ebb12ffffc..0d2085ce9a38 100644
>> --- a/kernel/bpf/hashtab.c
>> +++ b/kernel/bpf/hashtab.c
>> @@ -1401,6 +1401,99 @@ static void htab_map_seq_show_elem(struct
>> bpf_map *map, void *key,
>> rcu_read_unlock();
>> }
>> +static int __htab_map_lookup_and_delete_elem(struct bpf_map *map,
>> void *key,
>> + void *value, bool is_lru_map,
>> + bool is_percpu, u64 flags)
>> +{
>> + struct bpf_htab *htab = container_of(map, struct bpf_htab, map);
>> + u32 hash, key_size, value_size;
>> + struct hlist_nulls_head *head;
>> + int cpu, off = 0, ret;
>> + struct htab_elem *l;
>> + unsigned long bflags;
>> + void __percpu *pptr;
>> + struct bucket *b;
>> +
>> + if ((flags & ~BPF_F_LOCK) ||
>> + ((flags & BPF_F_LOCK) && !map_value_has_spin_lock(map)))
>> + return -EINVAL;
>> +
>> + key_size = map->key_size;
>> + value_size = round_up(map->value_size, 8);
>
> This value_size is actually a round_up size, so maybe
> rename it as "roundup_size". Also, I see it is only
Sorry, I actually mean "roundup_value_size" here.
> used in is_percpu case. It would be good if it can be
> moved inside is_percpu branch.
>
>> +
>> + hash = htab_map_hash(key, key_size, htab->hashrnd);
>> + b = __select_bucket(htab, hash);
>> + head = &b->head;
>> +
>> + ret = htab_lock_bucket(htab, b, hash, &bflags);
>> + if (ret)
>> + return ret;
>> +
>> + l = lookup_elem_raw(head, hash, key, key_size);
>> + if (l) {
>> + if (is_percpu) {
>> + pptr = htab_elem_get_ptr(l, key_size);
>> + for_each_possible_cpu(cpu) {
>> + bpf_long_memcpy(value + off,
>> + per_cpu_ptr(pptr, cpu),
>> + value_size);
>> + off += value_size;
>> + }
>> + } else {
>> + if (flags & BPF_F_LOCK)
>> + copy_map_value_locked(map, value, l->key +
>> + round_up(key_size, 8),
>> + true);
>> + else
>> + copy_map_value(map, value, l->key +
>> + round_up(key_size, 8));
>> + check_and_init_map_lock(map, value);
>> + }
>> +
>> + hlist_nulls_del_rcu(&l->hash_node);
>> + if (!is_lru_map)
>> + free_htab_elem(htab, l);
>> + } else
>> + ret = -ENOENT;
>> +
>> + htab_unlock_bucket(htab, b, hash, bflags);
>> +
>> + if (is_lru_map && l)
>> + bpf_lru_push_free(&htab->lru, &l->lru_node);
>> +
>> + return ret;
>> +}
>> +
[...]
next prev parent reply other threads:[~2021-04-05 5:43 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-29 12:57 [PATCH v4 bpf-next] bpf: add lookup_and_delete_elem support to hashtab Denis Salopek
2021-04-05 4:47 ` Yonghong Song
2021-04-05 5:42 ` Yonghong Song [this message]
2021-04-05 21:08 ` Andrii Nakryiko
2021-04-12 6:47 ` Denis Salopek
2021-04-12 14:19 ` Yonghong Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9325297a-66b8-eb26-22cb-aa4a740e8df6@fb.com \
--to=yhs@fb.com \
--cc=andrii.nakryiko@gmail.com \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=denis.salopek@sartura.hr \
--cc=juraj.vijtiuk@sartura.hr \
--cc=luka.oreskovic@sartura.hr \
--cc=luka.perkov@sartura.hr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).