bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joanne Koong <joannekoong@fb.com>
To: <bpf@vger.kernel.org>
Cc: <andrii@kernel.org>, <ast@kernel.org>, <daniel@iogearbox.net>,
	<kafai@fb.com>, <Kernel-team@fb.com>,
	Joanne Koong <joannekoong@fb.com>
Subject: [PATCH v6 bpf-next 0/5] Implement bloom filter map
Date: Wed, 27 Oct 2021 16:44:59 -0700	[thread overview]
Message-ID: <20211027234504.30744-1-joannekoong@fb.com> (raw)

This patchset adds a new kind of bpf map: the bloom filter map.
Bloom filters are a space-efficient probabilistic data structure
used to quickly test whether an element exists in a set.
For a brief overview about how bloom filters work,
may be helpful.

One example use-case is an application leveraging a bloom filter
map to determine whether a computationally expensive hashmap
lookup can be avoided. If the element was not found in the bloom
filter map, the hashmap lookup can be skipped.

This patchset includes benchmarks for testing the performance of
the bloom filter for different entry sizes and different number of
hash functions used, as well as comparisons for hashmap lookups
with vs. without the bloom filter.

A high level overview of this patchset is as follows:
1/5 - kernel changes for adding bloom filter map
2/5 - libbpf changes for adding map_extra flags
3/5 - tests for the bloom filter map
4/5 - benchmarks for bloom filter lookup/update throughput and false positive
5/5 - benchmarks for how hashmap lookups perform with vs. without the bloom

v5 -> v6:
* in 1/5: remove "inline" from the hash function, add check in syscall to
fail out in cases where map_extra is not 0 for non-bloom-filter maps,
fix alignment matching issues, move "map_extra flags" comments to inside
the bpf_attr struct, add bpf_map_info map_extra changes here, add map_extra
assignment in bpf_map_get_info_by_fd, change hash value_size to u32 instead of
a u64
* in 2/5: remove bpf_map_info map_extra changes, remove TODO comment about
extending BTF arrays to cover u64s, cast to unsigned long long for %llx when
printing out map_extra flags
* in 3/5: use __type(value, ...) instead of __uint(value_size, ...) for values
and keys
* in 4/5: fix wrong bounds for the index when iterating through random values,
update commit message to include update+lookup benchmark results for 8 byte
and 64-byte value sizes, remove explicit global bool initializaton to false
for hashmap_use_bloom and count_false_hits variables

v4 -> v5:
* Change the "bitset map with bloom filter capabilities" to a bloom filter map
with max_entries signifying the number of unique entries expected in the bloom
filter, remove bitset tests
* Reduce verbiage by changing "bloom_filter" to "bloom", and renaming progs to
more concise names.
* in 2/5: remove "map_extra" from struct definitions that are frozen, create a
"bpf_create_map_params" struct to propagate map_extra to the kernel at map
creation time, change map_extra to __u64
* in 4/5: check pthread condition variable in a loop when generating initial
map data, remove "err" checks where not pragmatic, generate random values
for the hashmap in the setup() instead of in the bpf program, add check_args()
for checking that there aren't more requested entries than possible unique
entries for the specified value size
* in 5/5: Update commit message with updated benchmark data

v3 -> v4:
* Generalize the bloom filter map to be a bitset map with bloom filter
* Add map_extra flags; pass in nr_hash_funcs through lower 4 bits of map_extra
for the bitset map
* Add tests for the bitset map (non-bloom filter) functionality
* In the benchmarks, stats are computed only as monotonic increases, and place
stats in a struct instead of as a percpu_array bpf map

v2 -> v3:
* Add libbpf changes for supporting nr_hash_funcs, instead of passing the
number of hash functions through map_flags.
* Separate the hashing logic in kernel/bpf/bloom_filter.c into a helper

v1 -> v2:
* Remove libbpf changes, and pass the number of hash functions through
map_flags instead.
* Default to using 5 hash functions if no number of hash functions
is specified.
* Use set_bit instead of spinlocks in the bloom filter bitmap. This
improved the speed significantly. For example, using 5 hash functions
with 100k entries, there was roughly a 35% speed increase.
* Use jhash2 (instead of jhash) for u32-aligned value sizes. This
increased the speed by roughly 5 to 15%. When using jhash2 on value
sizes non-u32 aligned (truncating any remainder bits), there was not
a noticeable difference.
* Add test for using the bloom filter as an inner map.
* Reran the benchmarks, updated the commit messages to correspond to
the new results.

Joanne Koong (5):
  bpf: Add bloom filter map implementation
  libbpf: Add "map_extra" as a per-map-type extra flag
  selftests/bpf: Add bloom filter map test cases
  bpf/benchs: Add benchmark tests for bloom filter throughput + false
  bpf/benchs: Add benchmarks for comparing hashmap lookups w/ vs. w/out
    bloom filter

 include/linux/bpf.h                           |   1 +
 include/linux/bpf_types.h                     |   1 +
 include/uapi/linux/bpf.h                      |   9 +
 kernel/bpf/Makefile                           |   2 +-
 kernel/bpf/bloom_filter.c                     | 195 +++++++
 kernel/bpf/syscall.c                          |  24 +-
 kernel/bpf/verifier.c                         |  19 +-
 tools/include/uapi/linux/bpf.h                |   9 +
 tools/lib/bpf/bpf.c                           |  27 +-
 tools/lib/bpf/bpf_gen_internal.h              |   2 +-
 tools/lib/bpf/gen_loader.c                    |   3 +-
 tools/lib/bpf/libbpf.c                        |  38 +-
 tools/lib/bpf/libbpf.h                        |   3 +
 tools/lib/bpf/libbpf.map                      |   2 +
 tools/lib/bpf/libbpf_internal.h               |  25 +-
 tools/testing/selftests/bpf/Makefile          |   6 +-
 tools/testing/selftests/bpf/bench.c           |  60 ++-
 tools/testing/selftests/bpf/bench.h           |   3 +
 .../bpf/benchs/bench_bloom_filter_map.c       | 477 ++++++++++++++++++
 .../bpf/benchs/run_bench_bloom_filter_map.sh  |  45 ++
 .../bpf/benchs/run_bench_ringbufs.sh          |  30 +-
 .../selftests/bpf/benchs/run_common.sh        |  60 +++
 .../bpf/prog_tests/bloom_filter_map.c         | 204 ++++++++
 .../selftests/bpf/progs/bloom_filter_bench.c  | 153 ++++++
 .../selftests/bpf/progs/bloom_filter_map.c    |  82 +++
 25 files changed, 1429 insertions(+), 51 deletions(-)
 create mode 100644 kernel/bpf/bloom_filter.c
 create mode 100644 tools/testing/selftests/bpf/benchs/bench_bloom_filter_map.c
 create mode 100755 tools/testing/selftests/bpf/benchs/run_bench_bloom_filter_map.sh
 create mode 100644 tools/testing/selftests/bpf/benchs/run_common.sh
 create mode 100644 tools/testing/selftests/bpf/prog_tests/bloom_filter_map.c
 create mode 100644 tools/testing/selftests/bpf/progs/bloom_filter_bench.c
 create mode 100644 tools/testing/selftests/bpf/progs/bloom_filter_map.c


             reply	other threads:[~2021-10-27 23:45 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-27 23:44 Joanne Koong [this message]
2021-10-27 23:45 ` [PATCH v6 bpf-next 1/5] bpf: Add bloom filter map implementation Joanne Koong
2021-10-28 18:15   ` Andrii Nakryiko
2021-10-29  0:15     ` Joanne Koong
2021-10-29  0:44       ` Andrii Nakryiko
2021-10-28 20:35   ` Alexei Starovoitov
2021-10-28 21:14   ` Martin KaFai Lau
2021-10-29  3:17     ` Joanne Koong
2021-10-29  4:49       ` Martin KaFai Lau
     [not found]         ` <6d930e97-424d-393d-4731-ac8eda9e5156@fb.com>
2021-10-29  6:40           ` Martin KaFai Lau
2021-10-27 23:45 ` [PATCH v6 bpf-next 2/5] libbpf: Add "map_extra" as a per-map-type extra flag Joanne Koong
2021-10-28 18:14   ` Andrii Nakryiko
2021-10-27 23:45 ` [PATCH v6 bpf-next 3/5] selftests/bpf: Add bloom filter map test cases Joanne Koong
2021-10-28 18:16   ` Andrii Nakryiko
2021-10-27 23:45 ` [PATCH v6 bpf-next 4/5] bpf/benchs: Add benchmark tests for bloom filter throughput + false positive Joanne Koong
2021-10-28 18:26   ` Andrii Nakryiko
2021-10-27 23:45 ` [PATCH v6 bpf-next 5/5] bpf/benchs: Add benchmarks for comparing hashmap lookups w/ vs. w/out bloom filter Joanne Koong
2021-10-28 22:10 ` [PATCH v6 bpf-next 0/5] Implement bloom filter map Martin KaFai Lau
2021-10-28 23:05   ` Alexei Starovoitov
2021-10-29  0:23     ` Joanne Koong
2021-10-29  0:30       ` Alexei Starovoitov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211027234504.30744-1-joannekoong@fb.com \
    --to=joannekoong@fb.com \
    --cc=Kernel-team@fb.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kafai@fb.com \


* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).