[PATCH bpf-next v4 0/5] Implement bitset maps, with bloom filter

* [PATCH bpf-next v4 0/5] Implement bitset maps, with bloom filter
@ 2021-10-06 22:20 Joanne Koong
  2021-10-06 22:20 ` [PATCH bpf-next v4 1/5] bpf: Add bitset map with bloom filter capabilities Joanne Koong
                   ` (4 more replies)
  0 siblings, 5 replies; 34+ messages in thread
From: Joanne Koong @ 2021-10-06 22:20 UTC (permalink / raw)
  To: bpf; +Cc: Kernel-team, Joanne Koong

This patchset adds a new kind of bpf map: the bitset map.

A bitset is an array data structure that compactly stores bits. It is a
non-associative data type and is often utilized to denote whether an element
exists in a set. A bitset is effective at exploiting bit-level parallelism in
hardware to perform operations quickly. For more information, please see
https://en.wikipedia.org/wiki/Bit_array

When a special flag is set, the bitset can be utilized as a bloom filter.
A bloom filter is a space-efficient probabilistic data structure used to
quickly test whether whether an element exists in a set. In a bloom filter,
false positives are possible whereas false negatives should never be. For a
more thorough overview about how bloom filters work,
https://en.wikipedia.org/wiki/Bloom_filter may be helpful.

One example use-case is an application leveraging a bloom filter map to
determine whether a computationally expensive hashmap lookup can be avoided. If
the element was not found in the bloom filter map, the hashmap lookup can be
skipped.

A high level overview of this patchset is as follows:
1/5 - kernel changes for the bitset map, with bloom filter capabilities
2/5 - libbpf changes for adding map_extra flags
3/5 - tests for the bitset map and for bloom filter capabilities
4/5 - benchmarks for bloom filter lookup/update throughput and false positive
rate
5/5 - benchmarks for how hashmap lookups perform with vs. without the bloom
filter

v3 -> v4:
* Generalize the bloom filter map to be a bitset map with bloom filter
capabilities
* Add map_extra flags; pass in nr_hash_funcs through lower 4 bits of map_extra
for the bitset map
* Add tests for the bitset map (non-bloom filter) functionality
* In the benchmarks, stats are computed only as monotonic increases. Placed
stats in a struct instead of as a percpu_array bpf map

v2 -> v3:
* Add libbpf changes for supporting nr_hash_funcs, instead of passing the
number of hash functions through map_flags.
* Separate the hashing logic in kernel/bpf/bloom_filter.c into a helper
function

v1 -> v2:
* Remove libbpf changes, and pass the number of hash functions through
map_flags instead.
* Default to using 5 hash functions if no number of hash functions
is specified.
* Use set_bit instead of spinlocks in the bloom filter bitmap. This
improved the speed significantly. For example, using 5 hash functions
with 100k entries, there was roughly a 35% speed increase.
* Use jhash2 (instead of jhash) for u32-aligned value sizes. This
increased the speed by roughly 5 to 15%. When using jhash2 on value
sizes non-u32 aligned (truncating any remainder bits), there was not
a noticeable difference.
* Add test for using the bloom filter as an inner map.
* Reran the benchmarks, updated the commit messages to correspond to
the new results.

Joanne Koong (5):
  bpf: Add bitset map with bloom filter capabilities
  libbpf: Add "map_extra" as a per-map-type extra flag
  selftests/bpf: Add bitset map test cases
  bpf/benchs: Add benchmark tests for bloom filter throughput + false
    positive
  bpf/benchs: Add benchmarks for comparing hashmap lookups w/ vs. w/out
    bloom filter

 include/linux/bpf.h                           |   2 +
 include/linux/bpf_types.h                     |   1 +
 include/uapi/linux/bpf.h                      |  10 +
 kernel/bpf/Makefile                           |   2 +-
 kernel/bpf/bitset.c                           | 256 ++++++++++
 kernel/bpf/syscall.c                          |  25 +-
 kernel/bpf/verifier.c                         |  10 +-
 tools/include/uapi/linux/bpf.h                |  10 +
 tools/lib/bpf/bpf.c                           |   1 +
 tools/lib/bpf/bpf.h                           |   1 +
 tools/lib/bpf/bpf_helpers.h                   |   1 +
 tools/lib/bpf/libbpf.c                        |  25 +-
 tools/lib/bpf/libbpf.h                        |   4 +
 tools/lib/bpf/libbpf.map                      |   2 +
 tools/lib/bpf/libbpf_internal.h               |   4 +-
 tools/testing/selftests/bpf/Makefile          |   6 +-
 tools/testing/selftests/bpf/bench.c           |  59 ++-
 tools/testing/selftests/bpf/bench.h           |   3 +
 .../bpf/benchs/bench_bloom_filter_map.c       | 448 ++++++++++++++++++
 .../bpf/benchs/run_bench_bloom_filter_map.sh  |  45 ++
 .../bpf/benchs/run_bench_ringbufs.sh          |  30 +-
 .../selftests/bpf/benchs/run_common.sh        |  60 +++
 tools/testing/selftests/bpf/bpf_util.h        |  11 +
 .../selftests/bpf/prog_tests/bitset_map.c     | 279 +++++++++++
 .../testing/selftests/bpf/progs/bitset_map.c  | 115 +++++
 .../selftests/bpf/progs/bloom_filter_bench.c  | 146 ++++++
 26 files changed, 1513 insertions(+), 43 deletions(-)
 create mode 100644 kernel/bpf/bitset.c
 create mode 100644 tools/testing/selftests/bpf/benchs/bench_bloom_filter_map.c
 create mode 100755 tools/testing/selftests/bpf/benchs/run_bench_bloom_filter_map.sh
 create mode 100644 tools/testing/selftests/bpf/benchs/run_common.sh
 create mode 100644 tools/testing/selftests/bpf/prog_tests/bitset_map.c
 create mode 100644 tools/testing/selftests/bpf/progs/bitset_map.c
 create mode 100644 tools/testing/selftests/bpf/progs/bloom_filter_bench.c

-- 
2.30.2

^ permalink raw reply	[flat|nested] 34+ messages in thread