Re: [PATCH v5 bpf-next 4/5] bpf/benchs: Add benchmark tests for bloom filter throughput + false positive

From: Andrii Nakryiko <andriin@fb.com>
To: Joanne Koong <joannekoong@fb.com>, <bpf@vger.kernel.org>
Cc: <Kernel-team@fb.com>, <andrii@kernel.org>
Subject: Re: [PATCH v5 bpf-next 4/5] bpf/benchs: Add benchmark tests for bloom filter throughput + false positive
Date: Tue, 26 Oct 2021 20:49:36 -0700	[thread overview]
Message-ID: <4c92fd62-ab88-3dac-ce90-5e00b1c62dd7@fb.com> (raw)
In-Reply-To: <20211022220249.2040337-5-joannekoong@fb.com>

On 10/22/21 3:02 PM, Joanne Koong wrote:
> This patch adds benchmark tests for the throughput (for lookups + updates)
> and the false positive rate of bloom filter lookups, as well as some
> minor refactoring of the bash script for running the benchmarks.
>
> These benchmarks show that as the number of hash functions increases,
> the throughput and the false positive rate of the bloom filter decreases.
>  From the benchmark data, the approximate average false-positive rates for
> 8-byte values are roughly as follows:
>
> 1 hash function = ~30%
> 2 hash functions = ~15%
> 3 hash functions = ~5%
> 4 hash functions = ~2.5%
> 5 hash functions = ~1%
> 6 hash functions = ~0.5%
> 7 hash functions  = ~0.35%
> 8 hash functions = ~0.15%
> 9 hash functions = ~0.1%
> 10 hash functions = ~0%

Can you please post update/lookup benchmark results just for reference? 
Maybe pick 8 byte and, don't know, 64 byte value sizes? Just for the 
history, because not everyone is going to run benchmarks to see for 
themselves.

> Signed-off-by: Joanne Koong <joannekoong@fb.com>
> ---
>   tools/testing/selftests/bpf/Makefile          |   6 +-
>   tools/testing/selftests/bpf/bench.c           |  37 ++
>   tools/testing/selftests/bpf/bench.h           |   3 +
>   .../bpf/benchs/bench_bloom_filter_map.c       | 420 ++++++++++++++++++
>   .../bpf/benchs/run_bench_bloom_filter_map.sh  |  28 ++
>   .../bpf/benchs/run_bench_ringbufs.sh          |  30 +-
>   .../selftests/bpf/benchs/run_common.sh        |  48 ++
>   .../selftests/bpf/progs/bloom_filter_bench.c  | 153 +++++++
>   8 files changed, 695 insertions(+), 30 deletions(-)
>   create mode 100644 tools/testing/selftests/bpf/benchs/bench_bloom_filter_map.c
>   create mode 100755 tools/testing/selftests/bpf/benchs/run_bench_bloom_filter_map.sh
>   create mode 100644 tools/testing/selftests/bpf/benchs/run_common.sh
>   create mode 100644 tools/testing/selftests/bpf/progs/bloom_filter_bench.c
>

[...]

> +SEC("fentry/__x64_sys_getpgid")
> +int bloom_hashmap_lookup(void *ctx)
> +{
> +	__u64 *result;
> +	int i, err;
> +
> +	__u32 index = bpf_get_prandom_u32();
> +
> +	for (i = 0; i < 1024; i++, index += value_size) {
> +		if (index >= nr_rand_bytes)

this if seems wrong. If you allow index to go all the way to 2500000 - 
1), then you'll be reading value_size-1 bytes past rand_vals. Verifier 
doesn't complain because there is quite  a lot of space for percpu_stats 
after that, but it's a bug. Just drop the if condition and always do the 
index masking and it should be ok.

> +			index = index & ((1ULL << 21) - 1);
> +
> +		if (hashmap_use_bloom) {
> +			err = bpf_map_peek_elem(&bloom_map,
> +						rand_vals + index);
> +			if (err) {
> +				if (err != -ENOENT) {
> +					error |= 2;
> +					return 0;
> +				}
> +				log_result(hit_key);
> +				continue;
> +			}
> +		}
> +
> +		result = bpf_map_lookup_elem(&hashmap,
> +					     rand_vals + index);
> +		if (result) {
> +			log_result(hit_key);
> +		} else {
> +			if (hashmap_use_bloom && count_false_hits)
> +				log_result(false_hit_key);
> +			log_result(drop_key);
> +		}
> +	}
> +
> +	return 0;
> +}