bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC bpf-next 0/4] bpf_jhash_mem() and BPF Bloom filter implementation
@ 2021-09-22 20:32 Andrii Nakryiko
  2021-09-22 20:32 ` [PATCH RFC bpf-next 1/4] bpf: add bpf_jhash_mem BPF helper Andrii Nakryiko
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Andrii Nakryiko @ 2021-09-22 20:32 UTC (permalink / raw)
  To: bpf, ast, daniel; +Cc: kafai, joannekoong, Andrii Nakryiko

This is a quick experiment on adding a hashing helper and buliding Bloom
filter with pure BPF code with no extra kernel functionality (beyond generic
hashing helper).

This is based on top of Joanne's series [0].

Patch 1 adds bpf_jhash_mem() helper. Patch 2 fixes existing benchmark to
report valid and consistent benchmark results and reduce amount of overhead
that stats counting itself causes. Patch 3 contains BPF-side implementation of
Bloom filter based on global variables. Patch 4 completes the integration on
user-space side. I split patch 3 and 4 to not distract from BPF-side changes.

Note that "Hashmap without bloom filter vs. hashmap with bloom filter"
benchmark is still spending lots of time in just generating random values on
BPF side, and it would be nice to optimize that and make it more reproducible
to compare different implementations. But I ran out of steam for doing that,
especially that I'm not sure this changes anything.

The same benchmark also checks only short 8-byte keys, which is a valid use
case, but not the only probably one, so would be nice to have that extended as
well.


For reference, here are full benchmark results comparing in-kernel Bloom filter
and custom BPF-implemented Bloom filter. I shortened the set of different
combinations tested to reduce amount of time to wait for results.

The results for "Hashmap without bloom filter vs. hashmap with bloom filter"
are quite confusing, though. I spent a bit of time to try to find
discrepancies. I confirmed that both implementations function correctly and
match 100% of time in terms of positives/negatives. Pure Bloom filter reading
benchmarks show a pretty small gap in performance with custom BPF
implementation losing. The "Hashmap without bloom filter vs. hashmap with bloom
filter" benchmark shows much bigger difference, though, which I wasn't able to
completely explain, to be entirely honest.

It's probably worth spending some more time investigating this, but I ran out
of self-alloted time for this.


Bloom filter map
================
        # threads: 1, # hashes: 1
10,000 entries -
        Total operations:    50.854 ± 0.134M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  31.36%
        [CUSTOM] Total operations:  49.391 ± 0.123M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  31.37%
100,000 entries -
        Total operations:    50.969 ± 0.049M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  24.04%
        [CUSTOM] Total operations:  49.135 ± 1.579M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  24.04%
1,000,000 entries -
        Total operations:    48.474 ± 1.619M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  27.50%
        [CUSTOM] Total operations:  46.088 ± 0.776M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  27.50%

        # threads: 1, # hashes: 3
10,000 entries -
        Total operations:    25.136 ± 0.011M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  4.71%
        [CUSTOM] Total operations:  24.115 ± 0.014M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  4.77%
100,000 entries -
        Total operations:    25.045 ± 0.120M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  7.67%
        [CUSTOM] Total operations:  23.028 ± 0.042M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  7.65%
1,000,000 entries -
        Total operations:    18.712 ± 0.406M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  2.64%
        [CUSTOM] Total operations:  18.100 ± 0.422M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  2.64%

        # threads: 1, # hashes: 5
10,000 entries -
        Total operations:    16.672 ± 0.011M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  0.32%
        [CUSTOM] Total operations:  15.318 ± 0.014M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  0.32%
100,000 entries -
        Total operations:    16.540 ± 0.121M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  0.78%
        [CUSTOM] Total operations:  15.189 ± 0.045M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  0.78%
1,000,000 entries -
        Total operations:    11.781 ± 0.192M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  1.79%
        [CUSTOM] Total operations:  11.651 ± 0.012M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  1.79%

        # threads: 1, # hashes: 10
10,000 entries -
        Total operations:    9.038 ± 0.052M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  0.00%
        [CUSTOM] Total operations:  8.620 ± 0.008M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  0.00%
100,000 entries -
        Total operations:    8.076 ± 0.027M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  0.01%
        [CUSTOM] Total operations:  7.261 ± 0.007M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  0.01%
1,000,000 entries -
        Total operations:    6.263 ± 0.041M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  0.03%
        [CUSTOM] Total operations:  6.173 ± 0.013M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  0.03%

        # threads: 4, # hashes: 1
10,000 entries -
        Total operations:    203.453 ± 0.161M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  31.28%
        [CUSTOM] Total operations:  197.959 ± 0.051M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  31.34%
100,000 entries -
        Total operations:    203.887 ± 0.155M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  24.07%
        [CUSTOM] Total operations:  197.476 ± 1.796M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  24.09%
1,000,000 entries -
        Total operations:    189.259 ± 0.473M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  27.50%
        [CUSTOM] Total operations:  185.157 ± 0.346M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  27.48%

        # threads: 4, # hashes: 3
10,000 entries -
        Total operations:    100.394 ± 0.062M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  4.76%
        [CUSTOM] Total operations:  93.896 ± 0.104M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  4.75%
100,000 entries -
        Total operations:    100.382 ± 0.155M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  7.62%
        [CUSTOM] Total operations:  93.460 ± 0.145M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  7.65%
1,000,000 entries -
        Total operations:    71.424 ± 0.710M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  2.65%
        [CUSTOM] Total operations:  72.546 ± 0.228M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  2.65%

        # threads: 4, # hashes: 5
10,000 entries -
        Total operations:    66.652 ± 0.116M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  0.32%
        [CUSTOM] Total operations:  60.790 ± 0.454M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  0.33%
100,000 entries -
        Total operations:    66.401 ± 0.090M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  0.78%
        [CUSTOM] Total operations:  61.066 ± 0.069M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  0.77%
1,000,000 entries -
        Total operations:    48.299 ± 0.369M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  1.79%
        [CUSTOM] Total operations:  47.401 ± 0.347M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  1.79%

        # threads: 4, # hashes: 10
10,000 entries -
        Total operations:    36.273 ± 0.030M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  0.00%
        [CUSTOM] Total operations:  34.464 ± 0.073M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  0.00%
100,000 entries -
        Total operations:    32.525 ± 0.047M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  0.01%
        [CUSTOM] Total operations:  29.516 ± 0.110M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  0.01%
1,000,000 entries -
        Total operations:    25.515 ± 0.405M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  0.03%
        [CUSTOM] Total operations:  24.566 ± 0.189M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  0.03%

        # threads: 8, # hashes: 1
10,000 entries -
        Total operations:    406.129 ± 2.262M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  31.45%
        [CUSTOM] Total operations:  384.758 ± 0.379M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  31.24%
100,000 entries -
        Total operations:    407.817 ± 0.793M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  24.05%
        [CUSTOM] Total operations:  394.745 ± 1.302M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  24.05%
1,000,000 entries -
        Total operations:    383.159 ± 0.289M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  27.49%
        [CUSTOM] Total operations:  371.173 ± 0.454M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  27.49%

        # threads: 8, # hashes: 3
10,000 entries -
        Total operations:    201.079 ± 0.183M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  4.69%
        [CUSTOM] Total operations:  187.658 ± 0.544M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  4.74%
100,000 entries -
        Total operations:    199.826 ± 0.972M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  7.63%
        [CUSTOM] Total operations:  185.415 ± 0.358M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  7.62%
1,000,000 entries -
        Total operations:    148.589 ± 1.320M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  2.65%
        [CUSTOM] Total operations:  142.591 ± 4.825M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  2.64%

        # threads: 8, # hashes: 5
10,000 entries -
        Total operations:    133.306 ± 0.468M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  0.32%
        [CUSTOM] Total operations:  127.377 ± 0.271M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  0.32%
100,000 entries -
        Total operations:    132.915 ± 0.364M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  0.78%
        [CUSTOM] Total operations:  123.722 ± 3.169M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  0.77%
1,000,000 entries -
        Total operations:    94.803 ± 3.240M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  1.79%
        [CUSTOM] Total operations:  95.670 ± 2.624M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  1.79%

        # threads: 8, # hashes: 10
10,000 entries -
        Total operations:    72.517 ± 0.083M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  0.00%
        [CUSTOM] Total operations:  65.803 ± 0.069M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  0.00%
100,000 entries -
        Total operations:    65.533 ± 0.148M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  0.01%
        [CUSTOM] Total operations:  59.908 ± 2.841M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  0.01%
1,000,000 entries -
        Total operations:    50.937 ± 0.105M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  0.03%
        [CUSTOM] Total operations:  47.526 ± 1.044M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  0.03%


Hashmap without bloom filter vs. hashmap with bloom filter (throughput, 8 threads)
==================================================================================
        # hashes: 1
10,000 entries -
        Hashmap without bloom filter:  123.919 ± 0.458M/s
        Hashmap with bloom filter:  124.588 ± 0.450M/s
        [CUSTOM] Hashmap with bloom filter:  106.838 ± 0.114M/s
100,000 entries -
        Hashmap without bloom filter:  93.708 ± 0.715M/s
        Hashmap with bloom filter:  131.686 ± 1.272M/s
        [CUSTOM] Hashmap with bloom filter:  105.649 ± 0.278M/s
1,000,000 entries -
        Hashmap without bloom filter:  40.040 ± 0.677M/s
        Hashmap with bloom filter:  67.250 ± 0.506M/s
        [CUSTOM] Hashmap with bloom filter:  58.356 ± 0.541M/s

        # hashes: 3
10,000 entries -
        Hashmap without bloom filter:  123.882 ± 0.077M/s
        Hashmap with bloom filter:  152.423 ± 0.061M/s
        [CUSTOM] Hashmap with bloom filter:  126.021 ± 0.115M/s
100,000 entries -
        Hashmap without bloom filter:  94.291 ± 0.986M/s
        Hashmap with bloom filter:  127.944 ± 0.825M/s
        [CUSTOM] Hashmap with bloom filter:  106.943 ± 0.827M/s
1,000,000 entries -
        Hashmap without bloom filter:  40.183 ± 0.382M/s
        Hashmap with bloom filter:  125.224 ± 0.266M/s
        [CUSTOM] Hashmap with bloom filter:  89.717 ± 0.158M/s

        # hashes: 5
10,000 entries -
        Hashmap without bloom filter:  120.510 ± 0.351M/s
        Hashmap with bloom filter:  170.138 ± 0.088M/s
        [CUSTOM] Hashmap with bloom filter:  136.006 ± 0.324M/s
100,000 entries -
        Hashmap without bloom filter:  94.774 ± 0.191M/s
        Hashmap with bloom filter:  145.559 ± 0.687M/s
        [CUSTOM] Hashmap with bloom filter:  117.073 ± 0.382M/s
1,000,000 entries -
        Hashmap without bloom filter:  40.004 ± 0.492M/s
        Hashmap with bloom filter:  96.916 ± 0.148M/s
        [CUSTOM] Hashmap with bloom filter:  78.485 ± 0.289M/s

        # hashes: 10
10,000 entries -
        Hashmap without bloom filter:  124.034 ± 0.245M/s
        Hashmap with bloom filter:  169.757 ± 0.336M/s
        [CUSTOM] Hashmap with bloom filter:  134.634 ± 0.276M/s
100,000 entries -
        Hashmap without bloom filter:  94.872 ± 0.195M/s
        Hashmap with bloom filter:  141.107 ± 0.780M/s
        [CUSTOM] Hashmap with bloom filter:  109.279 ± 0.330M/s
1,000,000 entries -
        Hashmap without bloom filter:  40.215 ± 0.396M/s
        Hashmap with bloom filter:  98.084 ± 0.267M/s
        [CUSTOM] Hashmap with bloom filter:  78.579 ± 0.046M/s


  [0] https://patchwork.kernel.org/project/netdevbpf/list/?series=550585&state=*

Andrii Nakryiko (4):
  bpf: add bpf_jhash_mem BPF helper
  selftests/bpf: fix and optimize bloom filter bench
  selftests/bpf: implement custom Bloom filter purely in BPF
  selftests/bpf: integrate custom Bloom filter impl into benchs

 include/uapi/linux/bpf.h                      |   8 +
 kernel/bpf/helpers.c                          |  23 +++
 tools/include/uapi/linux/bpf.h                |   8 +
 tools/testing/selftests/bpf/Makefile          |   2 +-
 tools/testing/selftests/bpf/bench.c           |   6 +
 .../bpf/benchs/bench_bloom_filter_map.c       | 153 +++++++++++++++++-
 .../bpf/benchs/run_bench_bloom_filter_map.sh  |  22 +--
 .../selftests/bpf/benchs/run_common.sh        |   2 +-
 .../selftests/bpf/progs/bloom_filter_map.c    | 125 ++++++++++++--
 9 files changed, 317 insertions(+), 32 deletions(-)

-- 
2.30.2


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH RFC bpf-next 1/4] bpf: add bpf_jhash_mem BPF helper
  2021-09-22 20:32 [PATCH RFC bpf-next 0/4] bpf_jhash_mem() and BPF Bloom filter implementation Andrii Nakryiko
@ 2021-09-22 20:32 ` Andrii Nakryiko
  2021-09-22 20:32 ` [PATCH RFC bpf-next 2/4] selftests/bpf: fix and optimize bloom filter bench Andrii Nakryiko
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Andrii Nakryiko @ 2021-09-22 20:32 UTC (permalink / raw)
  To: bpf, ast, daniel; +Cc: kafai, joannekoong, Andrii Nakryiko

Add BPF helper that allows to calculate jhash of arbitrary (initialized)
memory slice.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 include/uapi/linux/bpf.h       |  8 ++++++++
 kernel/bpf/helpers.c           | 23 +++++++++++++++++++++++
 tools/include/uapi/linux/bpf.h |  8 ++++++++
 3 files changed, 39 insertions(+)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 2e3048488feb..7b17e46b0be2 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -4913,6 +4913,13 @@ union bpf_attr {
  *	Return
  *		The number of bytes written to the buffer, or a negative error
  *		in case of failure.
+ *
+ * u64 bpf_jhash_mem(const void *data, u32 sz, u32 seed)
+ *	Description
+ *		Calculate jhash of a given memory slice.
+ *
+ *	Return
+ *		Calculated hash value.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -5093,6 +5100,7 @@ union bpf_attr {
 	FN(task_pt_regs),		\
 	FN(get_branch_snapshot),	\
 	FN(trace_vprintk),		\
+	FN(jhash_mem),			\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 2c604ff8c7fb..00fb1b69595c 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -15,9 +15,30 @@
 #include <linux/pid_namespace.h>
 #include <linux/proc_ns.h>
 #include <linux/security.h>
+#include <linux/jhash.h>
 
 #include "../../lib/kstrtox.h"
 
+BPF_CALL_3(bpf_jhash_mem, const void *, data, u32, sz, u32, seed)
+{
+	if (unlikely((uintptr_t)data % 4 || sz % 4))
+		return jhash(data, sz, seed);
+
+	/* optimized 4-byte version */
+	return jhash2(data, sz / 4, seed);
+}
+
+
+static const struct bpf_func_proto bpf_jhash_mem_proto = {
+	.func		= bpf_jhash_mem,
+	.gpl_only	= false,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_PTR_TO_MEM,
+	.arg2_type	= ARG_CONST_SIZE_OR_ZERO,
+	.arg3_type	= ARG_ANYTHING,
+};
+
+
 /* If kernel subsystem is allowing eBPF programs to call this function,
  * inside its own verifier_ops->get_func_proto() callback it should return
  * bpf_map_lookup_elem_proto, so that verifier can properly check the arguments
@@ -1341,6 +1362,8 @@ const struct bpf_func_proto *
 bpf_base_func_proto(enum bpf_func_id func_id)
 {
 	switch (func_id) {
+	case BPF_FUNC_jhash_mem:
+		return &bpf_jhash_mem_proto;
 	case BPF_FUNC_map_lookup_elem:
 		return &bpf_map_lookup_elem_proto;
 	case BPF_FUNC_map_update_elem:
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 2e3048488feb..7b17e46b0be2 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -4913,6 +4913,13 @@ union bpf_attr {
  *	Return
  *		The number of bytes written to the buffer, or a negative error
  *		in case of failure.
+ *
+ * u64 bpf_jhash_mem(const void *data, u32 sz, u32 seed)
+ *	Description
+ *		Calculate jhash of a given memory slice.
+ *
+ *	Return
+ *		Calculated hash value.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -5093,6 +5100,7 @@ union bpf_attr {
 	FN(task_pt_regs),		\
 	FN(get_branch_snapshot),	\
 	FN(trace_vprintk),		\
+	FN(jhash_mem),			\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH RFC bpf-next 2/4] selftests/bpf: fix and optimize bloom filter bench
  2021-09-22 20:32 [PATCH RFC bpf-next 0/4] bpf_jhash_mem() and BPF Bloom filter implementation Andrii Nakryiko
  2021-09-22 20:32 ` [PATCH RFC bpf-next 1/4] bpf: add bpf_jhash_mem BPF helper Andrii Nakryiko
@ 2021-09-22 20:32 ` Andrii Nakryiko
  2021-09-22 20:32 ` [PATCH RFC bpf-next 3/4] selftests/bpf: implement custom Bloom filter purely in BPF Andrii Nakryiko
  2021-09-22 20:32 ` [PATCH RFC bpf-next 4/4] selftests/bpf: integrate custom Bloom filter impl into benchs Andrii Nakryiko
  3 siblings, 0 replies; 5+ messages in thread
From: Andrii Nakryiko @ 2021-09-22 20:32 UTC (permalink / raw)
  To: bpf, ast, daniel; +Cc: kafai, joannekoong, Andrii Nakryiko

Fix inconsistent and highly-variable measurement logic by using always
increasing counters. Also reduce the amount of accounting by doing
increment once after the entire loop. Also use rodata for bloom filter
flag. And build bench files in optimized mode.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 tools/testing/selftests/bpf/Makefile          |  2 +-
 .../bpf/benchs/bench_bloom_filter_map.c       | 32 +++++++++++++-
 .../selftests/bpf/progs/bloom_filter_map.c    | 44 ++++++++++++++-----
 3 files changed, 63 insertions(+), 15 deletions(-)

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 5dbaf7f512fd..f36f49e4d0f2 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -515,7 +515,7 @@ $(OUTPUT)/test_cpp: test_cpp.cpp $(OUTPUT)/test_core_extern.skel.h $(BPFOBJ)
 # Benchmark runner
 $(OUTPUT)/bench_%.o: benchs/bench_%.c bench.h $(BPFOBJ)
 	$(call msg,CC,,$@)
-	$(Q)$(CC) $(CFLAGS) -c $(filter %.c,$^) $(LDLIBS) -o $@
+	$(Q)$(CC) $(CFLAGS) -O2 -c $(filter %.c,$^) $(LDLIBS) -o $@
 $(OUTPUT)/bench_rename.o: $(OUTPUT)/test_overhead.skel.h
 $(OUTPUT)/bench_trigger.o: $(OUTPUT)/trigger_bench.skel.h
 $(OUTPUT)/bench_ringbufs.o: $(OUTPUT)/ringbuf_bench.skel.h \
diff --git a/tools/testing/selftests/bpf/benchs/bench_bloom_filter_map.c b/tools/testing/selftests/bpf/benchs/bench_bloom_filter_map.c
index 7adf80be7292..4f53cd9fb099 100644
--- a/tools/testing/selftests/bpf/benchs/bench_bloom_filter_map.c
+++ b/tools/testing/selftests/bpf/benchs/bench_bloom_filter_map.c
@@ -248,7 +248,7 @@ static void hashmap_no_bloom_filter_setup(void)
 
 	ctx.skel = setup_skeleton();
 
-	ctx.skel->data->hashmap_use_bloom_filter = false;
+	ctx.skel->rodata->hashmap_use_bloom_filter = false;
 
 	populate_maps();
 
@@ -259,16 +259,26 @@ static void hashmap_no_bloom_filter_setup(void)
 	}
 }
 
+struct stat { __u32 stats[3]; };
+
 static void measure(struct bench_res *res)
 {
+	static long last_hits, last_drops, last_false_hits;
 	long total_hits = 0, total_drops = 0, total_false_hits = 0;
 	unsigned int nr_cpus = bpf_num_possible_cpus();
+	/*
 	BPF_DECLARE_PERCPU(__u64, zeroed_values);
 	BPF_DECLARE_PERCPU(__u64, false_hits);
 	BPF_DECLARE_PERCPU(__u64, drops);
 	BPF_DECLARE_PERCPU(__u64, hits);
 	int err, i, percpu_array_fd;
 	__u32 key;
+	*/
+	int i;
+	int hit_key, drop_key, false_hit_key;
+	hit_key = ctx.skel->rodata->hit_key;
+	drop_key = ctx.skel->rodata->drop_key;
+	false_hit_key = ctx.skel->rodata->false_hit_key;
 
 	if (ctx.skel->bss->error != 0) {
 		fprintf(stderr, "error (%d) when searching the bloom filter\n",
@@ -276,6 +286,7 @@ static void measure(struct bench_res *res)
 		exit(1);
 	}
 
+	/*
 	key = ctx.skel->rodata->hit_key;
 	percpu_array_fd = bpf_map__fd(ctx.skel->maps.percpu_array);
 	err = bpf_map_lookup_elem(percpu_array_fd, &key, hits);
@@ -300,20 +311,36 @@ static void measure(struct bench_res *res)
 			-errno);
 		exit(1);
 	}
+	*/
 
 	for (i = 0; i < nr_cpus; i++) {
+		struct stat *s = (void *)&ctx.skel->bss->percpu_stats[i];
+
+		total_hits += s->stats[hit_key];
+		total_drops += s->stats[drop_key];
+		total_false_hits += s->stats[false_hit_key];
+		/*
 		total_hits += bpf_percpu(hits, i);
 		total_drops += bpf_percpu(drops, i);
 		total_false_hits += bpf_percpu(false_hits, i);
+		*/
 	}
 
+	res->hits = total_hits - last_hits;
+	res->drops = total_drops - last_drops;
+	res->false_hits = total_false_hits - last_false_hits;
+
+	last_hits = total_hits;
+	last_drops = total_drops;
+	last_false_hits = total_false_hits;
+
+	/*
 	res->hits = total_hits;
 	res->drops = total_drops;
 	res->false_hits = total_false_hits;
 
 	memset(zeroed_values, 0, sizeof(zeroed_values));
 
-	/* zero out the percpu array */
 	key = ctx.skel->rodata->hit_key;
 	err = bpf_map_update_elem(percpu_array_fd, &key, zeroed_values, BPF_ANY);
 	if (err) {
@@ -332,6 +359,7 @@ static void measure(struct bench_res *res)
 		fprintf(stderr, "zeroing the percpu array failed: %d\n", -errno);
 		exit(1);
 	}
+	*/
 }
 
 static void *consumer(void *input)
diff --git a/tools/testing/selftests/bpf/progs/bloom_filter_map.c b/tools/testing/selftests/bpf/progs/bloom_filter_map.c
index 05f9706a5ba6..3ae2f9bb5968 100644
--- a/tools/testing/selftests/bpf/progs/bloom_filter_map.c
+++ b/tools/testing/selftests/bpf/progs/bloom_filter_map.c
@@ -23,6 +23,7 @@ struct map_bloom_filter_type {
 	__uint(value_size, sizeof(__u64));
 	__uint(max_entries, 1000);
 	__uint(nr_hash_funcs, 3);
+//	__uint(map_flags, BPF_F_ZERO_SEED);
 } map_bloom_filter SEC(".maps");
 
 struct {
@@ -37,13 +38,19 @@ struct callback_ctx {
 	struct map_bloom_filter_type *map;
 };
 
+struct {
+	__u32 stats[3];
+} __attribute__((aligned(256))) percpu_stats[256];
+
 /* Tracks the number of hits, drops, and false hits */
+/*
 struct {
 	__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
 	__uint(max_entries, 3);
 	__type(key, __u32);
 	__type(value, __u64);
 } percpu_array SEC(".maps");
+*/
 
 struct {
 	__uint(type, BPF_MAP_TYPE_HASH);
@@ -56,17 +63,23 @@ const __u32 hit_key  = 0;
 const __u32 drop_key  = 1;
 const __u32 false_hit_key = 2;
 
-bool hashmap_use_bloom_filter = true;
+const volatile bool hashmap_use_bloom_filter = true;
 
 int error = 0;
 
-static __always_inline void log_result(__u32 key)
+static __always_inline void log_result(__u32 key, __u32 val)
 {
+	__u32 cpu = bpf_get_smp_processor_id();
+
+	percpu_stats[cpu & 255].stats[key] += val;
+
+	/*
 	__u64 *count;
 
 	count = bpf_map_lookup_elem(&percpu_array, &key);
 	if (count)
-		*count += 1;
+		*count += val;
+	*/
 }
 
 static __u64
@@ -81,7 +94,7 @@ check_elem(struct bpf_map *map, __u32 *key, __u64 *val,
 		return 1; /* stop the iteration */
 	}
 
-	log_result(hit_key);
+	log_result(hit_key, 1);
 
 	return 0;
 }
@@ -121,37 +134,44 @@ int prog_bloom_filter_hashmap_lookup(void *ctx)
 {
 	__u64 *result;
 	int i, err;
-
 	union {
 		__u64 data64;
 		__u32 data32[2];
 	} val;
+	int hits = 0, drops = 0, false_hits = 0;
+	//bool custom_hit = false, noncustom_hit = false;
 
 	for (i = 0; i < 512; i++) {
-		val.data32[0] = bpf_get_prandom_u32();
-		val.data32[1] = bpf_get_prandom_u32();
+		val.data32[0] = /*i; */ bpf_get_prandom_u32();
+		val.data32[1] = /*i + 1;*/ bpf_get_prandom_u32();
 
-		if (hashmap_use_bloom_filter) {
+		if (hashmap_use_bloom_filter)
+		{
 			err = bpf_map_peek_elem(&map_bloom_filter, &val);
 			if (err) {
 				if (err != -ENOENT) {
 					error |= 3;
 					return 0;
 				}
-				log_result(drop_key);
+				hits++;
+				//noncustom_hit = true;
+				//__sync_fetch_and_add(&bloom_noncustom_hit, 1);
 				continue;
 			}
 		}
 
 		result = bpf_map_lookup_elem(&hashmap, &val);
 		if (result) {
-			log_result(hit_key);
+			hits++;
 		} else {
 			if (hashmap_use_bloom_filter)
-				log_result(false_hit_key);
-			log_result(drop_key);
+				false_hits++;
+			drops++;
 		}
 	}
+	log_result(hit_key, hits);
+	log_result(drop_key, drops);
+	log_result(false_hit_key, false_hits);
 
 	return 0;
 }
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH RFC bpf-next 3/4] selftests/bpf: implement custom Bloom filter purely in BPF
  2021-09-22 20:32 [PATCH RFC bpf-next 0/4] bpf_jhash_mem() and BPF Bloom filter implementation Andrii Nakryiko
  2021-09-22 20:32 ` [PATCH RFC bpf-next 1/4] bpf: add bpf_jhash_mem BPF helper Andrii Nakryiko
  2021-09-22 20:32 ` [PATCH RFC bpf-next 2/4] selftests/bpf: fix and optimize bloom filter bench Andrii Nakryiko
@ 2021-09-22 20:32 ` Andrii Nakryiko
  2021-09-22 20:32 ` [PATCH RFC bpf-next 4/4] selftests/bpf: integrate custom Bloom filter impl into benchs Andrii Nakryiko
  3 siblings, 0 replies; 5+ messages in thread
From: Andrii Nakryiko @ 2021-09-22 20:32 UTC (permalink / raw)
  To: bpf, ast, daniel; +Cc: kafai, joannekoong, Andrii Nakryiko

And integrate it into existing benchmarks (on BPF side). Posting this
separately from user-space benchmark to emphasize how little code is
necessary on BPF side to make this work.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 .../selftests/bpf/progs/bloom_filter_map.c    | 81 ++++++++++++++++++-
 1 file changed, 80 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/progs/bloom_filter_map.c b/tools/testing/selftests/bpf/progs/bloom_filter_map.c
index 3ae2f9bb5968..ee7bbde1af45 100644
--- a/tools/testing/selftests/bpf/progs/bloom_filter_map.c
+++ b/tools/testing/selftests/bpf/progs/bloom_filter_map.c
@@ -64,9 +64,43 @@ const __u32 drop_key  = 1;
 const __u32 false_hit_key = 2;
 
 const volatile bool hashmap_use_bloom_filter = true;
+const volatile bool hashmap_use_custom_bloom_filter = false;
+
+
+__u64 bloom_val = 0;
+static __u64 bloom_arr[256 * 1024]; /* enough for 1mln max_elems w/ 10 hash funcs */
+const volatile __u32 bloom_mask;
+const volatile __u32 bloom_hash_cnt;
+const volatile __u32 bloom_seed;
 
 int error = 0;
 
+static void bloom_add(const void *data, __u32 sz)
+{
+	__u32 i = 0, h;
+
+	for (i = 0; i < bloom_hash_cnt; i++) {
+		h = bpf_jhash_mem(data, sz, bloom_seed + i);
+		__sync_fetch_and_or(&bloom_arr[(h / 64) & bloom_mask], 1ULL << (h % 64));
+	}
+}
+
+bool bloom_contains(__u64 val)
+{
+	__u32 i = 0, h;
+	__u32 seed = bloom_seed, msk = bloom_mask;
+	__u64 v, *arr = bloom_arr;
+
+	for (i = bloom_hash_cnt; i > 0; i--) {
+		h = bpf_jhash_mem(&val, sizeof(val), seed);
+		v = arr[(h / 64) & msk];
+		if (!((v >> (h % 64)) & 1))
+			return false;
+		seed++;
+	}
+	return true;
+}
+
 static __always_inline void log_result(__u32 key, __u32 val)
 {
 	__u32 cpu = bpf_get_smp_processor_id();
@@ -99,6 +133,20 @@ check_elem(struct bpf_map *map, __u32 *key, __u64 *val,
 	return 0;
 }
 
+static __u64
+check_elem_custom(struct bpf_map *map, __u32 *key, __u64 *val,
+	   struct callback_ctx *data)
+{
+	if (!bloom_contains(*val)) {
+		error |= 1;
+		return 1; /* stop the iteration */
+	}
+
+	log_result(hit_key, 1);
+
+	return 0;
+}
+
 SEC("fentry/__x64_sys_getpgid")
 int prog_bloom_filter(void *ctx)
 {
@@ -110,6 +158,26 @@ int prog_bloom_filter(void *ctx)
 	return 0;
 }
 
+SEC("fentry/__x64_sys_getpgid")
+int prog_custom_bloom_filter(void *ctx)
+{
+	bpf_for_each_map_elem(&map_random_data, check_elem_custom, NULL, 0);
+
+	return 0;
+}
+
+__u32 bloom_err = 0;
+__u32 bloom_custom_hit = 0;
+__u32 bloom_noncustom_hit = 0;
+
+SEC("fentry/__x64_sys_getpgid")
+int prog_custom_bloom_filter_add(void *ctx)
+{
+	bloom_add(&bloom_val, sizeof(bloom_val));
+
+	return 0;
+}
+
 SEC("fentry/__x64_sys_getpgid")
 int prog_bloom_filter_inner_map(void *ctx)
 {
@@ -145,6 +213,15 @@ int prog_bloom_filter_hashmap_lookup(void *ctx)
 		val.data32[0] = /*i; */ bpf_get_prandom_u32();
 		val.data32[1] = /*i + 1;*/ bpf_get_prandom_u32();
 
+		if (hashmap_use_custom_bloom_filter)
+		{
+			if (!bloom_contains(val.data64)) {
+				hits++;
+				//custom_hit = true;
+				//__sync_fetch_and_add(&bloom_custom_hit, 1);
+				continue;
+			}
+		} 
 		if (hashmap_use_bloom_filter)
 		{
 			err = bpf_map_peek_elem(&map_bloom_filter, &val);
@@ -160,11 +237,13 @@ int prog_bloom_filter_hashmap_lookup(void *ctx)
 			}
 		}
 
+		//bloom_err += (custom_hit != noncustom_hit);
+
 		result = bpf_map_lookup_elem(&hashmap, &val);
 		if (result) {
 			hits++;
 		} else {
-			if (hashmap_use_bloom_filter)
+			if (hashmap_use_custom_bloom_filter || hashmap_use_bloom_filter)
 				false_hits++;
 			drops++;
 		}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH RFC bpf-next 4/4] selftests/bpf: integrate custom Bloom filter impl into benchs
  2021-09-22 20:32 [PATCH RFC bpf-next 0/4] bpf_jhash_mem() and BPF Bloom filter implementation Andrii Nakryiko
                   ` (2 preceding siblings ...)
  2021-09-22 20:32 ` [PATCH RFC bpf-next 3/4] selftests/bpf: implement custom Bloom filter purely in BPF Andrii Nakryiko
@ 2021-09-22 20:32 ` Andrii Nakryiko
  3 siblings, 0 replies; 5+ messages in thread
From: Andrii Nakryiko @ 2021-09-22 20:32 UTC (permalink / raw)
  To: bpf, ast, daniel; +Cc: kafai, joannekoong, Andrii Nakryiko

Add user-space integration parts.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 tools/testing/selftests/bpf/bench.c           |   6 +
 .../bpf/benchs/bench_bloom_filter_map.c       | 121 +++++++++++++++++-
 .../bpf/benchs/run_bench_bloom_filter_map.sh  |  22 ++--
 .../selftests/bpf/benchs/run_common.sh        |   2 +-
 4 files changed, 135 insertions(+), 16 deletions(-)

diff --git a/tools/testing/selftests/bpf/bench.c b/tools/testing/selftests/bpf/bench.c
index 7da1589a9fe0..ab03b259b76f 100644
--- a/tools/testing/selftests/bpf/bench.c
+++ b/tools/testing/selftests/bpf/bench.c
@@ -363,9 +363,12 @@ extern const struct bench bench_rb_custom;
 extern const struct bench bench_pb_libbpf;
 extern const struct bench bench_pb_custom;
 extern const struct bench bench_bloom_filter_map;
+extern const struct bench bench_custom_bloom_filter_map;
 extern const struct bench bench_bloom_filter_false_positive;
+extern const struct bench bench_custom_bloom_filter_false_positive;
 extern const struct bench bench_hashmap_without_bloom_filter;
 extern const struct bench bench_hashmap_with_bloom_filter;
+extern const struct bench bench_hashmap_with_custom_bloom_filter;
 
 static const struct bench *benchs[] = {
 	&bench_count_global,
@@ -388,9 +391,12 @@ static const struct bench *benchs[] = {
 	&bench_pb_libbpf,
 	&bench_pb_custom,
 	&bench_bloom_filter_map,
+	&bench_custom_bloom_filter_map,
 	&bench_bloom_filter_false_positive,
+	&bench_custom_bloom_filter_false_positive,
 	&bench_hashmap_without_bloom_filter,
 	&bench_hashmap_with_bloom_filter,
+	&bench_hashmap_with_custom_bloom_filter,
 };
 
 static void setup_benchmark()
diff --git a/tools/testing/selftests/bpf/benchs/bench_bloom_filter_map.c b/tools/testing/selftests/bpf/benchs/bench_bloom_filter_map.c
index 4f53cd9fb099..c0ccfbaacef7 100644
--- a/tools/testing/selftests/bpf/benchs/bench_bloom_filter_map.c
+++ b/tools/testing/selftests/bpf/benchs/bench_bloom_filter_map.c
@@ -95,11 +95,18 @@ static void *map_prepare_thread(void *arg)
 {
 	int err, random_data_fd, bloom_filter_fd, hashmap_fd;
 	__u64 i, val;
+	struct bpf_link *link;
 
 	bloom_filter_fd = bpf_map__fd(ctx.skel->maps.map_bloom_filter);
 	random_data_fd = bpf_map__fd(ctx.skel->maps.map_random_data);
 	hashmap_fd = bpf_map__fd(ctx.skel->maps.hashmap);
 
+	link = bpf_program__attach(ctx.skel->progs.prog_custom_bloom_filter_add);
+	if (libbpf_get_error(link)) {
+		fprintf(stderr, "failed to attach program!\n");
+		exit(1);
+	}
+
 	while (true) {
 		i = __atomic_add_fetch(&ctx.next_map_idx, 1, __ATOMIC_RELAXED);
 		if (i > args.nr_entries)
@@ -135,8 +142,13 @@ static void *map_prepare_thread(void *arg)
 			fprintf(stderr, "failed to add elem to bloom_filter: %d\n", -errno);
 			break;
 		}
+
+		ctx.skel->bss->bloom_val = val;
+		trigger_bpf_program();
 	}
 
+	bpf_link__destroy(link);
+
 	pthread_mutex_lock(&ctx.map_done_mtx);
 	pthread_cond_signal(&ctx.map_done);
 	pthread_mutex_unlock(&ctx.map_done_mtx);
@@ -146,7 +158,7 @@ static void *map_prepare_thread(void *arg)
 
 static void populate_maps(void)
 {
-	unsigned int nr_cpus = bpf_num_possible_cpus();
+	unsigned int nr_cpus = 1; // bpf_num_possible_cpus();
 	pthread_t map_thread;
 	int i, err;
 
@@ -167,10 +179,10 @@ static void populate_maps(void)
 		exit(1);
 }
 
-static struct bloom_filter_map *setup_skeleton(void)
+static struct bloom_filter_map *setup_skeleton()
 {
 	struct bloom_filter_map *skel;
-	int err;
+	int err, i, bloom_sz;
 
 	setup_libbpf();
 
@@ -204,10 +216,17 @@ static struct bloom_filter_map *setup_skeleton(void)
 		exit(1);
 	}
 
-	if (bloom_filter_map__load(skel)) {
-		fprintf(stderr, "failed to load skeleton\n");
-		exit(1);
+	skel->rodata->bloom_hash_cnt = args.nr_hash_funcs;
+	skel->rodata->bloom_seed = /* 0; */ rand();
+
+	bloom_sz = (long)args.nr_hash_funcs * args.nr_entries / 5 * 7;
+	for (i = 64; i < bloom_sz; i *= 2) {
 	}
+	bloom_sz = i / 64;
+	skel->rodata->bloom_mask = bloom_sz - 1;
+
+	//printf("SET BLOOM SZ TO %d NR_ENTRIES %d HASH CNT %d \n", bloom_sz, args.nr_entries, args.nr_hash_funcs);
+
 
 	return skel;
 }
@@ -218,6 +237,11 @@ static void bloom_filter_map_setup(void)
 
 	ctx.skel = setup_skeleton();
 
+	if (bloom_filter_map__load(ctx.skel)) {
+		fprintf(stderr, "failed to load skeleton\n");
+		exit(1);
+	}
+
 	populate_maps();
 
 	link = bpf_program__attach(ctx.skel->progs.prog_bloom_filter);
@@ -227,12 +251,59 @@ static void bloom_filter_map_setup(void)
 	}
 }
 
+static void custom_bloom_filter_map_setup(void)
+{
+	struct bpf_link *link;
+
+	ctx.skel = setup_skeleton();
+
+	if (bloom_filter_map__load(ctx.skel)) {
+		fprintf(stderr, "failed to load skeleton\n");
+		exit(1);
+	}
+
+	populate_maps();
+
+	link = bpf_program__attach(ctx.skel->progs.prog_custom_bloom_filter);
+	if (!link) {
+		fprintf(stderr, "failed to attach program!\n");
+		exit(1);
+	}
+}
+
 static void hashmap_lookup_setup(void)
 {
 	struct bpf_link *link;
 
 	ctx.skel = setup_skeleton();
 
+	if (bloom_filter_map__load(ctx.skel)) {
+		fprintf(stderr, "failed to load skeleton\n");
+		exit(1);
+	}
+
+	populate_maps();
+
+	link = bpf_program__attach(ctx.skel->progs.prog_bloom_filter_hashmap_lookup);
+	if (!link) {
+		fprintf(stderr, "failed to attach program!\n");
+		exit(1);
+	}
+}
+static void hashmap_lookup_custom_setup(void)
+{
+	struct bpf_link *link;
+
+	ctx.skel = setup_skeleton();
+
+	ctx.skel->rodata->hashmap_use_bloom_filter = false;
+	ctx.skel->rodata->hashmap_use_custom_bloom_filter = true;
+
+	if (bloom_filter_map__load(ctx.skel)) {
+		fprintf(stderr, "failed to load skeleton\n");
+		exit(1);
+	}
+
 	populate_maps();
 
 	link = bpf_program__attach(ctx.skel->progs.prog_bloom_filter_hashmap_lookup);
@@ -250,6 +321,11 @@ static void hashmap_no_bloom_filter_setup(void)
 
 	ctx.skel->rodata->hashmap_use_bloom_filter = false;
 
+	if (bloom_filter_map__load(ctx.skel)) {
+		fprintf(stderr, "failed to load skeleton\n");
+		exit(1);
+	}
+
 	populate_maps();
 
 	link = bpf_program__attach(ctx.skel->progs.prog_bloom_filter_hashmap_lookup);
@@ -378,6 +454,17 @@ const struct bench bench_bloom_filter_map = {
 	.report_final = hits_drops_report_final,
 };
 
+const struct bench bench_custom_bloom_filter_map = {
+	.name = "custom-bloom-filter-map",
+	.validate = validate,
+	.setup = custom_bloom_filter_map_setup,
+	.producer_thread = producer,
+	.consumer_thread = consumer,
+	.measure = measure,
+	.report_progress = hits_drops_report_progress,
+	.report_final = hits_drops_report_final,
+};
+
 const struct bench bench_bloom_filter_false_positive = {
 	.name = "bloom-filter-false-positive",
 	.validate = validate,
@@ -389,6 +476,17 @@ const struct bench bench_bloom_filter_false_positive = {
 	.report_final = false_hits_report_final,
 };
 
+const struct bench bench_custom_bloom_filter_false_positive = {
+	.name = "custom-bloom-filter-false-positive",
+	.validate = validate,
+	.setup = hashmap_lookup_custom_setup,
+	.producer_thread = producer,
+	.consumer_thread = consumer,
+	.measure = measure,
+	.report_progress = false_hits_report_progress,
+	.report_final = false_hits_report_final,
+};
+
 const struct bench bench_hashmap_without_bloom_filter = {
 	.name = "hashmap-without-bloom-filter",
 	.validate = validate,
@@ -410,3 +508,14 @@ const struct bench bench_hashmap_with_bloom_filter = {
 	.report_progress = hits_drops_report_progress,
 	.report_final = hits_drops_report_final,
 };
+
+const struct bench bench_hashmap_with_custom_bloom_filter = {
+	.name = "hashmap-with-custom-bloom-filter",
+	.validate = validate,
+	.setup = hashmap_lookup_custom_setup,
+	.producer_thread = producer,
+	.consumer_thread = consumer,
+	.measure = measure,
+	.report_progress = hits_drops_report_progress,
+	.report_final = hits_drops_report_final,
+};
diff --git a/tools/testing/selftests/bpf/benchs/run_bench_bloom_filter_map.sh b/tools/testing/selftests/bpf/benchs/run_bench_bloom_filter_map.sh
index 239c040b7aaa..ea2c2fff3f40 100755
--- a/tools/testing/selftests/bpf/benchs/run_bench_bloom_filter_map.sh
+++ b/tools/testing/selftests/bpf/benchs/run_bench_bloom_filter_map.sh
@@ -7,9 +7,9 @@ set -eufo pipefail
 
 header "Bloom filter map"
 for t in 1 4 8; do
-for h in {1..10}; do
+for h in 1 3 5 10; do
 subtitle "# threads: $t, # hashes: $h"
-	for e in 10000 50000 75000 100000 250000 500000 750000 1000000 2500000 5000000; do
+	for e in 10000 100000 1000000; do
 		printf "%'d entries -\n" $e
 		printf "\t"
 		summarize "Total operations: " \
@@ -17,20 +17,21 @@ subtitle "# threads: $t, # hashes: $h"
 		printf "\t"
 		summarize_percentage "False positive rate: " \
 			"$($RUN_BENCH -p $t --nr_hash_funcs $h --nr_entries $e bloom-filter-false-positive)"
+		printf "\t"
+		summarize "[CUSTOM] Total operations: " \
+			"$($RUN_BENCH -p $t --nr_hash_funcs $h --nr_entries $e custom-bloom-filter-map)"
+		printf "\t"
+		summarize_percentage "[CUSTOM] False positive rate: " \
+			"$($RUN_BENCH -p $t --nr_hash_funcs $h --nr_entries $e custom-bloom-filter-false-positive)"
 	done
 	printf "\n"
 done
 done
 
-header "Bloom filter map, multi-producer contention"
-for t in 1 2 3 4 8 12 16 20 24 28 32 36 40 44 48 52; do
-	summarize "$t threads - " "$($RUN_BENCH -p $t bloom-filter-map)"
-done
-
 header "Hashmap without bloom filter vs. hashmap with bloom filter (throughput, 8 threads)"
-for h in {1..10}; do
+for h in 1 3 5 10; do
 subtitle "# hashes: $h"
-	for e in 10000 50000 75000 100000 250000 500000 750000 1000000 2500000 5000000; do
+	for e in 10000 100000 1000000; do
 		printf "%'d entries -\n" $e
 		printf "\t"
 		summarize_total "Hashmap without bloom filter: " \
@@ -38,6 +39,9 @@ subtitle "# hashes: $h"
 		printf "\t"
 		summarize_total "Hashmap with bloom filter: " \
 			"$($RUN_BENCH --nr_hash_funcs $h --nr_entries $e -p 8 hashmap-with-bloom-filter)"
+		printf "\t"
+		summarize_total "[CUSTOM] Hashmap with bloom filter: " \
+			"$($RUN_BENCH --nr_hash_funcs $h --nr_entries $e -p 8 hashmap-with-custom-bloom-filter)"
 	done
 	printf "\n"
 done
diff --git a/tools/testing/selftests/bpf/benchs/run_common.sh b/tools/testing/selftests/bpf/benchs/run_common.sh
index 9a16be78b180..961d25374150 100644
--- a/tools/testing/selftests/bpf/benchs/run_common.sh
+++ b/tools/testing/selftests/bpf/benchs/run_common.sh
@@ -1,7 +1,7 @@
 #!/bin/bash
 # SPDX-License-Identifier: GPL-2.0
 
-RUN_BENCH="sudo ./bench -w3 -d10 -a"
+RUN_BENCH="sudo ./bench -w1 -d5 -a"
 
 function header()
 {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-09-22 20:32 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-22 20:32 [PATCH RFC bpf-next 0/4] bpf_jhash_mem() and BPF Bloom filter implementation Andrii Nakryiko
2021-09-22 20:32 ` [PATCH RFC bpf-next 1/4] bpf: add bpf_jhash_mem BPF helper Andrii Nakryiko
2021-09-22 20:32 ` [PATCH RFC bpf-next 2/4] selftests/bpf: fix and optimize bloom filter bench Andrii Nakryiko
2021-09-22 20:32 ` [PATCH RFC bpf-next 3/4] selftests/bpf: implement custom Bloom filter purely in BPF Andrii Nakryiko
2021-09-22 20:32 ` [PATCH RFC bpf-next 4/4] selftests/bpf: integrate custom Bloom filter impl into benchs Andrii Nakryiko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).