All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrii Nakryiko <andrii@kernel.org>
To: <bpf@vger.kernel.org>, <ast@kernel.org>, <daniel@iogearbox.net>
Cc: <kafai@fb.com>, <joannekoong@fb.com>,
	Andrii Nakryiko <andrii@kernel.org>
Subject: [PATCH RFC bpf-next 0/4] bpf_jhash_mem() and BPF Bloom filter implementation
Date: Wed, 22 Sep 2021 13:32:20 -0700	[thread overview]
Message-ID: <20210922203224.912809-1-andrii@kernel.org> (raw)

This is a quick experiment on adding a hashing helper and buliding Bloom
filter with pure BPF code with no extra kernel functionality (beyond generic
hashing helper).

This is based on top of Joanne's series [0].

Patch 1 adds bpf_jhash_mem() helper. Patch 2 fixes existing benchmark to
report valid and consistent benchmark results and reduce amount of overhead
that stats counting itself causes. Patch 3 contains BPF-side implementation of
Bloom filter based on global variables. Patch 4 completes the integration on
user-space side. I split patch 3 and 4 to not distract from BPF-side changes.

Note that "Hashmap without bloom filter vs. hashmap with bloom filter"
benchmark is still spending lots of time in just generating random values on
BPF side, and it would be nice to optimize that and make it more reproducible
to compare different implementations. But I ran out of steam for doing that,
especially that I'm not sure this changes anything.

The same benchmark also checks only short 8-byte keys, which is a valid use
case, but not the only probably one, so would be nice to have that extended as
well.


For reference, here are full benchmark results comparing in-kernel Bloom filter
and custom BPF-implemented Bloom filter. I shortened the set of different
combinations tested to reduce amount of time to wait for results.

The results for "Hashmap without bloom filter vs. hashmap with bloom filter"
are quite confusing, though. I spent a bit of time to try to find
discrepancies. I confirmed that both implementations function correctly and
match 100% of time in terms of positives/negatives. Pure Bloom filter reading
benchmarks show a pretty small gap in performance with custom BPF
implementation losing. The "Hashmap without bloom filter vs. hashmap with bloom
filter" benchmark shows much bigger difference, though, which I wasn't able to
completely explain, to be entirely honest.

It's probably worth spending some more time investigating this, but I ran out
of self-alloted time for this.


Bloom filter map
================
        # threads: 1, # hashes: 1
10,000 entries -
        Total operations:    50.854 ± 0.134M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  31.36%
        [CUSTOM] Total operations:  49.391 ± 0.123M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  31.37%
100,000 entries -
        Total operations:    50.969 ± 0.049M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  24.04%
        [CUSTOM] Total operations:  49.135 ± 1.579M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  24.04%
1,000,000 entries -
        Total operations:    48.474 ± 1.619M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  27.50%
        [CUSTOM] Total operations:  46.088 ± 0.776M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  27.50%

        # threads: 1, # hashes: 3
10,000 entries -
        Total operations:    25.136 ± 0.011M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  4.71%
        [CUSTOM] Total operations:  24.115 ± 0.014M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  4.77%
100,000 entries -
        Total operations:    25.045 ± 0.120M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  7.67%
        [CUSTOM] Total operations:  23.028 ± 0.042M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  7.65%
1,000,000 entries -
        Total operations:    18.712 ± 0.406M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  2.64%
        [CUSTOM] Total operations:  18.100 ± 0.422M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  2.64%

        # threads: 1, # hashes: 5
10,000 entries -
        Total operations:    16.672 ± 0.011M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  0.32%
        [CUSTOM] Total operations:  15.318 ± 0.014M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  0.32%
100,000 entries -
        Total operations:    16.540 ± 0.121M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  0.78%
        [CUSTOM] Total operations:  15.189 ± 0.045M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  0.78%
1,000,000 entries -
        Total operations:    11.781 ± 0.192M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  1.79%
        [CUSTOM] Total operations:  11.651 ± 0.012M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  1.79%

        # threads: 1, # hashes: 10
10,000 entries -
        Total operations:    9.038 ± 0.052M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  0.00%
        [CUSTOM] Total operations:  8.620 ± 0.008M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  0.00%
100,000 entries -
        Total operations:    8.076 ± 0.027M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  0.01%
        [CUSTOM] Total operations:  7.261 ± 0.007M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  0.01%
1,000,000 entries -
        Total operations:    6.263 ± 0.041M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  0.03%
        [CUSTOM] Total operations:  6.173 ± 0.013M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  0.03%

        # threads: 4, # hashes: 1
10,000 entries -
        Total operations:    203.453 ± 0.161M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  31.28%
        [CUSTOM] Total operations:  197.959 ± 0.051M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  31.34%
100,000 entries -
        Total operations:    203.887 ± 0.155M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  24.07%
        [CUSTOM] Total operations:  197.476 ± 1.796M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  24.09%
1,000,000 entries -
        Total operations:    189.259 ± 0.473M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  27.50%
        [CUSTOM] Total operations:  185.157 ± 0.346M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  27.48%

        # threads: 4, # hashes: 3
10,000 entries -
        Total operations:    100.394 ± 0.062M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  4.76%
        [CUSTOM] Total operations:  93.896 ± 0.104M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  4.75%
100,000 entries -
        Total operations:    100.382 ± 0.155M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  7.62%
        [CUSTOM] Total operations:  93.460 ± 0.145M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  7.65%
1,000,000 entries -
        Total operations:    71.424 ± 0.710M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  2.65%
        [CUSTOM] Total operations:  72.546 ± 0.228M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  2.65%

        # threads: 4, # hashes: 5
10,000 entries -
        Total operations:    66.652 ± 0.116M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  0.32%
        [CUSTOM] Total operations:  60.790 ± 0.454M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  0.33%
100,000 entries -
        Total operations:    66.401 ± 0.090M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  0.78%
        [CUSTOM] Total operations:  61.066 ± 0.069M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  0.77%
1,000,000 entries -
        Total operations:    48.299 ± 0.369M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  1.79%
        [CUSTOM] Total operations:  47.401 ± 0.347M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  1.79%

        # threads: 4, # hashes: 10
10,000 entries -
        Total operations:    36.273 ± 0.030M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  0.00%
        [CUSTOM] Total operations:  34.464 ± 0.073M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  0.00%
100,000 entries -
        Total operations:    32.525 ± 0.047M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  0.01%
        [CUSTOM] Total operations:  29.516 ± 0.110M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  0.01%
1,000,000 entries -
        Total operations:    25.515 ± 0.405M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  0.03%
        [CUSTOM] Total operations:  24.566 ± 0.189M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  0.03%

        # threads: 8, # hashes: 1
10,000 entries -
        Total operations:    406.129 ± 2.262M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  31.45%
        [CUSTOM] Total operations:  384.758 ± 0.379M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  31.24%
100,000 entries -
        Total operations:    407.817 ± 0.793M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  24.05%
        [CUSTOM] Total operations:  394.745 ± 1.302M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  24.05%
1,000,000 entries -
        Total operations:    383.159 ± 0.289M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  27.49%
        [CUSTOM] Total operations:  371.173 ± 0.454M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  27.49%

        # threads: 8, # hashes: 3
10,000 entries -
        Total operations:    201.079 ± 0.183M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  4.69%
        [CUSTOM] Total operations:  187.658 ± 0.544M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  4.74%
100,000 entries -
        Total operations:    199.826 ± 0.972M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  7.63%
        [CUSTOM] Total operations:  185.415 ± 0.358M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  7.62%
1,000,000 entries -
        Total operations:    148.589 ± 1.320M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  2.65%
        [CUSTOM] Total operations:  142.591 ± 4.825M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  2.64%

        # threads: 8, # hashes: 5
10,000 entries -
        Total operations:    133.306 ± 0.468M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  0.32%
        [CUSTOM] Total operations:  127.377 ± 0.271M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  0.32%
100,000 entries -
        Total operations:    132.915 ± 0.364M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  0.78%
        [CUSTOM] Total operations:  123.722 ± 3.169M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  0.77%
1,000,000 entries -
        Total operations:    94.803 ± 3.240M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  1.79%
        [CUSTOM] Total operations:  95.670 ± 2.624M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  1.79%

        # threads: 8, # hashes: 10
10,000 entries -
        Total operations:    72.517 ± 0.083M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  0.00%
        [CUSTOM] Total operations:  65.803 ± 0.069M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  0.00%
100,000 entries -
        Total operations:    65.533 ± 0.148M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  0.01%
        [CUSTOM] Total operations:  59.908 ± 2.841M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  0.01%
1,000,000 entries -
        Total operations:    50.937 ± 0.105M/s (drops 0.000 ± 0.000M/s)
        False positive rate:  0.03%
        [CUSTOM] Total operations:  47.526 ± 1.044M/s (drops 0.000 ± 0.000M/s)
        [CUSTOM] False positive rate:  0.03%


Hashmap without bloom filter vs. hashmap with bloom filter (throughput, 8 threads)
==================================================================================
        # hashes: 1
10,000 entries -
        Hashmap without bloom filter:  123.919 ± 0.458M/s
        Hashmap with bloom filter:  124.588 ± 0.450M/s
        [CUSTOM] Hashmap with bloom filter:  106.838 ± 0.114M/s
100,000 entries -
        Hashmap without bloom filter:  93.708 ± 0.715M/s
        Hashmap with bloom filter:  131.686 ± 1.272M/s
        [CUSTOM] Hashmap with bloom filter:  105.649 ± 0.278M/s
1,000,000 entries -
        Hashmap without bloom filter:  40.040 ± 0.677M/s
        Hashmap with bloom filter:  67.250 ± 0.506M/s
        [CUSTOM] Hashmap with bloom filter:  58.356 ± 0.541M/s

        # hashes: 3
10,000 entries -
        Hashmap without bloom filter:  123.882 ± 0.077M/s
        Hashmap with bloom filter:  152.423 ± 0.061M/s
        [CUSTOM] Hashmap with bloom filter:  126.021 ± 0.115M/s
100,000 entries -
        Hashmap without bloom filter:  94.291 ± 0.986M/s
        Hashmap with bloom filter:  127.944 ± 0.825M/s
        [CUSTOM] Hashmap with bloom filter:  106.943 ± 0.827M/s
1,000,000 entries -
        Hashmap without bloom filter:  40.183 ± 0.382M/s
        Hashmap with bloom filter:  125.224 ± 0.266M/s
        [CUSTOM] Hashmap with bloom filter:  89.717 ± 0.158M/s

        # hashes: 5
10,000 entries -
        Hashmap without bloom filter:  120.510 ± 0.351M/s
        Hashmap with bloom filter:  170.138 ± 0.088M/s
        [CUSTOM] Hashmap with bloom filter:  136.006 ± 0.324M/s
100,000 entries -
        Hashmap without bloom filter:  94.774 ± 0.191M/s
        Hashmap with bloom filter:  145.559 ± 0.687M/s
        [CUSTOM] Hashmap with bloom filter:  117.073 ± 0.382M/s
1,000,000 entries -
        Hashmap without bloom filter:  40.004 ± 0.492M/s
        Hashmap with bloom filter:  96.916 ± 0.148M/s
        [CUSTOM] Hashmap with bloom filter:  78.485 ± 0.289M/s

        # hashes: 10
10,000 entries -
        Hashmap without bloom filter:  124.034 ± 0.245M/s
        Hashmap with bloom filter:  169.757 ± 0.336M/s
        [CUSTOM] Hashmap with bloom filter:  134.634 ± 0.276M/s
100,000 entries -
        Hashmap without bloom filter:  94.872 ± 0.195M/s
        Hashmap with bloom filter:  141.107 ± 0.780M/s
        [CUSTOM] Hashmap with bloom filter:  109.279 ± 0.330M/s
1,000,000 entries -
        Hashmap without bloom filter:  40.215 ± 0.396M/s
        Hashmap with bloom filter:  98.084 ± 0.267M/s
        [CUSTOM] Hashmap with bloom filter:  78.579 ± 0.046M/s


  [0] https://patchwork.kernel.org/project/netdevbpf/list/?series=550585&state=*

Andrii Nakryiko (4):
  bpf: add bpf_jhash_mem BPF helper
  selftests/bpf: fix and optimize bloom filter bench
  selftests/bpf: implement custom Bloom filter purely in BPF
  selftests/bpf: integrate custom Bloom filter impl into benchs

 include/uapi/linux/bpf.h                      |   8 +
 kernel/bpf/helpers.c                          |  23 +++
 tools/include/uapi/linux/bpf.h                |   8 +
 tools/testing/selftests/bpf/Makefile          |   2 +-
 tools/testing/selftests/bpf/bench.c           |   6 +
 .../bpf/benchs/bench_bloom_filter_map.c       | 153 +++++++++++++++++-
 .../bpf/benchs/run_bench_bloom_filter_map.sh  |  22 +--
 .../selftests/bpf/benchs/run_common.sh        |   2 +-
 .../selftests/bpf/progs/bloom_filter_map.c    | 125 ++++++++++++--
 9 files changed, 317 insertions(+), 32 deletions(-)

-- 
2.30.2


             reply	other threads:[~2021-09-22 20:32 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-22 20:32 Andrii Nakryiko [this message]
2021-09-22 20:32 ` [PATCH RFC bpf-next 1/4] bpf: add bpf_jhash_mem BPF helper Andrii Nakryiko
2021-09-22 20:32 ` [PATCH RFC bpf-next 2/4] selftests/bpf: fix and optimize bloom filter bench Andrii Nakryiko
2021-09-22 20:32 ` [PATCH RFC bpf-next 3/4] selftests/bpf: implement custom Bloom filter purely in BPF Andrii Nakryiko
2021-09-22 20:32 ` [PATCH RFC bpf-next 4/4] selftests/bpf: integrate custom Bloom filter impl into benchs Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210922203224.912809-1-andrii@kernel.org \
    --to=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=joannekoong@fb.com \
    --cc=kafai@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.