All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Fastabend <john.fastabend@gmail.com>
To: Joanne Koong <joannekoong@fb.com>,
	John Fastabend <john.fastabend@gmail.com>,
	Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: bpf <bpf@vger.kernel.org>
Subject: Re: [PATCH bpf-next 1/5] bpf: Add bloom filter map implementation
Date: Fri, 03 Sep 2021 10:22:39 -0700	[thread overview]
Message-ID: <613259dfb6973_1c226208c1@john-XPS-13-9370.notmuch> (raw)
In-Reply-To: <0c1bb5a6-4ef5-77b4-cd10-aea0060d5349@fb.com>

Joanne Koong wrote:
> On 9/1/21 10:11 PM, John Fastabend wrote:
> 
> > Andrii Nakryiko wrote:
> >> On Tue, Aug 31, 2021 at 3:51 PM Joanne Koong <joannekoong@fb.com> wrote:
> >>> Bloom filters are a space-efficient probabilistic data structure
> >>> used to quickly test whether an element exists in a set.
> >>> In a bloom filter, false positives are possible whereas false
> >>> negatives are not.
> >>>
> >>> This patch adds a bloom filter map for bpf programs.
> >>> The bloom filter map supports peek (determining whether an element
> >>> is present in the map) and push (adding an element to the map)
> >>> operations.These operations are exposed to userspace applications
> >>> through the already existing syscalls in the following way:
> >>>
> >>> BPF_MAP_LOOKUP_ELEM -> peek
> >>> BPF_MAP_UPDATE_ELEM -> push
> >>>
> >>> The bloom filter map does not have keys, only values. In light of
> >>> this, the bloom filter map's API matches that of queue stack maps:
> >>> user applications use BPF_MAP_LOOKUP_ELEM/BPF_MAP_UPDATE_ELEM
> >>> which correspond internally to bpf_map_peek_elem/bpf_map_push_elem,
> >>> and bpf programs must use the bpf_map_peek_elem and bpf_map_push_elem
> >>> APIs to query or add an element to the bloom filter map. When the
> >>> bloom filter map is created, it must be created with a key_size of 0.
> >>>
> >>> For updates, the user will pass in the element to add to the map
> >>> as the value, wih a NULL key. For lookups, the user will pass in the
> >>> element to query in the map as the value. In the verifier layer, this
> >>> requires us to modify the argument type of a bloom filter's
> >>> BPF_FUNC_map_peek_elem call to ARG_PTR_TO_MAP_VALUE; as well, in
> >>> the syscall layer, we need to copy over the user value so that in
> >>> bpf_map_peek_elem, we know which specific value to query.
> >>>
> >>> The maximum number of entries in the bloom filter is not enforced; if
> >>> the user wishes to insert more entries into the bloom filter than they
> >>> specified as the max entries size of the bloom filter, that is permitted
> >>> but the performance of their bloom filter will have a higher false
> >>> positive rate.
> >>>
> >>> The number of hashes to use for the bloom filter is configurable from
> >>> userspace. The benchmarks later in this patchset can help compare the
> >>> performances of different number of hashes on different entry
> >>> sizes. In general, using more hashes decreases the speed of a lookup,
> >>> but increases the false positive rate of an element being detected in the
> >>> bloom filter.
> >>>
> >>> Signed-off-by: Joanne Koong <joannekoong@fb.com>

[...]

> >>> +static struct bpf_map *bloom_filter_map_alloc(union bpf_attr *attr)
> >>> +{
> >>> +       int numa_node = bpf_map_attr_numa_node(attr);
> >>> +       u32 nr_bits, bit_array_bytes, bit_array_mask;
> >>> +       struct bpf_bloom_filter *bloom_filter;
> >>> +
> >>> +       if (!bpf_capable())
> >>> +               return ERR_PTR(-EPERM);
> >>> +
> >>> +       if (attr->key_size != 0 || attr->value_size == 0 || attr->max_entries == 0 ||
> >>> +           attr->nr_hashes == 0 || attr->map_flags & ~BLOOM_FILTER_CREATE_FLAG_MASK ||
> >>> +           !bpf_map_flags_access_ok(attr->map_flags))
> >>> +               return ERR_PTR(-EINVAL);
> >>> +
> >>> +       /* For the bloom filter, the optimal bit array size that minimizes the
> >>> +        * false positive probability is n * k / ln(2) where n is the number of
> >>> +        * expected entries in the bloom filter and k is the number of hash
> >>> +        * functions. We use 7 / 5 to approximate 1 / ln(2).
> >>> +        *
> >>> +        * We round this up to the nearest power of two to enable more efficient
> >>> +        * hashing using bitmasks. The bitmask will be the bit array size - 1.
> >>> +        *
> >>> +        * If this overflows a u32, the bit array size will have 2^32 (4
> >>> +        * GB) bits.
> > Would it be better to return E2BIG or EINVAL here? Speculating a bit, but if I was
> > a user I might want to know that the number of bits I pushed down is not the actual
> > number?
> 
> I think if we return E2BIG or EINVAL here, this will fail to create the 
> bloom filter map
> if the max_entries exceeds some limit (~3 billion, according to my math) 
> whereas
> automatically setting the bit array size to 2^32 if the max_entries is
> extraordinarily large will still allow the user to create and use a 
> bloom filter (albeit
> one with a higher false positive rate).

It doesn't matter much to me, but I think if a user request 3+billion max entries
its ok to return E2BIG and then they can use a lower limit and know the
false positive rate is going to go up. 

> 
> > Another thought, would it be simpler to let user do this calculation and just let
> > max_elements be number of bits they want? Then we could have examples with the
> > above comment. Just a thought...
> 
> I like Martin's idea of keeping the max_entries meaning consistent 
> across all map types.
> I think that makes the interface clearer for users.

I'm convinced as well, lets keep it consistent. Thanks.

[...]

> >> Also, I wonder if ditching spinlock in favor of atomic bit set
> >> operation would improve performance in typical scenarios. Seems like
> >> set_bit() is an atomic operation, so it should be easy to test. Do you
> >> mind running benchmarks with spinlock and with set_bit()?
> > With the jhash pulled out of lock, I think it might be noticable. Curious
> > to see.
> Awesome, I will test this out and report back!

It looks like the benchmark tests were done with value size of __u64 should
we do larger entry? I guess (you tell me?) if this is used from XDP for
DDOS you would use a flow tuple and with IPv6 this could be
{dstIp, srcIp, sport, dport, proto} with roughly 44B.

> >>> +       for (i = 0; i < bloom_filter->map.nr_hashes; i++) {
> >>> +               hash = jhash(value, map->value_size, bloom_filter->hash_seed + i) &
> >>> +                       bloom_filter->bit_array_mask;
> >>> +               bitmap_set(bloom_filter->bit_array, hash, 1);
> >>> +       }
> >>> +
> >>> +       spin_unlock_irqrestore(&bloom_filter->spinlock, spinlock_flags);
> >>> +
> >>> +       return 0;
> >>> +}
> >>> +
> >> [...]
> >

  parent reply	other threads:[~2021-09-03 17:22 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-31 22:50 [PATCH bpf-next 0/5] Implement bloom filter map Joanne Koong
2021-08-31 22:50 ` [PATCH bpf-next 1/5] bpf: Add bloom filter map implementation Joanne Koong
2021-09-01  2:55   ` Alexei Starovoitov
2021-09-02 19:11     ` Joanne Koong
2021-09-02  1:44   ` Andrii Nakryiko
2021-09-02  5:11     ` John Fastabend
2021-09-02  6:16       ` Martin KaFai Lau
2021-09-02 22:07       ` Joanne Koong
2021-09-03  0:56         ` Martin KaFai Lau
2021-09-03  7:13           ` Joanne Koong
2021-09-03 17:19             ` Andrii Nakryiko
2021-09-03 17:22         ` John Fastabend [this message]
2021-09-08 19:10           ` Joanne Koong
2021-09-02  3:16   ` John Fastabend
2021-09-02  3:28     ` Andrii Nakryiko
2021-09-02  4:40       ` John Fastabend
2021-08-31 22:50 ` [PATCH bpf-next 2/5] libbpf: Allow the number of hashes in bloom filter maps to be configurable Joanne Koong
2021-09-02  3:30   ` Andrii Nakryiko
2021-08-31 22:50 ` [PATCH bpf-next 3/5] selftests/bpf: Add bloom filter map test cases Joanne Koong
2021-09-01  2:55   ` Alexei Starovoitov
2021-08-31 22:50 ` [PATCH bpf-next 4/5] bpf/benchs: Add benchmark test for bloom filter maps Joanne Koong
2021-09-02  3:35   ` Andrii Nakryiko
2021-08-31 22:50 ` [PATCH bpf-next 5/5] bpf/benchs: Add benchmarks for comparing hashmap lookups with vs. without bloom filter Joanne Koong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=613259dfb6973_1c226208c1@john-XPS-13-9370.notmuch \
    --to=john.fastabend@gmail.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=joannekoong@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.