bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: Joanne Koong <joannekoong@fb.com>
Cc: bpf@vger.kernel.org
Subject: Re: [PATCH bpf-next 1/5] bpf: Add bloom filter map implementation
Date: Tue, 31 Aug 2021 19:55:07 -0700	[thread overview]
Message-ID: <20210901025507.3hx4wpx3kmtjipad@ast-mbp.dhcp.thefacebook.com> (raw)
In-Reply-To: <20210831225005.2762202-2-joannekoong@fb.com>

On Tue, Aug 31, 2021 at 03:50:01PM -0700, Joanne Koong wrote:
> +static int bloom_filter_map_peek_elem(struct bpf_map *map, void *value)
> +{
> +	struct bpf_bloom_filter *bloom_filter =
> +		container_of(map, struct bpf_bloom_filter, map);
> +	u32 i, hash;
> +
> +	for (i = 0; i < bloom_filter->map.nr_hashes; i++) {
> +		hash = jhash(value, map->value_size, bloom_filter->hash_seed + i) &
> +			bloom_filter->bit_array_mask;
> +		if (!test_bit(hash, bloom_filter->bit_array))
> +			return -ENOENT;
> +	}

I'm curious what bloom filter theory says about n-hashes > 1
concurrent access with updates in terms of false negative?
Two concurrent updates race is protected by spin_lock,
but what about peek and update?
The update might set one bit, but not the other.
That shouldn't trigger false negative lookup, right?

Is bloom filter supported as inner map?
Hash and lru maps are often used as inner maps.
The lookups from them would be pre-filtered by bloom filter
map that would have to be (in some cases) inner map.
I suspect one bloom filter for all inner maps might be
reasonable workaround in some cases too.
The delete is not supported in bloom filter, of course.
Would be good to mention it in the commit log.
Since there is no delete the users would likely need
to replace the whole bloom filter. So map-in-map would
become necessary.
Do you think 'clear-all' operation might be useful for bloom filter?
It feels that if map-in-map is supported then clear-all is probably
not that useful, since atomic replacement and delete of the map
would work better. 'clear-all' will have issues with
lookup, since it cannot be done in parallel.
Would be good to document all these ideas and restrictions.

Could you collect 'perf annotate' data for the above performance
critical loop?
I wonder whether using jhash2 and forcing u32 value size could speed it up.
Probably not, but would be good to check, since restricting value_size
later would be problematic due to backward compatibility.

The recommended nr_hashes=3 was computed with value_size=8, right?
I wonder whether nr_hashes would be different for value_size=16 and =4
which are ipv6/ipv4 addresses and value_size = 40
an approximation of networking n-tuple.

> +static struct bpf_map *bloom_filter_map_alloc(union bpf_attr *attr)
> +{
> +	int numa_node = bpf_map_attr_numa_node(attr);
> +	u32 nr_bits, bit_array_bytes, bit_array_mask;
> +	struct bpf_bloom_filter *bloom_filter;
> +
> +	if (!bpf_capable())
> +		return ERR_PTR(-EPERM);
> +
> +	if (attr->key_size != 0 || attr->value_size == 0 || attr->max_entries == 0 ||
> +	    attr->nr_hashes == 0 || attr->map_flags & ~BLOOM_FILTER_CREATE_FLAG_MASK ||

Would it make sense to default to nr_hashes=3 if zero is passed?
This way the libbpf changes for nr_hashes will become 'optional'.
Most users wouldn't have to specify it explicitly.

Overall looks great!
Performance numbers are impressive.

  reply	other threads:[~2021-09-01  2:55 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-31 22:50 [PATCH bpf-next 0/5] Implement bloom filter map Joanne Koong
2021-08-31 22:50 ` [PATCH bpf-next 1/5] bpf: Add bloom filter map implementation Joanne Koong
2021-09-01  2:55   ` Alexei Starovoitov [this message]
2021-09-02 19:11     ` Joanne Koong
2021-09-02  1:44   ` Andrii Nakryiko
2021-09-02  5:11     ` John Fastabend
2021-09-02  6:16       ` Martin KaFai Lau
2021-09-02 22:07       ` Joanne Koong
2021-09-03  0:56         ` Martin KaFai Lau
2021-09-03  7:13           ` Joanne Koong
2021-09-03 17:19             ` Andrii Nakryiko
2021-09-03 17:22         ` John Fastabend
2021-09-08 19:10           ` Joanne Koong
2021-09-02  3:16   ` John Fastabend
2021-09-02  3:28     ` Andrii Nakryiko
2021-09-02  4:40       ` John Fastabend
2021-08-31 22:50 ` [PATCH bpf-next 2/5] libbpf: Allow the number of hashes in bloom filter maps to be configurable Joanne Koong
2021-09-02  3:30   ` Andrii Nakryiko
2021-08-31 22:50 ` [PATCH bpf-next 3/5] selftests/bpf: Add bloom filter map test cases Joanne Koong
2021-09-01  2:55   ` Alexei Starovoitov
2021-08-31 22:50 ` [PATCH bpf-next 4/5] bpf/benchs: Add benchmark test for bloom filter maps Joanne Koong
2021-09-02  3:35   ` Andrii Nakryiko
2021-08-31 22:50 ` [PATCH bpf-next 5/5] bpf/benchs: Add benchmarks for comparing hashmap lookups with vs. without bloom filter Joanne Koong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210901025507.3hx4wpx3kmtjipad@ast-mbp.dhcp.thefacebook.com \
    --to=alexei.starovoitov@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=joannekoong@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).