bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joanne Koong <joannekoong@fb.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: <bpf@vger.kernel.org>, <Kernel-team@fb.com>
Subject: Re: [PATCH v2 bpf-next 1/4] bpf: Add bloom filter map implementation
Date: Mon, 20 Sep 2021 14:03:16 -0700	[thread overview]
Message-ID: <31392b5e-c470-5cbf-62ee-121b212ee175@fb.com> (raw)
In-Reply-To: <20210917170130.njmm3dm65ftd76vo@ast-mbp>

On 9/17/21 10:01 AM, Alexei Starovoitov wrote:

> On Mon, Sep 13, 2021 at 09:04:30PM -0700, Joanne Koong wrote:
>> +
>> +/* For bloom filter maps, the next 4 bits represent how many hashes to use.
>> + * The maximum number of hash functions supported is 15. If this is not set,
>> + * the default number of hash functions used will be 5.
>> + */
>> +	BPF_F_BLOOM_FILTER_HASH_BIT_1 = (1U << 13),
>> +	BPF_F_BLOOM_FILTER_HASH_BIT_2 = (1U << 14),
>> +	BPF_F_BLOOM_FILTER_HASH_BIT_3 = (1U << 15),
>> +	BPF_F_BLOOM_FILTER_HASH_BIT_4 = (1U << 16),
> The bit selection is unintuitive.
> Since key_size has to be zero may be used that instead to indicate the number of hash
> functions in the rare case when 5 is not good enough?
> Or use inner_map_fd since there is no possibility of having an inner map in bloomfilter.
> It could be a union:
>      __u32   max_entries;    /* max number of entries in a map */
>      __u32   map_flags;      /* BPF_MAP_CREATE related
>                               * flags defined above.
>                               */
>      union {
>         __u32  inner_map_fd;   /* fd pointing to the inner map */
>         __u32  nr_hash_funcs;  /* or number of hash functions */
>      };
>      __u32   numa_node;      /* numa node */
I really like the idea of union-ing inner_map_fd with the number of hash 
functions (my worry with
using key_size is that it might be a confusing / non-intuitive API quirk 
for users), but I think this
would later require us to add some bloom filter specific APIs to libbpf 
(such as bpf_map__set_nr_hashes).

To make the bit selection more intuitive, Andrii suggested defining some 
helper like

BPF_F_BLOOM_NR_HASH_OFF = 13

where the user could then do something like

struct {
     __uint(type, BPF_MAP_TYPE_BLOOM_FILTER),
     ...
     __uint(map_flags, 5 << BPF_F_BLOOM_NR_HASH_OFF),
};

to set the number of hash functions.

Would this approach address your concerns about the unintuitiveness of 
the bit selection?

>> +struct bpf_bloom_filter {
>> +	struct bpf_map map;
>> +	u32 bit_array_mask;
>> +	u32 hash_seed;
>> +	/* If the size of the values in the bloom filter is u32 aligned,
>> +	 * then it is more performant to use jhash2 as the underlying hash
>> +	 * function, else we use jhash. This tracks the number of u32s
>> +	 * in an u32-aligned value size. If the value size is not u32 aligned,
>> +	 * this will be 0.
>> +	 */
>> +	u32 aligned_u32_count;
> what is the performance difference?

Using results from the hashmap benchmark tests, using jhash2 instead of 
jhash for 4-byte
aligned value sizes improved the performance by roughly 5% to 15%. For 
non-4-byte aligned
value sizes, there wasn't a noticeable difference between using jhash2 
(and truncating the
remainder bits) vs. using jhash.

> May be we enforce 4-byte sized value for simplicity?
Sounds great! And if in the future this becomes too restrictive, we 
could always loosen this
as well

  parent reply	other threads:[~2021-09-20 21:05 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-14  4:04 [PATCH v2 bpf-next 0/4] Implement bloom filter map Joanne Koong
2021-09-14  4:04 ` [PATCH v2 bpf-next 1/4] bpf: Add bloom filter map implementation Joanne Koong
2021-09-17 17:01   ` Alexei Starovoitov
2021-09-20 20:58     ` Andrii Nakryiko
2021-09-20 22:52       ` Joanne Koong
2021-09-20 23:21         ` Andrii Nakryiko
2021-09-20 21:03     ` Joanne Koong [this message]
2021-09-17 21:48   ` Andrii Nakryiko
2021-09-14  4:04 ` [PATCH v2 bpf-next 2/4] selftests/bpf: Add bloom filter map test cases Joanne Koong
2021-09-14  4:04 ` [PATCH v2 bpf-next 3/4] bpf/benchs: Add benchmark test for bloom filter maps Joanne Koong
2021-09-14  4:04 ` [PATCH v2 bpf-next 4/4] bpf/benchs: Add benchmarks for comparing hashmap lookups with vs. without bloom filter Joanne Koong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=31392b5e-c470-5cbf-62ee-121b212ee175@fb.com \
    --to=joannekoong@fb.com \
    --cc=Kernel-team@fb.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=bpf@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).