All of lore.kernel.org
 help / color / mirror / Atom feed
From: Martin KaFai Lau <kafai@fb.com>
To: John Fastabend <john.fastabend@gmail.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	Joanne Koong <joannekoong@fb.com>, bpf <bpf@vger.kernel.org>
Subject: Re: [PATCH bpf-next 1/5] bpf: Add bloom filter map implementation
Date: Wed, 1 Sep 2021 23:16:43 -0700	[thread overview]
Message-ID: <20210902061643.h2o67abkwn26d66a@kafai-mbp.dhcp.thefacebook.com> (raw)
In-Reply-To: <61305cf822fa_439b208a5@john-XPS-13-9370.notmuch>

On Wed, Sep 01, 2021 at 10:11:20PM -0700, John Fastabend wrote:
[ ... ]

> > > +static struct bpf_map *bloom_filter_map_alloc(union bpf_attr *attr)
> > > +{
> > > +       int numa_node = bpf_map_attr_numa_node(attr);
> > > +       u32 nr_bits, bit_array_bytes, bit_array_mask;
> > > +       struct bpf_bloom_filter *bloom_filter;
> > > +
> > > +       if (!bpf_capable())
> > > +               return ERR_PTR(-EPERM);
> > > +
> > > +       if (attr->key_size != 0 || attr->value_size == 0 || attr->max_entries == 0 ||
> > > +           attr->nr_hashes == 0 || attr->map_flags & ~BLOOM_FILTER_CREATE_FLAG_MASK ||
> > > +           !bpf_map_flags_access_ok(attr->map_flags))
> > > +               return ERR_PTR(-EINVAL);
> > > +
> > > +       /* For the bloom filter, the optimal bit array size that minimizes the
> > > +        * false positive probability is n * k / ln(2) where n is the number of
> > > +        * expected entries in the bloom filter and k is the number of hash
> > > +        * functions. We use 7 / 5 to approximate 1 / ln(2).
> > > +        *
> > > +        * We round this up to the nearest power of two to enable more efficient
> > > +        * hashing using bitmasks. The bitmask will be the bit array size - 1.
> > > +        *
> > > +        * If this overflows a u32, the bit array size will have 2^32 (4
> > > +        * GB) bits.
> 
> Would it be better to return E2BIG or EINVAL here? Speculating a bit, but if I was
> a user I might want to know that the number of bits I pushed down is not the actual
> number?
> 
> Another thought, would it be simpler to let user do this calculation and just let
> max_elements be number of bits they want? Then we could have examples with the
> above comment. Just a thought...
Instead of having user second guessing on what max_entries means
for a particular map, I think it is better to keep max_entries
meaning as consistent as possible and let the kernel figure out
the correct nr_bits to use.

[ ... ]

> > > +static int bloom_filter_map_push_elem(struct bpf_map *map, void *value,
> > > +                                     u64 flags)
> > > +{
> > > +       struct bpf_bloom_filter *bloom_filter =
> > > +               container_of(map, struct bpf_bloom_filter, map);
> > > +       unsigned long spinlock_flags;
> > > +       u32 i, hash;
> > > +
> > > +       if (flags != BPF_ANY)
> > > +               return -EINVAL;
> > > +
> > > +       spin_lock_irqsave(&bloom_filter->spinlock, spinlock_flags);
> > > +
> > 
> > If value_size is pretty big, hashing might take a noticeable amount of
> > CPU, during which we'll be keeping spinlock. With what I said above
Good catch on big value_size.

> > about sane number of hashes, if we bound it to small reasonable number
> > (e.g., 16), we can have a local 16-element array with hashes
> > calculated before we take lock. That way spinlock will be held only
> > for few bit flips.
> 
> +1. Anyways we are inside a RCU section here and the map shouldn't
> disapper without a grace period so we can READ_ONCE the seed right?
> Or are we thinking about sleepable programs here?
> 
> > Also, I wonder if ditching spinlock in favor of atomic bit set
> > operation would improve performance in typical scenarios. Seems like
> > set_bit() is an atomic operation, so it should be easy to test. Do you
> > mind running benchmarks with spinlock and with set_bit()?
The atomic set_bit() is a good idea.  Then no need to have a 16-element array
and keep thing simple.
It is in general useful to optimize the update/push path (e.g. I would like to
have the bloom-filter bench populating millions entries faster).   Our current
usecase is to have the userspace populates the map (e.g. a lot of suspicious
IP that we have already learned) at the very beginning and then very sparse
update after that.  The bpf prog will mostly only lookup/peek which I think
is a better optimization and benchmark target.

> 
> With the jhash pulled out of lock, I think it might be noticable. Curious
> to see.
> 
> >
> > > +       for (i = 0; i < bloom_filter->map.nr_hashes; i++) {
> > > +               hash = jhash(value, map->value_size, bloom_filter->hash_seed + i) &
> > > +                       bloom_filter->bit_array_mask;
> > > +               bitmap_set(bloom_filter->bit_array, hash, 1);
> > > +       }
> > > +
> > > +       spin_unlock_irqrestore(&bloom_filter->spinlock, spinlock_flags);
> > > +
> > > +       return 0;
> > > +}
> > > +
> > 
> > [...]
> 
> 

  reply	other threads:[~2021-09-02  6:16 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-31 22:50 [PATCH bpf-next 0/5] Implement bloom filter map Joanne Koong
2021-08-31 22:50 ` [PATCH bpf-next 1/5] bpf: Add bloom filter map implementation Joanne Koong
2021-09-01  2:55   ` Alexei Starovoitov
2021-09-02 19:11     ` Joanne Koong
2021-09-02  1:44   ` Andrii Nakryiko
2021-09-02  5:11     ` John Fastabend
2021-09-02  6:16       ` Martin KaFai Lau [this message]
2021-09-02 22:07       ` Joanne Koong
2021-09-03  0:56         ` Martin KaFai Lau
2021-09-03  7:13           ` Joanne Koong
2021-09-03 17:19             ` Andrii Nakryiko
2021-09-03 17:22         ` John Fastabend
2021-09-08 19:10           ` Joanne Koong
2021-09-02  3:16   ` John Fastabend
2021-09-02  3:28     ` Andrii Nakryiko
2021-09-02  4:40       ` John Fastabend
2021-08-31 22:50 ` [PATCH bpf-next 2/5] libbpf: Allow the number of hashes in bloom filter maps to be configurable Joanne Koong
2021-09-02  3:30   ` Andrii Nakryiko
2021-08-31 22:50 ` [PATCH bpf-next 3/5] selftests/bpf: Add bloom filter map test cases Joanne Koong
2021-09-01  2:55   ` Alexei Starovoitov
2021-08-31 22:50 ` [PATCH bpf-next 4/5] bpf/benchs: Add benchmark test for bloom filter maps Joanne Koong
2021-09-02  3:35   ` Andrii Nakryiko
2021-08-31 22:50 ` [PATCH bpf-next 5/5] bpf/benchs: Add benchmarks for comparing hashmap lookups with vs. without bloom filter Joanne Koong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210902061643.h2o67abkwn26d66a@kafai-mbp.dhcp.thefacebook.com \
    --to=kafai@fb.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=joannekoong@fb.com \
    --cc=john.fastabend@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.