Re: [PATCH bpf-next 1/5] bpf: Add bloom filter map implementation

From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: Joanne Koong <joannekoong@fb.com>
Cc: Martin KaFai Lau <kafai@fb.com>,
	John Fastabend <john.fastabend@gmail.com>,
	bpf <bpf@vger.kernel.org>
Subject: Re: [PATCH bpf-next 1/5] bpf: Add bloom filter map implementation
Date: Fri, 3 Sep 2021 10:19:55 -0700	[thread overview]
Message-ID: <CAEf4BzZ1PeuF1Uy2R=c9zmU+Zs=iP8_g5P=xZg+b_5qLbm41iQ@mail.gmail.com> (raw)
In-Reply-To: <0beca6da-7444-fdf3-8dc4-c9126b7779de@fb.com>

On Fri, Sep 3, 2021 at 12:13 AM Joanne Koong <joannekoong@fb.com> wrote:
>
> On 9/2/21 5:56 PM, Martin KaFai Lau wrote:
>
> > On Thu, Sep 02, 2021 at 03:07:56PM -0700, Joanne Koong wrote:
> > [ ... ]
> >>>> But one high-level point I wanted to discuss was that bloom filter
> >>>> logic is actually simple enough to be implementable by pure BPF
> >>>> program logic. The only problematic part is generic hashing of a piece
> >>>> of memory. Regardless of implementing bloom filter as kernel-provided
> >>>> BPF map or implementing it with custom BPF program logic, having BPF
> >>>> helper for hashing a piece of memory seems extremely useful and very
> >>>> generic. I can't recall if we ever discussed adding such helpers, but
> >>>> maybe we should?
> >>> Aha started typing the same thing :)
> >>>
> >>> Adding generic hash helper has been on my todo list and close to the top
> >>> now. The use case is hashing data from skb payloads and such from kprobe
> >>> and sockmap side. I'm happy to work on it as soon as possible if no one
> >>> else picks it up.
> >>>
> After thinking through this some more, I'm curious to hear your thoughts,
> Andrii and John, on how the bitmap would be allocated. From what I
> understand, we do not currently support dynamic memory allocations
> in bpf programs. Assuming the optimal number of bits the user wants
> to use for their bitmap follows something like
> num_entries * num_hash_functions / ln(2), I think the bitmap would
> have to be dynamically allocated in the bpf program since it'd be too
> large to store on the stack, unless there's some other way I'm not seeing?

You can either use BPF_MAP_TYPE_ARRAY and size it at runtime. Or one
can use compile-time fixed-sized array in BPF program:

u64 bits[HOWEVER_MANY_U64S_WE_NEED];

/* then in BPF program itself */

h = hash(...);
bits[h / 64] |= (1 << (h % 64));

As an example. The latter case avoid map lookups completely, except
you'd need to prove to the verifier that you are not going out of
bounds for bits, which is simple to do if HOWEVER_MANY_U64S_WE_NEED is
power-of-2. Then you can do:

h = hash(...);
bits[(h / 64) & (HOWEVER_MANY_U64S_WE_NEED - 1)] |= (1 << (h % 64));

> >>>> It would be a really interesting experiment to implement the same
> >>>> logic in pure BPF logic and run it as another benchmark, along the
> >>>> Bloom filter map. BPF has both spinlock and atomic operation, so we
> >>>> can compare and contrast. We only miss hashing BPF helper.
> >>> The one issue I've found writing a hash logic is its a bit tricky
> >>> to get the verifier to consume it. Especially when the hash is nested
> >>> inside a for loop and sometimes a couple for loops so you end up with
> >>> things like,
> >>>
> >>>   for (i = 0; i < someTlvs; i++) {
> >>>    for (j = 0; j < someKeys; i++) {
> >>>      ...
> >>>      bpf_hash(someValue)
> >>>      ...
> >>>   }
> >>>
> >>> I've find small seemingly unrelated changes cause the complexity limit
> >>> to explode. Usually we can work around it with code to get pruning

btw, global BPF functions (sub-programs) should limit this complexity
explosion, even if you implement your own hashing function purely in
BPF.

> >>> points and such, but its a bit ugly. Perhaps this means we need
> >>> to dive into details of why the complexity explodes, but I've not
> >>> got to it yet. The todo list is long.
> Out of curiosity, why would this helper have trouble in the verifier?
>  From a quick glance, it seems like the implementation for it would
> be pretty similar to how bpf_get_prandom_u32() is implemented
> (except where the arguments for the hash helper would take in a
> void* data (ARG_PTR_TO_MEM), the size of the data buffer, and
> the seed)? I'm a bit new to bpf, so there's a good chance I might be
> completely overlooking something here :)

Curious as well. I imagine we'd define new helper with this signature:

u64 bpf_hash_mem(void *data, u64 sz, enum bpf_hash_func hash_fn, u64 flags);

Where enum bpf_hash_func { JHASH, MURMUR, CRC32, etc }, whatever is
available in the kernel (or will be added later).

John, would this still cause problems for the verifier?

>
> >>>> Being able to do this in pure BPF code has a bunch of advantages.
> >>>> Depending on specific application, users can decide to:
> >>>>     - speed up the operation by ditching spinlock or atomic operation,
> >>>> if the logic allows to lose some bit updates;
> >>>>     - decide on optimal size, which might not be a power of 2, depending
> >>>> on memory vs CPU trade of in any particular case;
> >>>>     - it's also possible to implement a more general Counting Bloom
> >>>> filter, all without modifying the kernel.
> >>> Also it means no call and if you build it on top of an array
> >>> map of size 1 its just a load. Could this be a performance win for
> >>> example a Bloom filter in XDP for DDOS? Maybe. Not sure if the program
> >>> would be complex enough a call might be in the noise. I don't know.
> >>>
> >>>> We could go further, and start implementing other simple data
> >>>> structures relying on hashing, like HyperLogLog. And all with no
> >>>> kernel modifications. Map-in-map is no issue as well, because there is
> >>>> a choice of using either fixed global data arrays for maximum
> >>>> performance, or using BPF_MAP_TYPE_ARRAY maps that can go inside
> >>>> map-in-map.
> >>> We've been doing most of our array maps as single entry arrays
> >>> at this point and just calculating offsets directly in BPF. Same
> >>> for some simple hashing algorithms.
> >>>
> >>>> Basically, regardless of having this map in the kernel or not, let's
> >>>> have a "universal" hashing function as a BPF helper as well.
> >>>> Thoughts?
> >>> I like it, but not the bloom filter expert here.
> >> Ooh, I like your idea of comparing the performance of the bloom filter with
> >> a kernel-provided BPF map vs. custom BPF program logic using a
> >> hash helper, especially if a BPF hash helper is something useful that
> >> we want to add to the codebase in and of itself!
> > I think a hash helper will be useful in general but could it be a
> > separate experiment to try using it to implement some bpf maps (probably
> > a mix of an easy one and a little harder one) ?
>
> I agree, I think the hash helper implementation should be its own separate
> patchset orthogonal to this one.
>

Sure, I don't feel strongly against having Bloom filter as BPF map.