Re: [RFC PATCH bpf-next v2 0/3] bpf: support string key in htab

From: Hou Tao <houtao1@huawei.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Martin KaFai Lau <kafai@fb.com>, Yonghong Song <yhs@fb.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Song Liu <songliubraving@fb.com>,
	John Fastabend <john.fastabend@gmail.com>,
	Network Development <netdev@vger.kernel.org>,
	bpf <bpf@vger.kernel.org>, Joanne Koong <joannekoong@fb.com>
Subject: Re: [RFC PATCH bpf-next v2 0/3] bpf: support string key in htab
Date: Wed, 9 Mar 2022 19:47:27 +0800	[thread overview]
Message-ID: <daea3845-ab03-12df-ec3b-a645dcee5aaf@huawei.com> (raw)
In-Reply-To: <CAADnVQJUJp3YBcpESwR3Q1U6GS1mBM=Vp-qYuQX7eZOaoLjdUA@mail.gmail.com>

Hi,

On 2/27/2022 11:08 AM, Alexei Starovoitov wrote:
> On Sat, Feb 26, 2022 at 4:16 AM Hou Tao <houtao1@huawei.com> wrote:
>>
>> For now, our case is a write-once case, so only lookup is considered.
>> When data set is bigger than 128KB, hash table has better lookup performance as
>> show below:
>>
>> | lookup all elem (ms) | 4K  | 16K  | 64K  | 128K  | 256K  | 512K  | 1M     | 2M     | 4M      | 7M      |
>> | -------------------- | --- | ---- | ---- | ----- | ----- | ----- | ------ | ------ | ------- | ------- |
>> | hash                 | 3.3 | 12.7 | 47   | 90.6  | 185.9 | 382.3 | 788.5  | 1622.4 | 3296    | 6248.7  |
>> | tries                | 2   | 10   | 45.9 | 111.6 | 274.6 | 688.9 | 1747.2 | 4394.5 | 11229.8 | 27148.8 |
>> | tst                  | 3.8 | 16.4 | 61.3 | 139.1 | 313.9 | 707.3 | 1641.3 | 3856.1 | 9002.3  | 19793.8 |
> 
> Yeah. It's hard to beat hash lookup when it's hitting a good case of O(1),
> but what are the chances that it stays this way?
> Are you saying you can size up the table and populate to good % just once?
>
Yes. for our case the hash table is populated only once. During these test the
hash table is populated firstly by inserting all strings into the table and then
do the lookup. The strings for all tests come from the same string set.

> If so it's probably better to replace all strings with something
> like a good hash
A strong one like sha1sum and using the string as hash-table value just
as proposed in previous email ?

> 7M elements is not a lot. A hash producing 8 or 16 bytes will have close
> to zero false positives.
> And in case of "populate the table once" the hash seed can be
> precomputed and adjusted, so you can guarantee zero collisions
> for 7M strings. While lookup part can still have 0.001% chance
> of a false positive there could be a way to deal with it after lookup.
>
I can try the above method. But the lookup procedure will be slowed done by
calculating a good hash and the memory usage will not reduced.

>> Ternary search tree always has better memory usage:
>>
>> | memory usage (MB) | 4K  | 16K | 64K  | 128K | 256K | 512K | 1M   | 2M    | 4M    | 7M     |
>> | ----------------- | --- | --- | ---- | ---- | ---- | ---- | ---- | ----- | ----- | ------ |
>> | hash              | 2.2 | 8.9 | 35.5 | 71   | 142  | 284  | 568  | 1136  | 2272  | 4302.5 |
>> | tries             | 2.1 | 8.5 | 34   | 68   | 136  | 272  | 544  | 1088  | 2176  | 4106.9 |
>> | tst               | 0.5 | 1.6 | 5.6  | 10.6 | 20.3 | 38.6 | 73.1 | 138.6 | 264.6 | 479.5  |
>>
> 
> Ternary search tree looks amazing.
> Since you have a prototype can you wrap it into a new type of bpf map
> and post the patches?
Will do.

> I wonder what data structures look like to achieve such memory efficiency.
The lower memory usage partially is due to the string set for test is full file
paths and these paths share the same prefix. And ternary search tree reduces the
memory usage by sharing the common prefix.
> .
>