bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH bpf-next v2 0/3] bpf: support string key in htab
@ 2022-02-14 11:13 Hou Tao
  2022-02-14 11:13 ` [RFC PATCH bpf-next v2 1/3] bpf: add support for string in hash table key Hou Tao
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Hou Tao @ 2022-02-14 11:13 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Martin KaFai Lau, Yonghong Song, Daniel Borkmann,
	Andrii Nakryiko, Song Liu, John Fastabend, netdev, bpf, houtao1

Hi,

In order to use string as hash-table key, key_size must be the storage
size of longest string. If there are large differencies in string
length, the hash distribution will be sub-optimal due to the unused
zero bytes in shorter strings and the lookup will be inefficient due to
unnecessary memcmp().

Also it is possible the unused part of string key returned from bpf helper
(e.g. bpf_d_path) is not mem-zeroed and if using it directly as lookup key,
the lookup will fail with -ENOENT (as reported in [1]).

The patchset tries to address the inefficiency by adding support for
string key. There is extensibility problem in v1 because the string key
and its optimization is only available for string-only key. To make it
extensible, v2 introduces bpf_str_key_stor and bpf_str_key_desc and enforce
the layout of hash key struct through BTF as follows:

	>the start of hash key
	...
	[struct bpf_str_key_desc m;]
	...
	[struct bpf_str_key_desc n;]
	...
	struct bpf_str_key_stor z;
	unsigned char raw[N];
	>the end of hash key

So if there is string-only key, the struct of hash key will be:

	struct key {
		struct bpf_str_key_stor comm;
		unsigend char raw[128];
	};

And if there are other fields in hash key, the struct will be:

	struct key {
		int pid;
		struct bpf_str_key_stor comm;
		unsigned char raw[128];
	};

If there are multiple string in hash, the struct will become as:

	struct key {
		int pid;
		struct bpf_str_key_desc path;
		struct bpf_str_key_desc comm;
		unsigned char raw[128 + 128];
	};

See patch #1 and #3 for more details on how these key are manipulated and
used. Patch #2 adds a simple test to demonstrate how string key solves the
reported problem ([1]) due to unused part in hash key.

There are about 180% and 170% improvment in benchmark under x86-64 and
arm64 when key_size is 252. About 280% and %270 when key size is greater
than 512.

Also testing the performance improvment by using all files under linux
kernel sources as the string key input. There are about 74k files and the
maximum string length is 101. When key_size is 108, there are about 71%
and 39% win under x86-64 and arm64 in lookup performance, and when key_size
is 252, the win increases to 150% and 94% respectively.

The patchset is still in early stage of development, so any comments and
suggestions are always welcome.

Regards,
Tao

Change Log
v2:
  * make string key being extensible for no-string-only hash key

v1: https://lore.kernel.org/bpf/20211219052245.791605-1-houtao1@huawei.com/

[1]: https://lore.kernel.org/bpf/20211120051839.28212-2-yunbo.xufeng@linux.alibaba.com

Hou Tao (3):
  bpf: add support for string in hash table key
  selftests/bpf: add a simple test for htab str key
  selftests/bpf: add benchmark for string-key hash table

 include/linux/btf.h                           |   3 +
 include/uapi/linux/bpf.h                      |  19 +
 kernel/bpf/btf.c                              |  39 ++
 kernel/bpf/hashtab.c                          | 162 ++++++-
 tools/include/uapi/linux/bpf.h                |  19 +
 tools/testing/selftests/bpf/Makefile          |   4 +-
 tools/testing/selftests/bpf/bench.c           |  14 +
 .../selftests/bpf/benchs/bench_str_htab.c     | 449 ++++++++++++++++++
 .../testing/selftests/bpf/benchs/run_htab.sh  |  11 +
 .../selftests/bpf/prog_tests/str_key.c        |  71 +++
 .../selftests/bpf/progs/str_htab_bench.c      | 224 +++++++++
 tools/testing/selftests/bpf/progs/str_key.c   |  75 +++
 12 files changed, 1064 insertions(+), 26 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/benchs/bench_str_htab.c
 create mode 100755 tools/testing/selftests/bpf/benchs/run_htab.sh
 create mode 100644 tools/testing/selftests/bpf/prog_tests/str_key.c
 create mode 100644 tools/testing/selftests/bpf/progs/str_htab_bench.c
 create mode 100644 tools/testing/selftests/bpf/progs/str_key.c

-- 
2.25.4


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-03-09 11:47 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-14 11:13 [RFC PATCH bpf-next v2 0/3] bpf: support string key in htab Hou Tao
2022-02-14 11:13 ` [RFC PATCH bpf-next v2 1/3] bpf: add support for string in hash table key Hou Tao
2022-02-14 11:13 ` [RFC PATCH bpf-next v2 2/3] selftests/bpf: add a simple test for htab str key Hou Tao
2022-02-14 11:13 ` [RFC PATCH bpf-next v2 3/3] selftests/bpf: add benchmark for string-key hash table Hou Tao
2022-02-17  3:50 ` [RFC PATCH bpf-next v2 0/3] bpf: support string key in htab Alexei Starovoitov
2022-02-18 13:53   ` Hou Tao
2022-02-19 18:44     ` Alexei Starovoitov
     [not found]       ` <ecc04a70-0b57-62ef-ab52-e7169845d789@huawei.com>
2022-02-27  3:08         ` Alexei Starovoitov
2022-03-09 11:47           ` Hou Tao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).