All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: Hou Tao <houtao1@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>,
	Martin KaFai Lau <kafai@fb.com>, Yonghong Song <yhs@fb.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Networking <netdev@vger.kernel.org>, bpf <bpf@vger.kernel.org>
Subject: Re: [PATCH bpf-next 4/5] selftests/bpf: add benchmark for bpf_strncmp() helper
Date: Mon, 6 Dec 2021 19:01:28 -0800	[thread overview]
Message-ID: <CAEf4BzZLsV_MoUz4VwspzVUbJaXVn0YVsKvf=bL-WPspbw6WGA@mail.gmail.com> (raw)
In-Reply-To: <20211130142215.1237217-5-houtao1@huawei.com>

On Tue, Nov 30, 2021 at 6:07 AM Hou Tao <houtao1@huawei.com> wrote:
>
> Add benchmark to compare the performance between home-made strncmp()
> in bpf program and bpf_strncmp() helper. In summary, the performance
> win of bpf_strncmp() under x86-64 is greater than 18% when the compared
> string length is greater than 64, and is 179% when the length is 4095.
> Under arm64 the performance win is even bigger: 33% when the length
> is greater than 64 and 600% when the length is 4095.
>
> The following is the details:
>
> no-helper-X: use home-made strncmp() to compare X-sized string
> helper-Y: use bpf_strncmp() to compare Y-sized string
>
> Under x86-64:
>
> no-helper-1          3.504 ± 0.000M/s (drops 0.000 ± 0.000M/s)
> helper-1             3.347 ± 0.001M/s (drops 0.000 ± 0.000M/s)
>
> no-helper-8          3.357 ± 0.001M/s (drops 0.000 ± 0.000M/s)
> helper-8             3.307 ± 0.001M/s (drops 0.000 ± 0.000M/s)
>
> no-helper-32         3.064 ± 0.000M/s (drops 0.000 ± 0.000M/s)
> helper-32            3.253 ± 0.001M/s (drops 0.000 ± 0.000M/s)
>
> no-helper-64         2.563 ± 0.001M/s (drops 0.000 ± 0.000M/s)
> helper-64            3.040 ± 0.001M/s (drops 0.000 ± 0.000M/s)
>
> no-helper-128        1.975 ± 0.000M/s (drops 0.000 ± 0.000M/s)
> helper-128           2.641 ± 0.000M/s (drops 0.000 ± 0.000M/s)
>
> no-helper-512        0.759 ± 0.000M/s (drops 0.000 ± 0.000M/s)
> helper-512           1.574 ± 0.000M/s (drops 0.000 ± 0.000M/s)
>
> no-helper-2048       0.329 ± 0.000M/s (drops 0.000 ± 0.000M/s)
> helper-2048          0.602 ± 0.000M/s (drops 0.000 ± 0.000M/s)
>
> no-helper-4095       0.117 ± 0.000M/s (drops 0.000 ± 0.000M/s)
> helper-4095          0.327 ± 0.000M/s (drops 0.000 ± 0.000M/s)
>
> Under arm64:
>
> no-helper-1          2.806 ± 0.004M/s (drops 0.000 ± 0.000M/s)
> helper-1             2.819 ± 0.002M/s (drops 0.000 ± 0.000M/s)
>
> no-helper-8          2.797 ± 0.109M/s (drops 0.000 ± 0.000M/s)
> helper-8             2.786 ± 0.025M/s (drops 0.000 ± 0.000M/s)
>
> no-helper-32         2.399 ± 0.011M/s (drops 0.000 ± 0.000M/s)
> helper-32            2.703 ± 0.002M/s (drops 0.000 ± 0.000M/s)
>
> no-helper-64         2.020 ± 0.015M/s (drops 0.000 ± 0.000M/s)
> helper-64            2.702 ± 0.073M/s (drops 0.000 ± 0.000M/s)
>
> no-helper-128        1.604 ± 0.001M/s (drops 0.000 ± 0.000M/s)
> helper-128           2.516 ± 0.002M/s (drops 0.000 ± 0.000M/s)
>
> no-helper-512        0.699 ± 0.000M/s (drops 0.000 ± 0.000M/s)
> helper-512           2.106 ± 0.003M/s (drops 0.000 ± 0.000M/s)
>
> no-helper-2048       0.215 ± 0.000M/s (drops 0.000 ± 0.000M/s)
> helper-2048          1.223 ± 0.003M/s (drops 0.000 ± 0.000M/s)
>
> no-helper-4095       0.112 ± 0.000M/s (drops 0.000 ± 0.000M/s)
> helper-4095          0.796 ± 0.000M/s (drops 0.000 ± 0.000M/s)
>
> Signed-off-by: Hou Tao <houtao1@huawei.com>
> ---
>  tools/testing/selftests/bpf/Makefile          |   4 +-
>  tools/testing/selftests/bpf/bench.c           |   6 +
>  .../selftests/bpf/benchs/bench_strncmp.c      | 150 ++++++++++++++++++
>  .../selftests/bpf/benchs/run_bench_strncmp.sh |  12 ++
>  .../selftests/bpf/progs/strncmp_bench.c       |  50 ++++++
>  5 files changed, 221 insertions(+), 1 deletion(-)
>  create mode 100644 tools/testing/selftests/bpf/benchs/bench_strncmp.c
>  create mode 100755 tools/testing/selftests/bpf/benchs/run_bench_strncmp.sh
>  create mode 100644 tools/testing/selftests/bpf/progs/strncmp_bench.c
>

[...]

> diff --git a/tools/testing/selftests/bpf/progs/strncmp_bench.c b/tools/testing/selftests/bpf/progs/strncmp_bench.c
> new file mode 100644
> index 000000000000..18373a7df76e
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/strncmp_bench.c
> @@ -0,0 +1,50 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (C) 2021. Huawei Technologies Co., Ltd */
> +#include <linux/types.h>
> +#include <linux/bpf.h>
> +#include <bpf/bpf_helpers.h>
> +#include <bpf/bpf_tracing.h>
> +
> +#define STRNCMP_STR_SZ 4096
> +
> +/* Will be updated by benchmark before program loading */
> +const volatile unsigned int cmp_str_len = 1;
> +const char target[STRNCMP_STR_SZ];
> +
> +long hits = 0;
> +char str[STRNCMP_STR_SZ];
> +
> +char _license[] SEC("license") = "GPL";
> +
> +static __always_inline int local_strncmp(const char *s1, unsigned int sz,
> +                                        const char *s2)
> +{
> +       int ret = 0;
> +       unsigned int i;
> +
> +       for (i = 0; i < sz; i++) {
> +               /* E.g. 0xff > 0x31 */
> +               ret = (unsigned char)s1[i] - (unsigned char)s2[i];

I'm actually not sure if it will perform subtraction in unsigned form
(and thus you'll never have a negative result) and then cast to int,
or not. Why not cast to int instead of unsigned char to be sure?

> +               if (ret || !s1[i])
> +                       break;
> +       }
> +
> +       return ret;
> +}
> +
> +SEC("tp/syscalls/sys_enter_getpgid")
> +int strncmp_no_helper(void *ctx)
> +{
> +       if (local_strncmp(str, cmp_str_len + 1, target) < 0)
> +               __sync_add_and_fetch(&hits, 1);
> +       return 0;
> +}
> +
> +SEC("tp/syscalls/sys_enter_getpgid")
> +int strncmp_helper(void *ctx)
> +{
> +       if (bpf_strncmp(str, cmp_str_len + 1, target) < 0)
> +               __sync_add_and_fetch(&hits, 1);
> +       return 0;
> +}
> +
> --
> 2.29.2
>

  reply	other threads:[~2021-12-07  3:01 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-30 14:22 [PATCH bpf-next 0/5] introduce bpf_strncmp() helper Hou Tao
2021-11-30 14:22 ` [PATCH bpf-next 1/5] bpf: add bpf_strncmp helper Hou Tao
2021-11-30 14:22 ` [PATCH bpf-next 2/5] selftests/bpf: fix checkpatch error on empty function parameter Hou Tao
2021-11-30 14:22 ` [PATCH bpf-next 3/5] selftests/bpf: factor out common helpers for benchmarks Hou Tao
2021-12-07  2:55   ` Andrii Nakryiko
2021-12-08 13:41     ` Hou Tao
2021-11-30 14:22 ` [PATCH bpf-next 4/5] selftests/bpf: add benchmark for bpf_strncmp() helper Hou Tao
2021-12-07  3:01   ` Andrii Nakryiko [this message]
2021-12-08 13:47     ` Hou Tao
2021-12-08 20:08       ` Andrii Nakryiko
2021-11-30 14:22 ` [PATCH bpf-next 5/5] selftests/bpf: add test cases for bpf_strncmp() Hou Tao
2021-12-07  3:09   ` Andrii Nakryiko
2021-12-08 13:50     ` Hou Tao
2021-12-03  2:09 ` [PATCH bpf-next 0/5] introduce bpf_strncmp() helper Alexei Starovoitov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAEf4BzZLsV_MoUz4VwspzVUbJaXVn0YVsKvf=bL-WPspbw6WGA@mail.gmail.com' \
    --to=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=houtao1@huawei.com \
    --cc=kafai@fb.com \
    --cc=netdev@vger.kernel.org \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.