linux-security-module.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kees Cook <keescook@chromium.org>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: "zhujianwei (C)" <zhujianwei7@huawei.com>,
	"bpf@vger.kernel.org" <bpf@vger.kernel.org>,
	"linux-security-module@vger.kernel.org"
	<linux-security-module@vger.kernel.org>,
	Hehuazhen <hehuazhen@huawei.com>,
	"Lennart Poettering" <lennart@poettering.net>,
	"Christian Ehrhardt" <christian.ehrhardt@canonical.com>,
	"Zbigniew Jędrzejewski-Szmek" <zbyszek@in.waw.pl>,
	daniel@iogearbox.net, netdev@vger.kernel.org
Subject: Re: new seccomp mode aims to improve performance
Date: Mon, 1 Jun 2020 11:16:14 -0700	[thread overview]
Message-ID: <202006011106.8766849C2@keescook> (raw)
In-Reply-To: <20200531171915.wsxvdjeetmhpsdv2@ast-mbp.dhcp.thefacebook.com>

On Sun, May 31, 2020 at 10:19:15AM -0700, Alexei Starovoitov wrote:
> Thank you for crafting a benchmark.
> The only thing that it's not doing a fair comparison.
> The problem with that patch [1] that is using:
> 
> static noinline u32 __seccomp_benchmark(struct bpf_prog *prog,
>                                         const struct seccomp_data *sd)
> {
>         return SECCOMP_RET_ALLOW;
> }
> 
> as a benchmarking function.
> The 'noinline' keyword tells the compiler to keep the body of the function, but
> the compiler is still doing full control and data flow analysis though this
> function and it is smart enough to optimize its usage in seccomp_run_filters()
> and in __seccomp_filter() because all functions are in a single .c file.
> Lots of code gets optimized away when 'f->benchmark' is on.
> 
> To make it into fair comparison I've added the following patch
> on top of your [1].
> 
> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> index 2fdbf5ad8372..86204422e096 100644
> --- a/kernel/seccomp.c
> +++ b/kernel/seccomp.c
> @@ -244,7 +244,7 @@ static int seccomp_check_filter(struct sock_filter *filter, unsigned int flen)
>         return 0;
>  }
> 
> -static noinline u32 __seccomp_benchmark(struct bpf_prog *prog,
> +__weak noinline u32 __seccomp_benchmark(struct bpf_prog *prog,
>                                         const struct seccomp_data *sd)
> 
> Please take a look at 'make kernel/seccomp.s' before and after to see the difference
> __weak keyword makes.

Ah yeah, thanks. That does bring it up to the same overhead. Nice!

> And here is what seccomp_benchmark now reports:
> 
> Benchmarking 33554432 samples...
> 22.618269641 - 15.030812794 = 7587456847
> getpid native: 226 ns
> 30.792042986 - 22.619048831 = 8172994155
> getpid RET_ALLOW 1 filter: 243 ns
> 39.451435038 - 30.792836778 = 8658598260
> getpid RET_ALLOW 2 filters: 258 ns
> 47.616011529 - 39.452190830 = 8163820699
> getpid BPF-less allow: 243 ns
> Estimated total seccomp overhead for 1 filter: 17 ns
> Estimated total seccomp overhead for 2 filters: 32 ns
> Estimated seccomp per-filter overhead: 15 ns
> Estimated seccomp entry overhead: 2 ns
> Estimated BPF overhead per filter: 0 ns
> 
> [...]
> 
> > So, with the layered nature of seccomp filters there's a reasonable gain
> > to be seen for a O(1) bitmap lookup to skip running even a single filter,
> > even for the fastest BPF mode.
> 
> This is not true.
> The O(1) bitmap implemented as kernel C code will have exactly the same speed
> as O(1) bitmap implemented as eBPF program.

Yes, that'd be true if it was the first (and only) filter. What I'm
trying to provide is a mechanism to speed up the syscalls for all
attached filters (i.e. create a seccomp fast-path). The reality of
seccomp usage is that it's very layered: systemd sets some (or many!),
then container runtime sets some, then the process itself might set
some.

-- 
Kees Cook

  reply	other threads:[~2020-06-01 18:36 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-29 12:48 new seccomp mode aims to improve performance zhujianwei (C)
2020-05-29 15:43 ` Alexei Starovoitov
2020-05-29 16:09   ` Kees Cook
2020-05-29 17:31     ` Alexei Starovoitov
2020-05-29 19:27     ` Kees Cook
2020-05-31 17:19       ` Alexei Starovoitov
2020-06-01 18:16         ` Kees Cook [this message]
2020-06-01  2:08       ` 答复: " zhujianwei (C)
2020-06-01  3:30         ` Alexei Starovoitov
2020-06-02  2:42           ` 答复: " zhujianwei (C)
2020-06-02  3:24             ` Alexei Starovoitov
2020-06-02 11:13               ` 答复: " zhujianwei (C)
2020-06-02 11:34               ` zhujianwei (C)
2020-06-02 18:32                 ` Kees Cook
2020-06-03  4:51                   ` 答复: " zhujianwei (C)
2020-06-01 10:11       ` Lennart Poettering
2020-06-01 12:32         ` Paul Moore
2020-06-02 12:53           ` Lennart Poettering
2020-06-02 15:03             ` Paul Moore
2020-06-02 18:39               ` Kees Cook
2020-06-01 18:21         ` Kees Cook
2020-06-02 12:44           ` Lennart Poettering
2020-06-02 18:37             ` Kees Cook
2020-06-16  6:00             ` Kees Cook

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202006011106.8766849C2@keescook \
    --to=keescook@chromium.org \
    --cc=alexei.starovoitov@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=christian.ehrhardt@canonical.com \
    --cc=daniel@iogearbox.net \
    --cc=hehuazhen@huawei.com \
    --cc=lennart@poettering.net \
    --cc=linux-security-module@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=zbyszek@in.waw.pl \
    --cc=zhujianwei7@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).