From: Palmer Dabbelt <palmerdabbelt@google.com> To: Bjorn Topel <bjorn.topel@gmail.com> Cc: daniel@iogearbox.net, ast@kernel.org, zlim.lnx@gmail.com, catalin.marinas@arm.com, will@kernel.org, kafai@fb.com, songliubraving@fb.com, yhs@fb.com, andriin@fb.com, shuah@kernel.org, Palmer Dabbelt <palmerdabbelt@google.com>, netdev@vger.kernel.org, bpf@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, clang-built-linux@googlegroups.com, kernel-team@android.com Subject: arm64: bpf: Elide some moves to a0 after calls Date: Mon, 27 Jan 2020 18:11:41 -0800 [thread overview] Message-ID: <20200128021145.36774-1-palmerdabbelt@google.com> (raw) There's four patches here, but only one of them actually does anything. The first patch fixes a BPF selftests build failure on my machine and has already been sent to the list separately. The next three are just staged such that there are some patches that avoid changing any functionality pulled out from the whole point of those refactorings, with two cleanups and then the idea. Maybe this is an odd thing to say in a cover letter, but I'm not actually sure this patch set is a good idea. The issue of extra moves after calls came up as I was reviewing some unrelated performance optimizations to the RISC-V BPF JIT. I figured I'd take a whack at performing the optimization in the context of the arm64 port just to get a breath of fresh air, and I'm not convinced I like the results. That said, I think I would accept something like this for the RISC-V port because we're already doing a multi-pass optimization for shrinking function addresses so it's not as much extra complexity over there. If we do that we should probably start puling some of this code into the shared BPF compiler, but we're also opening the doors to more complicated BPF JIT optimizations. Given that the BPF JIT appears to have been designed explicitly to be simple/fast as opposed to perform complex optimization, I'm not sure this is a sane way to move forward. I figured I'd send the patch set out as more of a question than anything else. Specifically: * How should I go about measuring the performance of these sort of optimizations? I'd like to balance the time it takes to run the JIT with the time spent executing the program, but I don't have any feel for what real BPF programs look like or have any benchmark suite to run. Is there something out there this should be benchmarked against? (I'd also like to know that to run those benchmarks on the RISC-V port.) * Is this the sort of thing that makes sense in a BPF JIT? I guess I've just realized I turned "review this patch" into a way bigger rabbit hole than I really want to go down... I worked on top of 5.4 for these, but trivially different versions of the patches applied on Linus' master a few days ago when I tried. LMK if those aren't sane places to start from over here, I'm new to both arm64 and BPF so I might be a bit lost. [PATCH 1/4] selftests/bpf: Elide a check for LLVM versions that can't [PATCH 2/4] arm64: bpf: Convert bpf2a64 to a function [PATCH 3/4] arm64: bpf: Split the read and write halves of dst [PATCH 4/4] arm64: bpf: Elide some moves to a0 after calls
WARNING: multiple messages have this Message-ID (diff)
From: Palmer Dabbelt <palmerdabbelt@google.com> To: Bjorn Topel <bjorn.topel@gmail.com> Cc: songliubraving@fb.com, andriin@fb.com, daniel@iogearbox.net, kernel-team@android.com, zlim.lnx@gmail.com, shuah@kernel.org, Palmer Dabbelt <palmerdabbelt@google.com>, ast@kernel.org, linux-kernel@vger.kernel.org, clang-built-linux@googlegroups.com, netdev@vger.kernel.org, linux-kselftest@vger.kernel.org, catalin.marinas@arm.com, yhs@fb.com, bpf@vger.kernel.org, will@kernel.org, kafai@fb.com, linux-arm-kernel@lists.infradead.org Subject: arm64: bpf: Elide some moves to a0 after calls Date: Mon, 27 Jan 2020 18:11:41 -0800 [thread overview] Message-ID: <20200128021145.36774-1-palmerdabbelt@google.com> (raw) There's four patches here, but only one of them actually does anything. The first patch fixes a BPF selftests build failure on my machine and has already been sent to the list separately. The next three are just staged such that there are some patches that avoid changing any functionality pulled out from the whole point of those refactorings, with two cleanups and then the idea. Maybe this is an odd thing to say in a cover letter, but I'm not actually sure this patch set is a good idea. The issue of extra moves after calls came up as I was reviewing some unrelated performance optimizations to the RISC-V BPF JIT. I figured I'd take a whack at performing the optimization in the context of the arm64 port just to get a breath of fresh air, and I'm not convinced I like the results. That said, I think I would accept something like this for the RISC-V port because we're already doing a multi-pass optimization for shrinking function addresses so it's not as much extra complexity over there. If we do that we should probably start puling some of this code into the shared BPF compiler, but we're also opening the doors to more complicated BPF JIT optimizations. Given that the BPF JIT appears to have been designed explicitly to be simple/fast as opposed to perform complex optimization, I'm not sure this is a sane way to move forward. I figured I'd send the patch set out as more of a question than anything else. Specifically: * How should I go about measuring the performance of these sort of optimizations? I'd like to balance the time it takes to run the JIT with the time spent executing the program, but I don't have any feel for what real BPF programs look like or have any benchmark suite to run. Is there something out there this should be benchmarked against? (I'd also like to know that to run those benchmarks on the RISC-V port.) * Is this the sort of thing that makes sense in a BPF JIT? I guess I've just realized I turned "review this patch" into a way bigger rabbit hole than I really want to go down... I worked on top of 5.4 for these, but trivially different versions of the patches applied on Linus' master a few days ago when I tried. LMK if those aren't sane places to start from over here, I'm new to both arm64 and BPF so I might be a bit lost. [PATCH 1/4] selftests/bpf: Elide a check for LLVM versions that can't [PATCH 2/4] arm64: bpf: Convert bpf2a64 to a function [PATCH 3/4] arm64: bpf: Split the read and write halves of dst [PATCH 4/4] arm64: bpf: Elide some moves to a0 after calls _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next reply other threads:[~2020-01-28 2:14 UTC|newest] Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-01-28 2:11 Palmer Dabbelt [this message] 2020-01-28 2:11 ` arm64: bpf: Elide some moves to a0 after calls Palmer Dabbelt 2020-01-28 2:11 ` [PATCH 1/4] selftests/bpf: Elide a check for LLVM versions that can't compile it Palmer Dabbelt 2020-01-28 2:11 ` Palmer Dabbelt 2020-02-11 18:20 ` Nick Desaulniers 2020-02-11 18:20 ` Nick Desaulniers 2020-01-28 2:11 ` [PATCH 2/4] arm64: bpf: Convert bpf2a64 to a function Palmer Dabbelt 2020-01-28 2:11 ` Palmer Dabbelt 2020-01-28 2:11 ` [PATCH 3/4] arm64: bpf: Split the read and write halves of dst Palmer Dabbelt 2020-01-28 2:11 ` Palmer Dabbelt 2020-01-28 2:11 ` [PATCH 4/4] arm64: bpf: Elide some moves to a0 after calls Palmer Dabbelt 2020-01-28 2:11 ` Palmer Dabbelt 2020-02-04 19:13 ` Björn Töpel 2020-02-04 19:13 ` Björn Töpel 2020-02-11 0:15 ` Alexei Starovoitov 2020-02-11 0:15 ` Alexei Starovoitov 2020-02-04 19:30 ` Björn Töpel 2020-02-04 19:30 ` Björn Töpel 2020-02-04 20:33 ` John Fastabend 2020-02-04 20:33 ` John Fastabend 2020-02-18 19:28 ` Palmer Dabbelt 2020-02-18 19:28 ` Palmer Dabbelt
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20200128021145.36774-1-palmerdabbelt@google.com \ --to=palmerdabbelt@google.com \ --cc=andriin@fb.com \ --cc=ast@kernel.org \ --cc=bjorn.topel@gmail.com \ --cc=bpf@vger.kernel.org \ --cc=catalin.marinas@arm.com \ --cc=clang-built-linux@googlegroups.com \ --cc=daniel@iogearbox.net \ --cc=kafai@fb.com \ --cc=kernel-team@android.com \ --cc=linux-arm-kernel@lists.infradead.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-kselftest@vger.kernel.org \ --cc=netdev@vger.kernel.org \ --cc=shuah@kernel.org \ --cc=songliubraving@fb.com \ --cc=will@kernel.org \ --cc=yhs@fb.com \ --cc=zlim.lnx@gmail.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.