* [PATCH bpf-next 1/2] bpf, x86: Small optimization in comparing against imm0 @ 2019-10-02 23:45 Daniel Borkmann 2019-10-02 23:45 ` [PATCH bpf-next 2/2] bpf: Add loop test case with 32 bit reg comparison against 0 Daniel Borkmann 2019-10-03 20:52 ` [PATCH bpf-next 1/2] bpf, x86: Small optimization in comparing against imm0 Song Liu 0 siblings, 2 replies; 6+ messages in thread From: Daniel Borkmann @ 2019-10-02 23:45 UTC (permalink / raw) To: ast; +Cc: bpf, netdev, Daniel Borkmann Replace 'cmp reg, 0' with 'test reg, reg' for comparisons against zero. Saves 1 byte of instruction encoding per occurrence. The flag results of test 'reg, reg' are identical to 'cmp reg, 0' in all cases except for AF which we don't use/care about. In terms of macro-fusibility in combination with a subsequent conditional jump instruction, both have the same properties for the jumps used in the JIT translation. For example, same JITed Cilium program can shrink a bit from e.g. 12,455 to 12,317 bytes as tests with 0 are used quite frequently. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> --- arch/x86/net/bpf_jit_comp.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index 991549a1c5f3..3ad2ba1ad855 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -909,6 +909,16 @@ xadd: if (is_imm8(insn->off)) case BPF_JMP32 | BPF_JSLT | BPF_K: case BPF_JMP32 | BPF_JSGE | BPF_K: case BPF_JMP32 | BPF_JSLE | BPF_K: + /* test dst_reg, dst_reg to save one extra byte */ + if (imm32 == 0) { + if (BPF_CLASS(insn->code) == BPF_JMP) + EMIT1(add_2mod(0x48, dst_reg, dst_reg)); + else if (is_ereg(dst_reg)) + EMIT1(add_2mod(0x40, dst_reg, dst_reg)); + EMIT2(0x85, add_2reg(0xC0, dst_reg, dst_reg)); + goto emit_cond_jmp; + } + /* cmp dst_reg, imm8/32 */ if (BPF_CLASS(insn->code) == BPF_JMP) EMIT1(add_1mod(0x48, dst_reg)); -- 2.17.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH bpf-next 2/2] bpf: Add loop test case with 32 bit reg comparison against 0 2019-10-02 23:45 [PATCH bpf-next 1/2] bpf, x86: Small optimization in comparing against imm0 Daniel Borkmann @ 2019-10-02 23:45 ` Daniel Borkmann 2019-10-03 20:56 ` Song Liu 2019-10-03 20:52 ` [PATCH bpf-next 1/2] bpf, x86: Small optimization in comparing against imm0 Song Liu 1 sibling, 1 reply; 6+ messages in thread From: Daniel Borkmann @ 2019-10-02 23:45 UTC (permalink / raw) To: ast; +Cc: bpf, netdev, Daniel Borkmann Add a loop test with 32 bit register against 0 immediate: # ./test_verifier 631 #631/p taken loop with back jump to 1st insn, 2 OK Disassembly: [...] 1b: test %edi,%edi 1d: jne 0x0000000000000014 [...] Pretty much similar to prior "taken loop with back jump to 1st insn" test case just as jmp32 variant. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> --- tools/testing/selftests/bpf/verifier/loops1.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/tools/testing/selftests/bpf/verifier/loops1.c b/tools/testing/selftests/bpf/verifier/loops1.c index 1fc4e61e9f9f..1af37187dc12 100644 --- a/tools/testing/selftests/bpf/verifier/loops1.c +++ b/tools/testing/selftests/bpf/verifier/loops1.c @@ -187,3 +187,20 @@ .prog_type = BPF_PROG_TYPE_XDP, .retval = 55, }, +{ + "taken loop with back jump to 1st insn, 2", + .insns = { + BPF_MOV64_IMM(BPF_REG_1, 10), + BPF_MOV64_IMM(BPF_REG_2, 0), + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 1, 0, 1), + BPF_EXIT_INSN(), + BPF_ALU64_REG(BPF_ADD, BPF_REG_2, BPF_REG_1), + BPF_ALU64_IMM(BPF_SUB, BPF_REG_1, 1), + BPF_JMP32_IMM(BPF_JNE, BPF_REG_1, 0, -3), + BPF_MOV64_REG(BPF_REG_0, BPF_REG_2), + BPF_EXIT_INSN(), + }, + .result = ACCEPT, + .prog_type = BPF_PROG_TYPE_XDP, + .retval = 55, +}, -- 2.17.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH bpf-next 2/2] bpf: Add loop test case with 32 bit reg comparison against 0 2019-10-02 23:45 ` [PATCH bpf-next 2/2] bpf: Add loop test case with 32 bit reg comparison against 0 Daniel Borkmann @ 2019-10-03 20:56 ` Song Liu 0 siblings, 0 replies; 6+ messages in thread From: Song Liu @ 2019-10-03 20:56 UTC (permalink / raw) To: Daniel Borkmann; +Cc: Alexei Starovoitov, bpf, Networking On Wed, Oct 2, 2019 at 5:30 PM Daniel Borkmann <daniel@iogearbox.net> wrote: > > Add a loop test with 32 bit register against 0 immediate: > > # ./test_verifier 631 > #631/p taken loop with back jump to 1st insn, 2 OK > > Disassembly: > > [...] > 1b: test %edi,%edi > 1d: jne 0x0000000000000014 > [...] > > Pretty much similar to prior "taken loop with back jump to 1st > insn" test case just as jmp32 variant. > > Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH bpf-next 1/2] bpf, x86: Small optimization in comparing against imm0 2019-10-02 23:45 [PATCH bpf-next 1/2] bpf, x86: Small optimization in comparing against imm0 Daniel Borkmann 2019-10-02 23:45 ` [PATCH bpf-next 2/2] bpf: Add loop test case with 32 bit reg comparison against 0 Daniel Borkmann @ 2019-10-03 20:52 ` Song Liu 2019-10-03 21:08 ` John Fastabend 1 sibling, 1 reply; 6+ messages in thread From: Song Liu @ 2019-10-03 20:52 UTC (permalink / raw) To: Daniel Borkmann; +Cc: Alexei Starovoitov, bpf, Networking On Wed, Oct 2, 2019 at 5:30 PM Daniel Borkmann <daniel@iogearbox.net> wrote: > > Replace 'cmp reg, 0' with 'test reg, reg' for comparisons against > zero. Saves 1 byte of instruction encoding per occurrence. The flag > results of test 'reg, reg' are identical to 'cmp reg, 0' in all > cases except for AF which we don't use/care about. In terms of > macro-fusibility in combination with a subsequent conditional jump > instruction, both have the same properties for the jumps used in > the JIT translation. For example, same JITed Cilium program can > shrink a bit from e.g. 12,455 to 12,317 bytes as tests with 0 are > used quite frequently. > > Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH bpf-next 1/2] bpf, x86: Small optimization in comparing against imm0 2019-10-03 20:52 ` [PATCH bpf-next 1/2] bpf, x86: Small optimization in comparing against imm0 Song Liu @ 2019-10-03 21:08 ` John Fastabend 2019-10-04 19:36 ` Alexei Starovoitov 0 siblings, 1 reply; 6+ messages in thread From: John Fastabend @ 2019-10-03 21:08 UTC (permalink / raw) To: Song Liu, Daniel Borkmann; +Cc: Alexei Starovoitov, bpf, Networking Song Liu wrote: > On Wed, Oct 2, 2019 at 5:30 PM Daniel Borkmann <daniel@iogearbox.net> wrote: > > > > Replace 'cmp reg, 0' with 'test reg, reg' for comparisons against > > zero. Saves 1 byte of instruction encoding per occurrence. The flag > > results of test 'reg, reg' are identical to 'cmp reg, 0' in all > > cases except for AF which we don't use/care about. In terms of > > macro-fusibility in combination with a subsequent conditional jump > > instruction, both have the same properties for the jumps used in > > the JIT translation. For example, same JITed Cilium program can > > shrink a bit from e.g. 12,455 to 12,317 bytes as tests with 0 are > > used quite frequently. > > > > Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> > > Acked-by: Song Liu <songliubraving@fb.com> Bonus points for causing me to spend the morning remembering the differences between cmd, and, or, and test. Also wonder if at some point we should clean up the jit a bit and add some defines/helpers for all the open coded opcodes and such. Acked-by: John Fastabend <john.fastabend@gmail.com> ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH bpf-next 1/2] bpf, x86: Small optimization in comparing against imm0 2019-10-03 21:08 ` John Fastabend @ 2019-10-04 19:36 ` Alexei Starovoitov 0 siblings, 0 replies; 6+ messages in thread From: Alexei Starovoitov @ 2019-10-04 19:36 UTC (permalink / raw) To: John Fastabend Cc: Song Liu, Daniel Borkmann, Alexei Starovoitov, bpf, Networking On Thu, Oct 3, 2019 at 2:08 PM John Fastabend <john.fastabend@gmail.com> wrote: > > Song Liu wrote: > > On Wed, Oct 2, 2019 at 5:30 PM Daniel Borkmann <daniel@iogearbox.net> wrote: > > > > > > Replace 'cmp reg, 0' with 'test reg, reg' for comparisons against > > > zero. Saves 1 byte of instruction encoding per occurrence. The flag > > > results of test 'reg, reg' are identical to 'cmp reg, 0' in all > > > cases except for AF which we don't use/care about. In terms of > > > macro-fusibility in combination with a subsequent conditional jump > > > instruction, both have the same properties for the jumps used in > > > the JIT translation. For example, same JITed Cilium program can > > > shrink a bit from e.g. 12,455 to 12,317 bytes as tests with 0 are > > > used quite frequently. > > > > > > Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> > > > > Acked-by: Song Liu <songliubraving@fb.com> > > Bonus points for causing me to spend the morning remembering the > differences between cmd, and, or, and test. > > Also wonder if at some point we should clean up the jit a bit and > add some defines/helpers for all the open coded opcodes and such. > > Acked-by: John Fastabend <john.fastabend@gmail.com> Applied both. Thanks ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-10-04 19:36 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-10-02 23:45 [PATCH bpf-next 1/2] bpf, x86: Small optimization in comparing against imm0 Daniel Borkmann 2019-10-02 23:45 ` [PATCH bpf-next 2/2] bpf: Add loop test case with 32 bit reg comparison against 0 Daniel Borkmann 2019-10-03 20:56 ` Song Liu 2019-10-03 20:52 ` [PATCH bpf-next 1/2] bpf, x86: Small optimization in comparing against imm0 Song Liu 2019-10-03 21:08 ` John Fastabend 2019-10-04 19:36 ` Alexei Starovoitov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).