bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next 1/2] bpf, x86: Small optimization in comparing against imm0
@ 2019-10-02 23:45 Daniel Borkmann
  2019-10-02 23:45 ` [PATCH bpf-next 2/2] bpf: Add loop test case with 32 bit reg comparison against 0 Daniel Borkmann
  2019-10-03 20:52 ` [PATCH bpf-next 1/2] bpf, x86: Small optimization in comparing against imm0 Song Liu
  0 siblings, 2 replies; 6+ messages in thread
From: Daniel Borkmann @ 2019-10-02 23:45 UTC (permalink / raw)
  To: ast; +Cc: bpf, netdev, Daniel Borkmann

Replace 'cmp reg, 0' with 'test reg, reg' for comparisons against
zero. Saves 1 byte of instruction encoding per occurrence. The flag
results of test 'reg, reg' are identical to 'cmp reg, 0' in all
cases except for AF which we don't use/care about. In terms of
macro-fusibility in combination with a subsequent conditional jump
instruction, both have the same properties for the jumps used in
the JIT translation. For example, same JITed Cilium program can
shrink a bit from e.g. 12,455 to 12,317 bytes as tests with 0 are
used quite frequently.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
 arch/x86/net/bpf_jit_comp.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 991549a1c5f3..3ad2ba1ad855 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -909,6 +909,16 @@ xadd:			if (is_imm8(insn->off))
 		case BPF_JMP32 | BPF_JSLT | BPF_K:
 		case BPF_JMP32 | BPF_JSGE | BPF_K:
 		case BPF_JMP32 | BPF_JSLE | BPF_K:
+			/* test dst_reg, dst_reg to save one extra byte */
+			if (imm32 == 0) {
+				if (BPF_CLASS(insn->code) == BPF_JMP)
+					EMIT1(add_2mod(0x48, dst_reg, dst_reg));
+				else if (is_ereg(dst_reg))
+					EMIT1(add_2mod(0x40, dst_reg, dst_reg));
+				EMIT2(0x85, add_2reg(0xC0, dst_reg, dst_reg));
+				goto emit_cond_jmp;
+			}
+
 			/* cmp dst_reg, imm8/32 */
 			if (BPF_CLASS(insn->code) == BPF_JMP)
 				EMIT1(add_1mod(0x48, dst_reg));
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH bpf-next 2/2] bpf: Add loop test case with 32 bit reg comparison against 0
  2019-10-02 23:45 [PATCH bpf-next 1/2] bpf, x86: Small optimization in comparing against imm0 Daniel Borkmann
@ 2019-10-02 23:45 ` Daniel Borkmann
  2019-10-03 20:56   ` Song Liu
  2019-10-03 20:52 ` [PATCH bpf-next 1/2] bpf, x86: Small optimization in comparing against imm0 Song Liu
  1 sibling, 1 reply; 6+ messages in thread
From: Daniel Borkmann @ 2019-10-02 23:45 UTC (permalink / raw)
  To: ast; +Cc: bpf, netdev, Daniel Borkmann

Add a loop test with 32 bit register against 0 immediate:

  # ./test_verifier 631
  #631/p taken loop with back jump to 1st insn, 2 OK

Disassembly:

  [...]
  1b:	test   %edi,%edi
  1d:	jne    0x0000000000000014
  [...]

Pretty much similar to prior "taken loop with back jump to 1st
insn" test case just as jmp32 variant.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
 tools/testing/selftests/bpf/verifier/loops1.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/tools/testing/selftests/bpf/verifier/loops1.c b/tools/testing/selftests/bpf/verifier/loops1.c
index 1fc4e61e9f9f..1af37187dc12 100644
--- a/tools/testing/selftests/bpf/verifier/loops1.c
+++ b/tools/testing/selftests/bpf/verifier/loops1.c
@@ -187,3 +187,20 @@
 	.prog_type = BPF_PROG_TYPE_XDP,
 	.retval = 55,
 },
+{
+	"taken loop with back jump to 1st insn, 2",
+	.insns = {
+	BPF_MOV64_IMM(BPF_REG_1, 10),
+	BPF_MOV64_IMM(BPF_REG_2, 0),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 1, 0, 1),
+	BPF_EXIT_INSN(),
+	BPF_ALU64_REG(BPF_ADD, BPF_REG_2, BPF_REG_1),
+	BPF_ALU64_IMM(BPF_SUB, BPF_REG_1, 1),
+	BPF_JMP32_IMM(BPF_JNE, BPF_REG_1, 0, -3),
+	BPF_MOV64_REG(BPF_REG_0, BPF_REG_2),
+	BPF_EXIT_INSN(),
+	},
+	.result = ACCEPT,
+	.prog_type = BPF_PROG_TYPE_XDP,
+	.retval = 55,
+},
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH bpf-next 1/2] bpf, x86: Small optimization in comparing against imm0
  2019-10-02 23:45 [PATCH bpf-next 1/2] bpf, x86: Small optimization in comparing against imm0 Daniel Borkmann
  2019-10-02 23:45 ` [PATCH bpf-next 2/2] bpf: Add loop test case with 32 bit reg comparison against 0 Daniel Borkmann
@ 2019-10-03 20:52 ` Song Liu
  2019-10-03 21:08   ` John Fastabend
  1 sibling, 1 reply; 6+ messages in thread
From: Song Liu @ 2019-10-03 20:52 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: Alexei Starovoitov, bpf, Networking

On Wed, Oct 2, 2019 at 5:30 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> Replace 'cmp reg, 0' with 'test reg, reg' for comparisons against
> zero. Saves 1 byte of instruction encoding per occurrence. The flag
> results of test 'reg, reg' are identical to 'cmp reg, 0' in all
> cases except for AF which we don't use/care about. In terms of
> macro-fusibility in combination with a subsequent conditional jump
> instruction, both have the same properties for the jumps used in
> the JIT translation. For example, same JITed Cilium program can
> shrink a bit from e.g. 12,455 to 12,317 bytes as tests with 0 are
> used quite frequently.
>
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

Acked-by: Song Liu <songliubraving@fb.com>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH bpf-next 2/2] bpf: Add loop test case with 32 bit reg comparison against 0
  2019-10-02 23:45 ` [PATCH bpf-next 2/2] bpf: Add loop test case with 32 bit reg comparison against 0 Daniel Borkmann
@ 2019-10-03 20:56   ` Song Liu
  0 siblings, 0 replies; 6+ messages in thread
From: Song Liu @ 2019-10-03 20:56 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: Alexei Starovoitov, bpf, Networking

On Wed, Oct 2, 2019 at 5:30 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> Add a loop test with 32 bit register against 0 immediate:
>
>   # ./test_verifier 631
>   #631/p taken loop with back jump to 1st insn, 2 OK
>
> Disassembly:
>
>   [...]
>   1b:   test   %edi,%edi
>   1d:   jne    0x0000000000000014
>   [...]
>
> Pretty much similar to prior "taken loop with back jump to 1st
> insn" test case just as jmp32 variant.
>
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

Acked-by: Song Liu <songliubraving@fb.com>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH bpf-next 1/2] bpf, x86: Small optimization in comparing against imm0
  2019-10-03 20:52 ` [PATCH bpf-next 1/2] bpf, x86: Small optimization in comparing against imm0 Song Liu
@ 2019-10-03 21:08   ` John Fastabend
  2019-10-04 19:36     ` Alexei Starovoitov
  0 siblings, 1 reply; 6+ messages in thread
From: John Fastabend @ 2019-10-03 21:08 UTC (permalink / raw)
  To: Song Liu, Daniel Borkmann; +Cc: Alexei Starovoitov, bpf, Networking

Song Liu wrote:
> On Wed, Oct 2, 2019 at 5:30 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
> >
> > Replace 'cmp reg, 0' with 'test reg, reg' for comparisons against
> > zero. Saves 1 byte of instruction encoding per occurrence. The flag
> > results of test 'reg, reg' are identical to 'cmp reg, 0' in all
> > cases except for AF which we don't use/care about. In terms of
> > macro-fusibility in combination with a subsequent conditional jump
> > instruction, both have the same properties for the jumps used in
> > the JIT translation. For example, same JITed Cilium program can
> > shrink a bit from e.g. 12,455 to 12,317 bytes as tests with 0 are
> > used quite frequently.
> >
> > Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> 
> Acked-by: Song Liu <songliubraving@fb.com>

Bonus points for causing me to spend the morning remembering the
differences between cmd, and, or, and test.

Also wonder if at some point we should clean up the jit a bit and
add some defines/helpers for all the open coded opcodes and such.

Acked-by: John Fastabend <john.fastabend@gmail.com>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH bpf-next 1/2] bpf, x86: Small optimization in comparing against imm0
  2019-10-03 21:08   ` John Fastabend
@ 2019-10-04 19:36     ` Alexei Starovoitov
  0 siblings, 0 replies; 6+ messages in thread
From: Alexei Starovoitov @ 2019-10-04 19:36 UTC (permalink / raw)
  To: John Fastabend
  Cc: Song Liu, Daniel Borkmann, Alexei Starovoitov, bpf, Networking

On Thu, Oct 3, 2019 at 2:08 PM John Fastabend <john.fastabend@gmail.com> wrote:
>
> Song Liu wrote:
> > On Wed, Oct 2, 2019 at 5:30 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
> > >
> > > Replace 'cmp reg, 0' with 'test reg, reg' for comparisons against
> > > zero. Saves 1 byte of instruction encoding per occurrence. The flag
> > > results of test 'reg, reg' are identical to 'cmp reg, 0' in all
> > > cases except for AF which we don't use/care about. In terms of
> > > macro-fusibility in combination with a subsequent conditional jump
> > > instruction, both have the same properties for the jumps used in
> > > the JIT translation. For example, same JITed Cilium program can
> > > shrink a bit from e.g. 12,455 to 12,317 bytes as tests with 0 are
> > > used quite frequently.
> > >
> > > Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> >
> > Acked-by: Song Liu <songliubraving@fb.com>
>
> Bonus points for causing me to spend the morning remembering the
> differences between cmd, and, or, and test.
>
> Also wonder if at some point we should clean up the jit a bit and
> add some defines/helpers for all the open coded opcodes and such.
>
> Acked-by: John Fastabend <john.fastabend@gmail.com>

Applied both. Thanks

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-10-04 19:36 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-02 23:45 [PATCH bpf-next 1/2] bpf, x86: Small optimization in comparing against imm0 Daniel Borkmann
2019-10-02 23:45 ` [PATCH bpf-next 2/2] bpf: Add loop test case with 32 bit reg comparison against 0 Daniel Borkmann
2019-10-03 20:56   ` Song Liu
2019-10-03 20:52 ` [PATCH bpf-next 1/2] bpf, x86: Small optimization in comparing against imm0 Song Liu
2019-10-03 21:08   ` John Fastabend
2019-10-04 19:36     ` Alexei Starovoitov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).