linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf 0/2] bpf, arm: Small JIT optimizations
@ 2020-05-01  2:02 Luke Nelson
  2020-05-01  2:02 ` [PATCH bpf-next 1/2] bpf, arm: Optimize ALU64 ARSH X using orrpl conditional instruction Luke Nelson
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Luke Nelson @ 2020-05-01  2:02 UTC (permalink / raw)
  To: bpf
  Cc: Luke Nelson, Shubham Bansal, Russell King, Alexei Starovoitov,
	Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	Andrii Nakryiko, John Fastabend, KP Singh, netdev,
	linux-arm-kernel, linux-kernel

As Daniel suggested to us, we ran our formal verification tool, Serval,
over the arm JIT. The bugs we found have been patched and applied to the
bpf tree [1, 2]. This patch series introduces two small optimizations
that simplify the JIT and use fewer instructions.

[1] https://lore.kernel.org/bpf/20200408181229.10909-1-luke.r.nels@gmail.com/
[2] https://lore.kernel.org/bpf/20200409221752.28448-1-luke.r.nels@gmail.com/

Luke Nelson (2):
  bpf, arm: Optimize emit_a32_arsh_r64 using conditional instruction
  bpf, arm: Optimize ALU ARSH K using asr immediate instruction

 arch/arm/net/bpf_jit_32.c | 14 +++++++++-----
 arch/arm/net/bpf_jit_32.h |  2 ++
 2 files changed, 11 insertions(+), 5 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH bpf-next 1/2] bpf, arm: Optimize ALU64 ARSH X using orrpl conditional instruction
  2020-05-01  2:02 [PATCH bpf 0/2] bpf, arm: Small JIT optimizations Luke Nelson
@ 2020-05-01  2:02 ` Luke Nelson
  2020-05-01  2:02 ` [PATCH bpf-next 2/2] bpf, arm: Optimize ALU ARSH K using asr immediate instruction Luke Nelson
  2020-05-04 16:05 ` [PATCH bpf 0/2] bpf, arm: Small JIT optimizations Daniel Borkmann
  2 siblings, 0 replies; 4+ messages in thread
From: Luke Nelson @ 2020-05-01  2:02 UTC (permalink / raw)
  To: bpf
  Cc: Luke Nelson, Xi Wang, Shubham Bansal, Russell King,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	Yonghong Song, Andrii Nakryiko, John Fastabend, KP Singh, netdev,
	linux-arm-kernel, linux-kernel

This patch optimizes the code generated by emit_a32_arsh_r64, which
handles the BPF_ALU64 BPF_ARSH BPF_X instruction.

The original code uses a conditional B followed by an unconditional ORR.
The optimization saves one instruction by removing the B instruction
and using a conditional ORR (with an inverted condition).

Example of the code generated for BPF_ALU64_REG(BPF_ARSH, BPF_REG_0,
BPF_REG_1), before optimization:

  34:  rsb    ip, r2, #32
  38:  subs   r9, r2, #32
  3c:  lsr    lr, r0, r2
  40:  orr    lr, lr, r1, lsl ip
  44:  bmi    0x4c
  48:  orr    lr, lr, r1, asr r9
  4c:  asr    ip, r1, r2
  50:  mov    r0, lr
  54:  mov    r1, ip

and after optimization:

  34:  rsb    ip, r2, #32
  38:  subs   r9, r2, #32
  3c:  lsr    lr, r0, r2
  40:  orr    lr, lr, r1, lsl ip
  44:  orrpl  lr, lr, r1, asr r9
  48:  asr    ip, r1, r2
  4c:  mov    r0, lr
  50:  mov    r1, ip

Tested on QEMU using lib/test_bpf and test_verifier.

Co-developed-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Luke Nelson <luke.r.nels@gmail.com>
---
 arch/arm/net/bpf_jit_32.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c
index bf85d6db4931..48b89211ee5c 100644
--- a/arch/arm/net/bpf_jit_32.c
+++ b/arch/arm/net/bpf_jit_32.c
@@ -860,8 +860,8 @@ static inline void emit_a32_arsh_r64(const s8 dst[], const s8 src[],
 	emit(ARM_SUBS_I(tmp2[0], rt, 32), ctx);
 	emit(ARM_MOV_SR(ARM_LR, rd[1], SRTYPE_LSR, rt), ctx);
 	emit(ARM_ORR_SR(ARM_LR, ARM_LR, rd[0], SRTYPE_ASL, ARM_IP), ctx);
-	_emit(ARM_COND_MI, ARM_B(0), ctx);
-	emit(ARM_ORR_SR(ARM_LR, ARM_LR, rd[0], SRTYPE_ASR, tmp2[0]), ctx);
+	_emit(ARM_COND_PL,
+	      ARM_ORR_SR(ARM_LR, ARM_LR, rd[0], SRTYPE_ASR, tmp2[0]), ctx);
 	emit(ARM_MOV_SR(ARM_IP, rd[0], SRTYPE_ASR, rt), ctx);
 
 	arm_bpf_put_reg32(dst_lo, ARM_LR, ctx);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH bpf-next 2/2] bpf, arm: Optimize ALU ARSH K using asr immediate instruction
  2020-05-01  2:02 [PATCH bpf 0/2] bpf, arm: Small JIT optimizations Luke Nelson
  2020-05-01  2:02 ` [PATCH bpf-next 1/2] bpf, arm: Optimize ALU64 ARSH X using orrpl conditional instruction Luke Nelson
@ 2020-05-01  2:02 ` Luke Nelson
  2020-05-04 16:05 ` [PATCH bpf 0/2] bpf, arm: Small JIT optimizations Daniel Borkmann
  2 siblings, 0 replies; 4+ messages in thread
From: Luke Nelson @ 2020-05-01  2:02 UTC (permalink / raw)
  To: bpf
  Cc: Luke Nelson, Xi Wang, Shubham Bansal, Russell King,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	Yonghong Song, Andrii Nakryiko, John Fastabend, KP Singh, netdev,
	linux-arm-kernel, linux-kernel

This patch adds an optimization that uses the asr immediate instruction
for BPF_ALU BPF_ARSH BPF_K, rather than loading the immediate to
a temporary register. This is similar to existing code for handling
BPF_ALU BPF_{LSH,RSH} BPF_K. This optimization saves two instructions
and is more consistent with LSH and RSH.

Example of the code generated for BPF_ALU32_IMM(BPF_ARSH, BPF_REG_0, 5)
before the optimization:

  2c:  mov    r8, #5
  30:  mov    r9, #0
  34:  asr    r0, r0, r8

and after optimization:

  2c:  asr    r0, r0, #5

Tested on QEMU using lib/test_bpf and test_verifier.

Co-developed-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Luke Nelson <luke.r.nels@gmail.com>
---
 arch/arm/net/bpf_jit_32.c | 10 +++++++---
 arch/arm/net/bpf_jit_32.h |  3 +++
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c
index 48b89211ee5c..0207b6ea6e8a 100644
--- a/arch/arm/net/bpf_jit_32.c
+++ b/arch/arm/net/bpf_jit_32.c
@@ -795,6 +795,9 @@ static inline void emit_a32_alu_i(const s8 dst, const u32 val,
 	case BPF_RSH:
 		emit(ARM_LSR_I(rd, rd, val), ctx);
 		break;
+	case BPF_ARSH:
+		emit(ARM_ASR_I(rd, rd, val), ctx);
+		break;
 	case BPF_NEG:
 		emit(ARM_RSB_I(rd, rd, val), ctx);
 		break;
@@ -1408,7 +1411,6 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
 	case BPF_ALU | BPF_MUL | BPF_X:
 	case BPF_ALU | BPF_LSH | BPF_X:
 	case BPF_ALU | BPF_RSH | BPF_X:
-	case BPF_ALU | BPF_ARSH | BPF_K:
 	case BPF_ALU | BPF_ARSH | BPF_X:
 	case BPF_ALU64 | BPF_ADD | BPF_K:
 	case BPF_ALU64 | BPF_ADD | BPF_X:
@@ -1465,10 +1467,12 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
 	case BPF_ALU64 | BPF_MOD | BPF_K:
 	case BPF_ALU64 | BPF_MOD | BPF_X:
 		goto notyet;
-	/* dst = dst >> imm */
 	/* dst = dst << imm */
-	case BPF_ALU | BPF_RSH | BPF_K:
+	/* dst = dst >> imm */
+	/* dst = dst >> imm (signed) */
 	case BPF_ALU | BPF_LSH | BPF_K:
+	case BPF_ALU | BPF_RSH | BPF_K:
+	case BPF_ALU | BPF_ARSH | BPF_K:
 		if (unlikely(imm > 31))
 			return -EINVAL;
 		if (imm)
diff --git a/arch/arm/net/bpf_jit_32.h b/arch/arm/net/bpf_jit_32.h
index fb67cbc589e0..e0b593a1498d 100644
--- a/arch/arm/net/bpf_jit_32.h
+++ b/arch/arm/net/bpf_jit_32.h
@@ -94,6 +94,9 @@
 #define ARM_INST_LSR_I		0x01a00020
 #define ARM_INST_LSR_R		0x01a00030
 
+#define ARM_INST_ASR_I		0x01a00040
+#define ARM_INST_ASR_R		0x01a00050
+
 #define ARM_INST_MOV_R		0x01a00000
 #define ARM_INST_MOVS_R		0x01b00000
 #define ARM_INST_MOV_I		0x03a00000
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH bpf 0/2] bpf, arm: Small JIT optimizations
  2020-05-01  2:02 [PATCH bpf 0/2] bpf, arm: Small JIT optimizations Luke Nelson
  2020-05-01  2:02 ` [PATCH bpf-next 1/2] bpf, arm: Optimize ALU64 ARSH X using orrpl conditional instruction Luke Nelson
  2020-05-01  2:02 ` [PATCH bpf-next 2/2] bpf, arm: Optimize ALU ARSH K using asr immediate instruction Luke Nelson
@ 2020-05-04 16:05 ` Daniel Borkmann
  2 siblings, 0 replies; 4+ messages in thread
From: Daniel Borkmann @ 2020-05-04 16:05 UTC (permalink / raw)
  To: Luke Nelson, bpf
  Cc: Luke Nelson, Shubham Bansal, Russell King, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, Andrii Nakryiko,
	John Fastabend, KP Singh, netdev, linux-arm-kernel, linux-kernel

On 5/1/20 4:02 AM, Luke Nelson wrote:
> As Daniel suggested to us, we ran our formal verification tool, Serval,
> over the arm JIT. The bugs we found have been patched and applied to the
> bpf tree [1, 2]. This patch series introduces two small optimizations
> that simplify the JIT and use fewer instructions.
> 
> [1] https://lore.kernel.org/bpf/20200408181229.10909-1-luke.r.nels@gmail.com/
> [2] https://lore.kernel.org/bpf/20200409221752.28448-1-luke.r.nels@gmail.com/
> 
> Luke Nelson (2):
>    bpf, arm: Optimize emit_a32_arsh_r64 using conditional instruction
>    bpf, arm: Optimize ALU ARSH K using asr immediate instruction
> 
>   arch/arm/net/bpf_jit_32.c | 14 +++++++++-----
>   arch/arm/net/bpf_jit_32.h |  2 ++
>   2 files changed, 11 insertions(+), 5 deletions(-)
> 

Applied, thanks!

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-05-04 16:05 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-01  2:02 [PATCH bpf 0/2] bpf, arm: Small JIT optimizations Luke Nelson
2020-05-01  2:02 ` [PATCH bpf-next 1/2] bpf, arm: Optimize ALU64 ARSH X using orrpl conditional instruction Luke Nelson
2020-05-01  2:02 ` [PATCH bpf-next 2/2] bpf, arm: Optimize ALU ARSH K using asr immediate instruction Luke Nelson
2020-05-04 16:05 ` [PATCH bpf 0/2] bpf, arm: Small JIT optimizations Daniel Borkmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).