bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next v3 0/4] bpf, arm64: support more atomic ops
@ 2022-01-29 22:04 Hou Tao
  2022-01-29 22:04 ` [PATCH bpf-next v3 1/4] arm64: move AARCH64_BREAK_FAULT into insn-def.h Hou Tao
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Hou Tao @ 2022-01-29 22:04 UTC (permalink / raw)
  To: Alexei Starovoitov, Mark Rutland
  Cc: Martin KaFai Lau, Yonghong Song, Daniel Borkmann,
	Andrii Nakryiko, Song Liu, David S . Miller, John Fastabend,
	netdev, bpf, houtao1, Zi Shen Lim, Catalin Marinas, Will Deacon,
	Julien Thierry, Ard Biesheuvel, linux-arm-kernel

Hi,

Atomics support in bpf has already been done by "Atomics for eBPF"
patch series [1], but it only adds support for x86, and this patchset
adds support for arm64.

Patch #1 & patch #2 are arm64 related. Patch #1 moves the common used
macro AARCH64_BREAK_FAULT into insn-def.h for insn.h. Patch #2 adds
necessary encoder helpers for atomic operations.

Patch #3 implements atomic[64]_fetch_add, atomic[64]_[fetch_]{and,or,xor}
and atomic[64]_{xchg|cmpxchg} for arm64 bpf. Patch #4 changes the type of
test program from fentry/ to raw_tp/ for atomics test.

For cpus_have_cap(ARM64_HAS_LSE_ATOMICS) case and no-LSE-ATOMICS case,
./test_verifier, "./test_progs -t atomic", and "insmod ./test_bpf.ko"
are exercised and passed correspondingly.

Comments are always welcome.

Regards,
Tao

[1]: https://lore.kernel.org/bpf/20210114181751.768687-2-jackmanb@google.com/

Change Log:
v3:
 * split arm64 insn related code into a separated patch (from Mark)
 * update enum name in aarch64_insn_mem_atomic_op (from Mark)
 * consider all cases for aarch64_insn_mem_order_type and
   aarch64_insn_mb_type (from Mark)
 * exercise and pass "insmod ./test_bpf.ko" test (suggested by Daniel)
 * remove aarch64_insn_gen_store_release_ex() and extend
   aarch64_insn_ldst_type instead
 * compile aarch64_insn_gen_atomic_ld_op(), aarch64_insn_gen_cas() and
   emit_lse_atomic() out when CONFIG_ARM64_LSE_ATOMICS is disabled.

v2: https://lore.kernel.org/bpf/20220127075322.675323-1-houtao1@huawei.com/
  * patch #1: use two separated ASSERT_OK() instead of ASSERT_TRUE()
  * add Acked-by tag for both patches

v1: https://lore.kernel.org/bpf/20220121135632.136976-1-houtao1@huawei.com/

Hou Tao (4):
  arm64: move AARCH64_BREAK_FAULT into insn-def.h
  arm64: insn: add encoders for atomic operations
  bpf, arm64: support more atomic operations
  selftests/bpf: use raw_tp program for atomic test

 arch/arm64/include/asm/debug-monitors.h       |  12 -
 arch/arm64/include/asm/insn-def.h             |  14 ++
 arch/arm64/include/asm/insn.h                 |  80 ++++++-
 arch/arm64/lib/insn.c                         | 185 +++++++++++++--
 arch/arm64/net/bpf_jit.h                      |  44 +++-
 arch/arm64/net/bpf_jit_comp.c                 | 223 ++++++++++++++----
 .../selftests/bpf/prog_tests/atomics.c        |  91 ++-----
 tools/testing/selftests/bpf/progs/atomics.c   |  28 +--
 8 files changed, 517 insertions(+), 160 deletions(-)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH bpf-next v3 1/4] arm64: move AARCH64_BREAK_FAULT into insn-def.h
  2022-01-29 22:04 [PATCH bpf-next v3 0/4] bpf, arm64: support more atomic ops Hou Tao
@ 2022-01-29 22:04 ` Hou Tao
  2022-01-29 22:04 ` [PATCH bpf-next v3 2/4] arm64: insn: add encoders for atomic operations Hou Tao
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: Hou Tao @ 2022-01-29 22:04 UTC (permalink / raw)
  To: Alexei Starovoitov, Mark Rutland
  Cc: Martin KaFai Lau, Yonghong Song, Daniel Borkmann,
	Andrii Nakryiko, Song Liu, David S . Miller, John Fastabend,
	netdev, bpf, houtao1, Zi Shen Lim, Catalin Marinas, Will Deacon,
	Julien Thierry, Ard Biesheuvel, linux-arm-kernel

If CONFIG_ARM64_LSE_ATOMICS is off, encoders for LSE-related instructions
can return AARCH64_BREAK_FAULT directly in insn.h. In order to access
AARCH64_BREAK_FAULT in insn.h, we can not include debug-monitors.h in
insn.h, because debug-monitors.h has already depends on insn.h, so just
move AARCH64_BREAK_FAULT into insn-def.h.

It will be used by the following patch to eliminate unnecessary LSE-related
encoders when CONFIG_ARM64_LSE_ATOMICS is off.

Signed-off-by: Hou Tao <houtao1@huawei.com>
---
 arch/arm64/include/asm/debug-monitors.h | 12 ------------
 arch/arm64/include/asm/insn-def.h       | 14 ++++++++++++++
 2 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/include/asm/debug-monitors.h b/arch/arm64/include/asm/debug-monitors.h
index 657c921fd784..00c291067e57 100644
--- a/arch/arm64/include/asm/debug-monitors.h
+++ b/arch/arm64/include/asm/debug-monitors.h
@@ -34,18 +34,6 @@
  */
 #define BREAK_INSTR_SIZE		AARCH64_INSN_SIZE
 
-/*
- * BRK instruction encoding
- * The #imm16 value should be placed at bits[20:5] within BRK ins
- */
-#define AARCH64_BREAK_MON	0xd4200000
-
-/*
- * BRK instruction for provoking a fault on purpose
- * Unlike kgdb, #imm16 value with unallocated handler is used for faulting.
- */
-#define AARCH64_BREAK_FAULT	(AARCH64_BREAK_MON | (FAULT_BRK_IMM << 5))
-
 #define AARCH64_BREAK_KGDB_DYN_DBG	\
 	(AARCH64_BREAK_MON | (KGDB_DYN_DBG_BRK_IMM << 5))
 
diff --git a/arch/arm64/include/asm/insn-def.h b/arch/arm64/include/asm/insn-def.h
index 2c075f615c6a..1a7d0d483698 100644
--- a/arch/arm64/include/asm/insn-def.h
+++ b/arch/arm64/include/asm/insn-def.h
@@ -3,7 +3,21 @@
 #ifndef __ASM_INSN_DEF_H
 #define __ASM_INSN_DEF_H
 
+#include <asm/brk-imm.h>
+
 /* A64 instructions are always 32 bits. */
 #define	AARCH64_INSN_SIZE		4
 
+/*
+ * BRK instruction encoding
+ * The #imm16 value should be placed at bits[20:5] within BRK ins
+ */
+#define AARCH64_BREAK_MON	0xd4200000
+
+/*
+ * BRK instruction for provoking a fault on purpose
+ * Unlike kgdb, #imm16 value with unallocated handler is used for faulting.
+ */
+#define AARCH64_BREAK_FAULT	(AARCH64_BREAK_MON | (FAULT_BRK_IMM << 5))
+
 #endif /* __ASM_INSN_DEF_H */
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH bpf-next v3 2/4] arm64: insn: add encoders for atomic operations
  2022-01-29 22:04 [PATCH bpf-next v3 0/4] bpf, arm64: support more atomic ops Hou Tao
  2022-01-29 22:04 ` [PATCH bpf-next v3 1/4] arm64: move AARCH64_BREAK_FAULT into insn-def.h Hou Tao
@ 2022-01-29 22:04 ` Hou Tao
  2022-02-11 14:39   ` Daniel Borkmann
  2022-01-29 22:04 ` [PATCH bpf-next v3 3/4] bpf, arm64: support more " Hou Tao
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Hou Tao @ 2022-01-29 22:04 UTC (permalink / raw)
  To: Alexei Starovoitov, Mark Rutland
  Cc: Martin KaFai Lau, Yonghong Song, Daniel Borkmann,
	Andrii Nakryiko, Song Liu, David S . Miller, John Fastabend,
	netdev, bpf, houtao1, Zi Shen Lim, Catalin Marinas, Will Deacon,
	Julien Thierry, Ard Biesheuvel, linux-arm-kernel

It is a preparation patch for eBPF atomic supports under arm64. eBPF
needs support atomic[64]_fetch_add, atomic[64]_[fetch_]{and,or,xor} and
atomic[64]_{xchg|cmpxchg}. The ordering semantics of eBPF atomics are
the same with the implementations in linux kernel.

Add three helpers to support LDCLR/LDEOR/LDSET/SWP, CAS and DMB
instructions. STADD/STCLR/STEOR/STSET are simply encoded as aliases for
LDADD/LDCLR/LDEOR/LDSET with XZR as the destination register, so no extra
helper is added. atomic_fetch_add() and other atomic ops needs support for
STLXR instruction, so extend enum aarch64_insn_ldst_type to do that.

LDADD/LDEOR/LDSET/SWP and CAS instructions are only available when LSE
atomics is enabled, so just return AARCH64_BREAK_FAULT directly in
these newly-added helpers if CONFIG_ARM64_LSE_ATOMICS is disabled.

Signed-off-by: Hou Tao <houtao1@huawei.com>
---
 arch/arm64/include/asm/insn.h |  80 +++++++++++++--
 arch/arm64/lib/insn.c         | 185 +++++++++++++++++++++++++++++++---
 2 files changed, 244 insertions(+), 21 deletions(-)

diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
index 6b776c8667b2..0b6b31307e68 100644
--- a/arch/arm64/include/asm/insn.h
+++ b/arch/arm64/include/asm/insn.h
@@ -205,7 +205,9 @@ enum aarch64_insn_ldst_type {
 	AARCH64_INSN_LDST_LOAD_PAIR_POST_INDEX,
 	AARCH64_INSN_LDST_STORE_PAIR_POST_INDEX,
 	AARCH64_INSN_LDST_LOAD_EX,
+	AARCH64_INSN_LDST_LOAD_ACQ_EX,
 	AARCH64_INSN_LDST_STORE_EX,
+	AARCH64_INSN_LDST_STORE_REL_EX,
 };
 
 enum aarch64_insn_adsb_type {
@@ -280,6 +282,36 @@ enum aarch64_insn_adr_type {
 	AARCH64_INSN_ADR_TYPE_ADR,
 };
 
+enum aarch64_insn_mem_atomic_op {
+	AARCH64_INSN_MEM_ATOMIC_ADD,
+	AARCH64_INSN_MEM_ATOMIC_CLR,
+	AARCH64_INSN_MEM_ATOMIC_EOR,
+	AARCH64_INSN_MEM_ATOMIC_SET,
+	AARCH64_INSN_MEM_ATOMIC_SWP,
+};
+
+enum aarch64_insn_mem_order_type {
+	AARCH64_INSN_MEM_ORDER_NONE,
+	AARCH64_INSN_MEM_ORDER_ACQ,
+	AARCH64_INSN_MEM_ORDER_REL,
+	AARCH64_INSN_MEM_ORDER_ACQREL,
+};
+
+enum aarch64_insn_mb_type {
+	AARCH64_INSN_MB_SY,
+	AARCH64_INSN_MB_ST,
+	AARCH64_INSN_MB_LD,
+	AARCH64_INSN_MB_ISH,
+	AARCH64_INSN_MB_ISHST,
+	AARCH64_INSN_MB_ISHLD,
+	AARCH64_INSN_MB_NSH,
+	AARCH64_INSN_MB_NSHST,
+	AARCH64_INSN_MB_NSHLD,
+	AARCH64_INSN_MB_OSH,
+	AARCH64_INSN_MB_OSHST,
+	AARCH64_INSN_MB_OSHLD,
+};
+
 #define	__AARCH64_INSN_FUNCS(abbr, mask, val)				\
 static __always_inline bool aarch64_insn_is_##abbr(u32 code)		\
 {									\
@@ -303,6 +335,11 @@ __AARCH64_INSN_FUNCS(store_post,	0x3FE00C00, 0x38000400)
 __AARCH64_INSN_FUNCS(load_post,	0x3FE00C00, 0x38400400)
 __AARCH64_INSN_FUNCS(str_reg,	0x3FE0EC00, 0x38206800)
 __AARCH64_INSN_FUNCS(ldadd,	0x3F20FC00, 0x38200000)
+__AARCH64_INSN_FUNCS(ldclr,	0x3F20FC00, 0x38201000)
+__AARCH64_INSN_FUNCS(ldeor,	0x3F20FC00, 0x38202000)
+__AARCH64_INSN_FUNCS(ldset,	0x3F20FC00, 0x38203000)
+__AARCH64_INSN_FUNCS(swp,	0x3F20FC00, 0x38208000)
+__AARCH64_INSN_FUNCS(cas,	0x3FA07C00, 0x08A07C00)
 __AARCH64_INSN_FUNCS(ldr_reg,	0x3FE0EC00, 0x38606800)
 __AARCH64_INSN_FUNCS(ldr_lit,	0xBF000000, 0x18000000)
 __AARCH64_INSN_FUNCS(ldrsw_lit,	0xFF000000, 0x98000000)
@@ -474,13 +511,6 @@ u32 aarch64_insn_gen_load_store_ex(enum aarch64_insn_register reg,
 				   enum aarch64_insn_register state,
 				   enum aarch64_insn_size_type size,
 				   enum aarch64_insn_ldst_type type);
-u32 aarch64_insn_gen_ldadd(enum aarch64_insn_register result,
-			   enum aarch64_insn_register address,
-			   enum aarch64_insn_register value,
-			   enum aarch64_insn_size_type size);
-u32 aarch64_insn_gen_stadd(enum aarch64_insn_register address,
-			   enum aarch64_insn_register value,
-			   enum aarch64_insn_size_type size);
 u32 aarch64_insn_gen_add_sub_imm(enum aarch64_insn_register dst,
 				 enum aarch64_insn_register src,
 				 int imm, enum aarch64_insn_variant variant,
@@ -541,6 +571,42 @@ u32 aarch64_insn_gen_prefetch(enum aarch64_insn_register base,
 			      enum aarch64_insn_prfm_type type,
 			      enum aarch64_insn_prfm_target target,
 			      enum aarch64_insn_prfm_policy policy);
+#ifdef CONFIG_ARM64_LSE_ATOMICS
+u32 aarch64_insn_gen_atomic_ld_op(enum aarch64_insn_register result,
+				  enum aarch64_insn_register address,
+				  enum aarch64_insn_register value,
+				  enum aarch64_insn_size_type size,
+				  enum aarch64_insn_mem_atomic_op op,
+				  enum aarch64_insn_mem_order_type order);
+u32 aarch64_insn_gen_cas(enum aarch64_insn_register result,
+			 enum aarch64_insn_register address,
+			 enum aarch64_insn_register value,
+			 enum aarch64_insn_size_type size,
+			 enum aarch64_insn_mem_order_type order);
+#else
+static inline
+u32 aarch64_insn_gen_atomic_ld_op(enum aarch64_insn_register result,
+				  enum aarch64_insn_register address,
+				  enum aarch64_insn_register value,
+				  enum aarch64_insn_size_type size,
+				  enum aarch64_insn_mem_atomic_op op,
+				  enum aarch64_insn_mem_order_type order)
+{
+	return AARCH64_BREAK_FAULT;
+}
+
+static inline
+u32 aarch64_insn_gen_cas(enum aarch64_insn_register result,
+			 enum aarch64_insn_register address,
+			 enum aarch64_insn_register value,
+			 enum aarch64_insn_size_type size,
+			 enum aarch64_insn_mem_order_type order)
+{
+	return AARCH64_BREAK_FAULT;
+}
+#endif
+u32 aarch64_insn_gen_dmb(enum aarch64_insn_mb_type type);
+
 s32 aarch64_get_branch_offset(u32 insn);
 u32 aarch64_set_branch_offset(u32 insn, s32 offset);
 
diff --git a/arch/arm64/lib/insn.c b/arch/arm64/lib/insn.c
index fccfe363e567..bd119fde8504 100644
--- a/arch/arm64/lib/insn.c
+++ b/arch/arm64/lib/insn.c
@@ -578,10 +578,16 @@ u32 aarch64_insn_gen_load_store_ex(enum aarch64_insn_register reg,
 
 	switch (type) {
 	case AARCH64_INSN_LDST_LOAD_EX:
+	case AARCH64_INSN_LDST_LOAD_ACQ_EX:
 		insn = aarch64_insn_get_load_ex_value();
+		if (type == AARCH64_INSN_LDST_LOAD_ACQ_EX)
+			insn |= BIT(15);
 		break;
 	case AARCH64_INSN_LDST_STORE_EX:
+	case AARCH64_INSN_LDST_STORE_REL_EX:
 		insn = aarch64_insn_get_store_ex_value();
+		if (type == AARCH64_INSN_LDST_STORE_REL_EX)
+			insn |= BIT(15);
 		break;
 	default:
 		pr_err("%s: unknown load/store exclusive encoding %d\n", __func__, type);
@@ -603,12 +609,65 @@ u32 aarch64_insn_gen_load_store_ex(enum aarch64_insn_register reg,
 					    state);
 }
 
-u32 aarch64_insn_gen_ldadd(enum aarch64_insn_register result,
-			   enum aarch64_insn_register address,
-			   enum aarch64_insn_register value,
-			   enum aarch64_insn_size_type size)
+#ifdef CONFIG_ARM64_LSE_ATOMICS
+static u32 aarch64_insn_encode_ldst_order(enum aarch64_insn_mem_order_type type,
+					  u32 insn)
 {
-	u32 insn = aarch64_insn_get_ldadd_value();
+	u32 order;
+
+	switch (type) {
+	case AARCH64_INSN_MEM_ORDER_NONE:
+		order = 0;
+		break;
+	case AARCH64_INSN_MEM_ORDER_ACQ:
+		order = 2;
+		break;
+	case AARCH64_INSN_MEM_ORDER_REL:
+		order = 1;
+		break;
+	case AARCH64_INSN_MEM_ORDER_ACQREL:
+		order = 3;
+		break;
+	default:
+		pr_err("%s: unknown mem order %d\n", __func__, type);
+		return AARCH64_BREAK_FAULT;
+	}
+
+	insn &= ~GENMASK(23, 22);
+	insn |= order << 22;
+
+	return insn;
+}
+
+u32 aarch64_insn_gen_atomic_ld_op(enum aarch64_insn_register result,
+				  enum aarch64_insn_register address,
+				  enum aarch64_insn_register value,
+				  enum aarch64_insn_size_type size,
+				  enum aarch64_insn_mem_atomic_op op,
+				  enum aarch64_insn_mem_order_type order)
+{
+	u32 insn;
+
+	switch (op) {
+	case AARCH64_INSN_MEM_ATOMIC_ADD:
+		insn = aarch64_insn_get_ldadd_value();
+		break;
+	case AARCH64_INSN_MEM_ATOMIC_CLR:
+		insn = aarch64_insn_get_ldclr_value();
+		break;
+	case AARCH64_INSN_MEM_ATOMIC_EOR:
+		insn = aarch64_insn_get_ldeor_value();
+		break;
+	case AARCH64_INSN_MEM_ATOMIC_SET:
+		insn = aarch64_insn_get_ldset_value();
+		break;
+	case AARCH64_INSN_MEM_ATOMIC_SWP:
+		insn = aarch64_insn_get_swp_value();
+		break;
+	default:
+		pr_err("%s: unimplemented mem atomic op %d\n", __func__, op);
+		return AARCH64_BREAK_FAULT;
+	}
 
 	switch (size) {
 	case AARCH64_INSN_SIZE_32:
@@ -621,6 +680,8 @@ u32 aarch64_insn_gen_ldadd(enum aarch64_insn_register result,
 
 	insn = aarch64_insn_encode_ldst_size(size, insn);
 
+	insn = aarch64_insn_encode_ldst_order(order, insn);
+
 	insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RT, insn,
 					    result);
 
@@ -631,17 +692,68 @@ u32 aarch64_insn_gen_ldadd(enum aarch64_insn_register result,
 					    value);
 }
 
-u32 aarch64_insn_gen_stadd(enum aarch64_insn_register address,
-			   enum aarch64_insn_register value,
-			   enum aarch64_insn_size_type size)
+static u32 aarch64_insn_encode_cas_order(enum aarch64_insn_mem_order_type type,
+					 u32 insn)
 {
-	/*
-	 * STADD is simply encoded as an alias for LDADD with XZR as
-	 * the destination register.
-	 */
-	return aarch64_insn_gen_ldadd(AARCH64_INSN_REG_ZR, address,
-				      value, size);
+	u32 order;
+
+	switch (type) {
+	case AARCH64_INSN_MEM_ORDER_NONE:
+		order = 0;
+		break;
+	case AARCH64_INSN_MEM_ORDER_ACQ:
+		order = BIT(22);
+		break;
+	case AARCH64_INSN_MEM_ORDER_REL:
+		order = BIT(15);
+		break;
+	case AARCH64_INSN_MEM_ORDER_ACQREL:
+		order = BIT(15) | BIT(22);
+		break;
+	default:
+		pr_err("%s: unknown mem order %d\n", __func__, type);
+		return AARCH64_BREAK_FAULT;
+	}
+
+	insn &= ~(BIT(15) | BIT(22));
+	insn |= order;
+
+	return insn;
+}
+
+u32 aarch64_insn_gen_cas(enum aarch64_insn_register result,
+			 enum aarch64_insn_register address,
+			 enum aarch64_insn_register value,
+			 enum aarch64_insn_size_type size,
+			 enum aarch64_insn_mem_order_type order)
+{
+	u32 insn;
+
+	switch (size) {
+	case AARCH64_INSN_SIZE_32:
+	case AARCH64_INSN_SIZE_64:
+		break;
+	default:
+		pr_err("%s: unimplemented size encoding %d\n", __func__, size);
+		return AARCH64_BREAK_FAULT;
+	}
+
+	insn = aarch64_insn_get_cas_value();
+
+	insn = aarch64_insn_encode_ldst_size(size, insn);
+
+	insn = aarch64_insn_encode_cas_order(order, insn);
+
+	insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RT, insn,
+					    result);
+
+	insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RN, insn,
+					    address);
+
+	return aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RS, insn,
+					    value);
 }
+#endif
 
 static u32 aarch64_insn_encode_prfm_imm(enum aarch64_insn_prfm_type type,
 					enum aarch64_insn_prfm_target target,
@@ -1456,3 +1568,48 @@ u32 aarch64_insn_gen_extr(enum aarch64_insn_variant variant,
 	insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RN, insn, Rn);
 	return aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RM, insn, Rm);
 }
+
+u32 aarch64_insn_gen_dmb(enum aarch64_insn_mb_type type)
+{
+	u32 opt;
+	u32 insn;
+
+	switch (type) {
+	case AARCH64_INSN_MB_SY:
+		opt = 0xf;
+		break;
+	case AARCH64_INSN_MB_ST:
+		opt = 0xe;
+		break;
+	case AARCH64_INSN_MB_LD:
+		opt = 0xd;
+		break;
+	case AARCH64_INSN_MB_ISH:
+		opt = 0xb;
+		break;
+	case AARCH64_INSN_MB_ISHST:
+		opt = 0xa;
+		break;
+	case AARCH64_INSN_MB_ISHLD:
+		opt = 0x9;
+		break;
+	case AARCH64_INSN_MB_NSH:
+		opt = 0x7;
+		break;
+	case AARCH64_INSN_MB_NSHST:
+		opt = 0x6;
+		break;
+	case AARCH64_INSN_MB_NSHLD:
+		opt = 0x5;
+		break;
+	default:
+		pr_err("%s: unknown dmb type %d\n", __func__, type);
+		return AARCH64_BREAK_FAULT;
+	}
+
+	insn = aarch64_insn_get_dmb_value();
+	insn &= ~GENMASK(11, 8);
+	insn |= (opt << 8);
+
+	return insn;
+}
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH bpf-next v3 3/4] bpf, arm64: support more atomic operations
  2022-01-29 22:04 [PATCH bpf-next v3 0/4] bpf, arm64: support more atomic ops Hou Tao
  2022-01-29 22:04 ` [PATCH bpf-next v3 1/4] arm64: move AARCH64_BREAK_FAULT into insn-def.h Hou Tao
  2022-01-29 22:04 ` [PATCH bpf-next v3 2/4] arm64: insn: add encoders for atomic operations Hou Tao
@ 2022-01-29 22:04 ` Hou Tao
  2022-01-29 22:04 ` [PATCH bpf-next v3 4/4] selftests/bpf: use raw_tp program for atomic test Hou Tao
  2022-02-15  4:25 ` [PATCH bpf-next v3 0/4] bpf, arm64: support more atomic ops Hou Tao
  4 siblings, 0 replies; 11+ messages in thread
From: Hou Tao @ 2022-01-29 22:04 UTC (permalink / raw)
  To: Alexei Starovoitov, Mark Rutland
  Cc: Martin KaFai Lau, Yonghong Song, Daniel Borkmann,
	Andrii Nakryiko, Song Liu, David S . Miller, John Fastabend,
	netdev, bpf, houtao1, Zi Shen Lim, Catalin Marinas, Will Deacon,
	Julien Thierry, Ard Biesheuvel, linux-arm-kernel

Atomics for eBPF patch series adds support for atomic[64]_fetch_add,
atomic[64]_[fetch_]{and,or,xor} and atomic[64]_{xchg|cmpxchg}, but
it only adds support for x86-64, so support these atomic operations
for arm64 as well.

Basically the implementation procedure is almost mechanical translation
of code snippets in atomic_ll_sc.h & atomic_lse.h & cmpxchg.h located
under arch/arm64/include/asm.

When LSE atomic is unavailable, an extra temporary register is needed for
(BPF_ADD | BPF_FETCH) to save the value of src register, instead of adding
TMP_REG_4 just use BPF_REG_AX instead. Also make emit_lse_atomic() as an
empty inline function when CONFIG_ARM64_LSE_ATOMICS is disabled.

For cpus_have_cap(ARM64_HAS_LSE_ATOMICS) case and no-LSE-ATOMICS case, the
following three tests: "./test_verifier", "./test_progs -t atomic" and
"insmod ./test_bpf.ko" are exercised and passed.

Signed-off-by: Hou Tao <houtao1@huawei.com>
---
 arch/arm64/net/bpf_jit.h      |  44 ++++++-
 arch/arm64/net/bpf_jit_comp.c | 223 +++++++++++++++++++++++++++-------
 2 files changed, 223 insertions(+), 44 deletions(-)

diff --git a/arch/arm64/net/bpf_jit.h b/arch/arm64/net/bpf_jit.h
index cc0cf0f5c7c3..dd59b5ad8fe4 100644
--- a/arch/arm64/net/bpf_jit.h
+++ b/arch/arm64/net/bpf_jit.h
@@ -88,10 +88,42 @@
 /* [Rn] = Rt; (atomic) Rs = [state] */
 #define A64_STXR(sf, Rt, Rn, Rs) \
 	A64_LSX(sf, Rt, Rn, Rs, STORE_EX)
+/* [Rn] = Rt (store release); (atomic) Rs = [state] */
+#define A64_STLXR(sf, Rt, Rn, Rs) \
+	aarch64_insn_gen_load_store_ex(Rt, Rn, Rs, A64_SIZE(sf), \
+				       AARCH64_INSN_LDST_STORE_REL_EX)
+
+/*
+ * LSE atomics
+ *
+ * ST{ADD,CLR,SET,EOR} is simply encoded as an alias for
+ * LDD{ADD,CLR,SET,EOR} with XZR as the destination register.
+ */
+#define A64_ST_OP(sf, Rn, Rs, op) \
+	aarch64_insn_gen_atomic_ld_op(A64_ZR, Rn, Rs, \
+		A64_SIZE(sf), AARCH64_INSN_MEM_ATOMIC_##op, \
+		AARCH64_INSN_MEM_ORDER_NONE)
+/* [Rn] <op>= Rs */
+#define A64_STADD(sf, Rn, Rs) A64_ST_OP(sf, Rn, Rs, ADD)
+#define A64_STCLR(sf, Rn, Rs) A64_ST_OP(sf, Rn, Rs, CLR)
+#define A64_STEOR(sf, Rn, Rs) A64_ST_OP(sf, Rn, Rs, EOR)
+#define A64_STSET(sf, Rn, Rs) A64_ST_OP(sf, Rn, Rs, SET)
 
-/* LSE atomics */
-#define A64_STADD(sf, Rn, Rs) \
-	aarch64_insn_gen_stadd(Rn, Rs, A64_SIZE(sf))
+#define A64_LD_OP_AL(sf, Rt, Rn, Rs, op) \
+	aarch64_insn_gen_atomic_ld_op(Rt, Rn, Rs, \
+		A64_SIZE(sf), AARCH64_INSN_MEM_ATOMIC_##op, \
+		AARCH64_INSN_MEM_ORDER_ACQREL)
+/* Rt = [Rn] (load acquire); [Rn] <op>= Rs (store release) */
+#define A64_LDADDAL(sf, Rt, Rn, Rs) A64_LD_OP_AL(sf, Rt, Rn, Rs, ADD)
+#define A64_LDCLRAL(sf, Rt, Rn, Rs) A64_LD_OP_AL(sf, Rt, Rn, Rs, CLR)
+#define A64_LDEORAL(sf, Rt, Rn, Rs) A64_LD_OP_AL(sf, Rt, Rn, Rs, EOR)
+#define A64_LDSETAL(sf, Rt, Rn, Rs) A64_LD_OP_AL(sf, Rt, Rn, Rs, SET)
+/* Rt = [Rn] (load acquire); [Rn] = Rs (store release) */
+#define A64_SWPAL(sf, Rt, Rn, Rs) A64_LD_OP_AL(sf, Rt, Rn, Rs, SWP)
+/* Rs = CAS(Rn, Rs, Rt) (load acquire & store release) */
+#define A64_CASAL(sf, Rt, Rn, Rs) \
+	aarch64_insn_gen_cas(Rt, Rn, Rs, A64_SIZE(sf), \
+		AARCH64_INSN_MEM_ORDER_ACQREL)
 
 /* Add/subtract (immediate) */
 #define A64_ADDSUB_IMM(sf, Rd, Rn, imm12, type) \
@@ -196,6 +228,9 @@
 #define A64_ANDS(sf, Rd, Rn, Rm) A64_LOGIC_SREG(sf, Rd, Rn, Rm, AND_SETFLAGS)
 /* Rn & Rm; set condition flags */
 #define A64_TST(sf, Rn, Rm) A64_ANDS(sf, A64_ZR, Rn, Rm)
+/* Rd = ~Rm (alias of ORN with A64_ZR as Rn) */
+#define A64_MVN(sf, Rd, Rm)  \
+	A64_LOGIC_SREG(sf, Rd, A64_ZR, Rm, ORN)
 
 /* Logical (immediate) */
 #define A64_LOGIC_IMM(sf, Rd, Rn, imm, type) ({ \
@@ -219,4 +254,7 @@
 #define A64_BTI_J  A64_HINT(AARCH64_INSN_HINT_BTIJ)
 #define A64_BTI_JC A64_HINT(AARCH64_INSN_HINT_BTIJC)
 
+/* DMB */
+#define A64_DMB_ISH aarch64_insn_gen_dmb(AARCH64_INSN_MB_ISH)
+
 #endif /* _BPF_JIT_H */
diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index 74f9a9b6a053..2375ed3e4c8a 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -27,6 +27,17 @@
 #define TCALL_CNT (MAX_BPF_JIT_REG + 2)
 #define TMP_REG_3 (MAX_BPF_JIT_REG + 3)
 
+#define check_imm(bits, imm) do {				\
+	if ((((imm) > 0) && ((imm) >> (bits))) ||		\
+	    (((imm) < 0) && (~(imm) >> (bits)))) {		\
+		pr_info("[%2d] imm=%d(0x%x) out of range\n",	\
+			i, imm, imm);				\
+		return -EINVAL;					\
+	}							\
+} while (0)
+#define check_imm19(imm) check_imm(19, imm)
+#define check_imm26(imm) check_imm(26, imm)
+
 /* Map BPF registers to A64 registers */
 static const int bpf2a64[] = {
 	/* return value from in-kernel function, and exit value from eBPF */
@@ -329,6 +340,170 @@ static int emit_bpf_tail_call(struct jit_ctx *ctx)
 #undef jmp_offset
 }
 
+#ifdef CONFIG_ARM64_LSE_ATOMICS
+static int emit_lse_atomic(const struct bpf_insn *insn, struct jit_ctx *ctx)
+{
+	const u8 code = insn->code;
+	const u8 dst = bpf2a64[insn->dst_reg];
+	const u8 src = bpf2a64[insn->src_reg];
+	const u8 tmp = bpf2a64[TMP_REG_1];
+	const u8 tmp2 = bpf2a64[TMP_REG_2];
+	const bool isdw = BPF_SIZE(code) == BPF_DW;
+	const s16 off = insn->off;
+	u8 reg;
+
+	if (!off) {
+		reg = dst;
+	} else {
+		emit_a64_mov_i(1, tmp, off, ctx);
+		emit(A64_ADD(1, tmp, tmp, dst), ctx);
+		reg = tmp;
+	}
+
+	switch (insn->imm) {
+	/* lock *(u32/u64 *)(dst_reg + off) <op>= src_reg */
+	case BPF_ADD:
+		emit(A64_STADD(isdw, reg, src), ctx);
+		break;
+	case BPF_AND:
+		emit(A64_MVN(isdw, tmp2, src), ctx);
+		emit(A64_STCLR(isdw, reg, tmp2), ctx);
+		break;
+	case BPF_OR:
+		emit(A64_STSET(isdw, reg, src), ctx);
+		break;
+	case BPF_XOR:
+		emit(A64_STEOR(isdw, reg, src), ctx);
+		break;
+	/* src_reg = atomic_fetch_<op>(dst_reg + off, src_reg) */
+	case BPF_ADD | BPF_FETCH:
+		emit(A64_LDADDAL(isdw, src, reg, src), ctx);
+		break;
+	case BPF_AND | BPF_FETCH:
+		emit(A64_MVN(isdw, tmp2, src), ctx);
+		emit(A64_LDCLRAL(isdw, src, reg, tmp2), ctx);
+		break;
+	case BPF_OR | BPF_FETCH:
+		emit(A64_LDSETAL(isdw, src, reg, src), ctx);
+		break;
+	case BPF_XOR | BPF_FETCH:
+		emit(A64_LDEORAL(isdw, src, reg, src), ctx);
+		break;
+	/* src_reg = atomic_xchg(dst_reg + off, src_reg); */
+	case BPF_XCHG:
+		emit(A64_SWPAL(isdw, src, reg, src), ctx);
+		break;
+	/* r0 = atomic_cmpxchg(dst_reg + off, r0, src_reg); */
+	case BPF_CMPXCHG:
+		emit(A64_CASAL(isdw, src, reg, bpf2a64[BPF_REG_0]), ctx);
+		break;
+	default:
+		pr_err_once("unknown atomic op code %02x\n", insn->imm);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+#else
+static inline int emit_lse_atomic(const struct bpf_insn *insn, struct jit_ctx *ctx)
+{
+	return -EINVAL;
+}
+#endif
+
+static int emit_ll_sc_atomic(const struct bpf_insn *insn, struct jit_ctx *ctx)
+{
+	const u8 code = insn->code;
+	const u8 dst = bpf2a64[insn->dst_reg];
+	const u8 src = bpf2a64[insn->src_reg];
+	const u8 tmp = bpf2a64[TMP_REG_1];
+	const u8 tmp2 = bpf2a64[TMP_REG_2];
+	const u8 tmp3 = bpf2a64[TMP_REG_3];
+	const int i = insn - ctx->prog->insnsi;
+	const s32 imm = insn->imm;
+	const s16 off = insn->off;
+	const bool isdw = BPF_SIZE(code) == BPF_DW;
+	u8 reg;
+	s32 jmp_offset;
+
+	if (!off) {
+		reg = dst;
+	} else {
+		emit_a64_mov_i(1, tmp, off, ctx);
+		emit(A64_ADD(1, tmp, tmp, dst), ctx);
+		reg = tmp;
+	}
+
+	if (imm == BPF_ADD || imm == BPF_AND ||
+	    imm == BPF_OR || imm == BPF_XOR) {
+		/* lock *(u32/u64 *)(dst_reg + off) <op>= src_reg */
+		emit(A64_LDXR(isdw, tmp2, reg), ctx);
+		if (imm == BPF_ADD)
+			emit(A64_ADD(isdw, tmp2, tmp2, src), ctx);
+		else if (imm == BPF_AND)
+			emit(A64_AND(isdw, tmp2, tmp2, src), ctx);
+		else if (imm == BPF_OR)
+			emit(A64_ORR(isdw, tmp2, tmp2, src), ctx);
+		else
+			emit(A64_EOR(isdw, tmp2, tmp2, src), ctx);
+		emit(A64_STXR(isdw, tmp2, reg, tmp3), ctx);
+		jmp_offset = -3;
+		check_imm19(jmp_offset);
+		emit(A64_CBNZ(0, tmp3, jmp_offset), ctx);
+	} else if (imm == (BPF_ADD | BPF_FETCH) ||
+		   imm == (BPF_AND | BPF_FETCH) ||
+		   imm == (BPF_OR | BPF_FETCH) ||
+		   imm == (BPF_XOR | BPF_FETCH)) {
+		/* src_reg = atomic_fetch_<op>(dst_reg + off, src_reg) */
+		const u8 ax = bpf2a64[BPF_REG_AX];
+
+		emit(A64_MOV(isdw, ax, src), ctx);
+		emit(A64_LDXR(isdw, src, reg), ctx);
+		if (imm == (BPF_ADD | BPF_FETCH))
+			emit(A64_ADD(isdw, tmp2, src, ax), ctx);
+		else if (imm == (BPF_AND | BPF_FETCH))
+			emit(A64_AND(isdw, tmp2, src, ax), ctx);
+		else if (imm == (BPF_OR | BPF_FETCH))
+			emit(A64_ORR(isdw, tmp2, src, ax), ctx);
+		else
+			emit(A64_EOR(isdw, tmp2, src, ax), ctx);
+		emit(A64_STLXR(isdw, tmp2, reg, tmp3), ctx);
+		jmp_offset = -3;
+		check_imm19(jmp_offset);
+		emit(A64_CBNZ(0, tmp3, jmp_offset), ctx);
+		emit(A64_DMB_ISH, ctx);
+	} else if (imm == BPF_XCHG) {
+		/* src_reg = atomic_xchg(dst_reg + off, src_reg); */
+		emit(A64_MOV(isdw, tmp2, src), ctx);
+		emit(A64_LDXR(isdw, src, reg), ctx);
+		emit(A64_STLXR(isdw, tmp2, reg, tmp3), ctx);
+		jmp_offset = -2;
+		check_imm19(jmp_offset);
+		emit(A64_CBNZ(0, tmp3, jmp_offset), ctx);
+		emit(A64_DMB_ISH, ctx);
+	} else if (imm == BPF_CMPXCHG) {
+		/* r0 = atomic_cmpxchg(dst_reg + off, r0, src_reg); */
+		const u8 r0 = bpf2a64[BPF_REG_0];
+
+		emit(A64_MOV(isdw, tmp2, r0), ctx);
+		emit(A64_LDXR(isdw, r0, reg), ctx);
+		emit(A64_EOR(isdw, tmp3, r0, tmp2), ctx);
+		jmp_offset = 4;
+		check_imm19(jmp_offset);
+		emit(A64_CBNZ(isdw, tmp3, jmp_offset), ctx);
+		emit(A64_STLXR(isdw, src, reg, tmp3), ctx);
+		jmp_offset = -4;
+		check_imm19(jmp_offset);
+		emit(A64_CBNZ(0, tmp3, jmp_offset), ctx);
+		emit(A64_DMB_ISH, ctx);
+	} else {
+		pr_err_once("unknown atomic op code %02x\n", imm);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
 static void build_epilogue(struct jit_ctx *ctx)
 {
 	const u8 r0 = bpf2a64[BPF_REG_0];
@@ -434,29 +609,16 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx,
 	const u8 src = bpf2a64[insn->src_reg];
 	const u8 tmp = bpf2a64[TMP_REG_1];
 	const u8 tmp2 = bpf2a64[TMP_REG_2];
-	const u8 tmp3 = bpf2a64[TMP_REG_3];
 	const s16 off = insn->off;
 	const s32 imm = insn->imm;
 	const int i = insn - ctx->prog->insnsi;
 	const bool is64 = BPF_CLASS(code) == BPF_ALU64 ||
 			  BPF_CLASS(code) == BPF_JMP;
-	const bool isdw = BPF_SIZE(code) == BPF_DW;
-	u8 jmp_cond, reg;
+	u8 jmp_cond;
 	s32 jmp_offset;
 	u32 a64_insn;
 	int ret;
 
-#define check_imm(bits, imm) do {				\
-	if ((((imm) > 0) && ((imm) >> (bits))) ||		\
-	    (((imm) < 0) && (~(imm) >> (bits)))) {		\
-		pr_info("[%2d] imm=%d(0x%x) out of range\n",	\
-			i, imm, imm);				\
-		return -EINVAL;					\
-	}							\
-} while (0)
-#define check_imm19(imm) check_imm(19, imm)
-#define check_imm26(imm) check_imm(26, imm)
-
 	switch (code) {
 	/* dst = src */
 	case BPF_ALU | BPF_MOV | BPF_X:
@@ -891,33 +1053,12 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx,
 
 	case BPF_STX | BPF_ATOMIC | BPF_W:
 	case BPF_STX | BPF_ATOMIC | BPF_DW:
-		if (insn->imm != BPF_ADD) {
-			pr_err_once("unknown atomic op code %02x\n", insn->imm);
-			return -EINVAL;
-		}
-
-		/* STX XADD: lock *(u32 *)(dst + off) += src
-		 * and
-		 * STX XADD: lock *(u64 *)(dst + off) += src
-		 */
-
-		if (!off) {
-			reg = dst;
-		} else {
-			emit_a64_mov_i(1, tmp, off, ctx);
-			emit(A64_ADD(1, tmp, tmp, dst), ctx);
-			reg = tmp;
-		}
-		if (cpus_have_cap(ARM64_HAS_LSE_ATOMICS)) {
-			emit(A64_STADD(isdw, reg, src), ctx);
-		} else {
-			emit(A64_LDXR(isdw, tmp2, reg), ctx);
-			emit(A64_ADD(isdw, tmp2, tmp2, src), ctx);
-			emit(A64_STXR(isdw, tmp2, reg, tmp3), ctx);
-			jmp_offset = -3;
-			check_imm19(jmp_offset);
-			emit(A64_CBNZ(0, tmp3, jmp_offset), ctx);
-		}
+		if (cpus_have_cap(ARM64_HAS_LSE_ATOMICS))
+			ret = emit_lse_atomic(insn, ctx);
+		else
+			ret = emit_ll_sc_atomic(insn, ctx);
+		if (ret)
+			return ret;
 		break;
 
 	default:
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH bpf-next v3 4/4] selftests/bpf: use raw_tp program for atomic test
  2022-01-29 22:04 [PATCH bpf-next v3 0/4] bpf, arm64: support more atomic ops Hou Tao
                   ` (2 preceding siblings ...)
  2022-01-29 22:04 ` [PATCH bpf-next v3 3/4] bpf, arm64: support more " Hou Tao
@ 2022-01-29 22:04 ` Hou Tao
  2022-02-15  4:25 ` [PATCH bpf-next v3 0/4] bpf, arm64: support more atomic ops Hou Tao
  4 siblings, 0 replies; 11+ messages in thread
From: Hou Tao @ 2022-01-29 22:04 UTC (permalink / raw)
  To: Alexei Starovoitov, Mark Rutland
  Cc: Martin KaFai Lau, Yonghong Song, Daniel Borkmann,
	Andrii Nakryiko, Song Liu, David S . Miller, John Fastabend,
	netdev, bpf, houtao1, Zi Shen Lim, Catalin Marinas, Will Deacon,
	Julien Thierry, Ard Biesheuvel, linux-arm-kernel

Now atomic tests will attach fentry program and run it through
bpf_prog_test_run_opts(), but attaching fentry program depends on bpf
trampoline which is only available under x86-64. Considering many archs
have atomic support, using raw_tp program instead.

Signed-off-by: Hou Tao <houtao1@huawei.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
---
 .../selftests/bpf/prog_tests/atomics.c        | 91 +++++--------------
 tools/testing/selftests/bpf/progs/atomics.c   | 28 +++---
 2 files changed, 36 insertions(+), 83 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/atomics.c b/tools/testing/selftests/bpf/prog_tests/atomics.c
index ab62aba10e2b..13e101f370a1 100644
--- a/tools/testing/selftests/bpf/prog_tests/atomics.c
+++ b/tools/testing/selftests/bpf/prog_tests/atomics.c
@@ -7,19 +7,15 @@
 static void test_add(struct atomics_lskel *skel)
 {
 	int err, prog_fd;
-	int link_fd;
 	LIBBPF_OPTS(bpf_test_run_opts, topts);
 
-	link_fd = atomics_lskel__add__attach(skel);
-	if (!ASSERT_GT(link_fd, 0, "attach(add)"))
-		return;
-
+	/* No need to attach it, just run it directly */
 	prog_fd = skel->progs.add.prog_fd;
 	err = bpf_prog_test_run_opts(prog_fd, &topts);
 	if (!ASSERT_OK(err, "test_run_opts err"))
-		goto cleanup;
+		return;
 	if (!ASSERT_OK(topts.retval, "test_run_opts retval"))
-		goto cleanup;
+		return;
 
 	ASSERT_EQ(skel->data->add64_value, 3, "add64_value");
 	ASSERT_EQ(skel->bss->add64_result, 1, "add64_result");
@@ -31,27 +27,20 @@ static void test_add(struct atomics_lskel *skel)
 	ASSERT_EQ(skel->bss->add_stack_result, 1, "add_stack_result");
 
 	ASSERT_EQ(skel->data->add_noreturn_value, 3, "add_noreturn_value");
-
-cleanup:
-	close(link_fd);
 }
 
 static void test_sub(struct atomics_lskel *skel)
 {
 	int err, prog_fd;
-	int link_fd;
 	LIBBPF_OPTS(bpf_test_run_opts, topts);
 
-	link_fd = atomics_lskel__sub__attach(skel);
-	if (!ASSERT_GT(link_fd, 0, "attach(sub)"))
-		return;
-
+	/* No need to attach it, just run it directly */
 	prog_fd = skel->progs.sub.prog_fd;
 	err = bpf_prog_test_run_opts(prog_fd, &topts);
 	if (!ASSERT_OK(err, "test_run_opts err"))
-		goto cleanup;
+		return;
 	if (!ASSERT_OK(topts.retval, "test_run_opts retval"))
-		goto cleanup;
+		return;
 
 	ASSERT_EQ(skel->data->sub64_value, -1, "sub64_value");
 	ASSERT_EQ(skel->bss->sub64_result, 1, "sub64_result");
@@ -63,27 +52,20 @@ static void test_sub(struct atomics_lskel *skel)
 	ASSERT_EQ(skel->bss->sub_stack_result, 1, "sub_stack_result");
 
 	ASSERT_EQ(skel->data->sub_noreturn_value, -1, "sub_noreturn_value");
-
-cleanup:
-	close(link_fd);
 }
 
 static void test_and(struct atomics_lskel *skel)
 {
 	int err, prog_fd;
-	int link_fd;
 	LIBBPF_OPTS(bpf_test_run_opts, topts);
 
-	link_fd = atomics_lskel__and__attach(skel);
-	if (!ASSERT_GT(link_fd, 0, "attach(and)"))
-		return;
-
+	/* No need to attach it, just run it directly */
 	prog_fd = skel->progs.and.prog_fd;
 	err = bpf_prog_test_run_opts(prog_fd, &topts);
 	if (!ASSERT_OK(err, "test_run_opts err"))
-		goto cleanup;
+		return;
 	if (!ASSERT_OK(topts.retval, "test_run_opts retval"))
-		goto cleanup;
+		return;
 
 	ASSERT_EQ(skel->data->and64_value, 0x010ull << 32, "and64_value");
 	ASSERT_EQ(skel->bss->and64_result, 0x110ull << 32, "and64_result");
@@ -92,26 +74,20 @@ static void test_and(struct atomics_lskel *skel)
 	ASSERT_EQ(skel->bss->and32_result, 0x110, "and32_result");
 
 	ASSERT_EQ(skel->data->and_noreturn_value, 0x010ull << 32, "and_noreturn_value");
-cleanup:
-	close(link_fd);
 }
 
 static void test_or(struct atomics_lskel *skel)
 {
 	int err, prog_fd;
-	int link_fd;
 	LIBBPF_OPTS(bpf_test_run_opts, topts);
 
-	link_fd = atomics_lskel__or__attach(skel);
-	if (!ASSERT_GT(link_fd, 0, "attach(or)"))
-		return;
-
+	/* No need to attach it, just run it directly */
 	prog_fd = skel->progs.or.prog_fd;
 	err = bpf_prog_test_run_opts(prog_fd, &topts);
 	if (!ASSERT_OK(err, "test_run_opts err"))
-		goto cleanup;
+		return;
 	if (!ASSERT_OK(topts.retval, "test_run_opts retval"))
-		goto cleanup;
+		return;
 
 	ASSERT_EQ(skel->data->or64_value, 0x111ull << 32, "or64_value");
 	ASSERT_EQ(skel->bss->or64_result, 0x110ull << 32, "or64_result");
@@ -120,26 +96,20 @@ static void test_or(struct atomics_lskel *skel)
 	ASSERT_EQ(skel->bss->or32_result, 0x110, "or32_result");
 
 	ASSERT_EQ(skel->data->or_noreturn_value, 0x111ull << 32, "or_noreturn_value");
-cleanup:
-	close(link_fd);
 }
 
 static void test_xor(struct atomics_lskel *skel)
 {
 	int err, prog_fd;
-	int link_fd;
 	LIBBPF_OPTS(bpf_test_run_opts, topts);
 
-	link_fd = atomics_lskel__xor__attach(skel);
-	if (!ASSERT_GT(link_fd, 0, "attach(xor)"))
-		return;
-
+	/* No need to attach it, just run it directly */
 	prog_fd = skel->progs.xor.prog_fd;
 	err = bpf_prog_test_run_opts(prog_fd, &topts);
 	if (!ASSERT_OK(err, "test_run_opts err"))
-		goto cleanup;
+		return;
 	if (!ASSERT_OK(topts.retval, "test_run_opts retval"))
-		goto cleanup;
+		return;
 
 	ASSERT_EQ(skel->data->xor64_value, 0x101ull << 32, "xor64_value");
 	ASSERT_EQ(skel->bss->xor64_result, 0x110ull << 32, "xor64_result");
@@ -148,26 +118,20 @@ static void test_xor(struct atomics_lskel *skel)
 	ASSERT_EQ(skel->bss->xor32_result, 0x110, "xor32_result");
 
 	ASSERT_EQ(skel->data->xor_noreturn_value, 0x101ull << 32, "xor_nxoreturn_value");
-cleanup:
-	close(link_fd);
 }
 
 static void test_cmpxchg(struct atomics_lskel *skel)
 {
 	int err, prog_fd;
-	int link_fd;
 	LIBBPF_OPTS(bpf_test_run_opts, topts);
 
-	link_fd = atomics_lskel__cmpxchg__attach(skel);
-	if (!ASSERT_GT(link_fd, 0, "attach(cmpxchg)"))
-		return;
-
+	/* No need to attach it, just run it directly */
 	prog_fd = skel->progs.cmpxchg.prog_fd;
 	err = bpf_prog_test_run_opts(prog_fd, &topts);
 	if (!ASSERT_OK(err, "test_run_opts err"))
-		goto cleanup;
+		return;
 	if (!ASSERT_OK(topts.retval, "test_run_opts retval"))
-		goto cleanup;
+		return;
 
 	ASSERT_EQ(skel->data->cmpxchg64_value, 2, "cmpxchg64_value");
 	ASSERT_EQ(skel->bss->cmpxchg64_result_fail, 1, "cmpxchg_result_fail");
@@ -176,45 +140,34 @@ static void test_cmpxchg(struct atomics_lskel *skel)
 	ASSERT_EQ(skel->data->cmpxchg32_value, 2, "lcmpxchg32_value");
 	ASSERT_EQ(skel->bss->cmpxchg32_result_fail, 1, "cmpxchg_result_fail");
 	ASSERT_EQ(skel->bss->cmpxchg32_result_succeed, 1, "cmpxchg_result_succeed");
-
-cleanup:
-	close(link_fd);
 }
 
 static void test_xchg(struct atomics_lskel *skel)
 {
 	int err, prog_fd;
-	int link_fd;
 	LIBBPF_OPTS(bpf_test_run_opts, topts);
 
-	link_fd = atomics_lskel__xchg__attach(skel);
-	if (!ASSERT_GT(link_fd, 0, "attach(xchg)"))
-		return;
-
+	/* No need to attach it, just run it directly */
 	prog_fd = skel->progs.xchg.prog_fd;
 	err = bpf_prog_test_run_opts(prog_fd, &topts);
 	if (!ASSERT_OK(err, "test_run_opts err"))
-		goto cleanup;
+		return;
 	if (!ASSERT_OK(topts.retval, "test_run_opts retval"))
-		goto cleanup;
+		return;
 
 	ASSERT_EQ(skel->data->xchg64_value, 2, "xchg64_value");
 	ASSERT_EQ(skel->bss->xchg64_result, 1, "xchg64_result");
 
 	ASSERT_EQ(skel->data->xchg32_value, 2, "xchg32_value");
 	ASSERT_EQ(skel->bss->xchg32_result, 1, "xchg32_result");
-
-cleanup:
-	close(link_fd);
 }
 
 void test_atomics(void)
 {
 	struct atomics_lskel *skel;
-	__u32 duration = 0;
 
 	skel = atomics_lskel__open_and_load();
-	if (CHECK(!skel, "skel_load", "atomics skeleton failed\n"))
+	if (!ASSERT_OK_PTR(skel, "atomics skeleton load"))
 		return;
 
 	if (skel->data->skip_tests) {
diff --git a/tools/testing/selftests/bpf/progs/atomics.c b/tools/testing/selftests/bpf/progs/atomics.c
index 16e57313204a..f89c7f0cc53b 100644
--- a/tools/testing/selftests/bpf/progs/atomics.c
+++ b/tools/testing/selftests/bpf/progs/atomics.c
@@ -20,8 +20,8 @@ __u64 add_stack_value_copy = 0;
 __u64 add_stack_result = 0;
 __u64 add_noreturn_value = 1;
 
-SEC("fentry/bpf_fentry_test1")
-int BPF_PROG(add, int a)
+SEC("raw_tp/sys_enter")
+int add(const void *ctx)
 {
 	if (pid != (bpf_get_current_pid_tgid() >> 32))
 		return 0;
@@ -46,8 +46,8 @@ __s64 sub_stack_value_copy = 0;
 __s64 sub_stack_result = 0;
 __s64 sub_noreturn_value = 1;
 
-SEC("fentry/bpf_fentry_test1")
-int BPF_PROG(sub, int a)
+SEC("raw_tp/sys_enter")
+int sub(const void *ctx)
 {
 	if (pid != (bpf_get_current_pid_tgid() >> 32))
 		return 0;
@@ -70,8 +70,8 @@ __u32 and32_value = 0x110;
 __u32 and32_result = 0;
 __u64 and_noreturn_value = (0x110ull << 32);
 
-SEC("fentry/bpf_fentry_test1")
-int BPF_PROG(and, int a)
+SEC("raw_tp/sys_enter")
+int and(const void *ctx)
 {
 	if (pid != (bpf_get_current_pid_tgid() >> 32))
 		return 0;
@@ -91,8 +91,8 @@ __u32 or32_value = 0x110;
 __u32 or32_result = 0;
 __u64 or_noreturn_value = (0x110ull << 32);
 
-SEC("fentry/bpf_fentry_test1")
-int BPF_PROG(or, int a)
+SEC("raw_tp/sys_enter")
+int or(const void *ctx)
 {
 	if (pid != (bpf_get_current_pid_tgid() >> 32))
 		return 0;
@@ -111,8 +111,8 @@ __u32 xor32_value = 0x110;
 __u32 xor32_result = 0;
 __u64 xor_noreturn_value = (0x110ull << 32);
 
-SEC("fentry/bpf_fentry_test1")
-int BPF_PROG(xor, int a)
+SEC("raw_tp/sys_enter")
+int xor(const void *ctx)
 {
 	if (pid != (bpf_get_current_pid_tgid() >> 32))
 		return 0;
@@ -132,8 +132,8 @@ __u32 cmpxchg32_value = 1;
 __u32 cmpxchg32_result_fail = 0;
 __u32 cmpxchg32_result_succeed = 0;
 
-SEC("fentry/bpf_fentry_test1")
-int BPF_PROG(cmpxchg, int a)
+SEC("raw_tp/sys_enter")
+int cmpxchg(const void *ctx)
 {
 	if (pid != (bpf_get_current_pid_tgid() >> 32))
 		return 0;
@@ -153,8 +153,8 @@ __u64 xchg64_result = 0;
 __u32 xchg32_value = 1;
 __u32 xchg32_result = 0;
 
-SEC("fentry/bpf_fentry_test1")
-int BPF_PROG(xchg, int a)
+SEC("raw_tp/sys_enter")
+int xchg(const void *ctx)
 {
 	if (pid != (bpf_get_current_pid_tgid() >> 32))
 		return 0;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next v3 2/4] arm64: insn: add encoders for atomic operations
  2022-01-29 22:04 ` [PATCH bpf-next v3 2/4] arm64: insn: add encoders for atomic operations Hou Tao
@ 2022-02-11 14:39   ` Daniel Borkmann
  2022-02-15 17:42     ` Will Deacon
  2022-02-16 17:16     ` Will Deacon
  0 siblings, 2 replies; 11+ messages in thread
From: Daniel Borkmann @ 2022-02-11 14:39 UTC (permalink / raw)
  To: Hou Tao, Alexei Starovoitov, Mark Rutland
  Cc: Martin KaFai Lau, Yonghong Song, Andrii Nakryiko, Song Liu,
	David S . Miller, John Fastabend, netdev, bpf, Zi Shen Lim,
	Catalin Marinas, Will Deacon, Julien Thierry, Ard Biesheuvel,
	linux-arm-kernel

On 1/29/22 11:04 PM, Hou Tao wrote:
> It is a preparation patch for eBPF atomic supports under arm64. eBPF
> needs support atomic[64]_fetch_add, atomic[64]_[fetch_]{and,or,xor} and
> atomic[64]_{xchg|cmpxchg}. The ordering semantics of eBPF atomics are
> the same with the implementations in linux kernel.
> 
> Add three helpers to support LDCLR/LDEOR/LDSET/SWP, CAS and DMB
> instructions. STADD/STCLR/STEOR/STSET are simply encoded as aliases for
> LDADD/LDCLR/LDEOR/LDSET with XZR as the destination register, so no extra
> helper is added. atomic_fetch_add() and other atomic ops needs support for
> STLXR instruction, so extend enum aarch64_insn_ldst_type to do that.
> 
> LDADD/LDEOR/LDSET/SWP and CAS instructions are only available when LSE
> atomics is enabled, so just return AARCH64_BREAK_FAULT directly in
> these newly-added helpers if CONFIG_ARM64_LSE_ATOMICS is disabled.
> 
> Signed-off-by: Hou Tao <houtao1@huawei.com>

Hey Mark / Ard / Will / Catalin or others, could we get an Ack on patch 1 & 2
at min if it looks good to you?

Thanks a lot,
Daniel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next v3 0/4] bpf, arm64: support more atomic ops
  2022-01-29 22:04 [PATCH bpf-next v3 0/4] bpf, arm64: support more atomic ops Hou Tao
                   ` (3 preceding siblings ...)
  2022-01-29 22:04 ` [PATCH bpf-next v3 4/4] selftests/bpf: use raw_tp program for atomic test Hou Tao
@ 2022-02-15  4:25 ` Hou Tao
  4 siblings, 0 replies; 11+ messages in thread
From: Hou Tao @ 2022-02-15  4:25 UTC (permalink / raw)
  To: Alexei Starovoitov, Mark Rutland
  Cc: Martin KaFai Lau, Yonghong Song, Daniel Borkmann,
	Andrii Nakryiko, Song Liu, David S . Miller, John Fastabend,
	netdev, bpf, Zi Shen Lim, Catalin Marinas, Will Deacon,
	Julien Thierry, Ard Biesheuvel, linux-arm-kernel

ping ?

On 1/30/2022 6:04 AM, Hou Tao wrote:
> Hi,
>
> Atomics support in bpf has already been done by "Atomics for eBPF"
> patch series [1], but it only adds support for x86, and this patchset
> adds support for arm64.
>
> Patch #1 & patch #2 are arm64 related. Patch #1 moves the common used
> macro AARCH64_BREAK_FAULT into insn-def.h for insn.h. Patch #2 adds
> necessary encoder helpers for atomic operations.
>
> Patch #3 implements atomic[64]_fetch_add, atomic[64]_[fetch_]{and,or,xor}
> and atomic[64]_{xchg|cmpxchg} for arm64 bpf. Patch #4 changes the type of
> test program from fentry/ to raw_tp/ for atomics test.
>
> For cpus_have_cap(ARM64_HAS_LSE_ATOMICS) case and no-LSE-ATOMICS case,
> ./test_verifier, "./test_progs -t atomic", and "insmod ./test_bpf.ko"
> are exercised and passed correspondingly.
>
> Comments are always welcome.
>
> Regards,
> Tao
>
> [1]: https://lore.kernel.org/bpf/20210114181751.768687-2-jackmanb@google.com/
>
> Change Log:
> v3:
>  * split arm64 insn related code into a separated patch (from Mark)
>  * update enum name in aarch64_insn_mem_atomic_op (from Mark)
>  * consider all cases for aarch64_insn_mem_order_type and
>    aarch64_insn_mb_type (from Mark)
>  * exercise and pass "insmod ./test_bpf.ko" test (suggested by Daniel)
>  * remove aarch64_insn_gen_store_release_ex() and extend
>    aarch64_insn_ldst_type instead
>  * compile aarch64_insn_gen_atomic_ld_op(), aarch64_insn_gen_cas() and
>    emit_lse_atomic() out when CONFIG_ARM64_LSE_ATOMICS is disabled.
>
> v2: https://lore.kernel.org/bpf/20220127075322.675323-1-houtao1@huawei.com/
>   * patch #1: use two separated ASSERT_OK() instead of ASSERT_TRUE()
>   * add Acked-by tag for both patches
>
> v1: https://lore.kernel.org/bpf/20220121135632.136976-1-houtao1@huawei.com/
>
> Hou Tao (4):
>   arm64: move AARCH64_BREAK_FAULT into insn-def.h
>   arm64: insn: add encoders for atomic operations
>   bpf, arm64: support more atomic operations
>   selftests/bpf: use raw_tp program for atomic test
>
>  arch/arm64/include/asm/debug-monitors.h       |  12 -
>  arch/arm64/include/asm/insn-def.h             |  14 ++
>  arch/arm64/include/asm/insn.h                 |  80 ++++++-
>  arch/arm64/lib/insn.c                         | 185 +++++++++++++--
>  arch/arm64/net/bpf_jit.h                      |  44 +++-
>  arch/arm64/net/bpf_jit_comp.c                 | 223 ++++++++++++++----
>  .../selftests/bpf/prog_tests/atomics.c        |  91 ++-----
>  tools/testing/selftests/bpf/progs/atomics.c   |  28 +--
>  8 files changed, 517 insertions(+), 160 deletions(-)
>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next v3 2/4] arm64: insn: add encoders for atomic operations
  2022-02-11 14:39   ` Daniel Borkmann
@ 2022-02-15 17:42     ` Will Deacon
  2022-02-16  0:21       ` Daniel Borkmann
  2022-02-16 17:16     ` Will Deacon
  1 sibling, 1 reply; 11+ messages in thread
From: Will Deacon @ 2022-02-15 17:42 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Hou Tao, Alexei Starovoitov, Mark Rutland, Martin KaFai Lau,
	Yonghong Song, Andrii Nakryiko, Song Liu, David S . Miller,
	John Fastabend, netdev, bpf, Zi Shen Lim, Catalin Marinas,
	Julien Thierry, Ard Biesheuvel, linux-arm-kernel

Hi Daniel,

On Fri, Feb 11, 2022 at 03:39:48PM +0100, Daniel Borkmann wrote:
> On 1/29/22 11:04 PM, Hou Tao wrote:
> > It is a preparation patch for eBPF atomic supports under arm64. eBPF
> > needs support atomic[64]_fetch_add, atomic[64]_[fetch_]{and,or,xor} and
> > atomic[64]_{xchg|cmpxchg}. The ordering semantics of eBPF atomics are
> > the same with the implementations in linux kernel.
> > 
> > Add three helpers to support LDCLR/LDEOR/LDSET/SWP, CAS and DMB
> > instructions. STADD/STCLR/STEOR/STSET are simply encoded as aliases for
> > LDADD/LDCLR/LDEOR/LDSET with XZR as the destination register, so no extra
> > helper is added. atomic_fetch_add() and other atomic ops needs support for
> > STLXR instruction, so extend enum aarch64_insn_ldst_type to do that.
> > 
> > LDADD/LDEOR/LDSET/SWP and CAS instructions are only available when LSE
> > atomics is enabled, so just return AARCH64_BREAK_FAULT directly in
> > these newly-added helpers if CONFIG_ARM64_LSE_ATOMICS is disabled.
> > 
> > Signed-off-by: Hou Tao <houtao1@huawei.com>
> 
> Hey Mark / Ard / Will / Catalin or others, could we get an Ack on patch 1 & 2
> at min if it looks good to you?

Sorry for the delay, for some reason this series has all ended up in my
spam! I'll take a look this week. If it looks good, do you mind if I queue
those two patches in arm64 on a stable branch for you to pull as well? We've
got a few other (non-BPF) changes pending to the instruction decoder, and
I'd like to avoid conflicts if we can.

Cheers,

Will

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next v3 2/4] arm64: insn: add encoders for atomic operations
  2022-02-15 17:42     ` Will Deacon
@ 2022-02-16  0:21       ` Daniel Borkmann
  0 siblings, 0 replies; 11+ messages in thread
From: Daniel Borkmann @ 2022-02-16  0:21 UTC (permalink / raw)
  To: Will Deacon
  Cc: Hou Tao, Alexei Starovoitov, Mark Rutland, Martin KaFai Lau,
	Yonghong Song, Andrii Nakryiko, Song Liu, David S . Miller,
	John Fastabend, netdev, bpf, Zi Shen Lim, Catalin Marinas,
	Julien Thierry, Ard Biesheuvel, linux-arm-kernel

Hi Will,

On 2/15/22 6:42 PM, Will Deacon wrote:
> On Fri, Feb 11, 2022 at 03:39:48PM +0100, Daniel Borkmann wrote:
>> On 1/29/22 11:04 PM, Hou Tao wrote:
>>> It is a preparation patch for eBPF atomic supports under arm64. eBPF
>>> needs support atomic[64]_fetch_add, atomic[64]_[fetch_]{and,or,xor} and
>>> atomic[64]_{xchg|cmpxchg}. The ordering semantics of eBPF atomics are
>>> the same with the implementations in linux kernel.
>>>
>>> Add three helpers to support LDCLR/LDEOR/LDSET/SWP, CAS and DMB
>>> instructions. STADD/STCLR/STEOR/STSET are simply encoded as aliases for
>>> LDADD/LDCLR/LDEOR/LDSET with XZR as the destination register, so no extra
>>> helper is added. atomic_fetch_add() and other atomic ops needs support for
>>> STLXR instruction, so extend enum aarch64_insn_ldst_type to do that.
>>>
>>> LDADD/LDEOR/LDSET/SWP and CAS instructions are only available when LSE
>>> atomics is enabled, so just return AARCH64_BREAK_FAULT directly in
>>> these newly-added helpers if CONFIG_ARM64_LSE_ATOMICS is disabled.
>>>
>>> Signed-off-by: Hou Tao <houtao1@huawei.com>
>>
>> Hey Mark / Ard / Will / Catalin or others, could we get an Ack on patch 1 & 2
>> at min if it looks good to you?
> 
> Sorry for the delay, for some reason this series has all ended up in my
> spam! I'll take a look this week. If it looks good, do you mind if I queue
> those two patches in arm64 on a stable branch for you to pull as well? We've
> got a few other (non-BPF) changes pending to the instruction decoder, and
> I'd like to avoid conflicts if we can.

Yes, that should be totally fine.

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next v3 2/4] arm64: insn: add encoders for atomic operations
  2022-02-11 14:39   ` Daniel Borkmann
  2022-02-15 17:42     ` Will Deacon
@ 2022-02-16 17:16     ` Will Deacon
  2022-02-17  1:55       ` Hou Tao
  1 sibling, 1 reply; 11+ messages in thread
From: Will Deacon @ 2022-02-16 17:16 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Hou Tao, Alexei Starovoitov, Mark Rutland, Martin KaFai Lau,
	Yonghong Song, Andrii Nakryiko, Song Liu, David S . Miller,
	John Fastabend, netdev, bpf, Zi Shen Lim, Catalin Marinas,
	Julien Thierry, Ard Biesheuvel, linux-arm-kernel

On Fri, Feb 11, 2022 at 03:39:48PM +0100, Daniel Borkmann wrote:
> On 1/29/22 11:04 PM, Hou Tao wrote:
> > It is a preparation patch for eBPF atomic supports under arm64. eBPF
> > needs support atomic[64]_fetch_add, atomic[64]_[fetch_]{and,or,xor} and
> > atomic[64]_{xchg|cmpxchg}. The ordering semantics of eBPF atomics are
> > the same with the implementations in linux kernel.
> > 
> > Add three helpers to support LDCLR/LDEOR/LDSET/SWP, CAS and DMB
> > instructions. STADD/STCLR/STEOR/STSET are simply encoded as aliases for
> > LDADD/LDCLR/LDEOR/LDSET with XZR as the destination register, so no extra
> > helper is added. atomic_fetch_add() and other atomic ops needs support for
> > STLXR instruction, so extend enum aarch64_insn_ldst_type to do that.
> > 
> > LDADD/LDEOR/LDSET/SWP and CAS instructions are only available when LSE
> > atomics is enabled, so just return AARCH64_BREAK_FAULT directly in
> > these newly-added helpers if CONFIG_ARM64_LSE_ATOMICS is disabled.
> > 
> > Signed-off-by: Hou Tao <houtao1@huawei.com>
> 
> Hey Mark / Ard / Will / Catalin or others, could we get an Ack on patch 1 & 2
> at min if it looks good to you?

I checked the instruction encodings in patches 1 and 2 and they all look
fine to me. However, after applying those two locally I get a build failure:

  | In file included from arch/arm64/net/bpf_jit_comp.c:23:
  | arch/arm64/net/bpf_jit_comp.c: In function ‘build_insn’:
  | arch/arm64/net/bpf_jit.h:94:2: error: implicit declaration of function ‘aarch64_insn_gen_stadd’; did you mean ‘aarch64_insn_gen_adr’? [-Werror=implicit-function-declaration]
  |    94 |  aarch64_insn_gen_stadd(Rn, Rs, A64_SIZE(sf))
  |       |  ^~~~~~~~~~~~~~~~~~~~~~
  | arch/arm64/net/bpf_jit_comp.c:912:9: note: in expansion of macro ‘A64_STADD’
  |   912 |    emit(A64_STADD(isdw, reg, src), ctx);
  |       |         ^~~~~~~~~
  | cc1: some warnings being treated as errors

Will

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next v3 2/4] arm64: insn: add encoders for atomic operations
  2022-02-16 17:16     ` Will Deacon
@ 2022-02-17  1:55       ` Hou Tao
  0 siblings, 0 replies; 11+ messages in thread
From: Hou Tao @ 2022-02-17  1:55 UTC (permalink / raw)
  To: Will Deacon, Daniel Borkmann
  Cc: Alexei Starovoitov, Mark Rutland, Martin KaFai Lau,
	Yonghong Song, Andrii Nakryiko, Song Liu, David S . Miller,
	John Fastabend, netdev, bpf, Zi Shen Lim, Catalin Marinas,
	Julien Thierry, Ard Biesheuvel, linux-arm-kernel

Hi,

On 2/17/2022 1:16 AM, Will Deacon wrote:
> On Fri, Feb 11, 2022 at 03:39:48PM +0100, Daniel Borkmann wrote:
>> On 1/29/22 11:04 PM, Hou Tao wrote:
>>> It is a preparation patch for eBPF atomic supports under arm64. eBPF
>>> needs support atomic[64]_fetch_add, atomic[64]_[fetch_]{and,or,xor} and
>>> atomic[64]_{xchg|cmpxchg}. The ordering semantics of eBPF atomics are
>>> the same with the implementations in linux kernel.
>>>
>>> Add three helpers to support LDCLR/LDEOR/LDSET/SWP, CAS and DMB
>>> instructions. STADD/STCLR/STEOR/STSET are simply encoded as aliases for
>>> LDADD/LDCLR/LDEOR/LDSET with XZR as the destination register, so no extra
>>> helper is added. atomic_fetch_add() and other atomic ops needs support for
>>> STLXR instruction, so extend enum aarch64_insn_ldst_type to do that.
>>>
>>> LDADD/LDEOR/LDSET/SWP and CAS instructions are only available when LSE
>>> atomics is enabled, so just return AARCH64_BREAK_FAULT directly in
>>> these newly-added helpers if CONFIG_ARM64_LSE_ATOMICS is disabled.
>>>
>>> Signed-off-by: Hou Tao <houtao1@huawei.com>
>> Hey Mark / Ard / Will / Catalin or others, could we get an Ack on patch 1 & 2
>> at min if it looks good to you?
> I checked the instruction encodings in patches 1 and 2 and they all look
> fine to me. However, after applying those two locally I get a build failure:
>
>   | In file included from arch/arm64/net/bpf_jit_comp.c:23:
>   | arch/arm64/net/bpf_jit_comp.c: In function ‘build_insn’:
>   | arch/arm64/net/bpf_jit.h:94:2: error: implicit declaration of function ‘aarch64_insn_gen_stadd’; did you mean ‘aarch64_insn_gen_adr’? [-Werror=implicit-function-declaration]
>   |    94 |  aarch64_insn_gen_stadd(Rn, Rs, A64_SIZE(sf))
>   |       |  ^~~~~~~~~~~~~~~~~~~~~~
>   | arch/arm64/net/bpf_jit_comp.c:912:9: note: in expansion of macro ‘A64_STADD’
>   |   912 |    emit(A64_STADD(isdw, reg, src), ctx);
>   |       |         ^~~~~~~~~
>   | cc1: some warnings being treated as errors
Thanks for your review. The build failure is my fault. I update A64_STADD() in
patch 3 instead of patch 2
after replacing aarch64_insn_get_stadd() by aarch64_insn_gen_atomic_ld_op(), and
will fix it in v4. If you
are trying to test the encoder, I suggest you to apply patch 1~3.

Regards,
Tao
> Will
> .


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-02-17  1:55 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-29 22:04 [PATCH bpf-next v3 0/4] bpf, arm64: support more atomic ops Hou Tao
2022-01-29 22:04 ` [PATCH bpf-next v3 1/4] arm64: move AARCH64_BREAK_FAULT into insn-def.h Hou Tao
2022-01-29 22:04 ` [PATCH bpf-next v3 2/4] arm64: insn: add encoders for atomic operations Hou Tao
2022-02-11 14:39   ` Daniel Borkmann
2022-02-15 17:42     ` Will Deacon
2022-02-16  0:21       ` Daniel Borkmann
2022-02-16 17:16     ` Will Deacon
2022-02-17  1:55       ` Hou Tao
2022-01-29 22:04 ` [PATCH bpf-next v3 3/4] bpf, arm64: support more " Hou Tao
2022-01-29 22:04 ` [PATCH bpf-next v3 4/4] selftests/bpf: use raw_tp program for atomic test Hou Tao
2022-02-15  4:25 ` [PATCH bpf-next v3 0/4] bpf, arm64: support more atomic ops Hou Tao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).