All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next 00/24] Support bpf trampoline for s390x
@ 2023-01-25 21:37 Ilya Leoshkevich
  2023-01-25 21:37 ` [PATCH bpf-next 01/24] selftests/bpf: Fix liburandom_read.so linker error Ilya Leoshkevich
                   ` (24 more replies)
  0 siblings, 25 replies; 48+ messages in thread
From: Ilya Leoshkevich @ 2023-01-25 21:37 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: bpf, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich

Hi,

This series implements poke, trampoline, kfunc, mixing subprogs and
tailcalls, and fixes a number of tests on s390x.

The following failures still remain:

#52      core_read_macros:FAIL
Uses BPF_PROBE_READ(), shouldn't there be BPF_PROBE_READ_KERNEL()?

#82      get_stack_raw_tp:FAIL
get_stack_print_output:FAIL:user_stack corrupted user stack
Known issue:
We cannot reliably unwind userspace on s390x without DWARF.

#101     ksyms_module:FAIL
address of kernel function bpf_testmod_test_mod_kfunc is out of range
Known issue:
Kernel and modules are too far away from each other on s390x.

#167     sk_assign:FAIL
Uses legacy map definitions in 'maps' section.

#190     stacktrace_build_id:FAIL
Known issue:
We cannot reliably unwind userspace on s390x without DWARF.

#211     test_bpffs:FAIL
iterators.bpf.c is broken on s390x, it uses BPF_CORE_READ(), shouldn't
there be BPF_CORE_READ_KERNEL()?

#218     test_profiler:FAIL
A lot of BPF_PROBE_READ() usages.

#281     xdp_metadata:FAIL
See patch 24.

#284     xdp_synproxy:FAIL
Verifier error:
; value = bpf_tcp_raw_gen_syncookie_ipv4(hdr->ipv4, hdr->tcp,
281: (79) r1 = *(u64 *)(r10 -80)      ; R1_w=pkt(off=14,r=74,imm=0) R10=fp0
282: (bf) r2 = r8                     ; R2_w=pkt(id=5,off=14,r=74,umax=60,var_off=(0x0; 0x3c)) R8=pkt(id=5,off=14,r=74,umax=60,var_off=(0x0; 0x3c))
283: (79) r3 = *(u64 *)(r10 -104)     ; R3_w=scalar(umax=60,var_off=(0x0; 0x3c)) R10=fp0
284: (85) call bpf_tcp_raw_gen_syncookie_ipv4#204
invalid access to packet, off=14 size=0, R2(id=5,off=14,r=74)
R2 offset is outside of the packet

None of these seem to be due to the new changes.

Best regards,
Ilya

Ilya Leoshkevich (24):
  selftests/bpf: Fix liburandom_read.so linker error
  selftests/bpf: Fix symlink creation error
  selftests/bpf: Fix fexit_stress on s390x
  selftests/bpf: Fix trampoline_count on s390x
  selftests/bpf: Fix kfree_skb on s390x
  selftests/bpf: Set errno when urand_spawn() fails
  selftests/bpf: Fix decap_sanity_ns cleanup
  selftests/bpf: Fix verify_pkcs7_sig on s390x
  selftests/bpf: Fix xdp_do_redirect on s390x
  selftests/bpf: Fix cgrp_local_storage on s390x
  selftests/bpf: Check stack_mprotect() return value
  selftests/bpf: Increase SIZEOF_BPF_LOCAL_STORAGE_ELEM on s390x
  selftests/bpf: Add a sign-extension test for kfuncs
  selftests/bpf: Fix test_lsm on s390x
  selftests/bpf: Fix test_xdp_adjust_tail_grow2 on s390x
  selftests/bpf: Fix vmlinux test on s390x
  libbpf: Read usdt arg spec with bpf_probe_read_kernel()
  s390/bpf: Fix a typo in a comment
  s390/bpf: Add expoline to tail calls
  s390/bpf: Implement bpf_arch_text_poke()
  bpf: btf: Add BTF_FMODEL_SIGNED_ARG flag
  s390/bpf: Implement arch_prepare_bpf_trampoline()
  s390/bpf: Implement bpf_jit_supports_subprog_tailcalls()
  s390/bpf: Implement bpf_jit_supports_kfunc_call()

 arch/s390/net/bpf_jit_comp.c                  | 708 +++++++++++++++++-
 include/linux/bpf.h                           |   8 +
 include/linux/btf.h                           |  15 +-
 kernel/bpf/btf.c                              |  16 +-
 net/bpf/test_run.c                            |   9 +
 tools/lib/bpf/usdt.bpf.h                      |  33 +-
 tools/testing/selftests/bpf/Makefile          |   7 +-
 tools/testing/selftests/bpf/netcnt_common.h   |   6 +-
 .../selftests/bpf/prog_tests/bpf_cookie.c     |   6 +-
 .../bpf/prog_tests/cgrp_local_storage.c       |   2 +-
 .../selftests/bpf/prog_tests/decap_sanity.c   |   2 +-
 .../selftests/bpf/prog_tests/fexit_stress.c   |   6 +-
 .../selftests/bpf/prog_tests/kfree_skb.c      |   2 +-
 .../selftests/bpf/prog_tests/kfunc_call.c     |   1 +
 .../selftests/bpf/prog_tests/test_lsm.c       |   3 +-
 .../bpf/prog_tests/trampoline_count.c         |   4 +
 tools/testing/selftests/bpf/prog_tests/usdt.c |   1 +
 .../bpf/prog_tests/verify_pkcs7_sig.c         |   9 +
 .../bpf/prog_tests/xdp_adjust_tail.c          |   7 +-
 .../bpf/prog_tests/xdp_do_redirect.c          |   4 +
 .../selftests/bpf/progs/kfunc_call_test.c     |  18 +
 tools/testing/selftests/bpf/progs/lsm.c       |   7 +-
 .../bpf/progs/test_verify_pkcs7_sig.c         |  12 +-
 .../selftests/bpf/progs/test_vmlinux.c        |   4 +-
 .../bpf/progs/test_xdp_adjust_tail_grow.c     |   8 +-
 25 files changed, 816 insertions(+), 82 deletions(-)

-- 
2.39.1


^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH bpf-next 01/24] selftests/bpf: Fix liburandom_read.so linker error
  2023-01-25 21:37 [PATCH bpf-next 00/24] Support bpf trampoline for s390x Ilya Leoshkevich
@ 2023-01-25 21:37 ` Ilya Leoshkevich
  2023-01-26  1:07   ` Andrii Nakryiko
  2023-01-25 21:37 ` [PATCH bpf-next 02/24] selftests/bpf: Fix symlink creation error Ilya Leoshkevich
                   ` (23 subsequent siblings)
  24 siblings, 1 reply; 48+ messages in thread
From: Ilya Leoshkevich @ 2023-01-25 21:37 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: bpf, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich

When building with O=, the following linker error occurs:

    clang: error: no such file or directory: 'liburandom_read.so'

Fix by adding $(OUTPUT) to the linker search path.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 tools/testing/selftests/bpf/Makefile | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index c9b5ed59e1ed..43098eb15d31 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -189,9 +189,9 @@ $(OUTPUT)/liburandom_read.so: urandom_read_lib1.c urandom_read_lib2.c
 $(OUTPUT)/urandom_read: urandom_read.c urandom_read_aux.c $(OUTPUT)/liburandom_read.so
 	$(call msg,BINARY,,$@)
 	$(Q)$(CLANG) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $(filter %.c,$^) \
-		     liburandom_read.so $(filter-out -static,$(LDLIBS))	     \
+		     $(filter-out -static,$(LDLIBS))	                       \
 		     -fuse-ld=$(LLD) -Wl,-znoseparate-code -Wl,--build-id=sha1 \
-		     -Wl,-rpath=. -o $@
+		     -Wl,-rpath=. -o $@ -lurandom_read -L$(OUTPUT)
 
 $(OUTPUT)/sign-file: ../../../../scripts/sign-file.c
 	$(call msg,SIGN-FILE,,$@)
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH bpf-next 02/24] selftests/bpf: Fix symlink creation error
  2023-01-25 21:37 [PATCH bpf-next 00/24] Support bpf trampoline for s390x Ilya Leoshkevich
  2023-01-25 21:37 ` [PATCH bpf-next 01/24] selftests/bpf: Fix liburandom_read.so linker error Ilya Leoshkevich
@ 2023-01-25 21:37 ` Ilya Leoshkevich
  2023-01-25 21:37 ` [PATCH bpf-next 03/24] selftests/bpf: Fix fexit_stress on s390x Ilya Leoshkevich
                   ` (22 subsequent siblings)
  24 siblings, 0 replies; 48+ messages in thread
From: Ilya Leoshkevich @ 2023-01-25 21:37 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: bpf, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich

When building with O=, the following error occurs:

    ln: failed to create symbolic link 'no_alu32/bpftool': No such file or directory

Adjust the code to account for $(OUTPUT).

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 tools/testing/selftests/bpf/Makefile | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 43098eb15d31..b6c31817755e 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -519,7 +519,8 @@ $(OUTPUT)/$(TRUNNER_BINARY): $(TRUNNER_TEST_OBJS)			\
 	$$(call msg,BINARY,,$$@)
 	$(Q)$$(CC) $$(CFLAGS) $$(filter %.a %.o,$$^) $$(LDLIBS) -o $$@
 	$(Q)$(RESOLVE_BTFIDS) --btf $(TRUNNER_OUTPUT)/btf_data.bpf.o $$@
-	$(Q)ln -sf $(if $2,..,.)/tools/build/bpftool/bootstrap/bpftool $(if $2,$2/)bpftool
+	$(Q)ln -sf $(if $2,..,.)/tools/build/bpftool/bootstrap/bpftool \
+		   $(OUTPUT)/$(if $2,$2/)bpftool
 
 endef
 
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH bpf-next 03/24] selftests/bpf: Fix fexit_stress on s390x
  2023-01-25 21:37 [PATCH bpf-next 00/24] Support bpf trampoline for s390x Ilya Leoshkevich
  2023-01-25 21:37 ` [PATCH bpf-next 01/24] selftests/bpf: Fix liburandom_read.so linker error Ilya Leoshkevich
  2023-01-25 21:37 ` [PATCH bpf-next 02/24] selftests/bpf: Fix symlink creation error Ilya Leoshkevich
@ 2023-01-25 21:37 ` Ilya Leoshkevich
  2023-01-25 21:37 ` [PATCH bpf-next 04/24] selftests/bpf: Fix trampoline_count " Ilya Leoshkevich
                   ` (21 subsequent siblings)
  24 siblings, 0 replies; 48+ messages in thread
From: Ilya Leoshkevich @ 2023-01-25 21:37 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: bpf, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich

BPF_MAX_TRAMP_LINKS is smaller on s390x than on x86.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 tools/testing/selftests/bpf/prog_tests/fexit_stress.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/fexit_stress.c b/tools/testing/selftests/bpf/prog_tests/fexit_stress.c
index 5a7e6011f6bf..329950f572cc 100644
--- a/tools/testing/selftests/bpf/prog_tests/fexit_stress.c
+++ b/tools/testing/selftests/bpf/prog_tests/fexit_stress.c
@@ -2,8 +2,12 @@
 /* Copyright (c) 2019 Facebook */
 #include <test_progs.h>
 
-/* that's kernel internal BPF_MAX_TRAMP_PROGS define */
+/* that's kernel internal BPF_MAX_TRAMP_LINKS define */
+#if defined(__s390x__)
+#define CNT 27
+#else
 #define CNT 38
+#endif
 
 void serial_test_fexit_stress(void)
 {
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH bpf-next 04/24] selftests/bpf: Fix trampoline_count on s390x
  2023-01-25 21:37 [PATCH bpf-next 00/24] Support bpf trampoline for s390x Ilya Leoshkevich
                   ` (2 preceding siblings ...)
  2023-01-25 21:37 ` [PATCH bpf-next 03/24] selftests/bpf: Fix fexit_stress on s390x Ilya Leoshkevich
@ 2023-01-25 21:37 ` Ilya Leoshkevich
  2023-01-25 21:37 ` [PATCH bpf-next 05/24] selftests/bpf: Fix kfree_skb " Ilya Leoshkevich
                   ` (20 subsequent siblings)
  24 siblings, 0 replies; 48+ messages in thread
From: Ilya Leoshkevich @ 2023-01-25 21:37 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: bpf, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich

MAX_TRAMP_PROGS on s390x is smaller than on x86.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 tools/testing/selftests/bpf/prog_tests/trampoline_count.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/trampoline_count.c b/tools/testing/selftests/bpf/prog_tests/trampoline_count.c
index 564b75bc087f..3d2e25492f40 100644
--- a/tools/testing/selftests/bpf/prog_tests/trampoline_count.c
+++ b/tools/testing/selftests/bpf/prog_tests/trampoline_count.c
@@ -2,7 +2,11 @@
 #define _GNU_SOURCE
 #include <test_progs.h>
 
+#if defined(__s390x__)
+#define MAX_TRAMP_PROGS 27
+#else
 #define MAX_TRAMP_PROGS 38
+#endif
 
 struct inst {
 	struct bpf_object *obj;
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH bpf-next 05/24] selftests/bpf: Fix kfree_skb on s390x
  2023-01-25 21:37 [PATCH bpf-next 00/24] Support bpf trampoline for s390x Ilya Leoshkevich
                   ` (3 preceding siblings ...)
  2023-01-25 21:37 ` [PATCH bpf-next 04/24] selftests/bpf: Fix trampoline_count " Ilya Leoshkevich
@ 2023-01-25 21:37 ` Ilya Leoshkevich
  2023-01-25 21:37 ` [PATCH bpf-next 06/24] selftests/bpf: Set errno when urand_spawn() fails Ilya Leoshkevich
                   ` (19 subsequent siblings)
  24 siblings, 0 replies; 48+ messages in thread
From: Ilya Leoshkevich @ 2023-01-25 21:37 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: bpf, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich

h_proto is big-endian; use htons() in order to make comparison work on
both little- and big-endian machines.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 tools/testing/selftests/bpf/prog_tests/kfree_skb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/kfree_skb.c b/tools/testing/selftests/bpf/prog_tests/kfree_skb.c
index 73579370bfbd..c07991544a78 100644
--- a/tools/testing/selftests/bpf/prog_tests/kfree_skb.c
+++ b/tools/testing/selftests/bpf/prog_tests/kfree_skb.c
@@ -36,7 +36,7 @@ static void on_sample(void *ctx, int cpu, void *data, __u32 size)
 		  "cb32_0 %x != %x\n",
 		  meta->cb32_0, cb.cb32[0]))
 		return;
-	if (CHECK(pkt_v6->eth.h_proto != 0xdd86, "check_eth",
+	if (CHECK(pkt_v6->eth.h_proto != htons(ETH_P_IPV6), "check_eth",
 		  "h_proto %x\n", pkt_v6->eth.h_proto))
 		return;
 	if (CHECK(pkt_v6->iph.nexthdr != 6, "check_ip",
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH bpf-next 06/24] selftests/bpf: Set errno when urand_spawn() fails
  2023-01-25 21:37 [PATCH bpf-next 00/24] Support bpf trampoline for s390x Ilya Leoshkevich
                   ` (4 preceding siblings ...)
  2023-01-25 21:37 ` [PATCH bpf-next 05/24] selftests/bpf: Fix kfree_skb " Ilya Leoshkevich
@ 2023-01-25 21:37 ` Ilya Leoshkevich
  2023-01-25 21:38 ` [PATCH bpf-next 07/24] selftests/bpf: Fix decap_sanity_ns cleanup Ilya Leoshkevich
                   ` (18 subsequent siblings)
  24 siblings, 0 replies; 48+ messages in thread
From: Ilya Leoshkevich @ 2023-01-25 21:37 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: bpf, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich

The result of urand_spawn() is checked with ASSERT_OK_PTR, which treats
NULL as success if errno == 0.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 tools/testing/selftests/bpf/prog_tests/usdt.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/usdt.c b/tools/testing/selftests/bpf/prog_tests/usdt.c
index 9ad9da0f215e..56ed1eb9b527 100644
--- a/tools/testing/selftests/bpf/prog_tests/usdt.c
+++ b/tools/testing/selftests/bpf/prog_tests/usdt.c
@@ -314,6 +314,7 @@ static FILE *urand_spawn(int *pid)
 
 	if (fscanf(f, "%d", pid) != 1) {
 		pclose(f);
+		errno = EINVAL;
 		return NULL;
 	}
 
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH bpf-next 07/24] selftests/bpf: Fix decap_sanity_ns cleanup
  2023-01-25 21:37 [PATCH bpf-next 00/24] Support bpf trampoline for s390x Ilya Leoshkevich
                   ` (5 preceding siblings ...)
  2023-01-25 21:37 ` [PATCH bpf-next 06/24] selftests/bpf: Set errno when urand_spawn() fails Ilya Leoshkevich
@ 2023-01-25 21:38 ` Ilya Leoshkevich
  2023-01-25 21:38 ` [PATCH bpf-next 08/24] selftests/bpf: Fix verify_pkcs7_sig on s390x Ilya Leoshkevich
                   ` (17 subsequent siblings)
  24 siblings, 0 replies; 48+ messages in thread
From: Ilya Leoshkevich @ 2023-01-25 21:38 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: bpf, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich

decap_sanity prints the following on the 1st run:

    decap_sanity: sh: 1: Syntax error: Bad fd number

and the following on the 2nd run:

    Cannot create namespace file "/run/netns/decap_sanity_ns": File exists

The problem is that the cleanup command has a typo and does nothing.
Fix the typo.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 tools/testing/selftests/bpf/prog_tests/decap_sanity.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/decap_sanity.c b/tools/testing/selftests/bpf/prog_tests/decap_sanity.c
index 0b2f73b88c53..2853883b7cbb 100644
--- a/tools/testing/selftests/bpf/prog_tests/decap_sanity.c
+++ b/tools/testing/selftests/bpf/prog_tests/decap_sanity.c
@@ -80,6 +80,6 @@ void test_decap_sanity(void)
 		bpf_tc_hook_destroy(&qdisc_hook);
 		close_netns(nstoken);
 	}
-	system("ip netns del " NS_TEST " >& /dev/null");
+	system("ip netns del " NS_TEST " &> /dev/null");
 	decap_sanity__destroy(skel);
 }
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH bpf-next 08/24] selftests/bpf: Fix verify_pkcs7_sig on s390x
  2023-01-25 21:37 [PATCH bpf-next 00/24] Support bpf trampoline for s390x Ilya Leoshkevich
                   ` (6 preceding siblings ...)
  2023-01-25 21:38 ` [PATCH bpf-next 07/24] selftests/bpf: Fix decap_sanity_ns cleanup Ilya Leoshkevich
@ 2023-01-25 21:38 ` Ilya Leoshkevich
  2023-01-26  1:06   ` Andrii Nakryiko
  2023-01-25 21:38 ` [PATCH bpf-next 09/24] selftests/bpf: Fix xdp_do_redirect " Ilya Leoshkevich
                   ` (16 subsequent siblings)
  24 siblings, 1 reply; 48+ messages in thread
From: Ilya Leoshkevich @ 2023-01-25 21:38 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: bpf, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich

Use bpf_probe_read_kernel() instead of bpf_probe_read(), which is not
defined on all architectures.

While at it, improve the error handling: do not hide the verifier log,
and check the return values of bpf_probe_read_kernel() and
bpf_copy_from_user().

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 .../selftests/bpf/prog_tests/verify_pkcs7_sig.c      |  9 +++++++++
 .../selftests/bpf/progs/test_verify_pkcs7_sig.c      | 12 ++++++++----
 2 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/verify_pkcs7_sig.c b/tools/testing/selftests/bpf/prog_tests/verify_pkcs7_sig.c
index 579d6ee83ce0..75c256f79f85 100644
--- a/tools/testing/selftests/bpf/prog_tests/verify_pkcs7_sig.c
+++ b/tools/testing/selftests/bpf/prog_tests/verify_pkcs7_sig.c
@@ -56,11 +56,17 @@ struct data {
 	__u32 sig_len;
 };
 
+static char libbpf_log[8192];
 static bool kfunc_not_supported;
 
 static int libbpf_print_cb(enum libbpf_print_level level, const char *fmt,
 			   va_list args)
 {
+	size_t log_len = strlen(libbpf_log);
+
+	vsnprintf(libbpf_log + log_len, sizeof(libbpf_log) - log_len,
+		  fmt, args);
+
 	if (strcmp(fmt, "libbpf: extern (func ksym) '%s': not found in kernel or module BTFs\n"))
 		return 0;
 
@@ -277,6 +283,7 @@ void test_verify_pkcs7_sig(void)
 	if (!ASSERT_OK_PTR(skel, "test_verify_pkcs7_sig__open"))
 		goto close_prog;
 
+	libbpf_log[0] = 0;
 	old_print_cb = libbpf_set_print(libbpf_print_cb);
 	ret = test_verify_pkcs7_sig__load(skel);
 	libbpf_set_print(old_print_cb);
@@ -289,6 +296,8 @@ void test_verify_pkcs7_sig(void)
 		goto close_prog;
 	}
 
+	printf("%s", libbpf_log);
+
 	if (!ASSERT_OK(ret, "test_verify_pkcs7_sig__load"))
 		goto close_prog;
 
diff --git a/tools/testing/selftests/bpf/progs/test_verify_pkcs7_sig.c b/tools/testing/selftests/bpf/progs/test_verify_pkcs7_sig.c
index ce419304ff1f..7748cc23de8a 100644
--- a/tools/testing/selftests/bpf/progs/test_verify_pkcs7_sig.c
+++ b/tools/testing/selftests/bpf/progs/test_verify_pkcs7_sig.c
@@ -59,10 +59,14 @@ int BPF_PROG(bpf, int cmd, union bpf_attr *attr, unsigned int size)
 	if (!data_val)
 		return 0;
 
-	bpf_probe_read(&value, sizeof(value), &attr->value);
-
-	bpf_copy_from_user(data_val, sizeof(struct data),
-			   (void *)(unsigned long)value);
+	ret = bpf_probe_read_kernel(&value, sizeof(value), &attr->value);
+	if (ret)
+		return ret;
+
+	ret = bpf_copy_from_user(data_val, sizeof(struct data),
+				 (void *)(unsigned long)value);
+	if (ret)
+		return ret;
 
 	if (data_val->data_len > sizeof(data_val->data))
 		return -EINVAL;
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH bpf-next 09/24] selftests/bpf: Fix xdp_do_redirect on s390x
  2023-01-25 21:37 [PATCH bpf-next 00/24] Support bpf trampoline for s390x Ilya Leoshkevich
                   ` (7 preceding siblings ...)
  2023-01-25 21:38 ` [PATCH bpf-next 08/24] selftests/bpf: Fix verify_pkcs7_sig on s390x Ilya Leoshkevich
@ 2023-01-25 21:38 ` Ilya Leoshkevich
  2023-01-25 21:38 ` [PATCH bpf-next 10/24] selftests/bpf: Fix cgrp_local_storage " Ilya Leoshkevich
                   ` (15 subsequent siblings)
  24 siblings, 0 replies; 48+ messages in thread
From: Ilya Leoshkevich @ 2023-01-25 21:38 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: bpf, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich

s390x cache line size is 256 bytes, so skb_shared_info must be aligned
on a much larger boundary than for x86. This makes the maximum packet
size smaller.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c b/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c
index a50971c6cf4a..ac70e871d62f 100644
--- a/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c
+++ b/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c
@@ -65,7 +65,11 @@ static int attach_tc_prog(struct bpf_tc_hook *hook, int fd)
 /* The maximum permissible size is: PAGE_SIZE - sizeof(struct xdp_page_head) -
  * sizeof(struct skb_shared_info) - XDP_PACKET_HEADROOM = 3368 bytes
  */
+#if defined(__s390x__)
+#define MAX_PKT_SIZE 3176
+#else
 #define MAX_PKT_SIZE 3368
+#endif
 static void test_max_pkt_size(int fd)
 {
 	char data[MAX_PKT_SIZE + 1] = {};
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH bpf-next 10/24] selftests/bpf: Fix cgrp_local_storage on s390x
  2023-01-25 21:37 [PATCH bpf-next 00/24] Support bpf trampoline for s390x Ilya Leoshkevich
                   ` (8 preceding siblings ...)
  2023-01-25 21:38 ` [PATCH bpf-next 09/24] selftests/bpf: Fix xdp_do_redirect " Ilya Leoshkevich
@ 2023-01-25 21:38 ` Ilya Leoshkevich
  2023-01-25 21:38 ` [PATCH bpf-next 11/24] selftests/bpf: Check stack_mprotect() return value Ilya Leoshkevich
                   ` (14 subsequent siblings)
  24 siblings, 0 replies; 48+ messages in thread
From: Ilya Leoshkevich @ 2023-01-25 21:38 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: bpf, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich

Sync the definition of socket_cookie between the eBPF program and the
test. Currently the test works by accident, since on little-endian it
is sometimes acceptable to access u64 as u32.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 tools/testing/selftests/bpf/prog_tests/cgrp_local_storage.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/cgrp_local_storage.c b/tools/testing/selftests/bpf/prog_tests/cgrp_local_storage.c
index 33a2776737e7..2cc759956e3b 100644
--- a/tools/testing/selftests/bpf/prog_tests/cgrp_local_storage.c
+++ b/tools/testing/selftests/bpf/prog_tests/cgrp_local_storage.c
@@ -16,7 +16,7 @@
 
 struct socket_cookie {
 	__u64 cookie_key;
-	__u32 cookie_value;
+	__u64 cookie_value;
 };
 
 static void test_tp_btf(int cgroup_fd)
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH bpf-next 11/24] selftests/bpf: Check stack_mprotect() return value
  2023-01-25 21:37 [PATCH bpf-next 00/24] Support bpf trampoline for s390x Ilya Leoshkevich
                   ` (9 preceding siblings ...)
  2023-01-25 21:38 ` [PATCH bpf-next 10/24] selftests/bpf: Fix cgrp_local_storage " Ilya Leoshkevich
@ 2023-01-25 21:38 ` Ilya Leoshkevich
  2023-01-25 21:38 ` [PATCH bpf-next 12/24] selftests/bpf: Increase SIZEOF_BPF_LOCAL_STORAGE_ELEM on s390x Ilya Leoshkevich
                   ` (13 subsequent siblings)
  24 siblings, 0 replies; 48+ messages in thread
From: Ilya Leoshkevich @ 2023-01-25 21:38 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: bpf, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich

If stack_mprotect() succeeds, errno is not changed. This can produce
misleading error messages, that show stale errno.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 tools/testing/selftests/bpf/prog_tests/bpf_cookie.c | 6 ++++--
 tools/testing/selftests/bpf/prog_tests/test_lsm.c   | 3 ++-
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c b/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c
index 2be2d61954bc..26b2d1bffdfd 100644
--- a/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c
@@ -472,6 +472,7 @@ static void lsm_subtest(struct test_bpf_cookie *skel)
 	int prog_fd;
 	int lsm_fd = -1;
 	LIBBPF_OPTS(bpf_link_create_opts, link_opts);
+	int err;
 
 	skel->bss->lsm_res = 0;
 
@@ -482,8 +483,9 @@ static void lsm_subtest(struct test_bpf_cookie *skel)
 	if (!ASSERT_GE(lsm_fd, 0, "lsm.link_create"))
 		goto cleanup;
 
-	stack_mprotect();
-	if (!ASSERT_EQ(errno, EPERM, "stack_mprotect"))
+	err = stack_mprotect();
+	if (!ASSERT_EQ(err, -1, "stack_mprotect") ||
+	    !ASSERT_EQ(errno, EPERM, "stack_mprotect"))
 		goto cleanup;
 
 	usleep(1);
diff --git a/tools/testing/selftests/bpf/prog_tests/test_lsm.c b/tools/testing/selftests/bpf/prog_tests/test_lsm.c
index 244c01125126..16175d579bc7 100644
--- a/tools/testing/selftests/bpf/prog_tests/test_lsm.c
+++ b/tools/testing/selftests/bpf/prog_tests/test_lsm.c
@@ -75,7 +75,8 @@ static int test_lsm(struct lsm *skel)
 	skel->bss->monitored_pid = getpid();
 
 	err = stack_mprotect();
-	if (!ASSERT_EQ(errno, EPERM, "stack_mprotect"))
+	if (!ASSERT_EQ(err, -1, "stack_mprotect") ||
+	    !ASSERT_EQ(errno, EPERM, "stack_mprotect"))
 		return err;
 
 	ASSERT_EQ(skel->bss->mprotect_count, 1, "mprotect_count");
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH bpf-next 12/24] selftests/bpf: Increase SIZEOF_BPF_LOCAL_STORAGE_ELEM on s390x
  2023-01-25 21:37 [PATCH bpf-next 00/24] Support bpf trampoline for s390x Ilya Leoshkevich
                   ` (10 preceding siblings ...)
  2023-01-25 21:38 ` [PATCH bpf-next 11/24] selftests/bpf: Check stack_mprotect() return value Ilya Leoshkevich
@ 2023-01-25 21:38 ` Ilya Leoshkevich
  2023-01-25 21:38 ` [PATCH bpf-next 13/24] selftests/bpf: Add a sign-extension test for kfuncs Ilya Leoshkevich
                   ` (12 subsequent siblings)
  24 siblings, 0 replies; 48+ messages in thread
From: Ilya Leoshkevich @ 2023-01-25 21:38 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: bpf, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich

sizeof(struct bpf_local_storage_elem) is 512 on s390x:

    struct bpf_local_storage_elem {
            struct hlist_node          map_node;             /*     0    16 */
            struct hlist_node          snode;                /*    16    16 */
            struct bpf_local_storage * local_storage;        /*    32     8 */
            struct callback_head       rcu __attribute__((__aligned__(8))); /*    40    16 */

            /* XXX 200 bytes hole, try to pack */

            /* --- cacheline 1 boundary (256 bytes) --- */
            struct bpf_local_storage_data sdata __attribute__((__aligned__(256))); /*   256     8 */

            /* size: 512, cachelines: 2, members: 5 */
            /* sum members: 64, holes: 1, sum holes: 200 */
            /* padding: 248 */
            /* forced alignments: 2, forced holes: 1, sum forced holes: 200 */
    } __attribute__((__aligned__(256)));

As the existing comment suggests, use a larger number in order to be
future-proof.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 tools/testing/selftests/bpf/netcnt_common.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/bpf/netcnt_common.h b/tools/testing/selftests/bpf/netcnt_common.h
index 0ab1c88041cd..2d4a58e4e39c 100644
--- a/tools/testing/selftests/bpf/netcnt_common.h
+++ b/tools/testing/selftests/bpf/netcnt_common.h
@@ -8,11 +8,11 @@
 
 /* sizeof(struct bpf_local_storage_elem):
  *
- * It really is about 128 bytes on x86_64, but allocate more to account for
- * possible layout changes, different architectures, etc.
+ * It is about 128 bytes on x86_64 and 512 bytes on s390x, but allocate more to
+ * account for possible layout changes, different architectures, etc.
  * The kernel will wrap up to PAGE_SIZE internally anyway.
  */
-#define SIZEOF_BPF_LOCAL_STORAGE_ELEM		256
+#define SIZEOF_BPF_LOCAL_STORAGE_ELEM		768
 
 /* Try to estimate kernel's BPF_LOCAL_STORAGE_MAX_VALUE_SIZE: */
 #define BPF_LOCAL_STORAGE_MAX_VALUE_SIZE	(0xFFFF - \
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH bpf-next 13/24] selftests/bpf: Add a sign-extension test for kfuncs
  2023-01-25 21:37 [PATCH bpf-next 00/24] Support bpf trampoline for s390x Ilya Leoshkevich
                   ` (11 preceding siblings ...)
  2023-01-25 21:38 ` [PATCH bpf-next 12/24] selftests/bpf: Increase SIZEOF_BPF_LOCAL_STORAGE_ELEM on s390x Ilya Leoshkevich
@ 2023-01-25 21:38 ` Ilya Leoshkevich
  2023-01-25 21:38 ` [PATCH bpf-next 14/24] selftests/bpf: Fix test_lsm on s390x Ilya Leoshkevich
                   ` (11 subsequent siblings)
  24 siblings, 0 replies; 48+ messages in thread
From: Ilya Leoshkevich @ 2023-01-25 21:38 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: bpf, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich

s390x ABI requires the caller to zero- or sign-extend the arguments.
eBPF already deals with zero-extension (by definition of its ABI), but
not with sign-extension.

Add a test to cover that potentially problematic area.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 net/bpf/test_run.c                             |  9 +++++++++
 .../selftests/bpf/prog_tests/kfunc_call.c      |  1 +
 .../selftests/bpf/progs/kfunc_call_test.c      | 18 ++++++++++++++++++
 3 files changed, 28 insertions(+)

diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index 8da0d73b368e..7dbefa4fd2eb 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -550,6 +550,14 @@ struct sock * noinline bpf_kfunc_call_test3(struct sock *sk)
 	return sk;
 }
 
+long noinline bpf_kfunc_call_test4(signed char a, short b, int c, long d)
+{
+	/* Provoke the compiler to assume that the caller has sign-extended a,
+	 * b and c on platforms where this is required (e.g. s390x).
+	 */
+	return (long)a + (long)b + (long)c + d;
+}
+
 struct prog_test_member1 {
 	int a;
 };
@@ -746,6 +754,7 @@ BTF_SET8_START(test_sk_check_kfunc_ids)
 BTF_ID_FLAGS(func, bpf_kfunc_call_test1)
 BTF_ID_FLAGS(func, bpf_kfunc_call_test2)
 BTF_ID_FLAGS(func, bpf_kfunc_call_test3)
+BTF_ID_FLAGS(func, bpf_kfunc_call_test4)
 BTF_ID_FLAGS(func, bpf_kfunc_call_test_acquire, KF_ACQUIRE | KF_RET_NULL)
 BTF_ID_FLAGS(func, bpf_kfunc_call_memb_acquire, KF_ACQUIRE | KF_RET_NULL)
 BTF_ID_FLAGS(func, bpf_kfunc_call_test_release, KF_RELEASE)
diff --git a/tools/testing/selftests/bpf/prog_tests/kfunc_call.c b/tools/testing/selftests/bpf/prog_tests/kfunc_call.c
index 5af1ee8f0e6e..bb4cd82a788a 100644
--- a/tools/testing/selftests/bpf/prog_tests/kfunc_call.c
+++ b/tools/testing/selftests/bpf/prog_tests/kfunc_call.c
@@ -72,6 +72,7 @@ static struct kfunc_test_params kfunc_tests[] = {
 	/* success cases */
 	TC_TEST(kfunc_call_test1, 12),
 	TC_TEST(kfunc_call_test2, 3),
+	TC_TEST(kfunc_call_test4, -1234),
 	TC_TEST(kfunc_call_test_ref_btf_id, 0),
 	TC_TEST(kfunc_call_test_get_mem, 42),
 	SYSCALL_TEST(kfunc_syscall_test, 0),
diff --git a/tools/testing/selftests/bpf/progs/kfunc_call_test.c b/tools/testing/selftests/bpf/progs/kfunc_call_test.c
index f636e50be259..d91c58d06d38 100644
--- a/tools/testing/selftests/bpf/progs/kfunc_call_test.c
+++ b/tools/testing/selftests/bpf/progs/kfunc_call_test.c
@@ -3,6 +3,7 @@
 #include <vmlinux.h>
 #include <bpf/bpf_helpers.h>
 
+extern long bpf_kfunc_call_test4(signed char a, short b, int c, long d) __ksym;
 extern int bpf_kfunc_call_test2(struct sock *sk, __u32 a, __u32 b) __ksym;
 extern __u64 bpf_kfunc_call_test1(struct sock *sk, __u32 a, __u64 b,
 				  __u32 c, __u64 d) __ksym;
@@ -17,6 +18,23 @@ extern void bpf_kfunc_call_test_mem_len_fail2(__u64 *mem, int len) __ksym;
 extern int *bpf_kfunc_call_test_get_rdwr_mem(struct prog_test_ref_kfunc *p, const int rdwr_buf_size) __ksym;
 extern int *bpf_kfunc_call_test_get_rdonly_mem(struct prog_test_ref_kfunc *p, const int rdonly_buf_size) __ksym;
 
+SEC("tc")
+int kfunc_call_test4(struct __sk_buff *skb)
+{
+	struct bpf_sock *sk = skb->sk;
+	long tmp;
+
+	if (!sk)
+		return -1;
+
+	sk = bpf_sk_fullsock(sk);
+	if (!sk)
+		return -1;
+
+	tmp = bpf_kfunc_call_test4(-3, -30, -200, -1000);
+	return (tmp >> 32) + tmp;
+}
+
 SEC("tc")
 int kfunc_call_test2(struct __sk_buff *skb)
 {
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH bpf-next 14/24] selftests/bpf: Fix test_lsm on s390x
  2023-01-25 21:37 [PATCH bpf-next 00/24] Support bpf trampoline for s390x Ilya Leoshkevich
                   ` (12 preceding siblings ...)
  2023-01-25 21:38 ` [PATCH bpf-next 13/24] selftests/bpf: Add a sign-extension test for kfuncs Ilya Leoshkevich
@ 2023-01-25 21:38 ` Ilya Leoshkevich
  2023-01-25 21:38 ` [PATCH bpf-next 15/24] selftests/bpf: Fix test_xdp_adjust_tail_grow2 " Ilya Leoshkevich
                   ` (10 subsequent siblings)
  24 siblings, 0 replies; 48+ messages in thread
From: Ilya Leoshkevich @ 2023-01-25 21:38 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: bpf, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich

Use syscall macros to access the setdomainname() arguments; currently
the code uses gprs[2] instead of orig_gpr2 for the first argument.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 tools/testing/selftests/bpf/progs/lsm.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/bpf/progs/lsm.c b/tools/testing/selftests/bpf/progs/lsm.c
index d8d8af623bc2..dc93887ed34c 100644
--- a/tools/testing/selftests/bpf/progs/lsm.c
+++ b/tools/testing/selftests/bpf/progs/lsm.c
@@ -6,9 +6,10 @@
 
 #include "bpf_misc.h"
 #include "vmlinux.h"
+#include <bpf/bpf_core_read.h>
 #include <bpf/bpf_helpers.h>
 #include <bpf/bpf_tracing.h>
-#include  <errno.h>
+#include <errno.h>
 
 struct {
 	__uint(type, BPF_MAP_TYPE_ARRAY);
@@ -164,8 +165,8 @@ int copy_test = 0;
 SEC("fentry.s/" SYS_PREFIX "sys_setdomainname")
 int BPF_PROG(test_sys_setdomainname, struct pt_regs *regs)
 {
-	void *ptr = (void *)PT_REGS_PARM1(regs);
-	int len = PT_REGS_PARM2(regs);
+	void *ptr = (void *)PT_REGS_PARM1_SYSCALL(regs);
+	int len = PT_REGS_PARM2_SYSCALL(regs);
 	int buf = 0;
 	long ret;
 
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH bpf-next 15/24] selftests/bpf: Fix test_xdp_adjust_tail_grow2 on s390x
  2023-01-25 21:37 [PATCH bpf-next 00/24] Support bpf trampoline for s390x Ilya Leoshkevich
                   ` (13 preceding siblings ...)
  2023-01-25 21:38 ` [PATCH bpf-next 14/24] selftests/bpf: Fix test_lsm on s390x Ilya Leoshkevich
@ 2023-01-25 21:38 ` Ilya Leoshkevich
  2023-01-25 21:38 ` [PATCH bpf-next 16/24] selftests/bpf: Fix vmlinux test " Ilya Leoshkevich
                   ` (9 subsequent siblings)
  24 siblings, 0 replies; 48+ messages in thread
From: Ilya Leoshkevich @ 2023-01-25 21:38 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: bpf, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich

s390x cache line size is 256 bytes, so skb_shared_info must be aligned
on a much larger boundary than for x86.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 tools/testing/selftests/bpf/prog_tests/xdp_adjust_tail.c  | 7 ++++++-
 .../selftests/bpf/progs/test_xdp_adjust_tail_grow.c       | 8 +++++++-
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_adjust_tail.c b/tools/testing/selftests/bpf/prog_tests/xdp_adjust_tail.c
index 39973ea1ce43..f09505f8b038 100644
--- a/tools/testing/selftests/bpf/prog_tests/xdp_adjust_tail.c
+++ b/tools/testing/selftests/bpf/prog_tests/xdp_adjust_tail.c
@@ -76,10 +76,15 @@ static void test_xdp_adjust_tail_grow2(void)
 {
 	const char *file = "./test_xdp_adjust_tail_grow.bpf.o";
 	char buf[4096]; /* avoid segfault: large buf to hold grow results */
-	int tailroom = 320; /* SKB_DATA_ALIGN(sizeof(struct skb_shared_info))*/;
 	struct bpf_object *obj;
 	int err, cnt, i;
 	int max_grow, prog_fd;
+	/* SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) */
+#if defined(__s390x__)
+	int tailroom = 512;
+#else
+	int tailroom = 320;
+#endif
 
 	LIBBPF_OPTS(bpf_test_run_opts, tattr,
 		.repeat		= 1,
diff --git a/tools/testing/selftests/bpf/progs/test_xdp_adjust_tail_grow.c b/tools/testing/selftests/bpf/progs/test_xdp_adjust_tail_grow.c
index 53b64c999450..297c260fc364 100644
--- a/tools/testing/selftests/bpf/progs/test_xdp_adjust_tail_grow.c
+++ b/tools/testing/selftests/bpf/progs/test_xdp_adjust_tail_grow.c
@@ -9,6 +9,12 @@ int _xdp_adjust_tail_grow(struct xdp_md *xdp)
 	void *data = (void *)(long)xdp->data;
 	int data_len = bpf_xdp_get_buff_len(xdp);
 	int offset = 0;
+	/* SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) */
+#if defined(__TARGET_ARCH_s390)
+	int tailroom = 512;
+#else
+	int tailroom = 320;
+#endif
 
 	/* Data length determine test case */
 
@@ -20,7 +26,7 @@ int _xdp_adjust_tail_grow(struct xdp_md *xdp)
 		offset = 128;
 	} else if (data_len == 128) {
 		/* Max tail grow 3520 */
-		offset = 4096 - 256 - 320 - data_len;
+		offset = 4096 - 256 - tailroom - data_len;
 	} else if (data_len == 9000) {
 		offset = 10;
 	} else if (data_len == 9001) {
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH bpf-next 16/24] selftests/bpf: Fix vmlinux test on s390x
  2023-01-25 21:37 [PATCH bpf-next 00/24] Support bpf trampoline for s390x Ilya Leoshkevich
                   ` (14 preceding siblings ...)
  2023-01-25 21:38 ` [PATCH bpf-next 15/24] selftests/bpf: Fix test_xdp_adjust_tail_grow2 " Ilya Leoshkevich
@ 2023-01-25 21:38 ` Ilya Leoshkevich
  2023-01-25 21:38 ` [PATCH bpf-next 17/24] libbpf: Read usdt arg spec with bpf_probe_read_kernel() Ilya Leoshkevich
                   ` (8 subsequent siblings)
  24 siblings, 0 replies; 48+ messages in thread
From: Ilya Leoshkevich @ 2023-01-25 21:38 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: bpf, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich

Use a syscall macro to access the nanosleep()'s first argument;
currently the code uses gprs[2] instead of orig_gpr2.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 tools/testing/selftests/bpf/progs/test_vmlinux.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/bpf/progs/test_vmlinux.c b/tools/testing/selftests/bpf/progs/test_vmlinux.c
index e9dfa0313d1b..4b8e37f7fd06 100644
--- a/tools/testing/selftests/bpf/progs/test_vmlinux.c
+++ b/tools/testing/selftests/bpf/progs/test_vmlinux.c
@@ -42,7 +42,7 @@ int BPF_PROG(handle__raw_tp, struct pt_regs *regs, long id)
 	if (id != __NR_nanosleep)
 		return 0;
 
-	ts = (void *)PT_REGS_PARM1_CORE(regs);
+	ts = (void *)PT_REGS_PARM1_CORE_SYSCALL(regs);
 	if (bpf_probe_read_user(&tv_nsec, sizeof(ts->tv_nsec), &ts->tv_nsec) ||
 	    tv_nsec != MY_TV_NSEC)
 		return 0;
@@ -60,7 +60,7 @@ int BPF_PROG(handle__tp_btf, struct pt_regs *regs, long id)
 	if (id != __NR_nanosleep)
 		return 0;
 
-	ts = (void *)PT_REGS_PARM1_CORE(regs);
+	ts = (void *)PT_REGS_PARM1_CORE_SYSCALL(regs);
 	if (bpf_probe_read_user(&tv_nsec, sizeof(ts->tv_nsec), &ts->tv_nsec) ||
 	    tv_nsec != MY_TV_NSEC)
 		return 0;
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH bpf-next 17/24] libbpf: Read usdt arg spec with bpf_probe_read_kernel()
  2023-01-25 21:37 [PATCH bpf-next 00/24] Support bpf trampoline for s390x Ilya Leoshkevich
                   ` (15 preceding siblings ...)
  2023-01-25 21:38 ` [PATCH bpf-next 16/24] selftests/bpf: Fix vmlinux test " Ilya Leoshkevich
@ 2023-01-25 21:38 ` Ilya Leoshkevich
  2023-01-26  0:26   ` Andrii Nakryiko
  2023-01-25 21:38 ` [PATCH bpf-next 18/24] s390/bpf: Fix a typo in a comment Ilya Leoshkevich
                   ` (7 subsequent siblings)
  24 siblings, 1 reply; 48+ messages in thread
From: Ilya Leoshkevich @ 2023-01-25 21:38 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: bpf, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich

Loading programs that use bpf_usdt_arg() on s390x fails with:

    ; switch (arg_spec->arg_type) {
    139: (61) r1 = *(u32 *)(r2 +8)
    R2 unbounded memory access, make sure to bounds check any such access

The bound checks seem to be already in place in the C code, and maybe
it's even possible to add extra bogus checks to placate LLVM or the
verifier. Take a simpler approach and just use a helper.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 tools/lib/bpf/usdt.bpf.h | 33 ++++++++++++++++++---------------
 1 file changed, 18 insertions(+), 15 deletions(-)

diff --git a/tools/lib/bpf/usdt.bpf.h b/tools/lib/bpf/usdt.bpf.h
index fdfd235e52c4..ddfa2521ab67 100644
--- a/tools/lib/bpf/usdt.bpf.h
+++ b/tools/lib/bpf/usdt.bpf.h
@@ -116,7 +116,7 @@ __weak __hidden
 int bpf_usdt_arg(struct pt_regs *ctx, __u64 arg_num, long *res)
 {
 	struct __bpf_usdt_spec *spec;
-	struct __bpf_usdt_arg_spec *arg_spec;
+	struct __bpf_usdt_arg_spec arg_spec;
 	unsigned long val;
 	int err, spec_id;
 
@@ -133,21 +133,24 @@ int bpf_usdt_arg(struct pt_regs *ctx, __u64 arg_num, long *res)
 	if (arg_num >= BPF_USDT_MAX_ARG_CNT || arg_num >= spec->arg_cnt)
 		return -ENOENT;
 
-	arg_spec = &spec->args[arg_num];
-	switch (arg_spec->arg_type) {
+	err = bpf_probe_read_kernel(&arg_spec, sizeof(arg_spec), &spec->args[arg_num]);
+	if (err)
+		return err;
+
+	switch (arg_spec.arg_type) {
 	case BPF_USDT_ARG_CONST:
 		/* Arg is just a constant ("-4@$-9" in USDT arg spec).
-		 * value is recorded in arg_spec->val_off directly.
+		 * value is recorded in arg_spec.val_off directly.
 		 */
-		val = arg_spec->val_off;
+		val = arg_spec.val_off;
 		break;
 	case BPF_USDT_ARG_REG:
 		/* Arg is in a register (e.g, "8@%rax" in USDT arg spec),
 		 * so we read the contents of that register directly from
 		 * struct pt_regs. To keep things simple user-space parts
-		 * record offsetof(struct pt_regs, <regname>) in arg_spec->reg_off.
+		 * record offsetof(struct pt_regs, <regname>) in arg_spec.reg_off.
 		 */
-		err = bpf_probe_read_kernel(&val, sizeof(val), (void *)ctx + arg_spec->reg_off);
+		err = bpf_probe_read_kernel(&val, sizeof(val), (void *)ctx + arg_spec.reg_off);
 		if (err)
 			return err;
 		break;
@@ -155,18 +158,18 @@ int bpf_usdt_arg(struct pt_regs *ctx, __u64 arg_num, long *res)
 		/* Arg is in memory addressed by register, plus some offset
 		 * (e.g., "-4@-1204(%rbp)" in USDT arg spec). Register is
 		 * identified like with BPF_USDT_ARG_REG case, and the offset
-		 * is in arg_spec->val_off. We first fetch register contents
+		 * is in arg_spec.val_off. We first fetch register contents
 		 * from pt_regs, then do another user-space probe read to
 		 * fetch argument value itself.
 		 */
-		err = bpf_probe_read_kernel(&val, sizeof(val), (void *)ctx + arg_spec->reg_off);
+		err = bpf_probe_read_kernel(&val, sizeof(val), (void *)ctx + arg_spec.reg_off);
 		if (err)
 			return err;
-		err = bpf_probe_read_user(&val, sizeof(val), (void *)val + arg_spec->val_off);
+		err = bpf_probe_read_user(&val, sizeof(val), (void *)val + arg_spec.val_off);
 		if (err)
 			return err;
 #if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
-		val >>= arg_spec->arg_bitshift;
+		val >>= arg_spec.arg_bitshift;
 #endif
 		break;
 	default:
@@ -177,11 +180,11 @@ int bpf_usdt_arg(struct pt_regs *ctx, __u64 arg_num, long *res)
 	 * necessary upper arg_bitshift bits, with sign extension if argument
 	 * is signed
 	 */
-	val <<= arg_spec->arg_bitshift;
-	if (arg_spec->arg_signed)
-		val = ((long)val) >> arg_spec->arg_bitshift;
+	val <<= arg_spec.arg_bitshift;
+	if (arg_spec.arg_signed)
+		val = ((long)val) >> arg_spec.arg_bitshift;
 	else
-		val = val >> arg_spec->arg_bitshift;
+		val = val >> arg_spec.arg_bitshift;
 	*res = val;
 	return 0;
 }
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH bpf-next 18/24] s390/bpf: Fix a typo in a comment
  2023-01-25 21:37 [PATCH bpf-next 00/24] Support bpf trampoline for s390x Ilya Leoshkevich
                   ` (16 preceding siblings ...)
  2023-01-25 21:38 ` [PATCH bpf-next 17/24] libbpf: Read usdt arg spec with bpf_probe_read_kernel() Ilya Leoshkevich
@ 2023-01-25 21:38 ` Ilya Leoshkevich
  2023-01-25 21:38 ` [PATCH bpf-next 19/24] s390/bpf: Add expoline to tail calls Ilya Leoshkevich
                   ` (6 subsequent siblings)
  24 siblings, 0 replies; 48+ messages in thread
From: Ilya Leoshkevich @ 2023-01-25 21:38 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: bpf, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich

"decription" should be "description".

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 arch/s390/net/bpf_jit_comp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
index af35052d06ed..eb1a78c0e6a8 100644
--- a/arch/s390/net/bpf_jit_comp.c
+++ b/arch/s390/net/bpf_jit_comp.c
@@ -510,7 +510,7 @@ static void bpf_skip(struct bpf_jit *jit, int size)
  * Emit function prologue
  *
  * Save registers and create stack frame if necessary.
- * See stack frame layout desription in "bpf_jit.h"!
+ * See stack frame layout description in "bpf_jit.h"!
  */
 static void bpf_jit_prologue(struct bpf_jit *jit, u32 stack_depth)
 {
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH bpf-next 19/24] s390/bpf: Add expoline to tail calls
  2023-01-25 21:37 [PATCH bpf-next 00/24] Support bpf trampoline for s390x Ilya Leoshkevich
                   ` (17 preceding siblings ...)
  2023-01-25 21:38 ` [PATCH bpf-next 18/24] s390/bpf: Fix a typo in a comment Ilya Leoshkevich
@ 2023-01-25 21:38 ` Ilya Leoshkevich
  2023-01-25 21:38 ` [PATCH bpf-next 20/24] s390/bpf: Implement bpf_arch_text_poke() Ilya Leoshkevich
                   ` (5 subsequent siblings)
  24 siblings, 0 replies; 48+ messages in thread
From: Ilya Leoshkevich @ 2023-01-25 21:38 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: bpf, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich

All the indirect jumps in the eBPF JIT already use expolines, except
for the tail call one.

Fixes: de5cb6eb514e ("s390: use expoline thunks in the BPF JIT")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 arch/s390/net/bpf_jit_comp.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
index eb1a78c0e6a8..8400a06c926e 100644
--- a/arch/s390/net/bpf_jit_comp.c
+++ b/arch/s390/net/bpf_jit_comp.c
@@ -1393,8 +1393,16 @@ static noinline int bpf_jit_insn(struct bpf_jit *jit, struct bpf_prog *fp,
 		/* lg %r1,bpf_func(%r1) */
 		EMIT6_DISP_LH(0xe3000000, 0x0004, REG_1, REG_1, REG_0,
 			      offsetof(struct bpf_prog, bpf_func));
-		/* bc 0xf,tail_call_start(%r1) */
-		_EMIT4(0x47f01000 + jit->tail_call_start);
+		if (nospec_uses_trampoline()) {
+			jit->seen |= SEEN_FUNC;
+			/* aghi %r1,tail_call_start */
+			EMIT4_IMM(0xa70b0000, REG_1, jit->tail_call_start);
+			/* brcl 0xf,__s390_indirect_jump_r1 */
+			EMIT6_PCREL_RILC(0xc0040000, 0xf, jit->r1_thunk_ip);
+		} else {
+			/* bc 0xf,tail_call_start(%r1) */
+			_EMIT4(0x47f01000 + jit->tail_call_start);
+		}
 		/* out: */
 		if (jit->prg_buf) {
 			*(u16 *)(jit->prg_buf + patch_1_clrj + 2) =
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH bpf-next 20/24] s390/bpf: Implement bpf_arch_text_poke()
  2023-01-25 21:37 [PATCH bpf-next 00/24] Support bpf trampoline for s390x Ilya Leoshkevich
                   ` (18 preceding siblings ...)
  2023-01-25 21:38 ` [PATCH bpf-next 19/24] s390/bpf: Add expoline to tail calls Ilya Leoshkevich
@ 2023-01-25 21:38 ` Ilya Leoshkevich
  2023-01-25 21:38 ` [PATCH bpf-next 21/24] bpf: btf: Add BTF_FMODEL_SIGNED_ARG flag Ilya Leoshkevich
                   ` (4 subsequent siblings)
  24 siblings, 0 replies; 48+ messages in thread
From: Ilya Leoshkevich @ 2023-01-25 21:38 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: bpf, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich

bpf_arch_text_poke() is used to hotpatch eBPF programs and trampolines.
s390x has a very strict hotpatching restriction: the only thing that is
allowed to be hotpatched is conditional branch mask.

Take the same approach as commit de5012b41e5c ("s390/ftrace: implement
hotpatching"): create a conditional jump to a "plt", which loads the
target address from memory and jumps to it; then first patch this
address, and then the mask.

Trampolines (introduced in the next patch) respect the ftrace calling
convention: the return address is in %r0, and %r1 is clobbered. With
that in mind, bpf_arch_text_poke() does not differentiate between jumps
and calls.

However, there is a simple optimization for jumps (for the epilogue_ip
case): if a jump already points to the destination, then there is no
"plt" and we can just flip the mask.

For simplicity, the "plt" template is defined in assembly, and its size
is used to define C arrays. There doesn't seem to be a way to convey
this size to C as a constant, so it's hardcoded and double-checked
during runtime.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 arch/s390/net/bpf_jit_comp.c | 97 ++++++++++++++++++++++++++++++++++++
 1 file changed, 97 insertions(+)

diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
index 8400a06c926e..c72eb3fc1f98 100644
--- a/arch/s390/net/bpf_jit_comp.c
+++ b/arch/s390/net/bpf_jit_comp.c
@@ -30,6 +30,7 @@
 #include <asm/facility.h>
 #include <asm/nospec-branch.h>
 #include <asm/set_memory.h>
+#include <asm/text-patching.h>
 #include "bpf_jit.h"
 
 struct bpf_jit {
@@ -50,6 +51,8 @@ struct bpf_jit {
 	int r14_thunk_ip;	/* Address of expoline thunk for 'br %r14' */
 	int tail_call_start;	/* Tail call start offset */
 	int excnt;		/* Number of exception table entries */
+	int prologue_plt_ret;	/* Return address for prologue hotpatch PLT */
+	int prologue_plt;	/* Start of prologue hotpatch PLT */
 };
 
 #define SEEN_MEM	BIT(0)		/* use mem[] for temporary storage */
@@ -506,6 +509,36 @@ static void bpf_skip(struct bpf_jit *jit, int size)
 	}
 }
 
+/*
+ * PLT for hotpatchable calls. The calling convention is the same as for the
+ * ftrace hotpatch trampolines: %r0 is return address, %r1 is clobbered.
+ */
+extern const char bpf_plt[];
+extern const char bpf_plt_ret[];
+extern const char bpf_plt_target[];
+extern const char bpf_plt_end[];
+#define BPF_PLT_SIZE 32
+asm(
+	".pushsection .rodata\n"
+	"	.align 8\n"
+	"bpf_plt:\n"
+	"	lgrl %r0,bpf_plt_ret\n"
+	"	lgrl %r1,bpf_plt_target\n"
+	"	br %r1\n"
+	"	.align 8\n"
+	"bpf_plt_ret: .quad 0\n"
+	"bpf_plt_target: .quad 0\n"
+	"bpf_plt_end:\n"
+	"	.popsection\n"
+);
+
+static void bpf_jit_plt(void *plt, void *ret, void *target)
+{
+	memcpy(plt, bpf_plt, BPF_PLT_SIZE);
+	*(void **)((char *)plt + (bpf_plt_ret - bpf_plt)) = ret;
+	*(void **)((char *)plt + (bpf_plt_target - bpf_plt)) = target;
+}
+
 /*
  * Emit function prologue
  *
@@ -514,6 +547,11 @@ static void bpf_skip(struct bpf_jit *jit, int size)
  */
 static void bpf_jit_prologue(struct bpf_jit *jit, u32 stack_depth)
 {
+	/* No-op for hotpatching */
+	/* brcl 0,prologue_plt */
+	EMIT6_PCREL_RILC(0xc0040000, 0, jit->prologue_plt);
+	jit->prologue_plt_ret = jit->prg;
+
 	if (jit->seen & SEEN_TAIL_CALL) {
 		/* xc STK_OFF_TCCNT(4,%r15),STK_OFF_TCCNT(%r15) */
 		_EMIT6(0xd703f000 | STK_OFF_TCCNT, 0xf000 | STK_OFF_TCCNT);
@@ -589,6 +627,13 @@ static void bpf_jit_epilogue(struct bpf_jit *jit, u32 stack_depth)
 		/* br %r1 */
 		_EMIT2(0x07f1);
 	}
+
+	jit->prg = ALIGN(jit->prg, 8);
+	jit->prologue_plt = jit->prg;
+	if (jit->prg_buf)
+		bpf_jit_plt(jit->prg_buf + jit->prg,
+			    jit->prg_buf + jit->prologue_plt_ret, NULL);
+	jit->prg += BPF_PLT_SIZE;
 }
 
 static int get_probe_mem_regno(const u8 *insn)
@@ -1776,6 +1821,9 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 	struct bpf_jit jit;
 	int pass;
 
+	if (WARN_ON_ONCE(bpf_plt_end - bpf_plt != BPF_PLT_SIZE))
+		return orig_fp;
+
 	if (!fp->jit_requested)
 		return orig_fp;
 
@@ -1867,3 +1915,52 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 					   tmp : orig_fp);
 	return fp;
 }
+
+int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t,
+		       void *old_addr, void *new_addr)
+{
+	struct {
+		u16 opc;
+		s32 disp;
+	} __packed insn;
+	char expected_plt[BPF_PLT_SIZE];
+	char current_plt[BPF_PLT_SIZE];
+	char *plt;
+	int err;
+
+	/* Verify the branch to be patched. */
+	err = copy_from_kernel_nofault(&insn, ip, sizeof(insn));
+	if (err < 0)
+		return err;
+	if (insn.opc != (0xc004 | (old_addr ? 0xf0 : 0)))
+		return -EINVAL;
+
+	if (t == BPF_MOD_JUMP &&
+	    insn.disp == ((char *)new_addr - (char *)ip) >> 1) {
+		/*
+		 * The branch already points to the destination,
+		 * there is no PLT.
+		 */
+	} else {
+		/* Verify the PLT. */
+		plt = (char *)ip + (insn.disp << 1);
+		err = copy_from_kernel_nofault(current_plt, plt, BPF_PLT_SIZE);
+		if (err < 0)
+			return err;
+		bpf_jit_plt(expected_plt, (char *)ip + 6, old_addr);
+		if (memcmp(current_plt, expected_plt, BPF_PLT_SIZE))
+			return -EINVAL;
+		/* Adjust the call address. */
+		s390_kernel_write(plt + (bpf_plt_target - bpf_plt),
+				  &new_addr, sizeof(void *));
+	}
+
+	/* Adjust the mask of the branch. */
+	insn.opc = 0xc004 | (new_addr ? 0xf0 : 0);
+	s390_kernel_write((char *)ip + 1, (char *)&insn.opc + 1, 1);
+
+	/* Make the new code visible to the other CPUs. */
+	text_poke_sync_lock();
+
+	return 0;
+}
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH bpf-next 21/24] bpf: btf: Add BTF_FMODEL_SIGNED_ARG flag
  2023-01-25 21:37 [PATCH bpf-next 00/24] Support bpf trampoline for s390x Ilya Leoshkevich
                   ` (19 preceding siblings ...)
  2023-01-25 21:38 ` [PATCH bpf-next 20/24] s390/bpf: Implement bpf_arch_text_poke() Ilya Leoshkevich
@ 2023-01-25 21:38 ` Ilya Leoshkevich
  2023-01-25 21:38 ` [PATCH bpf-next 22/24] s390/bpf: Implement arch_prepare_bpf_trampoline() Ilya Leoshkevich
                   ` (3 subsequent siblings)
  24 siblings, 0 replies; 48+ messages in thread
From: Ilya Leoshkevich @ 2023-01-25 21:38 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: bpf, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich

s390x eBPF JIT needs to know whether a function return value is signed
and which function arguments are signed, in order to generate code
compliant with the s390x ABI.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 include/linux/bpf.h |  4 ++++
 include/linux/btf.h | 15 ++++++++++-----
 kernel/bpf/btf.c    | 16 +++++++++++++++-
 3 files changed, 29 insertions(+), 6 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 14a0264fac57..cf89504c8dda 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -899,8 +899,12 @@ enum bpf_cgroup_storage_type {
 /* The argument is a structure. */
 #define BTF_FMODEL_STRUCT_ARG		BIT(0)
 
+/* The argument is signed. */
+#define BTF_FMODEL_SIGNED_ARG		BIT(1)
+
 struct btf_func_model {
 	u8 ret_size;
+	u8 ret_flags;
 	u8 nr_args;
 	u8 arg_size[MAX_BPF_FUNC_ARGS];
 	u8 arg_flags[MAX_BPF_FUNC_ARGS];
diff --git a/include/linux/btf.h b/include/linux/btf.h
index 5f628f323442..e9b90d9c3569 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -236,6 +236,16 @@ static inline bool btf_type_is_small_int(const struct btf_type *t)
 	return btf_type_is_int(t) && t->size <= sizeof(u64);
 }
 
+static inline u8 btf_int_encoding(const struct btf_type *t)
+{
+	return BTF_INT_ENCODING(*(u32 *)(t + 1));
+}
+
+static inline bool btf_type_is_signed_int(const struct btf_type *t)
+{
+	return btf_type_is_int(t) && (btf_int_encoding(t) & BTF_INT_SIGNED);
+}
+
 static inline bool btf_type_is_enum(const struct btf_type *t)
 {
 	return BTF_INFO_KIND(t->info) == BTF_KIND_ENUM;
@@ -306,11 +316,6 @@ static inline u8 btf_int_offset(const struct btf_type *t)
 	return BTF_INT_OFFSET(*(u32 *)(t + 1));
 }
 
-static inline u8 btf_int_encoding(const struct btf_type *t)
-{
-	return BTF_INT_ENCODING(*(u32 *)(t + 1));
-}
-
 static inline bool btf_type_is_scalar(const struct btf_type *t)
 {
 	return btf_type_is_int(t) || btf_type_is_enum(t);
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 47b8cb96f2c2..1622a3b15d6f 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -6453,6 +6453,18 @@ static int __get_type_size(struct btf *btf, u32 btf_id,
 	return -EINVAL;
 }
 
+static u8 __get_type_fmodel_flags(const struct btf_type *t)
+{
+	u8 flags = 0;
+
+	if (__btf_type_is_struct(t))
+		flags |= BTF_FMODEL_STRUCT_ARG;
+	if (btf_type_is_signed_int(t))
+		flags |= BTF_FMODEL_SIGNED_ARG;
+
+	return flags;
+}
+
 int btf_distill_func_proto(struct bpf_verifier_log *log,
 			   struct btf *btf,
 			   const struct btf_type *func,
@@ -6473,6 +6485,7 @@ int btf_distill_func_proto(struct bpf_verifier_log *log,
 			m->arg_flags[i] = 0;
 		}
 		m->ret_size = 8;
+		m->ret_flags = 0;
 		m->nr_args = MAX_BPF_FUNC_REG_ARGS;
 		return 0;
 	}
@@ -6492,6 +6505,7 @@ int btf_distill_func_proto(struct bpf_verifier_log *log,
 		return -EINVAL;
 	}
 	m->ret_size = ret;
+	m->ret_flags = __get_type_fmodel_flags(t);
 
 	for (i = 0; i < nargs; i++) {
 		if (i == nargs - 1 && args[i].type == 0) {
@@ -6516,7 +6530,7 @@ int btf_distill_func_proto(struct bpf_verifier_log *log,
 			return -EINVAL;
 		}
 		m->arg_size[i] = ret;
-		m->arg_flags[i] = __btf_type_is_struct(t) ? BTF_FMODEL_STRUCT_ARG : 0;
+		m->arg_flags[i] = __get_type_fmodel_flags(t);
 	}
 	m->nr_args = nargs;
 	return 0;
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH bpf-next 22/24] s390/bpf: Implement arch_prepare_bpf_trampoline()
  2023-01-25 21:37 [PATCH bpf-next 00/24] Support bpf trampoline for s390x Ilya Leoshkevich
                   ` (20 preceding siblings ...)
  2023-01-25 21:38 ` [PATCH bpf-next 21/24] bpf: btf: Add BTF_FMODEL_SIGNED_ARG flag Ilya Leoshkevich
@ 2023-01-25 21:38 ` Ilya Leoshkevich
  2023-01-26  1:15   ` Andrii Nakryiko
                     ` (2 more replies)
  2023-01-25 21:38 ` [PATCH bpf-next 23/24] s390/bpf: Implement bpf_jit_supports_subprog_tailcalls() Ilya Leoshkevich
                   ` (2 subsequent siblings)
  24 siblings, 3 replies; 48+ messages in thread
From: Ilya Leoshkevich @ 2023-01-25 21:38 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: bpf, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich

arch_prepare_bpf_trampoline() is used for direct attachment of eBPF
programs to various places, bypassing kprobes. It's responsible for
calling a number of eBPF programs before, instead and/or after
whatever they are attached to.

Add a s390x implementation, paying attention to the following:

- Reuse the existing JIT infrastructure, where possible.
- Like the existing JIT, prefer making multiple passes instead of
  backpatching. Currently 2 passes is enough. If literal pool is
  introduced, this needs to be raised to 3. However, at the moment
  adding literal pool only makes the code larger. If branch
  shortening is introduced, the number of passes needs to be
  increased even further.
- Support both regular and ftrace calling conventions, depending on
  the trampoline flags.
- Use expolines for indirect calls.
- Handle the mismatch between the eBPF and the s390x ABIs.
- Sign-extend fmod_ret return values.

invoke_bpf_prog() produces about 120 bytes; it might be possible to
slightly optimize this, but reaching 50 bytes, like on x86_64, looks
unrealistic: just loading cookie, __bpf_prog_enter, bpf_func, insnsi
and __bpf_prog_exit as literals already takes at least 5 * 12 = 60
bytes, and we can't use relative addressing for most of them.
Therefore, lower BPF_MAX_TRAMP_LINKS on s390x.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 arch/s390/net/bpf_jit_comp.c | 535 +++++++++++++++++++++++++++++++++--
 include/linux/bpf.h          |   4 +
 2 files changed, 517 insertions(+), 22 deletions(-)

diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
index c72eb3fc1f98..ea8203bd4112 100644
--- a/arch/s390/net/bpf_jit_comp.c
+++ b/arch/s390/net/bpf_jit_comp.c
@@ -71,6 +71,10 @@ struct bpf_jit {
 #define REG_0		REG_W0			/* Register 0 */
 #define REG_1		REG_W1			/* Register 1 */
 #define REG_2		BPF_REG_1		/* Register 2 */
+#define REG_3		BPF_REG_2		/* Register 3 */
+#define REG_4		BPF_REG_3		/* Register 4 */
+#define REG_7		BPF_REG_6		/* Register 7 */
+#define REG_8		BPF_REG_7		/* Register 8 */
 #define REG_14		BPF_REG_0		/* Register 14 */
 
 /*
@@ -595,6 +599,43 @@ static void bpf_jit_prologue(struct bpf_jit *jit, u32 stack_depth)
 	}
 }
 
+/*
+ * Emit an expoline for a jump that follows
+ */
+static void emit_expoline(struct bpf_jit *jit)
+{
+	/* exrl %r0,.+10 */
+	EMIT6_PCREL_RIL(0xc6000000, jit->prg + 10);
+	/* j . */
+	EMIT4_PCREL(0xa7f40000, 0);
+}
+
+/*
+ * Emit __s390_indirect_jump_r1 thunk if necessary
+ */
+static void emit_r1_thunk(struct bpf_jit *jit)
+{
+	if (nospec_uses_trampoline()) {
+		jit->r1_thunk_ip = jit->prg;
+		emit_expoline(jit);
+		/* br %r1 */
+		_EMIT2(0x07f1);
+	}
+}
+
+/*
+ * Call r1 either directly or via __s390_indirect_jump_r1 thunk
+ */
+static void call_r1(struct bpf_jit *jit)
+{
+	if (nospec_uses_trampoline())
+		/* brasl %r14,__s390_indirect_jump_r1 */
+		EMIT6_PCREL_RILB(0xc0050000, REG_14, jit->r1_thunk_ip);
+	else
+		/* basr %r14,%r1 */
+		EMIT2(0x0d00, REG_14, REG_1);
+}
+
 /*
  * Function epilogue
  */
@@ -608,25 +649,13 @@ static void bpf_jit_epilogue(struct bpf_jit *jit, u32 stack_depth)
 	if (nospec_uses_trampoline()) {
 		jit->r14_thunk_ip = jit->prg;
 		/* Generate __s390_indirect_jump_r14 thunk */
-		/* exrl %r0,.+10 */
-		EMIT6_PCREL_RIL(0xc6000000, jit->prg + 10);
-		/* j . */
-		EMIT4_PCREL(0xa7f40000, 0);
+		emit_expoline(jit);
 	}
 	/* br %r14 */
 	_EMIT2(0x07fe);
 
-	if ((nospec_uses_trampoline()) &&
-	    (is_first_pass(jit) || (jit->seen & SEEN_FUNC))) {
-		jit->r1_thunk_ip = jit->prg;
-		/* Generate __s390_indirect_jump_r1 thunk */
-		/* exrl %r0,.+10 */
-		EMIT6_PCREL_RIL(0xc6000000, jit->prg + 10);
-		/* j . */
-		EMIT4_PCREL(0xa7f40000, 0);
-		/* br %r1 */
-		_EMIT2(0x07f1);
-	}
+	if (is_first_pass(jit) || (jit->seen & SEEN_FUNC))
+		emit_r1_thunk(jit);
 
 	jit->prg = ALIGN(jit->prg, 8);
 	jit->prologue_plt = jit->prg;
@@ -707,6 +736,34 @@ static int bpf_jit_probe_mem(struct bpf_jit *jit, struct bpf_prog *fp,
 	return 0;
 }
 
+/*
+ * Sign-extend the register if necessary
+ */
+static int sign_extend(struct bpf_jit *jit, int r, u8 size, u8 flags)
+{
+	if (!(flags & BTF_FMODEL_SIGNED_ARG))
+		return 0;
+
+	switch (size) {
+	case 1:
+		/* lgbr %r,%r */
+		EMIT4(0xb9060000, r, r);
+		return 0;
+	case 2:
+		/* lghr %r,%r */
+		EMIT4(0xb9070000, r, r);
+		return 0;
+	case 4:
+		/* lgfr %r,%r */
+		EMIT4(0xb9140000, r, r);
+		return 0;
+	case 8:
+		return 0;
+	default:
+		return -1;
+	}
+}
+
 /*
  * Compile one eBPF instruction into s390x code
  *
@@ -1355,13 +1412,8 @@ static noinline int bpf_jit_insn(struct bpf_jit *jit, struct bpf_prog *fp,
 		jit->seen |= SEEN_FUNC;
 		/* lgrl %w1,func */
 		EMIT6_PCREL_RILB(0xc4080000, REG_W1, _EMIT_CONST_U64(func));
-		if (nospec_uses_trampoline()) {
-			/* brasl %r14,__s390_indirect_jump_r1 */
-			EMIT6_PCREL_RILB(0xc0050000, REG_14, jit->r1_thunk_ip);
-		} else {
-			/* basr %r14,%w1 */
-			EMIT2(0x0d00, REG_14, REG_W1);
-		}
+		/* %r1() */
+		call_r1(jit);
 		/* lgr %b0,%r2: load return value into %b0 */
 		EMIT4(0xb9040000, BPF_REG_0, REG_2);
 		break;
@@ -1964,3 +2016,442 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t,
 
 	return 0;
 }
+
+struct bpf_tramp_jit {
+	struct bpf_jit common;
+	int orig_stack_args_off;/* Offset of arguments placed on stack by the
+				 * func_addr's original caller
+				 */
+	int stack_size;		/* Trampoline stack size */
+	int stack_args_off;	/* Offset of stack arguments for calling
+				 * func_addr, has to be at the top
+				 */
+	int reg_args_off;	/* Offset of register arguments for calling
+				 * func_addr
+				 */
+	int ip_off;		/* For bpf_get_func_ip(), has to be at
+				 * (ctx - 16)
+				 */
+	int arg_cnt_off;	/* For bpf_get_func_arg_cnt(), has to be at
+				 * (ctx - 8)
+				 */
+	int bpf_args_off;	/* Offset of BPF_PROG context, which consists
+				 * of BPF arguments followed by return value
+				 */
+	int retval_off;		/* Offset of return value (see above) */
+	int r7_r8_off;		/* Offset of saved %r7 and %r8, which are used
+				 * for __bpf_prog_enter() return value and
+				 * func_addr respectively
+				 */
+	int r14_off;		/* Offset of saved %r14 */
+	int run_ctx_off;	/* Offset of struct bpf_tramp_run_ctx */
+	int do_fexit;		/* do_fexit: label */
+};
+
+static void load_imm64(struct bpf_jit *jit, int dst_reg, u64 val)
+{
+	/* llihf %dst_reg,val_hi */
+	EMIT6_IMM(0xc00e0000, dst_reg, (val >> 32));
+	/* oilf %rdst_reg,val_lo */
+	EMIT6_IMM(0xc00d0000, dst_reg, val);
+}
+
+static void invoke_bpf_prog(struct bpf_tramp_jit *tjit,
+			    const struct btf_func_model *m,
+			    struct bpf_tramp_link *tlink, bool save_ret)
+{
+	struct bpf_jit *jit = &tjit->common;
+	int cookie_off = tjit->run_ctx_off +
+			 offsetof(struct bpf_tramp_run_ctx, bpf_cookie);
+	struct bpf_prog *p = tlink->link.prog;
+	int patch;
+
+	/*
+	 * run_ctx.cookie = tlink->cookie;
+	 */
+
+	/* %r0 = tlink->cookie */
+	load_imm64(jit, REG_W0, tlink->cookie);
+	/* stg %r0,cookie_off(%r15) */
+	EMIT6_DISP_LH(0xe3000000, 0x0024, REG_W0, REG_0, REG_15, cookie_off);
+
+	/*
+	 * if ((start = __bpf_prog_enter(p, &run_ctx)) == 0)
+	 *         goto skip;
+	 */
+
+	/* %r1 = __bpf_prog_enter */
+	load_imm64(jit, REG_1, (u64)bpf_trampoline_enter(p));
+	/* %r2 = p */
+	load_imm64(jit, REG_2, (u64)p);
+	/* la %r3,run_ctx_off(%r15) */
+	EMIT4_DISP(0x41000000, REG_3, REG_15, tjit->run_ctx_off);
+	/* %r1() */
+	call_r1(jit);
+	/* ltgr %r7,%r2 */
+	EMIT4(0xb9020000, REG_7, REG_2);
+	/* brcl 8,skip */
+	patch = jit->prg;
+	EMIT6_PCREL_RILC(0xc0040000, 8, 0);
+
+	/*
+	 * retval = bpf_func(args, p->insnsi);
+	 */
+
+	/* %r1 = p->bpf_func */
+	load_imm64(jit, REG_1, (u64)p->bpf_func);
+	/* la %r2,bpf_args_off(%r15) */
+	EMIT4_DISP(0x41000000, REG_2, REG_15, tjit->bpf_args_off);
+	/* %r3 = p->insnsi */
+	if (!p->jited)
+		load_imm64(jit, REG_3, (u64)p->insnsi);
+	/* %r1() */
+	call_r1(jit);
+	/* stg %r2,retval_off(%r15) */
+	if (save_ret) {
+		sign_extend(jit, REG_2, m->ret_size, m->ret_flags);
+		EMIT6_DISP_LH(0xe3000000, 0x0024, REG_2, REG_0, REG_15,
+			      tjit->retval_off);
+	}
+
+	/* skip: */
+	if (jit->prg_buf)
+		*(u32 *)&jit->prg_buf[patch + 2] = (jit->prg - patch) >> 1;
+
+	/*
+	 * __bpf_prog_exit(p, start, &run_ctx);
+	 */
+
+	/* %r1 = __bpf_prog_exit */
+	load_imm64(jit, REG_1, (u64)bpf_trampoline_exit(p));
+	/* %r2 = p */
+	load_imm64(jit, REG_2, (u64)p);
+	/* lgr %r3,%r7 */
+	EMIT4(0xb9040000, REG_3, REG_7);
+	/* la %r4,run_ctx_off(%r15) */
+	EMIT4_DISP(0x41000000, REG_4, REG_15, tjit->run_ctx_off);
+	/* %r1() */
+	call_r1(jit);
+}
+
+static int alloc_stack(struct bpf_tramp_jit *tjit, size_t size)
+{
+	int stack_offset = tjit->stack_size;
+
+	tjit->stack_size += size;
+	return stack_offset;
+}
+
+/* ABI uses %r2 - %r6 for parameter passing. */
+#define MAX_NR_REG_ARGS 5
+
+/* The "L" field of the "mvc" instruction is 8 bits. */
+#define MAX_MVC_SIZE 256
+#define MAX_NR_STACK_ARGS (MAX_MVC_SIZE / sizeof(u64))
+
+/* -mfentry generates a 6-byte nop on s390x. */
+#define S390X_PATCH_SIZE 6
+
+int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im,
+				  struct bpf_tramp_jit *tjit,
+				  const struct btf_func_model *m,
+				  u32 flags, struct bpf_tramp_links *tlinks,
+				  void *func_addr)
+{
+	struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
+	struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
+	struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
+	int nr_bpf_args, nr_reg_args, nr_stack_args;
+	struct bpf_jit *jit = &tjit->common;
+	int arg, bpf_arg_off;
+	int i, j;
+
+	/* Support as many stack arguments as "mvc" instruction can handle. */
+	nr_reg_args = min_t(int, m->nr_args, MAX_NR_REG_ARGS);
+	nr_stack_args = m->nr_args - nr_reg_args;
+	if (nr_stack_args > MAX_NR_STACK_ARGS)
+		return -ENOTSUPP;
+
+	/* Return to %r14, since func_addr and %r0 are not available. */
+	if (!func_addr && !(flags & BPF_TRAMP_F_ORIG_STACK))
+		flags |= BPF_TRAMP_F_SKIP_FRAME;
+
+	/*
+	 * Compute how many arguments we need to pass to BPF programs.
+	 * BPF ABI mirrors that of x86_64: arguments that are 16 bytes or
+	 * smaller are packed into 1 or 2 registers; larger arguments are
+	 * passed via pointers.
+	 * In s390x ABI, arguments that are 8 bytes or smaller are packed into
+	 * a register; larger arguments are passed via pointers.
+	 * We need to deal with this difference.
+	 */
+	nr_bpf_args = 0;
+	for (i = 0; i < m->nr_args; i++) {
+		if (m->arg_size[i] <= 8)
+			nr_bpf_args += 1;
+		else if (m->arg_size[i] <= 16)
+			nr_bpf_args += 2;
+		else
+			return -ENOTSUPP;
+	}
+
+	/*
+	 * Calculate the stack layout.
+	 */
+
+	/* Reserve STACK_FRAME_OVERHEAD bytes for the callees. */
+	tjit->stack_size = STACK_FRAME_OVERHEAD;
+	tjit->stack_args_off = alloc_stack(tjit, nr_stack_args * sizeof(u64));
+	tjit->reg_args_off = alloc_stack(tjit, nr_reg_args * sizeof(u64));
+	tjit->ip_off = alloc_stack(tjit, sizeof(u64));
+	tjit->arg_cnt_off = alloc_stack(tjit, sizeof(u64));
+	tjit->bpf_args_off = alloc_stack(tjit, nr_bpf_args * sizeof(u64));
+	tjit->retval_off = alloc_stack(tjit, sizeof(u64));
+	tjit->r7_r8_off = alloc_stack(tjit, 2 * sizeof(u64));
+	tjit->r14_off = alloc_stack(tjit, sizeof(u64));
+	tjit->run_ctx_off = alloc_stack(tjit,
+					sizeof(struct bpf_tramp_run_ctx));
+	/* The caller has already reserved STACK_FRAME_OVERHEAD bytes. */
+	tjit->stack_size -= STACK_FRAME_OVERHEAD;
+	tjit->orig_stack_args_off = tjit->stack_size + STACK_FRAME_OVERHEAD;
+
+	/* aghi %r15,-stack_size */
+	EMIT4_IMM(0xa70b0000, REG_15, -tjit->stack_size);
+	/* stmg %r2,%rN,fwd_reg_args_off(%r15) */
+	if (nr_reg_args)
+		EMIT6_DISP_LH(0xeb000000, 0x0024, REG_2,
+			      REG_2 + (nr_reg_args - 1), REG_15,
+			      tjit->reg_args_off);
+	for (i = 0, j = 0; i < m->nr_args; i++) {
+		if (i < MAX_NR_REG_ARGS)
+			arg = REG_2 + i;
+		else
+			arg = tjit->orig_stack_args_off +
+			      (i - MAX_NR_REG_ARGS) * sizeof(u64);
+		bpf_arg_off = tjit->bpf_args_off + j * sizeof(u64);
+		if (m->arg_size[i] <= 8) {
+			if (i < MAX_NR_REG_ARGS)
+				/* stg %arg,bpf_arg_off(%r15) */
+				EMIT6_DISP_LH(0xe3000000, 0x0024, arg,
+					      REG_0, REG_15, bpf_arg_off);
+			else
+				/* mvc bpf_arg_off(8,%r15),arg(%r15) */
+				_EMIT6(0xd207f000 | bpf_arg_off,
+				       0xf000 | arg);
+			j += 1;
+		} else {
+			if (i < MAX_NR_REG_ARGS) {
+				/* mvc bpf_arg_off(16,%r15),0(%arg) */
+				_EMIT6(0xd20ff000 | bpf_arg_off,
+				       reg2hex[arg] << 12);
+			} else {
+				/* lg %r1,arg(%r15) */
+				EMIT6_DISP_LH(0xe3000000, 0x0004, REG_1, REG_0,
+					      REG_15, arg);
+				/* mvc bpf_arg_off(16,%r15),0(%r1) */
+				_EMIT6(0xd20ff000 | bpf_arg_off, 0x1000);
+			}
+			j += 2;
+		}
+	}
+	/* stmg %r7,%r8,r7_r8_off(%r15) */
+	EMIT6_DISP_LH(0xeb000000, 0x0024, REG_7, REG_8, REG_15,
+		      tjit->r7_r8_off);
+	/* stg %r14,r14_off(%r15) */
+	EMIT6_DISP_LH(0xe3000000, 0x0024, REG_14, REG_0, REG_15, tjit->r14_off);
+
+	if (flags & BPF_TRAMP_F_ORIG_STACK) {
+		/*
+		 * The ftrace trampoline puts the return address (which is the
+		 * address of the original function + S390X_PATCH_SIZE) into
+		 * %r0; see ftrace_shared_hotpatch_trampoline_br and
+		 * ftrace_init_nop() for details.
+		 */
+
+		/* lgr %r8,%r0 */
+		EMIT4(0xb9040000, REG_8, REG_0);
+	} else {
+		/* %r8 = func_addr + S390X_PATCH_SIZE */
+		load_imm64(jit, REG_8, (u64)func_addr + S390X_PATCH_SIZE);
+	}
+
+	/*
+	 * ip = func_addr;
+	 * arg_cnt = m->nr_args;
+	 */
+
+	if (flags & BPF_TRAMP_F_IP_ARG) {
+		/* %r0 = func_addr */
+		load_imm64(jit, REG_0, (u64)func_addr);
+		/* stg %r0,ip_off(%r15) */
+		EMIT6_DISP_LH(0xe3000000, 0x0024, REG_0, REG_0, REG_15,
+			      tjit->ip_off);
+	}
+	/* lghi %r0,nr_bpf_args */
+	EMIT4_IMM(0xa7090000, REG_0, nr_bpf_args);
+	/* stg %r0,arg_cnt_off(%r15) */
+	EMIT6_DISP_LH(0xe3000000, 0x0024, REG_0, REG_0, REG_15,
+		      tjit->arg_cnt_off);
+
+	if (flags & BPF_TRAMP_F_CALL_ORIG) {
+		/*
+		 * __bpf_tramp_enter(im);
+		 */
+
+		/* %r1 = __bpf_tramp_enter */
+		load_imm64(jit, REG_1, (u64)__bpf_tramp_enter);
+		/* %r2 = im */
+		load_imm64(jit, REG_2, (u64)im);
+		/* %r1() */
+		call_r1(jit);
+	}
+
+	for (i = 0; i < fentry->nr_links; i++)
+		invoke_bpf_prog(tjit, m, fentry->links[i],
+				flags & BPF_TRAMP_F_RET_FENTRY_RET);
+
+	if (fmod_ret->nr_links) {
+		/*
+		 * retval = 0;
+		 */
+
+		/* xc retval_off(8,%r15),retval_off(%r15) */
+		_EMIT6(0xd707f000 | tjit->retval_off,
+		       0xf000 | tjit->retval_off);
+
+		for (i = 0; i < fmod_ret->nr_links; i++) {
+			invoke_bpf_prog(tjit, m, fmod_ret->links[i], true);
+
+			/*
+			 * if (retval)
+			 *         goto do_fexit;
+			 */
+
+			/* ltg %r0,retval_off(%r15) */
+			EMIT6_DISP_LH(0xe3000000, 0x0002, REG_0, REG_0, REG_15,
+				      tjit->retval_off);
+			/* brcl 7,do_fexit */
+			EMIT6_PCREL_RILC(0xc0040000, 7, tjit->do_fexit);
+		}
+	}
+
+	if (flags & BPF_TRAMP_F_CALL_ORIG) {
+		/*
+		 * retval = func_addr(args);
+		 */
+
+		/* lmg %r2,%rN,reg_args_off(%r15) */
+		if (nr_reg_args)
+			EMIT6_DISP_LH(0xeb000000, 0x0004, REG_2,
+				      REG_2 + (nr_reg_args - 1), REG_15,
+				      tjit->reg_args_off);
+		/* mvc stack_args_off(N,%r15),orig_stack_args_off(%r15) */
+		if (nr_stack_args)
+			_EMIT6(0xd200f000 |
+				       (nr_stack_args * sizeof(u64) - 1) << 16 |
+				       tjit->stack_args_off,
+			       0xf000 | tjit->orig_stack_args_off);
+		/* lgr %r1,%r8 */
+		EMIT4(0xb9040000, REG_1, REG_8);
+		/* %r1() */
+		call_r1(jit);
+		/* stg %r2,retval_off(%r15) */
+		EMIT6_DISP_LH(0xe3000000, 0x0024, REG_2, REG_0, REG_15,
+			      tjit->retval_off);
+
+		im->ip_after_call = jit->prg_buf + jit->prg;
+
+		/*
+		 * The following nop will be patched by bpf_tramp_image_put().
+		 */
+
+		/* brcl 0,im->ip_epilogue */
+		EMIT6_PCREL_RILC(0xc0040000, 0, (u64)im->ip_epilogue);
+	}
+
+	/* do_fexit: */
+	tjit->do_fexit = jit->prg;
+	for (i = 0; i < fexit->nr_links; i++)
+		invoke_bpf_prog(tjit, m, fexit->links[i], false);
+
+	if (flags & BPF_TRAMP_F_CALL_ORIG) {
+		im->ip_epilogue = jit->prg_buf + jit->prg;
+
+		/*
+		 * __bpf_tramp_exit(im);
+		 */
+
+		/* %r1 = __bpf_tramp_exit */
+		load_imm64(jit, REG_1, (u64)__bpf_tramp_exit);
+		/* %r2 = im */
+		load_imm64(jit, REG_2, (u64)im);
+		/* %r1() */
+		call_r1(jit);
+	}
+
+	/* lmg %r2,%rN,reg_args_off(%r15) */
+	if ((flags & BPF_TRAMP_F_RESTORE_REGS) && nr_reg_args)
+		EMIT6_DISP_LH(0xeb000000, 0x0004, REG_2,
+			      REG_2 + (nr_reg_args - 1), REG_15,
+			      tjit->reg_args_off);
+	/* lgr %r1,%r8 */
+	if (!(flags & BPF_TRAMP_F_SKIP_FRAME))
+		EMIT4(0xb9040000, REG_1, REG_8);
+	/* lmg %r7,%r8,r7_r8_off(%r15) */
+	EMIT6_DISP_LH(0xeb000000, 0x0004, REG_7, REG_8, REG_15,
+		      tjit->r7_r8_off);
+	/* lg %r14,r14_off(%r15) */
+	EMIT6_DISP_LH(0xe3000000, 0x0004, REG_14, REG_0, REG_15, tjit->r14_off);
+	/* lg %r2,retval_off(%r15) */
+	if (flags & (BPF_TRAMP_F_CALL_ORIG | BPF_TRAMP_F_RET_FENTRY_RET))
+		EMIT6_DISP_LH(0xe3000000, 0x0004, REG_2, REG_0, REG_15,
+			      tjit->retval_off);
+	/* aghi %r15,stack_size */
+	EMIT4_IMM(0xa70b0000, REG_15, tjit->stack_size);
+	/* Emit an expoline for the following indirect jump. */
+	if (nospec_uses_trampoline())
+		emit_expoline(jit);
+	if (flags & BPF_TRAMP_F_SKIP_FRAME)
+		/* br %r14 */
+		_EMIT2(0x07fe);
+	else
+		/* br %r1 */
+		_EMIT2(0x07f1);
+
+	emit_r1_thunk(jit);
+
+	return 0;
+}
+
+int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image,
+				void *image_end, const struct btf_func_model *m,
+				u32 flags, struct bpf_tramp_links *tlinks,
+				void *func_addr)
+{
+	struct bpf_tramp_jit tjit;
+	int ret;
+	int i;
+
+	for (i = 0; i < 2; i++) {
+		if (i == 0) {
+			/* Compute offsets, check whether the code fits. */
+			memset(&tjit, 0, sizeof(tjit));
+		} else {
+			/* Generate the code. */
+			tjit.common.prg = 0;
+			tjit.common.prg_buf = image;
+		}
+		ret = __arch_prepare_bpf_trampoline(im, &tjit, m, flags,
+						    tlinks, func_addr);
+		if (ret < 0)
+			return ret;
+		if (tjit.common.prg > (char *)image_end - (char *)image)
+			/*
+			 * Use the same error code as for exceeding
+			 * BPF_MAX_TRAMP_LINKS.
+			 */
+			return -E2BIG;
+	}
+
+	return ret;
+}
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index cf89504c8dda..52ff43bbf996 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -943,7 +943,11 @@ struct btf_func_model {
 /* Each call __bpf_prog_enter + call bpf_func + call __bpf_prog_exit is ~50
  * bytes on x86.
  */
+#if defined(__s390x__)
+#define BPF_MAX_TRAMP_LINKS 27
+#else
 #define BPF_MAX_TRAMP_LINKS 38
+#endif
 
 struct bpf_tramp_links {
 	struct bpf_tramp_link *links[BPF_MAX_TRAMP_LINKS];
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH bpf-next 23/24] s390/bpf: Implement bpf_jit_supports_subprog_tailcalls()
  2023-01-25 21:37 [PATCH bpf-next 00/24] Support bpf trampoline for s390x Ilya Leoshkevich
                   ` (21 preceding siblings ...)
  2023-01-25 21:38 ` [PATCH bpf-next 22/24] s390/bpf: Implement arch_prepare_bpf_trampoline() Ilya Leoshkevich
@ 2023-01-25 21:38 ` Ilya Leoshkevich
  2023-01-25 21:38 ` [PATCH bpf-next 24/24] s390/bpf: Implement bpf_jit_supports_kfunc_call() Ilya Leoshkevich
  2023-01-26  0:45 ` [PATCH bpf-next 00/24] Support bpf trampoline for s390x Andrii Nakryiko
  24 siblings, 0 replies; 48+ messages in thread
From: Ilya Leoshkevich @ 2023-01-25 21:38 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: bpf, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich

Allow mixing subprogs and tail calls by passing the current tail
call count to subprogs.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 arch/s390/net/bpf_jit_comp.c | 37 ++++++++++++++++++++++++++----------
 1 file changed, 27 insertions(+), 10 deletions(-)

diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
index ea8203bd4112..f4cdecb32629 100644
--- a/arch/s390/net/bpf_jit_comp.c
+++ b/arch/s390/net/bpf_jit_comp.c
@@ -58,7 +58,6 @@ struct bpf_jit {
 #define SEEN_MEM	BIT(0)		/* use mem[] for temporary storage */
 #define SEEN_LITERAL	BIT(1)		/* code uses literals */
 #define SEEN_FUNC	BIT(2)		/* calls C functions */
-#define SEEN_TAIL_CALL	BIT(3)		/* code uses tail calls */
 #define SEEN_STACK	(SEEN_FUNC | SEEN_MEM)
 
 /*
@@ -549,20 +548,23 @@ static void bpf_jit_plt(void *plt, void *ret, void *target)
  * Save registers and create stack frame if necessary.
  * See stack frame layout description in "bpf_jit.h"!
  */
-static void bpf_jit_prologue(struct bpf_jit *jit, u32 stack_depth)
+static void bpf_jit_prologue(struct bpf_jit *jit, struct bpf_prog *fp,
+			     u32 stack_depth)
 {
 	/* No-op for hotpatching */
 	/* brcl 0,prologue_plt */
 	EMIT6_PCREL_RILC(0xc0040000, 0, jit->prologue_plt);
 	jit->prologue_plt_ret = jit->prg;
 
-	if (jit->seen & SEEN_TAIL_CALL) {
+	if (fp->aux->func_idx == 0) {
+		/* Initialize the tail call counter in the main program. */
 		/* xc STK_OFF_TCCNT(4,%r15),STK_OFF_TCCNT(%r15) */
 		_EMIT6(0xd703f000 | STK_OFF_TCCNT, 0xf000 | STK_OFF_TCCNT);
 	} else {
 		/*
-		 * There are no tail calls. Insert nops in order to have
-		 * tail_call_start at a predictable offset.
+		 * Skip the tail call counter initialization in subprograms.
+		 * Insert nops in order to have tail_call_start at a
+		 * predictable offset.
 		 */
 		bpf_skip(jit, 6);
 	}
@@ -1410,6 +1412,19 @@ static noinline int bpf_jit_insn(struct bpf_jit *jit, struct bpf_prog *fp,
 
 		REG_SET_SEEN(BPF_REG_5);
 		jit->seen |= SEEN_FUNC;
+		/*
+		 * Copy the tail call counter to where the callee expects it.
+		 *
+		 * Note 1: The callee can increment the tail call counter, but
+		 * we do not load it back, since the x86 JIT does not do this
+		 * either.
+		 *
+		 * Note 2: We assume that the verifier does not let us call the
+		 * main program, which clears the tail call counter on entry.
+		 */
+		/* mvc STK_OFF_TCCNT(4,%r15),N(%r15) */
+		_EMIT6(0xd203f000 | STK_OFF_TCCNT,
+		       0xf000 | (STK_OFF_TCCNT + STK_OFF + stack_depth));
 		/* lgrl %w1,func */
 		EMIT6_PCREL_RILB(0xc4080000, REG_W1, _EMIT_CONST_U64(func));
 		/* %r1() */
@@ -1426,10 +1441,7 @@ static noinline int bpf_jit_insn(struct bpf_jit *jit, struct bpf_prog *fp,
 		 *  B1: pointer to ctx
 		 *  B2: pointer to bpf_array
 		 *  B3: index in bpf_array
-		 */
-		jit->seen |= SEEN_TAIL_CALL;
-
-		/*
+		 *
 		 * if (index >= array->map.max_entries)
 		 *         goto out;
 		 */
@@ -1793,7 +1805,7 @@ static int bpf_jit_prog(struct bpf_jit *jit, struct bpf_prog *fp,
 	jit->prg = 0;
 	jit->excnt = 0;
 
-	bpf_jit_prologue(jit, stack_depth);
+	bpf_jit_prologue(jit, fp, stack_depth);
 	if (bpf_set_addr(jit, 0) < 0)
 		return -1;
 	for (i = 0; i < fp->len; i += insn_count) {
@@ -2455,3 +2467,8 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image,
 
 	return ret;
 }
+
+bool bpf_jit_supports_subprog_tailcalls(void)
+{
+	return true;
+}
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH bpf-next 24/24] s390/bpf: Implement bpf_jit_supports_kfunc_call()
  2023-01-25 21:37 [PATCH bpf-next 00/24] Support bpf trampoline for s390x Ilya Leoshkevich
                   ` (22 preceding siblings ...)
  2023-01-25 21:38 ` [PATCH bpf-next 23/24] s390/bpf: Implement bpf_jit_supports_subprog_tailcalls() Ilya Leoshkevich
@ 2023-01-25 21:38 ` Ilya Leoshkevich
  2023-01-26  1:28   ` Alexei Starovoitov
  2023-01-26  0:45 ` [PATCH bpf-next 00/24] Support bpf trampoline for s390x Andrii Nakryiko
  24 siblings, 1 reply; 48+ messages in thread
From: Ilya Leoshkevich @ 2023-01-25 21:38 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: bpf, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich

Implement calling kernel functions from eBPF. In general, the eBPF ABI
is fairly close to that of s390x, with one important difference: on
s390x callers should sign-extend signed arguments. Handle that by using
information returned by bpf_jit_find_kfunc_model().

At the moment bpf_jit_find_kfunc_model() does not seem to play nicely
with XDP metadata functions: add_kfunc_call() adds an "abstract" bpf_*()
version to kfunc_btf_tab, but then fixup_kfunc_call() puts the concrete
version into insn->imm, which bpf_jit_find_kfunc_model() cannot find.
But this seems to be a common code problem.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 arch/s390/net/bpf_jit_comp.c | 25 +++++++++++++++++++++++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
index f4cdecb32629..74b38e817bd8 100644
--- a/arch/s390/net/bpf_jit_comp.c
+++ b/arch/s390/net/bpf_jit_comp.c
@@ -1401,9 +1401,10 @@ static noinline int bpf_jit_insn(struct bpf_jit *jit, struct bpf_prog *fp,
 	 */
 	case BPF_JMP | BPF_CALL:
 	{
-		u64 func;
+		const struct btf_func_model *m;
 		bool func_addr_fixed;
-		int ret;
+		int j, ret;
+		u64 func;
 
 		ret = bpf_jit_get_func_addr(fp, insn, extra_pass,
 					    &func, &func_addr_fixed);
@@ -1425,6 +1426,21 @@ static noinline int bpf_jit_insn(struct bpf_jit *jit, struct bpf_prog *fp,
 		/* mvc STK_OFF_TCCNT(4,%r15),N(%r15) */
 		_EMIT6(0xd203f000 | STK_OFF_TCCNT,
 		       0xf000 | (STK_OFF_TCCNT + STK_OFF + stack_depth));
+
+		/* Sign-extend the kfunc arguments. */
+		if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) {
+			m = bpf_jit_find_kfunc_model(fp, insn);
+			if (!m)
+				return -1;
+
+			for (j = 0; j < m->nr_args; j++) {
+				if (sign_extend(jit, BPF_REG_1 + j,
+						m->arg_size[j],
+						m->arg_flags[j]))
+					return -1;
+			}
+		}
+
 		/* lgrl %w1,func */
 		EMIT6_PCREL_RILB(0xc4080000, REG_W1, _EMIT_CONST_U64(func));
 		/* %r1() */
@@ -1980,6 +1996,11 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 	return fp;
 }
 
+bool bpf_jit_supports_kfunc_call(void)
+{
+	return true;
+}
+
 int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t,
 		       void *old_addr, void *new_addr)
 {
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH bpf-next 17/24] libbpf: Read usdt arg spec with bpf_probe_read_kernel()
  2023-01-25 21:38 ` [PATCH bpf-next 17/24] libbpf: Read usdt arg spec with bpf_probe_read_kernel() Ilya Leoshkevich
@ 2023-01-26  0:26   ` Andrii Nakryiko
  2023-01-26 11:41     ` Ilya Leoshkevich
  0 siblings, 1 reply; 48+ messages in thread
From: Andrii Nakryiko @ 2023-01-26  0:26 UTC (permalink / raw)
  To: Ilya Leoshkevich
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf,
	Heiko Carstens, Vasily Gorbik

On Wed, Jan 25, 2023 at 1:39 PM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>
> Loading programs that use bpf_usdt_arg() on s390x fails with:
>
>     ; switch (arg_spec->arg_type) {
>     139: (61) r1 = *(u32 *)(r2 +8)
>     R2 unbounded memory access, make sure to bounds check any such access

can you show a bit longer log? we shouldn't just  use
bpf_probe_read_kernel for this. I suspect strategically placed
barrier_var() calls will solve this. This is usually an issue with
compiler reordering operations and doing actual check after it already
speculatively adjusted pointer (which is technically safe and ok if we
never deref that pointer, but verifier doesn't recognize such pattern)

>
> The bound checks seem to be already in place in the C code, and maybe
> it's even possible to add extra bogus checks to placate LLVM or the
> verifier. Take a simpler approach and just use a helper.
>
> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> ---
>  tools/lib/bpf/usdt.bpf.h | 33 ++++++++++++++++++---------------
>  1 file changed, 18 insertions(+), 15 deletions(-)
>
> diff --git a/tools/lib/bpf/usdt.bpf.h b/tools/lib/bpf/usdt.bpf.h
> index fdfd235e52c4..ddfa2521ab67 100644
> --- a/tools/lib/bpf/usdt.bpf.h
> +++ b/tools/lib/bpf/usdt.bpf.h
> @@ -116,7 +116,7 @@ __weak __hidden
>  int bpf_usdt_arg(struct pt_regs *ctx, __u64 arg_num, long *res)
>  {
>         struct __bpf_usdt_spec *spec;
> -       struct __bpf_usdt_arg_spec *arg_spec;
> +       struct __bpf_usdt_arg_spec arg_spec;
>         unsigned long val;
>         int err, spec_id;
>
> @@ -133,21 +133,24 @@ int bpf_usdt_arg(struct pt_regs *ctx, __u64 arg_num, long *res)
>         if (arg_num >= BPF_USDT_MAX_ARG_CNT || arg_num >= spec->arg_cnt)
>                 return -ENOENT;
>
> -       arg_spec = &spec->args[arg_num];
> -       switch (arg_spec->arg_type) {
> +       err = bpf_probe_read_kernel(&arg_spec, sizeof(arg_spec), &spec->args[arg_num]);
> +       if (err)
> +               return err;
> +
> +       switch (arg_spec.arg_type) {
>         case BPF_USDT_ARG_CONST:
>                 /* Arg is just a constant ("-4@$-9" in USDT arg spec).
> -                * value is recorded in arg_spec->val_off directly.
> +                * value is recorded in arg_spec.val_off directly.
>                  */
> -               val = arg_spec->val_off;
> +               val = arg_spec.val_off;
>                 break;
>         case BPF_USDT_ARG_REG:
>                 /* Arg is in a register (e.g, "8@%rax" in USDT arg spec),
>                  * so we read the contents of that register directly from
>                  * struct pt_regs. To keep things simple user-space parts
> -                * record offsetof(struct pt_regs, <regname>) in arg_spec->reg_off.
> +                * record offsetof(struct pt_regs, <regname>) in arg_spec.reg_off.
>                  */
> -               err = bpf_probe_read_kernel(&val, sizeof(val), (void *)ctx + arg_spec->reg_off);
> +               err = bpf_probe_read_kernel(&val, sizeof(val), (void *)ctx + arg_spec.reg_off);
>                 if (err)
>                         return err;
>                 break;
> @@ -155,18 +158,18 @@ int bpf_usdt_arg(struct pt_regs *ctx, __u64 arg_num, long *res)
>                 /* Arg is in memory addressed by register, plus some offset
>                  * (e.g., "-4@-1204(%rbp)" in USDT arg spec). Register is
>                  * identified like with BPF_USDT_ARG_REG case, and the offset
> -                * is in arg_spec->val_off. We first fetch register contents
> +                * is in arg_spec.val_off. We first fetch register contents
>                  * from pt_regs, then do another user-space probe read to
>                  * fetch argument value itself.
>                  */
> -               err = bpf_probe_read_kernel(&val, sizeof(val), (void *)ctx + arg_spec->reg_off);
> +               err = bpf_probe_read_kernel(&val, sizeof(val), (void *)ctx + arg_spec.reg_off);
>                 if (err)
>                         return err;
> -               err = bpf_probe_read_user(&val, sizeof(val), (void *)val + arg_spec->val_off);
> +               err = bpf_probe_read_user(&val, sizeof(val), (void *)val + arg_spec.val_off);
>                 if (err)
>                         return err;
>  #if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
> -               val >>= arg_spec->arg_bitshift;
> +               val >>= arg_spec.arg_bitshift;
>  #endif
>                 break;
>         default:
> @@ -177,11 +180,11 @@ int bpf_usdt_arg(struct pt_regs *ctx, __u64 arg_num, long *res)
>          * necessary upper arg_bitshift bits, with sign extension if argument
>          * is signed
>          */
> -       val <<= arg_spec->arg_bitshift;
> -       if (arg_spec->arg_signed)
> -               val = ((long)val) >> arg_spec->arg_bitshift;
> +       val <<= arg_spec.arg_bitshift;
> +       if (arg_spec.arg_signed)
> +               val = ((long)val) >> arg_spec.arg_bitshift;
>         else
> -               val = val >> arg_spec->arg_bitshift;
> +               val = val >> arg_spec.arg_bitshift;
>         *res = val;
>         return 0;
>  }
> --
> 2.39.1
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH bpf-next 00/24] Support bpf trampoline for s390x
  2023-01-25 21:37 [PATCH bpf-next 00/24] Support bpf trampoline for s390x Ilya Leoshkevich
                   ` (23 preceding siblings ...)
  2023-01-25 21:38 ` [PATCH bpf-next 24/24] s390/bpf: Implement bpf_jit_supports_kfunc_call() Ilya Leoshkevich
@ 2023-01-26  0:45 ` Andrii Nakryiko
  2023-01-27 16:51   ` Ilya Leoshkevich
  24 siblings, 1 reply; 48+ messages in thread
From: Andrii Nakryiko @ 2023-01-26  0:45 UTC (permalink / raw)
  To: Ilya Leoshkevich
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf,
	Heiko Carstens, Vasily Gorbik

On Wed, Jan 25, 2023 at 1:39 PM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>
> Hi,
>
> This series implements poke, trampoline, kfunc, mixing subprogs and
> tailcalls, and fixes a number of tests on s390x.
>
> The following failures still remain:
>
> #52      core_read_macros:FAIL
> Uses BPF_PROBE_READ(), shouldn't there be BPF_PROBE_READ_KERNEL()?

BPF_PROBE_READ(), similarly to BPF_CORE_READ() both use
bpf_probe_read_kernel() internally, as it's most common use case. We
have separate BPF_PROBE_READ_USER() and BPF_CORE_READ_USER() for when
bpf_probe_read_user() has to be used.

>
> #82      get_stack_raw_tp:FAIL
> get_stack_print_output:FAIL:user_stack corrupted user stack
> Known issue:
> We cannot reliably unwind userspace on s390x without DWARF.

like in principle, or frame pointers (or some equivalent) needs to be
configured for this to work?

Asking also in the context of [0], where s390x was excluded. If there
is actually a way to enable frame pointer-based stack unwinding on
s390x, would be nice to hear from an expert.

  [0] https://pagure.io/fesco/issue/2923

>
> #101     ksyms_module:FAIL
> address of kernel function bpf_testmod_test_mod_kfunc is out of range
> Known issue:
> Kernel and modules are too far away from each other on s390x.
>
> #167     sk_assign:FAIL
> Uses legacy map definitions in 'maps' section.

Hm.. assuming new enough iproute2, new-style .maps definition should
be supported, right? Let's convert map definition?

>
> #190     stacktrace_build_id:FAIL
> Known issue:
> We cannot reliably unwind userspace on s390x without DWARF.
>
> #211     test_bpffs:FAIL
> iterators.bpf.c is broken on s390x, it uses BPF_CORE_READ(), shouldn't
> there be BPF_CORE_READ_KERNEL()?

BPF_CORE_READ() is that, so must be something else

>
> #218     test_profiler:FAIL
> A lot of BPF_PROBE_READ() usages.

ditto, something else

>
> #281     xdp_metadata:FAIL
> See patch 24.
>
> #284     xdp_synproxy:FAIL
> Verifier error:
> ; value = bpf_tcp_raw_gen_syncookie_ipv4(hdr->ipv4, hdr->tcp,
> 281: (79) r1 = *(u64 *)(r10 -80)      ; R1_w=pkt(off=14,r=74,imm=0) R10=fp0
> 282: (bf) r2 = r8                     ; R2_w=pkt(id=5,off=14,r=74,umax=60,var_off=(0x0; 0x3c)) R8=pkt(id=5,off=14,r=74,umax=60,var_off=(0x0; 0x3c))
> 283: (79) r3 = *(u64 *)(r10 -104)     ; R3_w=scalar(umax=60,var_off=(0x0; 0x3c)) R10=fp0
> 284: (85) call bpf_tcp_raw_gen_syncookie_ipv4#204
> invalid access to packet, off=14 size=0, R2(id=5,off=14,r=74)
> R2 offset is outside of the packet

third arg to bpf_tcp_raw_gen_syncookie_ipv4() is defined as
ARG_CONST_SIZE, so is required to be strictly positive, which doesn't
seem to be "proven" to verifier. Please provided bigger log, it might
another instance of needing to sprinkle barrier_var() around.

And maybe thinking about using ARG_CONST_SIZE_OR_ZERO instead of ARG_CONST_SIZE.

>
> None of these seem to be due to the new changes.
>
> Best regards,
> Ilya
>
> Ilya Leoshkevich (24):
>   selftests/bpf: Fix liburandom_read.so linker error
>   selftests/bpf: Fix symlink creation error
>   selftests/bpf: Fix fexit_stress on s390x
>   selftests/bpf: Fix trampoline_count on s390x
>   selftests/bpf: Fix kfree_skb on s390x
>   selftests/bpf: Set errno when urand_spawn() fails
>   selftests/bpf: Fix decap_sanity_ns cleanup
>   selftests/bpf: Fix verify_pkcs7_sig on s390x
>   selftests/bpf: Fix xdp_do_redirect on s390x
>   selftests/bpf: Fix cgrp_local_storage on s390x
>   selftests/bpf: Check stack_mprotect() return value
>   selftests/bpf: Increase SIZEOF_BPF_LOCAL_STORAGE_ELEM on s390x
>   selftests/bpf: Add a sign-extension test for kfuncs
>   selftests/bpf: Fix test_lsm on s390x
>   selftests/bpf: Fix test_xdp_adjust_tail_grow2 on s390x
>   selftests/bpf: Fix vmlinux test on s390x
>   libbpf: Read usdt arg spec with bpf_probe_read_kernel()
>   s390/bpf: Fix a typo in a comment
>   s390/bpf: Add expoline to tail calls
>   s390/bpf: Implement bpf_arch_text_poke()
>   bpf: btf: Add BTF_FMODEL_SIGNED_ARG flag
>   s390/bpf: Implement arch_prepare_bpf_trampoline()
>   s390/bpf: Implement bpf_jit_supports_subprog_tailcalls()
>   s390/bpf: Implement bpf_jit_supports_kfunc_call()
>
>  arch/s390/net/bpf_jit_comp.c                  | 708 +++++++++++++++++-
>  include/linux/bpf.h                           |   8 +
>  include/linux/btf.h                           |  15 +-
>  kernel/bpf/btf.c                              |  16 +-
>  net/bpf/test_run.c                            |   9 +
>  tools/lib/bpf/usdt.bpf.h                      |  33 +-
>  tools/testing/selftests/bpf/Makefile          |   7 +-
>  tools/testing/selftests/bpf/netcnt_common.h   |   6 +-
>  .../selftests/bpf/prog_tests/bpf_cookie.c     |   6 +-
>  .../bpf/prog_tests/cgrp_local_storage.c       |   2 +-
>  .../selftests/bpf/prog_tests/decap_sanity.c   |   2 +-
>  .../selftests/bpf/prog_tests/fexit_stress.c   |   6 +-
>  .../selftests/bpf/prog_tests/kfree_skb.c      |   2 +-
>  .../selftests/bpf/prog_tests/kfunc_call.c     |   1 +
>  .../selftests/bpf/prog_tests/test_lsm.c       |   3 +-
>  .../bpf/prog_tests/trampoline_count.c         |   4 +
>  tools/testing/selftests/bpf/prog_tests/usdt.c |   1 +
>  .../bpf/prog_tests/verify_pkcs7_sig.c         |   9 +
>  .../bpf/prog_tests/xdp_adjust_tail.c          |   7 +-
>  .../bpf/prog_tests/xdp_do_redirect.c          |   4 +
>  .../selftests/bpf/progs/kfunc_call_test.c     |  18 +
>  tools/testing/selftests/bpf/progs/lsm.c       |   7 +-
>  .../bpf/progs/test_verify_pkcs7_sig.c         |  12 +-
>  .../selftests/bpf/progs/test_vmlinux.c        |   4 +-
>  .../bpf/progs/test_xdp_adjust_tail_grow.c     |   8 +-
>  25 files changed, 816 insertions(+), 82 deletions(-)
>
> --
> 2.39.1
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH bpf-next 08/24] selftests/bpf: Fix verify_pkcs7_sig on s390x
  2023-01-25 21:38 ` [PATCH bpf-next 08/24] selftests/bpf: Fix verify_pkcs7_sig on s390x Ilya Leoshkevich
@ 2023-01-26  1:06   ` Andrii Nakryiko
  2023-01-27 12:36     ` Ilya Leoshkevich
  0 siblings, 1 reply; 48+ messages in thread
From: Andrii Nakryiko @ 2023-01-26  1:06 UTC (permalink / raw)
  To: Ilya Leoshkevich
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf,
	Heiko Carstens, Vasily Gorbik

On Wed, Jan 25, 2023 at 1:39 PM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>
> Use bpf_probe_read_kernel() instead of bpf_probe_read(), which is not
> defined on all architectures.
>
> While at it, improve the error handling: do not hide the verifier log,
> and check the return values of bpf_probe_read_kernel() and
> bpf_copy_from_user().
>
> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> ---
>  .../selftests/bpf/prog_tests/verify_pkcs7_sig.c      |  9 +++++++++
>  .../selftests/bpf/progs/test_verify_pkcs7_sig.c      | 12 ++++++++----
>  2 files changed, 17 insertions(+), 4 deletions(-)
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/verify_pkcs7_sig.c b/tools/testing/selftests/bpf/prog_tests/verify_pkcs7_sig.c
> index 579d6ee83ce0..75c256f79f85 100644
> --- a/tools/testing/selftests/bpf/prog_tests/verify_pkcs7_sig.c
> +++ b/tools/testing/selftests/bpf/prog_tests/verify_pkcs7_sig.c
> @@ -56,11 +56,17 @@ struct data {
>         __u32 sig_len;
>  };
>
> +static char libbpf_log[8192];
>  static bool kfunc_not_supported;
>
>  static int libbpf_print_cb(enum libbpf_print_level level, const char *fmt,
>                            va_list args)
>  {
> +       size_t log_len = strlen(libbpf_log);
> +
> +       vsnprintf(libbpf_log + log_len, sizeof(libbpf_log) - log_len,
> +                 fmt, args);

it seems like test is written to assume that load might fail and we'll
get error messages, so not sure it's that useful to print out these
errors. But at the very least we should filter out DEBUG and INFO
level messages, and pass through WARN only.

Also, there is no point in having a separate log buffer, just printf
directly. test_progs will take care to collect overall log and ignore
it if test succeeds, or emit it if test fails


> +
>         if (strcmp(fmt, "libbpf: extern (func ksym) '%s': not found in kernel or module BTFs\n"))
>                 return 0;
>
> @@ -277,6 +283,7 @@ void test_verify_pkcs7_sig(void)
>         if (!ASSERT_OK_PTR(skel, "test_verify_pkcs7_sig__open"))
>                 goto close_prog;
>
> +       libbpf_log[0] = 0;
>         old_print_cb = libbpf_set_print(libbpf_print_cb);
>         ret = test_verify_pkcs7_sig__load(skel);
>         libbpf_set_print(old_print_cb);
> @@ -289,6 +296,8 @@ void test_verify_pkcs7_sig(void)
>                 goto close_prog;
>         }
>
> +       printf("%s", libbpf_log);
> +
>         if (!ASSERT_OK(ret, "test_verify_pkcs7_sig__load"))
>                 goto close_prog;
>
> diff --git a/tools/testing/selftests/bpf/progs/test_verify_pkcs7_sig.c b/tools/testing/selftests/bpf/progs/test_verify_pkcs7_sig.c
> index ce419304ff1f..7748cc23de8a 100644
> --- a/tools/testing/selftests/bpf/progs/test_verify_pkcs7_sig.c
> +++ b/tools/testing/selftests/bpf/progs/test_verify_pkcs7_sig.c
> @@ -59,10 +59,14 @@ int BPF_PROG(bpf, int cmd, union bpf_attr *attr, unsigned int size)
>         if (!data_val)
>                 return 0;
>
> -       bpf_probe_read(&value, sizeof(value), &attr->value);
> -
> -       bpf_copy_from_user(data_val, sizeof(struct data),
> -                          (void *)(unsigned long)value);
> +       ret = bpf_probe_read_kernel(&value, sizeof(value), &attr->value);
> +       if (ret)
> +               return ret;
> +
> +       ret = bpf_copy_from_user(data_val, sizeof(struct data),
> +                                (void *)(unsigned long)value);
> +       if (ret)
> +               return ret;

this part looks good, we shouldn't use bpf_probe_read.

You'll have to update progs/profiler.inc.h as well, btw, which still
uses bpf_probe_read() and bpf_probe_read_str.
>
>         if (data_val->data_len > sizeof(data_val->data))
>                 return -EINVAL;
> --
> 2.39.1
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH bpf-next 01/24] selftests/bpf: Fix liburandom_read.so linker error
  2023-01-25 21:37 ` [PATCH bpf-next 01/24] selftests/bpf: Fix liburandom_read.so linker error Ilya Leoshkevich
@ 2023-01-26  1:07   ` Andrii Nakryiko
  2023-01-26 13:30     ` Ilya Leoshkevich
  0 siblings, 1 reply; 48+ messages in thread
From: Andrii Nakryiko @ 2023-01-26  1:07 UTC (permalink / raw)
  To: Ilya Leoshkevich
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf,
	Heiko Carstens, Vasily Gorbik

On Wed, Jan 25, 2023 at 1:39 PM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>
> When building with O=, the following linker error occurs:
>
>     clang: error: no such file or directory: 'liburandom_read.so'
>
> Fix by adding $(OUTPUT) to the linker search path.
>
> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> ---
>  tools/testing/selftests/bpf/Makefile | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
> index c9b5ed59e1ed..43098eb15d31 100644
> --- a/tools/testing/selftests/bpf/Makefile
> +++ b/tools/testing/selftests/bpf/Makefile
> @@ -189,9 +189,9 @@ $(OUTPUT)/liburandom_read.so: urandom_read_lib1.c urandom_read_lib2.c
>  $(OUTPUT)/urandom_read: urandom_read.c urandom_read_aux.c $(OUTPUT)/liburandom_read.so
>         $(call msg,BINARY,,$@)
>         $(Q)$(CLANG) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $(filter %.c,$^) \
> -                    liburandom_read.so $(filter-out -static,$(LDLIBS))      \
> +                    $(filter-out -static,$(LDLIBS))                           \
>                      -fuse-ld=$(LLD) -Wl,-znoseparate-code -Wl,--build-id=sha1 \
> -                    -Wl,-rpath=. -o $@
> +                    -Wl,-rpath=. -o $@ -lurandom_read -L$(OUTPUT)

why moving to the end? it's nice in verbose logs when the last thing
is the resulting file ($@), so if possible, let's move it back?


>
>  $(OUTPUT)/sign-file: ../../../../scripts/sign-file.c
>         $(call msg,SIGN-FILE,,$@)
> --
> 2.39.1
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH bpf-next 22/24] s390/bpf: Implement arch_prepare_bpf_trampoline()
  2023-01-25 21:38 ` [PATCH bpf-next 22/24] s390/bpf: Implement arch_prepare_bpf_trampoline() Ilya Leoshkevich
@ 2023-01-26  1:15   ` Andrii Nakryiko
  2023-01-26 14:30     ` Ilya Leoshkevich
  2023-02-15  4:07   ` kernel test robot
  2023-02-15 17:03   ` kernel test robot
  2 siblings, 1 reply; 48+ messages in thread
From: Andrii Nakryiko @ 2023-01-26  1:15 UTC (permalink / raw)
  To: Ilya Leoshkevich
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf,
	Heiko Carstens, Vasily Gorbik

On Wed, Jan 25, 2023 at 1:39 PM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>
> arch_prepare_bpf_trampoline() is used for direct attachment of eBPF
> programs to various places, bypassing kprobes. It's responsible for
> calling a number of eBPF programs before, instead and/or after
> whatever they are attached to.
>
> Add a s390x implementation, paying attention to the following:
>
> - Reuse the existing JIT infrastructure, where possible.
> - Like the existing JIT, prefer making multiple passes instead of
>   backpatching. Currently 2 passes is enough. If literal pool is
>   introduced, this needs to be raised to 3. However, at the moment
>   adding literal pool only makes the code larger. If branch
>   shortening is introduced, the number of passes needs to be
>   increased even further.
> - Support both regular and ftrace calling conventions, depending on
>   the trampoline flags.
> - Use expolines for indirect calls.
> - Handle the mismatch between the eBPF and the s390x ABIs.
> - Sign-extend fmod_ret return values.
>
> invoke_bpf_prog() produces about 120 bytes; it might be possible to
> slightly optimize this, but reaching 50 bytes, like on x86_64, looks
> unrealistic: just loading cookie, __bpf_prog_enter, bpf_func, insnsi
> and __bpf_prog_exit as literals already takes at least 5 * 12 = 60
> bytes, and we can't use relative addressing for most of them.
> Therefore, lower BPF_MAX_TRAMP_LINKS on s390x.
>
> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> ---
>  arch/s390/net/bpf_jit_comp.c | 535 +++++++++++++++++++++++++++++++++--
>  include/linux/bpf.h          |   4 +
>  2 files changed, 517 insertions(+), 22 deletions(-)
>

[...]

> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index cf89504c8dda..52ff43bbf996 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -943,7 +943,11 @@ struct btf_func_model {
>  /* Each call __bpf_prog_enter + call bpf_func + call __bpf_prog_exit is ~50
>   * bytes on x86.
>   */
> +#if defined(__s390x__)
> +#define BPF_MAX_TRAMP_LINKS 27
> +#else
>  #define BPF_MAX_TRAMP_LINKS 38
> +#endif

if we turn this into enum definition, then on selftests side we can
just discover this from vmlinux BTF, instead of hard-coding
arch-specific constants. Thoughts?

>
>  struct bpf_tramp_links {
>         struct bpf_tramp_link *links[BPF_MAX_TRAMP_LINKS];
> --
> 2.39.1
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH bpf-next 24/24] s390/bpf: Implement bpf_jit_supports_kfunc_call()
  2023-01-25 21:38 ` [PATCH bpf-next 24/24] s390/bpf: Implement bpf_jit_supports_kfunc_call() Ilya Leoshkevich
@ 2023-01-26  1:28   ` Alexei Starovoitov
  2023-01-27 11:36     ` Ilya Leoshkevich
  0 siblings, 1 reply; 48+ messages in thread
From: Alexei Starovoitov @ 2023-01-26  1:28 UTC (permalink / raw)
  To: Ilya Leoshkevich
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf,
	Heiko Carstens, Vasily Gorbik

On Wed, Jan 25, 2023 at 10:38:17PM +0100, Ilya Leoshkevich wrote:
> +
> +		/* Sign-extend the kfunc arguments. */
> +		if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) {
> +			m = bpf_jit_find_kfunc_model(fp, insn);
> +			if (!m)
> +				return -1;
> +
> +			for (j = 0; j < m->nr_args; j++) {
> +				if (sign_extend(jit, BPF_REG_1 + j,
> +						m->arg_size[j],
> +						m->arg_flags[j]))
> +					return -1;
> +			}
> +		}

Is this because s390 doesn't have subregisters?
Could you give an example where it's necessary?
I'm guessing a bpf prog compiled with alu32 and operates on signed int
that is passed into a kfunc that expects 'int' in 64-bit reg?

>  
> +bool bpf_jit_supports_kfunc_call(void)
> +{
> +	return true;
> +}

Timely :) Thanks for working it.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH bpf-next 17/24] libbpf: Read usdt arg spec with bpf_probe_read_kernel()
  2023-01-26  0:26   ` Andrii Nakryiko
@ 2023-01-26 11:41     ` Ilya Leoshkevich
  2023-01-26 19:03       ` Andrii Nakryiko
  0 siblings, 1 reply; 48+ messages in thread
From: Ilya Leoshkevich @ 2023-01-26 11:41 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf,
	Heiko Carstens, Vasily Gorbik

On Wed, 2023-01-25 at 16:26 -0800, Andrii Nakryiko wrote:
> On Wed, Jan 25, 2023 at 1:39 PM Ilya Leoshkevich <iii@linux.ibm.com>
> wrote:
> > 
> > Loading programs that use bpf_usdt_arg() on s390x fails with:
> > 
> >     ; switch (arg_spec->arg_type) {
> >     139: (61) r1 = *(u32 *)(r2 +8)
> >     R2 unbounded memory access, make sure to bounds check any such
> > access
> 
> can you show a bit longer log? we shouldn't just  use
> bpf_probe_read_kernel for this. I suspect strategically placed
> barrier_var() calls will solve this. This is usually an issue with
> compiler reordering operations and doing actual check after it
> already
> speculatively adjusted pointer (which is technically safe and ok if
> we
> never deref that pointer, but verifier doesn't recognize such
> pattern)

The full log is here:

https://gist.github.com/iii-i/b6149ee99b37078ec920ab1d3bb45134

The relevant part seems to be:

; if (arg_num >= BPF_USDT_MAX_ARG_CNT || arg_num >= spec->arg_cnt)
128: (79) r1 = *(u64 *)(r10 -24)      ; frame1:
R1_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0
129: (25) if r1 > 0xb goto pc+83      ; frame1:
R1_w=scalar(umax=11,var_off=(0x0; 0xf))
; if (arg_num >= BPF_USDT_MAX_ARG_CNT || arg_num >= spec->arg_cnt)
130: (69) r1 = *(u16 *)(r8 +200)      ; frame1:
R1_w=scalar(umax=65535,var_off=(0x0; 0xffff))
R8_w=map_value(off=0,ks=4,vs=208,imm=0)
131: (67) r1 <<= 48                   ; frame1:
R1_w=scalar(smax=9223090561878065152,umax=18446462598732840960,var_off=
(0x0; 0xffff000000000000),s32_min=0,s32_max=0,u32_max=0)
132: (c7) r1 s>>= 48                  ; frame1: R1_w=scalar(smin=-
32768,smax=32767)
; if (arg_num >= BPF_USDT_MAX_ARG_CNT || arg_num >= spec->arg_cnt)
133: (79) r2 = *(u64 *)(r10 -24)      ; frame1:
R2=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0
134: (bd) if r1 <= r2 goto pc+78      ; frame1: R1=scalar(smin=-
32768,smax=32767) R2=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
; arg_spec = &spec->args[arg_num];
135: (79) r1 = *(u64 *)(r10 -24)      ; frame1:
R1_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0
136: (67) r1 <<= 4                    ; frame1:
R1_w=scalar(umax=68719476720,var_off=(0x0;
0xffffffff0),s32_max=2147483632,u32_max=-16)
137: (bf) r2 = r8                     ; frame1:
R2_w=map_value(off=0,ks=4,vs=208,imm=0)
R8=map_value(off=0,ks=4,vs=208,imm=0)
138: (0f) r2 += r1                    ; frame1:
R1_w=scalar(umax=68719476720,var_off=(0x0;
0xffffffff0),s32_max=2147483632,u32_max=-16)
R2_w=map_value(off=0,ks=4,vs=208,umax=68719476720,var_off=(0x0;
0xffffffff0),s32_max=2147483632,u32_max=-16)
; switch (arg_spec->arg_type) {
139: (61) r1 = *(u32 *)(r2 +8)

#128-#129 make sure that *(u64 *)(r10 -24) <= 11, but when #133
loads it again, this constraint is not there. I guess we need to
force flushing r1 to stack? The following helps:

--- a/tools/lib/bpf/usdt.bpf.h
+++ b/tools/lib/bpf/usdt.bpf.h
@@ -130,7 +130,10 @@ int bpf_usdt_arg(struct pt_regs *ctx, __u64
arg_num, long *res)
        if (!spec)
                return -ESRCH;
 
-       if (arg_num >= BPF_USDT_MAX_ARG_CNT || arg_num >= spec-
>arg_cnt)
+       if (arg_num >= BPF_USDT_MAX_ARG_CNT)
+               return -ENOENT;
+       barrier_var(arg_num);
+       if (arg_num >= spec->arg_cnt)
                return -ENOENT;
 
        arg_spec = &spec->args[arg_num];

I can use this in v2 if it looks good.



Btw, I looked at the barrier_var() definition:

#define barrier_var(var) asm volatile("" : "=r"(var) : "0"(var))

and I'm curious why it's not defined like this:

#define barrier_var(var) asm volatile("" : "+r"(var))

which is a bit simpler?
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH bpf-next 01/24] selftests/bpf: Fix liburandom_read.so linker error
  2023-01-26  1:07   ` Andrii Nakryiko
@ 2023-01-26 13:30     ` Ilya Leoshkevich
  0 siblings, 0 replies; 48+ messages in thread
From: Ilya Leoshkevich @ 2023-01-26 13:30 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf,
	Heiko Carstens, Vasily Gorbik

On Wed, 2023-01-25 at 17:07 -0800, Andrii Nakryiko wrote:
> On Wed, Jan 25, 2023 at 1:39 PM Ilya Leoshkevich <iii@linux.ibm.com>
> wrote:
> > 
> > When building with O=, the following linker error occurs:
> > 
> >     clang: error: no such file or directory: 'liburandom_read.so'
> > 
> > Fix by adding $(OUTPUT) to the linker search path.
> > 
> > Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> > ---
> >  tools/testing/selftests/bpf/Makefile | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/tools/testing/selftests/bpf/Makefile
> > b/tools/testing/selftests/bpf/Makefile
> > index c9b5ed59e1ed..43098eb15d31 100644
> > --- a/tools/testing/selftests/bpf/Makefile
> > +++ b/tools/testing/selftests/bpf/Makefile
> > @@ -189,9 +189,9 @@ $(OUTPUT)/liburandom_read.so:
> > urandom_read_lib1.c urandom_read_lib2.c
> >  $(OUTPUT)/urandom_read: urandom_read.c urandom_read_aux.c
> > $(OUTPUT)/liburandom_read.so
> >         $(call msg,BINARY,,$@)
> >         $(Q)$(CLANG) $(filter-out -static,$(CFLAGS) $(LDFLAGS))
> > $(filter %.c,$^) \
> > -                    liburandom_read.so $(filter-out -
> > static,$(LDLIBS))      \
> > +                    $(filter-out -
> > static,$(LDLIBS))                           \
> >                      -fuse-ld=$(LLD) -Wl,-znoseparate-code -Wl,--
> > build-id=sha1 \
> > -                    -Wl,-rpath=. -o $@
> > +                    -Wl,-rpath=. -o $@ -lurandom_read -L$(OUTPUT)
> 
> why moving to the end? it's nice in verbose logs when the last thing
> is the resulting file ($@), so if possible, let's move it back?

You're right, I'm just used to having the libraries at the end, but
here we already have $(LDLIBS) in the middle. Will do in v2.

> 
> > 
> >  $(OUTPUT)/sign-file: ../../../../scripts/sign-file.c
> >         $(call msg,SIGN-FILE,,$@)
> > --
> > 2.39.1
> > 


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH bpf-next 22/24] s390/bpf: Implement arch_prepare_bpf_trampoline()
  2023-01-26  1:15   ` Andrii Nakryiko
@ 2023-01-26 14:30     ` Ilya Leoshkevich
  2023-01-26 19:06       ` Andrii Nakryiko
  0 siblings, 1 reply; 48+ messages in thread
From: Ilya Leoshkevich @ 2023-01-26 14:30 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf,
	Heiko Carstens, Vasily Gorbik

On Wed, 2023-01-25 at 17:15 -0800, Andrii Nakryiko wrote:
> On Wed, Jan 25, 2023 at 1:39 PM Ilya Leoshkevich <iii@linux.ibm.com>
> wrote:
> > 
> > arch_prepare_bpf_trampoline() is used for direct attachment of eBPF
> > programs to various places, bypassing kprobes. It's responsible for
> > calling a number of eBPF programs before, instead and/or after
> > whatever they are attached to.
> > 
> > Add a s390x implementation, paying attention to the following:
> > 
> > - Reuse the existing JIT infrastructure, where possible.
> > - Like the existing JIT, prefer making multiple passes instead of
> >   backpatching. Currently 2 passes is enough. If literal pool is
> >   introduced, this needs to be raised to 3. However, at the moment
> >   adding literal pool only makes the code larger. If branch
> >   shortening is introduced, the number of passes needs to be
> >   increased even further.
> > - Support both regular and ftrace calling conventions, depending on
> >   the trampoline flags.
> > - Use expolines for indirect calls.
> > - Handle the mismatch between the eBPF and the s390x ABIs.
> > - Sign-extend fmod_ret return values.
> > 
> > invoke_bpf_prog() produces about 120 bytes; it might be possible to
> > slightly optimize this, but reaching 50 bytes, like on x86_64,
> > looks
> > unrealistic: just loading cookie, __bpf_prog_enter, bpf_func,
> > insnsi
> > and __bpf_prog_exit as literals already takes at least 5 * 12 = 60
> > bytes, and we can't use relative addressing for most of them.
> > Therefore, lower BPF_MAX_TRAMP_LINKS on s390x.
> > 
> > Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> > ---
> >  arch/s390/net/bpf_jit_comp.c | 535
> > +++++++++++++++++++++++++++++++++--
> >  include/linux/bpf.h          |   4 +
> >  2 files changed, 517 insertions(+), 22 deletions(-)
> > 
> 
> [...]
> 
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index cf89504c8dda..52ff43bbf996 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -943,7 +943,11 @@ struct btf_func_model {
> >  /* Each call __bpf_prog_enter + call bpf_func + call
> > __bpf_prog_exit is ~50
> >   * bytes on x86.
> >   */
> > +#if defined(__s390x__)
> > +#define BPF_MAX_TRAMP_LINKS 27
> > +#else
> >  #define BPF_MAX_TRAMP_LINKS 38
> > +#endif
> 
> if we turn this into enum definition, then on selftests side we can
> just discover this from vmlinux BTF, instead of hard-coding
> arch-specific constants. Thoughts?

This seems to work. I can replace 3/24 and 4/24 with that in v2.
Some random notes:

- It doesn't seem to be possible to #include "vlinux.h" into tests,
  so one has to go through the btf__load_vmlinux_btf() dance and
  allocate the fd arrays dynamically.

- One has to give this enum an otherwise unnecessary name, so that
  it's easy to find. This doesn't seem like a big deal though:

enum bpf_max_tramp_links {
#if defined(__s390x__)
	BPF_MAX_TRAMP_LINKS = 27,
#else
	BPF_MAX_TRAMP_LINKS = 38,
#endif
};

- An alternative might be to expose this via /proc, since the users
  might be interested in it too.

> > 
> >  struct bpf_tramp_links {
> >         struct bpf_tramp_link *links[BPF_MAX_TRAMP_LINKS];
> > --
> > 2.39.1
> > 


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH bpf-next 17/24] libbpf: Read usdt arg spec with bpf_probe_read_kernel()
  2023-01-26 11:41     ` Ilya Leoshkevich
@ 2023-01-26 19:03       ` Andrii Nakryiko
  2023-01-27 11:01         ` Ilya Leoshkevich
  0 siblings, 1 reply; 48+ messages in thread
From: Andrii Nakryiko @ 2023-01-26 19:03 UTC (permalink / raw)
  To: Ilya Leoshkevich
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf,
	Heiko Carstens, Vasily Gorbik

On Thu, Jan 26, 2023 at 3:41 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>
> On Wed, 2023-01-25 at 16:26 -0800, Andrii Nakryiko wrote:
> > On Wed, Jan 25, 2023 at 1:39 PM Ilya Leoshkevich <iii@linux.ibm.com>
> > wrote:
> > >
> > > Loading programs that use bpf_usdt_arg() on s390x fails with:
> > >
> > >     ; switch (arg_spec->arg_type) {
> > >     139: (61) r1 = *(u32 *)(r2 +8)
> > >     R2 unbounded memory access, make sure to bounds check any such
> > > access
> >
> > can you show a bit longer log? we shouldn't just  use
> > bpf_probe_read_kernel for this. I suspect strategically placed
> > barrier_var() calls will solve this. This is usually an issue with
> > compiler reordering operations and doing actual check after it
> > already
> > speculatively adjusted pointer (which is technically safe and ok if
> > we
> > never deref that pointer, but verifier doesn't recognize such
> > pattern)
>
> The full log is here:
>
> https://gist.github.com/iii-i/b6149ee99b37078ec920ab1d3bb45134
>
> The relevant part seems to be:
>
> ; if (arg_num >= BPF_USDT_MAX_ARG_CNT || arg_num >= spec->arg_cnt)
> 128: (79) r1 = *(u64 *)(r10 -24)      ; frame1:
> R1_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0
> 129: (25) if r1 > 0xb goto pc+83      ; frame1:
> R1_w=scalar(umax=11,var_off=(0x0; 0xf))
> ; if (arg_num >= BPF_USDT_MAX_ARG_CNT || arg_num >= spec->arg_cnt)
> 130: (69) r1 = *(u16 *)(r8 +200)      ; frame1:
> R1_w=scalar(umax=65535,var_off=(0x0; 0xffff))
> R8_w=map_value(off=0,ks=4,vs=208,imm=0)
> 131: (67) r1 <<= 48                   ; frame1:
> R1_w=scalar(smax=9223090561878065152,umax=18446462598732840960,var_off=
> (0x0; 0xffff000000000000),s32_min=0,s32_max=0,u32_max=0)
> 132: (c7) r1 s>>= 48                  ; frame1: R1_w=scalar(smin=-
> 32768,smax=32767)
> ; if (arg_num >= BPF_USDT_MAX_ARG_CNT || arg_num >= spec->arg_cnt)
> 133: (79) r2 = *(u64 *)(r10 -24)      ; frame1:
> R2=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0
> 134: (bd) if r1 <= r2 goto pc+78      ; frame1: R1=scalar(smin=-
> 32768,smax=32767) R2=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
> ; arg_spec = &spec->args[arg_num];
> 135: (79) r1 = *(u64 *)(r10 -24)      ; frame1:
> R1_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0
> 136: (67) r1 <<= 4                    ; frame1:
> R1_w=scalar(umax=68719476720,var_off=(0x0;
> 0xffffffff0),s32_max=2147483632,u32_max=-16)
> 137: (bf) r2 = r8                     ; frame1:
> R2_w=map_value(off=0,ks=4,vs=208,imm=0)
> R8=map_value(off=0,ks=4,vs=208,imm=0)
> 138: (0f) r2 += r1                    ; frame1:
> R1_w=scalar(umax=68719476720,var_off=(0x0;
> 0xffffffff0),s32_max=2147483632,u32_max=-16)
> R2_w=map_value(off=0,ks=4,vs=208,umax=68719476720,var_off=(0x0;
> 0xffffffff0),s32_max=2147483632,u32_max=-16)
> ; switch (arg_spec->arg_type) {
> 139: (61) r1 = *(u32 *)(r2 +8)
>
> #128-#129 make sure that *(u64 *)(r10 -24) <= 11, but when #133
> loads it again, this constraint is not there. I guess we need to
> force flushing r1 to stack? The following helps:
>
> --- a/tools/lib/bpf/usdt.bpf.h
> +++ b/tools/lib/bpf/usdt.bpf.h
> @@ -130,7 +130,10 @@ int bpf_usdt_arg(struct pt_regs *ctx, __u64
> arg_num, long *res)
>         if (!spec)
>                 return -ESRCH;
>
> -       if (arg_num >= BPF_USDT_MAX_ARG_CNT || arg_num >= spec-
> >arg_cnt)
> +       if (arg_num >= BPF_USDT_MAX_ARG_CNT)
> +               return -ENOENT;
> +       barrier_var(arg_num);
> +       if (arg_num >= spec->arg_cnt)
>                 return -ENOENT;
>
>         arg_spec = &spec->args[arg_num];
>
> I can use this in v2 if it looks good.

arg_num -> spec->arg_cnt is "real" check, arg_num >=
BPF_USDT_MAX_ARG_CNT is more to satisfy verifier (we know that
spec->arg_cnt won't be >= BPF_USDT_MAX_ARG_CNT). Let's swap two checks
in order and keep BPF_USDT_MAX_ARG_CNT close to spec->args[arg_num]
use? And if barrier_var() is necessary, then so be it.

>
>
>
> Btw, I looked at the barrier_var() definition:
>
> #define barrier_var(var) asm volatile("" : "=r"(var) : "0"(var))
>
> and I'm curious why it's not defined like this:
>
> #define barrier_var(var) asm volatile("" : "+r"(var))
>
> which is a bit simpler?
> >

no reason, just unfamiliarity with embedded asm back then, we can
update it we they are equivalent

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH bpf-next 22/24] s390/bpf: Implement arch_prepare_bpf_trampoline()
  2023-01-26 14:30     ` Ilya Leoshkevich
@ 2023-01-26 19:06       ` Andrii Nakryiko
  2023-01-27 11:15         ` Ilya Leoshkevich
  0 siblings, 1 reply; 48+ messages in thread
From: Andrii Nakryiko @ 2023-01-26 19:06 UTC (permalink / raw)
  To: Ilya Leoshkevich
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf,
	Heiko Carstens, Vasily Gorbik

On Thu, Jan 26, 2023 at 6:30 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>
> On Wed, 2023-01-25 at 17:15 -0800, Andrii Nakryiko wrote:
> > On Wed, Jan 25, 2023 at 1:39 PM Ilya Leoshkevich <iii@linux.ibm.com>
> > wrote:
> > >
> > > arch_prepare_bpf_trampoline() is used for direct attachment of eBPF
> > > programs to various places, bypassing kprobes. It's responsible for
> > > calling a number of eBPF programs before, instead and/or after
> > > whatever they are attached to.
> > >
> > > Add a s390x implementation, paying attention to the following:
> > >
> > > - Reuse the existing JIT infrastructure, where possible.
> > > - Like the existing JIT, prefer making multiple passes instead of
> > >   backpatching. Currently 2 passes is enough. If literal pool is
> > >   introduced, this needs to be raised to 3. However, at the moment
> > >   adding literal pool only makes the code larger. If branch
> > >   shortening is introduced, the number of passes needs to be
> > >   increased even further.
> > > - Support both regular and ftrace calling conventions, depending on
> > >   the trampoline flags.
> > > - Use expolines for indirect calls.
> > > - Handle the mismatch between the eBPF and the s390x ABIs.
> > > - Sign-extend fmod_ret return values.
> > >
> > > invoke_bpf_prog() produces about 120 bytes; it might be possible to
> > > slightly optimize this, but reaching 50 bytes, like on x86_64,
> > > looks
> > > unrealistic: just loading cookie, __bpf_prog_enter, bpf_func,
> > > insnsi
> > > and __bpf_prog_exit as literals already takes at least 5 * 12 = 60
> > > bytes, and we can't use relative addressing for most of them.
> > > Therefore, lower BPF_MAX_TRAMP_LINKS on s390x.
> > >
> > > Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> > > ---
> > >  arch/s390/net/bpf_jit_comp.c | 535
> > > +++++++++++++++++++++++++++++++++--
> > >  include/linux/bpf.h          |   4 +
> > >  2 files changed, 517 insertions(+), 22 deletions(-)
> > >
> >
> > [...]
> >
> > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > index cf89504c8dda..52ff43bbf996 100644
> > > --- a/include/linux/bpf.h
> > > +++ b/include/linux/bpf.h
> > > @@ -943,7 +943,11 @@ struct btf_func_model {
> > >  /* Each call __bpf_prog_enter + call bpf_func + call
> > > __bpf_prog_exit is ~50
> > >   * bytes on x86.
> > >   */
> > > +#if defined(__s390x__)
> > > +#define BPF_MAX_TRAMP_LINKS 27
> > > +#else
> > >  #define BPF_MAX_TRAMP_LINKS 38
> > > +#endif
> >
> > if we turn this into enum definition, then on selftests side we can
> > just discover this from vmlinux BTF, instead of hard-coding
> > arch-specific constants. Thoughts?
>
> This seems to work. I can replace 3/24 and 4/24 with that in v2.
> Some random notes:
>
> - It doesn't seem to be possible to #include "vlinux.h" into tests,
>   so one has to go through the btf__load_vmlinux_btf() dance and
>   allocate the fd arrays dynamically.

yes, you can't include vmlinux.h into user-space code, of course. And
yes it's true about needing to use btf__load_vmlinux_btf().

But I didn't get what you are saying about fd arrays, tbh. Can you
please elaborate?

>
> - One has to give this enum an otherwise unnecessary name, so that
>   it's easy to find. This doesn't seem like a big deal though:
>
> enum bpf_max_tramp_links {

not really, you can keep it anonymous enum. We do that in
include/uapi/linux/bpf.h for a lot of constants

> #if defined(__s390x__)
>         BPF_MAX_TRAMP_LINKS = 27,
> #else
>         BPF_MAX_TRAMP_LINKS = 38,
> #endif
> };
>
> - An alternative might be to expose this via /proc, since the users
>   might be interested in it too.

I'd say let's not, there is no need, having it in BTF is more than
enough for testing purposes

>
> > >
> > >  struct bpf_tramp_links {
> > >         struct bpf_tramp_link *links[BPF_MAX_TRAMP_LINKS];
> > > --
> > > 2.39.1
> > >
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH bpf-next 17/24] libbpf: Read usdt arg spec with bpf_probe_read_kernel()
  2023-01-26 19:03       ` Andrii Nakryiko
@ 2023-01-27 11:01         ` Ilya Leoshkevich
  0 siblings, 0 replies; 48+ messages in thread
From: Ilya Leoshkevich @ 2023-01-27 11:01 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf,
	Heiko Carstens, Vasily Gorbik

On Thu, 2023-01-26 at 11:03 -0800, Andrii Nakryiko wrote:
> On Thu, Jan 26, 2023 at 3:41 AM Ilya Leoshkevich <iii@linux.ibm.com>
> wrote:
> > 
> > On Wed, 2023-01-25 at 16:26 -0800, Andrii Nakryiko wrote:
> > > On Wed, Jan 25, 2023 at 1:39 PM Ilya Leoshkevich
> > > <iii@linux.ibm.com>
> > > wrote:
> > > > 
> > > > Loading programs that use bpf_usdt_arg() on s390x fails with:
> > > > 
> > > >     ; switch (arg_spec->arg_type) {
> > > >     139: (61) r1 = *(u32 *)(r2 +8)
> > > >     R2 unbounded memory access, make sure to bounds check any
> > > > such
> > > > access
> > > 
> > > can you show a bit longer log? we shouldn't just  use
> > > bpf_probe_read_kernel for this. I suspect strategically placed
> > > barrier_var() calls will solve this. This is usually an issue
> > > with
> > > compiler reordering operations and doing actual check after it
> > > already
> > > speculatively adjusted pointer (which is technically safe and ok
> > > if
> > > we
> > > never deref that pointer, but verifier doesn't recognize such
> > > pattern)
> > 
> > The full log is here:
> > 
> > https://gist.github.com/iii-i/b6149ee99b37078ec920ab1d3bb45134

[...]

> > --- a/tools/lib/bpf/usdt.bpf.h
> > +++ b/tools/lib/bpf/usdt.bpf.h
> > @@ -130,7 +130,10 @@ int bpf_usdt_arg(struct pt_regs *ctx, __u64
> > arg_num, long *res)
> >         if (!spec)
> >                 return -ESRCH;
> > 
> > -       if (arg_num >= BPF_USDT_MAX_ARG_CNT || arg_num >= spec-
> > > arg_cnt)
> > +       if (arg_num >= BPF_USDT_MAX_ARG_CNT)
> > +               return -ENOENT;
> > +       barrier_var(arg_num);
> > +       if (arg_num >= spec->arg_cnt)
> >                 return -ENOENT;
> > 
> >         arg_spec = &spec->args[arg_num];
> > 
> > I can use this in v2 if it looks good.
> 
> arg_num -> spec->arg_cnt is "real" check, arg_num >=
> BPF_USDT_MAX_ARG_CNT is more to satisfy verifier (we know that
> spec->arg_cnt won't be >= BPF_USDT_MAX_ARG_CNT). Let's swap two
> checks
> in order and keep BPF_USDT_MAX_ARG_CNT close to spec->args[arg_num]
> use? And if barrier_var() is necessary, then so be it.

Unfortunately just swapping did not help, so let's use the barrier.

> > Btw, I looked at the barrier_var() definition:
> > 
> > #define barrier_var(var) asm volatile("" : "=r"(var) : "0"(var))
> > 
> > and I'm curious why it's not defined like this:
> > 
> > #define barrier_var(var) asm volatile("" : "+r"(var))
> > 
> > which is a bit simpler?
> > > 
> 
> no reason, just unfamiliarity with embedded asm back then, we can
> update it we they are equivalent

Thanks! I will add a simplification in v2 then.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH bpf-next 22/24] s390/bpf: Implement arch_prepare_bpf_trampoline()
  2023-01-26 19:06       ` Andrii Nakryiko
@ 2023-01-27 11:15         ` Ilya Leoshkevich
  2023-01-27 17:30           ` Andrii Nakryiko
  0 siblings, 1 reply; 48+ messages in thread
From: Ilya Leoshkevich @ 2023-01-27 11:15 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf,
	Heiko Carstens, Vasily Gorbik

On Thu, 2023-01-26 at 11:06 -0800, Andrii Nakryiko wrote:
> On Thu, Jan 26, 2023 at 6:30 AM Ilya Leoshkevich <iii@linux.ibm.com>
> wrote:
> > 
> > On Wed, 2023-01-25 at 17:15 -0800, Andrii Nakryiko wrote:
> > > On Wed, Jan 25, 2023 at 1:39 PM Ilya Leoshkevich
> > > <iii@linux.ibm.com>
> > > wrote:
> > > > 
> > > > arch_prepare_bpf_trampoline() is used for direct attachment of
> > > > eBPF
> > > > programs to various places, bypassing kprobes. It's responsible
> > > > for
> > > > calling a number of eBPF programs before, instead and/or after
> > > > whatever they are attached to.
> > > > 
> > > > Add a s390x implementation, paying attention to the following:
> > > > 
> > > > - Reuse the existing JIT infrastructure, where possible.
> > > > - Like the existing JIT, prefer making multiple passes instead
> > > > of
> > > >   backpatching. Currently 2 passes is enough. If literal pool
> > > > is
> > > >   introduced, this needs to be raised to 3. However, at the
> > > > moment
> > > >   adding literal pool only makes the code larger. If branch
> > > >   shortening is introduced, the number of passes needs to be
> > > >   increased even further.
> > > > - Support both regular and ftrace calling conventions,
> > > > depending on
> > > >   the trampoline flags.
> > > > - Use expolines for indirect calls.
> > > > - Handle the mismatch between the eBPF and the s390x ABIs.
> > > > - Sign-extend fmod_ret return values.
> > > > 
> > > > invoke_bpf_prog() produces about 120 bytes; it might be
> > > > possible to
> > > > slightly optimize this, but reaching 50 bytes, like on x86_64,
> > > > looks
> > > > unrealistic: just loading cookie, __bpf_prog_enter, bpf_func,
> > > > insnsi
> > > > and __bpf_prog_exit as literals already takes at least 5 * 12 =
> > > > 60
> > > > bytes, and we can't use relative addressing for most of them.
> > > > Therefore, lower BPF_MAX_TRAMP_LINKS on s390x.
> > > > 
> > > > Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> > > > ---
> > > >  arch/s390/net/bpf_jit_comp.c | 535
> > > > +++++++++++++++++++++++++++++++++--
> > > >  include/linux/bpf.h          |   4 +
> > > >  2 files changed, 517 insertions(+), 22 deletions(-)
> > > > 
> > > 
> > > [...]
> > > 
> > > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > > index cf89504c8dda..52ff43bbf996 100644
> > > > --- a/include/linux/bpf.h
> > > > +++ b/include/linux/bpf.h
> > > > @@ -943,7 +943,11 @@ struct btf_func_model {
> > > >  /* Each call __bpf_prog_enter + call bpf_func + call
> > > > __bpf_prog_exit is ~50
> > > >   * bytes on x86.
> > > >   */
> > > > +#if defined(__s390x__)
> > > > +#define BPF_MAX_TRAMP_LINKS 27
> > > > +#else
> > > >  #define BPF_MAX_TRAMP_LINKS 38
> > > > +#endif
> > > 
> > > if we turn this into enum definition, then on selftests side we
> > > can
> > > just discover this from vmlinux BTF, instead of hard-coding
> > > arch-specific constants. Thoughts?
> > 
> > This seems to work. I can replace 3/24 and 4/24 with that in v2.
> > Some random notes:
> > 
> > - It doesn't seem to be possible to #include "vlinux.h" into tests,
> >   so one has to go through the btf__load_vmlinux_btf() dance and
> >   allocate the fd arrays dynamically.
> 
> yes, you can't include vmlinux.h into user-space code, of course. And
> yes it's true about needing to use btf__load_vmlinux_btf().
> 
> But I didn't get what you are saying about fd arrays, tbh. Can you
> please elaborate?

That's a really minor thing; fexit_fd and and link_fd in fexit_stress
now need to be allocated dynamically.

> > - One has to give this enum an otherwise unnecessary name, so that
> >   it's easy to find. This doesn't seem like a big deal though:
> > 
> > enum bpf_max_tramp_links {
> 
> not really, you can keep it anonymous enum. We do that in
> include/uapi/linux/bpf.h for a lot of constants

How would you find it then? My current code is:

int get_bpf_max_tramp_links_from(struct btf *btf)
{
        const struct btf_enum *e;
        const struct btf_type *t;
        const char *name;
        int id;

        id = btf__find_by_name_kind(btf, "bpf_max_tramp_links",
BTF_KIND_ENUM);
        if (!ASSERT_GT(id, 0, "bpf_max_tramp_links id"))
                return -1;
        t = btf__type_by_id(btf, id);
        if (!ASSERT_OK_PTR(t, "bpf_max_tramp_links type"))
                return -1;
        if (!ASSERT_EQ(btf_vlen(t), 1, "bpf_max_tramp_links vlen"))
                return -1;
        e = btf_enum(t);
        if (!ASSERT_OK_PTR(e, "bpf_max_tramp_links[0]"))
                return -1;
        name = btf__name_by_offset(btf, e->name_off);
        if (!ASSERT_OK_PTR(name, "bpf_max_tramp_links[0].name_off") &&
            !ASSERT_STREQ(name, "BPF_MAX_TRAMP_LINKS",
"BPF_MAX_TRAMP_LINKS"))
                return -1;

        return e->val;
}

Is there a way to bypass looking up the enum, and go straight for the
named member?

> > #if defined(__s390x__)
> >         BPF_MAX_TRAMP_LINKS = 27,
> > #else
> >         BPF_MAX_TRAMP_LINKS = 38,
> > #endif
> > };
> > 
> > - An alternative might be to expose this via /proc, since the users
> >   might be interested in it too.
> 
> I'd say let's not, there is no need, having it in BTF is more than
> enough for testing purposes

Fair enough.
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH bpf-next 24/24] s390/bpf: Implement bpf_jit_supports_kfunc_call()
  2023-01-26  1:28   ` Alexei Starovoitov
@ 2023-01-27 11:36     ` Ilya Leoshkevich
  2023-01-27 16:04       ` Alexei Starovoitov
  0 siblings, 1 reply; 48+ messages in thread
From: Ilya Leoshkevich @ 2023-01-27 11:36 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf,
	Heiko Carstens, Vasily Gorbik

On Wed, 2023-01-25 at 17:28 -0800, Alexei Starovoitov wrote:
> On Wed, Jan 25, 2023 at 10:38:17PM +0100, Ilya Leoshkevich wrote:
> > +
> > +               /* Sign-extend the kfunc arguments. */
> > +               if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) {
> > +                       m = bpf_jit_find_kfunc_model(fp, insn);
> > +                       if (!m)
> > +                               return -1;
> > +
> > +                       for (j = 0; j < m->nr_args; j++) {
> > +                               if (sign_extend(jit, BPF_REG_1 + j,
> > +                                               m->arg_size[j],
> > +                                               m->arg_flags[j]))
> > +                                       return -1;
> > +                       }
> > +               }
> 
> Is this because s390 doesn't have subregisters?
> Could you give an example where it's necessary?
> I'm guessing a bpf prog compiled with alu32 and operates on signed
> int
> that is passed into a kfunc that expects 'int' in 64-bit reg?

Precisely. The test added in 13/24 fails without this:

verify_success:PASS:skel 0 nsec                                       
verify_success:PASS:bpf_object__find_program_by_name 0 nsec           
verify_success:PASS:kfunc_call_test4 0 nsec                           
verify_success:FAIL:retval unexpected retval: actual 4294966065 !=
expected -1234                                                        
#94/10   kfunc_call/kfunc_call_test4:FAIL                    

Looking at the assembly:

; long noinline bpf_kfunc_call_test4(signed char a, short b, int c,
long d)
0000000000936a78 <bpf_kfunc_call_test4>:
  936a78:       c0 04 00 00 00 00       jgnop   936a78
<bpf_kfunc_call_test4>
; 	return (long)a + (long)b + (long)c + d;
  936a7e:       b9 08 00 45             agr     %r4,%r5
  936a82:       b9 08 00 43             agr     %r4,%r3
  936a86:       b9 08 00 24             agr     %r2,%r4
  936a8a:       c0 f4 00 1e 3b 27       jg      cfe0d8
<__s390_indirect_jump_r14>

As per the s390x ABI, bpf_kfunc_call_test4() has the right to assume
that a, b and c are sign-extended by the caller, which results in using
64-bit additions (agr) without any additional conversions.

On the JITed code side (without this hunk) we have:

; tmp = bpf_kfunc_call_test4(-3, -30, -200, -1000);
;        5:       b4 10 00 00 ff ff ff fd w1 = -3
   0x3ff7fdcdad4:       llilf   %r2,0xfffffffd
;        6:       b4 20 00 00 ff ff ff e2 w2 = -30
   0x3ff7fdcdada:       llilf   %r3,0xffffffe2
;        7:       b4 30 00 00 ff ff ff 38 w3 = -200
   0x3ff7fdcdae0:       llilf   %r4,0xffffff38
;       8:       b7 40 00 00 ff ff fc 18 r4 = -1000
   0x3ff7fdcdae6:       lgfi    %r5,-1000
   0x3ff7fdcdaec:       mvc     64(4,%r15),160(%r15)
   0x3ff7fdcdaf2:       lgrl    %r1,bpf_kfunc_call_test4@GOT
   0x3ff7fdcdaf8:       brasl   %r14,__s390_indirect_jump_r1

This first 3 llilfs are 32-bit loads, that need to be sign-extended
to 64 bits.

> > +bool bpf_jit_supports_kfunc_call(void)
> > +{
> > +       return true;
> > +}
> 
> Timely :) Thanks for working it.


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH bpf-next 08/24] selftests/bpf: Fix verify_pkcs7_sig on s390x
  2023-01-26  1:06   ` Andrii Nakryiko
@ 2023-01-27 12:36     ` Ilya Leoshkevich
  2023-01-27 17:26       ` Andrii Nakryiko
  0 siblings, 1 reply; 48+ messages in thread
From: Ilya Leoshkevich @ 2023-01-27 12:36 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf,
	Heiko Carstens, Vasily Gorbik

On Wed, 2023-01-25 at 17:06 -0800, Andrii Nakryiko wrote:
> On Wed, Jan 25, 2023 at 1:39 PM Ilya Leoshkevich <iii@linux.ibm.com>
> wrote:
> > 
> > Use bpf_probe_read_kernel() instead of bpf_probe_read(), which is
> > not
> > defined on all architectures.
> > 
> > While at it, improve the error handling: do not hide the verifier
> > log,
> > and check the return values of bpf_probe_read_kernel() and
> > bpf_copy_from_user().
> > 
> > Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> > ---
> >  .../selftests/bpf/prog_tests/verify_pkcs7_sig.c      |  9
> > +++++++++
> >  .../selftests/bpf/progs/test_verify_pkcs7_sig.c      | 12
> > ++++++++----
> >  2 files changed, 17 insertions(+), 4 deletions(-)
> > 
> > diff --git
> > a/tools/testing/selftests/bpf/prog_tests/verify_pkcs7_sig.c
> > b/tools/testing/selftests/bpf/prog_tests/verify_pkcs7_sig.c
> > index 579d6ee83ce0..75c256f79f85 100644
> > --- a/tools/testing/selftests/bpf/prog_tests/verify_pkcs7_sig.c
> > +++ b/tools/testing/selftests/bpf/prog_tests/verify_pkcs7_sig.c
> > @@ -56,11 +56,17 @@ struct data {
> >         __u32 sig_len;
> >  };
> > 
> > +static char libbpf_log[8192];
> >  static bool kfunc_not_supported;
> > 
> >  static int libbpf_print_cb(enum libbpf_print_level level, const
> > char *fmt,
> >                            va_list args)
> >  {
> > +       size_t log_len = strlen(libbpf_log);
> > +
> > +       vsnprintf(libbpf_log + log_len, sizeof(libbpf_log) -
> > log_len,
> > +                 fmt, args);
> 
> it seems like test is written to assume that load might fail and
> we'll
> get error messages, so not sure it's that useful to print out these
> errors. But at the very least we should filter out DEBUG and INFO
> level messages, and pass through WARN only.
> 
> Also, there is no point in having a separate log buffer, just printf
> directly. test_progs will take care to collect overall log and ignore
> it if test succeeds, or emit it if test fails

Thanks, I completely overlooked the fact that the test framework
already hides the output in case of success. With that in mind I can do
just this:

--- a/tools/testing/selftests/bpf/prog_tests/verify_pkcs7_sig.c
+++ b/tools/testing/selftests/bpf/prog_tests/verify_pkcs7_sig.c
@@ -61,6 +61,9 @@ static bool kfunc_not_supported;
 static int libbpf_print_cb(enum libbpf_print_level level, const char
*fmt,
                           va_list args)
 {
+       if (level == LIBBPF_WARN)
+               vprintf(fmt, args);
+
        if (strcmp(fmt, "libbpf: extern (func ksym) '%s': not found in
kernel or module BTFs\n"))
                return 0;
 
If the load fails due to missing kfuncs, we'll skip the test - I think
in this case the output won't be printed either, so we should be fine.

> > +
> >         if (strcmp(fmt, "libbpf: extern (func ksym) '%s': not found
> > in kernel or module BTFs\n"))
> >                 return 0;
> > 
> > @@ -277,6 +283,7 @@ void test_verify_pkcs7_sig(void)
> >         if (!ASSERT_OK_PTR(skel, "test_verify_pkcs7_sig__open"))
> >                 goto close_prog;
> > 
> > +       libbpf_log[0] = 0;
> >         old_print_cb = libbpf_set_print(libbpf_print_cb);
> >         ret = test_verify_pkcs7_sig__load(skel);
> >         libbpf_set_print(old_print_cb);
> > @@ -289,6 +296,8 @@ void test_verify_pkcs7_sig(void)
> >                 goto close_prog;
> >         }
> > 
> > +       printf("%s", libbpf_log);
> > +
> >         if (!ASSERT_OK(ret, "test_verify_pkcs7_sig__load"))
> >                 goto close_prog;
> > 
> > diff --git
> > a/tools/testing/selftests/bpf/progs/test_verify_pkcs7_sig.c
> > b/tools/testing/selftests/bpf/progs/test_verify_pkcs7_sig.c
> > index ce419304ff1f..7748cc23de8a 100644
> > --- a/tools/testing/selftests/bpf/progs/test_verify_pkcs7_sig.c
> > +++ b/tools/testing/selftests/bpf/progs/test_verify_pkcs7_sig.c
> > @@ -59,10 +59,14 @@ int BPF_PROG(bpf, int cmd, union bpf_attr
> > *attr, unsigned int size)
> >         if (!data_val)
> >                 return 0;
> > 
> > -       bpf_probe_read(&value, sizeof(value), &attr->value);
> > -
> > -       bpf_copy_from_user(data_val, sizeof(struct data),
> > -                          (void *)(unsigned long)value);
> > +       ret = bpf_probe_read_kernel(&value, sizeof(value), &attr-
> > >value);
> > +       if (ret)
> > +               return ret;
> > +
> > +       ret = bpf_copy_from_user(data_val, sizeof(struct data),
> > +                                (void *)(unsigned long)value);
> > +       if (ret)
> > +               return ret;
> 
> this part looks good, we shouldn't use bpf_probe_read.
> 
> You'll have to update progs/profiler.inc.h as well, btw, which still
> uses bpf_probe_read() and bpf_probe_read_str.

I remember trying this, but there were still failures due to, as I
thought back then, usage of BPF_CORE_READ() and the lack of
BPF_CORE_READ_KERNEL(). But this seems to be a generic issue. Let me
try again and post my findings as a reply to 0/24.

> >         if (data_val->data_len > sizeof(data_val->data))
> >                 return -EINVAL;
> > --
> > 2.39.1
> > 


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH bpf-next 24/24] s390/bpf: Implement bpf_jit_supports_kfunc_call()
  2023-01-27 11:36     ` Ilya Leoshkevich
@ 2023-01-27 16:04       ` Alexei Starovoitov
  0 siblings, 0 replies; 48+ messages in thread
From: Alexei Starovoitov @ 2023-01-27 16:04 UTC (permalink / raw)
  To: Ilya Leoshkevich
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf,
	Heiko Carstens, Vasily Gorbik

On Fri, Jan 27, 2023 at 3:36 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>
> On Wed, 2023-01-25 at 17:28 -0800, Alexei Starovoitov wrote:
> > On Wed, Jan 25, 2023 at 10:38:17PM +0100, Ilya Leoshkevich wrote:
> > > +
> > > +               /* Sign-extend the kfunc arguments. */
> > > +               if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) {
> > > +                       m = bpf_jit_find_kfunc_model(fp, insn);
> > > +                       if (!m)
> > > +                               return -1;
> > > +
> > > +                       for (j = 0; j < m->nr_args; j++) {
> > > +                               if (sign_extend(jit, BPF_REG_1 + j,
> > > +                                               m->arg_size[j],
> > > +                                               m->arg_flags[j]))
> > > +                                       return -1;
> > > +                       }
> > > +               }
> >
> > Is this because s390 doesn't have subregisters?
> > Could you give an example where it's necessary?
> > I'm guessing a bpf prog compiled with alu32 and operates on signed
> > int
> > that is passed into a kfunc that expects 'int' in 64-bit reg?
>
> Precisely. The test added in 13/24 fails without this:
>
> verify_success:PASS:skel 0 nsec
> verify_success:PASS:bpf_object__find_program_by_name 0 nsec
> verify_success:PASS:kfunc_call_test4 0 nsec
> verify_success:FAIL:retval unexpected retval: actual 4294966065 !=
> expected -1234
> #94/10   kfunc_call/kfunc_call_test4:FAIL
>
> Looking at the assembly:
>
> ; long noinline bpf_kfunc_call_test4(signed char a, short b, int c,
> long d)
> 0000000000936a78 <bpf_kfunc_call_test4>:
>   936a78:       c0 04 00 00 00 00       jgnop   936a78
> <bpf_kfunc_call_test4>
> ;       return (long)a + (long)b + (long)c + d;
>   936a7e:       b9 08 00 45             agr     %r4,%r5
>   936a82:       b9 08 00 43             agr     %r4,%r3
>   936a86:       b9 08 00 24             agr     %r2,%r4
>   936a8a:       c0 f4 00 1e 3b 27       jg      cfe0d8
> <__s390_indirect_jump_r14>
>
> As per the s390x ABI, bpf_kfunc_call_test4() has the right to assume
> that a, b and c are sign-extended by the caller, which results in using
> 64-bit additions (agr) without any additional conversions.
>
> On the JITed code side (without this hunk) we have:
>
> ; tmp = bpf_kfunc_call_test4(-3, -30, -200, -1000);
> ;        5:       b4 10 00 00 ff ff ff fd w1 = -3
>    0x3ff7fdcdad4:       llilf   %r2,0xfffffffd
> ;        6:       b4 20 00 00 ff ff ff e2 w2 = -30
>    0x3ff7fdcdada:       llilf   %r3,0xffffffe2
> ;        7:       b4 30 00 00 ff ff ff 38 w3 = -200
>    0x3ff7fdcdae0:       llilf   %r4,0xffffff38
> ;       8:       b7 40 00 00 ff ff fc 18 r4 = -1000
>    0x3ff7fdcdae6:       lgfi    %r5,-1000
>    0x3ff7fdcdaec:       mvc     64(4,%r15),160(%r15)
>    0x3ff7fdcdaf2:       lgrl    %r1,bpf_kfunc_call_test4@GOT
>    0x3ff7fdcdaf8:       brasl   %r14,__s390_indirect_jump_r1
>
> This first 3 llilfs are 32-bit loads, that need to be sign-extended
> to 64 bits.

All makes sense. Please add this explanation to the commit log.
When 2nd arch appears that needs similar sign extension in
the caller we can add this logic to the verifier.

In parallel we're working on new sign extension instructions.
Doing sign extension with shifts in the verifier won't be efficient,
so we need to wait for them.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH bpf-next 00/24] Support bpf trampoline for s390x
  2023-01-26  0:45 ` [PATCH bpf-next 00/24] Support bpf trampoline for s390x Andrii Nakryiko
@ 2023-01-27 16:51   ` Ilya Leoshkevich
  2023-01-27 17:24     ` Andrii Nakryiko
  0 siblings, 1 reply; 48+ messages in thread
From: Ilya Leoshkevich @ 2023-01-27 16:51 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf,
	Heiko Carstens, Vasily Gorbik

On Wed, 2023-01-25 at 16:45 -0800, Andrii Nakryiko wrote:
> On Wed, Jan 25, 2023 at 1:39 PM Ilya Leoshkevich <iii@linux.ibm.com>
> wrote:
> > 
> > Hi,
> > 
> > This series implements poke, trampoline, kfunc, mixing subprogs and
> > tailcalls, and fixes a number of tests on s390x.
> > 
> > The following failures still remain:
> > 
> > #52      core_read_macros:FAIL
> > Uses BPF_PROBE_READ(), shouldn't there be BPF_PROBE_READ_KERNEL()?
> 
> BPF_PROBE_READ(), similarly to BPF_CORE_READ() both use
> bpf_probe_read_kernel() internally, as it's most common use case. We
> have separate BPF_PROBE_READ_USER() and BPF_CORE_READ_USER() for when
> bpf_probe_read_user() has to be used.

At least purely from the code perspective, BPF_PROBE_READ() seems to
delegate to bpf_probe_read(). The following therefore helps with this
test:

--- a/tools/lib/bpf/bpf_core_read.h
+++ b/tools/lib/bpf/bpf_core_read.h
@@ -364,7 +364,7 @@ enum bpf_enum_value_kind {
 
 /* Non-CO-RE variant of BPF_CORE_READ_INTO() */
 #define BPF_PROBE_READ_INTO(dst, src, a, ...) ({                     
\
-       ___core_read(bpf_probe_read, bpf_probe_read,                  
\
+       ___core_read(bpf_probe_read_kernel, bpf_probe_read_kernel,    
\
                     dst, (src), a, ##__VA_ARGS__)                    
\
 })
 
@@ -400,7 +400,7 @@ enum bpf_enum_value_kind {
 
 /* Non-CO-RE variant of BPF_CORE_READ_STR_INTO() */
 #define BPF_PROBE_READ_STR_INTO(dst, src, a, ...) ({                 
\
-       ___core_read(bpf_probe_read_str, bpf_probe_read,              
\
+       ___core_read(bpf_probe_read_kernel_str, bpf_probe_read_kernel,
\
                     dst, (src), a, ##__VA_ARGS__)                    
\
 })

but I'm not sure if there are backward compatibility concerns, or if
libbpf is supposed to rewrite this when
!ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE.

> > 
> > #82      get_stack_raw_tp:FAIL
> > get_stack_print_output:FAIL:user_stack corrupted user stack
> > Known issue:
> > We cannot reliably unwind userspace on s390x without DWARF.
> 
> like in principle, or frame pointers (or some equivalent) needs to be
> configured for this to work?
> 
> Asking also in the context of [0], where s390x was excluded. If there
> is actually a way to enable frame pointer-based stack unwinding on
> s390x, would be nice to hear from an expert.
> 
>   [0] https://pagure.io/fesco/issue/2923

For DWARFless unwinding we have -mbackchain (not to be confused with
-fno-omit-frame-pointer, which we also have, but which just hurts
performance without providing tangible benefits).
-mbackchain has a few problems though:

- It's not atomic. Here is a typical prologue with -mbackchain:

        1: stmg    %r11,%r15,88(%r15)  # save non-volatile registers
        2: lgr     %r14,%r15           # %r14 = sp
        3: lay     %r15,-160(%r15)     # sp -= 160
        4: stg     %r14,0(%r15)        # *(void **)sp = %r14

  The invariant here is that *(void **)%r15 is always a pointer to the
  next frame. This means that if we unwind from (4), we are totally
  broken. This does not happen if we unwind from any other instruction,
  but still.

- Unwinding from (1)-(3) is not particularly good either. PSW points to
  the callee, but R15 points to the caller's frame.

- Unwinding leaf functions is like the previous problem, but worse:
  they often do not establish a stack frame at all, so PSW and R15 are
  out of sync for the entire duration of the call.

Therefore .eh_frame-based unwinding is preferred, since it covers all
these corner cases and is therefore reliable. From what I understand,
adding .eh_frame unwinding to the kernel is not desirable. In an
internal discussion we had another idea though: would it be possible to
have an eBPF program that does .eh_frame parsing and unwinding? I
understand that it can be technically challenging at the moment, but
the end result would not be exploitable by crafting malicious
.eh_frame sections, won't loop endlessly, will have good performance,
etc.

> > #101     ksyms_module:FAIL
> > address of kernel function bpf_testmod_test_mod_kfunc is out of
> > range
> > Known issue:
> > Kernel and modules are too far away from each other on s390x.
> > 
> > #167     sk_assign:FAIL
> > Uses legacy map definitions in 'maps' section.
> 
> Hm.. assuming new enough iproute2, new-style .maps definition should
> be supported, right? Let's convert map definition?

Yep, that worked. Will include in v2.

> > #190     stacktrace_build_id:FAIL
> > Known issue:
> > We cannot reliably unwind userspace on s390x without DWARF.
> > 
> > #211     test_bpffs:FAIL
> > iterators.bpf.c is broken on s390x, it uses BPF_CORE_READ(),
> > shouldn't
> > there be BPF_CORE_READ_KERNEL()?
> 
> BPF_CORE_READ() is that, so must be something else
> 
> > 
> > #218     test_profiler:FAIL
> > A lot of BPF_PROBE_READ() usages.
> 
> ditto, something else
> 
> > 
> > #281     xdp_metadata:FAIL
> > See patch 24.
> > 
> > #284     xdp_synproxy:FAIL
> > Verifier error:
> > ; value = bpf_tcp_raw_gen_syncookie_ipv4(hdr->ipv4, hdr->tcp,
> > 281: (79) r1 = *(u64 *)(r10 -80)      ; R1_w=pkt(off=14,r=74,imm=0)
> > R10=fp0
> > 282: (bf) r2 = r8                     ;
> > R2_w=pkt(id=5,off=14,r=74,umax=60,var_off=(0x0; 0x3c))
> > R8=pkt(id=5,off=14,r=74,umax=60,var_off=(0x0; 0x3c))
> > 283: (79) r3 = *(u64 *)(r10 -104)     ;
> > R3_w=scalar(umax=60,var_off=(0x0; 0x3c)) R10=fp0
> > 284: (85) call bpf_tcp_raw_gen_syncookie_ipv4#204
> > invalid access to packet, off=14 size=0, R2(id=5,off=14,r=74)
> > R2 offset is outside of the packet
> 
> third arg to bpf_tcp_raw_gen_syncookie_ipv4() is defined as
> ARG_CONST_SIZE, so is required to be strictly positive, which doesn't
> seem to be "proven" to verifier. Please provided bigger log, it might
> another instance of needing to sprinkle barrier_var() around.
> 
> And maybe thinking about using ARG_CONST_SIZE_OR_ZERO instead of
> ARG_CONST_SIZE.

Here is the full log:

https://gist.github.com/iii-i/8e20100c33ab6f0dffb5e6e51d1330e8

Apparently we do indeed lose a constraint established by
if (hdr->tcp_len < sizeof(*hdr->tcp)). But the naive

--- a/tools/testing/selftests/bpf/progs/xdp_synproxy_kern.c
+++ b/tools/testing/selftests/bpf/progs/xdp_synproxy_kern.c
@@ -401,6 +401,7 @@ static __always_inline int tcp_dissect(void *data,
void *data_end,
        hdr->tcp_len = hdr->tcp->doff * 4;
        if (hdr->tcp_len < sizeof(*hdr->tcp))
                return XDP_DROP;
+       barrier_var(hdr->tcp_len);
 
        return XDP_TX;
 }
@@ -791,6 +792,7 @@ static __always_inline int syncookie_part2(void
*ctx, void *data, void *data_end
        hdr->tcp_len = hdr->tcp->doff * 4;
        if (hdr->tcp_len < sizeof(*hdr->tcp))
                return XDP_ABORTED;
+       barrier_var(hdr->tcp_len);
 
        return hdr->tcp->syn ? syncookie_handle_syn(hdr, ctx, data,
data_end, xdp) :
                               syncookie_handle_ack(hdr);

does not help.

> 
> > 
> > None of these seem to be due to the new changes.
> > 
> > Best regards,
> > Ilya
> > 
> > Ilya Leoshkevich (24):
> >   selftests/bpf: Fix liburandom_read.so linker error
> >   selftests/bpf: Fix symlink creation error
> >   selftests/bpf: Fix fexit_stress on s390x
> >   selftests/bpf: Fix trampoline_count on s390x
> >   selftests/bpf: Fix kfree_skb on s390x
> >   selftests/bpf: Set errno when urand_spawn() fails
> >   selftests/bpf: Fix decap_sanity_ns cleanup
> >   selftests/bpf: Fix verify_pkcs7_sig on s390x
> >   selftests/bpf: Fix xdp_do_redirect on s390x
> >   selftests/bpf: Fix cgrp_local_storage on s390x
> >   selftests/bpf: Check stack_mprotect() return value
> >   selftests/bpf: Increase SIZEOF_BPF_LOCAL_STORAGE_ELEM on s390x
> >   selftests/bpf: Add a sign-extension test for kfuncs
> >   selftests/bpf: Fix test_lsm on s390x
> >   selftests/bpf: Fix test_xdp_adjust_tail_grow2 on s390x
> >   selftests/bpf: Fix vmlinux test on s390x
> >   libbpf: Read usdt arg spec with bpf_probe_read_kernel()
> >   s390/bpf: Fix a typo in a comment
> >   s390/bpf: Add expoline to tail calls
> >   s390/bpf: Implement bpf_arch_text_poke()
> >   bpf: btf: Add BTF_FMODEL_SIGNED_ARG flag
> >   s390/bpf: Implement arch_prepare_bpf_trampoline()
> >   s390/bpf: Implement bpf_jit_supports_subprog_tailcalls()
> >   s390/bpf: Implement bpf_jit_supports_kfunc_call()
> > 
> >  arch/s390/net/bpf_jit_comp.c                  | 708
> > +++++++++++++++++-
> >  include/linux/bpf.h                           |   8 +
> >  include/linux/btf.h                           |  15 +-
> >  kernel/bpf/btf.c                              |  16 +-
> >  net/bpf/test_run.c                            |   9 +
> >  tools/lib/bpf/usdt.bpf.h                      |  33 +-
> >  tools/testing/selftests/bpf/Makefile          |   7 +-
> >  tools/testing/selftests/bpf/netcnt_common.h   |   6 +-
> >  .../selftests/bpf/prog_tests/bpf_cookie.c     |   6 +-
> >  .../bpf/prog_tests/cgrp_local_storage.c       |   2 +-
> >  .../selftests/bpf/prog_tests/decap_sanity.c   |   2 +-
> >  .../selftests/bpf/prog_tests/fexit_stress.c   |   6 +-
> >  .../selftests/bpf/prog_tests/kfree_skb.c      |   2 +-
> >  .../selftests/bpf/prog_tests/kfunc_call.c     |   1 +
> >  .../selftests/bpf/prog_tests/test_lsm.c       |   3 +-
> >  .../bpf/prog_tests/trampoline_count.c         |   4 +
> >  tools/testing/selftests/bpf/prog_tests/usdt.c |   1 +
> >  .../bpf/prog_tests/verify_pkcs7_sig.c         |   9 +
> >  .../bpf/prog_tests/xdp_adjust_tail.c          |   7 +-
> >  .../bpf/prog_tests/xdp_do_redirect.c          |   4 +
> >  .../selftests/bpf/progs/kfunc_call_test.c     |  18 +
> >  tools/testing/selftests/bpf/progs/lsm.c       |   7 +-
> >  .../bpf/progs/test_verify_pkcs7_sig.c         |  12 +-
> >  .../selftests/bpf/progs/test_vmlinux.c        |   4 +-
> >  .../bpf/progs/test_xdp_adjust_tail_grow.c     |   8 +-
> >  25 files changed, 816 insertions(+), 82 deletions(-)
> > 
> > --
> > 2.39.1
> > 


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH bpf-next 00/24] Support bpf trampoline for s390x
  2023-01-27 16:51   ` Ilya Leoshkevich
@ 2023-01-27 17:24     ` Andrii Nakryiko
  2023-01-27 22:50       ` Ilya Leoshkevich
  0 siblings, 1 reply; 48+ messages in thread
From: Andrii Nakryiko @ 2023-01-27 17:24 UTC (permalink / raw)
  To: Ilya Leoshkevich
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf,
	Heiko Carstens, Vasily Gorbik

On Fri, Jan 27, 2023 at 8:51 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>
> On Wed, 2023-01-25 at 16:45 -0800, Andrii Nakryiko wrote:
> > On Wed, Jan 25, 2023 at 1:39 PM Ilya Leoshkevich <iii@linux.ibm.com>
> > wrote:
> > >
> > > Hi,
> > >
> > > This series implements poke, trampoline, kfunc, mixing subprogs and
> > > tailcalls, and fixes a number of tests on s390x.
> > >
> > > The following failures still remain:
> > >
> > > #52      core_read_macros:FAIL
> > > Uses BPF_PROBE_READ(), shouldn't there be BPF_PROBE_READ_KERNEL()?
> >
> > BPF_PROBE_READ(), similarly to BPF_CORE_READ() both use
> > bpf_probe_read_kernel() internally, as it's most common use case. We
> > have separate BPF_PROBE_READ_USER() and BPF_CORE_READ_USER() for when
> > bpf_probe_read_user() has to be used.
>
> At least purely from the code perspective, BPF_PROBE_READ() seems to
> delegate to bpf_probe_read(). The following therefore helps with this
> test:
>
> --- a/tools/lib/bpf/bpf_core_read.h
> +++ b/tools/lib/bpf/bpf_core_read.h
> @@ -364,7 +364,7 @@ enum bpf_enum_value_kind {
>
>  /* Non-CO-RE variant of BPF_CORE_READ_INTO() */
>  #define BPF_PROBE_READ_INTO(dst, src, a, ...) ({
> \
> -       ___core_read(bpf_probe_read, bpf_probe_read,
> \
> +       ___core_read(bpf_probe_read_kernel, bpf_probe_read_kernel,
> \
>                      dst, (src), a, ##__VA_ARGS__)
> \
>  })
>
> @@ -400,7 +400,7 @@ enum bpf_enum_value_kind {
>
>  /* Non-CO-RE variant of BPF_CORE_READ_STR_INTO() */
>  #define BPF_PROBE_READ_STR_INTO(dst, src, a, ...) ({
> \
> -       ___core_read(bpf_probe_read_str, bpf_probe_read,
> \
> +       ___core_read(bpf_probe_read_kernel_str, bpf_probe_read_kernel,
> \
>                      dst, (src), a, ##__VA_ARGS__)
> \
>  })
>
> but I'm not sure if there are backward compatibility concerns, or if
> libbpf is supposed to rewrite this when
> !ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE.

Oh, this is just a bug, it was an omission when we converted
BPF_CORE_READ to bpf_probe_read_kernel. If you look at bpf_core_read
definition, it is using bpf_probe_read_kernel, which is also the
intent for BPF_PROBE_READ. Let's fix this.

And there is no backwards compat concerns because libbpf will
automatically convert calls to bpf_probe_read_{kernel,user}[_str] to
bpf_probe_read[_str] if host kernel doesn't yet support kernel or user
specific variants.

>
> > >
> > > #82      get_stack_raw_tp:FAIL
> > > get_stack_print_output:FAIL:user_stack corrupted user stack
> > > Known issue:
> > > We cannot reliably unwind userspace on s390x without DWARF.
> >
> > like in principle, or frame pointers (or some equivalent) needs to be
> > configured for this to work?
> >
> > Asking also in the context of [0], where s390x was excluded. If there
> > is actually a way to enable frame pointer-based stack unwinding on
> > s390x, would be nice to hear from an expert.
> >
> >   [0] https://pagure.io/fesco/issue/2923
>
> For DWARFless unwinding we have -mbackchain (not to be confused with
> -fno-omit-frame-pointer, which we also have, but which just hurts
> performance without providing tangible benefits).
> -mbackchain has a few problems though:
>
> - It's not atomic. Here is a typical prologue with -mbackchain:
>
>         1: stmg    %r11,%r15,88(%r15)  # save non-volatile registers
>         2: lgr     %r14,%r15           # %r14 = sp
>         3: lay     %r15,-160(%r15)     # sp -= 160
>         4: stg     %r14,0(%r15)        # *(void **)sp = %r14
>
>   The invariant here is that *(void **)%r15 is always a pointer to the
>   next frame. This means that if we unwind from (4), we are totally
>   broken. This does not happen if we unwind from any other instruction,
>   but still.
>
> - Unwinding from (1)-(3) is not particularly good either. PSW points to
>   the callee, but R15 points to the caller's frame.
>
> - Unwinding leaf functions is like the previous problem, but worse:
>   they often do not establish a stack frame at all, so PSW and R15 are
>   out of sync for the entire duration of the call.
>
> Therefore .eh_frame-based unwinding is preferred, since it covers all
> these corner cases and is therefore reliable. From what I understand,
> adding .eh_frame unwinding to the kernel is not desirable. In an
> internal discussion we had another idea though: would it be possible to
> have an eBPF program that does .eh_frame parsing and unwinding? I
> understand that it can be technically challenging at the moment, but
> the end result would not be exploitable by crafting malicious
> .eh_frame sections, won't loop endlessly, will have good performance,
> etc.

Thanks for details. This was all discussed at length in Fedora
-fno-omit-frame-pointer discussion I linked above, so no real need to
go over this again. .eh_frame-based unwinding on BPF side is possible,
but only for processes that you knew about (and preprocessed) before
you started profiling session. Pre-processing is memory and
cpu-intensive operation on busy systems, and they will miss any
processes started during profiling. So as a general approach for
system-wide profiling it leaves a lot to be desired.

Should we enable -mbackchain in our CI for s390x to be able to capture
stack traces (even if on some instructions they might be incomplete or
outright broken)?

>
> > > #101     ksyms_module:FAIL
> > > address of kernel function bpf_testmod_test_mod_kfunc is out of
> > > range
> > > Known issue:
> > > Kernel and modules are too far away from each other on s390x.
> > >
> > > #167     sk_assign:FAIL
> > > Uses legacy map definitions in 'maps' section.
> >
> > Hm.. assuming new enough iproute2, new-style .maps definition should
> > be supported, right? Let's convert map definition?
>
> Yep, that worked. Will include in v2.

Nice.

>
> > > #190     stacktrace_build_id:FAIL
> > > Known issue:
> > > We cannot reliably unwind userspace on s390x without DWARF.
> > >
> > > #211     test_bpffs:FAIL
> > > iterators.bpf.c is broken on s390x, it uses BPF_CORE_READ(),
> > > shouldn't
> > > there be BPF_CORE_READ_KERNEL()?
> >
> > BPF_CORE_READ() is that, so must be something else
> >
> > >
> > > #218     test_profiler:FAIL
> > > A lot of BPF_PROBE_READ() usages.
> >
> > ditto, something else
> >
> > >
> > > #281     xdp_metadata:FAIL
> > > See patch 24.
> > >
> > > #284     xdp_synproxy:FAIL
> > > Verifier error:
> > > ; value = bpf_tcp_raw_gen_syncookie_ipv4(hdr->ipv4, hdr->tcp,
> > > 281: (79) r1 = *(u64 *)(r10 -80)      ; R1_w=pkt(off=14,r=74,imm=0)
> > > R10=fp0
> > > 282: (bf) r2 = r8                     ;
> > > R2_w=pkt(id=5,off=14,r=74,umax=60,var_off=(0x0; 0x3c))
> > > R8=pkt(id=5,off=14,r=74,umax=60,var_off=(0x0; 0x3c))
> > > 283: (79) r3 = *(u64 *)(r10 -104)     ;
> > > R3_w=scalar(umax=60,var_off=(0x0; 0x3c)) R10=fp0
> > > 284: (85) call bpf_tcp_raw_gen_syncookie_ipv4#204
> > > invalid access to packet, off=14 size=0, R2(id=5,off=14,r=74)
> > > R2 offset is outside of the packet
> >
> > third arg to bpf_tcp_raw_gen_syncookie_ipv4() is defined as
> > ARG_CONST_SIZE, so is required to be strictly positive, which doesn't
> > seem to be "proven" to verifier. Please provided bigger log, it might
> > another instance of needing to sprinkle barrier_var() around.
> >
> > And maybe thinking about using ARG_CONST_SIZE_OR_ZERO instead of
> > ARG_CONST_SIZE.
>
> Here is the full log:
>
> https://gist.github.com/iii-i/8e20100c33ab6f0dffb5e6e51d1330e8
>
> Apparently we do indeed lose a constraint established by
> if (hdr->tcp_len < sizeof(*hdr->tcp)). But the naive

The test is too big and unfamiliar for me to figure this out. And the
problem is not upper bound, but lower bound. hdr->tcp_len is not
proven to be strictly greater than zero, which is what verifier
complains about. Not sure how it works on other arches right now.


But I see that bpf_csum_diff defines size arguments as
ARG_CONST_SIZE_OR_ZERO while bpf_tcp_raw_gen_syncookie_ipv4 has
ARG_CONST_SIZE. I generally found ARG_CONST_SIZE way too problematic
in practice, I'd say we should change it to ARG_CONST_SIZE_OR_ZERO.

>
> --- a/tools/testing/selftests/bpf/progs/xdp_synproxy_kern.c
> +++ b/tools/testing/selftests/bpf/progs/xdp_synproxy_kern.c
> @@ -401,6 +401,7 @@ static __always_inline int tcp_dissect(void *data,
> void *data_end,
>         hdr->tcp_len = hdr->tcp->doff * 4;
>         if (hdr->tcp_len < sizeof(*hdr->tcp))
>                 return XDP_DROP;
> +       barrier_var(hdr->tcp_len);
>
>         return XDP_TX;
>  }
> @@ -791,6 +792,7 @@ static __always_inline int syncookie_part2(void
> *ctx, void *data, void *data_end
>         hdr->tcp_len = hdr->tcp->doff * 4;
>         if (hdr->tcp_len < sizeof(*hdr->tcp))
>                 return XDP_ABORTED;
> +       barrier_var(hdr->tcp_len);
>
>         return hdr->tcp->syn ? syncookie_handle_syn(hdr, ctx, data,
> data_end, xdp) :
>                                syncookie_handle_ack(hdr);
>
> does not help.
>
> >
> > >
> > > None of these seem to be due to the new changes.
> > >
> > > Best regards,
> > > Ilya
> > >
> > > Ilya Leoshkevich (24):
> > >   selftests/bpf: Fix liburandom_read.so linker error
> > >   selftests/bpf: Fix symlink creation error
> > >   selftests/bpf: Fix fexit_stress on s390x
> > >   selftests/bpf: Fix trampoline_count on s390x
> > >   selftests/bpf: Fix kfree_skb on s390x
> > >   selftests/bpf: Set errno when urand_spawn() fails
> > >   selftests/bpf: Fix decap_sanity_ns cleanup
> > >   selftests/bpf: Fix verify_pkcs7_sig on s390x
> > >   selftests/bpf: Fix xdp_do_redirect on s390x
> > >   selftests/bpf: Fix cgrp_local_storage on s390x
> > >   selftests/bpf: Check stack_mprotect() return value
> > >   selftests/bpf: Increase SIZEOF_BPF_LOCAL_STORAGE_ELEM on s390x
> > >   selftests/bpf: Add a sign-extension test for kfuncs
> > >   selftests/bpf: Fix test_lsm on s390x
> > >   selftests/bpf: Fix test_xdp_adjust_tail_grow2 on s390x
> > >   selftests/bpf: Fix vmlinux test on s390x
> > >   libbpf: Read usdt arg spec with bpf_probe_read_kernel()
> > >   s390/bpf: Fix a typo in a comment
> > >   s390/bpf: Add expoline to tail calls
> > >   s390/bpf: Implement bpf_arch_text_poke()
> > >   bpf: btf: Add BTF_FMODEL_SIGNED_ARG flag
> > >   s390/bpf: Implement arch_prepare_bpf_trampoline()
> > >   s390/bpf: Implement bpf_jit_supports_subprog_tailcalls()
> > >   s390/bpf: Implement bpf_jit_supports_kfunc_call()
> > >
> > >  arch/s390/net/bpf_jit_comp.c                  | 708
> > > +++++++++++++++++-
> > >  include/linux/bpf.h                           |   8 +
> > >  include/linux/btf.h                           |  15 +-
> > >  kernel/bpf/btf.c                              |  16 +-
> > >  net/bpf/test_run.c                            |   9 +
> > >  tools/lib/bpf/usdt.bpf.h                      |  33 +-
> > >  tools/testing/selftests/bpf/Makefile          |   7 +-
> > >  tools/testing/selftests/bpf/netcnt_common.h   |   6 +-
> > >  .../selftests/bpf/prog_tests/bpf_cookie.c     |   6 +-
> > >  .../bpf/prog_tests/cgrp_local_storage.c       |   2 +-
> > >  .../selftests/bpf/prog_tests/decap_sanity.c   |   2 +-
> > >  .../selftests/bpf/prog_tests/fexit_stress.c   |   6 +-
> > >  .../selftests/bpf/prog_tests/kfree_skb.c      |   2 +-
> > >  .../selftests/bpf/prog_tests/kfunc_call.c     |   1 +
> > >  .../selftests/bpf/prog_tests/test_lsm.c       |   3 +-
> > >  .../bpf/prog_tests/trampoline_count.c         |   4 +
> > >  tools/testing/selftests/bpf/prog_tests/usdt.c |   1 +
> > >  .../bpf/prog_tests/verify_pkcs7_sig.c         |   9 +
> > >  .../bpf/prog_tests/xdp_adjust_tail.c          |   7 +-
> > >  .../bpf/prog_tests/xdp_do_redirect.c          |   4 +
> > >  .../selftests/bpf/progs/kfunc_call_test.c     |  18 +
> > >  tools/testing/selftests/bpf/progs/lsm.c       |   7 +-
> > >  .../bpf/progs/test_verify_pkcs7_sig.c         |  12 +-
> > >  .../selftests/bpf/progs/test_vmlinux.c        |   4 +-
> > >  .../bpf/progs/test_xdp_adjust_tail_grow.c     |   8 +-
> > >  25 files changed, 816 insertions(+), 82 deletions(-)
> > >
> > > --
> > > 2.39.1
> > >
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH bpf-next 08/24] selftests/bpf: Fix verify_pkcs7_sig on s390x
  2023-01-27 12:36     ` Ilya Leoshkevich
@ 2023-01-27 17:26       ` Andrii Nakryiko
  0 siblings, 0 replies; 48+ messages in thread
From: Andrii Nakryiko @ 2023-01-27 17:26 UTC (permalink / raw)
  To: Ilya Leoshkevich
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf,
	Heiko Carstens, Vasily Gorbik

On Fri, Jan 27, 2023 at 4:36 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>
> On Wed, 2023-01-25 at 17:06 -0800, Andrii Nakryiko wrote:
> > On Wed, Jan 25, 2023 at 1:39 PM Ilya Leoshkevich <iii@linux.ibm.com>
> > wrote:
> > >
> > > Use bpf_probe_read_kernel() instead of bpf_probe_read(), which is
> > > not
> > > defined on all architectures.
> > >
> > > While at it, improve the error handling: do not hide the verifier
> > > log,
> > > and check the return values of bpf_probe_read_kernel() and
> > > bpf_copy_from_user().
> > >
> > > Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> > > ---
> > >  .../selftests/bpf/prog_tests/verify_pkcs7_sig.c      |  9
> > > +++++++++
> > >  .../selftests/bpf/progs/test_verify_pkcs7_sig.c      | 12
> > > ++++++++----
> > >  2 files changed, 17 insertions(+), 4 deletions(-)
> > >
> > > diff --git
> > > a/tools/testing/selftests/bpf/prog_tests/verify_pkcs7_sig.c
> > > b/tools/testing/selftests/bpf/prog_tests/verify_pkcs7_sig.c
> > > index 579d6ee83ce0..75c256f79f85 100644
> > > --- a/tools/testing/selftests/bpf/prog_tests/verify_pkcs7_sig.c
> > > +++ b/tools/testing/selftests/bpf/prog_tests/verify_pkcs7_sig.c
> > > @@ -56,11 +56,17 @@ struct data {
> > >         __u32 sig_len;
> > >  };
> > >
> > > +static char libbpf_log[8192];
> > >  static bool kfunc_not_supported;
> > >
> > >  static int libbpf_print_cb(enum libbpf_print_level level, const
> > > char *fmt,
> > >                            va_list args)
> > >  {
> > > +       size_t log_len = strlen(libbpf_log);
> > > +
> > > +       vsnprintf(libbpf_log + log_len, sizeof(libbpf_log) -
> > > log_len,
> > > +                 fmt, args);
> >
> > it seems like test is written to assume that load might fail and
> > we'll
> > get error messages, so not sure it's that useful to print out these
> > errors. But at the very least we should filter out DEBUG and INFO
> > level messages, and pass through WARN only.
> >
> > Also, there is no point in having a separate log buffer, just printf
> > directly. test_progs will take care to collect overall log and ignore
> > it if test succeeds, or emit it if test fails
>
> Thanks, I completely overlooked the fact that the test framework
> already hides the output in case of success. With that in mind I can do
> just this:
>
> --- a/tools/testing/selftests/bpf/prog_tests/verify_pkcs7_sig.c
> +++ b/tools/testing/selftests/bpf/prog_tests/verify_pkcs7_sig.c
> @@ -61,6 +61,9 @@ static bool kfunc_not_supported;
>  static int libbpf_print_cb(enum libbpf_print_level level, const char
> *fmt,
>                            va_list args)
>  {
> +       if (level == LIBBPF_WARN)
> +               vprintf(fmt, args);
> +
>         if (strcmp(fmt, "libbpf: extern (func ksym) '%s': not found in
> kernel or module BTFs\n"))
>                 return 0;
>
> If the load fails due to missing kfuncs, we'll skip the test - I think
> in this case the output won't be printed either, so we should be fine.

sgtm

>
> > > +
> > >         if (strcmp(fmt, "libbpf: extern (func ksym) '%s': not found
> > > in kernel or module BTFs\n"))
> > >                 return 0;
> > >
> > > @@ -277,6 +283,7 @@ void test_verify_pkcs7_sig(void)
> > >         if (!ASSERT_OK_PTR(skel, "test_verify_pkcs7_sig__open"))
> > >                 goto close_prog;
> > >
> > > +       libbpf_log[0] = 0;
> > >         old_print_cb = libbpf_set_print(libbpf_print_cb);
> > >         ret = test_verify_pkcs7_sig__load(skel);
> > >         libbpf_set_print(old_print_cb);
> > > @@ -289,6 +296,8 @@ void test_verify_pkcs7_sig(void)
> > >                 goto close_prog;
> > >         }
> > >
> > > +       printf("%s", libbpf_log);
> > > +
> > >         if (!ASSERT_OK(ret, "test_verify_pkcs7_sig__load"))
> > >                 goto close_prog;
> > >
> > > diff --git
> > > a/tools/testing/selftests/bpf/progs/test_verify_pkcs7_sig.c
> > > b/tools/testing/selftests/bpf/progs/test_verify_pkcs7_sig.c
> > > index ce419304ff1f..7748cc23de8a 100644
> > > --- a/tools/testing/selftests/bpf/progs/test_verify_pkcs7_sig.c
> > > +++ b/tools/testing/selftests/bpf/progs/test_verify_pkcs7_sig.c
> > > @@ -59,10 +59,14 @@ int BPF_PROG(bpf, int cmd, union bpf_attr
> > > *attr, unsigned int size)
> > >         if (!data_val)
> > >                 return 0;
> > >
> > > -       bpf_probe_read(&value, sizeof(value), &attr->value);
> > > -
> > > -       bpf_copy_from_user(data_val, sizeof(struct data),
> > > -                          (void *)(unsigned long)value);
> > > +       ret = bpf_probe_read_kernel(&value, sizeof(value), &attr-
> > > >value);
> > > +       if (ret)
> > > +               return ret;
> > > +
> > > +       ret = bpf_copy_from_user(data_val, sizeof(struct data),
> > > +                                (void *)(unsigned long)value);
> > > +       if (ret)
> > > +               return ret;
> >
> > this part looks good, we shouldn't use bpf_probe_read.
> >
> > You'll have to update progs/profiler.inc.h as well, btw, which still
> > uses bpf_probe_read() and bpf_probe_read_str.
>
> I remember trying this, but there were still failures due to, as I
> thought back then, usage of BPF_CORE_READ() and the lack of
> BPF_CORE_READ_KERNEL(). But this seems to be a generic issue. Let me
> try again and post my findings as a reply to 0/24.

Yep, replied there as well: BPF_PROBE_READ still using
bpf_probe_read() is an omission and we should fix that.

>
> > >         if (data_val->data_len > sizeof(data_val->data))
> > >                 return -EINVAL;
> > > --
> > > 2.39.1
> > >
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH bpf-next 22/24] s390/bpf: Implement arch_prepare_bpf_trampoline()
  2023-01-27 11:15         ` Ilya Leoshkevich
@ 2023-01-27 17:30           ` Andrii Nakryiko
  0 siblings, 0 replies; 48+ messages in thread
From: Andrii Nakryiko @ 2023-01-27 17:30 UTC (permalink / raw)
  To: Ilya Leoshkevich
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf,
	Heiko Carstens, Vasily Gorbik

On Fri, Jan 27, 2023 at 3:15 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>
> On Thu, 2023-01-26 at 11:06 -0800, Andrii Nakryiko wrote:
> > On Thu, Jan 26, 2023 at 6:30 AM Ilya Leoshkevich <iii@linux.ibm.com>
> > wrote:
> > >
> > > On Wed, 2023-01-25 at 17:15 -0800, Andrii Nakryiko wrote:
> > > > On Wed, Jan 25, 2023 at 1:39 PM Ilya Leoshkevich
> > > > <iii@linux.ibm.com>
> > > > wrote:
> > > > >
> > > > > arch_prepare_bpf_trampoline() is used for direct attachment of
> > > > > eBPF
> > > > > programs to various places, bypassing kprobes. It's responsible
> > > > > for
> > > > > calling a number of eBPF programs before, instead and/or after
> > > > > whatever they are attached to.
> > > > >
> > > > > Add a s390x implementation, paying attention to the following:
> > > > >
> > > > > - Reuse the existing JIT infrastructure, where possible.
> > > > > - Like the existing JIT, prefer making multiple passes instead
> > > > > of
> > > > >   backpatching. Currently 2 passes is enough. If literal pool
> > > > > is
> > > > >   introduced, this needs to be raised to 3. However, at the
> > > > > moment
> > > > >   adding literal pool only makes the code larger. If branch
> > > > >   shortening is introduced, the number of passes needs to be
> > > > >   increased even further.
> > > > > - Support both regular and ftrace calling conventions,
> > > > > depending on
> > > > >   the trampoline flags.
> > > > > - Use expolines for indirect calls.
> > > > > - Handle the mismatch between the eBPF and the s390x ABIs.
> > > > > - Sign-extend fmod_ret return values.
> > > > >
> > > > > invoke_bpf_prog() produces about 120 bytes; it might be
> > > > > possible to
> > > > > slightly optimize this, but reaching 50 bytes, like on x86_64,
> > > > > looks
> > > > > unrealistic: just loading cookie, __bpf_prog_enter, bpf_func,
> > > > > insnsi
> > > > > and __bpf_prog_exit as literals already takes at least 5 * 12 =
> > > > > 60
> > > > > bytes, and we can't use relative addressing for most of them.
> > > > > Therefore, lower BPF_MAX_TRAMP_LINKS on s390x.
> > > > >
> > > > > Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> > > > > ---
> > > > >  arch/s390/net/bpf_jit_comp.c | 535
> > > > > +++++++++++++++++++++++++++++++++--
> > > > >  include/linux/bpf.h          |   4 +
> > > > >  2 files changed, 517 insertions(+), 22 deletions(-)
> > > > >
> > > >
> > > > [...]
> > > >
> > > > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > > > index cf89504c8dda..52ff43bbf996 100644
> > > > > --- a/include/linux/bpf.h
> > > > > +++ b/include/linux/bpf.h
> > > > > @@ -943,7 +943,11 @@ struct btf_func_model {
> > > > >  /* Each call __bpf_prog_enter + call bpf_func + call
> > > > > __bpf_prog_exit is ~50
> > > > >   * bytes on x86.
> > > > >   */
> > > > > +#if defined(__s390x__)
> > > > > +#define BPF_MAX_TRAMP_LINKS 27
> > > > > +#else
> > > > >  #define BPF_MAX_TRAMP_LINKS 38
> > > > > +#endif
> > > >
> > > > if we turn this into enum definition, then on selftests side we
> > > > can
> > > > just discover this from vmlinux BTF, instead of hard-coding
> > > > arch-specific constants. Thoughts?
> > >
> > > This seems to work. I can replace 3/24 and 4/24 with that in v2.
> > > Some random notes:
> > >
> > > - It doesn't seem to be possible to #include "vlinux.h" into tests,
> > >   so one has to go through the btf__load_vmlinux_btf() dance and
> > >   allocate the fd arrays dynamically.
> >
> > yes, you can't include vmlinux.h into user-space code, of course. And
> > yes it's true about needing to use btf__load_vmlinux_btf().
> >
> > But I didn't get what you are saying about fd arrays, tbh. Can you
> > please elaborate?
>
> That's a really minor thing; fexit_fd and and link_fd in fexit_stress
> now need to be allocated dynamically.
>
> > > - One has to give this enum an otherwise unnecessary name, so that
> > >   it's easy to find. This doesn't seem like a big deal though:
> > >
> > > enum bpf_max_tramp_links {
> >
> > not really, you can keep it anonymous enum. We do that in
> > include/uapi/linux/bpf.h for a lot of constants
>
> How would you find it then? My current code is:
>
> int get_bpf_max_tramp_links_from(struct btf *btf)
> {
>         const struct btf_enum *e;
>         const struct btf_type *t;
>         const char *name;
>         int id;
>
>         id = btf__find_by_name_kind(btf, "bpf_max_tramp_links",
> BTF_KIND_ENUM);
>         if (!ASSERT_GT(id, 0, "bpf_max_tramp_links id"))
>                 return -1;
>         t = btf__type_by_id(btf, id);
>         if (!ASSERT_OK_PTR(t, "bpf_max_tramp_links type"))
>                 return -1;
>         if (!ASSERT_EQ(btf_vlen(t), 1, "bpf_max_tramp_links vlen"))
>                 return -1;
>         e = btf_enum(t);
>         if (!ASSERT_OK_PTR(e, "bpf_max_tramp_links[0]"))
>                 return -1;
>         name = btf__name_by_offset(btf, e->name_off);
>         if (!ASSERT_OK_PTR(name, "bpf_max_tramp_links[0].name_off") &&
>             !ASSERT_STREQ(name, "BPF_MAX_TRAMP_LINKS",
> "BPF_MAX_TRAMP_LINKS"))
>                 return -1;
>
>         return e->val;
> }
>
> Is there a way to bypass looking up the enum, and go straight for the
> named member?


don't use btf__find_by_name_kind, just iterate all types and look at
all anonymous enums and its values, roughly

for (i = 1; i < btf__type_cnt(btf); i++) {
    const btf_type *t = btf__type_by_id(i);
    if (!btf_is_enum(t) || t->name_off)
        continue;
    for (j = 0; j < btf_vlen(t); j++) {
        if (strcmp(btf__str_by_offset(btf, btf_enum(t)[j].name_off),
"BPF_MAX_TRAMP_LINKS") != 0)
            continue;
        /* found it */
    }
}

but cleaner :)


>
> > > #if defined(__s390x__)
> > >         BPF_MAX_TRAMP_LINKS = 27,
> > > #else
> > >         BPF_MAX_TRAMP_LINKS = 38,
> > > #endif
> > > };
> > >
> > > - An alternative might be to expose this via /proc, since the users
> > >   might be interested in it too.
> >
> > I'd say let's not, there is no need, having it in BTF is more than
> > enough for testing purposes
>
> Fair enough.
> >

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH bpf-next 00/24] Support bpf trampoline for s390x
  2023-01-27 17:24     ` Andrii Nakryiko
@ 2023-01-27 22:50       ` Ilya Leoshkevich
  0 siblings, 0 replies; 48+ messages in thread
From: Ilya Leoshkevich @ 2023-01-27 22:50 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, bpf,
	Heiko Carstens, Vasily Gorbik

On Fri, 2023-01-27 at 09:24 -0800, Andrii Nakryiko wrote:
> On Fri, Jan 27, 2023 at 8:51 AM Ilya Leoshkevich <iii@linux.ibm.com>
> wrote:
> > 
> > On Wed, 2023-01-25 at 16:45 -0800, Andrii Nakryiko wrote:
> > > On Wed, Jan 25, 2023 at 1:39 PM Ilya Leoshkevich
> > > <iii@linux.ibm.com>
> > > wrote:
> > > > 
> > > > Hi,
> > > > 
> > > > This series implements poke, trampoline, kfunc, mixing subprogs
> > > > and
> > > > tailcalls, and fixes a number of tests on s390x.
> > > > 
> > > > The following failures still remain:
> > > > 
> > > > #52      core_read_macros:FAIL
> > > > Uses BPF_PROBE_READ(), shouldn't there be
> > > > BPF_PROBE_READ_KERNEL()?
> > > 
> > > BPF_PROBE_READ(), similarly to BPF_CORE_READ() both use
> > > bpf_probe_read_kernel() internally, as it's most common use case.
> > > We
> > > have separate BPF_PROBE_READ_USER() and BPF_CORE_READ_USER() for
> > > when
> > > bpf_probe_read_user() has to be used.
> > 
> > At least purely from the code perspective, BPF_PROBE_READ() seems
> > to
> > delegate to bpf_probe_read(). The following therefore helps with
> > this
> > test:
> > 
> > --- a/tools/lib/bpf/bpf_core_read.h
> > +++ b/tools/lib/bpf/bpf_core_read.h
> > @@ -364,7 +364,7 @@ enum bpf_enum_value_kind {
> > 
> >  /* Non-CO-RE variant of BPF_CORE_READ_INTO() */
> >  #define BPF_PROBE_READ_INTO(dst, src, a, ...) ({
> > \
> > -       ___core_read(bpf_probe_read, bpf_probe_read,
> > \
> > +       ___core_read(bpf_probe_read_kernel, bpf_probe_read_kernel,
> > \
> >                      dst, (src), a, ##__VA_ARGS__)
> > \
> >  })
> > 
> > @@ -400,7 +400,7 @@ enum bpf_enum_value_kind {
> > 
> >  /* Non-CO-RE variant of BPF_CORE_READ_STR_INTO() */
> >  #define BPF_PROBE_READ_STR_INTO(dst, src, a, ...) ({
> > \
> > -       ___core_read(bpf_probe_read_str, bpf_probe_read,
> > \
> > +       ___core_read(bpf_probe_read_kernel_str,
> > bpf_probe_read_kernel,
> > \
> >                      dst, (src), a, ##__VA_ARGS__)
> > \
> >  })
> > 
> > but I'm not sure if there are backward compatibility concerns, or
> > if
> > libbpf is supposed to rewrite this when
> > !ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE.
> 
> Oh, this is just a bug, it was an omission when we converted
> BPF_CORE_READ to bpf_probe_read_kernel. If you look at bpf_core_read
> definition, it is using bpf_probe_read_kernel, which is also the
> intent for BPF_PROBE_READ. Let's fix this.
> 
> And there is no backwards compat concerns because libbpf will
> automatically convert calls to bpf_probe_read_{kernel,user}[_str] to
> bpf_probe_read[_str] if host kernel doesn't yet support kernel or
> user
> specific variants.

Thanks for confirming! I will include the fix in v2.

> > > > #82      get_stack_raw_tp:FAIL
> > > > get_stack_print_output:FAIL:user_stack corrupted user stack
> > > > Known issue:
> > > > We cannot reliably unwind userspace on s390x without DWARF.
> > > 
> > > like in principle, or frame pointers (or some equivalent) needs
> > > to be
> > > configured for this to work?
> > > 
> > > Asking also in the context of [0], where s390x was excluded. If
> > > there
> > > is actually a way to enable frame pointer-based stack unwinding
> > > on
> > > s390x, would be nice to hear from an expert.
> > > 
> > >   [0] https://pagure.io/fesco/issue/2923
> > 
> > For DWARFless unwinding we have -mbackchain (not to be confused
> > with
> > -fno-omit-frame-pointer, which we also have, but which just hurts
> > performance without providing tangible benefits).
> > -mbackchain has a few problems though:
> > 
> > - It's not atomic. Here is a typical prologue with -mbackchain:
> > 
> >         1: stmg    %r11,%r15,88(%r15)  # save non-volatile
> > registers
> >         2: lgr     %r14,%r15           # %r14 = sp
> >         3: lay     %r15,-160(%r15)     # sp -= 160
> >         4: stg     %r14,0(%r15)        # *(void **)sp = %r14
> > 
> >   The invariant here is that *(void **)%r15 is always a pointer to
> > the
> >   next frame. This means that if we unwind from (4), we are totally
> >   broken. This does not happen if we unwind from any other
> > instruction,
> >   but still.
> > 
> > - Unwinding from (1)-(3) is not particularly good either. PSW
> > points to
> >   the callee, but R15 points to the caller's frame.
> > 
> > - Unwinding leaf functions is like the previous problem, but worse:
> >   they often do not establish a stack frame at all, so PSW and R15
> > are
> >   out of sync for the entire duration of the call.
> > 
> > Therefore .eh_frame-based unwinding is preferred, since it covers
> > all
> > these corner cases and is therefore reliable. From what I
> > understand,
> > adding .eh_frame unwinding to the kernel is not desirable. In an
> > internal discussion we had another idea though: would it be
> > possible to
> > have an eBPF program that does .eh_frame parsing and unwinding? I
> > understand that it can be technically challenging at the moment,
> > but
> > the end result would not be exploitable by crafting malicious
> > .eh_frame sections, won't loop endlessly, will have good
> > performance,
> > etc.
> 
> Thanks for details. This was all discussed at length in Fedora
> -fno-omit-frame-pointer discussion I linked above, so no real need to
> go over this again. .eh_frame-based unwinding on BPF side is
> possible,
> but only for processes that you knew about (and preprocessed) before
> you started profiling session. Pre-processing is memory and
> cpu-intensive operation on busy systems, and they will miss any
> processes started during profiling. So as a general approach for
> system-wide profiling it leaves a lot to be desired.

Thanks for the explanation, I'll read the thread and come back if I
get some new ideas not listed here.

> Should we enable -mbackchain in our CI for s390x to be able to
> capture
> stack traces (even if on some instructions they might be incomplete
> or
> outright broken)?

Let's do it, I don't have anything against this.

[...]
> > > 
> 

> > Here is the full log:
> > 
> > https://gist.github.com/iii-i/8e20100c33ab6f0dffb5e6e51d1330e8
> > 
> > Apparently we do indeed lose a constraint established by
> > if (hdr->tcp_len < sizeof(*hdr->tcp)). But the naive
> 
> The test is too big and unfamiliar for me to figure this out. And the
> problem is not upper bound, but lower bound. hdr->tcp_len is not
> proven to be strictly greater than zero, which is what verifier
> complains about. Not sure how it works on other arches right now.
> 
> 
> But I see that bpf_csum_diff defines size arguments as
> ARG_CONST_SIZE_OR_ZERO while bpf_tcp_raw_gen_syncookie_ipv4 has
> ARG_CONST_SIZE. I generally found ARG_CONST_SIZE way too problematic
> in practice, I'd say we should change it to ARG_CONST_SIZE_OR_ZERO.

Yes, this helps, and doesn't seem to introduce issues, since the 
minimum length is enforced inside this function anyway. I will include
the change for bpf_tcp_raw_gen_syncookie_ipv{4,6} in v2; I guess some
other functions may benefit from this as well, but this is outside the
scope of this series.

[...]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH bpf-next 22/24] s390/bpf: Implement arch_prepare_bpf_trampoline()
  2023-01-25 21:38 ` [PATCH bpf-next 22/24] s390/bpf: Implement arch_prepare_bpf_trampoline() Ilya Leoshkevich
  2023-01-26  1:15   ` Andrii Nakryiko
@ 2023-02-15  4:07   ` kernel test robot
  2023-02-15 17:03   ` kernel test robot
  2 siblings, 0 replies; 48+ messages in thread
From: kernel test robot @ 2023-02-15  4:07 UTC (permalink / raw)
  To: Ilya Leoshkevich; +Cc: oe-kbuild-all

Hi Ilya,

I love your patch! Perhaps something to improve:

[auto build test WARNING on bpf-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/Ilya-Leoshkevich/selftests-bpf-Fix-liburandom_read-so-linker-error/20230128-150321
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
patch link:    https://lore.kernel.org/r/20230125213817.1424447-23-iii%40linux.ibm.com
patch subject: [PATCH bpf-next 22/24] s390/bpf: Implement arch_prepare_bpf_trampoline()
config: s390-defconfig (https://download.01.org/0day-ci/archive/20230215/202302151241.GpoGkTCm-lkp@intel.com/config)
compiler: s390-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/6257ca590791a94b97f1b4f72851dc110290a52b
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Ilya-Leoshkevich/selftests-bpf-Fix-liburandom_read-so-linker-error/20230128-150321
        git checkout 6257ca590791a94b97f1b4f72851dc110290a52b
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=s390 olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=s390 SHELL=/bin/bash arch/s390/net/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202302151241.GpoGkTCm-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> arch/s390/net/bpf_jit_comp.c:2155:5: warning: no previous prototype for '__arch_prepare_bpf_trampoline' [-Wmissing-prototypes]
    2155 | int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im,
         |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~


vim +/__arch_prepare_bpf_trampoline +2155 arch/s390/net/bpf_jit_comp.c

  2154	
> 2155	int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im,
  2156					  struct bpf_tramp_jit *tjit,
  2157					  const struct btf_func_model *m,
  2158					  u32 flags, struct bpf_tramp_links *tlinks,
  2159					  void *func_addr)
  2160	{
  2161		struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
  2162		struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
  2163		struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
  2164		int nr_bpf_args, nr_reg_args, nr_stack_args;
  2165		struct bpf_jit *jit = &tjit->common;
  2166		int arg, bpf_arg_off;
  2167		int i, j;
  2168	
  2169		/* Support as many stack arguments as "mvc" instruction can handle. */
  2170		nr_reg_args = min_t(int, m->nr_args, MAX_NR_REG_ARGS);
  2171		nr_stack_args = m->nr_args - nr_reg_args;
  2172		if (nr_stack_args > MAX_NR_STACK_ARGS)
  2173			return -ENOTSUPP;
  2174	
  2175		/* Return to %r14, since func_addr and %r0 are not available. */
  2176		if (!func_addr && !(flags & BPF_TRAMP_F_ORIG_STACK))
  2177			flags |= BPF_TRAMP_F_SKIP_FRAME;
  2178	
  2179		/*
  2180		 * Compute how many arguments we need to pass to BPF programs.
  2181		 * BPF ABI mirrors that of x86_64: arguments that are 16 bytes or
  2182		 * smaller are packed into 1 or 2 registers; larger arguments are
  2183		 * passed via pointers.
  2184		 * In s390x ABI, arguments that are 8 bytes or smaller are packed into
  2185		 * a register; larger arguments are passed via pointers.
  2186		 * We need to deal with this difference.
  2187		 */
  2188		nr_bpf_args = 0;
  2189		for (i = 0; i < m->nr_args; i++) {
  2190			if (m->arg_size[i] <= 8)
  2191				nr_bpf_args += 1;
  2192			else if (m->arg_size[i] <= 16)
  2193				nr_bpf_args += 2;
  2194			else
  2195				return -ENOTSUPP;
  2196		}
  2197	
  2198		/*
  2199		 * Calculate the stack layout.
  2200		 */
  2201	
  2202		/* Reserve STACK_FRAME_OVERHEAD bytes for the callees. */
  2203		tjit->stack_size = STACK_FRAME_OVERHEAD;
  2204		tjit->stack_args_off = alloc_stack(tjit, nr_stack_args * sizeof(u64));
  2205		tjit->reg_args_off = alloc_stack(tjit, nr_reg_args * sizeof(u64));
  2206		tjit->ip_off = alloc_stack(tjit, sizeof(u64));
  2207		tjit->arg_cnt_off = alloc_stack(tjit, sizeof(u64));
  2208		tjit->bpf_args_off = alloc_stack(tjit, nr_bpf_args * sizeof(u64));
  2209		tjit->retval_off = alloc_stack(tjit, sizeof(u64));
  2210		tjit->r7_r8_off = alloc_stack(tjit, 2 * sizeof(u64));
  2211		tjit->r14_off = alloc_stack(tjit, sizeof(u64));
  2212		tjit->run_ctx_off = alloc_stack(tjit,
  2213						sizeof(struct bpf_tramp_run_ctx));
  2214		/* The caller has already reserved STACK_FRAME_OVERHEAD bytes. */
  2215		tjit->stack_size -= STACK_FRAME_OVERHEAD;
  2216		tjit->orig_stack_args_off = tjit->stack_size + STACK_FRAME_OVERHEAD;
  2217	
  2218		/* aghi %r15,-stack_size */
  2219		EMIT4_IMM(0xa70b0000, REG_15, -tjit->stack_size);
  2220		/* stmg %r2,%rN,fwd_reg_args_off(%r15) */
  2221		if (nr_reg_args)
  2222			EMIT6_DISP_LH(0xeb000000, 0x0024, REG_2,
  2223				      REG_2 + (nr_reg_args - 1), REG_15,
  2224				      tjit->reg_args_off);
  2225		for (i = 0, j = 0; i < m->nr_args; i++) {
  2226			if (i < MAX_NR_REG_ARGS)
  2227				arg = REG_2 + i;
  2228			else
  2229				arg = tjit->orig_stack_args_off +
  2230				      (i - MAX_NR_REG_ARGS) * sizeof(u64);
  2231			bpf_arg_off = tjit->bpf_args_off + j * sizeof(u64);
  2232			if (m->arg_size[i] <= 8) {
  2233				if (i < MAX_NR_REG_ARGS)
  2234					/* stg %arg,bpf_arg_off(%r15) */
  2235					EMIT6_DISP_LH(0xe3000000, 0x0024, arg,
  2236						      REG_0, REG_15, bpf_arg_off);
  2237				else
  2238					/* mvc bpf_arg_off(8,%r15),arg(%r15) */
  2239					_EMIT6(0xd207f000 | bpf_arg_off,
  2240					       0xf000 | arg);
  2241				j += 1;
  2242			} else {
  2243				if (i < MAX_NR_REG_ARGS) {
  2244					/* mvc bpf_arg_off(16,%r15),0(%arg) */
  2245					_EMIT6(0xd20ff000 | bpf_arg_off,
  2246					       reg2hex[arg] << 12);
  2247				} else {
  2248					/* lg %r1,arg(%r15) */
  2249					EMIT6_DISP_LH(0xe3000000, 0x0004, REG_1, REG_0,
  2250						      REG_15, arg);
  2251					/* mvc bpf_arg_off(16,%r15),0(%r1) */
  2252					_EMIT6(0xd20ff000 | bpf_arg_off, 0x1000);
  2253				}
  2254				j += 2;
  2255			}
  2256		}
  2257		/* stmg %r7,%r8,r7_r8_off(%r15) */
  2258		EMIT6_DISP_LH(0xeb000000, 0x0024, REG_7, REG_8, REG_15,
  2259			      tjit->r7_r8_off);
  2260		/* stg %r14,r14_off(%r15) */
  2261		EMIT6_DISP_LH(0xe3000000, 0x0024, REG_14, REG_0, REG_15, tjit->r14_off);
  2262	
  2263		if (flags & BPF_TRAMP_F_ORIG_STACK) {
  2264			/*
  2265			 * The ftrace trampoline puts the return address (which is the
  2266			 * address of the original function + S390X_PATCH_SIZE) into
  2267			 * %r0; see ftrace_shared_hotpatch_trampoline_br and
  2268			 * ftrace_init_nop() for details.
  2269			 */
  2270	
  2271			/* lgr %r8,%r0 */
  2272			EMIT4(0xb9040000, REG_8, REG_0);
  2273		} else {
  2274			/* %r8 = func_addr + S390X_PATCH_SIZE */
  2275			load_imm64(jit, REG_8, (u64)func_addr + S390X_PATCH_SIZE);
  2276		}
  2277	
  2278		/*
  2279		 * ip = func_addr;
  2280		 * arg_cnt = m->nr_args;
  2281		 */
  2282	
  2283		if (flags & BPF_TRAMP_F_IP_ARG) {
  2284			/* %r0 = func_addr */
  2285			load_imm64(jit, REG_0, (u64)func_addr);
  2286			/* stg %r0,ip_off(%r15) */
  2287			EMIT6_DISP_LH(0xe3000000, 0x0024, REG_0, REG_0, REG_15,
  2288				      tjit->ip_off);
  2289		}
  2290		/* lghi %r0,nr_bpf_args */
  2291		EMIT4_IMM(0xa7090000, REG_0, nr_bpf_args);
  2292		/* stg %r0,arg_cnt_off(%r15) */
  2293		EMIT6_DISP_LH(0xe3000000, 0x0024, REG_0, REG_0, REG_15,
  2294			      tjit->arg_cnt_off);
  2295	
  2296		if (flags & BPF_TRAMP_F_CALL_ORIG) {
  2297			/*
  2298			 * __bpf_tramp_enter(im);
  2299			 */
  2300	
  2301			/* %r1 = __bpf_tramp_enter */
  2302			load_imm64(jit, REG_1, (u64)__bpf_tramp_enter);
  2303			/* %r2 = im */
  2304			load_imm64(jit, REG_2, (u64)im);
  2305			/* %r1() */
  2306			call_r1(jit);
  2307		}
  2308	
  2309		for (i = 0; i < fentry->nr_links; i++)
  2310			invoke_bpf_prog(tjit, m, fentry->links[i],
  2311					flags & BPF_TRAMP_F_RET_FENTRY_RET);
  2312	
  2313		if (fmod_ret->nr_links) {
  2314			/*
  2315			 * retval = 0;
  2316			 */
  2317	
  2318			/* xc retval_off(8,%r15),retval_off(%r15) */
  2319			_EMIT6(0xd707f000 | tjit->retval_off,
  2320			       0xf000 | tjit->retval_off);
  2321	
  2322			for (i = 0; i < fmod_ret->nr_links; i++) {
  2323				invoke_bpf_prog(tjit, m, fmod_ret->links[i], true);
  2324	
  2325				/*
  2326				 * if (retval)
  2327				 *         goto do_fexit;
  2328				 */
  2329	
  2330				/* ltg %r0,retval_off(%r15) */
  2331				EMIT6_DISP_LH(0xe3000000, 0x0002, REG_0, REG_0, REG_15,
  2332					      tjit->retval_off);
  2333				/* brcl 7,do_fexit */
  2334				EMIT6_PCREL_RILC(0xc0040000, 7, tjit->do_fexit);
  2335			}
  2336		}
  2337	
  2338		if (flags & BPF_TRAMP_F_CALL_ORIG) {
  2339			/*
  2340			 * retval = func_addr(args);
  2341			 */
  2342	
  2343			/* lmg %r2,%rN,reg_args_off(%r15) */
  2344			if (nr_reg_args)
  2345				EMIT6_DISP_LH(0xeb000000, 0x0004, REG_2,
  2346					      REG_2 + (nr_reg_args - 1), REG_15,
  2347					      tjit->reg_args_off);
  2348			/* mvc stack_args_off(N,%r15),orig_stack_args_off(%r15) */
  2349			if (nr_stack_args)
  2350				_EMIT6(0xd200f000 |
  2351					       (nr_stack_args * sizeof(u64) - 1) << 16 |
  2352					       tjit->stack_args_off,
  2353				       0xf000 | tjit->orig_stack_args_off);
  2354			/* lgr %r1,%r8 */
  2355			EMIT4(0xb9040000, REG_1, REG_8);
  2356			/* %r1() */
  2357			call_r1(jit);
  2358			/* stg %r2,retval_off(%r15) */
  2359			EMIT6_DISP_LH(0xe3000000, 0x0024, REG_2, REG_0, REG_15,
  2360				      tjit->retval_off);
  2361	
  2362			im->ip_after_call = jit->prg_buf + jit->prg;
  2363	
  2364			/*
  2365			 * The following nop will be patched by bpf_tramp_image_put().
  2366			 */
  2367	
  2368			/* brcl 0,im->ip_epilogue */
  2369			EMIT6_PCREL_RILC(0xc0040000, 0, (u64)im->ip_epilogue);
  2370		}
  2371	
  2372		/* do_fexit: */
  2373		tjit->do_fexit = jit->prg;
  2374		for (i = 0; i < fexit->nr_links; i++)
  2375			invoke_bpf_prog(tjit, m, fexit->links[i], false);
  2376	
  2377		if (flags & BPF_TRAMP_F_CALL_ORIG) {
  2378			im->ip_epilogue = jit->prg_buf + jit->prg;
  2379	
  2380			/*
  2381			 * __bpf_tramp_exit(im);
  2382			 */
  2383	
  2384			/* %r1 = __bpf_tramp_exit */
  2385			load_imm64(jit, REG_1, (u64)__bpf_tramp_exit);
  2386			/* %r2 = im */
  2387			load_imm64(jit, REG_2, (u64)im);
  2388			/* %r1() */
  2389			call_r1(jit);
  2390		}
  2391	
  2392		/* lmg %r2,%rN,reg_args_off(%r15) */
  2393		if ((flags & BPF_TRAMP_F_RESTORE_REGS) && nr_reg_args)
  2394			EMIT6_DISP_LH(0xeb000000, 0x0004, REG_2,
  2395				      REG_2 + (nr_reg_args - 1), REG_15,
  2396				      tjit->reg_args_off);
  2397		/* lgr %r1,%r8 */
  2398		if (!(flags & BPF_TRAMP_F_SKIP_FRAME))
  2399			EMIT4(0xb9040000, REG_1, REG_8);
  2400		/* lmg %r7,%r8,r7_r8_off(%r15) */
  2401		EMIT6_DISP_LH(0xeb000000, 0x0004, REG_7, REG_8, REG_15,
  2402			      tjit->r7_r8_off);
  2403		/* lg %r14,r14_off(%r15) */
  2404		EMIT6_DISP_LH(0xe3000000, 0x0004, REG_14, REG_0, REG_15, tjit->r14_off);
  2405		/* lg %r2,retval_off(%r15) */
  2406		if (flags & (BPF_TRAMP_F_CALL_ORIG | BPF_TRAMP_F_RET_FENTRY_RET))
  2407			EMIT6_DISP_LH(0xe3000000, 0x0004, REG_2, REG_0, REG_15,
  2408				      tjit->retval_off);
  2409		/* aghi %r15,stack_size */
  2410		EMIT4_IMM(0xa70b0000, REG_15, tjit->stack_size);
  2411		/* Emit an expoline for the following indirect jump. */
  2412		if (nospec_uses_trampoline())
  2413			emit_expoline(jit);
  2414		if (flags & BPF_TRAMP_F_SKIP_FRAME)
  2415			/* br %r14 */
  2416			_EMIT2(0x07fe);
  2417		else
  2418			/* br %r1 */
  2419			_EMIT2(0x07f1);
  2420	
  2421		emit_r1_thunk(jit);
  2422	
  2423		return 0;
  2424	}
  2425	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH bpf-next 22/24] s390/bpf: Implement arch_prepare_bpf_trampoline()
  2023-01-25 21:38 ` [PATCH bpf-next 22/24] s390/bpf: Implement arch_prepare_bpf_trampoline() Ilya Leoshkevich
  2023-01-26  1:15   ` Andrii Nakryiko
  2023-02-15  4:07   ` kernel test robot
@ 2023-02-15 17:03   ` kernel test robot
  2 siblings, 0 replies; 48+ messages in thread
From: kernel test robot @ 2023-02-15 17:03 UTC (permalink / raw)
  To: Ilya Leoshkevich; +Cc: llvm, oe-kbuild-all

Hi Ilya,

I love your patch! Perhaps something to improve:

[auto build test WARNING on bpf-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/Ilya-Leoshkevich/selftests-bpf-Fix-liburandom_read-so-linker-error/20230128-150321
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
patch link:    https://lore.kernel.org/r/20230125213817.1424447-23-iii%40linux.ibm.com
patch subject: [PATCH bpf-next 22/24] s390/bpf: Implement arch_prepare_bpf_trampoline()
config: s390-randconfig-r025-20230212 (https://download.01.org/0day-ci/archive/20230216/202302160158.P4gvGOoV-lkp@intel.com/config)
compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project db89896bbbd2251fff457699635acbbedeead27f)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install s390 cross compiling tool for clang build
        # apt-get install binutils-s390x-linux-gnu
        # https://github.com/intel-lab-lkp/linux/commit/6257ca590791a94b97f1b4f72851dc110290a52b
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Ilya-Leoshkevich/selftests-bpf-Fix-liburandom_read-so-linker-error/20230128-150321
        git checkout 6257ca590791a94b97f1b4f72851dc110290a52b
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=s390 olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=s390 SHELL=/bin/bash arch/s390/net/ drivers/md/bcache/ drivers/pci/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202302160158.P4gvGOoV-lkp@intel.com/

All warnings (new ones prefixed by >>):

   In file included from arch/s390/net/bpf_jit_comp.c:21:
   In file included from include/linux/netdevice.h:38:
   In file included from include/net/net_namespace.h:43:
   In file included from include/linux/skbuff.h:31:
   In file included from include/linux/dma-mapping.h:10:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:75:
   include/asm-generic/io.h:547:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __raw_readb(PCI_IOBASE + addr);
                             ~~~~~~~~~~ ^
   include/asm-generic/io.h:560:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:37:59: note: expanded from macro '__le16_to_cpu'
   #define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x))
                                                             ^
   include/uapi/linux/swab.h:102:54: note: expanded from macro '__swab16'
   #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x))
                                                        ^
   In file included from arch/s390/net/bpf_jit_comp.c:21:
   In file included from include/linux/netdevice.h:38:
   In file included from include/net/net_namespace.h:43:
   In file included from include/linux/skbuff.h:31:
   In file included from include/linux/dma-mapping.h:10:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:75:
   include/asm-generic/io.h:573:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:35:59: note: expanded from macro '__le32_to_cpu'
   #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
                                                             ^
   include/uapi/linux/swab.h:115:54: note: expanded from macro '__swab32'
   #define __swab32(x) (__u32)__builtin_bswap32((__u32)(x))
                                                        ^
   In file included from arch/s390/net/bpf_jit_comp.c:21:
   In file included from include/linux/netdevice.h:38:
   In file included from include/net/net_namespace.h:43:
   In file included from include/linux/skbuff.h:31:
   In file included from include/linux/dma-mapping.h:10:
   In file included from include/linux/scatterlist.h:9:
   In file included from arch/s390/include/asm/io.h:75:
   include/asm-generic/io.h:584:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writeb(value, PCI_IOBASE + addr);
                               ~~~~~~~~~~ ^
   include/asm-generic/io.h:594:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
                                                         ~~~~~~~~~~ ^
   include/asm-generic/io.h:604:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
                                                         ~~~~~~~~~~ ^
   include/asm-generic/io.h:692:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           readsb(PCI_IOBASE + addr, buffer, count);
                  ~~~~~~~~~~ ^
   include/asm-generic/io.h:700:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           readsw(PCI_IOBASE + addr, buffer, count);
                  ~~~~~~~~~~ ^
   include/asm-generic/io.h:708:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           readsl(PCI_IOBASE + addr, buffer, count);
                  ~~~~~~~~~~ ^
   include/asm-generic/io.h:717:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           writesb(PCI_IOBASE + addr, buffer, count);
                   ~~~~~~~~~~ ^
   include/asm-generic/io.h:726:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           writesw(PCI_IOBASE + addr, buffer, count);
                   ~~~~~~~~~~ ^
   include/asm-generic/io.h:735:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           writesl(PCI_IOBASE + addr, buffer, count);
                   ~~~~~~~~~~ ^
>> arch/s390/net/bpf_jit_comp.c:2155:5: warning: no previous prototype for function '__arch_prepare_bpf_trampoline' [-Wmissing-prototypes]
   int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im,
       ^
   arch/s390/net/bpf_jit_comp.c:2155:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im,
   ^
   static 
   13 warnings generated.


vim +/__arch_prepare_bpf_trampoline +2155 arch/s390/net/bpf_jit_comp.c

  2154	
> 2155	int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im,
  2156					  struct bpf_tramp_jit *tjit,
  2157					  const struct btf_func_model *m,
  2158					  u32 flags, struct bpf_tramp_links *tlinks,
  2159					  void *func_addr)
  2160	{
  2161		struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
  2162		struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
  2163		struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
  2164		int nr_bpf_args, nr_reg_args, nr_stack_args;
  2165		struct bpf_jit *jit = &tjit->common;
  2166		int arg, bpf_arg_off;
  2167		int i, j;
  2168	
  2169		/* Support as many stack arguments as "mvc" instruction can handle. */
  2170		nr_reg_args = min_t(int, m->nr_args, MAX_NR_REG_ARGS);
  2171		nr_stack_args = m->nr_args - nr_reg_args;
  2172		if (nr_stack_args > MAX_NR_STACK_ARGS)
  2173			return -ENOTSUPP;
  2174	
  2175		/* Return to %r14, since func_addr and %r0 are not available. */
  2176		if (!func_addr && !(flags & BPF_TRAMP_F_ORIG_STACK))
  2177			flags |= BPF_TRAMP_F_SKIP_FRAME;
  2178	
  2179		/*
  2180		 * Compute how many arguments we need to pass to BPF programs.
  2181		 * BPF ABI mirrors that of x86_64: arguments that are 16 bytes or
  2182		 * smaller are packed into 1 or 2 registers; larger arguments are
  2183		 * passed via pointers.
  2184		 * In s390x ABI, arguments that are 8 bytes or smaller are packed into
  2185		 * a register; larger arguments are passed via pointers.
  2186		 * We need to deal with this difference.
  2187		 */
  2188		nr_bpf_args = 0;
  2189		for (i = 0; i < m->nr_args; i++) {
  2190			if (m->arg_size[i] <= 8)
  2191				nr_bpf_args += 1;
  2192			else if (m->arg_size[i] <= 16)
  2193				nr_bpf_args += 2;
  2194			else
  2195				return -ENOTSUPP;
  2196		}
  2197	
  2198		/*
  2199		 * Calculate the stack layout.
  2200		 */
  2201	
  2202		/* Reserve STACK_FRAME_OVERHEAD bytes for the callees. */
  2203		tjit->stack_size = STACK_FRAME_OVERHEAD;
  2204		tjit->stack_args_off = alloc_stack(tjit, nr_stack_args * sizeof(u64));
  2205		tjit->reg_args_off = alloc_stack(tjit, nr_reg_args * sizeof(u64));
  2206		tjit->ip_off = alloc_stack(tjit, sizeof(u64));
  2207		tjit->arg_cnt_off = alloc_stack(tjit, sizeof(u64));
  2208		tjit->bpf_args_off = alloc_stack(tjit, nr_bpf_args * sizeof(u64));
  2209		tjit->retval_off = alloc_stack(tjit, sizeof(u64));
  2210		tjit->r7_r8_off = alloc_stack(tjit, 2 * sizeof(u64));
  2211		tjit->r14_off = alloc_stack(tjit, sizeof(u64));
  2212		tjit->run_ctx_off = alloc_stack(tjit,
  2213						sizeof(struct bpf_tramp_run_ctx));
  2214		/* The caller has already reserved STACK_FRAME_OVERHEAD bytes. */
  2215		tjit->stack_size -= STACK_FRAME_OVERHEAD;
  2216		tjit->orig_stack_args_off = tjit->stack_size + STACK_FRAME_OVERHEAD;
  2217	
  2218		/* aghi %r15,-stack_size */
  2219		EMIT4_IMM(0xa70b0000, REG_15, -tjit->stack_size);
  2220		/* stmg %r2,%rN,fwd_reg_args_off(%r15) */
  2221		if (nr_reg_args)
  2222			EMIT6_DISP_LH(0xeb000000, 0x0024, REG_2,
  2223				      REG_2 + (nr_reg_args - 1), REG_15,
  2224				      tjit->reg_args_off);
  2225		for (i = 0, j = 0; i < m->nr_args; i++) {
  2226			if (i < MAX_NR_REG_ARGS)
  2227				arg = REG_2 + i;
  2228			else
  2229				arg = tjit->orig_stack_args_off +
  2230				      (i - MAX_NR_REG_ARGS) * sizeof(u64);
  2231			bpf_arg_off = tjit->bpf_args_off + j * sizeof(u64);
  2232			if (m->arg_size[i] <= 8) {
  2233				if (i < MAX_NR_REG_ARGS)
  2234					/* stg %arg,bpf_arg_off(%r15) */
  2235					EMIT6_DISP_LH(0xe3000000, 0x0024, arg,
  2236						      REG_0, REG_15, bpf_arg_off);
  2237				else
  2238					/* mvc bpf_arg_off(8,%r15),arg(%r15) */
  2239					_EMIT6(0xd207f000 | bpf_arg_off,
  2240					       0xf000 | arg);
  2241				j += 1;
  2242			} else {
  2243				if (i < MAX_NR_REG_ARGS) {
  2244					/* mvc bpf_arg_off(16,%r15),0(%arg) */
  2245					_EMIT6(0xd20ff000 | bpf_arg_off,
  2246					       reg2hex[arg] << 12);
  2247				} else {
  2248					/* lg %r1,arg(%r15) */
  2249					EMIT6_DISP_LH(0xe3000000, 0x0004, REG_1, REG_0,
  2250						      REG_15, arg);
  2251					/* mvc bpf_arg_off(16,%r15),0(%r1) */
  2252					_EMIT6(0xd20ff000 | bpf_arg_off, 0x1000);
  2253				}
  2254				j += 2;
  2255			}
  2256		}
  2257		/* stmg %r7,%r8,r7_r8_off(%r15) */
  2258		EMIT6_DISP_LH(0xeb000000, 0x0024, REG_7, REG_8, REG_15,
  2259			      tjit->r7_r8_off);
  2260		/* stg %r14,r14_off(%r15) */
  2261		EMIT6_DISP_LH(0xe3000000, 0x0024, REG_14, REG_0, REG_15, tjit->r14_off);
  2262	
  2263		if (flags & BPF_TRAMP_F_ORIG_STACK) {
  2264			/*
  2265			 * The ftrace trampoline puts the return address (which is the
  2266			 * address of the original function + S390X_PATCH_SIZE) into
  2267			 * %r0; see ftrace_shared_hotpatch_trampoline_br and
  2268			 * ftrace_init_nop() for details.
  2269			 */
  2270	
  2271			/* lgr %r8,%r0 */
  2272			EMIT4(0xb9040000, REG_8, REG_0);
  2273		} else {
  2274			/* %r8 = func_addr + S390X_PATCH_SIZE */
  2275			load_imm64(jit, REG_8, (u64)func_addr + S390X_PATCH_SIZE);
  2276		}
  2277	
  2278		/*
  2279		 * ip = func_addr;
  2280		 * arg_cnt = m->nr_args;
  2281		 */
  2282	
  2283		if (flags & BPF_TRAMP_F_IP_ARG) {
  2284			/* %r0 = func_addr */
  2285			load_imm64(jit, REG_0, (u64)func_addr);
  2286			/* stg %r0,ip_off(%r15) */
  2287			EMIT6_DISP_LH(0xe3000000, 0x0024, REG_0, REG_0, REG_15,
  2288				      tjit->ip_off);
  2289		}
  2290		/* lghi %r0,nr_bpf_args */
  2291		EMIT4_IMM(0xa7090000, REG_0, nr_bpf_args);
  2292		/* stg %r0,arg_cnt_off(%r15) */
  2293		EMIT6_DISP_LH(0xe3000000, 0x0024, REG_0, REG_0, REG_15,
  2294			      tjit->arg_cnt_off);
  2295	
  2296		if (flags & BPF_TRAMP_F_CALL_ORIG) {
  2297			/*
  2298			 * __bpf_tramp_enter(im);
  2299			 */
  2300	
  2301			/* %r1 = __bpf_tramp_enter */
  2302			load_imm64(jit, REG_1, (u64)__bpf_tramp_enter);
  2303			/* %r2 = im */
  2304			load_imm64(jit, REG_2, (u64)im);
  2305			/* %r1() */
  2306			call_r1(jit);
  2307		}
  2308	
  2309		for (i = 0; i < fentry->nr_links; i++)
  2310			invoke_bpf_prog(tjit, m, fentry->links[i],
  2311					flags & BPF_TRAMP_F_RET_FENTRY_RET);
  2312	
  2313		if (fmod_ret->nr_links) {
  2314			/*
  2315			 * retval = 0;
  2316			 */
  2317	
  2318			/* xc retval_off(8,%r15),retval_off(%r15) */
  2319			_EMIT6(0xd707f000 | tjit->retval_off,
  2320			       0xf000 | tjit->retval_off);
  2321	
  2322			for (i = 0; i < fmod_ret->nr_links; i++) {
  2323				invoke_bpf_prog(tjit, m, fmod_ret->links[i], true);
  2324	
  2325				/*
  2326				 * if (retval)
  2327				 *         goto do_fexit;
  2328				 */
  2329	
  2330				/* ltg %r0,retval_off(%r15) */
  2331				EMIT6_DISP_LH(0xe3000000, 0x0002, REG_0, REG_0, REG_15,
  2332					      tjit->retval_off);
  2333				/* brcl 7,do_fexit */
  2334				EMIT6_PCREL_RILC(0xc0040000, 7, tjit->do_fexit);
  2335			}
  2336		}
  2337	
  2338		if (flags & BPF_TRAMP_F_CALL_ORIG) {
  2339			/*
  2340			 * retval = func_addr(args);
  2341			 */
  2342	
  2343			/* lmg %r2,%rN,reg_args_off(%r15) */
  2344			if (nr_reg_args)
  2345				EMIT6_DISP_LH(0xeb000000, 0x0004, REG_2,
  2346					      REG_2 + (nr_reg_args - 1), REG_15,
  2347					      tjit->reg_args_off);
  2348			/* mvc stack_args_off(N,%r15),orig_stack_args_off(%r15) */
  2349			if (nr_stack_args)
  2350				_EMIT6(0xd200f000 |
  2351					       (nr_stack_args * sizeof(u64) - 1) << 16 |
  2352					       tjit->stack_args_off,
  2353				       0xf000 | tjit->orig_stack_args_off);
  2354			/* lgr %r1,%r8 */
  2355			EMIT4(0xb9040000, REG_1, REG_8);
  2356			/* %r1() */
  2357			call_r1(jit);
  2358			/* stg %r2,retval_off(%r15) */
  2359			EMIT6_DISP_LH(0xe3000000, 0x0024, REG_2, REG_0, REG_15,
  2360				      tjit->retval_off);
  2361	
  2362			im->ip_after_call = jit->prg_buf + jit->prg;
  2363	
  2364			/*
  2365			 * The following nop will be patched by bpf_tramp_image_put().
  2366			 */
  2367	
  2368			/* brcl 0,im->ip_epilogue */
  2369			EMIT6_PCREL_RILC(0xc0040000, 0, (u64)im->ip_epilogue);
  2370		}
  2371	
  2372		/* do_fexit: */
  2373		tjit->do_fexit = jit->prg;
  2374		for (i = 0; i < fexit->nr_links; i++)
  2375			invoke_bpf_prog(tjit, m, fexit->links[i], false);
  2376	
  2377		if (flags & BPF_TRAMP_F_CALL_ORIG) {
  2378			im->ip_epilogue = jit->prg_buf + jit->prg;
  2379	
  2380			/*
  2381			 * __bpf_tramp_exit(im);
  2382			 */
  2383	
  2384			/* %r1 = __bpf_tramp_exit */
  2385			load_imm64(jit, REG_1, (u64)__bpf_tramp_exit);
  2386			/* %r2 = im */
  2387			load_imm64(jit, REG_2, (u64)im);
  2388			/* %r1() */
  2389			call_r1(jit);
  2390		}
  2391	
  2392		/* lmg %r2,%rN,reg_args_off(%r15) */
  2393		if ((flags & BPF_TRAMP_F_RESTORE_REGS) && nr_reg_args)
  2394			EMIT6_DISP_LH(0xeb000000, 0x0004, REG_2,
  2395				      REG_2 + (nr_reg_args - 1), REG_15,
  2396				      tjit->reg_args_off);
  2397		/* lgr %r1,%r8 */
  2398		if (!(flags & BPF_TRAMP_F_SKIP_FRAME))
  2399			EMIT4(0xb9040000, REG_1, REG_8);
  2400		/* lmg %r7,%r8,r7_r8_off(%r15) */
  2401		EMIT6_DISP_LH(0xeb000000, 0x0004, REG_7, REG_8, REG_15,
  2402			      tjit->r7_r8_off);
  2403		/* lg %r14,r14_off(%r15) */
  2404		EMIT6_DISP_LH(0xe3000000, 0x0004, REG_14, REG_0, REG_15, tjit->r14_off);
  2405		/* lg %r2,retval_off(%r15) */
  2406		if (flags & (BPF_TRAMP_F_CALL_ORIG | BPF_TRAMP_F_RET_FENTRY_RET))
  2407			EMIT6_DISP_LH(0xe3000000, 0x0004, REG_2, REG_0, REG_15,
  2408				      tjit->retval_off);
  2409		/* aghi %r15,stack_size */
  2410		EMIT4_IMM(0xa70b0000, REG_15, tjit->stack_size);
  2411		/* Emit an expoline for the following indirect jump. */
  2412		if (nospec_uses_trampoline())
  2413			emit_expoline(jit);
  2414		if (flags & BPF_TRAMP_F_SKIP_FRAME)
  2415			/* br %r14 */
  2416			_EMIT2(0x07fe);
  2417		else
  2418			/* br %r1 */
  2419			_EMIT2(0x07f1);
  2420	
  2421		emit_r1_thunk(jit);
  2422	
  2423		return 0;
  2424	}
  2425	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2023-02-15 17:05 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-25 21:37 [PATCH bpf-next 00/24] Support bpf trampoline for s390x Ilya Leoshkevich
2023-01-25 21:37 ` [PATCH bpf-next 01/24] selftests/bpf: Fix liburandom_read.so linker error Ilya Leoshkevich
2023-01-26  1:07   ` Andrii Nakryiko
2023-01-26 13:30     ` Ilya Leoshkevich
2023-01-25 21:37 ` [PATCH bpf-next 02/24] selftests/bpf: Fix symlink creation error Ilya Leoshkevich
2023-01-25 21:37 ` [PATCH bpf-next 03/24] selftests/bpf: Fix fexit_stress on s390x Ilya Leoshkevich
2023-01-25 21:37 ` [PATCH bpf-next 04/24] selftests/bpf: Fix trampoline_count " Ilya Leoshkevich
2023-01-25 21:37 ` [PATCH bpf-next 05/24] selftests/bpf: Fix kfree_skb " Ilya Leoshkevich
2023-01-25 21:37 ` [PATCH bpf-next 06/24] selftests/bpf: Set errno when urand_spawn() fails Ilya Leoshkevich
2023-01-25 21:38 ` [PATCH bpf-next 07/24] selftests/bpf: Fix decap_sanity_ns cleanup Ilya Leoshkevich
2023-01-25 21:38 ` [PATCH bpf-next 08/24] selftests/bpf: Fix verify_pkcs7_sig on s390x Ilya Leoshkevich
2023-01-26  1:06   ` Andrii Nakryiko
2023-01-27 12:36     ` Ilya Leoshkevich
2023-01-27 17:26       ` Andrii Nakryiko
2023-01-25 21:38 ` [PATCH bpf-next 09/24] selftests/bpf: Fix xdp_do_redirect " Ilya Leoshkevich
2023-01-25 21:38 ` [PATCH bpf-next 10/24] selftests/bpf: Fix cgrp_local_storage " Ilya Leoshkevich
2023-01-25 21:38 ` [PATCH bpf-next 11/24] selftests/bpf: Check stack_mprotect() return value Ilya Leoshkevich
2023-01-25 21:38 ` [PATCH bpf-next 12/24] selftests/bpf: Increase SIZEOF_BPF_LOCAL_STORAGE_ELEM on s390x Ilya Leoshkevich
2023-01-25 21:38 ` [PATCH bpf-next 13/24] selftests/bpf: Add a sign-extension test for kfuncs Ilya Leoshkevich
2023-01-25 21:38 ` [PATCH bpf-next 14/24] selftests/bpf: Fix test_lsm on s390x Ilya Leoshkevich
2023-01-25 21:38 ` [PATCH bpf-next 15/24] selftests/bpf: Fix test_xdp_adjust_tail_grow2 " Ilya Leoshkevich
2023-01-25 21:38 ` [PATCH bpf-next 16/24] selftests/bpf: Fix vmlinux test " Ilya Leoshkevich
2023-01-25 21:38 ` [PATCH bpf-next 17/24] libbpf: Read usdt arg spec with bpf_probe_read_kernel() Ilya Leoshkevich
2023-01-26  0:26   ` Andrii Nakryiko
2023-01-26 11:41     ` Ilya Leoshkevich
2023-01-26 19:03       ` Andrii Nakryiko
2023-01-27 11:01         ` Ilya Leoshkevich
2023-01-25 21:38 ` [PATCH bpf-next 18/24] s390/bpf: Fix a typo in a comment Ilya Leoshkevich
2023-01-25 21:38 ` [PATCH bpf-next 19/24] s390/bpf: Add expoline to tail calls Ilya Leoshkevich
2023-01-25 21:38 ` [PATCH bpf-next 20/24] s390/bpf: Implement bpf_arch_text_poke() Ilya Leoshkevich
2023-01-25 21:38 ` [PATCH bpf-next 21/24] bpf: btf: Add BTF_FMODEL_SIGNED_ARG flag Ilya Leoshkevich
2023-01-25 21:38 ` [PATCH bpf-next 22/24] s390/bpf: Implement arch_prepare_bpf_trampoline() Ilya Leoshkevich
2023-01-26  1:15   ` Andrii Nakryiko
2023-01-26 14:30     ` Ilya Leoshkevich
2023-01-26 19:06       ` Andrii Nakryiko
2023-01-27 11:15         ` Ilya Leoshkevich
2023-01-27 17:30           ` Andrii Nakryiko
2023-02-15  4:07   ` kernel test robot
2023-02-15 17:03   ` kernel test robot
2023-01-25 21:38 ` [PATCH bpf-next 23/24] s390/bpf: Implement bpf_jit_supports_subprog_tailcalls() Ilya Leoshkevich
2023-01-25 21:38 ` [PATCH bpf-next 24/24] s390/bpf: Implement bpf_jit_supports_kfunc_call() Ilya Leoshkevich
2023-01-26  1:28   ` Alexei Starovoitov
2023-01-27 11:36     ` Ilya Leoshkevich
2023-01-27 16:04       ` Alexei Starovoitov
2023-01-26  0:45 ` [PATCH bpf-next 00/24] Support bpf trampoline for s390x Andrii Nakryiko
2023-01-27 16:51   ` Ilya Leoshkevich
2023-01-27 17:24     ` Andrii Nakryiko
2023-01-27 22:50       ` Ilya Leoshkevich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.