BPF Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH bpf-next 0/3] samples/bpf: A couple s390 fixes
@ 2020-07-28 12:00 Ilya Leoshkevich
  2020-07-28 12:00 ` [PATCH bpf-next 1/3] samples/bpf: Fix building out of srctree Ilya Leoshkevich
                   ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Ilya Leoshkevich @ 2020-07-28 12:00 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann
  Cc: bpf, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich

This series contains small fixes that make samples work on s390:

Patch 1: Fix building M=samples/bpf with O=.
Patch 2: Use more portable __sys_connect instead of
         SYSCALL(sys_connect).
Patch 3: Use bpf_probe_read_kernel instead of bpf_probe_read in
         libbpf.

Ilya Leoshkevich (3):
  samples/bpf: Fix building out of srctree
  samples/bpf: Fix test_map_in_map on s390
  libbpf: Use bpf_probe_read_kernel

 samples/bpf/Makefile               |  1 +
 samples/bpf/test_map_in_map_kern.c |  7 ++--
 tools/lib/bpf/bpf_core_read.h      | 51 ++++++++++++++++--------------
 tools/lib/bpf/bpf_tracing.h        | 15 ++++++---
 4 files changed, 41 insertions(+), 33 deletions(-)

-- 
2.25.4


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH bpf-next 1/3] samples/bpf: Fix building out of srctree
  2020-07-28 12:00 [PATCH bpf-next 0/3] samples/bpf: A couple s390 fixes Ilya Leoshkevich
@ 2020-07-28 12:00 ` Ilya Leoshkevich
  2020-07-28 20:48   ` Song Liu
  2020-07-28 12:00 ` [PATCH bpf-next 2/3] samples/bpf: Fix test_map_in_map on s390 Ilya Leoshkevich
  2020-07-28 12:00 ` [PATCH bpf-next 3/3] libbpf: Use bpf_probe_read_kernel Ilya Leoshkevich
  2 siblings, 1 reply; 20+ messages in thread
From: Ilya Leoshkevich @ 2020-07-28 12:00 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann
  Cc: bpf, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich

Building BPF samples out of srctree fails, because the output directory
for progs shared with selftests (CGROUP_HELPERS, TRACE_HELPERS) is
missing and the compiler cannot create output files.

Fix by creating the output directory in Makefile.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 samples/bpf/Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index f87ee02073ba..81ba0beca0a3 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -301,6 +301,7 @@ $(obj)/hbm_edt_kern.o: $(src)/hbm.h $(src)/hbm_kern.h
 # But, there is no easy way to fix it, so just exclude it since it is
 # useless for BPF samples.
 $(obj)/%.o: $(src)/%.c
+	$(Q)mkdir -p $(@D)
 	@echo "  CLANG-bpf " $@
 	$(Q)$(CLANG) $(NOSTDINC_FLAGS) $(LINUXINCLUDE) $(BPF_EXTRA_CFLAGS) \
 		-I$(obj) -I$(srctree)/tools/testing/selftests/bpf/ \
-- 
2.25.4


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH bpf-next 2/3] samples/bpf: Fix test_map_in_map on s390
  2020-07-28 12:00 [PATCH bpf-next 0/3] samples/bpf: A couple s390 fixes Ilya Leoshkevich
  2020-07-28 12:00 ` [PATCH bpf-next 1/3] samples/bpf: Fix building out of srctree Ilya Leoshkevich
@ 2020-07-28 12:00 ` Ilya Leoshkevich
  2020-07-28 20:59   ` Song Liu
  2020-07-28 12:00 ` [PATCH bpf-next 3/3] libbpf: Use bpf_probe_read_kernel Ilya Leoshkevich
  2 siblings, 1 reply; 20+ messages in thread
From: Ilya Leoshkevich @ 2020-07-28 12:00 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann
  Cc: bpf, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich

s390 uses socketcall multiplexer instead of individual socket syscalls.
Therefore, "kprobe/" SYSCALL(sys_connect) does not trigger and
test_map_in_map fails. Fix by using "kprobe/__sys_connect" instead.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 samples/bpf/test_map_in_map_kern.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/samples/bpf/test_map_in_map_kern.c b/samples/bpf/test_map_in_map_kern.c
index 8def45c5b697..b0200c8eac09 100644
--- a/samples/bpf/test_map_in_map_kern.c
+++ b/samples/bpf/test_map_in_map_kern.c
@@ -103,10 +103,9 @@ static __always_inline int do_inline_hash_lookup(void *inner_map, u32 port)
 	return result ? *result : -ENOENT;
 }
 
-SEC("kprobe/" SYSCALL(sys_connect))
+SEC("kprobe/__sys_connect")
 int trace_sys_connect(struct pt_regs *ctx)
 {
-	struct pt_regs *real_regs = (struct pt_regs *)PT_REGS_PARM1_CORE(ctx);
 	struct sockaddr_in6 *in6;
 	u16 test_case, port, dst6[8];
 	int addrlen, ret, inline_ret, ret_key = 0;
@@ -114,8 +113,8 @@ int trace_sys_connect(struct pt_regs *ctx)
 	void *outer_map, *inner_map;
 	bool inline_hash = false;
 
-	in6 = (struct sockaddr_in6 *)PT_REGS_PARM2_CORE(real_regs);
-	addrlen = (int)PT_REGS_PARM3_CORE(real_regs);
+	in6 = (struct sockaddr_in6 *)PT_REGS_PARM2_CORE(ctx);
+	addrlen = (int)PT_REGS_PARM3_CORE(ctx);
 
 	if (addrlen != sizeof(*in6))
 		return 0;
-- 
2.25.4


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH bpf-next 3/3] libbpf: Use bpf_probe_read_kernel
  2020-07-28 12:00 [PATCH bpf-next 0/3] samples/bpf: A couple s390 fixes Ilya Leoshkevich
  2020-07-28 12:00 ` [PATCH bpf-next 1/3] samples/bpf: Fix building out of srctree Ilya Leoshkevich
  2020-07-28 12:00 ` [PATCH bpf-next 2/3] samples/bpf: Fix test_map_in_map on s390 Ilya Leoshkevich
@ 2020-07-28 12:00 ` Ilya Leoshkevich
  2020-07-28 19:11   ` Andrii Nakryiko
  2 siblings, 1 reply; 20+ messages in thread
From: Ilya Leoshkevich @ 2020-07-28 12:00 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann
  Cc: bpf, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich

Yet another adaptation to commit 0ebeea8ca8a4 ("bpf: Restrict
bpf_probe_read{, str}() only to archs where they work") that makes more
samples compile on s390.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 tools/lib/bpf/bpf_core_read.h | 51 ++++++++++++++++++-----------------
 tools/lib/bpf/bpf_tracing.h   | 15 +++++++----
 2 files changed, 37 insertions(+), 29 deletions(-)

diff --git a/tools/lib/bpf/bpf_core_read.h b/tools/lib/bpf/bpf_core_read.h
index eae5cccff761..b108381fbec4 100644
--- a/tools/lib/bpf/bpf_core_read.h
+++ b/tools/lib/bpf/bpf_core_read.h
@@ -23,28 +23,29 @@ enum bpf_field_info_kind {
 	__builtin_preserve_field_info((src)->field, BPF_FIELD_##info)
 
 #if __BYTE_ORDER == __LITTLE_ENDIAN
-#define __CORE_BITFIELD_PROBE_READ(dst, src, fld)			      \
-	bpf_probe_read((void *)dst,					      \
-		       __CORE_RELO(src, fld, BYTE_SIZE),		      \
-		       (const void *)src + __CORE_RELO(src, fld, BYTE_OFFSET))
+#define __CORE_BITFIELD_PROBE_READ(dst, src, fld)                              \
+	bpf_probe_read_kernel((void *)dst, __CORE_RELO(src, fld, BYTE_SIZE),   \
+			      (const void *)src +                              \
+				      __CORE_RELO(src, fld, BYTE_OFFSET))
 #else
 /* semantics of LSHIFT_64 assumes loading values into low-ordered bytes, so
  * for big-endian we need to adjust destination pointer accordingly, based on
  * field byte size
  */
-#define __CORE_BITFIELD_PROBE_READ(dst, src, fld)			      \
-	bpf_probe_read((void *)dst + (8 - __CORE_RELO(src, fld, BYTE_SIZE)),  \
-		       __CORE_RELO(src, fld, BYTE_SIZE),		      \
-		       (const void *)src + __CORE_RELO(src, fld, BYTE_OFFSET))
+#define __CORE_BITFIELD_PROBE_READ(dst, src, fld)                              \
+	bpf_probe_read_kernel(                                                 \
+		(void *)dst + (8 - __CORE_RELO(src, fld, BYTE_SIZE)),          \
+		__CORE_RELO(src, fld, BYTE_SIZE),                              \
+		(const void *)src + __CORE_RELO(src, fld, BYTE_OFFSET))
 #endif
 
 /*
  * Extract bitfield, identified by s->field, and return its value as u64.
  * All this is done in relocatable manner, so bitfield changes such as
  * signedness, bit size, offset changes, this will be handled automatically.
- * This version of macro is using bpf_probe_read() to read underlying integer
- * storage. Macro functions as an expression and its return type is
- * bpf_probe_read()'s return value: 0, on success, <0 on error.
+ * This version of macro is using bpf_probe_read_kernel() to read underlying
+ * integer storage. Macro functions as an expression and its return type is
+ * bpf_probe_read_kernel()'s return value: 0, on success, <0 on error.
  */
 #define BPF_CORE_READ_BITFIELD_PROBED(s, field) ({			      \
 	unsigned long long val = 0;					      \
@@ -99,9 +100,9 @@ enum bpf_field_info_kind {
 	__builtin_preserve_field_info(field, BPF_FIELD_BYTE_SIZE)
 
 /*
- * bpf_core_read() abstracts away bpf_probe_read() call and captures offset
- * relocation for source address using __builtin_preserve_access_index()
- * built-in, provided by Clang.
+ * bpf_core_read() abstracts away bpf_probe_read_kernel() call and captures
+ * offset relocation for source address using
+ * __builtin_preserve_access_index() built-in, provided by Clang.
  *
  * __builtin_preserve_access_index() takes as an argument an expression of
  * taking an address of a field within struct/union. It makes compiler emit
@@ -114,18 +115,18 @@ enum bpf_field_info_kind {
  * actual field offset, based on target kernel BTF type that matches original
  * (local) BTF, used to record relocation.
  */
-#define bpf_core_read(dst, sz, src)					    \
-	bpf_probe_read(dst, sz,						    \
-		       (const void *)__builtin_preserve_access_index(src))
+#define bpf_core_read(dst, sz, src)                                            \
+	bpf_probe_read_kernel(                                                 \
+		dst, sz, (const void *)__builtin_preserve_access_index(src))
 
 /*
- * bpf_core_read_str() is a thin wrapper around bpf_probe_read_str()
+ * bpf_core_read_str() is a thin wrapper around bpf_probe_read_kernel_str()
  * additionally emitting BPF CO-RE field relocation for specified source
  * argument.
  */
-#define bpf_core_read_str(dst, sz, src)					    \
-	bpf_probe_read_str(dst, sz,					    \
-			   (const void *)__builtin_preserve_access_index(src))
+#define bpf_core_read_str(dst, sz, src)                                        \
+	bpf_probe_read_kernel_str(                                             \
+		dst, sz, (const void *)__builtin_preserve_access_index(src))
 
 #define ___concat(a, b) a ## b
 #define ___apply(fn, n) ___concat(fn, n)
@@ -239,15 +240,17 @@ enum bpf_field_info_kind {
  *	int x = BPF_CORE_READ(s, a.b.c, d.e, f, g);
  *
  * BPF_CORE_READ will decompose above statement into 4 bpf_core_read (BPF
- * CO-RE relocatable bpf_probe_read() wrapper) calls, logically equivalent to:
+ * CO-RE relocatable bpf_probe_read_kernel() wrapper) calls, logically
+ * equivalent to:
  * 1. const void *__t = s->a.b.c;
  * 2. __t = __t->d.e;
  * 3. __t = __t->f;
  * 4. return __t->g;
  *
  * Equivalence is logical, because there is a heavy type casting/preservation
- * involved, as well as all the reads are happening through bpf_probe_read()
- * calls using __builtin_preserve_access_index() to emit CO-RE relocations.
+ * involved, as well as all the reads are happening through
+ * bpf_probe_read_kernel() calls using __builtin_preserve_access_index() to
+ * emit CO-RE relocations.
  *
  * N.B. Only up to 9 "field accessors" are supported, which should be more
  * than enough for any practical purpose.
diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h
index 58eceb884df3..4eb3be4130f0 100644
--- a/tools/lib/bpf/bpf_tracing.h
+++ b/tools/lib/bpf/bpf_tracing.h
@@ -288,11 +288,16 @@ struct pt_regs;
 #define BPF_KPROBE_READ_RET_IP(ip, ctx)		({ (ip) = PT_REGS_RET(ctx); })
 #define BPF_KRETPROBE_READ_RET_IP		BPF_KPROBE_READ_RET_IP
 #else
-#define BPF_KPROBE_READ_RET_IP(ip, ctx)					    \
-	({ bpf_probe_read(&(ip), sizeof(ip), (void *)PT_REGS_RET(ctx)); })
-#define BPF_KRETPROBE_READ_RET_IP(ip, ctx)				    \
-	({ bpf_probe_read(&(ip), sizeof(ip),				    \
-			  (void *)(PT_REGS_FP(ctx) + sizeof(ip))); })
+#define BPF_KPROBE_READ_RET_IP(ip, ctx)                                        \
+	({                                                                     \
+		bpf_probe_read_kernel(&(ip), sizeof(ip),                       \
+				      (void *)PT_REGS_RET(ctx));               \
+	})
+#define BPF_KRETPROBE_READ_RET_IP(ip, ctx)                                     \
+	({                                                                     \
+		bpf_probe_read_kernel(&(ip), sizeof(ip),                       \
+				      (void *)(PT_REGS_FP(ctx) + sizeof(ip))); \
+	})
 #endif
 
 #define ___bpf_concat(a, b) a ## b
-- 
2.25.4


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 3/3] libbpf: Use bpf_probe_read_kernel
  2020-07-28 12:00 ` [PATCH bpf-next 3/3] libbpf: Use bpf_probe_read_kernel Ilya Leoshkevich
@ 2020-07-28 19:11   ` Andrii Nakryiko
  2020-07-28 21:16     ` Daniel Borkmann
  0 siblings, 1 reply; 20+ messages in thread
From: Andrii Nakryiko @ 2020-07-28 19:11 UTC (permalink / raw)
  To: Ilya Leoshkevich
  Cc: Alexei Starovoitov, Daniel Borkmann, bpf, Heiko Carstens, Vasily Gorbik

On Tue, Jul 28, 2020 at 5:15 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>
> Yet another adaptation to commit 0ebeea8ca8a4 ("bpf: Restrict
> bpf_probe_read{, str}() only to archs where they work") that makes more
> samples compile on s390.
>
> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> ---

Sorry, we can't do this yet. This will break on older kernels that
don't yet have bpf_probe_read_kernel() implemented. Met and Yonghong
are working on extending a set of CO-RE relocations, that would allow
to do bpf_probe_read_kernel() detection on BPF side, transparently for
an application, and will pick either bpf_probe_read() or
bpf_probe_read_kernel(). It should be ready soon (this or next week,
most probably), though it will have dependency on the latest Clang.
But for now, please don't change this.

>  tools/lib/bpf/bpf_core_read.h | 51 ++++++++++++++++++-----------------
>  tools/lib/bpf/bpf_tracing.h   | 15 +++++++----
>  2 files changed, 37 insertions(+), 29 deletions(-)
>

[...]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 1/3] samples/bpf: Fix building out of srctree
  2020-07-28 12:00 ` [PATCH bpf-next 1/3] samples/bpf: Fix building out of srctree Ilya Leoshkevich
@ 2020-07-28 20:48   ` Song Liu
  2020-07-28 21:12     ` Ilya Leoshkevich
  0 siblings, 1 reply; 20+ messages in thread
From: Song Liu @ 2020-07-28 20:48 UTC (permalink / raw)
  To: Ilya Leoshkevich
  Cc: Alexei Starovoitov, Daniel Borkmann, bpf, Heiko Carstens, Vasily Gorbik

On Tue, Jul 28, 2020 at 5:15 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>
> Building BPF samples out of srctree fails, because the output directory
> for progs shared with selftests (CGROUP_HELPERS, TRACE_HELPERS) is
> missing and the compiler cannot create output files.
>
> Fix by creating the output directory in Makefile.

What is the make command line we use here? I am trying:

   make M=samples/bpf O=./xxx

w/o this patch, make created ./xxx/samples/bpf.

Did I miss something?

Thanks,
Song

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 2/3] samples/bpf: Fix test_map_in_map on s390
  2020-07-28 12:00 ` [PATCH bpf-next 2/3] samples/bpf: Fix test_map_in_map on s390 Ilya Leoshkevich
@ 2020-07-28 20:59   ` Song Liu
  2020-07-28 22:05     ` Ilya Leoshkevich
  0 siblings, 1 reply; 20+ messages in thread
From: Song Liu @ 2020-07-28 20:59 UTC (permalink / raw)
  To: Ilya Leoshkevich
  Cc: Alexei Starovoitov, Daniel Borkmann, bpf, Heiko Carstens, Vasily Gorbik

On Tue, Jul 28, 2020 at 5:14 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>
> s390 uses socketcall multiplexer instead of individual socket syscalls.
> Therefore, "kprobe/" SYSCALL(sys_connect) does not trigger and
> test_map_in_map fails. Fix by using "kprobe/__sys_connect" instead.

samples/bpf is in semi-deprecated state. I tried for quite some time, but still
cannot build it all successfully. So I apologize for bounding the
question to you...

From the code, we do the SYSCALL() trick to change the exact name for
different architecture. Would this change break the same file for x86?

Thanks,
Song

>
> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> ---
>  samples/bpf/test_map_in_map_kern.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/samples/bpf/test_map_in_map_kern.c b/samples/bpf/test_map_in_map_kern.c
> index 8def45c5b697..b0200c8eac09 100644
> --- a/samples/bpf/test_map_in_map_kern.c
> +++ b/samples/bpf/test_map_in_map_kern.c
> @@ -103,10 +103,9 @@ static __always_inline int do_inline_hash_lookup(void *inner_map, u32 port)
>         return result ? *result : -ENOENT;
>  }
>
> -SEC("kprobe/" SYSCALL(sys_connect))
> +SEC("kprobe/__sys_connect")
>  int trace_sys_connect(struct pt_regs *ctx)
>  {
> -       struct pt_regs *real_regs = (struct pt_regs *)PT_REGS_PARM1_CORE(ctx);
>         struct sockaddr_in6 *in6;
>         u16 test_case, port, dst6[8];
>         int addrlen, ret, inline_ret, ret_key = 0;
> @@ -114,8 +113,8 @@ int trace_sys_connect(struct pt_regs *ctx)
>         void *outer_map, *inner_map;
>         bool inline_hash = false;
>
> -       in6 = (struct sockaddr_in6 *)PT_REGS_PARM2_CORE(real_regs);
> -       addrlen = (int)PT_REGS_PARM3_CORE(real_regs);
> +       in6 = (struct sockaddr_in6 *)PT_REGS_PARM2_CORE(ctx);
> +       addrlen = (int)PT_REGS_PARM3_CORE(ctx);
>
>         if (addrlen != sizeof(*in6))
>                 return 0;
> --
> 2.25.4
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 1/3] samples/bpf: Fix building out of srctree
  2020-07-28 20:48   ` Song Liu
@ 2020-07-28 21:12     ` Ilya Leoshkevich
  2020-07-28 21:37       ` Ilya Leoshkevich
  0 siblings, 1 reply; 20+ messages in thread
From: Ilya Leoshkevich @ 2020-07-28 21:12 UTC (permalink / raw)
  To: Song Liu
  Cc: Alexei Starovoitov, Daniel Borkmann, bpf, Heiko Carstens, Vasily Gorbik

On Tue, 2020-07-28 at 13:48 -0700, Song Liu wrote:
> On Tue, Jul 28, 2020 at 5:15 AM Ilya Leoshkevich <iii@linux.ibm.com>
> wrote:
> > Building BPF samples out of srctree fails, because the output
> > directory
> > for progs shared with selftests (CGROUP_HELPERS, TRACE_HELPERS) is
> > missing and the compiler cannot create output files.
> > 
> > Fix by creating the output directory in Makefile.
> 
> What is the make command line we use here? I am trying:
> 
>    make M=samples/bpf O=./xxx
> 
> w/o this patch, make created ./xxx/samples/bpf.
> 
> Did I miss something?

I'm using

make O=$HOME/linux-build 'CC=ccache gcc' M=samples/bpf -j12

My make version is GNU Make 4.2.1.

Best regards,
Ilya


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 3/3] libbpf: Use bpf_probe_read_kernel
  2020-07-28 19:11   ` Andrii Nakryiko
@ 2020-07-28 21:16     ` Daniel Borkmann
  2020-07-29  4:06       ` Andrii Nakryiko
  0 siblings, 1 reply; 20+ messages in thread
From: Daniel Borkmann @ 2020-07-28 21:16 UTC (permalink / raw)
  To: Andrii Nakryiko, Ilya Leoshkevich
  Cc: Alexei Starovoitov, bpf, Heiko Carstens, Vasily Gorbik

On 7/28/20 9:11 PM, Andrii Nakryiko wrote:
> On Tue, Jul 28, 2020 at 5:15 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>>
>> Yet another adaptation to commit 0ebeea8ca8a4 ("bpf: Restrict
>> bpf_probe_read{, str}() only to archs where they work") that makes more
>> samples compile on s390.
>>
>> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> 
> Sorry, we can't do this yet. This will break on older kernels that
> don't yet have bpf_probe_read_kernel() implemented. Met and Yonghong
> are working on extending a set of CO-RE relocations, that would allow
> to do bpf_probe_read_kernel() detection on BPF side, transparently for
> an application, and will pick either bpf_probe_read() or
> bpf_probe_read_kernel(). It should be ready soon (this or next week,
> most probably), though it will have dependency on the latest Clang.
> But for now, please don't change this.

Could you elaborate what this means wrt dependency on latest clang? Given clang
releases have a rather long cadence, what about existing users with current clang
releases?

>>   tools/lib/bpf/bpf_core_read.h | 51 ++++++++++++++++++-----------------
>>   tools/lib/bpf/bpf_tracing.h   | 15 +++++++----
>>   2 files changed, 37 insertions(+), 29 deletions(-)
>>
> 
> [...]
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 1/3] samples/bpf: Fix building out of srctree
  2020-07-28 21:12     ` Ilya Leoshkevich
@ 2020-07-28 21:37       ` Ilya Leoshkevich
  0 siblings, 0 replies; 20+ messages in thread
From: Ilya Leoshkevich @ 2020-07-28 21:37 UTC (permalink / raw)
  To: Song Liu
  Cc: Alexei Starovoitov, Daniel Borkmann, bpf, Heiko Carstens, Vasily Gorbik

On Tue, 2020-07-28 at 23:12 +0200, Ilya Leoshkevich wrote:
> On Tue, 2020-07-28 at 13:48 -0700, Song Liu wrote:
> > On Tue, Jul 28, 2020 at 5:15 AM Ilya Leoshkevich <iii@linux.ibm.com
> > >
> > wrote:
> > > Building BPF samples out of srctree fails, because the output
> > > directory
> > > for progs shared with selftests (CGROUP_HELPERS, TRACE_HELPERS)
> > > is
> > > missing and the compiler cannot create output files.
> > > 
> > > Fix by creating the output directory in Makefile.
> > 
> > What is the make command line we use here? I am trying:
> > 
> >    make M=samples/bpf O=./xxx
> > 
> > w/o this patch, make created ./xxx/samples/bpf.
> > 
> > Did I miss something?
> 
> I'm using
> 
> make O=$HOME/linux-build 'CC=ccache gcc' M=samples/bpf -j12
> 
> My make version is GNU Make 4.2.1.

Hmm, I wanted to elaborate a little bit more on why this fix should
work, and realized that it works only by accident: cgroup_helpers
and trace_helpers are userspace code, and my fix modifies the recipe
for building bpf programs.

Please disregard this patch - I will have to come up with a better fix.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 2/3] samples/bpf: Fix test_map_in_map on s390
  2020-07-28 20:59   ` Song Liu
@ 2020-07-28 22:05     ` Ilya Leoshkevich
  0 siblings, 0 replies; 20+ messages in thread
From: Ilya Leoshkevich @ 2020-07-28 22:05 UTC (permalink / raw)
  To: Song Liu
  Cc: Alexei Starovoitov, Daniel Borkmann, bpf, Heiko Carstens, Vasily Gorbik

On Tue, 2020-07-28 at 13:59 -0700, Song Liu wrote:
> On Tue, Jul 28, 2020 at 5:14 AM Ilya Leoshkevich <iii@linux.ibm.com>
> wrote:
> > s390 uses socketcall multiplexer instead of individual socket
> > syscalls.
> > Therefore, "kprobe/" SYSCALL(sys_connect) does not trigger and
> > test_map_in_map fails. Fix by using "kprobe/__sys_connect" instead.
> 
> samples/bpf is in semi-deprecated state. I tried for quite some time,
> but still
> cannot build it all successfully. So I apologize for bounding the
> question to you...
> 
> From the code, we do the SYSCALL() trick to change the exact name for
> different architecture. Would this change break the same file for
> x86?

No, it shouldn't - __sys_connect exists on all architectures and gets
control from both regular socket syscalls and socketcall multiplexer.
I tested it on x86 and it worked for me.
It's also already used by
tools/testing/selftests/bpf/progs/test_probe_user.c

Best regards,
Ilya


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 3/3] libbpf: Use bpf_probe_read_kernel
  2020-07-28 21:16     ` Daniel Borkmann
@ 2020-07-29  4:06       ` Andrii Nakryiko
  2020-07-29 21:01         ` Daniel Borkmann
  0 siblings, 1 reply; 20+ messages in thread
From: Andrii Nakryiko @ 2020-07-29  4:06 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Ilya Leoshkevich, Alexei Starovoitov, bpf, Heiko Carstens, Vasily Gorbik

On Tue, Jul 28, 2020 at 2:16 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 7/28/20 9:11 PM, Andrii Nakryiko wrote:
> > On Tue, Jul 28, 2020 at 5:15 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
> >>
> >> Yet another adaptation to commit 0ebeea8ca8a4 ("bpf: Restrict
> >> bpf_probe_read{, str}() only to archs where they work") that makes more
> >> samples compile on s390.
> >>
> >> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> >
> > Sorry, we can't do this yet. This will break on older kernels that
> > don't yet have bpf_probe_read_kernel() implemented. Met and Yonghong
> > are working on extending a set of CO-RE relocations, that would allow
> > to do bpf_probe_read_kernel() detection on BPF side, transparently for
> > an application, and will pick either bpf_probe_read() or
> > bpf_probe_read_kernel(). It should be ready soon (this or next week,
> > most probably), though it will have dependency on the latest Clang.
> > But for now, please don't change this.
>
> Could you elaborate what this means wrt dependency on latest clang? Given clang
> releases have a rather long cadence, what about existing users with current clang
> releases?

So the overall idea is to use something like this to do kernel reads:

static __always_inline int bpf_probe_read_universal(void *dst, u32 sz,
const void *src)
{
    if (bpf_core_type_exists(btf_bpf_probe_read_kernel))
        return bpf_probe_read_kernel(dst, sz, src);
    else
        return bpf_probe_read(dst, sz, src);
}

And then use bpf_probe_read_universal() in BPF_CORE_READ and family.

This approach relies on few things:

1. each BPF helper has a corresponding btf_<helper-name> type defined for it
2. bpf_core_type_exists(some_type) returns 0 or 1, depending if
specified type is found in kernel BTF (so needs kernel BTF, of
course). This is the part me and Yonghong are working on at the
moment.
3. verifier's dead code elimination, which will leave only
bpf_probe_read() or bpf_probe_read_kernel() calls and will remove the
other one. So on older kernels, there will never be unsupoorted call
to bpf_probe_read_kernel().


The new type existence relocation requires the latest Clang. So the
way to deal with older Clangs would be to just fallback to
bpf_probe_read, if we detect that Clang is too old and can't emit
necessary relocation.

If that's not an acceptable plan, then one can "parameterize"
BPF_CORE_READ macro family by re-defining bpf_core_read() macro. Right
now it's defined as:

#define bpf_core_read(dst, sz, src) \
    bpf_probe_read(dst, sz, (const void *)__builtin_preserve_access_index(src))

Re-defining it in terms of bpf_probe_read_kernel is trivial, but I
can't do it for BPF_CORE_READ, because it will break all the users of
bpf_core_read.h that run on older kernels.


>
> >>   tools/lib/bpf/bpf_core_read.h | 51 ++++++++++++++++++-----------------
> >>   tools/lib/bpf/bpf_tracing.h   | 15 +++++++----
> >>   2 files changed, 37 insertions(+), 29 deletions(-)
> >>
> >
> > [...]
> >
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 3/3] libbpf: Use bpf_probe_read_kernel
  2020-07-29  4:06       ` Andrii Nakryiko
@ 2020-07-29 21:01         ` Daniel Borkmann
  2020-07-29 21:36           ` Andrii Nakryiko
  0 siblings, 1 reply; 20+ messages in thread
From: Daniel Borkmann @ 2020-07-29 21:01 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Ilya Leoshkevich, Alexei Starovoitov, bpf, Heiko Carstens, Vasily Gorbik

On 7/29/20 6:06 AM, Andrii Nakryiko wrote:
> On Tue, Jul 28, 2020 at 2:16 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>> On 7/28/20 9:11 PM, Andrii Nakryiko wrote:
>>> On Tue, Jul 28, 2020 at 5:15 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>>>>
>>>> Yet another adaptation to commit 0ebeea8ca8a4 ("bpf: Restrict
>>>> bpf_probe_read{, str}() only to archs where they work") that makes more
>>>> samples compile on s390.
>>>>
>>>> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
>>>
>>> Sorry, we can't do this yet. This will break on older kernels that
>>> don't yet have bpf_probe_read_kernel() implemented. Met and Yonghong
>>> are working on extending a set of CO-RE relocations, that would allow
>>> to do bpf_probe_read_kernel() detection on BPF side, transparently for
>>> an application, and will pick either bpf_probe_read() or
>>> bpf_probe_read_kernel(). It should be ready soon (this or next week,
>>> most probably), though it will have dependency on the latest Clang.
>>> But for now, please don't change this.
>>
>> Could you elaborate what this means wrt dependency on latest clang? Given clang
>> releases have a rather long cadence, what about existing users with current clang
>> releases?
> 
> So the overall idea is to use something like this to do kernel reads:
> 
> static __always_inline int bpf_probe_read_universal(void *dst, u32 sz,
> const void *src)
> {
>      if (bpf_core_type_exists(btf_bpf_probe_read_kernel))
>          return bpf_probe_read_kernel(dst, sz, src);
>      else
>          return bpf_probe_read(dst, sz, src);
> }
> 
> And then use bpf_probe_read_universal() in BPF_CORE_READ and family.
> 
> This approach relies on few things:
> 
> 1. each BPF helper has a corresponding btf_<helper-name> type defined for it
> 2. bpf_core_type_exists(some_type) returns 0 or 1, depending if
> specified type is found in kernel BTF (so needs kernel BTF, of
> course). This is the part me and Yonghong are working on at the
> moment.
> 3. verifier's dead code elimination, which will leave only
> bpf_probe_read() or bpf_probe_read_kernel() calls and will remove the
> other one. So on older kernels, there will never be unsupoorted call
> to bpf_probe_read_kernel().
> 
> The new type existence relocation requires the latest Clang. So the
> way to deal with older Clangs would be to just fallback to
> bpf_probe_read, if we detect that Clang is too old and can't emit
> necessary relocation.

Okay, seems reasonable overall. One question though: couldn't libbpf transparently
fix up the selection of bpf_probe_read() vs bpf_probe_read_kernel()? E.g. it would
probe the kernel whether bpf_probe_read_kernel() is available and if it is then it
would rewrite the raw call number from the instruction from bpf_probe_read() into
the one for bpf_probe_read_kernel()? I guess the question then becomes whether the
original use for bpf_probe_read() was related to CO-RE. But I think this could also
be overcome by adding a fake helper signature in libbpf with a unreasonable high
number that is dedicated to probing mem via CO-RE and then libbpf picks the right
underlying helper call number for the insn. That avoids fiddling with macros and
need for new clang version, no (unless I'm missing something)?

> If that's not an acceptable plan, then one can "parameterize"
> BPF_CORE_READ macro family by re-defining bpf_core_read() macro. Right
> now it's defined as:
> 
> #define bpf_core_read(dst, sz, src) \
>      bpf_probe_read(dst, sz, (const void *)__builtin_preserve_access_index(src))
> 
> Re-defining it in terms of bpf_probe_read_kernel is trivial, but I
> can't do it for BPF_CORE_READ, because it will break all the users of
> bpf_core_read.h that run on older kernels.
> 
> 
>>
>>>>    tools/lib/bpf/bpf_core_read.h | 51 ++++++++++++++++++-----------------
>>>>    tools/lib/bpf/bpf_tracing.h   | 15 +++++++----
>>>>    2 files changed, 37 insertions(+), 29 deletions(-)
>>>>
>>>
>>> [...]
>>>
>>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 3/3] libbpf: Use bpf_probe_read_kernel
  2020-07-29 21:01         ` Daniel Borkmann
@ 2020-07-29 21:36           ` Andrii Nakryiko
  2020-07-29 21:54             ` Daniel Borkmann
  0 siblings, 1 reply; 20+ messages in thread
From: Andrii Nakryiko @ 2020-07-29 21:36 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Ilya Leoshkevich, Alexei Starovoitov, bpf, Heiko Carstens, Vasily Gorbik

On Wed, Jul 29, 2020 at 2:01 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 7/29/20 6:06 AM, Andrii Nakryiko wrote:
> > On Tue, Jul 28, 2020 at 2:16 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
> >> On 7/28/20 9:11 PM, Andrii Nakryiko wrote:
> >>> On Tue, Jul 28, 2020 at 5:15 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
> >>>>
> >>>> Yet another adaptation to commit 0ebeea8ca8a4 ("bpf: Restrict
> >>>> bpf_probe_read{, str}() only to archs where they work") that makes more
> >>>> samples compile on s390.
> >>>>
> >>>> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> >>>
> >>> Sorry, we can't do this yet. This will break on older kernels that
> >>> don't yet have bpf_probe_read_kernel() implemented. Met and Yonghong
> >>> are working on extending a set of CO-RE relocations, that would allow
> >>> to do bpf_probe_read_kernel() detection on BPF side, transparently for
> >>> an application, and will pick either bpf_probe_read() or
> >>> bpf_probe_read_kernel(). It should be ready soon (this or next week,
> >>> most probably), though it will have dependency on the latest Clang.
> >>> But for now, please don't change this.
> >>
> >> Could you elaborate what this means wrt dependency on latest clang? Given clang
> >> releases have a rather long cadence, what about existing users with current clang
> >> releases?
> >
> > So the overall idea is to use something like this to do kernel reads:
> >
> > static __always_inline int bpf_probe_read_universal(void *dst, u32 sz,
> > const void *src)
> > {
> >      if (bpf_core_type_exists(btf_bpf_probe_read_kernel))
> >          return bpf_probe_read_kernel(dst, sz, src);
> >      else
> >          return bpf_probe_read(dst, sz, src);
> > }
> >
> > And then use bpf_probe_read_universal() in BPF_CORE_READ and family.
> >
> > This approach relies on few things:
> >
> > 1. each BPF helper has a corresponding btf_<helper-name> type defined for it
> > 2. bpf_core_type_exists(some_type) returns 0 or 1, depending if
> > specified type is found in kernel BTF (so needs kernel BTF, of
> > course). This is the part me and Yonghong are working on at the
> > moment.
> > 3. verifier's dead code elimination, which will leave only
> > bpf_probe_read() or bpf_probe_read_kernel() calls and will remove the
> > other one. So on older kernels, there will never be unsupoorted call
> > to bpf_probe_read_kernel().
> >
> > The new type existence relocation requires the latest Clang. So the
> > way to deal with older Clangs would be to just fallback to
> > bpf_probe_read, if we detect that Clang is too old and can't emit
> > necessary relocation.
>
> Okay, seems reasonable overall. One question though: couldn't libbpf transparently
> fix up the selection of bpf_probe_read() vs bpf_probe_read_kernel()? E.g. it would
> probe the kernel whether bpf_probe_read_kernel() is available and if it is then it
> would rewrite the raw call number from the instruction from bpf_probe_read() into
> the one for bpf_probe_read_kernel()? I guess the question then becomes whether the
> original use for bpf_probe_read() was related to CO-RE. But I think this could also
> be overcome by adding a fake helper signature in libbpf with a unreasonable high
> number that is dedicated to probing mem via CO-RE and then libbpf picks the right
> underlying helper call number for the insn. That avoids fiddling with macros and
> need for new clang version, no (unless I'm missing something)?

Libbpf could do it, but I'm a bit worried that unconditionally
changing bpf_probe_read() into bpf_probe_read_kernel() is going to be
wrong in some cases. If that wasn't the case, why wouldn't we just
re-purpose bpf_probe_read() into bpf_probe_read_kernel() in kernel
itself, right?

But fear not about old Clang support. The bpf_core_type_exists() will
use a new built-in, and I'll be able to detect its presence with
__has_builtin(X) check in Clang. So it will be completely transparent
to users in the end.

>
> > If that's not an acceptable plan, then one can "parameterize"
> > BPF_CORE_READ macro family by re-defining bpf_core_read() macro. Right
> > now it's defined as:
> >
> > #define bpf_core_read(dst, sz, src) \
> >      bpf_probe_read(dst, sz, (const void *)__builtin_preserve_access_index(src))
> >
> > Re-defining it in terms of bpf_probe_read_kernel is trivial, but I
> > can't do it for BPF_CORE_READ, because it will break all the users of
> > bpf_core_read.h that run on older kernels.
> >
> >
> >>
> >>>>    tools/lib/bpf/bpf_core_read.h | 51 ++++++++++++++++++-----------------
> >>>>    tools/lib/bpf/bpf_tracing.h   | 15 +++++++----
> >>>>    2 files changed, 37 insertions(+), 29 deletions(-)
> >>>>
> >>>
> >>> [...]
> >>>
> >>
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 3/3] libbpf: Use bpf_probe_read_kernel
  2020-07-29 21:36           ` Andrii Nakryiko
@ 2020-07-29 21:54             ` Daniel Borkmann
  2020-07-29 22:05               ` Andrii Nakryiko
  0 siblings, 1 reply; 20+ messages in thread
From: Daniel Borkmann @ 2020-07-29 21:54 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Ilya Leoshkevich, Alexei Starovoitov, bpf, Heiko Carstens, Vasily Gorbik

On 7/29/20 11:36 PM, Andrii Nakryiko wrote:
> On Wed, Jul 29, 2020 at 2:01 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>> On 7/29/20 6:06 AM, Andrii Nakryiko wrote:
>>> On Tue, Jul 28, 2020 at 2:16 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>>>> On 7/28/20 9:11 PM, Andrii Nakryiko wrote:
>>>>> On Tue, Jul 28, 2020 at 5:15 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>>>>>>
>>>>>> Yet another adaptation to commit 0ebeea8ca8a4 ("bpf: Restrict
>>>>>> bpf_probe_read{, str}() only to archs where they work") that makes more
>>>>>> samples compile on s390.
>>>>>>
>>>>>> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
>>>>>
>>>>> Sorry, we can't do this yet. This will break on older kernels that
>>>>> don't yet have bpf_probe_read_kernel() implemented. Met and Yonghong
>>>>> are working on extending a set of CO-RE relocations, that would allow
>>>>> to do bpf_probe_read_kernel() detection on BPF side, transparently for
>>>>> an application, and will pick either bpf_probe_read() or
>>>>> bpf_probe_read_kernel(). It should be ready soon (this or next week,
>>>>> most probably), though it will have dependency on the latest Clang.
>>>>> But for now, please don't change this.
>>>>
>>>> Could you elaborate what this means wrt dependency on latest clang? Given clang
>>>> releases have a rather long cadence, what about existing users with current clang
>>>> releases?
>>>
>>> So the overall idea is to use something like this to do kernel reads:
>>>
>>> static __always_inline int bpf_probe_read_universal(void *dst, u32 sz,
>>> const void *src)
>>> {
>>>       if (bpf_core_type_exists(btf_bpf_probe_read_kernel))
>>>           return bpf_probe_read_kernel(dst, sz, src);
>>>       else
>>>           return bpf_probe_read(dst, sz, src);
>>> }
>>>
>>> And then use bpf_probe_read_universal() in BPF_CORE_READ and family.
>>>
>>> This approach relies on few things:
>>>
>>> 1. each BPF helper has a corresponding btf_<helper-name> type defined for it
>>> 2. bpf_core_type_exists(some_type) returns 0 or 1, depending if
>>> specified type is found in kernel BTF (so needs kernel BTF, of
>>> course). This is the part me and Yonghong are working on at the
>>> moment.
>>> 3. verifier's dead code elimination, which will leave only
>>> bpf_probe_read() or bpf_probe_read_kernel() calls and will remove the
>>> other one. So on older kernels, there will never be unsupoorted call
>>> to bpf_probe_read_kernel().
>>>
>>> The new type existence relocation requires the latest Clang. So the
>>> way to deal with older Clangs would be to just fallback to
>>> bpf_probe_read, if we detect that Clang is too old and can't emit
>>> necessary relocation.
>>
>> Okay, seems reasonable overall. One question though: couldn't libbpf transparently
>> fix up the selection of bpf_probe_read() vs bpf_probe_read_kernel()? E.g. it would
>> probe the kernel whether bpf_probe_read_kernel() is available and if it is then it
>> would rewrite the raw call number from the instruction from bpf_probe_read() into
>> the one for bpf_probe_read_kernel()? I guess the question then becomes whether the
>> original use for bpf_probe_read() was related to CO-RE. But I think this could also
>> be overcome by adding a fake helper signature in libbpf with a unreasonable high
>> number that is dedicated to probing mem via CO-RE and then libbpf picks the right
>> underlying helper call number for the insn. That avoids fiddling with macros and
>> need for new clang version, no (unless I'm missing something)?
> 
> Libbpf could do it, but I'm a bit worried that unconditionally
> changing bpf_probe_read() into bpf_probe_read_kernel() is going to be
> wrong in some cases. If that wasn't the case, why wouldn't we just
> re-purpose bpf_probe_read() into bpf_probe_read_kernel() in kernel
> itself, right?

Yes, that is correct, but I mentioned above that this new 'fake' helper call number
that libbpf would be fixing up would only be used for bpf_probe_read{,str}() inside
bpf_core_read.h.

Small example, bpf_core_read.h would be changed to (just an extract):

diff --git a/tools/lib/bpf/bpf_core_read.h b/tools/lib/bpf/bpf_core_read.h
index eae5cccff761..4bddb2ddf3f0 100644
--- a/tools/lib/bpf/bpf_core_read.h
+++ b/tools/lib/bpf/bpf_core_read.h
@@ -115,7 +115,7 @@ enum bpf_field_info_kind {
   * (local) BTF, used to record relocation.
   */
  #define bpf_core_read(dst, sz, src)                                        \
-       bpf_probe_read(dst, sz,                                             \
+       bpf_probe_read_selector(dst, sz,                                                    \
                        (const void *)__builtin_preserve_access_index(src))

  /*
@@ -124,7 +124,7 @@ enum bpf_field_info_kind {
   * argument.
   */
  #define bpf_core_read_str(dst, sz, src)                                            \
-       bpf_probe_read_str(dst, sz,                                         \
+       bpf_probe_read_str_selector(dst, sz,                                        \
                            (const void *)__builtin_preserve_access_index(src))

  #define ___concat(a, b) a ## b

And bpf_probe_read_{,str_}selector would be defined as e.g. ...

static long (*bpf_probe_read_selector)(void *dst, __u32 size, const void *unsafe_ptr) = (void *) -1;
static long (*bpf_probe_read_str_selector)(void *dst, __u32 size, const void *unsafe_ptr) = (void *) -2;

... where libbpf would do the fix up to either 4 or 45 for insn->imm. But it's still
confined to usage in bpf_core_read.h when the CO-RE macros are used.

> But fear not about old Clang support. The bpf_core_type_exists() will
> use a new built-in, and I'll be able to detect its presence with
> __has_builtin(X) check in Clang. So it will be completely transparent
> to users in the end.

Ok.

>>> If that's not an acceptable plan, then one can "parameterize"
>>> BPF_CORE_READ macro family by re-defining bpf_core_read() macro. Right
>>> now it's defined as:
>>>
>>> #define bpf_core_read(dst, sz, src) \
>>>       bpf_probe_read(dst, sz, (const void *)__builtin_preserve_access_index(src))
>>>
>>> Re-defining it in terms of bpf_probe_read_kernel is trivial, but I
>>> can't do it for BPF_CORE_READ, because it will break all the users of
>>> bpf_core_read.h that run on older kernels.
>>>
>>>
>>>>
>>>>>>     tools/lib/bpf/bpf_core_read.h | 51 ++++++++++++++++++-----------------
>>>>>>     tools/lib/bpf/bpf_tracing.h   | 15 +++++++----
>>>>>>     2 files changed, 37 insertions(+), 29 deletions(-)
>>>>>>
>>>>>
>>>>> [...]
>>>>>
>>>>
>>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 3/3] libbpf: Use bpf_probe_read_kernel
  2020-07-29 21:54             ` Daniel Borkmann
@ 2020-07-29 22:05               ` Andrii Nakryiko
  2020-07-29 22:12                 ` Daniel Borkmann
  0 siblings, 1 reply; 20+ messages in thread
From: Andrii Nakryiko @ 2020-07-29 22:05 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Ilya Leoshkevich, Alexei Starovoitov, bpf, Heiko Carstens, Vasily Gorbik

On Wed, Jul 29, 2020 at 2:54 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 7/29/20 11:36 PM, Andrii Nakryiko wrote:
> > On Wed, Jul 29, 2020 at 2:01 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
> >> On 7/29/20 6:06 AM, Andrii Nakryiko wrote:
> >>> On Tue, Jul 28, 2020 at 2:16 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
> >>>> On 7/28/20 9:11 PM, Andrii Nakryiko wrote:
> >>>>> On Tue, Jul 28, 2020 at 5:15 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
> >>>>>>
> >>>>>> Yet another adaptation to commit 0ebeea8ca8a4 ("bpf: Restrict
> >>>>>> bpf_probe_read{, str}() only to archs where they work") that makes more
> >>>>>> samples compile on s390.
> >>>>>>
> >>>>>> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> >>>>>
> >>>>> Sorry, we can't do this yet. This will break on older kernels that
> >>>>> don't yet have bpf_probe_read_kernel() implemented. Met and Yonghong
> >>>>> are working on extending a set of CO-RE relocations, that would allow
> >>>>> to do bpf_probe_read_kernel() detection on BPF side, transparently for
> >>>>> an application, and will pick either bpf_probe_read() or
> >>>>> bpf_probe_read_kernel(). It should be ready soon (this or next week,
> >>>>> most probably), though it will have dependency on the latest Clang.
> >>>>> But for now, please don't change this.
> >>>>
> >>>> Could you elaborate what this means wrt dependency on latest clang? Given clang
> >>>> releases have a rather long cadence, what about existing users with current clang
> >>>> releases?
> >>>
> >>> So the overall idea is to use something like this to do kernel reads:
> >>>
> >>> static __always_inline int bpf_probe_read_universal(void *dst, u32 sz,
> >>> const void *src)
> >>> {
> >>>       if (bpf_core_type_exists(btf_bpf_probe_read_kernel))
> >>>           return bpf_probe_read_kernel(dst, sz, src);
> >>>       else
> >>>           return bpf_probe_read(dst, sz, src);
> >>> }
> >>>
> >>> And then use bpf_probe_read_universal() in BPF_CORE_READ and family.
> >>>
> >>> This approach relies on few things:
> >>>
> >>> 1. each BPF helper has a corresponding btf_<helper-name> type defined for it
> >>> 2. bpf_core_type_exists(some_type) returns 0 or 1, depending if
> >>> specified type is found in kernel BTF (so needs kernel BTF, of
> >>> course). This is the part me and Yonghong are working on at the
> >>> moment.
> >>> 3. verifier's dead code elimination, which will leave only
> >>> bpf_probe_read() or bpf_probe_read_kernel() calls and will remove the
> >>> other one. So on older kernels, there will never be unsupoorted call
> >>> to bpf_probe_read_kernel().
> >>>
> >>> The new type existence relocation requires the latest Clang. So the
> >>> way to deal with older Clangs would be to just fallback to
> >>> bpf_probe_read, if we detect that Clang is too old and can't emit
> >>> necessary relocation.
> >>
> >> Okay, seems reasonable overall. One question though: couldn't libbpf transparently
> >> fix up the selection of bpf_probe_read() vs bpf_probe_read_kernel()? E.g. it would
> >> probe the kernel whether bpf_probe_read_kernel() is available and if it is then it
> >> would rewrite the raw call number from the instruction from bpf_probe_read() into
> >> the one for bpf_probe_read_kernel()? I guess the question then becomes whether the
> >> original use for bpf_probe_read() was related to CO-RE. But I think this could also
> >> be overcome by adding a fake helper signature in libbpf with a unreasonable high
> >> number that is dedicated to probing mem via CO-RE and then libbpf picks the right
> >> underlying helper call number for the insn. That avoids fiddling with macros and
> >> need for new clang version, no (unless I'm missing something)?
> >
> > Libbpf could do it, but I'm a bit worried that unconditionally
> > changing bpf_probe_read() into bpf_probe_read_kernel() is going to be
> > wrong in some cases. If that wasn't the case, why wouldn't we just
> > re-purpose bpf_probe_read() into bpf_probe_read_kernel() in kernel
> > itself, right?
>
> Yes, that is correct, but I mentioned above that this new 'fake' helper call number
> that libbpf would be fixing up would only be used for bpf_probe_read{,str}() inside
> bpf_core_read.h.
>
> Small example, bpf_core_read.h would be changed to (just an extract):
>
> diff --git a/tools/lib/bpf/bpf_core_read.h b/tools/lib/bpf/bpf_core_read.h
> index eae5cccff761..4bddb2ddf3f0 100644
> --- a/tools/lib/bpf/bpf_core_read.h
> +++ b/tools/lib/bpf/bpf_core_read.h
> @@ -115,7 +115,7 @@ enum bpf_field_info_kind {
>    * (local) BTF, used to record relocation.
>    */
>   #define bpf_core_read(dst, sz, src)                                        \
> -       bpf_probe_read(dst, sz,                                             \
> +       bpf_probe_read_selector(dst, sz,                                                    \
>                         (const void *)__builtin_preserve_access_index(src))
>
>   /*
> @@ -124,7 +124,7 @@ enum bpf_field_info_kind {
>    * argument.
>    */
>   #define bpf_core_read_str(dst, sz, src)                                            \
> -       bpf_probe_read_str(dst, sz,                                         \
> +       bpf_probe_read_str_selector(dst, sz,                                        \
>                             (const void *)__builtin_preserve_access_index(src))
>
>   #define ___concat(a, b) a ## b
>
> And bpf_probe_read_{,str_}selector would be defined as e.g. ...
>
> static long (*bpf_probe_read_selector)(void *dst, __u32 size, const void *unsafe_ptr) = (void *) -1;
> static long (*bpf_probe_read_str_selector)(void *dst, __u32 size, const void *unsafe_ptr) = (void *) -2;
>
> ... where libbpf would do the fix up to either 4 or 45 for insn->imm. But it's still
> confined to usage in bpf_core_read.h when the CO-RE macros are used.

Ah, I see. Yeah, I suppose that would work as well. Do you prefer me
to go this way?

>
> > But fear not about old Clang support. The bpf_core_type_exists() will
> > use a new built-in, and I'll be able to detect its presence with
> > __has_builtin(X) check in Clang. So it will be completely transparent
> > to users in the end.
>
> Ok.
>
> >>> If that's not an acceptable plan, then one can "parameterize"
> >>> BPF_CORE_READ macro family by re-defining bpf_core_read() macro. Right
> >>> now it's defined as:
> >>>
> >>> #define bpf_core_read(dst, sz, src) \
> >>>       bpf_probe_read(dst, sz, (const void *)__builtin_preserve_access_index(src))
> >>>
> >>> Re-defining it in terms of bpf_probe_read_kernel is trivial, but I
> >>> can't do it for BPF_CORE_READ, because it will break all the users of
> >>> bpf_core_read.h that run on older kernels.
> >>>
> >>>
> >>>>
> >>>>>>     tools/lib/bpf/bpf_core_read.h | 51 ++++++++++++++++++-----------------
> >>>>>>     tools/lib/bpf/bpf_tracing.h   | 15 +++++++----
> >>>>>>     2 files changed, 37 insertions(+), 29 deletions(-)
> >>>>>>
> >>>>>
> >>>>> [...]
> >>>>>
> >>>>
> >>
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 3/3] libbpf: Use bpf_probe_read_kernel
  2020-07-29 22:05               ` Andrii Nakryiko
@ 2020-07-29 22:12                 ` Daniel Borkmann
  2020-07-29 22:17                   ` Andrii Nakryiko
  2020-07-31 17:41                   ` Andrii Nakryiko
  0 siblings, 2 replies; 20+ messages in thread
From: Daniel Borkmann @ 2020-07-29 22:12 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Ilya Leoshkevich, Alexei Starovoitov, bpf, Heiko Carstens, Vasily Gorbik

On 7/30/20 12:05 AM, Andrii Nakryiko wrote:
> On Wed, Jul 29, 2020 at 2:54 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>> On 7/29/20 11:36 PM, Andrii Nakryiko wrote:
>>> On Wed, Jul 29, 2020 at 2:01 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>>>> On 7/29/20 6:06 AM, Andrii Nakryiko wrote:
>>>>> On Tue, Jul 28, 2020 at 2:16 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>>>>>> On 7/28/20 9:11 PM, Andrii Nakryiko wrote:
>>>>>>> On Tue, Jul 28, 2020 at 5:15 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>>>>>>>>
>>>>>>>> Yet another adaptation to commit 0ebeea8ca8a4 ("bpf: Restrict
>>>>>>>> bpf_probe_read{, str}() only to archs where they work") that makes more
>>>>>>>> samples compile on s390.
>>>>>>>>
>>>>>>>> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
>>>>>>>
>>>>>>> Sorry, we can't do this yet. This will break on older kernels that
>>>>>>> don't yet have bpf_probe_read_kernel() implemented. Met and Yonghong
>>>>>>> are working on extending a set of CO-RE relocations, that would allow
>>>>>>> to do bpf_probe_read_kernel() detection on BPF side, transparently for
>>>>>>> an application, and will pick either bpf_probe_read() or
>>>>>>> bpf_probe_read_kernel(). It should be ready soon (this or next week,
>>>>>>> most probably), though it will have dependency on the latest Clang.
>>>>>>> But for now, please don't change this.
>>>>>>
>>>>>> Could you elaborate what this means wrt dependency on latest clang? Given clang
>>>>>> releases have a rather long cadence, what about existing users with current clang
>>>>>> releases?
>>>>>
>>>>> So the overall idea is to use something like this to do kernel reads:
>>>>>
>>>>> static __always_inline int bpf_probe_read_universal(void *dst, u32 sz,
>>>>> const void *src)
>>>>> {
>>>>>        if (bpf_core_type_exists(btf_bpf_probe_read_kernel))
>>>>>            return bpf_probe_read_kernel(dst, sz, src);
>>>>>        else
>>>>>            return bpf_probe_read(dst, sz, src);
>>>>> }
>>>>>
>>>>> And then use bpf_probe_read_universal() in BPF_CORE_READ and family.
>>>>>
>>>>> This approach relies on few things:
>>>>>
>>>>> 1. each BPF helper has a corresponding btf_<helper-name> type defined for it
>>>>> 2. bpf_core_type_exists(some_type) returns 0 or 1, depending if
>>>>> specified type is found in kernel BTF (so needs kernel BTF, of
>>>>> course). This is the part me and Yonghong are working on at the
>>>>> moment.
>>>>> 3. verifier's dead code elimination, which will leave only
>>>>> bpf_probe_read() or bpf_probe_read_kernel() calls and will remove the
>>>>> other one. So on older kernels, there will never be unsupoorted call
>>>>> to bpf_probe_read_kernel().
>>>>>
>>>>> The new type existence relocation requires the latest Clang. So the
>>>>> way to deal with older Clangs would be to just fallback to
>>>>> bpf_probe_read, if we detect that Clang is too old and can't emit
>>>>> necessary relocation.
>>>>
>>>> Okay, seems reasonable overall. One question though: couldn't libbpf transparently
>>>> fix up the selection of bpf_probe_read() vs bpf_probe_read_kernel()? E.g. it would
>>>> probe the kernel whether bpf_probe_read_kernel() is available and if it is then it
>>>> would rewrite the raw call number from the instruction from bpf_probe_read() into
>>>> the one for bpf_probe_read_kernel()? I guess the question then becomes whether the
>>>> original use for bpf_probe_read() was related to CO-RE. But I think this could also
>>>> be overcome by adding a fake helper signature in libbpf with a unreasonable high
>>>> number that is dedicated to probing mem via CO-RE and then libbpf picks the right
>>>> underlying helper call number for the insn. That avoids fiddling with macros and
>>>> need for new clang version, no (unless I'm missing something)?
>>>
>>> Libbpf could do it, but I'm a bit worried that unconditionally
>>> changing bpf_probe_read() into bpf_probe_read_kernel() is going to be
>>> wrong in some cases. If that wasn't the case, why wouldn't we just
>>> re-purpose bpf_probe_read() into bpf_probe_read_kernel() in kernel
>>> itself, right?
>>
>> Yes, that is correct, but I mentioned above that this new 'fake' helper call number
>> that libbpf would be fixing up would only be used for bpf_probe_read{,str}() inside
>> bpf_core_read.h.
>>
>> Small example, bpf_core_read.h would be changed to (just an extract):
>>
>> diff --git a/tools/lib/bpf/bpf_core_read.h b/tools/lib/bpf/bpf_core_read.h
>> index eae5cccff761..4bddb2ddf3f0 100644
>> --- a/tools/lib/bpf/bpf_core_read.h
>> +++ b/tools/lib/bpf/bpf_core_read.h
>> @@ -115,7 +115,7 @@ enum bpf_field_info_kind {
>>     * (local) BTF, used to record relocation.
>>     */
>>    #define bpf_core_read(dst, sz, src)                                        \
>> -       bpf_probe_read(dst, sz,                                             \
>> +       bpf_probe_read_selector(dst, sz,                                                    \
>>                          (const void *)__builtin_preserve_access_index(src))
>>
>>    /*
>> @@ -124,7 +124,7 @@ enum bpf_field_info_kind {
>>     * argument.
>>     */
>>    #define bpf_core_read_str(dst, sz, src)                                            \
>> -       bpf_probe_read_str(dst, sz,                                         \
>> +       bpf_probe_read_str_selector(dst, sz,                                        \
>>                              (const void *)__builtin_preserve_access_index(src))
>>
>>    #define ___concat(a, b) a ## b
>>
>> And bpf_probe_read_{,str_}selector would be defined as e.g. ...
>>
>> static long (*bpf_probe_read_selector)(void *dst, __u32 size, const void *unsafe_ptr) = (void *) -1;
>> static long (*bpf_probe_read_str_selector)(void *dst, __u32 size, const void *unsafe_ptr) = (void *) -2;
>>
>> ... where libbpf would do the fix up to either 4 or 45 for insn->imm. But it's still
>> confined to usage in bpf_core_read.h when the CO-RE macros are used.
> 
> Ah, I see. Yeah, I suppose that would work as well. Do you prefer me
> to go this way?

I would suggest we should try this path given this can be used with any clang version
that has the BPF backend, not just latest upstream git.

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 3/3] libbpf: Use bpf_probe_read_kernel
  2020-07-29 22:12                 ` Daniel Borkmann
@ 2020-07-29 22:17                   ` Andrii Nakryiko
  2020-07-31 17:41                   ` Andrii Nakryiko
  1 sibling, 0 replies; 20+ messages in thread
From: Andrii Nakryiko @ 2020-07-29 22:17 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Ilya Leoshkevich, Alexei Starovoitov, bpf, Heiko Carstens, Vasily Gorbik

On Wed, Jul 29, 2020 at 3:12 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 7/30/20 12:05 AM, Andrii Nakryiko wrote:
> > On Wed, Jul 29, 2020 at 2:54 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
> >> On 7/29/20 11:36 PM, Andrii Nakryiko wrote:
> >>> On Wed, Jul 29, 2020 at 2:01 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
> >>>> On 7/29/20 6:06 AM, Andrii Nakryiko wrote:
> >>>>> On Tue, Jul 28, 2020 at 2:16 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
> >>>>>> On 7/28/20 9:11 PM, Andrii Nakryiko wrote:
> >>>>>>> On Tue, Jul 28, 2020 at 5:15 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
> >>>>>>>>
> >>>>>>>> Yet another adaptation to commit 0ebeea8ca8a4 ("bpf: Restrict
> >>>>>>>> bpf_probe_read{, str}() only to archs where they work") that makes more
> >>>>>>>> samples compile on s390.
> >>>>>>>>
> >>>>>>>> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> >>>>>>>
> >>>>>>> Sorry, we can't do this yet. This will break on older kernels that
> >>>>>>> don't yet have bpf_probe_read_kernel() implemented. Met and Yonghong
> >>>>>>> are working on extending a set of CO-RE relocations, that would allow
> >>>>>>> to do bpf_probe_read_kernel() detection on BPF side, transparently for
> >>>>>>> an application, and will pick either bpf_probe_read() or
> >>>>>>> bpf_probe_read_kernel(). It should be ready soon (this or next week,
> >>>>>>> most probably), though it will have dependency on the latest Clang.
> >>>>>>> But for now, please don't change this.
> >>>>>>
> >>>>>> Could you elaborate what this means wrt dependency on latest clang? Given clang
> >>>>>> releases have a rather long cadence, what about existing users with current clang
> >>>>>> releases?
> >>>>>
> >>>>> So the overall idea is to use something like this to do kernel reads:
> >>>>>
> >>>>> static __always_inline int bpf_probe_read_universal(void *dst, u32 sz,
> >>>>> const void *src)
> >>>>> {
> >>>>>        if (bpf_core_type_exists(btf_bpf_probe_read_kernel))
> >>>>>            return bpf_probe_read_kernel(dst, sz, src);
> >>>>>        else
> >>>>>            return bpf_probe_read(dst, sz, src);
> >>>>> }
> >>>>>
> >>>>> And then use bpf_probe_read_universal() in BPF_CORE_READ and family.
> >>>>>
> >>>>> This approach relies on few things:
> >>>>>
> >>>>> 1. each BPF helper has a corresponding btf_<helper-name> type defined for it
> >>>>> 2. bpf_core_type_exists(some_type) returns 0 or 1, depending if
> >>>>> specified type is found in kernel BTF (so needs kernel BTF, of
> >>>>> course). This is the part me and Yonghong are working on at the
> >>>>> moment.
> >>>>> 3. verifier's dead code elimination, which will leave only
> >>>>> bpf_probe_read() or bpf_probe_read_kernel() calls and will remove the
> >>>>> other one. So on older kernels, there will never be unsupoorted call
> >>>>> to bpf_probe_read_kernel().
> >>>>>
> >>>>> The new type existence relocation requires the latest Clang. So the
> >>>>> way to deal with older Clangs would be to just fallback to
> >>>>> bpf_probe_read, if we detect that Clang is too old and can't emit
> >>>>> necessary relocation.
> >>>>
> >>>> Okay, seems reasonable overall. One question though: couldn't libbpf transparently
> >>>> fix up the selection of bpf_probe_read() vs bpf_probe_read_kernel()? E.g. it would
> >>>> probe the kernel whether bpf_probe_read_kernel() is available and if it is then it
> >>>> would rewrite the raw call number from the instruction from bpf_probe_read() into
> >>>> the one for bpf_probe_read_kernel()? I guess the question then becomes whether the
> >>>> original use for bpf_probe_read() was related to CO-RE. But I think this could also
> >>>> be overcome by adding a fake helper signature in libbpf with a unreasonable high
> >>>> number that is dedicated to probing mem via CO-RE and then libbpf picks the right
> >>>> underlying helper call number for the insn. That avoids fiddling with macros and
> >>>> need for new clang version, no (unless I'm missing something)?
> >>>
> >>> Libbpf could do it, but I'm a bit worried that unconditionally
> >>> changing bpf_probe_read() into bpf_probe_read_kernel() is going to be
> >>> wrong in some cases. If that wasn't the case, why wouldn't we just
> >>> re-purpose bpf_probe_read() into bpf_probe_read_kernel() in kernel
> >>> itself, right?
> >>
> >> Yes, that is correct, but I mentioned above that this new 'fake' helper call number
> >> that libbpf would be fixing up would only be used for bpf_probe_read{,str}() inside
> >> bpf_core_read.h.
> >>
> >> Small example, bpf_core_read.h would be changed to (just an extract):
> >>
> >> diff --git a/tools/lib/bpf/bpf_core_read.h b/tools/lib/bpf/bpf_core_read.h
> >> index eae5cccff761..4bddb2ddf3f0 100644
> >> --- a/tools/lib/bpf/bpf_core_read.h
> >> +++ b/tools/lib/bpf/bpf_core_read.h
> >> @@ -115,7 +115,7 @@ enum bpf_field_info_kind {
> >>     * (local) BTF, used to record relocation.
> >>     */
> >>    #define bpf_core_read(dst, sz, src)                                        \
> >> -       bpf_probe_read(dst, sz,                                             \
> >> +       bpf_probe_read_selector(dst, sz,                                                    \
> >>                          (const void *)__builtin_preserve_access_index(src))
> >>
> >>    /*
> >> @@ -124,7 +124,7 @@ enum bpf_field_info_kind {
> >>     * argument.
> >>     */
> >>    #define bpf_core_read_str(dst, sz, src)                                            \
> >> -       bpf_probe_read_str(dst, sz,                                         \
> >> +       bpf_probe_read_str_selector(dst, sz,                                        \
> >>                              (const void *)__builtin_preserve_access_index(src))
> >>
> >>    #define ___concat(a, b) a ## b
> >>
> >> And bpf_probe_read_{,str_}selector would be defined as e.g. ...
> >>
> >> static long (*bpf_probe_read_selector)(void *dst, __u32 size, const void *unsafe_ptr) = (void *) -1;
> >> static long (*bpf_probe_read_str_selector)(void *dst, __u32 size, const void *unsafe_ptr) = (void *) -2;
> >>
> >> ... where libbpf would do the fix up to either 4 or 45 for insn->imm. But it's still
> >> confined to usage in bpf_core_read.h when the CO-RE macros are used.
> >
> > Ah, I see. Yeah, I suppose that would work as well. Do you prefer me
> > to go this way?
>
> I would suggest we should try this path given this can be used with any clang version
> that has the BPF backend, not just latest upstream git.

Sure, sounds good.

>
> Thanks,
> Daniel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 3/3] libbpf: Use bpf_probe_read_kernel
  2020-07-29 22:12                 ` Daniel Borkmann
  2020-07-29 22:17                   ` Andrii Nakryiko
@ 2020-07-31 17:41                   ` Andrii Nakryiko
  2020-07-31 20:34                     ` Daniel Borkmann
  1 sibling, 1 reply; 20+ messages in thread
From: Andrii Nakryiko @ 2020-07-31 17:41 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Ilya Leoshkevich, Alexei Starovoitov, bpf, Heiko Carstens, Vasily Gorbik

On Wed, Jul 29, 2020 at 3:12 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 7/30/20 12:05 AM, Andrii Nakryiko wrote:
> > On Wed, Jul 29, 2020 at 2:54 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
> >> On 7/29/20 11:36 PM, Andrii Nakryiko wrote:
> >>> On Wed, Jul 29, 2020 at 2:01 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
> >>>> On 7/29/20 6:06 AM, Andrii Nakryiko wrote:
> >>>>> On Tue, Jul 28, 2020 at 2:16 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
> >>>>>> On 7/28/20 9:11 PM, Andrii Nakryiko wrote:
> >>>>>>> On Tue, Jul 28, 2020 at 5:15 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
> >>>>>>>>
> >>>>>>>> Yet another adaptation to commit 0ebeea8ca8a4 ("bpf: Restrict
> >>>>>>>> bpf_probe_read{, str}() only to archs where they work") that makes more
> >>>>>>>> samples compile on s390.
> >>>>>>>>
> >>>>>>>> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> >>>>>>>
> >>>>>>> Sorry, we can't do this yet. This will break on older kernels that
> >>>>>>> don't yet have bpf_probe_read_kernel() implemented. Met and Yonghong
> >>>>>>> are working on extending a set of CO-RE relocations, that would allow
> >>>>>>> to do bpf_probe_read_kernel() detection on BPF side, transparently for
> >>>>>>> an application, and will pick either bpf_probe_read() or
> >>>>>>> bpf_probe_read_kernel(). It should be ready soon (this or next week,
> >>>>>>> most probably), though it will have dependency on the latest Clang.
> >>>>>>> But for now, please don't change this.
> >>>>>>
> >>>>>> Could you elaborate what this means wrt dependency on latest clang? Given clang
> >>>>>> releases have a rather long cadence, what about existing users with current clang
> >>>>>> releases?
> >>>>>
> >>>>> So the overall idea is to use something like this to do kernel reads:
> >>>>>
> >>>>> static __always_inline int bpf_probe_read_universal(void *dst, u32 sz,
> >>>>> const void *src)
> >>>>> {
> >>>>>        if (bpf_core_type_exists(btf_bpf_probe_read_kernel))
> >>>>>            return bpf_probe_read_kernel(dst, sz, src);
> >>>>>        else
> >>>>>            return bpf_probe_read(dst, sz, src);
> >>>>> }
> >>>>>
> >>>>> And then use bpf_probe_read_universal() in BPF_CORE_READ and family.
> >>>>>
> >>>>> This approach relies on few things:
> >>>>>
> >>>>> 1. each BPF helper has a corresponding btf_<helper-name> type defined for it
> >>>>> 2. bpf_core_type_exists(some_type) returns 0 or 1, depending if
> >>>>> specified type is found in kernel BTF (so needs kernel BTF, of
> >>>>> course). This is the part me and Yonghong are working on at the
> >>>>> moment.
> >>>>> 3. verifier's dead code elimination, which will leave only
> >>>>> bpf_probe_read() or bpf_probe_read_kernel() calls and will remove the
> >>>>> other one. So on older kernels, there will never be unsupoorted call
> >>>>> to bpf_probe_read_kernel().
> >>>>>
> >>>>> The new type existence relocation requires the latest Clang. So the
> >>>>> way to deal with older Clangs would be to just fallback to
> >>>>> bpf_probe_read, if we detect that Clang is too old and can't emit
> >>>>> necessary relocation.
> >>>>
> >>>> Okay, seems reasonable overall. One question though: couldn't libbpf transparently
> >>>> fix up the selection of bpf_probe_read() vs bpf_probe_read_kernel()? E.g. it would
> >>>> probe the kernel whether bpf_probe_read_kernel() is available and if it is then it
> >>>> would rewrite the raw call number from the instruction from bpf_probe_read() into
> >>>> the one for bpf_probe_read_kernel()? I guess the question then becomes whether the
> >>>> original use for bpf_probe_read() was related to CO-RE. But I think this could also
> >>>> be overcome by adding a fake helper signature in libbpf with a unreasonable high
> >>>> number that is dedicated to probing mem via CO-RE and then libbpf picks the right
> >>>> underlying helper call number for the insn. That avoids fiddling with macros and
> >>>> need for new clang version, no (unless I'm missing something)?
> >>>
> >>> Libbpf could do it, but I'm a bit worried that unconditionally
> >>> changing bpf_probe_read() into bpf_probe_read_kernel() is going to be
> >>> wrong in some cases. If that wasn't the case, why wouldn't we just
> >>> re-purpose bpf_probe_read() into bpf_probe_read_kernel() in kernel
> >>> itself, right?
> >>
> >> Yes, that is correct, but I mentioned above that this new 'fake' helper call number
> >> that libbpf would be fixing up would only be used for bpf_probe_read{,str}() inside
> >> bpf_core_read.h.
> >>
> >> Small example, bpf_core_read.h would be changed to (just an extract):
> >>
> >> diff --git a/tools/lib/bpf/bpf_core_read.h b/tools/lib/bpf/bpf_core_read.h
> >> index eae5cccff761..4bddb2ddf3f0 100644
> >> --- a/tools/lib/bpf/bpf_core_read.h
> >> +++ b/tools/lib/bpf/bpf_core_read.h
> >> @@ -115,7 +115,7 @@ enum bpf_field_info_kind {
> >>     * (local) BTF, used to record relocation.
> >>     */
> >>    #define bpf_core_read(dst, sz, src)                                        \
> >> -       bpf_probe_read(dst, sz,                                             \
> >> +       bpf_probe_read_selector(dst, sz,                                                    \
> >>                          (const void *)__builtin_preserve_access_index(src))
> >>
> >>    /*
> >> @@ -124,7 +124,7 @@ enum bpf_field_info_kind {
> >>     * argument.
> >>     */
> >>    #define bpf_core_read_str(dst, sz, src)                                            \
> >> -       bpf_probe_read_str(dst, sz,                                         \
> >> +       bpf_probe_read_str_selector(dst, sz,                                        \
> >>                              (const void *)__builtin_preserve_access_index(src))
> >>
> >>    #define ___concat(a, b) a ## b
> >>
> >> And bpf_probe_read_{,str_}selector would be defined as e.g. ...
> >>
> >> static long (*bpf_probe_read_selector)(void *dst, __u32 size, const void *unsafe_ptr) = (void *) -1;
> >> static long (*bpf_probe_read_str_selector)(void *dst, __u32 size, const void *unsafe_ptr) = (void *) -2;
> >>
> >> ... where libbpf would do the fix up to either 4 or 45 for insn->imm. But it's still
> >> confined to usage in bpf_core_read.h when the CO-RE macros are used.
> >
> > Ah, I see. Yeah, I suppose that would work as well. Do you prefer me
> > to go this way?
>
> I would suggest we should try this path given this can be used with any clang version
> that has the BPF backend, not just latest upstream git.

I have an even better solution, I think. Convert everything to
bpf_probe_read_kernel() or bpf_probe_read_user() unconditionally, but
let libbpf switch those two to bpf_probe_read() if _kernel()/_user()
variants are not yet in the kernel. That should handle both CO-RE
helpers and just pretty much any use case that was converted.


>
> Thanks,
> Daniel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 3/3] libbpf: Use bpf_probe_read_kernel
  2020-07-31 17:41                   ` Andrii Nakryiko
@ 2020-07-31 20:34                     ` Daniel Borkmann
  0 siblings, 0 replies; 20+ messages in thread
From: Daniel Borkmann @ 2020-07-31 20:34 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Ilya Leoshkevich, Alexei Starovoitov, bpf, Heiko Carstens, Vasily Gorbik

On 7/31/20 7:41 PM, Andrii Nakryiko wrote:
> On Wed, Jul 29, 2020 at 3:12 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>> On 7/30/20 12:05 AM, Andrii Nakryiko wrote:
>>> On Wed, Jul 29, 2020 at 2:54 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>>>> On 7/29/20 11:36 PM, Andrii Nakryiko wrote:
>>>>> On Wed, Jul 29, 2020 at 2:01 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>>>>>> On 7/29/20 6:06 AM, Andrii Nakryiko wrote:
>>>>>>> On Tue, Jul 28, 2020 at 2:16 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>>>>>>>> On 7/28/20 9:11 PM, Andrii Nakryiko wrote:
>>>>>>>>> On Tue, Jul 28, 2020 at 5:15 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>>>>>>>>>>
>>>>>>>>>> Yet another adaptation to commit 0ebeea8ca8a4 ("bpf: Restrict
>>>>>>>>>> bpf_probe_read{, str}() only to archs where they work") that makes more
>>>>>>>>>> samples compile on s390.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
>>>>>>>>>
>>>>>>>>> Sorry, we can't do this yet. This will break on older kernels that
>>>>>>>>> don't yet have bpf_probe_read_kernel() implemented. Met and Yonghong
>>>>>>>>> are working on extending a set of CO-RE relocations, that would allow
>>>>>>>>> to do bpf_probe_read_kernel() detection on BPF side, transparently for
>>>>>>>>> an application, and will pick either bpf_probe_read() or
>>>>>>>>> bpf_probe_read_kernel(). It should be ready soon (this or next week,
>>>>>>>>> most probably), though it will have dependency on the latest Clang.
>>>>>>>>> But for now, please don't change this.
>>>>>>>>
>>>>>>>> Could you elaborate what this means wrt dependency on latest clang? Given clang
>>>>>>>> releases have a rather long cadence, what about existing users with current clang
>>>>>>>> releases?
>>>>>>>
>>>>>>> So the overall idea is to use something like this to do kernel reads:
>>>>>>>
>>>>>>> static __always_inline int bpf_probe_read_universal(void *dst, u32 sz,
>>>>>>> const void *src)
>>>>>>> {
>>>>>>>         if (bpf_core_type_exists(btf_bpf_probe_read_kernel))
>>>>>>>             return bpf_probe_read_kernel(dst, sz, src);
>>>>>>>         else
>>>>>>>             return bpf_probe_read(dst, sz, src);
>>>>>>> }
>>>>>>>
>>>>>>> And then use bpf_probe_read_universal() in BPF_CORE_READ and family.
>>>>>>>
>>>>>>> This approach relies on few things:
>>>>>>>
>>>>>>> 1. each BPF helper has a corresponding btf_<helper-name> type defined for it
>>>>>>> 2. bpf_core_type_exists(some_type) returns 0 or 1, depending if
>>>>>>> specified type is found in kernel BTF (so needs kernel BTF, of
>>>>>>> course). This is the part me and Yonghong are working on at the
>>>>>>> moment.
>>>>>>> 3. verifier's dead code elimination, which will leave only
>>>>>>> bpf_probe_read() or bpf_probe_read_kernel() calls and will remove the
>>>>>>> other one. So on older kernels, there will never be unsupoorted call
>>>>>>> to bpf_probe_read_kernel().
>>>>>>>
>>>>>>> The new type existence relocation requires the latest Clang. So the
>>>>>>> way to deal with older Clangs would be to just fallback to
>>>>>>> bpf_probe_read, if we detect that Clang is too old and can't emit
>>>>>>> necessary relocation.
>>>>>>
>>>>>> Okay, seems reasonable overall. One question though: couldn't libbpf transparently
>>>>>> fix up the selection of bpf_probe_read() vs bpf_probe_read_kernel()? E.g. it would
>>>>>> probe the kernel whether bpf_probe_read_kernel() is available and if it is then it
>>>>>> would rewrite the raw call number from the instruction from bpf_probe_read() into
>>>>>> the one for bpf_probe_read_kernel()? I guess the question then becomes whether the
>>>>>> original use for bpf_probe_read() was related to CO-RE. But I think this could also
>>>>>> be overcome by adding a fake helper signature in libbpf with a unreasonable high
>>>>>> number that is dedicated to probing mem via CO-RE and then libbpf picks the right
>>>>>> underlying helper call number for the insn. That avoids fiddling with macros and
>>>>>> need for new clang version, no (unless I'm missing something)?
>>>>>
>>>>> Libbpf could do it, but I'm a bit worried that unconditionally
>>>>> changing bpf_probe_read() into bpf_probe_read_kernel() is going to be
>>>>> wrong in some cases. If that wasn't the case, why wouldn't we just
>>>>> re-purpose bpf_probe_read() into bpf_probe_read_kernel() in kernel
>>>>> itself, right?
>>>>
>>>> Yes, that is correct, but I mentioned above that this new 'fake' helper call number
>>>> that libbpf would be fixing up would only be used for bpf_probe_read{,str}() inside
>>>> bpf_core_read.h.
>>>>
>>>> Small example, bpf_core_read.h would be changed to (just an extract):
>>>>
>>>> diff --git a/tools/lib/bpf/bpf_core_read.h b/tools/lib/bpf/bpf_core_read.h
>>>> index eae5cccff761..4bddb2ddf3f0 100644
>>>> --- a/tools/lib/bpf/bpf_core_read.h
>>>> +++ b/tools/lib/bpf/bpf_core_read.h
>>>> @@ -115,7 +115,7 @@ enum bpf_field_info_kind {
>>>>      * (local) BTF, used to record relocation.
>>>>      */
>>>>     #define bpf_core_read(dst, sz, src)                                        \
>>>> -       bpf_probe_read(dst, sz,                                             \
>>>> +       bpf_probe_read_selector(dst, sz,                                                    \
>>>>                           (const void *)__builtin_preserve_access_index(src))
>>>>
>>>>     /*
>>>> @@ -124,7 +124,7 @@ enum bpf_field_info_kind {
>>>>      * argument.
>>>>      */
>>>>     #define bpf_core_read_str(dst, sz, src)                                            \
>>>> -       bpf_probe_read_str(dst, sz,                                         \
>>>> +       bpf_probe_read_str_selector(dst, sz,                                        \
>>>>                               (const void *)__builtin_preserve_access_index(src))
>>>>
>>>>     #define ___concat(a, b) a ## b
>>>>
>>>> And bpf_probe_read_{,str_}selector would be defined as e.g. ...
>>>>
>>>> static long (*bpf_probe_read_selector)(void *dst, __u32 size, const void *unsafe_ptr) = (void *) -1;
>>>> static long (*bpf_probe_read_str_selector)(void *dst, __u32 size, const void *unsafe_ptr) = (void *) -2;
>>>>
>>>> ... where libbpf would do the fix up to either 4 or 45 for insn->imm. But it's still
>>>> confined to usage in bpf_core_read.h when the CO-RE macros are used.
>>>
>>> Ah, I see. Yeah, I suppose that would work as well. Do you prefer me
>>> to go this way?
>>
>> I would suggest we should try this path given this can be used with any clang version
>> that has the BPF backend, not just latest upstream git.
> 
> I have an even better solution, I think. Convert everything to
> bpf_probe_read_kernel() or bpf_probe_read_user() unconditionally, but
> let libbpf switch those two to bpf_probe_read() if _kernel()/_user()
> variants are not yet in the kernel. That should handle both CO-RE
> helpers and just pretty much any use case that was converted.

Yes, agree, that is an even cleaner solution and avoids to 'pollute' the
helper ID space with such remapping. The user intent with bpf_probe_read_kernel()
or bpf_probe_read_user() is rather clear so bpf_probe_read() can be a fallback
for this direction. Lets go with that.

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, back to index

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-28 12:00 [PATCH bpf-next 0/3] samples/bpf: A couple s390 fixes Ilya Leoshkevich
2020-07-28 12:00 ` [PATCH bpf-next 1/3] samples/bpf: Fix building out of srctree Ilya Leoshkevich
2020-07-28 20:48   ` Song Liu
2020-07-28 21:12     ` Ilya Leoshkevich
2020-07-28 21:37       ` Ilya Leoshkevich
2020-07-28 12:00 ` [PATCH bpf-next 2/3] samples/bpf: Fix test_map_in_map on s390 Ilya Leoshkevich
2020-07-28 20:59   ` Song Liu
2020-07-28 22:05     ` Ilya Leoshkevich
2020-07-28 12:00 ` [PATCH bpf-next 3/3] libbpf: Use bpf_probe_read_kernel Ilya Leoshkevich
2020-07-28 19:11   ` Andrii Nakryiko
2020-07-28 21:16     ` Daniel Borkmann
2020-07-29  4:06       ` Andrii Nakryiko
2020-07-29 21:01         ` Daniel Borkmann
2020-07-29 21:36           ` Andrii Nakryiko
2020-07-29 21:54             ` Daniel Borkmann
2020-07-29 22:05               ` Andrii Nakryiko
2020-07-29 22:12                 ` Daniel Borkmann
2020-07-29 22:17                   ` Andrii Nakryiko
2020-07-31 17:41                   ` Andrii Nakryiko
2020-07-31 20:34                     ` Daniel Borkmann

BPF Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/bpf/0 bpf/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 bpf bpf/ https://lore.kernel.org/bpf \
		bpf@vger.kernel.org
	public-inbox-index bpf

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.bpf


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git