bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next] libbpf: generate more efficient BPF_CORE_READ code
@ 2019-10-11  2:38 Andrii Nakryiko
  2019-10-11 21:26 ` Daniel Borkmann
  0 siblings, 1 reply; 2+ messages in thread
From: Andrii Nakryiko @ 2019-10-11  2:38 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko

Existing BPF_CORE_READ() macro generates slightly suboptimal code. If
there are intermediate pointers to be read, initial source pointer is
going to be assigned into a temporary variable and then temporary
variable is going to be uniformly used as a "source" pointer for all
intermediate pointer reads. Schematically (ignoring all the type casts),
BPF_CORE_READ(s, a, b, c) is expanded into:
({
	const void *__t = src;
	bpf_probe_read(&__t, sizeof(*__t), &__t->a);
	bpf_probe_read(&__t, sizeof(*__t), &__t->b);

	typeof(s->a->b->c) __r;
	bpf_probe_read(&__r, sizeof(*__r), &__t->c);
})

This initial `__t = src` makes calls more uniform, but causes slightly
less optimal register usage sometimes when compiled with Clang. This can
cascase into, e.g., more register spills.

This patch fixes this issue by generating more optimal sequence:
({
	const void *__t;
	bpf_probe_read(&__t, sizeof(*__t), &src->a); /* <-- src here */
	bpf_probe_read(&__t, sizeof(*__t), &__t->b);

	typeof(s->a->b->c) __r;
	bpf_probe_read(&__r, sizeof(*__r), &__t->c);
})

Fixes: 7db3822ab991 ("libbpf: Add BPF_CORE_READ/BPF_CORE_READ_INTO helpers")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 tools/lib/bpf/bpf_core_read.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/lib/bpf/bpf_core_read.h b/tools/lib/bpf/bpf_core_read.h
index ae877e3ffb51..4daf04c25493 100644
--- a/tools/lib/bpf/bpf_core_read.h
+++ b/tools/lib/bpf/bpf_core_read.h
@@ -88,11 +88,11 @@
 	read_fn((void *)(dst), sizeof(*(dst)), &((src_type)(src))->accessor)
 
 /* "recursively" read a sequence of inner pointers using local __t var */
+#define ___rd_first(src, a) ___read(bpf_core_read, &__t, ___type(src), src, a);
 #define ___rd_last(...)							    \
 	___read(bpf_core_read, &__t,					    \
 		___type(___nolast(__VA_ARGS__)), __t, ___last(__VA_ARGS__));
-#define ___rd_p0(src) const void *__t = src;
-#define ___rd_p1(...) ___rd_p0(___nolast(__VA_ARGS__)) ___rd_last(__VA_ARGS__)
+#define ___rd_p1(...) const void *__t; ___rd_first(__VA_ARGS__)
 #define ___rd_p2(...) ___rd_p1(___nolast(__VA_ARGS__)) ___rd_last(__VA_ARGS__)
 #define ___rd_p3(...) ___rd_p2(___nolast(__VA_ARGS__)) ___rd_last(__VA_ARGS__)
 #define ___rd_p4(...) ___rd_p3(___nolast(__VA_ARGS__)) ___rd_last(__VA_ARGS__)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH bpf-next] libbpf: generate more efficient BPF_CORE_READ code
  2019-10-11  2:38 [PATCH bpf-next] libbpf: generate more efficient BPF_CORE_READ code Andrii Nakryiko
@ 2019-10-11 21:26 ` Daniel Borkmann
  0 siblings, 0 replies; 2+ messages in thread
From: Daniel Borkmann @ 2019-10-11 21:26 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: bpf, netdev, ast, andrii.nakryiko, kernel-team

On Thu, Oct 10, 2019 at 07:38:47PM -0700, Andrii Nakryiko wrote:
> Existing BPF_CORE_READ() macro generates slightly suboptimal code. If
> there are intermediate pointers to be read, initial source pointer is
> going to be assigned into a temporary variable and then temporary
> variable is going to be uniformly used as a "source" pointer for all
> intermediate pointer reads. Schematically (ignoring all the type casts),
> BPF_CORE_READ(s, a, b, c) is expanded into:
> ({
> 	const void *__t = src;
> 	bpf_probe_read(&__t, sizeof(*__t), &__t->a);
> 	bpf_probe_read(&__t, sizeof(*__t), &__t->b);
> 
> 	typeof(s->a->b->c) __r;
> 	bpf_probe_read(&__r, sizeof(*__r), &__t->c);
> })
> 
> This initial `__t = src` makes calls more uniform, but causes slightly
> less optimal register usage sometimes when compiled with Clang. This can
> cascase into, e.g., more register spills.
> 
> This patch fixes this issue by generating more optimal sequence:
> ({
> 	const void *__t;
> 	bpf_probe_read(&__t, sizeof(*__t), &src->a); /* <-- src here */
> 	bpf_probe_read(&__t, sizeof(*__t), &__t->b);
> 
> 	typeof(s->a->b->c) __r;
> 	bpf_probe_read(&__r, sizeof(*__r), &__t->c);
> })
> 
> Fixes: 7db3822ab991 ("libbpf: Add BPF_CORE_READ/BPF_CORE_READ_INTO helpers")
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>

Applied, thanks!

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2019-10-11 21:26 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-11  2:38 [PATCH bpf-next] libbpf: generate more efficient BPF_CORE_READ code Andrii Nakryiko
2019-10-11 21:26 ` Daniel Borkmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).