All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next 0/7] bpftool: Add LLVM as default library for disassembling JIT-ed programs
@ 2022-09-06 13:36 Quentin Monnet
  2022-09-06 13:36 ` [PATCH bpf-next 1/7] bpftool: Define _GNU_SOURCE only once Quentin Monnet
                   ` (7 more replies)
  0 siblings, 8 replies; 24+ messages in thread
From: Quentin Monnet @ 2022-09-06 13:36 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf,
	Niklas Söderlund, Simon Horman, Quentin Monnet

To disassemble instructions for JIT-ed programs, bpftool has relied on the
libbfd library. This has been problematic in the past: libbfd's interface
is not meant to be stable and has changed several times, hence the
detection of the two related features from the Makefile
(disassembler-four-args and disassembler-init-styled). When it comes to
shipping bpftool, this has also caused issues with several distribution
maintainers unwilling to support the feature (for example, Debian's page
for binutils-dev, libbfd's package, says: "Note that building Debian
packages which depend on the shared libbfd is Not Allowed.").

This patchset adds support for LLVM as the primary library for
disassembling instructions for JIT-ed programs.

We keep libbfd as a fallback. One reason for this is that currently it
works well, we have all we need in terms of features detection in the
Makefile, so it provides a fallback for disassembling JIT-ed programs if
libbfd is installed but LLVM is not. The other reason is that libbfd
supports nfp instruction for Netronome's SmartNICs and can be used to
disassemble offloaded programs, something that LLVM cannot do (Niklas
confirmed that the feature is still in use). However, if libbfd's interface
breaks again in the future, we might reconsider keeping support for it.

Quentin Monnet (7):
  bpftool: Define _GNU_SOURCE only once
  bpftool: Remove asserts from JIT disassembler
  bpftool: Split FEATURE_TESTS/FEATURE_DISPLAY definitions in Makefile
  bpftool: Group libbfd defs in Makefile, only pass them if we use
    libbfd
  bpftool: Refactor disassembler for JIT-ed programs
  bpftool: Add LLVM as default library for disassembling JIT-ed programs
  bpftool: Add llvm feature to "bpftool version"

 .../bpftool/Documentation/common_options.rst  |   8 +-
 tools/bpf/bpftool/Makefile                    |  65 +++--
 tools/bpf/bpftool/common.c                    |   2 +
 tools/bpf/bpftool/iter.c                      |   2 +
 tools/bpf/bpftool/jit_disasm.c                | 244 ++++++++++++++----
 tools/bpf/bpftool/main.c                      |  10 +
 tools/bpf/bpftool/main.h                      |  29 ++-
 tools/bpf/bpftool/map.c                       |   1 -
 tools/bpf/bpftool/net.c                       |   2 +
 tools/bpf/bpftool/perf.c                      |   2 +
 tools/bpf/bpftool/prog.c                      |  17 +-
 tools/bpf/bpftool/xlated_dumper.c             |   2 +
 12 files changed, 291 insertions(+), 93 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH bpf-next 1/7] bpftool: Define _GNU_SOURCE only once
  2022-09-06 13:36 [PATCH bpf-next 0/7] bpftool: Add LLVM as default library for disassembling JIT-ed programs Quentin Monnet
@ 2022-09-06 13:36 ` Quentin Monnet
  2022-09-06 23:14   ` Song Liu
  2022-09-06 13:36 ` [PATCH bpf-next 2/7] bpftool: Remove asserts from JIT disassembler Quentin Monnet
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 24+ messages in thread
From: Quentin Monnet @ 2022-09-06 13:36 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf,
	Niklas Söderlund, Simon Horman, Quentin Monnet

_GNU_SOURCE is defined in several source files for bpftool, but only one
of them takes the precaution of checking whether the value is already
defined. Add #ifndef for other occurrences too.

This is in preparation for the support of disassembling JIT-ed programs
with LLVM, with $(llvm-config --cflags) passing -D_GNU_SOURCE as a
compilation argument.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
---
 tools/bpf/bpftool/common.c        | 2 ++
 tools/bpf/bpftool/iter.c          | 2 ++
 tools/bpf/bpftool/jit_disasm.c    | 2 ++
 tools/bpf/bpftool/net.c           | 2 ++
 tools/bpf/bpftool/perf.c          | 2 ++
 tools/bpf/bpftool/prog.c          | 2 ++
 tools/bpf/bpftool/xlated_dumper.c | 2 ++
 7 files changed, 14 insertions(+)

diff --git a/tools/bpf/bpftool/common.c b/tools/bpf/bpftool/common.c
index 8727765add88..4c2e909a2d67 100644
--- a/tools/bpf/bpftool/common.c
+++ b/tools/bpf/bpftool/common.c
@@ -1,7 +1,9 @@
 // SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
 /* Copyright (C) 2017-2018 Netronome Systems, Inc. */
 
+#ifndef _GNU_SOURCE
 #define _GNU_SOURCE
+#endif
 #include <ctype.h>
 #include <errno.h>
 #include <fcntl.h>
diff --git a/tools/bpf/bpftool/iter.c b/tools/bpf/bpftool/iter.c
index f88fdc820d23..a3e6b167153d 100644
--- a/tools/bpf/bpftool/iter.c
+++ b/tools/bpf/bpftool/iter.c
@@ -1,7 +1,9 @@
 // SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
 // Copyright (C) 2020 Facebook
 
+#ifndef _GNU_SOURCE
 #define _GNU_SOURCE
+#endif
 #include <unistd.h>
 #include <linux/err.h>
 #include <bpf/libbpf.h>
diff --git a/tools/bpf/bpftool/jit_disasm.c b/tools/bpf/bpftool/jit_disasm.c
index aaf99a0168c9..71cb258ab0ee 100644
--- a/tools/bpf/bpftool/jit_disasm.c
+++ b/tools/bpf/bpftool/jit_disasm.c
@@ -11,7 +11,9 @@
  * Licensed under the GNU General Public License, version 2.0 (GPLv2)
  */
 
+#ifndef _GNU_SOURCE
 #define _GNU_SOURCE
+#endif
 #include <stdio.h>
 #include <stdarg.h>
 #include <stdint.h>
diff --git a/tools/bpf/bpftool/net.c b/tools/bpf/bpftool/net.c
index 526a332c48e6..c40e44c938ae 100644
--- a/tools/bpf/bpftool/net.c
+++ b/tools/bpf/bpftool/net.c
@@ -1,7 +1,9 @@
 // SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
 // Copyright (C) 2018 Facebook
 
+#ifndef _GNU_SOURCE
 #define _GNU_SOURCE
+#endif
 #include <errno.h>
 #include <fcntl.h>
 #include <stdlib.h>
diff --git a/tools/bpf/bpftool/perf.c b/tools/bpf/bpftool/perf.c
index 226ec2c39052..91743445e4c7 100644
--- a/tools/bpf/bpftool/perf.c
+++ b/tools/bpf/bpftool/perf.c
@@ -2,7 +2,9 @@
 // Copyright (C) 2018 Facebook
 // Author: Yonghong Song <yhs@fb.com>
 
+#ifndef _GNU_SOURCE
 #define _GNU_SOURCE
+#endif
 #include <ctype.h>
 #include <errno.h>
 #include <fcntl.h>
diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c
index c81362a001ba..a31ae9f0c307 100644
--- a/tools/bpf/bpftool/prog.c
+++ b/tools/bpf/bpftool/prog.c
@@ -1,7 +1,9 @@
 // SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
 /* Copyright (C) 2017-2018 Netronome Systems, Inc. */
 
+#ifndef _GNU_SOURCE
 #define _GNU_SOURCE
+#endif
 #include <errno.h>
 #include <fcntl.h>
 #include <signal.h>
diff --git a/tools/bpf/bpftool/xlated_dumper.c b/tools/bpf/bpftool/xlated_dumper.c
index 2d9cd6a7b3c8..6fe3134ae45d 100644
--- a/tools/bpf/bpftool/xlated_dumper.c
+++ b/tools/bpf/bpftool/xlated_dumper.c
@@ -1,7 +1,9 @@
 // SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
 /* Copyright (C) 2018 Netronome Systems, Inc. */
 
+#ifndef _GNU_SOURCE
 #define _GNU_SOURCE
+#endif
 #include <stdarg.h>
 #include <stdio.h>
 #include <stdlib.h>
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH bpf-next 2/7] bpftool: Remove asserts from JIT disassembler
  2022-09-06 13:36 [PATCH bpf-next 0/7] bpftool: Add LLVM as default library for disassembling JIT-ed programs Quentin Monnet
  2022-09-06 13:36 ` [PATCH bpf-next 1/7] bpftool: Define _GNU_SOURCE only once Quentin Monnet
@ 2022-09-06 13:36 ` Quentin Monnet
  2022-09-06 23:16   ` Song Liu
  2022-09-06 13:36 ` [PATCH bpf-next 3/7] bpftool: Split FEATURE_TESTS/FEATURE_DISPLAY definitions in Makefile Quentin Monnet
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 24+ messages in thread
From: Quentin Monnet @ 2022-09-06 13:36 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf,
	Niklas Söderlund, Simon Horman, Quentin Monnet

The JIT disassembler in bpftool is the only components (with the JSON
writer) using asserts to check the return values of functions. But it
does not do so in a consistent way, and diasm_print_insn() returns no
value, although sometimes the operation failed.

Remove the asserts, and instead check the return values, print messages
on errors, and propagate the error to the caller from prog.c.

Remove the inclusion of assert.h from jit_disasm.c, and also from map.c
where it is unused.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
---
 tools/bpf/bpftool/jit_disasm.c | 51 +++++++++++++++++++++++-----------
 tools/bpf/bpftool/main.h       | 25 +++++++++--------
 tools/bpf/bpftool/map.c        |  1 -
 tools/bpf/bpftool/prog.c       | 15 ++++++----
 4 files changed, 57 insertions(+), 35 deletions(-)

diff --git a/tools/bpf/bpftool/jit_disasm.c b/tools/bpf/bpftool/jit_disasm.c
index 71cb258ab0ee..723a9b799a0c 100644
--- a/tools/bpf/bpftool/jit_disasm.c
+++ b/tools/bpf/bpftool/jit_disasm.c
@@ -18,7 +18,6 @@
 #include <stdarg.h>
 #include <stdint.h>
 #include <stdlib.h>
-#include <assert.h>
 #include <unistd.h>
 #include <string.h>
 #include <bfd.h>
@@ -31,14 +30,18 @@
 #include "json_writer.h"
 #include "main.h"
 
-static void get_exec_path(char *tpath, size_t size)
+static int get_exec_path(char *tpath, size_t size)
 {
 	const char *path = "/proc/self/exe";
 	ssize_t len;
 
 	len = readlink(path, tpath, size - 1);
-	assert(len > 0);
+	if (len <= 0)
+		return -1;
+
 	tpath[len] = 0;
+
+	return 0;
 }
 
 static int oper_count;
@@ -99,30 +102,39 @@ static int fprintf_json_styled(void *out,
 	return r;
 }
 
-void disasm_print_insn(unsigned char *image, ssize_t len, int opcodes,
-		       const char *arch, const char *disassembler_options,
-		       const struct btf *btf,
-		       const struct bpf_prog_linfo *prog_linfo,
-		       __u64 func_ksym, unsigned int func_idx,
-		       bool linum)
+int disasm_print_insn(unsigned char *image, ssize_t len, int opcodes,
+		      const char *arch, const char *disassembler_options,
+		      const struct btf *btf,
+		      const struct bpf_prog_linfo *prog_linfo,
+		      __u64 func_ksym, unsigned int func_idx,
+		      bool linum)
 {
 	const struct bpf_line_info *linfo = NULL;
 	disassembler_ftype disassemble;
+	int count, i, pc = 0, err = -1;
 	struct disassemble_info info;
 	unsigned int nr_skip = 0;
-	int count, i, pc = 0;
 	char tpath[PATH_MAX];
 	bfd *bfdf;
 
 	if (!len)
-		return;
+		return -1;
 
 	memset(tpath, 0, sizeof(tpath));
-	get_exec_path(tpath, sizeof(tpath));
+	if (get_exec_path(tpath, sizeof(tpath))) {
+		p_err("failed to create disasembler (get_exec_path)");
+		return -1;
+	}
 
 	bfdf = bfd_openr(tpath, NULL);
-	assert(bfdf);
-	assert(bfd_check_format(bfdf, bfd_object));
+	if (!bfdf) {
+		p_err("failed to create disassembler (bfd_openr)");
+		return -1;
+	}
+	if (!bfd_check_format(bfdf, bfd_object)) {
+		p_err("failed to create disassembler (bfd_check_format)");
+		goto exit_close;
+	}
 
 	if (json_output)
 		init_disassemble_info_compat(&info, stdout,
@@ -141,7 +153,7 @@ void disasm_print_insn(unsigned char *image, ssize_t len, int opcodes,
 			bfdf->arch_info = inf;
 		} else {
 			p_err("No libbfd support for %s", arch);
-			return;
+			goto exit_close;
 		}
 	}
 
@@ -162,7 +174,10 @@ void disasm_print_insn(unsigned char *image, ssize_t len, int opcodes,
 #else
 	disassemble = disassembler(bfdf);
 #endif
-	assert(disassemble);
+	if (!disassemble) {
+		p_err("failed to create disassembler");
+		goto exit_close;
+	}
 
 	if (json_output)
 		jsonw_start_array(json_wtr);
@@ -226,7 +241,11 @@ void disasm_print_insn(unsigned char *image, ssize_t len, int opcodes,
 	if (json_output)
 		jsonw_end_array(json_wtr);
 
+	err = 0;
+
+exit_close:
 	bfd_close(bfdf);
+	return err;
 }
 
 int disasm_init(void)
diff --git a/tools/bpf/bpftool/main.h b/tools/bpf/bpftool/main.h
index 5e5060c2ac04..c9e171082cf6 100644
--- a/tools/bpf/bpftool/main.h
+++ b/tools/bpf/bpftool/main.h
@@ -173,22 +173,23 @@ int map_parse_fd_and_info(int *argc, char ***argv, void *info, __u32 *info_len);
 
 struct bpf_prog_linfo;
 #ifdef HAVE_LIBBFD_SUPPORT
-void disasm_print_insn(unsigned char *image, ssize_t len, int opcodes,
-		       const char *arch, const char *disassembler_options,
-		       const struct btf *btf,
-		       const struct bpf_prog_linfo *prog_linfo,
-		       __u64 func_ksym, unsigned int func_idx,
-		       bool linum);
+int disasm_print_insn(unsigned char *image, ssize_t len, int opcodes,
+		      const char *arch, const char *disassembler_options,
+		      const struct btf *btf,
+		      const struct bpf_prog_linfo *prog_linfo,
+		      __u64 func_ksym, unsigned int func_idx,
+		      bool linum);
 int disasm_init(void);
 #else
 static inline
-void disasm_print_insn(unsigned char *image, ssize_t len, int opcodes,
-		       const char *arch, const char *disassembler_options,
-		       const struct btf *btf,
-		       const struct bpf_prog_linfo *prog_linfo,
-		       __u64 func_ksym, unsigned int func_idx,
-		       bool linum)
+int disasm_print_insn(unsigned char *image, ssize_t len, int opcodes,
+		      const char *arch, const char *disassembler_options,
+		      const struct btf *btf,
+		      const struct bpf_prog_linfo *prog_linfo,
+		      __u64 func_ksym, unsigned int func_idx,
+		      bool linum)
 {
+	return 0;
 }
 static inline int disasm_init(void)
 {
diff --git a/tools/bpf/bpftool/map.c b/tools/bpf/bpftool/map.c
index 38b6bc9c26c3..a0e573589811 100644
--- a/tools/bpf/bpftool/map.c
+++ b/tools/bpf/bpftool/map.c
@@ -1,7 +1,6 @@
 // SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
 /* Copyright (C) 2017-2018 Netronome Systems, Inc. */
 
-#include <assert.h>
 #include <errno.h>
 #include <fcntl.h>
 #include <linux/err.h>
diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c
index a31ae9f0c307..345dca656a34 100644
--- a/tools/bpf/bpftool/prog.c
+++ b/tools/bpf/bpftool/prog.c
@@ -822,10 +822,11 @@ prog_dump(struct bpf_prog_info *info, enum dump_mode mode,
 					printf("%s:\n", sym_name);
 				}
 
-				disasm_print_insn(img, lens[i], opcodes,
-						  name, disasm_opt, btf,
-						  prog_linfo, ksyms[i], i,
-						  linum);
+				if (disasm_print_insn(img, lens[i], opcodes,
+						      name, disasm_opt, btf,
+						      prog_linfo, ksyms[i], i,
+						      linum))
+					goto exit_free;
 
 				img += lens[i];
 
@@ -838,8 +839,10 @@ prog_dump(struct bpf_prog_info *info, enum dump_mode mode,
 			if (json_output)
 				jsonw_end_array(json_wtr);
 		} else {
-			disasm_print_insn(buf, member_len, opcodes, name,
-					  disasm_opt, btf, NULL, 0, 0, false);
+			if (disasm_print_insn(buf, member_len, opcodes, name,
+					      disasm_opt, btf, NULL, 0, 0,
+					      false))
+				goto exit_free;
 		}
 	} else if (visual) {
 		if (json_output)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH bpf-next 3/7] bpftool: Split FEATURE_TESTS/FEATURE_DISPLAY definitions in Makefile
  2022-09-06 13:36 [PATCH bpf-next 0/7] bpftool: Add LLVM as default library for disassembling JIT-ed programs Quentin Monnet
  2022-09-06 13:36 ` [PATCH bpf-next 1/7] bpftool: Define _GNU_SOURCE only once Quentin Monnet
  2022-09-06 13:36 ` [PATCH bpf-next 2/7] bpftool: Remove asserts from JIT disassembler Quentin Monnet
@ 2022-09-06 13:36 ` Quentin Monnet
  2022-09-06 23:18   ` Song Liu
  2022-09-06 13:36 ` [PATCH bpf-next 4/7] bpftool: Group libbfd defs in Makefile, only pass them if we use libbfd Quentin Monnet
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 24+ messages in thread
From: Quentin Monnet @ 2022-09-06 13:36 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf,
	Niklas Söderlund, Simon Horman, Quentin Monnet,
	Andres Freund

Make FEATURE_TESTS and FEATURE_DISPLAY easier to read and less likely to
be subject to conflicts on updates by having one feature per line.

Suggested-by: Andres Freund <andres@anarazel.de>
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
---
 tools/bpf/bpftool/Makefile | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/tools/bpf/bpftool/Makefile b/tools/bpf/bpftool/Makefile
index 04d733e98bff..8b5bfd8256c5 100644
--- a/tools/bpf/bpftool/Makefile
+++ b/tools/bpf/bpftool/Makefile
@@ -93,9 +93,16 @@ INSTALL ?= install
 RM ?= rm -f
 
 FEATURE_USER = .bpftool
-FEATURE_TESTS = libbfd disassembler-four-args disassembler-init-styled libcap \
-	clang-bpf-co-re
-FEATURE_DISPLAY = libbfd libcap clang-bpf-co-re
+
+FEATURE_TESTS := clang-bpf-co-re
+FEATURE_TESTS += libcap
+FEATURE_TESTS += libbfd
+FEATURE_TESTS += disassembler-four-args
+FEATURE_TESTS += disassembler-init-styled
+
+FEATURE_DISPLAY := clang-bpf-co-re
+FEATURE_DISPLAY += libcap
+FEATURE_DISPLAY += libbfd
 
 check_feat := 1
 NON_CHECK_FEAT_TARGETS := clean uninstall doc doc-clean doc-install doc-uninstall
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH bpf-next 4/7] bpftool: Group libbfd defs in Makefile, only pass them if we use libbfd
  2022-09-06 13:36 [PATCH bpf-next 0/7] bpftool: Add LLVM as default library for disassembling JIT-ed programs Quentin Monnet
                   ` (2 preceding siblings ...)
  2022-09-06 13:36 ` [PATCH bpf-next 3/7] bpftool: Split FEATURE_TESTS/FEATURE_DISPLAY definitions in Makefile Quentin Monnet
@ 2022-09-06 13:36 ` Quentin Monnet
  2022-09-06 23:31   ` Song Liu
  2022-09-06 13:36 ` [PATCH bpf-next 5/7] bpftool: Refactor disassembler for JIT-ed programs Quentin Monnet
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 24+ messages in thread
From: Quentin Monnet @ 2022-09-06 13:36 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf,
	Niklas Söderlund, Simon Horman, Quentin Monnet

Bpftool uses libbfd for disassembling JIT-ed programs. But the feature
is optional, and the tool can be compiled without libbfd support. The
Makefile sets the relevant variables accordingly. It also sets variables
related to libbfd's interface, given that it has changed over time.

Group all those libbfd-related definitions so that it's easier to
understand what we are testing for, and only use variables related to
libbfd's interface if we need libbfd in the first place.

In addition to make the Makefile clearer, grouping the definitions
related to disassembling JIT-ed programs will help support alternatives
to libbfd.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
---
 tools/bpf/bpftool/Makefile | 27 +++++++++++++++------------
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/tools/bpf/bpftool/Makefile b/tools/bpf/bpftool/Makefile
index 8b5bfd8256c5..8060c7013d4f 100644
--- a/tools/bpf/bpftool/Makefile
+++ b/tools/bpf/bpftool/Makefile
@@ -120,13 +120,6 @@ include $(FEATURES_DUMP)
 endif
 endif
 
-ifeq ($(feature-disassembler-four-args), 1)
-CFLAGS += -DDISASM_FOUR_ARGS_SIGNATURE
-endif
-ifeq ($(feature-disassembler-init-styled), 1)
-    CFLAGS += -DDISASM_INIT_STYLED
-endif
-
 LIBS = $(LIBBPF) -lelf -lz
 LIBS_BOOTSTRAP = $(LIBBPF_BOOTSTRAP) -lelf -lz
 ifeq ($(feature-libcap), 1)
@@ -138,9 +131,7 @@ include $(wildcard $(OUTPUT)*.d)
 
 all: $(OUTPUT)bpftool
 
-BFD_SRCS = jit_disasm.c
-
-SRCS = $(filter-out $(BFD_SRCS),$(wildcard *.c))
+SRCS := $(wildcard *.c)
 
 ifeq ($(feature-libbfd),1)
   LIBS += -lbfd -ldl -lopcodes
@@ -150,9 +141,21 @@ else ifeq ($(feature-libbfd-liberty-z),1)
   LIBS += -lbfd -ldl -lopcodes -liberty -lz
 endif
 
+# If one of the above feature combinations is set, we support libbfd
 ifneq ($(filter -lbfd,$(LIBS)),)
-CFLAGS += -DHAVE_LIBBFD_SUPPORT
-SRCS += $(BFD_SRCS)
+  CFLAGS += -DHAVE_LIBBFD_SUPPORT
+
+  # Libbfd interface changed over time, figure out what we need
+  ifeq ($(feature-disassembler-four-args), 1)
+    CFLAGS += -DDISASM_FOUR_ARGS_SIGNATURE
+  endif
+  ifeq ($(feature-disassembler-init-styled), 1)
+    CFLAGS += -DDISASM_INIT_STYLED
+  endif
+endif
+ifeq ($(filter -DHAVE_LIBBFD_SUPPORT,$(CFLAGS)),)
+  # No support for JIT disassembly
+  SRCS := $(filter-out jit_disasm.c,$(SRCS))
 endif
 
 HOST_CFLAGS = $(subst -I$(LIBBPF_INCLUDE),-I$(LIBBPF_BOOTSTRAP_INCLUDE),\
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH bpf-next 5/7] bpftool: Refactor disassembler for JIT-ed programs
  2022-09-06 13:36 [PATCH bpf-next 0/7] bpftool: Add LLVM as default library for disassembling JIT-ed programs Quentin Monnet
                   ` (3 preceding siblings ...)
  2022-09-06 13:36 ` [PATCH bpf-next 4/7] bpftool: Group libbfd defs in Makefile, only pass them if we use libbfd Quentin Monnet
@ 2022-09-06 13:36 ` Quentin Monnet
  2022-09-06 23:55   ` Song Liu
  2022-09-06 13:36 ` [PATCH bpf-next 6/7] bpftool: Add LLVM as default library for disassembling " Quentin Monnet
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 24+ messages in thread
From: Quentin Monnet @ 2022-09-06 13:36 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf,
	Niklas Söderlund, Simon Horman, Quentin Monnet

Refactor disasm_print_insn() to extract the code specific to libbfd and
move it to dedicated functions. There is no functional change. This is
in preparation for supporting an alternative library for disassembling
the instructions.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
---
 tools/bpf/bpftool/jit_disasm.c | 133 ++++++++++++++++++++++-----------
 1 file changed, 88 insertions(+), 45 deletions(-)

diff --git a/tools/bpf/bpftool/jit_disasm.c b/tools/bpf/bpftool/jit_disasm.c
index 723a9b799a0c..e31ad3950fd6 100644
--- a/tools/bpf/bpftool/jit_disasm.c
+++ b/tools/bpf/bpftool/jit_disasm.c
@@ -30,6 +30,14 @@
 #include "json_writer.h"
 #include "main.h"
 
+static int oper_count;
+
+typedef struct {
+	struct disassemble_info *info;
+	disassembler_ftype disassemble;
+	bfd *bfdf;
+} disasm_ctx_t;
+
 static int get_exec_path(char *tpath, size_t size)
 {
 	const char *path = "/proc/self/exe";
@@ -44,7 +52,6 @@ static int get_exec_path(char *tpath, size_t size)
 	return 0;
 }
 
-static int oper_count;
 static int printf_json(void *out, const char *fmt, va_list ap)
 {
 	char *s;
@@ -102,46 +109,44 @@ static int fprintf_json_styled(void *out,
 	return r;
 }
 
-int disasm_print_insn(unsigned char *image, ssize_t len, int opcodes,
-		      const char *arch, const char *disassembler_options,
-		      const struct btf *btf,
-		      const struct bpf_prog_linfo *prog_linfo,
-		      __u64 func_ksym, unsigned int func_idx,
-		      bool linum)
+static int init_context(disasm_ctx_t *ctx, const char *arch,
+			const char *disassembler_options,
+			unsigned char *image, ssize_t len)
 {
-	const struct bpf_line_info *linfo = NULL;
-	disassembler_ftype disassemble;
-	int count, i, pc = 0, err = -1;
-	struct disassemble_info info;
-	unsigned int nr_skip = 0;
+	struct disassemble_info *info;
 	char tpath[PATH_MAX];
 	bfd *bfdf;
 
-	if (!len)
-		return -1;
-
 	memset(tpath, 0, sizeof(tpath));
 	if (get_exec_path(tpath, sizeof(tpath))) {
 		p_err("failed to create disasembler (get_exec_path)");
 		return -1;
 	}
 
-	bfdf = bfd_openr(tpath, NULL);
-	if (!bfdf) {
+	ctx->bfdf = bfd_openr(tpath, NULL);
+	if (!ctx->bfdf) {
 		p_err("failed to create disassembler (bfd_openr)");
 		return -1;
 	}
-	if (!bfd_check_format(bfdf, bfd_object)) {
+	if (!bfd_check_format(ctx->bfdf, bfd_object)) {
 		p_err("failed to create disassembler (bfd_check_format)");
-		goto exit_close;
+		goto err_close;
 	}
+	bfdf = ctx->bfdf;
+
+	ctx->info = malloc(sizeof(struct disassemble_info));
+	if (!ctx->info) {
+		p_err("mem alloc failed");
+		goto err_close;
+	}
+	info = ctx->info;
 
 	if (json_output)
-		init_disassemble_info_compat(&info, stdout,
+		init_disassemble_info_compat(info, stdout,
 					     (fprintf_ftype) fprintf_json,
 					     fprintf_json_styled);
 	else
-		init_disassemble_info_compat(&info, stdout,
+		init_disassemble_info_compat(info, stdout,
 					     (fprintf_ftype) fprintf,
 					     fprintf_styled);
 
@@ -153,31 +158,76 @@ int disasm_print_insn(unsigned char *image, ssize_t len, int opcodes,
 			bfdf->arch_info = inf;
 		} else {
 			p_err("No libbfd support for %s", arch);
-			goto exit_close;
+			goto err_free;
 		}
 	}
 
-	info.arch = bfd_get_arch(bfdf);
-	info.mach = bfd_get_mach(bfdf);
+	info->arch = bfd_get_arch(bfdf);
+	info->mach = bfd_get_mach(bfdf);
 	if (disassembler_options)
-		info.disassembler_options = disassembler_options;
-	info.buffer = image;
-	info.buffer_length = len;
+		info->disassembler_options = disassembler_options;
+	info->buffer = image;
+	info->buffer_length = len;
 
-	disassemble_init_for_target(&info);
+	disassemble_init_for_target(info);
 
 #ifdef DISASM_FOUR_ARGS_SIGNATURE
-	disassemble = disassembler(info.arch,
-				   bfd_big_endian(bfdf),
-				   info.mach,
-				   bfdf);
+	ctx->disassemble = disassembler(info->arch,
+					bfd_big_endian(bfdf),
+					info->mach,
+					bfdf);
 #else
-	disassemble = disassembler(bfdf);
+	ctx->disassemble = disassembler(bfdf);
 #endif
-	if (!disassemble) {
+	if (!ctx->disassemble) {
 		p_err("failed to create disassembler");
-		goto exit_close;
+		goto err_free;
 	}
+	return 0;
+
+err_free:
+	free(info);
+err_close:
+	bfd_close(ctx->bfdf);
+	return -1;
+}
+
+static void destroy_context(disasm_ctx_t *ctx)
+{
+	free(ctx->info);
+	bfd_close(ctx->bfdf);
+}
+
+static int
+disassemble_insn(disasm_ctx_t *ctx, __maybe_unused unsigned char *image,
+		 __maybe_unused ssize_t len, int pc)
+{
+	return ctx->disassemble(pc, ctx->info);
+}
+
+int disasm_init(void)
+{
+	bfd_init();
+	return 0;
+}
+
+int disasm_print_insn(unsigned char *image, ssize_t len, int opcodes,
+		      const char *arch, const char *disassembler_options,
+		      const struct btf *btf,
+		      const struct bpf_prog_linfo *prog_linfo,
+		      __u64 func_ksym, unsigned int func_idx,
+		      bool linum)
+{
+	const struct bpf_line_info *linfo = NULL;
+	unsigned int nr_skip = 0;
+	int count, i, pc = 0;
+	disasm_ctx_t ctx;
+
+	if (!len)
+		return -1;
+
+	if (init_context(&ctx, arch, disassembler_options, image, len))
+		return -1;
 
 	if (json_output)
 		jsonw_start_array(json_wtr);
@@ -205,7 +255,8 @@ int disasm_print_insn(unsigned char *image, ssize_t len, int opcodes,
 			printf("%4x:\t", pc);
 		}
 
-		count = disassemble(pc, &info);
+		count = disassemble_insn(&ctx, image, len, pc);
+
 		if (json_output) {
 			/* Operand array, was started in fprintf_json. Before
 			 * that, make sure we have a _null_ value if no operand
@@ -241,15 +292,7 @@ int disasm_print_insn(unsigned char *image, ssize_t len, int opcodes,
 	if (json_output)
 		jsonw_end_array(json_wtr);
 
-	err = 0;
+	destroy_context(&ctx);
 
-exit_close:
-	bfd_close(bfdf);
-	return err;
-}
-
-int disasm_init(void)
-{
-	bfd_init();
 	return 0;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH bpf-next 6/7] bpftool: Add LLVM as default library for disassembling JIT-ed programs
  2022-09-06 13:36 [PATCH bpf-next 0/7] bpftool: Add LLVM as default library for disassembling JIT-ed programs Quentin Monnet
                   ` (4 preceding siblings ...)
  2022-09-06 13:36 ` [PATCH bpf-next 5/7] bpftool: Refactor disassembler for JIT-ed programs Quentin Monnet
@ 2022-09-06 13:36 ` Quentin Monnet
  2022-09-06 23:46   ` Alexei Starovoitov
  2022-09-07  0:06   ` Song Liu
  2022-09-06 13:36 ` [PATCH bpf-next 7/7] bpftool: Add llvm feature to "bpftool version" Quentin Monnet
  2022-09-10 19:41 ` [PATCH bpf-next 0/7] bpftool: Add LLVM as default library for disassembling JIT-ed programs Niklas Söderlund
  7 siblings, 2 replies; 24+ messages in thread
From: Quentin Monnet @ 2022-09-06 13:36 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf,
	Niklas Söderlund, Simon Horman, Quentin Monnet

To disassemble instructions for JIT-ed programs, bpftool has relied on
the libbfd library. This has been problematic in the past: libbfd's
interface is not meant to be stable and has changed several times. For
building bpftool, we have to detect how the libbfd version on the system
behaves, which is why we have to handle features disassembler-four-args
and disassembler-init-styled in the Makefile. When it comes to shipping
bpftool, this has also caused issues with several distribution
maintainers unwilling to support the feature (see for example Debian's
page for binutils-dev, which ships libbfd: "Note that building Debian
packages which depend on the shared libbfd is Not Allowed." [0]).

For these reasons, we add support for LLVM as an alternative to libbfd
for disassembling instructions of JIT-ed programs. Thanks to the
preparation work in the previous commits, it's easy to add the library
by passing the relevant compilation options in the Makefile, and by
adding the functions for setting up the LLVM disassembler in file
jit_disasm.c.

Naturally, the display of disassembled instructions comes with a few
minor differences. Here is a sample output with libbfd (already
supported before this patch):

    # bpftool prog dump jited id 56
    bpf_prog_6deef7357e7b4530:
       0:   nopl   0x0(%rax,%rax,1)
       5:   xchg   %ax,%ax
       7:   push   %rbp
       8:   mov    %rsp,%rbp
       b:   push   %rbx
       c:   push   %r13
       e:   push   %r14
      10:   mov    %rdi,%rbx
      13:   movzwq 0xb0(%rbx),%r13
      1b:   xor    %r14d,%r14d
      1e:   or     $0x2,%r14d
      22:   mov    $0x1,%eax
      27:   cmp    $0x2,%r14
      2b:   jne    0x000000000000002f
      2d:   xor    %eax,%eax
      2f:   pop    %r14
      31:   pop    %r13
      33:   pop    %rbx
      34:   leave
      35:   ret
      36:   int3

LLVM supports several variants that we could set when initialising the
disassembler, for example with:

    LLVMSetDisasmOptions(*ctx,
                         LLVMDisassembler_Option_AsmPrinterVariant);

but the default printer is kept for now. Here is the output with LLVM:

    # bpftool prog dump jited id 56
    bpf_prog_6deef7357e7b4530:
       0:   nopl    (%rax,%rax)
       5:   nop
       7:   pushq   %rbp
       8:   movq    %rsp, %rbp
       b:   pushq   %rbx
       c:   pushq   %r13
       e:   pushq   %r14
      10:   movq    %rdi, %rbx
      13:   movzwq  176(%rbx), %r13
      1b:   xorl    %r14d, %r14d
      1e:   orl     $2, %r14d
      22:   movl    $1, %eax
      27:   cmpq    $2, %r14
      2b:   jne     2
      2d:   xorl    %eax, %eax
      2f:   popq    %r14
      31:   popq    %r13
      33:   popq    %rbx
      34:   leave
      35:   retq
      36:   int3

The LLVM disassembler comes as the default choice, with libbfd as a
fall-back.

Of course, we could replace libbfd entirely and avoid supporting two
different libraries. One reason for keeping libbfd is that, right now,
it works well, we have all we need in terms of features detection in the
Makefile, so it provides a fallback for disassembling JIT-ed programs if
libbfd is installed but LLVM is not. The other motivation is that libbfd
supports nfp instruction for Netronome's SmartNICs and can be used to
disassemble offloaded programs, something that LLVM cannot do. If
libbfd's interface breaks again in the future, we might reconsider
keeping support for it.

[0] https://packages.debian.org/buster/binutils-dev

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
---
 tools/bpf/bpftool/Makefile     | 45 ++++++++++------
 tools/bpf/bpftool/jit_disasm.c | 94 ++++++++++++++++++++++++++++++++--
 tools/bpf/bpftool/main.h       |  4 +-
 3 files changed, 121 insertions(+), 22 deletions(-)

diff --git a/tools/bpf/bpftool/Makefile b/tools/bpf/bpftool/Makefile
index 8060c7013d4f..d96584e2dae7 100644
--- a/tools/bpf/bpftool/Makefile
+++ b/tools/bpf/bpftool/Makefile
@@ -95,12 +95,14 @@ RM ?= rm -f
 FEATURE_USER = .bpftool
 
 FEATURE_TESTS := clang-bpf-co-re
+FEATURE_TESTS += llvm
 FEATURE_TESTS += libcap
 FEATURE_TESTS += libbfd
 FEATURE_TESTS += disassembler-four-args
 FEATURE_TESTS += disassembler-init-styled
 
 FEATURE_DISPLAY := clang-bpf-co-re
+FEATURE_DISPLAY += llvm
 FEATURE_DISPLAY += libcap
 FEATURE_DISPLAY += libbfd
 
@@ -133,27 +135,36 @@ all: $(OUTPUT)bpftool
 
 SRCS := $(wildcard *.c)
 
-ifeq ($(feature-libbfd),1)
-  LIBS += -lbfd -ldl -lopcodes
-else ifeq ($(feature-libbfd-liberty),1)
-  LIBS += -lbfd -ldl -lopcodes -liberty
-else ifeq ($(feature-libbfd-liberty-z),1)
-  LIBS += -lbfd -ldl -lopcodes -liberty -lz
-endif
+ifeq ($(feature-llvm),1)
+  # If LLVM is available, use it for JIT disassembly
+  CFLAGS  += -DHAVE_LLVM_SUPPORT
+  CFLAGS  += $(shell llvm-config --cflags --libs)
+  LIBS    += $(shell llvm-config --libs)
+  LDFLAGS += $(shell llvm-config --ldflags)
+else
+  # Fall back on libbfd
+  ifeq ($(feature-libbfd),1)
+    LIBS += -lbfd -ldl -lopcodes
+  else ifeq ($(feature-libbfd-liberty),1)
+    LIBS += -lbfd -ldl -lopcodes -liberty
+  else ifeq ($(feature-libbfd-liberty-z),1)
+    LIBS += -lbfd -ldl -lopcodes -liberty -lz
+  endif
 
-# If one of the above feature combinations is set, we support libbfd
-ifneq ($(filter -lbfd,$(LIBS)),)
-  CFLAGS += -DHAVE_LIBBFD_SUPPORT
+  # If one of the above feature combinations is set, we support libbfd
+  ifneq ($(filter -lbfd,$(LIBS)),)
+    CFLAGS += -DHAVE_LIBBFD_SUPPORT
 
-  # Libbfd interface changed over time, figure out what we need
-  ifeq ($(feature-disassembler-four-args), 1)
-    CFLAGS += -DDISASM_FOUR_ARGS_SIGNATURE
-  endif
-  ifeq ($(feature-disassembler-init-styled), 1)
-    CFLAGS += -DDISASM_INIT_STYLED
+    # Libbfd interface changed over time, figure out what we need
+    ifeq ($(feature-disassembler-four-args), 1)
+      CFLAGS += -DDISASM_FOUR_ARGS_SIGNATURE
+    endif
+    ifeq ($(feature-disassembler-init-styled), 1)
+      CFLAGS += -DDISASM_INIT_STYLED
+    endif
   endif
 endif
-ifeq ($(filter -DHAVE_LIBBFD_SUPPORT,$(CFLAGS)),)
+ifeq ($(filter -DHAVE_LLVM_SUPPORT -DHAVE_LIBBFD_SUPPORT,$(CFLAGS)),)
   # No support for JIT disassembly
   SRCS := $(filter-out jit_disasm.c,$(SRCS))
 endif
diff --git a/tools/bpf/bpftool/jit_disasm.c b/tools/bpf/bpftool/jit_disasm.c
index e31ad3950fd6..8128b0cf2ad3 100644
--- a/tools/bpf/bpftool/jit_disasm.c
+++ b/tools/bpf/bpftool/jit_disasm.c
@@ -20,18 +20,105 @@
 #include <stdlib.h>
 #include <unistd.h>
 #include <string.h>
-#include <bfd.h>
-#include <dis-asm.h>
 #include <sys/stat.h>
 #include <limits.h>
 #include <bpf/libbpf.h>
+
+#ifdef HAVE_LLVM_SUPPORT
+#include <llvm-c/Core.h>
+#include <llvm-c/Disassembler.h>
+#include <llvm-c/Target.h>
+#include <llvm-c/TargetMachine.h>
+#endif
+
+#ifdef HAVE_LIBBFD_SUPPORT
+#include <bfd.h>
+#include <dis-asm.h>
 #include <tools/dis-asm-compat.h>
+#endif
 
 #include "json_writer.h"
 #include "main.h"
 
 static int oper_count;
 
+#ifdef HAVE_LLVM_SUPPORT
+#define DISASM_SPACER
+
+typedef LLVMDisasmContextRef disasm_ctx_t;
+
+static int printf_json(char *s)
+{
+	s = strtok(s, " \t");
+	jsonw_string_field(json_wtr, "operation", s);
+
+	jsonw_name(json_wtr, "operands");
+	jsonw_start_array(json_wtr);
+	oper_count = 1;
+
+	while ((s = strtok(NULL, " \t,()")) != 0) {
+		jsonw_string(json_wtr, s);
+		oper_count++;
+	}
+	return 0;
+}
+
+static int
+init_context(disasm_ctx_t *ctx, const char *arch,
+	     __maybe_unused const char *disassembler_options,
+	     __maybe_unused unsigned char *image, __maybe_unused ssize_t len)
+{
+	char *triple;
+
+	if (arch) {
+		p_err("Architecture %s not supported", arch);
+		return -1;
+	}
+
+	triple = LLVMGetDefaultTargetTriple();
+	*ctx = LLVMCreateDisasm(triple, NULL, 0, NULL, NULL);
+	LLVMDisposeMessage(triple);
+
+	if (!*ctx) {
+		p_err("Failed to create disassembler");
+		return -1;
+	}
+
+	return 0;
+}
+
+static void destroy_context(disasm_ctx_t *ctx)
+{
+	LLVMDisposeMessage(*ctx);
+}
+
+static int
+disassemble_insn(disasm_ctx_t *ctx, unsigned char *image, ssize_t len, int pc)
+{
+	char buf[256];
+	int count;
+
+	count = LLVMDisasmInstruction(*ctx, image + pc, len - pc, pc,
+				      buf, sizeof(buf));
+	if (json_output)
+		printf_json(buf);
+	else
+		printf("%s", buf);
+
+	return count;
+}
+
+int disasm_init(void)
+{
+	LLVMInitializeNativeTarget();
+	LLVMInitializeNativeDisassembler();
+	return 0;
+}
+#endif /* HAVE_LLVM_SUPPORT */
+
+#ifdef HAVE_LIBBFD_SUPPORT
+#define DISASM_SPACER "\t"
+
 typedef struct {
 	struct disassemble_info *info;
 	disassembler_ftype disassemble;
@@ -210,6 +297,7 @@ int disasm_init(void)
 	bfd_init();
 	return 0;
 }
+#endif /* HAVE_LIBBPFD_SUPPORT */
 
 int disasm_print_insn(unsigned char *image, ssize_t len, int opcodes,
 		      const char *arch, const char *disassembler_options,
@@ -252,7 +340,7 @@ int disasm_print_insn(unsigned char *image, ssize_t len, int opcodes,
 			if (linfo)
 				btf_dump_linfo_plain(btf, linfo, "; ",
 						     linum);
-			printf("%4x:\t", pc);
+			printf("%4x:" DISASM_SPACER, pc);
 		}
 
 		count = disassemble_insn(&ctx, image, len, pc);
diff --git a/tools/bpf/bpftool/main.h b/tools/bpf/bpftool/main.h
index c9e171082cf6..9a149c67aa5d 100644
--- a/tools/bpf/bpftool/main.h
+++ b/tools/bpf/bpftool/main.h
@@ -172,7 +172,7 @@ int map_parse_fds(int *argc, char ***argv, int **fds);
 int map_parse_fd_and_info(int *argc, char ***argv, void *info, __u32 *info_len);
 
 struct bpf_prog_linfo;
-#ifdef HAVE_LIBBFD_SUPPORT
+#if defined(HAVE_LLVM_SUPPORT) || defined(HAVE_LIBBFD_SUPPORT)
 int disasm_print_insn(unsigned char *image, ssize_t len, int opcodes,
 		      const char *arch, const char *disassembler_options,
 		      const struct btf *btf,
@@ -193,7 +193,7 @@ int disasm_print_insn(unsigned char *image, ssize_t len, int opcodes,
 }
 static inline int disasm_init(void)
 {
-	p_err("No libbfd support");
+	p_err("No JIT disassembly support");
 	return -1;
 }
 #endif
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH bpf-next 7/7] bpftool: Add llvm feature to "bpftool version"
  2022-09-06 13:36 [PATCH bpf-next 0/7] bpftool: Add LLVM as default library for disassembling JIT-ed programs Quentin Monnet
                   ` (5 preceding siblings ...)
  2022-09-06 13:36 ` [PATCH bpf-next 6/7] bpftool: Add LLVM as default library for disassembling " Quentin Monnet
@ 2022-09-06 13:36 ` Quentin Monnet
  2022-09-10 19:41 ` [PATCH bpf-next 0/7] bpftool: Add LLVM as default library for disassembling JIT-ed programs Niklas Söderlund
  7 siblings, 0 replies; 24+ messages in thread
From: Quentin Monnet @ 2022-09-06 13:36 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf,
	Niklas Söderlund, Simon Horman, Quentin Monnet

Similarly to "libbfd", add a "llvm" feature to the output of command
"bpftool version" to indicate that LLVM is used for disassembling JIT-ed
programs. This feature is mutually exclusive with "libbfd".

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
---
 tools/bpf/bpftool/Documentation/common_options.rst |  8 ++++----
 tools/bpf/bpftool/main.c                           | 10 ++++++++++
 2 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/tools/bpf/bpftool/Documentation/common_options.rst b/tools/bpf/bpftool/Documentation/common_options.rst
index 4107a586b68b..05350a1aadf9 100644
--- a/tools/bpf/bpftool/Documentation/common_options.rst
+++ b/tools/bpf/bpftool/Documentation/common_options.rst
@@ -7,10 +7,10 @@
 	  Print bpftool's version number (similar to **bpftool version**), the
 	  number of the libbpf version in use, and optional features that were
 	  included when bpftool was compiled. Optional features include linking
-	  against libbfd to provide the disassembler for JIT-ted programs
-	  (**bpftool prog dump jited**) and usage of BPF skeletons (some
-	  features like **bpftool prog profile** or showing pids associated to
-	  BPF objects may rely on it).
+	  against LLVM or libbfd to provide the disassembler for JIT-ted
+	  programs (**bpftool prog dump jited**) and usage of BPF skeletons
+	  (some features like **bpftool prog profile** or showing pids
+	  associated to BPF objects may rely on it).
 
 -j, --json
 	  Generate JSON output. For commands that cannot produce JSON, this
diff --git a/tools/bpf/bpftool/main.c b/tools/bpf/bpftool/main.c
index ccd7457f92bf..7e06ca2c5d42 100644
--- a/tools/bpf/bpftool/main.c
+++ b/tools/bpf/bpftool/main.c
@@ -89,6 +89,11 @@ static int do_version(int argc, char **argv)
 #else
 	const bool has_libbfd = false;
 #endif
+#ifdef HAVE_LLVM_SUPPORT
+	const bool has_llvm = true;
+#else
+	const bool has_llvm = false;
+#endif
 #ifdef BPFTOOL_WITHOUT_SKELETONS
 	const bool has_skeletons = false;
 #else
@@ -112,6 +117,7 @@ static int do_version(int argc, char **argv)
 		jsonw_name(json_wtr, "features");
 		jsonw_start_object(json_wtr);	/* features */
 		jsonw_bool_field(json_wtr, "libbfd", has_libbfd);
+		jsonw_bool_field(json_wtr, "llvm", has_llvm);
 		jsonw_bool_field(json_wtr, "libbpf_strict", !legacy_libbpf);
 		jsonw_bool_field(json_wtr, "skeletons", has_skeletons);
 		jsonw_end_object(json_wtr);	/* features */
@@ -132,6 +138,10 @@ static int do_version(int argc, char **argv)
 			printf(" libbfd");
 			nb_features++;
 		}
+		if (has_llvm) {
+			printf(" llvm");
+			nb_features++;
+		}
 		if (!legacy_libbpf) {
 			printf("%s libbpf_strict", nb_features++ ? "," : "");
 			nb_features++;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next 1/7] bpftool: Define _GNU_SOURCE only once
  2022-09-06 13:36 ` [PATCH bpf-next 1/7] bpftool: Define _GNU_SOURCE only once Quentin Monnet
@ 2022-09-06 23:14   ` Song Liu
  0 siblings, 0 replies; 24+ messages in thread
From: Song Liu @ 2022-09-06 23:14 UTC (permalink / raw)
  To: Quentin Monnet
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf,
	Niklas Söderlund, Simon Horman

On Tue, Sep 6, 2022 at 6:44 AM Quentin Monnet <quentin@isovalent.com> wrote:
>
> _GNU_SOURCE is defined in several source files for bpftool, but only one
> of them takes the precaution of checking whether the value is already
> defined. Add #ifndef for other occurrences too.
>
> This is in preparation for the support of disassembling JIT-ed programs
> with LLVM, with $(llvm-config --cflags) passing -D_GNU_SOURCE as a
> compilation argument.
>
> Signed-off-by: Quentin Monnet <quentin@isovalent.com>

Acked-by: Song Liu <song@kernel.org>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next 2/7] bpftool: Remove asserts from JIT disassembler
  2022-09-06 13:36 ` [PATCH bpf-next 2/7] bpftool: Remove asserts from JIT disassembler Quentin Monnet
@ 2022-09-06 23:16   ` Song Liu
  0 siblings, 0 replies; 24+ messages in thread
From: Song Liu @ 2022-09-06 23:16 UTC (permalink / raw)
  To: Quentin Monnet
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf,
	Niklas Söderlund, Simon Horman

On Tue, Sep 6, 2022 at 6:44 AM Quentin Monnet <quentin@isovalent.com> wrote:
>
> The JIT disassembler in bpftool is the only components (with the JSON
> writer) using asserts to check the return values of functions. But it
> does not do so in a consistent way, and diasm_print_insn() returns no
> value, although sometimes the operation failed.
>
> Remove the asserts, and instead check the return values, print messages
> on errors, and propagate the error to the caller from prog.c.
>
> Remove the inclusion of assert.h from jit_disasm.c, and also from map.c
> where it is unused.
>
> Signed-off-by: Quentin Monnet <quentin@isovalent.com>

Acked-by: Song Liu <song@kernel.org>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next 3/7] bpftool: Split FEATURE_TESTS/FEATURE_DISPLAY definitions in Makefile
  2022-09-06 13:36 ` [PATCH bpf-next 3/7] bpftool: Split FEATURE_TESTS/FEATURE_DISPLAY definitions in Makefile Quentin Monnet
@ 2022-09-06 23:18   ` Song Liu
  0 siblings, 0 replies; 24+ messages in thread
From: Song Liu @ 2022-09-06 23:18 UTC (permalink / raw)
  To: Quentin Monnet
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf,
	Niklas Söderlund, Simon Horman, Andres Freund

On Tue, Sep 6, 2022 at 6:44 AM Quentin Monnet <quentin@isovalent.com> wrote:
>
> Make FEATURE_TESTS and FEATURE_DISPLAY easier to read and less likely to
> be subject to conflicts on updates by having one feature per line.
>
> Suggested-by: Andres Freund <andres@anarazel.de>
> Signed-off-by: Quentin Monnet <quentin@isovalent.com>

Acked-by: Song Liu <song@kernel.org>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next 4/7] bpftool: Group libbfd defs in Makefile, only pass them if we use libbfd
  2022-09-06 13:36 ` [PATCH bpf-next 4/7] bpftool: Group libbfd defs in Makefile, only pass them if we use libbfd Quentin Monnet
@ 2022-09-06 23:31   ` Song Liu
  2022-09-07 14:19     ` Quentin Monnet
  0 siblings, 1 reply; 24+ messages in thread
From: Song Liu @ 2022-09-06 23:31 UTC (permalink / raw)
  To: Quentin Monnet
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf,
	Niklas Söderlund, Simon Horman

On Tue, Sep 6, 2022 at 6:44 AM Quentin Monnet <quentin@isovalent.com> wrote:
>
[...]

>
> +# If one of the above feature combinations is set, we support libbfd
>  ifneq ($(filter -lbfd,$(LIBS)),)
> -CFLAGS += -DHAVE_LIBBFD_SUPPORT
> -SRCS += $(BFD_SRCS)
> +  CFLAGS += -DHAVE_LIBBFD_SUPPORT
> +
> +  # Libbfd interface changed over time, figure out what we need
> +  ifeq ($(feature-disassembler-four-args), 1)
> +    CFLAGS += -DDISASM_FOUR_ARGS_SIGNATURE
> +  endif
> +  ifeq ($(feature-disassembler-init-styled), 1)
> +    CFLAGS += -DDISASM_INIT_STYLED
> +  endif
> +endif


> +ifeq ($(filter -DHAVE_LIBBFD_SUPPORT,$(CFLAGS)),)
> +  # No support for JIT disassembly
> +  SRCS := $(filter-out jit_disasm.c,$(SRCS))
>  endif

This part could just be an else clause for the ifneq above.
Well, I guess the difference is minimal.

Acked-by: Song Liu <song@kernel.org>

>
>  HOST_CFLAGS = $(subst -I$(LIBBPF_INCLUDE),-I$(LIBBPF_BOOTSTRAP_INCLUDE),\
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next 6/7] bpftool: Add LLVM as default library for disassembling JIT-ed programs
  2022-09-06 13:36 ` [PATCH bpf-next 6/7] bpftool: Add LLVM as default library for disassembling " Quentin Monnet
@ 2022-09-06 23:46   ` Alexei Starovoitov
  2022-09-07 14:20     ` Quentin Monnet
  2022-09-07  0:06   ` Song Liu
  1 sibling, 1 reply; 24+ messages in thread
From: Alexei Starovoitov @ 2022-09-06 23:46 UTC (permalink / raw)
  To: Quentin Monnet
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf,
	Niklas Söderlund, Simon Horman

On Tue, Sep 6, 2022 at 6:36 AM Quentin Monnet <quentin@isovalent.com> wrote:
>
> Naturally, the display of disassembled instructions comes with a few
> minor differences. Here is a sample output with libbfd (already
> supported before this patch):
>
>     # bpftool prog dump jited id 56
>     bpf_prog_6deef7357e7b4530:
>        0:   nopl   0x0(%rax,%rax,1)
>        5:   xchg   %ax,%ax
>        7:   push   %rbp
>        8:   mov    %rsp,%rbp
>        b:   push   %rbx
>        c:   push   %r13
>        e:   push   %r14
>       10:   mov    %rdi,%rbx
>       13:   movzwq 0xb0(%rbx),%r13
>       1b:   xor    %r14d,%r14d
>       1e:   or     $0x2,%r14d
>       22:   mov    $0x1,%eax
>       27:   cmp    $0x2,%r14
>       2b:   jne    0x000000000000002f
>       2d:   xor    %eax,%eax
>       2f:   pop    %r14
>       31:   pop    %r13
>       33:   pop    %rbx
>       34:   leave
>       35:   ret
>       36:   int3
>
> LLVM supports several variants that we could set when initialising the
> disassembler, for example with:
>
>     LLVMSetDisasmOptions(*ctx,
>                          LLVMDisassembler_Option_AsmPrinterVariant);
>
> but the default printer is kept for now. Here is the output with LLVM:
>
>     # bpftool prog dump jited id 56
>     bpf_prog_6deef7357e7b4530:
>        0:   nopl    (%rax,%rax)
>        5:   nop
>        7:   pushq   %rbp
>        8:   movq    %rsp, %rbp
>        b:   pushq   %rbx
>        c:   pushq   %r13
>        e:   pushq   %r14
>       10:   movq    %rdi, %rbx
>       13:   movzwq  176(%rbx), %r13
>       1b:   xorl    %r14d, %r14d
>       1e:   orl     $2, %r14d
>       22:   movl    $1, %eax
>       27:   cmpq    $2, %r14
>       2b:   jne     2
>       2d:   xorl    %eax, %eax
>       2f:   popq    %r14

If I'm reading the asm correctly the difference is significant.
jne 0x2f was an absolute address and jmps were easy
to follow.
While in llvm disasm it's 'jne 2' ?! What is 2 ?
2 bytes from the next insn of 0x2d ?
That is super hard to read.
Is there a way to tune/configure llvm disasm?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next 5/7] bpftool: Refactor disassembler for JIT-ed programs
  2022-09-06 13:36 ` [PATCH bpf-next 5/7] bpftool: Refactor disassembler for JIT-ed programs Quentin Monnet
@ 2022-09-06 23:55   ` Song Liu
  0 siblings, 0 replies; 24+ messages in thread
From: Song Liu @ 2022-09-06 23:55 UTC (permalink / raw)
  To: Quentin Monnet
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf,
	Niklas Söderlund, Simon Horman

On Tue, Sep 6, 2022 at 6:44 AM Quentin Monnet <quentin@isovalent.com> wrote:
>
> Refactor disasm_print_insn() to extract the code specific to libbfd and
> move it to dedicated functions. There is no functional change. This is
> in preparation for supporting an alternative library for disassembling
> the instructions.
>
> Signed-off-by: Quentin Monnet <quentin@isovalent.com>

Acked-by: Song Liu <song@kernel.org>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next 6/7] bpftool: Add LLVM as default library for disassembling JIT-ed programs
  2022-09-06 13:36 ` [PATCH bpf-next 6/7] bpftool: Add LLVM as default library for disassembling " Quentin Monnet
  2022-09-06 23:46   ` Alexei Starovoitov
@ 2022-09-07  0:06   ` Song Liu
  2022-09-07 14:20     ` Quentin Monnet
  1 sibling, 1 reply; 24+ messages in thread
From: Song Liu @ 2022-09-07  0:06 UTC (permalink / raw)
  To: Quentin Monnet
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf,
	Niklas Söderlund, Simon Horman

On Tue, Sep 6, 2022 at 6:46 AM Quentin Monnet <quentin@isovalent.com> wrote:
>
[...]
> +
> +static int
> +init_context(disasm_ctx_t *ctx, const char *arch,
> +            __maybe_unused const char *disassembler_options,
> +            __maybe_unused unsigned char *image, __maybe_unused ssize_t len)
> +{
> +       char *triple;
> +
> +       if (arch) {
> +               p_err("Architecture %s not supported", arch);
> +               return -1;
> +       }

Does this mean we stop supporting arch by default (prefer llvm
over bfd)?

Thanks,
Song

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next 4/7] bpftool: Group libbfd defs in Makefile, only pass them if we use libbfd
  2022-09-06 23:31   ` Song Liu
@ 2022-09-07 14:19     ` Quentin Monnet
  0 siblings, 0 replies; 24+ messages in thread
From: Quentin Monnet @ 2022-09-07 14:19 UTC (permalink / raw)
  To: Song Liu
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf,
	Niklas Söderlund, Simon Horman

On 07/09/2022 00:31, Song Liu wrote:
> On Tue, Sep 6, 2022 at 6:44 AM Quentin Monnet <quentin@isovalent.com> wrote:
>>
> [...]
> 
>>
>> +# If one of the above feature combinations is set, we support libbfd
>>  ifneq ($(filter -lbfd,$(LIBS)),)
>> -CFLAGS += -DHAVE_LIBBFD_SUPPORT
>> -SRCS += $(BFD_SRCS)
>> +  CFLAGS += -DHAVE_LIBBFD_SUPPORT
>> +
>> +  # Libbfd interface changed over time, figure out what we need
>> +  ifeq ($(feature-disassembler-four-args), 1)
>> +    CFLAGS += -DDISASM_FOUR_ARGS_SIGNATURE
>> +  endif
>> +  ifeq ($(feature-disassembler-init-styled), 1)
>> +    CFLAGS += -DDISASM_INIT_STYLED
>> +  endif
>> +endif
> 
> 
>> +ifeq ($(filter -DHAVE_LIBBFD_SUPPORT,$(CFLAGS)),)
>> +  # No support for JIT disassembly
>> +  SRCS := $(filter-out jit_disasm.c,$(SRCS))
>>  endif
> 
> This part could just be an else clause for the ifneq above.
> Well, I guess the difference is minimal.

True for this patch, but please see patch 6 with the LLVM support: the
ifneq above gets embedded in an outer if/else block (we only run it if
LLVM is not found), whereas removing jit_disasm.c from the sources
occurs when none of the two libs is available.

Ideally we'd have "if LLVM ... else if libbfd ... else remove
jit_disasm.c", but the check on libbfd involved checking multiple
features so I didn't find a simple way to write that in Makefile syntax
and thought it more readable to have a separate block for jit_disasm.c.

> 
> Acked-by: Song Liu <song@kernel.org>

Thanks for the review!

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next 6/7] bpftool: Add LLVM as default library for disassembling JIT-ed programs
  2022-09-06 23:46   ` Alexei Starovoitov
@ 2022-09-07 14:20     ` Quentin Monnet
  2022-09-07 16:10       ` Alexei Starovoitov
  0 siblings, 1 reply; 24+ messages in thread
From: Quentin Monnet @ 2022-09-07 14:20 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf,
	Niklas Söderlund, Simon Horman

On 07/09/2022 00:46, Alexei Starovoitov wrote:
> On Tue, Sep 6, 2022 at 6:36 AM Quentin Monnet <quentin@isovalent.com> wrote:
>>
>> Naturally, the display of disassembled instructions comes with a few
>> minor differences. Here is a sample output with libbfd (already
>> supported before this patch):
>>
>>     # bpftool prog dump jited id 56
>>     bpf_prog_6deef7357e7b4530:
>>        0:   nopl   0x0(%rax,%rax,1)
>>        5:   xchg   %ax,%ax
>>        7:   push   %rbp
>>        8:   mov    %rsp,%rbp
>>        b:   push   %rbx
>>        c:   push   %r13
>>        e:   push   %r14
>>       10:   mov    %rdi,%rbx
>>       13:   movzwq 0xb0(%rbx),%r13
>>       1b:   xor    %r14d,%r14d
>>       1e:   or     $0x2,%r14d
>>       22:   mov    $0x1,%eax
>>       27:   cmp    $0x2,%r14
>>       2b:   jne    0x000000000000002f
>>       2d:   xor    %eax,%eax
>>       2f:   pop    %r14
>>       31:   pop    %r13
>>       33:   pop    %rbx
>>       34:   leave
>>       35:   ret
>>       36:   int3
>>
>> LLVM supports several variants that we could set when initialising the
>> disassembler, for example with:
>>
>>     LLVMSetDisasmOptions(*ctx,
>>                          LLVMDisassembler_Option_AsmPrinterVariant);
>>
>> but the default printer is kept for now. Here is the output with LLVM:
>>
>>     # bpftool prog dump jited id 56
>>     bpf_prog_6deef7357e7b4530:
>>        0:   nopl    (%rax,%rax)
>>        5:   nop
>>        7:   pushq   %rbp
>>        8:   movq    %rsp, %rbp
>>        b:   pushq   %rbx
>>        c:   pushq   %r13
>>        e:   pushq   %r14
>>       10:   movq    %rdi, %rbx
>>       13:   movzwq  176(%rbx), %r13
>>       1b:   xorl    %r14d, %r14d
>>       1e:   orl     $2, %r14d
>>       22:   movl    $1, %eax
>>       27:   cmpq    $2, %r14
>>       2b:   jne     2
>>       2d:   xorl    %eax, %eax
>>       2f:   popq    %r14
> 
> If I'm reading the asm correctly the difference is significant.
> jne 0x2f was an absolute address and jmps were easy
> to follow.
> While in llvm disasm it's 'jne 2' ?! What is 2 ?
> 2 bytes from the next insn of 0x2d ?

Yes, that's it. Apparently, this is how the operand is encoded, and
libbfd does the translation to the absolute address:

    # bpftool prog dump jited id 7868 opcodes
    [...]
      2b:   jne    0x000000000000002f
            75 02
    [...]

The same difference is observable between objdump and llvm-objdump on an
x86-64 binary for example, although they usually have labels to refer to
("jne     -22 <_obstack_memory_used+0x7d0>"), making the navigation
easier. The only mention I could find of that difference is a report
from 2013 [0].

[0] https://discourse.llvm.org/t/llvm-objdump-disassembling-jmp/29584/2

> That is super hard to read.
> Is there a way to tune/configure llvm disasm?

There's a function and some options to tune it, but I tried them and
none applies to converting the jump operands.

    int LLVMSetDisasmOptions(LLVMDisasmContextRef DC, uint64_t Options);

    /* The option to produce marked up assembly. */
    #define LLVMDisassembler_Option_UseMarkup 1
    /* The option to print immediates as hex. */
    #define LLVMDisassembler_Option_PrintImmHex 2
    /* The option use the other assembler printer variant */
    #define LLVMDisassembler_Option_AsmPrinterVariant 4
    /* The option to set comment on instructions */
    #define LLVMDisassembler_Option_SetInstrComments 8
    /* The option to print latency information alongside instructions */
    #define LLVMDisassembler_Option_PrintLatency 16

I found that LLVMDisassembler_Option_AsmPrinterVariant read better,
although in my patch I kept the default output which looked closer to
the existing from libbfd. Here's what the option produces:

    bpf_prog_6deef7357e7b4530:
       0:   nop     dword ptr [rax + rax]
       5:   nop
       7:   push    rbp
       8:   mov     rbp, rsp
       b:   push    rbx
       c:   push    r13
       e:   push    r14
      10:   mov     rbx, rdi
      13:   movzx   r13, word ptr [rbx + 180]
      1b:   xor     r14d, r14d
      1e:   or      r14d, 2
      22:   mov     eax, 1
      27:   cmp     r14, 2
      2b:   jne     2
      2d:   xor     eax, eax
      2f:   pop     r14
      31:   pop     r13
      33:   pop     rbx
      34:   leave
      35:   re

But the jne operand remains a '2'. I'm not aware of any option to change
it in LLVM's disassembler :(.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next 6/7] bpftool: Add LLVM as default library for disassembling JIT-ed programs
  2022-09-07  0:06   ` Song Liu
@ 2022-09-07 14:20     ` Quentin Monnet
  2022-09-07 16:37       ` Song Liu
  0 siblings, 1 reply; 24+ messages in thread
From: Quentin Monnet @ 2022-09-07 14:20 UTC (permalink / raw)
  To: Song Liu
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf,
	Niklas Söderlund, Simon Horman

On 07/09/2022 01:06, Song Liu wrote:
> On Tue, Sep 6, 2022 at 6:46 AM Quentin Monnet <quentin@isovalent.com> wrote:
>>
> [...]
>> +
>> +static int
>> +init_context(disasm_ctx_t *ctx, const char *arch,
>> +            __maybe_unused const char *disassembler_options,
>> +            __maybe_unused unsigned char *image, __maybe_unused ssize_t len)
>> +{
>> +       char *triple;
>> +
>> +       if (arch) {
>> +               p_err("Architecture %s not supported", arch);
>> +               return -1;
>> +       }
> 
> Does this mean we stop supporting arch by default (prefer llvm
> over bfd)?

We do drop support in practice, because the "arch" is only used for nfp
(we only use this when the program is not using the host architecture,
so when it's offloaded - see ifindex_to_bfd_params() in common.c), and
LLVM has no support for nfp.

Although on second thought, it would probably be cleaner to set the arch
anyway in the snippet above, and to let LLVM return an error if it
doesn't know about it, so that we don't have to update bpftool in the
future if a new arch is used for BPF offload. I can update for the next
iteration.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next 6/7] bpftool: Add LLVM as default library for disassembling JIT-ed programs
  2022-09-07 14:20     ` Quentin Monnet
@ 2022-09-07 16:10       ` Alexei Starovoitov
  2022-09-07 16:33         ` Quentin Monnet
  0 siblings, 1 reply; 24+ messages in thread
From: Alexei Starovoitov @ 2022-09-07 16:10 UTC (permalink / raw)
  To: Quentin Monnet
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf,
	Niklas Söderlund, Simon Horman

On Wed, Sep 7, 2022 at 7:20 AM Quentin Monnet <quentin@isovalent.com> wrote:
>
> On 07/09/2022 00:46, Alexei Starovoitov wrote:
> > On Tue, Sep 6, 2022 at 6:36 AM Quentin Monnet <quentin@isovalent.com> wrote:
> >>
> >> Naturally, the display of disassembled instructions comes with a few
> >> minor differences. Here is a sample output with libbfd (already
> >> supported before this patch):
> >>
> >>     # bpftool prog dump jited id 56
> >>     bpf_prog_6deef7357e7b4530:
> >>        0:   nopl   0x0(%rax,%rax,1)
> >>        5:   xchg   %ax,%ax
> >>        7:   push   %rbp
> >>        8:   mov    %rsp,%rbp
> >>        b:   push   %rbx
> >>        c:   push   %r13
> >>        e:   push   %r14
> >>       10:   mov    %rdi,%rbx
> >>       13:   movzwq 0xb0(%rbx),%r13
> >>       1b:   xor    %r14d,%r14d
> >>       1e:   or     $0x2,%r14d
> >>       22:   mov    $0x1,%eax
> >>       27:   cmp    $0x2,%r14
> >>       2b:   jne    0x000000000000002f
> >>       2d:   xor    %eax,%eax
> >>       2f:   pop    %r14
> >>       31:   pop    %r13
> >>       33:   pop    %rbx
> >>       34:   leave
> >>       35:   ret
> >>       36:   int3
> >>
> >> LLVM supports several variants that we could set when initialising the
> >> disassembler, for example with:
> >>
> >>     LLVMSetDisasmOptions(*ctx,
> >>                          LLVMDisassembler_Option_AsmPrinterVariant);
> >>
> >> but the default printer is kept for now. Here is the output with LLVM:
> >>
> >>     # bpftool prog dump jited id 56
> >>     bpf_prog_6deef7357e7b4530:
> >>        0:   nopl    (%rax,%rax)
> >>        5:   nop
> >>        7:   pushq   %rbp
> >>        8:   movq    %rsp, %rbp
> >>        b:   pushq   %rbx
> >>        c:   pushq   %r13
> >>        e:   pushq   %r14
> >>       10:   movq    %rdi, %rbx
> >>       13:   movzwq  176(%rbx), %r13
> >>       1b:   xorl    %r14d, %r14d
> >>       1e:   orl     $2, %r14d
> >>       22:   movl    $1, %eax
> >>       27:   cmpq    $2, %r14
> >>       2b:   jne     2
> >>       2d:   xorl    %eax, %eax
> >>       2f:   popq    %r14
> >
> > If I'm reading the asm correctly the difference is significant.
> > jne 0x2f was an absolute address and jmps were easy
> > to follow.
> > While in llvm disasm it's 'jne 2' ?! What is 2 ?
> > 2 bytes from the next insn of 0x2d ?
>
> Yes, that's it. Apparently, this is how the operand is encoded, and
> libbfd does the translation to the absolute address:
>
>     # bpftool prog dump jited id 7868 opcodes
>     [...]
>       2b:   jne    0x000000000000002f
>             75 02
>     [...]
>
> The same difference is observable between objdump and llvm-objdump on an
> x86-64 binary for example, although they usually have labels to refer to
> ("jne     -22 <_obstack_memory_used+0x7d0>"), making the navigation
> easier. The only mention I could find of that difference is a report
> from 2013 [0].
>
> [0] https://discourse.llvm.org/t/llvm-objdump-disassembling-jmp/29584/2
>
> > That is super hard to read.
> > Is there a way to tune/configure llvm disasm?
>
> There's a function and some options to tune it, but I tried them and
> none applies to converting the jump operands.
>
>     int LLVMSetDisasmOptions(LLVMDisasmContextRef DC, uint64_t Options);
>
>     /* The option to produce marked up assembly. */
>     #define LLVMDisassembler_Option_UseMarkup 1
>     /* The option to print immediates as hex. */
>     #define LLVMDisassembler_Option_PrintImmHex 2
>     /* The option use the other assembler printer variant */
>     #define LLVMDisassembler_Option_AsmPrinterVariant 4
>     /* The option to set comment on instructions */
>     #define LLVMDisassembler_Option_SetInstrComments 8
>     /* The option to print latency information alongside instructions */
>     #define LLVMDisassembler_Option_PrintLatency 16
>
> I found that LLVMDisassembler_Option_AsmPrinterVariant read better,
> although in my patch I kept the default output which looked closer to
> the existing from libbfd. Here's what the option produces:
>
>     bpf_prog_6deef7357e7b4530:
>        0:   nop     dword ptr [rax + rax]
>        5:   nop
>        7:   push    rbp
>        8:   mov     rbp, rsp
>        b:   push    rbx
>        c:   push    r13
>        e:   push    r14
>       10:   mov     rbx, rdi
>       13:   movzx   r13, word ptr [rbx + 180]
>       1b:   xor     r14d, r14d
>       1e:   or      r14d, 2
>       22:   mov     eax, 1
>       27:   cmp     r14, 2
>       2b:   jne     2
>       2d:   xor     eax, eax
>       2f:   pop     r14
>       31:   pop     r13
>       33:   pop     rbx
>       34:   leave
>       35:   re
>
> But the jne operand remains a '2'. I'm not aware of any option to change
> it in LLVM's disassembler :(.

Hmm. llvm-objdump -d test_maps
looks fine:
  41bfcb: e8 6f f7 ff ff                   callq    0x41b73f
<find_extern_btf_id>

the must be something llvm disasm is missing when you feed raw bytes
into it.
Please keep investigating. In this form I'm afraid it's no go.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next 6/7] bpftool: Add LLVM as default library for disassembling JIT-ed programs
  2022-09-07 16:10       ` Alexei Starovoitov
@ 2022-09-07 16:33         ` Quentin Monnet
  2022-09-07 18:02           ` Yonghong Song
  0 siblings, 1 reply; 24+ messages in thread
From: Quentin Monnet @ 2022-09-07 16:33 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf,
	Niklas Söderlund, Simon Horman

On 07/09/2022 17:10, Alexei Starovoitov wrote:
> On Wed, Sep 7, 2022 at 7:20 AM Quentin Monnet <quentin@isovalent.com> wrote:
>>
>> On 07/09/2022 00:46, Alexei Starovoitov wrote:
>>> On Tue, Sep 6, 2022 at 6:36 AM Quentin Monnet <quentin@isovalent.com> wrote:
>>>>
>>>> Naturally, the display of disassembled instructions comes with a few
>>>> minor differences. Here is a sample output with libbfd (already
>>>> supported before this patch):
>>>>
>>>>     # bpftool prog dump jited id 56
>>>>     bpf_prog_6deef7357e7b4530:
>>>>        0:   nopl   0x0(%rax,%rax,1)
>>>>        5:   xchg   %ax,%ax
>>>>        7:   push   %rbp
>>>>        8:   mov    %rsp,%rbp
>>>>        b:   push   %rbx
>>>>        c:   push   %r13
>>>>        e:   push   %r14
>>>>       10:   mov    %rdi,%rbx
>>>>       13:   movzwq 0xb0(%rbx),%r13
>>>>       1b:   xor    %r14d,%r14d
>>>>       1e:   or     $0x2,%r14d
>>>>       22:   mov    $0x1,%eax
>>>>       27:   cmp    $0x2,%r14
>>>>       2b:   jne    0x000000000000002f
>>>>       2d:   xor    %eax,%eax
>>>>       2f:   pop    %r14
>>>>       31:   pop    %r13
>>>>       33:   pop    %rbx
>>>>       34:   leave
>>>>       35:   ret
>>>>       36:   int3
>>>>
>>>> LLVM supports several variants that we could set when initialising the
>>>> disassembler, for example with:
>>>>
>>>>     LLVMSetDisasmOptions(*ctx,
>>>>                          LLVMDisassembler_Option_AsmPrinterVariant);
>>>>
>>>> but the default printer is kept for now. Here is the output with LLVM:
>>>>
>>>>     # bpftool prog dump jited id 56
>>>>     bpf_prog_6deef7357e7b4530:
>>>>        0:   nopl    (%rax,%rax)
>>>>        5:   nop
>>>>        7:   pushq   %rbp
>>>>        8:   movq    %rsp, %rbp
>>>>        b:   pushq   %rbx
>>>>        c:   pushq   %r13
>>>>        e:   pushq   %r14
>>>>       10:   movq    %rdi, %rbx
>>>>       13:   movzwq  176(%rbx), %r13
>>>>       1b:   xorl    %r14d, %r14d
>>>>       1e:   orl     $2, %r14d
>>>>       22:   movl    $1, %eax
>>>>       27:   cmpq    $2, %r14
>>>>       2b:   jne     2
>>>>       2d:   xorl    %eax, %eax
>>>>       2f:   popq    %r14
>>>
>>> If I'm reading the asm correctly the difference is significant.
>>> jne 0x2f was an absolute address and jmps were easy
>>> to follow.
>>> While in llvm disasm it's 'jne 2' ?! What is 2 ?
>>> 2 bytes from the next insn of 0x2d ?
>>
>> Yes, that's it. Apparently, this is how the operand is encoded, and
>> libbfd does the translation to the absolute address:
>>
>>     # bpftool prog dump jited id 7868 opcodes
>>     [...]
>>       2b:   jne    0x000000000000002f
>>             75 02
>>     [...]
>>
>> The same difference is observable between objdump and llvm-objdump on an
>> x86-64 binary for example, although they usually have labels to refer to
>> ("jne     -22 <_obstack_memory_used+0x7d0>"), making the navigation
>> easier. The only mention I could find of that difference is a report
>> from 2013 [0].
>>
>> [0] https://discourse.llvm.org/t/llvm-objdump-disassembling-jmp/29584/2
>>
>>> That is super hard to read.
>>> Is there a way to tune/configure llvm disasm?
>>
>> There's a function and some options to tune it, but I tried them and
>> none applies to converting the jump operands.
>>
>>     int LLVMSetDisasmOptions(LLVMDisasmContextRef DC, uint64_t Options);
>>
>>     /* The option to produce marked up assembly. */
>>     #define LLVMDisassembler_Option_UseMarkup 1
>>     /* The option to print immediates as hex. */
>>     #define LLVMDisassembler_Option_PrintImmHex 2
>>     /* The option use the other assembler printer variant */
>>     #define LLVMDisassembler_Option_AsmPrinterVariant 4
>>     /* The option to set comment on instructions */
>>     #define LLVMDisassembler_Option_SetInstrComments 8
>>     /* The option to print latency information alongside instructions */
>>     #define LLVMDisassembler_Option_PrintLatency 16
>>
>> I found that LLVMDisassembler_Option_AsmPrinterVariant read better,
>> although in my patch I kept the default output which looked closer to
>> the existing from libbfd. Here's what the option produces:
>>
>>     bpf_prog_6deef7357e7b4530:
>>        0:   nop     dword ptr [rax + rax]
>>        5:   nop
>>        7:   push    rbp
>>        8:   mov     rbp, rsp
>>        b:   push    rbx
>>        c:   push    r13
>>        e:   push    r14
>>       10:   mov     rbx, rdi
>>       13:   movzx   r13, word ptr [rbx + 180]
>>       1b:   xor     r14d, r14d
>>       1e:   or      r14d, 2
>>       22:   mov     eax, 1
>>       27:   cmp     r14, 2
>>       2b:   jne     2
>>       2d:   xor     eax, eax
>>       2f:   pop     r14
>>       31:   pop     r13
>>       33:   pop     rbx
>>       34:   leave
>>       35:   re
>>
>> But the jne operand remains a '2'. I'm not aware of any option to change
>> it in LLVM's disassembler :(.
> 
> Hmm. llvm-objdump -d test_maps
> looks fine:
>   41bfcb: e8 6f f7 ff ff                   callq    0x41b73f
> <find_extern_btf_id>
> 
> the must be something llvm disasm is missing when you feed raw bytes
> into it.
> Please keep investigating. In this form I'm afraid it's no go.

OK, I'll keep looking

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next 6/7] bpftool: Add LLVM as default library for disassembling JIT-ed programs
  2022-09-07 14:20     ` Quentin Monnet
@ 2022-09-07 16:37       ` Song Liu
  0 siblings, 0 replies; 24+ messages in thread
From: Song Liu @ 2022-09-07 16:37 UTC (permalink / raw)
  To: Quentin Monnet
  Cc: Song Liu, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin Lau, Yonghong Song, John Fastabend, KP Singh,
	Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf,
	Niklas Söderlund, Simon Horman



> On Sep 7, 2022, at 7:20 AM, Quentin Monnet <quentin@isovalent.com> wrote:
> 
> On 07/09/2022 01:06, Song Liu wrote:
>> On Tue, Sep 6, 2022 at 6:46 AM Quentin Monnet <quentin@isovalent.com> wrote:
>>> 
>> [...]
>>> +
>>> +static int
>>> +init_context(disasm_ctx_t *ctx, const char *arch,
>>> +            __maybe_unused const char *disassembler_options,
>>> +            __maybe_unused unsigned char *image, __maybe_unused ssize_t len)
>>> +{
>>> +       char *triple;
>>> +
>>> +       if (arch) {
>>> +               p_err("Architecture %s not supported", arch);
>>> +               return -1;
>>> +       }
>> 
>> Does this mean we stop supporting arch by default (prefer llvm
>> over bfd)?
> 
> We do drop support in practice, because the "arch" is only used for nfp
> (we only use this when the program is not using the host architecture,
> so when it's offloaded - see ifindex_to_bfd_params() in common.c), and
> LLVM has no support for nfp.
> 
> Although on second thought, it would probably be cleaner to set the arch
> anyway in the snippet above, and to let LLVM return an error if it
> doesn't know about it, so that we don't have to update bpftool in the
> future if a new arch is used for BPF offload. I can update for the next
> iteration.

Sounds good! Thanks for looking into different options. 

Song


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next 6/7] bpftool: Add LLVM as default library for disassembling JIT-ed programs
  2022-09-07 16:33         ` Quentin Monnet
@ 2022-09-07 18:02           ` Yonghong Song
  2022-09-11 20:13             ` Quentin Monnet
  0 siblings, 1 reply; 24+ messages in thread
From: Yonghong Song @ 2022-09-07 18:02 UTC (permalink / raw)
  To: Quentin Monnet, Alexei Starovoitov
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, John Fastabend, KP Singh,
	Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf,
	Niklas Söderlund, Simon Horman



On 9/7/22 9:33 AM, Quentin Monnet wrote:
> On 07/09/2022 17:10, Alexei Starovoitov wrote:
>> On Wed, Sep 7, 2022 at 7:20 AM Quentin Monnet <quentin@isovalent.com> wrote:
>>>
>>> On 07/09/2022 00:46, Alexei Starovoitov wrote:
>>>> On Tue, Sep 6, 2022 at 6:36 AM Quentin Monnet <quentin@isovalent.com> wrote:
>>>>>
>>>>> Naturally, the display of disassembled instructions comes with a few
>>>>> minor differences. Here is a sample output with libbfd (already
>>>>> supported before this patch):
>>>>>
>>>>>      # bpftool prog dump jited id 56
>>>>>      bpf_prog_6deef7357e7b4530:
>>>>>         0:   nopl   0x0(%rax,%rax,1)
>>>>>         5:   xchg   %ax,%ax
>>>>>         7:   push   %rbp
>>>>>         8:   mov    %rsp,%rbp
>>>>>         b:   push   %rbx
>>>>>         c:   push   %r13
>>>>>         e:   push   %r14
>>>>>        10:   mov    %rdi,%rbx
>>>>>        13:   movzwq 0xb0(%rbx),%r13
>>>>>        1b:   xor    %r14d,%r14d
>>>>>        1e:   or     $0x2,%r14d
>>>>>        22:   mov    $0x1,%eax
>>>>>        27:   cmp    $0x2,%r14
>>>>>        2b:   jne    0x000000000000002f
>>>>>        2d:   xor    %eax,%eax
>>>>>        2f:   pop    %r14
>>>>>        31:   pop    %r13
>>>>>        33:   pop    %rbx
>>>>>        34:   leave
>>>>>        35:   ret
>>>>>        36:   int3
>>>>>
>>>>> LLVM supports several variants that we could set when initialising the
>>>>> disassembler, for example with:
>>>>>
>>>>>      LLVMSetDisasmOptions(*ctx,
>>>>>                           LLVMDisassembler_Option_AsmPrinterVariant);
>>>>>
>>>>> but the default printer is kept for now. Here is the output with LLVM:
>>>>>
>>>>>      # bpftool prog dump jited id 56
>>>>>      bpf_prog_6deef7357e7b4530:
>>>>>         0:   nopl    (%rax,%rax)
>>>>>         5:   nop
>>>>>         7:   pushq   %rbp
>>>>>         8:   movq    %rsp, %rbp
>>>>>         b:   pushq   %rbx
>>>>>         c:   pushq   %r13
>>>>>         e:   pushq   %r14
>>>>>        10:   movq    %rdi, %rbx
>>>>>        13:   movzwq  176(%rbx), %r13
>>>>>        1b:   xorl    %r14d, %r14d
>>>>>        1e:   orl     $2, %r14d
>>>>>        22:   movl    $1, %eax
>>>>>        27:   cmpq    $2, %r14
>>>>>        2b:   jne     2
>>>>>        2d:   xorl    %eax, %eax
>>>>>        2f:   popq    %r14
>>>>
>>>> If I'm reading the asm correctly the difference is significant.
>>>> jne 0x2f was an absolute address and jmps were easy
>>>> to follow.
>>>> While in llvm disasm it's 'jne 2' ?! What is 2 ?
>>>> 2 bytes from the next insn of 0x2d ?
>>>
>>> Yes, that's it. Apparently, this is how the operand is encoded, and
>>> libbfd does the translation to the absolute address:
>>>
>>>      # bpftool prog dump jited id 7868 opcodes
>>>      [...]
>>>        2b:   jne    0x000000000000002f
>>>              75 02
>>>      [...]
>>>
>>> The same difference is observable between objdump and llvm-objdump on an
>>> x86-64 binary for example, although they usually have labels to refer to
>>> ("jne     -22 <_obstack_memory_used+0x7d0>"), making the navigation
>>> easier. The only mention I could find of that difference is a report
>>> from 2013 [0].
>>>
>>> [0] https://discourse.llvm.org/t/llvm-objdump-disassembling-jmp/29584/2
>>>
>>>> That is super hard to read.
>>>> Is there a way to tune/configure llvm disasm?
>>>
>>> There's a function and some options to tune it, but I tried them and
>>> none applies to converting the jump operands.
>>>
>>>      int LLVMSetDisasmOptions(LLVMDisasmContextRef DC, uint64_t Options);
>>>
>>>      /* The option to produce marked up assembly. */
>>>      #define LLVMDisassembler_Option_UseMarkup 1
>>>      /* The option to print immediates as hex. */
>>>      #define LLVMDisassembler_Option_PrintImmHex 2
>>>      /* The option use the other assembler printer variant */
>>>      #define LLVMDisassembler_Option_AsmPrinterVariant 4
>>>      /* The option to set comment on instructions */
>>>      #define LLVMDisassembler_Option_SetInstrComments 8
>>>      /* The option to print latency information alongside instructions */
>>>      #define LLVMDisassembler_Option_PrintLatency 16
>>>
>>> I found that LLVMDisassembler_Option_AsmPrinterVariant read better,
>>> although in my patch I kept the default output which looked closer to
>>> the existing from libbfd. Here's what the option produces:
>>>
>>>      bpf_prog_6deef7357e7b4530:
>>>         0:   nop     dword ptr [rax + rax]
>>>         5:   nop
>>>         7:   push    rbp
>>>         8:   mov     rbp, rsp
>>>         b:   push    rbx
>>>         c:   push    r13
>>>         e:   push    r14
>>>        10:   mov     rbx, rdi
>>>        13:   movzx   r13, word ptr [rbx + 180]
>>>        1b:   xor     r14d, r14d
>>>        1e:   or      r14d, 2
>>>        22:   mov     eax, 1
>>>        27:   cmp     r14, 2
>>>        2b:   jne     2
>>>        2d:   xor     eax, eax
>>>        2f:   pop     r14
>>>        31:   pop     r13
>>>        33:   pop     rbx
>>>        34:   leave
>>>        35:   re
>>>
>>> But the jne operand remains a '2'. I'm not aware of any option to change
>>> it in LLVM's disassembler :(.
>>
>> Hmm. llvm-objdump -d test_maps
>> looks fine:
>>    41bfcb: e8 6f f7 ff ff                   callq    0x41b73f
>> <find_extern_btf_id>
>>
>> the must be something llvm disasm is missing when you feed raw bytes
>> into it.
>> Please keep investigating. In this form I'm afraid it's no go.
> 
> OK, I'll keep looking

Quentin, if eventually there is no existing solution for this problem, 
we could improve llvm API to encode branch target in more 
easy-to-understand form.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next 0/7] bpftool: Add LLVM as default library for disassembling JIT-ed programs
  2022-09-06 13:36 [PATCH bpf-next 0/7] bpftool: Add LLVM as default library for disassembling JIT-ed programs Quentin Monnet
                   ` (6 preceding siblings ...)
  2022-09-06 13:36 ` [PATCH bpf-next 7/7] bpftool: Add llvm feature to "bpftool version" Quentin Monnet
@ 2022-09-10 19:41 ` Niklas Söderlund
  7 siblings, 0 replies; 24+ messages in thread
From: Niklas Söderlund @ 2022-09-10 19:41 UTC (permalink / raw)
  To: Quentin Monnet
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf,
	Simon Horman

Hi Quentin,

Thanks for your work.

On 2022-09-06 14:36:06 +0100, Quentin Monnet wrote:
> To disassemble instructions for JIT-ed programs, bpftool has relied on the
> libbfd library. This has been problematic in the past: libbfd's interface
> is not meant to be stable and has changed several times, hence the
> detection of the two related features from the Makefile
> (disassembler-four-args and disassembler-init-styled). When it comes to
> shipping bpftool, this has also caused issues with several distribution
> maintainers unwilling to support the feature (for example, Debian's page
> for binutils-dev, libbfd's package, says: "Note that building Debian
> packages which depend on the shared libbfd is Not Allowed.").
> 
> This patchset adds support for LLVM as the primary library for
> disassembling instructions for JIT-ed programs.
> 
> We keep libbfd as a fallback. One reason for this is that currently it
> works well, we have all we need in terms of features detection in the
> Makefile, so it provides a fallback for disassembling JIT-ed programs if
> libbfd is installed but LLVM is not. The other reason is that libbfd
> supports nfp instruction for Netronome's SmartNICs and can be used to
> disassemble offloaded programs, something that LLVM cannot do (Niklas
> confirmed that the feature is still in use). However, if libbfd's interface
> breaks again in the future, we might reconsider keeping support for it.

I have tested the fallback method for NFP and it works as expected and 
one can dump the offloaded program. I know there is discussion about the 
output format of the LLVM path, but for the whole series from a NFP 
point of view,

Tested-by: Niklas Söderlund <niklas.soderlund@corigine.com>

> 
> Quentin Monnet (7):
>   bpftool: Define _GNU_SOURCE only once
>   bpftool: Remove asserts from JIT disassembler
>   bpftool: Split FEATURE_TESTS/FEATURE_DISPLAY definitions in Makefile
>   bpftool: Group libbfd defs in Makefile, only pass them if we use
>     libbfd
>   bpftool: Refactor disassembler for JIT-ed programs
>   bpftool: Add LLVM as default library for disassembling JIT-ed programs
>   bpftool: Add llvm feature to "bpftool version"
> 
>  .../bpftool/Documentation/common_options.rst  |   8 +-
>  tools/bpf/bpftool/Makefile                    |  65 +++--
>  tools/bpf/bpftool/common.c                    |   2 +
>  tools/bpf/bpftool/iter.c                      |   2 +
>  tools/bpf/bpftool/jit_disasm.c                | 244 ++++++++++++++----
>  tools/bpf/bpftool/main.c                      |  10 +
>  tools/bpf/bpftool/main.h                      |  29 ++-
>  tools/bpf/bpftool/map.c                       |   1 -
>  tools/bpf/bpftool/net.c                       |   2 +
>  tools/bpf/bpftool/perf.c                      |   2 +
>  tools/bpf/bpftool/prog.c                      |  17 +-
>  tools/bpf/bpftool/xlated_dumper.c             |   2 +
>  12 files changed, 291 insertions(+), 93 deletions(-)
> 
> -- 
> 2.34.1
> 

-- 
Kind Regards,
Niklas Söderlund

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next 6/7] bpftool: Add LLVM as default library for disassembling JIT-ed programs
  2022-09-07 18:02           ` Yonghong Song
@ 2022-09-11 20:13             ` Quentin Monnet
  0 siblings, 0 replies; 24+ messages in thread
From: Quentin Monnet @ 2022-09-11 20:13 UTC (permalink / raw)
  To: Yonghong Song, Alexei Starovoitov
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, John Fastabend, KP Singh,
	Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf,
	Niklas Söderlund, Simon Horman

On 07/09/2022 19:02, Yonghong Song wrote:
> 
> 
> On 9/7/22 9:33 AM, Quentin Monnet wrote:
>> On 07/09/2022 17:10, Alexei Starovoitov wrote:
>>> On Wed, Sep 7, 2022 at 7:20 AM Quentin Monnet <quentin@isovalent.com>
>>> wrote:
>>>>
>>>> On 07/09/2022 00:46, Alexei Starovoitov wrote:
>>>>> On Tue, Sep 6, 2022 at 6:36 AM Quentin Monnet
>>>>> <quentin@isovalent.com> wrote:

>>>>>>      # bpftool prog dump jited id 56
>>>>>>      bpf_prog_6deef7357e7b4530:
>>>>>>         0:   nopl    (%rax,%rax)
>>>>>>         5:   nop
>>>>>>         7:   pushq   %rbp
>>>>>>         8:   movq    %rsp, %rbp
>>>>>>         b:   pushq   %rbx
>>>>>>         c:   pushq   %r13
>>>>>>         e:   pushq   %r14
>>>>>>        10:   movq    %rdi, %rbx
>>>>>>        13:   movzwq  176(%rbx), %r13
>>>>>>        1b:   xorl    %r14d, %r14d
>>>>>>        1e:   orl     $2, %r14d
>>>>>>        22:   movl    $1, %eax
>>>>>>        27:   cmpq    $2, %r14
>>>>>>        2b:   jne     2
>>>>>>        2d:   xorl    %eax, %eax
>>>>>>        2f:   popq    %r14
>>>>>
>>>>> If I'm reading the asm correctly the difference is significant.
>>>>> jne 0x2f was an absolute address and jmps were easy
>>>>> to follow.
>>>>> While in llvm disasm it's 'jne 2' ?! What is 2 ?
>>>>> 2 bytes from the next insn of 0x2d ?
>>>>
>>>> Yes, that's it. Apparently, this is how the operand is encoded, and
>>>> libbfd does the translation to the absolute address:
>>>>
>>>>      # bpftool prog dump jited id 7868 opcodes
>>>>      [...]
>>>>        2b:   jne    0x000000000000002f
>>>>              75 02
>>>>      [...]
>>>>
>>>> The same difference is observable between objdump and llvm-objdump
>>>> on an
>>>> x86-64 binary for example, although they usually have labels to
>>>> refer to
>>>> ("jne     -22 <_obstack_memory_used+0x7d0>"), making the navigation
>>>> easier. The only mention I could find of that difference is a report
>>>> from 2013 [0].
>>>>
>>>> [0] https://discourse.llvm.org/t/llvm-objdump-disassembling-jmp/29584/2
>>>>
>>>>> That is super hard to read.
>>>>> Is there a way to tune/configure llvm disasm?
>>>>
>>>> There's a function and some options to tune it, but I tried them and
>>>> none applies to converting the jump operands.
>>>>
>>>>      int LLVMSetDisasmOptions(LLVMDisasmContextRef DC, uint64_t
>>>> Options);
>>>>
>>>>      /* The option to produce marked up assembly. */
>>>>      #define LLVMDisassembler_Option_UseMarkup 1
>>>>      /* The option to print immediates as hex. */
>>>>      #define LLVMDisassembler_Option_PrintImmHex 2
>>>>      /* The option use the other assembler printer variant */
>>>>      #define LLVMDisassembler_Option_AsmPrinterVariant 4
>>>>      /* The option to set comment on instructions */
>>>>      #define LLVMDisassembler_Option_SetInstrComments 8
>>>>      /* The option to print latency information alongside
>>>> instructions */
>>>>      #define LLVMDisassembler_Option_PrintLatency 16
>>>>
>>>> I found that LLVMDisassembler_Option_AsmPrinterVariant read better,
>>>> although in my patch I kept the default output which looked closer to
>>>> the existing from libbfd. Here's what the option produces:
>>>>
>>>>      bpf_prog_6deef7357e7b4530:
>>>>         0:   nop     dword ptr [rax + rax]
>>>>         5:   nop
>>>>         7:   push    rbp
>>>>         8:   mov     rbp, rsp
>>>>         b:   push    rbx
>>>>         c:   push    r13
>>>>         e:   push    r14
>>>>        10:   mov     rbx, rdi
>>>>        13:   movzx   r13, word ptr [rbx + 180]
>>>>        1b:   xor     r14d, r14d
>>>>        1e:   or      r14d, 2
>>>>        22:   mov     eax, 1
>>>>        27:   cmp     r14, 2
>>>>        2b:   jne     2
>>>>        2d:   xor     eax, eax
>>>>        2f:   pop     r14
>>>>        31:   pop     r13
>>>>        33:   pop     rbx
>>>>        34:   leave
>>>>        35:   re
>>>>
>>>> But the jne operand remains a '2'. I'm not aware of any option to
>>>> change
>>>> it in LLVM's disassembler :(.
>>>
>>> Hmm. llvm-objdump -d test_maps
>>> looks fine:
>>>    41bfcb: e8 6f f7 ff ff                   callq    0x41b73f
>>> <find_extern_btf_id>
>>>
>>> the must be something llvm disasm is missing when you feed raw bytes
>>> into it.
>>> Please keep investigating. In this form I'm afraid it's no go.
>>
>> OK, I'll keep looking
> 
> Quentin, if eventually there is no existing solution for this problem,
> we could improve llvm API to encode branch target in more
> easy-to-understand form.
> 

TL;DR: I figured it out, with no LLVM fix required. I'll send a v2.

---
The details:
In my previous example, I ran on Ubuntu 20.04 with llvm-objdump v10.
Looking at just the v11 with the same command, I get the relative
addresses like in Alexei's output:

    $ llvm-objdump-10 -d /usr/bin/grep | grep -m1 jne
        4f20: 0f 85 e7 0b 00 00             jne     3047 <.text+0xddd>

    $ llvm-objdump-11 -d /usr/bin/grep | grep -m1 jne
        4f20: 0f 85 e7 0b 00 00             jne     0x5b0d <.text+0xddd>

    $ printf '%#x\n' $((0x4f20 + 3047 + 6))
    0x5b0d

The change was introduced between the two versions by the following commits:

    5fad05e80dd0 ("[MCInstPrinter] Pass `Address` parameter to
        MCOI::OPERAND_PCREL typed operands. NFC")
    https://reviews.llvm.org/D76574

    87de9a0786d9 ("[X86InstPrinter] Change printPCRelImm to print the
        target address in hexadecimal form")
    https://reviews.llvm.org/D76580

The first one in particular introduces a PrintBranchImmAsAddress
variable in the MCInstPrinter, and makes llvm-obdump call
"setPrintBranchImmAsAddress(true);".

Now, this function is not exposed to the C interface, llvm-c, that
bpftool would use. Instead, it's available through the more
comprehensive C++ interface that llvm-objdump relies on. I've tried to
replicate these two commits in the MCDisassembler that llvm-c interfaces
with, and succeeded.

But while looking at unit tests to extend it with the relevant checks, I
realised that the existing test would already expect an address instead
of an operand [0]. Turns out that the symbolLookupCallback() callback
defined in the test and passed when creating the disassembler ends up
changing the display so we get the operands as addresses, as we want.

[0]
https://github.com/llvm/llvm-project/blob/llvmorg-15.0.0/llvm/unittests/MC/Disassembler.cpp#L64
---

I validated that this works in bpftool too, and will send v2 with that
change.

P.S. Thank you Niklas for testing!

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2022-09-11 20:13 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-06 13:36 [PATCH bpf-next 0/7] bpftool: Add LLVM as default library for disassembling JIT-ed programs Quentin Monnet
2022-09-06 13:36 ` [PATCH bpf-next 1/7] bpftool: Define _GNU_SOURCE only once Quentin Monnet
2022-09-06 23:14   ` Song Liu
2022-09-06 13:36 ` [PATCH bpf-next 2/7] bpftool: Remove asserts from JIT disassembler Quentin Monnet
2022-09-06 23:16   ` Song Liu
2022-09-06 13:36 ` [PATCH bpf-next 3/7] bpftool: Split FEATURE_TESTS/FEATURE_DISPLAY definitions in Makefile Quentin Monnet
2022-09-06 23:18   ` Song Liu
2022-09-06 13:36 ` [PATCH bpf-next 4/7] bpftool: Group libbfd defs in Makefile, only pass them if we use libbfd Quentin Monnet
2022-09-06 23:31   ` Song Liu
2022-09-07 14:19     ` Quentin Monnet
2022-09-06 13:36 ` [PATCH bpf-next 5/7] bpftool: Refactor disassembler for JIT-ed programs Quentin Monnet
2022-09-06 23:55   ` Song Liu
2022-09-06 13:36 ` [PATCH bpf-next 6/7] bpftool: Add LLVM as default library for disassembling " Quentin Monnet
2022-09-06 23:46   ` Alexei Starovoitov
2022-09-07 14:20     ` Quentin Monnet
2022-09-07 16:10       ` Alexei Starovoitov
2022-09-07 16:33         ` Quentin Monnet
2022-09-07 18:02           ` Yonghong Song
2022-09-11 20:13             ` Quentin Monnet
2022-09-07  0:06   ` Song Liu
2022-09-07 14:20     ` Quentin Monnet
2022-09-07 16:37       ` Song Liu
2022-09-06 13:36 ` [PATCH bpf-next 7/7] bpftool: Add llvm feature to "bpftool version" Quentin Monnet
2022-09-10 19:41 ` [PATCH bpf-next 0/7] bpftool: Add LLVM as default library for disassembling JIT-ed programs Niklas Söderlund

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.