All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next] libbpf: fix internal USDT address translation logic for shared libraries
@ 2022-06-16  5:55 Andrii Nakryiko
  2022-06-16 19:59 ` Alexei Starovoitov
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Andrii Nakryiko @ 2022-06-16  5:55 UTC (permalink / raw)
  To: bpf, ast, daniel; +Cc: andrii, kernel-team

Perform the same virtual address to file offset translation that libbpf
is doing for executable ELF binaries also for shared libraries.
Currently libbpf is making a simplifying and sometimes wrong assumption
that for shared libraries relative virtual addresses inside ELF are
always equal to file offsets.

Unfortunately, this is not always the case with LLVM's lld linker, which
now by default generates quite more complicated ELF segments layout.
E.g., for liburandom_read.so from selftests/bpf, here's an excerpt from
readelf output listing ELF segments (a.k.a. program headers):

  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  PHDR           0x000040 0x0000000000000040 0x0000000000000040 0x0001f8 0x0001f8 R   0x8
  LOAD           0x000000 0x0000000000000000 0x0000000000000000 0x0005e4 0x0005e4 R   0x1000
  LOAD           0x0005f0 0x00000000000015f0 0x00000000000015f0 0x000160 0x000160 R E 0x1000
  LOAD           0x000750 0x0000000000002750 0x0000000000002750 0x000210 0x000210 RW  0x1000
  LOAD           0x000960 0x0000000000003960 0x0000000000003960 0x000028 0x000029 RW  0x1000

Compare that to what is generated by GNU ld (or LLVM lld's with extra
-znoseparate-code argument which disables this cleverness in the name of
file size reduction):

  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  LOAD           0x000000 0x0000000000000000 0x0000000000000000 0x000550 0x000550 R   0x1000
  LOAD           0x001000 0x0000000000001000 0x0000000000001000 0x000131 0x000131 R E 0x1000
  LOAD           0x002000 0x0000000000002000 0x0000000000002000 0x0000ac 0x0000ac R   0x1000
  LOAD           0x002dc0 0x0000000000003dc0 0x0000000000003dc0 0x000262 0x000268 RW  0x1000

You can see from the first example above that for executable (Flg == "R E")
PT_LOAD segment (LOAD #2), Offset doesn't match VirtAddr columns.
And it does in the second case (GNU ld output).

This is important because all the addresses, including USDT specs,
operate in a virtual address space, while kernel is expecting file
offsets when performing uprobe attach. So such mismatches have to be
properly taken care of and compensated by libbpf, which is what this
patch is fixing.

Also patch clarifies few function and variable names, as well as updates
comments to reflect this important distinction (virtaddr vs file offset)
and to ephasize that shared libraries are not all that different from
executables in this regard.

This patch also changes selftests/bpf Makefile to force urand_read and
liburand_read.so to be built with Clang and LLVM's lld (and explicitly
request this ELF file size optimization through -znoseparate-code linker
parameter) to validate libbpf logic and ensure regressions don't happen
in the future. I've bundled these selftests changes together with libbpf
changes to keep the above description tied with both libbpf and
selftests changes.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 tools/lib/bpf/usdt.c                 | 123 ++++++++++++++-------------
 tools/testing/selftests/bpf/Makefile |  14 +--
 2 files changed, 72 insertions(+), 65 deletions(-)

diff --git a/tools/lib/bpf/usdt.c b/tools/lib/bpf/usdt.c
index f1c9339cfbbc..5159207cbfd9 100644
--- a/tools/lib/bpf/usdt.c
+++ b/tools/lib/bpf/usdt.c
@@ -441,7 +441,7 @@ static int parse_elf_segs(Elf *elf, const char *path, struct elf_seg **segs, siz
 	return 0;
 }
 
-static int parse_lib_segs(int pid, const char *lib_path, struct elf_seg **segs, size_t *seg_cnt)
+static int parse_vma_segs(int pid, const char *lib_path, struct elf_seg **segs, size_t *seg_cnt)
 {
 	char path[PATH_MAX], line[PATH_MAX], mode[16];
 	size_t seg_start, seg_end, seg_off;
@@ -531,35 +531,40 @@ static int parse_lib_segs(int pid, const char *lib_path, struct elf_seg **segs,
 	return err;
 }
 
-static struct elf_seg *find_elf_seg(struct elf_seg *segs, size_t seg_cnt, long addr, bool relative)
+static struct elf_seg *find_elf_seg(struct elf_seg *segs, size_t seg_cnt, long virtaddr)
 {
 	struct elf_seg *seg;
 	int i;
 
-	if (relative) {
-		/* for shared libraries, address is relative offset and thus
-		 * should be fall within logical offset-based range of
-		 * [offset_start, offset_end)
-		 */
-		for (i = 0, seg = segs; i < seg_cnt; i++, seg++) {
-			if (seg->offset <= addr && addr < seg->offset + (seg->end - seg->start))
-				return seg;
-		}
-	} else {
-		/* for binaries, address is absolute and thus should be within
-		 * absolute address range of [seg_start, seg_end)
-		 */
-		for (i = 0, seg = segs; i < seg_cnt; i++, seg++) {
-			if (seg->start <= addr && addr < seg->end)
-				return seg;
-		}
+	/* for ELF binaries (both executables and shared libraries), we are
+	 * given virtual address (absolute for executables, relative for
+	 * libraries) which should match address range of [seg_start, seg_end)
+	 */
+	for (i = 0, seg = segs; i < seg_cnt; i++, seg++) {
+		if (seg->start <= virtaddr && virtaddr < seg->end)
+			return seg;
 	}
+	return NULL;
+}
 
+static struct elf_seg *find_vma_seg(struct elf_seg *segs, size_t seg_cnt, long offset)
+{
+	struct elf_seg *seg;
+	int i;
+
+	/* for VMA segments from /proc/<pid>/maps file, provided "address" is
+	 * actually a file offset, so should be fall within logical
+	 * offset-based range of [offset_start, offset_end)
+	 */
+	for (i = 0, seg = segs; i < seg_cnt; i++, seg++) {
+		if (seg->offset <= offset && offset < seg->offset + (seg->end - seg->start))
+			return seg;
+	}
 	return NULL;
 }
 
-static int parse_usdt_note(Elf *elf, const char *path, long base_addr,
-			   GElf_Nhdr *nhdr, const char *data, size_t name_off, size_t desc_off,
+static int parse_usdt_note(Elf *elf, const char *path, GElf_Nhdr *nhdr,
+			   const char *data, size_t name_off, size_t desc_off,
 			   struct usdt_note *usdt_note);
 
 static int parse_usdt_spec(struct usdt_spec *spec, const struct usdt_note *note, __u64 usdt_cookie);
@@ -568,8 +573,8 @@ static int collect_usdt_targets(struct usdt_manager *man, Elf *elf, const char *
 				const char *usdt_provider, const char *usdt_name, __u64 usdt_cookie,
 				struct usdt_target **out_targets, size_t *out_target_cnt)
 {
-	size_t off, name_off, desc_off, seg_cnt = 0, lib_seg_cnt = 0, target_cnt = 0;
-	struct elf_seg *segs = NULL, *lib_segs = NULL;
+	size_t off, name_off, desc_off, seg_cnt = 0, vma_seg_cnt = 0, target_cnt = 0;
+	struct elf_seg *segs = NULL, *vma_segs = NULL;
 	struct usdt_target *targets = NULL, *target;
 	long base_addr = 0;
 	Elf_Scn *notes_scn, *base_scn;
@@ -613,8 +618,7 @@ static int collect_usdt_targets(struct usdt_manager *man, Elf *elf, const char *
 		struct elf_seg *seg = NULL;
 		void *tmp;
 
-		err = parse_usdt_note(elf, path, base_addr, &nhdr,
-				      data->d_buf, name_off, desc_off, &note);
+		err = parse_usdt_note(elf, path, &nhdr, data->d_buf, name_off, desc_off, &note);
 		if (err)
 			goto err_out;
 
@@ -654,30 +658,29 @@ static int collect_usdt_targets(struct usdt_manager *man, Elf *elf, const char *
 			usdt_rel_ip += base_addr - note.base_addr;
 		}
 
-		if (ehdr.e_type == ET_EXEC) {
-			/* When attaching uprobes (which what USDTs basically
-			 * are) kernel expects a relative IP to be specified,
-			 * so if we are attaching to an executable ELF binary
-			 * (i.e., not a shared library), we need to calculate
-			 * proper relative IP based on ELF's load address
-			 */
-			seg = find_elf_seg(segs, seg_cnt, usdt_abs_ip, false /* relative */);
-			if (!seg) {
-				err = -ESRCH;
-				pr_warn("usdt: failed to find ELF program segment for '%s:%s' in '%s' at IP 0x%lx\n",
-					usdt_provider, usdt_name, path, usdt_abs_ip);
-				goto err_out;
-			}
-			if (!seg->is_exec) {
-				err = -ESRCH;
-				pr_warn("usdt: matched ELF binary '%s' segment [0x%lx, 0x%lx) for '%s:%s' at IP 0x%lx is not executable\n",
-					path, seg->start, seg->end, usdt_provider, usdt_name,
-					usdt_abs_ip);
-				goto err_out;
-			}
+		/* When attaching uprobes (which is what USDTs basically are)
+		 * kernel expects file offset to be specified, not a relative
+		 * virtual address, so we need to translate virtual address to
+		 * file offset, for both ET_EXEC and ET_DYN binaries.
+		 */
+		seg = find_elf_seg(segs, seg_cnt, usdt_abs_ip);
+		if (!seg) {
+			err = -ESRCH;
+			pr_warn("usdt: failed to find ELF program segment for '%s:%s' in '%s' at IP 0x%lx\n",
+				usdt_provider, usdt_name, path, usdt_abs_ip);
+			goto err_out;
+		}
+		if (!seg->is_exec) {
+			err = -ESRCH;
+			pr_warn("usdt: matched ELF binary '%s' segment [0x%lx, 0x%lx) for '%s:%s' at IP 0x%lx is not executable\n",
+				path, seg->start, seg->end, usdt_provider, usdt_name,
+				usdt_abs_ip);
+			goto err_out;
+		}
+		/* translate from virtual address to file offset */
+		usdt_rel_ip = usdt_abs_ip - seg->start + seg->offset;
 
-			usdt_rel_ip = usdt_abs_ip - (seg->start - seg->offset);
-		} else if (!man->has_bpf_cookie) { /* ehdr.e_type == ET_DYN */
+		if (ehdr.e_type == ET_DYN && !man->has_bpf_cookie) {
 			/* If we don't have BPF cookie support but need to
 			 * attach to a shared library, we'll need to know and
 			 * record absolute addresses of attach points due to
@@ -697,9 +700,9 @@ static int collect_usdt_targets(struct usdt_manager *man, Elf *elf, const char *
 				goto err_out;
 			}
 
-			/* lib_segs are lazily initialized only if necessary */
-			if (lib_seg_cnt == 0) {
-				err = parse_lib_segs(pid, path, &lib_segs, &lib_seg_cnt);
+			/* vma_segs are lazily initialized only if necessary */
+			if (vma_seg_cnt == 0) {
+				err = parse_vma_segs(pid, path, &vma_segs, &vma_seg_cnt);
 				if (err) {
 					pr_warn("usdt: failed to get memory segments in PID %d for shared library '%s': %d\n",
 						pid, path, err);
@@ -707,7 +710,7 @@ static int collect_usdt_targets(struct usdt_manager *man, Elf *elf, const char *
 				}
 			}
 
-			seg = find_elf_seg(lib_segs, lib_seg_cnt, usdt_rel_ip, true /* relative */);
+			seg = find_vma_seg(vma_segs, vma_seg_cnt, usdt_rel_ip);
 			if (!seg) {
 				err = -ESRCH;
 				pr_warn("usdt: failed to find shared lib memory segment for '%s:%s' in '%s' at relative IP 0x%lx\n",
@@ -715,7 +718,7 @@ static int collect_usdt_targets(struct usdt_manager *man, Elf *elf, const char *
 				goto err_out;
 			}
 
-			usdt_abs_ip = seg->start + (usdt_rel_ip - seg->offset);
+			usdt_abs_ip = seg->start - seg->offset + usdt_rel_ip;
 		}
 
 		pr_debug("usdt: probe for '%s:%s' in %s '%s': addr 0x%lx base 0x%lx (resolved abs_ip 0x%lx rel_ip 0x%lx) args '%s' in segment [0x%lx, 0x%lx) at offset 0x%lx\n",
@@ -723,7 +726,7 @@ static int collect_usdt_targets(struct usdt_manager *man, Elf *elf, const char *
 			 note.loc_addr, note.base_addr, usdt_abs_ip, usdt_rel_ip, note.args,
 			 seg ? seg->start : 0, seg ? seg->end : 0, seg ? seg->offset : 0);
 
-		/* Adjust semaphore address to be a relative offset */
+		/* Adjust semaphore address to be a file offset */
 		if (note.sema_addr) {
 			if (!man->has_sema_refcnt) {
 				pr_warn("usdt: kernel doesn't support USDT semaphore refcounting for '%s:%s' in '%s'\n",
@@ -732,7 +735,7 @@ static int collect_usdt_targets(struct usdt_manager *man, Elf *elf, const char *
 				goto err_out;
 			}
 
-			seg = find_elf_seg(segs, seg_cnt, note.sema_addr, false /* relative */);
+			seg = find_elf_seg(segs, seg_cnt, note.sema_addr);
 			if (!seg) {
 				err = -ESRCH;
 				pr_warn("usdt: failed to find ELF loadable segment with semaphore of '%s:%s' in '%s' at 0x%lx\n",
@@ -747,7 +750,7 @@ static int collect_usdt_targets(struct usdt_manager *man, Elf *elf, const char *
 				goto err_out;
 			}
 
-			usdt_sema_off = note.sema_addr - (seg->start - seg->offset);
+			usdt_sema_off = note.sema_addr - seg->start + seg->offset;
 
 			pr_debug("usdt: sema  for '%s:%s' in %s '%s': addr 0x%lx base 0x%lx (resolved 0x%lx) in segment [0x%lx, 0x%lx] at offset 0x%lx\n",
 				 usdt_provider, usdt_name, ehdr.e_type == ET_EXEC ? "exec" : "lib ",
@@ -770,7 +773,7 @@ static int collect_usdt_targets(struct usdt_manager *man, Elf *elf, const char *
 		target->rel_ip = usdt_rel_ip;
 		target->sema_off = usdt_sema_off;
 
-		/* notes->args references strings from Elf itself, so they can
+		/* notes.args references strings from Elf itself, so they can
 		 * be referenced safely until elf_end() call
 		 */
 		target->spec_str = note.args;
@@ -788,7 +791,7 @@ static int collect_usdt_targets(struct usdt_manager *man, Elf *elf, const char *
 
 err_out:
 	free(segs);
-	free(lib_segs);
+	free(vma_segs);
 	if (err < 0)
 		free(targets);
 	return err;
@@ -1089,8 +1092,8 @@ struct bpf_link *usdt_manager_attach_usdt(struct usdt_manager *man, const struct
 /* Parse out USDT ELF note from '.note.stapsdt' section.
  * Logic inspired by perf's code.
  */
-static int parse_usdt_note(Elf *elf, const char *path, long base_addr,
-			   GElf_Nhdr *nhdr, const char *data, size_t name_off, size_t desc_off,
+static int parse_usdt_note(Elf *elf, const char *path, GElf_Nhdr *nhdr,
+			   const char *data, size_t name_off, size_t desc_off,
 			   struct usdt_note *note)
 {
 	const char *provider, *name, *args;
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 8ad7a733a505..e08e8e34e793 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -172,13 +172,15 @@ $(OUTPUT)/%:%.c
 # do not fail. Static builds leave urandom_read relying on system-wide shared libraries.
 $(OUTPUT)/liburandom_read.so: urandom_read_lib1.c urandom_read_lib2.c
 	$(call msg,LIB,,$@)
-	$(Q)$(CC) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $^ $(LDLIBS) -fPIC -shared -o $@
+	$(Q)$(CLANG) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $^ $(LDLIBS)   \
+		     -fuse-ld=lld -Wl,-znoseparate-code -fPIC -shared -o $@
 
 $(OUTPUT)/urandom_read: urandom_read.c urandom_read_aux.c $(OUTPUT)/liburandom_read.so
 	$(call msg,BINARY,,$@)
-	$(Q)$(CC) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $(filter %.c,$^)  \
-		  liburandom_read.so $(LDLIBS)	       			       \
-		  -Wl,-rpath=. -Wl,--build-id=sha1 -o $@
+	$(Q)$(CLANG) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $(filter %.c,$^) \
+		     liburandom_read.so $(LDLIBS)			       \
+		     -fuse-ld=lld -Wl,-znoseparate-code	       		       \
+		     -Wl,-rpath=. -Wl,--build-id=sha1 -o $@
 
 $(OUTPUT)/bpf_testmod.ko: $(VMLINUX_BTF) $(wildcard bpf_testmod/Makefile bpf_testmod/*.[ch])
 	$(call msg,MOD,,$@)
@@ -580,6 +582,8 @@ $(OUTPUT)/bench: $(OUTPUT)/bench.o \
 EXTRA_CLEAN := $(TEST_CUSTOM_PROGS) $(SCRATCH_DIR) $(HOST_SCRATCH_DIR)	\
 	prog_tests/tests.h map_tests/tests.h verifier/tests.h		\
 	feature bpftool							\
-	$(addprefix $(OUTPUT)/,*.o *.skel.h *.lskel.h *.subskel.h no_alu32 bpf_gcc bpf_testmod.ko)
+	$(addprefix $(OUTPUT)/,*.o *.skel.h *.lskel.h *.subskel.h	\
+			       no_alu32 bpf_gcc bpf_testmod.ko		\
+			       liburandom_read.so)
 
 .PHONY: docs docs-clean
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf-next] libbpf: fix internal USDT address translation logic for shared libraries
  2022-06-16  5:55 [PATCH bpf-next] libbpf: fix internal USDT address translation logic for shared libraries Andrii Nakryiko
@ 2022-06-16 19:59 ` Alexei Starovoitov
  2022-06-16 20:38   ` Andrii Nakryiko
  2022-06-16 23:30 ` patchwork-bot+netdevbpf
  2022-07-20 15:32 ` Jakub Sitnicki
  2 siblings, 1 reply; 7+ messages in thread
From: Alexei Starovoitov @ 2022-06-16 19:59 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Kernel Team

On Wed, Jun 15, 2022 at 10:56 PM Andrii Nakryiko <andrii@kernel.org> wrote:
>
> Perform the same virtual address to file offset translation that libbpf
> is doing for executable ELF binaries also for shared libraries.
> Currently libbpf is making a simplifying and sometimes wrong assumption
> that for shared libraries relative virtual addresses inside ELF are
> always equal to file offsets.
>
> Unfortunately, this is not always the case with LLVM's lld linker, which
> now by default generates quite more complicated ELF segments layout.
> E.g., for liburandom_read.so from selftests/bpf, here's an excerpt from
> readelf output listing ELF segments (a.k.a. program headers):
>
>   Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
>   PHDR           0x000040 0x0000000000000040 0x0000000000000040 0x0001f8 0x0001f8 R   0x8
>   LOAD           0x000000 0x0000000000000000 0x0000000000000000 0x0005e4 0x0005e4 R   0x1000
>   LOAD           0x0005f0 0x00000000000015f0 0x00000000000015f0 0x000160 0x000160 R E 0x1000
>   LOAD           0x000750 0x0000000000002750 0x0000000000002750 0x000210 0x000210 RW  0x1000
>   LOAD           0x000960 0x0000000000003960 0x0000000000003960 0x000028 0x000029 RW  0x1000
>
> Compare that to what is generated by GNU ld (or LLVM lld's with extra
> -znoseparate-code argument which disables this cleverness in the name of
> file size reduction):
>
>   Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
>   LOAD           0x000000 0x0000000000000000 0x0000000000000000 0x000550 0x000550 R   0x1000
>   LOAD           0x001000 0x0000000000001000 0x0000000000001000 0x000131 0x000131 R E 0x1000
>   LOAD           0x002000 0x0000000000002000 0x0000000000002000 0x0000ac 0x0000ac R   0x1000
>   LOAD           0x002dc0 0x0000000000003dc0 0x0000000000003dc0 0x000262 0x000268 RW  0x1000
>
> You can see from the first example above that for executable (Flg == "R E")
> PT_LOAD segment (LOAD #2), Offset doesn't match VirtAddr columns.
> And it does in the second case (GNU ld output).
>
> This is important because all the addresses, including USDT specs,
> operate in a virtual address space, while kernel is expecting file
> offsets when performing uprobe attach. So such mismatches have to be
> properly taken care of and compensated by libbpf, which is what this
> patch is fixing.
>
> Also patch clarifies few function and variable names, as well as updates
> comments to reflect this important distinction (virtaddr vs file offset)
> and to ephasize that shared libraries are not all that different from
> executables in this regard.
>
> This patch also changes selftests/bpf Makefile to force urand_read and
> liburand_read.so to be built with Clang and LLVM's lld (and explicitly
> request this ELF file size optimization through -znoseparate-code linker
> parameter) to validate libbpf logic and ensure regressions don't happen
> in the future. I've bundled these selftests changes together with libbpf
> changes to keep the above description tied with both libbpf and
> selftests changes.
>
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> ---
>  tools/lib/bpf/usdt.c                 | 123 ++++++++++++++-------------

What should be the Fixes tag here?
Back to the beginning of this usdt.c file?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf-next] libbpf: fix internal USDT address translation logic for shared libraries
  2022-06-16 19:59 ` Alexei Starovoitov
@ 2022-06-16 20:38   ` Andrii Nakryiko
  0 siblings, 0 replies; 7+ messages in thread
From: Andrii Nakryiko @ 2022-06-16 20:38 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, bpf, Alexei Starovoitov, Daniel Borkmann, Kernel Team

On Thu, Jun 16, 2022 at 1:00 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, Jun 15, 2022 at 10:56 PM Andrii Nakryiko <andrii@kernel.org> wrote:
> >
> > Perform the same virtual address to file offset translation that libbpf
> > is doing for executable ELF binaries also for shared libraries.
> > Currently libbpf is making a simplifying and sometimes wrong assumption
> > that for shared libraries relative virtual addresses inside ELF are
> > always equal to file offsets.
> >
> > Unfortunately, this is not always the case with LLVM's lld linker, which
> > now by default generates quite more complicated ELF segments layout.
> > E.g., for liburandom_read.so from selftests/bpf, here's an excerpt from
> > readelf output listing ELF segments (a.k.a. program headers):
> >
> >   Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
> >   PHDR           0x000040 0x0000000000000040 0x0000000000000040 0x0001f8 0x0001f8 R   0x8
> >   LOAD           0x000000 0x0000000000000000 0x0000000000000000 0x0005e4 0x0005e4 R   0x1000
> >   LOAD           0x0005f0 0x00000000000015f0 0x00000000000015f0 0x000160 0x000160 R E 0x1000
> >   LOAD           0x000750 0x0000000000002750 0x0000000000002750 0x000210 0x000210 RW  0x1000
> >   LOAD           0x000960 0x0000000000003960 0x0000000000003960 0x000028 0x000029 RW  0x1000
> >
> > Compare that to what is generated by GNU ld (or LLVM lld's with extra
> > -znoseparate-code argument which disables this cleverness in the name of
> > file size reduction):
> >
> >   Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
> >   LOAD           0x000000 0x0000000000000000 0x0000000000000000 0x000550 0x000550 R   0x1000
> >   LOAD           0x001000 0x0000000000001000 0x0000000000001000 0x000131 0x000131 R E 0x1000
> >   LOAD           0x002000 0x0000000000002000 0x0000000000002000 0x0000ac 0x0000ac R   0x1000
> >   LOAD           0x002dc0 0x0000000000003dc0 0x0000000000003dc0 0x000262 0x000268 RW  0x1000
> >
> > You can see from the first example above that for executable (Flg == "R E")
> > PT_LOAD segment (LOAD #2), Offset doesn't match VirtAddr columns.
> > And it does in the second case (GNU ld output).
> >
> > This is important because all the addresses, including USDT specs,
> > operate in a virtual address space, while kernel is expecting file
> > offsets when performing uprobe attach. So such mismatches have to be
> > properly taken care of and compensated by libbpf, which is what this
> > patch is fixing.
> >
> > Also patch clarifies few function and variable names, as well as updates
> > comments to reflect this important distinction (virtaddr vs file offset)
> > and to ephasize that shared libraries are not all that different from
> > executables in this regard.
> >
> > This patch also changes selftests/bpf Makefile to force urand_read and
> > liburand_read.so to be built with Clang and LLVM's lld (and explicitly
> > request this ELF file size optimization through -znoseparate-code linker
> > parameter) to validate libbpf logic and ensure regressions don't happen
> > in the future. I've bundled these selftests changes together with libbpf
> > changes to keep the above description tied with both libbpf and
> > selftests changes.
> >
> > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> > ---
> >  tools/lib/bpf/usdt.c                 | 123 ++++++++++++++-------------
>
> What should be the Fixes tag here?
> Back to the beginning of this usdt.c file?

Oh, sorry, forgot about the Fixes tag. Yeah, it's pretty much never
worked properly for such fancy ELF shared libraries.

Fixes: 74cc6311cec9 ("libbpf: Add USDT notes parsing and resolution logic")

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf-next] libbpf: fix internal USDT address translation logic for shared libraries
  2022-06-16  5:55 [PATCH bpf-next] libbpf: fix internal USDT address translation logic for shared libraries Andrii Nakryiko
  2022-06-16 19:59 ` Alexei Starovoitov
@ 2022-06-16 23:30 ` patchwork-bot+netdevbpf
  2022-07-20 15:32 ` Jakub Sitnicki
  2 siblings, 0 replies; 7+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-06-16 23:30 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: bpf, ast, daniel, kernel-team

Hello:

This patch was applied to bpf/bpf-next.git (master)
by Daniel Borkmann <daniel@iogearbox.net>:

On Wed, 15 Jun 2022 22:55:43 -0700 you wrote:
> Perform the same virtual address to file offset translation that libbpf
> is doing for executable ELF binaries also for shared libraries.
> Currently libbpf is making a simplifying and sometimes wrong assumption
> that for shared libraries relative virtual addresses inside ELF are
> always equal to file offsets.
> 
> Unfortunately, this is not always the case with LLVM's lld linker, which
> now by default generates quite more complicated ELF segments layout.
> E.g., for liburandom_read.so from selftests/bpf, here's an excerpt from
> readelf output listing ELF segments (a.k.a. program headers):
> 
> [...]

Here is the summary with links:
  - [bpf-next] libbpf: fix internal USDT address translation logic for shared libraries
    https://git.kernel.org/bpf/bpf-next/c/3e6fe5ce4d48

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf-next] libbpf: fix internal USDT address translation logic for shared libraries
  2022-06-16  5:55 [PATCH bpf-next] libbpf: fix internal USDT address translation logic for shared libraries Andrii Nakryiko
  2022-06-16 19:59 ` Alexei Starovoitov
  2022-06-16 23:30 ` patchwork-bot+netdevbpf
@ 2022-07-20 15:32 ` Jakub Sitnicki
  2022-07-20 18:32   ` Jakub Sitnicki
  2 siblings, 1 reply; 7+ messages in thread
From: Jakub Sitnicki @ 2022-07-20 15:32 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: bpf, ast, daniel, kernel-team, kernel-team

Hi Andrii,

On Wed, Jun 15, 2022 at 10:55 PM -07, Andrii Nakryiko wrote:

[...]

> This patch also changes selftests/bpf Makefile to force urand_read and
> liburand_read.so to be built with Clang and LLVM's lld (and explicitly
> request this ELF file size optimization through -znoseparate-code linker
> parameter) to validate libbpf logic and ensure regressions don't happen
> in the future. I've bundled these selftests changes together with libbpf
> changes to keep the above description tied with both libbpf and
> selftests changes.
>
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> ---

[...]

> diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
> index 8ad7a733a505..e08e8e34e793 100644
> --- a/tools/testing/selftests/bpf/Makefile
> +++ b/tools/testing/selftests/bpf/Makefile
> @@ -172,13 +172,15 @@ $(OUTPUT)/%:%.c
>  # do not fail. Static builds leave urandom_read relying on system-wide shared libraries.
>  $(OUTPUT)/liburandom_read.so: urandom_read_lib1.c urandom_read_lib2.c
>  	$(call msg,LIB,,$@)
> -	$(Q)$(CC) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $^ $(LDLIBS) -fPIC -shared -o $@
> +	$(Q)$(CLANG) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $^ $(LDLIBS)   \
> +		     -fuse-ld=lld -Wl,-znoseparate-code -fPIC -shared -o $@
>  
>  $(OUTPUT)/urandom_read: urandom_read.c urandom_read_aux.c $(OUTPUT)/liburandom_read.so
>  	$(call msg,BINARY,,$@)
> -	$(Q)$(CC) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $(filter %.c,$^)  \
> -		  liburandom_read.so $(LDLIBS)	       			       \
> -		  -Wl,-rpath=. -Wl,--build-id=sha1 -o $@
> +	$(Q)$(CLANG) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $(filter %.c,$^) \
> +		     liburandom_read.so $(LDLIBS)			       \
> +		     -fuse-ld=lld -Wl,-znoseparate-code	       		       \
> +		     -Wl,-rpath=. -Wl,--build-id=sha1 -o $@

[...]

Not sure if this was considered - adding a dependency on Clang for the
target platform makes cross-compiling bpf selftests much harder than it
was.

Maybe we could use $(CLANG) only when not cross-compiling, and execute
$(CC) like before otherwise?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf-next] libbpf: fix internal USDT address translation logic for shared libraries
  2022-07-20 15:32 ` Jakub Sitnicki
@ 2022-07-20 18:32   ` Jakub Sitnicki
  2022-07-28 22:04     ` Andrii Nakryiko
  0 siblings, 1 reply; 7+ messages in thread
From: Jakub Sitnicki @ 2022-07-20 18:32 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: bpf, ast, daniel, kernel-team, kernel-team

On Wed, Jul 20, 2022 at 05:32 PM +02, Jakub Sitnicki wrote:

[...]

>> diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
>> index 8ad7a733a505..e08e8e34e793 100644
>> --- a/tools/testing/selftests/bpf/Makefile
>> +++ b/tools/testing/selftests/bpf/Makefile
>> @@ -172,13 +172,15 @@ $(OUTPUT)/%:%.c
>>  # do not fail. Static builds leave urandom_read relying on system-wide shared libraries.
>>  $(OUTPUT)/liburandom_read.so: urandom_read_lib1.c urandom_read_lib2.c
>>  	$(call msg,LIB,,$@)
>> -	$(Q)$(CC) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $^ $(LDLIBS) -fPIC -shared -o $@
>> +	$(Q)$(CLANG) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $^ $(LDLIBS)   \
>> +		     -fuse-ld=lld -Wl,-znoseparate-code -fPIC -shared -o $@
>>  
>>  $(OUTPUT)/urandom_read: urandom_read.c urandom_read_aux.c $(OUTPUT)/liburandom_read.so
>>  	$(call msg,BINARY,,$@)
>> -	$(Q)$(CC) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $(filter %.c,$^)  \
>> -		  liburandom_read.so $(LDLIBS)	       			       \
>> -		  -Wl,-rpath=. -Wl,--build-id=sha1 -o $@
>> +	$(Q)$(CLANG) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $(filter %.c,$^) \
>> +		     liburandom_read.so $(LDLIBS)			       \
>> +		     -fuse-ld=lld -Wl,-znoseparate-code	       		       \
>> +		     -Wl,-rpath=. -Wl,--build-id=sha1 -o $@
>
> [...]
>
> Not sure if this was considered - adding a dependency on Clang for the
> target platform makes cross-compiling bpf selftests much harder than it
> was.
>
> Maybe we could use $(CLANG) only when not cross-compiling, and execute
> $(CC) like before otherwise?

OTOH, there seems to be nothing Clang specific about the build
recipe. We just want t use LLVM lld. GCC accepts -fuse-ld as well (bfd
vs lld).

Perhaps we could partially revert to something as below? It "fixes" the
cross-compilation build of bpf selftests for arm64 for me.

WDYT?

--8<--
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 8d59ec7f4c2d..541e2b0de27a 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -170,24 +170,24 @@ $(OUTPUT)/%:%.c
 
 # LLVM's ld.lld doesn't support all the architectures, so use it only on x86
 ifeq ($(SRCARCH),x86)
-LLD := lld
+ALT_LD := lld
 else
-LLD := ld
+ALT_LD := bfd
 endif
 
 # Filter out -static for liburandom_read.so and its dependent targets so that static builds
 # do not fail. Static builds leave urandom_read relying on system-wide shared libraries.
 $(OUTPUT)/liburandom_read.so: urandom_read_lib1.c urandom_read_lib2.c
 	$(call msg,LIB,,$@)
-	$(Q)$(CLANG) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $^ $(LDLIBS)   \
-		     -fuse-ld=$(LLD) -Wl,-znoseparate-code -fPIC -shared -o $@
+	$(Q)$(CC) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $^ $(LDLIBS)	\
+	          -fuse-ld=$(ALT_LD) -Wl,-znoseparate-code -fPIC -shared -o $@
 
 $(OUTPUT)/urandom_read: urandom_read.c urandom_read_aux.c $(OUTPUT)/liburandom_read.so
 	$(call msg,BINARY,,$@)
-	$(Q)$(CLANG) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $(filter %.c,$^) \
-		     liburandom_read.so $(LDLIBS)			       \
-		     -fuse-ld=$(LLD) -Wl,-znoseparate-code		       \
-		     -Wl,-rpath=. -Wl,--build-id=sha1 -o $@
+	$(Q)$(CC) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $(filter %.c,$^)	\
+	          liburandom_read.so $(LDLIBS)					\
+	          -fuse-ld=$(ALT_LD) -Wl,-znoseparate-code			\
+	          -Wl,-rpath=. -Wl,--build-id=sha1 -o $@
 
 $(OUTPUT)/bpf_testmod.ko: $(VMLINUX_BTF) $(wildcard bpf_testmod/Makefile bpf_testmod/*.[ch])
 	$(call msg,MOD,,$@)

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf-next] libbpf: fix internal USDT address translation logic for shared libraries
  2022-07-20 18:32   ` Jakub Sitnicki
@ 2022-07-28 22:04     ` Andrii Nakryiko
  0 siblings, 0 replies; 7+ messages in thread
From: Andrii Nakryiko @ 2022-07-28 22:04 UTC (permalink / raw)
  To: Jakub Sitnicki
  Cc: Andrii Nakryiko, bpf, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, kernel-team

On Wed, Jul 20, 2022 at 11:37 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>
> On Wed, Jul 20, 2022 at 05:32 PM +02, Jakub Sitnicki wrote:
>
> [...]
>
> >> diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
> >> index 8ad7a733a505..e08e8e34e793 100644
> >> --- a/tools/testing/selftests/bpf/Makefile
> >> +++ b/tools/testing/selftests/bpf/Makefile
> >> @@ -172,13 +172,15 @@ $(OUTPUT)/%:%.c
> >>  # do not fail. Static builds leave urandom_read relying on system-wide shared libraries.
> >>  $(OUTPUT)/liburandom_read.so: urandom_read_lib1.c urandom_read_lib2.c
> >>      $(call msg,LIB,,$@)
> >> -    $(Q)$(CC) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $^ $(LDLIBS) -fPIC -shared -o $@
> >> +    $(Q)$(CLANG) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $^ $(LDLIBS)   \
> >> +                 -fuse-ld=lld -Wl,-znoseparate-code -fPIC -shared -o $@
> >>
> >>  $(OUTPUT)/urandom_read: urandom_read.c urandom_read_aux.c $(OUTPUT)/liburandom_read.so
> >>      $(call msg,BINARY,,$@)
> >> -    $(Q)$(CC) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $(filter %.c,$^)  \
> >> -              liburandom_read.so $(LDLIBS)                                 \
> >> -              -Wl,-rpath=. -Wl,--build-id=sha1 -o $@
> >> +    $(Q)$(CLANG) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $(filter %.c,$^) \
> >> +                 liburandom_read.so $(LDLIBS)                              \
> >> +                 -fuse-ld=lld -Wl,-znoseparate-code                        \
> >> +                 -Wl,-rpath=. -Wl,--build-id=sha1 -o $@
> >
> > [...]
> >
> > Not sure if this was considered - adding a dependency on Clang for the
> > target platform makes cross-compiling bpf selftests much harder than it
> > was.
> >
> > Maybe we could use $(CLANG) only when not cross-compiling, and execute
> > $(CC) like before otherwise?
>
> OTOH, there seems to be nothing Clang specific about the build
> recipe. We just want t use LLVM lld. GCC accepts -fuse-ld as well (bfd
> vs lld).
>
> Perhaps we could partially revert to something as below? It "fixes" the
> cross-compilation build of bpf selftests for arm64 for me.
>
> WDYT?
>
> --8<--
> diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
> index 8d59ec7f4c2d..541e2b0de27a 100644
> --- a/tools/testing/selftests/bpf/Makefile
> +++ b/tools/testing/selftests/bpf/Makefile
> @@ -170,24 +170,24 @@ $(OUTPUT)/%:%.c
>
>  # LLVM's ld.lld doesn't support all the architectures, so use it only on x86
>  ifeq ($(SRCARCH),x86)
> -LLD := lld
> +ALT_LD := lld
>  else
> -LLD := ld
> +ALT_LD := bfd
>  endif
>
>  # Filter out -static for liburandom_read.so and its dependent targets so that static builds
>  # do not fail. Static builds leave urandom_read relying on system-wide shared libraries.
>  $(OUTPUT)/liburandom_read.so: urandom_read_lib1.c urandom_read_lib2.c
>         $(call msg,LIB,,$@)
> -       $(Q)$(CLANG) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $^ $(LDLIBS)   \
> -                    -fuse-ld=$(LLD) -Wl,-znoseparate-code -fPIC -shared -o $@
> +       $(Q)$(CC) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $^ $(LDLIBS)       \
> +                 -fuse-ld=$(ALT_LD) -Wl,-znoseparate-code -fPIC -shared -o $@
>
>  $(OUTPUT)/urandom_read: urandom_read.c urandom_read_aux.c $(OUTPUT)/liburandom_read.so
>         $(call msg,BINARY,,$@)
> -       $(Q)$(CLANG) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $(filter %.c,$^) \
> -                    liburandom_read.so $(LDLIBS)                              \
> -                    -fuse-ld=$(LLD) -Wl,-znoseparate-code                     \
> -                    -Wl,-rpath=. -Wl,--build-id=sha1 -o $@
> +       $(Q)$(CC) $(filter-out -static,$(CFLAGS) $(LDFLAGS)) $(filter %.c,$^)   \
> +                 liburandom_read.so $(LDLIBS)                                  \
> +                 -fuse-ld=$(ALT_LD) -Wl,-znoseparate-code                      \
> +                 -Wl,-rpath=. -Wl,--build-id=sha1 -o $@
>

I tried building with CC locally and I still see ELF layout that would
exercise the logic that was fixed with my original patchset:

$ readelf -l liburandom_read.so | grep LOAD -A1
  LOAD           0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x00000000000005ec 0x00000000000005ec  R      0x1000
  LOAD           0x00000000000005f0 0x00000000000015f0 0x00000000000015f0
                 0x0000000000000130 0x0000000000000130  R E    0x1000
  LOAD           0x0000000000000720 0x0000000000002720 0x0000000000002720
                 0x0000000000000200 0x0000000000000200  RW     0x1000
  LOAD           0x0000000000000920 0x0000000000003920 0x0000000000003920
                 0x0000000000000028 0x0000000000000029  RW     0x1000

So sure, sounds good to me, please send a patch with a change.



>  $(OUTPUT)/bpf_testmod.ko: $(VMLINUX_BTF) $(wildcard bpf_testmod/Makefile bpf_testmod/*.[ch])
>         $(call msg,MOD,,$@)

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-07-28 22:05 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-16  5:55 [PATCH bpf-next] libbpf: fix internal USDT address translation logic for shared libraries Andrii Nakryiko
2022-06-16 19:59 ` Alexei Starovoitov
2022-06-16 20:38   ` Andrii Nakryiko
2022-06-16 23:30 ` patchwork-bot+netdevbpf
2022-07-20 15:32 ` Jakub Sitnicki
2022-07-20 18:32   ` Jakub Sitnicki
2022-07-28 22:04     ` Andrii Nakryiko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.