All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 bpf-next 0/5] libbpf: name-based u[ret]probe attach
@ 2022-03-11 12:10 Alan Maguire
  2022-03-11 12:10 ` [PATCH v4 bpf-next 1/5] libbpf: bpf_program__attach_uprobe_opts() should determine paths for programs/libraries where possible Alan Maguire
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Alan Maguire @ 2022-03-11 12:10 UTC (permalink / raw)
  To: ast, daniel, andrii
  Cc: kafai, songliubraving, yhs, john.fastabend, kpsingh, toke,
	sunyucong, netdev, bpf, Alan Maguire

This patch series focuses on supporting name-based attach - similar
to that supported for kprobes - for uprobe BPF programs.

Currently attach for such probes is done by determining the offset
manually, so the aim is to try and mimic the simplicity of kprobe
attach, making use of uprobe opts to specify a name string.
Patch 1 supports expansion of the binary_path argument used for
bpf_program__attach_uprobe_opts(), allowing it to determine paths
for programs and shared objects automatically, allowing for
specification of "libc.so.6" rather than the full path
"/usr/lib64/libc.so.6".

Patch 2 adds the "func_name" option to allow uprobe attach by
name; the mechanics are described there.

Having name-based support allows us to support auto-attach for
uprobes; patch 3 adds auto-attach support while attempting
to handle backwards-compatibility issues that arise.  The format
supported is

u[ret]probe/path2binary:[raw_offset|function[+offset]]

For example, to attach to libc malloc:

SEC("uprobe//usr/lib64/libc.so.6:malloc")

..or, making use of the path computation mechanisms introduced in patch 1

SEC("uprobe/libc.so.6:malloc")

Finally patch 4 add tests to the attach_probe selftests covering
attach by name, with patch 5 covering skeleton auto-attach.

Changes since v3 [1]:
- reworked variable naming to fit better with libbpf conventions
  (Andrii, patch 2)
- use quoted binary path in log messages (Andrii, patch 2)
- added path determination mechanisms using LD_LIBRARY_PATH/PATH and
  standard locations (patch 1, Andrii)
- changed section lookup to be type+name (if name is specified) to
  simplify use cases (patch 2, Andrii)
- fixed .plt lookup scheme to match symbol table entries with .plt
  index via the .rela.plt table; also fix the incorrect assumption
  that the code in the .plt that does library linking is the same
  size as .plt entries (it just happens to be on x86_64)
- aligned with pluggable section support such that uprobe SEC() names
  that do not conform to auto-attach format do not cause skeleton load
  failure (patch 3, Andrii)
- no longer need to look up absolute path to libraries used by test_progs
  since we have mechanism to determine path automatically
- replaced CHECK()s with ASSERT*()s for attach_probe test (Andrii, patch 4)
- added auto-attach selftests also (Andrii, patch 5)
Changes since RFC [2]:
- used "long" for addresses instead of ssize_t (Andrii, patch 1).
- used gelf_ interfaces to avoid assumptions about 64-bit
  binaries (Andrii, patch 1)
- clarified string matching in symbol table lookups
  (Andrii, patch 1)
- added support for specification of shared object functions
  in a non-shared object binary.  This approach instruments
  the Procedure Linking Table (PLT) - malloc@PLT.
- changed logic in symbol search to check dynamic symbol table
  first, then fall back to symbol table (Andrii, patch 1).
- modified auto-attach string to require "/" separator prior
  to path prefix i.e. uprobe//path/to/binary (Andrii, patch 2)
- modified auto-attach string to use ':' separator (Andrii,
  patch 2)
- modified auto-attach to support raw offset (Andrii, patch 2)
- modified skeleton attach to interpret -ESRCH errors as
  a non-fatal "unable to auto-attach" (Andrii suggested
  -EOPNOTSUPP but my concern was it might collide with other
  instances where that value is returned and reflects a
  failure to attach a to-be-expected attachment rather than
  skip a program that does not present an auto-attachable
  section name. Admittedly -EOPNOTSUPP seems a more natural
  value here).
- moved library path retrieval code to trace_helpers (Andrii,
  patch 3)

[1] https://lore.kernel.org/bpf/1643645554-28723-1-git-send-email-alan.maguire@oracle.com/
[2] https://lore.kernel.org/bpf/1642678950-19584-1-git-send-email-alan.maguire@oracle.com/

Alan Maguire (5):
  libbpf: bpf_program__attach_uprobe_opts() should determine paths for
    programs/libraries where possible
  libbpf: support function name-based attach uprobes
  libbpf: add auto-attach for uprobes based on section name
  selftests/bpf: add tests for u[ret]probe attach by name
  selftests/bpf: add tests for uprobe auto-attach via skeleton

 tools/lib/bpf/libbpf.c                             | 429 ++++++++++++++++++++-
 tools/lib/bpf/libbpf.h                             |  10 +-
 .../selftests/bpf/prog_tests/attach_probe.c        |  89 ++++-
 .../selftests/bpf/prog_tests/uprobe_autoattach.c   |  48 +++
 .../selftests/bpf/progs/test_attach_probe.c        |  37 ++
 .../selftests/bpf/progs/test_uprobe_autoattach.c   |  69 ++++
 6 files changed, 665 insertions(+), 17 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/uprobe_autoattach.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_uprobe_autoattach.c

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v4 bpf-next 1/5] libbpf: bpf_program__attach_uprobe_opts() should determine paths for programs/libraries where possible
  2022-03-11 12:10 [PATCH v4 bpf-next 0/5] libbpf: name-based u[ret]probe attach Alan Maguire
@ 2022-03-11 12:10 ` Alan Maguire
  2022-03-16  4:33   ` Andrii Nakryiko
  2022-03-11 12:10 ` [PATCH v4 bpf-next 2/5] libbpf: support function name-based attach uprobes Alan Maguire
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 12+ messages in thread
From: Alan Maguire @ 2022-03-11 12:10 UTC (permalink / raw)
  To: ast, daniel, andrii
  Cc: kafai, songliubraving, yhs, john.fastabend, kpsingh, toke,
	sunyucong, netdev, bpf, Alan Maguire

bpf_program__attach_uprobe_opts() requires a binary_path argument
specifying binary to instrument.  Supporting simply specifying
"libc.so.6" or "foo" should be possible too.

Library search checks LD_LIBRARY_PATH, then /usr/lib64, /usr/lib.
This allows users to run BPF programs prefixed with
LD_LIBRARY_PATH=/path2/lib while still searching standard locations.
Similarly for non .so files, we check PATH and /usr/bin, /usr/sbin.

Path determination will be useful for auto-attach of BPF uprobe programs
using SEC() definition.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 tools/lib/bpf/libbpf.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 50 insertions(+), 1 deletion(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 43161fd..b577577 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -10320,6 +10320,45 @@ static int perf_event_uprobe_open_legacy(const char *probe_name, bool retprobe,
 	return pfd;
 }
 
+/* Get full path to program/shared library. */
+static int resolve_full_path(const char *file, char *result, size_t result_sz)
+{
+	char *search_paths[2];
+	int i;
+
+	if (strstr(file, ".so")) {
+		search_paths[0] = getenv("LD_LIBRARY_PATH");
+		search_paths[1] = (char *)"/usr/lib64:/usr/lib";
+	} else {
+		search_paths[0] = getenv("PATH");
+		search_paths[1] = (char *)"/usr/bin:/usr/sbin";
+	}
+
+	for (i = 0; i < ARRAY_SIZE(search_paths); i++) {
+		char *s, *search_path, *currpath, *saveptr = NULL;
+
+		if (!search_paths[i])
+			continue;
+		search_path = strdup(search_paths[i]);
+		s = search_path;
+		while ((currpath = strtok_r(s, ":", &saveptr)) != NULL) {
+			struct stat sb;
+
+			s = NULL;
+			snprintf(result, result_sz, "%s/%s", currpath, file);
+			/* ensure it is an executable file/link */
+			if (stat(result, &sb) == 0 && (sb.st_mode & (S_IFREG | S_IFLNK)) &&
+			    (sb.st_mode & (S_IXUSR | S_IXGRP | S_IXOTH))) {
+				pr_debug("resolved '%s' to '%s'\n", file, result);
+				free(search_path);
+				return 0;
+			}
+		}
+		free(search_path);
+	}
+	return -ENOENT;
+}
+
 LIBBPF_API struct bpf_link *
 bpf_program__attach_uprobe_opts(const struct bpf_program *prog, pid_t pid,
 				const char *binary_path, size_t func_offset,
@@ -10327,6 +10366,7 @@ static int perf_event_uprobe_open_legacy(const char *probe_name, bool retprobe,
 {
 	DECLARE_LIBBPF_OPTS(bpf_perf_event_opts, pe_opts);
 	char errmsg[STRERR_BUFSIZE], *legacy_probe = NULL;
+	char full_binary_path[PATH_MAX];
 	struct bpf_link *link;
 	size_t ref_ctr_off;
 	int pfd, err;
@@ -10338,13 +10378,22 @@ static int perf_event_uprobe_open_legacy(const char *probe_name, bool retprobe,
 	retprobe = OPTS_GET(opts, retprobe, false);
 	ref_ctr_off = OPTS_GET(opts, ref_ctr_offset, 0);
 	pe_opts.bpf_cookie = OPTS_GET(opts, bpf_cookie, 0);
+	if (binary_path && !strchr(binary_path, '/')) {
+		err = resolve_full_path(binary_path, full_binary_path,
+					sizeof(full_binary_path));
+		if (err) {
+			pr_warn("could not find full path for %s\n", binary_path);
+			return libbpf_err_ptr(err);
+		}
+		binary_path = full_binary_path;
+	}
 
 	legacy = determine_uprobe_perf_type() < 0;
 	if (!legacy) {
 		pfd = perf_event_open_probe(true /* uprobe */, retprobe, binary_path,
 					    func_offset, pid, ref_ctr_off);
 	} else {
-		char probe_name[512];
+		char probe_name[PATH_MAX + 64];
 
 		if (ref_ctr_off)
 			return libbpf_err_ptr(-EINVAL);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 bpf-next 2/5] libbpf: support function name-based attach uprobes
  2022-03-11 12:10 [PATCH v4 bpf-next 0/5] libbpf: name-based u[ret]probe attach Alan Maguire
  2022-03-11 12:10 ` [PATCH v4 bpf-next 1/5] libbpf: bpf_program__attach_uprobe_opts() should determine paths for programs/libraries where possible Alan Maguire
@ 2022-03-11 12:10 ` Alan Maguire
  2022-03-16  4:33   ` Andrii Nakryiko
  2022-03-11 12:10 ` [PATCH v4 bpf-next 3/5] libbpf: add auto-attach for uprobes based on section name Alan Maguire
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 12+ messages in thread
From: Alan Maguire @ 2022-03-11 12:10 UTC (permalink / raw)
  To: ast, daniel, andrii
  Cc: kafai, songliubraving, yhs, john.fastabend, kpsingh, toke,
	sunyucong, netdev, bpf, Alan Maguire

kprobe attach is name-based, using lookups of kallsyms to translate
a function name to an address.  Currently uprobe attach is done
via an offset value as described in [1].  Extend uprobe opts
for attach to include a function name which can then be converted
into a uprobe-friendly offset.  The calcualation is done in
several steps:

1. First, determine the symbol address using libelf; this gives us
   the offset as reported by objdump
2. If the function is a shared library function - and the binary
   provided is a shared library - no further work is required;
   the address found is the required address
3. If the function is a shared library function in a program
   (as opposed to a shared library), the Procedure Linking Table
   (PLT) table address is found (it is indexed via the dynamic
   symbol table index).  This allows us to instrument a call
   to a shared library function locally in the calling binary,
   reducing overhead versus having a breakpoint in global lib.
4. Finally, if the function is local, subtract the base address
   associated with the object, retrieved from ELF program headers.

The resultant value is then added to the func_offset value passed
in to specify the uprobe attach address.  So specifying a func_offset
of 0 along with a function name "printf" will attach to printf entry.

The modes of operation supported are then

1. to attach to a local function in a binary; function "foo1" in
   "/usr/bin/foo"
2. to attach to a shared library function in a binary;
   function "malloc" in "/usr/bin/foo"
3. to attach to a shared library function in a shared library -
   function "malloc" in libc.

[1] https://www.kernel.org/doc/html/latest/trace/uprobetracer.html

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 tools/lib/bpf/libbpf.c | 310 +++++++++++++++++++++++++++++++++++++++++++++++++
 tools/lib/bpf/libbpf.h |  10 +-
 2 files changed, 319 insertions(+), 1 deletion(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index b577577..2b50b01 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -10320,6 +10320,302 @@ static int perf_event_uprobe_open_legacy(const char *probe_name, bool retprobe,
 	return pfd;
 }
 
+/* uprobes deal in relative offsets; subtract the base address associated with
+ * the mapped binary.  See Documentation/trace/uprobetracer.rst for more
+ * details.
+ */
+static long elf_find_relative_offset(Elf *elf, long addr)
+{
+	size_t n;
+	int i;
+
+	if (elf_getphdrnum(elf, &n)) {
+		pr_warn("elf: failed to find program headers: %s\n",
+			elf_errmsg(-1));
+		return -ENOENT;
+	}
+
+	for (i = 0; i < n; i++) {
+		int seg_start, seg_end, seg_offset;
+		GElf_Phdr phdr;
+
+		if (!gelf_getphdr(elf, i, &phdr)) {
+			pr_warn("elf: failed to get program header %d: %s\n",
+				i, elf_errmsg(-1));
+			return -ENOENT;
+		}
+		if (phdr.p_type != PT_LOAD || !(phdr.p_flags & PF_X))
+			continue;
+
+		seg_start = phdr.p_vaddr;
+		seg_end = seg_start + phdr.p_memsz;
+		seg_offset = phdr.p_offset;
+		if (addr >= seg_start && addr < seg_end)
+			return addr - seg_start + seg_offset;
+	}
+	pr_warn("elf: failed to find prog header containing 0x%lx\n", addr);
+	return -ENOENT;
+}
+
+/* Return next ELF section of sh_type after scn, or first of that type
+ * if scn is NULL, and if name is non-NULL, both name and type must match.
+ */
+static Elf_Scn *elf_find_next_scn_by_type_name(Elf *elf, int sh_type, const char *name,
+					       Elf_Scn *scn)
+{
+	size_t shstrndx;
+
+	if (name && elf_getshdrstrndx(elf, &shstrndx)) {
+		pr_debug("elf: failed to get section names section index: %s\n",
+			 elf_errmsg(-1));
+		return NULL;
+	}
+
+	while ((scn = elf_nextscn(elf, scn)) != NULL) {
+		const char *sname;
+		GElf_Shdr sh;
+
+		if (!gelf_getshdr(scn, &sh))
+			continue;
+		if (sh.sh_type != sh_type)
+			continue;
+
+		if (!name)
+			return scn;
+
+		sname = elf_strptr(elf, shstrndx, sh.sh_name);
+		if (sname && strcmp(sname, name) == 0)
+			return scn;
+	}
+	return NULL;
+}
+
+/* For Position-Independent Code-based libraries, a table of trampolines (Procedure Linking Table)
+ * is used to support resolution of symbol names at linking time.  The goal here is to find the
+ * offset associated with the jump to the actual library function, since consumers of the
+ * library function within the binary will jump to the plt table entry (malloc@plt) first.
+ * If we can instrument that .plt entry locally in the specific binary (rather than instrumenting
+ * glibc say), overheads are greatly reduced.
+ *
+ * However, we need to find the index into the .plt table.  There are two parts to this.
+ *
+ * First, we have to find the index that the .plt entry (malloc@plt) lives at.  To do that,
+ * we use the .rela.plt table, which consists of entries in the same order as the .plt,
+ * but crucially each entry contains the symbol index from the symbol table.  This allows
+ * us to match the index of the .plt entry to the desired library function.
+ *
+ * Second, we need to find the address associated with that indexed .plt entry.
+ * The .plt section provides section information about the overall section size and the
+ * size of each .plt entry.  However prior to the entries themselves, there is code
+ * that carries out the dynamic linking, and this code may not be the same size as the
+ * .plt entry size (it happens to be on x86_64, but not on aarch64).  So we have to
+ * determine that initial code size so we can index into the .plt entries that follow it.
+ * To do this, we simply subtract the .plt table size (nr_plt_entries * entry_size)
+ * from the overall section size, and then use that offset as the base into the array
+ * of .plt entries.
+ */
+static ssize_t elf_find_plt_offset(Elf *elf, size_t sym_idx)
+{
+	long plt_entry_sz, plt_sz, plt_base;
+	Elf_Scn *rela_plt_scn, *plt_scn;
+	size_t plt_idx, nr_plt_entries;
+	Elf_Data *rela_data;
+	GElf_Shdr sh;
+
+	/* First get .plt index and table size via .rela.plt */
+	rela_plt_scn = elf_find_next_scn_by_type_name(elf, SHT_RELA, ".rela.plt", NULL);
+	if (!rela_plt_scn) {
+		pr_debug("elf: failed to find .rela.plt section\n");
+		return -LIBBPF_ERRNO__FORMAT;
+	}
+	if (!gelf_getshdr(rela_plt_scn, &sh)) {
+		pr_debug("elf: failed to get shdr for .rela.plt section\n");
+		return -LIBBPF_ERRNO__FORMAT;
+	}
+	rela_data = elf_getdata(rela_plt_scn, 0);
+	if (!rela_data) {
+		pr_debug("elf: failed to get data for .rela.plt section\n");
+		return -LIBBPF_ERRNO__FORMAT;
+	}
+	nr_plt_entries = sh.sh_size / sh.sh_entsize;
+	for (plt_idx = 0; plt_idx < nr_plt_entries; plt_idx++) {
+		GElf_Rela rela;
+
+		if (!gelf_getrela(rela_data, plt_idx, &rela))
+			continue;
+		if (GELF_R_SYM(rela.r_info) == sym_idx)
+			break;
+	}
+	if (plt_idx == nr_plt_entries) {
+		pr_debug("elf: could not find sym index %ld in .rela.plt section\n",
+			 sym_idx);
+		return -LIBBPF_ERRNO__FORMAT;
+	}
+
+	/* Now determine base of .plt table and calculate where entry for function is */
+	plt_scn = elf_find_next_scn_by_type_name(elf, SHT_PROGBITS, ".plt", NULL);
+	if (!plt_scn || !gelf_getshdr(plt_scn, &sh)) {
+		pr_debug("elf: failed to find .plt section\n");
+		return -LIBBPF_ERRNO__FORMAT;
+	}
+	plt_base = sh.sh_addr;
+	plt_entry_sz = sh.sh_entsize;
+	plt_sz = sh.sh_size;
+	if (nr_plt_entries * plt_entry_sz >= plt_sz) {
+		pr_debug("elf: failed to calculate base .plt entry size with %ld plt entries\n",
+			 nr_plt_entries);
+		return -LIBBPF_ERRNO__FORMAT;
+	}
+	plt_base += plt_sz - (nr_plt_entries * plt_entry_sz);
+
+	return plt_base + (plt_idx * plt_entry_sz);
+}
+
+/* Find offset of function name in object specified by path.  "name" matches
+ * symbol name or name@@LIB for library functions.
+ */
+static long elf_find_func_offset(const char *binary_path, const char *name)
+{
+	int fd, i, sh_types[2] = { SHT_DYNSYM, SHT_SYMTAB };
+	bool is_shared_lib, is_name_qualified;
+	size_t name_len, sym_idx = 0;
+	char errmsg[STRERR_BUFSIZE];
+	long ret = -ENOENT;
+	GElf_Ehdr ehdr;
+	Elf *elf;
+
+	fd = open(binary_path, O_RDONLY | O_CLOEXEC);
+	if (fd < 0) {
+		ret = -errno;
+		pr_warn("failed to open %s: %s\n", binary_path,
+			libbpf_strerror_r(ret, errmsg, sizeof(errmsg)));
+		return ret;
+	}
+	elf = elf_begin(fd, ELF_C_READ_MMAP, NULL);
+	if (!elf) {
+		pr_warn("elf: could not read elf from %s: %s\n", binary_path, elf_errmsg(-1));
+		close(fd);
+		return -LIBBPF_ERRNO__FORMAT;
+	}
+	if (!gelf_getehdr(elf, &ehdr)) {
+		pr_warn("elf: failed to get ehdr from %s: %s\n", binary_path, elf_errmsg(-1));
+		ret = -LIBBPF_ERRNO__FORMAT;
+		goto out;
+	}
+	/* for shared lib case, we do not need to calculate relative offset */
+	is_shared_lib = ehdr.e_type == ET_DYN;
+
+	name_len = strlen(name);
+	/* Does name specify "@@LIB"? */
+	is_name_qualified = strstr(name, "@@") != NULL;
+
+	/* Search SHT_DYNSYM, SHT_SYMTAB for symbol.  This search order is used because if
+	 * the symbol is found in SHY_DYNSYM, the index in that table tells us which index
+	 * to use in the Procedure Linking Table to instrument calls to the shared library
+	 * function, but locally in the binary rather than in the shared library itself.
+	 * If a binary is stripped, it may also only have SHT_DYNSYM, and a fully-statically
+	 * linked binary may not have SHT_DYMSYM, so absence of a section should not be
+	 * reported as a warning/error.
+	 */
+	for (i = 0; i < ARRAY_SIZE(sh_types); i++) {
+		size_t nr_syms, strtabidx, idx;
+		Elf_Data *symbols = NULL;
+		Elf_Scn *scn = NULL;
+		int last_bind = -1;
+		const char *sname;
+		GElf_Shdr sh;
+
+		scn = elf_find_next_scn_by_type_name(elf, sh_types[i], NULL, NULL);
+		if (!scn) {
+			pr_debug("elf: failed to find symbol table ELF sections in '%s'\n",
+				 binary_path);
+			continue;
+		}
+		if (!gelf_getshdr(scn, &sh))
+			continue;
+		strtabidx = sh.sh_link;
+		symbols = elf_getdata(scn, 0);
+		if (!symbols) {
+			pr_warn("elf: failed to get symbols for symtab section in '%s': %s\n",
+				binary_path, elf_errmsg(-1));
+			ret = -LIBBPF_ERRNO__FORMAT;
+			goto out;
+		}
+		nr_syms = symbols->d_size / sh.sh_entsize;
+
+		for (idx = 0; idx < nr_syms; idx++) {
+			int curr_bind;
+			GElf_Sym sym;
+
+			if (!gelf_getsym(symbols, idx, &sym))
+				continue;
+
+			if (GELF_ST_TYPE(sym.st_info) != STT_FUNC)
+				continue;
+
+			sname = elf_strptr(elf, strtabidx, sym.st_name);
+			if (!sname)
+				continue;
+
+			curr_bind = GELF_ST_BIND(sym.st_info);
+
+			/* User can specify func, func@@LIB or func@@LIB_VERSION. */
+			if (strncmp(sname, name, name_len) != 0)
+				continue;
+			/* ...but we don't want a search for "foo" to match 'foo2" also, so any
+			 * additional characters in sname should be of the form "@@LIB".
+			 */
+			if (!is_name_qualified && strlen(sname) > name_len &&
+			    sname[name_len] != '@')
+				continue;
+
+			if (ret >= 0) {
+				/* handle multiple matches */
+				if (last_bind != STB_WEAK && curr_bind != STB_WEAK) {
+					/* Only accept one non-weak bind. */
+					pr_warn("elf: ambiguous match for '%s': %s\n",
+						sname, name);
+					ret = -LIBBPF_ERRNO__FORMAT;
+					goto out;
+				} else if (curr_bind == STB_WEAK) {
+					/* already have a non-weak bind, and
+					 * this is a weak bind, so ignore.
+					 */
+					continue;
+				}
+			}
+			ret = sym.st_value;
+			last_bind = curr_bind;
+			sym_idx = idx;
+		}
+		/* The index of the entry in SHT_DYNSYM helps find .plt entry */
+		if (ret == 0 && sh_types[i] == SHT_DYNSYM)
+			ret = elf_find_plt_offset(elf, sym_idx);
+		/* For binaries that are not shared libraries, we need relative offset */
+		if (ret > 0 && !is_shared_lib)
+			ret = elf_find_relative_offset(elf, ret);
+		if (ret > 0)
+			break;
+	}
+
+	if (ret > 0) {
+		pr_debug("elf: symbol address match for '%s': 0x%lx\n", name, ret);
+	} else {
+		if (ret == 0) {
+			pr_warn("elf: '%s' is 0 in symtab for '%s': %s\n", name, binary_path,
+				is_shared_lib ? "should not be 0 in a shared library" :
+						"try using shared library path instead");
+			ret = -ENOENT;
+		} else {
+			pr_warn("elf: failed to find symbol '%s' in '%s'\n", name, binary_path);
+		}
+	}
+out:
+	elf_end(elf);
+	close(fd);
+	return ret;
+}
+
 /* Get full path to program/shared library. */
 static int resolve_full_path(const char *file, char *result, size_t result_sz)
 {
@@ -10371,6 +10667,7 @@ static int resolve_full_path(const char *file, char *result, size_t result_sz)
 	size_t ref_ctr_off;
 	int pfd, err;
 	bool retprobe, legacy;
+	const char *func_name;
 
 	if (!OPTS_VALID(opts, bpf_uprobe_opts))
 		return libbpf_err_ptr(-EINVAL);
@@ -10387,6 +10684,19 @@ static int resolve_full_path(const char *file, char *result, size_t result_sz)
 		}
 		binary_path = full_binary_path;
 	}
+	func_name = OPTS_GET(opts, func_name, NULL);
+	if (func_name) {
+		long sym_off;
+
+		if (!binary_path) {
+			pr_warn("name-based attach requires binary_path\n");
+			return libbpf_err_ptr(-EINVAL);
+		}
+		sym_off = elf_find_func_offset(binary_path, func_name);
+		if (sym_off < 0)
+			return libbpf_err_ptr(sym_off);
+		func_offset += (size_t)sym_off;
+	}
 
 	legacy = determine_uprobe_perf_type() < 0;
 	if (!legacy) {
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index c1b0c2e..85c93e2 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -436,9 +436,17 @@ struct bpf_uprobe_opts {
 	__u64 bpf_cookie;
 	/* uprobe is return probe, invoked at function return time */
 	bool retprobe;
+	/* name of function name or function@@LIBRARY.  Partial matches
+	 * work for library functions, such as printf, printf@@GLIBC.
+	 * To specify function entry, func_offset argument should be 0 and
+	 * func_name should specify function to trace.  To trace an offset
+	 * within the function, specify func_name and use func_offset
+	 * argument to specify argument _within_ the function.
+	 */
+	const char *func_name;
 	size_t :0;
 };
-#define bpf_uprobe_opts__last_field retprobe
+#define bpf_uprobe_opts__last_field func_name
 
 /**
  * @brief **bpf_program__attach_uprobe()** attaches a BPF program
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 bpf-next 3/5] libbpf: add auto-attach for uprobes based on section name
  2022-03-11 12:10 [PATCH v4 bpf-next 0/5] libbpf: name-based u[ret]probe attach Alan Maguire
  2022-03-11 12:10 ` [PATCH v4 bpf-next 1/5] libbpf: bpf_program__attach_uprobe_opts() should determine paths for programs/libraries where possible Alan Maguire
  2022-03-11 12:10 ` [PATCH v4 bpf-next 2/5] libbpf: support function name-based attach uprobes Alan Maguire
@ 2022-03-11 12:10 ` Alan Maguire
  2022-03-16  4:33   ` Andrii Nakryiko
  2022-03-11 12:10 ` [PATCH v4 bpf-next 4/5] selftests/bpf: add tests for u[ret]probe attach by name Alan Maguire
  2022-03-11 12:10 ` [PATCH v4 bpf-next 5/5] selftests/bpf: add tests for uprobe auto-attach via skeleton Alan Maguire
  4 siblings, 1 reply; 12+ messages in thread
From: Alan Maguire @ 2022-03-11 12:10 UTC (permalink / raw)
  To: ast, daniel, andrii
  Cc: kafai, songliubraving, yhs, john.fastabend, kpsingh, toke,
	sunyucong, netdev, bpf, Alan Maguire

Now that u[ret]probes can use name-based specification, it makes
sense to add support for auto-attach based on SEC() definition.
The format proposed is

        SEC("u[ret]probe/prog:[raw_offset|[function_name[+offset]]")

For example, to trace malloc() in libc:

        SEC("uprobe/libc.so.6:malloc")

...or to trace function foo2 in /usr/bin/foo:

	SEC("uprobe//usr/bin/foo:foo2")

Auto-attach is done for all tasks (pid -1).  prog can be an absolute
path or simply a program/library name; in the latter case, we use
PATH/LD_LIBRARY_PATH to resolve the full path, falling back to
standard locations (/usr/bin:/usr/sbin or /usr/lib64:/usr/lib) if
the file is not found via environment-variable specified locations.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 tools/lib/bpf/libbpf.c | 68 ++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 66 insertions(+), 2 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 2b50b01..0dcbca8 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -8593,6 +8593,7 @@ int bpf_program__set_log_buf(struct bpf_program *prog, char *log_buf, size_t log
 }
 
 static int attach_kprobe(const struct bpf_program *prog, long cookie, struct bpf_link **link);
+static int attach_uprobe(const struct bpf_program *prog, long cookie, struct bpf_link **link);
 static int attach_tp(const struct bpf_program *prog, long cookie, struct bpf_link **link);
 static int attach_raw_tp(const struct bpf_program *prog, long cookie, struct bpf_link **link);
 static int attach_trace(const struct bpf_program *prog, long cookie, struct bpf_link **link);
@@ -8604,9 +8605,9 @@ int bpf_program__set_log_buf(struct bpf_program *prog, char *log_buf, size_t log
 	SEC_DEF("sk_reuseport/migrate",	SK_REUSEPORT, BPF_SK_REUSEPORT_SELECT_OR_MIGRATE, SEC_ATTACHABLE | SEC_SLOPPY_PFX),
 	SEC_DEF("sk_reuseport",		SK_REUSEPORT, BPF_SK_REUSEPORT_SELECT, SEC_ATTACHABLE | SEC_SLOPPY_PFX),
 	SEC_DEF("kprobe/",		KPROBE,	0, SEC_NONE, attach_kprobe),
-	SEC_DEF("uprobe/",		KPROBE,	0, SEC_NONE),
+	SEC_DEF("uprobe/",		KPROBE,	0, SEC_NONE, attach_uprobe),
 	SEC_DEF("kretprobe/",		KPROBE, 0, SEC_NONE, attach_kprobe),
-	SEC_DEF("uretprobe/",		KPROBE, 0, SEC_NONE),
+	SEC_DEF("uretprobe/",		KPROBE, 0, SEC_NONE, attach_uprobe),
 	SEC_DEF("tc",			SCHED_CLS, 0, SEC_NONE),
 	SEC_DEF("classifier",		SCHED_CLS, 0, SEC_NONE | SEC_SLOPPY_PFX | SEC_DEPRECATED),
 	SEC_DEF("action",		SCHED_ACT, 0, SEC_NONE | SEC_SLOPPY_PFX),
@@ -10761,6 +10762,69 @@ struct bpf_link *bpf_program__attach_uprobe(const struct bpf_program *prog,
 	return bpf_program__attach_uprobe_opts(prog, pid, binary_path, func_offset, &opts);
 }
 
+/* Format of u[ret]probe section definition supporting auto-attach:
+ * u[ret]probe/prog:function[+offset]
+ *
+ * prog can be an absolute/relative path or a filename; the latter is resolved to a
+ * full path via bpf_program__attach_uprobe_opts.
+ *
+ * Many uprobe programs do not avail of auto-attach, so we need to handle the
+ * case where the format is uprobe/myfunc by returning 0 with *link set to NULL
+ * to identify the case where auto-attach is not supported.
+ */
+static int attach_uprobe(const struct bpf_program *prog, long cookie, struct bpf_link **link)
+{
+	DECLARE_LIBBPF_OPTS(bpf_uprobe_opts, opts);
+	char *func, *probe_name, *func_end;
+	char *func_name, binary_path[512];
+	unsigned long long raw_offset;
+	size_t offset = 0;
+	int n;
+
+	*link = NULL;
+
+	opts.retprobe = str_has_pfx(prog->sec_name, "uretprobe/");
+	if (opts.retprobe)
+		probe_name = prog->sec_name + sizeof("uretprobe/") - 1;
+	else
+		probe_name = prog->sec_name + sizeof("uprobe/") - 1;
+
+	snprintf(binary_path, sizeof(binary_path), "%s", probe_name);
+	/* ':' should be prior to function+offset */
+	func_name = strrchr(binary_path, ':');
+	if (!func_name) {
+		pr_debug("section '%s' is old-style u[ret]probe/function, cannot auto-attach\n",
+			 prog->sec_name);
+		return 0;
+	}
+	func_name[0] = '\0';
+	func_name++;
+	n = sscanf(func_name, "%m[a-zA-Z0-9_.]+%li", &func, &offset);
+	if (n < 1) {
+		pr_warn("uprobe name '%s' is invalid\n", func_name);
+		return -EINVAL;
+	}
+	if (opts.retprobe && offset != 0) {
+		free(func);
+		pr_warn("uretprobes do not support offset specification\n");
+		return -EINVAL;
+	}
+
+	/* Is func a raw address? */
+	errno = 0;
+	raw_offset = strtoull(func, &func_end, 0);
+	if (!errno && !*func_end) {
+		free(func);
+		func = NULL;
+		offset = (size_t)raw_offset;
+	}
+	opts.func_name = func;
+
+	*link = bpf_program__attach_uprobe_opts(prog, -1, binary_path, offset, &opts);
+	free(func);
+	return 0;
+}
+
 static int determine_tracepoint_id(const char *tp_category,
 				   const char *tp_name)
 {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 bpf-next 4/5] selftests/bpf: add tests for u[ret]probe attach by name
  2022-03-11 12:10 [PATCH v4 bpf-next 0/5] libbpf: name-based u[ret]probe attach Alan Maguire
                   ` (2 preceding siblings ...)
  2022-03-11 12:10 ` [PATCH v4 bpf-next 3/5] libbpf: add auto-attach for uprobes based on section name Alan Maguire
@ 2022-03-11 12:10 ` Alan Maguire
  2022-03-16  4:33   ` Andrii Nakryiko
  2022-03-11 12:10 ` [PATCH v4 bpf-next 5/5] selftests/bpf: add tests for uprobe auto-attach via skeleton Alan Maguire
  4 siblings, 1 reply; 12+ messages in thread
From: Alan Maguire @ 2022-03-11 12:10 UTC (permalink / raw)
  To: ast, daniel, andrii
  Cc: kafai, songliubraving, yhs, john.fastabend, kpsingh, toke,
	sunyucong, netdev, bpf, Alan Maguire

add tests that verify attaching by name for

1. local functions in a program
2. library functions in a shared object; and
3. library functions in a program

...succeed for uprobe and uretprobes using new "func_name"
option for bpf_program__attach_uprobe_opts().  Also verify
auto-attach works where uprobe, path to binary and function
name are specified, but fails with -EOPNOTSUPP when the format
does not match (the latter is to support backwards-compatibility).

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 .../selftests/bpf/prog_tests/attach_probe.c        | 89 ++++++++++++++++++----
 .../selftests/bpf/progs/test_attach_probe.c        | 37 +++++++++
 2 files changed, 113 insertions(+), 13 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/attach_probe.c b/tools/testing/selftests/bpf/prog_tests/attach_probe.c
index d48f6e5..b770e0e 100644
--- a/tools/testing/selftests/bpf/prog_tests/attach_probe.c
+++ b/tools/testing/selftests/bpf/prog_tests/attach_probe.c
@@ -11,15 +11,22 @@ static void trigger_func(void)
 	asm volatile ("");
 }
 
+/* attach point for byname uprobe */
+static void trigger_func2(void)
+{
+	asm volatile ("");
+}
+
 void test_attach_probe(void)
 {
 	DECLARE_LIBBPF_OPTS(bpf_uprobe_opts, uprobe_opts);
-	int duration = 0;
 	struct bpf_link *kprobe_link, *kretprobe_link;
 	struct bpf_link *uprobe_link, *uretprobe_link;
 	struct test_attach_probe* skel;
 	ssize_t uprobe_offset, ref_ctr_offset;
+	struct bpf_link *uprobe_err_link;
 	bool legacy;
+	char *mem;
 
 	/* Check if new-style kprobe/uprobe API is supported.
 	 * Kernels that support new FD-based kprobe and uprobe BPF attachment
@@ -43,9 +50,9 @@ void test_attach_probe(void)
 		return;
 
 	skel = test_attach_probe__open_and_load();
-	if (CHECK(!skel, "skel_open", "failed to open skeleton\n"))
+	if (!ASSERT_OK_PTR(skel, "skel_open"))
 		return;
-	if (CHECK(!skel->bss, "check_bss", ".bss wasn't mmap()-ed\n"))
+	if (!ASSERT_OK_PTR(skel->bss, "check_bss"))
 		goto cleanup;
 
 	kprobe_link = bpf_program__attach_kprobe(skel->progs.handle_kprobe,
@@ -90,24 +97,80 @@ void test_attach_probe(void)
 		goto cleanup;
 	skel->links.handle_uretprobe = uretprobe_link;
 
-	/* trigger & validate kprobe && kretprobe */
-	usleep(1);
+	/* verify auto-attach fails for old-style uprobe definition */
+	uprobe_err_link = bpf_program__attach(skel->progs.handle_uprobe_byname);
+	if (!ASSERT_EQ(libbpf_get_error(uprobe_err_link), -EOPNOTSUPP,
+		       "auto-attach should fail for old-style name"))
+		goto cleanup;
 
-	if (CHECK(skel->bss->kprobe_res != 1, "check_kprobe_res",
-		  "wrong kprobe res: %d\n", skel->bss->kprobe_res))
+	uprobe_opts.func_name = "trigger_func2";
+	uprobe_opts.retprobe = false;
+	uprobe_opts.ref_ctr_offset = 0;
+	skel->links.handle_uprobe_byname =
+			bpf_program__attach_uprobe_opts(skel->progs.handle_uprobe_byname,
+							0 /* this pid */,
+							"/proc/self/exe",
+							0, &uprobe_opts);
+	if (!ASSERT_OK_PTR(skel->links.handle_uprobe_byname, "attach_uprobe_byname"))
 		goto cleanup;
-	if (CHECK(skel->bss->kretprobe_res != 2, "check_kretprobe_res",
-		  "wrong kretprobe res: %d\n", skel->bss->kretprobe_res))
+
+	/* verify auto-attach works */
+	skel->links.handle_uretprobe_byname =
+			bpf_program__attach(skel->progs.handle_uretprobe_byname);
+	if (!ASSERT_OK_PTR(skel->links.handle_uretprobe_byname, "attach_uretprobe_byname"))
 		goto cleanup;
 
+	/* test attach by name for a library function, using the library
+	 * as the binary argument. libc.so.6 will be resolved via dlopen()/dlinfo().
+	 */
+	uprobe_opts.func_name = "malloc";
+	uprobe_opts.retprobe = false;
+	skel->links.handle_uprobe_byname2 =
+			bpf_program__attach_uprobe_opts(skel->progs.handle_uprobe_byname2,
+							0 /* this pid */,
+							"libc.so.6",
+							0, &uprobe_opts);
+	if (!ASSERT_OK_PTR(skel->links.handle_uprobe_byname2, "attach_uprobe_byname2"))
+		goto cleanup;
+
+	uprobe_opts.func_name = "free";
+	uprobe_opts.retprobe = true;
+	skel->links.handle_uretprobe_byname2 =
+			bpf_program__attach_uprobe_opts(skel->progs.handle_uretprobe_byname2,
+							-1 /* any pid */,
+							"/proc/self/exe",
+							0, &uprobe_opts);
+	if (!ASSERT_OK_PTR(skel->links.handle_uretprobe_byname2, "attach_uretprobe_byname2"))
+		goto cleanup;
+
+	/* trigger & validate kprobe && kretprobe */
+	usleep(1);
+
+	/* trigger & validate shared library u[ret]probes attached by name */
+	mem = malloc(1);
+	free(mem);
+
 	/* trigger & validate uprobe & uretprobe */
 	trigger_func();
 
-	if (CHECK(skel->bss->uprobe_res != 3, "check_uprobe_res",
-		  "wrong uprobe res: %d\n", skel->bss->uprobe_res))
+	/* trigger & validate uprobe attached by name */
+	trigger_func2();
+
+	if (!ASSERT_EQ(skel->bss->kprobe_res, 1, "check_kprobe_res"))
+		goto cleanup;
+	if (!ASSERT_EQ(skel->bss->kretprobe_res, 2, "check_kretprobe_res"))
+		goto cleanup;
+	if (!ASSERT_EQ(skel->bss->uprobe_res, 3, "check_uprobe_res"))
+		goto cleanup;
+	if (!ASSERT_EQ(skel->bss->uretprobe_res, 4, "check_uretprobe_res"))
+		goto cleanup;
+	if (!ASSERT_EQ(skel->bss->uprobe_byname_res, 5, "check_uprobe_byname_res"))
+		goto cleanup;
+	if (!ASSERT_EQ(skel->bss->uretprobe_byname_res, 6, "check_uretprobe_byname_res"))
+		goto cleanup;
+	if (!ASSERT_EQ(skel->bss->uprobe_byname2_res, 7, "check_uprobe_byname2_res"))
 		goto cleanup;
-	if (CHECK(skel->bss->uretprobe_res != 4, "check_uretprobe_res",
-		  "wrong uretprobe res: %d\n", skel->bss->uretprobe_res))
+	if (!ASSERT_EQ(skel->bss->uretprobe_byname2_res, 8, "check_uretprobe_byname2_res"))
 		goto cleanup;
 
 cleanup:
diff --git a/tools/testing/selftests/bpf/progs/test_attach_probe.c b/tools/testing/selftests/bpf/progs/test_attach_probe.c
index 8056a4c..9942461c 100644
--- a/tools/testing/selftests/bpf/progs/test_attach_probe.c
+++ b/tools/testing/selftests/bpf/progs/test_attach_probe.c
@@ -10,6 +10,10 @@
 int kretprobe_res = 0;
 int uprobe_res = 0;
 int uretprobe_res = 0;
+int uprobe_byname_res = 0;
+int uretprobe_byname_res = 0;
+int uprobe_byname2_res = 0;
+int uretprobe_byname2_res = 0;
 
 SEC("kprobe/sys_nanosleep")
 int handle_kprobe(struct pt_regs *ctx)
@@ -39,4 +43,37 @@ int handle_uretprobe(struct pt_regs *ctx)
 	return 0;
 }
 
+SEC("uprobe/trigger_func_byname")
+int handle_uprobe_byname(struct pt_regs *ctx)
+{
+	uprobe_byname_res = 5;
+	return 0;
+}
+
+/* use auto-attach format for section definition. */
+SEC("uretprobe//proc/self/exe:trigger_func2")
+int handle_uretprobe_byname(struct pt_regs *ctx)
+{
+	uretprobe_byname_res = 6;
+	return 0;
+}
+
+SEC("uprobe/trigger_func_byname2")
+int handle_uprobe_byname2(struct pt_regs *ctx)
+{
+	unsigned int size = PT_REGS_PARM1(ctx);
+
+	/* verify malloc size */
+	if (size == 1)
+		uprobe_byname2_res = 7;
+	return 0;
+}
+
+SEC("uretprobe/trigger_func_byname2")
+int handle_uretprobe_byname2(struct pt_regs *ctx)
+{
+	uretprobe_byname2_res = 8;
+	return 0;
+}
+
 char _license[] SEC("license") = "GPL";
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 bpf-next 5/5] selftests/bpf: add tests for uprobe auto-attach via skeleton
  2022-03-11 12:10 [PATCH v4 bpf-next 0/5] libbpf: name-based u[ret]probe attach Alan Maguire
                   ` (3 preceding siblings ...)
  2022-03-11 12:10 ` [PATCH v4 bpf-next 4/5] selftests/bpf: add tests for u[ret]probe attach by name Alan Maguire
@ 2022-03-11 12:10 ` Alan Maguire
  2022-03-14 22:26   ` Daniel Borkmann
  2022-03-16  4:33   ` Andrii Nakryiko
  4 siblings, 2 replies; 12+ messages in thread
From: Alan Maguire @ 2022-03-11 12:10 UTC (permalink / raw)
  To: ast, daniel, andrii
  Cc: kafai, songliubraving, yhs, john.fastabend, kpsingh, toke,
	sunyucong, netdev, bpf, Alan Maguire

tests that verify auto-attach works for function entry/return for
local functions in program, library functions in program and library
functions in library.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 .../selftests/bpf/prog_tests/uprobe_autoattach.c   | 48 +++++++++++++++
 .../selftests/bpf/progs/test_uprobe_autoattach.c   | 69 ++++++++++++++++++++++
 2 files changed, 117 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/uprobe_autoattach.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_uprobe_autoattach.c

diff --git a/tools/testing/selftests/bpf/prog_tests/uprobe_autoattach.c b/tools/testing/selftests/bpf/prog_tests/uprobe_autoattach.c
new file mode 100644
index 0000000..57ed636
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/uprobe_autoattach.c
@@ -0,0 +1,48 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2022, Oracle and/or its affiliates. */
+
+#include <test_progs.h>
+#include "test_uprobe_autoattach.skel.h"
+
+/* uprobe attach point */
+static void autoattach_trigger_func(void)
+{
+	asm volatile ("");
+}
+
+void test_uprobe_autoattach(void)
+{
+	struct test_uprobe_autoattach *skel;
+	char *mem;
+
+	skel = test_uprobe_autoattach__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "skel_open"))
+		return;
+	if (!ASSERT_OK_PTR(skel->bss, "check_bss"))
+		goto cleanup;
+
+	if (!ASSERT_OK(test_uprobe_autoattach__attach(skel), "skel_attach"))
+		goto cleanup;
+
+	/* trigger & validate uprobe & uretprobe */
+	autoattach_trigger_func();
+
+	/* trigger & validate shared library u[ret]probes attached by name */
+	mem = malloc(1);
+	free(mem);
+
+	if (!ASSERT_EQ(skel->bss->uprobe_byname_res, 1, "check_uprobe_byname_res"))
+		goto cleanup;
+	if (!ASSERT_EQ(skel->bss->uretprobe_byname_res, 2, "check_uretprobe_byname_res"))
+		goto cleanup;
+	if (!ASSERT_EQ(skel->bss->uprobe_byname2_res, 3, "check_uprobe_byname2_res"))
+		goto cleanup;
+	if (!ASSERT_EQ(skel->bss->uretprobe_byname2_res, 4, "check_uretprobe_byname2_res"))
+		goto cleanup;
+	if (!ASSERT_EQ(skel->bss->uprobe_byname3_res, 5, "check_uprobe_byname3_res"))
+		goto cleanup;
+	if (!ASSERT_EQ(skel->bss->uretprobe_byname3_res, 6, "check_uretprobe_byname3_res"))
+		goto cleanup;
+cleanup:
+	test_uprobe_autoattach__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/test_uprobe_autoattach.c b/tools/testing/selftests/bpf/progs/test_uprobe_autoattach.c
new file mode 100644
index 0000000..a808bcf
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_uprobe_autoattach.c
@@ -0,0 +1,69 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2022, Oracle and/or its affiliates. */
+
+#include <linux/ptrace.h>
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+int uprobe_byname_res = 0;
+int uretprobe_byname_res = 0;
+int uprobe_byname2_res = 0;
+int uretprobe_byname2_res = 0;
+int uprobe_byname3_res = 0;
+int uretprobe_byname3_res = 0;
+
+/* This program cannot auto-attach, but that should not stop other
+ * programs from attaching.
+ */
+SEC("uprobe/no_autoattach")
+int handle_uprobe_noautoattach(struct pt_regs *ctx)
+{
+	return 0;
+}
+
+SEC("uprobe//proc/self/exe:autoattach_trigger_func")
+int handle_uprobe_byname(struct pt_regs *ctx)
+{
+	uprobe_byname_res = 1;
+	return 0;
+}
+
+SEC("uretprobe//proc/self/exe:autoattach_trigger_func")
+int handle_uretprobe_byname(struct pt_regs *ctx)
+{
+	uretprobe_byname_res = 2;
+	return 0;
+}
+
+
+SEC("uprobe//proc/self/exe:malloc")
+int handle_uprobe_byname2(struct pt_regs *ctx)
+{
+	uprobe_byname2_res = 3;
+	return 0;
+}
+
+SEC("uretprobe//proc/self/exe:malloc")
+int handle_uretprobe_byname2(struct pt_regs *ctx)
+{
+	uretprobe_byname2_res = 4;
+	return 0;
+}
+
+
+SEC("uprobe/libc.so.6:free")
+int handle_uprobe_byname3(struct pt_regs *ctx)
+{
+	uprobe_byname3_res = 5;
+	return 0;
+}
+
+SEC("uretprobe/libc.so.6:free")
+int handle_uretprobe_byname3(struct pt_regs *ctx)
+{
+	uretprobe_byname3_res = 6;
+	return 0;
+}
+
+char _license[] SEC("license") = "GPL";
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 bpf-next 5/5] selftests/bpf: add tests for uprobe auto-attach via skeleton
  2022-03-11 12:10 ` [PATCH v4 bpf-next 5/5] selftests/bpf: add tests for uprobe auto-attach via skeleton Alan Maguire
@ 2022-03-14 22:26   ` Daniel Borkmann
  2022-03-16  4:33   ` Andrii Nakryiko
  1 sibling, 0 replies; 12+ messages in thread
From: Daniel Borkmann @ 2022-03-14 22:26 UTC (permalink / raw)
  To: Alan Maguire, ast, andrii
  Cc: kafai, songliubraving, yhs, john.fastabend, kpsingh, toke,
	sunyucong, netdev, bpf

On 3/11/22 1:10 PM, Alan Maguire wrote:
> tests that verify auto-attach works for function entry/return for
> local functions in program, library functions in program and library
> functions in library.
> 
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> ---
>   .../selftests/bpf/prog_tests/uprobe_autoattach.c   | 48 +++++++++++++++
>   .../selftests/bpf/progs/test_uprobe_autoattach.c   | 69 ++++++++++++++++++++++
>   2 files changed, 117 insertions(+)
>   create mode 100644 tools/testing/selftests/bpf/prog_tests/uprobe_autoattach.c
>   create mode 100644 tools/testing/selftests/bpf/progs/test_uprobe_autoattach.c
> 
> diff --git a/tools/testing/selftests/bpf/prog_tests/uprobe_autoattach.c b/tools/testing/selftests/bpf/prog_tests/uprobe_autoattach.c
> new file mode 100644
> index 0000000..57ed636
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/prog_tests/uprobe_autoattach.c
> @@ -0,0 +1,48 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2022, Oracle and/or its affiliates. */
> +
> +#include <test_progs.h>
> +#include "test_uprobe_autoattach.skel.h"
> +
> +/* uprobe attach point */
> +static void autoattach_trigger_func(void)
> +{
> +	asm volatile ("");
> +}
> +
> +void test_uprobe_autoattach(void)
> +{
> +	struct test_uprobe_autoattach *skel;
> +	char *mem;
> +
> +	skel = test_uprobe_autoattach__open_and_load();
> +	if (!ASSERT_OK_PTR(skel, "skel_open"))
> +		return;
> +	if (!ASSERT_OK_PTR(skel->bss, "check_bss"))
> +		goto cleanup;
> +
> +	if (!ASSERT_OK(test_uprobe_autoattach__attach(skel), "skel_attach"))
> +		goto cleanup;
> +
> +	/* trigger & validate uprobe & uretprobe */
> +	autoattach_trigger_func();
> +
> +	/* trigger & validate shared library u[ret]probes attached by name */
> +	mem = malloc(1);
> +	free(mem);
> +
> +	if (!ASSERT_EQ(skel->bss->uprobe_byname_res, 1, "check_uprobe_byname_res"))
> +		goto cleanup;
> +	if (!ASSERT_EQ(skel->bss->uretprobe_byname_res, 2, "check_uretprobe_byname_res"))
> +		goto cleanup;
> +	if (!ASSERT_EQ(skel->bss->uprobe_byname2_res, 3, "check_uprobe_byname2_res"))
> +		goto cleanup;
> +	if (!ASSERT_EQ(skel->bss->uretprobe_byname2_res, 4, "check_uretprobe_byname2_res"))
> +		goto cleanup;
> +	if (!ASSERT_EQ(skel->bss->uprobe_byname3_res, 5, "check_uprobe_byname3_res"))
> +		goto cleanup;
> +	if (!ASSERT_EQ(skel->bss->uretprobe_byname3_res, 6, "check_uretprobe_byname3_res"))
> +		goto cleanup;
> +cleanup:
> +	test_uprobe_autoattach__destroy(skel);
> +}

Hmm, looks like this fails CI, ptal:

https://github.com/kernel-patches/bpf/runs/5517172330?check_suite_focus=true

[...]
   test_attach_probe:PASS:uprobe_offset 0 nsec
   test_attach_probe:PASS:ref_ctr_offset 0 nsec
   test_attach_probe:PASS:skel_open 0 nsec
   test_attach_probe:PASS:check_bss 0 nsec
   test_attach_probe:PASS:attach_kprobe 0 nsec
   test_attach_probe:PASS:attach_kretprobe 0 nsec
   test_attach_probe:PASS:uprobe_ref_ctr_before 0 nsec
   test_attach_probe:PASS:attach_uprobe 0 nsec
   test_attach_probe:PASS:uprobe_ref_ctr_after 0 nsec
   test_attach_probe:PASS:attach_uretprobe 0 nsec
   test_attach_probe:PASS:auto-attach should fail for old-style name 0 nsec
   test_attach_probe:PASS:attach_uprobe_byname 0 nsec
   test_attach_probe:PASS:attach_uretprobe_byname 0 nsec
   test_attach_probe:PASS:attach_uprobe_byname2 0 nsec
   test_attach_probe:PASS:attach_uretprobe_byname2 0 nsec
   test_attach_probe:PASS:check_kprobe_res 0 nsec
   test_attach_probe:PASS:check_kretprobe_res 0 nsec
   test_attach_probe:PASS:check_uprobe_res 0 nsec
   test_attach_probe:PASS:check_uretprobe_res 0 nsec
   test_attach_probe:PASS:check_uprobe_byname_res 0 nsec
   test_attach_probe:PASS:check_uretprobe_byname_res 0 nsec
   test_attach_probe:PASS:check_uprobe_byname2_res 0 nsec
   test_attach_probe:FAIL:check_uretprobe_byname2_res unexpected check_uretprobe_byname2_res: actual 0 != expected 8
   test_attach_probe:PASS:uprobe_ref_ctr_cleanup 0 nsec
   #4 attach_probe:FAIL
[...]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 bpf-next 1/5] libbpf: bpf_program__attach_uprobe_opts() should determine paths for programs/libraries where possible
  2022-03-11 12:10 ` [PATCH v4 bpf-next 1/5] libbpf: bpf_program__attach_uprobe_opts() should determine paths for programs/libraries where possible Alan Maguire
@ 2022-03-16  4:33   ` Andrii Nakryiko
  0 siblings, 0 replies; 12+ messages in thread
From: Andrii Nakryiko @ 2022-03-16  4:33 UTC (permalink / raw)
  To: Alan Maguire
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Martin Lau,
	Song Liu, Yonghong Song, john fastabend, KP Singh,
	Toke Høiland-Jørgensen, Yucong Sun, Networking, bpf

On Fri, Mar 11, 2022 at 4:11 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> bpf_program__attach_uprobe_opts() requires a binary_path argument
> specifying binary to instrument.  Supporting simply specifying
> "libc.so.6" or "foo" should be possible too.
>
> Library search checks LD_LIBRARY_PATH, then /usr/lib64, /usr/lib.
> This allows users to run BPF programs prefixed with
> LD_LIBRARY_PATH=/path2/lib while still searching standard locations.
> Similarly for non .so files, we check PATH and /usr/bin, /usr/sbin.
>
> Path determination will be useful for auto-attach of BPF uprobe programs
> using SEC() definition.
>
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> ---
>  tools/lib/bpf/libbpf.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 50 insertions(+), 1 deletion(-)
>
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index 43161fd..b577577 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -10320,6 +10320,45 @@ static int perf_event_uprobe_open_legacy(const char *probe_name, bool retprobe,
>         return pfd;
>  }
>
> +/* Get full path to program/shared library. */
> +static int resolve_full_path(const char *file, char *result, size_t result_sz)
> +{
> +       char *search_paths[2];
> +       int i;
> +
> +       if (strstr(file, ".so")) {
> +               search_paths[0] = getenv("LD_LIBRARY_PATH");
> +               search_paths[1] = (char *)"/usr/lib64:/usr/lib";
> +       } else {
> +               search_paths[0] = getenv("PATH");
> +               search_paths[1] = (char *)"/usr/bin:/usr/sbin";

It's strange you chose to cast to mutable char * instead of sticking
to `const char*`. getenv() returns char *, but that string is not
supposed to be modified, so effectively it is `const char *`. Let's
keep it safe (and it also won't require any casting)

> +       }
> +
> +       for (i = 0; i < ARRAY_SIZE(search_paths); i++) {
> +               char *s, *search_path, *currpath, *saveptr = NULL;

I'll nitpick on naming, and it feels like we've talked about this
before. Please stick to libbpf naming conventions: cur_path, save_ptr,
etc.

> +
> +               if (!search_paths[i])
> +                       continue;
> +               search_path = strdup(search_paths[i]);
> +               s = search_path;
> +               while ((currpath = strtok_r(s, ":", &saveptr)) != NULL) {

hm... I don't see any benefit to using strtok_r, which assumes mutable
input string, according to its input argument type, and thus requires
strdup, etc. Why so complicated? We have a single delimiter, ':',
right? strchr(s, ':'), advance read-only const char * pointer,
snprintf("%*s", len_of_segment, segment_start), where len_of_segment
should be a pointer difference between two ':' occurrences. No
strdup(), no strtok(), wdyt?

> +                       struct stat sb;
> +
> +                       s = NULL;
> +                       snprintf(result, result_sz, "%s/%s", currpath, file);
> +                       /* ensure it is an executable file/link */
> +                       if (stat(result, &sb) == 0 && (sb.st_mode & (S_IFREG | S_IFLNK)) &&
> +                           (sb.st_mode & (S_IXUSR | S_IXGRP | S_IXOTH))) {

wouldn't access(path, R_OK) (or maybe `R_OK | X_OK`, not sure if it's
important) do the same?

> +                               pr_debug("resolved '%s' to '%s'\n", file, result);
> +                               free(search_path);
> +                               return 0;
> +                       }
> +               }
> +               free(search_path);
> +       }
> +       return -ENOENT;
> +}
> +
>  LIBBPF_API struct bpf_link *
>  bpf_program__attach_uprobe_opts(const struct bpf_program *prog, pid_t pid,
>                                 const char *binary_path, size_t func_offset,
> @@ -10327,6 +10366,7 @@ static int perf_event_uprobe_open_legacy(const char *probe_name, bool retprobe,
>  {
>         DECLARE_LIBBPF_OPTS(bpf_perf_event_opts, pe_opts);
>         char errmsg[STRERR_BUFSIZE], *legacy_probe = NULL;
> +       char full_binary_path[PATH_MAX];
>         struct bpf_link *link;
>         size_t ref_ctr_off;
>         int pfd, err;
> @@ -10338,13 +10378,22 @@ static int perf_event_uprobe_open_legacy(const char *probe_name, bool retprobe,
>         retprobe = OPTS_GET(opts, retprobe, false);
>         ref_ctr_off = OPTS_GET(opts, ref_ctr_offset, 0);
>         pe_opts.bpf_cookie = OPTS_GET(opts, bpf_cookie, 0);

nit: empty line to logically separate this piece of logic below?

> +       if (binary_path && !strchr(binary_path, '/')) {
> +               err = resolve_full_path(binary_path, full_binary_path,
> +                                       sizeof(full_binary_path));
> +               if (err) {
> +                       pr_warn("could not find full path for %s\n", binary_path);

consistency nit: "failed to resolve full path for '%s'\n"?

> +                       return libbpf_err_ptr(err);

we don't use libbpf_err*() helpers in internal helpers, this is
responsibility of user-facing API functions


> +               }
> +               binary_path = full_binary_path;
> +       }
>
>         legacy = determine_uprobe_perf_type() < 0;
>         if (!legacy) {
>                 pfd = perf_event_open_probe(true /* uprobe */, retprobe, binary_path,
>                                             func_offset, pid, ref_ctr_off);
>         } else {
> -               char probe_name[512];
> +               char probe_name[PATH_MAX + 64];
>
>                 if (ref_ctr_off)
>                         return libbpf_err_ptr(-EINVAL);
> --
> 1.8.3.1
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 bpf-next 2/5] libbpf: support function name-based attach uprobes
  2022-03-11 12:10 ` [PATCH v4 bpf-next 2/5] libbpf: support function name-based attach uprobes Alan Maguire
@ 2022-03-16  4:33   ` Andrii Nakryiko
  0 siblings, 0 replies; 12+ messages in thread
From: Andrii Nakryiko @ 2022-03-16  4:33 UTC (permalink / raw)
  To: Alan Maguire
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Martin Lau,
	Song Liu, Yonghong Song, john fastabend, KP Singh,
	Toke Høiland-Jørgensen, Yucong Sun, Networking, bpf

On Fri, Mar 11, 2022 at 4:11 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> kprobe attach is name-based, using lookups of kallsyms to translate
> a function name to an address.  Currently uprobe attach is done
> via an offset value as described in [1].  Extend uprobe opts
> for attach to include a function name which can then be converted
> into a uprobe-friendly offset.  The calcualation is done in
> several steps:
>
> 1. First, determine the symbol address using libelf; this gives us
>    the offset as reported by objdump
> 2. If the function is a shared library function - and the binary
>    provided is a shared library - no further work is required;
>    the address found is the required address
> 3. If the function is a shared library function in a program
>    (as opposed to a shared library), the Procedure Linking Table
>    (PLT) table address is found (it is indexed via the dynamic
>    symbol table index).  This allows us to instrument a call
>    to a shared library function locally in the calling binary,
>    reducing overhead versus having a breakpoint in global lib.
> 4. Finally, if the function is local, subtract the base address
>    associated with the object, retrieved from ELF program headers.
>
> The resultant value is then added to the func_offset value passed
> in to specify the uprobe attach address.  So specifying a func_offset
> of 0 along with a function name "printf" will attach to printf entry.
>
> The modes of operation supported are then
>
> 1. to attach to a local function in a binary; function "foo1" in
>    "/usr/bin/foo"
> 2. to attach to a shared library function in a binary;
>    function "malloc" in "/usr/bin/foo"
> 3. to attach to a shared library function in a shared library -
>    function "malloc" in libc.
>
> [1] https://www.kernel.org/doc/html/latest/trace/uprobetracer.html
>
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> ---

See some nits below, overall looks great. I have a temptation to move
all the uprobe-specific code (including bpf_program__attach_uprobe()
API) into a dedicated uprobe.c. Or maybe even all the tracing stuff
(tracepoints, perf_event, kprobe, uprobe) into tracing.c. We don't
have to do it right now, but I think we'll inevitably end up doing
this, as libbpf.c is huge and does a lot of different things already,
so adding more to it for uprobe code seems wrong.

>  tools/lib/bpf/libbpf.c | 310 +++++++++++++++++++++++++++++++++++++++++++++++++
>  tools/lib/bpf/libbpf.h |  10 +-
>  2 files changed, 319 insertions(+), 1 deletion(-)
>
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index b577577..2b50b01 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -10320,6 +10320,302 @@ static int perf_event_uprobe_open_legacy(const char *probe_name, bool retprobe,
>         return pfd;
>  }
>
> +/* uprobes deal in relative offsets; subtract the base address associated with
> + * the mapped binary.  See Documentation/trace/uprobetracer.rst for more
> + * details.
> + */
> +static long elf_find_relative_offset(Elf *elf, long addr)
> +{
> +       size_t n;
> +       int i;
> +
> +       if (elf_getphdrnum(elf, &n)) {
> +               pr_warn("elf: failed to find program headers: %s\n",
> +                       elf_errmsg(-1));

nit: fits single line, prefer single lines when possible (up to 100
characters is totally ok)

> +               return -ENOENT;
> +       }
> +
> +       for (i = 0; i < n; i++) {
> +               int seg_start, seg_end, seg_offset;
> +               GElf_Phdr phdr;
> +
> +               if (!gelf_getphdr(elf, i, &phdr)) {
> +                       pr_warn("elf: failed to get program header %d: %s\n",
> +                               i, elf_errmsg(-1));

same about single line

> +                       return -ENOENT;
> +               }
> +               if (phdr.p_type != PT_LOAD || !(phdr.p_flags & PF_X))
> +                       continue;
> +
> +               seg_start = phdr.p_vaddr;
> +               seg_end = seg_start + phdr.p_memsz;
> +               seg_offset = phdr.p_offset;
> +               if (addr >= seg_start && addr < seg_end)
> +                       return addr - seg_start + seg_offset;
> +       }
> +       pr_warn("elf: failed to find prog header containing 0x%lx\n", addr);

all these warnings in this function would benefit from specifying in
which particular binary we failed to find something, maybe let's pass
`const char *filename` and use that for errors?

> +       return -ENOENT;
> +}
> +
> +/* Return next ELF section of sh_type after scn, or first of that type
> + * if scn is NULL, and if name is non-NULL, both name and type must match.
> + */
> +static Elf_Scn *elf_find_next_scn_by_type_name(Elf *elf, int sh_type, const char *name,
> +                                              Elf_Scn *scn)
> +{
> +       size_t shstrndx;
> +
> +       if (name && elf_getshdrstrndx(elf, &shstrndx)) {
> +               pr_debug("elf: failed to get section names section index: %s\n",
> +                        elf_errmsg(-1));
> +               return NULL;
> +       }
> +
> +       while ((scn = elf_nextscn(elf, scn)) != NULL) {
> +               const char *sname;
> +               GElf_Shdr sh;
> +
> +               if (!gelf_getshdr(scn, &sh))
> +                       continue;
> +               if (sh.sh_type != sh_type)
> +                       continue;
> +
> +               if (!name)
> +                       return scn;
> +
> +               sname = elf_strptr(elf, shstrndx, sh.sh_name);
> +               if (sname && strcmp(sname, name) == 0)
> +                       return scn;
> +       }
> +       return NULL;
> +}
> +
> +/* For Position-Independent Code-based libraries, a table of trampolines (Procedure Linking Table)
> + * is used to support resolution of symbol names at linking time.  The goal here is to find the
> + * offset associated with the jump to the actual library function, since consumers of the
> + * library function within the binary will jump to the plt table entry (malloc@plt) first.
> + * If we can instrument that .plt entry locally in the specific binary (rather than instrumenting
> + * glibc say), overheads are greatly reduced.
> + *
> + * However, we need to find the index into the .plt table.  There are two parts to this.
> + *
> + * First, we have to find the index that the .plt entry (malloc@plt) lives at.  To do that,
> + * we use the .rela.plt table, which consists of entries in the same order as the .plt,
> + * but crucially each entry contains the symbol index from the symbol table.  This allows
> + * us to match the index of the .plt entry to the desired library function.
> + *
> + * Second, we need to find the address associated with that indexed .plt entry.
> + * The .plt section provides section information about the overall section size and the
> + * size of each .plt entry.  However prior to the entries themselves, there is code
> + * that carries out the dynamic linking, and this code may not be the same size as the
> + * .plt entry size (it happens to be on x86_64, but not on aarch64).  So we have to
> + * determine that initial code size so we can index into the .plt entries that follow it.
> + * To do this, we simply subtract the .plt table size (nr_plt_entries * entry_size)
> + * from the overall section size, and then use that offset as the base into the array
> + * of .plt entries.
> + */
> +static ssize_t elf_find_plt_offset(Elf *elf, size_t sym_idx)
> +{
> +       long plt_entry_sz, plt_sz, plt_base;
> +       Elf_Scn *rela_plt_scn, *plt_scn;
> +       size_t plt_idx, nr_plt_entries;
> +       Elf_Data *rela_data;
> +       GElf_Shdr sh;
> +
> +       /* First get .plt index and table size via .rela.plt */
> +       rela_plt_scn = elf_find_next_scn_by_type_name(elf, SHT_RELA, ".rela.plt", NULL);
> +       if (!rela_plt_scn) {
> +               pr_debug("elf: failed to find .rela.plt section\n");

see above, knowing which file we failed to find it in would be useful
(especially with full path resolution where user might not know for
sure which specific binary libbpf decided to process). So same as for
other functions, let's pass `const char *filename` and use it in error
messages?

> +               return -LIBBPF_ERRNO__FORMAT;
> +       }
> +       if (!gelf_getshdr(rela_plt_scn, &sh)) {
> +               pr_debug("elf: failed to get shdr for .rela.plt section\n");
> +               return -LIBBPF_ERRNO__FORMAT;
> +       }
> +       rela_data = elf_getdata(rela_plt_scn, 0);
> +       if (!rela_data) {
> +               pr_debug("elf: failed to get data for .rela.plt section\n");
> +               return -LIBBPF_ERRNO__FORMAT;
> +       }
> +       nr_plt_entries = sh.sh_size / sh.sh_entsize;
> +       for (plt_idx = 0; plt_idx < nr_plt_entries; plt_idx++) {
> +               GElf_Rela rela;
> +
> +               if (!gelf_getrela(rela_data, plt_idx, &rela))
> +                       continue;
> +               if (GELF_R_SYM(rela.r_info) == sym_idx)
> +                       break;
> +       }
> +       if (plt_idx == nr_plt_entries) {
> +               pr_debug("elf: could not find sym index %ld in .rela.plt section\n",

size_t => %zu

> +                        sym_idx);
> +               return -LIBBPF_ERRNO__FORMAT;
> +       }
> +
> +       /* Now determine base of .plt table and calculate where entry for function is */
> +       plt_scn = elf_find_next_scn_by_type_name(elf, SHT_PROGBITS, ".plt", NULL);
> +       if (!plt_scn || !gelf_getshdr(plt_scn, &sh)) {
> +               pr_debug("elf: failed to find .plt section\n");
> +               return -LIBBPF_ERRNO__FORMAT;
> +       }
> +       plt_base = sh.sh_addr;
> +       plt_entry_sz = sh.sh_entsize;
> +       plt_sz = sh.sh_size;
> +       if (nr_plt_entries * plt_entry_sz >= plt_sz) {

if (plt_entry_sz == 0 || nr_plt_entries >= plt_sz / plt_entry_sz)

would be equivalent, but doesn't suffer from overflow and invalid sh_entsize

> +               pr_debug("elf: failed to calculate base .plt entry size with %ld plt entries\n",
> +                        nr_plt_entries);

%zu, this is a guaranteed compilation warning and a hassle for me to
fix this up during next Github sync, please pay attention to things
like this (and please double check all other case in case I missed
some)

> +               return -LIBBPF_ERRNO__FORMAT;
> +       }
> +       plt_base += plt_sz - (nr_plt_entries * plt_entry_sz);
> +
> +       return plt_base + (plt_idx * plt_entry_sz);
> +}

nit: unnecessary () in above two expressions

> +
> +/* Find offset of function name in object specified by path.  "name" matches
> + * symbol name or name@@LIB for library functions.
> + */
> +static long elf_find_func_offset(const char *binary_path, const char *name)
> +{

[...]

> +               for (idx = 0; idx < nr_syms; idx++) {
> +                       int curr_bind;
> +                       GElf_Sym sym;
> +
> +                       if (!gelf_getsym(symbols, idx, &sym))
> +                               continue;
> +
> +                       if (GELF_ST_TYPE(sym.st_info) != STT_FUNC)
> +                               continue;
> +
> +                       sname = elf_strptr(elf, strtabidx, sym.st_name);
> +                       if (!sname)
> +                               continue;
> +
> +                       curr_bind = GELF_ST_BIND(sym.st_info);
> +
> +                       /* User can specify func, func@@LIB or func@@LIB_VERSION. */
> +                       if (strncmp(sname, name, name_len) != 0)
> +                               continue;
> +                       /* ...but we don't want a search for "foo" to match 'foo2" also, so any
> +                        * additional characters in sname should be of the form "@@LIB".
> +                        */
> +                       if (!is_name_qualified && strlen(sname) > name_len &&
> +                           sname[name_len] != '@')

recalculating strlen() isn't unnecessary. Might not be a big deal, but
you can as well write the same simply as

if (!is_name_qualified && sname[name_len] != '\0' && sname[name_len] != '@')
    continue;

right?

> +                               continue;
> +
> +                       if (ret >= 0) {
> +                               /* handle multiple matches */
> +                               if (last_bind != STB_WEAK && curr_bind != STB_WEAK) {
> +                                       /* Only accept one non-weak bind. */
> +                                       pr_warn("elf: ambiguous match for '%s': %s\n",
> +                                               sname, name);

add binary_path like in all the previous error messages?

> +                                       ret = -LIBBPF_ERRNO__FORMAT;
> +                                       goto out;
> +                               } else if (curr_bind == STB_WEAK) {
> +                                       /* already have a non-weak bind, and
> +                                        * this is a weak bind, so ignore.
> +                                        */
> +                                       continue;
> +                               }
> +                       }
> +                       ret = sym.st_value;
> +                       last_bind = curr_bind;
> +                       sym_idx = idx;
> +               }
> +               /* The index of the entry in SHT_DYNSYM helps find .plt entry */
> +               if (ret == 0 && sh_types[i] == SHT_DYNSYM)
> +                       ret = elf_find_plt_offset(elf, sym_idx);
> +               /* For binaries that are not shared libraries, we need relative offset */
> +               if (ret > 0 && !is_shared_lib)
> +                       ret = elf_find_relative_offset(elf, ret);
> +               if (ret > 0)
> +                       break;
> +       }
> +
> +       if (ret > 0) {
> +               pr_debug("elf: symbol address match for '%s': 0x%lx\n", name, ret);

forgot to log binary_path ;)

> +       } else {
> +               if (ret == 0) {
> +                       pr_warn("elf: '%s' is 0 in symtab for '%s': %s\n", name, binary_path,
> +                               is_shared_lib ? "should not be 0 in a shared library" :
> +                                               "try using shared library path instead");
> +                       ret = -ENOENT;
> +               } else {
> +                       pr_warn("elf: failed to find symbol '%s' in '%s'\n", name, binary_path);
> +               }
> +       }
> +out:
> +       elf_end(elf);
> +       close(fd);
> +       return ret;
> +}
> +
>  /* Get full path to program/shared library. */
>  static int resolve_full_path(const char *file, char *result, size_t result_sz)
>  {
> @@ -10371,6 +10667,7 @@ static int resolve_full_path(const char *file, char *result, size_t result_sz)
>         size_t ref_ctr_off;
>         int pfd, err;
>         bool retprobe, legacy;
> +       const char *func_name;
>
>         if (!OPTS_VALID(opts, bpf_uprobe_opts))
>                 return libbpf_err_ptr(-EINVAL);
> @@ -10387,6 +10684,19 @@ static int resolve_full_path(const char *file, char *result, size_t result_sz)
>                 }
>                 binary_path = full_binary_path;
>         }
> +       func_name = OPTS_GET(opts, func_name, NULL);
> +       if (func_name) {
> +               long sym_off;
> +
> +               if (!binary_path) {
> +                       pr_warn("name-based attach requires binary_path\n");
> +                       return libbpf_err_ptr(-EINVAL);
> +               }
> +               sym_off = elf_find_func_offset(binary_path, func_name);
> +               if (sym_off < 0)
> +                       return libbpf_err_ptr(sym_off);

no need to use libbpf_err_ptr()

> +               func_offset += (size_t)sym_off;

is cast needed?

> +       }
>
>         legacy = determine_uprobe_perf_type() < 0;
>         if (!legacy) {
> diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
> index c1b0c2e..85c93e2 100644
> --- a/tools/lib/bpf/libbpf.h
> +++ b/tools/lib/bpf/libbpf.h
> @@ -436,9 +436,17 @@ struct bpf_uprobe_opts {
>         __u64 bpf_cookie;
>         /* uprobe is return probe, invoked at function return time */
>         bool retprobe;
> +       /* name of function name or function@@LIBRARY.  Partial matches

"name of function name" reads poorly. "Function name to attach to.
Could be unqualified ("abc") or library-qualified ("abc@LIBXYZ")
name." A bit more verbose, but should be easier to understand
acceptable inputs?


> +        * work for library functions, such as printf, printf@@GLIBC.
> +        * To specify function entry, func_offset argument should be 0 and
> +        * func_name should specify function to trace.  To trace an offset
> +        * within the function, specify func_name and use func_offset
> +        * argument to specify argument _within_ the function.
> +        */
> +       const char *func_name;
>         size_t :0;
>  };
> -#define bpf_uprobe_opts__last_field retprobe
> +#define bpf_uprobe_opts__last_field func_name
>
>  /**
>   * @brief **bpf_program__attach_uprobe()** attaches a BPF program
> --
> 1.8.3.1
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 bpf-next 3/5] libbpf: add auto-attach for uprobes based on section name
  2022-03-11 12:10 ` [PATCH v4 bpf-next 3/5] libbpf: add auto-attach for uprobes based on section name Alan Maguire
@ 2022-03-16  4:33   ` Andrii Nakryiko
  0 siblings, 0 replies; 12+ messages in thread
From: Andrii Nakryiko @ 2022-03-16  4:33 UTC (permalink / raw)
  To: Alan Maguire
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Martin Lau,
	Song Liu, Yonghong Song, john fastabend, KP Singh,
	Toke Høiland-Jørgensen, Yucong Sun, Networking, bpf

On Fri, Mar 11, 2022 at 4:11 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> Now that u[ret]probes can use name-based specification, it makes
> sense to add support for auto-attach based on SEC() definition.
> The format proposed is
>
>         SEC("u[ret]probe/prog:[raw_offset|[function_name[+offset]]")

nit: prog -> binary ? or prog -> path?

>
> For example, to trace malloc() in libc:
>
>         SEC("uprobe/libc.so.6:malloc")
>
> ...or to trace function foo2 in /usr/bin/foo:
>
>         SEC("uprobe//usr/bin/foo:foo2")
>
> Auto-attach is done for all tasks (pid -1).  prog can be an absolute
> path or simply a program/library name; in the latter case, we use
> PATH/LD_LIBRARY_PATH to resolve the full path, falling back to
> standard locations (/usr/bin:/usr/sbin or /usr/lib64:/usr/lib) if
> the file is not found via environment-variable specified locations.
>
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> ---
>  tools/lib/bpf/libbpf.c | 68 ++++++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 66 insertions(+), 2 deletions(-)
>
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index 2b50b01..0dcbca8 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -8593,6 +8593,7 @@ int bpf_program__set_log_buf(struct bpf_program *prog, char *log_buf, size_t log
>  }
>
>  static int attach_kprobe(const struct bpf_program *prog, long cookie, struct bpf_link **link);
> +static int attach_uprobe(const struct bpf_program *prog, long cookie, struct bpf_link **link);
>  static int attach_tp(const struct bpf_program *prog, long cookie, struct bpf_link **link);
>  static int attach_raw_tp(const struct bpf_program *prog, long cookie, struct bpf_link **link);
>  static int attach_trace(const struct bpf_program *prog, long cookie, struct bpf_link **link);
> @@ -8604,9 +8605,9 @@ int bpf_program__set_log_buf(struct bpf_program *prog, char *log_buf, size_t log
>         SEC_DEF("sk_reuseport/migrate", SK_REUSEPORT, BPF_SK_REUSEPORT_SELECT_OR_MIGRATE, SEC_ATTACHABLE | SEC_SLOPPY_PFX),
>         SEC_DEF("sk_reuseport",         SK_REUSEPORT, BPF_SK_REUSEPORT_SELECT, SEC_ATTACHABLE | SEC_SLOPPY_PFX),
>         SEC_DEF("kprobe/",              KPROBE, 0, SEC_NONE, attach_kprobe),
> -       SEC_DEF("uprobe/",              KPROBE, 0, SEC_NONE),
> +       SEC_DEF("uprobe/",              KPROBE, 0, SEC_NONE, attach_uprobe),
>         SEC_DEF("kretprobe/",           KPROBE, 0, SEC_NONE, attach_kprobe),
> -       SEC_DEF("uretprobe/",           KPROBE, 0, SEC_NONE),
> +       SEC_DEF("uretprobe/",           KPROBE, 0, SEC_NONE, attach_uprobe),
>         SEC_DEF("tc",                   SCHED_CLS, 0, SEC_NONE),
>         SEC_DEF("classifier",           SCHED_CLS, 0, SEC_NONE | SEC_SLOPPY_PFX | SEC_DEPRECATED),
>         SEC_DEF("action",               SCHED_ACT, 0, SEC_NONE | SEC_SLOPPY_PFX),
> @@ -10761,6 +10762,69 @@ struct bpf_link *bpf_program__attach_uprobe(const struct bpf_program *prog,
>         return bpf_program__attach_uprobe_opts(prog, pid, binary_path, func_offset, &opts);
>  }
>
> +/* Format of u[ret]probe section definition supporting auto-attach:
> + * u[ret]probe/prog:function[+offset]

same about prog

> + *
> + * prog can be an absolute/relative path or a filename; the latter is resolved to a
> + * full path via bpf_program__attach_uprobe_opts.
> + *
> + * Many uprobe programs do not avail of auto-attach, so we need to handle the

do not avail of? meaning "can't be auto-attached due to missing information"?

> + * case where the format is uprobe/myfunc by returning 0 with *link set to NULL
> + * to identify the case where auto-attach is not supported.

it's true that we supported SEC("uprobe/whatever") before and that's
not enough to auto-attach. But let's not drag this legacy "syntax"
forward. How about we check if LIBBPF_STRICT_SEC_NAME is set, and if
yes, then it should either be plain SEC("uprobe") or a proper full
form of SEC("uprobe/path:func...") that you are adding? libbpf
supports such case, you just need to change SEC_DEF definition to
uprobe/ -> uprobe+, which means that it is either SEC("uprobe") or
SEC("uprobe/<something>").

In legacy mode we just won't support auto-attach for uprobe.

Thoughts?

> + */
> +static int attach_uprobe(const struct bpf_program *prog, long cookie, struct bpf_link **link)
> +{
> +       DECLARE_LIBBPF_OPTS(bpf_uprobe_opts, opts);
> +       char *func, *probe_name, *func_end;
> +       char *func_name, binary_path[512];
> +       unsigned long long raw_offset;
> +       size_t offset = 0;
> +       int n;
> +
> +       *link = NULL;
> +

[...]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 bpf-next 4/5] selftests/bpf: add tests for u[ret]probe attach by name
  2022-03-11 12:10 ` [PATCH v4 bpf-next 4/5] selftests/bpf: add tests for u[ret]probe attach by name Alan Maguire
@ 2022-03-16  4:33   ` Andrii Nakryiko
  0 siblings, 0 replies; 12+ messages in thread
From: Andrii Nakryiko @ 2022-03-16  4:33 UTC (permalink / raw)
  To: Alan Maguire
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Martin Lau,
	Song Liu, Yonghong Song, john fastabend, KP Singh,
	Toke Høiland-Jørgensen, Yucong Sun, Networking, bpf

On Fri, Mar 11, 2022 at 4:11 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> add tests that verify attaching by name for
>
> 1. local functions in a program
> 2. library functions in a shared object; and
> 3. library functions in a program
>
> ...succeed for uprobe and uretprobes using new "func_name"
> option for bpf_program__attach_uprobe_opts().  Also verify
> auto-attach works where uprobe, path to binary and function
> name are specified, but fails with -EOPNOTSUPP when the format
> does not match (the latter is to support backwards-compatibility).
>
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> ---
>  .../selftests/bpf/prog_tests/attach_probe.c        | 89 ++++++++++++++++++----
>  .../selftests/bpf/progs/test_attach_probe.c        | 37 +++++++++
>  2 files changed, 113 insertions(+), 13 deletions(-)
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/attach_probe.c b/tools/testing/selftests/bpf/prog_tests/attach_probe.c
> index d48f6e5..b770e0e 100644
> --- a/tools/testing/selftests/bpf/prog_tests/attach_probe.c
> +++ b/tools/testing/selftests/bpf/prog_tests/attach_probe.c
> @@ -11,15 +11,22 @@ static void trigger_func(void)
>         asm volatile ("");
>  }
>
> +/* attach point for byname uprobe */
> +static void trigger_func2(void)
> +{
> +       asm volatile ("");
> +}
> +
>  void test_attach_probe(void)
>  {
>         DECLARE_LIBBPF_OPTS(bpf_uprobe_opts, uprobe_opts);
> -       int duration = 0;
>         struct bpf_link *kprobe_link, *kretprobe_link;
>         struct bpf_link *uprobe_link, *uretprobe_link;
>         struct test_attach_probe* skel;
>         ssize_t uprobe_offset, ref_ctr_offset;
> +       struct bpf_link *uprobe_err_link;
>         bool legacy;
> +       char *mem;
>
>         /* Check if new-style kprobe/uprobe API is supported.
>          * Kernels that support new FD-based kprobe and uprobe BPF attachment

[...]

> -       if (CHECK(skel->bss->uprobe_res != 3, "check_uprobe_res",
> -                 "wrong uprobe res: %d\n", skel->bss->uprobe_res))
> +       /* trigger & validate uprobe attached by name */
> +       trigger_func2();
> +
> +       if (!ASSERT_EQ(skel->bss->kprobe_res, 1, "check_kprobe_res"))
> +               goto cleanup;
> +       if (!ASSERT_EQ(skel->bss->kretprobe_res, 2, "check_kretprobe_res"))
> +               goto cleanup;
> +       if (!ASSERT_EQ(skel->bss->uprobe_res, 3, "check_uprobe_res"))
> +               goto cleanup;
> +       if (!ASSERT_EQ(skel->bss->uretprobe_res, 4, "check_uretprobe_res"))
> +               goto cleanup;
> +       if (!ASSERT_EQ(skel->bss->uprobe_byname_res, 5, "check_uprobe_byname_res"))
> +               goto cleanup;
> +       if (!ASSERT_EQ(skel->bss->uretprobe_byname_res, 6, "check_uretprobe_byname_res"))
> +               goto cleanup;
> +       if (!ASSERT_EQ(skel->bss->uprobe_byname2_res, 7, "check_uprobe_byname2_res"))
>                 goto cleanup;

no need for all those goto cleanup. Just do unconditional ASSERT_EQ(),
it doesn't hurt and is much cleaner.

> -       if (CHECK(skel->bss->uretprobe_res != 4, "check_uretprobe_res",
> -                 "wrong uretprobe res: %d\n", skel->bss->uretprobe_res))
> +       if (!ASSERT_EQ(skel->bss->uretprobe_byname2_res, 8, "check_uretprobe_byname2_res"))
>                 goto cleanup;
>
>  cleanup:

[...]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 bpf-next 5/5] selftests/bpf: add tests for uprobe auto-attach via skeleton
  2022-03-11 12:10 ` [PATCH v4 bpf-next 5/5] selftests/bpf: add tests for uprobe auto-attach via skeleton Alan Maguire
  2022-03-14 22:26   ` Daniel Borkmann
@ 2022-03-16  4:33   ` Andrii Nakryiko
  1 sibling, 0 replies; 12+ messages in thread
From: Andrii Nakryiko @ 2022-03-16  4:33 UTC (permalink / raw)
  To: Alan Maguire
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Martin Lau,
	Song Liu, Yonghong Song, john fastabend, KP Singh,
	Toke Høiland-Jørgensen, Yucong Sun, Networking, bpf

On Fri, Mar 11, 2022 at 4:11 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> tests that verify auto-attach works for function entry/return for
> local functions in program, library functions in program and library
> functions in library.
>
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> ---
>  .../selftests/bpf/prog_tests/uprobe_autoattach.c   | 48 +++++++++++++++
>  .../selftests/bpf/progs/test_uprobe_autoattach.c   | 69 ++++++++++++++++++++++
>  2 files changed, 117 insertions(+)
>  create mode 100644 tools/testing/selftests/bpf/prog_tests/uprobe_autoattach.c
>  create mode 100644 tools/testing/selftests/bpf/progs/test_uprobe_autoattach.c
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/uprobe_autoattach.c b/tools/testing/selftests/bpf/prog_tests/uprobe_autoattach.c
> new file mode 100644
> index 0000000..57ed636
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/prog_tests/uprobe_autoattach.c
> @@ -0,0 +1,48 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2022, Oracle and/or its affiliates. */
> +
> +#include <test_progs.h>
> +#include "test_uprobe_autoattach.skel.h"
> +
> +/* uprobe attach point */
> +static void autoattach_trigger_func(void)
> +{
> +       asm volatile ("");
> +}
> +
> +void test_uprobe_autoattach(void)
> +{
> +       struct test_uprobe_autoattach *skel;
> +       char *mem;
> +
> +       skel = test_uprobe_autoattach__open_and_load();
> +       if (!ASSERT_OK_PTR(skel, "skel_open"))
> +               return;
> +       if (!ASSERT_OK_PTR(skel->bss, "check_bss"))
> +               goto cleanup;

no need to check skel->bss, a lot of tests will be broken if that doesn't work

> +
> +       if (!ASSERT_OK(test_uprobe_autoattach__attach(skel), "skel_attach"))
> +               goto cleanup;
> +
> +       /* trigger & validate uprobe & uretprobe */
> +       autoattach_trigger_func();
> +
> +       /* trigger & validate shared library u[ret]probes attached by name */
> +       mem = malloc(1);
> +       free(mem);
> +
> +       if (!ASSERT_EQ(skel->bss->uprobe_byname_res, 1, "check_uprobe_byname_res"))
> +               goto cleanup;
> +       if (!ASSERT_EQ(skel->bss->uretprobe_byname_res, 2, "check_uretprobe_byname_res"))
> +               goto cleanup;
> +       if (!ASSERT_EQ(skel->bss->uprobe_byname2_res, 3, "check_uprobe_byname2_res"))
> +               goto cleanup;
> +       if (!ASSERT_EQ(skel->bss->uretprobe_byname2_res, 4, "check_uretprobe_byname2_res"))
> +               goto cleanup;
> +       if (!ASSERT_EQ(skel->bss->uprobe_byname3_res, 5, "check_uprobe_byname3_res"))
> +               goto cleanup;
> +       if (!ASSERT_EQ(skel->bss->uretprobe_byname3_res, 6, "check_uretprobe_byname3_res"))
> +               goto cleanup;

same about goto cleanup, for such sequences of value checks it's nice
and succinct to just use unconditional sequence of ASSERT_EQ()s

> +cleanup:
> +       test_uprobe_autoattach__destroy(skel);
> +}

[...]

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-03-16  4:33 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-11 12:10 [PATCH v4 bpf-next 0/5] libbpf: name-based u[ret]probe attach Alan Maguire
2022-03-11 12:10 ` [PATCH v4 bpf-next 1/5] libbpf: bpf_program__attach_uprobe_opts() should determine paths for programs/libraries where possible Alan Maguire
2022-03-16  4:33   ` Andrii Nakryiko
2022-03-11 12:10 ` [PATCH v4 bpf-next 2/5] libbpf: support function name-based attach uprobes Alan Maguire
2022-03-16  4:33   ` Andrii Nakryiko
2022-03-11 12:10 ` [PATCH v4 bpf-next 3/5] libbpf: add auto-attach for uprobes based on section name Alan Maguire
2022-03-16  4:33   ` Andrii Nakryiko
2022-03-11 12:10 ` [PATCH v4 bpf-next 4/5] selftests/bpf: add tests for u[ret]probe attach by name Alan Maguire
2022-03-16  4:33   ` Andrii Nakryiko
2022-03-11 12:10 ` [PATCH v4 bpf-next 5/5] selftests/bpf: add tests for uprobe auto-attach via skeleton Alan Maguire
2022-03-14 22:26   ` Daniel Borkmann
2022-03-16  4:33   ` Andrii Nakryiko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.