All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 bpf-next 0/4] libbpf: name-based u[ret]probe attach
@ 2022-01-31 16:12 Alan Maguire
  2022-01-31 16:12 ` [PATCH v3 bpf-next 1/4] libbpf: support function name-based attach uprobes Alan Maguire
                   ` (4 more replies)
  0 siblings, 5 replies; 16+ messages in thread
From: Alan Maguire @ 2022-01-31 16:12 UTC (permalink / raw)
  To: andrii, ast, daniel
  Cc: kafai, songliubraving, yhs, john.fastabend, kpsingh, jolsa,
	sunyucong, netdev, bpf, Alan Maguire

This patch series is a refinement of the RFC patchset [1], focusing
on support for attach by name for uprobes and uretprobes. v3 
because there was an earlier RFC [2].

Currently attach for such probes is done by determining the offset
manually, so the aim is to try and mimic the simplicity of kprobe
attach, making use of uprobe opts to specify a name string.
Patch 1 adds the "func_name" option to allow uprobe attach by
name; the mechanics are described there.

Having name-based support allows us to support auto-attach for
uprobes; patch 2 adds auto-attach support while attempting
to handle backwards-compatibility issues that arise.  The format
supported is

u[ret]probe//path/2/binary:[raw_offset|function[+offset]]

For example, to attach to libc malloc:

SEC("uprobe//usr/lib64/libc.so.6:malloc")

Patch 3 introduces a helper function to trace_helpers, allowing
us to retrieve the path to a library by reading /proc/self/maps.

Finally patch 4 add tests to the attach_probe selftests covering
attach by name, auto-attach and auto-attach failure.

Changes since RFC [1]:
- used "long" for addresses instead of ssize_t (Andrii, patch 1).
- used gelf_ interfaces to avoid assumptions about 64-bit
  binaries (Andrii, patch 1)
- clarified string matching in symbol table lookups
  (Andrii, patch 1)
- added support for specification of shared object functions
  in a non-shared object binary.  This approach instruments
  the Procedure Linking Table (PLT) - malloc@PLT.
- changed logic in symbol search to check dynamic symbol table
  first, then fall back to symbol table (Andrii, patch 1).
- modified auto-attach string to require "/" separator prior
  to path prefix i.e. uprobe//path/to/binary (Andrii, patch 2)
- modified auto-attach string to use ':' separator (Andrii,
  patch 2)
- modified auto-attach to support raw offset (Andrii, patch 2)
- modified skeleton attach to interpret -ESRCH errors as
  a non-fatal "unable to auto-attach" (Andrii suggested
  -EOPNOTSUPP but my concern was it might collide with other
  instances where that value is returned and reflects a
  failure to attach a to-be-expected attachment rather than
  skip a program that does not present an auto-attachable
  section name. Admittedly -EOPNOTSUPP seems a more natural
  value here).
- moved library path retrieval code to trace_helpers (Andrii,
  patch 3)

[1] https://lore.kernel.org/bpf/1642678950-19584-1-git-send-email-alan.maguire@oracle.com/
[2] https://lore.kernel.org/bpf/1642004329-23514-1-git-send-email-alan.maguire@oracle.com/

Alan Maguire (4):
  libbpf: support function name-based attach uprobes
  libbpf: add auto-attach for uprobes based on section name
  selftests/bpf: add get_lib_path() helper
  selftests/bpf: add tests for u[ret]probe attach by name

 tools/lib/bpf/libbpf.c                             | 327 ++++++++++++++++++++-
 tools/lib/bpf/libbpf.h                             |  10 +-
 .../selftests/bpf/prog_tests/attach_probe.c        |  89 +++++-
 .../selftests/bpf/progs/test_attach_probe.c        |  37 +++
 tools/testing/selftests/bpf/trace_helpers.c        |  17 ++
 tools/testing/selftests/bpf/trace_helpers.h        |   2 +
 6 files changed, 475 insertions(+), 7 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v3 bpf-next 1/4] libbpf: support function name-based attach uprobes
  2022-01-31 16:12 [PATCH v3 bpf-next 0/4] libbpf: name-based u[ret]probe attach Alan Maguire
@ 2022-01-31 16:12 ` Alan Maguire
  2022-02-04 19:17   ` Andrii Nakryiko
  2022-01-31 16:12 ` [PATCH v3 bpf-next 2/4] libbpf: add auto-attach for uprobes based on section name Alan Maguire
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 16+ messages in thread
From: Alan Maguire @ 2022-01-31 16:12 UTC (permalink / raw)
  To: andrii, ast, daniel
  Cc: kafai, songliubraving, yhs, john.fastabend, kpsingh, jolsa,
	sunyucong, netdev, bpf, Alan Maguire

kprobe attach is name-based, using lookups of kallsyms to translate
a function name to an address.  Currently uprobe attach is done
via an offset value as described in [1].  Extend uprobe opts
for attach to include a function name which can then be converted
into a uprobe-friendly offset.  The calcualation is done in
several steps:

1. First, determine the symbol address using libelf; this gives us
   the offset as reported by objdump; then, in the case of local
   functions
2. If the function is a shared library function - and the binary
   provided is a shared library - no further work is required;
   the address found is the required address
3. If the function is a shared library function in a program
   (as opposed to a shared library), the Procedure Linking Table
   (PLT) table address is found (it is indexed via the dynamic
   symbol table index).  This allows us to instrument a call
   to a shared library function locally in the calling binary,
   reducing overhead versus having a breakpoint in global lib.
4. Finally, if the function is local, subtract the base address
   associated with the object, retrieved from ELF program headers.

The resultant value is then added to the func_offset value passed
in to specify the uprobe attach address.  So specifying a func_offset
of 0 along with a function name "printf" will attach to printf entry.

The modes of operation supported are then

1. to attach to a local function in a binary; function "foo1" in
   "/usr/bin/foo"
2. to attach to a shared library function in a binary;
   function "malloc" in "/usr/bin/foo"
3. to attach to a shared library function in a shared library -
   function "malloc" in libc.

[1] https://www.kernel.org/doc/html/latest/trace/uprobetracer.html

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 tools/lib/bpf/libbpf.c | 250 +++++++++++++++++++++++++++++++++++++++++++++++++
 tools/lib/bpf/libbpf.h |  10 +-
 2 files changed, 259 insertions(+), 1 deletion(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 4ce94f4..eb95629 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -10203,6 +10203,241 @@ static int perf_event_uprobe_open_legacy(const char *probe_name, bool retprobe,
 	return pfd;
 }
 
+/* uprobes deal in relative offsets; subtract the base address associated with
+ * the mapped binary.  See Documentation/trace/uprobetracer.rst for more
+ * details.
+ */
+static long elf_find_relative_offset(Elf *elf,  long addr)
+{
+	size_t n;
+	int i;
+
+	if (elf_getphdrnum(elf, &n)) {
+		pr_warn("elf: failed to find program headers: %s\n",
+			elf_errmsg(-1));
+		return -ENOENT;
+	}
+
+	for (i = 0; i < n; i++) {
+		int seg_start, seg_end, seg_offset;
+		GElf_Phdr phdr;
+
+		if (!gelf_getphdr(elf, i, &phdr)) {
+			pr_warn("elf: failed to get program header %d: %s\n",
+				i, elf_errmsg(-1));
+			return -ENOENT;
+		}
+		if (phdr.p_type != PT_LOAD || !(phdr.p_flags & PF_X))
+			continue;
+
+		seg_start = phdr.p_vaddr;
+		seg_end = seg_start + phdr.p_memsz;
+		seg_offset = phdr.p_offset;
+		if (addr >= seg_start && addr < seg_end)
+			return addr -  seg_start + seg_offset;
+	}
+	pr_warn("elf: failed to find prog header containing 0x%lx\n", addr);
+	return -ENOENT;
+}
+
+/* Return next ELF section of sh_type after scn, or first of that type
+ * if scn is NULL.
+ */
+static Elf_Scn *elf_find_next_scn_by_type(Elf *elf, int sh_type, Elf_Scn *scn)
+{
+	while ((scn = elf_nextscn(elf, scn)) != NULL) {
+		GElf_Shdr sh;
+
+		if (!gelf_getshdr(scn, &sh))
+			continue;
+		if (sh.sh_type == sh_type)
+			break;
+	}
+	return scn;
+}
+
+/* For Position-Independent Code-based libraries, a table of trampolines
+ * (Procedure Linking Table) is used to support resolution of symbol
+ * names at linking time.  The goal here is to find the offset associated
+ * with the jump to the actual library function.  If we can instrument that
+ * locally in the specific binary (rather than instrumenting glibc say),
+ * overheads are greatly reduced.
+ *
+ * The method used is to find the .plt section and determine the offset
+ * of the relevant entry (given by the base address plus the index
+ * of the function multiplied by the size of a .plt entry).
+ */
+static ssize_t elf_find_plt_offset(Elf *elf, size_t ndx)
+{
+	Elf_Scn *scn = NULL;
+	size_t shstrndx;
+
+	if (elf_getshdrstrndx(elf, &shstrndx)) {
+		pr_debug("elf: failed to get section names section index: %s\n",
+			 elf_errmsg(-1));
+		return -LIBBPF_ERRNO__FORMAT;
+	}
+	while ((scn = elf_find_next_scn_by_type(elf, SHT_PROGBITS, scn))) {
+		long plt_entry_sz, plt_base;
+		const char *name;
+		GElf_Shdr sh;
+
+		if (!gelf_getshdr(scn, &sh))
+			continue;
+		name = elf_strptr(elf, shstrndx, sh.sh_name);
+		if (!name || strcmp(name, ".plt") != 0)
+			continue;
+		plt_base = sh.sh_addr;
+		plt_entry_sz = sh.sh_entsize;
+		return plt_base + (ndx * plt_entry_sz);
+	}
+	pr_debug("elf: no .plt section found\n");
+	return -LIBBPF_ERRNO__FORMAT;
+}
+
+/* Find offset of function name in object specified by path.  "name" matches
+ * symbol name or name@@LIB for library functions.
+ */
+static long elf_find_func_offset(const char *binary_path, const char *name)
+{
+	int fd, i, sh_types[2] = { SHT_DYNSYM, SHT_SYMTAB };
+	bool is_shared_lib, is_name_qualified;
+	size_t name_len, sym_ndx = -1;
+	char errmsg[STRERR_BUFSIZE];
+	long ret = -ENOENT;
+	GElf_Ehdr ehdr;
+	Elf *elf;
+
+	fd = open(binary_path, O_RDONLY | O_CLOEXEC);
+	if (fd < 0) {
+		ret = -errno;
+		pr_warn("failed to open %s: %s\n", binary_path,
+			libbpf_strerror_r(ret, errmsg, sizeof(errmsg)));
+		return ret;
+	}
+	elf = elf_begin(fd, ELF_C_READ_MMAP, NULL);
+	if (!elf) {
+		pr_warn("elf: could not read elf from %s: %s\n", binary_path, elf_errmsg(-1));
+		close(fd);
+		return -LIBBPF_ERRNO__FORMAT;
+	}
+	if (!gelf_getehdr(elf, &ehdr)) {
+		pr_warn("elf: failed to get ehdr from %s: %s\n", binary_path, elf_errmsg(-1));
+		ret = -LIBBPF_ERRNO__FORMAT;
+		goto out;
+	}
+	/* for shared lib case, we do not need to calculate relative offset */
+	is_shared_lib = ehdr.e_type == ET_DYN;
+
+	name_len = strlen(name);
+	/* Does name specify "@@LIB"? */
+	is_name_qualified = strstr(name, "@@") != NULL;
+
+	/* Search SHT_DYNSYM, SHT_SYMTAB for symbol.  This search order is used because if
+	 * the symbol is found in SHY_DYNSYM, the index in that table tells us which index
+	 * to use in the Procedure Linking Table to instrument calls to the shared library
+	 * function, but locally in the binary rather than in the shared library ifself.
+	 * If a binary is stripped, it may also only have SHT_DYNSYM, and a fully-statically
+	 * linked binary may not have SHT_DYMSYM, so absence of a section should not be
+	 * reported as a warning/error.
+	 */
+	for (i = 0; i < ARRAY_SIZE(sh_types); i++) {
+		size_t strtabidx, ndx, nr_syms;
+		Elf_Data *symbols = NULL;
+		Elf_Scn *scn = NULL;
+		int last_bind = -1;
+		const char *sname;
+		GElf_Shdr sh;
+
+		scn = elf_find_next_scn_by_type(elf, sh_types[i], NULL);
+		if (!scn) {
+			pr_debug("elf: failed to find symbol table ELF sections in %s\n",
+				 binary_path);
+			continue;
+		}
+		if (!gelf_getshdr(scn, &sh))
+			continue;
+		strtabidx = sh.sh_link;
+		symbols = elf_getdata(scn, 0);
+		if (!symbols) {
+			pr_warn("elf: failed to get symbols for symtab section in %s: %s\n",
+				binary_path, elf_errmsg(-1));
+			ret = -LIBBPF_ERRNO__FORMAT;
+			goto out;
+		}
+		nr_syms = symbols->d_size / sh.sh_entsize;
+
+		for (ndx = 0; ndx < nr_syms; ndx++) {
+			int curr_bind;
+			GElf_Sym sym;
+
+			if (!gelf_getsym(symbols, ndx, &sym))
+				continue;
+			if (GELF_ST_TYPE(sym.st_info) != STT_FUNC)
+				continue;
+
+			sname = elf_strptr(elf, strtabidx, sym.st_name);
+			if (!sname)
+				continue;
+			curr_bind = GELF_ST_BIND(sym.st_info);
+
+			/* User can specify func, func@@LIB or func@@LIB_VERSION. */
+			if (strncmp(sname, name, name_len) != 0)
+				continue;
+			/* ...but we don't want a search for "foo" to match 'foo2" also, so any
+			 * additional characters in sname should be of the form "@@LIB".
+			 */
+			if (!is_name_qualified && strlen(sname) > name_len &&
+			    sname[name_len] != '@')
+				continue;
+
+			if (ret >= 0 && last_bind != -1) {
+				/* handle multiple matches */
+				if (last_bind != STB_WEAK && curr_bind != STB_WEAK) {
+					/* Only accept one non-weak bind. */
+					pr_warn("elf: ambiguous match for '%s': %s\n",
+						sname, name);
+					ret = -LIBBPF_ERRNO__FORMAT;
+					goto out;
+				} else if (curr_bind == STB_WEAK) {
+					/* already have a non-weak bind, and
+					 * this is a weak bind, so ignore.
+					 */
+					continue;
+				}
+			}
+			ret = sym.st_value;
+			last_bind = curr_bind;
+			sym_ndx = ndx;
+		}
+		/* The index of the entry in SHT_DYNSYM gives us the index into the PLT */
+		if (ret == 0 && sh_types[i] == SHT_DYNSYM)
+			ret = elf_find_plt_offset(elf, sym_ndx);
+		/* For binaries that are not shared libraries, we need relative offset */
+		if (ret > 0 && !is_shared_lib)
+			ret = elf_find_relative_offset(elf, ret);
+		if (ret > 0)
+			break;
+	}
+
+	if (ret > 0) {
+		pr_debug("elf: symbol address match for '%s': 0x%lx\n", name, ret);
+	} else {
+		if (ret == 0) {
+			pr_warn("elf: '%s' is 0 in symtab for '%s': %s\n", name, binary_path,
+				is_shared_lib ? "should not be 0 in a shared library" :
+						"try using shared library path instead");
+			ret = -ENOENT;
+		} else {
+			pr_warn("elf: failed to find symbol '%s' in '%s'\n", name, binary_path);
+		}
+	}
+out:
+	elf_end(elf);
+	close(fd);
+	return ret;
+}
+
 LIBBPF_API struct bpf_link *
 bpf_program__attach_uprobe_opts(const struct bpf_program *prog, pid_t pid,
 				const char *binary_path, size_t func_offset,
@@ -10214,6 +10449,7 @@ static int perf_event_uprobe_open_legacy(const char *probe_name, bool retprobe,
 	size_t ref_ctr_off;
 	int pfd, err;
 	bool retprobe, legacy;
+	const char *func_name;
 
 	if (!OPTS_VALID(opts, bpf_uprobe_opts))
 		return libbpf_err_ptr(-EINVAL);
@@ -10222,6 +10458,20 @@ static int perf_event_uprobe_open_legacy(const char *probe_name, bool retprobe,
 	ref_ctr_off = OPTS_GET(opts, ref_ctr_offset, 0);
 	pe_opts.bpf_cookie = OPTS_GET(opts, bpf_cookie, 0);
 
+	func_name = OPTS_GET(opts, func_name, NULL);
+	if (func_name) {
+		long sym_off;
+
+		if (!binary_path) {
+			pr_warn("name-based attach requires binary_path\n");
+			return libbpf_err_ptr(-EINVAL);
+		}
+		sym_off = elf_find_func_offset(binary_path, func_name);
+		if (sym_off < 0)
+			return libbpf_err_ptr(sym_off);
+		func_offset += (size_t)sym_off;
+	}
+
 	legacy = determine_uprobe_perf_type() < 0;
 	if (!legacy) {
 		pfd = perf_event_open_probe(true /* uprobe */, retprobe, binary_path,
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index 5762b57..1de3eeb 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -433,9 +433,17 @@ struct bpf_uprobe_opts {
 	__u64 bpf_cookie;
 	/* uprobe is return probe, invoked at function return time */
 	bool retprobe;
+	/* name of function name or function@@LIBRARY.  Partial matches
+	 * work for library functions, such as printf, printf@@GLIBC.
+	 * To specify function entry, func_offset argument should be 0 and
+	 * func_name should specify function to trace.  To trace an offset
+	 * within the function, specify func_name and use func_offset
+	 * argument to specify argument _within_ the function.
+	 */
+	const char *func_name;
 	size_t :0;
 };
-#define bpf_uprobe_opts__last_field retprobe
+#define bpf_uprobe_opts__last_field func_name
 
 /**
  * @brief **bpf_program__attach_uprobe()** attaches a BPF program
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 bpf-next 2/4] libbpf: add auto-attach for uprobes based on section name
  2022-01-31 16:12 [PATCH v3 bpf-next 0/4] libbpf: name-based u[ret]probe attach Alan Maguire
  2022-01-31 16:12 ` [PATCH v3 bpf-next 1/4] libbpf: support function name-based attach uprobes Alan Maguire
@ 2022-01-31 16:12 ` Alan Maguire
  2022-02-04 19:22   ` Andrii Nakryiko
  2022-01-31 16:12 ` [PATCH v3 bpf-next 3/4] selftests/bpf: add get_lib_path() helper Alan Maguire
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 16+ messages in thread
From: Alan Maguire @ 2022-01-31 16:12 UTC (permalink / raw)
  To: andrii, ast, daniel
  Cc: kafai, songliubraving, yhs, john.fastabend, kpsingh, jolsa,
	sunyucong, netdev, bpf, Alan Maguire

Now that u[ret]probes can use name-based specification, it makes
sense to add support for auto-attach based on SEC() definition.
The format proposed is

	SEC("u[ret]probe//path/to/prog:[raw_offset|[function_name[+offset]]")

For example, to trace malloc() in libc:

        SEC("uprobe//usr/lib64/libc.so.6:malloc")

Auto-attach is done for all tasks (pid -1).

Note that there is a backwards-compatibility issue here.  Consider a BPF
object consisting of a set of BPF programs, including a uprobe program.
Because uprobes did not previously support auto-attach, it's possible that
because the uprobe section name is not in auto-attachable form, overall
BPF skeleton attach would now fail due to the failure of the uprobe program
to auto-attach.  So we need to handle the case of auto-attach failure where
the form of the section name is not suitable for auto-attach without
a complete attach failure.  On surveying the code, bpf_program__attach()
already returns -ESRCH in cases where no auto-attach function is
supplied, so for consistency with that - and because that return value
is less likely to collide with actual attach failures than -EOPNOTSUPP -
it is used as the attach function return value signalling auto-attach
is not possible.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 tools/lib/bpf/libbpf.c | 77 ++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 74 insertions(+), 3 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index eb95629..e2b4415 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -8581,6 +8581,7 @@ int bpf_program__set_log_buf(struct bpf_program *prog, char *log_buf, size_t log
 }
 
 static struct bpf_link *attach_kprobe(const struct bpf_program *prog, long cookie);
+static struct bpf_link *attach_uprobe(const struct bpf_program *prog, long cookie);
 static struct bpf_link *attach_tp(const struct bpf_program *prog, long cookie);
 static struct bpf_link *attach_raw_tp(const struct bpf_program *prog, long cookie);
 static struct bpf_link *attach_trace(const struct bpf_program *prog, long cookie);
@@ -8592,9 +8593,9 @@ int bpf_program__set_log_buf(struct bpf_program *prog, char *log_buf, size_t log
 	SEC_DEF("sk_reuseport/migrate",	SK_REUSEPORT, BPF_SK_REUSEPORT_SELECT_OR_MIGRATE, SEC_ATTACHABLE | SEC_SLOPPY_PFX),
 	SEC_DEF("sk_reuseport",		SK_REUSEPORT, BPF_SK_REUSEPORT_SELECT, SEC_ATTACHABLE | SEC_SLOPPY_PFX),
 	SEC_DEF("kprobe/",		KPROBE,	0, SEC_NONE, attach_kprobe),
-	SEC_DEF("uprobe/",		KPROBE,	0, SEC_NONE),
+	SEC_DEF("uprobe/",		KPROBE, 0, SEC_NONE, attach_uprobe),
 	SEC_DEF("kretprobe/",		KPROBE, 0, SEC_NONE, attach_kprobe),
-	SEC_DEF("uretprobe/",		KPROBE, 0, SEC_NONE),
+	SEC_DEF("uretprobe/",		KPROBE, 0, SEC_NONE, attach_uprobe),
 	SEC_DEF("tc",			SCHED_CLS, 0, SEC_NONE),
 	SEC_DEF("classifier",		SCHED_CLS, 0, SEC_NONE | SEC_SLOPPY_PFX),
 	SEC_DEF("action",		SCHED_ACT, 0, SEC_NONE | SEC_SLOPPY_PFX),
@@ -10525,6 +10526,64 @@ static long elf_find_func_offset(const char *binary_path, const char *name)
 
 }
 
+/* Format of u[ret]probe section definition supporting auto-attach:
+ * u[ret]probe//path/to/prog:function[+offset]
+ *
+ * Many uprobe programs do not avail of auto-attach, so we need to handle the
+ * case where the format is uprobe/myfunc by returning NULL rather than an
+ * error.
+ */
+static struct bpf_link *attach_uprobe(const struct bpf_program *prog, long cookie)
+{
+	DECLARE_LIBBPF_OPTS(bpf_uprobe_opts, opts);
+	char *func_name, binary_path[512];
+	unsigned int raw_offset;
+	char *func, *probe_name;
+	struct bpf_link *link;
+	size_t offset = 0;
+	int n, err;
+
+	opts.retprobe = str_has_pfx(prog->sec_name, "uretprobe/");
+	if (opts.retprobe)
+		probe_name = prog->sec_name + sizeof("uretprobe/") - 1;
+	else
+		probe_name = prog->sec_name + sizeof("uprobe/") - 1;
+
+	snprintf(binary_path, sizeof(binary_path), "%s", probe_name);
+	/* ':' should be prior to function+offset */
+	func_name = strrchr(binary_path, ':');
+	if (!func_name) {
+		pr_debug("section '%s' is old-style u[ret]probe/function, cannot auto-attach\n",
+			 prog->sec_name);
+		return libbpf_err_ptr(-ESRCH);
+	}
+	func_name[0] = '\0';
+	func_name++;
+	n = sscanf(func_name, "%m[a-zA-Z0-9_.]+%li", &func, &offset);
+	if (n < 1) {
+		err = -EINVAL;
+		pr_warn("uprobe name is invalid: %s\n", func_name);
+		return libbpf_err_ptr(err);
+	}
+	/* Is func a raw address? */
+	if (n == 1 && sscanf(func, "%x", &raw_offset) == 1) {
+		free(func);
+		func = NULL;
+		offset = (size_t)raw_offset;
+	}
+	if (opts.retprobe && offset != 0) {
+		free(func);
+		err = -EINVAL;
+		pr_warn("uretprobes do not support offset specification\n");
+		return libbpf_err_ptr(err);
+	}
+	opts.func_name = func;
+
+	link = bpf_program__attach_uprobe_opts(prog, -1, binary_path, offset, &opts);
+	free(func);
+	return link;
+}
+
 struct bpf_link *bpf_program__attach_uprobe(const struct bpf_program *prog,
 					    bool retprobe, pid_t pid,
 					    const char *binary_path,
@@ -12041,7 +12100,19 @@ int bpf_object__attach_skeleton(struct bpf_object_skeleton *s)
 
 		*link = bpf_program__attach(prog);
 		err = libbpf_get_error(*link);
-		if (err) {
+		switch (err) {
+		case 0:
+			break;
+		case -ESRCH:
+			/* -ESRCH is used as it is less likely to collide with other error
+			 * cases in program attach while being consistent with the value
+			 * returned by bpf_program__attach() where no auto-attach function
+			 * is provided.
+			 */
+			pr_warn("auto-attach not supported for program '%s'\n",
+				bpf_program__name(prog));
+			break;
+		default:
 			pr_warn("failed to auto-attach program '%s': %d\n",
 				bpf_program__name(prog), err);
 			return libbpf_err(err);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 bpf-next 3/4] selftests/bpf: add get_lib_path() helper
  2022-01-31 16:12 [PATCH v3 bpf-next 0/4] libbpf: name-based u[ret]probe attach Alan Maguire
  2022-01-31 16:12 ` [PATCH v3 bpf-next 1/4] libbpf: support function name-based attach uprobes Alan Maguire
  2022-01-31 16:12 ` [PATCH v3 bpf-next 2/4] libbpf: add auto-attach for uprobes based on section name Alan Maguire
@ 2022-01-31 16:12 ` Alan Maguire
  2022-02-04 19:24   ` Andrii Nakryiko
  2022-01-31 16:12 ` [PATCH v3 bpf-next 4/4] selftests/bpf: add tests for u[ret]probe attach by name Alan Maguire
  2022-02-01  8:52 ` [PATCH v3 bpf-next 0/4] libbpf: name-based u[ret]probe attach Daniel Borkmann
  4 siblings, 1 reply; 16+ messages in thread
From: Alan Maguire @ 2022-01-31 16:12 UTC (permalink / raw)
  To: andrii, ast, daniel
  Cc: kafai, songliubraving, yhs, john.fastabend, kpsingh, jolsa,
	sunyucong, netdev, bpf, Alan Maguire

get_lib_path(path_substr) returns full path to a library
containing path_substr (such as "libc-") found via
/proc/self/maps.  Caller is responsible for freeing
the returned string.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 tools/testing/selftests/bpf/trace_helpers.c | 17 +++++++++++++++++
 tools/testing/selftests/bpf/trace_helpers.h |  2 ++
 2 files changed, 19 insertions(+)

diff --git a/tools/testing/selftests/bpf/trace_helpers.c b/tools/testing/selftests/bpf/trace_helpers.c
index ca6abae..49e5f0d 100644
--- a/tools/testing/selftests/bpf/trace_helpers.c
+++ b/tools/testing/selftests/bpf/trace_helpers.c
@@ -216,3 +216,20 @@ ssize_t get_rel_offset(uintptr_t addr)
 	fclose(f);
 	return -EINVAL;
 }
+
+char *get_lib_path(const char *path_substr)
+{
+	char *found = NULL;
+	char lib_path[512];
+	FILE *f;
+
+	f = fopen("/proc/self/maps", "r");
+	while (fscanf(f, "%*s %*s %*s %*s %*s %[^\n]", lib_path) == 1) {
+		if (strstr(lib_path, path_substr) == NULL)
+			continue;
+		found = strdup(lib_path);
+		break;
+	}
+	fclose(f);
+	return found;
+}
diff --git a/tools/testing/selftests/bpf/trace_helpers.h b/tools/testing/selftests/bpf/trace_helpers.h
index 238a9c9..ff379f6 100644
--- a/tools/testing/selftests/bpf/trace_helpers.h
+++ b/tools/testing/selftests/bpf/trace_helpers.h
@@ -20,5 +20,7 @@ struct ksym {
 
 ssize_t get_uprobe_offset(const void *addr);
 ssize_t get_rel_offset(uintptr_t addr);
+/* Return allocated string path to library that contains path_substr. */
+char *get_lib_path(const char *path_substr);
 
 #endif
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 bpf-next 4/4] selftests/bpf: add tests for u[ret]probe attach by name
  2022-01-31 16:12 [PATCH v3 bpf-next 0/4] libbpf: name-based u[ret]probe attach Alan Maguire
                   ` (2 preceding siblings ...)
  2022-01-31 16:12 ` [PATCH v3 bpf-next 3/4] selftests/bpf: add get_lib_path() helper Alan Maguire
@ 2022-01-31 16:12 ` Alan Maguire
  2022-02-04 19:30   ` Andrii Nakryiko
  2022-02-26 16:00   ` Jiri Olsa
  2022-02-01  8:52 ` [PATCH v3 bpf-next 0/4] libbpf: name-based u[ret]probe attach Daniel Borkmann
  4 siblings, 2 replies; 16+ messages in thread
From: Alan Maguire @ 2022-01-31 16:12 UTC (permalink / raw)
  To: andrii, ast, daniel
  Cc: kafai, songliubraving, yhs, john.fastabend, kpsingh, jolsa,
	sunyucong, netdev, bpf, Alan Maguire

add tests that verify attaching by name for

1. local functions in a program
2. library functions in a shared object; and
3. library functions in a program

...succeed for uprobe and uretprobes using new "func_name"
option for bpf_program__attach_uprobe_opts().  Also verify
auto-attach works where uprobe, path to binary and function
name are specified, but fails with -ESRCH when the format
does not match (the latter is to support backwards-compatibility).

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 .../selftests/bpf/prog_tests/attach_probe.c        | 89 +++++++++++++++++++++-
 .../selftests/bpf/progs/test_attach_probe.c        | 37 +++++++++
 2 files changed, 123 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/attach_probe.c b/tools/testing/selftests/bpf/prog_tests/attach_probe.c
index d48f6e5..127c347 100644
--- a/tools/testing/selftests/bpf/prog_tests/attach_probe.c
+++ b/tools/testing/selftests/bpf/prog_tests/attach_probe.c
@@ -11,6 +11,12 @@ static void trigger_func(void)
 	asm volatile ("");
 }
 
+/* attach point for byname uprobe */
+static void trigger_func2(void)
+{
+	asm volatile ("");
+}
+
 void test_attach_probe(void)
 {
 	DECLARE_LIBBPF_OPTS(bpf_uprobe_opts, uprobe_opts);
@@ -19,7 +25,10 @@ void test_attach_probe(void)
 	struct bpf_link *uprobe_link, *uretprobe_link;
 	struct test_attach_probe* skel;
 	ssize_t uprobe_offset, ref_ctr_offset;
+	struct bpf_link *uprobe_err_link;
+	char *libc_path;
 	bool legacy;
+	char *mem;
 
 	/* Check if new-style kprobe/uprobe API is supported.
 	 * Kernels that support new FD-based kprobe and uprobe BPF attachment
@@ -90,9 +99,72 @@ void test_attach_probe(void)
 		goto cleanup;
 	skel->links.handle_uretprobe = uretprobe_link;
 
+	/* verify auto-attach fails for old-style uprobe definition */
+	uprobe_err_link = bpf_program__attach(skel->progs.handle_uprobe_byname);
+	if (!ASSERT_EQ(libbpf_get_error(uprobe_err_link), -ESRCH,
+		       "auto-attach should fail for old-style name"))
+		goto cleanup;
+
+	uprobe_opts.func_name = "trigger_func2";
+	uprobe_opts.retprobe = false;
+	uprobe_opts.ref_ctr_offset = 0;
+	skel->links.handle_uprobe_byname =
+			bpf_program__attach_uprobe_opts(skel->progs.handle_uprobe_byname,
+							0 /* this pid */,
+							"/proc/self/exe",
+							0, &uprobe_opts);
+	if (!ASSERT_OK_PTR(skel->links.handle_uprobe_byname, "attach_uprobe_byname"))
+		goto cleanup;
+
+	/* verify auto-attach works */
+	skel->links.handle_uretprobe_byname =
+			bpf_program__attach(skel->progs.handle_uretprobe_byname);
+	if (!ASSERT_OK_PTR(skel->links.handle_uretprobe_byname, "attach_uretprobe_byname"))
+		goto cleanup;
+
+	/* test attach by name for a library function, using the library
+	 * as the binary argument.  To do this, find path to libc used
+	 * by test_progs via /proc/self/maps.
+	 */
+	libc_path = get_lib_path("libc-");
+	if (!ASSERT_OK_PTR(libc_path, "get path to libc"))
+		goto cleanup;
+	if (!ASSERT_NEQ(strstr(libc_path, "libc-"), NULL, "find libc path in /proc/self/maps"))
+		goto cleanup;
+
+	uprobe_opts.func_name = "malloc";
+	uprobe_opts.retprobe = false;
+	skel->links.handle_uprobe_byname2 =
+			bpf_program__attach_uprobe_opts(skel->progs.handle_uprobe_byname2,
+							0 /* this pid */,
+							libc_path,
+							0, &uprobe_opts);
+	if (!ASSERT_OK_PTR(skel->links.handle_uprobe_byname2, "attach_uprobe_byname2"))
+		goto cleanup;
+
+	uprobe_opts.func_name = "free";
+	uprobe_opts.retprobe = true;
+	skel->links.handle_uretprobe_byname2 =
+			bpf_program__attach_uprobe_opts(skel->progs.handle_uretprobe_byname2,
+							-1 /* any pid */,
+							"/proc/self/exe",
+							0, &uprobe_opts);
+	if (!ASSERT_OK_PTR(skel->links.handle_uretprobe_byname2, "attach_uretprobe_byname2"))
+		goto cleanup;
+
 	/* trigger & validate kprobe && kretprobe */
 	usleep(1);
 
+	/* trigger & validate shared library u[ret]probes attached by name */
+	mem = malloc(1);
+	free(mem);
+
+	/* trigger & validate uprobe & uretprobe */
+	trigger_func();
+
+	/* trigger & validate uprobe attached by name */
+	trigger_func2();
+
 	if (CHECK(skel->bss->kprobe_res != 1, "check_kprobe_res",
 		  "wrong kprobe res: %d\n", skel->bss->kprobe_res))
 		goto cleanup;
@@ -100,9 +172,6 @@ void test_attach_probe(void)
 		  "wrong kretprobe res: %d\n", skel->bss->kretprobe_res))
 		goto cleanup;
 
-	/* trigger & validate uprobe & uretprobe */
-	trigger_func();
-
 	if (CHECK(skel->bss->uprobe_res != 3, "check_uprobe_res",
 		  "wrong uprobe res: %d\n", skel->bss->uprobe_res))
 		goto cleanup;
@@ -110,7 +179,21 @@ void test_attach_probe(void)
 		  "wrong uretprobe res: %d\n", skel->bss->uretprobe_res))
 		goto cleanup;
 
+	if (CHECK(skel->bss->uprobe_byname_res != 5, "check_uprobe_byname_res",
+		  "wrong uprobe byname res: %d\n", skel->bss->uprobe_byname_res))
+		goto cleanup;
+	if (CHECK(skel->bss->uretprobe_byname_res != 6, "check_uretprobe_byname_res",
+		  "wrong uretprobe byname res: %d\n", skel->bss->uretprobe_byname_res))
+		goto cleanup;
+	if (CHECK(skel->bss->uprobe_byname2_res != 7, "check_uprobe_byname2_res",
+		  "wrong uprobe byname2 res: %d\n", skel->bss->uprobe_byname2_res))
+		goto cleanup;
+	if (CHECK(skel->bss->uretprobe_byname2_res != 8, "check_uretprobe_byname2_res",
+		  "wrong uretprobe byname2 res: %d\n", skel->bss->uretprobe_byname2_res))
+		goto cleanup;
+
 cleanup:
+	free(libc_path);
 	test_attach_probe__destroy(skel);
 	ASSERT_EQ(uprobe_ref_ctr, 0, "uprobe_ref_ctr_cleanup");
 }
diff --git a/tools/testing/selftests/bpf/progs/test_attach_probe.c b/tools/testing/selftests/bpf/progs/test_attach_probe.c
index 8056a4c..9942461c 100644
--- a/tools/testing/selftests/bpf/progs/test_attach_probe.c
+++ b/tools/testing/selftests/bpf/progs/test_attach_probe.c
@@ -10,6 +10,10 @@
 int kretprobe_res = 0;
 int uprobe_res = 0;
 int uretprobe_res = 0;
+int uprobe_byname_res = 0;
+int uretprobe_byname_res = 0;
+int uprobe_byname2_res = 0;
+int uretprobe_byname2_res = 0;
 
 SEC("kprobe/sys_nanosleep")
 int handle_kprobe(struct pt_regs *ctx)
@@ -39,4 +43,37 @@ int handle_uretprobe(struct pt_regs *ctx)
 	return 0;
 }
 
+SEC("uprobe/trigger_func_byname")
+int handle_uprobe_byname(struct pt_regs *ctx)
+{
+	uprobe_byname_res = 5;
+	return 0;
+}
+
+/* use auto-attach format for section definition. */
+SEC("uretprobe//proc/self/exe:trigger_func2")
+int handle_uretprobe_byname(struct pt_regs *ctx)
+{
+	uretprobe_byname_res = 6;
+	return 0;
+}
+
+SEC("uprobe/trigger_func_byname2")
+int handle_uprobe_byname2(struct pt_regs *ctx)
+{
+	unsigned int size = PT_REGS_PARM1(ctx);
+
+	/* verify malloc size */
+	if (size == 1)
+		uprobe_byname2_res = 7;
+	return 0;
+}
+
+SEC("uretprobe/trigger_func_byname2")
+int handle_uretprobe_byname2(struct pt_regs *ctx)
+{
+	uretprobe_byname2_res = 8;
+	return 0;
+}
+
 char _license[] SEC("license") = "GPL";
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 bpf-next 0/4] libbpf: name-based u[ret]probe attach
  2022-01-31 16:12 [PATCH v3 bpf-next 0/4] libbpf: name-based u[ret]probe attach Alan Maguire
                   ` (3 preceding siblings ...)
  2022-01-31 16:12 ` [PATCH v3 bpf-next 4/4] selftests/bpf: add tests for u[ret]probe attach by name Alan Maguire
@ 2022-02-01  8:52 ` Daniel Borkmann
  4 siblings, 0 replies; 16+ messages in thread
From: Daniel Borkmann @ 2022-02-01  8:52 UTC (permalink / raw)
  To: Alan Maguire, andrii, ast
  Cc: kafai, songliubraving, yhs, john.fastabend, kpsingh, jolsa,
	sunyucong, netdev, bpf

Hi Alan,

On 1/31/22 5:12 PM, Alan Maguire wrote:
> This patch series is a refinement of the RFC patchset [1], focusing
> on support for attach by name for uprobes and uretprobes. v3
> because there was an earlier RFC [2].
> 
> Currently attach for such probes is done by determining the offset
> manually, so the aim is to try and mimic the simplicity of kprobe
> attach, making use of uprobe opts to specify a name string.
> Patch 1 adds the "func_name" option to allow uprobe attach by
> name; the mechanics are described there.
> 
> Having name-based support allows us to support auto-attach for
> uprobes; patch 2 adds auto-attach support while attempting
> to handle backwards-compatibility issues that arise.  The format
> supported is
> 
> u[ret]probe//path/2/binary:[raw_offset|function[+offset]]
> 
> For example, to attach to libc malloc:
> 
> SEC("uprobe//usr/lib64/libc.so.6:malloc")
> 
> Patch 3 introduces a helper function to trace_helpers, allowing
> us to retrieve the path to a library by reading /proc/self/maps.
> 
> Finally patch 4 add tests to the attach_probe selftests covering
> attach by name, auto-attach and auto-attach failure.

Looks like the selftest in the series fails the BPF CI (test_progs & test_progs-no_alu32):

https://github.com/kernel-patches/bpf/runs/5012260907?check_suite_focus=true

   [...]
   test_attach_probe:PASS:uprobe_offset 0 nsec
   test_attach_probe:PASS:ref_ctr_offset 0 nsec
   test_attach_probe:PASS:skel_open 0 nsec
   test_attach_probe:PASS:check_bss 0 nsec
   test_attach_probe:PASS:attach_kprobe 0 nsec
   test_attach_probe:PASS:attach_kretprobe 0 nsec
   test_attach_probe:PASS:uprobe_ref_ctr_before 0 nsec
   test_attach_probe:PASS:attach_uprobe 0 nsec
   test_attach_probe:PASS:uprobe_ref_ctr_after 0 nsec
   test_attach_probe:PASS:attach_uretprobe 0 nsec
   test_attach_probe:PASS:auto-attach should fail for old-style name 0 nsec
   test_attach_probe:PASS:attach_uprobe_byname 0 nsec
   test_attach_probe:PASS:attach_uretprobe_byname 0 nsec
   test_attach_probe:PASS:get path to libc 0 nsec
   test_attach_probe:PASS:find libc path in /proc/self/maps 0 nsec
   libbpf: failed to open 7f55b225c000-7f55b2282000 r--p 00000000 fe:00 3381                       /usr/lib/libc-2.32.so: No such file or directory
   test_attach_probe:FAIL:attach_uprobe_byname2 unexpected error: -2
   test_attach_probe:PASS:uprobe_ref_ctr_cleanup 0 nsec
   #4 attach_probe:FAIL
   [...]

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 bpf-next 1/4] libbpf: support function name-based attach uprobes
  2022-01-31 16:12 ` [PATCH v3 bpf-next 1/4] libbpf: support function name-based attach uprobes Alan Maguire
@ 2022-02-04 19:17   ` Andrii Nakryiko
  2022-02-25 16:12     ` Alan Maguire
  0 siblings, 1 reply; 16+ messages in thread
From: Andrii Nakryiko @ 2022-02-04 19:17 UTC (permalink / raw)
  To: Alan Maguire
  Cc: Andrii Nakryiko, Alexei Starovoitov, Daniel Borkmann, Martin Lau,
	Song Liu, Yonghong Song, john fastabend, KP Singh, Jiri Olsa,
	Yucong Sun, Networking, bpf

On Mon, Jan 31, 2022 at 8:13 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> kprobe attach is name-based, using lookups of kallsyms to translate
> a function name to an address.  Currently uprobe attach is done
> via an offset value as described in [1].  Extend uprobe opts
> for attach to include a function name which can then be converted
> into a uprobe-friendly offset.  The calcualation is done in
> several steps:
>
> 1. First, determine the symbol address using libelf; this gives us
>    the offset as reported by objdump; then, in the case of local
>    functions
> 2. If the function is a shared library function - and the binary
>    provided is a shared library - no further work is required;
>    the address found is the required address
> 3. If the function is a shared library function in a program
>    (as opposed to a shared library), the Procedure Linking Table
>    (PLT) table address is found (it is indexed via the dynamic
>    symbol table index).  This allows us to instrument a call
>    to a shared library function locally in the calling binary,
>    reducing overhead versus having a breakpoint in global lib.
> 4. Finally, if the function is local, subtract the base address
>    associated with the object, retrieved from ELF program headers.
>
> The resultant value is then added to the func_offset value passed
> in to specify the uprobe attach address.  So specifying a func_offset
> of 0 along with a function name "printf" will attach to printf entry.
>
> The modes of operation supported are then
>
> 1. to attach to a local function in a binary; function "foo1" in
>    "/usr/bin/foo"
> 2. to attach to a shared library function in a binary;
>    function "malloc" in "/usr/bin/foo"
> 3. to attach to a shared library function in a shared library -
>    function "malloc" in libc.
>
> [1] https://www.kernel.org/doc/html/latest/trace/uprobetracer.html
>
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> ---

This looks great and very clean. I left a few nits, but otherwise it
looks ready (still need to go through the rest of the patches)

>  tools/lib/bpf/libbpf.c | 250 +++++++++++++++++++++++++++++++++++++++++++++++++
>  tools/lib/bpf/libbpf.h |  10 +-
>  2 files changed, 259 insertions(+), 1 deletion(-)
>
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index 4ce94f4..eb95629 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -10203,6 +10203,241 @@ static int perf_event_uprobe_open_legacy(const char *probe_name, bool retprobe,
>         return pfd;
>  }
>
> +/* uprobes deal in relative offsets; subtract the base address associated with
> + * the mapped binary.  See Documentation/trace/uprobetracer.rst for more
> + * details.
> + */
> +static long elf_find_relative_offset(Elf *elf,  long addr)

nit: too many spaces after comma

> +{
> +       size_t n;
> +       int i;
> +
> +       if (elf_getphdrnum(elf, &n)) {
> +               pr_warn("elf: failed to find program headers: %s\n",
> +                       elf_errmsg(-1));
> +               return -ENOENT;
> +       }
> +
> +       for (i = 0; i < n; i++) {
> +               int seg_start, seg_end, seg_offset;
> +               GElf_Phdr phdr;
> +
> +               if (!gelf_getphdr(elf, i, &phdr)) {
> +                       pr_warn("elf: failed to get program header %d: %s\n",
> +                               i, elf_errmsg(-1));
> +                       return -ENOENT;
> +               }
> +               if (phdr.p_type != PT_LOAD || !(phdr.p_flags & PF_X))
> +                       continue;
> +
> +               seg_start = phdr.p_vaddr;
> +               seg_end = seg_start + phdr.p_memsz;
> +               seg_offset = phdr.p_offset;
> +               if (addr >= seg_start && addr < seg_end)
> +                       return addr -  seg_start + seg_offset;

nit: double space before seg_start

> +       }
> +       pr_warn("elf: failed to find prog header containing 0x%lx\n", addr);
> +       return -ENOENT;
> +}
> +
> +/* Return next ELF section of sh_type after scn, or first of that type
> + * if scn is NULL.
> + */
> +static Elf_Scn *elf_find_next_scn_by_type(Elf *elf, int sh_type, Elf_Scn *scn)
> +{
> +       while ((scn = elf_nextscn(elf, scn)) != NULL) {
> +               GElf_Shdr sh;
> +
> +               if (!gelf_getshdr(scn, &sh))
> +                       continue;
> +               if (sh.sh_type == sh_type)
> +                       break;
> +       }
> +       return scn;
> +}
> +
> +/* For Position-Independent Code-based libraries, a table of trampolines
> + * (Procedure Linking Table) is used to support resolution of symbol
> + * names at linking time.  The goal here is to find the offset associated
> + * with the jump to the actual library function.  If we can instrument that
> + * locally in the specific binary (rather than instrumenting glibc say),
> + * overheads are greatly reduced.
> + *
> + * The method used is to find the .plt section and determine the offset
> + * of the relevant entry (given by the base address plus the index
> + * of the function multiplied by the size of a .plt entry).
> + */
> +static ssize_t elf_find_plt_offset(Elf *elf, size_t ndx)

nit: ndx -> func_idx? libbpf generally uses "idx" naming, "ndx" is
purely libelf's convention (and it is more obvious if it is explicitly
called out that it's index of a function entry)

> +{
> +       Elf_Scn *scn = NULL;
> +       size_t shstrndx;
> +
> +       if (elf_getshdrstrndx(elf, &shstrndx)) {
> +               pr_debug("elf: failed to get section names section index: %s\n",
> +                        elf_errmsg(-1));
> +               return -LIBBPF_ERRNO__FORMAT;
> +       }
> +       while ((scn = elf_find_next_scn_by_type(elf, SHT_PROGBITS, scn))) {
> +               long plt_entry_sz, plt_base;
> +               const char *name;
> +               GElf_Shdr sh;
> +
> +               if (!gelf_getshdr(scn, &sh))
> +                       continue;
> +               name = elf_strptr(elf, shstrndx, sh.sh_name);
> +               if (!name || strcmp(name, ".plt") != 0)
> +                       continue;

Wouldn't it be simpler to use elf_sec_by_name(elf, ".plt") and then
Shdr and check PROGBITS? Given there will be only one .plt, it makes
more sense than this while loop?

> +               plt_base = sh.sh_addr;
> +               plt_entry_sz = sh.sh_entsize;
> +               return plt_base + (ndx * plt_entry_sz);
> +       }
> +       pr_debug("elf: no .plt section found\n");

Do we really need this, especially without a binary path?

> +       return -LIBBPF_ERRNO__FORMAT;
> +}
> +
> +/* Find offset of function name in object specified by path.  "name" matches
> + * symbol name or name@@LIB for library functions.
> + */
> +static long elf_find_func_offset(const char *binary_path, const char *name)
> +{
> +       int fd, i, sh_types[2] = { SHT_DYNSYM, SHT_SYMTAB };
> +       bool is_shared_lib, is_name_qualified;
> +       size_t name_len, sym_ndx = -1;
> +       char errmsg[STRERR_BUFSIZE];
> +       long ret = -ENOENT;
> +       GElf_Ehdr ehdr;
> +       Elf *elf;
> +
> +       fd = open(binary_path, O_RDONLY | O_CLOEXEC);
> +       if (fd < 0) {
> +               ret = -errno;
> +               pr_warn("failed to open %s: %s\n", binary_path,
> +                       libbpf_strerror_r(ret, errmsg, sizeof(errmsg)));
> +               return ret;
> +       }
> +       elf = elf_begin(fd, ELF_C_READ_MMAP, NULL);
> +       if (!elf) {
> +               pr_warn("elf: could not read elf from %s: %s\n", binary_path, elf_errmsg(-1));
> +               close(fd);
> +               return -LIBBPF_ERRNO__FORMAT;
> +       }
> +       if (!gelf_getehdr(elf, &ehdr)) {
> +               pr_warn("elf: failed to get ehdr from %s: %s\n", binary_path, elf_errmsg(-1));
> +               ret = -LIBBPF_ERRNO__FORMAT;
> +               goto out;
> +       }
> +       /* for shared lib case, we do not need to calculate relative offset */
> +       is_shared_lib = ehdr.e_type == ET_DYN;
> +
> +       name_len = strlen(name);
> +       /* Does name specify "@@LIB"? */
> +       is_name_qualified = strstr(name, "@@") != NULL;
> +
> +       /* Search SHT_DYNSYM, SHT_SYMTAB for symbol.  This search order is used because if
> +        * the symbol is found in SHY_DYNSYM, the index in that table tells us which index
> +        * to use in the Procedure Linking Table to instrument calls to the shared library
> +        * function, but locally in the binary rather than in the shared library ifself.

typo: itself

> +        * If a binary is stripped, it may also only have SHT_DYNSYM, and a fully-statically
> +        * linked binary may not have SHT_DYMSYM, so absence of a section should not be
> +        * reported as a warning/error.
> +        */
> +       for (i = 0; i < ARRAY_SIZE(sh_types); i++) {
> +               size_t strtabidx, ndx, nr_syms;
> +               Elf_Data *symbols = NULL;
> +               Elf_Scn *scn = NULL;
> +               int last_bind = -1;
> +               const char *sname;
> +               GElf_Shdr sh;
> +
> +               scn = elf_find_next_scn_by_type(elf, sh_types[i], NULL);
> +               if (!scn) {
> +                       pr_debug("elf: failed to find symbol table ELF sections in %s\n",
> +                                binary_path);

you consistently used '%s' for binary_path, let's do that here as well

> +                       continue;
> +               }
> +               if (!gelf_getshdr(scn, &sh))
> +                       continue;
> +               strtabidx = sh.sh_link;
> +               symbols = elf_getdata(scn, 0);
> +               if (!symbols) {
> +                       pr_warn("elf: failed to get symbols for symtab section in %s: %s\n",
> +                               binary_path, elf_errmsg(-1));

and here

> +                       ret = -LIBBPF_ERRNO__FORMAT;
> +                       goto out;
> +               }
> +               nr_syms = symbols->d_size / sh.sh_entsize;
> +
> +               for (ndx = 0; ndx < nr_syms; ndx++) {
> +                       int curr_bind;
> +                       GElf_Sym sym;
> +
> +                       if (!gelf_getsym(symbols, ndx, &sym))
> +                               continue;
> +                       if (GELF_ST_TYPE(sym.st_info) != STT_FUNC)
> +                               continue;
> +
> +                       sname = elf_strptr(elf, strtabidx, sym.st_name);
> +                       if (!sname)
> +                               continue;
> +                       curr_bind = GELF_ST_BIND(sym.st_info);
> +
> +                       /* User can specify func, func@@LIB or func@@LIB_VERSION. */
> +                       if (strncmp(sname, name, name_len) != 0)
> +                               continue;
> +                       /* ...but we don't want a search for "foo" to match 'foo2" also, so any
> +                        * additional characters in sname should be of the form "@@LIB".
> +                        */
> +                       if (!is_name_qualified && strlen(sname) > name_len &&
> +                           sname[name_len] != '@')
> +                               continue;

if both the symbol name and requested function name have @ in them,
what should be the comparison rule? Shouldn't it be an exact match
including '@@' and part after it?

> +
> +                       if (ret >= 0 && last_bind != -1) {

if ret >= 0, last_bind can't be invalid, so let's drop the last_bind check here

> +                               /* handle multiple matches */
> +                               if (last_bind != STB_WEAK && curr_bind != STB_WEAK) {
> +                                       /* Only accept one non-weak bind. */
> +                                       pr_warn("elf: ambiguous match for '%s': %s\n",
> +                                               sname, name);
> +                                       ret = -LIBBPF_ERRNO__FORMAT;
> +                                       goto out;

[...]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 bpf-next 2/4] libbpf: add auto-attach for uprobes based on section name
  2022-01-31 16:12 ` [PATCH v3 bpf-next 2/4] libbpf: add auto-attach for uprobes based on section name Alan Maguire
@ 2022-02-04 19:22   ` Andrii Nakryiko
  2022-02-23  9:32     ` Alan Maguire
  0 siblings, 1 reply; 16+ messages in thread
From: Andrii Nakryiko @ 2022-02-04 19:22 UTC (permalink / raw)
  To: Alan Maguire
  Cc: Andrii Nakryiko, Alexei Starovoitov, Daniel Borkmann, Martin Lau,
	Song Liu, Yonghong Song, john fastabend, KP Singh, Jiri Olsa,
	Yucong Sun, Networking, bpf

On Mon, Jan 31, 2022 at 8:13 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> Now that u[ret]probes can use name-based specification, it makes
> sense to add support for auto-attach based on SEC() definition.
> The format proposed is
>
>         SEC("u[ret]probe//path/to/prog:[raw_offset|[function_name[+offset]]")
>
> For example, to trace malloc() in libc:
>
>         SEC("uprobe//usr/lib64/libc.so.6:malloc")

I assume that path to library can be relative path as well, right?

Also, should be look at trying to locate library in the system if it's
specified as "libc"? Or if the binary is "bash", for example. Just
bringing this up, because I think it came up before in the context of
one of libbpf-tools.

>
> Auto-attach is done for all tasks (pid -1).
>
> Note that there is a backwards-compatibility issue here.  Consider a BPF
> object consisting of a set of BPF programs, including a uprobe program.
> Because uprobes did not previously support auto-attach, it's possible that
> because the uprobe section name is not in auto-attachable form, overall
> BPF skeleton attach would now fail due to the failure of the uprobe program
> to auto-attach.  So we need to handle the case of auto-attach failure where
> the form of the section name is not suitable for auto-attach without
> a complete attach failure.  On surveying the code, bpf_program__attach()
> already returns -ESRCH in cases where no auto-attach function is
> supplied, so for consistency with that - and because that return value
> is less likely to collide with actual attach failures than -EOPNOTSUPP -
> it is used as the attach function return value signalling auto-attach
> is not possible.

I'm actually working on generalizing and extending this part of
libbpf's SEC() handling, I should post code today or early next week.
So we can base your code on those changes and we won't need to worry
about error code collisions.

>
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> ---
>  tools/lib/bpf/libbpf.c | 77 ++++++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 74 insertions(+), 3 deletions(-)
>
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index eb95629..e2b4415 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -8581,6 +8581,7 @@ int bpf_program__set_log_buf(struct bpf_program *prog, char *log_buf, size_t log
>  }
>

[...]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 bpf-next 3/4] selftests/bpf: add get_lib_path() helper
  2022-01-31 16:12 ` [PATCH v3 bpf-next 3/4] selftests/bpf: add get_lib_path() helper Alan Maguire
@ 2022-02-04 19:24   ` Andrii Nakryiko
  0 siblings, 0 replies; 16+ messages in thread
From: Andrii Nakryiko @ 2022-02-04 19:24 UTC (permalink / raw)
  To: Alan Maguire
  Cc: Andrii Nakryiko, Alexei Starovoitov, Daniel Borkmann, Martin Lau,
	Song Liu, Yonghong Song, john fastabend, KP Singh, Jiri Olsa,
	Yucong Sun, Networking, bpf

On Mon, Jan 31, 2022 at 8:13 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> get_lib_path(path_substr) returns full path to a library
> containing path_substr (such as "libc-") found via
> /proc/self/maps.  Caller is responsible for freeing
> the returned string.
>
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> ---
>  tools/testing/selftests/bpf/trace_helpers.c | 17 +++++++++++++++++
>  tools/testing/selftests/bpf/trace_helpers.h |  2 ++
>  2 files changed, 19 insertions(+)
>
> diff --git a/tools/testing/selftests/bpf/trace_helpers.c b/tools/testing/selftests/bpf/trace_helpers.c
> index ca6abae..49e5f0d 100644
> --- a/tools/testing/selftests/bpf/trace_helpers.c
> +++ b/tools/testing/selftests/bpf/trace_helpers.c
> @@ -216,3 +216,20 @@ ssize_t get_rel_offset(uintptr_t addr)
>         fclose(f);
>         return -EINVAL;
>  }
> +
> +char *get_lib_path(const char *path_substr)
> +{
> +       char *found = NULL;
> +       char lib_path[512];
> +       FILE *f;
> +
> +       f = fopen("/proc/self/maps", "r");
> +       while (fscanf(f, "%*s %*s %*s %*s %*s %[^\n]", lib_path) == 1) {

I think it can be followed by " (deleted)", right? Do we want to
detect that and do something about it?

> +               if (strstr(lib_path, path_substr) == NULL)
> +                       continue;
> +               found = strdup(lib_path);
> +               break;
> +       }
> +       fclose(f);
> +       return found;
> +}
> diff --git a/tools/testing/selftests/bpf/trace_helpers.h b/tools/testing/selftests/bpf/trace_helpers.h
> index 238a9c9..ff379f6 100644
> --- a/tools/testing/selftests/bpf/trace_helpers.h
> +++ b/tools/testing/selftests/bpf/trace_helpers.h
> @@ -20,5 +20,7 @@ struct ksym {
>
>  ssize_t get_uprobe_offset(const void *addr);
>  ssize_t get_rel_offset(uintptr_t addr);
> +/* Return allocated string path to library that contains path_substr. */
> +char *get_lib_path(const char *path_substr);
>
>  #endif
> --
> 1.8.3.1
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 bpf-next 4/4] selftests/bpf: add tests for u[ret]probe attach by name
  2022-01-31 16:12 ` [PATCH v3 bpf-next 4/4] selftests/bpf: add tests for u[ret]probe attach by name Alan Maguire
@ 2022-02-04 19:30   ` Andrii Nakryiko
  2022-02-26 16:00   ` Jiri Olsa
  1 sibling, 0 replies; 16+ messages in thread
From: Andrii Nakryiko @ 2022-02-04 19:30 UTC (permalink / raw)
  To: Alan Maguire
  Cc: Andrii Nakryiko, Alexei Starovoitov, Daniel Borkmann, Martin Lau,
	Song Liu, Yonghong Song, john fastabend, KP Singh, Jiri Olsa,
	Yucong Sun, Networking, bpf

On Mon, Jan 31, 2022 at 8:13 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> add tests that verify attaching by name for
>
> 1. local functions in a program
> 2. library functions in a shared object; and
> 3. library functions in a program
>
> ...succeed for uprobe and uretprobes using new "func_name"
> option for bpf_program__attach_uprobe_opts().  Also verify
> auto-attach works where uprobe, path to binary and function
> name are specified, but fails with -ESRCH when the format
> does not match (the latter is to support backwards-compatibility).
>
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> ---
>  .../selftests/bpf/prog_tests/attach_probe.c        | 89 +++++++++++++++++++++-
>  .../selftests/bpf/progs/test_attach_probe.c        | 37 +++++++++
>  2 files changed, 123 insertions(+), 3 deletions(-)
>

[...]

>         if (CHECK(skel->bss->uprobe_res != 3, "check_uprobe_res",
>                   "wrong uprobe res: %d\n", skel->bss->uprobe_res))
>                 goto cleanup;
> @@ -110,7 +179,21 @@ void test_attach_probe(void)
>                   "wrong uretprobe res: %d\n", skel->bss->uretprobe_res))
>                 goto cleanup;
>
> +       if (CHECK(skel->bss->uprobe_byname_res != 5, "check_uprobe_byname_res",
> +                 "wrong uprobe byname res: %d\n", skel->bss->uprobe_byname_res))
> +               goto cleanup;
> +       if (CHECK(skel->bss->uretprobe_byname_res != 6, "check_uretprobe_byname_res",
> +                 "wrong uretprobe byname res: %d\n", skel->bss->uretprobe_byname_res))
> +               goto cleanup;
> +       if (CHECK(skel->bss->uprobe_byname2_res != 7, "check_uprobe_byname2_res",
> +                 "wrong uprobe byname2 res: %d\n", skel->bss->uprobe_byname2_res))
> +               goto cleanup;
> +       if (CHECK(skel->bss->uretprobe_byname2_res != 8, "check_uretprobe_byname2_res",
> +                 "wrong uretprobe byname2 res: %d\n", skel->bss->uretprobe_byname2_res))
> +               goto cleanup;
> +

Please don't use CHECK()s for new code, only ASSERT_xxx() for new code.

>  cleanup:
> +       free(libc_path);
>         test_attach_probe__destroy(skel);
>         ASSERT_EQ(uprobe_ref_ctr, 0, "uprobe_ref_ctr_cleanup");
>  }
> diff --git a/tools/testing/selftests/bpf/progs/test_attach_probe.c b/tools/testing/selftests/bpf/progs/test_attach_probe.c
> index 8056a4c..9942461c 100644
> --- a/tools/testing/selftests/bpf/progs/test_attach_probe.c
> +++ b/tools/testing/selftests/bpf/progs/test_attach_probe.c
> @@ -10,6 +10,10 @@
>  int kretprobe_res = 0;
>  int uprobe_res = 0;
>  int uretprobe_res = 0;
> +int uprobe_byname_res = 0;
> +int uretprobe_byname_res = 0;
> +int uprobe_byname2_res = 0;
> +int uretprobe_byname2_res = 0;
>
>  SEC("kprobe/sys_nanosleep")
>  int handle_kprobe(struct pt_regs *ctx)
> @@ -39,4 +43,37 @@ int handle_uretprobe(struct pt_regs *ctx)
>         return 0;
>  }
>
> +SEC("uprobe/trigger_func_byname")
> +int handle_uprobe_byname(struct pt_regs *ctx)
> +{
> +       uprobe_byname_res = 5;
> +       return 0;
> +}
> +
> +/* use auto-attach format for section definition. */
> +SEC("uretprobe//proc/self/exe:trigger_func2")
> +int handle_uretprobe_byname(struct pt_regs *ctx)
> +{
> +       uretprobe_byname_res = 6;
> +       return 0;
> +}
> +
> +SEC("uprobe/trigger_func_byname2")
> +int handle_uprobe_byname2(struct pt_regs *ctx)

this one is for malloc, so why SEC() doesn't reflect this?

It would be great to also have (probably separate) selftest for
auto-attach logic of skeleton for uprobes.
I'd add a separate uprobe-specific selftests, there is plenty to test
without having kprobes intermingled.

> +{
> +       unsigned int size = PT_REGS_PARM1(ctx);
> +
> +       /* verify malloc size */
> +       if (size == 1)
> +               uprobe_byname2_res = 7;
> +       return 0;
> +}
> +
> +SEC("uretprobe/trigger_func_byname2")
> +int handle_uretprobe_byname2(struct pt_regs *ctx)
> +{
> +       uretprobe_byname2_res = 8;
> +       return 0;
> +}
> +
>  char _license[] SEC("license") = "GPL";
> --
> 1.8.3.1
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 bpf-next 2/4] libbpf: add auto-attach for uprobes based on section name
  2022-02-04 19:22   ` Andrii Nakryiko
@ 2022-02-23  9:32     ` Alan Maguire
  2022-02-24  1:49       ` Andrii Nakryiko
  0 siblings, 1 reply; 16+ messages in thread
From: Alan Maguire @ 2022-02-23  9:32 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alan Maguire, Andrii Nakryiko, Alexei Starovoitov,
	Daniel Borkmann, Martin Lau, Song Liu, Yonghong Song,
	john fastabend, KP Singh, Jiri Olsa, Yucong Sun, Networking, bpf

On Fri, 4 Feb 2022, Andrii Nakryiko wrote:

> On Mon, Jan 31, 2022 at 8:13 AM Alan Maguire <alan.maguire@oracle.com> wrote:
> >
> > Now that u[ret]probes can use name-based specification, it makes
> > sense to add support for auto-attach based on SEC() definition.
> > The format proposed is
> >
> >         SEC("u[ret]probe//path/to/prog:[raw_offset|[function_name[+offset]]")
> >
> > For example, to trace malloc() in libc:
> >
> >         SEC("uprobe//usr/lib64/libc.so.6:malloc")
> 
> I assume that path to library can be relative path as well, right?
> 
> Also, should be look at trying to locate library in the system if it's
> specified as "libc"? Or if the binary is "bash", for example. Just
> bringing this up, because I think it came up before in the context of
> one of libbpf-tools.
>

This is a great suggestion for usability, but I'm trying to puzzle
out how to carry out the location search for cases where the path 
specified is not a relative or absolute path.

A few things we can can do - use search paths from PATH and
LD_LIBRARY_PATH, with an appended set of standard locations
such as /usr/bin, /usr/sbin for cases where those environment
variables are missing or incomplete.

However, when it comes to libraries, do we search in /usr/lib64 or 
/usr/lib? We could use whether the version of libbpf is 64-bit or not I 
suppose, but it's at least conceivable that the user might want to 
instrument a 32-bit library from a 64-bit libbpf.  Do you think that
approach is sufficient, or are there other things we should do? Thanks!

Alan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 bpf-next 2/4] libbpf: add auto-attach for uprobes based on section name
  2022-02-23  9:32     ` Alan Maguire
@ 2022-02-24  1:49       ` Andrii Nakryiko
  2022-02-24 15:39         ` Alan Maguire
  0 siblings, 1 reply; 16+ messages in thread
From: Andrii Nakryiko @ 2022-02-24  1:49 UTC (permalink / raw)
  To: Alan Maguire
  Cc: Andrii Nakryiko, Alexei Starovoitov, Daniel Borkmann, Martin Lau,
	Song Liu, Yonghong Song, john fastabend, KP Singh, Jiri Olsa,
	Yucong Sun, Networking, bpf

On Wed, Feb 23, 2022 at 1:33 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> On Fri, 4 Feb 2022, Andrii Nakryiko wrote:
>
> > On Mon, Jan 31, 2022 at 8:13 AM Alan Maguire <alan.maguire@oracle.com> wrote:
> > >
> > > Now that u[ret]probes can use name-based specification, it makes
> > > sense to add support for auto-attach based on SEC() definition.
> > > The format proposed is
> > >
> > >         SEC("u[ret]probe//path/to/prog:[raw_offset|[function_name[+offset]]")
> > >
> > > For example, to trace malloc() in libc:
> > >
> > >         SEC("uprobe//usr/lib64/libc.so.6:malloc")
> >
> > I assume that path to library can be relative path as well, right?
> >
> > Also, should be look at trying to locate library in the system if it's
> > specified as "libc"? Or if the binary is "bash", for example. Just
> > bringing this up, because I think it came up before in the context of
> > one of libbpf-tools.
> >
>
> This is a great suggestion for usability, but I'm trying to puzzle
> out how to carry out the location search for cases where the path
> specified is not a relative or absolute path.
>
> A few things we can can do - use search paths from PATH and
> LD_LIBRARY_PATH, with an appended set of standard locations
> such as /usr/bin, /usr/sbin for cases where those environment
> variables are missing or incomplete.
>
> However, when it comes to libraries, do we search in /usr/lib64 or
> /usr/lib? We could use whether the version of libbpf is 64-bit or not I
> suppose, but it's at least conceivable that the user might want to
> instrument a 32-bit library from a 64-bit libbpf.  Do you think that
> approach is sufficient, or are there other things we should do? Thanks!

How does dynamic linker do this? When I specify "libbpf.so", is there
some documented algorithm for finding the library? If it's more or
less codified, we could implement something like that. If not, well,
too bad, we can do some useful heuristic, but ultimately there will be
cases that won't be supported. Worst case user will have to specify an
absolute path.

>
> Alan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 bpf-next 2/4] libbpf: add auto-attach for uprobes based on section name
  2022-02-24  1:49       ` Andrii Nakryiko
@ 2022-02-24 15:39         ` Alan Maguire
  2022-03-01  1:45           ` Andrii Nakryiko
  0 siblings, 1 reply; 16+ messages in thread
From: Alan Maguire @ 2022-02-24 15:39 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alan Maguire, Andrii Nakryiko, Alexei Starovoitov,
	Daniel Borkmann, Martin Lau, Song Liu, Yonghong Song,
	john fastabend, KP Singh, Jiri Olsa, Yucong Sun, Networking, bpf

On Thu, 24 Feb 2022, Andrii Nakryiko wrote:

> On Wed, Feb 23, 2022 at 1:33 AM Alan Maguire <alan.maguire@oracle.com> wrote:
> >
> > On Fri, 4 Feb 2022, Andrii Nakryiko wrote:
> >
> > > On Mon, Jan 31, 2022 at 8:13 AM Alan Maguire <alan.maguire@oracle.com> wrote:
> > > >
> > > > Now that u[ret]probes can use name-based specification, it makes
> > > > sense to add support for auto-attach based on SEC() definition.
> > > > The format proposed is
> > > >
> > > >         SEC("u[ret]probe//path/to/prog:[raw_offset|[function_name[+offset]]")
> > > >
> > > > For example, to trace malloc() in libc:
> > > >
> > > >         SEC("uprobe//usr/lib64/libc.so.6:malloc")
> > >
> > > I assume that path to library can be relative path as well, right?
> > >
> > > Also, should be look at trying to locate library in the system if it's
> > > specified as "libc"? Or if the binary is "bash", for example. Just
> > > bringing this up, because I think it came up before in the context of
> > > one of libbpf-tools.
> > >
> >
> > This is a great suggestion for usability, but I'm trying to puzzle
> > out how to carry out the location search for cases where the path
> > specified is not a relative or absolute path.
> >
> > A few things we can can do - use search paths from PATH and
> > LD_LIBRARY_PATH, with an appended set of standard locations
> > such as /usr/bin, /usr/sbin for cases where those environment
> > variables are missing or incomplete.
> >
> > However, when it comes to libraries, do we search in /usr/lib64 or
> > /usr/lib? We could use whether the version of libbpf is 64-bit or not I
> > suppose, but it's at least conceivable that the user might want to
> > instrument a 32-bit library from a 64-bit libbpf.  Do you think that
> > approach is sufficient, or are there other things we should do? Thanks!
> 
> How does dynamic linker do this? When I specify "libbpf.so", is there
> some documented algorithm for finding the library? If it's more or
> less codified, we could implement something like that. If not, well,
> too bad, we can do some useful heuristic, but ultimately there will be
> cases that won't be supported. Worst case user will have to specify an
> absolute path.
> 

There's a nice description in [1]:

       If filename is NULL, then the returned handle is for the main
       program.  If filename contains a slash ("/"), then it is
       interpreted as a (relative or absolute) pathname.  Otherwise, the
       dynamic linker searches for the object as follows (see ld.so(8)
       for further details):

       o   (ELF only) If the calling object (i.e., the shared library or
           executable from which dlopen() is called) contains a DT_RPATH
           tag, and does not contain a DT_RUNPATH tag, then the
           directories listed in the DT_RPATH tag are searched.

       o   If, at the time that the program was started, the environment
           variable LD_LIBRARY_PATH was defined to contain a colon-
           separated list of directories, then these are searched.  (As
           a security measure, this variable is ignored for set-user-ID
           and set-group-ID programs.)

       o   (ELF only) If the calling object contains a DT_RUNPATH tag,
           then the directories listed in that tag are searched.

       o   The cache file /etc/ld.so.cache (maintained by ldconfig(8))
           is checked to see whether it contains an entry for filename.

       o   The directories /lib and /usr/lib are searched (in that
           order).

Rather than re-inventing all of that however, we could use it
by dlopen()ing the file when it is a library (contains .so) and
is not a relative/absolute path, and then use dlinfo()'s
RTLD_DI_ORIGIN command to extract the path discovered, and then
dlclose() it. It would require linking libbpf with -ldl however.
What do you think?

Alan
 
[1] https://man7.org/linux/man-pages/man3/dlopen.3.html

> >
> > Alan
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 bpf-next 1/4] libbpf: support function name-based attach uprobes
  2022-02-04 19:17   ` Andrii Nakryiko
@ 2022-02-25 16:12     ` Alan Maguire
  0 siblings, 0 replies; 16+ messages in thread
From: Alan Maguire @ 2022-02-25 16:12 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alan Maguire, Andrii Nakryiko, Alexei Starovoitov,
	Daniel Borkmann, Martin Lau, Song Liu, Yonghong Song,
	john fastabend, KP Singh, Jiri Olsa, Yucong Sun, Networking, bpf

On Fri, 4 Feb 2022, Andrii Nakryiko wrote:

> On Mon, Jan 31, 2022 at 8:13 AM Alan Maguire <alan.maguire@oracle.com> wrote:
> >
> > kprobe attach is name-based, using lookups of kallsyms to translate
> > a function name to an address.  Currently uprobe attach is done
> > via an offset value as described in [1].  Extend uprobe opts
> > for attach to include a function name which can then be converted
> > into a uprobe-friendly offset.  The calcualation is done in
> > several steps:
> >
> > 1. First, determine the symbol address using libelf; this gives us
> >    the offset as reported by objdump; then, in the case of local
> >    functions
> > 2. If the function is a shared library function - and the binary
> >    provided is a shared library - no further work is required;
> >    the address found is the required address
> > 3. If the function is a shared library function in a program
> >    (as opposed to a shared library), the Procedure Linking Table
> >    (PLT) table address is found (it is indexed via the dynamic
> >    symbol table index).  This allows us to instrument a call
> >    to a shared library function locally in the calling binary,
> >    reducing overhead versus having a breakpoint in global lib.
> > 4. Finally, if the function is local, subtract the base address
> >    associated with the object, retrieved from ELF program headers.
> >
> > The resultant value is then added to the func_offset value passed
> > in to specify the uprobe attach address.  So specifying a func_offset
> > of 0 along with a function name "printf" will attach to printf entry.
> >
> > The modes of operation supported are then
> >
> > 1. to attach to a local function in a binary; function "foo1" in
> >    "/usr/bin/foo"
> > 2. to attach to a shared library function in a binary;
> >    function "malloc" in "/usr/bin/foo"
> > 3. to attach to a shared library function in a shared library -
> >    function "malloc" in libc.
> >
> > [1] https://www.kernel.org/doc/html/latest/trace/uprobetracer.html
> >
> > Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> > ---
> 
> This looks great and very clean. I left a few nits, but otherwise it
> looks ready (still need to go through the rest of the patches)
> 
> >  tools/lib/bpf/libbpf.c | 250 +++++++++++++++++++++++++++++++++++++++++++++++++
> >  tools/lib/bpf/libbpf.h |  10 +-
> >  2 files changed, 259 insertions(+), 1 deletion(-)
> >
>

<snip>
 
> if both the symbol name and requested function name have @ in them,
> what should be the comparison rule? Shouldn't it be an exact match
> including '@@' and part after it?
>

In this case, we might want to support matching on malloc@GLIBC and
malloc@GLIBC_2.3.4; in other words letting the caller decide how
specific they want to be makes sense I think.  So the caller dictates
the matching length via the argument they provide - with the proviso that
if it's just a function name without a "@LIBRARY" suffix it must match 
fully. The problem with the version numbers associated with functions is 
they're the versions from the mapfiles, so the same library version has 
malloc@GLIBC_2.2.5, epoll_ctl@GLIBC_2.3.2 etc.

Thanks for the review! I'm working on incorporating all of these changes
into v4 now.

Alan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 bpf-next 4/4] selftests/bpf: add tests for u[ret]probe attach by name
  2022-01-31 16:12 ` [PATCH v3 bpf-next 4/4] selftests/bpf: add tests for u[ret]probe attach by name Alan Maguire
  2022-02-04 19:30   ` Andrii Nakryiko
@ 2022-02-26 16:00   ` Jiri Olsa
  1 sibling, 0 replies; 16+ messages in thread
From: Jiri Olsa @ 2022-02-26 16:00 UTC (permalink / raw)
  To: Alan Maguire
  Cc: andrii, ast, daniel, kafai, songliubraving, yhs, john.fastabend,
	kpsingh, jolsa, sunyucong, netdev, bpf

On Mon, Jan 31, 2022 at 04:12:34PM +0000, Alan Maguire wrote:

SNIP

> +	/* verify auto-attach fails for old-style uprobe definition */
> +	uprobe_err_link = bpf_program__attach(skel->progs.handle_uprobe_byname);
> +	if (!ASSERT_EQ(libbpf_get_error(uprobe_err_link), -ESRCH,
> +		       "auto-attach should fail for old-style name"))
> +		goto cleanup;
> +
> +	uprobe_opts.func_name = "trigger_func2";
> +	uprobe_opts.retprobe = false;
> +	uprobe_opts.ref_ctr_offset = 0;
> +	skel->links.handle_uprobe_byname =
> +			bpf_program__attach_uprobe_opts(skel->progs.handle_uprobe_byname,
> +							0 /* this pid */,
> +							"/proc/self/exe",
> +							0, &uprobe_opts);
> +	if (!ASSERT_OK_PTR(skel->links.handle_uprobe_byname, "attach_uprobe_byname"))
> +		goto cleanup;
> +
> +	/* verify auto-attach works */
> +	skel->links.handle_uretprobe_byname =
> +			bpf_program__attach(skel->progs.handle_uretprobe_byname);
> +	if (!ASSERT_OK_PTR(skel->links.handle_uretprobe_byname, "attach_uretprobe_byname"))
> +		goto cleanup;
> +
> +	/* test attach by name for a library function, using the library
> +	 * as the binary argument.  To do this, find path to libc used
> +	 * by test_progs via /proc/self/maps.
> +	 */
> +	libc_path = get_lib_path("libc-");

hi,
I'm getting crash in here because the libc line in maps for me
looks like: /usr/lib64/libc.so.6

plus the check below will let through null pointer

> +	if (!ASSERT_OK_PTR(libc_path, "get path to libc"))
> +		goto cleanup;
> +	if (!ASSERT_NEQ(strstr(libc_path, "libc-"), NULL, "find libc path in /proc/self/maps"))
> +		goto cleanup;

and when I tried to use 'libc' in here, it does not crash but
libc_path holds the whole maps line:

  7fdbef31d000-7fdbef349000 r--p 00000000 fd:01 201656665                  /usr/lib64/libc.so.6

so it fails, I guess there's some issue in get_lib_path

jirka

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 bpf-next 2/4] libbpf: add auto-attach for uprobes based on section name
  2022-02-24 15:39         ` Alan Maguire
@ 2022-03-01  1:45           ` Andrii Nakryiko
  0 siblings, 0 replies; 16+ messages in thread
From: Andrii Nakryiko @ 2022-03-01  1:45 UTC (permalink / raw)
  To: Alan Maguire
  Cc: Andrii Nakryiko, Alexei Starovoitov, Daniel Borkmann, Martin Lau,
	Song Liu, Yonghong Song, john fastabend, KP Singh, Jiri Olsa,
	Yucong Sun, Networking, bpf

On Thu, Feb 24, 2022 at 7:40 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> On Thu, 24 Feb 2022, Andrii Nakryiko wrote:
>
> > On Wed, Feb 23, 2022 at 1:33 AM Alan Maguire <alan.maguire@oracle.com> wrote:
> > >
> > > On Fri, 4 Feb 2022, Andrii Nakryiko wrote:
> > >
> > > > On Mon, Jan 31, 2022 at 8:13 AM Alan Maguire <alan.maguire@oracle.com> wrote:
> > > > >
> > > > > Now that u[ret]probes can use name-based specification, it makes
> > > > > sense to add support for auto-attach based on SEC() definition.
> > > > > The format proposed is
> > > > >
> > > > >         SEC("u[ret]probe//path/to/prog:[raw_offset|[function_name[+offset]]")
> > > > >
> > > > > For example, to trace malloc() in libc:
> > > > >
> > > > >         SEC("uprobe//usr/lib64/libc.so.6:malloc")
> > > >
> > > > I assume that path to library can be relative path as well, right?
> > > >
> > > > Also, should be look at trying to locate library in the system if it's
> > > > specified as "libc"? Or if the binary is "bash", for example. Just
> > > > bringing this up, because I think it came up before in the context of
> > > > one of libbpf-tools.
> > > >
> > >
> > > This is a great suggestion for usability, but I'm trying to puzzle
> > > out how to carry out the location search for cases where the path
> > > specified is not a relative or absolute path.
> > >
> > > A few things we can can do - use search paths from PATH and
> > > LD_LIBRARY_PATH, with an appended set of standard locations
> > > such as /usr/bin, /usr/sbin for cases where those environment
> > > variables are missing or incomplete.
> > >
> > > However, when it comes to libraries, do we search in /usr/lib64 or
> > > /usr/lib? We could use whether the version of libbpf is 64-bit or not I
> > > suppose, but it's at least conceivable that the user might want to
> > > instrument a 32-bit library from a 64-bit libbpf.  Do you think that
> > > approach is sufficient, or are there other things we should do? Thanks!
> >
> > How does dynamic linker do this? When I specify "libbpf.so", is there
> > some documented algorithm for finding the library? If it's more or
> > less codified, we could implement something like that. If not, well,
> > too bad, we can do some useful heuristic, but ultimately there will be
> > cases that won't be supported. Worst case user will have to specify an
> > absolute path.
> >
>
> There's a nice description in [1]:
>
>        If filename is NULL, then the returned handle is for the main
>        program.  If filename contains a slash ("/"), then it is
>        interpreted as a (relative or absolute) pathname.  Otherwise, the
>        dynamic linker searches for the object as follows (see ld.so(8)
>        for further details):
>
>        o   (ELF only) If the calling object (i.e., the shared library or
>            executable from which dlopen() is called) contains a DT_RPATH
>            tag, and does not contain a DT_RUNPATH tag, then the
>            directories listed in the DT_RPATH tag are searched.
>
>        o   If, at the time that the program was started, the environment
>            variable LD_LIBRARY_PATH was defined to contain a colon-
>            separated list of directories, then these are searched.  (As
>            a security measure, this variable is ignored for set-user-ID
>            and set-group-ID programs.)
>
>        o   (ELF only) If the calling object contains a DT_RUNPATH tag,
>            then the directories listed in that tag are searched.
>
>        o   The cache file /etc/ld.so.cache (maintained by ldconfig(8))
>            is checked to see whether it contains an entry for filename.
>
>        o   The directories /lib and /usr/lib are searched (in that
>            order).
>
> Rather than re-inventing all of that however, we could use it
> by dlopen()ing the file when it is a library (contains .so) and
> is not a relative/absolute path, and then use dlinfo()'s
> RTLD_DI_ORIGIN command to extract the path discovered, and then
> dlclose() it. It would require linking libbpf with -ldl however.
> What do you think?

What do I think about dlopen()'ing some random library under root by
libbpf into the host process?.. I'd say that's a bad idea.

I'd probably start with just checking /lib, /usr/lib (and maybe those
32-bit and 64-bit specific ones, depending on host architecture; not
sure about all the details there, tbh). Or just say that the path to
the shared library has to be specified.

There is a similar problem with doing something like
SEC("uprobe/bash:readline"). Do we want to "search" for bash? I think
bpftrace is supporting that, but I haven't checked what it is doing.


>
> Alan
>
> [1] https://man7.org/linux/man-pages/man3/dlopen.3.html
>
> > >
> > > Alan
> >

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-03-01  1:45 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-31 16:12 [PATCH v3 bpf-next 0/4] libbpf: name-based u[ret]probe attach Alan Maguire
2022-01-31 16:12 ` [PATCH v3 bpf-next 1/4] libbpf: support function name-based attach uprobes Alan Maguire
2022-02-04 19:17   ` Andrii Nakryiko
2022-02-25 16:12     ` Alan Maguire
2022-01-31 16:12 ` [PATCH v3 bpf-next 2/4] libbpf: add auto-attach for uprobes based on section name Alan Maguire
2022-02-04 19:22   ` Andrii Nakryiko
2022-02-23  9:32     ` Alan Maguire
2022-02-24  1:49       ` Andrii Nakryiko
2022-02-24 15:39         ` Alan Maguire
2022-03-01  1:45           ` Andrii Nakryiko
2022-01-31 16:12 ` [PATCH v3 bpf-next 3/4] selftests/bpf: add get_lib_path() helper Alan Maguire
2022-02-04 19:24   ` Andrii Nakryiko
2022-01-31 16:12 ` [PATCH v3 bpf-next 4/4] selftests/bpf: add tests for u[ret]probe attach by name Alan Maguire
2022-02-04 19:30   ` Andrii Nakryiko
2022-02-26 16:00   ` Jiri Olsa
2022-02-01  8:52 ` [PATCH v3 bpf-next 0/4] libbpf: name-based u[ret]probe attach Daniel Borkmann

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.