bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next v10 0/2] bpf: adding get_file_path helper
@ 2019-11-19 13:27 Wenbo Zhang
  2019-11-19 13:27 ` [PATCH bpf-next v10 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname Wenbo Zhang
  2019-11-19 13:27 ` [PATCH bpf-next v10 " Wenbo Zhang
  0 siblings, 2 replies; 52+ messages in thread
From: Wenbo Zhang @ 2019-11-19 13:27 UTC (permalink / raw)
  To: bpf; +Cc: ast, daniel, yhs, andrii.nakryiko, netdev

This patch series introduce a bpf helper that can be used to map a file
descriptor to a pathname.

This requirement is mainly discussed here:

  https://github.com/iovisor/bcc/issues/237

This implementation supports both local and mountable pseudo file systems,
and ensure we're in user context which is safe for this helper to run.

Changes since v9:

* Associate help patch with its selftests patch to this series

* Refactor selftests code for further simplification  


Changes since v8:

* format helper description 
 

Changes since v7:

* Use fget_raw instead of fdget_raw, as fdget_raw is only used inside fs/

* Ensure we're in user context which is safe fot the help to run

* Filter unmountable pseudo filesystem, because they don't have real path

* Supplement the description of this helper function


Changes since v6:

* Fix missing signed-off-by line


Changes since v5:

* Refactor helper avoid unnecessary goto end by having two explicit returns


Changes since v4:

* Rename bpf_fd2path to bpf_get_file_path to be consistent with other
helper's names

* When fdget_raw fails, set ret to -EBADF instead of -EINVAL

* Remove fdput from fdget_raw's error path

* Use IS_ERR instead of IS_ERR_OR_NULL as d_path ether returns a pointer
into the buffer or an error code if the path was too long

* Modify the normal path's return value to return copied string length
including NUL

* Update helper description's Return bits.

* Refactor selftests code for further simplification  


Changes since v3:

* Remove unnecessary LOCKDOWN_BPF_READ

* Refactor error handling section for enhanced readability

* Provide a test case in tools/testing/selftests/bpf

* Refactor sefltests code to use real global variables instead of maps


Changes since v2:

* Fix backward compatibility

* Add helper description

* Refactor selftests use global data instead of perf_buffer to simplified
code

* Fix signed-off name


Wenbo Zhang (2):
  bpf: add new helper get_file_path for mapping a file descriptor to a
    pathname
  selftests/bpf: test for bpf_get_file_path() from tracepoint

 include/uapi/linux/bpf.h                      |  29 ++-
 kernel/trace/bpf_trace.c                      |  63 +++++++
 tools/include/uapi/linux/bpf.h                |  29 ++-
 .../selftests/bpf/prog_tests/get_file_path.c  | 171 ++++++++++++++++++
 .../selftests/bpf/progs/test_get_file_path.c  |  43 +++++
 5 files changed, 333 insertions(+), 2 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/get_file_path.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_get_file_path.c

-- 
2.17.1


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH bpf-next v10 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname
  2019-11-19 13:27 [PATCH bpf-next v10 0/2] bpf: adding get_file_path helper Wenbo Zhang
@ 2019-11-19 13:27 ` Wenbo Zhang
  2019-11-23  3:18   ` Alexei Starovoitov
  2019-12-05  4:20   ` [PATCH bpf-next v11 0/2] bpf: adding get_file_path helper Wenbo Zhang
  2019-11-19 13:27 ` [PATCH bpf-next v10 " Wenbo Zhang
  1 sibling, 2 replies; 52+ messages in thread
From: Wenbo Zhang @ 2019-11-19 13:27 UTC (permalink / raw)
  To: bpf; +Cc: ast, daniel, yhs, andrii.nakryiko, netdev

When people want to identify which file system files are being opened,
read, and written to, they can use this helper with file descriptor as
input to achieve this goal. Other pseudo filesystems are also supported.

This requirement is mainly discussed here:

  https://github.com/iovisor/bcc/issues/237

v9->v10: addressed Andrii's feedback
- send this patch together with the patch selftests as one patch series

v8->v9:
- format helper description

v7->v8: addressed Alexei's feedback
- use fget_raw instead of fdget_raw, as fdget_raw is only used inside fs/
- ensure we're in user context which is safe fot the help to run
- filter unmountable pseudo filesystem, because they don't have real path
- supplement the description of this helper function

v6->v7:
- fix missing signed-off-by line

v5->v6: addressed Andrii's feedback
- avoid unnecessary goto end by having two explicit returns

v4->v5: addressed Andrii and Daniel's feedback
- rename bpf_fd2path to bpf_get_file_path to be consistent with other
helper's names
- when fdget_raw fails, set ret to -EBADF instead of -EINVAL
- remove fdput from fdget_raw's error path
- use IS_ERR instead of IS_ERR_OR_NULL as d_path ether returns a pointer
into the buffer or an error code if the path was too long
- modify the normal path's return value to return copied string length
including NUL
- update this helper description's Return bits.

v3->v4: addressed Daniel's feedback
- fix missing fdput()
- move fd2path from kernel/bpf/trace.c to kernel/trace/bpf_trace.c
- move fd2path's test code to another patch
- add comment to explain why use fdget_raw instead of fdget

v2->v3: addressed Yonghong's feedback
- remove unnecessary LOCKDOWN_BPF_READ
- refactor error handling section for enhanced readability
- provide a test case in tools/testing/selftests/bpf

v1->v2: addressed Daniel's feedback
- fix backward compatibility
- add this helper description
- fix signed-off name

Signed-off-by: Wenbo Zhang <ethercflow@gmail.com>
---
 include/uapi/linux/bpf.h       | 29 +++++++++++++++-
 kernel/trace/bpf_trace.c       | 63 ++++++++++++++++++++++++++++++++++
 tools/include/uapi/linux/bpf.h | 29 +++++++++++++++-
 3 files changed, 119 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index dbbcf0b02970..71d9705df120 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -2821,6 +2821,32 @@ union bpf_attr {
  * 	Return
  * 		On success, the strictly positive length of the string,	including
  * 		the trailing NUL character. On error, a negative value.
+ *
+ * int bpf_get_file_path(char *path, u32 size, int fd)
+ *	Description
+ *		Get **file** atrribute from the current task by *fd*, then call
+ *		**d_path** to get it's absolute path and copy it as string into
+ *		*path* of *size*. Notice the **path** don't support unmountable
+ *		pseudo filesystems as they don't have path (eg: SOCKFS, PIPEFS).
+ *		The *size* must be strictly positive. On success, the helper
+ *		makes sure that the *path* is NUL-terminated, and the buffer
+ *		could be:
+ *		- a regular full path (include mountable fs eg: /proc, /sys)
+ *		- a regular full path with "(deleted)" at the end.
+ *		On failure, it is filled with zeroes.
+ *	Return
+ *		On success, returns the length of the copied string INCLUDING
+ *		the trailing NUL.
+ *
+ *		On failure, the returned value is one of the following:
+ *
+ *		**-EPERM** if no permission to get the path (eg: in irq ctx).
+ *
+ *		**-EBADF** if *fd* is invalid.
+ *
+ *		**-EINVAL** if *fd* corresponds to a unmountable pseudo fs
+ *
+ *		**-ENAMETOOLONG** if full path is longer than *size*
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -2938,7 +2964,8 @@ union bpf_attr {
 	FN(probe_read_user),		\
 	FN(probe_read_kernel),		\
 	FN(probe_read_user_str),	\
-	FN(probe_read_kernel_str),
+	FN(probe_read_kernel_str),	\
+	FN(get_file_path),
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index ffc91d4935ac..c77e55418f1e 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -762,6 +762,67 @@ static const struct bpf_func_proto bpf_send_signal_proto = {
 	.arg1_type	= ARG_ANYTHING,
 };
 
+BPF_CALL_3(bpf_get_file_path, char *, dst, u32, size, int, fd)
+{
+	struct file *f;
+	char *p;
+	int ret = -EBADF;
+
+	/* Ensure we're in user context which is safe for the helper to
+	 * run. This helper has no business in a kthread.
+	 */
+	if (unlikely(in_interrupt() ||
+		     current->flags & (PF_KTHREAD | PF_EXITING)))
+		return -EPERM;
+
+	/* Use fget_raw instead of fget to support O_PATH, and it doesn't
+	 * have any sleepable code, so it's ok to be here.
+	 */
+	f = fget_raw(fd);
+	if (!f)
+		goto error;
+
+	/* For unmountable pseudo filesystem, it seems to have no meaning
+	 * to get their fake paths as they don't have path, and to be no
+	 * way to validate this function pointer can be always safe to call
+	 * in the current context.
+	 */
+	if (f->f_path.dentry->d_op && f->f_path.dentry->d_op->d_dname)
+		return -EINVAL;
+
+	/* After filter unmountable pseudo filesytem, d_path won't call
+	 * dentry->d_op->d_name(), the normally path doesn't have any
+	 * sleepable code, and despite it uses the current macro to get
+	 * fs_struct (current->fs), we've already ensured we're in user
+	 * context, so it's ok to be here.
+	 */
+	p = d_path(&f->f_path, dst, size);
+	if (IS_ERR(p)) {
+		ret = PTR_ERR(p);
+		fput(f);
+		goto error;
+	}
+
+	ret = strlen(p);
+	memmove(dst, p, ret);
+	dst[ret++] = '\0';
+	fput(f);
+	return ret;
+
+error:
+	memset(dst, '0', size);
+	return ret;
+}
+
+static const struct bpf_func_proto bpf_get_file_path_proto = {
+	.func       = bpf_get_file_path,
+	.gpl_only   = true,
+	.ret_type   = RET_INTEGER,
+	.arg1_type  = ARG_PTR_TO_UNINIT_MEM,
+	.arg2_type  = ARG_CONST_SIZE,
+	.arg3_type  = ARG_ANYTHING,
+};
+
 static const struct bpf_func_proto *
 tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 {
@@ -822,6 +883,8 @@ tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 #endif
 	case BPF_FUNC_send_signal:
 		return &bpf_send_signal_proto;
+	case BPF_FUNC_get_file_path:
+		return &bpf_get_file_path_proto;
 	default:
 		return NULL;
 	}
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index dbbcf0b02970..71d9705df120 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -2821,6 +2821,32 @@ union bpf_attr {
  * 	Return
  * 		On success, the strictly positive length of the string,	including
  * 		the trailing NUL character. On error, a negative value.
+ *
+ * int bpf_get_file_path(char *path, u32 size, int fd)
+ *	Description
+ *		Get **file** atrribute from the current task by *fd*, then call
+ *		**d_path** to get it's absolute path and copy it as string into
+ *		*path* of *size*. Notice the **path** don't support unmountable
+ *		pseudo filesystems as they don't have path (eg: SOCKFS, PIPEFS).
+ *		The *size* must be strictly positive. On success, the helper
+ *		makes sure that the *path* is NUL-terminated, and the buffer
+ *		could be:
+ *		- a regular full path (include mountable fs eg: /proc, /sys)
+ *		- a regular full path with "(deleted)" at the end.
+ *		On failure, it is filled with zeroes.
+ *	Return
+ *		On success, returns the length of the copied string INCLUDING
+ *		the trailing NUL.
+ *
+ *		On failure, the returned value is one of the following:
+ *
+ *		**-EPERM** if no permission to get the path (eg: in irq ctx).
+ *
+ *		**-EBADF** if *fd* is invalid.
+ *
+ *		**-EINVAL** if *fd* corresponds to a unmountable pseudo fs
+ *
+ *		**-ENAMETOOLONG** if full path is longer than *size*
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -2938,7 +2964,8 @@ union bpf_attr {
 	FN(probe_read_user),		\
 	FN(probe_read_kernel),		\
 	FN(probe_read_user_str),	\
-	FN(probe_read_kernel_str),
+	FN(probe_read_kernel_str),	\
+	FN(get_file_path),
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH bpf-next v10 2/2] selftests/bpf: test for bpf_get_file_path() from tracepoint
  2019-11-19 13:27 [PATCH bpf-next v10 0/2] bpf: adding get_file_path helper Wenbo Zhang
  2019-11-19 13:27 ` [PATCH bpf-next v10 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname Wenbo Zhang
@ 2019-11-19 13:27 ` Wenbo Zhang
  1 sibling, 0 replies; 52+ messages in thread
From: Wenbo Zhang @ 2019-11-19 13:27 UTC (permalink / raw)
  To: bpf; +Cc: ast, daniel, yhs, andrii.nakryiko, netdev

trace fstat events by tracepoint syscalls/sys_enter_newfstat, and handle
events only produced by test_file_get_path, which call fstat on several
different types of files to test bpf_get_file_path's feature.

v4->v5: addressed Andrii's feedback
- pass NULL for opts as bpf_object__open_file's PARAM2, as not really
using any
- modify patch subject to keep up with test code
- as this test is single-threaded, so use getpid instead of SYS_gettid
- remove unnecessary parens around check which after if (i < 3)
- in kern use bpf_get_current_pid_tgid() >> 32 to fit getpid() in
userspace part
- with the patch adding helper as one patch series

v3->v4: addressed Andrii's feedback
- use a set of fd instead of fds array
- use global variables instead of maps (in v3, I mistakenly thought that
the bpf maps are global variables.)
- remove uncessary global variable path_info_index
- remove fd compare as the fstat's order is fixed

v2->v3: addressed Andrii's feedback
- use global data instead of perf_buffer to simplified code

v1->v2: addressed Daniel's feedback
- rename bpf_fd2path to bpf_get_file_path to be consistent with other
helper's names

Signed-off-by: Wenbo Zhang <ethercflow@gmail.com>
---
 .../selftests/bpf/prog_tests/get_file_path.c  | 171 ++++++++++++++++++
 .../selftests/bpf/progs/test_get_file_path.c  |  43 +++++
 2 files changed, 214 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/get_file_path.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_get_file_path.c

diff --git a/tools/testing/selftests/bpf/prog_tests/get_file_path.c b/tools/testing/selftests/bpf/prog_tests/get_file_path.c
new file mode 100644
index 000000000000..db88545e127b
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/get_file_path.c
@@ -0,0 +1,171 @@
+// SPDX-License-Identifier: GPL-2.0
+#define _GNU_SOURCE
+#include <test_progs.h>
+#include <sys/stat.h>
+#include <linux/sched.h>
+#include <sys/syscall.h>
+
+#define MAX_PATH_LEN		128
+#define MAX_FDS			7
+#define MAX_EVENT_NUM		16
+
+static struct file_path_test_data {
+	pid_t pid;
+	__u32 cnt;
+	__u32 fds[MAX_EVENT_NUM];
+	char paths[MAX_EVENT_NUM][MAX_PATH_LEN];
+} src, dst;
+
+static inline int set_pathname(int fd)
+{
+	char buf[MAX_PATH_LEN];
+
+	snprintf(buf, MAX_PATH_LEN, "/proc/%d/fd/%d", src.pid, fd);
+	src.fds[src.cnt] = fd;
+	return readlink(buf, src.paths[src.cnt++], MAX_PATH_LEN);
+}
+
+static int trigger_fstat_events(pid_t pid)
+{
+	int pipefd[2] = { -1, -1 };
+	int sockfd = -1, procfd = -1, devfd = -1;
+	int localfd = -1, indicatorfd = -1;
+	struct stat fileStat;
+	int ret = -1;
+
+	/* unmountable pseudo-filesystems */
+	if (CHECK_FAIL(pipe(pipefd) < 0))
+		return ret;
+	/* unmountable pseudo-filesystems */
+	sockfd = socket(AF_INET, SOCK_STREAM, 0);
+	if (CHECK_FAIL(sockfd < 0))
+		goto out_close;
+	/* mountable pseudo-filesystems */
+	procfd = open("/proc/self/comm", O_RDONLY);
+	if (CHECK_FAIL(procfd < 0))
+		goto out_close;
+	devfd = open("/dev/urandom", O_RDONLY);
+	if (CHECK_FAIL(devfd < 0))
+		goto out_close;
+	localfd = open("/tmp/fd2path_loadgen.txt", O_CREAT | O_RDONLY);
+	if (CHECK_FAIL(localfd < 0))
+		goto out_close;
+	/* bpf_get_file_path will return path with (deleted) */
+	remove("/tmp/fd2path_loadgen.txt");
+	indicatorfd = open("/tmp/", O_PATH);
+	if (CHECK_FAIL(indicatorfd < 0))
+		goto out_close;
+
+	src.pid = pid;
+
+	ret = set_pathname(pipefd[0]);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(pipefd[1]);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(sockfd);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(procfd);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(devfd);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(localfd);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(indicatorfd);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+
+	fstat(pipefd[0], &fileStat);
+	fstat(pipefd[1], &fileStat);
+	fstat(sockfd, &fileStat);
+	fstat(procfd, &fileStat);
+	fstat(devfd, &fileStat);
+	fstat(localfd, &fileStat);
+	fstat(indicatorfd, &fileStat);
+
+out_close:
+	close(indicatorfd);
+	close(localfd);
+	close(devfd);
+	close(procfd);
+	close(sockfd);
+	close(pipefd[1]);
+	close(pipefd[0]);
+
+	return ret;
+}
+
+void test_get_file_path(void)
+{
+	const char *prog_name = "tracepoint/syscalls/sys_enter_newfstat";
+	const char *obj_file = "test_get_file_path.o";
+	int err, results_map_fd, duration = 0;
+	struct bpf_program *tp_prog = NULL;
+	struct bpf_link *tp_link = NULL;
+	struct bpf_object *obj = NULL;
+	const int zero = 0;
+
+	obj = bpf_object__open_file(obj_file, NULL);
+	if (CHECK(IS_ERR(obj), "obj_open_file", "err %ld\n", PTR_ERR(obj)))
+		return;
+
+	tp_prog = bpf_object__find_program_by_title(obj, prog_name);
+	if (CHECK(!tp_prog, "find_tp",
+		  "prog '%s' not found\n", prog_name))
+		goto cleanup;
+
+	err = bpf_object__load(obj);
+	if (CHECK(err, "obj_load", "err %d\n", err))
+		goto cleanup;
+
+	results_map_fd = bpf_find_map(__func__, obj, "test_get.bss");
+	if (CHECK(results_map_fd < 0, "find_bss_map",
+		  "err %d\n", results_map_fd))
+		goto cleanup;
+
+	tp_link = bpf_program__attach_tracepoint(tp_prog, "syscalls",
+						 "sys_enter_newfstat");
+	if (CHECK(IS_ERR(tp_link), "attach_tp",
+		  "err %ld\n", PTR_ERR(tp_link))) {
+		tp_link = NULL;
+		goto cleanup;
+	}
+
+	dst.pid = getpid();
+	err = bpf_map_update_elem(results_map_fd, &zero, &dst, 0);
+	if (CHECK(err, "update_elem",
+		  "failed to set pid filter: %d\n", err))
+		goto cleanup;
+
+	err = trigger_fstat_events(dst.pid);
+	if (CHECK_FAIL(err < 0))
+		goto cleanup;
+
+	err = bpf_map_lookup_elem(results_map_fd, &zero, &dst);
+	if (CHECK(err, "get_results",
+		  "failed to get results: %d\n", err))
+		goto cleanup;
+
+	for (int i = 0; i < MAX_FDS; i++) {
+		if (i < 3)
+			CHECK((dst.paths[i][0] != '\0'), "get_file_path",
+			      "failed to filter fs [%d]: %u(%s) vs %u(%s)\n",
+			      i, src.fds[i], src.paths[i], dst.fds[i],
+			      dst.paths[i]);
+		else
+			CHECK(strncmp(src.paths[i], dst.paths[i], MAX_PATH_LEN),
+			      "get_file_path",
+			      "failed to get path[%d]: %u(%s) vs %u(%s)\n",
+			      i, src.fds[i], src.paths[i], dst.fds[i],
+			      dst.paths[i]);
+	}
+
+cleanup:
+	bpf_link__destroy(tp_link);
+	bpf_object__close(obj);
+}
diff --git a/tools/testing/selftests/bpf/progs/test_get_file_path.c b/tools/testing/selftests/bpf/progs/test_get_file_path.c
new file mode 100644
index 000000000000..eae663c1262a
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_get_file_path.c
@@ -0,0 +1,43 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/bpf.h>
+#include <linux/ptrace.h>
+#include <string.h>
+#include <unistd.h>
+#include "bpf_helpers.h"
+#include "bpf_tracing.h"
+
+#define MAX_PATH_LEN		128
+#define MAX_EVENT_NUM		16
+
+static struct file_path_test_data {
+	pid_t pid;
+	__u32 cnt;
+	__u32 fds[MAX_EVENT_NUM];
+	char paths[MAX_EVENT_NUM][MAX_PATH_LEN];
+} data;
+
+struct sys_enter_newfstat_args {
+	unsigned long long pad1;
+	unsigned long long pad2;
+	unsigned int fd;
+};
+
+SEC("tracepoint/syscalls/sys_enter_newfstat")
+int bpf_prog(struct sys_enter_newfstat_args *args)
+{
+	pid_t pid = bpf_get_current_pid_tgid() >> 32;
+
+	if (pid != data.pid)
+		return 0;
+	if (data.cnt >= MAX_EVENT_NUM)
+		return 0;
+
+	data.fds[data.cnt] = args->fd;
+	bpf_get_file_path(data.paths[data.cnt], MAX_PATH_LEN, args->fd);
+	data.cnt++;
+
+	return 0;
+}
+
+char _license[] SEC("license") = "GPL";
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v10 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname
  2019-11-19 13:27 ` [PATCH bpf-next v10 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname Wenbo Zhang
@ 2019-11-23  3:18   ` Alexei Starovoitov
  2019-11-23  4:43     ` Al Viro
  2019-11-23  4:51     ` Al Viro
  2019-12-05  4:20   ` [PATCH bpf-next v11 0/2] bpf: adding get_file_path helper Wenbo Zhang
  1 sibling, 2 replies; 52+ messages in thread
From: Alexei Starovoitov @ 2019-11-23  3:18 UTC (permalink / raw)
  To: Wenbo Zhang
  Cc: bpf, ast, daniel, yhs, andrii.nakryiko, netdev, viro, linux-fsdevel

On Tue, Nov 19, 2019 at 08:27:37AM -0500, Wenbo Zhang wrote:
> When people want to identify which file system files are being opened,
> read, and written to, they can use this helper with file descriptor as
> input to achieve this goal. Other pseudo filesystems are also supported.
> 
> This requirement is mainly discussed here:
> 
>   https://github.com/iovisor/bcc/issues/237
> 
> v9->v10: addressed Andrii's feedback
> - send this patch together with the patch selftests as one patch series
> 
> v8->v9:
> - format helper description
> 
> v7->v8: addressed Alexei's feedback
> - use fget_raw instead of fdget_raw, as fdget_raw is only used inside fs/
> - ensure we're in user context which is safe fot the help to run
> - filter unmountable pseudo filesystem, because they don't have real path
> - supplement the description of this helper function
> 
> v6->v7:
> - fix missing signed-off-by line
> 
> v5->v6: addressed Andrii's feedback
> - avoid unnecessary goto end by having two explicit returns
> 
> v4->v5: addressed Andrii and Daniel's feedback
> - rename bpf_fd2path to bpf_get_file_path to be consistent with other
> helper's names
> - when fdget_raw fails, set ret to -EBADF instead of -EINVAL
> - remove fdput from fdget_raw's error path
> - use IS_ERR instead of IS_ERR_OR_NULL as d_path ether returns a pointer
> into the buffer or an error code if the path was too long
> - modify the normal path's return value to return copied string length
> including NUL
> - update this helper description's Return bits.
> 
> v3->v4: addressed Daniel's feedback
> - fix missing fdput()
> - move fd2path from kernel/bpf/trace.c to kernel/trace/bpf_trace.c
> - move fd2path's test code to another patch
> - add comment to explain why use fdget_raw instead of fdget
> 
> v2->v3: addressed Yonghong's feedback
> - remove unnecessary LOCKDOWN_BPF_READ
> - refactor error handling section for enhanced readability
> - provide a test case in tools/testing/selftests/bpf
> 
> v1->v2: addressed Daniel's feedback
> - fix backward compatibility
> - add this helper description
> - fix signed-off name
> 
> Signed-off-by: Wenbo Zhang <ethercflow@gmail.com>
...
> +BPF_CALL_3(bpf_get_file_path, char *, dst, u32, size, int, fd)
> +{
> +	struct file *f;
> +	char *p;
> +	int ret = -EBADF;
> +
> +	/* Ensure we're in user context which is safe for the helper to
> +	 * run. This helper has no business in a kthread.
> +	 */
> +	if (unlikely(in_interrupt() ||
> +		     current->flags & (PF_KTHREAD | PF_EXITING)))
> +		return -EPERM;
> +
> +	/* Use fget_raw instead of fget to support O_PATH, and it doesn't
> +	 * have any sleepable code, so it's ok to be here.
> +	 */
> +	f = fget_raw(fd);
> +	if (!f)
> +		goto error;
> +
> +	/* For unmountable pseudo filesystem, it seems to have no meaning
> +	 * to get their fake paths as they don't have path, and to be no
> +	 * way to validate this function pointer can be always safe to call
> +	 * in the current context.
> +	 */
> +	if (f->f_path.dentry->d_op && f->f_path.dentry->d_op->d_dname)
> +		return -EINVAL;
> +
> +	/* After filter unmountable pseudo filesytem, d_path won't call
> +	 * dentry->d_op->d_name(), the normally path doesn't have any
> +	 * sleepable code, and despite it uses the current macro to get
> +	 * fs_struct (current->fs), we've already ensured we're in user
> +	 * context, so it's ok to be here.
> +	 */
> +	p = d_path(&f->f_path, dst, size);
> +	if (IS_ERR(p)) {
> +		ret = PTR_ERR(p);
> +		fput(f);
> +		goto error;
> +	}
> +
> +	ret = strlen(p);
> +	memmove(dst, p, ret);
> +	dst[ret++] = '\0';
> +	fput(f);
> +	return ret;
> +
> +error:
> +	memset(dst, '0', size);
> +	return ret;
> +}

Al,

could you please review about code whether it's doing enough checks to be
called safely from preempt_disabled region?

It's been under review for many weeks and looks good from bpf pov. Essentially
tracing folks need easy way to convert FD to full path name. This feature
request first came in 2015.

Thanks!


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v10 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname
  2019-11-23  3:18   ` Alexei Starovoitov
@ 2019-11-23  4:43     ` Al Viro
  2019-11-23  4:51     ` Al Viro
  1 sibling, 0 replies; 52+ messages in thread
From: Al Viro @ 2019-11-23  4:43 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Wenbo Zhang, bpf, ast, daniel, yhs, andrii.nakryiko, netdev,
	linux-fsdevel

On Fri, Nov 22, 2019 at 07:18:28PM -0800, Alexei Starovoitov wrote:
> > +	/* After filter unmountable pseudo filesytem, d_path won't call
> > +	 * dentry->d_op->d_name(), the normally path doesn't have any
> > +	 * sleepable code, and despite it uses the current macro to get
> > +	 * fs_struct (current->fs), we've already ensured we're in user
> > +	 * context, so it's ok to be here.
> > +	 */
> > +	p = d_path(&f->f_path, dst, size);
> > +	if (IS_ERR(p)) {
> > +		ret = PTR_ERR(p);
> > +		fput(f);
> > +		goto error;
> > +	}
> > +
> > +	ret = strlen(p);
> > +	memmove(dst, p, ret);
> > +	dst[ret++] = '\0';
> > +	fput(f);
> > +	return ret;
> > +
> > +error:
> > +	memset(dst, '0', size);
> > +	return ret;
> > +}
> 
> Al,
> 
> could you please review about code whether it's doing enough checks to be
> called safely from preempt_disabled region?

Depends.  Which context is it running in?  In particular, which
locks might be already held?

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v10 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname
  2019-11-23  3:18   ` Alexei Starovoitov
  2019-11-23  4:43     ` Al Viro
@ 2019-11-23  4:51     ` Al Viro
  2019-11-23  5:19       ` Alexei Starovoitov
  1 sibling, 1 reply; 52+ messages in thread
From: Al Viro @ 2019-11-23  4:51 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Wenbo Zhang, bpf, ast, daniel, yhs, andrii.nakryiko, netdev,
	linux-fsdevel

On Fri, Nov 22, 2019 at 07:18:28PM -0800, Alexei Starovoitov wrote:
> > +	f = fget_raw(fd);
> > +	if (!f)
> > +		goto error;
> > +
> > +	/* For unmountable pseudo filesystem, it seems to have no meaning
> > +	 * to get their fake paths as they don't have path, and to be no
> > +	 * way to validate this function pointer can be always safe to call
> > +	 * in the current context.
> > +	 */
> > +	if (f->f_path.dentry->d_op && f->f_path.dentry->d_op->d_dname)
> > +		return -EINVAL;

An obvious leak here, BTW.

Anyway, what could that be used for?  I mean, if you want to check
something about syscall arguments, that's an unfixably racy way to go.
Descriptor table can be a shared data structure, and two consequent
fdget() on the same number can bloody well yield completely unrelated
struct file references.

IOW, anything that does descriptor -> struct file * translation more than
once is an instant TOCTOU suspect.  In this particular case, the function
will produce a pathname of something that was once reachable via descriptor
with this number; quite possibly never before that function had been called
_and_ not once after it has returned.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v10 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname
  2019-11-23  4:51     ` Al Viro
@ 2019-11-23  5:19       ` Alexei Starovoitov
  2019-11-23  5:35         ` Al Viro
  0 siblings, 1 reply; 52+ messages in thread
From: Alexei Starovoitov @ 2019-11-23  5:19 UTC (permalink / raw)
  To: Al Viro
  Cc: Wenbo Zhang, bpf, ast, daniel, yhs, andrii.nakryiko, netdev,
	linux-fsdevel

On Sat, Nov 23, 2019 at 04:51:51AM +0000, Al Viro wrote:
> On Fri, Nov 22, 2019 at 07:18:28PM -0800, Alexei Starovoitov wrote:
> > > +	f = fget_raw(fd);
> > > +	if (!f)
> > > +		goto error;
> > > +
> > > +	/* For unmountable pseudo filesystem, it seems to have no meaning
> > > +	 * to get their fake paths as they don't have path, and to be no
> > > +	 * way to validate this function pointer can be always safe to call
> > > +	 * in the current context.
> > > +	 */
> > > +	if (f->f_path.dentry->d_op && f->f_path.dentry->d_op->d_dname)
> > > +		return -EINVAL;
> 
> An obvious leak here, BTW.

ohh. right.

> Depends.  Which context is it running in?  In particular, which
> locks might be already held?

hard to tell. It will be run out of bpf prog that attaches to kprobe or
tracepoint. What is the concern about locking?
d_path() doesn't take any locks and doesn't depend on any locks. Above 'if'
checks that plain d_path() is used and not some specilized callback with
unknown logic.

> Anyway, what could that be used for?  I mean, if you want to check
> something about syscall arguments, that's an unfixably racy way to go.
> Descriptor table can be a shared data structure, and two consequent
> fdget() on the same number can bloody well yield completely unrelated
> struct file references.

yes. It is racy. There are no guarantees on correctness of FD.
The program can pass arbitrary integer into this helper.

> IOW, anything that does descriptor -> struct file * translation more than
> once is an instant TOCTOU suspect.  In this particular case, the function
> will produce a pathname of something that was once reachable via descriptor
> with this number; quite possibly never before that function had been called
> _and_ not once after it has returned.

Right. TOCTOU is not a concern here. It's tracing. It's ok for full path to be
'one time deal'. Right now people use bpf_probe_read() to replicate what
d_path() does.
See https://github.com/iovisor/bcc/issues/237#issuecomment-547564661
It sort of works, but calling d_path() is simpler and more accurate.
The key thing that bpf helpers need to make sure is that regardless of
how they're called and what integer is passed in as an FD the helper
must not crash or lockup the kernel or cause it to misbehave.
Hence above in_interrupt() and other checks to limit the context.


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v10 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname
  2019-11-23  5:19       ` Alexei Starovoitov
@ 2019-11-23  5:35         ` Al Viro
  2019-11-23  6:04           ` Alexei Starovoitov
  0 siblings, 1 reply; 52+ messages in thread
From: Al Viro @ 2019-11-23  5:35 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Wenbo Zhang, bpf, ast, daniel, yhs, andrii.nakryiko, netdev,
	linux-fsdevel

On Fri, Nov 22, 2019 at 09:19:21PM -0800, Alexei Starovoitov wrote:

> hard to tell. It will be run out of bpf prog that attaches to kprobe or
> tracepoint. What is the concern about locking?
> d_path() doesn't take any locks and doesn't depend on any locks. Above 'if'
> checks that plain d_path() is used and not some specilized callback with
> unknown logic.

It sure as hell does.  It might end up taking rename_lock and/or mount_lock
spinlock components.  It'll try not to, but if the first pass ends up with
seqlock mismatch, it will just grab the spinlock the second time around.

> > with this number; quite possibly never before that function had been called
> > _and_ not once after it has returned.
> 
> Right. TOCTOU is not a concern here. It's tracing. It's ok for full path to be
> 'one time deal'.

It might very well be a full path of something completely unrelated to what
the syscall ends up operating upon.  It's not that the file might've been
moved; it might be a different file.  IOW, results of that tracing might be
misleading.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v10 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname
  2019-11-23  5:35         ` Al Viro
@ 2019-11-23  6:04           ` Alexei Starovoitov
  2019-12-13 19:51             ` Brendan Gregg
  0 siblings, 1 reply; 52+ messages in thread
From: Alexei Starovoitov @ 2019-11-23  6:04 UTC (permalink / raw)
  To: Al Viro
  Cc: Wenbo Zhang, bpf, ast, daniel, yhs, andrii.nakryiko, netdev,
	linux-fsdevel

On Sat, Nov 23, 2019 at 05:35:14AM +0000, Al Viro wrote:
> On Fri, Nov 22, 2019 at 09:19:21PM -0800, Alexei Starovoitov wrote:
> 
> > hard to tell. It will be run out of bpf prog that attaches to kprobe or
> > tracepoint. What is the concern about locking?
> > d_path() doesn't take any locks and doesn't depend on any locks. Above 'if'
> > checks that plain d_path() is used and not some specilized callback with
> > unknown logic.
> 
> It sure as hell does.  It might end up taking rename_lock and/or mount_lock
> spinlock components.  It'll try not to, but if the first pass ends up with
> seqlock mismatch, it will just grab the spinlock the second time around.

ohh. got it. I missed _or_lock() part in there.
The need_seqretry() logic is tricky. afaics there is no way for the checks
outside of prepend_path() to prevent spin_lock to happen. And adding a flag to
prepend_path() to return early if retry is needed is too ugly. So this helper
won't be safe to be run out of kprobe. But if we allow it for tracepoints only
it should be ok. I think. There are no tracepoints in inner guts of vfs and I
don't think they will ever be. So running in tracepoint->bpf_prog->d_path we
will be sure that rename_lock+mount_lock can be safely spinlocked. Am I missing
something?

> > > with this number; quite possibly never before that function had been called
> > > _and_ not once after it has returned.
> > 
> > Right. TOCTOU is not a concern here. It's tracing. It's ok for full path to be
> > 'one time deal'.
> 
> It might very well be a full path of something completely unrelated to what
> the syscall ends up operating upon.  It's not that the file might've been
> moved; it might be a different file.  IOW, results of that tracing might be
> misleading.

That is correct. Tracing is fine with such limitation. Still better than probe_read.


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH bpf-next v11 0/2] bpf: adding get_file_path helper
  2019-11-19 13:27 ` [PATCH bpf-next v10 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname Wenbo Zhang
  2019-11-23  3:18   ` Alexei Starovoitov
@ 2019-12-05  4:20   ` Wenbo Zhang
  2019-12-05  4:20     ` [PATCH bpf-next v11 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname Wenbo Zhang
  2019-12-05  4:20     ` [PATCH bpf-next v11 " Wenbo Zhang
  1 sibling, 2 replies; 52+ messages in thread
From: Wenbo Zhang @ 2019-12-05  4:20 UTC (permalink / raw)
  To: bpf; +Cc: ast, daniel, yhs, andrii.nakryiko, netdev

This patch series introduce a bpf helper that can be used to map a file
descriptor to a pathname.

This requirement is mainly discussed here:

  https://github.com/iovisor/bcc/issues/237

This implementation supports both local and mountable pseudo file systems,
and ensure we're in user context which is safe for this helper to run.

Changes since v10:

* fix missing fput


Changes since v9:

* Associate help patch with its selftests patch to this series

* Refactor selftests code for further simplification  


Changes since v8:

* format helper description 
 

Changes since v7:

* Use fget_raw instead of fdget_raw, as fdget_raw is only used inside fs/

* Ensure we're in user context which is safe fot the help to run

* Filter unmountable pseudo filesystem, because they don't have real path

* Supplement the description of this helper function


Changes since v6:

* Fix missing signed-off-by line


Changes since v5:

* Refactor helper avoid unnecessary goto end by having two explicit returns


Changes since v4:

* Rename bpf_fd2path to bpf_get_file_path to be consistent with other
helper's names

* When fdget_raw fails, set ret to -EBADF instead of -EINVAL

* Remove fdput from fdget_raw's error path

* Use IS_ERR instead of IS_ERR_OR_NULL as d_path ether returns a pointer
into the buffer or an error code if the path was too long

* Modify the normal path's return value to return copied string length
including NUL

* Update helper description's Return bits.

* Refactor selftests code for further simplification  


Changes since v3:

* Remove unnecessary LOCKDOWN_BPF_READ

* Refactor error handling section for enhanced readability

* Provide a test case in tools/testing/selftests/bpf

* Refactor sefltests code to use real global variables instead of maps


Changes since v2:

* Fix backward compatibility

* Add helper description

* Refactor selftests use global data instead of perf_buffer to simplified
code

* Fix signed-off name


Wenbo Zhang (2):
  bpf: add new helper get_file_path for mapping a file descriptor to a
    pathname
  selftests/bpf: test for bpf_get_file_path() from tracepoint

 include/uapi/linux/bpf.h                      |  29 ++-
 kernel/trace/bpf_trace.c                      |  68 +++++++
 tools/include/uapi/linux/bpf.h                |  29 ++-
 .../selftests/bpf/prog_tests/get_file_path.c  | 171 ++++++++++++++++++
 .../selftests/bpf/progs/test_get_file_path.c  |  43 +++++
 5 files changed, 338 insertions(+), 2 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/get_file_path.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_get_file_path.c

-- 
2.17.1


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH bpf-next v11 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname
  2019-12-05  4:20   ` [PATCH bpf-next v11 0/2] bpf: adding get_file_path helper Wenbo Zhang
@ 2019-12-05  4:20     ` Wenbo Zhang
  2019-12-05  7:19       ` Alexei Starovoitov
  2019-12-15  4:01       ` [PATCH bpf-next v12 0/2] bpf: adding get_file_path helper Wenbo Zhang
  2019-12-05  4:20     ` [PATCH bpf-next v11 " Wenbo Zhang
  1 sibling, 2 replies; 52+ messages in thread
From: Wenbo Zhang @ 2019-12-05  4:20 UTC (permalink / raw)
  To: bpf; +Cc: ast, daniel, yhs, andrii.nakryiko, netdev

When people want to identify which file system files are being opened,
read, and written to, they can use this helper with file descriptor as
input to achieve this goal. Other pseudo filesystems are also supported.

This requirement is mainly discussed here:

  https://github.com/iovisor/bcc/issues/237

v10->v11: addressed Al and Alexei's feedback
- fix missing fput()

v9->v10: addressed Andrii's feedback
- send this patch together with the patch selftests as one patch series

v8->v9:
- format helper description

v7->v8: addressed Alexei's feedback
- use fget_raw instead of fdget_raw, as fdget_raw is only used inside fs/
- ensure we're in user context which is safe fot the help to run
- filter unmountable pseudo filesystem, because they don't have real path
- supplement the description of this helper function

v6->v7:
- fix missing signed-off-by line

v5->v6: addressed Andrii's feedback
- avoid unnecessary goto end by having two explicit returns

v4->v5: addressed Andrii and Daniel's feedback
- rename bpf_fd2path to bpf_get_file_path to be consistent with other
helper's names
- when fdget_raw fails, set ret to -EBADF instead of -EINVAL
- remove fdput from fdget_raw's error path
- use IS_ERR instead of IS_ERR_OR_NULL as d_path ether returns a pointer
into the buffer or an error code if the path was too long
- modify the normal path's return value to return copied string length
including NUL
- update this helper description's Return bits.

v3->v4: addressed Daniel's feedback
- fix missing fdput()
- move fd2path from kernel/bpf/trace.c to kernel/trace/bpf_trace.c
- move fd2path's test code to another patch
- add comment to explain why use fdget_raw instead of fdget

v2->v3: addressed Yonghong's feedback
- remove unnecessary LOCKDOWN_BPF_READ
- refactor error handling section for enhanced readability
- provide a test case in tools/testing/selftests/bpf

v1->v2: addressed Daniel's feedback
- fix backward compatibility
- add this helper description
- fix signed-off name

Signed-off-by: Wenbo Zhang <ethercflow@gmail.com>
---
 include/uapi/linux/bpf.h       | 29 ++++++++++++++-
 kernel/trace/bpf_trace.c       | 68 ++++++++++++++++++++++++++++++++++
 tools/include/uapi/linux/bpf.h | 29 ++++++++++++++-
 3 files changed, 124 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index dbbcf0b02970..71d9705df120 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -2821,6 +2821,32 @@ union bpf_attr {
  * 	Return
  * 		On success, the strictly positive length of the string,	including
  * 		the trailing NUL character. On error, a negative value.
+ *
+ * int bpf_get_file_path(char *path, u32 size, int fd)
+ *	Description
+ *		Get **file** atrribute from the current task by *fd*, then call
+ *		**d_path** to get it's absolute path and copy it as string into
+ *		*path* of *size*. Notice the **path** don't support unmountable
+ *		pseudo filesystems as they don't have path (eg: SOCKFS, PIPEFS).
+ *		The *size* must be strictly positive. On success, the helper
+ *		makes sure that the *path* is NUL-terminated, and the buffer
+ *		could be:
+ *		- a regular full path (include mountable fs eg: /proc, /sys)
+ *		- a regular full path with "(deleted)" at the end.
+ *		On failure, it is filled with zeroes.
+ *	Return
+ *		On success, returns the length of the copied string INCLUDING
+ *		the trailing NUL.
+ *
+ *		On failure, the returned value is one of the following:
+ *
+ *		**-EPERM** if no permission to get the path (eg: in irq ctx).
+ *
+ *		**-EBADF** if *fd* is invalid.
+ *
+ *		**-EINVAL** if *fd* corresponds to a unmountable pseudo fs
+ *
+ *		**-ENAMETOOLONG** if full path is longer than *size*
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -2938,7 +2964,8 @@ union bpf_attr {
 	FN(probe_read_user),		\
 	FN(probe_read_kernel),		\
 	FN(probe_read_user_str),	\
-	FN(probe_read_kernel_str),
+	FN(probe_read_kernel_str),	\
+	FN(get_file_path),
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index ffc91d4935ac..16df8163d681 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -762,6 +762,72 @@ static const struct bpf_func_proto bpf_send_signal_proto = {
 	.arg1_type	= ARG_ANYTHING,
 };
 
+BPF_CALL_3(bpf_get_file_path, char *, dst, u32, size, int, fd)
+{
+	struct file *f;
+	char *p;
+	int ret = -EBADF;
+
+	/* Ensure we're in user context which is safe for the helper to
+	 * run. This helper has no business in a kthread.
+	 */
+	if (unlikely(in_interrupt() ||
+		     current->flags & (PF_KTHREAD | PF_EXITING))) {
+		ret = -EPERM;
+		goto error;
+	}
+
+	/* Use fget_raw instead of fget to support O_PATH, and it doesn't
+	 * have any sleepable code, so it's ok to be here.
+	 */
+	f = fget_raw(fd);
+	if (!f)
+		goto error;
+
+	/* For unmountable pseudo filesystem, it seems to have no meaning
+	 * to get their fake paths as they don't have path, and to be no
+	 * way to validate this function pointer can be always safe to call
+	 * in the current context.
+	 */
+	if (f->f_path.dentry->d_op && f->f_path.dentry->d_op->d_dname) {
+		ret = -EINVAL;
+		fput(f);
+		goto error;
+	}
+
+	/* After filter unmountable pseudo filesytem, d_path won't call
+	 * dentry->d_op->d_name(), the normally path doesn't have any
+	 * sleepable code, and despite it uses the current macro to get
+	 * fs_struct (current->fs), we've already ensured we're in user
+	 * context, so it's ok to be here.
+	 */
+	p = d_path(&f->f_path, dst, size);
+	if (IS_ERR(p)) {
+		ret = PTR_ERR(p);
+		fput(f);
+		goto error;
+	}
+
+	ret = strlen(p);
+	memmove(dst, p, ret);
+	dst[ret++] = '\0';
+	fput(f);
+	return ret;
+
+error:
+	memset(dst, '0', size);
+	return ret;
+}
+
+static const struct bpf_func_proto bpf_get_file_path_proto = {
+	.func       = bpf_get_file_path,
+	.gpl_only   = true,
+	.ret_type   = RET_INTEGER,
+	.arg1_type  = ARG_PTR_TO_UNINIT_MEM,
+	.arg2_type  = ARG_CONST_SIZE,
+	.arg3_type  = ARG_ANYTHING,
+};
+
 static const struct bpf_func_proto *
 tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 {
@@ -822,6 +888,8 @@ tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 #endif
 	case BPF_FUNC_send_signal:
 		return &bpf_send_signal_proto;
+	case BPF_FUNC_get_file_path:
+		return &bpf_get_file_path_proto;
 	default:
 		return NULL;
 	}
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index dbbcf0b02970..71d9705df120 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -2821,6 +2821,32 @@ union bpf_attr {
  * 	Return
  * 		On success, the strictly positive length of the string,	including
  * 		the trailing NUL character. On error, a negative value.
+ *
+ * int bpf_get_file_path(char *path, u32 size, int fd)
+ *	Description
+ *		Get **file** atrribute from the current task by *fd*, then call
+ *		**d_path** to get it's absolute path and copy it as string into
+ *		*path* of *size*. Notice the **path** don't support unmountable
+ *		pseudo filesystems as they don't have path (eg: SOCKFS, PIPEFS).
+ *		The *size* must be strictly positive. On success, the helper
+ *		makes sure that the *path* is NUL-terminated, and the buffer
+ *		could be:
+ *		- a regular full path (include mountable fs eg: /proc, /sys)
+ *		- a regular full path with "(deleted)" at the end.
+ *		On failure, it is filled with zeroes.
+ *	Return
+ *		On success, returns the length of the copied string INCLUDING
+ *		the trailing NUL.
+ *
+ *		On failure, the returned value is one of the following:
+ *
+ *		**-EPERM** if no permission to get the path (eg: in irq ctx).
+ *
+ *		**-EBADF** if *fd* is invalid.
+ *
+ *		**-EINVAL** if *fd* corresponds to a unmountable pseudo fs
+ *
+ *		**-ENAMETOOLONG** if full path is longer than *size*
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -2938,7 +2964,8 @@ union bpf_attr {
 	FN(probe_read_user),		\
 	FN(probe_read_kernel),		\
 	FN(probe_read_user_str),	\
-	FN(probe_read_kernel_str),
+	FN(probe_read_kernel_str),	\
+	FN(get_file_path),
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH bpf-next v11 2/2] selftests/bpf: test for bpf_get_file_path() from tracepoint
  2019-12-05  4:20   ` [PATCH bpf-next v11 0/2] bpf: adding get_file_path helper Wenbo Zhang
  2019-12-05  4:20     ` [PATCH bpf-next v11 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname Wenbo Zhang
@ 2019-12-05  4:20     ` Wenbo Zhang
  1 sibling, 0 replies; 52+ messages in thread
From: Wenbo Zhang @ 2019-12-05  4:20 UTC (permalink / raw)
  To: bpf; +Cc: ast, daniel, yhs, andrii.nakryiko, netdev

trace fstat events by tracepoint syscalls/sys_enter_newfstat, and handle
events only produced by test_file_get_path, which call fstat on several
different types of files to test bpf_get_file_path's feature.

v4->v5: addressed Andrii's feedback
- pass NULL for opts as bpf_object__open_file's PARAM2, as not really
using any
- modify patch subject to keep up with test code
- as this test is single-threaded, so use getpid instead of SYS_gettid
- remove unnecessary parens around check which after if (i < 3)
- in kern use bpf_get_current_pid_tgid() >> 32 to fit getpid() in
userspace part
- with the patch adding helper as one patch series

v3->v4: addressed Andrii's feedback
- use a set of fd instead of fds array
- use global variables instead of maps (in v3, I mistakenly thought that
the bpf maps are global variables.)
- remove uncessary global variable path_info_index
- remove fd compare as the fstat's order is fixed

v2->v3: addressed Andrii's feedback
- use global data instead of perf_buffer to simplified code

v1->v2: addressed Daniel's feedback
- rename bpf_fd2path to bpf_get_file_path to be consistent with other
helper's names

Signed-off-by: Wenbo Zhang <ethercflow@gmail.com>
---
 .../selftests/bpf/prog_tests/get_file_path.c  | 171 ++++++++++++++++++
 .../selftests/bpf/progs/test_get_file_path.c  |  43 +++++
 2 files changed, 214 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/get_file_path.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_get_file_path.c

diff --git a/tools/testing/selftests/bpf/prog_tests/get_file_path.c b/tools/testing/selftests/bpf/prog_tests/get_file_path.c
new file mode 100644
index 000000000000..db88545e127b
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/get_file_path.c
@@ -0,0 +1,171 @@
+// SPDX-License-Identifier: GPL-2.0
+#define _GNU_SOURCE
+#include <test_progs.h>
+#include <sys/stat.h>
+#include <linux/sched.h>
+#include <sys/syscall.h>
+
+#define MAX_PATH_LEN		128
+#define MAX_FDS			7
+#define MAX_EVENT_NUM		16
+
+static struct file_path_test_data {
+	pid_t pid;
+	__u32 cnt;
+	__u32 fds[MAX_EVENT_NUM];
+	char paths[MAX_EVENT_NUM][MAX_PATH_LEN];
+} src, dst;
+
+static inline int set_pathname(int fd)
+{
+	char buf[MAX_PATH_LEN];
+
+	snprintf(buf, MAX_PATH_LEN, "/proc/%d/fd/%d", src.pid, fd);
+	src.fds[src.cnt] = fd;
+	return readlink(buf, src.paths[src.cnt++], MAX_PATH_LEN);
+}
+
+static int trigger_fstat_events(pid_t pid)
+{
+	int pipefd[2] = { -1, -1 };
+	int sockfd = -1, procfd = -1, devfd = -1;
+	int localfd = -1, indicatorfd = -1;
+	struct stat fileStat;
+	int ret = -1;
+
+	/* unmountable pseudo-filesystems */
+	if (CHECK_FAIL(pipe(pipefd) < 0))
+		return ret;
+	/* unmountable pseudo-filesystems */
+	sockfd = socket(AF_INET, SOCK_STREAM, 0);
+	if (CHECK_FAIL(sockfd < 0))
+		goto out_close;
+	/* mountable pseudo-filesystems */
+	procfd = open("/proc/self/comm", O_RDONLY);
+	if (CHECK_FAIL(procfd < 0))
+		goto out_close;
+	devfd = open("/dev/urandom", O_RDONLY);
+	if (CHECK_FAIL(devfd < 0))
+		goto out_close;
+	localfd = open("/tmp/fd2path_loadgen.txt", O_CREAT | O_RDONLY);
+	if (CHECK_FAIL(localfd < 0))
+		goto out_close;
+	/* bpf_get_file_path will return path with (deleted) */
+	remove("/tmp/fd2path_loadgen.txt");
+	indicatorfd = open("/tmp/", O_PATH);
+	if (CHECK_FAIL(indicatorfd < 0))
+		goto out_close;
+
+	src.pid = pid;
+
+	ret = set_pathname(pipefd[0]);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(pipefd[1]);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(sockfd);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(procfd);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(devfd);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(localfd);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(indicatorfd);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+
+	fstat(pipefd[0], &fileStat);
+	fstat(pipefd[1], &fileStat);
+	fstat(sockfd, &fileStat);
+	fstat(procfd, &fileStat);
+	fstat(devfd, &fileStat);
+	fstat(localfd, &fileStat);
+	fstat(indicatorfd, &fileStat);
+
+out_close:
+	close(indicatorfd);
+	close(localfd);
+	close(devfd);
+	close(procfd);
+	close(sockfd);
+	close(pipefd[1]);
+	close(pipefd[0]);
+
+	return ret;
+}
+
+void test_get_file_path(void)
+{
+	const char *prog_name = "tracepoint/syscalls/sys_enter_newfstat";
+	const char *obj_file = "test_get_file_path.o";
+	int err, results_map_fd, duration = 0;
+	struct bpf_program *tp_prog = NULL;
+	struct bpf_link *tp_link = NULL;
+	struct bpf_object *obj = NULL;
+	const int zero = 0;
+
+	obj = bpf_object__open_file(obj_file, NULL);
+	if (CHECK(IS_ERR(obj), "obj_open_file", "err %ld\n", PTR_ERR(obj)))
+		return;
+
+	tp_prog = bpf_object__find_program_by_title(obj, prog_name);
+	if (CHECK(!tp_prog, "find_tp",
+		  "prog '%s' not found\n", prog_name))
+		goto cleanup;
+
+	err = bpf_object__load(obj);
+	if (CHECK(err, "obj_load", "err %d\n", err))
+		goto cleanup;
+
+	results_map_fd = bpf_find_map(__func__, obj, "test_get.bss");
+	if (CHECK(results_map_fd < 0, "find_bss_map",
+		  "err %d\n", results_map_fd))
+		goto cleanup;
+
+	tp_link = bpf_program__attach_tracepoint(tp_prog, "syscalls",
+						 "sys_enter_newfstat");
+	if (CHECK(IS_ERR(tp_link), "attach_tp",
+		  "err %ld\n", PTR_ERR(tp_link))) {
+		tp_link = NULL;
+		goto cleanup;
+	}
+
+	dst.pid = getpid();
+	err = bpf_map_update_elem(results_map_fd, &zero, &dst, 0);
+	if (CHECK(err, "update_elem",
+		  "failed to set pid filter: %d\n", err))
+		goto cleanup;
+
+	err = trigger_fstat_events(dst.pid);
+	if (CHECK_FAIL(err < 0))
+		goto cleanup;
+
+	err = bpf_map_lookup_elem(results_map_fd, &zero, &dst);
+	if (CHECK(err, "get_results",
+		  "failed to get results: %d\n", err))
+		goto cleanup;
+
+	for (int i = 0; i < MAX_FDS; i++) {
+		if (i < 3)
+			CHECK((dst.paths[i][0] != '\0'), "get_file_path",
+			      "failed to filter fs [%d]: %u(%s) vs %u(%s)\n",
+			      i, src.fds[i], src.paths[i], dst.fds[i],
+			      dst.paths[i]);
+		else
+			CHECK(strncmp(src.paths[i], dst.paths[i], MAX_PATH_LEN),
+			      "get_file_path",
+			      "failed to get path[%d]: %u(%s) vs %u(%s)\n",
+			      i, src.fds[i], src.paths[i], dst.fds[i],
+			      dst.paths[i]);
+	}
+
+cleanup:
+	bpf_link__destroy(tp_link);
+	bpf_object__close(obj);
+}
diff --git a/tools/testing/selftests/bpf/progs/test_get_file_path.c b/tools/testing/selftests/bpf/progs/test_get_file_path.c
new file mode 100644
index 000000000000..eae663c1262a
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_get_file_path.c
@@ -0,0 +1,43 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/bpf.h>
+#include <linux/ptrace.h>
+#include <string.h>
+#include <unistd.h>
+#include "bpf_helpers.h"
+#include "bpf_tracing.h"
+
+#define MAX_PATH_LEN		128
+#define MAX_EVENT_NUM		16
+
+static struct file_path_test_data {
+	pid_t pid;
+	__u32 cnt;
+	__u32 fds[MAX_EVENT_NUM];
+	char paths[MAX_EVENT_NUM][MAX_PATH_LEN];
+} data;
+
+struct sys_enter_newfstat_args {
+	unsigned long long pad1;
+	unsigned long long pad2;
+	unsigned int fd;
+};
+
+SEC("tracepoint/syscalls/sys_enter_newfstat")
+int bpf_prog(struct sys_enter_newfstat_args *args)
+{
+	pid_t pid = bpf_get_current_pid_tgid() >> 32;
+
+	if (pid != data.pid)
+		return 0;
+	if (data.cnt >= MAX_EVENT_NUM)
+		return 0;
+
+	data.fds[data.cnt] = args->fd;
+	bpf_get_file_path(data.paths[data.cnt], MAX_PATH_LEN, args->fd);
+	data.cnt++;
+
+	return 0;
+}
+
+char _license[] SEC("license") = "GPL";
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v11 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname
  2019-12-05  4:20     ` [PATCH bpf-next v11 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname Wenbo Zhang
@ 2019-12-05  7:19       ` Alexei Starovoitov
  2019-12-05  9:47         ` Wenbo Zhang
  2019-12-15  4:01       ` [PATCH bpf-next v12 0/2] bpf: adding get_file_path helper Wenbo Zhang
  1 sibling, 1 reply; 52+ messages in thread
From: Alexei Starovoitov @ 2019-12-05  7:19 UTC (permalink / raw)
  To: Wenbo Zhang; +Cc: bpf, ast, daniel, yhs, andrii.nakryiko, netdev

On Wed, Dec 04, 2019 at 11:20:35PM -0500, Wenbo Zhang wrote:
>  
> +BPF_CALL_3(bpf_get_file_path, char *, dst, u32, size, int, fd)
> +{
> +	struct file *f;
> +	char *p;
> +	int ret = -EBADF;
> +
> +	/* Ensure we're in user context which is safe for the helper to
> +	 * run. This helper has no business in a kthread.
> +	 */
> +	if (unlikely(in_interrupt() ||
> +		     current->flags & (PF_KTHREAD | PF_EXITING))) {
> +		ret = -EPERM;
> +		goto error;
> +	}
> +
> +	/* Use fget_raw instead of fget to support O_PATH, and it doesn't
> +	 * have any sleepable code, so it's ok to be here.
> +	 */
> +	f = fget_raw(fd);
> +	if (!f)
> +		goto error;
> +
> +	/* For unmountable pseudo filesystem, it seems to have no meaning
> +	 * to get their fake paths as they don't have path, and to be no
> +	 * way to validate this function pointer can be always safe to call
> +	 * in the current context.
> +	 */
> +	if (f->f_path.dentry->d_op && f->f_path.dentry->d_op->d_dname) {
> +		ret = -EINVAL;
> +		fput(f);
> +		goto error;
> +	}
> +
> +	/* After filter unmountable pseudo filesytem, d_path won't call
> +	 * dentry->d_op->d_name(), the normally path doesn't have any
> +	 * sleepable code, and despite it uses the current macro to get
> +	 * fs_struct (current->fs), we've already ensured we're in user
> +	 * context, so it's ok to be here.
> +	 */
> +	p = d_path(&f->f_path, dst, size);

Above 'if's are not enough to make sure that it won't dead lock.
Allowing it in tracing_func_proto() means that it's available to kprobe too.
Hence deadlock is possible. Please see previous email thread.
This helper is safe in tracepoint+bpf only.


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v11 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname
  2019-12-05  7:19       ` Alexei Starovoitov
@ 2019-12-05  9:47         ` Wenbo Zhang
  0 siblings, 0 replies; 52+ messages in thread
From: Wenbo Zhang @ 2019-12-05  9:47 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: bpf, ast, Daniel Borkmann, Yonghong Song, Andrii Nakryiko, Networking

> On Sat, Nov 23, 2019 at 05:35:14AM +0000, Al Viro wrote:
> > On Fri, Nov 22, 2019 at 09:19:21PM -0800, Alexei Starovoitov wrote:
> >
> > > hard to tell. It will be run out of bpf prog that attaches to kprobe or
> > > tracepoint. What is the concern about locking?
> > > d_path() doesn't take any locks and doesn't depend on any locks. Above 'if'
> > > checks that plain d_path() is used and not some specilized callback with
> > > unknown logic.
> >
> > It sure as hell does.  It might end up taking rename_lock and/or mount_lock
> > spinlock components.  It'll try not to, but if the first pass ends up with
> > seqlock mismatch, it will just grab the spinlock the second time around.

> ohh. got it. I missed _or_lock() part in there.
> The need_seqretry() logic is tricky. afaics there is no way for the checks
> outside of prepend_path() to prevent spin_lock to happen. And adding a flag to
> prepend_path() to return early if retry is needed is too ugly. So this helper
> won't be safe to be run out of kprobe. But if we allow it for tracepoints only
> it should be ok. I think. There are no tracepoints in inner guts of vfs and I
> don't think they will ever be. So running in tracepoint->bpf_prog->d_path we
> will be sure that rename_lock+mount_lock can be safely spinlocked. Am I missing
> something?

Hi Alexei,

Would you please give me an example of a deadlock condition under kprobe+bpf?
I'm not familiar with this detail and want to learn more.

> Above 'if's are not enough to make sure that it won't dead lock.
> Allowing it in tracing_func_proto() means that it's available to kprobe too.
> Hence deadlock is possible. Please see previous email thread.
> This helper is safe in tracepoint+bpf only.

So I should move it to `tp_prog_prog_func_proto` and `raw_tp_prog_func_prog`
right? Is raw tracepoint+bpf safe?

Thank you.

Alexei Starovoitov <alexei.starovoitov@gmail.com> 于2019年12月5日周四 下午3:19写道:
>
> On Wed, Dec 04, 2019 at 11:20:35PM -0500, Wenbo Zhang wrote:
> >
> > +BPF_CALL_3(bpf_get_file_path, char *, dst, u32, size, int, fd)
> > +{
> > +     struct file *f;
> > +     char *p;
> > +     int ret = -EBADF;
> > +
> > +     /* Ensure we're in user context which is safe for the helper to
> > +      * run. This helper has no business in a kthread.
> > +      */
> > +     if (unlikely(in_interrupt() ||
> > +                  current->flags & (PF_KTHREAD | PF_EXITING))) {
> > +             ret = -EPERM;
> > +             goto error;
> > +     }
> > +
> > +     /* Use fget_raw instead of fget to support O_PATH, and it doesn't
> > +      * have any sleepable code, so it's ok to be here.
> > +      */
> > +     f = fget_raw(fd);
> > +     if (!f)
> > +             goto error;
> > +
> > +     /* For unmountable pseudo filesystem, it seems to have no meaning
> > +      * to get their fake paths as they don't have path, and to be no
> > +      * way to validate this function pointer can be always safe to call
> > +      * in the current context.
> > +      */
> > +     if (f->f_path.dentry->d_op && f->f_path.dentry->d_op->d_dname) {
> > +             ret = -EINVAL;
> > +             fput(f);
> > +             goto error;
> > +     }
> > +
> > +     /* After filter unmountable pseudo filesytem, d_path won't call
> > +      * dentry->d_op->d_name(), the normally path doesn't have any
> > +      * sleepable code, and despite it uses the current macro to get
> > +      * fs_struct (current->fs), we've already ensured we're in user
> > +      * context, so it's ok to be here.
> > +      */
> > +     p = d_path(&f->f_path, dst, size);
>
> Above 'if's are not enough to make sure that it won't dead lock.
> Allowing it in tracing_func_proto() means that it's available to kprobe too.
> Hence deadlock is possible. Please see previous email thread.
> This helper is safe in tracepoint+bpf only.
>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v10 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname
  2019-11-23  6:04           ` Alexei Starovoitov
@ 2019-12-13 19:51             ` Brendan Gregg
  0 siblings, 0 replies; 52+ messages in thread
From: Brendan Gregg @ 2019-12-13 19:51 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Al Viro, Wenbo Zhang, bpf, ast, Daniel Borkmann, Yonghong Song,
	andrii.nakryiko, netdev, linux-fsdevel

On Fri, Nov 22, 2019 at 10:05 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Sat, Nov 23, 2019 at 05:35:14AM +0000, Al Viro wrote:
> > On Fri, Nov 22, 2019 at 09:19:21PM -0800, Alexei Starovoitov wrote:
> >
> > > hard to tell. It will be run out of bpf prog that attaches to kprobe or
> > > tracepoint. What is the concern about locking?
> > > d_path() doesn't take any locks and doesn't depend on any locks. Above 'if'
> > > checks that plain d_path() is used and not some specilized callback with
> > > unknown logic.
> >
> > It sure as hell does.  It might end up taking rename_lock and/or mount_lock
> > spinlock components.  It'll try not to, but if the first pass ends up with
> > seqlock mismatch, it will just grab the spinlock the second time around.
>
> ohh. got it. I missed _or_lock() part in there.
> The need_seqretry() logic is tricky. afaics there is no way for the checks
> outside of prepend_path() to prevent spin_lock to happen. And adding a flag to
> prepend_path() to return early if retry is needed is too ugly. So this helper
> won't be safe to be run out of kprobe. But if we allow it for tracepoints only
> it should be ok. I think. There are no tracepoints in inner guts of vfs and I
> don't think they will ever be. So running in tracepoint->bpf_prog->d_path we
> will be sure that rename_lock+mount_lock can be safely spinlocked. Am I missing
> something?

It seems rather restrictive to only allow tracepoints (especially
without VFS tracepoints), although I'll use it to improve my syscall
tracepoint tools, so I'd be happy to see this merged even with that
restriction.

Just a thought: if *buffer is in BPF memory, can prepend_path() check
it's memory location and not try to grab the lock based on that? This
would be to avoid adding a flag.

>
> > > > with this number; quite possibly never before that function had been called
> > > > _and_ not once after it has returned.
> > >
> > > Right. TOCTOU is not a concern here. It's tracing. It's ok for full path to be
> > > 'one time deal'.
> >
> > It might very well be a full path of something completely unrelated to what
> > the syscall ends up operating upon.  It's not that the file might've been
> > moved; it might be a different file.  IOW, results of that tracing might be
> > misleading.
>
> That is correct. Tracing is fine with such limitation. Still better than probe_read.
>

+1

Tracing is observability tools and we document these caveats, and this
won't be the first time I've published tools where the printed path
may not be the one you think (e.g., the case of hard links.)

Brendan

-- 
Brendan Gregg, Senior Performance Architect, Netflix

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH bpf-next v12 0/2] bpf: adding get_file_path helper
  2019-12-05  4:20     ` [PATCH bpf-next v11 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname Wenbo Zhang
  2019-12-05  7:19       ` Alexei Starovoitov
@ 2019-12-15  4:01       ` Wenbo Zhang
  2019-12-15  4:01         ` [PATCH bpf-next v12 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname Wenbo Zhang
  2019-12-15  4:01         ` [PATCH bpf-next v12 2/2] selftests/bpf: test for bpf_get_file_path() " Wenbo Zhang
  1 sibling, 2 replies; 52+ messages in thread
From: Wenbo Zhang @ 2019-12-15  4:01 UTC (permalink / raw)
  To: bpf; +Cc: ast, daniel, yhs, andrii.nakryiko, netdev

This patch series introduce a bpf helper that can be used to map a file
descriptor to a pathname.

This requirement is mainly discussed here:

  https://github.com/iovisor/bcc/issues/237

This implementation supports both local and mountable pseudo file systems,
and ensure we're in user context which is safe for this helper to run.

Changes since v11:

* only allow tracepoints to make sure it won't dead lock


Changes since v10:

* fix missing fput


Changes since v9:

* Associate help patch with its selftests patch to this series

* Refactor selftests code for further simplification  


Changes since v8:

* format helper description 
 

Changes since v7:

* Use fget_raw instead of fdget_raw, as fdget_raw is only used inside fs/

* Ensure we're in user context which is safe fot the help to run

* Filter unmountable pseudo filesystem, because they don't have real path

* Supplement the description of this helper function


Changes since v6:

* Fix missing signed-off-by line


Changes since v5:

* Refactor helper avoid unnecessary goto end by having two explicit returns


Changes since v4:

* Rename bpf_fd2path to bpf_get_file_path to be consistent with other
helper's names

* When fdget_raw fails, set ret to -EBADF instead of -EINVAL

* Remove fdput from fdget_raw's error path

* Use IS_ERR instead of IS_ERR_OR_NULL as d_path ether returns a pointer
into the buffer or an error code if the path was too long

* Modify the normal path's return value to return copied string length
including NUL

* Update helper description's Return bits.

* Refactor selftests code for further simplification  


Changes since v3:

* Remove unnecessary LOCKDOWN_BPF_READ

* Refactor error handling section for enhanced readability

* Provide a test case in tools/testing/selftests/bpf

* Refactor sefltests code to use real global variables instead of maps


Changes since v2:

* Fix backward compatibility

* Add helper description

* Refactor selftests use global data instead of perf_buffer to simplified
code

* Fix signed-off name


Wenbo Zhang (2):
  bpf: add new helper get_file_path for mapping a file descriptor to a
    pathname
  selftests/bpf: test for bpf_get_file_path() from tracepoint

 include/uapi/linux/bpf.h                      |  29 ++-
 kernel/trace/bpf_trace.c                      |  70 +++++++
 tools/include/uapi/linux/bpf.h                |  29 ++-
 .../selftests/bpf/prog_tests/get_file_path.c  | 171 ++++++++++++++++++
 .../selftests/bpf/progs/test_get_file_path.c  |  43 +++++
 5 files changed, 340 insertions(+), 2 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/get_file_path.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_get_file_path.c

-- 
2.17.1


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH bpf-next v12 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname
  2019-12-15  4:01       ` [PATCH bpf-next v12 0/2] bpf: adding get_file_path helper Wenbo Zhang
@ 2019-12-15  4:01         ` Wenbo Zhang
  2019-12-15 16:05           ` Yonghong Song
                             ` (3 more replies)
  2019-12-15  4:01         ` [PATCH bpf-next v12 2/2] selftests/bpf: test for bpf_get_file_path() " Wenbo Zhang
  1 sibling, 4 replies; 52+ messages in thread
From: Wenbo Zhang @ 2019-12-15  4:01 UTC (permalink / raw)
  To: bpf; +Cc: ast, daniel, yhs, andrii.nakryiko, netdev

When people want to identify which file system files are being opened,
read, and written to, they can use this helper with file descriptor as
input to achieve this goal. Other pseudo filesystems are also supported.

This requirement is mainly discussed here:

  https://github.com/iovisor/bcc/issues/237

v11->v12: addressed Alexei's feedback
- only allow tracepoints to make sure it won't dead lock

v10->v11: addressed Al and Alexei's feedback
- fix missing fput()

v9->v10: addressed Andrii's feedback
- send this patch together with the patch selftests as one patch series

v8->v9:
- format helper description

v7->v8: addressed Alexei's feedback
- use fget_raw instead of fdget_raw, as fdget_raw is only used inside fs/
- ensure we're in user context which is safe fot the help to run
- filter unmountable pseudo filesystem, because they don't have real path
- supplement the description of this helper function

v6->v7:
- fix missing signed-off-by line

v5->v6: addressed Andrii's feedback
- avoid unnecessary goto end by having two explicit returns

v4->v5: addressed Andrii and Daniel's feedback
- rename bpf_fd2path to bpf_get_file_path to be consistent with other
helper's names
- when fdget_raw fails, set ret to -EBADF instead of -EINVAL
- remove fdput from fdget_raw's error path
- use IS_ERR instead of IS_ERR_OR_NULL as d_path ether returns a pointer
into the buffer or an error code if the path was too long
- modify the normal path's return value to return copied string length
including NUL
- update this helper description's Return bits.

v3->v4: addressed Daniel's feedback
- fix missing fdput()
- move fd2path from kernel/bpf/trace.c to kernel/trace/bpf_trace.c
- move fd2path's test code to another patch
- add comment to explain why use fdget_raw instead of fdget

v2->v3: addressed Yonghong's feedback
- remove unnecessary LOCKDOWN_BPF_READ
- refactor error handling section for enhanced readability
- provide a test case in tools/testing/selftests/bpf

v1->v2: addressed Daniel's feedback
- fix backward compatibility
- add this helper description
- fix signed-off name

Signed-off-by: Wenbo Zhang <ethercflow@gmail.com>
---
 include/uapi/linux/bpf.h       | 29 +++++++++++++-
 kernel/trace/bpf_trace.c       | 70 ++++++++++++++++++++++++++++++++++
 tools/include/uapi/linux/bpf.h | 29 +++++++++++++-
 3 files changed, 126 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index dbbcf0b02970..71d9705df120 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -2821,6 +2821,32 @@ union bpf_attr {
  * 	Return
  * 		On success, the strictly positive length of the string,	including
  * 		the trailing NUL character. On error, a negative value.
+ *
+ * int bpf_get_file_path(char *path, u32 size, int fd)
+ *	Description
+ *		Get **file** atrribute from the current task by *fd*, then call
+ *		**d_path** to get it's absolute path and copy it as string into
+ *		*path* of *size*. Notice the **path** don't support unmountable
+ *		pseudo filesystems as they don't have path (eg: SOCKFS, PIPEFS).
+ *		The *size* must be strictly positive. On success, the helper
+ *		makes sure that the *path* is NUL-terminated, and the buffer
+ *		could be:
+ *		- a regular full path (include mountable fs eg: /proc, /sys)
+ *		- a regular full path with "(deleted)" at the end.
+ *		On failure, it is filled with zeroes.
+ *	Return
+ *		On success, returns the length of the copied string INCLUDING
+ *		the trailing NUL.
+ *
+ *		On failure, the returned value is one of the following:
+ *
+ *		**-EPERM** if no permission to get the path (eg: in irq ctx).
+ *
+ *		**-EBADF** if *fd* is invalid.
+ *
+ *		**-EINVAL** if *fd* corresponds to a unmountable pseudo fs
+ *
+ *		**-ENAMETOOLONG** if full path is longer than *size*
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -2938,7 +2964,8 @@ union bpf_attr {
 	FN(probe_read_user),		\
 	FN(probe_read_kernel),		\
 	FN(probe_read_user_str),	\
-	FN(probe_read_kernel_str),
+	FN(probe_read_kernel_str),	\
+	FN(get_file_path),
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index e5ef4ae9edb5..db9c0ec46a5d 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -762,6 +762,72 @@ static const struct bpf_func_proto bpf_send_signal_proto = {
 	.arg1_type	= ARG_ANYTHING,
 };
 
+BPF_CALL_3(bpf_get_file_path, char *, dst, u32, size, int, fd)
+{
+	struct file *f;
+	char *p;
+	int ret = -EBADF;
+
+	/* Ensure we're in user context which is safe for the helper to
+	 * run. This helper has no business in a kthread.
+	 */
+	if (unlikely(in_interrupt() ||
+		     current->flags & (PF_KTHREAD | PF_EXITING))) {
+		ret = -EPERM;
+		goto error;
+	}
+
+	/* Use fget_raw instead of fget to support O_PATH, and it doesn't
+	 * have any sleepable code, so it's ok to be here.
+	 */
+	f = fget_raw(fd);
+	if (!f)
+		goto error;
+
+	/* For unmountable pseudo filesystem, it seems to have no meaning
+	 * to get their fake paths as they don't have path, and to be no
+	 * way to validate this function pointer can be always safe to call
+	 * in the current context.
+	 */
+	if (f->f_path.dentry->d_op && f->f_path.dentry->d_op->d_dname) {
+		ret = -EINVAL;
+		fput(f);
+		goto error;
+	}
+
+	/* After filter unmountable pseudo filesytem, d_path won't call
+	 * dentry->d_op->d_name(), the normally path doesn't have any
+	 * sleepable code, and despite it uses the current macro to get
+	 * fs_struct (current->fs), we've already ensured we're in user
+	 * context, so it's ok to be here.
+	 */
+	p = d_path(&f->f_path, dst, size);
+	if (IS_ERR(p)) {
+		ret = PTR_ERR(p);
+		fput(f);
+		goto error;
+	}
+
+	ret = strlen(p);
+	memmove(dst, p, ret);
+	dst[ret++] = '\0';
+	fput(f);
+	return ret;
+
+error:
+	memset(dst, '0', size);
+	return ret;
+}
+
+static const struct bpf_func_proto bpf_get_file_path_proto = {
+	.func       = bpf_get_file_path,
+	.gpl_only   = true,
+	.ret_type   = RET_INTEGER,
+	.arg1_type  = ARG_PTR_TO_UNINIT_MEM,
+	.arg2_type  = ARG_CONST_SIZE,
+	.arg3_type  = ARG_ANYTHING,
+};
+
 static const struct bpf_func_proto *
 tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 {
@@ -953,6 +1019,8 @@ tp_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return &bpf_get_stackid_proto_tp;
 	case BPF_FUNC_get_stack:
 		return &bpf_get_stack_proto_tp;
+	case BPF_FUNC_get_file_path:
+		return &bpf_get_file_path_proto;
 	default:
 		return tracing_func_proto(func_id, prog);
 	}
@@ -1146,6 +1214,8 @@ raw_tp_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return &bpf_get_stackid_proto_raw_tp;
 	case BPF_FUNC_get_stack:
 		return &bpf_get_stack_proto_raw_tp;
+	case BPF_FUNC_get_file_path:
+		return &bpf_get_file_path_proto;
 	default:
 		return tracing_func_proto(func_id, prog);
 	}
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index dbbcf0b02970..71d9705df120 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -2821,6 +2821,32 @@ union bpf_attr {
  * 	Return
  * 		On success, the strictly positive length of the string,	including
  * 		the trailing NUL character. On error, a negative value.
+ *
+ * int bpf_get_file_path(char *path, u32 size, int fd)
+ *	Description
+ *		Get **file** atrribute from the current task by *fd*, then call
+ *		**d_path** to get it's absolute path and copy it as string into
+ *		*path* of *size*. Notice the **path** don't support unmountable
+ *		pseudo filesystems as they don't have path (eg: SOCKFS, PIPEFS).
+ *		The *size* must be strictly positive. On success, the helper
+ *		makes sure that the *path* is NUL-terminated, and the buffer
+ *		could be:
+ *		- a regular full path (include mountable fs eg: /proc, /sys)
+ *		- a regular full path with "(deleted)" at the end.
+ *		On failure, it is filled with zeroes.
+ *	Return
+ *		On success, returns the length of the copied string INCLUDING
+ *		the trailing NUL.
+ *
+ *		On failure, the returned value is one of the following:
+ *
+ *		**-EPERM** if no permission to get the path (eg: in irq ctx).
+ *
+ *		**-EBADF** if *fd* is invalid.
+ *
+ *		**-EINVAL** if *fd* corresponds to a unmountable pseudo fs
+ *
+ *		**-ENAMETOOLONG** if full path is longer than *size*
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -2938,7 +2964,8 @@ union bpf_attr {
 	FN(probe_read_user),		\
 	FN(probe_read_kernel),		\
 	FN(probe_read_user_str),	\
-	FN(probe_read_kernel_str),
+	FN(probe_read_kernel_str),	\
+	FN(get_file_path),
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH bpf-next v12 2/2] selftests/bpf: test for bpf_get_file_path() from tracepoint
  2019-12-15  4:01       ` [PATCH bpf-next v12 0/2] bpf: adding get_file_path helper Wenbo Zhang
  2019-12-15  4:01         ` [PATCH bpf-next v12 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname Wenbo Zhang
@ 2019-12-15  4:01         ` Wenbo Zhang
  2019-12-15 16:24           ` Yonghong Song
  1 sibling, 1 reply; 52+ messages in thread
From: Wenbo Zhang @ 2019-12-15  4:01 UTC (permalink / raw)
  To: bpf; +Cc: ast, daniel, yhs, andrii.nakryiko, netdev

trace fstat events by tracepoint syscalls/sys_enter_newfstat, and handle
events only produced by test_file_get_path, which call fstat on several
different types of files to test bpf_get_file_path's feature.

v4->v5: addressed Andrii's feedback
- pass NULL for opts as bpf_object__open_file's PARAM2, as not really
using any
- modify patch subject to keep up with test code
- as this test is single-threaded, so use getpid instead of SYS_gettid
- remove unnecessary parens around check which after if (i < 3)
- in kern use bpf_get_current_pid_tgid() >> 32 to fit getpid() in
userspace part
- with the patch adding helper as one patch series

v3->v4: addressed Andrii's feedback
- use a set of fd instead of fds array
- use global variables instead of maps (in v3, I mistakenly thought that
the bpf maps are global variables.)
- remove uncessary global variable path_info_index
- remove fd compare as the fstat's order is fixed

v2->v3: addressed Andrii's feedback
- use global data instead of perf_buffer to simplified code

v1->v2: addressed Daniel's feedback
- rename bpf_fd2path to bpf_get_file_path to be consistent with other
helper's names

Signed-off-by: Wenbo Zhang <ethercflow@gmail.com>
---
 .../selftests/bpf/prog_tests/get_file_path.c  | 171 ++++++++++++++++++
 .../selftests/bpf/progs/test_get_file_path.c  |  43 +++++
 2 files changed, 214 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/get_file_path.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_get_file_path.c

diff --git a/tools/testing/selftests/bpf/prog_tests/get_file_path.c b/tools/testing/selftests/bpf/prog_tests/get_file_path.c
new file mode 100644
index 000000000000..7ec11e43e0fc
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/get_file_path.c
@@ -0,0 +1,171 @@
+// SPDX-License-Identifier: GPL-2.0
+#define _GNU_SOURCE
+#include <test_progs.h>
+#include <sys/stat.h>
+#include <linux/sched.h>
+#include <sys/syscall.h>
+
+#define MAX_PATH_LEN		128
+#define MAX_FDS			7
+#define MAX_EVENT_NUM		16
+
+static struct file_path_test_data {
+	pid_t pid;
+	__u32 cnt;
+	__u32 fds[MAX_EVENT_NUM];
+	char paths[MAX_EVENT_NUM][MAX_PATH_LEN];
+} src, dst;
+
+static inline int set_pathname(int fd)
+{
+	char buf[MAX_PATH_LEN];
+
+	snprintf(buf, MAX_PATH_LEN, "/proc/%d/fd/%d", src.pid, fd);
+	src.fds[src.cnt] = fd;
+	return readlink(buf, src.paths[src.cnt++], MAX_PATH_LEN);
+}
+
+static int trigger_fstat_events(pid_t pid)
+{
+	int pipefd[2] = { -1, -1 };
+	int sockfd = -1, procfd = -1, devfd = -1;
+	int localfd = -1, indicatorfd = -1;
+	struct stat fileStat;
+	int ret = -1;
+
+	/* unmountable pseudo-filesystems */
+	if (CHECK_FAIL(pipe(pipefd) < 0))
+		return ret;
+	/* unmountable pseudo-filesystems */
+	sockfd = socket(AF_INET, SOCK_STREAM, 0);
+	if (CHECK_FAIL(sockfd < 0))
+		goto out_close;
+	/* mountable pseudo-filesystems */
+	procfd = open("/proc/self/comm", O_RDONLY);
+	if (CHECK_FAIL(procfd < 0))
+		goto out_close;
+	devfd = open("/dev/urandom", O_RDONLY);
+	if (CHECK_FAIL(devfd < 0))
+		goto out_close;
+	localfd = open("/tmp/fd2path_loadgen.txt", O_CREAT | O_RDONLY);
+	if (CHECK_FAIL(localfd < 0))
+		goto out_close;
+	/* bpf_get_file_path will return path with (deleted) */
+	remove("/tmp/fd2path_loadgen.txt");
+	indicatorfd = open("/tmp/", O_PATH);
+	if (CHECK_FAIL(indicatorfd < 0))
+		goto out_close;
+
+	src.pid = pid;
+
+	ret = set_pathname(pipefd[0]);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(pipefd[1]);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(sockfd);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(procfd);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(devfd);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(localfd);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(indicatorfd);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+
+	fstat(pipefd[0], &fileStat);
+	fstat(pipefd[1], &fileStat);
+	fstat(sockfd, &fileStat);
+	fstat(procfd, &fileStat);
+	fstat(devfd, &fileStat);
+	fstat(localfd, &fileStat);
+	fstat(indicatorfd, &fileStat);
+
+out_close:
+	close(indicatorfd);
+	close(localfd);
+	close(devfd);
+	close(procfd);
+	close(sockfd);
+	close(pipefd[1]);
+	close(pipefd[0]);
+
+	return ret;
+}
+
+void test_get_file_path(void)
+{
+	const char *prog_name = "tracepoint/syscalls/sys_enter_newfstat";
+	const char *obj_file = "test_get_file_path.o";
+	int err, results_map_fd, duration = 0;
+	struct bpf_program *tp_prog = NULL;
+	struct bpf_link *tp_link = NULL;
+	struct bpf_object *obj = NULL;
+	const int zero = 0;
+
+	obj = bpf_object__open_file(obj_file, NULL);
+	if (CHECK(IS_ERR(obj), "obj_open_file", "err %ld\n", PTR_ERR(obj)))
+		return;
+
+	tp_prog = bpf_object__find_program_by_title(obj, prog_name);
+	if (CHECK(!tp_prog, "find_tp",
+		  "prog '%s' not found\n", prog_name))
+		goto cleanup;
+
+	err = bpf_object__load(obj);
+	if (CHECK(err, "obj_load", "err %d\n", err))
+		goto cleanup;
+
+	results_map_fd = bpf_find_map(__func__, obj, "test_get.bss");
+	if (CHECK(results_map_fd < 0, "find_bss_map",
+		  "err %d\n", results_map_fd))
+		goto cleanup;
+
+	tp_link = bpf_program__attach_tracepoint(tp_prog, "syscalls",
+						 "sys_enter_newfstat");
+	if (CHECK(IS_ERR(tp_link), "attach_tp",
+		  "err %ld\n", PTR_ERR(tp_link))) {
+		tp_link = NULL;
+		goto cleanup;
+	}
+
+	dst.pid = getpid();
+	err = bpf_map_update_elem(results_map_fd, &zero, &dst, 0);
+	if (CHECK(err, "update_elem",
+		  "failed to set pid filter: %d\n", err))
+		goto cleanup;
+
+	err = trigger_fstat_events(dst.pid);
+	if (CHECK_FAIL(err < 0))
+		goto cleanup;
+
+	err = bpf_map_lookup_elem(results_map_fd, &zero, &dst);
+	if (CHECK(err, "get_results",
+		  "failed to get results: %d\n", err))
+		goto cleanup;
+
+	for (int i = 0; i < MAX_FDS; i++) {
+		if (i < 3)
+			CHECK((dst.paths[i][0] != '0'), "get_file_path",
+			      "failed to filter fs [%d]: %u(%s) vs %u(%s)\n",
+			      i, src.fds[i], src.paths[i], dst.fds[i],
+			      dst.paths[i]);
+		else
+			CHECK(strncmp(src.paths[i], dst.paths[i], MAX_PATH_LEN),
+			      "get_file_path",
+			      "failed to get path[%d]: %u(%s) vs %u(%s)\n",
+			      i, src.fds[i], src.paths[i], dst.fds[i],
+			      dst.paths[i]);
+	}
+
+cleanup:
+	bpf_link__destroy(tp_link);
+	bpf_object__close(obj);
+}
diff --git a/tools/testing/selftests/bpf/progs/test_get_file_path.c b/tools/testing/selftests/bpf/progs/test_get_file_path.c
new file mode 100644
index 000000000000..eae663c1262a
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_get_file_path.c
@@ -0,0 +1,43 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/bpf.h>
+#include <linux/ptrace.h>
+#include <string.h>
+#include <unistd.h>
+#include "bpf_helpers.h"
+#include "bpf_tracing.h"
+
+#define MAX_PATH_LEN		128
+#define MAX_EVENT_NUM		16
+
+static struct file_path_test_data {
+	pid_t pid;
+	__u32 cnt;
+	__u32 fds[MAX_EVENT_NUM];
+	char paths[MAX_EVENT_NUM][MAX_PATH_LEN];
+} data;
+
+struct sys_enter_newfstat_args {
+	unsigned long long pad1;
+	unsigned long long pad2;
+	unsigned int fd;
+};
+
+SEC("tracepoint/syscalls/sys_enter_newfstat")
+int bpf_prog(struct sys_enter_newfstat_args *args)
+{
+	pid_t pid = bpf_get_current_pid_tgid() >> 32;
+
+	if (pid != data.pid)
+		return 0;
+	if (data.cnt >= MAX_EVENT_NUM)
+		return 0;
+
+	data.fds[data.cnt] = args->fd;
+	bpf_get_file_path(data.paths[data.cnt], MAX_PATH_LEN, args->fd);
+	data.cnt++;
+
+	return 0;
+}
+
+char _license[] SEC("license") = "GPL";
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v12 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname
  2019-12-15  4:01         ` [PATCH bpf-next v12 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname Wenbo Zhang
@ 2019-12-15 16:05           ` Yonghong Song
  2019-12-17  6:26             ` Wenbo Zhang
  2019-12-15 16:10           ` Yonghong Song
                             ` (2 subsequent siblings)
  3 siblings, 1 reply; 52+ messages in thread
From: Yonghong Song @ 2019-12-15 16:05 UTC (permalink / raw)
  To: Wenbo Zhang, bpf; +Cc: ast, daniel, andrii.nakryiko, netdev



On 12/14/19 8:01 PM, Wenbo Zhang wrote:
> When people want to identify which file system files are being opened,
> read, and written to, they can use this helper with file descriptor as
> input to achieve this goal. Other pseudo filesystems are also supported.
> 
> This requirement is mainly discussed here:
> 
>    https://github.com/iovisor/bcc/issues/237
> 
> v11->v12: addressed Alexei's feedback
> - only allow tracepoints to make sure it won't dead lock
> 
> v10->v11: addressed Al and Alexei's feedback
> - fix missing fput()
> 
> v9->v10: addressed Andrii's feedback
> - send this patch together with the patch selftests as one patch series
> 
> v8->v9:
> - format helper description
> 
> v7->v8: addressed Alexei's feedback
> - use fget_raw instead of fdget_raw, as fdget_raw is only used inside fs/
> - ensure we're in user context which is safe fot the help to run
> - filter unmountable pseudo filesystem, because they don't have real path
> - supplement the description of this helper function
> 
> v6->v7:
> - fix missing signed-off-by line
> 
> v5->v6: addressed Andrii's feedback
> - avoid unnecessary goto end by having two explicit returns
> 
> v4->v5: addressed Andrii and Daniel's feedback
> - rename bpf_fd2path to bpf_get_file_path to be consistent with other
> helper's names
> - when fdget_raw fails, set ret to -EBADF instead of -EINVAL
> - remove fdput from fdget_raw's error path
> - use IS_ERR instead of IS_ERR_OR_NULL as d_path ether returns a pointer
> into the buffer or an error code if the path was too long
> - modify the normal path's return value to return copied string length
> including NUL
> - update this helper description's Return bits.
> 
> v3->v4: addressed Daniel's feedback
> - fix missing fdput()
> - move fd2path from kernel/bpf/trace.c to kernel/trace/bpf_trace.c
> - move fd2path's test code to another patch
> - add comment to explain why use fdget_raw instead of fdget
> 
> v2->v3: addressed Yonghong's feedback
> - remove unnecessary LOCKDOWN_BPF_READ
> - refactor error handling section for enhanced readability
> - provide a test case in tools/testing/selftests/bpf
> 
> v1->v2: addressed Daniel's feedback
> - fix backward compatibility
> - add this helper description
> - fix signed-off name
> 
> Signed-off-by: Wenbo Zhang <ethercflow@gmail.com>
> ---
>   include/uapi/linux/bpf.h       | 29 +++++++++++++-
>   kernel/trace/bpf_trace.c       | 70 ++++++++++++++++++++++++++++++++++
>   tools/include/uapi/linux/bpf.h | 29 +++++++++++++-
>   3 files changed, 126 insertions(+), 2 deletions(-)
> 
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index dbbcf0b02970..71d9705df120 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -2821,6 +2821,32 @@ union bpf_attr {
>    * 	Return
>    * 		On success, the strictly positive length of the string,	including
>    * 		the trailing NUL character. On error, a negative value.
> + *
> + * int bpf_get_file_path(char *path, u32 size, int fd)
> + *	Description
> + *		Get **file** atrribute from the current task by *fd*, then call
> + *		**d_path** to get it's absolute path and copy it as string into
> + *		*path* of *size*. Notice the **path** don't support unmountable
> + *		pseudo filesystems as they don't have path (eg: SOCKFS, PIPEFS).
> + *		The *size* must be strictly positive. On success, the helper
> + *		makes sure that the *path* is NUL-terminated, and the buffer
> + *		could be:
> + *		- a regular full path (include mountable fs eg: /proc, /sys)
> + *		- a regular full path with "(deleted)" at the end.

Let us say with " (deleted)" is appended to be consistent with comments
in d_path() and is more clear to user what the format will looks like.

> + *		On failure, it is filled with zeroes.
> + *	Return
> + *		On success, returns the length of the copied string INCLUDING
> + *		the trailing NUL.

trailing '\0'.

> + *
> + *		On failure, the returned value is one of the following:
> + *
> + *		**-EPERM** if no permission to get the path (eg: in irq ctx).
> + *
> + *		**-EBADF** if *fd* is invalid.
> + *
> + *		**-EINVAL** if *fd* corresponds to a unmountable pseudo fs
> + *
> + *		**-ENAMETOOLONG** if full path is longer than *size*
>    */
>   #define __BPF_FUNC_MAPPER(FN)		\
>   	FN(unspec),			\
> @@ -2938,7 +2964,8 @@ union bpf_attr {
>   	FN(probe_read_user),		\
>   	FN(probe_read_kernel),		\
>   	FN(probe_read_user_str),	\
> -	FN(probe_read_kernel_str),
> +	FN(probe_read_kernel_str),	\
> +	FN(get_file_path),
>   
>   /* integer value in 'imm' field of BPF_CALL instruction selects which helper
>    * function eBPF program intends to call
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index e5ef4ae9edb5..db9c0ec46a5d 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -762,6 +762,72 @@ static const struct bpf_func_proto bpf_send_signal_proto = {
>   	.arg1_type	= ARG_ANYTHING,
>   };
>   
> +BPF_CALL_3(bpf_get_file_path, char *, dst, u32, size, int, fd)
> +{
> +	struct file *f;
> +	char *p;
> +	int ret = -EBADF;
> +
> +	/* Ensure we're in user context which is safe for the helper to
> +	 * run. This helper has no business in a kthread.
> +	 */
> +	if (unlikely(in_interrupt() ||
> +		     current->flags & (PF_KTHREAD | PF_EXITING))) {
> +		ret = -EPERM;
> +		goto error;
> +	}
> +
> +	/* Use fget_raw instead of fget to support O_PATH, and it doesn't
> +	 * have any sleepable code, so it's ok to be here.
> +	 */
> +	f = fget_raw(fd);
> +	if (!f)
> +		goto error;
> +
> +	/* For unmountable pseudo filesystem, it seems to have no meaning
> +	 * to get their fake paths as they don't have path, and to be no
> +	 * way to validate this function pointer can be always safe to call
> +	 * in the current context.
> +	 */
> +	if (f->f_path.dentry->d_op && f->f_path.dentry->d_op->d_dname) {
> +		ret = -EINVAL;
> +		fput(f);
> +		goto error;
> +	}
> +
> +	/* After filter unmountable pseudo filesytem, d_path won't call
> +	 * dentry->d_op->d_name(), the normally path doesn't have any
> +	 * sleepable code, and despite it uses the current macro to get
> +	 * fs_struct (current->fs), we've already ensured we're in user
> +	 * context, so it's ok to be here.
> +	 */
> +	p = d_path(&f->f_path, dst, size);
> +	if (IS_ERR(p)) {
> +		ret = PTR_ERR(p);
> +		fput(f);
> +		goto error;
> +	}
> +
> +	ret = strlen(p);
> +	memmove(dst, p, ret);
> +	dst[ret++] = '\0';

nit: you could do memmove(dst, p, ret + 1)?

> +	fput(f);
> +	return ret;

The description says the return value length including trailing '\0'.
The above 'ret' does not include trailing '\0'.

> +
> +error:
> +	memset(dst, '0', size);
> +	return ret;
> +}
> +
> +static const struct bpf_func_proto bpf_get_file_path_proto = {
> +	.func       = bpf_get_file_path,
> +	.gpl_only   = true,
> +	.ret_type   = RET_INTEGER,
> +	.arg1_type  = ARG_PTR_TO_UNINIT_MEM,
> +	.arg2_type  = ARG_CONST_SIZE,
> +	.arg3_type  = ARG_ANYTHING,
> +};
> +
>   static const struct bpf_func_proto *
>   tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
>   {
> @@ -953,6 +1019,8 @@ tp_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
>   		return &bpf_get_stackid_proto_tp;
>   	case BPF_FUNC_get_stack:
>   		return &bpf_get_stack_proto_tp;
> +	case BPF_FUNC_get_file_path:
> +		return &bpf_get_file_path_proto;
>   	default:
>   		return tracing_func_proto(func_id, prog);
>   	}
> @@ -1146,6 +1214,8 @@ raw_tp_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
>   		return &bpf_get_stackid_proto_raw_tp;
>   	case BPF_FUNC_get_stack:
>   		return &bpf_get_stack_proto_raw_tp;
> +	case BPF_FUNC_get_file_path:
> +		return &bpf_get_file_path_proto;
>   	default:
>   		return tracing_func_proto(func_id, prog);
>   	}
> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index dbbcf0b02970..71d9705df120 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -2821,6 +2821,32 @@ union bpf_attr {
>    * 	Return
>    * 		On success, the strictly positive length of the string,	including
>    * 		the trailing NUL character. On error, a negative value.
> + *
> + * int bpf_get_file_path(char *path, u32 size, int fd)
> + *	Description
> + *		Get **file** atrribute from the current task by *fd*, then call
> + *		**d_path** to get it's absolute path and copy it as string into
> + *		*path* of *size*. Notice the **path** don't support unmountable
> + *		pseudo filesystems as they don't have path (eg: SOCKFS, PIPEFS).
> + *		The *size* must be strictly positive. On success, the helper
> + *		makes sure that the *path* is NUL-terminated, and the buffer
> + *		could be:
> + *		- a regular full path (include mountable fs eg: /proc, /sys)
> + *		- a regular full path with "(deleted)" at the end.

ditto

> + *		On failure, it is filled with zeroes.
> + *	Return
> + *		On success, returns the length of the copied string INCLUDING
> + *		the trailing NUL.

ditto

> + *
> + *		On failure, the returned value is one of the following:
> + *
> + *		**-EPERM** if no permission to get the path (eg: in irq ctx).
> + *
> + *		**-EBADF** if *fd* is invalid.
> + *
> + *		**-EINVAL** if *fd* corresponds to a unmountable pseudo fs
> + *
> + *		**-ENAMETOOLONG** if full path is longer than *size*
>    */
>   #define __BPF_FUNC_MAPPER(FN)		\
>   	FN(unspec),			\
> @@ -2938,7 +2964,8 @@ union bpf_attr {
>   	FN(probe_read_user),		\
>   	FN(probe_read_kernel),		\
>   	FN(probe_read_user_str),	\
> -	FN(probe_read_kernel_str),
> +	FN(probe_read_kernel_str),	\
> +	FN(get_file_path),
>   
>   /* integer value in 'imm' field of BPF_CALL instruction selects which helper
>    * function eBPF program intends to call
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v12 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname
  2019-12-15  4:01         ` [PATCH bpf-next v12 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname Wenbo Zhang
  2019-12-15 16:05           ` Yonghong Song
@ 2019-12-15 16:10           ` Yonghong Song
  2019-12-17  6:27             ` Wenbo Zhang
  2019-12-16 22:09           ` Brendan Gregg
  2019-12-17  9:47           ` [PATCH bpf-next v13 0/2] bpf: adding get_fd_path helper Wenbo Zhang
  3 siblings, 1 reply; 52+ messages in thread
From: Yonghong Song @ 2019-12-15 16:10 UTC (permalink / raw)
  To: Wenbo Zhang, bpf; +Cc: ast, daniel, andrii.nakryiko, netdev



On 12/14/19 8:01 PM, Wenbo Zhang wrote:
> When people want to identify which file system files are being opened,
> read, and written to, they can use this helper with file descriptor as
> input to achieve this goal. Other pseudo filesystems are also supported.
> 
> This requirement is mainly discussed here:
> 
>    https://github.com/iovisor/bcc/issues/237
> 
> v11->v12: addressed Alexei's feedback
> - only allow tracepoints to make sure it won't dead lock
> 
> v10->v11: addressed Al and Alexei's feedback
> - fix missing fput()
> 
> v9->v10: addressed Andrii's feedback
> - send this patch together with the patch selftests as one patch series
> 
> v8->v9:
> - format helper description
> 
> v7->v8: addressed Alexei's feedback
> - use fget_raw instead of fdget_raw, as fdget_raw is only used inside fs/
> - ensure we're in user context which is safe fot the help to run
> - filter unmountable pseudo filesystem, because they don't have real path
> - supplement the description of this helper function
> 
> v6->v7:
> - fix missing signed-off-by line
> 
> v5->v6: addressed Andrii's feedback
> - avoid unnecessary goto end by having two explicit returns
> 
> v4->v5: addressed Andrii and Daniel's feedback
> - rename bpf_fd2path to bpf_get_file_path to be consistent with other
> helper's names
> - when fdget_raw fails, set ret to -EBADF instead of -EINVAL
> - remove fdput from fdget_raw's error path
> - use IS_ERR instead of IS_ERR_OR_NULL as d_path ether returns a pointer
> into the buffer or an error code if the path was too long
> - modify the normal path's return value to return copied string length
> including NUL
> - update this helper description's Return bits.
> 
> v3->v4: addressed Daniel's feedback
> - fix missing fdput()
> - move fd2path from kernel/bpf/trace.c to kernel/trace/bpf_trace.c
> - move fd2path's test code to another patch
> - add comment to explain why use fdget_raw instead of fdget
> 
> v2->v3: addressed Yonghong's feedback
> - remove unnecessary LOCKDOWN_BPF_READ
> - refactor error handling section for enhanced readability
> - provide a test case in tools/testing/selftests/bpf
> 
> v1->v2: addressed Daniel's feedback
> - fix backward compatibility
> - add this helper description
> - fix signed-off name
> 
> Signed-off-by: Wenbo Zhang <ethercflow@gmail.com>
> ---
>   include/uapi/linux/bpf.h       | 29 +++++++++++++-
>   kernel/trace/bpf_trace.c       | 70 ++++++++++++++++++++++++++++++++++
>   tools/include/uapi/linux/bpf.h | 29 +++++++++++++-
>   3 files changed, 126 insertions(+), 2 deletions(-)
> 
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index dbbcf0b02970..71d9705df120 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -2821,6 +2821,32 @@ union bpf_attr {
>    * 	Return
>    * 		On success, the strictly positive length of the string,	including
>    * 		the trailing NUL character. On error, a negative value.
> + *
> + * int bpf_get_file_path(char *path, u32 size, int fd)
> + *	Description
> + *		Get **file** atrribute from the current task by *fd*, then call
> + *		**d_path** to get it's absolute path and copy it as string into
> + *		*path* of *size*. Notice the **path** don't support unmountable
> + *		pseudo filesystems as they don't have path (eg: SOCKFS, PIPEFS).
> + *		The *size* must be strictly positive. On success, the helper
> + *		makes sure that the *path* is NUL-terminated, and the buffer
> + *		could be:
> + *		- a regular full path (include mountable fs eg: /proc, /sys)
> + *		- a regular full path with "(deleted)" at the end.
> + *		On failure, it is filled with zeroes.
> + *	Return
> + *		On success, returns the length of the copied string INCLUDING
> + *		the trailing NUL.
> + *
> + *		On failure, the returned value is one of the following:
> + *
> + *		**-EPERM** if no permission to get the path (eg: in irq ctx).
> + *
> + *		**-EBADF** if *fd* is invalid.
> + *
> + *		**-EINVAL** if *fd* corresponds to a unmountable pseudo fs
> + *
> + *		**-ENAMETOOLONG** if full path is longer than *size*
>    */
>   #define __BPF_FUNC_MAPPER(FN)		\
>   	FN(unspec),			\
> @@ -2938,7 +2964,8 @@ union bpf_attr {
>   	FN(probe_read_user),		\
>   	FN(probe_read_kernel),		\
>   	FN(probe_read_user_str),	\
> -	FN(probe_read_kernel_str),
> +	FN(probe_read_kernel_str),	\
> +	FN(get_file_path),
>   
>   /* integer value in 'imm' field of BPF_CALL instruction selects which helper
>    * function eBPF program intends to call
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index e5ef4ae9edb5..db9c0ec46a5d 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -762,6 +762,72 @@ static const struct bpf_func_proto bpf_send_signal_proto = {
>   	.arg1_type	= ARG_ANYTHING,
>   };
>   
> +BPF_CALL_3(bpf_get_file_path, char *, dst, u32, size, int, fd)
> +{
> +	struct file *f;
> +	char *p;
> +	int ret = -EBADF;

please try to use reverse Christmas tree for declarations.

> +
> +	/* Ensure we're in user context which is safe for the helper to
> +	 * run. This helper has no business in a kthread.
> +	 */
> +	if (unlikely(in_interrupt() ||
> +		     current->flags & (PF_KTHREAD | PF_EXITING))) {
> +		ret = -EPERM;
> +		goto error;
> +	}
> +
> +	/* Use fget_raw instead of fget to support O_PATH, and it doesn't
> +	 * have any sleepable code, so it's ok to be here.
> +	 */
> +	f = fget_raw(fd);
> +	if (!f)
> +		goto error;
> +
> +	/* For unmountable pseudo filesystem, it seems to have no meaning
> +	 * to get their fake paths as they don't have path, and to be no
> +	 * way to validate this function pointer can be always safe to call
> +	 * in the current context.
> +	 */
> +	if (f->f_path.dentry->d_op && f->f_path.dentry->d_op->d_dname) {
> +		ret = -EINVAL;
> +		fput(f);
> +		goto error;
> +	}
> +
> +	/* After filter unmountable pseudo filesytem, d_path won't call
> +	 * dentry->d_op->d_name(), the normally path doesn't have any
> +	 * sleepable code, and despite it uses the current macro to get
> +	 * fs_struct (current->fs), we've already ensured we're in user
> +	 * context, so it's ok to be here.
> +	 */
> +	p = d_path(&f->f_path, dst, size);
> +	if (IS_ERR(p)) {
> +		ret = PTR_ERR(p);
> +		fput(f);
> +		goto error;
> +	}
> +
> +	ret = strlen(p);
> +	memmove(dst, p, ret);
> +	dst[ret++] = '\0';
> +	fput(f);
> +	return ret;
> +
> +error:
> +	memset(dst, '0', size);
> +	return ret;
> +}
> +
[...]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v12 2/2] selftests/bpf: test for bpf_get_file_path() from tracepoint
  2019-12-15  4:01         ` [PATCH bpf-next v12 2/2] selftests/bpf: test for bpf_get_file_path() " Wenbo Zhang
@ 2019-12-15 16:24           ` Yonghong Song
  2019-12-17  4:01             ` Wenbo Zhang
  0 siblings, 1 reply; 52+ messages in thread
From: Yonghong Song @ 2019-12-15 16:24 UTC (permalink / raw)
  To: Wenbo Zhang, bpf; +Cc: ast, daniel, andrii.nakryiko, netdev



On 12/14/19 8:01 PM, Wenbo Zhang wrote:
> trace fstat events by tracepoint syscalls/sys_enter_newfstat, and handle
> events only produced by test_file_get_path, which call fstat on several
> different types of files to test bpf_get_file_path's feature.
> 
> v4->v5: addressed Andrii's feedback
> - pass NULL for opts as bpf_object__open_file's PARAM2, as not really
> using any
> - modify patch subject to keep up with test code
> - as this test is single-threaded, so use getpid instead of SYS_gettid
> - remove unnecessary parens around check which after if (i < 3)
> - in kern use bpf_get_current_pid_tgid() >> 32 to fit getpid() in
> userspace part
> - with the patch adding helper as one patch series
> 
> v3->v4: addressed Andrii's feedback
> - use a set of fd instead of fds array
> - use global variables instead of maps (in v3, I mistakenly thought that
> the bpf maps are global variables.)
> - remove uncessary global variable path_info_index
> - remove fd compare as the fstat's order is fixed
> 
> v2->v3: addressed Andrii's feedback
> - use global data instead of perf_buffer to simplified code
> 
> v1->v2: addressed Daniel's feedback
> - rename bpf_fd2path to bpf_get_file_path to be consistent with other
> helper's names
> 
> Signed-off-by: Wenbo Zhang <ethercflow@gmail.com>
> ---
>   .../selftests/bpf/prog_tests/get_file_path.c  | 171 ++++++++++++++++++
>   .../selftests/bpf/progs/test_get_file_path.c  |  43 +++++
>   2 files changed, 214 insertions(+)
>   create mode 100644 tools/testing/selftests/bpf/prog_tests/get_file_path.c
>   create mode 100644 tools/testing/selftests/bpf/progs/test_get_file_path.c
> 
> diff --git a/tools/testing/selftests/bpf/prog_tests/get_file_path.c b/tools/testing/selftests/bpf/prog_tests/get_file_path.c
> new file mode 100644
> index 000000000000..7ec11e43e0fc
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/prog_tests/get_file_path.c
> @@ -0,0 +1,171 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#define _GNU_SOURCE
> +#include <test_progs.h>
> +#include <sys/stat.h>
> +#include <linux/sched.h>
> +#include <sys/syscall.h>
> +
> +#define MAX_PATH_LEN		128
> +#define MAX_FDS			7
> +#define MAX_EVENT_NUM		16
> +
> +static struct file_path_test_data {
> +	pid_t pid;
> +	__u32 cnt;
> +	__u32 fds[MAX_EVENT_NUM];
> +	char paths[MAX_EVENT_NUM][MAX_PATH_LEN];
> +} src, dst;
> +
> +static inline int set_pathname(int fd)

In non-bpf .c file, typically we do not add 'inline' attribute.
It is up to compiler to decide whether it should be inlined.

> +{
> +	char buf[MAX_PATH_LEN];
> +
> +	snprintf(buf, MAX_PATH_LEN, "/proc/%d/fd/%d", src.pid, fd);
> +	src.fds[src.cnt] = fd;
> +	return readlink(buf, src.paths[src.cnt++], MAX_PATH_LEN);
> +}
> +
[...]
> diff --git a/tools/testing/selftests/bpf/progs/test_get_file_path.c b/tools/testing/selftests/bpf/progs/test_get_file_path.c
> new file mode 100644
> index 000000000000..eae663c1262a
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/test_get_file_path.c
> @@ -0,0 +1,43 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include <linux/bpf.h>
> +#include <linux/ptrace.h>
> +#include <string.h>
> +#include <unistd.h>
> +#include "bpf_helpers.h"
> +#include "bpf_tracing.h"
> +
> +#define MAX_PATH_LEN		128
> +#define MAX_EVENT_NUM		16
> +
> +static struct file_path_test_data {
> +	pid_t pid;
> +	__u32 cnt;
> +	__u32 fds[MAX_EVENT_NUM];
> +	char paths[MAX_EVENT_NUM][MAX_PATH_LEN];
> +} data;
> +
> +struct sys_enter_newfstat_args {
> +	unsigned long long pad1;
> +	unsigned long long pad2;
> +	unsigned int fd;
> +};

The BTF generated vmlinux.h has the following structure,
struct trace_entry {
         short unsigned int type;
         unsigned char flags;
         unsigned char preempt_count;
         int pid;
};
struct trace_event_raw_sys_enter {
         struct trace_entry ent;
         long int id;
         long unsigned int args[6];
         char __data[0];
};

The third parameter type should be long, otherwise,
it may have issue on big endian machines?

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v12 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname
  2019-12-15  4:01         ` [PATCH bpf-next v12 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname Wenbo Zhang
  2019-12-15 16:05           ` Yonghong Song
  2019-12-15 16:10           ` Yonghong Song
@ 2019-12-16 22:09           ` Brendan Gregg
  2019-12-17  4:05             ` Wenbo Zhang
  2019-12-17  9:47           ` [PATCH bpf-next v13 0/2] bpf: adding get_fd_path helper Wenbo Zhang
  3 siblings, 1 reply; 52+ messages in thread
From: Brendan Gregg @ 2019-12-16 22:09 UTC (permalink / raw)
  To: Wenbo Zhang
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Yonghong Song,
	andrii.nakryiko, netdev

On Sat, Dec 14, 2019 at 8:01 PM Wenbo Zhang <ethercflow@gmail.com> wrote:
>
> When people want to identify which file system files are being opened,
> read, and written to, they can use this helper with file descriptor as
> input to achieve this goal. Other pseudo filesystems are also supported.
>
> This requirement is mainly discussed here:
>
>   https://github.com/iovisor/bcc/issues/237
>
> v11->v12: addressed Alexei's feedback
> - only allow tracepoints to make sure it won't dead lock
>
> v10->v11: addressed Al and Alexei's feedback
> - fix missing fput()
>
> v9->v10: addressed Andrii's feedback
> - send this patch together with the patch selftests as one patch series
>
> v8->v9:
> - format helper description
>
> v7->v8: addressed Alexei's feedback
> - use fget_raw instead of fdget_raw, as fdget_raw is only used inside fs/
> - ensure we're in user context which is safe fot the help to run
> - filter unmountable pseudo filesystem, because they don't have real path
> - supplement the description of this helper function
>
> v6->v7:
> - fix missing signed-off-by line
>
> v5->v6: addressed Andrii's feedback
> - avoid unnecessary goto end by having two explicit returns
>
> v4->v5: addressed Andrii and Daniel's feedback
> - rename bpf_fd2path to bpf_get_file_path to be consistent with other
> helper's names
> - when fdget_raw fails, set ret to -EBADF instead of -EINVAL
> - remove fdput from fdget_raw's error path
> - use IS_ERR instead of IS_ERR_OR_NULL as d_path ether returns a pointer
> into the buffer or an error code if the path was too long
> - modify the normal path's return value to return copied string length
> including NUL
> - update this helper description's Return bits.
>
> v3->v4: addressed Daniel's feedback
> - fix missing fdput()
> - move fd2path from kernel/bpf/trace.c to kernel/trace/bpf_trace.c
> - move fd2path's test code to another patch
> - add comment to explain why use fdget_raw instead of fdget
>
> v2->v3: addressed Yonghong's feedback
> - remove unnecessary LOCKDOWN_BPF_READ
> - refactor error handling section for enhanced readability
> - provide a test case in tools/testing/selftests/bpf
>
> v1->v2: addressed Daniel's feedback
> - fix backward compatibility
> - add this helper description
> - fix signed-off name
>
> Signed-off-by: Wenbo Zhang <ethercflow@gmail.com>
> ---
>  include/uapi/linux/bpf.h       | 29 +++++++++++++-
>  kernel/trace/bpf_trace.c       | 70 ++++++++++++++++++++++++++++++++++
>  tools/include/uapi/linux/bpf.h | 29 +++++++++++++-
>  3 files changed, 126 insertions(+), 2 deletions(-)
>
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index dbbcf0b02970..71d9705df120 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -2821,6 +2821,32 @@ union bpf_attr {
>   *     Return
>   *             On success, the strictly positive length of the string, including
>   *             the trailing NUL character. On error, a negative value.
> + *
> + * int bpf_get_file_path(char *path, u32 size, int fd)
> + *     Description
> + *             Get **file** atrribute from the current task by *fd*, then call
> + *             **d_path** to get it's absolute path and copy it as string into
> + *             *path* of *size*. Notice the **path** don't support unmountable
> + *             pseudo filesystems as they don't have path (eg: SOCKFS, PIPEFS).
> + *             The *size* must be strictly positive. On success, the helper
> + *             makes sure that the *path* is NUL-terminated, and the buffer
> + *             could be:
> + *             - a regular full path (include mountable fs eg: /proc, /sys)
> + *             - a regular full path with "(deleted)" at the end.
> + *             On failure, it is filled with zeroes.
> + *     Return
> + *             On success, returns the length of the copied string INCLUDING
> + *             the trailing NUL.
> + *
> + *             On failure, the returned value is one of the following:
> + *
> + *             **-EPERM** if no permission to get the path (eg: in irq ctx).
> + *
> + *             **-EBADF** if *fd* is invalid.
> + *
> + *             **-EINVAL** if *fd* corresponds to a unmountable pseudo fs
> + *
> + *             **-ENAMETOOLONG** if full path is longer than *size*
>   */
>  #define __BPF_FUNC_MAPPER(FN)          \
>         FN(unspec),                     \
> @@ -2938,7 +2964,8 @@ union bpf_attr {
>         FN(probe_read_user),            \
>         FN(probe_read_kernel),          \
>         FN(probe_read_user_str),        \
> -       FN(probe_read_kernel_str),
> +       FN(probe_read_kernel_str),      \
> +       FN(get_file_path),
>
>  /* integer value in 'imm' field of BPF_CALL instruction selects which helper
>   * function eBPF program intends to call
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index e5ef4ae9edb5..db9c0ec46a5d 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -762,6 +762,72 @@ static const struct bpf_func_proto bpf_send_signal_proto = {
>         .arg1_type      = ARG_ANYTHING,
>  };
>
> +BPF_CALL_3(bpf_get_file_path, char *, dst, u32, size, int, fd)
> +{
> +       struct file *f;
> +       char *p;
> +       int ret = -EBADF;
> +
> +       /* Ensure we're in user context which is safe for the helper to
> +        * run. This helper has no business in a kthread.
> +        */
> +       if (unlikely(in_interrupt() ||
> +                    current->flags & (PF_KTHREAD | PF_EXITING))) {
> +               ret = -EPERM;
> +               goto error;
> +       }
> +
> +       /* Use fget_raw instead of fget to support O_PATH, and it doesn't
> +        * have any sleepable code, so it's ok to be here.
> +        */
> +       f = fget_raw(fd);
> +       if (!f)
> +               goto error;
> +
> +       /* For unmountable pseudo filesystem, it seems to have no meaning
> +        * to get their fake paths as they don't have path, and to be no
> +        * way to validate this function pointer can be always safe to call
> +        * in the current context.
> +        */
> +       if (f->f_path.dentry->d_op && f->f_path.dentry->d_op->d_dname) {
> +               ret = -EINVAL;
> +               fput(f);
> +               goto error;
> +       }
> +
> +       /* After filter unmountable pseudo filesytem, d_path won't call
> +        * dentry->d_op->d_name(), the normally path doesn't have any
> +        * sleepable code, and despite it uses the current macro to get
> +        * fs_struct (current->fs), we've already ensured we're in user
> +        * context, so it's ok to be here.
> +        */
> +       p = d_path(&f->f_path, dst, size);
> +       if (IS_ERR(p)) {
> +               ret = PTR_ERR(p);
> +               fput(f);
> +               goto error;
> +       }
> +
> +       ret = strlen(p);
> +       memmove(dst, p, ret);
> +       dst[ret++] = '\0';
> +       fput(f);
> +       return ret;
> +
> +error:
> +       memset(dst, '0', size);
> +       return ret;
> +}
> +
> +static const struct bpf_func_proto bpf_get_file_path_proto = {
> +       .func       = bpf_get_file_path,
> +       .gpl_only   = true,
> +       .ret_type   = RET_INTEGER,
> +       .arg1_type  = ARG_PTR_TO_UNINIT_MEM,
> +       .arg2_type  = ARG_CONST_SIZE,
> +       .arg3_type  = ARG_ANYTHING,
> +};
> +
>  static const struct bpf_func_proto *
>  tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
>  {
> @@ -953,6 +1019,8 @@ tp_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
>                 return &bpf_get_stackid_proto_tp;
>         case BPF_FUNC_get_stack:
>                 return &bpf_get_stack_proto_tp;
> +       case BPF_FUNC_get_file_path:
> +               return &bpf_get_file_path_proto;
>         default:
>                 return tracing_func_proto(func_id, prog);
>         }
> @@ -1146,6 +1214,8 @@ raw_tp_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
>                 return &bpf_get_stackid_proto_raw_tp;
>         case BPF_FUNC_get_stack:
>                 return &bpf_get_stack_proto_raw_tp;
> +       case BPF_FUNC_get_file_path:
> +               return &bpf_get_file_path_proto;
>         default:
>                 return tracing_func_proto(func_id, prog);
>         }
> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index dbbcf0b02970..71d9705df120 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -2821,6 +2821,32 @@ union bpf_attr {
>   *     Return
>   *             On success, the strictly positive length of the string, including
>   *             the trailing NUL character. On error, a negative value.
> + *
> + * int bpf_get_file_path(char *path, u32 size, int fd)
> + *     Description
> + *             Get **file** atrribute from the current task by *fd*, then call
> + *             **d_path** to get it's absolute path and copy it as string into
> + *             *path* of *size*. Notice the **path** don't support unmountable
> + *             pseudo filesystems as they don't have path (eg: SOCKFS, PIPEFS).
> + *             The *size* must be strictly positive. On success, the helper
> + *             makes sure that the *path* is NUL-terminated, and the buffer
> + *             could be:
> + *             - a regular full path (include mountable fs eg: /proc, /sys)
> + *             - a regular full path with "(deleted)" at the end.
> + *             On failure, it is filled with zeroes.
> + *     Return
> + *             On success, returns the length of the copied string INCLUDING
> + *             the trailing NUL.
> + *
> + *             On failure, the returned value is one of the following:
> + *
> + *             **-EPERM** if no permission to get the path (eg: in irq ctx).
> + *
> + *             **-EBADF** if *fd* is invalid.
> + *
> + *             **-EINVAL** if *fd* corresponds to a unmountable pseudo fs
> + *
> + *             **-ENAMETOOLONG** if full path is longer than *size*
>   */
>  #define __BPF_FUNC_MAPPER(FN)          \
>         FN(unspec),                     \
> @@ -2938,7 +2964,8 @@ union bpf_attr {
>         FN(probe_read_user),            \
>         FN(probe_read_kernel),          \
>         FN(probe_read_user_str),        \
> -       FN(probe_read_kernel_str),
> +       FN(probe_read_kernel_str),      \
> +       FN(get_file_path),


I just realized that among my tools that want the path, the input is either:

A) syscall tracepoints: int fd
B) kprobes: struct file *

This serves (A). If we ever add a different helper for (B), we might
think that this helper was misnamed. Should it be called get_fd_path
instead? That leaves get_file_path available for a later "struct file
*" -> pathname helper.

Brendan

-- 
Brendan Gregg, Senior Performance Architect, Netflix

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v12 2/2] selftests/bpf: test for bpf_get_file_path() from tracepoint
  2019-12-15 16:24           ` Yonghong Song
@ 2019-12-17  4:01             ` Wenbo Zhang
  2019-12-17  4:13               ` Yonghong Song
  0 siblings, 1 reply; 52+ messages in thread
From: Wenbo Zhang @ 2019-12-17  4:01 UTC (permalink / raw)
  To: Yonghong Song; +Cc: bpf, ast, daniel, andrii.nakryiko, netdev

> In non-bpf .c file, typically we do not add 'inline' attribute.
> It is up to compiler to decide whether it should be inlined.

Thank you, I'll fix this.

> > +struct sys_enter_newfstat_args {
> > +     unsigned long long pad1;
> > +     unsigned long long pad2;
> > +     unsigned int fd;
> > +};

> The BTF generated vmlinux.h has the following structure,
> struct trace_entry {
>          short unsigned int type;
>          unsigned char flags;
>          unsigned char preempt_count;
>          int pid;
> };
> struct trace_event_raw_sys_enter {
>          struct trace_entry ent;
>          long int id;
>          long unsigned int args[6];
>          char __data[0];
> };

> The third parameter type should be long, otherwise,
> it may have issue on big endian machines?

Sorry, I don't understand why there is a problem on big-endian machines.
Would you please explain that in more detail? Thank you.

Yonghong Song <yhs@fb.com> 于2019年12月16日周一 上午12:25写道:
>
>
>
> On 12/14/19 8:01 PM, Wenbo Zhang wrote:
> > trace fstat events by tracepoint syscalls/sys_enter_newfstat, and handle
> > events only produced by test_file_get_path, which call fstat on several
> > different types of files to test bpf_get_file_path's feature.
> >
> > v4->v5: addressed Andrii's feedback
> > - pass NULL for opts as bpf_object__open_file's PARAM2, as not really
> > using any
> > - modify patch subject to keep up with test code
> > - as this test is single-threaded, so use getpid instead of SYS_gettid
> > - remove unnecessary parens around check which after if (i < 3)
> > - in kern use bpf_get_current_pid_tgid() >> 32 to fit getpid() in
> > userspace part
> > - with the patch adding helper as one patch series
> >
> > v3->v4: addressed Andrii's feedback
> > - use a set of fd instead of fds array
> > - use global variables instead of maps (in v3, I mistakenly thought that
> > the bpf maps are global variables.)
> > - remove uncessary global variable path_info_index
> > - remove fd compare as the fstat's order is fixed
> >
> > v2->v3: addressed Andrii's feedback
> > - use global data instead of perf_buffer to simplified code
> >
> > v1->v2: addressed Daniel's feedback
> > - rename bpf_fd2path to bpf_get_file_path to be consistent with other
> > helper's names
> >
> > Signed-off-by: Wenbo Zhang <ethercflow@gmail.com>
> > ---
> >   .../selftests/bpf/prog_tests/get_file_path.c  | 171 ++++++++++++++++++
> >   .../selftests/bpf/progs/test_get_file_path.c  |  43 +++++
> >   2 files changed, 214 insertions(+)
> >   create mode 100644 tools/testing/selftests/bpf/prog_tests/get_file_path.c
> >   create mode 100644 tools/testing/selftests/bpf/progs/test_get_file_path.c
> >
> > diff --git a/tools/testing/selftests/bpf/prog_tests/get_file_path.c b/tools/testing/selftests/bpf/prog_tests/get_file_path.c
> > new file mode 100644
> > index 000000000000..7ec11e43e0fc
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/prog_tests/get_file_path.c
> > @@ -0,0 +1,171 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +#define _GNU_SOURCE
> > +#include <test_progs.h>
> > +#include <sys/stat.h>
> > +#include <linux/sched.h>
> > +#include <sys/syscall.h>
> > +
> > +#define MAX_PATH_LEN         128
> > +#define MAX_FDS                      7
> > +#define MAX_EVENT_NUM                16
> > +
> > +static struct file_path_test_data {
> > +     pid_t pid;
> > +     __u32 cnt;
> > +     __u32 fds[MAX_EVENT_NUM];
> > +     char paths[MAX_EVENT_NUM][MAX_PATH_LEN];
> > +} src, dst;
> > +
> > +static inline int set_pathname(int fd)
>
> In non-bpf .c file, typically we do not add 'inline' attribute.
> It is up to compiler to decide whether it should be inlined.
>
> > +{
> > +     char buf[MAX_PATH_LEN];
> > +
> > +     snprintf(buf, MAX_PATH_LEN, "/proc/%d/fd/%d", src.pid, fd);
> > +     src.fds[src.cnt] = fd;
> > +     return readlink(buf, src.paths[src.cnt++], MAX_PATH_LEN);
> > +}
> > +
> [...]
> > diff --git a/tools/testing/selftests/bpf/progs/test_get_file_path.c b/tools/testing/selftests/bpf/progs/test_get_file_path.c
> > new file mode 100644
> > index 000000000000..eae663c1262a
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/progs/test_get_file_path.c
> > @@ -0,0 +1,43 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +#include <linux/bpf.h>
> > +#include <linux/ptrace.h>
> > +#include <string.h>
> > +#include <unistd.h>
> > +#include "bpf_helpers.h"
> > +#include "bpf_tracing.h"
> > +
> > +#define MAX_PATH_LEN         128
> > +#define MAX_EVENT_NUM                16
> > +
> > +static struct file_path_test_data {
> > +     pid_t pid;
> > +     __u32 cnt;
> > +     __u32 fds[MAX_EVENT_NUM];
> > +     char paths[MAX_EVENT_NUM][MAX_PATH_LEN];
> > +} data;
> > +
> > +struct sys_enter_newfstat_args {
> > +     unsigned long long pad1;
> > +     unsigned long long pad2;
> > +     unsigned int fd;
> > +};
>
> The BTF generated vmlinux.h has the following structure,
> struct trace_entry {
>          short unsigned int type;
>          unsigned char flags;
>          unsigned char preempt_count;
>          int pid;
> };
> struct trace_event_raw_sys_enter {
>          struct trace_entry ent;
>          long int id;
>          long unsigned int args[6];
>          char __data[0];
> };
>
> The third parameter type should be long, otherwise,
> it may have issue on big endian machines?

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v12 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname
  2019-12-16 22:09           ` Brendan Gregg
@ 2019-12-17  4:05             ` Wenbo Zhang
  0 siblings, 0 replies; 52+ messages in thread
From: Wenbo Zhang @ 2019-12-17  4:05 UTC (permalink / raw)
  To: Brendan Gregg
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Yonghong Song,
	Andrii Nakryiko, Networking

> I just realized that among my tools that want the path, the input is either:

> A) syscall tracepoints: int fd
> B) kprobes: struct file *

> This serves (A). If we ever add a different helper for (B), we might
> think that this helper was misnamed. Should it be called get_fd_path
> instead? That leaves get_file_path available for a later "struct file
> *" -> pathname helper.

+1, I'll rename it in the next version.

Brendan Gregg <bgregg@netflix.com> 于2019年12月17日周二 上午6:09写道:
>
> On Sat, Dec 14, 2019 at 8:01 PM Wenbo Zhang <ethercflow@gmail.com> wrote:
> >
> > When people want to identify which file system files are being opened,
> > read, and written to, they can use this helper with file descriptor as
> > input to achieve this goal. Other pseudo filesystems are also supported.
> >
> > This requirement is mainly discussed here:
> >
> >   https://github.com/iovisor/bcc/issues/237
> >
> > v11->v12: addressed Alexei's feedback
> > - only allow tracepoints to make sure it won't dead lock
> >
> > v10->v11: addressed Al and Alexei's feedback
> > - fix missing fput()
> >
> > v9->v10: addressed Andrii's feedback
> > - send this patch together with the patch selftests as one patch series
> >
> > v8->v9:
> > - format helper description
> >
> > v7->v8: addressed Alexei's feedback
> > - use fget_raw instead of fdget_raw, as fdget_raw is only used inside fs/
> > - ensure we're in user context which is safe fot the help to run
> > - filter unmountable pseudo filesystem, because they don't have real path
> > - supplement the description of this helper function
> >
> > v6->v7:
> > - fix missing signed-off-by line
> >
> > v5->v6: addressed Andrii's feedback
> > - avoid unnecessary goto end by having two explicit returns
> >
> > v4->v5: addressed Andrii and Daniel's feedback
> > - rename bpf_fd2path to bpf_get_file_path to be consistent with other
> > helper's names
> > - when fdget_raw fails, set ret to -EBADF instead of -EINVAL
> > - remove fdput from fdget_raw's error path
> > - use IS_ERR instead of IS_ERR_OR_NULL as d_path ether returns a pointer
> > into the buffer or an error code if the path was too long
> > - modify the normal path's return value to return copied string length
> > including NUL
> > - update this helper description's Return bits.
> >
> > v3->v4: addressed Daniel's feedback
> > - fix missing fdput()
> > - move fd2path from kernel/bpf/trace.c to kernel/trace/bpf_trace.c
> > - move fd2path's test code to another patch
> > - add comment to explain why use fdget_raw instead of fdget
> >
> > v2->v3: addressed Yonghong's feedback
> > - remove unnecessary LOCKDOWN_BPF_READ
> > - refactor error handling section for enhanced readability
> > - provide a test case in tools/testing/selftests/bpf
> >
> > v1->v2: addressed Daniel's feedback
> > - fix backward compatibility
> > - add this helper description
> > - fix signed-off name
> >
> > Signed-off-by: Wenbo Zhang <ethercflow@gmail.com>
> > ---
> >  include/uapi/linux/bpf.h       | 29 +++++++++++++-
> >  kernel/trace/bpf_trace.c       | 70 ++++++++++++++++++++++++++++++++++
> >  tools/include/uapi/linux/bpf.h | 29 +++++++++++++-
> >  3 files changed, 126 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index dbbcf0b02970..71d9705df120 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -2821,6 +2821,32 @@ union bpf_attr {
> >   *     Return
> >   *             On success, the strictly positive length of the string, including
> >   *             the trailing NUL character. On error, a negative value.
> > + *
> > + * int bpf_get_file_path(char *path, u32 size, int fd)
> > + *     Description
> > + *             Get **file** atrribute from the current task by *fd*, then call
> > + *             **d_path** to get it's absolute path and copy it as string into
> > + *             *path* of *size*. Notice the **path** don't support unmountable
> > + *             pseudo filesystems as they don't have path (eg: SOCKFS, PIPEFS).
> > + *             The *size* must be strictly positive. On success, the helper
> > + *             makes sure that the *path* is NUL-terminated, and the buffer
> > + *             could be:
> > + *             - a regular full path (include mountable fs eg: /proc, /sys)
> > + *             - a regular full path with "(deleted)" at the end.
> > + *             On failure, it is filled with zeroes.
> > + *     Return
> > + *             On success, returns the length of the copied string INCLUDING
> > + *             the trailing NUL.
> > + *
> > + *             On failure, the returned value is one of the following:
> > + *
> > + *             **-EPERM** if no permission to get the path (eg: in irq ctx).
> > + *
> > + *             **-EBADF** if *fd* is invalid.
> > + *
> > + *             **-EINVAL** if *fd* corresponds to a unmountable pseudo fs
> > + *
> > + *             **-ENAMETOOLONG** if full path is longer than *size*
> >   */
> >  #define __BPF_FUNC_MAPPER(FN)          \
> >         FN(unspec),                     \
> > @@ -2938,7 +2964,8 @@ union bpf_attr {
> >         FN(probe_read_user),            \
> >         FN(probe_read_kernel),          \
> >         FN(probe_read_user_str),        \
> > -       FN(probe_read_kernel_str),
> > +       FN(probe_read_kernel_str),      \
> > +       FN(get_file_path),
> >
> >  /* integer value in 'imm' field of BPF_CALL instruction selects which helper
> >   * function eBPF program intends to call
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index e5ef4ae9edb5..db9c0ec46a5d 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -762,6 +762,72 @@ static const struct bpf_func_proto bpf_send_signal_proto = {
> >         .arg1_type      = ARG_ANYTHING,
> >  };
> >
> > +BPF_CALL_3(bpf_get_file_path, char *, dst, u32, size, int, fd)
> > +{
> > +       struct file *f;
> > +       char *p;
> > +       int ret = -EBADF;
> > +
> > +       /* Ensure we're in user context which is safe for the helper to
> > +        * run. This helper has no business in a kthread.
> > +        */
> > +       if (unlikely(in_interrupt() ||
> > +                    current->flags & (PF_KTHREAD | PF_EXITING))) {
> > +               ret = -EPERM;
> > +               goto error;
> > +       }
> > +
> > +       /* Use fget_raw instead of fget to support O_PATH, and it doesn't
> > +        * have any sleepable code, so it's ok to be here.
> > +        */
> > +       f = fget_raw(fd);
> > +       if (!f)
> > +               goto error;
> > +
> > +       /* For unmountable pseudo filesystem, it seems to have no meaning
> > +        * to get their fake paths as they don't have path, and to be no
> > +        * way to validate this function pointer can be always safe to call
> > +        * in the current context.
> > +        */
> > +       if (f->f_path.dentry->d_op && f->f_path.dentry->d_op->d_dname) {
> > +               ret = -EINVAL;
> > +               fput(f);
> > +               goto error;
> > +       }
> > +
> > +       /* After filter unmountable pseudo filesytem, d_path won't call
> > +        * dentry->d_op->d_name(), the normally path doesn't have any
> > +        * sleepable code, and despite it uses the current macro to get
> > +        * fs_struct (current->fs), we've already ensured we're in user
> > +        * context, so it's ok to be here.
> > +        */
> > +       p = d_path(&f->f_path, dst, size);
> > +       if (IS_ERR(p)) {
> > +               ret = PTR_ERR(p);
> > +               fput(f);
> > +               goto error;
> > +       }
> > +
> > +       ret = strlen(p);
> > +       memmove(dst, p, ret);
> > +       dst[ret++] = '\0';
> > +       fput(f);
> > +       return ret;
> > +
> > +error:
> > +       memset(dst, '0', size);
> > +       return ret;
> > +}
> > +
> > +static const struct bpf_func_proto bpf_get_file_path_proto = {
> > +       .func       = bpf_get_file_path,
> > +       .gpl_only   = true,
> > +       .ret_type   = RET_INTEGER,
> > +       .arg1_type  = ARG_PTR_TO_UNINIT_MEM,
> > +       .arg2_type  = ARG_CONST_SIZE,
> > +       .arg3_type  = ARG_ANYTHING,
> > +};
> > +
> >  static const struct bpf_func_proto *
> >  tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
> >  {
> > @@ -953,6 +1019,8 @@ tp_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
> >                 return &bpf_get_stackid_proto_tp;
> >         case BPF_FUNC_get_stack:
> >                 return &bpf_get_stack_proto_tp;
> > +       case BPF_FUNC_get_file_path:
> > +               return &bpf_get_file_path_proto;
> >         default:
> >                 return tracing_func_proto(func_id, prog);
> >         }
> > @@ -1146,6 +1214,8 @@ raw_tp_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
> >                 return &bpf_get_stackid_proto_raw_tp;
> >         case BPF_FUNC_get_stack:
> >                 return &bpf_get_stack_proto_raw_tp;
> > +       case BPF_FUNC_get_file_path:
> > +               return &bpf_get_file_path_proto;
> >         default:
> >                 return tracing_func_proto(func_id, prog);
> >         }
> > diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> > index dbbcf0b02970..71d9705df120 100644
> > --- a/tools/include/uapi/linux/bpf.h
> > +++ b/tools/include/uapi/linux/bpf.h
> > @@ -2821,6 +2821,32 @@ union bpf_attr {
> >   *     Return
> >   *             On success, the strictly positive length of the string, including
> >   *             the trailing NUL character. On error, a negative value.
> > + *
> > + * int bpf_get_file_path(char *path, u32 size, int fd)
> > + *     Description
> > + *             Get **file** atrribute from the current task by *fd*, then call
> > + *             **d_path** to get it's absolute path and copy it as string into
> > + *             *path* of *size*. Notice the **path** don't support unmountable
> > + *             pseudo filesystems as they don't have path (eg: SOCKFS, PIPEFS).
> > + *             The *size* must be strictly positive. On success, the helper
> > + *             makes sure that the *path* is NUL-terminated, and the buffer
> > + *             could be:
> > + *             - a regular full path (include mountable fs eg: /proc, /sys)
> > + *             - a regular full path with "(deleted)" at the end.
> > + *             On failure, it is filled with zeroes.
> > + *     Return
> > + *             On success, returns the length of the copied string INCLUDING
> > + *             the trailing NUL.
> > + *
> > + *             On failure, the returned value is one of the following:
> > + *
> > + *             **-EPERM** if no permission to get the path (eg: in irq ctx).
> > + *
> > + *             **-EBADF** if *fd* is invalid.
> > + *
> > + *             **-EINVAL** if *fd* corresponds to a unmountable pseudo fs
> > + *
> > + *             **-ENAMETOOLONG** if full path is longer than *size*
> >   */
> >  #define __BPF_FUNC_MAPPER(FN)          \
> >         FN(unspec),                     \
> > @@ -2938,7 +2964,8 @@ union bpf_attr {
> >         FN(probe_read_user),            \
> >         FN(probe_read_kernel),          \
> >         FN(probe_read_user_str),        \
> > -       FN(probe_read_kernel_str),
> > +       FN(probe_read_kernel_str),      \
> > +       FN(get_file_path),
>
>
> I just realized that among my tools that want the path, the input is either:
>
> A) syscall tracepoints: int fd
> B) kprobes: struct file *
>
> This serves (A). If we ever add a different helper for (B), we might
> think that this helper was misnamed. Should it be called get_fd_path
> instead? That leaves get_file_path available for a later "struct file
> *" -> pathname helper.
>
> Brendan
>
> --
> Brendan Gregg, Senior Performance Architect, Netflix

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v12 2/2] selftests/bpf: test for bpf_get_file_path() from tracepoint
  2019-12-17  4:01             ` Wenbo Zhang
@ 2019-12-17  4:13               ` Yonghong Song
  2019-12-17  9:44                 ` Wenbo Zhang
  0 siblings, 1 reply; 52+ messages in thread
From: Yonghong Song @ 2019-12-17  4:13 UTC (permalink / raw)
  To: Wenbo Zhang; +Cc: bpf, ast, daniel, andrii.nakryiko, netdev



On 12/16/19 8:01 PM, Wenbo Zhang wrote:
>> In non-bpf .c file, typically we do not add 'inline' attribute.
>> It is up to compiler to decide whether it should be inlined.
> 
> Thank you, I'll fix this.
> 
>>> +struct sys_enter_newfstat_args {
>>> +     unsigned long long pad1;
>>> +     unsigned long long pad2;
>>> +     unsigned int fd;
>>> +};
> 
>> The BTF generated vmlinux.h has the following structure,
>> struct trace_entry {
>>           short unsigned int type;
>>           unsigned char flags;
>>           unsigned char preempt_count;
>>           int pid;
>> };
>> struct trace_event_raw_sys_enter {
>>           struct trace_entry ent;
>>           long int id;
>>           long unsigned int args[6];
>>           char __data[0];
>> };
> 
>> The third parameter type should be long, otherwise,
>> it may have issue on big endian machines?
> 
> Sorry, I don't understand why there is a problem on big-endian machines.
> Would you please explain that in more detail? Thank you.

The kernel will actually have 8 bytes of memory to store fd
based on trace_event_raw_sys_enter.

For little endian machine, the lower 4 bytes are read based on
your sys_enter_newfstat_args, which is "accidentally" the lower
4 bytes in u64, so you get the correct answer.

For big endian machine, the lower 4 bytes read based on
your sys_enter_newfstat_args will be high 4 bytes in u64, which
is incorrect.

> 
> Yonghong Song <yhs@fb.com> 于2019年12月16日周一 上午12:25写道:
>>
>>
>>
>> On 12/14/19 8:01 PM, Wenbo Zhang wrote:
>>> trace fstat events by tracepoint syscalls/sys_enter_newfstat, and handle
>>> events only produced by test_file_get_path, which call fstat on several
>>> different types of files to test bpf_get_file_path's feature.
>>>
>>> v4->v5: addressed Andrii's feedback
>>> - pass NULL for opts as bpf_object__open_file's PARAM2, as not really
>>> using any
>>> - modify patch subject to keep up with test code
>>> - as this test is single-threaded, so use getpid instead of SYS_gettid
>>> - remove unnecessary parens around check which after if (i < 3)
>>> - in kern use bpf_get_current_pid_tgid() >> 32 to fit getpid() in
>>> userspace part
>>> - with the patch adding helper as one patch series
>>>
>>> v3->v4: addressed Andrii's feedback
>>> - use a set of fd instead of fds array
>>> - use global variables instead of maps (in v3, I mistakenly thought that
>>> the bpf maps are global variables.)
>>> - remove uncessary global variable path_info_index
>>> - remove fd compare as the fstat's order is fixed
>>>
>>> v2->v3: addressed Andrii's feedback
>>> - use global data instead of perf_buffer to simplified code
>>>
>>> v1->v2: addressed Daniel's feedback
>>> - rename bpf_fd2path to bpf_get_file_path to be consistent with other
>>> helper's names
>>>
>>> Signed-off-by: Wenbo Zhang <ethercflow@gmail.com>
>>> ---
>>>    .../selftests/bpf/prog_tests/get_file_path.c  | 171 ++++++++++++++++++
>>>    .../selftests/bpf/progs/test_get_file_path.c  |  43 +++++
>>>    2 files changed, 214 insertions(+)
>>>    create mode 100644 tools/testing/selftests/bpf/prog_tests/get_file_path.c
>>>    create mode 100644 tools/testing/selftests/bpf/progs/test_get_file_path.c
>>>
>>> diff --git a/tools/testing/selftests/bpf/prog_tests/get_file_path.c b/tools/testing/selftests/bpf/prog_tests/get_file_path.c
>>> new file mode 100644
>>> index 000000000000..7ec11e43e0fc
>>> --- /dev/null
>>> +++ b/tools/testing/selftests/bpf/prog_tests/get_file_path.c
>>> @@ -0,0 +1,171 @@
>>> +// SPDX-License-Identifier: GPL-2.0
>>> +#define _GNU_SOURCE
>>> +#include <test_progs.h>
>>> +#include <sys/stat.h>
>>> +#include <linux/sched.h>
>>> +#include <sys/syscall.h>
>>> +
>>> +#define MAX_PATH_LEN         128
>>> +#define MAX_FDS                      7
>>> +#define MAX_EVENT_NUM                16
>>> +
>>> +static struct file_path_test_data {
>>> +     pid_t pid;
>>> +     __u32 cnt;
>>> +     __u32 fds[MAX_EVENT_NUM];
>>> +     char paths[MAX_EVENT_NUM][MAX_PATH_LEN];
>>> +} src, dst;
>>> +
>>> +static inline int set_pathname(int fd)
>>
>> In non-bpf .c file, typically we do not add 'inline' attribute.
>> It is up to compiler to decide whether it should be inlined.
>>
>>> +{
>>> +     char buf[MAX_PATH_LEN];
>>> +
>>> +     snprintf(buf, MAX_PATH_LEN, "/proc/%d/fd/%d", src.pid, fd);
>>> +     src.fds[src.cnt] = fd;
>>> +     return readlink(buf, src.paths[src.cnt++], MAX_PATH_LEN);
>>> +}
>>> +
>> [...]
>>> diff --git a/tools/testing/selftests/bpf/progs/test_get_file_path.c b/tools/testing/selftests/bpf/progs/test_get_file_path.c
>>> new file mode 100644
>>> index 000000000000..eae663c1262a
>>> --- /dev/null
>>> +++ b/tools/testing/selftests/bpf/progs/test_get_file_path.c
>>> @@ -0,0 +1,43 @@
>>> +// SPDX-License-Identifier: GPL-2.0
>>> +
>>> +#include <linux/bpf.h>
>>> +#include <linux/ptrace.h>
>>> +#include <string.h>
>>> +#include <unistd.h>
>>> +#include "bpf_helpers.h"
>>> +#include "bpf_tracing.h"
>>> +
>>> +#define MAX_PATH_LEN         128
>>> +#define MAX_EVENT_NUM                16
>>> +
>>> +static struct file_path_test_data {
>>> +     pid_t pid;
>>> +     __u32 cnt;
>>> +     __u32 fds[MAX_EVENT_NUM];
>>> +     char paths[MAX_EVENT_NUM][MAX_PATH_LEN];
>>> +} data;
>>> +
>>> +struct sys_enter_newfstat_args {
>>> +     unsigned long long pad1;
>>> +     unsigned long long pad2;
>>> +     unsigned int fd;
>>> +};
>>
>> The BTF generated vmlinux.h has the following structure,
>> struct trace_entry {
>>           short unsigned int type;
>>           unsigned char flags;
>>           unsigned char preempt_count;
>>           int pid;
>> };
>> struct trace_event_raw_sys_enter {
>>           struct trace_entry ent;
>>           long int id;
>>           long unsigned int args[6];
>>           char __data[0];
>> };
>>
>> The third parameter type should be long, otherwise,
>> it may have issue on big endian machines?

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v12 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname
  2019-12-15 16:05           ` Yonghong Song
@ 2019-12-17  6:26             ` Wenbo Zhang
  2019-12-17  6:33               ` Yonghong Song
  0 siblings, 1 reply; 52+ messages in thread
From: Wenbo Zhang @ 2019-12-17  6:26 UTC (permalink / raw)
  To: Yonghong Song; +Cc: bpf, ast, daniel, andrii.nakryiko, netdev

> > + *           - a regular full path (include mountable fs eg: /proc, /sys)
> > + *           - a regular full path with "(deleted)" at the end.

> Let us say with " (deleted)" is appended to be consistent with comments
> in d_path() and is more clear to user what the format will looks like.

Thank you, I'll fix this.

> > +     ret = strlen(p);
> > +     memmove(dst, p, ret);
> > +     dst[ret++] = '\0';

> nit: you could do memmove(dst, p, ret + 1)?

I did with `dst[ret++]='\0';`  to return value length including
trailing '\0'. as you mentioned below:

> > +     fput(f);
> > +     return ret;

> The description says the return value length including trailing '\0'.
> The above 'ret' does not include trailing '\0'.

It seems `[ret++]` not very clear to read and '\0' can be done by
`memmove`. I think I'll refactor to

```
ret = strlen(p) + 1;
memmove(dst, p, ret);
fput(f);
return ret;
```

Is this better?

Yonghong Song <yhs@fb.com> 于2019年12月16日周一 上午12:06写道:
>
>
>
> On 12/14/19 8:01 PM, Wenbo Zhang wrote:
> > When people want to identify which file system files are being opened,
> > read, and written to, they can use this helper with file descriptor as
> > input to achieve this goal. Other pseudo filesystems are also supported.
> >
> > This requirement is mainly discussed here:
> >
> >    https://github.com/iovisor/bcc/issues/237
> >
> > v11->v12: addressed Alexei's feedback
> > - only allow tracepoints to make sure it won't dead lock
> >
> > v10->v11: addressed Al and Alexei's feedback
> > - fix missing fput()
> >
> > v9->v10: addressed Andrii's feedback
> > - send this patch together with the patch selftests as one patch series
> >
> > v8->v9:
> > - format helper description
> >
> > v7->v8: addressed Alexei's feedback
> > - use fget_raw instead of fdget_raw, as fdget_raw is only used inside fs/
> > - ensure we're in user context which is safe fot the help to run
> > - filter unmountable pseudo filesystem, because they don't have real path
> > - supplement the description of this helper function
> >
> > v6->v7:
> > - fix missing signed-off-by line
> >
> > v5->v6: addressed Andrii's feedback
> > - avoid unnecessary goto end by having two explicit returns
> >
> > v4->v5: addressed Andrii and Daniel's feedback
> > - rename bpf_fd2path to bpf_get_file_path to be consistent with other
> > helper's names
> > - when fdget_raw fails, set ret to -EBADF instead of -EINVAL
> > - remove fdput from fdget_raw's error path
> > - use IS_ERR instead of IS_ERR_OR_NULL as d_path ether returns a pointer
> > into the buffer or an error code if the path was too long
> > - modify the normal path's return value to return copied string length
> > including NUL
> > - update this helper description's Return bits.
> >
> > v3->v4: addressed Daniel's feedback
> > - fix missing fdput()
> > - move fd2path from kernel/bpf/trace.c to kernel/trace/bpf_trace.c
> > - move fd2path's test code to another patch
> > - add comment to explain why use fdget_raw instead of fdget
> >
> > v2->v3: addressed Yonghong's feedback
> > - remove unnecessary LOCKDOWN_BPF_READ
> > - refactor error handling section for enhanced readability
> > - provide a test case in tools/testing/selftests/bpf
> >
> > v1->v2: addressed Daniel's feedback
> > - fix backward compatibility
> > - add this helper description
> > - fix signed-off name
> >
> > Signed-off-by: Wenbo Zhang <ethercflow@gmail.com>
> > ---
> >   include/uapi/linux/bpf.h       | 29 +++++++++++++-
> >   kernel/trace/bpf_trace.c       | 70 ++++++++++++++++++++++++++++++++++
> >   tools/include/uapi/linux/bpf.h | 29 +++++++++++++-
> >   3 files changed, 126 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index dbbcf0b02970..71d9705df120 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -2821,6 +2821,32 @@ union bpf_attr {
> >    *  Return
> >    *          On success, the strictly positive length of the string, including
> >    *          the trailing NUL character. On error, a negative value.
> > + *
> > + * int bpf_get_file_path(char *path, u32 size, int fd)
> > + *   Description
> > + *           Get **file** atrribute from the current task by *fd*, then call
> > + *           **d_path** to get it's absolute path and copy it as string into
> > + *           *path* of *size*. Notice the **path** don't support unmountable
> > + *           pseudo filesystems as they don't have path (eg: SOCKFS, PIPEFS).
> > + *           The *size* must be strictly positive. On success, the helper
> > + *           makes sure that the *path* is NUL-terminated, and the buffer
> > + *           could be:
> > + *           - a regular full path (include mountable fs eg: /proc, /sys)
> > + *           - a regular full path with "(deleted)" at the end.
>
> Let us say with " (deleted)" is appended to be consistent with comments
> in d_path() and is more clear to user what the format will looks like.
>
> > + *           On failure, it is filled with zeroes.
> > + *   Return
> > + *           On success, returns the length of the copied string INCLUDING
> > + *           the trailing NUL.
>
> trailing '\0'.
>
> > + *
> > + *           On failure, the returned value is one of the following:
> > + *
> > + *           **-EPERM** if no permission to get the path (eg: in irq ctx).
> > + *
> > + *           **-EBADF** if *fd* is invalid.
> > + *
> > + *           **-EINVAL** if *fd* corresponds to a unmountable pseudo fs
> > + *
> > + *           **-ENAMETOOLONG** if full path is longer than *size*
> >    */
> >   #define __BPF_FUNC_MAPPER(FN)               \
> >       FN(unspec),                     \
> > @@ -2938,7 +2964,8 @@ union bpf_attr {
> >       FN(probe_read_user),            \
> >       FN(probe_read_kernel),          \
> >       FN(probe_read_user_str),        \
> > -     FN(probe_read_kernel_str),
> > +     FN(probe_read_kernel_str),      \
> > +     FN(get_file_path),
> >
> >   /* integer value in 'imm' field of BPF_CALL instruction selects which helper
> >    * function eBPF program intends to call
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index e5ef4ae9edb5..db9c0ec46a5d 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -762,6 +762,72 @@ static const struct bpf_func_proto bpf_send_signal_proto = {
> >       .arg1_type      = ARG_ANYTHING,
> >   };
> >
> > +BPF_CALL_3(bpf_get_file_path, char *, dst, u32, size, int, fd)
> > +{
> > +     struct file *f;
> > +     char *p;
> > +     int ret = -EBADF;
> > +
> > +     /* Ensure we're in user context which is safe for the helper to
> > +      * run. This helper has no business in a kthread.
> > +      */
> > +     if (unlikely(in_interrupt() ||
> > +                  current->flags & (PF_KTHREAD | PF_EXITING))) {
> > +             ret = -EPERM;
> > +             goto error;
> > +     }
> > +
> > +     /* Use fget_raw instead of fget to support O_PATH, and it doesn't
> > +      * have any sleepable code, so it's ok to be here.
> > +      */
> > +     f = fget_raw(fd);
> > +     if (!f)
> > +             goto error;
> > +
> > +     /* For unmountable pseudo filesystem, it seems to have no meaning
> > +      * to get their fake paths as they don't have path, and to be no
> > +      * way to validate this function pointer can be always safe to call
> > +      * in the current context.
> > +      */
> > +     if (f->f_path.dentry->d_op && f->f_path.dentry->d_op->d_dname) {
> > +             ret = -EINVAL;
> > +             fput(f);
> > +             goto error;
> > +     }
> > +
> > +     /* After filter unmountable pseudo filesytem, d_path won't call
> > +      * dentry->d_op->d_name(), the normally path doesn't have any
> > +      * sleepable code, and despite it uses the current macro to get
> > +      * fs_struct (current->fs), we've already ensured we're in user
> > +      * context, so it's ok to be here.
> > +      */
> > +     p = d_path(&f->f_path, dst, size);
> > +     if (IS_ERR(p)) {
> > +             ret = PTR_ERR(p);
> > +             fput(f);
> > +             goto error;
> > +     }
> > +
> > +     ret = strlen(p);
> > +     memmove(dst, p, ret);
> > +     dst[ret++] = '\0';
>
> nit: you could do memmove(dst, p, ret + 1)?
>
> > +     fput(f);
> > +     return ret;
>
> The description says the return value length including trailing '\0'.
> The above 'ret' does not include trailing '\0'.
>
> > +
> > +error:
> > +     memset(dst, '0', size);
> > +     return ret;
> > +}
> > +
> > +static const struct bpf_func_proto bpf_get_file_path_proto = {
> > +     .func       = bpf_get_file_path,
> > +     .gpl_only   = true,
> > +     .ret_type   = RET_INTEGER,
> > +     .arg1_type  = ARG_PTR_TO_UNINIT_MEM,
> > +     .arg2_type  = ARG_CONST_SIZE,
> > +     .arg3_type  = ARG_ANYTHING,
> > +};
> > +
> >   static const struct bpf_func_proto *
> >   tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
> >   {
> > @@ -953,6 +1019,8 @@ tp_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
> >               return &bpf_get_stackid_proto_tp;
> >       case BPF_FUNC_get_stack:
> >               return &bpf_get_stack_proto_tp;
> > +     case BPF_FUNC_get_file_path:
> > +             return &bpf_get_file_path_proto;
> >       default:
> >               return tracing_func_proto(func_id, prog);
> >       }
> > @@ -1146,6 +1214,8 @@ raw_tp_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
> >               return &bpf_get_stackid_proto_raw_tp;
> >       case BPF_FUNC_get_stack:
> >               return &bpf_get_stack_proto_raw_tp;
> > +     case BPF_FUNC_get_file_path:
> > +             return &bpf_get_file_path_proto;
> >       default:
> >               return tracing_func_proto(func_id, prog);
> >       }
> > diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> > index dbbcf0b02970..71d9705df120 100644
> > --- a/tools/include/uapi/linux/bpf.h
> > +++ b/tools/include/uapi/linux/bpf.h
> > @@ -2821,6 +2821,32 @@ union bpf_attr {
> >    *  Return
> >    *          On success, the strictly positive length of the string, including
> >    *          the trailing NUL character. On error, a negative value.
> > + *
> > + * int bpf_get_file_path(char *path, u32 size, int fd)
> > + *   Description
> > + *           Get **file** atrribute from the current task by *fd*, then call
> > + *           **d_path** to get it's absolute path and copy it as string into
> > + *           *path* of *size*. Notice the **path** don't support unmountable
> > + *           pseudo filesystems as they don't have path (eg: SOCKFS, PIPEFS).
> > + *           The *size* must be strictly positive. On success, the helper
> > + *           makes sure that the *path* is NUL-terminated, and the buffer
> > + *           could be:
> > + *           - a regular full path (include mountable fs eg: /proc, /sys)
> > + *           - a regular full path with "(deleted)" at the end.
>
> ditto
>
> > + *           On failure, it is filled with zeroes.
> > + *   Return
> > + *           On success, returns the length of the copied string INCLUDING
> > + *           the trailing NUL.
>
> ditto
>
> > + *
> > + *           On failure, the returned value is one of the following:
> > + *
> > + *           **-EPERM** if no permission to get the path (eg: in irq ctx).
> > + *
> > + *           **-EBADF** if *fd* is invalid.
> > + *
> > + *           **-EINVAL** if *fd* corresponds to a unmountable pseudo fs
> > + *
> > + *           **-ENAMETOOLONG** if full path is longer than *size*
> >    */
> >   #define __BPF_FUNC_MAPPER(FN)               \
> >       FN(unspec),                     \
> > @@ -2938,7 +2964,8 @@ union bpf_attr {
> >       FN(probe_read_user),            \
> >       FN(probe_read_kernel),          \
> >       FN(probe_read_user_str),        \
> > -     FN(probe_read_kernel_str),
> > +     FN(probe_read_kernel_str),      \
> > +     FN(get_file_path),
> >
> >   /* integer value in 'imm' field of BPF_CALL instruction selects which helper
> >    * function eBPF program intends to call
> >

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v12 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname
  2019-12-15 16:10           ` Yonghong Song
@ 2019-12-17  6:27             ` Wenbo Zhang
  0 siblings, 0 replies; 52+ messages in thread
From: Wenbo Zhang @ 2019-12-17  6:27 UTC (permalink / raw)
  To: Yonghong Song; +Cc: bpf, ast, daniel, andrii.nakryiko, netdev

> > +BPF_CALL_3(bpf_get_file_path, char *, dst, u32, size, int, fd)
> > +{
> > +     struct file *f;
> > +     char *p;
> > +     int ret = -EBADF;

> please try to use reverse Christmas tree for declarations.

Thank you, I'll fix this.

Yonghong Song <yhs@fb.com> 于2019年12月16日周一 上午12:10写道:
>
>
>
> On 12/14/19 8:01 PM, Wenbo Zhang wrote:
> > When people want to identify which file system files are being opened,
> > read, and written to, they can use this helper with file descriptor as
> > input to achieve this goal. Other pseudo filesystems are also supported.
> >
> > This requirement is mainly discussed here:
> >
> >    https://github.com/iovisor/bcc/issues/237
> >
> > v11->v12: addressed Alexei's feedback
> > - only allow tracepoints to make sure it won't dead lock
> >
> > v10->v11: addressed Al and Alexei's feedback
> > - fix missing fput()
> >
> > v9->v10: addressed Andrii's feedback
> > - send this patch together with the patch selftests as one patch series
> >
> > v8->v9:
> > - format helper description
> >
> > v7->v8: addressed Alexei's feedback
> > - use fget_raw instead of fdget_raw, as fdget_raw is only used inside fs/
> > - ensure we're in user context which is safe fot the help to run
> > - filter unmountable pseudo filesystem, because they don't have real path
> > - supplement the description of this helper function
> >
> > v6->v7:
> > - fix missing signed-off-by line
> >
> > v5->v6: addressed Andrii's feedback
> > - avoid unnecessary goto end by having two explicit returns
> >
> > v4->v5: addressed Andrii and Daniel's feedback
> > - rename bpf_fd2path to bpf_get_file_path to be consistent with other
> > helper's names
> > - when fdget_raw fails, set ret to -EBADF instead of -EINVAL
> > - remove fdput from fdget_raw's error path
> > - use IS_ERR instead of IS_ERR_OR_NULL as d_path ether returns a pointer
> > into the buffer or an error code if the path was too long
> > - modify the normal path's return value to return copied string length
> > including NUL
> > - update this helper description's Return bits.
> >
> > v3->v4: addressed Daniel's feedback
> > - fix missing fdput()
> > - move fd2path from kernel/bpf/trace.c to kernel/trace/bpf_trace.c
> > - move fd2path's test code to another patch
> > - add comment to explain why use fdget_raw instead of fdget
> >
> > v2->v3: addressed Yonghong's feedback
> > - remove unnecessary LOCKDOWN_BPF_READ
> > - refactor error handling section for enhanced readability
> > - provide a test case in tools/testing/selftests/bpf
> >
> > v1->v2: addressed Daniel's feedback
> > - fix backward compatibility
> > - add this helper description
> > - fix signed-off name
> >
> > Signed-off-by: Wenbo Zhang <ethercflow@gmail.com>
> > ---
> >   include/uapi/linux/bpf.h       | 29 +++++++++++++-
> >   kernel/trace/bpf_trace.c       | 70 ++++++++++++++++++++++++++++++++++
> >   tools/include/uapi/linux/bpf.h | 29 +++++++++++++-
> >   3 files changed, 126 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index dbbcf0b02970..71d9705df120 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -2821,6 +2821,32 @@ union bpf_attr {
> >    *  Return
> >    *          On success, the strictly positive length of the string, including
> >    *          the trailing NUL character. On error, a negative value.
> > + *
> > + * int bpf_get_file_path(char *path, u32 size, int fd)
> > + *   Description
> > + *           Get **file** atrribute from the current task by *fd*, then call
> > + *           **d_path** to get it's absolute path and copy it as string into
> > + *           *path* of *size*. Notice the **path** don't support unmountable
> > + *           pseudo filesystems as they don't have path (eg: SOCKFS, PIPEFS).
> > + *           The *size* must be strictly positive. On success, the helper
> > + *           makes sure that the *path* is NUL-terminated, and the buffer
> > + *           could be:
> > + *           - a regular full path (include mountable fs eg: /proc, /sys)
> > + *           - a regular full path with "(deleted)" at the end.
> > + *           On failure, it is filled with zeroes.
> > + *   Return
> > + *           On success, returns the length of the copied string INCLUDING
> > + *           the trailing NUL.
> > + *
> > + *           On failure, the returned value is one of the following:
> > + *
> > + *           **-EPERM** if no permission to get the path (eg: in irq ctx).
> > + *
> > + *           **-EBADF** if *fd* is invalid.
> > + *
> > + *           **-EINVAL** if *fd* corresponds to a unmountable pseudo fs
> > + *
> > + *           **-ENAMETOOLONG** if full path is longer than *size*
> >    */
> >   #define __BPF_FUNC_MAPPER(FN)               \
> >       FN(unspec),                     \
> > @@ -2938,7 +2964,8 @@ union bpf_attr {
> >       FN(probe_read_user),            \
> >       FN(probe_read_kernel),          \
> >       FN(probe_read_user_str),        \
> > -     FN(probe_read_kernel_str),
> > +     FN(probe_read_kernel_str),      \
> > +     FN(get_file_path),
> >
> >   /* integer value in 'imm' field of BPF_CALL instruction selects which helper
> >    * function eBPF program intends to call
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index e5ef4ae9edb5..db9c0ec46a5d 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -762,6 +762,72 @@ static const struct bpf_func_proto bpf_send_signal_proto = {
> >       .arg1_type      = ARG_ANYTHING,
> >   };
> >
> > +BPF_CALL_3(bpf_get_file_path, char *, dst, u32, size, int, fd)
> > +{
> > +     struct file *f;
> > +     char *p;
> > +     int ret = -EBADF;
>
> please try to use reverse Christmas tree for declarations.
>
> > +
> > +     /* Ensure we're in user context which is safe for the helper to
> > +      * run. This helper has no business in a kthread.
> > +      */
> > +     if (unlikely(in_interrupt() ||
> > +                  current->flags & (PF_KTHREAD | PF_EXITING))) {
> > +             ret = -EPERM;
> > +             goto error;
> > +     }
> > +
> > +     /* Use fget_raw instead of fget to support O_PATH, and it doesn't
> > +      * have any sleepable code, so it's ok to be here.
> > +      */
> > +     f = fget_raw(fd);
> > +     if (!f)
> > +             goto error;
> > +
> > +     /* For unmountable pseudo filesystem, it seems to have no meaning
> > +      * to get their fake paths as they don't have path, and to be no
> > +      * way to validate this function pointer can be always safe to call
> > +      * in the current context.
> > +      */
> > +     if (f->f_path.dentry->d_op && f->f_path.dentry->d_op->d_dname) {
> > +             ret = -EINVAL;
> > +             fput(f);
> > +             goto error;
> > +     }
> > +
> > +     /* After filter unmountable pseudo filesytem, d_path won't call
> > +      * dentry->d_op->d_name(), the normally path doesn't have any
> > +      * sleepable code, and despite it uses the current macro to get
> > +      * fs_struct (current->fs), we've already ensured we're in user
> > +      * context, so it's ok to be here.
> > +      */
> > +     p = d_path(&f->f_path, dst, size);
> > +     if (IS_ERR(p)) {
> > +             ret = PTR_ERR(p);
> > +             fput(f);
> > +             goto error;
> > +     }
> > +
> > +     ret = strlen(p);
> > +     memmove(dst, p, ret);
> > +     dst[ret++] = '\0';
> > +     fput(f);
> > +     return ret;
> > +
> > +error:
> > +     memset(dst, '0', size);
> > +     return ret;
> > +}
> > +
> [...]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v12 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname
  2019-12-17  6:26             ` Wenbo Zhang
@ 2019-12-17  6:33               ` Yonghong Song
  0 siblings, 0 replies; 52+ messages in thread
From: Yonghong Song @ 2019-12-17  6:33 UTC (permalink / raw)
  To: Wenbo Zhang; +Cc: bpf, ast, daniel, andrii.nakryiko, netdev



On 12/16/19 10:26 PM, Wenbo Zhang wrote:
>>> + *           - a regular full path (include mountable fs eg: /proc, /sys)
>>> + *           - a regular full path with "(deleted)" at the end.
> 
>> Let us say with " (deleted)" is appended to be consistent with comments
>> in d_path() and is more clear to user what the format will looks like.
> 
> Thank you, I'll fix this.
> 
>>> +     ret = strlen(p);
>>> +     memmove(dst, p, ret);
>>> +     dst[ret++] = '\0';
> 
>> nit: you could do memmove(dst, p, ret + 1)?
> 
> I did with `dst[ret++]='\0';`  to return value length including
> trailing '\0'. as you mentioned below:
> 
>>> +     fput(f);
>>> +     return ret;
> 
>> The description says the return value length including trailing '\0'.
>> The above 'ret' does not include trailing '\0'.
> 
> It seems `[ret++]` not very clear to read and '\0' can be done by
> `memmove`. I think I'll refactor to
> 
> ```
> ret = strlen(p) + 1;
> memmove(dst, p, ret);
> fput(f);
> return ret;
> ```
> 
> Is this better?

Ah, I missed ret++ in dst[ret++]. Indeed the above code is better.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v12 2/2] selftests/bpf: test for bpf_get_file_path() from tracepoint
  2019-12-17  4:13               ` Yonghong Song
@ 2019-12-17  9:44                 ` Wenbo Zhang
  0 siblings, 0 replies; 52+ messages in thread
From: Wenbo Zhang @ 2019-12-17  9:44 UTC (permalink / raw)
  To: Yonghong Song; +Cc: bpf, ast, daniel, andrii.nakryiko, netdev

> The kernel will actually have 8 bytes of memory to store fd
> based on trace_event_raw_sys_enter.

> For little endian machine, the lower 4 bytes are read based on
> your sys_enter_newfstat_args, which is "accidentally" the lower
> 4 bytes in u64, so you get the correct answer.

> For big endian machine, the lower 4 bytes read based on
> your sys_enter_newfstat_args will be high 4 bytes in u64, which
> is incorrect.

Oh, get it. Thank you, I'll fix this in the next version.

Yonghong Song <yhs@fb.com> 于2019年12月17日周二 下午12:14写道:
>
>
>
> On 12/16/19 8:01 PM, Wenbo Zhang wrote:
> >> In non-bpf .c file, typically we do not add 'inline' attribute.
> >> It is up to compiler to decide whether it should be inlined.
> >
> > Thank you, I'll fix this.
> >
> >>> +struct sys_enter_newfstat_args {
> >>> +     unsigned long long pad1;
> >>> +     unsigned long long pad2;
> >>> +     unsigned int fd;
> >>> +};
> >
> >> The BTF generated vmlinux.h has the following structure,
> >> struct trace_entry {
> >>           short unsigned int type;
> >>           unsigned char flags;
> >>           unsigned char preempt_count;
> >>           int pid;
> >> };
> >> struct trace_event_raw_sys_enter {
> >>           struct trace_entry ent;
> >>           long int id;
> >>           long unsigned int args[6];
> >>           char __data[0];
> >> };
> >
> >> The third parameter type should be long, otherwise,
> >> it may have issue on big endian machines?
> >
> > Sorry, I don't understand why there is a problem on big-endian machines.
> > Would you please explain that in more detail? Thank you.
>
> The kernel will actually have 8 bytes of memory to store fd
> based on trace_event_raw_sys_enter.
>
> For little endian machine, the lower 4 bytes are read based on
> your sys_enter_newfstat_args, which is "accidentally" the lower
> 4 bytes in u64, so you get the correct answer.
>
> For big endian machine, the lower 4 bytes read based on
> your sys_enter_newfstat_args will be high 4 bytes in u64, which
> is incorrect.
>
> >
> > Yonghong Song <yhs@fb.com> 于2019年12月16日周一 上午12:25写道:
> >>
> >>
> >>
> >> On 12/14/19 8:01 PM, Wenbo Zhang wrote:
> >>> trace fstat events by tracepoint syscalls/sys_enter_newfstat, and handle
> >>> events only produced by test_file_get_path, which call fstat on several
> >>> different types of files to test bpf_get_file_path's feature.
> >>>
> >>> v4->v5: addressed Andrii's feedback
> >>> - pass NULL for opts as bpf_object__open_file's PARAM2, as not really
> >>> using any
> >>> - modify patch subject to keep up with test code
> >>> - as this test is single-threaded, so use getpid instead of SYS_gettid
> >>> - remove unnecessary parens around check which after if (i < 3)
> >>> - in kern use bpf_get_current_pid_tgid() >> 32 to fit getpid() in
> >>> userspace part
> >>> - with the patch adding helper as one patch series
> >>>
> >>> v3->v4: addressed Andrii's feedback
> >>> - use a set of fd instead of fds array
> >>> - use global variables instead of maps (in v3, I mistakenly thought that
> >>> the bpf maps are global variables.)
> >>> - remove uncessary global variable path_info_index
> >>> - remove fd compare as the fstat's order is fixed
> >>>
> >>> v2->v3: addressed Andrii's feedback
> >>> - use global data instead of perf_buffer to simplified code
> >>>
> >>> v1->v2: addressed Daniel's feedback
> >>> - rename bpf_fd2path to bpf_get_file_path to be consistent with other
> >>> helper's names
> >>>
> >>> Signed-off-by: Wenbo Zhang <ethercflow@gmail.com>
> >>> ---
> >>>    .../selftests/bpf/prog_tests/get_file_path.c  | 171 ++++++++++++++++++
> >>>    .../selftests/bpf/progs/test_get_file_path.c  |  43 +++++
> >>>    2 files changed, 214 insertions(+)
> >>>    create mode 100644 tools/testing/selftests/bpf/prog_tests/get_file_path.c
> >>>    create mode 100644 tools/testing/selftests/bpf/progs/test_get_file_path.c
> >>>
> >>> diff --git a/tools/testing/selftests/bpf/prog_tests/get_file_path.c b/tools/testing/selftests/bpf/prog_tests/get_file_path.c
> >>> new file mode 100644
> >>> index 000000000000..7ec11e43e0fc
> >>> --- /dev/null
> >>> +++ b/tools/testing/selftests/bpf/prog_tests/get_file_path.c
> >>> @@ -0,0 +1,171 @@
> >>> +// SPDX-License-Identifier: GPL-2.0
> >>> +#define _GNU_SOURCE
> >>> +#include <test_progs.h>
> >>> +#include <sys/stat.h>
> >>> +#include <linux/sched.h>
> >>> +#include <sys/syscall.h>
> >>> +
> >>> +#define MAX_PATH_LEN         128
> >>> +#define MAX_FDS                      7
> >>> +#define MAX_EVENT_NUM                16
> >>> +
> >>> +static struct file_path_test_data {
> >>> +     pid_t pid;
> >>> +     __u32 cnt;
> >>> +     __u32 fds[MAX_EVENT_NUM];
> >>> +     char paths[MAX_EVENT_NUM][MAX_PATH_LEN];
> >>> +} src, dst;
> >>> +
> >>> +static inline int set_pathname(int fd)
> >>
> >> In non-bpf .c file, typically we do not add 'inline' attribute.
> >> It is up to compiler to decide whether it should be inlined.
> >>
> >>> +{
> >>> +     char buf[MAX_PATH_LEN];
> >>> +
> >>> +     snprintf(buf, MAX_PATH_LEN, "/proc/%d/fd/%d", src.pid, fd);
> >>> +     src.fds[src.cnt] = fd;
> >>> +     return readlink(buf, src.paths[src.cnt++], MAX_PATH_LEN);
> >>> +}
> >>> +
> >> [...]
> >>> diff --git a/tools/testing/selftests/bpf/progs/test_get_file_path.c b/tools/testing/selftests/bpf/progs/test_get_file_path.c
> >>> new file mode 100644
> >>> index 000000000000..eae663c1262a
> >>> --- /dev/null
> >>> +++ b/tools/testing/selftests/bpf/progs/test_get_file_path.c
> >>> @@ -0,0 +1,43 @@
> >>> +// SPDX-License-Identifier: GPL-2.0
> >>> +
> >>> +#include <linux/bpf.h>
> >>> +#include <linux/ptrace.h>
> >>> +#include <string.h>
> >>> +#include <unistd.h>
> >>> +#include "bpf_helpers.h"
> >>> +#include "bpf_tracing.h"
> >>> +
> >>> +#define MAX_PATH_LEN         128
> >>> +#define MAX_EVENT_NUM                16
> >>> +
> >>> +static struct file_path_test_data {
> >>> +     pid_t pid;
> >>> +     __u32 cnt;
> >>> +     __u32 fds[MAX_EVENT_NUM];
> >>> +     char paths[MAX_EVENT_NUM][MAX_PATH_LEN];
> >>> +} data;
> >>> +
> >>> +struct sys_enter_newfstat_args {
> >>> +     unsigned long long pad1;
> >>> +     unsigned long long pad2;
> >>> +     unsigned int fd;
> >>> +};
> >>
> >> The BTF generated vmlinux.h has the following structure,
> >> struct trace_entry {
> >>           short unsigned int type;
> >>           unsigned char flags;
> >>           unsigned char preempt_count;
> >>           int pid;
> >> };
> >> struct trace_event_raw_sys_enter {
> >>           struct trace_entry ent;
> >>           long int id;
> >>           long unsigned int args[6];
> >>           char __data[0];
> >> };
> >>
> >> The third parameter type should be long, otherwise,
> >> it may have issue on big endian machines?

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH bpf-next v13 0/2] bpf: adding get_fd_path helper
  2019-12-15  4:01         ` [PATCH bpf-next v12 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname Wenbo Zhang
                             ` (2 preceding siblings ...)
  2019-12-16 22:09           ` Brendan Gregg
@ 2019-12-17  9:47           ` Wenbo Zhang
  2019-12-17  9:47             ` [PATCH bpf-next v13 1/2] bpf: add new helper get_fd_path for mapping a file descriptor to a pathname Wenbo Zhang
  2019-12-17  9:47             ` [PATCH bpf-next v13 " Wenbo Zhang
  3 siblings, 2 replies; 52+ messages in thread
From: Wenbo Zhang @ 2019-12-17  9:47 UTC (permalink / raw)
  To: bpf; +Cc: ast, daniel, yhs, bgregg, andrii.nakryiko, netdev

This patch series introduce a bpf helper that can be used to map a file
descriptor to a pathname.

This requirement is mainly discussed here:

  https://github.com/iovisor/bcc/issues/237

This implementation supports both local and mountable pseudo file systems,
and ensure we're in user context which is safe for this helper to run.

Changes since v12:

* Rename to get_fd_patch

* Fix test issue on big-endian machines


Changes since v11:

* Only allow tracepoints to make sure it won't dead lock


Changes since v10:

* Fix missing fput


Changes since v9:

* Associate help patch with its selftests patch to this series

* Refactor selftests code for further simplification  


Changes since v8:

* Format helper description 
 

Changes since v7:

* Use fget_raw instead of fdget_raw, as fdget_raw is only used inside fs/

* Ensure we're in user context which is safe fot the help to run

* Filter unmountable pseudo filesystem, because they don't have real path

* Supplement the description of this helper function


Changes since v6:

* Fix missing signed-off-by line


Changes since v5:

* Refactor helper avoid unnecessary goto end by having two explicit returns


Changes since v4:

* Rename bpf_fd2path to bpf_get_file_path to be consistent with other
helper's names

* When fdget_raw fails, set ret to -EBADF instead of -EINVAL

* Remove fdput from fdget_raw's error path

* Use IS_ERR instead of IS_ERR_OR_NULL as d_path ether returns a pointer
into the buffer or an error code if the path was too long

* Modify the normal path's return value to return copied string length
including NUL

* Update helper description's Return bits.

* Refactor selftests code for further simplification  


Changes since v3:

* Remove unnecessary LOCKDOWN_BPF_READ

* Refactor error handling section for enhanced readability

* Provide a test case in tools/testing/selftests/bpf

* Refactor sefltests code to use real global variables instead of maps


Changes since v2:

* Fix backward compatibility

* Add helper description

* Refactor selftests use global data instead of perf_buffer to simplified
code

* Fix signed-off name


Wenbo Zhang (2):
  bpf: add new helper get_fd_path for mapping a file descriptor to a
    pathname
  selftests/bpf: test for bpf_get_fd_path() from tracepoint

 include/uapi/linux/bpf.h                      |  29 ++-
 kernel/trace/bpf_trace.c                      |  69 +++++++
 tools/include/uapi/linux/bpf.h                |  29 ++-
 .../selftests/bpf/prog_tests/get_fd_path.c    | 171 ++++++++++++++++++
 .../selftests/bpf/progs/test_get_fd_path.c    |  43 +++++
 5 files changed, 339 insertions(+), 2 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/get_fd_path.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_get_fd_path.c

-- 
2.17.1


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH bpf-next v13 1/2] bpf: add new helper get_fd_path for mapping a file descriptor to a pathname
  2019-12-17  9:47           ` [PATCH bpf-next v13 0/2] bpf: adding get_fd_path helper Wenbo Zhang
@ 2019-12-17  9:47             ` Wenbo Zhang
  2019-12-17 16:29               ` Yonghong Song
  2019-12-18  0:56               ` [PATCH bpf-next v14 0/2] bpf: adding get_fd_path helper Wenbo Zhang
  2019-12-17  9:47             ` [PATCH bpf-next v13 " Wenbo Zhang
  1 sibling, 2 replies; 52+ messages in thread
From: Wenbo Zhang @ 2019-12-17  9:47 UTC (permalink / raw)
  To: bpf; +Cc: ast, daniel, yhs, bgregg, andrii.nakryiko, netdev

When people want to identify which file system files are being opened,
read, and written to, they can use this helper with file descriptor as
input to achieve this goal. Other pseudo filesystems are also supported.

This requirement is mainly discussed here:

  https://github.com/iovisor/bcc/issues/237

v12->v13: addressed Gregg and Yonghong's feedback
- rename to get_fd_path
- refactor code & comment to be clearer and more compliant

v11->v12: addressed Alexei's feedback
- only allow tracepoints to make sure it won't dead lock

v10->v11: addressed Al and Alexei's feedback
- fix missing fput()

v9->v10: addressed Andrii's feedback
- send this patch together with the patch selftests as one patch series

v8->v9:
- format helper description

v7->v8: addressed Alexei's feedback
- use fget_raw instead of fdget_raw, as fdget_raw is only used inside fs/
- ensure we're in user context which is safe fot the help to run
- filter unmountable pseudo filesystem, because they don't have real path
- supplement the description of this helper function

v6->v7:
- fix missing signed-off-by line

v5->v6: addressed Andrii's feedback
- avoid unnecessary goto end by having two explicit returns

v4->v5: addressed Andrii and Daniel's feedback
- rename bpf_fd2path to bpf_get_file_path to be consistent with other
helper's names
- when fdget_raw fails, set ret to -EBADF instead of -EINVAL
- remove fdput from fdget_raw's error path
- use IS_ERR instead of IS_ERR_OR_NULL as d_path ether returns a pointer
into the buffer or an error code if the path was too long
- modify the normal path's return value to return copied string length
including NUL
- update this helper description's Return bits.

v3->v4: addressed Daniel's feedback
- fix missing fdput()
- move fd2path from kernel/bpf/trace.c to kernel/trace/bpf_trace.c
- move fd2path's test code to another patch
- add comment to explain why use fdget_raw instead of fdget

v2->v3: addressed Yonghong's feedback
- remove unnecessary LOCKDOWN_BPF_READ
- refactor error handling section for enhanced readability
- provide a test case in tools/testing/selftests/bpf

v1->v2: addressed Daniel's feedback
- fix backward compatibility
- add this helper description
- fix signed-off name

Signed-off-by: Wenbo Zhang <ethercflow@gmail.com>
---
 include/uapi/linux/bpf.h       | 29 +++++++++++++-
 kernel/trace/bpf_trace.c       | 69 ++++++++++++++++++++++++++++++++++
 tools/include/uapi/linux/bpf.h | 29 +++++++++++++-
 3 files changed, 125 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index dbbcf0b02970..c1e4fd286614 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -2821,6 +2821,32 @@ union bpf_attr {
  * 	Return
  * 		On success, the strictly positive length of the string,	including
  * 		the trailing NUL character. On error, a negative value.
+ *
+ * int bpf_get_fd_path(char *path, u32 size, int fd)
+ *	Description
+ *		Get **file** atrribute from the current task by *fd*, then call
+ *		**d_path** to get it's absolute path and copy it as string into
+ *		*path* of *size*. Notice the **path** don't support unmountable
+ *		pseudo filesystems as they don't have path (eg: SOCKFS, PIPEFS).
+ *		The *size* must be strictly positive. On success, the helper
+ *		makes sure that the *path* is NUL-terminated, and the buffer
+ *		could be:
+ *		- a regular full path (include mountable fs eg: /proc, /sys)
+ *		- a regular full path with "(deleted)" is appended.
+ *		On failure, it is filled with zeroes.
+ *	Return
+ *		On success, returns the length of the copied string INCLUDING
+ *		the trailing '\0'.
+ *
+ *		On failure, the returned value is one of the following:
+ *
+ *		**-EPERM** if no permission to get the path (eg: in irq ctx).
+ *
+ *		**-EBADF** if *fd* is invalid.
+ *
+ *		**-EINVAL** if *fd* corresponds to a unmountable pseudo fs
+ *
+ *		**-ENAMETOOLONG** if full path is longer than *size*
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -2938,7 +2964,8 @@ union bpf_attr {
 	FN(probe_read_user),		\
 	FN(probe_read_kernel),		\
 	FN(probe_read_user_str),	\
-	FN(probe_read_kernel_str),
+	FN(probe_read_kernel_str),	\
+	FN(get_fd_path),
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index e5ef4ae9edb5..43a6aa6ad967 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -762,6 +762,71 @@ static const struct bpf_func_proto bpf_send_signal_proto = {
 	.arg1_type	= ARG_ANYTHING,
 };
 
+BPF_CALL_3(bpf_get_fd_path, char *, dst, u32, size, int, fd)
+{
+	int ret = -EBADF;
+	struct file *f;
+	char *p;
+
+	/* Ensure we're in user context which is safe for the helper to
+	 * run. This helper has no business in a kthread.
+	 */
+	if (unlikely(in_interrupt() ||
+		     current->flags & (PF_KTHREAD | PF_EXITING))) {
+		ret = -EPERM;
+		goto error;
+	}
+
+	/* Use fget_raw instead of fget to support O_PATH, and it doesn't
+	 * have any sleepable code, so it's ok to be here.
+	 */
+	f = fget_raw(fd);
+	if (!f)
+		goto error;
+
+	/* For unmountable pseudo filesystem, it seems to have no meaning
+	 * to get their fake paths as they don't have path, and to be no
+	 * way to validate this function pointer can be always safe to call
+	 * in the current context.
+	 */
+	if (f->f_path.dentry->d_op && f->f_path.dentry->d_op->d_dname) {
+		ret = -EINVAL;
+		fput(f);
+		goto error;
+	}
+
+	/* After filter unmountable pseudo filesytem, d_path won't call
+	 * dentry->d_op->d_name(), the normally path doesn't have any
+	 * sleepable code, and despite it uses the current macro to get
+	 * fs_struct (current->fs), we've already ensured we're in user
+	 * context, so it's ok to be here.
+	 */
+	p = d_path(&f->f_path, dst, size);
+	if (IS_ERR(p)) {
+		ret = PTR_ERR(p);
+		fput(f);
+		goto error;
+	}
+
+	ret = strlen(p) + 1;
+	memmove(dst, p, ret);
+	fput(f);
+	return ret;
+
+error:
+	memset(dst, '0', size);
+	return ret;
+}
+
+static const struct bpf_func_proto bpf_get_fd_path_proto = {
+	.func       = bpf_get_fd_path,
+	.gpl_only   = true,
+	.ret_type   = RET_INTEGER,
+	.arg1_type  = ARG_PTR_TO_UNINIT_MEM,
+	.arg2_type  = ARG_CONST_SIZE,
+	.arg3_type  = ARG_ANYTHING,
+};
+
 static const struct bpf_func_proto *
 tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 {
@@ -953,6 +1018,8 @@ tp_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return &bpf_get_stackid_proto_tp;
 	case BPF_FUNC_get_stack:
 		return &bpf_get_stack_proto_tp;
+	case BPF_FUNC_get_fd_path:
+		return &bpf_get_fd_path_proto;
 	default:
 		return tracing_func_proto(func_id, prog);
 	}
@@ -1146,6 +1213,8 @@ raw_tp_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return &bpf_get_stackid_proto_raw_tp;
 	case BPF_FUNC_get_stack:
 		return &bpf_get_stack_proto_raw_tp;
+	case BPF_FUNC_get_fd_path:
+		return &bpf_get_fd_path_proto;
 	default:
 		return tracing_func_proto(func_id, prog);
 	}
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index dbbcf0b02970..c1e4fd286614 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -2821,6 +2821,32 @@ union bpf_attr {
  * 	Return
  * 		On success, the strictly positive length of the string,	including
  * 		the trailing NUL character. On error, a negative value.
+ *
+ * int bpf_get_fd_path(char *path, u32 size, int fd)
+ *	Description
+ *		Get **file** atrribute from the current task by *fd*, then call
+ *		**d_path** to get it's absolute path and copy it as string into
+ *		*path* of *size*. Notice the **path** don't support unmountable
+ *		pseudo filesystems as they don't have path (eg: SOCKFS, PIPEFS).
+ *		The *size* must be strictly positive. On success, the helper
+ *		makes sure that the *path* is NUL-terminated, and the buffer
+ *		could be:
+ *		- a regular full path (include mountable fs eg: /proc, /sys)
+ *		- a regular full path with "(deleted)" is appended.
+ *		On failure, it is filled with zeroes.
+ *	Return
+ *		On success, returns the length of the copied string INCLUDING
+ *		the trailing '\0'.
+ *
+ *		On failure, the returned value is one of the following:
+ *
+ *		**-EPERM** if no permission to get the path (eg: in irq ctx).
+ *
+ *		**-EBADF** if *fd* is invalid.
+ *
+ *		**-EINVAL** if *fd* corresponds to a unmountable pseudo fs
+ *
+ *		**-ENAMETOOLONG** if full path is longer than *size*
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -2938,7 +2964,8 @@ union bpf_attr {
 	FN(probe_read_user),		\
 	FN(probe_read_kernel),		\
 	FN(probe_read_user_str),	\
-	FN(probe_read_kernel_str),
+	FN(probe_read_kernel_str),	\
+	FN(get_fd_path),
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH bpf-next v13 2/2] selftests/bpf: test for bpf_get_fd_path() from tracepoint
  2019-12-17  9:47           ` [PATCH bpf-next v13 0/2] bpf: adding get_fd_path helper Wenbo Zhang
  2019-12-17  9:47             ` [PATCH bpf-next v13 1/2] bpf: add new helper get_fd_path for mapping a file descriptor to a pathname Wenbo Zhang
@ 2019-12-17  9:47             ` Wenbo Zhang
  2019-12-17 16:32               ` Yonghong Song
  1 sibling, 1 reply; 52+ messages in thread
From: Wenbo Zhang @ 2019-12-17  9:47 UTC (permalink / raw)
  To: bpf; +Cc: ast, daniel, yhs, bgregg, andrii.nakryiko, netdev

trace fstat events by tracepoint syscalls/sys_enter_newfstat, and handle
events only produced by test_file_fd_path, which call fstat on several
different types of files to test bpf_fd_file_path's feature.

v5->v6: addressed Gregg and Yonghong's feedback
- rename to get_fd_path
- change sys_enter_newfstat_args's fd type to long to fix issue on
big-endian machines

v4->v5: addressed Andrii's feedback
- pass NULL for opts as bpf_object__open_file's PARAM2, as not really
using any
- modify patch subject to keep up with test code
- as this test is single-threaded, so use getpid instead of SYS_gettid
- remove unnecessary parens around check which after if (i < 3)
- in kern use bpf_get_current_pid_tgid() >> 32 to fit getpid() in
userspace part
- with the patch adding helper as one patch series

v3->v4: addressed Andrii's feedback
- use a set of fd instead of fds array
- use global variables instead of maps (in v3, I mistakenly thought that
the bpf maps are global variables.)
- remove uncessary global variable path_info_index
- remove fd compare as the fstat's order is fixed

v2->v3: addressed Andrii's feedback
- use global data instead of perf_buffer to simplified code

v1->v2: addressed Daniel's feedback
- rename bpf_fd2path to bpf_get_file_path to be consistent with other
helper's names

Signed-off-by: Wenbo Zhang <ethercflow@gmail.com>
---
 .../selftests/bpf/prog_tests/get_fd_path.c    | 171 ++++++++++++++++++
 .../selftests/bpf/progs/test_get_fd_path.c    |  43 +++++
 2 files changed, 214 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/get_fd_path.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_get_fd_path.c

diff --git a/tools/testing/selftests/bpf/prog_tests/get_fd_path.c b/tools/testing/selftests/bpf/prog_tests/get_fd_path.c
new file mode 100644
index 000000000000..5cf58028a8d2
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/get_fd_path.c
@@ -0,0 +1,171 @@
+// SPDX-License-Identifier: GPL-2.0
+#define _GNU_SOURCE
+#include <test_progs.h>
+#include <sys/stat.h>
+#include <linux/sched.h>
+#include <sys/syscall.h>
+
+#define MAX_PATH_LEN		128
+#define MAX_FDS			7
+#define MAX_EVENT_NUM		16
+
+static struct fd_path_test_data {
+	pid_t pid;
+	__u32 cnt;
+	__u32 fds[MAX_EVENT_NUM];
+	char paths[MAX_EVENT_NUM][MAX_PATH_LEN];
+} src, dst;
+
+static int set_pathname(int fd)
+{
+	char buf[MAX_PATH_LEN];
+
+	snprintf(buf, MAX_PATH_LEN, "/proc/%d/fd/%d", src.pid, fd);
+	src.fds[src.cnt] = fd;
+	return readlink(buf, src.paths[src.cnt++], MAX_PATH_LEN);
+}
+
+static int trigger_fstat_events(pid_t pid)
+{
+	int pipefd[2] = { -1, -1 };
+	int sockfd = -1, procfd = -1, devfd = -1;
+	int localfd = -1, indicatorfd = -1;
+	struct stat fileStat;
+	int ret = -1;
+
+	/* unmountable pseudo-filesystems */
+	if (CHECK_FAIL(pipe(pipefd) < 0))
+		return ret;
+	/* unmountable pseudo-filesystems */
+	sockfd = socket(AF_INET, SOCK_STREAM, 0);
+	if (CHECK_FAIL(sockfd < 0))
+		goto out_close;
+	/* mountable pseudo-filesystems */
+	procfd = open("/proc/self/comm", O_RDONLY);
+	if (CHECK_FAIL(procfd < 0))
+		goto out_close;
+	devfd = open("/dev/urandom", O_RDONLY);
+	if (CHECK_FAIL(devfd < 0))
+		goto out_close;
+	localfd = open("/tmp/fd2path_loadgen.txt", O_CREAT | O_RDONLY);
+	if (CHECK_FAIL(localfd < 0))
+		goto out_close;
+	/* bpf_get_fd_path will return path with (deleted) */
+	remove("/tmp/fd2path_loadgen.txt");
+	indicatorfd = open("/tmp/", O_PATH);
+	if (CHECK_FAIL(indicatorfd < 0))
+		goto out_close;
+
+	src.pid = pid;
+
+	ret = set_pathname(pipefd[0]);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(pipefd[1]);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(sockfd);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(procfd);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(devfd);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(localfd);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(indicatorfd);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+
+	fstat(pipefd[0], &fileStat);
+	fstat(pipefd[1], &fileStat);
+	fstat(sockfd, &fileStat);
+	fstat(procfd, &fileStat);
+	fstat(devfd, &fileStat);
+	fstat(localfd, &fileStat);
+	fstat(indicatorfd, &fileStat);
+
+out_close:
+	close(indicatorfd);
+	close(localfd);
+	close(devfd);
+	close(procfd);
+	close(sockfd);
+	close(pipefd[1]);
+	close(pipefd[0]);
+
+	return ret;
+}
+
+void test_get_fd_path(void)
+{
+	const char *prog_name = "tracepoint/syscalls/sys_enter_newfstat";
+	const char *obj_file = "test_get_fd_path.o";
+	int err, results_map_fd, duration = 0;
+	struct bpf_program *tp_prog = NULL;
+	struct bpf_link *tp_link = NULL;
+	struct bpf_object *obj = NULL;
+	const int zero = 0;
+
+	obj = bpf_object__open_file(obj_file, NULL);
+	if (CHECK(IS_ERR(obj), "obj_open_file", "err %ld\n", PTR_ERR(obj)))
+		return;
+
+	tp_prog = bpf_object__find_program_by_title(obj, prog_name);
+	if (CHECK(!tp_prog, "find_tp",
+		  "prog '%s' not found\n", prog_name))
+		goto cleanup;
+
+	err = bpf_object__load(obj);
+	if (CHECK(err, "obj_load", "err %d\n", err))
+		goto cleanup;
+
+	results_map_fd = bpf_find_map(__func__, obj, "test_get.bss");
+	if (CHECK(results_map_fd < 0, "find_bss_map",
+		  "err %d\n", results_map_fd))
+		goto cleanup;
+
+	tp_link = bpf_program__attach_tracepoint(tp_prog, "syscalls",
+						 "sys_enter_newfstat");
+	if (CHECK(IS_ERR(tp_link), "attach_tp",
+		  "err %ld\n", PTR_ERR(tp_link))) {
+		tp_link = NULL;
+		goto cleanup;
+	}
+
+	dst.pid = getpid();
+	err = bpf_map_update_elem(results_map_fd, &zero, &dst, 0);
+	if (CHECK(err, "update_elem",
+		  "failed to set pid filter: %d\n", err))
+		goto cleanup;
+
+	err = trigger_fstat_events(dst.pid);
+	if (CHECK_FAIL(err < 0))
+		goto cleanup;
+
+	err = bpf_map_lookup_elem(results_map_fd, &zero, &dst);
+	if (CHECK(err, "get_results",
+		  "failed to get results: %d\n", err))
+		goto cleanup;
+
+	for (int i = 0; i < MAX_FDS; i++) {
+		if (i < 3)
+			CHECK((dst.paths[i][0] != '0'), "get_fd_path",
+			      "failed to filter fs [%d]: %u(%s) vs %u(%s)\n",
+			      i, src.fds[i], src.paths[i], dst.fds[i],
+			      dst.paths[i]);
+		else
+			CHECK(strncmp(src.paths[i], dst.paths[i], MAX_PATH_LEN),
+			      "get_fd_path",
+			      "failed to get path[%d]: %u(%s) vs %u(%s)\n",
+			      i, src.fds[i], src.paths[i], dst.fds[i],
+			      dst.paths[i]);
+	}
+
+cleanup:
+	bpf_link__destroy(tp_link);
+	bpf_object__close(obj);
+}
diff --git a/tools/testing/selftests/bpf/progs/test_get_fd_path.c b/tools/testing/selftests/bpf/progs/test_get_fd_path.c
new file mode 100644
index 000000000000..8bb58f87755e
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_get_fd_path.c
@@ -0,0 +1,43 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/bpf.h>
+#include <linux/ptrace.h>
+#include <string.h>
+#include <unistd.h>
+#include "bpf_helpers.h"
+#include "bpf_tracing.h"
+
+#define MAX_PATH_LEN		128
+#define MAX_EVENT_NUM		16
+
+static struct fd_path_test_data {
+	pid_t pid;
+	__u32 cnt;
+	__u32 fds[MAX_EVENT_NUM];
+	char paths[MAX_EVENT_NUM][MAX_PATH_LEN];
+} data;
+
+struct sys_enter_newfstat_args {
+	unsigned long long pad1;
+	unsigned long long pad2;
+	unsigned long fd;
+};
+
+SEC("tracepoint/syscalls/sys_enter_newfstat")
+int bpf_prog(struct sys_enter_newfstat_args *args)
+{
+	pid_t pid = bpf_get_current_pid_tgid() >> 32;
+
+	if (pid != data.pid)
+		return 0;
+	if (data.cnt >= MAX_EVENT_NUM)
+		return 0;
+
+	data.fds[data.cnt] = args->fd;
+	bpf_get_fd_path(data.paths[data.cnt], MAX_PATH_LEN, args->fd);
+	data.cnt++;
+
+	return 0;
+}
+
+char _license[] SEC("license") = "GPL";
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v13 1/2] bpf: add new helper get_fd_path for mapping a file descriptor to a pathname
  2019-12-17  9:47             ` [PATCH bpf-next v13 1/2] bpf: add new helper get_fd_path for mapping a file descriptor to a pathname Wenbo Zhang
@ 2019-12-17 16:29               ` Yonghong Song
  2019-12-17 19:39                 ` Daniel Borkmann
  2019-12-18  0:06                 ` Wenbo Zhang
  2019-12-18  0:56               ` [PATCH bpf-next v14 0/2] bpf: adding get_fd_path helper Wenbo Zhang
  1 sibling, 2 replies; 52+ messages in thread
From: Yonghong Song @ 2019-12-17 16:29 UTC (permalink / raw)
  To: Wenbo Zhang, bpf; +Cc: ast, daniel, bgregg, andrii.nakryiko, netdev



On 12/17/19 1:47 AM, Wenbo Zhang wrote:
> When people want to identify which file system files are being opened,
> read, and written to, they can use this helper with file descriptor as
> input to achieve this goal. Other pseudo filesystems are also supported.
> 
> This requirement is mainly discussed here:
> 
>    https://github.com/iovisor/bcc/issues/237
> 
> v12->v13: addressed Gregg and Yonghong's feedback
> - rename to get_fd_path
> - refactor code & comment to be clearer and more compliant
> 
> v11->v12: addressed Alexei's feedback
> - only allow tracepoints to make sure it won't dead lock
> 
> v10->v11: addressed Al and Alexei's feedback
> - fix missing fput()
> 
> v9->v10: addressed Andrii's feedback
> - send this patch together with the patch selftests as one patch series
> 
> v8->v9:
> - format helper description
> 
> v7->v8: addressed Alexei's feedback
> - use fget_raw instead of fdget_raw, as fdget_raw is only used inside fs/
> - ensure we're in user context which is safe fot the help to run
> - filter unmountable pseudo filesystem, because they don't have real path
> - supplement the description of this helper function
> 
> v6->v7:
> - fix missing signed-off-by line
> 
> v5->v6: addressed Andrii's feedback
> - avoid unnecessary goto end by having two explicit returns
> 
> v4->v5: addressed Andrii and Daniel's feedback
> - rename bpf_fd2path to bpf_get_file_path to be consistent with other
> helper's names
> - when fdget_raw fails, set ret to -EBADF instead of -EINVAL
> - remove fdput from fdget_raw's error path
> - use IS_ERR instead of IS_ERR_OR_NULL as d_path ether returns a pointer
> into the buffer or an error code if the path was too long
> - modify the normal path's return value to return copied string length
> including NUL
> - update this helper description's Return bits.
> 
> v3->v4: addressed Daniel's feedback
> - fix missing fdput()
> - move fd2path from kernel/bpf/trace.c to kernel/trace/bpf_trace.c
> - move fd2path's test code to another patch
> - add comment to explain why use fdget_raw instead of fdget
> 
> v2->v3: addressed Yonghong's feedback
> - remove unnecessary LOCKDOWN_BPF_READ
> - refactor error handling section for enhanced readability
> - provide a test case in tools/testing/selftests/bpf
> 
> v1->v2: addressed Daniel's feedback
> - fix backward compatibility
> - add this helper description
> - fix signed-off name
> 
> Signed-off-by: Wenbo Zhang <ethercflow@gmail.com>

Ack with still the minor issue below, not sure whether another revision
will be needed or not or the maintainer can just fix up before merging.

Acked-by: Yonghong Song <yhs@fb.com>


> ---
>   include/uapi/linux/bpf.h       | 29 +++++++++++++-
>   kernel/trace/bpf_trace.c       | 69 ++++++++++++++++++++++++++++++++++
>   tools/include/uapi/linux/bpf.h | 29 +++++++++++++-
>   3 files changed, 125 insertions(+), 2 deletions(-)
> 
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index dbbcf0b02970..c1e4fd286614 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -2821,6 +2821,32 @@ union bpf_attr {
>    * 	Return
>    * 		On success, the strictly positive length of the string,	including
>    * 		the trailing NUL character. On error, a negative value.
> + *
> + * int bpf_get_fd_path(char *path, u32 size, int fd)
> + *	Description
> + *		Get **file** atrribute from the current task by *fd*, then call
> + *		**d_path** to get it's absolute path and copy it as string into
> + *		*path* of *size*. Notice the **path** don't support unmountable
> + *		pseudo filesystems as they don't have path (eg: SOCKFS, PIPEFS).
> + *		The *size* must be strictly positive. On success, the helper
> + *		makes sure that the *path* is NUL-terminated, and the buffer
> + *		could be:
> + *		- a regular full path (include mountable fs eg: /proc, /sys)
> + *		- a regular full path with "(deleted)" is appended.

Sorry about a little pedantic. In d_path() function comments, we have:
  * Convert a dentry into an ASCII path name. If the entry has been deleted
  * the string " (deleted)" is appended. Note that this is ambiguous.

Note that there is a space before "(deleted)". I would like to the above 
changed to
    - a regular full path with " (deleted)" is appended.

> + *		On failure, it is filled with zeroes.
> + *	Return
> + *		On success, returns the length of the copied string INCLUDING
> + *		the trailing '\0'.
> + *
> + *		On failure, the returned value is one of the following:
> + *
> + *		**-EPERM** if no permission to get the path (eg: in irq ctx).
> + *
> + *		**-EBADF** if *fd* is invalid.
> + *
> + *		**-EINVAL** if *fd* corresponds to a unmountable pseudo fs
> + *
> + *		**-ENAMETOOLONG** if full path is longer than *size*
>    */
>   #define __BPF_FUNC_MAPPER(FN)		\
>   	FN(unspec),			\
> @@ -2938,7 +2964,8 @@ union bpf_attr {
>   	FN(probe_read_user),		\
>   	FN(probe_read_kernel),		\
>   	FN(probe_read_user_str),	\
> -	FN(probe_read_kernel_str),
> +	FN(probe_read_kernel_str),	\
> +	FN(get_fd_path),
>   
>   /* integer value in 'imm' field of BPF_CALL instruction selects which helper
>    * function eBPF program intends to call
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index e5ef4ae9edb5..43a6aa6ad967 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -762,6 +762,71 @@ static const struct bpf_func_proto bpf_send_signal_proto = {
>   	.arg1_type	= ARG_ANYTHING,
>   };
>   
> +BPF_CALL_3(bpf_get_fd_path, char *, dst, u32, size, int, fd)
> +{
> +	int ret = -EBADF;
> +	struct file *f;
> +	char *p;
> +
> +	/* Ensure we're in user context which is safe for the helper to
> +	 * run. This helper has no business in a kthread.
> +	 */
> +	if (unlikely(in_interrupt() ||
> +		     current->flags & (PF_KTHREAD | PF_EXITING))) {
> +		ret = -EPERM;
> +		goto error;
> +	}
> +
> +	/* Use fget_raw instead of fget to support O_PATH, and it doesn't
> +	 * have any sleepable code, so it's ok to be here.
> +	 */
> +	f = fget_raw(fd);
> +	if (!f)
> +		goto error;
> +
> +	/* For unmountable pseudo filesystem, it seems to have no meaning
> +	 * to get their fake paths as they don't have path, and to be no
> +	 * way to validate this function pointer can be always safe to call
> +	 * in the current context.
> +	 */
> +	if (f->f_path.dentry->d_op && f->f_path.dentry->d_op->d_dname) {
> +		ret = -EINVAL;
> +		fput(f);
> +		goto error;
> +	}
> +
> +	/* After filter unmountable pseudo filesytem, d_path won't call
> +	 * dentry->d_op->d_name(), the normally path doesn't have any
> +	 * sleepable code, and despite it uses the current macro to get
> +	 * fs_struct (current->fs), we've already ensured we're in user
> +	 * context, so it's ok to be here.
> +	 */
> +	p = d_path(&f->f_path, dst, size);
> +	if (IS_ERR(p)) {
> +		ret = PTR_ERR(p);
> +		fput(f);
> +		goto error;
> +	}
> +
> +	ret = strlen(p) + 1;
> +	memmove(dst, p, ret);
> +	fput(f);
> +	return ret;
> +
> +error:
> +	memset(dst, '0', size);
> +	return ret;
> +}
> +
> +static const struct bpf_func_proto bpf_get_fd_path_proto = {
> +	.func       = bpf_get_fd_path,
> +	.gpl_only   = true,
> +	.ret_type   = RET_INTEGER,
> +	.arg1_type  = ARG_PTR_TO_UNINIT_MEM,
> +	.arg2_type  = ARG_CONST_SIZE,
> +	.arg3_type  = ARG_ANYTHING,
> +};
> +
>   static const struct bpf_func_proto *
>   tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
>   {
> @@ -953,6 +1018,8 @@ tp_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
>   		return &bpf_get_stackid_proto_tp;
>   	case BPF_FUNC_get_stack:
>   		return &bpf_get_stack_proto_tp;
> +	case BPF_FUNC_get_fd_path:
> +		return &bpf_get_fd_path_proto;
>   	default:
>   		return tracing_func_proto(func_id, prog);
>   	}
> @@ -1146,6 +1213,8 @@ raw_tp_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
>   		return &bpf_get_stackid_proto_raw_tp;
>   	case BPF_FUNC_get_stack:
>   		return &bpf_get_stack_proto_raw_tp;
> +	case BPF_FUNC_get_fd_path:
> +		return &bpf_get_fd_path_proto;
>   	default:
>   		return tracing_func_proto(func_id, prog);
>   	}
> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index dbbcf0b02970..c1e4fd286614 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -2821,6 +2821,32 @@ union bpf_attr {
>    * 	Return
>    * 		On success, the strictly positive length of the string,	including
>    * 		the trailing NUL character. On error, a negative value.
> + *
> + * int bpf_get_fd_path(char *path, u32 size, int fd)
> + *	Description
> + *		Get **file** atrribute from the current task by *fd*, then call
> + *		**d_path** to get it's absolute path and copy it as string into
> + *		*path* of *size*. Notice the **path** don't support unmountable
> + *		pseudo filesystems as they don't have path (eg: SOCKFS, PIPEFS).
> + *		The *size* must be strictly positive. On success, the helper
> + *		makes sure that the *path* is NUL-terminated, and the buffer
> + *		could be:
> + *		- a regular full path (include mountable fs eg: /proc, /sys)
> + *		- a regular full path with "(deleted)" is appended.
> + *		On failure, it is filled with zeroes.
> + *	Return
> + *		On success, returns the length of the copied string INCLUDING
> + *		the trailing '\0'.
> + *
> + *		On failure, the returned value is one of the following:
> + *
> + *		**-EPERM** if no permission to get the path (eg: in irq ctx).
> + *
> + *		**-EBADF** if *fd* is invalid.
> + *
> + *		**-EINVAL** if *fd* corresponds to a unmountable pseudo fs
> + *
> + *		**-ENAMETOOLONG** if full path is longer than *size*
>    */
>   #define __BPF_FUNC_MAPPER(FN)		\
>   	FN(unspec),			\
> @@ -2938,7 +2964,8 @@ union bpf_attr {
>   	FN(probe_read_user),		\
>   	FN(probe_read_kernel),		\
>   	FN(probe_read_user_str),	\
> -	FN(probe_read_kernel_str),
> +	FN(probe_read_kernel_str),	\
> +	FN(get_fd_path),
>   
>   /* integer value in 'imm' field of BPF_CALL instruction selects which helper
>    * function eBPF program intends to call
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v13 2/2] selftests/bpf: test for bpf_get_fd_path() from tracepoint
  2019-12-17  9:47             ` [PATCH bpf-next v13 " Wenbo Zhang
@ 2019-12-17 16:32               ` Yonghong Song
  0 siblings, 0 replies; 52+ messages in thread
From: Yonghong Song @ 2019-12-17 16:32 UTC (permalink / raw)
  To: Wenbo Zhang, bpf; +Cc: ast, daniel, bgregg, andrii.nakryiko, netdev



On 12/17/19 1:47 AM, Wenbo Zhang wrote:
> trace fstat events by tracepoint syscalls/sys_enter_newfstat, and handle
> events only produced by test_file_fd_path, which call fstat on several
> different types of files to test bpf_fd_file_path's feature.
> 
> v5->v6: addressed Gregg and Yonghong's feedback
> - rename to get_fd_path
> - change sys_enter_newfstat_args's fd type to long to fix issue on
> big-endian machines
> 
> v4->v5: addressed Andrii's feedback
> - pass NULL for opts as bpf_object__open_file's PARAM2, as not really
> using any
> - modify patch subject to keep up with test code
> - as this test is single-threaded, so use getpid instead of SYS_gettid
> - remove unnecessary parens around check which after if (i < 3)
> - in kern use bpf_get_current_pid_tgid() >> 32 to fit getpid() in
> userspace part
> - with the patch adding helper as one patch series
> 
> v3->v4: addressed Andrii's feedback
> - use a set of fd instead of fds array
> - use global variables instead of maps (in v3, I mistakenly thought that
> the bpf maps are global variables.)
> - remove uncessary global variable path_info_index
> - remove fd compare as the fstat's order is fixed
> 
> v2->v3: addressed Andrii's feedback
> - use global data instead of perf_buffer to simplified code
> 
> v1->v2: addressed Daniel's feedback
> - rename bpf_fd2path to bpf_get_file_path to be consistent with other
> helper's names
> 
> Signed-off-by: Wenbo Zhang <ethercflow@gmail.com>

Acked-by: Yonghong Song <yhs@fb.com>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v13 1/2] bpf: add new helper get_fd_path for mapping a file descriptor to a pathname
  2019-12-17 16:29               ` Yonghong Song
@ 2019-12-17 19:39                 ` Daniel Borkmann
  2019-12-18  0:11                   ` Wenbo Zhang
  2019-12-18  0:06                 ` Wenbo Zhang
  1 sibling, 1 reply; 52+ messages in thread
From: Daniel Borkmann @ 2019-12-17 19:39 UTC (permalink / raw)
  To: Yonghong Song, Wenbo Zhang, bpf; +Cc: ast, bgregg, andrii.nakryiko, netdev

On 12/17/19 5:29 PM, Yonghong Song wrote:
> On 12/17/19 1:47 AM, Wenbo Zhang wrote:
[...]
>> + *		On failure, it is filled with zeroes.
[...]
>>     */
>>    #define __BPF_FUNC_MAPPER(FN)		\
>>    	FN(unspec),			\
>> @@ -2938,7 +2964,8 @@ union bpf_attr {
>>    	FN(probe_read_user),		\
>>    	FN(probe_read_kernel),		\
>>    	FN(probe_read_user_str),	\
>> -	FN(probe_read_kernel_str),
>> +	FN(probe_read_kernel_str),	\
>> +	FN(get_fd_path),
>>    
>>    /* integer value in 'imm' field of BPF_CALL instruction selects which helper
>>     * function eBPF program intends to call
>> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
>> index e5ef4ae9edb5..43a6aa6ad967 100644
>> --- a/kernel/trace/bpf_trace.c
>> +++ b/kernel/trace/bpf_trace.c
>> @@ -762,6 +762,71 @@ static const struct bpf_func_proto bpf_send_signal_proto = {
>>    	.arg1_type	= ARG_ANYTHING,
>>    };
>>    
>> +BPF_CALL_3(bpf_get_fd_path, char *, dst, u32, size, int, fd)
>> +{
>> +	int ret = -EBADF;
>> +	struct file *f;
>> +	char *p;
>> +
>> +	/* Ensure we're in user context which is safe for the helper to
>> +	 * run. This helper has no business in a kthread.
>> +	 */
>> +	if (unlikely(in_interrupt() ||
>> +		     current->flags & (PF_KTHREAD | PF_EXITING))) {
>> +		ret = -EPERM;
>> +		goto error;
>> +	}
>> +
>> +	/* Use fget_raw instead of fget to support O_PATH, and it doesn't
>> +	 * have any sleepable code, so it's ok to be here.
>> +	 */
>> +	f = fget_raw(fd);
>> +	if (!f)
>> +		goto error;
>> +
>> +	/* For unmountable pseudo filesystem, it seems to have no meaning
>> +	 * to get their fake paths as they don't have path, and to be no
>> +	 * way to validate this function pointer can be always safe to call
>> +	 * in the current context.
>> +	 */
>> +	if (f->f_path.dentry->d_op && f->f_path.dentry->d_op->d_dname) {
>> +		ret = -EINVAL;
>> +		fput(f);
>> +		goto error;
>> +	}
>> +
>> +	/* After filter unmountable pseudo filesytem, d_path won't call
>> +	 * dentry->d_op->d_name(), the normally path doesn't have any
>> +	 * sleepable code, and despite it uses the current macro to get
>> +	 * fs_struct (current->fs), we've already ensured we're in user
>> +	 * context, so it's ok to be here.
>> +	 */
>> +	p = d_path(&f->f_path, dst, size);
>> +	if (IS_ERR(p)) {
>> +		ret = PTR_ERR(p);
>> +		fput(f);
>> +		goto error;
>> +	}
>> +
>> +	ret = strlen(p) + 1;
>> +	memmove(dst, p, ret);
>> +	fput(f);
>> +	return ret;
>> +
>> +error:
>> +	memset(dst, '0', size);

You fill it with 0x30's ...

>> +	return ret;
>> +}

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v13 1/2] bpf: add new helper get_fd_path for mapping a file descriptor to a pathname
  2019-12-17 16:29               ` Yonghong Song
  2019-12-17 19:39                 ` Daniel Borkmann
@ 2019-12-18  0:06                 ` Wenbo Zhang
  1 sibling, 0 replies; 52+ messages in thread
From: Wenbo Zhang @ 2019-12-18  0:06 UTC (permalink / raw)
  To: Yonghong Song; +Cc: bpf, ast, daniel, bgregg, andrii.nakryiko, netdev

> Sorry about a little pedantic. In d_path() function comments, we have:
> * Convert a dentry into an ASCII path name. If the entry has been deleted
>  * the string " (deleted)" is appended. Note that this is ambiguous.

> Note that there is a space before "(deleted)". I would like to the above
> changed to
> - a regular full path with " (deleted)" is appended.

Ah, so sorry about this, I should be more preciseness and thanks again for your
preciseness and patience.I'll submit another revision to fix this.

Yonghong Song <yhs@fb.com> 于2019年12月18日周三 上午12:29写道:
>
>
>
> On 12/17/19 1:47 AM, Wenbo Zhang wrote:
> > When people want to identify which file system files are being opened,
> > read, and written to, they can use this helper with file descriptor as
> > input to achieve this goal. Other pseudo filesystems are also supported.
> >
> > This requirement is mainly discussed here:
> >
> >    https://github.com/iovisor/bcc/issues/237
> >
> > v12->v13: addressed Gregg and Yonghong's feedback
> > - rename to get_fd_path
> > - refactor code & comment to be clearer and more compliant
> >
> > v11->v12: addressed Alexei's feedback
> > - only allow tracepoints to make sure it won't dead lock
> >
> > v10->v11: addressed Al and Alexei's feedback
> > - fix missing fput()
> >
> > v9->v10: addressed Andrii's feedback
> > - send this patch together with the patch selftests as one patch series
> >
> > v8->v9:
> > - format helper description
> >
> > v7->v8: addressed Alexei's feedback
> > - use fget_raw instead of fdget_raw, as fdget_raw is only used inside fs/
> > - ensure we're in user context which is safe fot the help to run
> > - filter unmountable pseudo filesystem, because they don't have real path
> > - supplement the description of this helper function
> >
> > v6->v7:
> > - fix missing signed-off-by line
> >
> > v5->v6: addressed Andrii's feedback
> > - avoid unnecessary goto end by having two explicit returns
> >
> > v4->v5: addressed Andrii and Daniel's feedback
> > - rename bpf_fd2path to bpf_get_file_path to be consistent with other
> > helper's names
> > - when fdget_raw fails, set ret to -EBADF instead of -EINVAL
> > - remove fdput from fdget_raw's error path
> > - use IS_ERR instead of IS_ERR_OR_NULL as d_path ether returns a pointer
> > into the buffer or an error code if the path was too long
> > - modify the normal path's return value to return copied string length
> > including NUL
> > - update this helper description's Return bits.
> >
> > v3->v4: addressed Daniel's feedback
> > - fix missing fdput()
> > - move fd2path from kernel/bpf/trace.c to kernel/trace/bpf_trace.c
> > - move fd2path's test code to another patch
> > - add comment to explain why use fdget_raw instead of fdget
> >
> > v2->v3: addressed Yonghong's feedback
> > - remove unnecessary LOCKDOWN_BPF_READ
> > - refactor error handling section for enhanced readability
> > - provide a test case in tools/testing/selftests/bpf
> >
> > v1->v2: addressed Daniel's feedback
> > - fix backward compatibility
> > - add this helper description
> > - fix signed-off name
> >
> > Signed-off-by: Wenbo Zhang <ethercflow@gmail.com>
>
> Ack with still the minor issue below, not sure whether another revision
> will be needed or not or the maintainer can just fix up before merging.
>
> Acked-by: Yonghong Song <yhs@fb.com>
>
>
> > ---
> >   include/uapi/linux/bpf.h       | 29 +++++++++++++-
> >   kernel/trace/bpf_trace.c       | 69 ++++++++++++++++++++++++++++++++++
> >   tools/include/uapi/linux/bpf.h | 29 +++++++++++++-
> >   3 files changed, 125 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index dbbcf0b02970..c1e4fd286614 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -2821,6 +2821,32 @@ union bpf_attr {
> >    *  Return
> >    *          On success, the strictly positive length of the string, including
> >    *          the trailing NUL character. On error, a negative value.
> > + *
> > + * int bpf_get_fd_path(char *path, u32 size, int fd)
> > + *   Description
> > + *           Get **file** atrribute from the current task by *fd*, then call
> > + *           **d_path** to get it's absolute path and copy it as string into
> > + *           *path* of *size*. Notice the **path** don't support unmountable
> > + *           pseudo filesystems as they don't have path (eg: SOCKFS, PIPEFS).
> > + *           The *size* must be strictly positive. On success, the helper
> > + *           makes sure that the *path* is NUL-terminated, and the buffer
> > + *           could be:
> > + *           - a regular full path (include mountable fs eg: /proc, /sys)
> > + *           - a regular full path with "(deleted)" is appended.
>
> Sorry about a little pedantic. In d_path() function comments, we have:
>   * Convert a dentry into an ASCII path name. If the entry has been deleted
>   * the string " (deleted)" is appended. Note that this is ambiguous.
>
> Note that there is a space before "(deleted)". I would like to the above
> changed to
>     - a regular full path with " (deleted)" is appended.
>
> > + *           On failure, it is filled with zeroes.
> > + *   Return
> > + *           On success, returns the length of the copied string INCLUDING
> > + *           the trailing '\0'.
> > + *
> > + *           On failure, the returned value is one of the following:
> > + *
> > + *           **-EPERM** if no permission to get the path (eg: in irq ctx).
> > + *
> > + *           **-EBADF** if *fd* is invalid.
> > + *
> > + *           **-EINVAL** if *fd* corresponds to a unmountable pseudo fs
> > + *
> > + *           **-ENAMETOOLONG** if full path is longer than *size*
> >    */
> >   #define __BPF_FUNC_MAPPER(FN)               \
> >       FN(unspec),                     \
> > @@ -2938,7 +2964,8 @@ union bpf_attr {
> >       FN(probe_read_user),            \
> >       FN(probe_read_kernel),          \
> >       FN(probe_read_user_str),        \
> > -     FN(probe_read_kernel_str),
> > +     FN(probe_read_kernel_str),      \
> > +     FN(get_fd_path),
> >
> >   /* integer value in 'imm' field of BPF_CALL instruction selects which helper
> >    * function eBPF program intends to call
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index e5ef4ae9edb5..43a6aa6ad967 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -762,6 +762,71 @@ static const struct bpf_func_proto bpf_send_signal_proto = {
> >       .arg1_type      = ARG_ANYTHING,
> >   };
> >
> > +BPF_CALL_3(bpf_get_fd_path, char *, dst, u32, size, int, fd)
> > +{
> > +     int ret = -EBADF;
> > +     struct file *f;
> > +     char *p;
> > +
> > +     /* Ensure we're in user context which is safe for the helper to
> > +      * run. This helper has no business in a kthread.
> > +      */
> > +     if (unlikely(in_interrupt() ||
> > +                  current->flags & (PF_KTHREAD | PF_EXITING))) {
> > +             ret = -EPERM;
> > +             goto error;
> > +     }
> > +
> > +     /* Use fget_raw instead of fget to support O_PATH, and it doesn't
> > +      * have any sleepable code, so it's ok to be here.
> > +      */
> > +     f = fget_raw(fd);
> > +     if (!f)
> > +             goto error;
> > +
> > +     /* For unmountable pseudo filesystem, it seems to have no meaning
> > +      * to get their fake paths as they don't have path, and to be no
> > +      * way to validate this function pointer can be always safe to call
> > +      * in the current context.
> > +      */
> > +     if (f->f_path.dentry->d_op && f->f_path.dentry->d_op->d_dname) {
> > +             ret = -EINVAL;
> > +             fput(f);
> > +             goto error;
> > +     }
> > +
> > +     /* After filter unmountable pseudo filesytem, d_path won't call
> > +      * dentry->d_op->d_name(), the normally path doesn't have any
> > +      * sleepable code, and despite it uses the current macro to get
> > +      * fs_struct (current->fs), we've already ensured we're in user
> > +      * context, so it's ok to be here.
> > +      */
> > +     p = d_path(&f->f_path, dst, size);
> > +     if (IS_ERR(p)) {
> > +             ret = PTR_ERR(p);
> > +             fput(f);
> > +             goto error;
> > +     }
> > +
> > +     ret = strlen(p) + 1;
> > +     memmove(dst, p, ret);
> > +     fput(f);
> > +     return ret;
> > +
> > +error:
> > +     memset(dst, '0', size);
> > +     return ret;
> > +}
> > +
> > +static const struct bpf_func_proto bpf_get_fd_path_proto = {
> > +     .func       = bpf_get_fd_path,
> > +     .gpl_only   = true,
> > +     .ret_type   = RET_INTEGER,
> > +     .arg1_type  = ARG_PTR_TO_UNINIT_MEM,
> > +     .arg2_type  = ARG_CONST_SIZE,
> > +     .arg3_type  = ARG_ANYTHING,
> > +};
> > +
> >   static const struct bpf_func_proto *
> >   tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
> >   {
> > @@ -953,6 +1018,8 @@ tp_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
> >               return &bpf_get_stackid_proto_tp;
> >       case BPF_FUNC_get_stack:
> >               return &bpf_get_stack_proto_tp;
> > +     case BPF_FUNC_get_fd_path:
> > +             return &bpf_get_fd_path_proto;
> >       default:
> >               return tracing_func_proto(func_id, prog);
> >       }
> > @@ -1146,6 +1213,8 @@ raw_tp_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
> >               return &bpf_get_stackid_proto_raw_tp;
> >       case BPF_FUNC_get_stack:
> >               return &bpf_get_stack_proto_raw_tp;
> > +     case BPF_FUNC_get_fd_path:
> > +             return &bpf_get_fd_path_proto;
> >       default:
> >               return tracing_func_proto(func_id, prog);
> >       }
> > diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> > index dbbcf0b02970..c1e4fd286614 100644
> > --- a/tools/include/uapi/linux/bpf.h
> > +++ b/tools/include/uapi/linux/bpf.h
> > @@ -2821,6 +2821,32 @@ union bpf_attr {
> >    *  Return
> >    *          On success, the strictly positive length of the string, including
> >    *          the trailing NUL character. On error, a negative value.
> > + *
> > + * int bpf_get_fd_path(char *path, u32 size, int fd)
> > + *   Description
> > + *           Get **file** atrribute from the current task by *fd*, then call
> > + *           **d_path** to get it's absolute path and copy it as string into
> > + *           *path* of *size*. Notice the **path** don't support unmountable
> > + *           pseudo filesystems as they don't have path (eg: SOCKFS, PIPEFS).
> > + *           The *size* must be strictly positive. On success, the helper
> > + *           makes sure that the *path* is NUL-terminated, and the buffer
> > + *           could be:
> > + *           - a regular full path (include mountable fs eg: /proc, /sys)
> > + *           - a regular full path with "(deleted)" is appended.
> > + *           On failure, it is filled with zeroes.
> > + *   Return
> > + *           On success, returns the length of the copied string INCLUDING
> > + *           the trailing '\0'.
> > + *
> > + *           On failure, the returned value is one of the following:
> > + *
> > + *           **-EPERM** if no permission to get the path (eg: in irq ctx).
> > + *
> > + *           **-EBADF** if *fd* is invalid.
> > + *
> > + *           **-EINVAL** if *fd* corresponds to a unmountable pseudo fs
> > + *
> > + *           **-ENAMETOOLONG** if full path is longer than *size*
> >    */
> >   #define __BPF_FUNC_MAPPER(FN)               \
> >       FN(unspec),                     \
> > @@ -2938,7 +2964,8 @@ union bpf_attr {
> >       FN(probe_read_user),            \
> >       FN(probe_read_kernel),          \
> >       FN(probe_read_user_str),        \
> > -     FN(probe_read_kernel_str),
> > +     FN(probe_read_kernel_str),      \
> > +     FN(get_fd_path),
> >
> >   /* integer value in 'imm' field of BPF_CALL instruction selects which helper
> >    * function eBPF program intends to call
> >

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v13 1/2] bpf: add new helper get_fd_path for mapping a file descriptor to a pathname
  2019-12-17 19:39                 ` Daniel Borkmann
@ 2019-12-18  0:11                   ` Wenbo Zhang
  0 siblings, 0 replies; 52+ messages in thread
From: Wenbo Zhang @ 2019-12-18  0:11 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: Yonghong Song, bpf, ast, bgregg, andrii.nakryiko, netdev

> [...]
>>> + *          On failure, it is filled with zeroes.
> [...]
> You fill it with 0x30's ...

So sorry about this, I'll submit another revision to fix this. Thanks
again for your
preciseness and patience.

Daniel Borkmann <daniel@iogearbox.net> 于2019年12月18日周三 上午3:39写道:
>
> On 12/17/19 5:29 PM, Yonghong Song wrote:
> > On 12/17/19 1:47 AM, Wenbo Zhang wrote:
> [...]
> >> + *          On failure, it is filled with zeroes.
> [...]
> >>     */
> >>    #define __BPF_FUNC_MAPPER(FN)             \
> >>      FN(unspec),                     \
> >> @@ -2938,7 +2964,8 @@ union bpf_attr {
> >>      FN(probe_read_user),            \
> >>      FN(probe_read_kernel),          \
> >>      FN(probe_read_user_str),        \
> >> -    FN(probe_read_kernel_str),
> >> +    FN(probe_read_kernel_str),      \
> >> +    FN(get_fd_path),
> >>
> >>    /* integer value in 'imm' field of BPF_CALL instruction selects which helper
> >>     * function eBPF program intends to call
> >> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> >> index e5ef4ae9edb5..43a6aa6ad967 100644
> >> --- a/kernel/trace/bpf_trace.c
> >> +++ b/kernel/trace/bpf_trace.c
> >> @@ -762,6 +762,71 @@ static const struct bpf_func_proto bpf_send_signal_proto = {
> >>      .arg1_type      = ARG_ANYTHING,
> >>    };
> >>
> >> +BPF_CALL_3(bpf_get_fd_path, char *, dst, u32, size, int, fd)
> >> +{
> >> +    int ret = -EBADF;
> >> +    struct file *f;
> >> +    char *p;
> >> +
> >> +    /* Ensure we're in user context which is safe for the helper to
> >> +     * run. This helper has no business in a kthread.
> >> +     */
> >> +    if (unlikely(in_interrupt() ||
> >> +                 current->flags & (PF_KTHREAD | PF_EXITING))) {
> >> +            ret = -EPERM;
> >> +            goto error;
> >> +    }
> >> +
> >> +    /* Use fget_raw instead of fget to support O_PATH, and it doesn't
> >> +     * have any sleepable code, so it's ok to be here.
> >> +     */
> >> +    f = fget_raw(fd);
> >> +    if (!f)
> >> +            goto error;
> >> +
> >> +    /* For unmountable pseudo filesystem, it seems to have no meaning
> >> +     * to get their fake paths as they don't have path, and to be no
> >> +     * way to validate this function pointer can be always safe to call
> >> +     * in the current context.
> >> +     */
> >> +    if (f->f_path.dentry->d_op && f->f_path.dentry->d_op->d_dname) {
> >> +            ret = -EINVAL;
> >> +            fput(f);
> >> +            goto error;
> >> +    }
> >> +
> >> +    /* After filter unmountable pseudo filesytem, d_path won't call
> >> +     * dentry->d_op->d_name(), the normally path doesn't have any
> >> +     * sleepable code, and despite it uses the current macro to get
> >> +     * fs_struct (current->fs), we've already ensured we're in user
> >> +     * context, so it's ok to be here.
> >> +     */
> >> +    p = d_path(&f->f_path, dst, size);
> >> +    if (IS_ERR(p)) {
> >> +            ret = PTR_ERR(p);
> >> +            fput(f);
> >> +            goto error;
> >> +    }
> >> +
> >> +    ret = strlen(p) + 1;
> >> +    memmove(dst, p, ret);
> >> +    fput(f);
> >> +    return ret;
> >> +
> >> +error:
> >> +    memset(dst, '0', size);
>
> You fill it with 0x30's ...
>
> >> +    return ret;
> >> +}

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH bpf-next v14 0/2] bpf: adding get_fd_path helper
  2019-12-17  9:47             ` [PATCH bpf-next v13 1/2] bpf: add new helper get_fd_path for mapping a file descriptor to a pathname Wenbo Zhang
  2019-12-17 16:29               ` Yonghong Song
@ 2019-12-18  0:56               ` Wenbo Zhang
  2019-12-18  0:56                 ` [PATCH bpf-next v14 1/2] bpf: add new helper get_fd_path for mapping a file descriptor to a pathname Wenbo Zhang
  2019-12-18  0:56                 ` [PATCH bpf-next v14 2/2] selftests/bpf: test for bpf_get_fd_path() from tracepoint Wenbo Zhang
  1 sibling, 2 replies; 52+ messages in thread
From: Wenbo Zhang @ 2019-12-18  0:56 UTC (permalink / raw)
  To: bpf; +Cc: ast, daniel, yhs, bgregg, andrii.nakryiko, netdev

This patch series introduce a bpf helper that can be used to map a file
descriptor to a pathname.

This requirement is mainly discussed here:

  https://github.com/iovisor/bcc/issues/237

This implementation supports both local and mountable pseudo file systems,
and ensure we're in user context which is safe for this helper to run.

Changes since v13:

* Fix this helper's description to be consistent with comments in d_path

* Fix error handling logic fill zeroes not '0's


Changes since v12:

* Rename to get_fd_patch

* Fix test issue on big-endian machines


Changes since v11:

* Only allow tracepoints to make sure it won't dead lock


Changes since v10:

* Fix missing fput


Changes since v9:

* Associate help patch with its selftests patch to this series

* Refactor selftests code for further simplification  


Changes since v8:

* Format helper description 
 

Changes since v7:

* Use fget_raw instead of fdget_raw, as fdget_raw is only used inside fs/

* Ensure we're in user context which is safe fot the help to run

* Filter unmountable pseudo filesystem, because they don't have real path

* Supplement the description of this helper function


Changes since v6:

* Fix missing signed-off-by line


Changes since v5:

* Refactor helper avoid unnecessary goto end by having two explicit returns


Changes since v4:

* Rename bpf_fd2path to bpf_get_file_path to be consistent with other
helper's names

* When fdget_raw fails, set ret to -EBADF instead of -EINVAL

* Remove fdput from fdget_raw's error path

* Use IS_ERR instead of IS_ERR_OR_NULL as d_path ether returns a pointer
into the buffer or an error code if the path was too long

* Modify the normal path's return value to return copied string length
including NUL

* Update helper description's Return bits.

* Refactor selftests code for further simplification  


Changes since v3:

* Remove unnecessary LOCKDOWN_BPF_READ

* Refactor error handling section for enhanced readability

* Provide a test case in tools/testing/selftests/bpf

* Refactor sefltests code to use real global variables instead of maps


Changes since v2:

* Fix backward compatibility

* Add helper description

* Refactor selftests use global data instead of perf_buffer to simplified
code

* Fix signed-off name


Wenbo Zhang (2):
  bpf: add new helper get_fd_path for mapping a file descriptor to a
    pathname
  selftests/bpf: test for bpf_get_fd_path() from tracepoint

 include/uapi/linux/bpf.h                      |  29 ++-
 kernel/trace/bpf_trace.c                      |  69 +++++++
 tools/include/uapi/linux/bpf.h                |  29 ++-
 .../selftests/bpf/prog_tests/get_fd_path.c    | 171 ++++++++++++++++++
 .../selftests/bpf/progs/test_get_fd_path.c    |  43 +++++
 5 files changed, 339 insertions(+), 2 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/get_fd_path.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_get_fd_path.c

-- 
2.17.1


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH bpf-next v14 1/2] bpf: add new helper get_fd_path for mapping a file descriptor to a pathname
  2019-12-18  0:56               ` [PATCH bpf-next v14 0/2] bpf: adding get_fd_path helper Wenbo Zhang
@ 2019-12-18  0:56                 ` Wenbo Zhang
  2019-12-18  3:27                   ` Yonghong Song
  2019-12-19 16:14                   ` Daniel Borkmann
  2019-12-18  0:56                 ` [PATCH bpf-next v14 2/2] selftests/bpf: test for bpf_get_fd_path() from tracepoint Wenbo Zhang
  1 sibling, 2 replies; 52+ messages in thread
From: Wenbo Zhang @ 2019-12-18  0:56 UTC (permalink / raw)
  To: bpf; +Cc: ast, daniel, yhs, bgregg, andrii.nakryiko, netdev

When people want to identify which file system files are being opened,
read, and written to, they can use this helper with file descriptor as
input to achieve this goal. Other pseudo filesystems are also supported.

This requirement is mainly discussed here:

  https://github.com/iovisor/bcc/issues/237

v13->v14: addressed Yonghong and Daniel's feedback
- fix this helper's description to be consistent with comments in d_path
- fix error handling logic fill zeroes not '0's

v12->v13: addressed Brendan and Yonghong's feedback
- rename to get_fd_path
- refactor code & comment to be clearer and more compliant

v11->v12: addressed Alexei's feedback
- only allow tracepoints to make sure it won't dead lock

v10->v11: addressed Al and Alexei's feedback
- fix missing fput()

v9->v10: addressed Andrii's feedback
- send this patch together with the patch selftests as one patch series

v8->v9:
- format helper description

v7->v8: addressed Alexei's feedback
- use fget_raw instead of fdget_raw, as fdget_raw is only used inside fs/
- ensure we're in user context which is safe fot the help to run
- filter unmountable pseudo filesystem, because they don't have real path
- supplement the description of this helper function

v6->v7:
- fix missing signed-off-by line

v5->v6: addressed Andrii's feedback
- avoid unnecessary goto end by having two explicit returns

v4->v5: addressed Andrii and Daniel's feedback
- rename bpf_fd2path to bpf_get_file_path to be consistent with other
helper's names
- when fdget_raw fails, set ret to -EBADF instead of -EINVAL
- remove fdput from fdget_raw's error path
- use IS_ERR instead of IS_ERR_OR_NULL as d_path ether returns a pointer
into the buffer or an error code if the path was too long
- modify the normal path's return value to return copied string length
including NUL
- update this helper description's Return bits.

v3->v4: addressed Daniel's feedback
- fix missing fdput()
- move fd2path from kernel/bpf/trace.c to kernel/trace/bpf_trace.c
- move fd2path's test code to another patch
- add comment to explain why use fdget_raw instead of fdget

v2->v3: addressed Yonghong's feedback
- remove unnecessary LOCKDOWN_BPF_READ
- refactor error handling section for enhanced readability
- provide a test case in tools/testing/selftests/bpf

v1->v2: addressed Daniel's feedback
- fix backward compatibility
- add this helper description
- fix signed-off name

Signed-off-by: Wenbo Zhang <ethercflow@gmail.com>
---
 include/uapi/linux/bpf.h       | 29 +++++++++++++-
 kernel/trace/bpf_trace.c       | 69 ++++++++++++++++++++++++++++++++++
 tools/include/uapi/linux/bpf.h | 29 +++++++++++++-
 3 files changed, 125 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index dbbcf0b02970..4534ce49f838 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -2821,6 +2821,32 @@ union bpf_attr {
  * 	Return
  * 		On success, the strictly positive length of the string,	including
  * 		the trailing NUL character. On error, a negative value.
+ *
+ * int bpf_get_fd_path(char *path, u32 size, int fd)
+ *	Description
+ *		Get **file** atrribute from the current task by *fd*, then call
+ *		**d_path** to get it's absolute path and copy it as string into
+ *		*path* of *size*. Notice the **path** don't support unmountable
+ *		pseudo filesystems as they don't have path (eg: SOCKFS, PIPEFS).
+ *		The *size* must be strictly positive. On success, the helper
+ *		makes sure that the *path* is NUL-terminated, and the buffer
+ *		could be:
+ *		- a regular full path (include mountable fs eg: /proc, /sys)
+ *		- a regular full path with " (deleted)" is appended.
+ *		On failure, it is filled with zeroes.
+ *	Return
+ *		On success, returns the length of the copied string INCLUDING
+ *		the trailing '\0'.
+ *
+ *		On failure, the returned value is one of the following:
+ *
+ *		**-EPERM** if no permission to get the path (eg: in irq ctx).
+ *
+ *		**-EBADF** if *fd* is invalid.
+ *
+ *		**-EINVAL** if *fd* corresponds to a unmountable pseudo fs
+ *
+ *		**-ENAMETOOLONG** if full path is longer than *size*
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -2938,7 +2964,8 @@ union bpf_attr {
 	FN(probe_read_user),		\
 	FN(probe_read_kernel),		\
 	FN(probe_read_user_str),	\
-	FN(probe_read_kernel_str),
+	FN(probe_read_kernel_str),	\
+	FN(get_fd_path),
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index e5ef4ae9edb5..a2c18b193141 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -762,6 +762,71 @@ static const struct bpf_func_proto bpf_send_signal_proto = {
 	.arg1_type	= ARG_ANYTHING,
 };
 
+BPF_CALL_3(bpf_get_fd_path, char *, dst, u32, size, int, fd)
+{
+	int ret = -EBADF;
+	struct file *f;
+	char *p;
+
+	/* Ensure we're in user context which is safe for the helper to
+	 * run. This helper has no business in a kthread.
+	 */
+	if (unlikely(in_interrupt() ||
+		     current->flags & (PF_KTHREAD | PF_EXITING))) {
+		ret = -EPERM;
+		goto error;
+	}
+
+	/* Use fget_raw instead of fget to support O_PATH, and it doesn't
+	 * have any sleepable code, so it's ok to be here.
+	 */
+	f = fget_raw(fd);
+	if (!f)
+		goto error;
+
+	/* For unmountable pseudo filesystem, it seems to have no meaning
+	 * to get their fake paths as they don't have path, and to be no
+	 * way to validate this function pointer can be always safe to call
+	 * in the current context.
+	 */
+	if (f->f_path.dentry->d_op && f->f_path.dentry->d_op->d_dname) {
+		ret = -EINVAL;
+		fput(f);
+		goto error;
+	}
+
+	/* After filter unmountable pseudo filesytem, d_path won't call
+	 * dentry->d_op->d_name(), the normally path doesn't have any
+	 * sleepable code, and despite it uses the current macro to get
+	 * fs_struct (current->fs), we've already ensured we're in user
+	 * context, so it's ok to be here.
+	 */
+	p = d_path(&f->f_path, dst, size);
+	if (IS_ERR(p)) {
+		ret = PTR_ERR(p);
+		fput(f);
+		goto error;
+	}
+
+	ret = strlen(p) + 1;
+	memmove(dst, p, ret);
+	fput(f);
+	return ret;
+
+error:
+	memset(dst, 0, size);
+	return ret;
+}
+
+static const struct bpf_func_proto bpf_get_fd_path_proto = {
+	.func       = bpf_get_fd_path,
+	.gpl_only   = true,
+	.ret_type   = RET_INTEGER,
+	.arg1_type  = ARG_PTR_TO_UNINIT_MEM,
+	.arg2_type  = ARG_CONST_SIZE,
+	.arg3_type  = ARG_ANYTHING,
+};
+
 static const struct bpf_func_proto *
 tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 {
@@ -953,6 +1018,8 @@ tp_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return &bpf_get_stackid_proto_tp;
 	case BPF_FUNC_get_stack:
 		return &bpf_get_stack_proto_tp;
+	case BPF_FUNC_get_fd_path:
+		return &bpf_get_fd_path_proto;
 	default:
 		return tracing_func_proto(func_id, prog);
 	}
@@ -1146,6 +1213,8 @@ raw_tp_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return &bpf_get_stackid_proto_raw_tp;
 	case BPF_FUNC_get_stack:
 		return &bpf_get_stack_proto_raw_tp;
+	case BPF_FUNC_get_fd_path:
+		return &bpf_get_fd_path_proto;
 	default:
 		return tracing_func_proto(func_id, prog);
 	}
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index dbbcf0b02970..4534ce49f838 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -2821,6 +2821,32 @@ union bpf_attr {
  * 	Return
  * 		On success, the strictly positive length of the string,	including
  * 		the trailing NUL character. On error, a negative value.
+ *
+ * int bpf_get_fd_path(char *path, u32 size, int fd)
+ *	Description
+ *		Get **file** atrribute from the current task by *fd*, then call
+ *		**d_path** to get it's absolute path and copy it as string into
+ *		*path* of *size*. Notice the **path** don't support unmountable
+ *		pseudo filesystems as they don't have path (eg: SOCKFS, PIPEFS).
+ *		The *size* must be strictly positive. On success, the helper
+ *		makes sure that the *path* is NUL-terminated, and the buffer
+ *		could be:
+ *		- a regular full path (include mountable fs eg: /proc, /sys)
+ *		- a regular full path with " (deleted)" is appended.
+ *		On failure, it is filled with zeroes.
+ *	Return
+ *		On success, returns the length of the copied string INCLUDING
+ *		the trailing '\0'.
+ *
+ *		On failure, the returned value is one of the following:
+ *
+ *		**-EPERM** if no permission to get the path (eg: in irq ctx).
+ *
+ *		**-EBADF** if *fd* is invalid.
+ *
+ *		**-EINVAL** if *fd* corresponds to a unmountable pseudo fs
+ *
+ *		**-ENAMETOOLONG** if full path is longer than *size*
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -2938,7 +2964,8 @@ union bpf_attr {
 	FN(probe_read_user),		\
 	FN(probe_read_kernel),		\
 	FN(probe_read_user_str),	\
-	FN(probe_read_kernel_str),
+	FN(probe_read_kernel_str),	\
+	FN(get_fd_path),
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH bpf-next v14 2/2] selftests/bpf: test for bpf_get_fd_path() from tracepoint
  2019-12-18  0:56               ` [PATCH bpf-next v14 0/2] bpf: adding get_fd_path helper Wenbo Zhang
  2019-12-18  0:56                 ` [PATCH bpf-next v14 1/2] bpf: add new helper get_fd_path for mapping a file descriptor to a pathname Wenbo Zhang
@ 2019-12-18  0:56                 ` Wenbo Zhang
  2019-12-18  3:27                   ` Yonghong Song
  1 sibling, 1 reply; 52+ messages in thread
From: Wenbo Zhang @ 2019-12-18  0:56 UTC (permalink / raw)
  To: bpf; +Cc: ast, daniel, yhs, bgregg, andrii.nakryiko, netdev

trace fstat events by tracepoint syscalls/sys_enter_newfstat, and handle
events only produced by test_file_fd_path, which call fstat on several
different types of files to test bpf_fd_file_path's feature.

v5->v6: addressed Gregg and Yonghong's feedback
- rename to get_fd_path
- change sys_enter_newfstat_args's fd type to long to fix issue on
big-endian machines

v4->v5: addressed Andrii's feedback
- pass NULL for opts as bpf_object__open_file's PARAM2, as not really
using any
- modify patch subject to keep up with test code
- as this test is single-threaded, so use getpid instead of SYS_gettid
- remove unnecessary parens around check which after if (i < 3)
- in kern use bpf_get_current_pid_tgid() >> 32 to fit getpid() in
userspace part
- with the patch adding helper as one patch series

v3->v4: addressed Andrii's feedback
- use a set of fd instead of fds array
- use global variables instead of maps (in v3, I mistakenly thought that
the bpf maps are global variables.)
- remove uncessary global variable path_info_index
- remove fd compare as the fstat's order is fixed

v2->v3: addressed Andrii's feedback
- use global data instead of perf_buffer to simplified code

v1->v2: addressed Daniel's feedback
- rename bpf_fd2path to bpf_get_file_path to be consistent with other
helper's names

Signed-off-by: Wenbo Zhang <ethercflow@gmail.com>
---
 .../selftests/bpf/prog_tests/get_fd_path.c    | 171 ++++++++++++++++++
 .../selftests/bpf/progs/test_get_fd_path.c    |  43 +++++
 2 files changed, 214 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/get_fd_path.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_get_fd_path.c

diff --git a/tools/testing/selftests/bpf/prog_tests/get_fd_path.c b/tools/testing/selftests/bpf/prog_tests/get_fd_path.c
new file mode 100644
index 000000000000..2846f0a4e84b
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/get_fd_path.c
@@ -0,0 +1,171 @@
+// SPDX-License-Identifier: GPL-2.0
+#define _GNU_SOURCE
+#include <test_progs.h>
+#include <sys/stat.h>
+#include <linux/sched.h>
+#include <sys/syscall.h>
+
+#define MAX_PATH_LEN		128
+#define MAX_FDS			7
+#define MAX_EVENT_NUM		16
+
+static struct fd_path_test_data {
+	pid_t pid;
+	__u32 cnt;
+	__u32 fds[MAX_EVENT_NUM];
+	char paths[MAX_EVENT_NUM][MAX_PATH_LEN];
+} src, dst;
+
+static int set_pathname(int fd)
+{
+	char buf[MAX_PATH_LEN];
+
+	snprintf(buf, MAX_PATH_LEN, "/proc/%d/fd/%d", src.pid, fd);
+	src.fds[src.cnt] = fd;
+	return readlink(buf, src.paths[src.cnt++], MAX_PATH_LEN);
+}
+
+static int trigger_fstat_events(pid_t pid)
+{
+	int pipefd[2] = { -1, -1 };
+	int sockfd = -1, procfd = -1, devfd = -1;
+	int localfd = -1, indicatorfd = -1;
+	struct stat fileStat;
+	int ret = -1;
+
+	/* unmountable pseudo-filesystems */
+	if (CHECK_FAIL(pipe(pipefd) < 0))
+		return ret;
+	/* unmountable pseudo-filesystems */
+	sockfd = socket(AF_INET, SOCK_STREAM, 0);
+	if (CHECK_FAIL(sockfd < 0))
+		goto out_close;
+	/* mountable pseudo-filesystems */
+	procfd = open("/proc/self/comm", O_RDONLY);
+	if (CHECK_FAIL(procfd < 0))
+		goto out_close;
+	devfd = open("/dev/urandom", O_RDONLY);
+	if (CHECK_FAIL(devfd < 0))
+		goto out_close;
+	localfd = open("/tmp/fd2path_loadgen.txt", O_CREAT | O_RDONLY);
+	if (CHECK_FAIL(localfd < 0))
+		goto out_close;
+	/* bpf_get_fd_path will return path with (deleted) */
+	remove("/tmp/fd2path_loadgen.txt");
+	indicatorfd = open("/tmp/", O_PATH);
+	if (CHECK_FAIL(indicatorfd < 0))
+		goto out_close;
+
+	src.pid = pid;
+
+	ret = set_pathname(pipefd[0]);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(pipefd[1]);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(sockfd);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(procfd);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(devfd);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(localfd);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+	ret = set_pathname(indicatorfd);
+	if (CHECK_FAIL(ret < 0))
+		goto out_close;
+
+	fstat(pipefd[0], &fileStat);
+	fstat(pipefd[1], &fileStat);
+	fstat(sockfd, &fileStat);
+	fstat(procfd, &fileStat);
+	fstat(devfd, &fileStat);
+	fstat(localfd, &fileStat);
+	fstat(indicatorfd, &fileStat);
+
+out_close:
+	close(indicatorfd);
+	close(localfd);
+	close(devfd);
+	close(procfd);
+	close(sockfd);
+	close(pipefd[1]);
+	close(pipefd[0]);
+
+	return ret;
+}
+
+void test_get_fd_path(void)
+{
+	const char *prog_name = "tracepoint/syscalls/sys_enter_newfstat";
+	const char *obj_file = "test_get_fd_path.o";
+	int err, results_map_fd, duration = 0;
+	struct bpf_program *tp_prog = NULL;
+	struct bpf_link *tp_link = NULL;
+	struct bpf_object *obj = NULL;
+	const int zero = 0;
+
+	obj = bpf_object__open_file(obj_file, NULL);
+	if (CHECK(IS_ERR(obj), "obj_open_file", "err %ld\n", PTR_ERR(obj)))
+		return;
+
+	tp_prog = bpf_object__find_program_by_title(obj, prog_name);
+	if (CHECK(!tp_prog, "find_tp",
+		  "prog '%s' not found\n", prog_name))
+		goto cleanup;
+
+	err = bpf_object__load(obj);
+	if (CHECK(err, "obj_load", "err %d\n", err))
+		goto cleanup;
+
+	results_map_fd = bpf_find_map(__func__, obj, "test_get.bss");
+	if (CHECK(results_map_fd < 0, "find_bss_map",
+		  "err %d\n", results_map_fd))
+		goto cleanup;
+
+	tp_link = bpf_program__attach_tracepoint(tp_prog, "syscalls",
+						 "sys_enter_newfstat");
+	if (CHECK(IS_ERR(tp_link), "attach_tp",
+		  "err %ld\n", PTR_ERR(tp_link))) {
+		tp_link = NULL;
+		goto cleanup;
+	}
+
+	dst.pid = getpid();
+	err = bpf_map_update_elem(results_map_fd, &zero, &dst, 0);
+	if (CHECK(err, "update_elem",
+		  "failed to set pid filter: %d\n", err))
+		goto cleanup;
+
+	err = trigger_fstat_events(dst.pid);
+	if (CHECK_FAIL(err < 0))
+		goto cleanup;
+
+	err = bpf_map_lookup_elem(results_map_fd, &zero, &dst);
+	if (CHECK(err, "get_results",
+		  "failed to get results: %d\n", err))
+		goto cleanup;
+
+	for (int i = 0; i < MAX_FDS; i++) {
+		if (i < 3)
+			CHECK((dst.paths[i][0] != 0), "get_fd_path",
+			      "failed to filter fs [%d]: %u(%s) vs %u(%s)\n",
+			      i, src.fds[i], src.paths[i], dst.fds[i],
+			      dst.paths[i]);
+		else
+			CHECK(strncmp(src.paths[i], dst.paths[i], MAX_PATH_LEN),
+			      "get_fd_path",
+			      "failed to get path[%d]: %u(%s) vs %u(%s)\n",
+			      i, src.fds[i], src.paths[i], dst.fds[i],
+			      dst.paths[i]);
+	}
+
+cleanup:
+	bpf_link__destroy(tp_link);
+	bpf_object__close(obj);
+}
diff --git a/tools/testing/selftests/bpf/progs/test_get_fd_path.c b/tools/testing/selftests/bpf/progs/test_get_fd_path.c
new file mode 100644
index 000000000000..8bb58f87755e
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_get_fd_path.c
@@ -0,0 +1,43 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/bpf.h>
+#include <linux/ptrace.h>
+#include <string.h>
+#include <unistd.h>
+#include "bpf_helpers.h"
+#include "bpf_tracing.h"
+
+#define MAX_PATH_LEN		128
+#define MAX_EVENT_NUM		16
+
+static struct fd_path_test_data {
+	pid_t pid;
+	__u32 cnt;
+	__u32 fds[MAX_EVENT_NUM];
+	char paths[MAX_EVENT_NUM][MAX_PATH_LEN];
+} data;
+
+struct sys_enter_newfstat_args {
+	unsigned long long pad1;
+	unsigned long long pad2;
+	unsigned long fd;
+};
+
+SEC("tracepoint/syscalls/sys_enter_newfstat")
+int bpf_prog(struct sys_enter_newfstat_args *args)
+{
+	pid_t pid = bpf_get_current_pid_tgid() >> 32;
+
+	if (pid != data.pid)
+		return 0;
+	if (data.cnt >= MAX_EVENT_NUM)
+		return 0;
+
+	data.fds[data.cnt] = args->fd;
+	bpf_get_fd_path(data.paths[data.cnt], MAX_PATH_LEN, args->fd);
+	data.cnt++;
+
+	return 0;
+}
+
+char _license[] SEC("license") = "GPL";
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v14 1/2] bpf: add new helper get_fd_path for mapping a file descriptor to a pathname
  2019-12-18  0:56                 ` [PATCH bpf-next v14 1/2] bpf: add new helper get_fd_path for mapping a file descriptor to a pathname Wenbo Zhang
@ 2019-12-18  3:27                   ` Yonghong Song
  2019-12-19 16:14                   ` Daniel Borkmann
  1 sibling, 0 replies; 52+ messages in thread
From: Yonghong Song @ 2019-12-18  3:27 UTC (permalink / raw)
  To: Wenbo Zhang, bpf; +Cc: ast, daniel, bgregg, andrii.nakryiko, netdev



On 12/17/19 4:56 PM, Wenbo Zhang wrote:
> When people want to identify which file system files are being opened,
> read, and written to, they can use this helper with file descriptor as
> input to achieve this goal. Other pseudo filesystems are also supported.
> 
> This requirement is mainly discussed here:
> 
>    https://github.com/iovisor/bcc/issues/237
> 
> v13->v14: addressed Yonghong and Daniel's feedback
> - fix this helper's description to be consistent with comments in d_path
> - fix error handling logic fill zeroes not '0's
> 
> v12->v13: addressed Brendan and Yonghong's feedback
> - rename to get_fd_path
> - refactor code & comment to be clearer and more compliant
> 
> v11->v12: addressed Alexei's feedback
> - only allow tracepoints to make sure it won't dead lock
> 
> v10->v11: addressed Al and Alexei's feedback
> - fix missing fput()
> 
> v9->v10: addressed Andrii's feedback
> - send this patch together with the patch selftests as one patch series
> 
> v8->v9:
> - format helper description
> 
> v7->v8: addressed Alexei's feedback
> - use fget_raw instead of fdget_raw, as fdget_raw is only used inside fs/
> - ensure we're in user context which is safe fot the help to run
> - filter unmountable pseudo filesystem, because they don't have real path
> - supplement the description of this helper function
> 
> v6->v7:
> - fix missing signed-off-by line
> 
> v5->v6: addressed Andrii's feedback
> - avoid unnecessary goto end by having two explicit returns
> 
> v4->v5: addressed Andrii and Daniel's feedback
> - rename bpf_fd2path to bpf_get_file_path to be consistent with other
> helper's names
> - when fdget_raw fails, set ret to -EBADF instead of -EINVAL
> - remove fdput from fdget_raw's error path
> - use IS_ERR instead of IS_ERR_OR_NULL as d_path ether returns a pointer
> into the buffer or an error code if the path was too long
> - modify the normal path's return value to return copied string length
> including NUL
> - update this helper description's Return bits.
> 
> v3->v4: addressed Daniel's feedback
> - fix missing fdput()
> - move fd2path from kernel/bpf/trace.c to kernel/trace/bpf_trace.c
> - move fd2path's test code to another patch
> - add comment to explain why use fdget_raw instead of fdget
> 
> v2->v3: addressed Yonghong's feedback
> - remove unnecessary LOCKDOWN_BPF_READ
> - refactor error handling section for enhanced readability
> - provide a test case in tools/testing/selftests/bpf
> 
> v1->v2: addressed Daniel's feedback
> - fix backward compatibility
> - add this helper description
> - fix signed-off name
> 
> Signed-off-by: Wenbo Zhang <ethercflow@gmail.com>

Acked-by: Yonghong Song <yhs@fb.com>


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v14 2/2] selftests/bpf: test for bpf_get_fd_path() from tracepoint
  2019-12-18  0:56                 ` [PATCH bpf-next v14 2/2] selftests/bpf: test for bpf_get_fd_path() from tracepoint Wenbo Zhang
@ 2019-12-18  3:27                   ` Yonghong Song
  0 siblings, 0 replies; 52+ messages in thread
From: Yonghong Song @ 2019-12-18  3:27 UTC (permalink / raw)
  To: Wenbo Zhang, bpf; +Cc: ast, daniel, bgregg, andrii.nakryiko, netdev



On 12/17/19 4:56 PM, Wenbo Zhang wrote:
> trace fstat events by tracepoint syscalls/sys_enter_newfstat, and handle
> events only produced by test_file_fd_path, which call fstat on several
> different types of files to test bpf_fd_file_path's feature.
> 
> v5->v6: addressed Gregg and Yonghong's feedback
> - rename to get_fd_path
> - change sys_enter_newfstat_args's fd type to long to fix issue on
> big-endian machines
> 
> v4->v5: addressed Andrii's feedback
> - pass NULL for opts as bpf_object__open_file's PARAM2, as not really
> using any
> - modify patch subject to keep up with test code
> - as this test is single-threaded, so use getpid instead of SYS_gettid
> - remove unnecessary parens around check which after if (i < 3)
> - in kern use bpf_get_current_pid_tgid() >> 32 to fit getpid() in
> userspace part
> - with the patch adding helper as one patch series
> 
> v3->v4: addressed Andrii's feedback
> - use a set of fd instead of fds array
> - use global variables instead of maps (in v3, I mistakenly thought that
> the bpf maps are global variables.)
> - remove uncessary global variable path_info_index
> - remove fd compare as the fstat's order is fixed
> 
> v2->v3: addressed Andrii's feedback
> - use global data instead of perf_buffer to simplified code
> 
> v1->v2: addressed Daniel's feedback
> - rename bpf_fd2path to bpf_get_file_path to be consistent with other
> helper's names
> 
> Signed-off-by: Wenbo Zhang <ethercflow@gmail.com>

Acked-by: Yonghong Song <yhs@fb.com>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v14 1/2] bpf: add new helper get_fd_path for mapping a file descriptor to a pathname
  2019-12-18  0:56                 ` [PATCH bpf-next v14 1/2] bpf: add new helper get_fd_path for mapping a file descriptor to a pathname Wenbo Zhang
  2019-12-18  3:27                   ` Yonghong Song
@ 2019-12-19 16:14                   ` Daniel Borkmann
  2019-12-20  3:35                     ` Wenbo Zhang
  1 sibling, 1 reply; 52+ messages in thread
From: Daniel Borkmann @ 2019-12-19 16:14 UTC (permalink / raw)
  To: Wenbo Zhang, bpf; +Cc: ast, yhs, bgregg, andrii.nakryiko, netdev, viro

[ Wenbo, please keep also Al (added here) in the loop since he was providing
   feedback on prior submissions as well wrt vfs bits. ]

On 12/18/19 1:56 AM, Wenbo Zhang wrote:
> When people want to identify which file system files are being opened,
> read, and written to, they can use this helper with file descriptor as
> input to achieve this goal. Other pseudo filesystems are also supported.
> 
> This requirement is mainly discussed here:
> 
>    https://github.com/iovisor/bcc/issues/237
> 
> v13->v14: addressed Yonghong and Daniel's feedback
> - fix this helper's description to be consistent with comments in d_path
> - fix error handling logic fill zeroes not '0's
> 
> v12->v13: addressed Brendan and Yonghong's feedback
> - rename to get_fd_path
> - refactor code & comment to be clearer and more compliant
> 
> v11->v12: addressed Alexei's feedback
> - only allow tracepoints to make sure it won't dead lock
> 
> v10->v11: addressed Al and Alexei's feedback
> - fix missing fput()
> 
> v9->v10: addressed Andrii's feedback
> - send this patch together with the patch selftests as one patch series
> 
> v8->v9:
> - format helper description
> 
> v7->v8: addressed Alexei's feedback
> - use fget_raw instead of fdget_raw, as fdget_raw is only used inside fs/
> - ensure we're in user context which is safe fot the help to run
> - filter unmountable pseudo filesystem, because they don't have real path
> - supplement the description of this helper function
> 
> v6->v7:
> - fix missing signed-off-by line
> 
> v5->v6: addressed Andrii's feedback
> - avoid unnecessary goto end by having two explicit returns
> 
> v4->v5: addressed Andrii and Daniel's feedback
> - rename bpf_fd2path to bpf_get_file_path to be consistent with other
> helper's names
> - when fdget_raw fails, set ret to -EBADF instead of -EINVAL
> - remove fdput from fdget_raw's error path
> - use IS_ERR instead of IS_ERR_OR_NULL as d_path ether returns a pointer
> into the buffer or an error code if the path was too long
> - modify the normal path's return value to return copied string length
> including NUL
> - update this helper description's Return bits.
> 
> v3->v4: addressed Daniel's feedback
> - fix missing fdput()
> - move fd2path from kernel/bpf/trace.c to kernel/trace/bpf_trace.c
> - move fd2path's test code to another patch
> - add comment to explain why use fdget_raw instead of fdget
> 
> v2->v3: addressed Yonghong's feedback
> - remove unnecessary LOCKDOWN_BPF_READ
> - refactor error handling section for enhanced readability
> - provide a test case in tools/testing/selftests/bpf
> 
> v1->v2: addressed Daniel's feedback
> - fix backward compatibility
> - add this helper description
> - fix signed-off name
> 
> Signed-off-by: Wenbo Zhang <ethercflow@gmail.com>
> ---
>   include/uapi/linux/bpf.h       | 29 +++++++++++++-
>   kernel/trace/bpf_trace.c       | 69 ++++++++++++++++++++++++++++++++++
>   tools/include/uapi/linux/bpf.h | 29 +++++++++++++-
>   3 files changed, 125 insertions(+), 2 deletions(-)
> 
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index dbbcf0b02970..4534ce49f838 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -2821,6 +2821,32 @@ union bpf_attr {
>    * 	Return
>    * 		On success, the strictly positive length of the string,	including
>    * 		the trailing NUL character. On error, a negative value.
> + *
> + * int bpf_get_fd_path(char *path, u32 size, int fd)
> + *	Description
> + *		Get **file** atrribute from the current task by *fd*, then call
> + *		**d_path** to get it's absolute path and copy it as string into
> + *		*path* of *size*. Notice the **path** don't support unmountable
> + *		pseudo filesystems as they don't have path (eg: SOCKFS, PIPEFS).
> + *		The *size* must be strictly positive. On success, the helper
> + *		makes sure that the *path* is NUL-terminated, and the buffer
> + *		could be:
> + *		- a regular full path (include mountable fs eg: /proc, /sys)
> + *		- a regular full path with " (deleted)" is appended.
> + *		On failure, it is filled with zeroes.
> + *	Return
> + *		On success, returns the length of the copied string INCLUDING
> + *		the trailing '\0'.
> + *
> + *		On failure, the returned value is one of the following:
> + *
> + *		**-EPERM** if no permission to get the path (eg: in irq ctx).
> + *
> + *		**-EBADF** if *fd* is invalid.
> + *
> + *		**-EINVAL** if *fd* corresponds to a unmountable pseudo fs
> + *
> + *		**-ENAMETOOLONG** if full path is longer than *size*
>    */
>   #define __BPF_FUNC_MAPPER(FN)		\
>   	FN(unspec),			\
> @@ -2938,7 +2964,8 @@ union bpf_attr {
>   	FN(probe_read_user),		\
>   	FN(probe_read_kernel),		\
>   	FN(probe_read_user_str),	\
> -	FN(probe_read_kernel_str),
> +	FN(probe_read_kernel_str),	\
> +	FN(get_fd_path),
>   
>   /* integer value in 'imm' field of BPF_CALL instruction selects which helper
>    * function eBPF program intends to call
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index e5ef4ae9edb5..a2c18b193141 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -762,6 +762,71 @@ static const struct bpf_func_proto bpf_send_signal_proto = {
>   	.arg1_type	= ARG_ANYTHING,
>   };
>   
> +BPF_CALL_3(bpf_get_fd_path, char *, dst, u32, size, int, fd)
> +{
> +	int ret = -EBADF;
> +	struct file *f;
> +	char *p;
> +
> +	/* Ensure we're in user context which is safe for the helper to
> +	 * run. This helper has no business in a kthread.
> +	 */
> +	if (unlikely(in_interrupt() ||
> +		     current->flags & (PF_KTHREAD | PF_EXITING))) {
> +		ret = -EPERM;
> +		goto error;
> +	}
> +
> +	/* Use fget_raw instead of fget to support O_PATH, and it doesn't
> +	 * have any sleepable code, so it's ok to be here.
> +	 */
> +	f = fget_raw(fd);
> +	if (!f)
> +		goto error;
> +
> +	/* For unmountable pseudo filesystem, it seems to have no meaning
> +	 * to get their fake paths as they don't have path, and to be no
> +	 * way to validate this function pointer can be always safe to call
> +	 * in the current context.
> +	 */
> +	if (f->f_path.dentry->d_op && f->f_path.dentry->d_op->d_dname) {
> +		ret = -EINVAL;
> +		fput(f);
> +		goto error;
> +	}
> +
> +	/* After filter unmountable pseudo filesytem, d_path won't call
> +	 * dentry->d_op->d_name(), the normally path doesn't have any
> +	 * sleepable code, and despite it uses the current macro to get
> +	 * fs_struct (current->fs), we've already ensured we're in user
> +	 * context, so it's ok to be here.
> +	 */
> +	p = d_path(&f->f_path, dst, size);
> +	if (IS_ERR(p)) {
> +		ret = PTR_ERR(p);
> +		fput(f);
> +		goto error;
> +	}
> +
> +	ret = strlen(p) + 1;
> +	memmove(dst, p, ret);
> +	fput(f);
> +	return ret;
> +
> +error:
> +	memset(dst, 0, size);
> +	return ret;
> +}
> +
> +static const struct bpf_func_proto bpf_get_fd_path_proto = {
> +	.func       = bpf_get_fd_path,
> +	.gpl_only   = true,
> +	.ret_type   = RET_INTEGER,
> +	.arg1_type  = ARG_PTR_TO_UNINIT_MEM,
> +	.arg2_type  = ARG_CONST_SIZE,
> +	.arg3_type  = ARG_ANYTHING,
> +};
> +
>   static const struct bpf_func_proto *
>   tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
>   {
> @@ -953,6 +1018,8 @@ tp_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
>   		return &bpf_get_stackid_proto_tp;
>   	case BPF_FUNC_get_stack:
>   		return &bpf_get_stack_proto_tp;
> +	case BPF_FUNC_get_fd_path:
> +		return &bpf_get_fd_path_proto;
>   	default:
>   		return tracing_func_proto(func_id, prog);
>   	}
> @@ -1146,6 +1213,8 @@ raw_tp_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
>   		return &bpf_get_stackid_proto_raw_tp;
>   	case BPF_FUNC_get_stack:
>   		return &bpf_get_stack_proto_raw_tp;
> +	case BPF_FUNC_get_fd_path:
> +		return &bpf_get_fd_path_proto;
>   	default:
>   		return tracing_func_proto(func_id, prog);
>   	}
> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index dbbcf0b02970..4534ce49f838 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -2821,6 +2821,32 @@ union bpf_attr {
>    * 	Return
>    * 		On success, the strictly positive length of the string,	including
>    * 		the trailing NUL character. On error, a negative value.
> + *
> + * int bpf_get_fd_path(char *path, u32 size, int fd)
> + *	Description
> + *		Get **file** atrribute from the current task by *fd*, then call
> + *		**d_path** to get it's absolute path and copy it as string into
> + *		*path* of *size*. Notice the **path** don't support unmountable
> + *		pseudo filesystems as they don't have path (eg: SOCKFS, PIPEFS).
> + *		The *size* must be strictly positive. On success, the helper
> + *		makes sure that the *path* is NUL-terminated, and the buffer
> + *		could be:
> + *		- a regular full path (include mountable fs eg: /proc, /sys)
> + *		- a regular full path with " (deleted)" is appended.
> + *		On failure, it is filled with zeroes.
> + *	Return
> + *		On success, returns the length of the copied string INCLUDING
> + *		the trailing '\0'.
> + *
> + *		On failure, the returned value is one of the following:
> + *
> + *		**-EPERM** if no permission to get the path (eg: in irq ctx).
> + *
> + *		**-EBADF** if *fd* is invalid.
> + *
> + *		**-EINVAL** if *fd* corresponds to a unmountable pseudo fs
> + *
> + *		**-ENAMETOOLONG** if full path is longer than *size*
>    */
>   #define __BPF_FUNC_MAPPER(FN)		\
>   	FN(unspec),			\
> @@ -2938,7 +2964,8 @@ union bpf_attr {
>   	FN(probe_read_user),		\
>   	FN(probe_read_kernel),		\
>   	FN(probe_read_user_str),	\
> -	FN(probe_read_kernel_str),
> +	FN(probe_read_kernel_str),	\
> +	FN(get_fd_path),
>   
>   /* integer value in 'imm' field of BPF_CALL instruction selects which helper
>    * function eBPF program intends to call
> 


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v14 1/2] bpf: add new helper get_fd_path for mapping a file descriptor to a pathname
  2019-12-19 16:14                   ` Daniel Borkmann
@ 2019-12-20  3:35                     ` Wenbo Zhang
  2020-01-16  8:59                       ` Jiri Olsa
  0 siblings, 1 reply; 52+ messages in thread
From: Wenbo Zhang @ 2019-12-20  3:35 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: bpf, Alexei Starovoitov, Yonghong Song, Brendan Gregg,
	Andrii Nakryiko, Networking, viro

> [ Wenbo, please keep also Al (added here) in the loop since he was providing
>    feedback on prior submissions as well wrt vfs bits. ]

Get it, thank you!

Daniel Borkmann <daniel@iogearbox.net> 于2019年12月20日周五 上午12:14写道:
>
> [ Wenbo, please keep also Al (added here) in the loop since he was providing
>    feedback on prior submissions as well wrt vfs bits. ]
>
> On 12/18/19 1:56 AM, Wenbo Zhang wrote:
> > When people want to identify which file system files are being opened,
> > read, and written to, they can use this helper with file descriptor as
> > input to achieve this goal. Other pseudo filesystems are also supported.
> >
> > This requirement is mainly discussed here:
> >
> >    https://github.com/iovisor/bcc/issues/237
> >
> > v13->v14: addressed Yonghong and Daniel's feedback
> > - fix this helper's description to be consistent with comments in d_path
> > - fix error handling logic fill zeroes not '0's
> >
> > v12->v13: addressed Brendan and Yonghong's feedback
> > - rename to get_fd_path
> > - refactor code & comment to be clearer and more compliant
> >
> > v11->v12: addressed Alexei's feedback
> > - only allow tracepoints to make sure it won't dead lock
> >
> > v10->v11: addressed Al and Alexei's feedback
> > - fix missing fput()
> >
> > v9->v10: addressed Andrii's feedback
> > - send this patch together with the patch selftests as one patch series
> >
> > v8->v9:
> > - format helper description
> >
> > v7->v8: addressed Alexei's feedback
> > - use fget_raw instead of fdget_raw, as fdget_raw is only used inside fs/
> > - ensure we're in user context which is safe fot the help to run
> > - filter unmountable pseudo filesystem, because they don't have real path
> > - supplement the description of this helper function
> >
> > v6->v7:
> > - fix missing signed-off-by line
> >
> > v5->v6: addressed Andrii's feedback
> > - avoid unnecessary goto end by having two explicit returns
> >
> > v4->v5: addressed Andrii and Daniel's feedback
> > - rename bpf_fd2path to bpf_get_file_path to be consistent with other
> > helper's names
> > - when fdget_raw fails, set ret to -EBADF instead of -EINVAL
> > - remove fdput from fdget_raw's error path
> > - use IS_ERR instead of IS_ERR_OR_NULL as d_path ether returns a pointer
> > into the buffer or an error code if the path was too long
> > - modify the normal path's return value to return copied string length
> > including NUL
> > - update this helper description's Return bits.
> >
> > v3->v4: addressed Daniel's feedback
> > - fix missing fdput()
> > - move fd2path from kernel/bpf/trace.c to kernel/trace/bpf_trace.c
> > - move fd2path's test code to another patch
> > - add comment to explain why use fdget_raw instead of fdget
> >
> > v2->v3: addressed Yonghong's feedback
> > - remove unnecessary LOCKDOWN_BPF_READ
> > - refactor error handling section for enhanced readability
> > - provide a test case in tools/testing/selftests/bpf
> >
> > v1->v2: addressed Daniel's feedback
> > - fix backward compatibility
> > - add this helper description
> > - fix signed-off name
> >
> > Signed-off-by: Wenbo Zhang <ethercflow@gmail.com>
> > ---
> >   include/uapi/linux/bpf.h       | 29 +++++++++++++-
> >   kernel/trace/bpf_trace.c       | 69 ++++++++++++++++++++++++++++++++++
> >   tools/include/uapi/linux/bpf.h | 29 +++++++++++++-
> >   3 files changed, 125 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index dbbcf0b02970..4534ce49f838 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -2821,6 +2821,32 @@ union bpf_attr {
> >    *  Return
> >    *          On success, the strictly positive length of the string, including
> >    *          the trailing NUL character. On error, a negative value.
> > + *
> > + * int bpf_get_fd_path(char *path, u32 size, int fd)
> > + *   Description
> > + *           Get **file** atrribute from the current task by *fd*, then call
> > + *           **d_path** to get it's absolute path and copy it as string into
> > + *           *path* of *size*. Notice the **path** don't support unmountable
> > + *           pseudo filesystems as they don't have path (eg: SOCKFS, PIPEFS).
> > + *           The *size* must be strictly positive. On success, the helper
> > + *           makes sure that the *path* is NUL-terminated, and the buffer
> > + *           could be:
> > + *           - a regular full path (include mountable fs eg: /proc, /sys)
> > + *           - a regular full path with " (deleted)" is appended.
> > + *           On failure, it is filled with zeroes.
> > + *   Return
> > + *           On success, returns the length of the copied string INCLUDING
> > + *           the trailing '\0'.
> > + *
> > + *           On failure, the returned value is one of the following:
> > + *
> > + *           **-EPERM** if no permission to get the path (eg: in irq ctx).
> > + *
> > + *           **-EBADF** if *fd* is invalid.
> > + *
> > + *           **-EINVAL** if *fd* corresponds to a unmountable pseudo fs
> > + *
> > + *           **-ENAMETOOLONG** if full path is longer than *size*
> >    */
> >   #define __BPF_FUNC_MAPPER(FN)               \
> >       FN(unspec),                     \
> > @@ -2938,7 +2964,8 @@ union bpf_attr {
> >       FN(probe_read_user),            \
> >       FN(probe_read_kernel),          \
> >       FN(probe_read_user_str),        \
> > -     FN(probe_read_kernel_str),
> > +     FN(probe_read_kernel_str),      \
> > +     FN(get_fd_path),
> >
> >   /* integer value in 'imm' field of BPF_CALL instruction selects which helper
> >    * function eBPF program intends to call
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index e5ef4ae9edb5..a2c18b193141 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -762,6 +762,71 @@ static const struct bpf_func_proto bpf_send_signal_proto = {
> >       .arg1_type      = ARG_ANYTHING,
> >   };
> >
> > +BPF_CALL_3(bpf_get_fd_path, char *, dst, u32, size, int, fd)
> > +{
> > +     int ret = -EBADF;
> > +     struct file *f;
> > +     char *p;
> > +
> > +     /* Ensure we're in user context which is safe for the helper to
> > +      * run. This helper has no business in a kthread.
> > +      */
> > +     if (unlikely(in_interrupt() ||
> > +                  current->flags & (PF_KTHREAD | PF_EXITING))) {
> > +             ret = -EPERM;
> > +             goto error;
> > +     }
> > +
> > +     /* Use fget_raw instead of fget to support O_PATH, and it doesn't
> > +      * have any sleepable code, so it's ok to be here.
> > +      */
> > +     f = fget_raw(fd);
> > +     if (!f)
> > +             goto error;
> > +
> > +     /* For unmountable pseudo filesystem, it seems to have no meaning
> > +      * to get their fake paths as they don't have path, and to be no
> > +      * way to validate this function pointer can be always safe to call
> > +      * in the current context.
> > +      */
> > +     if (f->f_path.dentry->d_op && f->f_path.dentry->d_op->d_dname) {
> > +             ret = -EINVAL;
> > +             fput(f);
> > +             goto error;
> > +     }
> > +
> > +     /* After filter unmountable pseudo filesytem, d_path won't call
> > +      * dentry->d_op->d_name(), the normally path doesn't have any
> > +      * sleepable code, and despite it uses the current macro to get
> > +      * fs_struct (current->fs), we've already ensured we're in user
> > +      * context, so it's ok to be here.
> > +      */
> > +     p = d_path(&f->f_path, dst, size);
> > +     if (IS_ERR(p)) {
> > +             ret = PTR_ERR(p);
> > +             fput(f);
> > +             goto error;
> > +     }
> > +
> > +     ret = strlen(p) + 1;
> > +     memmove(dst, p, ret);
> > +     fput(f);
> > +     return ret;
> > +
> > +error:
> > +     memset(dst, 0, size);
> > +     return ret;
> > +}
> > +
> > +static const struct bpf_func_proto bpf_get_fd_path_proto = {
> > +     .func       = bpf_get_fd_path,
> > +     .gpl_only   = true,
> > +     .ret_type   = RET_INTEGER,
> > +     .arg1_type  = ARG_PTR_TO_UNINIT_MEM,
> > +     .arg2_type  = ARG_CONST_SIZE,
> > +     .arg3_type  = ARG_ANYTHING,
> > +};
> > +
> >   static const struct bpf_func_proto *
> >   tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
> >   {
> > @@ -953,6 +1018,8 @@ tp_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
> >               return &bpf_get_stackid_proto_tp;
> >       case BPF_FUNC_get_stack:
> >               return &bpf_get_stack_proto_tp;
> > +     case BPF_FUNC_get_fd_path:
> > +             return &bpf_get_fd_path_proto;
> >       default:
> >               return tracing_func_proto(func_id, prog);
> >       }
> > @@ -1146,6 +1213,8 @@ raw_tp_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
> >               return &bpf_get_stackid_proto_raw_tp;
> >       case BPF_FUNC_get_stack:
> >               return &bpf_get_stack_proto_raw_tp;
> > +     case BPF_FUNC_get_fd_path:
> > +             return &bpf_get_fd_path_proto;
> >       default:
> >               return tracing_func_proto(func_id, prog);
> >       }
> > diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> > index dbbcf0b02970..4534ce49f838 100644
> > --- a/tools/include/uapi/linux/bpf.h
> > +++ b/tools/include/uapi/linux/bpf.h
> > @@ -2821,6 +2821,32 @@ union bpf_attr {
> >    *  Return
> >    *          On success, the strictly positive length of the string, including
> >    *          the trailing NUL character. On error, a negative value.
> > + *
> > + * int bpf_get_fd_path(char *path, u32 size, int fd)
> > + *   Description
> > + *           Get **file** atrribute from the current task by *fd*, then call
> > + *           **d_path** to get it's absolute path and copy it as string into
> > + *           *path* of *size*. Notice the **path** don't support unmountable
> > + *           pseudo filesystems as they don't have path (eg: SOCKFS, PIPEFS).
> > + *           The *size* must be strictly positive. On success, the helper
> > + *           makes sure that the *path* is NUL-terminated, and the buffer
> > + *           could be:
> > + *           - a regular full path (include mountable fs eg: /proc, /sys)
> > + *           - a regular full path with " (deleted)" is appended.
> > + *           On failure, it is filled with zeroes.
> > + *   Return
> > + *           On success, returns the length of the copied string INCLUDING
> > + *           the trailing '\0'.
> > + *
> > + *           On failure, the returned value is one of the following:
> > + *
> > + *           **-EPERM** if no permission to get the path (eg: in irq ctx).
> > + *
> > + *           **-EBADF** if *fd* is invalid.
> > + *
> > + *           **-EINVAL** if *fd* corresponds to a unmountable pseudo fs
> > + *
> > + *           **-ENAMETOOLONG** if full path is longer than *size*
> >    */
> >   #define __BPF_FUNC_MAPPER(FN)               \
> >       FN(unspec),                     \
> > @@ -2938,7 +2964,8 @@ union bpf_attr {
> >       FN(probe_read_user),            \
> >       FN(probe_read_kernel),          \
> >       FN(probe_read_user_str),        \
> > -     FN(probe_read_kernel_str),
> > +     FN(probe_read_kernel_str),      \
> > +     FN(get_fd_path),
> >
> >   /* integer value in 'imm' field of BPF_CALL instruction selects which helper
> >    * function eBPF program intends to call
> >
>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v14 1/2] bpf: add new helper get_fd_path for mapping a file descriptor to a pathname
  2019-12-20  3:35                     ` Wenbo Zhang
@ 2020-01-16  8:59                       ` Jiri Olsa
  2020-02-10  4:43                         ` Brendan Gregg
  0 siblings, 1 reply; 52+ messages in thread
From: Jiri Olsa @ 2020-01-16  8:59 UTC (permalink / raw)
  To: Wenbo Zhang
  Cc: Daniel Borkmann, bpf, Alexei Starovoitov, Yonghong Song,
	Brendan Gregg, Andrii Nakryiko, Networking, viro

On Fri, Dec 20, 2019 at 11:35:08AM +0800, Wenbo Zhang wrote:
> > [ Wenbo, please keep also Al (added here) in the loop since he was providing
> >    feedback on prior submissions as well wrt vfs bits. ]
> 
> Get it, thank you!

hi,
is this stuck on review? I'd like to see this merged ;-)
we have bpftrace change using it already.. from that side:

Tested-by: Jiri Olsa <jolsa@kernel.org>

thanks,
jirka


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v14 1/2] bpf: add new helper get_fd_path for mapping a file descriptor to a pathname
  2020-01-16  8:59                       ` Jiri Olsa
@ 2020-02-10  4:43                         ` Brendan Gregg
  2020-02-11  0:01                           ` Daniel Borkmann
  0 siblings, 1 reply; 52+ messages in thread
From: Brendan Gregg @ 2020-02-10  4:43 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Wenbo Zhang, Daniel Borkmann, bpf, Alexei Starovoitov,
	Yonghong Song, Andrii Nakryiko, Networking, Al Viro

On Thu, Jan 16, 2020 at 12:59 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Fri, Dec 20, 2019 at 11:35:08AM +0800, Wenbo Zhang wrote:
> > > [ Wenbo, please keep also Al (added here) in the loop since he was providing
> > >    feedback on prior submissions as well wrt vfs bits. ]
> >
> > Get it, thank you!
>
> hi,
> is this stuck on review? I'd like to see this merged ;-)

Is this still waiting on someone? I'm writing some docs on analyzing
file systems via syscall tracing and this will be a big improvement.
Thanks,

Brendan

>
> we have bpftrace change using it already.. from that side:
>
> Tested-by: Jiri Olsa <jolsa@kernel.org>
>
> thanks,
> jirka
>


-- 
Brendan Gregg, Senior Performance Architect, Netflix

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v14 1/2] bpf: add new helper get_fd_path for mapping a file descriptor to a pathname
  2020-02-10  4:43                         ` Brendan Gregg
@ 2020-02-11  0:01                           ` Daniel Borkmann
  2020-02-12 15:21                             ` Jiri Olsa
  0 siblings, 1 reply; 52+ messages in thread
From: Daniel Borkmann @ 2020-02-11  0:01 UTC (permalink / raw)
  To: Brendan Gregg, Jiri Olsa
  Cc: Wenbo Zhang, bpf, Alexei Starovoitov, Yonghong Song,
	Andrii Nakryiko, Networking, Al Viro

On 2/10/20 5:43 AM, Brendan Gregg wrote:
> On Thu, Jan 16, 2020 at 12:59 AM Jiri Olsa <jolsa@redhat.com> wrote:
>> On Fri, Dec 20, 2019 at 11:35:08AM +0800, Wenbo Zhang wrote:
>>>> [ Wenbo, please keep also Al (added here) in the loop since he was providing
>>>>     feedback on prior submissions as well wrt vfs bits. ]
>>>
>>> Get it, thank you!
>>
>> hi,
>> is this stuck on review? I'd like to see this merged ;-)
> 
> Is this still waiting on someone? I'm writing some docs on analyzing
> file systems via syscall tracing and this will be a big improvement.

It was waiting on final review/ACK from vfs folks, but given that didn't
happen so far :/, this series should get rebased for proceeding with merge.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v14 1/2] bpf: add new helper get_fd_path for mapping a file descriptor to a pathname
  2020-02-11  0:01                           ` Daniel Borkmann
@ 2020-02-12 15:21                             ` Jiri Olsa
  2020-06-01 14:17                               ` Wenbo Zhang
  0 siblings, 1 reply; 52+ messages in thread
From: Jiri Olsa @ 2020-02-12 15:21 UTC (permalink / raw)
  To: Daniel Borkmann, Al Viro
  Cc: Brendan Gregg, Wenbo Zhang, bpf, Alexei Starovoitov,
	Yonghong Song, Andrii Nakryiko, Al Viro

On Tue, Feb 11, 2020 at 01:01:16AM +0100, Daniel Borkmann wrote:
> On 2/10/20 5:43 AM, Brendan Gregg wrote:
> > On Thu, Jan 16, 2020 at 12:59 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > On Fri, Dec 20, 2019 at 11:35:08AM +0800, Wenbo Zhang wrote:
> > > > > [ Wenbo, please keep also Al (added here) in the loop since he was providing
> > > > >     feedback on prior submissions as well wrt vfs bits. ]
> > > > 
> > > > Get it, thank you!
> > > 
> > > hi,
> > > is this stuck on review? I'd like to see this merged ;-)
> > 
> > Is this still waiting on someone? I'm writing some docs on analyzing
> > file systems via syscall tracing and this will be a big improvement.
> 
> It was waiting on final review/ACK from vfs folks, but given that didn't
> happen so far :/, this series should get rebased for proceeding with merge.
> 

Al Viro, any chance you could check on the latest version?

thanks,
jirka


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v14 1/2] bpf: add new helper get_fd_path for mapping a file descriptor to a pathname
  2020-02-12 15:21                             ` Jiri Olsa
@ 2020-06-01 14:17                               ` Wenbo Zhang
  2020-06-01 16:38                                 ` Alexei Starovoitov
  0 siblings, 1 reply; 52+ messages in thread
From: Wenbo Zhang @ 2020-06-01 14:17 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Jiri Olsa, Al Viro, Brendan Gregg, bpf, Alexei Starovoitov,
	Yonghong Song, Andrii Nakryiko

Hi Daniel,

I find https://patchwork.ozlabs.org/project/netdev/patch/7464919bd9c15f2496ca29dceb6a4048b3199774.1576629200.git.ethercflow@gmail.com/
this PR's current state is Awaiting Upstream. I don't know much about
this state. I want to ask if this PR will be merged.

Thank you
Wenbo

Jiri Olsa <jolsa@redhat.com> 于2020年2月12日周三 下午11:22写道:
>
> On Tue, Feb 11, 2020 at 01:01:16AM +0100, Daniel Borkmann wrote:
> > On 2/10/20 5:43 AM, Brendan Gregg wrote:
> > > On Thu, Jan 16, 2020 at 12:59 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > > On Fri, Dec 20, 2019 at 11:35:08AM +0800, Wenbo Zhang wrote:
> > > > > > [ Wenbo, please keep also Al (added here) in the loop since he was providing
> > > > > >     feedback on prior submissions as well wrt vfs bits. ]
> > > > >
> > > > > Get it, thank you!
> > > >
> > > > hi,
> > > > is this stuck on review? I'd like to see this merged ;-)
> > >
> > > Is this still waiting on someone? I'm writing some docs on analyzing
> > > file systems via syscall tracing and this will be a big improvement.
> >
> > It was waiting on final review/ACK from vfs folks, but given that didn't
> > happen so far :/, this series should get rebased for proceeding with merge.
> >
>
> Al Viro, any chance you could check on the latest version?
>
> thanks,
> jirka
>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v14 1/2] bpf: add new helper get_fd_path for mapping a file descriptor to a pathname
  2020-06-01 14:17                               ` Wenbo Zhang
@ 2020-06-01 16:38                                 ` Alexei Starovoitov
  2020-06-02  3:04                                   ` Wenbo Zhang
  0 siblings, 1 reply; 52+ messages in thread
From: Alexei Starovoitov @ 2020-06-01 16:38 UTC (permalink / raw)
  To: Wenbo Zhang
  Cc: Daniel Borkmann, Jiri Olsa, Al Viro, Brendan Gregg, bpf,
	Alexei Starovoitov, Yonghong Song, Andrii Nakryiko

On Mon, Jun 1, 2020 at 7:17 AM Wenbo Zhang <ethercflow@gmail.com> wrote:
>
> Hi Daniel,
>
> I find https://patchwork.ozlabs.org/project/netdev/patch/7464919bd9c15f2496ca29dceb6a4048b3199774.1576629200.git.ethercflow@gmail.com/
> this PR's current state is Awaiting Upstream. I don't know much about
> this state. I want to ask if this PR will be merged.

This one won't be merged.
Jiri had sent patches based on whitelist approach.
That's a proper direction to address locking concerns.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v14 1/2] bpf: add new helper get_fd_path for mapping a file descriptor to a pathname
  2020-06-01 16:38                                 ` Alexei Starovoitov
@ 2020-06-02  3:04                                   ` Wenbo Zhang
  2020-06-02  8:14                                     ` Jiri Olsa
  0 siblings, 1 reply; 52+ messages in thread
From: Wenbo Zhang @ 2020-06-02  3:04 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Daniel Borkmann, Jiri Olsa, Al Viro, Brendan Gregg, bpf,
	Alexei Starovoitov, Yonghong Song, Andrii Nakryiko

Get it, I'll search Jiri's patches to see how to use that. Thanks.

Alexei Starovoitov <alexei.starovoitov@gmail.com> 于2020年6月2日周二 上午12:38写道:
>
> On Mon, Jun 1, 2020 at 7:17 AM Wenbo Zhang <ethercflow@gmail.com> wrote:
> >
> > Hi Daniel,
> >
> > I find https://patchwork.ozlabs.org/project/netdev/patch/7464919bd9c15f2496ca29dceb6a4048b3199774.1576629200.git.ethercflow@gmail.com/
> > this PR's current state is Awaiting Upstream. I don't know much about
> > this state. I want to ask if this PR will be merged.
>
> This one won't be merged.
> Jiri had sent patches based on whitelist approach.
> That's a proper direction to address locking concerns.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH bpf-next v14 1/2] bpf: add new helper get_fd_path for mapping a file descriptor to a pathname
  2020-06-02  3:04                                   ` Wenbo Zhang
@ 2020-06-02  8:14                                     ` Jiri Olsa
  0 siblings, 0 replies; 52+ messages in thread
From: Jiri Olsa @ 2020-06-02  8:14 UTC (permalink / raw)
  To: Wenbo Zhang
  Cc: Alexei Starovoitov, Daniel Borkmann, Al Viro, Brendan Gregg, bpf,
	Alexei Starovoitov, Yonghong Song, Andrii Nakryiko

On Tue, Jun 02, 2020 at 11:04:02AM +0800, Wenbo Zhang wrote:
> Get it, I'll search Jiri's patches to see how to use that. Thanks.

I'll cc you in the next post

jirka

> 
> Alexei Starovoitov <alexei.starovoitov@gmail.com> 于2020年6月2日周二 上午12:38写道:
> >
> > On Mon, Jun 1, 2020 at 7:17 AM Wenbo Zhang <ethercflow@gmail.com> wrote:
> > >
> > > Hi Daniel,
> > >
> > > I find https://patchwork.ozlabs.org/project/netdev/patch/7464919bd9c15f2496ca29dceb6a4048b3199774.1576629200.git.ethercflow@gmail.com/
> > > this PR's current state is Awaiting Upstream. I don't know much about
> > > this state. I want to ask if this PR will be merged.
> >
> > This one won't be merged.
> > Jiri had sent patches based on whitelist approach.
> > That's a proper direction to address locking concerns.
> 


^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2020-06-02  8:15 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-19 13:27 [PATCH bpf-next v10 0/2] bpf: adding get_file_path helper Wenbo Zhang
2019-11-19 13:27 ` [PATCH bpf-next v10 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname Wenbo Zhang
2019-11-23  3:18   ` Alexei Starovoitov
2019-11-23  4:43     ` Al Viro
2019-11-23  4:51     ` Al Viro
2019-11-23  5:19       ` Alexei Starovoitov
2019-11-23  5:35         ` Al Viro
2019-11-23  6:04           ` Alexei Starovoitov
2019-12-13 19:51             ` Brendan Gregg
2019-12-05  4:20   ` [PATCH bpf-next v11 0/2] bpf: adding get_file_path helper Wenbo Zhang
2019-12-05  4:20     ` [PATCH bpf-next v11 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname Wenbo Zhang
2019-12-05  7:19       ` Alexei Starovoitov
2019-12-05  9:47         ` Wenbo Zhang
2019-12-15  4:01       ` [PATCH bpf-next v12 0/2] bpf: adding get_file_path helper Wenbo Zhang
2019-12-15  4:01         ` [PATCH bpf-next v12 1/2] bpf: add new helper get_file_path for mapping a file descriptor to a pathname Wenbo Zhang
2019-12-15 16:05           ` Yonghong Song
2019-12-17  6:26             ` Wenbo Zhang
2019-12-17  6:33               ` Yonghong Song
2019-12-15 16:10           ` Yonghong Song
2019-12-17  6:27             ` Wenbo Zhang
2019-12-16 22:09           ` Brendan Gregg
2019-12-17  4:05             ` Wenbo Zhang
2019-12-17  9:47           ` [PATCH bpf-next v13 0/2] bpf: adding get_fd_path helper Wenbo Zhang
2019-12-17  9:47             ` [PATCH bpf-next v13 1/2] bpf: add new helper get_fd_path for mapping a file descriptor to a pathname Wenbo Zhang
2019-12-17 16:29               ` Yonghong Song
2019-12-17 19:39                 ` Daniel Borkmann
2019-12-18  0:11                   ` Wenbo Zhang
2019-12-18  0:06                 ` Wenbo Zhang
2019-12-18  0:56               ` [PATCH bpf-next v14 0/2] bpf: adding get_fd_path helper Wenbo Zhang
2019-12-18  0:56                 ` [PATCH bpf-next v14 1/2] bpf: add new helper get_fd_path for mapping a file descriptor to a pathname Wenbo Zhang
2019-12-18  3:27                   ` Yonghong Song
2019-12-19 16:14                   ` Daniel Borkmann
2019-12-20  3:35                     ` Wenbo Zhang
2020-01-16  8:59                       ` Jiri Olsa
2020-02-10  4:43                         ` Brendan Gregg
2020-02-11  0:01                           ` Daniel Borkmann
2020-02-12 15:21                             ` Jiri Olsa
2020-06-01 14:17                               ` Wenbo Zhang
2020-06-01 16:38                                 ` Alexei Starovoitov
2020-06-02  3:04                                   ` Wenbo Zhang
2020-06-02  8:14                                     ` Jiri Olsa
2019-12-18  0:56                 ` [PATCH bpf-next v14 2/2] selftests/bpf: test for bpf_get_fd_path() from tracepoint Wenbo Zhang
2019-12-18  3:27                   ` Yonghong Song
2019-12-17  9:47             ` [PATCH bpf-next v13 " Wenbo Zhang
2019-12-17 16:32               ` Yonghong Song
2019-12-15  4:01         ` [PATCH bpf-next v12 2/2] selftests/bpf: test for bpf_get_file_path() " Wenbo Zhang
2019-12-15 16:24           ` Yonghong Song
2019-12-17  4:01             ` Wenbo Zhang
2019-12-17  4:13               ` Yonghong Song
2019-12-17  9:44                 ` Wenbo Zhang
2019-12-05  4:20     ` [PATCH bpf-next v11 " Wenbo Zhang
2019-11-19 13:27 ` [PATCH bpf-next v10 " Wenbo Zhang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).