[PATCH bpf-next 00/10] bpf_link observability APIs

bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH bpf-next 00/10] bpf_link observability APIs
@ 2020-04-24  5:34 Andrii Nakryiko
  2020-04-24  5:34 ` [PATCH bpf-next 01/10] bpf: refactor bpf_link update handling Andrii Nakryiko
                   ` (10 more replies)
  0 siblings, 11 replies; 24+ messages in thread
From: Andrii Nakryiko @ 2020-04-24  5:34 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko

This patch series adds various observability APIs to bpf_link:
  - each bpf_link now gets ID, similar to bpf_map and bpf_prog, by which
    user-space can iterate over all existing bpf_links and create limited FD
    from ID;
  - allows to get extra object information with bpf_link general and
    type-specific information;
  - implements `bpf link show` command which lists all active bpf_links in the
    system;
  - implements `bpf link pin` allowing to pin bpf_link by ID or from other
    pinned path.

rfc->v1:
  - dropped read-only bpf_links (Alexei);
  - fixed bug in bpf_link_cleanup() not removing ID;
  - fixed bpftool link pinning search logic;
  - added bash-completion and man page.

Andrii Nakryiko (10):
  bpf: refactor bpf_link update handling
  bpf: allocate ID for bpf_link
  bpf: support GET_FD_BY_ID and GET_NEXT_ID for bpf_link
  bpf: add support for BPF_OBJ_GET_INFO_BY_FD for bpf_link
  libbpf: add low-level APIs for new bpf_link commands
  selftests/bpf: test bpf_link's get_next_id, get_fd_by_id, and
    get_obj_info
  bpftool: expose attach_type-to-string array to non-cgroup code
  bpftool: add bpf_link show and pin support
  bpftool: add bpftool-link manpage
  bpftool: add link bash completions

 include/linux/bpf-cgroup.h                    |  14 -
 include/linux/bpf.h                           |  28 +-
 include/linux/bpf_types.h                     |   6 +
 include/uapi/linux/bpf.h                      |  31 ++
 kernel/bpf/btf.c                              |   2 +
 kernel/bpf/cgroup.c                           |  89 +++-
 kernel/bpf/syscall.c                          | 363 +++++++++++++---
 kernel/bpf/verifier.c                         |   2 +
 kernel/cgroup/cgroup.c                        |  27 --
 .../bpftool/Documentation/bpftool-link.rst    | 119 ++++++
 tools/bpf/bpftool/bash-completion/bpftool     |  39 ++
 tools/bpf/bpftool/cgroup.c                    |  28 +-
 tools/bpf/bpftool/common.c                    |   2 +
 tools/bpf/bpftool/link.c                      | 402 ++++++++++++++++++
 tools/bpf/bpftool/main.c                      |   6 +-
 tools/bpf/bpftool/main.h                      |  37 ++
 tools/include/uapi/linux/bpf.h                |  31 ++
 tools/lib/bpf/bpf.c                           |  19 +-
 tools/lib/bpf/bpf.h                           |   4 +-
 tools/lib/bpf/libbpf.map                      |   6 +
 .../selftests/bpf/prog_tests/bpf_obj_id.c     | 110 ++++-
 .../testing/selftests/bpf/progs/test_obj_id.c |  14 +-
 22 files changed, 1203 insertions(+), 176 deletions(-)
 create mode 100644 tools/bpf/bpftool/Documentation/bpftool-link.rst
 create mode 100644 tools/bpf/bpftool/link.c

-- 
2.24.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH bpf-next 01/10] bpf: refactor bpf_link update handling
  2020-04-24  5:34 [PATCH bpf-next 00/10] bpf_link observability APIs Andrii Nakryiko
@ 2020-04-24  5:34 ` Andrii Nakryiko
  2020-04-24  5:34 ` [PATCH bpf-next 02/10] bpf: allocate ID for bpf_link Andrii Nakryiko
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Andrii Nakryiko @ 2020-04-24  5:34 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko

Make bpf_link update support more generic by making it into another
bpf_link_ops methods. This allows generic syscall handling code to be agnostic
to various conditionally compiled features (e.g., the case of
CONFIG_CGROUP_BPF). This also allows to keep link type-specific code to remain
static within respective code base. Refactor existing bpf_cgroup_link code and
take advantage of this.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 include/linux/bpf-cgroup.h | 12 ------------
 include/linux/bpf.h        |  3 ++-
 kernel/bpf/cgroup.c        | 30 ++++++++++++++++++++++++++++--
 kernel/bpf/syscall.c       | 11 ++++-------
 kernel/cgroup/cgroup.c     | 27 ---------------------------
 5 files changed, 34 insertions(+), 49 deletions(-)

diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h
index c11b413d5b1a..d2d969669564 100644
--- a/include/linux/bpf-cgroup.h
+++ b/include/linux/bpf-cgroup.h
@@ -100,8 +100,6 @@ int __cgroup_bpf_attach(struct cgroup *cgrp,
 int __cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
 			struct bpf_cgroup_link *link,
 			enum bpf_attach_type type);
-int __cgroup_bpf_replace(struct cgroup *cgrp, struct bpf_cgroup_link *link,
-			 struct bpf_prog *new_prog);
 int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
 		       union bpf_attr __user *uattr);
 
@@ -112,8 +110,6 @@ int cgroup_bpf_attach(struct cgroup *cgrp,
 		      u32 flags);
 int cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
 		      enum bpf_attach_type type);
-int cgroup_bpf_replace(struct bpf_link *link, struct bpf_prog *old_prog,
-		       struct bpf_prog *new_prog);
 int cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
 		     union bpf_attr __user *uattr);
 
@@ -354,7 +350,6 @@ int cgroup_bpf_prog_query(const union bpf_attr *attr,
 #else
 
 struct bpf_prog;
-struct bpf_link;
 struct cgroup_bpf {};
 static inline int cgroup_bpf_inherit(struct cgroup *cgrp) { return 0; }
 static inline void cgroup_bpf_offline(struct cgroup *cgrp) {}
@@ -378,13 +373,6 @@ static inline int cgroup_bpf_link_attach(const union bpf_attr *attr,
 	return -EINVAL;
 }
 
-static inline int cgroup_bpf_replace(struct bpf_link *link,
-				     struct bpf_prog *old_prog,
-				     struct bpf_prog *new_prog)
-{
-	return -EINVAL;
-}
-
 static inline int cgroup_bpf_prog_query(const union bpf_attr *attr,
 					union bpf_attr __user *uattr)
 {
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 25da6ff2a880..20bcac258d1f 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1093,7 +1093,8 @@ struct bpf_link {
 struct bpf_link_ops {
 	void (*release)(struct bpf_link *link);
 	void (*dealloc)(struct bpf_link *link);
-
+	int (*update_prog)(struct bpf_link *link, struct bpf_prog *new_prog,
+			   struct bpf_prog *old_prog);
 };
 
 void bpf_link_init(struct bpf_link *link, const struct bpf_link_ops *ops,
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index 4d748c5785bc..27a223bcc44c 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -557,8 +557,9 @@ static void replace_effective_prog(struct cgroup *cgrp,
  *
  * Must be called with cgroup_mutex held.
  */
-int __cgroup_bpf_replace(struct cgroup *cgrp, struct bpf_cgroup_link *link,
-			 struct bpf_prog *new_prog)
+static int __cgroup_bpf_replace(struct cgroup *cgrp,
+				struct bpf_cgroup_link *link,
+				struct bpf_prog *new_prog)
 {
 	struct list_head *progs = &cgrp->bpf.progs[link->type];
 	struct bpf_prog *old_prog;
@@ -583,6 +584,30 @@ int __cgroup_bpf_replace(struct cgroup *cgrp, struct bpf_cgroup_link *link,
 	return 0;
 }
 
+static int cgroup_bpf_replace(struct bpf_link *link, struct bpf_prog *new_prog,
+			      struct bpf_prog *old_prog)
+{
+	struct bpf_cgroup_link *cg_link;
+	int ret;
+
+	cg_link = container_of(link, struct bpf_cgroup_link, link);
+
+	mutex_lock(&cgroup_mutex);
+	/* link might have been auto-released by dying cgroup, so fail */
+	if (!cg_link->cgroup) {
+		ret = -EINVAL;
+		goto out_unlock;
+	}
+	if (old_prog && link->prog != old_prog) {
+		ret = -EPERM;
+		goto out_unlock;
+	}
+	ret = __cgroup_bpf_replace(cg_link->cgroup, cg_link, new_prog);
+out_unlock:
+	mutex_unlock(&cgroup_mutex);
+	return ret;
+}
+
 static struct bpf_prog_list *find_detach_entry(struct list_head *progs,
 					       struct bpf_prog *prog,
 					       struct bpf_cgroup_link *link,
@@ -811,6 +836,7 @@ static void bpf_cgroup_link_dealloc(struct bpf_link *link)
 const struct bpf_link_ops bpf_cgroup_link_lops = {
 	.release = bpf_cgroup_link_release,
 	.dealloc = bpf_cgroup_link_dealloc,
+	.update_prog = cgroup_bpf_replace,
 };
 
 int cgroup_bpf_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index d85f37239540..84518853648a 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -3640,13 +3640,10 @@ static int link_update(union bpf_attr *attr)
 		}
 	}
 
-#ifdef CONFIG_CGROUP_BPF
-	if (link->ops == &bpf_cgroup_link_lops) {
-		ret = cgroup_bpf_replace(link, old_prog, new_prog);
-		goto out_put_progs;
-	}
-#endif
-	ret = -EINVAL;
+	if (link->ops->update_prog)
+		ret = link->ops->update_prog(link, new_prog, old_prog);
+	else
+		ret = EINVAL;
 
 out_put_progs:
 	if (old_prog)
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 06b5ea9d899d..557a9b9d2244 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -6508,33 +6508,6 @@ int cgroup_bpf_attach(struct cgroup *cgrp,
 	return ret;
 }
 
-int cgroup_bpf_replace(struct bpf_link *link, struct bpf_prog *old_prog,
-		       struct bpf_prog *new_prog)
-{
-	struct bpf_cgroup_link *cg_link;
-	int ret;
-
-	if (link->ops != &bpf_cgroup_link_lops)
-		return -EINVAL;
-
-	cg_link = container_of(link, struct bpf_cgroup_link, link);
-
-	mutex_lock(&cgroup_mutex);
-	/* link might have been auto-released by dying cgroup, so fail */
-	if (!cg_link->cgroup) {
-		ret = -EINVAL;
-		goto out_unlock;
-	}
-	if (old_prog && link->prog != old_prog) {
-		ret = -EPERM;
-		goto out_unlock;
-	}
-	ret = __cgroup_bpf_replace(cg_link->cgroup, cg_link, new_prog);
-out_unlock:
-	mutex_unlock(&cgroup_mutex);
-	return ret;
-}
-
 int cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
 		      enum bpf_attach_type type)
 {
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH bpf-next 02/10] bpf: allocate ID for bpf_link
  2020-04-24  5:34 [PATCH bpf-next 00/10] bpf_link observability APIs Andrii Nakryiko
  2020-04-24  5:34 ` [PATCH bpf-next 01/10] bpf: refactor bpf_link update handling Andrii Nakryiko
@ 2020-04-24  5:34 ` Andrii Nakryiko
  2020-04-24  5:34 ` [PATCH bpf-next 03/10] bpf: support GET_FD_BY_ID and GET_NEXT_ID " Andrii Nakryiko
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Andrii Nakryiko @ 2020-04-24  5:34 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko

Generate ID for each bpf_link using IDR, similarly to bpf_map and bpf_prog.
bpf_link creation, initialization, attachment, and exposing to user-space
through FD and ID is a complicated multi-step process, abstract it away
through bpf_link_primer and bpf_link_prime(), bpf_link_settle(), and
bpf_link_cleanup() internal API. They guarantee that until bpf_link is
properly attached, user-space won't be able to access partially-initialized
bpf_link either from FD or ID. All this allows to simplify bpf_link attachment
and error handling code.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 include/linux/bpf.h      |  17 +++--
 include/uapi/linux/bpf.h |   1 +
 kernel/bpf/cgroup.c      |  14 ++--
 kernel/bpf/syscall.c     | 143 +++++++++++++++++++++++++++------------
 4 files changed, 118 insertions(+), 57 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 20bcac258d1f..caea78cadc73 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1085,11 +1085,19 @@ int bpf_prog_new_fd(struct bpf_prog *prog);
 
 struct bpf_link {
 	atomic64_t refcnt;
+	u32 id;
 	const struct bpf_link_ops *ops;
 	struct bpf_prog *prog;
 	struct work_struct work;
 };
 
+struct bpf_link_primer {
+	struct bpf_link *link;
+	struct file *file;
+	int fd;
+	u32 id;
+};
+
 struct bpf_link_ops {
 	void (*release)(struct bpf_link *link);
 	void (*dealloc)(struct bpf_link *link);
@@ -1097,10 +1105,11 @@ struct bpf_link_ops {
 			   struct bpf_prog *old_prog);
 };
 
-void bpf_link_init(struct bpf_link *link, const struct bpf_link_ops *ops,
-		   struct bpf_prog *prog);
-void bpf_link_cleanup(struct bpf_link *link, struct file *link_file,
-		      int link_fd);
+void bpf_link_init(struct bpf_link *link,
+		   const struct bpf_link_ops *ops, struct bpf_prog *prog);
+int bpf_link_prime(struct bpf_link *link, struct bpf_link_primer *primer);
+int bpf_link_settle(struct bpf_link_primer *primer);
+void bpf_link_cleanup(struct bpf_link_primer *primer);
 void bpf_link_inc(struct bpf_link *link);
 void bpf_link_put(struct bpf_link *link);
 int bpf_link_new_fd(struct bpf_link *link);
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 2e29a671d67e..eccfd1dea951 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -523,6 +523,7 @@ union bpf_attr {
 			__u32		prog_id;
 			__u32		map_id;
 			__u32		btf_id;
+			__u32		link_id;
 		};
 		__u32		next_id;
 		__u32		open_flags;
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index 27a223bcc44c..41a03bb84eb8 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -841,10 +841,10 @@ const struct bpf_link_ops bpf_cgroup_link_lops = {
 
 int cgroup_bpf_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
 {
+	struct bpf_link_primer link_primer;
 	struct bpf_cgroup_link *link;
-	struct file *link_file;
 	struct cgroup *cgrp;
-	int err, link_fd;
+	int err;
 
 	if (attr->link_create.flags)
 		return -EINVAL;
@@ -862,22 +862,20 @@ int cgroup_bpf_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
 	link->cgroup = cgrp;
 	link->type = attr->link_create.attach_type;
 
-	link_file = bpf_link_new_file(&link->link, &link_fd);
-	if (IS_ERR(link_file)) {
+	err  = bpf_link_prime(&link->link, &link_primer);
+	if (err) {
 		kfree(link);
-		err = PTR_ERR(link_file);
 		goto out_put_cgroup;
 	}
 
 	err = cgroup_bpf_attach(cgrp, NULL, NULL, link, link->type,
 				BPF_F_ALLOW_MULTI);
 	if (err) {
-		bpf_link_cleanup(&link->link, link_file, link_fd);
+		bpf_link_cleanup(&link_primer);
 		goto out_put_cgroup;
 	}
 
-	fd_install(link_fd, link_file);
-	return link_fd;
+	return bpf_link_settle(&link_primer);
 
 out_put_cgroup:
 	cgroup_put(cgrp);
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 84518853648a..66781b3cac59 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -42,6 +42,8 @@ static DEFINE_IDR(prog_idr);
 static DEFINE_SPINLOCK(prog_idr_lock);
 static DEFINE_IDR(map_idr);
 static DEFINE_SPINLOCK(map_idr_lock);
+static DEFINE_IDR(link_idr);
+static DEFINE_SPINLOCK(link_idr_lock);
 
 int sysctl_unprivileged_bpf_disabled __read_mostly;
 
@@ -2181,25 +2183,40 @@ static int bpf_obj_get(const union bpf_attr *attr)
 				attr->file_flags);
 }
 
-void bpf_link_init(struct bpf_link *link, const struct bpf_link_ops *ops,
-		   struct bpf_prog *prog)
+void bpf_link_init(struct bpf_link *link,
+		   const struct bpf_link_ops *ops, struct bpf_prog *prog)
 {
 	atomic64_set(&link->refcnt, 1);
+	link->id = 0;
 	link->ops = ops;
 	link->prog = prog;
 }
 
+static void bpf_link_free_id(int id)
+{
+	unsigned long flags;
+
+	if (!id)
+		return;
+
+	spin_lock_irqsave(&link_idr_lock, flags);
+	idr_remove(&link_idr, id);
+	spin_unlock_irqrestore(&link_idr_lock, flags);
+}
+
 /* Clean up bpf_link and corresponding anon_inode file and FD. After
  * anon_inode is created, bpf_link can't be just kfree()'d due to deferred
- * anon_inode's release() call. This helper manages marking bpf_link as
- * defunct, releases anon_inode file and puts reserved FD.
+ * anon_inode's release() call. This helper marksbpf_link as
+ * defunct, releases anon_inode file and puts reserved FD. bpf_prog's refcnt
+ * is not decremented, it's the responsibility of a calling code that failed
+ * to complete bpf_link initialization.
  */
-void bpf_link_cleanup(struct bpf_link *link, struct file *link_file,
-		      int link_fd)
+void bpf_link_cleanup(struct bpf_link_primer *primer)
 {
-	link->prog = NULL;
-	fput(link_file);
-	put_unused_fd(link_fd);
+	primer->link->prog = NULL;
+	bpf_link_free_id(primer->id);
+	fput(primer->file);
+	put_unused_fd(primer->fd);
 }
 
 void bpf_link_inc(struct bpf_link *link)
@@ -2210,6 +2227,7 @@ void bpf_link_inc(struct bpf_link *link)
 /* bpf_link_free is guaranteed to be called from process context */
 static void bpf_link_free(struct bpf_link *link)
 {
+	bpf_link_free_id(link->id);
 	if (link->prog) {
 		/* detach BPF program, clean up used resources */
 		link->ops->release(link);
@@ -2275,9 +2293,11 @@ static void bpf_link_show_fdinfo(struct seq_file *m, struct file *filp)
 	bin2hex(prog_tag, prog->tag, sizeof(prog->tag));
 	seq_printf(m,
 		   "link_type:\t%s\n"
+		   "link_id:\t%u\n"
 		   "prog_tag:\t%s\n"
 		   "prog_id:\t%u\n",
 		   link_type,
+		   link->id,
 		   prog_tag,
 		   prog->aux->id);
 }
@@ -2292,36 +2312,74 @@ const struct file_operations bpf_link_fops = {
 	.write		= bpf_dummy_write,
 };
 
-int bpf_link_new_fd(struct bpf_link *link)
+static int bpf_link_alloc_id(struct bpf_link *link)
 {
-	return anon_inode_getfd("bpf-link", &bpf_link_fops, link, O_CLOEXEC);
-}
+	int id;
+
+	idr_preload(GFP_KERNEL);
+	spin_lock_bh(&link_idr_lock);
+	id = idr_alloc_cyclic(&link_idr, link, 1, INT_MAX, GFP_ATOMIC);
+	spin_unlock_bh(&link_idr_lock);
+	idr_preload_end();
 
-/* Similar to bpf_link_new_fd, create anon_inode for given bpf_link, but
- * instead of immediately installing fd in fdtable, just reserve it and
- * return. Caller then need to either install it with fd_install(fd, file) or
- * release with put_unused_fd(fd).
- * This is useful for cases when bpf_link attachment/detachment are
- * complicated and expensive operations and should be delayed until all the fd
- * reservation and anon_inode creation succeeds.
+	return id;
+}
+
+/* Prepare bpf_link to be exposed to user-space by allocating anon_inode file,
+ * reserving unused FD and allocating ID from link_idr. This is to be paired
+ * with bpf_link_settle() to install FD and ID and expose bpf_link to
+ * user-space, if bpf_link is successfully attached. If not, bpf_link and
+ * pre-allocated resources are to be freed with bpf_cleanup() call. All the
+ * transient state is passed around in struct bpf_link_primer.
+ * This is preferred way to create and initialize bpf_link, especially when
+ * there are complicated and expensive operations inbetween creating bpf_link
+ * itself and attaching it to BPF hook. By using bpf_link_prime() and
+ * bpf_link_settle() kernel code using bpf_link doesn't have to perform
+ * expensive (and potentially failing) roll back operations in a rare case
+ * that file, FD, or ID can't be allocated.
  */
-struct file *bpf_link_new_file(struct bpf_link *link, int *reserved_fd)
+int bpf_link_prime(struct bpf_link *link, struct bpf_link_primer *primer)
 {
 	struct file *file;
-	int fd;
+	int fd, id;
 
 	fd = get_unused_fd_flags(O_CLOEXEC);
 	if (fd < 0)
-		return ERR_PTR(fd);
+		return fd;
 
 	file = anon_inode_getfile("bpf_link", &bpf_link_fops, link, O_CLOEXEC);
 	if (IS_ERR(file)) {
 		put_unused_fd(fd);
-		return file;
+		return PTR_ERR(file);
 	}
 
-	*reserved_fd = fd;
-	return file;
+	id = bpf_link_alloc_id(link);
+	if (id < 0) {
+		put_unused_fd(fd);
+		fput(file);
+		return id;
+	}
+
+	primer->link = link;
+	primer->file = file;
+	primer->fd = fd;
+	primer->id = id;
+	return 0;
+}
+
+int bpf_link_settle(struct bpf_link_primer *primer)
+{
+	/* make bpf_link fetchable by ID */
+	WRITE_ONCE(primer->link->id, primer->id);
+	/* make bpf_link fetchable by FD */
+	fd_install(primer->fd, primer->file);
+	/* pass through installed FD */
+	return primer->fd;
+}
+
+int bpf_link_new_fd(struct bpf_link *link)
+{
+	return anon_inode_getfd("bpf-link", &bpf_link_fops, link, O_CLOEXEC);
 }
 
 struct bpf_link *bpf_link_get_from_fd(u32 ufd)
@@ -2367,9 +2425,9 @@ static const struct bpf_link_ops bpf_tracing_link_lops = {
 
 static int bpf_tracing_prog_attach(struct bpf_prog *prog)
 {
+	struct bpf_link_primer link_primer;
 	struct bpf_tracing_link *link;
-	struct file *link_file;
-	int link_fd, err;
+	int err;
 
 	switch (prog->type) {
 	case BPF_PROG_TYPE_TRACING:
@@ -2404,22 +2462,19 @@ static int bpf_tracing_prog_attach(struct bpf_prog *prog)
 	}
 	bpf_link_init(&link->link, &bpf_tracing_link_lops, prog);
 
-	link_file = bpf_link_new_file(&link->link, &link_fd);
-	if (IS_ERR(link_file)) {
+	err = bpf_link_prime(&link->link, &link_primer);
+	if (err) {
 		kfree(link);
-		err = PTR_ERR(link_file);
 		goto out_put_prog;
 	}
 
 	err = bpf_trampoline_link_prog(prog);
 	if (err) {
-		bpf_link_cleanup(&link->link, link_file, link_fd);
+		bpf_link_cleanup(&link_primer);
 		goto out_put_prog;
 	}
 
-	fd_install(link_fd, link_file);
-	return link_fd;
-
+	return bpf_link_settle(&link_primer);
 out_put_prog:
 	bpf_prog_put(prog);
 	return err;
@@ -2447,7 +2502,7 @@ static void bpf_raw_tp_link_dealloc(struct bpf_link *link)
 	kfree(raw_tp);
 }
 
-static const struct bpf_link_ops bpf_raw_tp_lops = {
+static const struct bpf_link_ops bpf_raw_tp_link_lops = {
 	.release = bpf_raw_tp_link_release,
 	.dealloc = bpf_raw_tp_link_dealloc,
 };
@@ -2456,13 +2511,13 @@ static const struct bpf_link_ops bpf_raw_tp_lops = {
 
 static int bpf_raw_tracepoint_open(const union bpf_attr *attr)
 {
+	struct bpf_link_primer link_primer;
 	struct bpf_raw_tp_link *link;
 	struct bpf_raw_event_map *btp;
-	struct file *link_file;
 	struct bpf_prog *prog;
 	const char *tp_name;
 	char buf[128];
-	int link_fd, err;
+	int err;
 
 	if (CHECK_ATTR(BPF_RAW_TRACEPOINT_OPEN))
 		return -EINVAL;
@@ -2515,24 +2570,22 @@ static int bpf_raw_tracepoint_open(const union bpf_attr *attr)
 		err = -ENOMEM;
 		goto out_put_btp;
 	}
-	bpf_link_init(&link->link, &bpf_raw_tp_lops, prog);
+	bpf_link_init(&link->link, &bpf_raw_tp_link_lops, prog);
 	link->btp = btp;
 
-	link_file = bpf_link_new_file(&link->link, &link_fd);
-	if (IS_ERR(link_file)) {
+	err = bpf_link_prime(&link->link, &link_primer);
+	if (err) {
 		kfree(link);
-		err = PTR_ERR(link_file);
 		goto out_put_btp;
 	}
 
 	err = bpf_probe_register(link->btp, prog);
 	if (err) {
-		bpf_link_cleanup(&link->link, link_file, link_fd);
+		bpf_link_cleanup(&link_primer);
 		goto out_put_btp;
 	}
 
-	fd_install(link_fd, link_file);
-	return link_fd;
+	return bpf_link_settle(&link_primer);
 
 out_put_btp:
 	bpf_put_raw_tracepoint(btp);
@@ -3464,7 +3517,7 @@ static int bpf_task_fd_query(const union bpf_attr *attr,
 	if (file->f_op == &bpf_link_fops) {
 		struct bpf_link *link = file->private_data;
 
-		if (link->ops == &bpf_raw_tp_lops) {
+		if (link->ops == &bpf_raw_tp_link_lops) {
 			struct bpf_raw_tp_link *raw_tp =
 				container_of(link, struct bpf_raw_tp_link, link);
 			struct bpf_raw_event_map *btp = raw_tp->btp;
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH bpf-next 03/10] bpf: support GET_FD_BY_ID and GET_NEXT_ID for bpf_link
  2020-04-24  5:34 [PATCH bpf-next 00/10] bpf_link observability APIs Andrii Nakryiko
  2020-04-24  5:34 ` [PATCH bpf-next 01/10] bpf: refactor bpf_link update handling Andrii Nakryiko
  2020-04-24  5:34 ` [PATCH bpf-next 02/10] bpf: allocate ID for bpf_link Andrii Nakryiko
@ 2020-04-24  5:34 ` Andrii Nakryiko
  2020-04-24  5:34 ` [PATCH bpf-next 04/10] bpf: add support for BPF_OBJ_GET_INFO_BY_FD " Andrii Nakryiko
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Andrii Nakryiko @ 2020-04-24  5:34 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko

Add support to look up bpf_link by ID and iterate over all existing bpf_links
in the system. GET_FD_BY_ID code handles not-yet-ready bpf_link by checking
that its ID hasn't been set to non-zero value yet. Setting bpf_link's ID is
done as the very last step in finalizing bpf_link, together with installing
FD. This approach allows users of bpf_link in kernel code to not worry about
races between user-space and kernel code that hasn't finished attaching and
initializing bpf_link.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 include/uapi/linux/bpf.h |  2 ++
 kernel/bpf/syscall.c     | 49 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 51 insertions(+)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index eccfd1dea951..407c086bc9e4 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -113,6 +113,8 @@ enum bpf_cmd {
 	BPF_MAP_DELETE_BATCH,
 	BPF_LINK_CREATE,
 	BPF_LINK_UPDATE,
+	BPF_LINK_GET_FD_BY_ID,
+	BPF_LINK_GET_NEXT_ID,
 };
 
 enum bpf_map_type {
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 66781b3cac59..ac352e7fd37e 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -3706,6 +3706,48 @@ static int link_update(union bpf_attr *attr)
 	return ret;
 }
 
+static int bpf_link_inc_not_zero(struct bpf_link *link)
+{
+	return atomic64_fetch_add_unless(&link->refcnt, 1, 0) ? 0 : -ENOENT;
+}
+
+#define BPF_LINK_GET_FD_BY_ID_LAST_FIELD link_id
+
+static int bpf_link_get_fd_by_id(const union bpf_attr *attr)
+{
+	struct bpf_link *link;
+	u32 id = attr->link_id;
+	int fd, err;
+
+	if (CHECK_ATTR(BPF_LINK_GET_FD_BY_ID))
+		return -EINVAL;
+
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
+	spin_lock_bh(&link_idr_lock);
+	link = idr_find(&link_idr, id);
+	/* before link is "settled", ID is 0, pretend it doesn't exist yet */
+	if (link) {
+		if (link->id)
+			err = bpf_link_inc_not_zero(link);
+		else
+			err = -EAGAIN;
+	} else {
+		err = -ENOENT;
+	}
+	spin_unlock_bh(&link_idr_lock);
+
+	if (err)
+		return err;
+
+	fd = bpf_link_new_fd(link);
+	if (fd < 0)
+		bpf_link_put(link);
+
+	return fd;
+}
+
 SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, size)
 {
 	union bpf_attr attr;
@@ -3823,6 +3865,13 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
 	case BPF_LINK_UPDATE:
 		err = link_update(&attr);
 		break;
+	case BPF_LINK_GET_FD_BY_ID:
+		err = bpf_link_get_fd_by_id(&attr);
+		break;
+	case BPF_LINK_GET_NEXT_ID:
+		err = bpf_obj_get_next_id(&attr, uattr,
+					  &link_idr, &link_idr_lock);
+		break;
 	default:
 		err = -EINVAL;
 		break;
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH bpf-next 04/10] bpf: add support for BPF_OBJ_GET_INFO_BY_FD for bpf_link
  2020-04-24  5:34 [PATCH bpf-next 00/10] bpf_link observability APIs Andrii Nakryiko
                   ` (2 preceding siblings ...)
  2020-04-24  5:34 ` [PATCH bpf-next 03/10] bpf: support GET_FD_BY_ID and GET_NEXT_ID " Andrii Nakryiko
@ 2020-04-24  5:34 ` Andrii Nakryiko
  2020-04-24  5:35 ` [PATCH bpf-next 05/10] libbpf: add low-level APIs for new bpf_link commands Andrii Nakryiko
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Andrii Nakryiko @ 2020-04-24  5:34 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko

Add ability to fetch bpf_link details through BPF_OBJ_GET_INFO_BY_FD command.
Also enhance show_fdinfo to potentially include bpf_link type-specific
information (similarly to obj_info).

Also introduce enum bpf_link_type stored in bpf_link itself and expose it in
UAPI. bpf_link_tracing also now will store and return bpf_attach_type.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 include/linux/bpf-cgroup.h     |   2 -
 include/linux/bpf.h            |  10 +-
 include/linux/bpf_types.h      |   6 ++
 include/uapi/linux/bpf.h       |  28 ++++++
 kernel/bpf/btf.c               |   2 +
 kernel/bpf/cgroup.c            |  45 ++++++++-
 kernel/bpf/syscall.c           | 164 +++++++++++++++++++++++++++++----
 kernel/bpf/verifier.c          |   2 +
 tools/include/uapi/linux/bpf.h |  31 +++++++
 9 files changed, 266 insertions(+), 24 deletions(-)

diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h
index d2d969669564..ab95824a1d99 100644
--- a/include/linux/bpf-cgroup.h
+++ b/include/linux/bpf-cgroup.h
@@ -57,8 +57,6 @@ struct bpf_cgroup_link {
 	enum bpf_attach_type type;
 };
 
-extern const struct bpf_link_ops bpf_cgroup_link_lops;
-
 struct bpf_prog_list {
 	struct list_head node;
 	struct bpf_prog *prog;
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index caea78cadc73..5ff97ee64065 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1026,9 +1026,11 @@ extern const struct file_operations bpf_prog_fops;
 	extern const struct bpf_verifier_ops _name ## _verifier_ops;
 #define BPF_MAP_TYPE(_id, _ops) \
 	extern const struct bpf_map_ops _ops;
+#define BPF_LINK_TYPE(_id, _name)
 #include <linux/bpf_types.h>
 #undef BPF_PROG_TYPE
 #undef BPF_MAP_TYPE
+#undef BPF_LINK_TYPE
 
 extern const struct bpf_prog_ops bpf_offload_prog_ops;
 extern const struct bpf_verifier_ops tc_cls_act_analyzer_ops;
@@ -1086,6 +1088,7 @@ int bpf_prog_new_fd(struct bpf_prog *prog);
 struct bpf_link {
 	atomic64_t refcnt;
 	u32 id;
+	enum bpf_link_type type;
 	const struct bpf_link_ops *ops;
 	struct bpf_prog *prog;
 	struct work_struct work;
@@ -1103,9 +1106,14 @@ struct bpf_link_ops {
 	void (*dealloc)(struct bpf_link *link);
 	int (*update_prog)(struct bpf_link *link, struct bpf_prog *new_prog,
 			   struct bpf_prog *old_prog);
+	void (*show_fdinfo)(const struct bpf_link *link, struct seq_file *seq);
+	int (*fill_link_info)(const struct bpf_link *link,
+			      struct bpf_link_info *info,
+			      const struct bpf_link_info *uinfo,
+			      u32 info_len);
 };
 
-void bpf_link_init(struct bpf_link *link,
+void bpf_link_init(struct bpf_link *link, enum bpf_link_type type,
 		   const struct bpf_link_ops *ops, struct bpf_prog *prog);
 int bpf_link_prime(struct bpf_link *link, struct bpf_link_primer *primer);
 int bpf_link_settle(struct bpf_link_primer *primer);
diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h
index ba0c2d56f8a3..8345cdf553b8 100644
--- a/include/linux/bpf_types.h
+++ b/include/linux/bpf_types.h
@@ -118,3 +118,9 @@ BPF_MAP_TYPE(BPF_MAP_TYPE_STACK, stack_map_ops)
 #if defined(CONFIG_BPF_JIT)
 BPF_MAP_TYPE(BPF_MAP_TYPE_STRUCT_OPS, bpf_struct_ops_map_ops)
 #endif
+
+BPF_LINK_TYPE(BPF_LINK_TYPE_RAW_TRACEPOINT, raw_tracepoint)
+BPF_LINK_TYPE(BPF_LINK_TYPE_TRACING, tracing)
+#ifdef CONFIG_CGROUP_BPF
+BPF_LINK_TYPE(BPF_LINK_TYPE_CGROUP, cgroup)
+#endif
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 407c086bc9e4..d2f269082a33 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -222,6 +222,15 @@ enum bpf_attach_type {
 
 #define MAX_BPF_ATTACH_TYPE __MAX_BPF_ATTACH_TYPE
 
+enum bpf_link_type {
+	BPF_LINK_TYPE_UNSPEC = 0,
+	BPF_LINK_TYPE_RAW_TRACEPOINT = 1,
+	BPF_LINK_TYPE_TRACING = 2,
+	BPF_LINK_TYPE_CGROUP = 3,
+
+	MAX_BPF_LINK_TYPE,
+};
+
 /* cgroup-bpf attach flags used in BPF_PROG_ATTACH command
  *
  * NONE(default): No further bpf programs allowed in the subtree.
@@ -3601,6 +3610,25 @@ struct bpf_btf_info {
 	__u32 id;
 } __attribute__((aligned(8)));
 
+struct bpf_link_info {
+	__u32 type;
+	__u32 id;
+	__u32 prog_id;
+	union {
+		struct {
+			__aligned_u64 tp_name; /* in/out: tp_name buffer ptr */
+			__u32 tp_name_len;     /* in/out: tp_name buffer len */
+		} raw_tracepoint;
+		struct {
+			__u32 attach_type;
+		} tracing;
+		struct {
+			__u64 cgroup_id;
+			__u32 attach_type;
+		} cgroup;
+	};
+} __attribute__((aligned(8)));
+
 /* User bpf_sock_addr struct to access socket fields and sockaddr struct passed
  * by user and intended to be used by socket (e.g. to bind to, depends on
  * attach attach type).
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index d65c6912bdaf..a2cfba89a8e1 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -3482,6 +3482,7 @@ extern char __weak __stop_BTF[];
 extern struct btf *btf_vmlinux;
 
 #define BPF_MAP_TYPE(_id, _ops)
+#define BPF_LINK_TYPE(_id, _name)
 static union {
 	struct bpf_ctx_convert {
 #define BPF_PROG_TYPE(_id, _name, prog_ctx_type, kern_ctx_type) \
@@ -3508,6 +3509,7 @@ static u8 bpf_ctx_convert_map[] = {
 	0, /* avoid empty array */
 };
 #undef BPF_MAP_TYPE
+#undef BPF_LINK_TYPE
 
 static const struct btf_member *
 btf_get_prog_ctx_type(struct bpf_verifier_log *log, struct btf *btf,
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index 41a03bb84eb8..b8392db7b901 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -833,10 +833,50 @@ static void bpf_cgroup_link_dealloc(struct bpf_link *link)
 	kfree(cg_link);
 }
 
-const struct bpf_link_ops bpf_cgroup_link_lops = {
+static void bpf_cgroup_link_show_fdinfo(const struct bpf_link *link,
+					struct seq_file *seq)
+{
+	struct bpf_cgroup_link *cg_link =
+		container_of(link, struct bpf_cgroup_link, link);
+	u64 cg_id = 0;
+
+	mutex_lock(&cgroup_mutex);
+	if (cg_link->cgroup)
+		cg_id = cgroup_id(cg_link->cgroup);
+	mutex_unlock(&cgroup_mutex);
+
+	seq_printf(seq,
+		   "cgroup_id:\t%llu\n"
+		   "attach_type:\t%d\n",
+		   cg_id,
+		   cg_link->type);
+}
+
+static int bpf_cgroup_link_fill_link_info(const struct bpf_link *link,
+					  struct bpf_link_info *info,
+					  const struct bpf_link_info *uinfo,
+					  u32 info_len)
+{
+	struct bpf_cgroup_link *cg_link =
+		container_of(link, struct bpf_cgroup_link, link);
+	u64 cg_id = 0;
+
+	mutex_lock(&cgroup_mutex);
+	if (cg_link->cgroup)
+		cg_id = cgroup_id(cg_link->cgroup);
+	mutex_unlock(&cgroup_mutex);
+
+	info->cgroup.cgroup_id = cg_id;
+	info->cgroup.attach_type = cg_link->type;
+	return 0;
+}
+
+static const struct bpf_link_ops bpf_cgroup_link_lops = {
 	.release = bpf_cgroup_link_release,
 	.dealloc = bpf_cgroup_link_dealloc,
 	.update_prog = cgroup_bpf_replace,
+	.show_fdinfo = bpf_cgroup_link_show_fdinfo,
+	.fill_link_info = bpf_cgroup_link_fill_link_info,
 };
 
 int cgroup_bpf_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
@@ -858,7 +898,8 @@ int cgroup_bpf_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
 		err = -ENOMEM;
 		goto out_put_cgroup;
 	}
-	bpf_link_init(&link->link, &bpf_cgroup_link_lops, prog);
+	bpf_link_init(&link->link, BPF_LINK_TYPE_CGROUP, &bpf_cgroup_link_lops,
+		      prog);
 	link->cgroup = cgrp;
 	link->type = attr->link_create.attach_type;
 
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index ac352e7fd37e..0d1975c58153 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -51,9 +51,11 @@ static const struct bpf_map_ops * const bpf_map_types[] = {
 #define BPF_PROG_TYPE(_id, _name, prog_ctx_type, kern_ctx_type)
 #define BPF_MAP_TYPE(_id, _ops) \
 	[_id] = &_ops,
+#define BPF_LINK_TYPE(_id, _name)
 #include <linux/bpf_types.h>
 #undef BPF_PROG_TYPE
 #undef BPF_MAP_TYPE
+#undef BPF_LINK_TYPE
 };
 
 /*
@@ -1548,9 +1550,11 @@ static const struct bpf_prog_ops * const bpf_prog_types[] = {
 #define BPF_PROG_TYPE(_id, _name, prog_ctx_type, kern_ctx_type) \
 	[_id] = & _name ## _prog_ops,
 #define BPF_MAP_TYPE(_id, _ops)
+#define BPF_LINK_TYPE(_id, _name)
 #include <linux/bpf_types.h>
 #undef BPF_PROG_TYPE
 #undef BPF_MAP_TYPE
+#undef BPF_LINK_TYPE
 };
 
 static int find_prog_type(enum bpf_prog_type type, struct bpf_prog *prog)
@@ -2183,10 +2187,11 @@ static int bpf_obj_get(const union bpf_attr *attr)
 				attr->file_flags);
 }
 
-void bpf_link_init(struct bpf_link *link,
+void bpf_link_init(struct bpf_link *link, enum bpf_link_type type,
 		   const struct bpf_link_ops *ops, struct bpf_prog *prog)
 {
 	atomic64_set(&link->refcnt, 1);
+	link->type = type;
 	link->id = 0;
 	link->ops = ops;
 	link->prog = prog;
@@ -2268,27 +2273,23 @@ static int bpf_link_release(struct inode *inode, struct file *filp)
 	return 0;
 }
 
-#ifdef CONFIG_PROC_FS
-static const struct bpf_link_ops bpf_raw_tp_lops;
-static const struct bpf_link_ops bpf_tracing_link_lops;
+#define BPF_PROG_TYPE(_id, _name, prog_ctx_type, kern_ctx_type)
+#define BPF_MAP_TYPE(_id, _ops)
+#define BPF_LINK_TYPE(_id, _name) [_id] = #_name,
+static const char *bpf_link_type_strs[] = {
+	[BPF_LINK_TYPE_UNSPEC] = "<invalid>",
+#include <linux/bpf_types.h>
+};
+#undef BPF_PROG_TYPE
+#undef BPF_MAP_TYPE
+#undef BPF_LINK_TYPE
 
+#ifdef CONFIG_PROC_FS
 static void bpf_link_show_fdinfo(struct seq_file *m, struct file *filp)
 {
 	const struct bpf_link *link = filp->private_data;
 	const struct bpf_prog *prog = link->prog;
 	char prog_tag[sizeof(prog->tag) * 2 + 1] = { };
-	const char *link_type;
-
-	if (link->ops == &bpf_raw_tp_lops)
-		link_type = "raw_tracepoint";
-	else if (link->ops == &bpf_tracing_link_lops)
-		link_type = "tracing";
-#ifdef CONFIG_CGROUP_BPF
-	else if (link->ops == &bpf_cgroup_link_lops)
-		link_type = "cgroup";
-#endif
-	else
-		link_type = "unknown";
 
 	bin2hex(prog_tag, prog->tag, sizeof(prog->tag));
 	seq_printf(m,
@@ -2296,10 +2297,12 @@ static void bpf_link_show_fdinfo(struct seq_file *m, struct file *filp)
 		   "link_id:\t%u\n"
 		   "prog_tag:\t%s\n"
 		   "prog_id:\t%u\n",
-		   link_type,
+		   bpf_link_type_strs[link->type],
 		   link->id,
 		   prog_tag,
 		   prog->aux->id);
+	if (link->ops->show_fdinfo)
+		link->ops->show_fdinfo(link, m);
 }
 #endif
 
@@ -2403,6 +2406,7 @@ struct bpf_link *bpf_link_get_from_fd(u32 ufd)
 
 struct bpf_tracing_link {
 	struct bpf_link link;
+	enum bpf_attach_type attach_type;
 };
 
 static void bpf_tracing_link_release(struct bpf_link *link)
@@ -2418,9 +2422,35 @@ static void bpf_tracing_link_dealloc(struct bpf_link *link)
 	kfree(tr_link);
 }
 
+static void bpf_tracing_link_show_fdinfo(const struct bpf_link *link,
+					 struct seq_file *seq)
+{
+	struct bpf_tracing_link *tr_link =
+		container_of(link, struct bpf_tracing_link, link);
+
+	seq_printf(seq,
+		   "attach_type:\t%d\n",
+		   tr_link->attach_type);
+}
+
+static int bpf_tracing_link_fill_link_info(const struct bpf_link *link,
+					   struct bpf_link_info *info,
+					   const struct bpf_link_info *uinfo,
+					   u32 info_len)
+{
+	struct bpf_tracing_link *tr_link =
+		container_of(link, struct bpf_tracing_link, link);
+
+	info->tracing.attach_type = tr_link->attach_type;
+
+	return 0;
+}
+
 static const struct bpf_link_ops bpf_tracing_link_lops = {
 	.release = bpf_tracing_link_release,
 	.dealloc = bpf_tracing_link_dealloc,
+	.show_fdinfo = bpf_tracing_link_show_fdinfo,
+	.fill_link_info = bpf_tracing_link_fill_link_info,
 };
 
 static int bpf_tracing_prog_attach(struct bpf_prog *prog)
@@ -2460,7 +2490,9 @@ static int bpf_tracing_prog_attach(struct bpf_prog *prog)
 		err = -ENOMEM;
 		goto out_put_prog;
 	}
-	bpf_link_init(&link->link, &bpf_tracing_link_lops, prog);
+	bpf_link_init(&link->link, BPF_LINK_TYPE_TRACING,
+		      &bpf_tracing_link_lops, prog);
+	link->attach_type = prog->expected_attach_type;
 
 	err = bpf_link_prime(&link->link, &link_primer);
 	if (err) {
@@ -2502,9 +2534,66 @@ static void bpf_raw_tp_link_dealloc(struct bpf_link *link)
 	kfree(raw_tp);
 }
 
+static void bpf_raw_tp_link_show_fdinfo(const struct bpf_link *link,
+					struct seq_file *seq)
+{
+	struct bpf_raw_tp_link *raw_tp_link =
+		container_of(link, struct bpf_raw_tp_link, link);
+
+	seq_printf(seq,
+		   "tp_name:\t%s\n",
+		   raw_tp_link->btp->tp->name);
+}
+
+static int bpf_raw_tp_link_fill_link_info(const struct bpf_link *link,
+					  struct bpf_link_info *info,
+					  const struct bpf_link_info *uinfo,
+					  u32 info_len)
+{
+	struct bpf_raw_tp_link *raw_tp_link =
+		container_of(link, struct bpf_raw_tp_link, link);
+	u64 ubuf_ptr;
+	char __user *ubuf = u64_to_user_ptr(uinfo->raw_tracepoint.tp_name);
+	const char *tp_name = raw_tp_link->btp->tp->name;
+	size_t tp_len;
+	u32 ulen;
+
+	if (get_user(ulen, &uinfo->raw_tracepoint.tp_name_len))
+		return -EFAULT;
+	if (get_user(ubuf_ptr, &uinfo->raw_tracepoint.tp_name))
+		return -EFAULT;
+	ubuf = u64_to_user_ptr(ubuf_ptr);
+
+	if (ulen && !ubuf)
+		return -EINVAL;
+	if (!ubuf)
+		return 0;
+
+	tp_len = strlen(raw_tp_link->btp->tp->name);
+	info->raw_tracepoint.tp_name_len = tp_len + 1;
+	info->raw_tracepoint.tp_name = (u64)(unsigned long)ubuf;
+
+	if (ulen >= tp_len + 1) {
+		if (copy_to_user(ubuf, tp_name, tp_len + 1))
+			return -EFAULT;
+	} else {
+		char zero = '\0';
+
+		if (copy_to_user(ubuf, tp_name, ulen - 1))
+			return -EFAULT;
+		if (put_user(zero, ubuf + ulen - 1))
+			return -EFAULT;
+		return -ENOSPC;
+	}
+
+	return 0;
+}
+
 static const struct bpf_link_ops bpf_raw_tp_link_lops = {
 	.release = bpf_raw_tp_link_release,
 	.dealloc = bpf_raw_tp_link_dealloc,
+	.show_fdinfo = bpf_raw_tp_link_show_fdinfo,
+	.fill_link_info = bpf_raw_tp_link_fill_link_info,
 };
 
 #define BPF_RAW_TRACEPOINT_OPEN_LAST_FIELD raw_tracepoint.prog_fd
@@ -2570,7 +2659,8 @@ static int bpf_raw_tracepoint_open(const union bpf_attr *attr)
 		err = -ENOMEM;
 		goto out_put_btp;
 	}
-	bpf_link_init(&link->link, &bpf_raw_tp_link_lops, prog);
+	bpf_link_init(&link->link, BPF_LINK_TYPE_RAW_TRACEPOINT,
+		      &bpf_raw_tp_link_lops, prog);
 	link->btp = btp;
 
 	err = bpf_link_prime(&link->link, &link_primer);
@@ -3366,6 +3456,39 @@ static int bpf_btf_get_info_by_fd(struct btf *btf,
 	return btf_get_info_by_fd(btf, attr, uattr);
 }
 
+static int bpf_link_get_info_by_fd(struct bpf_link *link,
+				  const union bpf_attr *attr,
+				  union bpf_attr __user *uattr)
+{
+	struct bpf_link_info __user *uinfo = u64_to_user_ptr(attr->info.info);
+	struct bpf_link_info info;
+	u32 info_len = attr->info.info_len;
+	int err;
+
+	err = bpf_check_uarg_tail_zero(uinfo, sizeof(info), info_len);
+	if (err)
+		return err;
+	info_len = min_t(u32, sizeof(info), info_len);
+
+	memset(&info, 0, sizeof(info));
+	info.type = link->type;
+	info.id = link->id;
+	info.prog_id = link->prog->aux->id;
+
+	if (link->ops->fill_link_info) {
+		err = link->ops->fill_link_info(link, &info, uinfo, info_len);
+		if (err)
+			return err;
+	}
+
+	if (copy_to_user(uinfo, &info, info_len) ||
+	    put_user(info_len, &uattr->info.info_len))
+		return -EFAULT;
+
+	return 0;
+}
+
+
 #define BPF_OBJ_GET_INFO_BY_FD_LAST_FIELD info.info
 
 static int bpf_obj_get_info_by_fd(const union bpf_attr *attr,
@@ -3390,6 +3513,9 @@ static int bpf_obj_get_info_by_fd(const union bpf_attr *attr,
 					     uattr);
 	else if (f.file->f_op == &btf_fops)
 		err = bpf_btf_get_info_by_fd(f.file->private_data, attr, uattr);
+	else if (f.file->f_op == &bpf_link_fops)
+		err = bpf_link_get_info_by_fd(f.file->private_data,
+					      attr, uattr);
 	else
 		err = -EINVAL;
 
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 513d9c545176..04fec8809fa0 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -28,9 +28,11 @@ static const struct bpf_verifier_ops * const bpf_verifier_ops[] = {
 #define BPF_PROG_TYPE(_id, _name, prog_ctx_type, kern_ctx_type) \
 	[_id] = & _name ## _verifier_ops,
 #define BPF_MAP_TYPE(_id, _ops)
+#define BPF_LINK_TYPE(_id, _name)
 #include <linux/bpf_types.h>
 #undef BPF_PROG_TYPE
 #undef BPF_MAP_TYPE
+#undef BPF_LINK_TYPE
 };
 
 /* bpf_check() is a static code analyzer that walks eBPF program
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 2e29a671d67e..d2f269082a33 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -113,6 +113,8 @@ enum bpf_cmd {
 	BPF_MAP_DELETE_BATCH,
 	BPF_LINK_CREATE,
 	BPF_LINK_UPDATE,
+	BPF_LINK_GET_FD_BY_ID,
+	BPF_LINK_GET_NEXT_ID,
 };
 
 enum bpf_map_type {
@@ -220,6 +222,15 @@ enum bpf_attach_type {
 
 #define MAX_BPF_ATTACH_TYPE __MAX_BPF_ATTACH_TYPE
 
+enum bpf_link_type {
+	BPF_LINK_TYPE_UNSPEC = 0,
+	BPF_LINK_TYPE_RAW_TRACEPOINT = 1,
+	BPF_LINK_TYPE_TRACING = 2,
+	BPF_LINK_TYPE_CGROUP = 3,
+
+	MAX_BPF_LINK_TYPE,
+};
+
 /* cgroup-bpf attach flags used in BPF_PROG_ATTACH command
  *
  * NONE(default): No further bpf programs allowed in the subtree.
@@ -523,6 +534,7 @@ union bpf_attr {
 			__u32		prog_id;
 			__u32		map_id;
 			__u32		btf_id;
+			__u32		link_id;
 		};
 		__u32		next_id;
 		__u32		open_flags;
@@ -3598,6 +3610,25 @@ struct bpf_btf_info {
 	__u32 id;
 } __attribute__((aligned(8)));
 
+struct bpf_link_info {
+	__u32 type;
+	__u32 id;
+	__u32 prog_id;
+	union {
+		struct {
+			__aligned_u64 tp_name; /* in/out: tp_name buffer ptr */
+			__u32 tp_name_len;     /* in/out: tp_name buffer len */
+		} raw_tracepoint;
+		struct {
+			__u32 attach_type;
+		} tracing;
+		struct {
+			__u64 cgroup_id;
+			__u32 attach_type;
+		} cgroup;
+	};
+} __attribute__((aligned(8)));
+
 /* User bpf_sock_addr struct to access socket fields and sockaddr struct passed
  * by user and intended to be used by socket (e.g. to bind to, depends on
  * attach attach type).
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH bpf-next 05/10] libbpf: add low-level APIs for new bpf_link commands
  2020-04-24  5:34 [PATCH bpf-next 00/10] bpf_link observability APIs Andrii Nakryiko
                   ` (3 preceding siblings ...)
  2020-04-24  5:34 ` [PATCH bpf-next 04/10] bpf: add support for BPF_OBJ_GET_INFO_BY_FD " Andrii Nakryiko
@ 2020-04-24  5:35 ` Andrii Nakryiko
  2020-04-24  5:35 ` [PATCH bpf-next 06/10] selftests/bpf: test bpf_link's get_next_id, get_fd_by_id, and get_obj_info Andrii Nakryiko
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Andrii Nakryiko @ 2020-04-24  5:35 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko

Add low-level API calls for bpf_link_get_next_id() and
bpf_link_get_fd_by_id().

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 tools/lib/bpf/bpf.c      | 19 +++++++++++++++++--
 tools/lib/bpf/bpf.h      |  4 +++-
 tools/lib/bpf/libbpf.map |  6 ++++++
 3 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 5cc1b0785d18..8f2f0958d446 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -721,6 +721,11 @@ int bpf_btf_get_next_id(__u32 start_id, __u32 *next_id)
 	return bpf_obj_get_next_id(start_id, next_id, BPF_BTF_GET_NEXT_ID);
 }
 
+int bpf_link_get_next_id(__u32 start_id, __u32 *next_id)
+{
+	return bpf_obj_get_next_id(start_id, next_id, BPF_LINK_GET_NEXT_ID);
+}
+
 int bpf_prog_get_fd_by_id(__u32 id)
 {
 	union bpf_attr attr;
@@ -751,13 +756,23 @@ int bpf_btf_get_fd_by_id(__u32 id)
 	return sys_bpf(BPF_BTF_GET_FD_BY_ID, &attr, sizeof(attr));
 }
 
-int bpf_obj_get_info_by_fd(int prog_fd, void *info, __u32 *info_len)
+int bpf_link_get_fd_by_id(__u32 id)
+{
+	union bpf_attr attr;
+
+	memset(&attr, 0, sizeof(attr));
+	attr.link_id = id;
+
+	return sys_bpf(BPF_LINK_GET_FD_BY_ID, &attr, sizeof(attr));
+}
+
+int bpf_obj_get_info_by_fd(int bpf_fd, void *info, __u32 *info_len)
 {
 	union bpf_attr attr;
 	int err;
 
 	memset(&attr, 0, sizeof(attr));
-	attr.info.bpf_fd = prog_fd;
+	attr.info.bpf_fd = bpf_fd;
 	attr.info.info_len = *info_len;
 	attr.info.info = ptr_to_u64(info);
 
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index 46d47afdd887..335b457b3a25 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -216,10 +216,12 @@ LIBBPF_API int bpf_prog_test_run(int prog_fd, int repeat, void *data,
 LIBBPF_API int bpf_prog_get_next_id(__u32 start_id, __u32 *next_id);
 LIBBPF_API int bpf_map_get_next_id(__u32 start_id, __u32 *next_id);
 LIBBPF_API int bpf_btf_get_next_id(__u32 start_id, __u32 *next_id);
+LIBBPF_API int bpf_link_get_next_id(__u32 start_id, __u32 *next_id);
 LIBBPF_API int bpf_prog_get_fd_by_id(__u32 id);
 LIBBPF_API int bpf_map_get_fd_by_id(__u32 id);
 LIBBPF_API int bpf_btf_get_fd_by_id(__u32 id);
-LIBBPF_API int bpf_obj_get_info_by_fd(int prog_fd, void *info, __u32 *info_len);
+LIBBPF_API int bpf_link_get_fd_by_id(__u32 id);
+LIBBPF_API int bpf_obj_get_info_by_fd(int bpf_fd, void *info, __u32 *info_len);
 LIBBPF_API int bpf_prog_query(int target_fd, enum bpf_attach_type type,
 			      __u32 query_flags, __u32 *attach_flags,
 			      __u32 *prog_ids, __u32 *prog_cnt);
diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
index bb8831605b25..7cd49aa38005 100644
--- a/tools/lib/bpf/libbpf.map
+++ b/tools/lib/bpf/libbpf.map
@@ -254,3 +254,9 @@ LIBBPF_0.0.8 {
 		bpf_program__set_lsm;
 		bpf_set_link_xdp_fd_opts;
 } LIBBPF_0.0.7;
+
+LIBBPF_0.0.9 {
+	global:
+		bpf_link_get_fd_by_id;
+		bpf_link_get_next_id;
+} LIBBPF_0.0.8;
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH bpf-next 06/10] selftests/bpf: test bpf_link's get_next_id, get_fd_by_id, and get_obj_info
  2020-04-24  5:34 [PATCH bpf-next 00/10] bpf_link observability APIs Andrii Nakryiko
                   ` (4 preceding siblings ...)
  2020-04-24  5:35 ` [PATCH bpf-next 05/10] libbpf: add low-level APIs for new bpf_link commands Andrii Nakryiko
@ 2020-04-24  5:35 ` Andrii Nakryiko
  2020-04-24  5:35 ` [PATCH bpf-next 07/10] bpftool: expose attach_type-to-string array to non-cgroup code Andrii Nakryiko
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Andrii Nakryiko @ 2020-04-24  5:35 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko

Extend bpf_obj_id selftest to verify bpf_link's observability APIs.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 .../selftests/bpf/prog_tests/bpf_obj_id.c     | 110 ++++++++++++++++--
 .../testing/selftests/bpf/progs/test_obj_id.c |  14 +--
 2 files changed, 104 insertions(+), 20 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_obj_id.c b/tools/testing/selftests/bpf/prog_tests/bpf_obj_id.c
index f10029821e16..7afa4160416f 100644
--- a/tools/testing/selftests/bpf/prog_tests/bpf_obj_id.c
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_obj_id.c
@@ -1,26 +1,30 @@
 // SPDX-License-Identifier: GPL-2.0
 #include <test_progs.h>
 
+#define nr_iters 2
+
 void test_bpf_obj_id(void)
 {
 	const __u64 array_magic_value = 0xfaceb00c;
 	const __u32 array_key = 0;
-	const int nr_iters = 2;
 	const char *file = "./test_obj_id.o";
 	const char *expected_prog_name = "test_obj_id";
 	const char *expected_map_name = "test_map_id";
 	const __u64 nsec_per_sec = 1000000000;
 
-	struct bpf_object *objs[nr_iters];
+	struct bpf_object *objs[nr_iters] = {};
+	struct bpf_link *links[nr_iters] = {};
+	struct bpf_program *prog;
 	int prog_fds[nr_iters], map_fds[nr_iters];
 	/* +1 to test for the info_len returned by kernel */
 	struct bpf_prog_info prog_infos[nr_iters + 1];
 	struct bpf_map_info map_infos[nr_iters + 1];
+	struct bpf_link_info link_infos[nr_iters + 1];
 	/* Each prog only uses one map. +1 to test nr_map_ids
 	 * returned by kernel.
 	 */
 	__u32 map_ids[nr_iters + 1];
-	char jited_insns[128], xlated_insns[128], zeros[128];
+	char jited_insns[128], xlated_insns[128], zeros[128], tp_name[128];
 	__u32 i, next_id, info_len, nr_id_found, duration = 0;
 	struct timespec real_time_ts, boot_time_ts;
 	int err = 0;
@@ -36,14 +40,15 @@ void test_bpf_obj_id(void)
 	CHECK(err >= 0 || errno != ENOENT,
 	      "get-fd-by-notexist-map-id", "err %d errno %d\n", err, errno);
 
-	for (i = 0; i < nr_iters; i++)
-		objs[i] = NULL;
+	err = bpf_link_get_fd_by_id(0);
+	CHECK(err >= 0 || errno != ENOENT,
+	      "get-fd-by-notexist-link-id", "err %d errno %d\n", err, errno);
 
 	/* Check bpf_obj_get_info_by_fd() */
 	bzero(zeros, sizeof(zeros));
 	for (i = 0; i < nr_iters; i++) {
 		now = time(NULL);
-		err = bpf_prog_load(file, BPF_PROG_TYPE_SOCKET_FILTER,
+		err = bpf_prog_load(file, BPF_PROG_TYPE_RAW_TRACEPOINT,
 				    &objs[i], &prog_fds[i]);
 		/* test_obj_id.o is a dumb prog. It should never fail
 		 * to load.
@@ -60,6 +65,17 @@ void test_bpf_obj_id(void)
 		if (CHECK_FAIL(err))
 			goto done;
 
+		prog = bpf_object__find_program_by_title(objs[i],
+							 "raw_tp/sys_enter");
+		if (CHECK_FAIL(!prog))
+			goto done;
+		links[i] = bpf_program__attach(prog);
+		err = libbpf_get_error(links[i]);
+		if (CHECK(err, "prog_attach", "prog #%d, err %d\n", i, err)) {
+			links[i] = NULL;
+			goto done;
+		}
+
 		/* Check getting map info */
 		info_len = sizeof(struct bpf_map_info) * 2;
 		bzero(&map_infos[i], info_len);
@@ -107,7 +123,7 @@ void test_bpf_obj_id(void)
 		load_time = (real_time_ts.tv_sec - boot_time_ts.tv_sec)
 			+ (prog_infos[i].load_time / nsec_per_sec);
 		if (CHECK(err ||
-			  prog_infos[i].type != BPF_PROG_TYPE_SOCKET_FILTER ||
+			  prog_infos[i].type != BPF_PROG_TYPE_RAW_TRACEPOINT ||
 			  info_len != sizeof(struct bpf_prog_info) ||
 			  (env.jit_enabled && !prog_infos[i].jited_prog_len) ||
 			  (env.jit_enabled &&
@@ -120,7 +136,11 @@ void test_bpf_obj_id(void)
 			  *(int *)(long)prog_infos[i].map_ids != map_infos[i].id ||
 			  strcmp((char *)prog_infos[i].name, expected_prog_name),
 			  "get-prog-info(fd)",
-			  "err %d errno %d i %d type %d(%d) info_len %u(%zu) jit_enabled %d jited_prog_len %u xlated_prog_len %u jited_prog %d xlated_prog %d load_time %lu(%lu) uid %u(%u) nr_map_ids %u(%u) map_id %u(%u) name %s(%s)\n",
+			  "err %d errno %d i %d type %d(%d) info_len %u(%zu) "
+			  "jit_enabled %d jited_prog_len %u xlated_prog_len %u "
+			  "jited_prog %d xlated_prog %d load_time %lu(%lu) "
+			  "uid %u(%u) nr_map_ids %u(%u) map_id %u(%u) "
+			  "name %s(%s)\n",
 			  err, errno, i,
 			  prog_infos[i].type, BPF_PROG_TYPE_SOCKET_FILTER,
 			  info_len, sizeof(struct bpf_prog_info),
@@ -135,6 +155,33 @@ void test_bpf_obj_id(void)
 			  *(int *)(long)prog_infos[i].map_ids, map_infos[i].id,
 			  prog_infos[i].name, expected_prog_name))
 			goto done;
+
+		/* Check getting link info */
+		info_len = sizeof(struct bpf_link_info) * 2;
+		bzero(&link_infos[i], info_len);
+		link_infos[i].raw_tracepoint.tp_name = (__u64)&tp_name;
+		link_infos[i].raw_tracepoint.tp_name_len = sizeof(tp_name);
+		err = bpf_obj_get_info_by_fd(bpf_link__fd(links[i]),
+					     &link_infos[i], &info_len);
+		if (CHECK(err ||
+			  link_infos[i].type != BPF_LINK_TYPE_RAW_TRACEPOINT ||
+			  link_infos[i].prog_id != prog_infos[i].id ||
+			  link_infos[i].raw_tracepoint.tp_name != (__u64)&tp_name ||
+			  strcmp((char *)link_infos[i].raw_tracepoint.tp_name,
+				 "sys_enter") ||
+			  info_len != sizeof(struct bpf_link_info),
+			  "get-link-info(fd)",
+			  "err %d errno %d info_len %u(%zu) type %d(%d) id %d "
+			  "prog_id %d (%d) tp_name %s(%s)\n",
+			  err, errno,
+			  info_len, sizeof(struct bpf_link_info),
+			  link_infos[i].type, BPF_LINK_TYPE_RAW_TRACEPOINT,
+			  link_infos[i].id,
+			  link_infos[i].prog_id, prog_infos[i].id,
+			  (char *)link_infos[i].raw_tracepoint.tp_name,
+			  "sys_enter"))
+			goto done;
+
 	}
 
 	/* Check bpf_prog_get_next_id() */
@@ -247,7 +294,52 @@ void test_bpf_obj_id(void)
 	      "nr_id_found %u(%u)\n",
 	      nr_id_found, nr_iters);
 
+	/* Check bpf_link_get_next_id() */
+	nr_id_found = 0;
+	next_id = 0;
+	while (!bpf_link_get_next_id(next_id, &next_id)) {
+		struct bpf_link_info link_info;
+		int link_fd, cmp_res;
+
+		info_len = sizeof(link_info);
+		memset(&link_info, 0, info_len);
+
+		link_fd = bpf_link_get_fd_by_id(next_id);
+		if (link_fd < 0 && errno == ENOENT)
+			/* The bpf_link is in the dead row */
+			continue;
+		if (CHECK(link_fd < 0, "get-link-fd(next_id)",
+			  "link_fd %d next_id %u errno %d\n",
+			  link_fd, next_id, errno))
+			break;
+
+		for (i = 0; i < nr_iters; i++)
+			if (link_infos[i].id == next_id)
+				break;
+
+		if (i == nr_iters)
+			continue;
+
+		nr_id_found++;
+
+		err = bpf_obj_get_info_by_fd(link_fd, &link_info, &info_len);
+		cmp_res = memcmp(&link_info, &link_infos[i],
+				offsetof(struct bpf_link_info, raw_tracepoint));
+		CHECK(err || info_len != sizeof(link_info) || cmp_res,
+		      "check get-link-info(next_id->fd)",
+		      "err %d errno %d info_len %u(%zu) memcmp %d\n",
+		      err, errno, info_len, sizeof(struct bpf_link_info),
+		      cmp_res);
+
+		close(link_fd);
+	}
+	CHECK(nr_id_found != nr_iters,
+	      "check total link id found by get_next_id",
+	      "nr_id_found %u(%u)\n", nr_id_found, nr_iters);
+
 done:
-	for (i = 0; i < nr_iters; i++)
+	for (i = 0; i < nr_iters; i++) {
+		bpf_link__destroy(links[i]);
 		bpf_object__close(objs[i]);
+	}
 }
diff --git a/tools/testing/selftests/bpf/progs/test_obj_id.c b/tools/testing/selftests/bpf/progs/test_obj_id.c
index 98b9de2fafd0..ded71b3ff6b4 100644
--- a/tools/testing/selftests/bpf/progs/test_obj_id.c
+++ b/tools/testing/selftests/bpf/progs/test_obj_id.c
@@ -3,16 +3,8 @@
  */
 #include <stddef.h>
 #include <linux/bpf.h>
-#include <linux/pkt_cls.h>
 #include <bpf/bpf_helpers.h>
 
-/* It is a dumb bpf program such that it must have no
- * issue to be loaded since testing the verifier is
- * not the focus here.
- */
-
-int _version SEC("version") = 1;
-
 struct {
 	__uint(type, BPF_MAP_TYPE_ARRAY);
 	__uint(max_entries, 1);
@@ -20,13 +12,13 @@ struct {
 	__type(value, __u64);
 } test_map_id SEC(".maps");
 
-SEC("test_obj_id_dummy")
-int test_obj_id(struct __sk_buff *skb)
+SEC("raw_tp/sys_enter")
+int test_obj_id(void *ctx)
 {
 	__u32 key = 0;
 	__u64 *value;
 
 	value = bpf_map_lookup_elem(&test_map_id, &key);
 
-	return TC_ACT_OK;
+	return 0;
 }
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH bpf-next 07/10] bpftool: expose attach_type-to-string array to non-cgroup code
  2020-04-24  5:34 [PATCH bpf-next 00/10] bpf_link observability APIs Andrii Nakryiko
                   ` (5 preceding siblings ...)
  2020-04-24  5:35 ` [PATCH bpf-next 06/10] selftests/bpf: test bpf_link's get_next_id, get_fd_by_id, and get_obj_info Andrii Nakryiko
@ 2020-04-24  5:35 ` Andrii Nakryiko
  2020-04-24 10:32   ` Quentin Monnet
  2020-04-24  5:35 ` [PATCH bpf-next 08/10] bpftool: add bpf_link show and pin support Andrii Nakryiko
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 24+ messages in thread
From: Andrii Nakryiko @ 2020-04-24  5:35 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko

Move attach_type_strings into main.h for access in non-cgroup code.
bpf_attach_type is used for non-cgroup attach types quite widely now. So also
complete missing string translations for non-cgroup attach types.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 tools/bpf/bpftool/cgroup.c | 28 +++-------------------------
 tools/bpf/bpftool/main.h   | 32 ++++++++++++++++++++++++++++++++
 2 files changed, 35 insertions(+), 25 deletions(-)

diff --git a/tools/bpf/bpftool/cgroup.c b/tools/bpf/bpftool/cgroup.c
index 62c6a1d7cd18..d1fd9c9f2690 100644
--- a/tools/bpf/bpftool/cgroup.c
+++ b/tools/bpf/bpftool/cgroup.c
@@ -31,35 +31,13 @@
 
 static unsigned int query_flags;
 
-static const char * const attach_type_strings[] = {
-	[BPF_CGROUP_INET_INGRESS] = "ingress",
-	[BPF_CGROUP_INET_EGRESS] = "egress",
-	[BPF_CGROUP_INET_SOCK_CREATE] = "sock_create",
-	[BPF_CGROUP_SOCK_OPS] = "sock_ops",
-	[BPF_CGROUP_DEVICE] = "device",
-	[BPF_CGROUP_INET4_BIND] = "bind4",
-	[BPF_CGROUP_INET6_BIND] = "bind6",
-	[BPF_CGROUP_INET4_CONNECT] = "connect4",
-	[BPF_CGROUP_INET6_CONNECT] = "connect6",
-	[BPF_CGROUP_INET4_POST_BIND] = "post_bind4",
-	[BPF_CGROUP_INET6_POST_BIND] = "post_bind6",
-	[BPF_CGROUP_UDP4_SENDMSG] = "sendmsg4",
-	[BPF_CGROUP_UDP6_SENDMSG] = "sendmsg6",
-	[BPF_CGROUP_SYSCTL] = "sysctl",
-	[BPF_CGROUP_UDP4_RECVMSG] = "recvmsg4",
-	[BPF_CGROUP_UDP6_RECVMSG] = "recvmsg6",
-	[BPF_CGROUP_GETSOCKOPT] = "getsockopt",
-	[BPF_CGROUP_SETSOCKOPT] = "setsockopt",
-	[__MAX_BPF_ATTACH_TYPE] = NULL,
-};
-
 static enum bpf_attach_type parse_attach_type(const char *str)
 {
 	enum bpf_attach_type type;
 
 	for (type = 0; type < __MAX_BPF_ATTACH_TYPE; type++) {
-		if (attach_type_strings[type] &&
-		    is_prefix(str, attach_type_strings[type]))
+		if (attach_type_name[type] &&
+		    is_prefix(str, attach_type_name[type]))
 			return type;
 	}
 
@@ -171,7 +149,7 @@ static int show_attached_bpf_progs(int cgroup_fd, enum bpf_attach_type type,
 	}
 
 	for (iter = 0; iter < prog_cnt; iter++)
-		show_bpf_prog(prog_ids[iter], attach_type_strings[type],
+		show_bpf_prog(prog_ids[iter], attach_type_name[type],
 			      attach_flags_str, level);
 
 	return 0;
diff --git a/tools/bpf/bpftool/main.h b/tools/bpf/bpftool/main.h
index 86f14ce26fd7..dd212d2af923 100644
--- a/tools/bpf/bpftool/main.h
+++ b/tools/bpf/bpftool/main.h
@@ -83,6 +83,38 @@ static const char * const prog_type_name[] = {
 	[BPF_PROG_TYPE_EXT]			= "ext",
 };
 
+static const char * const attach_type_name[] = {
+	[BPF_CGROUP_INET_INGRESS] = "ingress",
+	[BPF_CGROUP_INET_EGRESS] = "egress",
+	[BPF_CGROUP_INET_SOCK_CREATE] = "sock_create",
+	[BPF_CGROUP_SOCK_OPS] = "sock_ops",
+	[BPF_CGROUP_DEVICE] = "device",
+	[BPF_CGROUP_INET4_BIND] = "bind4",
+	[BPF_CGROUP_INET6_BIND] = "bind6",
+	[BPF_CGROUP_INET4_CONNECT] = "connect4",
+	[BPF_CGROUP_INET6_CONNECT] = "connect6",
+	[BPF_CGROUP_INET4_POST_BIND] = "post_bind4",
+	[BPF_CGROUP_INET6_POST_BIND] = "post_bind6",
+	[BPF_CGROUP_UDP4_SENDMSG] = "sendmsg4",
+	[BPF_CGROUP_UDP6_SENDMSG] = "sendmsg6",
+	[BPF_CGROUP_SYSCTL] = "sysctl",
+	[BPF_CGROUP_UDP4_RECVMSG] = "recvmsg4",
+	[BPF_CGROUP_UDP6_RECVMSG] = "recvmsg6",
+	[BPF_CGROUP_GETSOCKOPT] = "getsockopt",
+	[BPF_CGROUP_SETSOCKOPT] = "setsockopt",
+
+	[BPF_SK_SKB_STREAM_PARSER] = "sk_skb_stream_parser",
+	[BPF_SK_SKB_STREAM_VERDICT] = "sk_skb_stream_verdict",
+	[BPF_SK_MSG_VERDICT] = "sk_msg_verdict",
+	[BPF_LIRC_MODE2] = "lirc_mode2",
+	[BPF_FLOW_DISSECTOR] = "flow_dissector",
+	[BPF_TRACE_RAW_TP] = "raw_tp",
+	[BPF_TRACE_FENTRY] = "fentry",
+	[BPF_TRACE_FEXIT] = "fexit",
+	[BPF_MODIFY_RETURN] = "mod_ret",
+	[BPF_LSM_MAC] = "lsm_mac",
+};
+
 extern const char * const map_type_name[];
 extern const size_t map_type_name_size;
 
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH bpf-next 08/10] bpftool: add bpf_link show and pin support
  2020-04-24  5:34 [PATCH bpf-next 00/10] bpf_link observability APIs Andrii Nakryiko
                   ` (6 preceding siblings ...)
  2020-04-24  5:35 ` [PATCH bpf-next 07/10] bpftool: expose attach_type-to-string array to non-cgroup code Andrii Nakryiko
@ 2020-04-24  5:35 ` Andrii Nakryiko
  2020-04-24 10:32   ` Quentin Monnet
  2020-04-24  5:35 ` [PATCH bpf-next 09/10] bpftool: add bpftool-link manpage Andrii Nakryiko
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 24+ messages in thread
From: Andrii Nakryiko @ 2020-04-24  5:35 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko

Add `bpftool link show` and `bpftool link pin` commands.

Example plain output for `link show` (with showing pinned paths):

[vmuser@archvm bpf]$ sudo ~/local/linux/tools/bpf/bpftool/bpftool -f link
1: tracing  prog 12
        prog_type tracing  attach_type fentry
        pinned /sys/fs/bpf/my_test_link
        pinned /sys/fs/bpf/my_test_link2
2: tracing  prog 13
        prog_type tracing  attach_type fentry
3: tracing  prog 14
        prog_type tracing  attach_type fentry
4: tracing  prog 15
        prog_type tracing  attach_type fentry
5: tracing  prog 16
        prog_type tracing  attach_type fentry
6: tracing  prog 17
        prog_type tracing  attach_type fentry
7: raw_tracepoint  prog 21
        tp 'sys_enter'
8: cgroup  prog 25
        cgroup_id 584  attach_type egress
9: cgroup  prog 25
        cgroup_id 599  attach_type egress
10: cgroup  prog 25
        cgroup_id 614  attach_type egress
11: cgroup  prog 25
        cgroup_id 629  attach_type egress

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 tools/bpf/bpftool/common.c |   2 +
 tools/bpf/bpftool/link.c   | 402 +++++++++++++++++++++++++++++++++++++
 tools/bpf/bpftool/main.c   |   6 +-
 tools/bpf/bpftool/main.h   |   5 +
 4 files changed, 414 insertions(+), 1 deletion(-)
 create mode 100644 tools/bpf/bpftool/link.c

diff --git a/tools/bpf/bpftool/common.c b/tools/bpf/bpftool/common.c
index f2223dbdfb0a..c47bdc65de8e 100644
--- a/tools/bpf/bpftool/common.c
+++ b/tools/bpf/bpftool/common.c
@@ -262,6 +262,8 @@ int get_fd_type(int fd)
 		return BPF_OBJ_MAP;
 	else if (strstr(buf, "bpf-prog"))
 		return BPF_OBJ_PROG;
+	else if (strstr(buf, "bpf-link"))
+		return BPF_OBJ_LINK;
 
 	return BPF_OBJ_UNKNOWN;
 }
diff --git a/tools/bpf/bpftool/link.c b/tools/bpf/bpftool/link.c
new file mode 100644
index 000000000000..d5dcf9e46536
--- /dev/null
+++ b/tools/bpf/bpftool/link.c
@@ -0,0 +1,402 @@
+// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+/* Copyright (C) 2020 Facebook */
+
+#include <errno.h>
+#include <net/if.h>
+#include <stdio.h>
+#include <unistd.h>
+
+#include <bpf/bpf.h>
+
+#include "json_writer.h"
+#include "main.h"
+
+static const char * const link_type_name[] = {
+	[BPF_LINK_TYPE_UNSPEC]			= "unspec",
+	[BPF_LINK_TYPE_RAW_TRACEPOINT]		= "raw_tracepoint",
+	[BPF_LINK_TYPE_TRACING]			= "tracing",
+	[BPF_LINK_TYPE_CGROUP]			= "cgroup",
+};
+
+static int link_parse_fds(int *argc, char ***argv, int **fds)
+{
+	if (is_prefix(**argv, "id")) {
+		unsigned int id;
+		char *endptr;
+
+		NEXT_ARGP();
+
+		id = strtoul(**argv, &endptr, 0);
+		if (*endptr) {
+			p_err("can't parse %s as ID", **argv);
+			return -1;
+		}
+		NEXT_ARGP();
+
+		(*fds)[0] = bpf_link_get_fd_by_id(id);
+		if ((*fds)[0] < 0) {
+			p_err("get link by id (%u): %s", id, strerror(errno));
+			return -1;
+		}
+		return 1;
+	} else if (is_prefix(**argv, "pinned")) {
+		char *path;
+
+		NEXT_ARGP();
+
+		path = **argv;
+		NEXT_ARGP();
+
+		(*fds)[0] = open_obj_pinned_any(path, BPF_OBJ_LINK);
+		if ((*fds)[0] < 0)
+			return -1;
+		return 1;
+	}
+
+	p_err("expected 'id' or 'pinned', got: '%s'?", **argv);
+	return -1;
+}
+
+static int link_parse_fd(int *argc, char ***argv)
+{
+	int *fds = NULL;
+	int nb_fds, fd;
+
+	fds = malloc(sizeof(int));
+	if (!fds) {
+		p_err("mem alloc failed");
+		return -1;
+	}
+	nb_fds = link_parse_fds(argc, argv, &fds);
+	if (nb_fds != 1) {
+		if (nb_fds > 1) {
+			p_err("several links match this handle");
+			while (nb_fds--)
+				close(fds[nb_fds]);
+		}
+		fd = -1;
+		goto exit_free;
+	}
+
+	fd = fds[0];
+exit_free:
+	free(fds);
+	return fd;
+}
+
+static void
+show_link_header_json(struct bpf_link_info *info, json_writer_t *wtr)
+{
+	jsonw_uint_field(wtr, "id", info->id);
+	if (info->type < ARRAY_SIZE(link_type_name))
+		jsonw_string_field(wtr, "type", link_type_name[info->type]);
+	else
+		jsonw_uint_field(wtr, "type", info->type);
+
+	jsonw_uint_field(json_wtr, "prog_id", info->prog_id);
+}
+
+static int get_prog_info(int prog_id, struct bpf_prog_info *info)
+{
+	__u32 len = sizeof(*info);
+	int err, prog_fd;
+
+	prog_fd = bpf_prog_get_fd_by_id(prog_id);
+	if (prog_fd < 0)
+		return prog_fd;
+
+	memset(info, 0, sizeof(*info));
+	err = bpf_obj_get_info_by_fd(prog_fd, info, &len);
+	if (err) {
+		p_err("can't get prog info: %s", strerror(errno));
+		close(prog_fd);
+		return err;
+	}
+
+	close(prog_fd);
+	return 0;
+}
+
+static int show_link_close_json(int fd, struct bpf_link_info *info)
+{
+	struct bpf_prog_info prog_info;
+	int err;
+
+	jsonw_start_object(json_wtr);
+
+	show_link_header_json(info, json_wtr);
+
+	switch (info->type) {
+	case BPF_LINK_TYPE_RAW_TRACEPOINT:
+		jsonw_string_field(json_wtr, "tp_name",
+				   (const char *)info->raw_tracepoint.tp_name);
+		break;
+	case BPF_LINK_TYPE_TRACING:
+		err = get_prog_info(info->prog_id, &prog_info);
+		if (err)
+			return err;
+
+		if (prog_info.type < ARRAY_SIZE(prog_type_name))
+			jsonw_string_field(json_wtr, "prog_type",
+					   prog_type_name[prog_info.type]);
+		else
+			jsonw_uint_field(json_wtr, "prog_type",
+					 prog_info.type);
+
+		if (info->tracing.attach_type < ARRAY_SIZE(attach_type_name))
+			jsonw_string_field(json_wtr, "attach_type",
+			       attach_type_name[info->tracing.attach_type]);
+		else
+			jsonw_uint_field(json_wtr, "attach_type",
+					 info->tracing.attach_type);
+		break;
+	case BPF_LINK_TYPE_CGROUP:
+		jsonw_lluint_field(json_wtr, "cgroup_id",
+				   info->cgroup.cgroup_id);
+		if (info->cgroup.attach_type < ARRAY_SIZE(attach_type_name))
+			jsonw_string_field(json_wtr, "attach_type",
+			       attach_type_name[info->cgroup.attach_type]);
+		else
+			jsonw_uint_field(json_wtr, "attach_type",
+					 info->cgroup.attach_type);
+		break;
+	default:
+		break;
+	}
+
+	if (!hash_empty(link_table.table)) {
+		struct pinned_obj *obj;
+
+		jsonw_name(json_wtr, "pinned");
+		jsonw_start_array(json_wtr);
+		hash_for_each_possible(link_table.table, obj, hash, info->id) {
+			if (obj->id == info->id)
+				jsonw_string(json_wtr, obj->path);
+		}
+		jsonw_end_array(json_wtr);
+	}
+	jsonw_end_object(json_wtr);
+
+	return 0;
+}
+
+static void show_link_header_plain(struct bpf_link_info *info)
+{
+	printf("%u: ", info->id);
+	if (info->type < ARRAY_SIZE(link_type_name))
+		printf("%s  ", link_type_name[info->type]);
+	else
+		printf("type %u  ", info->type);
+
+	printf("prog %u  ", info->prog_id);
+}
+
+static int show_link_close_plain(int fd, struct bpf_link_info *info)
+{
+	struct bpf_prog_info prog_info;
+	int err;
+
+	show_link_header_plain(info);
+
+	switch (info->type) {
+	case BPF_LINK_TYPE_RAW_TRACEPOINT:
+		printf("\n\ttp '%s'  ",
+		       (const char *)info->raw_tracepoint.tp_name);
+		break;
+	case BPF_LINK_TYPE_TRACING:
+		err = get_prog_info(info->prog_id, &prog_info);
+		if (err)
+			return err;
+
+		if (prog_info.type < ARRAY_SIZE(prog_type_name))
+			printf("\n\tprog_type %s  ",
+			       prog_type_name[prog_info.type]);
+		else
+			printf("\n\tprog_type %u  ", prog_info.type);
+
+		if (info->tracing.attach_type < ARRAY_SIZE(attach_type_name))
+			printf("attach_type %s  ",
+			       attach_type_name[info->tracing.attach_type]);
+		else
+			printf("attach_type %u  ", info->tracing.attach_type);
+		break;
+	case BPF_LINK_TYPE_CGROUP:
+		printf("\n\tcgroup_id %zu  ", (size_t)info->cgroup.cgroup_id);
+		if (info->cgroup.attach_type < ARRAY_SIZE(attach_type_name))
+			printf("attach_type %s  ",
+			       attach_type_name[info->cgroup.attach_type]);
+		else
+			printf("attach_type %u  ", info->cgroup.attach_type);
+		break;
+	default:
+		break;
+	}
+
+	if (!hash_empty(link_table.table)) {
+		struct pinned_obj *obj;
+
+		hash_for_each_possible(link_table.table, obj, hash, info->id) {
+			if (obj->id == info->id)
+				printf("\n\tpinned %s", obj->path);
+		}
+	}
+
+	printf("\n");
+
+	return 0;
+}
+
+static int do_show_link(int fd)
+{
+	struct bpf_link_info info;
+	__u32 len = sizeof(info);
+	char raw_tp_name[256];
+	int err;
+
+	memset(&info, 0, sizeof(info));
+again:
+	err = bpf_obj_get_info_by_fd(fd, &info, &len);
+	if (err) {
+		p_err("can't get link info: %s",
+		      strerror(errno));
+		close(fd);
+		return err;
+	}
+	if (info.type == BPF_LINK_TYPE_RAW_TRACEPOINT &&
+	    !info.raw_tracepoint.tp_name) {
+		info.raw_tracepoint.tp_name = (unsigned long)&raw_tp_name;
+		info.raw_tracepoint.tp_name_len = sizeof(raw_tp_name);
+		goto again;
+	}
+
+	if (json_output)
+		show_link_close_json(fd, &info);
+	else
+		show_link_close_plain(fd, &info);
+
+	close(fd);
+	return 0;
+}
+
+static int do_show_subset(int argc, char **argv)
+{
+	int *fds = NULL;
+	int nb_fds, i;
+	int err = -1;
+
+	fds = malloc(sizeof(int));
+	if (!fds) {
+		p_err("mem alloc failed");
+		return -1;
+	}
+	nb_fds = link_parse_fds(&argc, &argv, &fds);
+	if (nb_fds < 1)
+		goto exit_free;
+
+	if (json_output && nb_fds > 1)
+		jsonw_start_array(json_wtr);	/* root array */
+	for (i = 0; i < nb_fds; i++) {
+		err = do_show_link(fds[i]);
+		if (err) {
+			for (; i + 1 < nb_fds; i++)
+				close(fds[i]);
+			break;
+		}
+	}
+	if (json_output && nb_fds > 1)
+		jsonw_end_array(json_wtr);	/* root array */
+
+exit_free:
+	free(fds);
+	return err;
+}
+
+static int do_show(int argc, char **argv)
+{
+	__u32 id = 0;
+	int err;
+	int fd;
+
+	if (show_pinned)
+		build_pinned_obj_table(&link_table, BPF_OBJ_LINK);
+
+	if (argc == 2)
+		return do_show_subset(argc, argv);
+
+	if (argc)
+		return BAD_ARG();
+
+	if (json_output)
+		jsonw_start_array(json_wtr);
+	while (true) {
+		err = bpf_link_get_next_id(id, &id);
+		if (err) {
+			if (errno == ENOENT)
+				break;
+			p_err("can't get next link: %s%s", strerror(errno),
+			      errno == EINVAL ? " -- kernel too old?" : "");
+			break;
+		}
+
+		fd = bpf_link_get_fd_by_id(id);
+		if (fd < 0) {
+			if (errno == ENOENT)
+				continue;
+			p_err("can't get link by id (%u): %s",
+			      id, strerror(errno));
+			break;
+		}
+
+		err = do_show_link(fd);
+		if (err)
+			break;
+	}
+	if (json_output)
+		jsonw_end_array(json_wtr);
+
+	return errno == ENOENT ? 0 : -1;
+}
+
+static int do_pin(int argc, char **argv)
+{
+	int err;
+
+	err = do_pin_any(argc, argv, link_parse_fd);
+	if (!err && json_output)
+		jsonw_null(json_wtr);
+	return err;
+}
+
+static int do_help(int argc, char **argv)
+{
+	if (json_output) {
+		jsonw_null(json_wtr);
+		return 0;
+	}
+
+	fprintf(stderr,
+		"Usage: %1$s %2$s { show | list }   [LINK]\n"
+		"       %1$s %2$s pin        LINK  FILE\n"
+		"       %1$s %2$s help\n"
+		"\n"
+		"       " HELP_SPEC_LINK "\n"
+		"       " HELP_SPEC_PROGRAM "\n"
+		"       " HELP_SPEC_OPTIONS "\n"
+		"",
+		bin_name, argv[-2]);
+
+	return 0;
+}
+
+static const struct cmd cmds[] = {
+	{ "show",	do_show },
+	{ "list",	do_show },
+	{ "help",	do_help },
+	{ "pin",	do_pin },
+	{ 0 }
+};
+
+int do_link(int argc, char **argv)
+{
+	return cmd_select(cmds, argc, argv, do_help);
+}
diff --git a/tools/bpf/bpftool/main.c b/tools/bpf/bpftool/main.c
index 466c269eabdd..1413a154806e 100644
--- a/tools/bpf/bpftool/main.c
+++ b/tools/bpf/bpftool/main.c
@@ -30,6 +30,7 @@ bool verifier_logs;
 bool relaxed_maps;
 struct pinned_obj_table prog_table;
 struct pinned_obj_table map_table;
+struct pinned_obj_table link_table;
 
 static void __noreturn clean_and_exit(int i)
 {
@@ -58,7 +59,7 @@ static int do_help(int argc, char **argv)
 		"       %s batch file FILE\n"
 		"       %s version\n"
 		"\n"
-		"       OBJECT := { prog | map | cgroup | perf | net | feature | btf | gen | struct_ops }\n"
+		"       OBJECT := { prog | map | link | cgroup | perf | net | feature | btf | gen | struct_ops }\n"
 		"       " HELP_SPEC_OPTIONS "\n"
 		"",
 		bin_name, bin_name, bin_name);
@@ -215,6 +216,7 @@ static const struct cmd cmds[] = {
 	{ "batch",	do_batch },
 	{ "prog",	do_prog },
 	{ "map",	do_map },
+	{ "link",	do_link },
 	{ "cgroup",	do_cgroup },
 	{ "perf",	do_perf },
 	{ "net",	do_net },
@@ -364,6 +366,7 @@ int main(int argc, char **argv)
 
 	hash_init(prog_table.table);
 	hash_init(map_table.table);
+	hash_init(link_table.table);
 
 	opterr = 0;
 	while ((opt = getopt_long(argc, argv, "Vhpjfmnd",
@@ -422,6 +425,7 @@ int main(int argc, char **argv)
 	if (show_pinned) {
 		delete_pinned_obj_table(&prog_table);
 		delete_pinned_obj_table(&map_table);
+		delete_pinned_obj_table(&link_table);
 	}
 
 	return ret;
diff --git a/tools/bpf/bpftool/main.h b/tools/bpf/bpftool/main.h
index dd212d2af923..ff00a91982f7 100644
--- a/tools/bpf/bpftool/main.h
+++ b/tools/bpf/bpftool/main.h
@@ -50,6 +50,8 @@
 	"\t            {-m|--mapcompat} | {-n|--nomount} }"
 #define HELP_SPEC_MAP							\
 	"MAP := { id MAP_ID | pinned FILE | name MAP_NAME }"
+#define HELP_SPEC_LINK							\
+	"LINK := { id LINK_ID | pinned FILE }"
 
 static const char * const prog_type_name[] = {
 	[BPF_PROG_TYPE_UNSPEC]			= "unspec",
@@ -122,6 +124,7 @@ enum bpf_obj_type {
 	BPF_OBJ_UNKNOWN,
 	BPF_OBJ_PROG,
 	BPF_OBJ_MAP,
+	BPF_OBJ_LINK,
 };
 
 extern const char *bin_name;
@@ -134,6 +137,7 @@ extern bool verifier_logs;
 extern bool relaxed_maps;
 extern struct pinned_obj_table prog_table;
 extern struct pinned_obj_table map_table;
+extern struct pinned_obj_table link_table;
 
 void __printf(1, 2) p_err(const char *fmt, ...);
 void __printf(1, 2) p_info(const char *fmt, ...);
@@ -185,6 +189,7 @@ int do_pin_fd(int fd, const char *name);
 
 int do_prog(int argc, char **arg);
 int do_map(int argc, char **arg);
+int do_link(int argc, char **arg);
 int do_event_pipe(int argc, char **argv);
 int do_cgroup(int argc, char **arg);
 int do_perf(int argc, char **arg);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH bpf-next 09/10] bpftool: add bpftool-link manpage
  2020-04-24  5:34 [PATCH bpf-next 00/10] bpf_link observability APIs Andrii Nakryiko
                   ` (7 preceding siblings ...)
  2020-04-24  5:35 ` [PATCH bpf-next 08/10] bpftool: add bpf_link show and pin support Andrii Nakryiko
@ 2020-04-24  5:35 ` Andrii Nakryiko
  2020-04-24 10:33   ` Quentin Monnet
  2020-04-24  5:35 ` [PATCH bpf-next 10/10] bpftool: add link bash completions Andrii Nakryiko
  2020-04-24 11:40 ` [PATCH bpf-next 00/10] bpf_link observability APIs Toke Høiland-Jørgensen
  10 siblings, 1 reply; 24+ messages in thread
From: Andrii Nakryiko @ 2020-04-24  5:35 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko

Add bpftool-link manpage with information and examples of link-related
commands.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 .../bpftool/Documentation/bpftool-link.rst    | 119 ++++++++++++++++++
 1 file changed, 119 insertions(+)
 create mode 100644 tools/bpf/bpftool/Documentation/bpftool-link.rst

diff --git a/tools/bpf/bpftool/Documentation/bpftool-link.rst b/tools/bpf/bpftool/Documentation/bpftool-link.rst
new file mode 100644
index 000000000000..2866128cd6b2
--- /dev/null
+++ b/tools/bpf/bpftool/Documentation/bpftool-link.rst
@@ -0,0 +1,119 @@
+================
+bpftool-link
+================
+-------------------------------------------------------------------------------
+tool for inspection and simple manipulation of eBPF links
+-------------------------------------------------------------------------------
+
+:Manual section: 8
+
+SYNOPSIS
+========
+
+	**bpftool** [*OPTIONS*] **link *COMMAND*
+
+	*OPTIONS* := { { **-j** | **--json** } [{ **-p** | **--pretty** }] | { **-f** | **--bpffs** } }
+
+	*COMMANDS* := { **show** | **list** | **pin** | **help** }
+
+LINK COMMANDS
+=============
+
+|	**bpftool** **link { show | list }** [*LINK*]
+|	**bpftool** **link pin** *LINK* *FILE*
+|	**bpftool** **link help**
+|
+|	*LINK* := { **id** *LINK_ID* | **pinned** *FILE* }
+
+
+DESCRIPTION
+===========
+	**bpftool link { show | list }** [*LINK*]
+		  Show information about active links. If *LINK* is
+		  specified show information only about given link,
+		  otherwise list all links currently active on the system.
+
+		  Output will start with link ID followed by link type and
+		  zero or more named attributes, some of which depend on type
+                  of link.
+
+	**bpftool link pin** *LINK* *FILE*
+		  Pin link *LINK* as *FILE*.
+
+		  Note: *FILE* must be located in *bpffs* mount. It must not
+		  contain a dot character ('.'), which is reserved for future
+		  extensions of *bpffs*.
+
+	**bpftool link help**
+		  Print short help message.
+
+OPTIONS
+=======
+	-h, --help
+		  Print short generic help message (similar to **bpftool help**).
+
+	-V, --version
+		  Print version number (similar to **bpftool version**).
+
+	-j, --json
+		  Generate JSON output. For commands that cannot produce JSON, this
+		  option has no effect.
+
+	-p, --pretty
+		  Generate human-readable JSON output. Implies **-j**.
+
+	-f, --bpffs
+		  When showing BPF links, show file names of pinned
+		  links.
+
+	-n, --nomount
+		  Do not automatically attempt to mount any virtual file system
+		  (such as tracefs or BPF virtual file system) when necessary.
+
+	-d, --debug
+		  Print all logs available, even debug-level information. This
+		  includes logs from libbpf.
+
+EXAMPLES
+========
+**# bpftool link show**
+
+::
+
+    10: cgroup  prog 25
+            cgroup_id 614  attach_type egress
+
+**# bpftool --json --pretty link show**
+
+::
+
+    [{
+            "type": "cgroup",
+            "prog_id": 25,
+            "cgroup_id": 614,
+            "attach_type": "egress"
+        }
+    ]
+
+|
+| **# mount -t bpf none /sys/fs/bpf/**
+| **# bpftool link pin id 10 /sys/fs/bpf/link**
+| **# ls -l /sys/fs/bpf/**
+
+::
+
+    -rw------- 1 root root 0 Apr 23 21:39 link
+
+
+SEE ALSO
+========
+	**bpf**\ (2),
+	**bpf-helpers**\ (7),
+	**bpftool**\ (8),
+	**bpftool-prog\ (8),
+	**bpftool-map**\ (8),
+	**bpftool-cgroup**\ (8),
+	**bpftool-feature**\ (8),
+	**bpftool-net**\ (8),
+	**bpftool-perf**\ (8),
+	**bpftool-btf**\ (8)
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH bpf-next 10/10] bpftool: add link bash completions
  2020-04-24  5:34 [PATCH bpf-next 00/10] bpf_link observability APIs Andrii Nakryiko
                   ` (8 preceding siblings ...)
  2020-04-24  5:35 ` [PATCH bpf-next 09/10] bpftool: add bpftool-link manpage Andrii Nakryiko
@ 2020-04-24  5:35 ` Andrii Nakryiko
  2020-04-24 10:33   ` Quentin Monnet
  2020-04-24 11:40 ` [PATCH bpf-next 00/10] bpf_link observability APIs Toke Høiland-Jørgensen
  10 siblings, 1 reply; 24+ messages in thread
From: Andrii Nakryiko @ 2020-04-24  5:35 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko

Extend bpftool's bash-completion script to handle new link command and its
sub-commands.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 tools/bpf/bpftool/bash-completion/bpftool | 39 +++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/tools/bpf/bpftool/bash-completion/bpftool b/tools/bpf/bpftool/bash-completion/bpftool
index 45ee99b159e2..c033c3329f73 100644
--- a/tools/bpf/bpftool/bash-completion/bpftool
+++ b/tools/bpf/bpftool/bash-completion/bpftool
@@ -98,6 +98,12 @@ _bpftool_get_btf_ids()
         command sed -n 's/.*"id": \(.*\),$/\1/p' )" -- "$cur" ) )
 }
 
+_bpftool_get_link_ids()
+{
+    COMPREPLY+=( $( compgen -W "$( bpftool -jp link 2>&1 | \
+        command sed -n 's/.*"id": \(.*\),$/\1/p' )" -- "$cur" ) )
+}
+
 _bpftool_get_obj_map_names()
 {
     local obj
@@ -1082,6 +1088,39 @@ _bpftool()
                     ;;
             esac
             ;;
+        link)
+            case $command in
+                show|list|pin)
+                    case $prev in
+                        id)
+                            _bpftool_get_link_ids
+                            return 0
+                            ;;
+                    esac
+                    ;;
+            esac
+
+            local LINK_TYPE='id pinned'
+            case $command in
+                show|list)
+                    [[ $prev != "$command" ]] && return 0
+                    COMPREPLY=( $( compgen -W "$LINK_TYPE" -- "$cur" ) )
+                    return 0
+                    ;;
+                pin)
+                    if [[ $prev == "$command" ]]; then
+                        COMPREPLY=( $( compgen -W "$LINK_TYPE" -- "$cur" ) )
+                    else
+                        _filedir
+                    fi
+                    return 0
+                    ;;
+                *)
+                    [[ $prev == $object ]] && \
+                        COMPREPLY=( $( compgen -W 'help pin show list' -- "$cur" ) )
+                    ;;
+            esac
+            ;;
     esac
 } &&
 complete -F _bpftool bpftool
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next 07/10] bpftool: expose attach_type-to-string array to non-cgroup code
  2020-04-24  5:35 ` [PATCH bpf-next 07/10] bpftool: expose attach_type-to-string array to non-cgroup code Andrii Nakryiko
@ 2020-04-24 10:32   ` Quentin Monnet
  2020-04-24 16:27     ` Andrii Nakryiko
  0 siblings, 1 reply; 24+ messages in thread
From: Quentin Monnet @ 2020-04-24 10:32 UTC (permalink / raw)
  To: Andrii Nakryiko, bpf, netdev, ast, daniel; +Cc: andrii.nakryiko, kernel-team

2020-04-23 22:35 UTC-0700 ~ Andrii Nakryiko <andriin@fb.com>
> Move attach_type_strings into main.h for access in non-cgroup code.
> bpf_attach_type is used for non-cgroup attach types quite widely now. So also
> complete missing string translations for non-cgroup attach types.
> 
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> ---
>  tools/bpf/bpftool/cgroup.c | 28 +++-------------------------
>  tools/bpf/bpftool/main.h   | 32 ++++++++++++++++++++++++++++++++
>  2 files changed, 35 insertions(+), 25 deletions(-)
> 
> diff --git a/tools/bpf/bpftool/cgroup.c b/tools/bpf/bpftool/cgroup.c
> index 62c6a1d7cd18..d1fd9c9f2690 100644
> --- a/tools/bpf/bpftool/cgroup.c
> +++ b/tools/bpf/bpftool/cgroup.c
> @@ -31,35 +31,13 @@
>  
>  static unsigned int query_flags;
>  
> -static const char * const attach_type_strings[] = {
> -	[BPF_CGROUP_INET_INGRESS] = "ingress",
> -	[BPF_CGROUP_INET_EGRESS] = "egress",
> -	[BPF_CGROUP_INET_SOCK_CREATE] = "sock_create",
> -	[BPF_CGROUP_SOCK_OPS] = "sock_ops",
> -	[BPF_CGROUP_DEVICE] = "device",
> -	[BPF_CGROUP_INET4_BIND] = "bind4",
> -	[BPF_CGROUP_INET6_BIND] = "bind6",
> -	[BPF_CGROUP_INET4_CONNECT] = "connect4",
> -	[BPF_CGROUP_INET6_CONNECT] = "connect6",
> -	[BPF_CGROUP_INET4_POST_BIND] = "post_bind4",
> -	[BPF_CGROUP_INET6_POST_BIND] = "post_bind6",
> -	[BPF_CGROUP_UDP4_SENDMSG] = "sendmsg4",
> -	[BPF_CGROUP_UDP6_SENDMSG] = "sendmsg6",
> -	[BPF_CGROUP_SYSCTL] = "sysctl",
> -	[BPF_CGROUP_UDP4_RECVMSG] = "recvmsg4",
> -	[BPF_CGROUP_UDP6_RECVMSG] = "recvmsg6",
> -	[BPF_CGROUP_GETSOCKOPT] = "getsockopt",
> -	[BPF_CGROUP_SETSOCKOPT] = "setsockopt",
> -	[__MAX_BPF_ATTACH_TYPE] = NULL,

So you removed the "[__MAX_BPF_ATTACH_TYPE] = NULL" from the new array,
if I understand correctly this is because all attach type enum members
are now in the new attach_type_name[] so we're safe by looping until we
reach __MAX_BPF_ATTACH_TYPE. Sounds good in theory but...

> -};
> -
>  static enum bpf_attach_type parse_attach_type(const char *str)
>  {
>  	enum bpf_attach_type type;
>  
>  	for (type = 0; type < __MAX_BPF_ATTACH_TYPE; type++) {
> -		if (attach_type_strings[type] &&
> -		    is_prefix(str, attach_type_strings[type]))
> +		if (attach_type_name[type] &&
> +		    is_prefix(str, attach_type_name[type]))
>  			return type;
>  	}

... I'm concerned the "attach_type_name[type]" here could segfault if we
add a new attach type to the kernel, but don't report it immediately to
bpftool's array.

Is there any drawback with keeping the "[__MAX_BPF_ATTACH_TYPE] = NULL"?
Or change here to loop on ARRAY_SIZE(), as you do in your own patch for
link?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next 08/10] bpftool: add bpf_link show and pin support
  2020-04-24  5:35 ` [PATCH bpf-next 08/10] bpftool: add bpf_link show and pin support Andrii Nakryiko
@ 2020-04-24 10:32   ` Quentin Monnet
  2020-04-24 16:30     ` Andrii Nakryiko
  0 siblings, 1 reply; 24+ messages in thread
From: Quentin Monnet @ 2020-04-24 10:32 UTC (permalink / raw)
  To: Andrii Nakryiko, bpf, netdev, ast, daniel; +Cc: andrii.nakryiko, kernel-team

2020-04-23 22:35 UTC-0700 ~ Andrii Nakryiko <andriin@fb.com>
> Add `bpftool link show` and `bpftool link pin` commands.
> 
> Example plain output for `link show` (with showing pinned paths):
> 
> [vmuser@archvm bpf]$ sudo ~/local/linux/tools/bpf/bpftool/bpftool -f link
> 1: tracing  prog 12
>         prog_type tracing  attach_type fentry
>         pinned /sys/fs/bpf/my_test_link
>         pinned /sys/fs/bpf/my_test_link2
> 2: tracing  prog 13
>         prog_type tracing  attach_type fentry
> 3: tracing  prog 14
>         prog_type tracing  attach_type fentry
> 4: tracing  prog 15
>         prog_type tracing  attach_type fentry
> 5: tracing  prog 16
>         prog_type tracing  attach_type fentry
> 6: tracing  prog 17
>         prog_type tracing  attach_type fentry
> 7: raw_tracepoint  prog 21
>         tp 'sys_enter'
> 8: cgroup  prog 25
>         cgroup_id 584  attach_type egress
> 9: cgroup  prog 25
>         cgroup_id 599  attach_type egress
> 10: cgroup  prog 25
>         cgroup_id 614  attach_type egress
> 11: cgroup  prog 25
>         cgroup_id 629  attach_type egress
> 
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> ---
>  tools/bpf/bpftool/common.c |   2 +
>  tools/bpf/bpftool/link.c   | 402 +++++++++++++++++++++++++++++++++++++
>  tools/bpf/bpftool/main.c   |   6 +-
>  tools/bpf/bpftool/main.h   |   5 +
>  4 files changed, 414 insertions(+), 1 deletion(-)
>  create mode 100644 tools/bpf/bpftool/link.c
> 
> diff --git a/tools/bpf/bpftool/common.c b/tools/bpf/bpftool/common.c
> index f2223dbdfb0a..c47bdc65de8e 100644
> --- a/tools/bpf/bpftool/common.c
> +++ b/tools/bpf/bpftool/common.c
> @@ -262,6 +262,8 @@ int get_fd_type(int fd)
>  		return BPF_OBJ_MAP;
>  	else if (strstr(buf, "bpf-prog"))
>  		return BPF_OBJ_PROG;
> +	else if (strstr(buf, "bpf-link"))
> +		return BPF_OBJ_LINK;
>  
>  	return BPF_OBJ_UNKNOWN;
>  }
> diff --git a/tools/bpf/bpftool/link.c b/tools/bpf/bpftool/link.c
> new file mode 100644
> index 000000000000..d5dcf9e46536
> --- /dev/null
> +++ b/tools/bpf/bpftool/link.c

[...]

> +
> +static int link_parse_fd(int *argc, char ***argv)
> +{
> +	int *fds = NULL;
> +	int nb_fds, fd;
> +
> +	fds = malloc(sizeof(int));
> +	if (!fds) {
> +		p_err("mem alloc failed");
> +		return -1;
> +	}
> +	nb_fds = link_parse_fds(argc, argv, &fds);
> +	if (nb_fds != 1) {
> +		if (nb_fds > 1) {
> +			p_err("several links match this handle");

Can this ever happen? "bpftool prog show" has this because "name" or
"tag" can match multiple programs. But "id" and "pinned" for link should
not, as far as I understand.

> +			while (nb_fds--)
> +				close(fds[nb_fds]);
> +		}
> +		fd = -1;
> +		goto exit_free;
> +	}
> +
> +	fd = fds[0];
> +exit_free:
> +	free(fds);
> +	return fd;
> +}
> +
> +static void
> +show_link_header_json(struct bpf_link_info *info, json_writer_t *wtr)
> +{
> +	jsonw_uint_field(wtr, "id", info->id);
> +	if (info->type < ARRAY_SIZE(link_type_name))
> +		jsonw_string_field(wtr, "type", link_type_name[info->type]);
> +	else
> +		jsonw_uint_field(wtr, "type", info->type);
> +
> +	jsonw_uint_field(json_wtr, "prog_id", info->prog_id);
> +}
> +
> +static int get_prog_info(int prog_id, struct bpf_prog_info *info)
> +{
> +	__u32 len = sizeof(*info);
> +	int err, prog_fd;
> +
> +	prog_fd = bpf_prog_get_fd_by_id(prog_id);
> +	if (prog_fd < 0)
> +		return prog_fd;
> +
> +	memset(info, 0, sizeof(*info));
> +	err = bpf_obj_get_info_by_fd(prog_fd, info, &len);
> +	if (err) {
> +		p_err("can't get prog info: %s", strerror(errno));
> +		close(prog_fd);
> +		return err;

Nit: you could "return err;" at the end of the function, and remove the
"close()" and "return" from this if block.

> +	}
> +
> +	close(prog_fd);
> +	return 0;
> +}
> +

[...]

> +
> +static int do_show_subset(int argc, char **argv)
> +{
> +	int *fds = NULL;
> +	int nb_fds, i;
> +	int err = -1;
> +
> +	fds = malloc(sizeof(int));
> +	if (!fds) {
> +		p_err("mem alloc failed");
> +		return -1;
> +	}
> +	nb_fds = link_parse_fds(&argc, &argv, &fds);
> +	if (nb_fds < 1)
> +		goto exit_free;
> +
> +	if (json_output && nb_fds > 1)
> +		jsonw_start_array(json_wtr);	/* root array */
> +	for (i = 0; i < nb_fds; i++) {
> +		err = do_show_link(fds[i]);
> +		if (err) {
> +			for (; i + 1 < nb_fds; i++)
> +				close(fds[i]);
> +			break;
> +		}
> +	}
> +	if (json_output && nb_fds > 1)
> +		jsonw_end_array(json_wtr);	/* root array */
> +
> +exit_free:
> +	free(fds);
> +	return err;
> +}
> +
> +static int do_show(int argc, char **argv)
> +{
> +	__u32 id = 0;
> +	int err;
> +	int fd;
> +
> +	if (show_pinned)
> +		build_pinned_obj_table(&link_table, BPF_OBJ_LINK);
> +
> +	if (argc == 2)
> +		return do_show_subset(argc, argv);

I understand the "_subset" aspect was taken from prog.c. But it was
added there (ec2025095cf6) because "bpftool prog show <name|tag>" can
match several programs, and the array with fds would be reallocated as
required while parsing names/tags.

I do not think that by restricting the selection on "id" or "pinned",
you can get more than one link at a time. So we can probably avoid
juggling with fd arrays for link_parse_fd() / link_parse_fds() and show
just the single relevant link.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next 09/10] bpftool: add bpftool-link manpage
  2020-04-24  5:35 ` [PATCH bpf-next 09/10] bpftool: add bpftool-link manpage Andrii Nakryiko
@ 2020-04-24 10:33   ` Quentin Monnet
  2020-04-24 16:32     ` Andrii Nakryiko
  0 siblings, 1 reply; 24+ messages in thread
From: Quentin Monnet @ 2020-04-24 10:33 UTC (permalink / raw)
  To: Andrii Nakryiko, bpf, netdev, ast, daniel; +Cc: andrii.nakryiko, kernel-team

2020-04-23 22:35 UTC-0700 ~ Andrii Nakryiko <andriin@fb.com>
> Add bpftool-link manpage with information and examples of link-related
> commands.
> 
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> ---
>  .../bpftool/Documentation/bpftool-link.rst    | 119 ++++++++++++++++++
>  1 file changed, 119 insertions(+)
>  create mode 100644 tools/bpf/bpftool/Documentation/bpftool-link.rst
> 
> diff --git a/tools/bpf/bpftool/Documentation/bpftool-link.rst b/tools/bpf/bpftool/Documentation/bpftool-link.rst
> new file mode 100644
> index 000000000000..2866128cd6b2
> --- /dev/null
> +++ b/tools/bpf/bpftool/Documentation/bpftool-link.rst
> @@ -0,0 +1,119 @@
> +================
> +bpftool-link
> +================
> +-------------------------------------------------------------------------------
> +tool for inspection and simple manipulation of eBPF links
> +-------------------------------------------------------------------------------
> +
> +:Manual section: 8
> +
> +SYNOPSIS
> +========
> +
> +	**bpftool** [*OPTIONS*] **link *COMMAND*

Missing the ending "**" after "**link", please fix.

> +
> +	*OPTIONS* := { { **-j** | **--json** } [{ **-p** | **--pretty** }] | { **-f** | **--bpffs** } }
> +
> +	*COMMANDS* := { **show** | **list** | **pin** | **help** }
> +
> +LINK COMMANDS
> +=============
> +
> +|	**bpftool** **link { show | list }** [*LINK*]
> +|	**bpftool** **link pin** *LINK* *FILE*
> +|	**bpftool** **link help**
> +|
> +|	*LINK* := { **id** *LINK_ID* | **pinned** *FILE* }
> +
> +
> +DESCRIPTION
> +===========
> +	**bpftool link { show | list }** [*LINK*]
> +		  Show information about active links. If *LINK* is
> +		  specified show information only about given link,
> +		  otherwise list all links currently active on the system.
> +
> +		  Output will start with link ID followed by link type and
> +		  zero or more named attributes, some of which depend on type
> +                  of link.

Nit: indent issue on the line above.

> +
> +	**bpftool link pin** *LINK* *FILE*
> +		  Pin link *LINK* as *FILE*.
> +
> +		  Note: *FILE* must be located in *bpffs* mount. It must not
> +		  contain a dot character ('.'), which is reserved for future
> +		  extensions of *bpffs*.
> +
> +	**bpftool link help**
> +		  Print short help message.
> +
> +OPTIONS
> +=======
> +	-h, --help
> +		  Print short generic help message (similar to **bpftool help**).
> +
> +	-V, --version
> +		  Print version number (similar to **bpftool version**).
> +
> +	-j, --json
> +		  Generate JSON output. For commands that cannot produce JSON, this
> +		  option has no effect.
> +
> +	-p, --pretty
> +		  Generate human-readable JSON output. Implies **-j**.
> +
> +	-f, --bpffs
> +		  When showing BPF links, show file names of pinned
> +		  links.
> +
> +	-n, --nomount
> +		  Do not automatically attempt to mount any virtual file system
> +		  (such as tracefs or BPF virtual file system) when necessary.
> +
> +	-d, --debug
> +		  Print all logs available, even debug-level information. This
> +		  includes logs from libbpf.
> +
> +EXAMPLES
> +========
> +**# bpftool link show**
> +
> +::
> +
> +    10: cgroup  prog 25
> +            cgroup_id 614  attach_type egress
> +
> +**# bpftool --json --pretty link show**
> +
> +::
> +
> +    [{
> +            "type": "cgroup",
> +            "prog_id": 25,
> +            "cgroup_id": 614,
> +            "attach_type": "egress"
> +        }
> +    ]
> +
> +|
> +| **# mount -t bpf none /sys/fs/bpf/**

[ Mounting should not be required, as you call
do_pin_any()->do_pin_fd()->mount_bpffs_for_pin().

Although on second thought I'm fine with keeping it, just in case users
call bpftool --nomount. ]

> +| **# bpftool link pin id 10 /sys/fs/bpf/link**
> +| **# ls -l /sys/fs/bpf/**
> +
> +::
> +
> +    -rw------- 1 root root 0 Apr 23 21:39 link
> +
> +
> +SEE ALSO
> +========
> +	**bpf**\ (2),
> +	**bpf-helpers**\ (7),
> +	**bpftool**\ (8),
> +	**bpftool-prog\ (8),
> +	**bpftool-map**\ (8),
> +	**bpftool-cgroup**\ (8),
> +	**bpftool-feature**\ (8),
> +	**bpftool-net**\ (8),
> +	**bpftool-perf**\ (8),
> +	**bpftool-btf**\ (8)
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next 10/10] bpftool: add link bash completions
  2020-04-24  5:35 ` [PATCH bpf-next 10/10] bpftool: add link bash completions Andrii Nakryiko
@ 2020-04-24 10:33   ` Quentin Monnet
  0 siblings, 0 replies; 24+ messages in thread
From: Quentin Monnet @ 2020-04-24 10:33 UTC (permalink / raw)
  To: Andrii Nakryiko, bpf, netdev, ast, daniel; +Cc: andrii.nakryiko, kernel-team

2020-04-23 22:35 UTC-0700 ~ Andrii Nakryiko <andriin@fb.com>
> Extend bpftool's bash-completion script to handle new link command and its
> sub-commands.
> 
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>

Looks great, thanks!
Reviewed-by: Quentin Monnet <quentin@isovalent.com>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next 00/10] bpf_link observability APIs
  2020-04-24  5:34 [PATCH bpf-next 00/10] bpf_link observability APIs Andrii Nakryiko
                   ` (9 preceding siblings ...)
  2020-04-24  5:35 ` [PATCH bpf-next 10/10] bpftool: add link bash completions Andrii Nakryiko
@ 2020-04-24 11:40 ` Toke Høiland-Jørgensen
  2020-04-24 15:54   ` Andrii Nakryiko
  10 siblings, 1 reply; 24+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-04-24 11:40 UTC (permalink / raw)
  To: Andrii Nakryiko, bpf, netdev, ast, daniel
  Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko

Andrii Nakryiko <andriin@fb.com> writes:

> This patch series adds various observability APIs to bpf_link:
>   - each bpf_link now gets ID, similar to bpf_map and bpf_prog, by which
>     user-space can iterate over all existing bpf_links and create limited FD
>     from ID;
>   - allows to get extra object information with bpf_link general and
>     type-specific information;
>   - implements `bpf link show` command which lists all active bpf_links in the
>     system;
>   - implements `bpf link pin` allowing to pin bpf_link by ID or from other
>     pinned path.
>
> rfc->v1:
>   - dropped read-only bpf_links (Alexei);

Just to make sure I understand this right: With this change, the
GET_FD_BY_ID operation will always return a r/w bpf_link fd that can
subsequently be used to detach the link? And you're doing the 'access
limiting' by just requiring CAP_SYS_ADMIN for the whole thing. Right? :)

-Toke


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next 00/10] bpf_link observability APIs
  2020-04-24 11:40 ` [PATCH bpf-next 00/10] bpf_link observability APIs Toke Høiland-Jørgensen
@ 2020-04-24 15:54   ` Andrii Nakryiko
  2020-04-24 16:39     ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 24+ messages in thread
From: Andrii Nakryiko @ 2020-04-24 15:54 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team

On Fri, Apr 24, 2020 at 4:40 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Andrii Nakryiko <andriin@fb.com> writes:
>
> > This patch series adds various observability APIs to bpf_link:
> >   - each bpf_link now gets ID, similar to bpf_map and bpf_prog, by which
> >     user-space can iterate over all existing bpf_links and create limited FD
> >     from ID;
> >   - allows to get extra object information with bpf_link general and
> >     type-specific information;
> >   - implements `bpf link show` command which lists all active bpf_links in the
> >     system;
> >   - implements `bpf link pin` allowing to pin bpf_link by ID or from other
> >     pinned path.
> >
> > rfc->v1:
> >   - dropped read-only bpf_links (Alexei);
>
> Just to make sure I understand this right: With this change, the
> GET_FD_BY_ID operation will always return a r/w bpf_link fd that can
> subsequently be used to detach the link? And you're doing the 'access
> limiting' by just requiring CAP_SYS_ADMIN for the whole thing. Right? :)

Right.

>
> -Toke
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next 07/10] bpftool: expose attach_type-to-string array to non-cgroup code
  2020-04-24 10:32   ` Quentin Monnet
@ 2020-04-24 16:27     ` Andrii Nakryiko
  2020-04-24 17:08       ` Quentin Monnet
  0 siblings, 1 reply; 24+ messages in thread
From: Andrii Nakryiko @ 2020-04-24 16:27 UTC (permalink / raw)
  To: Quentin Monnet
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team

On Fri, Apr 24, 2020 at 3:32 AM Quentin Monnet <quentin@isovalent.com> wrote:
>
> 2020-04-23 22:35 UTC-0700 ~ Andrii Nakryiko <andriin@fb.com>
> > Move attach_type_strings into main.h for access in non-cgroup code.
> > bpf_attach_type is used for non-cgroup attach types quite widely now. So also
> > complete missing string translations for non-cgroup attach types.
> >
> > Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> > ---
> >  tools/bpf/bpftool/cgroup.c | 28 +++-------------------------
> >  tools/bpf/bpftool/main.h   | 32 ++++++++++++++++++++++++++++++++
> >  2 files changed, 35 insertions(+), 25 deletions(-)
> >
> > diff --git a/tools/bpf/bpftool/cgroup.c b/tools/bpf/bpftool/cgroup.c
> > index 62c6a1d7cd18..d1fd9c9f2690 100644
> > --- a/tools/bpf/bpftool/cgroup.c
> > +++ b/tools/bpf/bpftool/cgroup.c
> > @@ -31,35 +31,13 @@
> >
> >  static unsigned int query_flags;
> >
> > -static const char * const attach_type_strings[] = {
> > -     [BPF_CGROUP_INET_INGRESS] = "ingress",
> > -     [BPF_CGROUP_INET_EGRESS] = "egress",
> > -     [BPF_CGROUP_INET_SOCK_CREATE] = "sock_create",
> > -     [BPF_CGROUP_SOCK_OPS] = "sock_ops",
> > -     [BPF_CGROUP_DEVICE] = "device",
> > -     [BPF_CGROUP_INET4_BIND] = "bind4",
> > -     [BPF_CGROUP_INET6_BIND] = "bind6",
> > -     [BPF_CGROUP_INET4_CONNECT] = "connect4",
> > -     [BPF_CGROUP_INET6_CONNECT] = "connect6",
> > -     [BPF_CGROUP_INET4_POST_BIND] = "post_bind4",
> > -     [BPF_CGROUP_INET6_POST_BIND] = "post_bind6",
> > -     [BPF_CGROUP_UDP4_SENDMSG] = "sendmsg4",
> > -     [BPF_CGROUP_UDP6_SENDMSG] = "sendmsg6",
> > -     [BPF_CGROUP_SYSCTL] = "sysctl",
> > -     [BPF_CGROUP_UDP4_RECVMSG] = "recvmsg4",
> > -     [BPF_CGROUP_UDP6_RECVMSG] = "recvmsg6",
> > -     [BPF_CGROUP_GETSOCKOPT] = "getsockopt",
> > -     [BPF_CGROUP_SETSOCKOPT] = "setsockopt",
> > -     [__MAX_BPF_ATTACH_TYPE] = NULL,
>
> So you removed the "[__MAX_BPF_ATTACH_TYPE] = NULL" from the new array,
> if I understand correctly this is because all attach type enum members
> are now in the new attach_type_name[] so we're safe by looping until we
> reach __MAX_BPF_ATTACH_TYPE. Sounds good in theory but...
>

Well, NULL is default value, so having [__MAX_BPF_ATTACH_TYPE] = NULL
just increases ARRAY_SIZE(attach_type_names) by one. Which is
generally not needed, because we do proper < ARRAY_SIZE() checks
everywhere... except for one place. show_bpf_prog in cgroup.c looks up
name directly and can pass NULL into jsonw_string_field which will
crash.

I can fix that by setting [__MAX_BPF_ATTACH_TYPE] to "unknown" or
adding extra check in show_bpf_prog() code? Any preferences?

> > -};
> > -
> >  static enum bpf_attach_type parse_attach_type(const char *str)
> >  {
> >       enum bpf_attach_type type;
> >
> >       for (type = 0; type < __MAX_BPF_ATTACH_TYPE; type++) {
> > -             if (attach_type_strings[type] &&
> > -                 is_prefix(str, attach_type_strings[type]))
> > +             if (attach_type_name[type] &&
> > +                 is_prefix(str, attach_type_name[type]))
> >                       return type;
> >       }
>
> ... I'm concerned the "attach_type_name[type]" here could segfault if we
> add a new attach type to the kernel, but don't report it immediately to
> bpftool's array.

I don't think so. Here we'll iterate over all possible bpf_attach_type
(as far as our copy of UAPI header is concerned, of course). If some
of the values don't have entries in attach_type_name array, we'll get
back NULL (same as with explicit [__MAX_BPF_ATTACH_TYPE] = NULL, btw),
which will get handled properly in the loop. And caller will get back
__MAX_BPF_ATTACH_TYPE as bpf_attach_type value. So unless I'm still
missing something, it seems to be working exactly the same as before?

>
> Is there any drawback with keeping the "[__MAX_BPF_ATTACH_TYPE] = NULL"?
> Or change here to loop on ARRAY_SIZE(), as you do in your own patch for
> link?

ARRAY_SIZE() == __MAX_BPF_ATTACH_TYPE, isn't it? Previously ARRAY_SIZE
was (__MAX_BPF_ATTACH_TYPE + 1), but I don't think it's necessary?

The only difference is show_bpf_prog() which now is going to do out of
array reads, while previously it would get NULL. But both cases are
bad and needs fixing.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next 08/10] bpftool: add bpf_link show and pin support
  2020-04-24 10:32   ` Quentin Monnet
@ 2020-04-24 16:30     ` Andrii Nakryiko
  0 siblings, 0 replies; 24+ messages in thread
From: Andrii Nakryiko @ 2020-04-24 16:30 UTC (permalink / raw)
  To: Quentin Monnet
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team

On Fri, Apr 24, 2020 at 3:32 AM Quentin Monnet <quentin@isovalent.com> wrote:
>
> 2020-04-23 22:35 UTC-0700 ~ Andrii Nakryiko <andriin@fb.com>
> > Add `bpftool link show` and `bpftool link pin` commands.
> >
> > Example plain output for `link show` (with showing pinned paths):
> >
> > [vmuser@archvm bpf]$ sudo ~/local/linux/tools/bpf/bpftool/bpftool -f link
> > 1: tracing  prog 12
> >         prog_type tracing  attach_type fentry
> >         pinned /sys/fs/bpf/my_test_link
> >         pinned /sys/fs/bpf/my_test_link2
> > 2: tracing  prog 13
> >         prog_type tracing  attach_type fentry
> > 3: tracing  prog 14
> >         prog_type tracing  attach_type fentry
> > 4: tracing  prog 15
> >         prog_type tracing  attach_type fentry
> > 5: tracing  prog 16
> >         prog_type tracing  attach_type fentry
> > 6: tracing  prog 17
> >         prog_type tracing  attach_type fentry
> > 7: raw_tracepoint  prog 21
> >         tp 'sys_enter'
> > 8: cgroup  prog 25
> >         cgroup_id 584  attach_type egress
> > 9: cgroup  prog 25
> >         cgroup_id 599  attach_type egress
> > 10: cgroup  prog 25
> >         cgroup_id 614  attach_type egress
> > 11: cgroup  prog 25
> >         cgroup_id 629  attach_type egress
> >
> > Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> > ---
> >  tools/bpf/bpftool/common.c |   2 +
> >  tools/bpf/bpftool/link.c   | 402 +++++++++++++++++++++++++++++++++++++
> >  tools/bpf/bpftool/main.c   |   6 +-
> >  tools/bpf/bpftool/main.h   |   5 +
> >  4 files changed, 414 insertions(+), 1 deletion(-)
> >  create mode 100644 tools/bpf/bpftool/link.c
> >
> > diff --git a/tools/bpf/bpftool/common.c b/tools/bpf/bpftool/common.c
> > index f2223dbdfb0a..c47bdc65de8e 100644
> > --- a/tools/bpf/bpftool/common.c
> > +++ b/tools/bpf/bpftool/common.c
> > @@ -262,6 +262,8 @@ int get_fd_type(int fd)
> >               return BPF_OBJ_MAP;
> >       else if (strstr(buf, "bpf-prog"))
> >               return BPF_OBJ_PROG;
> > +     else if (strstr(buf, "bpf-link"))
> > +             return BPF_OBJ_LINK;
> >
> >       return BPF_OBJ_UNKNOWN;
> >  }
> > diff --git a/tools/bpf/bpftool/link.c b/tools/bpf/bpftool/link.c
> > new file mode 100644
> > index 000000000000..d5dcf9e46536
> > --- /dev/null
> > +++ b/tools/bpf/bpftool/link.c
>
> [...]
>
> > +
> > +static int link_parse_fd(int *argc, char ***argv)
> > +{
> > +     int *fds = NULL;
> > +     int nb_fds, fd;
> > +
> > +     fds = malloc(sizeof(int));
> > +     if (!fds) {
> > +             p_err("mem alloc failed");
> > +             return -1;
> > +     }
> > +     nb_fds = link_parse_fds(argc, argv, &fds);
> > +     if (nb_fds != 1) {
> > +             if (nb_fds > 1) {
> > +                     p_err("several links match this handle");
>
> Can this ever happen? "bpftool prog show" has this because "name" or
> "tag" can match multiple programs. But "id" and "pinned" for link should
> not, as far as I understand.

No, it shouldn't. I'll keep only link_parse_fd and will always return
singe fd, thanks.

>
> > +                     while (nb_fds--)
> > +                             close(fds[nb_fds]);
> > +             }
> > +             fd = -1;
> > +             goto exit_free;
> > +     }
> > +
> > +     fd = fds[0];
> > +exit_free:
> > +     free(fds);
> > +     return fd;
> > +}
> > +
> > +static void
> > +show_link_header_json(struct bpf_link_info *info, json_writer_t *wtr)
> > +{
> > +     jsonw_uint_field(wtr, "id", info->id);
> > +     if (info->type < ARRAY_SIZE(link_type_name))
> > +             jsonw_string_field(wtr, "type", link_type_name[info->type]);
> > +     else
> > +             jsonw_uint_field(wtr, "type", info->type);
> > +
> > +     jsonw_uint_field(json_wtr, "prog_id", info->prog_id);
> > +}
> > +
> > +static int get_prog_info(int prog_id, struct bpf_prog_info *info)
> > +{
> > +     __u32 len = sizeof(*info);
> > +     int err, prog_fd;
> > +
> > +     prog_fd = bpf_prog_get_fd_by_id(prog_id);
> > +     if (prog_fd < 0)
> > +             return prog_fd;
> > +
> > +     memset(info, 0, sizeof(*info));
> > +     err = bpf_obj_get_info_by_fd(prog_fd, info, &len);
> > +     if (err) {
> > +             p_err("can't get prog info: %s", strerror(errno));
> > +             close(prog_fd);
> > +             return err;
>
> Nit: you could "return err;" at the end of the function, and remove the
> "close()" and "return" from this if block.

yep

>
> > +     }
> > +
> > +     close(prog_fd);
> > +     return 0;
> > +}
> > +
>
> [...]
>
> > +
> > +static int do_show_subset(int argc, char **argv)
> > +{
> > +     int *fds = NULL;
> > +     int nb_fds, i;
> > +     int err = -1;
> > +
> > +     fds = malloc(sizeof(int));
> > +     if (!fds) {
> > +             p_err("mem alloc failed");
> > +             return -1;
> > +     }
> > +     nb_fds = link_parse_fds(&argc, &argv, &fds);
> > +     if (nb_fds < 1)
> > +             goto exit_free;
> > +
> > +     if (json_output && nb_fds > 1)
> > +             jsonw_start_array(json_wtr);    /* root array */
> > +     for (i = 0; i < nb_fds; i++) {
> > +             err = do_show_link(fds[i]);
> > +             if (err) {
> > +                     for (; i + 1 < nb_fds; i++)
> > +                             close(fds[i]);
> > +                     break;
> > +             }
> > +     }
> > +     if (json_output && nb_fds > 1)
> > +             jsonw_end_array(json_wtr);      /* root array */
> > +
> > +exit_free:
> > +     free(fds);
> > +     return err;
> > +}
> > +
> > +static int do_show(int argc, char **argv)
> > +{
> > +     __u32 id = 0;
> > +     int err;
> > +     int fd;
> > +
> > +     if (show_pinned)
> > +             build_pinned_obj_table(&link_table, BPF_OBJ_LINK);
> > +
> > +     if (argc == 2)
> > +             return do_show_subset(argc, argv);
>
> I understand the "_subset" aspect was taken from prog.c. But it was
> added there (ec2025095cf6) because "bpftool prog show <name|tag>" can
> match several programs, and the array with fds would be reallocated as
> required while parsing names/tags.
>
> I do not think that by restricting the selection on "id" or "pinned",
> you can get more than one link at a time. So we can probably avoid
> juggling with fd arrays for link_parse_fd() / link_parse_fds() and show
> just the single relevant link.

yep, I did quite mechanical conversion of progs code to links, I'll do
another pass with this in mind. Thanks for review!

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next 09/10] bpftool: add bpftool-link manpage
  2020-04-24 10:33   ` Quentin Monnet
@ 2020-04-24 16:32     ` Andrii Nakryiko
  0 siblings, 0 replies; 24+ messages in thread
From: Andrii Nakryiko @ 2020-04-24 16:32 UTC (permalink / raw)
  To: Quentin Monnet
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team

On Fri, Apr 24, 2020 at 3:33 AM Quentin Monnet <quentin@isovalent.com> wrote:
>
> 2020-04-23 22:35 UTC-0700 ~ Andrii Nakryiko <andriin@fb.com>
> > Add bpftool-link manpage with information and examples of link-related
> > commands.
> >
> > Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> > ---
> >  .../bpftool/Documentation/bpftool-link.rst    | 119 ++++++++++++++++++
> >  1 file changed, 119 insertions(+)
> >  create mode 100644 tools/bpf/bpftool/Documentation/bpftool-link.rst
> >
> > diff --git a/tools/bpf/bpftool/Documentation/bpftool-link.rst b/tools/bpf/bpftool/Documentation/bpftool-link.rst
> > new file mode 100644
> > index 000000000000..2866128cd6b2
> > --- /dev/null
> > +++ b/tools/bpf/bpftool/Documentation/bpftool-link.rst
> > @@ -0,0 +1,119 @@
> > +================
> > +bpftool-link
> > +================
> > +-------------------------------------------------------------------------------
> > +tool for inspection and simple manipulation of eBPF links
> > +-------------------------------------------------------------------------------
> > +
> > +:Manual section: 8
> > +
> > +SYNOPSIS
> > +========
> > +
> > +     **bpftool** [*OPTIONS*] **link *COMMAND*
>
> Missing the ending "**" after "**link", please fix.

will do

>
> > +
> > +     *OPTIONS* := { { **-j** | **--json** } [{ **-p** | **--pretty** }] | { **-f** | **--bpffs** } }
> > +
> > +     *COMMANDS* := { **show** | **list** | **pin** | **help** }
> > +
> > +LINK COMMANDS
> > +=============
> > +
> > +|    **bpftool** **link { show | list }** [*LINK*]
> > +|    **bpftool** **link pin** *LINK* *FILE*
> > +|    **bpftool** **link help**
> > +|
> > +|    *LINK* := { **id** *LINK_ID* | **pinned** *FILE* }
> > +
> > +
> > +DESCRIPTION
> > +===========
> > +     **bpftool link { show | list }** [*LINK*]
> > +               Show information about active links. If *LINK* is
> > +               specified show information only about given link,
> > +               otherwise list all links currently active on the system.
> > +
> > +               Output will start with link ID followed by link type and
> > +               zero or more named attributes, some of which depend on type
> > +                  of link.
>
> Nit: indent issue on the line above.

fixing

>
> > +
> > +     **bpftool link pin** *LINK* *FILE*
> > +               Pin link *LINK* as *FILE*.
> > +
> > +               Note: *FILE* must be located in *bpffs* mount. It must not
> > +               contain a dot character ('.'), which is reserved for future
> > +               extensions of *bpffs*.
> > +
> > +     **bpftool link help**
> > +               Print short help message.
> > +
> > +OPTIONS
> > +=======
> > +     -h, --help
> > +               Print short generic help message (similar to **bpftool help**).
> > +
> > +     -V, --version
> > +               Print version number (similar to **bpftool version**).
> > +
> > +     -j, --json
> > +               Generate JSON output. For commands that cannot produce JSON, this
> > +               option has no effect.
> > +
> > +     -p, --pretty
> > +               Generate human-readable JSON output. Implies **-j**.
> > +
> > +     -f, --bpffs
> > +               When showing BPF links, show file names of pinned
> > +               links.
> > +
> > +     -n, --nomount
> > +               Do not automatically attempt to mount any virtual file system
> > +               (such as tracefs or BPF virtual file system) when necessary.
> > +
> > +     -d, --debug
> > +               Print all logs available, even debug-level information. This
> > +               includes logs from libbpf.
> > +
> > +EXAMPLES
> > +========
> > +**# bpftool link show**
> > +
> > +::
> > +
> > +    10: cgroup  prog 25
> > +            cgroup_id 614  attach_type egress
> > +
> > +**# bpftool --json --pretty link show**
> > +
> > +::
> > +
> > +    [{
> > +            "type": "cgroup",
> > +            "prog_id": 25,
> > +            "cgroup_id": 614,
> > +            "attach_type": "egress"
> > +        }
> > +    ]
> > +
> > +|
> > +| **# mount -t bpf none /sys/fs/bpf/**
>
> [ Mounting should not be required, as you call
> do_pin_any()->do_pin_fd()->mount_bpffs_for_pin().
>
> Although on second thought I'm fine with keeping it, just in case users
> call bpftool --nomount. ]

It was a copy/paste from bpftool-prog.rst, but I think I'll drop it
for this one (and keep it in bpftool-prog.rst).

>
> > +| **# bpftool link pin id 10 /sys/fs/bpf/link**
> > +| **# ls -l /sys/fs/bpf/**
> > +
> > +::
> > +
> > +    -rw------- 1 root root 0 Apr 23 21:39 link
> > +
> > +
> > +SEE ALSO
> > +========
> > +     **bpf**\ (2),
> > +     **bpf-helpers**\ (7),
> > +     **bpftool**\ (8),
> > +     **bpftool-prog\ (8),
> > +     **bpftool-map**\ (8),
> > +     **bpftool-cgroup**\ (8),
> > +     **bpftool-feature**\ (8),
> > +     **bpftool-net**\ (8),
> > +     **bpftool-perf**\ (8),
> > +     **bpftool-btf**\ (8)
> >
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next 00/10] bpf_link observability APIs
  2020-04-24 15:54   ` Andrii Nakryiko
@ 2020-04-24 16:39     ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 24+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-04-24 16:39 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team

Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:

> On Fri, Apr 24, 2020 at 4:40 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> Andrii Nakryiko <andriin@fb.com> writes:
>>
>> > This patch series adds various observability APIs to bpf_link:
>> >   - each bpf_link now gets ID, similar to bpf_map and bpf_prog, by which
>> >     user-space can iterate over all existing bpf_links and create limited FD
>> >     from ID;
>> >   - allows to get extra object information with bpf_link general and
>> >     type-specific information;
>> >   - implements `bpf link show` command which lists all active bpf_links in the
>> >     system;
>> >   - implements `bpf link pin` allowing to pin bpf_link by ID or from other
>> >     pinned path.
>> >
>> > rfc->v1:
>> >   - dropped read-only bpf_links (Alexei);
>>
>> Just to make sure I understand this right: With this change, the
>> GET_FD_BY_ID operation will always return a r/w bpf_link fd that can
>> subsequently be used to detach the link? And you're doing the 'access
>> limiting' by just requiring CAP_SYS_ADMIN for the whole thing. Right? :)
>
> Right.

Great! SGTM; thanks for confirming :)

-Toke


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next 07/10] bpftool: expose attach_type-to-string array to non-cgroup code
  2020-04-24 16:27     ` Andrii Nakryiko
@ 2020-04-24 17:08       ` Quentin Monnet
  2020-04-25  0:12         ` Andrii Nakryiko
  0 siblings, 1 reply; 24+ messages in thread
From: Quentin Monnet @ 2020-04-24 17:08 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team

2020-04-24 09:27 UTC-0700 ~ Andrii Nakryiko <andrii.nakryiko@gmail.com>
> On Fri, Apr 24, 2020 at 3:32 AM Quentin Monnet <quentin@isovalent.com> wrote:
>>
>> 2020-04-23 22:35 UTC-0700 ~ Andrii Nakryiko <andriin@fb.com>
>>> Move attach_type_strings into main.h for access in non-cgroup code.
>>> bpf_attach_type is used for non-cgroup attach types quite widely now. So also
>>> complete missing string translations for non-cgroup attach types.
>>>
>>> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
>>> ---
>>>  tools/bpf/bpftool/cgroup.c | 28 +++-------------------------
>>>  tools/bpf/bpftool/main.h   | 32 ++++++++++++++++++++++++++++++++
>>>  2 files changed, 35 insertions(+), 25 deletions(-)
>>>
>>> diff --git a/tools/bpf/bpftool/cgroup.c b/tools/bpf/bpftool/cgroup.c
>>> index 62c6a1d7cd18..d1fd9c9f2690 100644
>>> --- a/tools/bpf/bpftool/cgroup.c
>>> +++ b/tools/bpf/bpftool/cgroup.c
>>> @@ -31,35 +31,13 @@
>>>
>>>  static unsigned int query_flags;
>>>
>>> -static const char * const attach_type_strings[] = {
>>> -     [BPF_CGROUP_INET_INGRESS] = "ingress",
>>> -     [BPF_CGROUP_INET_EGRESS] = "egress",
>>> -     [BPF_CGROUP_INET_SOCK_CREATE] = "sock_create",
>>> -     [BPF_CGROUP_SOCK_OPS] = "sock_ops",
>>> -     [BPF_CGROUP_DEVICE] = "device",
>>> -     [BPF_CGROUP_INET4_BIND] = "bind4",
>>> -     [BPF_CGROUP_INET6_BIND] = "bind6",
>>> -     [BPF_CGROUP_INET4_CONNECT] = "connect4",
>>> -     [BPF_CGROUP_INET6_CONNECT] = "connect6",
>>> -     [BPF_CGROUP_INET4_POST_BIND] = "post_bind4",
>>> -     [BPF_CGROUP_INET6_POST_BIND] = "post_bind6",
>>> -     [BPF_CGROUP_UDP4_SENDMSG] = "sendmsg4",
>>> -     [BPF_CGROUP_UDP6_SENDMSG] = "sendmsg6",
>>> -     [BPF_CGROUP_SYSCTL] = "sysctl",
>>> -     [BPF_CGROUP_UDP4_RECVMSG] = "recvmsg4",
>>> -     [BPF_CGROUP_UDP6_RECVMSG] = "recvmsg6",
>>> -     [BPF_CGROUP_GETSOCKOPT] = "getsockopt",
>>> -     [BPF_CGROUP_SETSOCKOPT] = "setsockopt",
>>> -     [__MAX_BPF_ATTACH_TYPE] = NULL,
>>
>> So you removed the "[__MAX_BPF_ATTACH_TYPE] = NULL" from the new array,
>> if I understand correctly this is because all attach type enum members
>> are now in the new attach_type_name[] so we're safe by looping until we
>> reach __MAX_BPF_ATTACH_TYPE. Sounds good in theory but...
>>
> 
> Well, NULL is default value, so having [__MAX_BPF_ATTACH_TYPE] = NULL
> just increases ARRAY_SIZE(attach_type_names) by one. Which is
> generally not needed, because we do proper < ARRAY_SIZE() checks
> everywhere... except for one place. show_bpf_prog in cgroup.c looks up
> name directly and can pass NULL into jsonw_string_field which will
> crash.
> 
> I can fix that by setting [__MAX_BPF_ATTACH_TYPE] to "unknown" or
> adding extra check in show_bpf_prog() code? Any preferences?

Maybe add the extra check, so we remove this [__MAX_BPF_ATTACH_TYPE]
indeed. It will be more consistent with the array with program names,
and as you say, all other places loop on ARRAY_SIZE() just fine.

Maybe we could print the integer value for the type if we don't know the
name? Not sure if this is good for JSON though.

> 
>>> -};
>>> -
>>>  static enum bpf_attach_type parse_attach_type(const char *str)
>>>  {
>>>       enum bpf_attach_type type;
>>>
>>>       for (type = 0; type < __MAX_BPF_ATTACH_TYPE; type++) {
>>> -             if (attach_type_strings[type] &&
>>> -                 is_prefix(str, attach_type_strings[type]))
>>> +             if (attach_type_name[type] &&
>>> +                 is_prefix(str, attach_type_name[type]))
>>>                       return type;
>>>       }
>>
>> ... I'm concerned the "attach_type_name[type]" here could segfault if we
>> add a new attach type to the kernel, but don't report it immediately to
>> bpftool's array.
> 
> I don't think so. Here we'll iterate over all possible bpf_attach_type
> (as far as our copy of UAPI header is concerned, of course). If some
> of the values don't have entries in attach_type_name array, we'll get
> back NULL (same as with explicit [__MAX_BPF_ATTACH_TYPE] = NULL, btw),
> which will get handled properly in the loop. And caller will get back
> __MAX_BPF_ATTACH_TYPE as bpf_attach_type value. So unless I'm still
> missing something, it seems to be working exactly the same as before?
> 
>>
>> Is there any drawback with keeping the "[__MAX_BPF_ATTACH_TYPE] = NULL"?
>> Or change here to loop on ARRAY_SIZE(), as you do in your own patch for
>> link?
> 
> ARRAY_SIZE() == __MAX_BPF_ATTACH_TYPE, isn't it? Previously ARRAY_SIZE
> was (__MAX_BPF_ATTACH_TYPE + 1), but I don't think it's necessary?

ARRAY_SIZE() /should/ be equal to __MAX_BPF_ATTACH_TYPE, the concern is
only if new attach types get added to UAPI header and we forget to add
them to the array. In that case, the assumption is not longer valid and
we risk reading out of the array in parse_attach_type(). That was not
the case before, because we knew that the array was always big enough.
There was no risk to read beyond index __MAX_BPF_ATTACH_TYPE, there is
one now to read beyond index BPF_LSM_MAC when new types are added. Or am
I the one missing something?

> 
> The only difference is show_bpf_prog() which now is going to do out of
> array reads, while previously it would get NULL. But both cases are
> bad and needs fixing.
> 

Right, nice catch, this needs a fix.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next 07/10] bpftool: expose attach_type-to-string array to non-cgroup code
  2020-04-24 17:08       ` Quentin Monnet
@ 2020-04-25  0:12         ` Andrii Nakryiko
  2020-04-25 10:19           ` Quentin Monnet
  0 siblings, 1 reply; 24+ messages in thread
From: Andrii Nakryiko @ 2020-04-25  0:12 UTC (permalink / raw)
  To: Quentin Monnet
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team

On Fri, Apr 24, 2020 at 10:08 AM Quentin Monnet <quentin@isovalent.com> wrote:
>
> 2020-04-24 09:27 UTC-0700 ~ Andrii Nakryiko <andrii.nakryiko@gmail.com>
> > On Fri, Apr 24, 2020 at 3:32 AM Quentin Monnet <quentin@isovalent.com> wrote:
> >>
> >> 2020-04-23 22:35 UTC-0700 ~ Andrii Nakryiko <andriin@fb.com>
> >>> Move attach_type_strings into main.h for access in non-cgroup code.
> >>> bpf_attach_type is used for non-cgroup attach types quite widely now. So also
> >>> complete missing string translations for non-cgroup attach types.
> >>>
> >>> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> >>> ---
> >>>  tools/bpf/bpftool/cgroup.c | 28 +++-------------------------
> >>>  tools/bpf/bpftool/main.h   | 32 ++++++++++++++++++++++++++++++++
> >>>  2 files changed, 35 insertions(+), 25 deletions(-)
> >>>
> >>> diff --git a/tools/bpf/bpftool/cgroup.c b/tools/bpf/bpftool/cgroup.c
> >>> index 62c6a1d7cd18..d1fd9c9f2690 100644
> >>> --- a/tools/bpf/bpftool/cgroup.c
> >>> +++ b/tools/bpf/bpftool/cgroup.c
> >>> @@ -31,35 +31,13 @@
> >>>
> >>>  static unsigned int query_flags;
> >>>
> >>> -static const char * const attach_type_strings[] = {
> >>> -     [BPF_CGROUP_INET_INGRESS] = "ingress",
> >>> -     [BPF_CGROUP_INET_EGRESS] = "egress",
> >>> -     [BPF_CGROUP_INET_SOCK_CREATE] = "sock_create",
> >>> -     [BPF_CGROUP_SOCK_OPS] = "sock_ops",
> >>> -     [BPF_CGROUP_DEVICE] = "device",
> >>> -     [BPF_CGROUP_INET4_BIND] = "bind4",
> >>> -     [BPF_CGROUP_INET6_BIND] = "bind6",
> >>> -     [BPF_CGROUP_INET4_CONNECT] = "connect4",
> >>> -     [BPF_CGROUP_INET6_CONNECT] = "connect6",
> >>> -     [BPF_CGROUP_INET4_POST_BIND] = "post_bind4",
> >>> -     [BPF_CGROUP_INET6_POST_BIND] = "post_bind6",
> >>> -     [BPF_CGROUP_UDP4_SENDMSG] = "sendmsg4",
> >>> -     [BPF_CGROUP_UDP6_SENDMSG] = "sendmsg6",
> >>> -     [BPF_CGROUP_SYSCTL] = "sysctl",
> >>> -     [BPF_CGROUP_UDP4_RECVMSG] = "recvmsg4",
> >>> -     [BPF_CGROUP_UDP6_RECVMSG] = "recvmsg6",
> >>> -     [BPF_CGROUP_GETSOCKOPT] = "getsockopt",
> >>> -     [BPF_CGROUP_SETSOCKOPT] = "setsockopt",
> >>> -     [__MAX_BPF_ATTACH_TYPE] = NULL,
> >>
> >> So you removed the "[__MAX_BPF_ATTACH_TYPE] = NULL" from the new array,
> >> if I understand correctly this is because all attach type enum members
> >> are now in the new attach_type_name[] so we're safe by looping until we
> >> reach __MAX_BPF_ATTACH_TYPE. Sounds good in theory but...
> >>
> >
> > Well, NULL is default value, so having [__MAX_BPF_ATTACH_TYPE] = NULL
> > just increases ARRAY_SIZE(attach_type_names) by one. Which is
> > generally not needed, because we do proper < ARRAY_SIZE() checks
> > everywhere... except for one place. show_bpf_prog in cgroup.c looks up
> > name directly and can pass NULL into jsonw_string_field which will
> > crash.
> >
> > I can fix that by setting [__MAX_BPF_ATTACH_TYPE] to "unknown" or
> > adding extra check in show_bpf_prog() code? Any preferences?
>
> Maybe add the extra check, so we remove this [__MAX_BPF_ATTACH_TYPE]
> indeed. It will be more consistent with the array with program names,
> and as you say, all other places loop on ARRAY_SIZE() just fine.

Sounds good.

>
> Maybe we could print the integer value for the type if we don't know the
> name? Not sure if this is good for JSON though.

We do that in a bunch of places, I'll see if that's easy to do.

>
> >
> >>> -};
> >>> -
> >>>  static enum bpf_attach_type parse_attach_type(const char *str)
> >>>  {
> >>>       enum bpf_attach_type type;
> >>>
> >>>       for (type = 0; type < __MAX_BPF_ATTACH_TYPE; type++) {
> >>> -             if (attach_type_strings[type] &&
> >>> -                 is_prefix(str, attach_type_strings[type]))
> >>> +             if (attach_type_name[type] &&
> >>> +                 is_prefix(str, attach_type_name[type]))
> >>>                       return type;
> >>>       }
> >>
> >> ... I'm concerned the "attach_type_name[type]" here could segfault if we
> >> add a new attach type to the kernel, but don't report it immediately to
> >> bpftool's array.
> >
> > I don't think so. Here we'll iterate over all possible bpf_attach_type
> > (as far as our copy of UAPI header is concerned, of course). If some
> > of the values don't have entries in attach_type_name array, we'll get
> > back NULL (same as with explicit [__MAX_BPF_ATTACH_TYPE] = NULL, btw),
> > which will get handled properly in the loop. And caller will get back
> > __MAX_BPF_ATTACH_TYPE as bpf_attach_type value. So unless I'm still
> > missing something, it seems to be working exactly the same as before?
> >
> >>
> >> Is there any drawback with keeping the "[__MAX_BPF_ATTACH_TYPE] = NULL"?
> >> Or change here to loop on ARRAY_SIZE(), as you do in your own patch for
> >> link?
> >
> > ARRAY_SIZE() == __MAX_BPF_ATTACH_TYPE, isn't it? Previously ARRAY_SIZE
> > was (__MAX_BPF_ATTACH_TYPE + 1), but I don't think it's necessary?
>
> ARRAY_SIZE() /should/ be equal to __MAX_BPF_ATTACH_TYPE, the concern is
> only if new attach types get added to UAPI header and we forget to add
> them to the array. In that case, the assumption is not longer valid and
> we risk reading out of the array in parse_attach_type(). That was not
> the case before, because we knew that the array was always big enough.
> There was no risk to read beyond index __MAX_BPF_ATTACH_TYPE, there is
> one now to read beyond index BPF_LSM_MAC when new types are added. Or am
> I the one missing something?

Ah, I see what you are saying... I can just declare array as

const char *attach_type_strings[__MAX_BPF_ATTACH_TYPE] = { ... }

to prevent this. There is still this issue of potentially getting back
NULL pointer. But that warrants separate "audit" of the code usage and
fixing appropriately, I don't think it belongs in this patch set.

>
> >
> > The only difference is show_bpf_prog() which now is going to do out of
> > array reads, while previously it would get NULL. But both cases are
> > bad and needs fixing.
> >
>
> Right, nice catch, this needs a fix.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next 07/10] bpftool: expose attach_type-to-string array to non-cgroup code
  2020-04-25  0:12         ` Andrii Nakryiko
@ 2020-04-25 10:19           ` Quentin Monnet
  0 siblings, 0 replies; 24+ messages in thread
From: Quentin Monnet @ 2020-04-25 10:19 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team

On Sat, 25 Apr 2020 at 01:12, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:

>
> Ah, I see what you are saying... I can just declare array as
>
> const char *attach_type_strings[__MAX_BPF_ATTACH_TYPE] = { ... }
>

That would address my concern, thanks!

> to prevent this. There is still this issue of potentially getting back
> NULL pointer. But that warrants separate "audit" of the code usage and
> fixing appropriately, I don't think it belongs in this patch set.

Agreed, we can keep that for later. My main concern for this review
was the other point above anyway.

Thank you,
Quentin

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2020-04-25 10:19 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-24  5:34 [PATCH bpf-next 00/10] bpf_link observability APIs Andrii Nakryiko
2020-04-24  5:34 ` [PATCH bpf-next 01/10] bpf: refactor bpf_link update handling Andrii Nakryiko
2020-04-24  5:34 ` [PATCH bpf-next 02/10] bpf: allocate ID for bpf_link Andrii Nakryiko
2020-04-24  5:34 ` [PATCH bpf-next 03/10] bpf: support GET_FD_BY_ID and GET_NEXT_ID " Andrii Nakryiko
2020-04-24  5:34 ` [PATCH bpf-next 04/10] bpf: add support for BPF_OBJ_GET_INFO_BY_FD " Andrii Nakryiko
2020-04-24  5:35 ` [PATCH bpf-next 05/10] libbpf: add low-level APIs for new bpf_link commands Andrii Nakryiko
2020-04-24  5:35 ` [PATCH bpf-next 06/10] selftests/bpf: test bpf_link's get_next_id, get_fd_by_id, and get_obj_info Andrii Nakryiko
2020-04-24  5:35 ` [PATCH bpf-next 07/10] bpftool: expose attach_type-to-string array to non-cgroup code Andrii Nakryiko
2020-04-24 10:32   ` Quentin Monnet
2020-04-24 16:27     ` Andrii Nakryiko
2020-04-24 17:08       ` Quentin Monnet
2020-04-25  0:12         ` Andrii Nakryiko
2020-04-25 10:19           ` Quentin Monnet
2020-04-24  5:35 ` [PATCH bpf-next 08/10] bpftool: add bpf_link show and pin support Andrii Nakryiko
2020-04-24 10:32   ` Quentin Monnet
2020-04-24 16:30     ` Andrii Nakryiko
2020-04-24  5:35 ` [PATCH bpf-next 09/10] bpftool: add bpftool-link manpage Andrii Nakryiko
2020-04-24 10:33   ` Quentin Monnet
2020-04-24 16:32     ` Andrii Nakryiko
2020-04-24  5:35 ` [PATCH bpf-next 10/10] bpftool: add link bash completions Andrii Nakryiko
2020-04-24 10:33   ` Quentin Monnet
2020-04-24 11:40 ` [PATCH bpf-next 00/10] bpf_link observability APIs Toke Høiland-Jørgensen
2020-04-24 15:54   ` Andrii Nakryiko
2020-04-24 16:39     ` Toke Høiland-Jørgensen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).