bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next 0/2] Document some recent core kfunc additions
@ 2022-12-02 22:07 David Vernet
  2022-12-02 22:07 ` [PATCH bpf-next 1/2] bpf/docs: Document struct task_struct * kfuncs David Vernet
  2022-12-02 22:07 ` [PATCH bpf-next 2/2] bpf/docs: Document struct cgroup " David Vernet
  0 siblings, 2 replies; 6+ messages in thread
From: David Vernet @ 2022-12-02 22:07 UTC (permalink / raw)
  To: bpf
  Cc: ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, linux-kernel, kernel-team

A series of recent patch sets introduced kfuncs that allowed struct
task_struct and struct cgroup objects to be used as kptrs. These were
introduced in [0], [1], and [2].

[0]: https://lore.kernel.org/lkml/20221120051004.3605026-1-void@manifault.com/
[1]: https://lore.kernel.org/lkml/20221122145300.251210-2-void@manifault.com/T/
[2]: https://lore.kernel.org/lkml/20221122055458.173143-1-void@manifault.com/

These are "core" kfuncs, in that they may be used by a wide variety of
possible BPF tracepoint or struct_ops programs, and are defined in
kernel/bpf/helpers.c. Even though as kfuncs they have no ABI stability
guarantees, they should still be properly documented. This patch set
adds that documentation.

Some other kfuncs were added recently as well, such as
bpf_rcu_read_lock() and bpf_rcu_read_unlock(). Those could and should be
added to this "Core kfuncs" section as well in other patch sets.

David Vernet (2):
  bpf/docs: Document struct task_struct * kfuncs
  bpf/docs: Document struct cgroup * kfuncs

 Documentation/bpf/kfuncs.rst | 198 +++++++++++++++++++++++++++++++++++
 kernel/bpf/helpers.c         |  10 +-
 2 files changed, 203 insertions(+), 5 deletions(-)

-- 
2.38.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH bpf-next 1/2] bpf/docs: Document struct task_struct * kfuncs
  2022-12-02 22:07 [PATCH bpf-next 0/2] Document some recent core kfunc additions David Vernet
@ 2022-12-02 22:07 ` David Vernet
  2022-12-03  2:15   ` Alexei Starovoitov
  2022-12-02 22:07 ` [PATCH bpf-next 2/2] bpf/docs: Document struct cgroup " David Vernet
  1 sibling, 1 reply; 6+ messages in thread
From: David Vernet @ 2022-12-02 22:07 UTC (permalink / raw)
  To: bpf
  Cc: ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, linux-kernel, kernel-team

bpf_task_acquire(), bpf_task_release(), bpf_task_kptr_get(), and
bpf_task_from_pid() are kfuncs that were recently added to
kernel/bpf/helpers.c. These are "core" kfuncs in that they're available
for use for any tracepoint or struct_ops BPF program. Though they have
no ABI stability guarantees, we should still document them. This patch
adds a new Core kfuncs section to the BPF kfuncs doc, and adds entries
for all of these task kfuncs.

Signed-off-by: David Vernet <void@manifault.com>
---
 Documentation/bpf/kfuncs.rst | 148 +++++++++++++++++++++++++++++++++++
 kernel/bpf/helpers.c         |   8 +-
 2 files changed, 152 insertions(+), 4 deletions(-)

diff --git a/Documentation/bpf/kfuncs.rst b/Documentation/bpf/kfuncs.rst
index 90774479ab7a..b0c35ad6fad4 100644
--- a/Documentation/bpf/kfuncs.rst
+++ b/Documentation/bpf/kfuncs.rst
@@ -213,3 +213,151 @@ type. An example is shown below::
                 return register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING, &bpf_task_kfunc_set);
         }
         late_initcall(init_subsystem);
+
+3. Core kfuncs
+==============
+
+The BPF subsystem provides a number of "core" kfuncs that are potentially
+applicable to a wide variety of different possible use cases and programs.
+Those kfuncs are documented here.
+
+3.1 struct task_struct * kfuncs
+-------------------------------
+
+There are a number of kfuncs that allow ``struct task_struct *`` objects to be
+used as kptrs:
+
+.. kernel-doc:: kernel/bpf/helpers.c
+   :identifiers: bpf_task_acquire bpf_task_release
+
+These kfuncs are useful when you want to acquire or release a reference to a
+``struct task_struct *`` that was passed as e.g. a tracepoint arg, or a
+struct_ops callback arg. For example:
+
+.. code-block:: c
+
+	/**
+	 * A trivial example tracepoint program that shows how to
+	 * acquire and release a struct task_struct * pointer.
+	 */
+	SEC("tp_btf/task_newtask")
+	int BPF_PROG(task_acquire_release_example, struct task_struct *task, u64 clone_flags)
+	{
+		struct task_struct *acquired;
+
+		acquired = bpf_task_acquire(task);
+
+		/*
+		 * In a typical program you'd do something like store
+		 * the task in a map. Here, we just release it.
+		 */
+		bpf_task_release(acquired);
+		return 0;
+	}
+
+If you want to acquire a reference to a ``struct task_struct`` kptr that's
+already stored in a map, you can use bpf_task_kptr_get():
+
+.. kernel-doc:: kernel/bpf/helpers.c
+   :identifiers: bpf_task_kptr_get
+
+Here's an example of how it can be used:
+
+.. code-block:: c
+
+	/* struct containing the struct task_struct kptr which is actually stored in the map. */
+	struct __tasks_kfunc_map_value {
+		struct task_struct __kptr_ref * task;
+	};
+
+	/* The map containing struct __tasks_kfunc_map_value entries. */
+	struct hash_map {
+		__uint(type, BPF_MAP_TYPE_HASH);
+		__type(key, int);
+		__type(value, struct __tasks_kfunc_map_value);
+		__uint(max_entries, 1);
+	} __tasks_kfunc_map SEC(".maps");
+
+	/* ... */
+
+	/**
+	 * A simple example tracepoint program showing how a
+	 * struct task_struct kptr that is stored in a map can
+	 * be acquired using the bpf_task_kptr_get() kfunc.
+	 */
+	 SEC("tp_btf/task_newtask")
+	 int BPF_PROG(task_kptr_get_example, struct task_struct *task, u64 clone_flags)
+	 {
+		struct task_struct *kptr;
+		struct __tasks_kfunc_map_value *v;
+		s32 pid;
+		long status;
+
+		status = bpf_probe_read_kernel(&pid, sizeof(pid), &task->pid);
+		if (status)
+			return status;
+
+		/* Assume a task kptr was previously stored in the map. */
+		v = bpf_map_lookup_elem(&__tasks_kfunc_map, &pid);
+		if (!v)
+			return -ENOENT;
+
+		/* Acquire a reference to the task kptr that's already stored in the map. */
+		kptr = bpf_task_kptr_get(&v->task);
+		if (!kptr)
+			/* If no task was present in the map, it's because
+			 * we're racing with another CPU that removed it with
+			 * bpf_kptr_xchg() between the bpf_map_lookup_elem()
+			 * above, and our call to bpf_task_kptr_get().
+			 * bpf_task_kptr_get() internally safely handles this
+			 * race, and will return NULL if the task is no longer
+			 * present in the map by the time we invoke the kfunc.
+			 */
+			return -EBUSY;
+
+		/* Free the reference we just took above. Note that the
+		 * original struct task_struct kptr is still in the map.
+		 * It will be freed either at a later time if another
+		 * context deletes it from the map, or automatically by
+		 * the BPF subsystem if it's still present when the map
+		 * is destroyed.
+		 */
+		bpf_task_release(kptr);
+
+		return 0;
+        }
+
+Finally, a BPF program can also look up a task from a pid. This can be useful
+if the caller doesn't have a trusted pointer to a ``struct task_struct *``
+object that it can acquire a reference on with bpf_task_acquire().
+
+.. kernel-doc:: kernel/bpf/helpers.c
+   :identifiers: bpf_task_from_pid
+
+Here is an example of it being used:
+
+.. code-block:: c
+
+	SEC("tp_btf/task_newtask")
+	int BPF_PROG(task_get_pid_example, struct task_struct *task, u64 clone_flags)
+	{
+		struct task_struct *lookup;
+
+		lookup = bpf_task_from_pid(task->pid);
+		if (!lookup)
+			/* A task should always be found, as %task is a tracepoint arg. */
+			return -ENOENT;
+
+		if (lookup->pid != task->pid) {
+			/* The pid of the lookup task should be the same as the input task. */
+			bpf_task_release(lookup);
+			return -EINVAL;
+		}
+
+		/* bpf_task_from_pid() returns an acquired reference,
+		 * so it must be dropped before returning from the
+		 * tracepoint handler.
+		 */
+		bpf_task_release(lookup);
+		return 0;
+	}
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index a5a511430f2a..004afbc14bbf 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -1868,10 +1868,10 @@ struct task_struct *bpf_task_kptr_get(struct task_struct **pp)
 }
 
 /**
- * bpf_task_release - Release the reference acquired on a struct task_struct *.
- * If this kfunc is invoked in an RCU read region, the task_struct is
- * guaranteed to not be freed until the current grace period has ended, even if
- * its refcount drops to 0.
+ * bpf_task_release - Release the reference acquired on a task.  If this kfunc
+ * is invoked in an RCU read region, the task_struct is guaranteed to not be
+ * freed until the current grace period has ended, even if its refcount drops
+ * to 0.
  * @p: The task on which a reference is being released.
  */
 void bpf_task_release(struct task_struct *p)
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH bpf-next 2/2] bpf/docs: Document struct cgroup * kfuncs
  2022-12-02 22:07 [PATCH bpf-next 0/2] Document some recent core kfunc additions David Vernet
  2022-12-02 22:07 ` [PATCH bpf-next 1/2] bpf/docs: Document struct task_struct * kfuncs David Vernet
@ 2022-12-02 22:07 ` David Vernet
  1 sibling, 0 replies; 6+ messages in thread
From: David Vernet @ 2022-12-02 22:07 UTC (permalink / raw)
  To: bpf
  Cc: ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, linux-kernel, kernel-team

bpf_cgroup_acquire(), bpf_cgroup_release(), bpf_cgroup_kptr_get(), and
bpf_cgroup_ancestor(), are kfuncs that were recnetly added to
kernel/bpf/helpers.c. These are "core" kfuncs in that they're available
for use in any tracepoint or struct_ops BPF program. Though they have no
ABI stability guarantees, we should still document them. This patch adds
a struct cgroup * subsection to the Core kfuncs section which describes
each of these kfuncs.

Signed-off-by: David Vernet <void@manifault.com>
---
 Documentation/bpf/kfuncs.rst | 49 ++++++++++++++++++++++++++++++++++++
 kernel/bpf/helpers.c         |  2 +-
 2 files changed, 50 insertions(+), 1 deletion(-)

diff --git a/Documentation/bpf/kfuncs.rst b/Documentation/bpf/kfuncs.rst
index b0c35ad6fad4..fd2c8c3b2d52 100644
--- a/Documentation/bpf/kfuncs.rst
+++ b/Documentation/bpf/kfuncs.rst
@@ -361,3 +361,52 @@ Here is an example of it being used:
 		bpf_task_release(lookup);
 		return 0;
 	}
+
+3.2 struct cgroup * kfuncs
+--------------------------
+
+``struct cgroup *`` objects also have acquire, release, and kptr_get functions:
+
+.. kernel-doc:: kernel/bpf/helpers.c
+   :identifiers: bpf_cgroup_acquire bpf_cgroup_release
+
+.. kernel-doc:: kernel/bpf/helpers.c
+   :identifiers: bpf_cgroup_kptr_get
+
+These kfuncs are used in exactly the same manner as bpf_task_acquire(),
+bpf_task_release(), and bpf_task_kptr_get() respectively, so we won't provide
+examples for them.
+
+Another kfunc available for interacting with ``struct cgroup *`` objects is
+bpf_cgroup_ancestor(). This allows callers to access the ancestor of a cgroup,
+and return it as a cgroup kptr.
+
+.. kernel-doc:: kernel/bpf/helpers.c
+   :identifiers: bpf_cgroup_ancestor
+
+Eventually, BPF should be updated to allow this to happen with a normal memory
+load in the program itself. This is currently not possible without more work in
+the verifier. bpf_cgroup_ancestor() can be used as follows:
+
+.. code-block:: c
+
+	/**
+	 * Simple tracepoint example that illustrates how a cgroup's
+	 * ancestor can be accessed using bpf_cgroup_ancestor().
+	 */
+	SEC("tp_btf/cgroup_mkdir")
+	int BPF_PROG(cgrp_ancestor_example, struct cgroup *cgrp, const char *path)
+	{
+		struct cgroup *parent;
+
+		/* The parent cgroup resides at the level before the current cgroup's level. */
+		parent = bpf_cgroup_ancestor(cgrp, cgrp->level - 1);
+		if (!parent)
+			return -ENOENT;
+
+		bpf_printk("Parent id is %d", parent->self.id);
+
+		/* Return the parent cgroup that was acquired above. */
+		bpf_cgroup_release(parent);
+		return 0;
+	}
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 004afbc14bbf..2a07d216c8f3 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -1927,7 +1927,7 @@ struct cgroup *bpf_cgroup_kptr_get(struct cgroup **cgrpp)
 }
 
 /**
- * bpf_cgroup_release - Release the reference acquired on a struct cgroup *.
+ * bpf_cgroup_release - Release the reference acquired on a struct cgroup kptr.
  * If this kfunc is invoked in an RCU read region, the cgroup is guaranteed to
  * not be freed until the current grace period has ended, even if its refcount
  * drops to 0.
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH bpf-next 1/2] bpf/docs: Document struct task_struct * kfuncs
  2022-12-02 22:07 ` [PATCH bpf-next 1/2] bpf/docs: Document struct task_struct * kfuncs David Vernet
@ 2022-12-03  2:15   ` Alexei Starovoitov
  2022-12-05 16:10     ` David Vernet
  0 siblings, 1 reply; 6+ messages in thread
From: Alexei Starovoitov @ 2022-12-03  2:15 UTC (permalink / raw)
  To: David Vernet
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, linux-kernel, kernel-team

On Fri, Dec 02, 2022 at 04:07:35PM -0600, David Vernet wrote:
> bpf_task_acquire(), bpf_task_release(), bpf_task_kptr_get(), and
> bpf_task_from_pid() are kfuncs that were recently added to
> kernel/bpf/helpers.c. These are "core" kfuncs in that they're available
> for use for any tracepoint or struct_ops BPF program. Though they have
> no ABI stability guarantees, we should still document them. This patch
> adds a new Core kfuncs section to the BPF kfuncs doc, and adds entries
> for all of these task kfuncs.
> 
> Signed-off-by: David Vernet <void@manifault.com>
> ---
>  Documentation/bpf/kfuncs.rst | 148 +++++++++++++++++++++++++++++++++++
>  kernel/bpf/helpers.c         |   8 +-
>  2 files changed, 152 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/bpf/kfuncs.rst b/Documentation/bpf/kfuncs.rst
> index 90774479ab7a..b0c35ad6fad4 100644
> --- a/Documentation/bpf/kfuncs.rst
> +++ b/Documentation/bpf/kfuncs.rst
> @@ -213,3 +213,151 @@ type. An example is shown below::
>                  return register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING, &bpf_task_kfunc_set);
>          }
>          late_initcall(init_subsystem);
> +
> +3. Core kfuncs
> +==============
> +
> +The BPF subsystem provides a number of "core" kfuncs that are potentially
> +applicable to a wide variety of different possible use cases and programs.
> +Those kfuncs are documented here.
> +
> +3.1 struct task_struct * kfuncs
> +-------------------------------
> +
> +There are a number of kfuncs that allow ``struct task_struct *`` objects to be
> +used as kptrs:
> +
> +.. kernel-doc:: kernel/bpf/helpers.c
> +   :identifiers: bpf_task_acquire bpf_task_release
> +
> +These kfuncs are useful when you want to acquire or release a reference to a
> +``struct task_struct *`` that was passed as e.g. a tracepoint arg, or a
> +struct_ops callback arg. For example:
> +
> +.. code-block:: c
> +
> +	/**
> +	 * A trivial example tracepoint program that shows how to
> +	 * acquire and release a struct task_struct * pointer.
> +	 */
> +	SEC("tp_btf/task_newtask")
> +	int BPF_PROG(task_acquire_release_example, struct task_struct *task, u64 clone_flags)
> +	{
> +		struct task_struct *acquired;
> +
> +		acquired = bpf_task_acquire(task);
> +
> +		/*
> +		 * In a typical program you'd do something like store
> +		 * the task in a map. Here, we just release it.

There is a sentence later in this patch about what happens with the pointer
that was stored in a map, but I would add some part of it here as well. Like:

 * In a typical program you'd do something like store
 * the task in a map and the map will automatically release it later.
 * Here, we release it manually.

> +		 */
> +		bpf_task_release(acquired);
> +		return 0;
> +	}
> +
> +If you want to acquire a reference to a ``struct task_struct`` kptr that's
> +already stored in a map, you can use bpf_task_kptr_get():
> +
> +.. kernel-doc:: kernel/bpf/helpers.c
> +   :identifiers: bpf_task_kptr_get
> +
> +Here's an example of how it can be used:
> +
> +.. code-block:: c
> +
> +	/* struct containing the struct task_struct kptr which is actually stored in the map. */
> +	struct __tasks_kfunc_map_value {
> +		struct task_struct __kptr_ref * task;
> +	};
> +
> +	/* The map containing struct __tasks_kfunc_map_value entries. */
> +	struct hash_map {
> +		__uint(type, BPF_MAP_TYPE_HASH);
> +		__type(key, int);
> +		__type(value, struct __tasks_kfunc_map_value);
> +		__uint(max_entries, 1);
> +	} __tasks_kfunc_map SEC(".maps");
> +
> +	/* ... */
> +
> +	/**
> +	 * A simple example tracepoint program showing how a
> +	 * struct task_struct kptr that is stored in a map can
> +	 * be acquired using the bpf_task_kptr_get() kfunc.
> +	 */
> +	 SEC("tp_btf/task_newtask")
> +	 int BPF_PROG(task_kptr_get_example, struct task_struct *task, u64 clone_flags)
> +	 {
> +		struct task_struct *kptr;
> +		struct __tasks_kfunc_map_value *v;
> +		s32 pid;
> +		long status;
> +
> +		status = bpf_probe_read_kernel(&pid, sizeof(pid), &task->pid);

why use the slow bpf_probe_read_kernel() here?
I think the example should follow modern coding practices.
Just: pid = task->pid; instead ?

> +		if (status)
> +			return status;
> +
> +		/* Assume a task kptr was previously stored in the map. */
> +		v = bpf_map_lookup_elem(&__tasks_kfunc_map, &pid);
> +		if (!v)
> +			return -ENOENT;
> +
> +		/* Acquire a reference to the task kptr that's already stored in the map. */
> +		kptr = bpf_task_kptr_get(&v->task);
> +		if (!kptr)
> +			/* If no task was present in the map, it's because
> +			 * we're racing with another CPU that removed it with
> +			 * bpf_kptr_xchg() between the bpf_map_lookup_elem()
> +			 * above, and our call to bpf_task_kptr_get().
> +			 * bpf_task_kptr_get() internally safely handles this
> +			 * race, and will return NULL if the task is no longer
> +			 * present in the map by the time we invoke the kfunc.
> +			 */
> +			return -EBUSY;
> +
> +		/* Free the reference we just took above. Note that the
> +		 * original struct task_struct kptr is still in the map.
> +		 * It will be freed either at a later time if another
> +		 * context deletes it from the map, or automatically by
> +		 * the BPF subsystem if it's still present when the map
> +		 * is destroyed.
> +		 */
> +		bpf_task_release(kptr);
> +
> +		return 0;
> +        }
> +
> +Finally, a BPF program can also look up a task from a pid. This can be useful
> +if the caller doesn't have a trusted pointer to a ``struct task_struct *``
> +object that it can acquire a reference on with bpf_task_acquire().
> +
> +.. kernel-doc:: kernel/bpf/helpers.c
> +   :identifiers: bpf_task_from_pid
> +
> +Here is an example of it being used:
> +
> +.. code-block:: c
> +
> +	SEC("tp_btf/task_newtask")
> +	int BPF_PROG(task_get_pid_example, struct task_struct *task, u64 clone_flags)
> +	{
> +		struct task_struct *lookup;
> +
> +		lookup = bpf_task_from_pid(task->pid);
> +		if (!lookup)
> +			/* A task should always be found, as %task is a tracepoint arg. */
> +			return -ENOENT;
> +
> +		if (lookup->pid != task->pid) {
> +			/* The pid of the lookup task should be the same as the input task. */

I suspect both "errors" are actually possible in practice,
since bpf_task_from_pid is using init_pid_ns.
But this taskd might be in different pid_ns. See task_active_pid_ns.
Probably worth mentioning this aspect of bpf_task_from_pid.

> +			bpf_task_release(lookup);
> +			return -EINVAL;
> +		}
> +
> +		/* bpf_task_from_pid() returns an acquired reference,
> +		 * so it must be dropped before returning from the
> +		 * tracepoint handler.
> +		 */
> +		bpf_task_release(lookup);
> +		return 0;
> +	}
> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index a5a511430f2a..004afbc14bbf 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
> @@ -1868,10 +1868,10 @@ struct task_struct *bpf_task_kptr_get(struct task_struct **pp)
>  }
>  
>  /**
> - * bpf_task_release - Release the reference acquired on a struct task_struct *.
> - * If this kfunc is invoked in an RCU read region, the task_struct is
> - * guaranteed to not be freed until the current grace period has ended, even if
> - * its refcount drops to 0.
> + * bpf_task_release - Release the reference acquired on a task.  If this kfunc
> + * is invoked in an RCU read region, the task_struct is guaranteed to not be
> + * freed until the current grace period has ended, even if its refcount drops
> + * to 0.
>   * @p: The task on which a reference is being released.
>   */
>  void bpf_task_release(struct task_struct *p)
> -- 
> 2.38.1
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH bpf-next 1/2] bpf/docs: Document struct task_struct * kfuncs
  2022-12-03  2:15   ` Alexei Starovoitov
@ 2022-12-05 16:10     ` David Vernet
  2022-12-05 20:58       ` David Vernet
  0 siblings, 1 reply; 6+ messages in thread
From: David Vernet @ 2022-12-05 16:10 UTC (permalink / raw)
  To: Alexei Starovoitov, F
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, linux-kernel, kernel-team

On Fri, Dec 02, 2022 at 06:15:00PM -0800, Alexei Starovoitov wrote:

[...]

> > +.. code-block:: c
> > +
> > +	/**
> > +	 * A trivial example tracepoint program that shows how to
> > +	 * acquire and release a struct task_struct * pointer.
> > +	 */
> > +	SEC("tp_btf/task_newtask")
> > +	int BPF_PROG(task_acquire_release_example, struct task_struct *task, u64 clone_flags)
> > +	{
> > +		struct task_struct *acquired;
> > +
> > +		acquired = bpf_task_acquire(task);
> > +
> > +		/*
> > +		 * In a typical program you'd do something like store
> > +		 * the task in a map. Here, we just release it.
> 
> There is a sentence later in this patch about what happens with the pointer
> that was stored in a map, but I would add some part of it here as well. Like:
> 
>  * In a typical program you'd do something like store
>  * the task in a map and the map will automatically release it later.
>  * Here, we release it manually.

Will do

> > +		 */
> > +		bpf_task_release(acquired);
> > +		return 0;
> > +	}
> > +
> > +If you want to acquire a reference to a ``struct task_struct`` kptr that's
> > +already stored in a map, you can use bpf_task_kptr_get():
> > +
> > +.. kernel-doc:: kernel/bpf/helpers.c
> > +   :identifiers: bpf_task_kptr_get
> > +
> > +Here's an example of how it can be used:
> > +
> > +.. code-block:: c
> > +
> > +	/* struct containing the struct task_struct kptr which is actually stored in the map. */
> > +	struct __tasks_kfunc_map_value {
> > +		struct task_struct __kptr_ref * task;
> > +	};
> > +
> > +	/* The map containing struct __tasks_kfunc_map_value entries. */
> > +	struct hash_map {
> > +		__uint(type, BPF_MAP_TYPE_HASH);
> > +		__type(key, int);
> > +		__type(value, struct __tasks_kfunc_map_value);
> > +		__uint(max_entries, 1);
> > +	} __tasks_kfunc_map SEC(".maps");
> > +
> > +	/* ... */
> > +
> > +	/**
> > +	 * A simple example tracepoint program showing how a
> > +	 * struct task_struct kptr that is stored in a map can
> > +	 * be acquired using the bpf_task_kptr_get() kfunc.
> > +	 */
> > +	 SEC("tp_btf/task_newtask")
> > +	 int BPF_PROG(task_kptr_get_example, struct task_struct *task, u64 clone_flags)
> > +	 {
> > +		struct task_struct *kptr;
> > +		struct __tasks_kfunc_map_value *v;
> > +		s32 pid;
> > +		long status;
> > +
> > +		status = bpf_probe_read_kernel(&pid, sizeof(pid), &task->pid);
> 
> why use the slow bpf_probe_read_kernel() here?
> I think the example should follow modern coding practices.
> Just: pid = task->pid; instead ?

Yeah, I'll fix this.

[...]

> > +		if (status)
> > +			return status;
> > +
> > +		/* Assume a task kptr was previously stored in the map. */
> > +		v = bpf_map_lookup_elem(&__tasks_kfunc_map, &pid);
> > +		if (!v)
> > +			return -ENOENT;
> > +
> > +		/* Acquire a reference to the task kptr that's already stored in the map. */
> > +		kptr = bpf_task_kptr_get(&v->task);
> > +		if (!kptr)
> > +			/* If no task was present in the map, it's because
> > +			 * we're racing with another CPU that removed it with
> > +			 * bpf_kptr_xchg() between the bpf_map_lookup_elem()
> > +			 * above, and our call to bpf_task_kptr_get().
> > +			 * bpf_task_kptr_get() internally safely handles this
> > +			 * race, and will return NULL if the task is no longer
> > +			 * present in the map by the time we invoke the kfunc.
> > +			 */
> > +			return -EBUSY;
> > +
> > +		/* Free the reference we just took above. Note that the
> > +		 * original struct task_struct kptr is still in the map.
> > +		 * It will be freed either at a later time if another
> > +		 * context deletes it from the map, or automatically by
> > +		 * the BPF subsystem if it's still present when the map
> > +		 * is destroyed.
> > +		 */
> > +		bpf_task_release(kptr);
> > +
> > +		return 0;
> > +        }
> > +
> > +Finally, a BPF program can also look up a task from a pid. This can be useful
> > +if the caller doesn't have a trusted pointer to a ``struct task_struct *``
> > +object that it can acquire a reference on with bpf_task_acquire().
> > +
> > +.. kernel-doc:: kernel/bpf/helpers.c
> > +   :identifiers: bpf_task_from_pid
> > +
> > +Here is an example of it being used:
> > +
> > +.. code-block:: c
> > +
> > +	SEC("tp_btf/task_newtask")
> > +	int BPF_PROG(task_get_pid_example, struct task_struct *task, u64 clone_flags)
> > +	{
> > +		struct task_struct *lookup;
> > +
> > +		lookup = bpf_task_from_pid(task->pid);
> > +		if (!lookup)
> > +			/* A task should always be found, as %task is a tracepoint arg. */
> > +			return -ENOENT;
> > +
> > +		if (lookup->pid != task->pid) {
> > +			/* The pid of the lookup task should be the same as the input task. */
> 
> I suspect both "errors" are actually possible in practice,
> since bpf_task_from_pid is using init_pid_ns.
> But this taskd might be in different pid_ns. See task_active_pid_ns.
> Probably worth mentioning this aspect of bpf_task_from_pid.

Yep, agreed. Will add

[...]

Thanks,
David

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH bpf-next 1/2] bpf/docs: Document struct task_struct * kfuncs
  2022-12-05 16:10     ` David Vernet
@ 2022-12-05 20:58       ` David Vernet
  0 siblings, 0 replies; 6+ messages in thread
From: David Vernet @ 2022-12-05 20:58 UTC (permalink / raw)
  To: Alexei Starovoitov, F
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, linux-kernel, kernel-team

On Mon, Dec 05, 2022 at 10:10:14AM -0600, David Vernet wrote:
> > > +.. code-block:: c
> > > +
> > > +	SEC("tp_btf/task_newtask")
> > > +	int BPF_PROG(task_get_pid_example, struct task_struct *task, u64 clone_flags)
> > > +	{
> > > +		struct task_struct *lookup;
> > > +
> > > +		lookup = bpf_task_from_pid(task->pid);
> > > +		if (!lookup)
> > > +			/* A task should always be found, as %task is a tracepoint arg. */
> > > +			return -ENOENT;
> > > +
> > > +		if (lookup->pid != task->pid) {
> > > +			/* The pid of the lookup task should be the same as the input task. */
> > 
> > I suspect both "errors" are actually possible in practice,
> > since bpf_task_from_pid is using init_pid_ns.
> > But this taskd might be in different pid_ns. See task_active_pid_ns.
> > Probably worth mentioning this aspect of bpf_task_from_pid.
> 
> Yep, agreed. Will add

Actually, I don't think either error can ever happen. p->pid is globally
unique, and always uses the init_pid_ns. See [0] where p->pid is set,
and [1] for the implementation of pid_nr(). So I think the existing
example is actually correct, though I'll still add some comments to
explain that the kfunc only works for p->pid / the init_pid_ns.

[0]: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/kernel/fork.c#n2326
[1]: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/include/linux/pid.h#n181

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-12-05 20:58 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-02 22:07 [PATCH bpf-next 0/2] Document some recent core kfunc additions David Vernet
2022-12-02 22:07 ` [PATCH bpf-next 1/2] bpf/docs: Document struct task_struct * kfuncs David Vernet
2022-12-03  2:15   ` Alexei Starovoitov
2022-12-05 16:10     ` David Vernet
2022-12-05 20:58       ` David Vernet
2022-12-02 22:07 ` [PATCH bpf-next 2/2] bpf/docs: Document struct cgroup " David Vernet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).