rcu.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next v2 0/4] Remove unnecessary RCU grace period chaining
@ 2022-10-14 11:39 Hou Tao
  2022-10-14 11:39 ` [PATCH bpf-next v2 1/4] rcu-tasks: Provide rcu_trace_implies_rcu_gp() Hou Tao
                   ` (5 more replies)
  0 siblings, 6 replies; 11+ messages in thread
From: Hou Tao @ 2022-10-14 11:39 UTC (permalink / raw)
  To: bpf
  Cc: Martin KaFai Lau, Andrii Nakryiko, Song Liu, Hao Luo,
	Yonghong Song, Alexei Starovoitov, Daniel Borkmann, KP Singh,
	David S . Miller, Jakub Kicinski, Stanislav Fomichev, Jiri Olsa,
	John Fastabend, houtao1, Paul E . McKenney, Delyan Kratunov, rcu

From: Hou Tao <houtao1@huawei.com>

Hi,

Now bpf uses RCU grace period chaining to wait for the completion of
access from both sleepable and non-sleepable bpf program: calling
call_rcu_tasks_trace() firstly to wait for a RCU-tasks-trace grace
period, then in its callback calls call_rcu() or kfree_rcu() to wait for
a normal RCU grace period.

According to the implementation of RCU Tasks Trace, it inovkes
->postscan_func() to wait for one RCU-tasks-trace grace period and
rcu_tasks_trace_postscan() inovkes synchronize_rcu() to wait for one
normal RCU grace period in turn, so one RCU-tasks-trace grace period
will imply one normal RCU grace period. To codify the implication,
introduces rcu_trace_implies_rcu_gp() in patch #1. And using it in patch
#2~#4 to remove unnecessary call_rcu() or kfree_rcu() in bpf subsystem.
Other two uses of call_rcu_tasks_trace() are unchanged: for
__bpf_prog_put_rcu() there is no gp chain and for
__bpf_tramp_image_put_rcu_tasks() it chains RCU tasks trace GP and RCU
tasks GP.

An alternative way to remove these unnecessary RCU grace period
chainings is using the RCU polling API to check whether or not a normal
RCU grace period has passed (e.g. get_state_synchronize_rcu()). But it
needs an unsigned long space for each free element or each call, and
it is not affordable for local storage element, so as for now always
rcu_trace_implies_rcu_gp().

Comments are always welcome.

Change Log:

v2:
 * codify the implication of RCU Tasks Trace grace period instead of
   assuming for it

v1: https://lore.kernel.org/bpf/20221011071128.3470622-1-houtao@huaweicloud.com

Hou Tao (3):
  bpf: Use rcu_trace_implies_rcu_gp() in bpf memory allocator
  bpf: Use rcu_trace_implies_rcu_gp() in local storage map
  bpf: Use rcu_trace_implies_rcu_gp() for program array freeing

Paul E. McKenney (1):
  rcu-tasks: Provide rcu_trace_implies_rcu_gp()

 include/linux/rcupdate.h       | 12 ++++++++++++
 kernel/bpf/bpf_local_storage.c | 13 +++++++++++--
 kernel/bpf/core.c              |  8 +++++++-
 kernel/bpf/memalloc.c          | 15 ++++++++++-----
 kernel/rcu/tasks.h             |  2 ++
 5 files changed, 42 insertions(+), 8 deletions(-)

-- 
2.29.2


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH bpf-next v2 1/4] rcu-tasks: Provide rcu_trace_implies_rcu_gp()
  2022-10-14 11:39 [PATCH bpf-next v2 0/4] Remove unnecessary RCU grace period chaining Hou Tao
@ 2022-10-14 11:39 ` Hou Tao
  2022-10-14 11:39 ` [PATCH bpf-next v2 2/4] bpf: Use rcu_trace_implies_rcu_gp() in bpf memory allocator Hou Tao
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Hou Tao @ 2022-10-14 11:39 UTC (permalink / raw)
  To: bpf
  Cc: Martin KaFai Lau, Andrii Nakryiko, Song Liu, Hao Luo,
	Yonghong Song, Alexei Starovoitov, Daniel Borkmann, KP Singh,
	David S . Miller, Jakub Kicinski, Stanislav Fomichev, Jiri Olsa,
	John Fastabend, houtao1, Paul E . McKenney, Delyan Kratunov, rcu

From: "Paul E. McKenney" <paulmck@kernel.org>

As an accident of implementation, an RCU Tasks Trace grace period also
acts as an RCU grace period.  However, this could change at any time.
This commit therefore creates an rcu_trace_implies_rcu_gp() that currently
returns true to codify this accident.  Code relying on this accident
must call this function to verify that this accident is still happening.

Reported-by: Hou Tao <houtao@huaweicloud.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
---
 include/linux/rcupdate.h | 12 ++++++++++++
 kernel/rcu/tasks.h       |  2 ++
 2 files changed, 14 insertions(+)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 08605ce7379d..8822f06e4b40 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -240,6 +240,18 @@ static inline void exit_tasks_rcu_start(void) { }
 static inline void exit_tasks_rcu_finish(void) { }
 #endif /* #else #ifdef CONFIG_TASKS_RCU_GENERIC */
 
+/**
+ * rcu_trace_implies_rcu_gp - does an RCU Tasks Trace grace period imply an RCU grace period?
+ *
+ * As an accident of implementation, an RCU Tasks Trace grace period also
+ * acts as an RCU grace period.  However, this could change at any time.
+ * Code relying on this accident must call this function to verify that
+ * this accident is still happening.
+ *
+ * You have been warned!
+ */
+static inline bool rcu_trace_implies_rcu_gp(void) { return true; }
+
 /**
  * cond_resched_tasks_rcu_qs - Report potential quiescent states to RCU
  *
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index f5bf6fb430da..9435e5a7b53e 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -1535,6 +1535,8 @@ static void rcu_tasks_trace_postscan(struct list_head *hop)
 {
 	// Wait for late-stage exiting tasks to finish exiting.
 	// These might have passed the call to exit_tasks_rcu_finish().
+
+	// If you remove the following line, update rcu_trace_implies_rcu_gp()!!!
 	synchronize_rcu();
 	// Any tasks that exit after this point will set
 	// TRC_NEED_QS_CHECKED in ->trc_reader_special.b.need_qs.
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH bpf-next v2 2/4] bpf: Use rcu_trace_implies_rcu_gp() in bpf memory allocator
  2022-10-14 11:39 [PATCH bpf-next v2 0/4] Remove unnecessary RCU grace period chaining Hou Tao
  2022-10-14 11:39 ` [PATCH bpf-next v2 1/4] rcu-tasks: Provide rcu_trace_implies_rcu_gp() Hou Tao
@ 2022-10-14 11:39 ` Hou Tao
  2022-10-14 11:39 ` [PATCH bpf-next v2 3/4] bpf: Use rcu_trace_implies_rcu_gp() in local storage map Hou Tao
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Hou Tao @ 2022-10-14 11:39 UTC (permalink / raw)
  To: bpf
  Cc: Martin KaFai Lau, Andrii Nakryiko, Song Liu, Hao Luo,
	Yonghong Song, Alexei Starovoitov, Daniel Borkmann, KP Singh,
	David S . Miller, Jakub Kicinski, Stanislav Fomichev, Jiri Olsa,
	John Fastabend, houtao1, Paul E . McKenney, Delyan Kratunov, rcu

From: Hou Tao <houtao1@huawei.com>

The memory free logic in bpf memory allocator chains a RCU Tasks Trace
grace period and a normal RCU grace period one after the other, so it
can ensure that both sleepable and non-sleepable programs have finished.

With the introduction of rcu_trace_implies_rcu_gp(),
__free_rcu_tasks_trace() can check whether or not a normal RCU grace
period has also passed after a RCU Tasks Trace grace period has passed.
If it is true, freeing these elements directly, else freeing through
call_rcu().

Signed-off-by: Hou Tao <houtao1@huawei.com>
---
 kernel/bpf/memalloc.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c
index 5f83be1d2018..2433be58bb85 100644
--- a/kernel/bpf/memalloc.c
+++ b/kernel/bpf/memalloc.c
@@ -222,9 +222,13 @@ static void __free_rcu(struct rcu_head *head)
 
 static void __free_rcu_tasks_trace(struct rcu_head *head)
 {
-	struct bpf_mem_cache *c = container_of(head, struct bpf_mem_cache, rcu);
-
-	call_rcu(&c->rcu, __free_rcu);
+	/* If RCU Tasks Trace grace period implies RCU grace period,
+	 * there is no need to invoke call_rcu().
+	 */
+	if (rcu_trace_implies_rcu_gp())
+		__free_rcu(head);
+	else
+		call_rcu(head, __free_rcu);
 }
 
 static void enque_to_free(struct bpf_mem_cache *c, void *obj)
@@ -253,8 +257,9 @@ static void do_call_rcu(struct bpf_mem_cache *c)
 		 */
 		__llist_add(llnode, &c->waiting_for_gp);
 	/* Use call_rcu_tasks_trace() to wait for sleepable progs to finish.
-	 * Then use call_rcu() to wait for normal progs to finish
-	 * and finally do free_one() on each element.
+	 * If RCU Tasks Trace grace period implies RCU grace period, free
+	 * these elements directly, else use call_rcu() to wait for normal
+	 * progs to finish and finally do free_one() on each element.
 	 */
 	call_rcu_tasks_trace(&c->rcu, __free_rcu_tasks_trace);
 }
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH bpf-next v2 3/4] bpf: Use rcu_trace_implies_rcu_gp() in local storage map
  2022-10-14 11:39 [PATCH bpf-next v2 0/4] Remove unnecessary RCU grace period chaining Hou Tao
  2022-10-14 11:39 ` [PATCH bpf-next v2 1/4] rcu-tasks: Provide rcu_trace_implies_rcu_gp() Hou Tao
  2022-10-14 11:39 ` [PATCH bpf-next v2 2/4] bpf: Use rcu_trace_implies_rcu_gp() in bpf memory allocator Hou Tao
@ 2022-10-14 11:39 ` Hou Tao
  2022-10-14 11:39 ` [PATCH bpf-next v2 4/4] bpf: Use rcu_trace_implies_rcu_gp() for program array freeing Hou Tao
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Hou Tao @ 2022-10-14 11:39 UTC (permalink / raw)
  To: bpf
  Cc: Martin KaFai Lau, Andrii Nakryiko, Song Liu, Hao Luo,
	Yonghong Song, Alexei Starovoitov, Daniel Borkmann, KP Singh,
	David S . Miller, Jakub Kicinski, Stanislav Fomichev, Jiri Olsa,
	John Fastabend, houtao1, Paul E . McKenney, Delyan Kratunov, rcu

From: Hou Tao <houtao1@huawei.com>

Local storage map is accessible for both sleepable and non-sleepable bpf
program, and its memory is freed by using both call_rcu_tasks_trace() and
kfree_rcu() to wait for both RCU-tasks-trace grace period and RCU grace
period to pass.

With the introduction of rcu_trace_implies_rcu_gp(), both
bpf_selem_free_rcu() and bpf_local_storage_free_rcu() can check whether
or not a normal RCU grace period has also passed after a RCU-tasks-trace
grace period has passed. If it is true, it is safe to call kfree()
directly.

Signed-off-by: Hou Tao <houtao1@huawei.com>
---
 kernel/bpf/bpf_local_storage.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c
index 802fc15b0d73..9dc6de1cf185 100644
--- a/kernel/bpf/bpf_local_storage.c
+++ b/kernel/bpf/bpf_local_storage.c
@@ -88,8 +88,14 @@ void bpf_local_storage_free_rcu(struct rcu_head *rcu)
 {
 	struct bpf_local_storage *local_storage;
 
+	/* If RCU Tasks Trace grace period implies RCU grace period, do
+	 * kfree(), else do kfree_rcu().
+	 */
 	local_storage = container_of(rcu, struct bpf_local_storage, rcu);
-	kfree_rcu(local_storage, rcu);
+	if (rcu_trace_implies_rcu_gp())
+		kfree(local_storage);
+	else
+		kfree_rcu(local_storage, rcu);
 }
 
 static void bpf_selem_free_rcu(struct rcu_head *rcu)
@@ -97,7 +103,10 @@ static void bpf_selem_free_rcu(struct rcu_head *rcu)
 	struct bpf_local_storage_elem *selem;
 
 	selem = container_of(rcu, struct bpf_local_storage_elem, rcu);
-	kfree_rcu(selem, rcu);
+	if (rcu_trace_implies_rcu_gp())
+		kfree(selem);
+	else
+		kfree_rcu(selem, rcu);
 }
 
 /* local_storage->lock must be held and selem->local_storage == local_storage.
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH bpf-next v2 4/4] bpf: Use rcu_trace_implies_rcu_gp() for program array freeing
  2022-10-14 11:39 [PATCH bpf-next v2 0/4] Remove unnecessary RCU grace period chaining Hou Tao
                   ` (2 preceding siblings ...)
  2022-10-14 11:39 ` [PATCH bpf-next v2 3/4] bpf: Use rcu_trace_implies_rcu_gp() in local storage map Hou Tao
@ 2022-10-14 11:39 ` Hou Tao
  2022-10-17 13:39 ` [PATCH bpf-next v2 0/4] Remove unnecessary RCU grace period chaining Paul E. McKenney
  2022-10-18 17:40 ` patchwork-bot+netdevbpf
  5 siblings, 0 replies; 11+ messages in thread
From: Hou Tao @ 2022-10-14 11:39 UTC (permalink / raw)
  To: bpf
  Cc: Martin KaFai Lau, Andrii Nakryiko, Song Liu, Hao Luo,
	Yonghong Song, Alexei Starovoitov, Daniel Borkmann, KP Singh,
	David S . Miller, Jakub Kicinski, Stanislav Fomichev, Jiri Olsa,
	John Fastabend, houtao1, Paul E . McKenney, Delyan Kratunov, rcu

From: Hou Tao <houtao1@huawei.com>

To support both sleepable and normal uprobe bpf program, the freeing of
trace program array chains a RCU-tasks-trace grace period and a normal
RCU grace period one after the other.

With the introduction of rcu_trace_implies_rcu_gp(),
__bpf_prog_array_free_sleepable_cb() can check whether or not a normal
RCU grace period has also passed after a RCU-tasks-trace grace period
has passed. If it is true, it is safe to invoke kfree() directly.

Signed-off-by: Hou Tao <houtao1@huawei.com>
---
 kernel/bpf/core.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 711fd293b6de..4bc5f46d7030 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -2251,8 +2251,14 @@ static void __bpf_prog_array_free_sleepable_cb(struct rcu_head *rcu)
 {
 	struct bpf_prog_array *progs;
 
+	/* If RCU Tasks Trace grace period implies RCU grace period, there is
+	 * no need to call kfree_rcu(), just call kfree() directly.
+	 */
 	progs = container_of(rcu, struct bpf_prog_array, rcu);
-	kfree_rcu(progs, rcu);
+	if (rcu_trace_implies_rcu_gp())
+		kfree(progs);
+	else
+		kfree_rcu(progs, rcu);
 }
 
 void bpf_prog_array_free_sleepable(struct bpf_prog_array *progs)
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next v2 0/4] Remove unnecessary RCU grace period chaining
  2022-10-14 11:39 [PATCH bpf-next v2 0/4] Remove unnecessary RCU grace period chaining Hou Tao
                   ` (3 preceding siblings ...)
  2022-10-14 11:39 ` [PATCH bpf-next v2 4/4] bpf: Use rcu_trace_implies_rcu_gp() for program array freeing Hou Tao
@ 2022-10-17 13:39 ` Paul E. McKenney
  2022-10-18  7:31   ` Hou Tao
  2022-10-18 17:40 ` patchwork-bot+netdevbpf
  5 siblings, 1 reply; 11+ messages in thread
From: Paul E. McKenney @ 2022-10-17 13:39 UTC (permalink / raw)
  To: Hou Tao
  Cc: bpf, Martin KaFai Lau, Andrii Nakryiko, Song Liu, Hao Luo,
	Yonghong Song, Alexei Starovoitov, Daniel Borkmann, KP Singh,
	David S . Miller, Jakub Kicinski, Stanislav Fomichev, Jiri Olsa,
	John Fastabend, houtao1, Delyan Kratunov, rcu

On Fri, Oct 14, 2022 at 07:39:42PM +0800, Hou Tao wrote:
> From: Hou Tao <houtao1@huawei.com>
> 
> Hi,
> 
> Now bpf uses RCU grace period chaining to wait for the completion of
> access from both sleepable and non-sleepable bpf program: calling
> call_rcu_tasks_trace() firstly to wait for a RCU-tasks-trace grace
> period, then in its callback calls call_rcu() or kfree_rcu() to wait for
> a normal RCU grace period.
> 
> According to the implementation of RCU Tasks Trace, it inovkes
> ->postscan_func() to wait for one RCU-tasks-trace grace period and
> rcu_tasks_trace_postscan() inovkes synchronize_rcu() to wait for one
> normal RCU grace period in turn, so one RCU-tasks-trace grace period
> will imply one normal RCU grace period. To codify the implication,
> introduces rcu_trace_implies_rcu_gp() in patch #1. And using it in patch
> #2~#4 to remove unnecessary call_rcu() or kfree_rcu() in bpf subsystem.
> Other two uses of call_rcu_tasks_trace() are unchanged: for
> __bpf_prog_put_rcu() there is no gp chain and for
> __bpf_tramp_image_put_rcu_tasks() it chains RCU tasks trace GP and RCU
> tasks GP.
> 
> An alternative way to remove these unnecessary RCU grace period
> chainings is using the RCU polling API to check whether or not a normal
> RCU grace period has passed (e.g. get_state_synchronize_rcu()). But it
> needs an unsigned long space for each free element or each call, and
> it is not affordable for local storage element, so as for now always
> rcu_trace_implies_rcu_gp().
> 
> Comments are always welcome.

For #2-#4:

Reviewed-by: Paul E. McKenney <paulmck@kernel.org>

(#1 already has my Signed-off-by, in case anyone was wondering.)

> Change Log:
> 
> v2:
>  * codify the implication of RCU Tasks Trace grace period instead of
>    assuming for it
> 
> v1: https://lore.kernel.org/bpf/20221011071128.3470622-1-houtao@huaweicloud.com
> 
> Hou Tao (3):
>   bpf: Use rcu_trace_implies_rcu_gp() in bpf memory allocator
>   bpf: Use rcu_trace_implies_rcu_gp() in local storage map
>   bpf: Use rcu_trace_implies_rcu_gp() for program array freeing
> 
> Paul E. McKenney (1):
>   rcu-tasks: Provide rcu_trace_implies_rcu_gp()
> 
>  include/linux/rcupdate.h       | 12 ++++++++++++
>  kernel/bpf/bpf_local_storage.c | 13 +++++++++++--
>  kernel/bpf/core.c              |  8 +++++++-
>  kernel/bpf/memalloc.c          | 15 ++++++++++-----
>  kernel/rcu/tasks.h             |  2 ++
>  5 files changed, 42 insertions(+), 8 deletions(-)
> 
> -- 
> 2.29.2
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next v2 0/4] Remove unnecessary RCU grace period chaining
  2022-10-17 13:39 ` [PATCH bpf-next v2 0/4] Remove unnecessary RCU grace period chaining Paul E. McKenney
@ 2022-10-18  7:31   ` Hou Tao
  2022-10-18 15:08     ` Paul E. McKenney
  0 siblings, 1 reply; 11+ messages in thread
From: Hou Tao @ 2022-10-18  7:31 UTC (permalink / raw)
  To: paulmck
  Cc: bpf, Martin KaFai Lau, Andrii Nakryiko, Song Liu, Hao Luo,
	Yonghong Song, Alexei Starovoitov, Daniel Borkmann, KP Singh,
	David S . Miller, Jakub Kicinski, Stanislav Fomichev, Jiri Olsa,
	John Fastabend, houtao1, Delyan Kratunov, rcu

Hi,

On 10/17/2022 9:39 PM, Paul E. McKenney wrote:
> On Fri, Oct 14, 2022 at 07:39:42PM +0800, Hou Tao wrote:
>> From: Hou Tao <houtao1@huawei.com>
>>
>> Hi,
>>
>> Now bpf uses RCU grace period chaining to wait for the completion of
>> access from both sleepable and non-sleepable bpf program: calling
>> call_rcu_tasks_trace() firstly to wait for a RCU-tasks-trace grace
>> period, then in its callback calls call_rcu() or kfree_rcu() to wait for
>> a normal RCU grace period.
>>
>> According to the implementation of RCU Tasks Trace, it inovkes
>> ->postscan_func() to wait for one RCU-tasks-trace grace period and
>> rcu_tasks_trace_postscan() inovkes synchronize_rcu() to wait for one
>> normal RCU grace period in turn, so one RCU-tasks-trace grace period
>> will imply one normal RCU grace period. To codify the implication,
>> introduces rcu_trace_implies_rcu_gp() in patch #1. And using it in patch
>> #2~#4 to remove unnecessary call_rcu() or kfree_rcu() in bpf subsystem.
>> Other two uses of call_rcu_tasks_trace() are unchanged: for
>> __bpf_prog_put_rcu() there is no gp chain and for
>> __bpf_tramp_image_put_rcu_tasks() it chains RCU tasks trace GP and RCU
>> tasks GP.
>>
>> An alternative way to remove these unnecessary RCU grace period
>> chainings is using the RCU polling API to check whether or not a normal
>> RCU grace period has passed (e.g. get_state_synchronize_rcu()). But it
>> needs an unsigned long space for each free element or each call, and
>> it is not affordable for local storage element, so as for now always
>> rcu_trace_implies_rcu_gp().
>>
>> Comments are always welcome.
> For #2-#4:
>
> Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
>
> (#1 already has my Signed-off-by, in case anyone was wondering.)
Thanks for the review. But it seems I missed another possible use case for
rcu_trace_implies_rcu_gp() in bpf memory allocator. The code snippet for
free_mem_alloc() is as following:

static void free_mem_alloc(struct bpf_mem_alloc *ma)
{
        /* waiting_for_gp lists was drained, but __free_rcu might
         * still execute. Wait for it now before we freeing percpu caches.
         */
        rcu_barrier_tasks_trace();
        rcu_barrier();
        free_mem_alloc_no_barrier(ma);
}

It uses rcu_barrier_tasks_trace() and rcu_barrier() to wait for the completion
of pending call_rcu_tasks_trace()s and call_rcu()s. I think it is also safe to
check rcu_trace_implies_rcu_gp() in free_mem_alloc() and if it is true, there is
no need to call rcu_barrier().

static void free_mem_alloc(struct bpf_mem_alloc *ma)
{
        /* waiting_for_gp lists was drained, but __free_rcu_tasks_trace()
         * or __free_rcu() might still execute. Wait for it now before we
         * freeing percpu caches.
         */
        rcu_barrier_tasks_trace();
        if (!rcu_trace_implies_rcu_gp())
                rcu_barrier();
        free_mem_alloc_no_barrier(ma);
}

Does the above change look good to you ? If it is, I will post v3 to include the
above change and add your Reviewed-by tag.
>
>> Change Log:
>>
>> v2:
>>  * codify the implication of RCU Tasks Trace grace period instead of
>>    assuming for it
>>
>> v1: https://lore.kernel.org/bpf/20221011071128.3470622-1-houtao@huaweicloud.com
>>
>> Hou Tao (3):
>>   bpf: Use rcu_trace_implies_rcu_gp() in bpf memory allocator
>>   bpf: Use rcu_trace_implies_rcu_gp() in local storage map
>>   bpf: Use rcu_trace_implies_rcu_gp() for program array freeing
>>
>> Paul E. McKenney (1):
>>   rcu-tasks: Provide rcu_trace_implies_rcu_gp()
>>
>>  include/linux/rcupdate.h       | 12 ++++++++++++
>>  kernel/bpf/bpf_local_storage.c | 13 +++++++++++--
>>  kernel/bpf/core.c              |  8 +++++++-
>>  kernel/bpf/memalloc.c          | 15 ++++++++++-----
>>  kernel/rcu/tasks.h             |  2 ++
>>  5 files changed, 42 insertions(+), 8 deletions(-)
>>
>> -- 
>> 2.29.2
>>
> .


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next v2 0/4] Remove unnecessary RCU grace period chaining
  2022-10-18  7:31   ` Hou Tao
@ 2022-10-18 15:08     ` Paul E. McKenney
  2022-10-21  7:08       ` Hou Tao
  0 siblings, 1 reply; 11+ messages in thread
From: Paul E. McKenney @ 2022-10-18 15:08 UTC (permalink / raw)
  To: Hou Tao
  Cc: bpf, Martin KaFai Lau, Andrii Nakryiko, Song Liu, Hao Luo,
	Yonghong Song, Alexei Starovoitov, Daniel Borkmann, KP Singh,
	David S . Miller, Jakub Kicinski, Stanislav Fomichev, Jiri Olsa,
	John Fastabend, houtao1, Delyan Kratunov, rcu

On Tue, Oct 18, 2022 at 03:31:20PM +0800, Hou Tao wrote:
> Hi,
> 
> On 10/17/2022 9:39 PM, Paul E. McKenney wrote:
> > On Fri, Oct 14, 2022 at 07:39:42PM +0800, Hou Tao wrote:
> >> From: Hou Tao <houtao1@huawei.com>
> >>
> >> Hi,
> >>
> >> Now bpf uses RCU grace period chaining to wait for the completion of
> >> access from both sleepable and non-sleepable bpf program: calling
> >> call_rcu_tasks_trace() firstly to wait for a RCU-tasks-trace grace
> >> period, then in its callback calls call_rcu() or kfree_rcu() to wait for
> >> a normal RCU grace period.
> >>
> >> According to the implementation of RCU Tasks Trace, it inovkes
> >> ->postscan_func() to wait for one RCU-tasks-trace grace period and
> >> rcu_tasks_trace_postscan() inovkes synchronize_rcu() to wait for one
> >> normal RCU grace period in turn, so one RCU-tasks-trace grace period
> >> will imply one normal RCU grace period. To codify the implication,
> >> introduces rcu_trace_implies_rcu_gp() in patch #1. And using it in patch
> >> #2~#4 to remove unnecessary call_rcu() or kfree_rcu() in bpf subsystem.
> >> Other two uses of call_rcu_tasks_trace() are unchanged: for
> >> __bpf_prog_put_rcu() there is no gp chain and for
> >> __bpf_tramp_image_put_rcu_tasks() it chains RCU tasks trace GP and RCU
> >> tasks GP.
> >>
> >> An alternative way to remove these unnecessary RCU grace period
> >> chainings is using the RCU polling API to check whether or not a normal
> >> RCU grace period has passed (e.g. get_state_synchronize_rcu()). But it
> >> needs an unsigned long space for each free element or each call, and
> >> it is not affordable for local storage element, so as for now always
> >> rcu_trace_implies_rcu_gp().
> >>
> >> Comments are always welcome.
> > For #2-#4:
> >
> > Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
> >
> > (#1 already has my Signed-off-by, in case anyone was wondering.)
> Thanks for the review. But it seems I missed another possible use case for
> rcu_trace_implies_rcu_gp() in bpf memory allocator. The code snippet for
> free_mem_alloc() is as following:
> 
> static void free_mem_alloc(struct bpf_mem_alloc *ma)
> {
>         /* waiting_for_gp lists was drained, but __free_rcu might
>          * still execute. Wait for it now before we freeing percpu caches.
>          */
>         rcu_barrier_tasks_trace();
>         rcu_barrier();
>         free_mem_alloc_no_barrier(ma);
> }
> 
> It uses rcu_barrier_tasks_trace() and rcu_barrier() to wait for the completion
> of pending call_rcu_tasks_trace()s and call_rcu()s. I think it is also safe to
> check rcu_trace_implies_rcu_gp() in free_mem_alloc() and if it is true, there is
> no need to call rcu_barrier().
> 
> static void free_mem_alloc(struct bpf_mem_alloc *ma)
> {
>         /* waiting_for_gp lists was drained, but __free_rcu_tasks_trace()
>          * or __free_rcu() might still execute. Wait for it now before we
>          * freeing percpu caches.
>          */
>         rcu_barrier_tasks_trace();
>         if (!rcu_trace_implies_rcu_gp())
>                 rcu_barrier();
>         free_mem_alloc_no_barrier(ma);
> }
> 
> Does the above change look good to you ? If it is, I will post v3 to include the
> above change and add your Reviewed-by tag.

Unfortunately, although synchronize_rcu_tasks_trace() implies
that synchronize_rcu(), there is no relationship between the
callbacks.  Furthermore, rcu_barrier_tasks_trace() does not imply
synchronize_rcu_tasks_trace().

So the above change really would break things.  Please do not do it.

You could use workqueues or similar to make the rcu_barrier_tasks_trace()
and the rcu_barrier() wait concurrently, though.  This would of course
require some synchronization.

							Thanx, Paul

> >> Change Log:
> >>
> >> v2:
> >>  * codify the implication of RCU Tasks Trace grace period instead of
> >>    assuming for it
> >>
> >> v1: https://lore.kernel.org/bpf/20221011071128.3470622-1-houtao@huaweicloud.com
> >>
> >> Hou Tao (3):
> >>   bpf: Use rcu_trace_implies_rcu_gp() in bpf memory allocator
> >>   bpf: Use rcu_trace_implies_rcu_gp() in local storage map
> >>   bpf: Use rcu_trace_implies_rcu_gp() for program array freeing
> >>
> >> Paul E. McKenney (1):
> >>   rcu-tasks: Provide rcu_trace_implies_rcu_gp()
> >>
> >>  include/linux/rcupdate.h       | 12 ++++++++++++
> >>  kernel/bpf/bpf_local_storage.c | 13 +++++++++++--
> >>  kernel/bpf/core.c              |  8 +++++++-
> >>  kernel/bpf/memalloc.c          | 15 ++++++++++-----
> >>  kernel/rcu/tasks.h             |  2 ++
> >>  5 files changed, 42 insertions(+), 8 deletions(-)
> >>
> >> -- 
> >> 2.29.2
> >>
> > .
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next v2 0/4] Remove unnecessary RCU grace period chaining
  2022-10-14 11:39 [PATCH bpf-next v2 0/4] Remove unnecessary RCU grace period chaining Hou Tao
                   ` (4 preceding siblings ...)
  2022-10-17 13:39 ` [PATCH bpf-next v2 0/4] Remove unnecessary RCU grace period chaining Paul E. McKenney
@ 2022-10-18 17:40 ` patchwork-bot+netdevbpf
  5 siblings, 0 replies; 11+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-10-18 17:40 UTC (permalink / raw)
  To: Hou Tao
  Cc: bpf, kafai, andrii, songliubraving, haoluo, yhs, ast, daniel,
	kpsingh, davem, kuba, sdf, jolsa, john.fastabend, houtao1,
	paulmck, delyank, rcu

Hello:

This series was applied to bpf/bpf-next.git (master)
by Alexei Starovoitov <ast@kernel.org>:

On Fri, 14 Oct 2022 19:39:42 +0800 you wrote:
> From: Hou Tao <houtao1@huawei.com>
> 
> Hi,
> 
> Now bpf uses RCU grace period chaining to wait for the completion of
> access from both sleepable and non-sleepable bpf program: calling
> call_rcu_tasks_trace() firstly to wait for a RCU-tasks-trace grace
> period, then in its callback calls call_rcu() or kfree_rcu() to wait for
> a normal RCU grace period.
> 
> [...]

Here is the summary with links:
  - [bpf-next,v2,1/4] rcu-tasks: Provide rcu_trace_implies_rcu_gp()
    https://git.kernel.org/bpf/bpf-next/c/e6c86c513f44
  - [bpf-next,v2,2/4] bpf: Use rcu_trace_implies_rcu_gp() in bpf memory allocator
    https://git.kernel.org/bpf/bpf-next/c/59be91e5e70a
  - [bpf-next,v2,3/4] bpf: Use rcu_trace_implies_rcu_gp() in local storage map
    https://git.kernel.org/bpf/bpf-next/c/d39d1445d377
  - [bpf-next,v2,4/4] bpf: Use rcu_trace_implies_rcu_gp() for program array freeing
    https://git.kernel.org/bpf/bpf-next/c/4835f9ee980c

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next v2 0/4] Remove unnecessary RCU grace period chaining
  2022-10-18 15:08     ` Paul E. McKenney
@ 2022-10-21  7:08       ` Hou Tao
  2022-10-21 18:50         ` Paul E. McKenney
  0 siblings, 1 reply; 11+ messages in thread
From: Hou Tao @ 2022-10-21  7:08 UTC (permalink / raw)
  To: paulmck
  Cc: bpf, Martin KaFai Lau, Andrii Nakryiko, Song Liu, Hao Luo,
	Yonghong Song, Alexei Starovoitov, Daniel Borkmann, KP Singh,
	David S . Miller, Jakub Kicinski, Stanislav Fomichev, Jiri Olsa,
	John Fastabend, houtao1, Delyan Kratunov, rcu

Hi,

On 10/18/2022 11:08 PM, Paul E. McKenney wrote:
> On Tue, Oct 18, 2022 at 03:31:20PM +0800, Hou Tao wrote:
>> Hi,
>>
>> On 10/17/2022 9:39 PM, Paul E. McKenney wrote:
>>> On Fri, Oct 14, 2022 at 07:39:42PM +0800, Hou Tao wrote:
SNIP
>>>
>> Thanks for the review. But it seems I missed another possible use case for
>> rcu_trace_implies_rcu_gp() in bpf memory allocator. The code snippet for
>> free_mem_alloc() is as following:
>>
>> static void free_mem_alloc(struct bpf_mem_alloc *ma)
>> {
>>         /* waiting_for_gp lists was drained, but __free_rcu might
>>          * still execute. Wait for it now before we freeing percpu caches.
>>          */
>>         rcu_barrier_tasks_trace();
>>         rcu_barrier();
>>         free_mem_alloc_no_barrier(ma);
>> }
>>
>> It uses rcu_barrier_tasks_trace() and rcu_barrier() to wait for the completion
>> of pending call_rcu_tasks_trace()s and call_rcu()s. I think it is also safe to
>> check rcu_trace_implies_rcu_gp() in free_mem_alloc() and if it is true, there is
>> no need to call rcu_barrier().
>>
>> static void free_mem_alloc(struct bpf_mem_alloc *ma)
>> {
>>         /* waiting_for_gp lists was drained, but __free_rcu_tasks_trace()
>>          * or __free_rcu() might still execute. Wait for it now before we
>>          * freeing percpu caches.
>>          */
>>         rcu_barrier_tasks_trace();
>>         if (!rcu_trace_implies_rcu_gp())
>>                 rcu_barrier();
>>         free_mem_alloc_no_barrier(ma);
>> }
>>
>> Does the above change look good to you ? If it is, I will post v3 to include the
>> above change and add your Reviewed-by tag.
> Unfortunately, although synchronize_rcu_tasks_trace() implies
> that synchronize_rcu(), there is no relationship between the
> callbacks.  Furthermore, rcu_barrier_tasks_trace() does not imply
> synchronize_rcu_tasks_trace().
Yes. I see. And according to the code, if there is not pending cb,
rcu_barrier_tasks_trace() will returned immediately. It is also possible
rcu_tasks_trace kthread is in the middle of grace period waiting when invoking
rcu_barrier_task_trace(), so rcu_barrier_task_trace() does not imply
synchronize_rcu_tasks_trace().
>
> So the above change really would break things.  Please do not do it.
However I am a little confused about the conclusion. If only considering the
invocations of call_rcu() and call_rcu_tasks_trace() in kernel/bpf/memalloc.c, I
think it is safe to do so, right ? Because if  rcu_trace_implies_rcu_gp() is
true, there will be no invocation of call_rcu() and rcu_barrier_tasks_trace()
will wait for the completion of pending call_rcu_tasks_trace(). If
rcu_trace_implies_rcu_gp(), rcu_barrier_tasks_trace() and rcu_barrier() will do
the job. If considering the invocations of call_rcu() in other places, I think
it is definitely unsafe to do that, right ?
>
> You could use workqueues or similar to make the rcu_barrier_tasks_trace()
> and the rcu_barrier() wait concurrently, though.  This would of course
> require some synchronization.
Thanks for the suggestion. Will check it later.
>
> 							Thanx, Paul
>
>>>> Change Log:
>>>>
>>>> v2:
>>>>  * codify the implication of RCU Tasks Trace grace period instead of
>>>>    assuming for it
>>>>
>>>> v1: https://lore.kernel.org/bpf/20221011071128.3470622-1-houtao@huaweicloud.com
>>>>
>>>> Hou Tao (3):
>>>>   bpf: Use rcu_trace_implies_rcu_gp() in bpf memory allocator
>>>>   bpf: Use rcu_trace_implies_rcu_gp() in local storage map
>>>>   bpf: Use rcu_trace_implies_rcu_gp() for program array freeing
>>>>
>>>> Paul E. McKenney (1):
>>>>   rcu-tasks: Provide rcu_trace_implies_rcu_gp()
>>>>
>>>>  include/linux/rcupdate.h       | 12 ++++++++++++
>>>>  kernel/bpf/bpf_local_storage.c | 13 +++++++++++--
>>>>  kernel/bpf/core.c              |  8 +++++++-
>>>>  kernel/bpf/memalloc.c          | 15 ++++++++++-----
>>>>  kernel/rcu/tasks.h             |  2 ++
>>>>  5 files changed, 42 insertions(+), 8 deletions(-)
>>>>
>>>> -- 
>>>> 2.29.2
>>>>
>>> .


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next v2 0/4] Remove unnecessary RCU grace period chaining
  2022-10-21  7:08       ` Hou Tao
@ 2022-10-21 18:50         ` Paul E. McKenney
  0 siblings, 0 replies; 11+ messages in thread
From: Paul E. McKenney @ 2022-10-21 18:50 UTC (permalink / raw)
  To: Hou Tao
  Cc: bpf, Martin KaFai Lau, Andrii Nakryiko, Song Liu, Hao Luo,
	Yonghong Song, Alexei Starovoitov, Daniel Borkmann, KP Singh,
	David S . Miller, Jakub Kicinski, Stanislav Fomichev, Jiri Olsa,
	John Fastabend, houtao1, Delyan Kratunov, rcu

On Fri, Oct 21, 2022 at 03:08:21PM +0800, Hou Tao wrote:
> Hi,
> 
> On 10/18/2022 11:08 PM, Paul E. McKenney wrote:
> > On Tue, Oct 18, 2022 at 03:31:20PM +0800, Hou Tao wrote:
> >> Hi,
> >>
> >> On 10/17/2022 9:39 PM, Paul E. McKenney wrote:
> >>> On Fri, Oct 14, 2022 at 07:39:42PM +0800, Hou Tao wrote:
> SNIP
> >>>
> >> Thanks for the review. But it seems I missed another possible use case for
> >> rcu_trace_implies_rcu_gp() in bpf memory allocator. The code snippet for
> >> free_mem_alloc() is as following:
> >>
> >> static void free_mem_alloc(struct bpf_mem_alloc *ma)
> >> {
> >>         /* waiting_for_gp lists was drained, but __free_rcu might
> >>          * still execute. Wait for it now before we freeing percpu caches.
> >>          */
> >>         rcu_barrier_tasks_trace();
> >>         rcu_barrier();
> >>         free_mem_alloc_no_barrier(ma);
> >> }
> >>
> >> It uses rcu_barrier_tasks_trace() and rcu_barrier() to wait for the completion
> >> of pending call_rcu_tasks_trace()s and call_rcu()s. I think it is also safe to
> >> check rcu_trace_implies_rcu_gp() in free_mem_alloc() and if it is true, there is
> >> no need to call rcu_barrier().
> >>
> >> static void free_mem_alloc(struct bpf_mem_alloc *ma)
> >> {
> >>         /* waiting_for_gp lists was drained, but __free_rcu_tasks_trace()
> >>          * or __free_rcu() might still execute. Wait for it now before we
> >>          * freeing percpu caches.
> >>          */
> >>         rcu_barrier_tasks_trace();
> >>         if (!rcu_trace_implies_rcu_gp())
> >>                 rcu_barrier();
> >>         free_mem_alloc_no_barrier(ma);
> >> }
> >>
> >> Does the above change look good to you ? If it is, I will post v3 to include the
> >> above change and add your Reviewed-by tag.
> > Unfortunately, although synchronize_rcu_tasks_trace() implies
> > that synchronize_rcu(), there is no relationship between the
> > callbacks.  Furthermore, rcu_barrier_tasks_trace() does not imply
> > synchronize_rcu_tasks_trace().
> 
> Yes. I see. And according to the code, if there is not pending cb,
> rcu_barrier_tasks_trace() will returned immediately. It is also possible
> rcu_tasks_trace kthread is in the middle of grace period waiting when invoking
> rcu_barrier_task_trace(), so rcu_barrier_task_trace() does not imply
> synchronize_rcu_tasks_trace().

Very good!

> > So the above change really would break things.  Please do not do it.
> 
> However I am a little confused about the conclusion. If only considering the
> invocations of call_rcu() and call_rcu_tasks_trace() in kernel/bpf/memalloc.c, I
> think it is safe to do so, right ? Because if  rcu_trace_implies_rcu_gp() is
> true, there will be no invocation of call_rcu() and rcu_barrier_tasks_trace()
> will wait for the completion of pending call_rcu_tasks_trace(). If
> rcu_trace_implies_rcu_gp(), rcu_barrier_tasks_trace() and rcu_barrier() will do
> the job. If considering the invocations of call_rcu() in other places, I think
> it is definitely unsafe to do that, right ?

Agreed, I am being cautious and pessimistic in assuming that there are
other call_rcu() invocations.  On the other hand, my caution and pessimism
is based on their having been other call_rcu() invocations in the past.
So please verify that there are none before making this sort of change.

							Thanx, Paul

> > You could use workqueues or similar to make the rcu_barrier_tasks_trace()
> > and the rcu_barrier() wait concurrently, though.  This would of course
> > require some synchronization.
> Thanks for the suggestion. Will check it later.
> >
> > 							Thanx, Paul
> >
> >>>> Change Log:
> >>>>
> >>>> v2:
> >>>>  * codify the implication of RCU Tasks Trace grace period instead of
> >>>>    assuming for it
> >>>>
> >>>> v1: https://lore.kernel.org/bpf/20221011071128.3470622-1-houtao@huaweicloud.com
> >>>>
> >>>> Hou Tao (3):
> >>>>   bpf: Use rcu_trace_implies_rcu_gp() in bpf memory allocator
> >>>>   bpf: Use rcu_trace_implies_rcu_gp() in local storage map
> >>>>   bpf: Use rcu_trace_implies_rcu_gp() for program array freeing
> >>>>
> >>>> Paul E. McKenney (1):
> >>>>   rcu-tasks: Provide rcu_trace_implies_rcu_gp()
> >>>>
> >>>>  include/linux/rcupdate.h       | 12 ++++++++++++
> >>>>  kernel/bpf/bpf_local_storage.c | 13 +++++++++++--
> >>>>  kernel/bpf/core.c              |  8 +++++++-
> >>>>  kernel/bpf/memalloc.c          | 15 ++++++++++-----
> >>>>  kernel/rcu/tasks.h             |  2 ++
> >>>>  5 files changed, 42 insertions(+), 8 deletions(-)
> >>>>
> >>>> -- 
> >>>> 2.29.2
> >>>>
> >>> .
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-10-21 18:50 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-14 11:39 [PATCH bpf-next v2 0/4] Remove unnecessary RCU grace period chaining Hou Tao
2022-10-14 11:39 ` [PATCH bpf-next v2 1/4] rcu-tasks: Provide rcu_trace_implies_rcu_gp() Hou Tao
2022-10-14 11:39 ` [PATCH bpf-next v2 2/4] bpf: Use rcu_trace_implies_rcu_gp() in bpf memory allocator Hou Tao
2022-10-14 11:39 ` [PATCH bpf-next v2 3/4] bpf: Use rcu_trace_implies_rcu_gp() in local storage map Hou Tao
2022-10-14 11:39 ` [PATCH bpf-next v2 4/4] bpf: Use rcu_trace_implies_rcu_gp() for program array freeing Hou Tao
2022-10-17 13:39 ` [PATCH bpf-next v2 0/4] Remove unnecessary RCU grace period chaining Paul E. McKenney
2022-10-18  7:31   ` Hou Tao
2022-10-18 15:08     ` Paul E. McKenney
2022-10-21  7:08       ` Hou Tao
2022-10-21 18:50         ` Paul E. McKenney
2022-10-18 17:40 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).