All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] tracepoint static call fixes
@ 2021-08-05 13:27 Mathieu Desnoyers
  2021-08-05 13:27 ` [PATCH 1/3] Fix: tracepoint: static call: compare data on transition from 2->1 callees Mathieu Desnoyers
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Mathieu Desnoyers @ 2021-08-05 13:27 UTC (permalink / raw)
  To: Steven Rostedt, Ingo Molnar, Peter Zijlstra, Andrew Morton,
	Paul E. McKenney, Stefan Metzmacher, stable
  Cc: linux-kernel, Mathieu Desnoyers

This series fixes issues with tracepoint callback function vs data
argument mismatch when back-to-back registrations/unregistrations are
performed.

Those issues were introduced by the tracepoint optimizations using
static_call().

Thanks,

Mathieu

Mathieu Desnoyers (3):
  Fix: tracepoint: static call: compare data on transition from 2->1
    callees
  Fix: tracepoint: static call function vs data state mismatch (v2)
  Fix: tracepoint: rcu get state and cond sync for static call updates
    (v2)

 kernel/tracepoint.c | 159 ++++++++++++++++++++++++++++++++++++++------
 1 file changed, 139 insertions(+), 20 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/3] Fix: tracepoint: static call: compare data on transition from 2->1 callees
  2021-08-05 13:27 [PATCH 0/3] tracepoint static call fixes Mathieu Desnoyers
@ 2021-08-05 13:27 ` Mathieu Desnoyers
  2021-08-05 17:07   ` Steven Rostedt
  2021-08-05 13:27 ` [PATCH 2/3] Fix: tracepoint: static call function vs data state mismatch (v2) Mathieu Desnoyers
  2021-08-05 13:27 ` [PATCH 3/3] Fix: tracepoint: rcu get state and cond sync for static call updates (v2) Mathieu Desnoyers
  2 siblings, 1 reply; 12+ messages in thread
From: Mathieu Desnoyers @ 2021-08-05 13:27 UTC (permalink / raw)
  To: Steven Rostedt, Ingo Molnar, Peter Zijlstra, Andrew Morton,
	Paul E. McKenney, Stefan Metzmacher, stable
  Cc: linux-kernel, Mathieu Desnoyers

On transition from 2->1 callees, we should be comparing .data rather
than .func, because the same callback can be registered twice with
different data, and what we care about here is that the data of array
element 0 is unchanged to skip rcu sync.

Link: https://lore.kernel.org/io-uring/4ebea8f0-58c9-e571-fd30-0ce4f6f09c70@samba.org/
Fixes: d25e37d89dd2 ("tracepoint: Optimize using static_call()")
Fixes: 547305a64632 ("tracepoint: Fix out of sync data passing by static caller")
Fixes: 352384d5c84e ("tracepoints: Update static_call before tp_funcs when adding a tracepoint")
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Stefan Metzmacher <metze@samba.org>
Cc: <stable@vger.kernel.org> # 5.10+
---
 kernel/tracepoint.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
index fc32821f8240..133b6454b287 100644
--- a/kernel/tracepoint.c
+++ b/kernel/tracepoint.c
@@ -338,7 +338,7 @@ static int tracepoint_remove_func(struct tracepoint *tp,
 	} else {
 		rcu_assign_pointer(tp->funcs, tp_funcs);
 		tracepoint_update_call(tp, tp_funcs,
-				       tp_funcs[0].func != old[0].func);
+				       tp_funcs[0].data != old[0].data);
 	}
 	release_probes(old);
 	return 0;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/3] Fix: tracepoint: static call function vs data state mismatch (v2)
  2021-08-05 13:27 [PATCH 0/3] tracepoint static call fixes Mathieu Desnoyers
  2021-08-05 13:27 ` [PATCH 1/3] Fix: tracepoint: static call: compare data on transition from 2->1 callees Mathieu Desnoyers
@ 2021-08-05 13:27 ` Mathieu Desnoyers
  2021-08-05 18:56   ` Steven Rostedt
  2021-08-05 13:27 ` [PATCH 3/3] Fix: tracepoint: rcu get state and cond sync for static call updates (v2) Mathieu Desnoyers
  2 siblings, 1 reply; 12+ messages in thread
From: Mathieu Desnoyers @ 2021-08-05 13:27 UTC (permalink / raw)
  To: Steven Rostedt, Ingo Molnar, Peter Zijlstra, Andrew Morton,
	Paul E. McKenney, Stefan Metzmacher, stable
  Cc: linux-kernel, Mathieu Desnoyers

On a 1->0->1 callbacks transition, there is an issue with the new
callback using the old callback's data.

Considering __DO_TRACE_CALL:

        do {                                                            \
                struct tracepoint_func *it_func_ptr;                    \
                void *__data;                                           \
                it_func_ptr =                                           \
                        rcu_dereference_raw((&__tracepoint_##name)->funcs); \
                if (it_func_ptr) {                                      \
                        __data = (it_func_ptr)->data;                   \

----> [ delayed here on one CPU (e.g. vcpu preempted by the host) ]

                        static_call(tp_func_##name)(__data, args);      \
                }                                                       \
        } while (0)

It has loaded the tp->funcs of the old callback, so it will try to use the old
data. This can be fixed by adding a RCU sync anywhere in the 1->0->1
transition chain.

On a N->2->1 transition, we need an rcu-sync because you may have a
sequence of 3->2->1 (or 1->2->1) where the element 0 data is unchanged
between 2->1, but was changed from 3->2 (or from 1->2), which may be
observed by the static call. This can be fixed by adding an
unconditional RCU sync in transition 2->1.

A follow up fix will introduce a more lightweight scheme based on RCU
get_state and cond_sync.

Link: https://lore.kernel.org/io-uring/4ebea8f0-58c9-e571-fd30-0ce4f6f09c70@samba.org/
Fixes: d25e37d89dd2 ("tracepoint: Optimize using static_call()")
Fixes: 547305a64632 ("tracepoint: Fix out of sync data passing by static caller")
Fixes: 352384d5c84e ("tracepoints: Update static_call before tp_funcs when adding a tracepoint")
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Stefan Metzmacher <metze@samba.org>
Cc: <stable@vger.kernel.org> # 5.10+
---
Changes since v1:
- Add missing TP_FUNC_2 fallthrough to TP_FUNC_N in
  tracepoint_remove_func. This handles the missing transition from 3->2
  probes.
---
 kernel/tracepoint.c | 102 +++++++++++++++++++++++++++++++++++---------
 1 file changed, 82 insertions(+), 20 deletions(-)

diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
index 133b6454b287..8d772bd6894d 100644
--- a/kernel/tracepoint.c
+++ b/kernel/tracepoint.c
@@ -15,6 +15,13 @@
 #include <linux/sched/task.h>
 #include <linux/static_key.h>
 
+enum tp_func_state {
+	TP_FUNC_0,
+	TP_FUNC_1,
+	TP_FUNC_2,
+	TP_FUNC_N,
+};
+
 extern tracepoint_ptr_t __start___tracepoints_ptrs[];
 extern tracepoint_ptr_t __stop___tracepoints_ptrs[];
 
@@ -246,26 +253,29 @@ static void *func_remove(struct tracepoint_func **funcs,
 	return old;
 }
 
-static void tracepoint_update_call(struct tracepoint *tp, struct tracepoint_func *tp_funcs, bool sync)
+/*
+ * Count the number of functions (enum tp_func_state) in a tp_funcs array.
+ */
+static enum tp_func_state nr_func_state(const struct tracepoint_func *tp_funcs)
+{
+	if (!tp_funcs)
+		return TP_FUNC_0;
+	if (!tp_funcs[1].func)
+		return TP_FUNC_1;
+	if (!tp_funcs[2].func)
+		return TP_FUNC_2;
+	return TP_FUNC_N;	/* 3 or more */
+}
+
+static void tracepoint_update_call(struct tracepoint *tp, struct tracepoint_func *tp_funcs)
 {
 	void *func = tp->iterator;
 
 	/* Synthetic events do not have static call sites */
 	if (!tp->static_call_key)
 		return;
-
-	if (!tp_funcs[1].func) {
+	if (nr_func_state(tp_funcs) == TP_FUNC_1)
 		func = tp_funcs[0].func;
-		/*
-		 * If going from the iterator back to a single caller,
-		 * we need to synchronize with __DO_TRACE to make sure
-		 * that the data passed to the callback is the one that
-		 * belongs to that callback.
-		 */
-		if (sync)
-			tracepoint_synchronize_unregister();
-	}
-
 	__static_call_update(tp->static_call_key, tp->static_call_tramp, func);
 }
 
@@ -299,9 +309,31 @@ static int tracepoint_add_func(struct tracepoint *tp,
 	 * a pointer to it.  This array is referenced by __DO_TRACE from
 	 * include/linux/tracepoint.h using rcu_dereference_sched().
 	 */
-	tracepoint_update_call(tp, tp_funcs, false);
-	rcu_assign_pointer(tp->funcs, tp_funcs);
-	static_key_enable(&tp->key);
+	switch (nr_func_state(tp_funcs)) {
+	case TP_FUNC_1:		/* 0->1 */
+		/* Set static call to first function */
+		tracepoint_update_call(tp, tp_funcs);
+		/* Both iterator and static call handle NULL tp->funcs */
+		rcu_assign_pointer(tp->funcs, tp_funcs);
+		static_key_enable(&tp->key);
+		break;
+	case TP_FUNC_2:		/* 1->2 */
+		/* Set iterator static call */
+		tracepoint_update_call(tp, tp_funcs);
+		/*
+		 * Iterator callback installed before updating tp->funcs.
+		 * Requires ordering between RCU assign/dereference and
+		 * static call update/call.
+		 */
+		rcu_assign_pointer(tp->funcs, tp_funcs);
+		break;
+	case TP_FUNC_N:		/* N->N+1 (N>1) */
+		rcu_assign_pointer(tp->funcs, tp_funcs);
+		break;
+	default:
+		WARN_ON_ONCE(1);
+		break;
+	}
 
 	release_probes(old);
 	return 0;
@@ -328,17 +360,47 @@ static int tracepoint_remove_func(struct tracepoint *tp,
 		/* Failed allocating new tp_funcs, replaced func with stub */
 		return 0;
 
-	if (!tp_funcs) {
+	switch (nr_func_state(tp_funcs)) {
+	case TP_FUNC_0:		/* 1->0 */
 		/* Removed last function */
 		if (tp->unregfunc && static_key_enabled(&tp->key))
 			tp->unregfunc();
 
 		static_key_disable(&tp->key);
+		/* Set iterator static call */
+		tracepoint_update_call(tp, tp_funcs);
+		/* Both iterator and static call handle NULL tp->funcs */
+		rcu_assign_pointer(tp->funcs, NULL);
+		/*
+		 * Make sure new func never uses old data after a 1->0->1
+		 * transition sequence.
+		 * Considering that transition 0->1 is the common case
+		 * and don't have rcu-sync, issue rcu-sync after
+		 * transition 1->0 to break that sequence by waiting for
+		 * readers to be quiescent.
+		 */
+		tracepoint_synchronize_unregister();
+		break;
+	case TP_FUNC_1:		/* 2->1 */
 		rcu_assign_pointer(tp->funcs, tp_funcs);
-	} else {
+		/*
+		 * On 2->1 transition, RCU sync is needed before setting
+		 * static call to first callback, because the observer
+		 * may have loaded any prior tp->funcs after the last one
+		 * associated with an rcu-sync.
+		 */
+		tracepoint_synchronize_unregister();
+		/* Set static call to first function */
+		tracepoint_update_call(tp, tp_funcs);
+		break;
+	case TP_FUNC_2:		/* N->N-1 (N>2) */
+		fallthrough;
+	case TP_FUNC_N:
 		rcu_assign_pointer(tp->funcs, tp_funcs);
-		tracepoint_update_call(tp, tp_funcs,
-				       tp_funcs[0].data != old[0].data);
+		break;
+	default:
+		WARN_ON_ONCE(1);
+		break;
 	}
 	release_probes(old);
 	return 0;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/3] Fix: tracepoint: rcu get state and cond sync for static call updates (v2)
  2021-08-05 13:27 [PATCH 0/3] tracepoint static call fixes Mathieu Desnoyers
  2021-08-05 13:27 ` [PATCH 1/3] Fix: tracepoint: static call: compare data on transition from 2->1 callees Mathieu Desnoyers
  2021-08-05 13:27 ` [PATCH 2/3] Fix: tracepoint: static call function vs data state mismatch (v2) Mathieu Desnoyers
@ 2021-08-05 13:27 ` Mathieu Desnoyers
  2021-08-05 19:12   ` Steven Rostedt
  2 siblings, 1 reply; 12+ messages in thread
From: Mathieu Desnoyers @ 2021-08-05 13:27 UTC (permalink / raw)
  To: Steven Rostedt, Ingo Molnar, Peter Zijlstra, Andrew Morton,
	Paul E. McKenney, Stefan Metzmacher, stable
  Cc: linux-kernel, Mathieu Desnoyers

State transitions from 1->0->1 and N->2->1 callbacks require RCU
synchronization. Rather than performing the RCU synchronization every
time the state change occurs, which is quite slow when many tracepoints
are registered in batch, instead keep a snapshot of the RCU state on the
most recent transitions which belong to a chain, and conditionally wait
for a grace period on the last transition of the chain if one g.p. has
not elapsed since the last snapshot.

This applies to both RCU and SRCU.

Link: https://lore.kernel.org/io-uring/4ebea8f0-58c9-e571-fd30-0ce4f6f09c70@samba.org/
Fixes: d25e37d89dd2 ("tracepoint: Optimize using static_call()")
Fixes: 547305a64632 ("tracepoint: Fix out of sync data passing by static caller")
Fixes: 352384d5c84e ("tracepoints: Update static_call before tp_funcs when adding a tracepoint")
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Stefan Metzmacher <metze@samba.org>
Cc: <stable@vger.kernel.org> # 5.10+
---
Changes since v1:
- Use tp_rcu_get_state/tp_rcu_cond_sync on 2->1 transition when
  tp_funcs[0].data != old[0].data rather than
  tracepoint_synchronize_unregister.
---
 kernel/tracepoint.c | 81 ++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 69 insertions(+), 12 deletions(-)

diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
index 8d772bd6894d..d8f69580001c 100644
--- a/kernel/tracepoint.c
+++ b/kernel/tracepoint.c
@@ -28,6 +28,44 @@ extern tracepoint_ptr_t __stop___tracepoints_ptrs[];
 DEFINE_SRCU(tracepoint_srcu);
 EXPORT_SYMBOL_GPL(tracepoint_srcu);
 
+enum tp_transition_sync {
+	TP_TRANSITION_SYNC_1_0_1,
+	TP_TRANSITION_SYNC_N_2_1,
+
+	_NR_TP_TRANSITION_SYNC,
+};
+
+struct tp_transition_snapshot {
+	unsigned long rcu;
+	unsigned long srcu;
+	bool ongoing;
+};
+
+/* Protected by tracepoints_mutex */
+static struct tp_transition_snapshot tp_transition_snapshot[_NR_TP_TRANSITION_SYNC];
+
+static void tp_rcu_get_state(enum tp_transition_sync sync)
+{
+	struct tp_transition_snapshot *snapshot = &tp_transition_snapshot[sync];
+
+	/* Keep the latest get_state snapshot. */
+	snapshot->rcu = get_state_synchronize_rcu();
+	snapshot->srcu = start_poll_synchronize_srcu(&tracepoint_srcu);
+	snapshot->ongoing = true;
+}
+
+static void tp_rcu_cond_sync(enum tp_transition_sync sync)
+{
+	struct tp_transition_snapshot *snapshot = &tp_transition_snapshot[sync];
+
+	if (!snapshot->ongoing)
+		return;
+	cond_synchronize_rcu(snapshot->rcu);
+	if (!poll_state_synchronize_srcu(&tracepoint_srcu, snapshot->srcu))
+		synchronize_srcu(&tracepoint_srcu);
+	snapshot->ongoing = false;
+}
+
 /* Set to 1 to enable tracepoint debug output */
 static const int tracepoint_debug;
 
@@ -311,6 +349,11 @@ static int tracepoint_add_func(struct tracepoint *tp,
 	 */
 	switch (nr_func_state(tp_funcs)) {
 	case TP_FUNC_1:		/* 0->1 */
+		/*
+		 * Make sure new static func never uses old data after a
+		 * 1->0->1 transition sequence.
+		 */
+		tp_rcu_cond_sync(TP_TRANSITION_SYNC_1_0_1);
 		/* Set static call to first function */
 		tracepoint_update_call(tp, tp_funcs);
 		/* Both iterator and static call handle NULL tp->funcs */
@@ -326,9 +369,21 @@ static int tracepoint_add_func(struct tracepoint *tp,
 		 * static call update/call.
 		 */
 		rcu_assign_pointer(tp->funcs, tp_funcs);
+		/*
+		 * Make sure static func never uses incorrect data after a
+		 * 1->...->2->1 transition sequence.
+		 */
+		if (tp_funcs[0].data != old[0].data)
+			tp_rcu_get_state(TP_TRANSITION_SYNC_N_2_1);
 		break;
 	case TP_FUNC_N:		/* N->N+1 (N>1) */
 		rcu_assign_pointer(tp->funcs, tp_funcs);
+		/*
+		 * Make sure static func never uses incorrect data after a
+		 * N->...->2->1 (N>1) transition sequence.
+		 */
+		if (tp_funcs[0].data != old[0].data)
+			tp_rcu_get_state(TP_TRANSITION_SYNC_N_2_1);
 		break;
 	default:
 		WARN_ON_ONCE(1);
@@ -372,24 +427,20 @@ static int tracepoint_remove_func(struct tracepoint *tp,
 		/* Both iterator and static call handle NULL tp->funcs */
 		rcu_assign_pointer(tp->funcs, NULL);
 		/*
-		 * Make sure new func never uses old data after a 1->0->1
-		 * transition sequence.
-		 * Considering that transition 0->1 is the common case
-		 * and don't have rcu-sync, issue rcu-sync after
-		 * transition 1->0 to break that sequence by waiting for
-		 * readers to be quiescent.
+		 * Make sure new static func never uses old data after a
+		 * 1->0->1 transition sequence.
 		 */
-		tracepoint_synchronize_unregister();
+		tp_rcu_get_state(TP_TRANSITION_SYNC_1_0_1);
 		break;
 	case TP_FUNC_1:		/* 2->1 */
 		rcu_assign_pointer(tp->funcs, tp_funcs);
 		/*
-		 * On 2->1 transition, RCU sync is needed before setting
-		 * static call to first callback, because the observer
-		 * may have loaded any prior tp->funcs after the last one
-		 * associated with an rcu-sync.
+		 * Make sure static func never uses incorrect data after a
+		 * N->...->2->1 (N>2) transition sequence.
 		 */
-		tracepoint_synchronize_unregister();
+		if (tp_funcs[0].data != old[0].data)
+			tp_rcu_get_state(TP_TRANSITION_SYNC_N_2_1);
+		tp_rcu_cond_sync(TP_TRANSITION_SYNC_N_2_1);
 		/* Set static call to first function */
 		tracepoint_update_call(tp, tp_funcs);
 		break;
@@ -397,6 +448,12 @@ static int tracepoint_remove_func(struct tracepoint *tp,
 		fallthrough;
 	case TP_FUNC_N:
 		rcu_assign_pointer(tp->funcs, tp_funcs);
+		/*
+		 * Make sure static func never uses incorrect data after a
+		 * N->...->2->1 (N>2) transition sequence.
+		 */
+		if (tp_funcs[0].data != old[0].data)
+			tp_rcu_get_state(TP_TRANSITION_SYNC_N_2_1);
 		break;
 	default:
 		WARN_ON_ONCE(1);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] Fix: tracepoint: static call: compare data on transition from 2->1 callees
  2021-08-05 13:27 ` [PATCH 1/3] Fix: tracepoint: static call: compare data on transition from 2->1 callees Mathieu Desnoyers
@ 2021-08-05 17:07   ` Steven Rostedt
  2021-08-05 17:57     ` Mathieu Desnoyers
  0 siblings, 1 reply; 12+ messages in thread
From: Steven Rostedt @ 2021-08-05 17:07 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Ingo Molnar, Peter Zijlstra, Andrew Morton, Paul E. McKenney,
	Stefan Metzmacher, stable, linux-kernel

On Thu,  5 Aug 2021 09:27:15 -0400
Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:

> On transition from 2->1 callees, we should be comparing .data rather
> than .func, because the same callback can be registered twice with
> different data, and what we care about here is that the data of array
> element 0 is unchanged to skip rcu sync.
> 
> Link: https://lore.kernel.org/io-uring/4ebea8f0-58c9-e571-fd30-0ce4f6f09c70@samba.org/

FYI, You only need to show one Fixes.

> Fixes: d25e37d89dd2 ("tracepoint: Optimize using static_call()")

The above is fixed by the one below. Which means all the stable kernels
that have the above, will also have the below, and thus the above is
just redundant.

> Fixes: 547305a64632 ("tracepoint: Fix out of sync data passing by static caller")

The above is what the patch actually fixes.

> Fixes: 352384d5c84e ("tracepoints: Update static_call before tp_funcs when adding a tracepoint")

How does this patch fix the above? Perhaps the above did not go enough
to fix the issue, but it's unrelated.

I'll remove the first and last Fixes tag.

> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: "Paul E. McKenney" <paulmck@kernel.org>
> Cc: Stefan Metzmacher <metze@samba.org>
> Cc: <stable@vger.kernel.org> # 5.10+

The "# 5.10+" is now obsolete, and not needed. The Fixes tag is used to
determine where this gets backported to.

Other than that. This patch looks good.

-- Steve

> ---
>  kernel/tracepoint.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
> index fc32821f8240..133b6454b287 100644
> --- a/kernel/tracepoint.c
> +++ b/kernel/tracepoint.c
> @@ -338,7 +338,7 @@ static int tracepoint_remove_func(struct tracepoint *tp,
>  	} else {
>  		rcu_assign_pointer(tp->funcs, tp_funcs);
>  		tracepoint_update_call(tp, tp_funcs,
> -				       tp_funcs[0].func != old[0].func);
> +				       tp_funcs[0].data != old[0].data);
>  	}
>  	release_probes(old);
>  	return 0;


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] Fix: tracepoint: static call: compare data on transition from 2->1 callees
  2021-08-05 17:07   ` Steven Rostedt
@ 2021-08-05 17:57     ` Mathieu Desnoyers
  0 siblings, 0 replies; 12+ messages in thread
From: Mathieu Desnoyers @ 2021-08-05 17:57 UTC (permalink / raw)
  To: rostedt
  Cc: Ingo Molnar, Peter Zijlstra, Andrew Morton, paulmck,
	Stefan Metzmacher, stable, linux-kernel

----- On Aug 5, 2021, at 1:07 PM, rostedt rostedt@goodmis.org wrote:

> On Thu,  5 Aug 2021 09:27:15 -0400
> Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:
> 
>> On transition from 2->1 callees, we should be comparing .data rather
>> than .func, because the same callback can be registered twice with
>> different data, and what we care about here is that the data of array
>> element 0 is unchanged to skip rcu sync.
>> 
>> Link:
>> https://lore.kernel.org/io-uring/4ebea8f0-58c9-e571-fd30-0ce4f6f09c70@samba.org/
> 
> FYI, You only need to show one Fixes.
> 
>> Fixes: d25e37d89dd2 ("tracepoint: Optimize using static_call()")
> 
> The above is fixed by the one below. Which means all the stable kernels
> that have the above, will also have the below, and thus the above is
> just redundant.
> 
>> Fixes: 547305a64632 ("tracepoint: Fix out of sync data passing by static
>> caller")
> 
> The above is what the patch actually fixes.
> 
>> Fixes: 352384d5c84e ("tracepoints: Update static_call before tp_funcs when
>> adding a tracepoint")
> 
> How does this patch fix the above? Perhaps the above did not go enough
> to fix the issue, but it's unrelated.

OK

> 
> I'll remove the first and last Fixes tag.

OK

> 
>> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
>> Cc: Steven Rostedt <rostedt@goodmis.org>
>> Cc: Ingo Molnar <mingo@redhat.com>
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: "Paul E. McKenney" <paulmck@kernel.org>
>> Cc: Stefan Metzmacher <metze@samba.org>
>> Cc: <stable@vger.kernel.org> # 5.10+
> 
> The "# 5.10+" is now obsolete, and not needed. The Fixes tag is used to
> determine where this gets backported to.
> 
> Other than that. This patch looks good.

Great, thanks!

Mathieu

> 
> -- Steve
> 
>> ---
>>  kernel/tracepoint.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
>> index fc32821f8240..133b6454b287 100644
>> --- a/kernel/tracepoint.c
>> +++ b/kernel/tracepoint.c
>> @@ -338,7 +338,7 @@ static int tracepoint_remove_func(struct tracepoint *tp,
>>  	} else {
>>  		rcu_assign_pointer(tp->funcs, tp_funcs);
>>  		tracepoint_update_call(tp, tp_funcs,
>> -				       tp_funcs[0].func != old[0].func);
>> +				       tp_funcs[0].data != old[0].data);
>>  	}
>>  	release_probes(old);
> >  	return 0;

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] Fix: tracepoint: static call function vs data state mismatch (v2)
  2021-08-05 13:27 ` [PATCH 2/3] Fix: tracepoint: static call function vs data state mismatch (v2) Mathieu Desnoyers
@ 2021-08-05 18:56   ` Steven Rostedt
  2021-08-05 19:15     ` Mathieu Desnoyers
  0 siblings, 1 reply; 12+ messages in thread
From: Steven Rostedt @ 2021-08-05 18:56 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Ingo Molnar, Peter Zijlstra, Andrew Morton, Paul E. McKenney,
	Stefan Metzmacher, stable, linux-kernel


Note, there shouldn't be a "(v2)" outside the "[PATCH ]" part.
Otherwise it gets added into the git commit during "git am".

On Thu,  5 Aug 2021 09:27:16 -0400
Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:

> On a 1->0->1 callbacks transition, there is an issue with the new
> callback using the old callback's data.
> 
> Considering __DO_TRACE_CALL:
> 
>         do {                                                            \
>                 struct tracepoint_func *it_func_ptr;                    \
>                 void *__data;                                           \
>                 it_func_ptr =                                           \
>                         rcu_dereference_raw((&__tracepoint_##name)->funcs); \
>                 if (it_func_ptr) {                                      \
>                         __data = (it_func_ptr)->data;                   \
> 
> ----> [ delayed here on one CPU (e.g. vcpu preempted by the host) ]  
> 
>                         static_call(tp_func_##name)(__data, args);      \
>                 }                                                       \
>         } while (0)
> 
> It has loaded the tp->funcs of the old callback, so it will try to use the old
> data. This can be fixed by adding a RCU sync anywhere in the 1->0->1
> transition chain.
> 
> On a N->2->1 transition, we need an rcu-sync because you may have a
> sequence of 3->2->1 (or 1->2->1) where the element 0 data is unchanged
> between 2->1, but was changed from 3->2 (or from 1->2), which may be
> observed by the static call. This can be fixed by adding an
> unconditional RCU sync in transition 2->1.
> 
> A follow up fix will introduce a more lightweight scheme based on RCU
> get_state and cond_sync.

I'll add here that this patch will cause a huge performance regression
on disabling the trace events, but the follow up patch will fix that.

Before this patch:

  # trace-cmd start -e all
  # time trace-cmd start -p nop

  real	0m0.778s
  user	0m0.000s
  sys	0m0.061s

After this patch:

  # trace-cmd start -e all
  # time trace-cmd start -p nop

  real	0m10.593s
  user	0m0.017s
  sys	0m0.259s


That's more than 10x slow down. Just under a second to disable all
events now goes to over 10 seconds!

But after the next patch:

  # trace-cmd start -e all
  # time trace-cmd start -p nop

  real	0m0.878s
  user	0m0.000s
  sys	0m0.103s

Which is in the noise from before this patch.

This is a big enough regression, I'll even add a Fixes tag to the next
patch on the final sha1 of this patch! Such that this patch won't be
backported without the next patch.

> 
> Link: https://lore.kernel.org/io-uring/4ebea8f0-58c9-e571-fd30-0ce4f6f09c70@samba.org/
> Fixes: d25e37d89dd2 ("tracepoint: Optimize using static_call()")

For this patch, I would say the above is what this fixes.

-- Steve

> Fixes: 547305a64632 ("tracepoint: Fix out of sync data passing by static caller")
> Fixes: 352384d5c84e ("tracepoints: Update static_call before tp_funcs when adding a tracepoint")
> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: "Paul E. McKenney" <paulmck@kernel.org>
> Cc: Stefan Metzmacher <metze@samba.org>
> Cc: <stable@vger.kernel.org> # 5.10+
> ---

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] Fix: tracepoint: rcu get state and cond sync for static call updates (v2)
  2021-08-05 13:27 ` [PATCH 3/3] Fix: tracepoint: rcu get state and cond sync for static call updates (v2) Mathieu Desnoyers
@ 2021-08-05 19:12   ` Steven Rostedt
  2021-08-05 19:29     ` [PATCH v3 1/1] Fix: tracepoint: rcu get state and cond sync for static call updates Mathieu Desnoyers
  0 siblings, 1 reply; 12+ messages in thread
From: Steven Rostedt @ 2021-08-05 19:12 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Ingo Molnar, Peter Zijlstra, Andrew Morton, Paul E. McKenney,
	Stefan Metzmacher, stable, linux-kernel

On Thu,  5 Aug 2021 09:27:17 -0400
Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:

> State transitions from 1->0->1 and N->2->1 callbacks require RCU
> synchronization. Rather than performing the RCU synchronization every
> time the state change occurs, which is quite slow when many tracepoints
> are registered in batch, instead keep a snapshot of the RCU state on the
> most recent transitions which belong to a chain, and conditionally wait
> for a grace period on the last transition of the chain if one g.p. has
> not elapsed since the last snapshot.
> 
> This applies to both RCU and SRCU.
> 
> Link: https://lore.kernel.org/io-uring/4ebea8f0-58c9-e571-fd30-0ce4f6f09c70@samba.org/
> Fixes: d25e37d89dd2 ("tracepoint: Optimize using static_call()")
> Fixes: 547305a64632 ("tracepoint: Fix out of sync data passing by static caller")
> Fixes: 352384d5c84e ("tracepoints: Update static_call before tp_funcs when adding a tracepoint")
> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: "Paul E. McKenney" <paulmck@kernel.org>
> Cc: Stefan Metzmacher <metze@samba.org>
> Cc: <stable@vger.kernel.org> # 5.10+
> ---
> Changes since v1:
> - Use tp_rcu_get_state/tp_rcu_cond_sync on 2->1 transition when
>   tp_funcs[0].data != old[0].data rather than
>   tracepoint_synchronize_unregister.
> ---
>  kernel/tracepoint.c | 81 ++++++++++++++++++++++++++++++++++++++-------
>  1 file changed, 69 insertions(+), 12 deletions(-)
> 
> diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
> index 8d772bd6894d..d8f69580001c 100644
> --- a/kernel/tracepoint.c
> +++ b/kernel/tracepoint.c
> @@ -28,6 +28,44 @@ extern tracepoint_ptr_t __stop___tracepoints_ptrs[];
>  DEFINE_SRCU(tracepoint_srcu);
>  EXPORT_SYMBOL_GPL(tracepoint_srcu);
>  
> +enum tp_transition_sync {
> +	TP_TRANSITION_SYNC_1_0_1,
> +	TP_TRANSITION_SYNC_N_2_1,
> +
> +	_NR_TP_TRANSITION_SYNC,
> +};
> +
> +struct tp_transition_snapshot {
> +	unsigned long rcu;
> +	unsigned long srcu;
> +	bool ongoing;
> +};
> +
> +/* Protected by tracepoints_mutex */
> +static struct tp_transition_snapshot tp_transition_snapshot[_NR_TP_TRANSITION_SYNC];
> +
> +static void tp_rcu_get_state(enum tp_transition_sync sync)
> +{
> +	struct tp_transition_snapshot *snapshot = &tp_transition_snapshot[sync];
> +
> +	/* Keep the latest get_state snapshot. */
> +	snapshot->rcu = get_state_synchronize_rcu();
> +	snapshot->srcu = start_poll_synchronize_srcu(&tracepoint_srcu);
> +	snapshot->ongoing = true;
> +}
> +
> +static void tp_rcu_cond_sync(enum tp_transition_sync sync)
> +{
> +	struct tp_transition_snapshot *snapshot = &tp_transition_snapshot[sync];
> +
> +	if (!snapshot->ongoing)
> +		return;
> +	cond_synchronize_rcu(snapshot->rcu);
> +	if (!poll_state_synchronize_srcu(&tracepoint_srcu, snapshot->srcu))
> +		synchronize_srcu(&tracepoint_srcu);
> +	snapshot->ongoing = false;
> +}
> +
>  /* Set to 1 to enable tracepoint debug output */
>  static const int tracepoint_debug;
>  
> @@ -311,6 +349,11 @@ static int tracepoint_add_func(struct tracepoint *tp,
>  	 */
>  	switch (nr_func_state(tp_funcs)) {
>  	case TP_FUNC_1:		/* 0->1 */
> +		/*
> +		 * Make sure new static func never uses old data after a
> +		 * 1->0->1 transition sequence.
> +		 */
> +		tp_rcu_cond_sync(TP_TRANSITION_SYNC_1_0_1);
>  		/* Set static call to first function */
>  		tracepoint_update_call(tp, tp_funcs);
>  		/* Both iterator and static call handle NULL tp->funcs */
> @@ -326,9 +369,21 @@ static int tracepoint_add_func(struct tracepoint *tp,
>  		 * static call update/call.
>  		 */
>  		rcu_assign_pointer(tp->funcs, tp_funcs);
> +		/*
> +		 * Make sure static func never uses incorrect data after a
> +		 * 1->...->2->1 transition sequence.
> +		 */
> +		if (tp_funcs[0].data != old[0].data)
> +			tp_rcu_get_state(TP_TRANSITION_SYNC_N_2_1);
>  		break;
>  	case TP_FUNC_N:		/* N->N+1 (N>1) */
>  		rcu_assign_pointer(tp->funcs, tp_funcs);
> +		/*
> +		 * Make sure static func never uses incorrect data after a
> +		 * N->...->2->1 (N>1) transition sequence.
> +		 */
> +		if (tp_funcs[0].data != old[0].data)
> +			tp_rcu_get_state(TP_TRANSITION_SYNC_N_2_1);
>  		break;

Looks to me that the above can be replaced with:

	case TP_FUNC_2:		/* 1->2 */
		/* Set iterator static call */
		tracepoint_update_call(tp, tp_funcs);
		/*
		 * Iterator callback installed before updating tp->funcs.
		 * Requires ordering between RCU assign/dereference and
		 * static call update/call.
		 */
		fallthrough;
	case TP_FUNC_N:		/* N->N+1 (N>1) */
		rcu_assign_pointer(tp->funcs, tp_funcs);
		/*
		 * Make sure static func never uses incorrect data after a
		 * N->...->2->1 (N>1) transition sequence.
		 */
		if (tp_funcs[0].data != old[0].data)
			tp_rcu_get_state(TP_TRANSITION_SYNC_N_2_1);
		break;

>  	default:
>  		WARN_ON_ONCE(1);
> @@ -372,24 +427,20 @@ static int tracepoint_remove_func(struct tracepoint *tp,
>  		/* Both iterator and static call handle NULL tp->funcs */
>  		rcu_assign_pointer(tp->funcs, NULL);
>  		/*
> -		 * Make sure new func never uses old data after a 1->0->1
> -		 * transition sequence.
> -		 * Considering that transition 0->1 is the common case
> -		 * and don't have rcu-sync, issue rcu-sync after
> -		 * transition 1->0 to break that sequence by waiting for
> -		 * readers to be quiescent.
> +		 * Make sure new static func never uses old data after a
> +		 * 1->0->1 transition sequence.
>  		 */
> -		tracepoint_synchronize_unregister();
> +		tp_rcu_get_state(TP_TRANSITION_SYNC_1_0_1);
>  		break;
>  	case TP_FUNC_1:		/* 2->1 */
>  		rcu_assign_pointer(tp->funcs, tp_funcs);
>  		/*
> -		 * On 2->1 transition, RCU sync is needed before setting
> -		 * static call to first callback, because the observer
> -		 * may have loaded any prior tp->funcs after the last one
> -		 * associated with an rcu-sync.
> +		 * Make sure static func never uses incorrect data after a
> +		 * N->...->2->1 (N>2) transition sequence.
>  		 */
> -		tracepoint_synchronize_unregister();

We should add a comment here with:

		/*
		 * If the first element's data has changed, then force the
		 * synchronization, to prevent current readers that have loaded
		 * the old data from calling the new function.
		 */

Can you send a v3 of just this patch? I'll pull in the other patches.

-- Steve

> +		if (tp_funcs[0].data != old[0].data)
> +			tp_rcu_get_state(TP_TRANSITION_SYNC_N_2_1);
> +		tp_rcu_cond_sync(TP_TRANSITION_SYNC_N_2_1);
>  		/* Set static call to first function */
>  		tracepoint_update_call(tp, tp_funcs);
>  		break;
> @@ -397,6 +448,12 @@ static int tracepoint_remove_func(struct tracepoint *tp,
>  		fallthrough;
>  	case TP_FUNC_N:
>  		rcu_assign_pointer(tp->funcs, tp_funcs);
> +		/*
> +		 * Make sure static func never uses incorrect data after a
> +		 * N->...->2->1 (N>2) transition sequence.
> +		 */
> +		if (tp_funcs[0].data != old[0].data)
> +			tp_rcu_get_state(TP_TRANSITION_SYNC_N_2_1);
>  		break;
>  	default:
>  		WARN_ON_ONCE(1);


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] Fix: tracepoint: static call function vs data state mismatch (v2)
  2021-08-05 18:56   ` Steven Rostedt
@ 2021-08-05 19:15     ` Mathieu Desnoyers
  2021-08-05 19:38       ` Steven Rostedt
  0 siblings, 1 reply; 12+ messages in thread
From: Mathieu Desnoyers @ 2021-08-05 19:15 UTC (permalink / raw)
  To: rostedt
  Cc: Ingo Molnar, Peter Zijlstra, Andrew Morton, paulmck,
	Stefan Metzmacher, stable, linux-kernel

----- On Aug 5, 2021, at 2:56 PM, rostedt rostedt@goodmis.org wrote:

> Note, there shouldn't be a "(v2)" outside the "[PATCH ]" part.
> Otherwise it gets added into the git commit during "git am".

Out of curiosity, do you know any way to annotate my local commits to have the
[PATCH v2] tag automatically generated by git send-email ?

> 
> On Thu,  5 Aug 2021 09:27:16 -0400
> Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:
> 
>> On a 1->0->1 callbacks transition, there is an issue with the new
>> callback using the old callback's data.
>> 
>> Considering __DO_TRACE_CALL:
>> 
>>         do {                                                            \
>>                 struct tracepoint_func *it_func_ptr;                    \
>>                 void *__data;                                           \
>>                 it_func_ptr =                                           \
>>                         rcu_dereference_raw((&__tracepoint_##name)->funcs); \
>>                 if (it_func_ptr) {                                      \
>>                         __data = (it_func_ptr)->data;                   \
>> 
>> ----> [ delayed here on one CPU (e.g. vcpu preempted by the host) ]
>> 
>>                         static_call(tp_func_##name)(__data, args);      \
>>                 }                                                       \
>>         } while (0)
>> 
>> It has loaded the tp->funcs of the old callback, so it will try to use the old
>> data. This can be fixed by adding a RCU sync anywhere in the 1->0->1
>> transition chain.
>> 
>> On a N->2->1 transition, we need an rcu-sync because you may have a
>> sequence of 3->2->1 (or 1->2->1) where the element 0 data is unchanged
>> between 2->1, but was changed from 3->2 (or from 1->2), which may be
>> observed by the static call. This can be fixed by adding an
>> unconditional RCU sync in transition 2->1.
>> 
>> A follow up fix will introduce a more lightweight scheme based on RCU
>> get_state and cond_sync.
> 
> I'll add here that this patch will cause a huge performance regression
> on disabling the trace events, but the follow up patch will fix that.
> 
> Before this patch:
> 
>  # trace-cmd start -e all
>  # time trace-cmd start -p nop
> 
>  real	0m0.778s
>  user	0m0.000s
>  sys	0m0.061s
> 
> After this patch:
> 
>  # trace-cmd start -e all
>  # time trace-cmd start -p nop
> 
>  real	0m10.593s
>  user	0m0.017s
>  sys	0m0.259s
> 
> 
> That's more than 10x slow down. Just under a second to disable all
> events now goes to over 10 seconds!
> 
> But after the next patch:
> 
>  # trace-cmd start -e all
>  # time trace-cmd start -p nop
> 
>  real	0m0.878s
>  user	0m0.000s
>  sys	0m0.103s
> 
> Which is in the noise from before this patch.
> 
> This is a big enough regression, I'll even add a Fixes tag to the next
> patch on the final sha1 of this patch! Such that this patch won't be
> backported without the next patch.

This makes sense. I still wanted to keep the two patches separate so we would
introduce the (slow) state machine in the first patch, and optimize for
speed in the second. My intent is to facilitate of small logical changes,
and make bisection more precise in the future if we introduce an issue
here.

Calling out more clearly how slow things become with this patch is indeed
important.

> 
>> 
>> Link:
>> https://lore.kernel.org/io-uring/4ebea8f0-58c9-e571-fd30-0ce4f6f09c70@samba.org/
>> Fixes: d25e37d89dd2 ("tracepoint: Optimize using static_call()")
> 
> For this patch, I would say the above is what this fixes.

Yes.

Thanks,

Mathieu

> 
> -- Steve
> 
>> Fixes: 547305a64632 ("tracepoint: Fix out of sync data passing by static
>> caller")
>> Fixes: 352384d5c84e ("tracepoints: Update static_call before tp_funcs when
>> adding a tracepoint")
>> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
>> Cc: Steven Rostedt <rostedt@goodmis.org>
>> Cc: Ingo Molnar <mingo@redhat.com>
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: "Paul E. McKenney" <paulmck@kernel.org>
>> Cc: Stefan Metzmacher <metze@samba.org>
>> Cc: <stable@vger.kernel.org> # 5.10+
> > ---

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v3 1/1] Fix: tracepoint: rcu get state and cond sync for static call updates
  2021-08-05 19:12   ` Steven Rostedt
@ 2021-08-05 19:29     ` Mathieu Desnoyers
  0 siblings, 0 replies; 12+ messages in thread
From: Mathieu Desnoyers @ 2021-08-05 19:29 UTC (permalink / raw)
  To: rostedt
  Cc: linux-kernel, Mathieu Desnoyers, Ingo Molnar, Peter Zijlstra,
	Andrew Morton, Paul E. McKenney, Stefan Metzmacher, stable

State transitions from 1->0->1 and N->2->1 callbacks require RCU
synchronization. Rather than performing the RCU synchronization every
time the state change occurs, which is quite slow when many tracepoints
are registered in batch, instead keep a snapshot of the RCU state on the
most recent transitions which belong to a chain, and conditionally wait
for a grace period on the last transition of the chain if one g.p. has
not elapsed since the last snapshot.

This applies to both RCU and SRCU.

Link: https://lore.kernel.org/io-uring/4ebea8f0-58c9-e571-fd30-0ce4f6f09c70@samba.org/
Fixes: <!!!COMMIT ID TBD!!!> ("Fix: tracepoint: static call function vs data state mismatch")
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Stefan Metzmacher <metze@samba.org>
Cc: <stable@vger.kernel.org> # 5.10+
---
Changes since v1:
- Use tp_rcu_get_state/tp_rcu_cond_sync on 2->1 transition when
  tp_funcs[0].data != old[0].data rather than
  tracepoint_synchronize_unregister.
Changes since v2:
- Combine duplicated code from add func case TP_FUNC_2 and TP_FUNC_N
  using a fallthrough.
- Add comment.
---
 kernel/tracepoint.c | 81 +++++++++++++++++++++++++++++++++++++--------
 1 file changed, 67 insertions(+), 14 deletions(-)

diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
index 8d772bd6894d..efd14c79fab4 100644
--- a/kernel/tracepoint.c
+++ b/kernel/tracepoint.c
@@ -28,6 +28,44 @@ extern tracepoint_ptr_t __stop___tracepoints_ptrs[];
 DEFINE_SRCU(tracepoint_srcu);
 EXPORT_SYMBOL_GPL(tracepoint_srcu);
 
+enum tp_transition_sync {
+	TP_TRANSITION_SYNC_1_0_1,
+	TP_TRANSITION_SYNC_N_2_1,
+
+	_NR_TP_TRANSITION_SYNC,
+};
+
+struct tp_transition_snapshot {
+	unsigned long rcu;
+	unsigned long srcu;
+	bool ongoing;
+};
+
+/* Protected by tracepoints_mutex */
+static struct tp_transition_snapshot tp_transition_snapshot[_NR_TP_TRANSITION_SYNC];
+
+static void tp_rcu_get_state(enum tp_transition_sync sync)
+{
+	struct tp_transition_snapshot *snapshot = &tp_transition_snapshot[sync];
+
+	/* Keep the latest get_state snapshot. */
+	snapshot->rcu = get_state_synchronize_rcu();
+	snapshot->srcu = start_poll_synchronize_srcu(&tracepoint_srcu);
+	snapshot->ongoing = true;
+}
+
+static void tp_rcu_cond_sync(enum tp_transition_sync sync)
+{
+	struct tp_transition_snapshot *snapshot = &tp_transition_snapshot[sync];
+
+	if (!snapshot->ongoing)
+		return;
+	cond_synchronize_rcu(snapshot->rcu);
+	if (!poll_state_synchronize_srcu(&tracepoint_srcu, snapshot->srcu))
+		synchronize_srcu(&tracepoint_srcu);
+	snapshot->ongoing = false;
+}
+
 /* Set to 1 to enable tracepoint debug output */
 static const int tracepoint_debug;
 
@@ -311,6 +349,11 @@ static int tracepoint_add_func(struct tracepoint *tp,
 	 */
 	switch (nr_func_state(tp_funcs)) {
 	case TP_FUNC_1:		/* 0->1 */
+		/*
+		 * Make sure new static func never uses old data after a
+		 * 1->0->1 transition sequence.
+		 */
+		tp_rcu_cond_sync(TP_TRANSITION_SYNC_1_0_1);
 		/* Set static call to first function */
 		tracepoint_update_call(tp, tp_funcs);
 		/* Both iterator and static call handle NULL tp->funcs */
@@ -325,10 +368,15 @@ static int tracepoint_add_func(struct tracepoint *tp,
 		 * Requires ordering between RCU assign/dereference and
 		 * static call update/call.
 		 */
-		rcu_assign_pointer(tp->funcs, tp_funcs);
-		break;
+		fallthrough;
 	case TP_FUNC_N:		/* N->N+1 (N>1) */
 		rcu_assign_pointer(tp->funcs, tp_funcs);
+		/*
+		 * Make sure static func never uses incorrect data after a
+		 * N->...->2->1 (N>1) transition sequence.
+		 */
+		if (tp_funcs[0].data != old[0].data)
+			tp_rcu_get_state(TP_TRANSITION_SYNC_N_2_1);
 		break;
 	default:
 		WARN_ON_ONCE(1);
@@ -372,24 +420,23 @@ static int tracepoint_remove_func(struct tracepoint *tp,
 		/* Both iterator and static call handle NULL tp->funcs */
 		rcu_assign_pointer(tp->funcs, NULL);
 		/*
-		 * Make sure new func never uses old data after a 1->0->1
-		 * transition sequence.
-		 * Considering that transition 0->1 is the common case
-		 * and don't have rcu-sync, issue rcu-sync after
-		 * transition 1->0 to break that sequence by waiting for
-		 * readers to be quiescent.
+		 * Make sure new static func never uses old data after a
+		 * 1->0->1 transition sequence.
 		 */
-		tracepoint_synchronize_unregister();
+		tp_rcu_get_state(TP_TRANSITION_SYNC_1_0_1);
 		break;
 	case TP_FUNC_1:		/* 2->1 */
 		rcu_assign_pointer(tp->funcs, tp_funcs);
 		/*
-		 * On 2->1 transition, RCU sync is needed before setting
-		 * static call to first callback, because the observer
-		 * may have loaded any prior tp->funcs after the last one
-		 * associated with an rcu-sync.
+		 * Make sure static func never uses incorrect data after a
+		 * N->...->2->1 (N>2) transition sequence. If the first
+		 * element's data has changed, then force the synchronization
+		 * to prevent current readers that have loaded the old data
+		 * from calling the new function.
 		 */
-		tracepoint_synchronize_unregister();
+		if (tp_funcs[0].data != old[0].data)
+			tp_rcu_get_state(TP_TRANSITION_SYNC_N_2_1);
+		tp_rcu_cond_sync(TP_TRANSITION_SYNC_N_2_1);
 		/* Set static call to first function */
 		tracepoint_update_call(tp, tp_funcs);
 		break;
@@ -397,6 +444,12 @@ static int tracepoint_remove_func(struct tracepoint *tp,
 		fallthrough;
 	case TP_FUNC_N:
 		rcu_assign_pointer(tp->funcs, tp_funcs);
+		/*
+		 * Make sure static func never uses incorrect data after a
+		 * N->...->2->1 (N>2) transition sequence.
+		 */
+		if (tp_funcs[0].data != old[0].data)
+			tp_rcu_get_state(TP_TRANSITION_SYNC_N_2_1);
 		break;
 	default:
 		WARN_ON_ONCE(1);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] Fix: tracepoint: static call function vs data state mismatch (v2)
  2021-08-05 19:15     ` Mathieu Desnoyers
@ 2021-08-05 19:38       ` Steven Rostedt
  2021-08-05 19:42         ` Mathieu Desnoyers
  0 siblings, 1 reply; 12+ messages in thread
From: Steven Rostedt @ 2021-08-05 19:38 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Ingo Molnar, Peter Zijlstra, Andrew Morton, paulmck,
	Stefan Metzmacher, stable, linux-kernel

On Thu, 5 Aug 2021 15:15:43 -0400 (EDT)
Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:

> ----- On Aug 5, 2021, at 2:56 PM, rostedt rostedt@goodmis.org wrote:
> 
> > Note, there shouldn't be a "(v2)" outside the "[PATCH ]" part.
> > Otherwise it gets added into the git commit during "git am".  
> 
> Out of curiosity, do you know any way to annotate my local commits to have the
> [PATCH v2] tag automatically generated by git send-email ?

I pass -v2 to git send-email, and it adds the v2 for me.

> > This is a big enough regression, I'll even add a Fixes tag to the next
> > patch on the final sha1 of this patch! Such that this patch won't be
> > backported without the next patch.  
> 
> This makes sense. I still wanted to keep the two patches separate so we would
> introduce the (slow) state machine in the first patch, and optimize for
> speed in the second. My intent is to facilitate of small logical changes,
> and make bisection more precise in the future if we introduce an issue
> here.

I agree which is why I didn't ask you to fold them. The logic in this
code was a big enough change, where I agree it should be kept separate.
Unfortunately, it caused a huge performance regression :-(, but at the
same time, fixed a correctness issue, which Thomas always says that
correctness trumps performance.

But the compromise is to add a Fixes tag to the next patch and document
why they are separated, but still required to act as "one". I'll add
that commentary.

-- Steve

> 
> Calling out more clearly how slow things become with this patch is indeed
> important.
> 
> >   
> >> 
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] Fix: tracepoint: static call function vs data state mismatch (v2)
  2021-08-05 19:38       ` Steven Rostedt
@ 2021-08-05 19:42         ` Mathieu Desnoyers
  0 siblings, 0 replies; 12+ messages in thread
From: Mathieu Desnoyers @ 2021-08-05 19:42 UTC (permalink / raw)
  To: rostedt
  Cc: Ingo Molnar, Peter Zijlstra, Andrew Morton, paulmck,
	Stefan Metzmacher, stable, linux-kernel

----- On Aug 5, 2021, at 3:38 PM, rostedt rostedt@goodmis.org wrote:

> On Thu, 5 Aug 2021 15:15:43 -0400 (EDT)
> Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:
> 
>> ----- On Aug 5, 2021, at 2:56 PM, rostedt rostedt@goodmis.org wrote:
>> 
>> > Note, there shouldn't be a "(v2)" outside the "[PATCH ]" part.
>> > Otherwise it gets added into the git commit during "git am".
>> 
>> Out of curiosity, do you know any way to annotate my local commits to have the
>> [PATCH v2] tag automatically generated by git send-email ?
> 
> I pass -v2 to git send-email, and it adds the v2 for me.

OK, so you version the entire patch series in one go. It makes sense.

> 
>> > This is a big enough regression, I'll even add a Fixes tag to the next
>> > patch on the final sha1 of this patch! Such that this patch won't be
>> > backported without the next patch.
>> 
>> This makes sense. I still wanted to keep the two patches separate so we would
>> introduce the (slow) state machine in the first patch, and optimize for
>> speed in the second. My intent is to facilitate of small logical changes,
>> and make bisection more precise in the future if we introduce an issue
>> here.
> 
> I agree which is why I didn't ask you to fold them. The logic in this
> code was a big enough change, where I agree it should be kept separate.
> Unfortunately, it caused a huge performance regression :-(, but at the
> same time, fixed a correctness issue, which Thomas always says that
> correctness trumps performance.
> 
> But the compromise is to add a Fixes tag to the next patch and document
> why they are separated, but still required to act as "one". I'll add
> that commentary.

Perfect, thanks!

Mathieu

> 
> -- Steve
> 
>> 
>> Calling out more clearly how slow things become with this patch is indeed
>> important.
>> 
>> >   
>> >> 

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-08-05 19:42 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-05 13:27 [PATCH 0/3] tracepoint static call fixes Mathieu Desnoyers
2021-08-05 13:27 ` [PATCH 1/3] Fix: tracepoint: static call: compare data on transition from 2->1 callees Mathieu Desnoyers
2021-08-05 17:07   ` Steven Rostedt
2021-08-05 17:57     ` Mathieu Desnoyers
2021-08-05 13:27 ` [PATCH 2/3] Fix: tracepoint: static call function vs data state mismatch (v2) Mathieu Desnoyers
2021-08-05 18:56   ` Steven Rostedt
2021-08-05 19:15     ` Mathieu Desnoyers
2021-08-05 19:38       ` Steven Rostedt
2021-08-05 19:42         ` Mathieu Desnoyers
2021-08-05 13:27 ` [PATCH 3/3] Fix: tracepoint: rcu get state and cond sync for static call updates (v2) Mathieu Desnoyers
2021-08-05 19:12   ` Steven Rostedt
2021-08-05 19:29     ` [PATCH v3 1/1] Fix: tracepoint: rcu get state and cond sync for static call updates Mathieu Desnoyers

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.