linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Steven Rostedt <rostedt@goodmis.org>
To: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
	linux-trace-devel <linux-trace-devel@vger.kernel.org>,
	Ingo Molnar <mingo@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Stefan Metzmacher <metze@samba.org>,
	io-uring <io-uring@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: [PATCH] tracepoints: Update static_call before tp_funcs when adding a tracepoint
Date: Mon, 26 Jul 2021 12:56:04 -0400	[thread overview]
Message-ID: <20210726125604.55bb6655@oasis.local.home> (raw)
In-Reply-To: <715282075.6481.1627314401745.JavaMail.zimbra@efficios.com>

On Mon, 26 Jul 2021 11:46:41 -0400 (EDT)
Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:
> 
> [...]
> 
> Looking into the various transitions, I suspect the issue runs much deeper than
> this.
> 
> The sequence of transitions (number of probes) I'm considering is:
> 
> 0->1
> 1->2
> 2->1
> 1->0
> 0->1
> 1->2
> 
> I come to three conclusions:
> 
> Where we have:
> 
> tracepoint_remove_func()
> 
>                 tracepoint_update_call(tp, tp_funcs,
>                                        tp_funcs[0].func != old[0].func);
> 
> We should be comparing .data rather than .func, because the same callback
> can be registered twice with different data, and what we care about here
> is that the data of array element 0 is unchanged to skip rcu sync.

I guess we could do that, as you are right, we are worried about
passing the wrong data to the wrong function. If the function is the
same, at least it wont crash the kernel as the function can handle that
data. But, it could miss the callback that is staying while calling the
one that is going instead.

Unlikely to happen, but in theory it is enough to fix.

> 
> My second conclusion is that it's odd that transition 1->0 leaves the
> prior function call in place even after it's been removed. When we go
> back to 0->1, that function call may still be called even though the
> function is not there anymore. And there is no RCU synchronization on
> these transitions, so those are all possible scenarios.

How so? When doing this transition we have:

	tracepoint_update_call(tp, tp_funcs, false);
	rcu_assign_pointer(tp->funcs, tp_funcs);
	static_key_enable(&tp->key);

Where that tracepoint_update_call() will reinstall the iterator, and
that's a full memory barrier. It even sends IPIs to all other CPUs to
make sure all CPUs are synchronized before continuing.

By the time we get to static_key_enable(), there will not be any CPUs
that see the old function. And the process of updating a static_key
also does the same kind of synchronization.

> 
> My third conclusion is that we'd need synchronize RCU whenever tp_funcs[0].data
> changes for transitions 1->2, 2->1, and 1->2 because the priorities don't guarantee
> that the first callback stays in the first position, and we also need to rcu sync
> unconditionally on transition 1->0. We currently only have sync RCU on transition
> from 2->1 when tp_funcs[0].func changes, which is bogus in many ways.

Going from 1 to 2, there's no issue. We switch to the iterator, which
is the old method anyway. It looks directly at the array and matches
the data with the func for each element of that array, and the data
read initially (before calling the iterator) is ignored.

> 
> Basically, transitions from the iterator to a specific function should be handled
> with care (making sure the tp_funcs array is updated and rcu-sync is done), except
> in the specific case where the prior tp->funcs was NULL, which skips the function
> call. And unless there is a rcu-sync between the state transitions, we need to consider
> all prior states as additional original state as well. Therefore, in a 1->0->1
> transition sequence, it's very much possible that the old function ends up observing
> the new callback's data unless we add some rcu sync in between.

I disagree with the last part, as I explained above.

But I do agree that comparing data is probably the better check.

-- Steve

> 
> Thoughts ?
> 
> Thanks,
> 
> Mathieu
> 


  reply	other threads:[~2021-07-26 16:58 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-23  2:33 [PATCH] tracepoints: Update static_call before tp_funcs when adding a tracepoint Steven Rostedt
2021-07-26 15:46 ` Mathieu Desnoyers
2021-07-26 16:56   ` Steven Rostedt [this message]
2021-07-26 17:39     ` Mathieu Desnoyers
2021-07-26 18:49       ` Steven Rostedt
2021-07-26 19:12         ` Mathieu Desnoyers
2021-07-27 11:44         ` Peter Zijlstra
2021-07-27 13:46           ` Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210726125604.55bb6655@oasis.local.home \
    --to=rostedt@goodmis.org \
    --cc=akpm@linux-foundation.org \
    --cc=io-uring@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-devel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=metze@samba.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).