From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: rostedt <rostedt@goodmis.org>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
Ingo Molnar <mingo@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Peter Zijlstra <peterz@infradead.org>,
Josh Poimboeuf <jpoimboe@redhat.com>,
Ingo Molnar <mingo@redhat.com>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Dmitry Vyukov <dvyukov@google.com>,
Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
Yonghong Song <yhs@fb.com>, Andrii Nakryiko <andriin@fb.com>,
John Fastabend <john.fastabend@gmail.com>,
KP Singh <kpsingh@chromium.org>, netdev <netdev@vger.kernel.org>,
bpf <bpf@vger.kernel.org>, Kees Cook <keescook@chromium.org>,
Florian Weimer <fw@deneb.enyo.de>,
syzbot+83aa762ef23b6f0d1991@syzkaller.appspotmail.com,
syzbot+d29e58bb557324e55e5e@syzkaller.appspotmail.com,
Matt Mullins <mmullins@mmlx.us>
Subject: Re: [for-next][PATCH 14/15] tracepoint: Do not fail unregistering a probe due to memory failure
Date: Wed, 3 Feb 2021 12:57:24 -0500 (EST) [thread overview]
Message-ID: <1836191179.6214.1612375044968.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <20210203160550.710877069@goodmis.org>
----- On Feb 3, 2021, at 11:05 AM, rostedt rostedt@goodmis.org wrote:
> From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
>
> The list of tracepoint callbacks is managed by an array that is protected
> by RCU. To update this array, a new array is allocated, the updates are
> copied over to the new array, and then the list of functions for the
> tracepoint is switched over to the new array. After a completion of an RCU
> grace period, the old array is freed.
>
> This process happens for both adding a callback as well as removing one.
> But on removing a callback, if the new array fails to be allocated, the
> callback is not removed, and may be used after it is freed by the clients
> of the tracepoint.
>
> There's really no reason to fail if the allocation for a new array fails
> when removing a function. Instead, the function can simply be replaced by a
> stub function that could be cleaned up on the next modification of the
> array. That is, instead of calling the function registered to the
> tracepoint, it would call a stub function in its place.
>
> Link: https://lore.kernel.org/r/20201115055256.65625-1-mmullins@mmlx.us
> Link: https://lore.kernel.org/r/20201116175107.02db396d@gandalf.local.home
> Link: https://lore.kernel.org/r/20201117211836.54acaef2@oasis.local.home
> Link: https://lkml.kernel.org/r/20201118093405.7a6d2290@gandalf.local.home
>
> [ Note, this version does use undefined compiler behavior (assuming that
> a stub function with no parameters or return, can be called by a location
> that thinks it has parameters but still no return value. Static calls
> do the same thing, so this trick is not without precedent.
>
> There's another solution that uses RCU tricks and is more complex, but
> can be an alternative if this solution becomes an issue.
>
> Link: https://lore.kernel.org/lkml/20210127170721.58bce7cc@gandalf.local.home/
> ]
>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Josh Poimboeuf <jpoimboe@redhat.com>
> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: Dmitry Vyukov <dvyukov@google.com>
> Cc: Martin KaFai Lau <kafai@fb.com>
> Cc: Song Liu <songliubraving@fb.com>
> Cc: Yonghong Song <yhs@fb.com>
> Cc: Andrii Nakryiko <andriin@fb.com>
> Cc: John Fastabend <john.fastabend@gmail.com>
> Cc: KP Singh <kpsingh@chromium.org>
> Cc: netdev <netdev@vger.kernel.org>
> Cc: bpf <bpf@vger.kernel.org>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Florian Weimer <fw@deneb.enyo.de>
> Fixes: 97e1c18e8d17b ("tracing: Kernel Tracepoints")
> Reported-by: syzbot+83aa762ef23b6f0d1991@syzkaller.appspotmail.com
> Reported-by: syzbot+d29e58bb557324e55e5e@syzkaller.appspotmail.com
> Reported-by: Matt Mullins <mmullins@mmlx.us>
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> Tested-by: Matt Mullins <mmullins@mmlx.us>
> ---
> kernel/tracepoint.c | 80 ++++++++++++++++++++++++++++++++++++---------
> 1 file changed, 64 insertions(+), 16 deletions(-)
>
> diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
> index 7261fa0f5e3c..e8f20ae29c18 100644
> --- a/kernel/tracepoint.c
> +++ b/kernel/tracepoint.c
> @@ -53,6 +53,12 @@ struct tp_probes {
> struct tracepoint_func probes[];
> };
>
> +/* Called in removal of a func but failed to allocate a new tp_funcs */
> +static void tp_stub_func(void)
> +{
> + return;
> +}
> +
> static inline void *allocate_probes(int count)
> {
> struct tp_probes *p = kmalloc(struct_size(p, probes, count),
> @@ -131,6 +137,7 @@ func_add(struct tracepoint_func **funcs, struct
> tracepoint_func *tp_func,
> {
> struct tracepoint_func *old, *new;
> int nr_probes = 0;
> + int stub_funcs = 0;
> int pos = -1;
>
> if (WARN_ON(!tp_func->func))
> @@ -147,14 +154,34 @@ func_add(struct tracepoint_func **funcs, struct
> tracepoint_func *tp_func,
> if (old[nr_probes].func == tp_func->func &&
> old[nr_probes].data == tp_func->data)
> return ERR_PTR(-EEXIST);
> + if (old[nr_probes].func == tp_stub_func)
> + stub_funcs++;
> }
> }
> - /* + 2 : one for new probe, one for NULL func */
> - new = allocate_probes(nr_probes + 2);
> + /* + 2 : one for new probe, one for NULL func - stub functions */
> + new = allocate_probes(nr_probes + 2 - stub_funcs);
> if (new == NULL)
> return ERR_PTR(-ENOMEM);
> if (old) {
> - if (pos < 0) {
> + if (stub_funcs) {
Considering that we end up implementing a case where we carefully copy over
each item, I recommend we replace the two "memcpy" branches by a single item-wise
implementation. It's a slow-path anyway, and reducing the overall complexity
is a benefit for slow paths. Fewer bugs, less code to review, and it's easier to
reach a decent testing state-space coverage.
> + /* Need to copy one at a time to remove stubs */
> + int probes = 0;
> +
> + pos = -1;
> + for (nr_probes = 0; old[nr_probes].func; nr_probes++) {
> + if (old[nr_probes].func == tp_stub_func)
> + continue;
> + if (pos < 0 && old[nr_probes].prio < prio)
> + pos = probes++;
> + new[probes++] = old[nr_probes];
> + }
> + nr_probes = probes;
Repurposing "nr_probes" from accounting for the number of items in the old
array to counting the number of items in the new array in the middle of the
function is confusing.
> + if (pos < 0)
> + pos = probes;
> + else
> + nr_probes--; /* Account for insertion */
This is probably why you need to play tricks with nr_probes here.
> + } else if (pos < 0) {
> pos = nr_probes;
> memcpy(new, old, nr_probes * sizeof(struct tracepoint_func));
> } else {
> @@ -188,8 +215,9 @@ static void *func_remove(struct tracepoint_func **funcs,
> /* (N -> M), (N > 1, M >= 0) probes */
> if (tp_func->func) {
> for (nr_probes = 0; old[nr_probes].func; nr_probes++) {
> - if (old[nr_probes].func == tp_func->func &&
> - old[nr_probes].data == tp_func->data)
> + if ((old[nr_probes].func == tp_func->func &&
> + old[nr_probes].data == tp_func->data) ||
> + old[nr_probes].func == tp_stub_func)
> nr_del++;
> }
> }
> @@ -208,14 +236,32 @@ static void *func_remove(struct tracepoint_func **funcs,
> /* N -> M, (N > 1, M > 0) */
> /* + 1 for NULL */
> new = allocate_probes(nr_probes - nr_del + 1);
> - if (new == NULL)
> - return ERR_PTR(-ENOMEM);
> - for (i = 0; old[i].func; i++)
> - if (old[i].func != tp_func->func
> - || old[i].data != tp_func->data)
> - new[j++] = old[i];
> - new[nr_probes - nr_del].func = NULL;
> - *funcs = new;
> + if (new) {
> + for (i = 0; old[i].func; i++)
> + if ((old[i].func != tp_func->func
> + || old[i].data != tp_func->data)
> + && old[i].func != tp_stub_func)
> + new[j++] = old[i];
> + new[nr_probes - nr_del].func = NULL;
> + *funcs = new;
> + } else {
> + /*
> + * Failed to allocate, replace the old function
> + * with calls to tp_stub_func.
> + */
> + for (i = 0; old[i].func; i++)
> + if (old[i].func == tp_func->func &&
> + old[i].data == tp_func->data) {
> + old[i].func = tp_stub_func;
This updates "func" while readers are loading it concurrently. I would recommend
using WRITE_ONCE here paired with READ_ONCE within __traceiter_##_name.
> + /* Set the prio to the next event. */
I don't get why the priority needs to be changed here. Could it simply stay
at its original value ? It's already in the correct priority order anyway.
> + if (old[i + 1].func)
> + old[i].prio =
> + old[i + 1].prio;
> + else
> + old[i].prio = -1;
> + }
> + *funcs = old;
I'm not sure what setting *funcs to old achieves ? Isn't it already pointing
to old ?
I'll send a patch which applies on top of yours implementing my recommendations.
It shrinks the code complexity nicely:
include/linux/tracepoint.h | 2 +-
kernel/tracepoint.c | 80 +++++++++++++-------------------------
2 files changed, 28 insertions(+), 54 deletions(-)
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
next prev parent reply other threads:[~2021-02-03 17:58 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-03 16:05 [for-next][PATCH 00/15] tracing: Updates for 5.12 Steven Rostedt
2021-02-03 16:05 ` [for-next][PATCH 01/15] tracing: Add printf attribute to log function Steven Rostedt
2021-02-03 16:05 ` [for-next][PATCH 02/15] tracing: Update trace_ignore_this_task() kernel-doc comment Steven Rostedt
2021-02-03 16:05 ` [for-next][PATCH 03/15] tracing: Remove get/put_cpu() from function_trace_init Steven Rostedt
2021-02-03 16:05 ` [for-next][PATCH 04/15] ring-buffer: Remove cpu_buffer argument from the rb_inc_page() Steven Rostedt
2021-02-03 16:05 ` [for-next][PATCH 05/15] ring-buffer: Drop unneeded check in ring_buffer_resize() Steven Rostedt
2021-02-03 16:05 ` [for-next][PATCH 07/15] tracing: Inline tracing_gen_ctx_flags() Steven Rostedt
2021-02-03 16:05 ` [for-next][PATCH 08/15] tracing: Use in_serving_softirq() to deduct softirq status Steven Rostedt
2021-02-03 16:05 ` [for-next][PATCH 09/15] tracing: Remove NULL check from current in tracing_generic_entry_update() Steven Rostedt
2021-02-03 16:05 ` [for-next][PATCH 10/15] tracing: Fix spelling mistake in Kconfig "infinit" -> "infinite" Steven Rostedt
2021-02-03 16:05 ` [for-next][PATCH 11/15] tracing: Fix spelling of controlling in uprobes Steven Rostedt
2021-02-03 16:05 ` [for-next][PATCH 12/15] tracing: Fix a kernel doc warning Steven Rostedt
2021-02-03 16:05 ` [for-next][PATCH 13/15] tracing: Remove definition of DEBUG in trace_mmiotrace.c Steven Rostedt
2021-02-03 16:05 ` [for-next][PATCH 14/15] tracepoint: Do not fail unregistering a probe due to memory failure Steven Rostedt
2021-02-03 17:05 ` Peter Zijlstra
2021-02-03 17:09 ` Peter Zijlstra
2021-02-03 17:23 ` Steven Rostedt
2021-02-03 17:57 ` Mathieu Desnoyers [this message]
2021-02-04 16:50 ` Steven Rostedt
2021-02-03 17:57 ` [PATCH 1/1] tracepoint: cleanup: do not fail unregistration Mathieu Desnoyers
2021-02-04 17:47 ` Steven Rostedt
2021-02-03 16:05 ` [for-next][PATCH 15/15] kernel: trace: preemptirq_delay_test: add cpu affinity Steven Rostedt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1836191179.6214.1612375044968.JavaMail.zimbra@efficios.com \
--to=mathieu.desnoyers@efficios.com \
--cc=akpm@linux-foundation.org \
--cc=andriin@fb.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=dvyukov@google.com \
--cc=fw@deneb.enyo.de \
--cc=john.fastabend@gmail.com \
--cc=jpoimboe@redhat.com \
--cc=kafai@fb.com \
--cc=keescook@chromium.org \
--cc=kpsingh@chromium.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=mingo@redhat.com \
--cc=mmullins@mmlx.us \
--cc=netdev@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=songliubraving@fb.com \
--cc=syzbot+83aa762ef23b6f0d1991@syzkaller.appspotmail.com \
--cc=syzbot+d29e58bb557324e55e5e@syzkaller.appspotmail.com \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).