From: Steven Rostedt <rostedt@goodmis.org>
To: Petr Mladek <pmladek@suse.com>
Cc: Anton Vorontsov <anton@enomsg.org>,
linux-doc@vger.kernel.org, Peter Zijlstra <peterz@infradead.org>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>,
"James E.J. Bottomley" <James.Bottomley@hansenpartnership.com>,
Guo Ren <guoren@kernel.org>, "H. Peter Anvin" <hpa@zytor.com>,
live-patching@vger.kernel.org, Miroslav Benes <mbenes@suse.cz>,
Ingo Molnar <mingo@kernel.org>,
linux-s390@vger.kernel.org,
Joe Lawrence <joe.lawrence@redhat.com>,
Jonathan Corbet <corbet@lwn.net>,
Mauro Carvalho Chehab <mchehab+huawei@kernel.org>,
Helge Deller <deller@gmx.de>,
x86@kernel.org, linux-csky@vger.kernel.org,
Christian Borntraeger <borntraeger@de.ibm.com>,
Kees Cook <keescook@chromium.org>,
Vasily Gorbik <gor@linux.ibm.com>,
Heiko Carstens <hca@linux.ibm.com>,
Jiri Kosina <jikos@kernel.org>, Borislav Petkov <bp@alien8.de>,
Josh Poimboeuf <jpoimboe@redhat.com>,
Thomas Gleixner <tglx@linutronix.de>,
Tony Luck <tony.luck@intel.com>,
linux-parisc@vger.kernel.org, linux-kernel@vger.kernel.org,
Masami Hiramatsu <mhiramat@kernel.org>,
Colin Cross <ccross@android.com>,
Paul Mackerras <paulus@samba.org>,
Andrew Morton <akpm@linux-foundation.org>,
linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH 11/11 v2] ftrace: Add recording of functions that caused recursion
Date: Mon, 2 Nov 2020 12:09:07 -0500 [thread overview]
Message-ID: <20201102120907.457ad2f7@gandalf.local.home> (raw)
In-Reply-To: <20201102164147.GJ20201@alley>
On Mon, 2 Nov 2020 17:41:47 +0100
Petr Mladek <pmladek@suse.com> wrote:
> On Fri 2020-10-30 17:31:53, Steven Rostedt wrote:
> > From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> >
> > This adds CONFIG_FTRACE_RECORD_RECURSION that will record to a file
> > "recursed_functions" all the functions that caused recursion while a
> > callback to the function tracer was running.
> >
>
> > --- /dev/null
> > +++ b/kernel/trace/trace_recursion_record.c
> > @@ -0,0 +1,220 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +#include <linux/seq_file.h>
> > +#include <linux/kallsyms.h>
> > +#include <linux/module.h>
> > +#include <linux/ftrace.h>
> > +#include <linux/fs.h>
> > +
> > +#include "trace_output.h"
> > +
> > +struct recursed_functions {
> > + unsigned long ip;
> > + unsigned long parent_ip;
> > +};
> > +
> > +static struct recursed_functions recursed_functions[CONFIG_FTRACE_RECORD_RECURSION_SIZE];
>
> The code tries to be lockless safe as much as possible. It would make
> sense to allign the array.
Hmm, is there an arch where the compiler would put an array of structures
with two unsigned long, misaligned?
>
>
> > +static atomic_t nr_records;
> > +
> > +/*
> > + * Cache the last found function. Yes, updates to this is racey, but
> > + * so is memory cache ;-)
> > + */
> > +static unsigned long cached_function;
> > +
> > +void ftrace_record_recursion(unsigned long ip, unsigned long parent_ip)
> > +{
> > + int index;
> > + int i = 0;
> > + unsigned long old;
> > +
> > + again:
> > + /* First check the last one recorded */
> > + if (ip == cached_function)
> > + return;
> > +
> > + index = atomic_read(&nr_records);
> > + /* nr_records is -1 when clearing records */
> > + smp_mb__after_atomic();
> > + if (index < 0)
> > + return;
> > +
> > + /* See below */
> > + if (i > index)
> > + index = i;
>
> This looks like a complicated way to do index++ via "i" variable.
> I guess that it was needed only in some older variant of the code.
> See below.
Because we reread the index above, and index could be bigger than i (more
than index + 1).
>
> > + if (index >= CONFIG_FTRACE_RECORD_RECURSION_SIZE)
> > + return;
> > +
> > + for (i = index - 1; i >= 0; i--) {
> > + if (recursed_functions[i].ip == ip) {
> > + cached_function = ip;
> > + return;
> > + }
> > + }
> > +
> > + cached_function = ip;
> > +
> > + /*
> > + * We only want to add a function if it hasn't been added before.
> > + * Add to the current location before incrementing the count.
> > + * If it fails to add, then increment the index (save in i)
> > + * and try again.
> > + */
> > + old = cmpxchg(&recursed_functions[index].ip, 0, ip);
> > + if (old != 0) {
> > + /* Did something else already added this for us? */
> > + if (old == ip)
> > + return;
> > + /* Try the next location (use i for the next index) */
> > + i = index + 1;
>
> What about
>
> index++;
>
> We basically want to run the code again with index + 1 limit.
But something else could update nr_records, and we want to use that if
nr_records is greater than i.
Now, we could swap the use case, and have
int index = 0;
[..]
i = atomic_read(&nr_records);
if (i > index)
index = i;
[..]
index++;
goto again;
>
> Maybe, it even does not make sense to check the array again
> and we should just try to store the value into the next slot.
We do this dance to prevent duplicates.
But you are correct, that this went through a few iterations. And the first
ones didn't have the cmpxchg on the ip itself, and that could make it so
that we don't need this index = i dance.
>
> > + goto again;
> > + }
> > +
> > + recursed_functions[index].parent_ip = parent_ip;
>
> WRITE_ONCE() ?
Does it really matter?
>
> > +
> > + /*
> > + * It's still possible that we could race with the clearing
> > + * CPU0 CPU1
> > + * ---- ----
> > + * ip = func
> > + * nr_records = -1;
> > + * recursed_functions[0] = 0;
> > + * i = -1
> > + * if (i < 0)
> > + * nr_records = 0;
> > + * (new recursion detected)
> > + * recursed_functions[0] = func
> > + * cmpxchg(recursed_functions[0],
> > + * func, 0)
> > + *
> > + * But the worse that could happen is that we get a zero in
> > + * the recursed_functions array, and it's likely that "func" will
> > + * be recorded again.
> > + */
> > + i = atomic_read(&nr_records);
> > + smp_mb__after_atomic();
> > + if (i < 0)
> > + cmpxchg(&recursed_functions[index].ip, ip, 0);
> > + else if (i <= index)
> > + atomic_cmpxchg(&nr_records, i, index + 1);
>
> This looks weird. It would shift nr_records past the record added
> in this call. It might skip many slots that were zeroed when clearing.
> Also we do not know if our entry was not zeroed as well.
>
> I would suggest to do it some other way (not even compile tested):
>
> void ftrace_record_recursion(unsigned long ip, unsigned long parent_ip)
> {
> int index, old_index;
> int i = 0;
> unsigned long old_ip;
>
> again:
> /* First check the last one recorded. */
> if (ip == READ_ONCE(cached_function))
> return;
>
> index = atomic_read(&nr_records);
> /* nr_records is -1 when clearing records. */
> smp_mb__after_atomic();
> if (index < 0)
> return;
>
> /* Already cached? */
> for (i = index - 1; i >= 0; i--) {
> if (recursed_functions[i].ip == ip) {
> WRITE_ONCE(cached_function, ip);
> return;
> }
> }
>
> if (index >= CONFIG_FTRACE_RECORD_RECURSION_SIZE)
> return;
>
> /*
> * Try to reserve the slot. It might be already taken
> * or the entire cache cleared.
> */
> old_index = atomic_cmpxchg(&nr_records, index, index + 1);
> if (old_index != index)
> goto again;
>
> /*
> * Be careful. The entire cache might have been cleared and reused in
> * the meantime. Replace only empty slot.
> */
> old_ip = cmpxchg(&recursed_functions[index].ip, 0, ip);
> if (old_ip != 0)
> goto again;
>
> old_ip = cmpxchg(&recursed_functions[index].parent_ip, 0, parrent_ip);
> if (old_ip != 0)
> goto again;
>
> /*
> * No ip is better than non-consistent one. The race with
> * clearing should be rare and not worth a perfect solution.
> */
> if (READ_ONCE(recursed_functions[index].ip) != ip) {
> cmpxchg(&recursed_functions[index].ip, ip, 0UL)
> goto again;
> }
> }
Let me go and rewrite it, this time considering the cmpxchg in the ip
update code. I may end up with what you have above ;-)
>
> The last check probably is not needed. Inconsistent entries
> should be prevented by the way how this func is called:
>
> static atomic_t paranoid_test; \
> if (!atomic_read(¶noid_test)) { \
> atomic_inc(¶noid_test); \
> ftrace_record_recursion(ip, pip); \
> atomic_dec(¶noid_test); \
> } \
>
>
>
>
> The rest of the patchset looks fine. I do not feel comfortable to give
> it Reviewed-by because I did not review it in depth.
>
> I spent more time with the above lockless code. I took it is a
> training. I need to improve this skill to feel more comfortable with
> the lockless printk ring buffer ;-)
Yeah, everything becomes exponentially complex when you make it lockless
with multiple concurrent writers.
-- Steve
next prev parent reply other threads:[~2020-11-02 17:11 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20201030213142.096102821@goodmis.org>
2020-10-30 21:31 ` [PATCH 05/11 v2] kprobes/ftrace: Add recursion protection to the ftrace callback Steven Rostedt
2020-11-03 11:22 ` Masami Hiramatsu
2020-11-04 18:46 ` [PATCH 05/11 v2.1] " Steven Rostedt
2020-10-30 21:31 ` [PATCH 11/11 v2] ftrace: Add recording of functions that caused recursion Steven Rostedt
2020-11-02 16:41 ` Petr Mladek
2020-11-02 17:09 ` Steven Rostedt [this message]
2020-11-02 17:19 ` Steven Rostedt
2020-11-03 10:40 ` Petr Mladek
2020-11-02 17:37 ` Steven Rostedt
2020-11-02 17:46 ` Steven Rostedt
2020-11-02 19:23 ` [PATCH 11/11 v2.2] " Steven Rostedt
2020-11-03 14:10 ` Petr Mladek
2020-11-03 16:14 ` Steven Rostedt
2020-11-04 19:13 ` Steven Rostedt
2020-11-02 19:14 ` [PATCH 11/11 v2.1] " Steven Rostedt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201102120907.457ad2f7@gandalf.local.home \
--to=rostedt@goodmis.org \
--cc=James.Bottomley@hansenpartnership.com \
--cc=akpm@linux-foundation.org \
--cc=anton@enomsg.org \
--cc=bigeasy@linutronix.de \
--cc=borntraeger@de.ibm.com \
--cc=bp@alien8.de \
--cc=ccross@android.com \
--cc=corbet@lwn.net \
--cc=deller@gmx.de \
--cc=gor@linux.ibm.com \
--cc=guoren@kernel.org \
--cc=hca@linux.ibm.com \
--cc=hpa@zytor.com \
--cc=jikos@kernel.org \
--cc=joe.lawrence@redhat.com \
--cc=jpoimboe@redhat.com \
--cc=kamalesh@linux.vnet.ibm.com \
--cc=keescook@chromium.org \
--cc=linux-csky@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-parisc@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=live-patching@vger.kernel.org \
--cc=mbenes@suse.cz \
--cc=mchehab+huawei@kernel.org \
--cc=mhiramat@kernel.org \
--cc=mingo@kernel.org \
--cc=paulus@samba.org \
--cc=peterz@infradead.org \
--cc=pmladek@suse.com \
--cc=tglx@linutronix.de \
--cc=tony.luck@intel.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).