linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Steven Rostedt <rostedt@goodmis.org>
To: Petr Mladek <pmladek@suse.com>
Cc: Anton Vorontsov <anton@enomsg.org>,
	linux-doc@vger.kernel.org, Peter Zijlstra <peterz@infradead.org>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>,
	"James E.J. Bottomley" <James.Bottomley@hansenpartnership.com>,
	Guo Ren <guoren@kernel.org>, "H. Peter Anvin" <hpa@zytor.com>,
	live-patching@vger.kernel.org, Miroslav Benes <mbenes@suse.cz>,
	Ingo Molnar <mingo@kernel.org>,
	linux-s390@vger.kernel.org,
	Joe Lawrence <joe.lawrence@redhat.com>,
	Jonathan Corbet <corbet@lwn.net>,
	Mauro Carvalho Chehab <mchehab+huawei@kernel.org>,
	Helge Deller <deller@gmx.de>,
	x86@kernel.org, linux-csky@vger.kernel.org,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Kees Cook <keescook@chromium.org>,
	Vasily Gorbik <gor@linux.ibm.com>,
	Heiko Carstens <hca@linux.ibm.com>,
	Jiri Kosina <jikos@kernel.org>, Borislav Petkov <bp@alien8.de>,
	Josh Poimboeuf <jpoimboe@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Tony Luck <tony.luck@intel.com>,
	linux-parisc@vger.kernel.org, linux-kernel@vger.kernel.org,
	Masami Hiramatsu <mhiramat@kernel.org>,
	Colin Cross <ccross@android.com>,
	Paul Mackerras <paulus@samba.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH 11/11 v2] ftrace: Add recording of functions that caused recursion
Date: Mon, 2 Nov 2020 12:09:07 -0500	[thread overview]
Message-ID: <20201102120907.457ad2f7@gandalf.local.home> (raw)
In-Reply-To: <20201102164147.GJ20201@alley>

On Mon, 2 Nov 2020 17:41:47 +0100
Petr Mladek <pmladek@suse.com> wrote:

> On Fri 2020-10-30 17:31:53, Steven Rostedt wrote:
> > From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> > 
> > This adds CONFIG_FTRACE_RECORD_RECURSION that will record to a file
> > "recursed_functions" all the functions that caused recursion while a
> > callback to the function tracer was running.
> >   
> 
> > --- /dev/null
> > +++ b/kernel/trace/trace_recursion_record.c
> > @@ -0,0 +1,220 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +#include <linux/seq_file.h>
> > +#include <linux/kallsyms.h>
> > +#include <linux/module.h>
> > +#include <linux/ftrace.h>
> > +#include <linux/fs.h>
> > +
> > +#include "trace_output.h"
> > +
> > +struct recursed_functions {
> > +	unsigned long		ip;
> > +	unsigned long		parent_ip;
> > +};
> > +
> > +static struct recursed_functions recursed_functions[CONFIG_FTRACE_RECORD_RECURSION_SIZE];  
> 
> The code tries to be lockless safe as much as possible. It would make
> sense to allign the array.

Hmm, is there an arch where the compiler would put an array of structures
with two unsigned long, misaligned?

> 
> 
> > +static atomic_t nr_records;
> > +
> > +/*
> > + * Cache the last found function. Yes, updates to this is racey, but
> > + * so is memory cache ;-)
> > + */
> > +static unsigned long cached_function;
> > +
> > +void ftrace_record_recursion(unsigned long ip, unsigned long parent_ip)
> > +{
> > +	int index;
> > +	int i = 0;
> > +	unsigned long old;
> > +
> > + again:
> > +	/* First check the last one recorded */
> > +	if (ip == cached_function)
> > +		return;
> > +
> > +	index = atomic_read(&nr_records);
> > +	/* nr_records is -1 when clearing records */
> > +	smp_mb__after_atomic();
> > +	if (index < 0)
> > +		return;
> > +
> > +	/* See below */
> > +	if (i > index)
> > +		index = i;  
> 
> This looks like a complicated way to do index++ via "i" variable.
> I guess that it was needed only in some older variant of the code.
> See below.

Because we reread the index above, and index could be bigger than i (more
than index + 1).

> 
> > +	if (index >= CONFIG_FTRACE_RECORD_RECURSION_SIZE)
> > +		return;
> > +
> > +	for (i = index - 1; i >= 0; i--) {
> > +		if (recursed_functions[i].ip == ip) {
> > +			cached_function = ip;
> > +			return;
> > +		}
> > +	}
> > +
> > +	cached_function = ip;
> > +
> > +	/*
> > +	 * We only want to add a function if it hasn't been added before.
> > +	 * Add to the current location before incrementing the count.
> > +	 * If it fails to add, then increment the index (save in i)
> > +	 * and try again.
> > +	 */
> > +	old = cmpxchg(&recursed_functions[index].ip, 0, ip);
> > +	if (old != 0) {
> > +		/* Did something else already added this for us? */
> > +		if (old == ip)
> > +			return;
> > +		/* Try the next location (use i for the next index) */
> > +		i = index + 1;  
> 
> What about
> 
> 		index++;
> 
> We basically want to run the code again with index + 1 limit.

But something else could update nr_records, and we want to use that if
nr_records is greater than i.

Now, we could swap the use case, and have

	int index = 0;

	[..]
	i = atomic_read(&nr_records);
	if (i > index)
		index = i;

	[..]

		index++;
		goto again;


> 
> Maybe, it even does not make sense to check the array again
> and we should just try to store the value into the next slot.

We do this dance to prevent duplicates.

But you are correct, that this went through a few iterations. And the first
ones didn't have the cmpxchg on the ip itself, and that could make it so
that we don't need this index = i dance.

> 
> > +		goto again;
> > +	}
> > +
> > +	recursed_functions[index].parent_ip = parent_ip;  
> 
> WRITE_ONCE() ?

Does it really matter?

> 
> > +
> > +	/*
> > +	 * It's still possible that we could race with the clearing
> > +	 *    CPU0                                    CPU1
> > +	 *    ----                                    ----
> > +	 *                                       ip = func
> > +	 *  nr_records = -1;
> > +	 *  recursed_functions[0] = 0;
> > +	 *                                       i = -1
> > +	 *                                       if (i < 0)
> > +	 *  nr_records = 0;
> > +	 *  (new recursion detected)
> > +	 *      recursed_functions[0] = func
> > +	 *                                            cmpxchg(recursed_functions[0],
> > +	 *                                                    func, 0)
> > +	 *
> > +	 * But the worse that could happen is that we get a zero in
> > +	 * the recursed_functions array, and it's likely that "func" will
> > +	 * be recorded again.
> > +	 */
> > +	i = atomic_read(&nr_records);
> > +	smp_mb__after_atomic();
> > +	if (i < 0)
> > +		cmpxchg(&recursed_functions[index].ip, ip, 0);
> > +	else if (i <= index)
> > +		atomic_cmpxchg(&nr_records, i, index + 1);  
> 
> This looks weird. It would shift nr_records past the record added
> in this call. It might skip many slots that were zeroed when clearing.
> Also we do not know if our entry was not zeroed as well.
> 
> I would suggest to do it some other way (not even compile tested):
> 
> void ftrace_record_recursion(unsigned long ip, unsigned long parent_ip)
> {
> 	int index, old_index;
> 	int i = 0;
> 	unsigned long old_ip;
> 
>  again:
> 	/* First check the last one recorded. */
> 	if (ip == READ_ONCE(cached_function))
> 		return;
> 
> 	index = atomic_read(&nr_records);
> 	/* nr_records is -1 when clearing records. */
> 	smp_mb__after_atomic();
> 	if (index < 0)
> 		return;
> 
> 	/* Already cached? */
> 	for (i = index - 1; i >= 0; i--) {
> 		if (recursed_functions[i].ip == ip) {
> 			WRITE_ONCE(cached_function, ip);
> 			return;
> 		}
> 	}
> 
> 	if (index >= CONFIG_FTRACE_RECORD_RECURSION_SIZE)
> 		return;
> 
> 	/*
> 	 * Try to reserve the slot. It might be already taken
> 	 * or the entire cache cleared.
> 	 */
> 	old_index = atomic_cmpxchg(&nr_records, index, index + 1);
> 	if (old_index != index)
> 		goto again;
> 
> 	/*
> 	 * Be careful. The entire cache might have been cleared and reused in
> 	 * the meantime. Replace only empty slot.
> 	 */
> 	old_ip = cmpxchg(&recursed_functions[index].ip, 0, ip);
> 	if (old_ip != 0)
> 		goto again;
> 
> 	old_ip = cmpxchg(&recursed_functions[index].parent_ip, 0, parrent_ip);
> 	if (old_ip != 0)
> 		goto again;
> 
> 	/*
> 	 * No ip is better than non-consistent one. The race with
> 	 * clearing should be rare and not worth a perfect solution.
> 	 */
> 	if (READ_ONCE(recursed_functions[index].ip) != ip) {
> 		cmpxchg(&recursed_functions[index].ip, ip, 0UL)
> 		goto again;
> 	}
> }

Let me go and rewrite it, this time considering the cmpxchg in the ip
update code. I may end up with what you have above ;-)


> 
> The last check probably is not needed. Inconsistent entries
> should be prevented by the way how this func is called:
> 
> 		static atomic_t paranoid_test;				\
> 		if (!atomic_read(&paranoid_test)) {			\
> 			atomic_inc(&paranoid_test);			\
> 			ftrace_record_recursion(ip, pip);		\
> 			atomic_dec(&paranoid_test);			\
> 		}							\
> 
> 
> 
> 
> The rest of the patchset looks fine. I do not feel comfortable to give
> it Reviewed-by because I did not review it in depth.
> 
> I spent more time with the above lockless code. I took it is a
> training. I need to improve this skill to feel more comfortable with
> the lockless printk ring buffer ;-)

Yeah, everything becomes exponentially complex when you make it lockless
with multiple concurrent writers.

-- Steve

  reply	other threads:[~2020-11-02 17:11 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20201030213142.096102821@goodmis.org>
2020-10-30 21:31 ` [PATCH 05/11 v2] kprobes/ftrace: Add recursion protection to the ftrace callback Steven Rostedt
2020-11-03 11:22   ` Masami Hiramatsu
2020-11-04 18:46     ` [PATCH 05/11 v2.1] " Steven Rostedt
2020-10-30 21:31 ` [PATCH 11/11 v2] ftrace: Add recording of functions that caused recursion Steven Rostedt
2020-11-02 16:41   ` Petr Mladek
2020-11-02 17:09     ` Steven Rostedt [this message]
2020-11-02 17:19       ` Steven Rostedt
2020-11-03 10:40       ` Petr Mladek
2020-11-02 17:37     ` Steven Rostedt
2020-11-02 17:46       ` Steven Rostedt
2020-11-02 19:23         ` [PATCH 11/11 v2.2] " Steven Rostedt
2020-11-03 14:10           ` Petr Mladek
2020-11-03 16:14             ` Steven Rostedt
2020-11-04 19:13             ` Steven Rostedt
2020-11-02 19:14       ` [PATCH 11/11 v2.1] " Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201102120907.457ad2f7@gandalf.local.home \
    --to=rostedt@goodmis.org \
    --cc=James.Bottomley@hansenpartnership.com \
    --cc=akpm@linux-foundation.org \
    --cc=anton@enomsg.org \
    --cc=bigeasy@linutronix.de \
    --cc=borntraeger@de.ibm.com \
    --cc=bp@alien8.de \
    --cc=ccross@android.com \
    --cc=corbet@lwn.net \
    --cc=deller@gmx.de \
    --cc=gor@linux.ibm.com \
    --cc=guoren@kernel.org \
    --cc=hca@linux.ibm.com \
    --cc=hpa@zytor.com \
    --cc=jikos@kernel.org \
    --cc=joe.lawrence@redhat.com \
    --cc=jpoimboe@redhat.com \
    --cc=kamalesh@linux.vnet.ibm.com \
    --cc=keescook@chromium.org \
    --cc=linux-csky@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-parisc@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=live-patching@vger.kernel.org \
    --cc=mbenes@suse.cz \
    --cc=mchehab+huawei@kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=mingo@kernel.org \
    --cc=paulus@samba.org \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).