All of lore.kernel.org
 help / color / mirror / Atom feed
From: Steven Rostedt <rostedt@goodmis.org>
To: Joerg Roedel <jroedel@suse.de>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Borislav Petkov <bp@alien8.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	Shile Zhang <shile.zhang@linux.alibaba.com>,
	Andy Lutomirski <luto@amacapital.net>,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Tzvetomir Stoyanov <tz.stoyanov@gmail.com>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Subject: Re: [RFC][PATCH] x86/mm: Sync all vmalloc mappings before text_poke()
Date: Thu, 30 Apr 2020 12:11:36 -0400	[thread overview]
Message-ID: <20200430121136.6d7aeb22@gandalf.local.home> (raw)
In-Reply-To: <20200430141120.GA8135@suse.de>

On Thu, 30 Apr 2020 16:11:21 +0200
Joerg Roedel <jroedel@suse.de> wrote:

> Hi,
> 
> On Wed, Apr 29, 2020 at 10:07:31AM -0400, Steven Rostedt wrote:
> > Talking with Mathieu about this on IRC, he pointed out that my code does
> > have a vzalloc() that is called:
> > 
> > in trace_pid_write()
> > 
> > 	pid_list->pids = vzalloc((pid_list->pid_max + 7) >> 3);
> > 
> > This is done when -P1,2 is on the trace-cmd command line.  
> 
> Okay, tracked it down, some instrumentation in the page-fault and
> double-fault handler gave me the stack-traces. Here is what happens:
> 
> As already pointed out, it all happens because of page-faults on the
> vzalloc'ed pid bitmap. It starts with this stack-trace:
> 
>  RIP: 0010:trace_event_ignore_this_pid+0x23/0x30

Interesting. Because that function is this:

bool trace_event_ignore_this_pid(struct trace_event_file *trace_file)
{
	struct trace_array *tr = trace_file->tr;
	struct trace_array_cpu *data;
	struct trace_pid_list *no_pid_list;
	struct trace_pid_list *pid_list;

	pid_list = rcu_dereference_raw(tr->filtered_pids);
	no_pid_list = rcu_dereference_raw(tr->filtered_no_pids);

	if (!pid_list && !no_pid_list)
		return false;

	data = this_cpu_ptr(tr->array_buffer.data);

	return data->ignore_pid;
}

Where it only sees if the pid masks exist. That is, it looks to see if
there's pointers to them, it doesn't actually touch the vmalloc'd area.
This check is to handle a race between allocating and deallocating the
buffers and setting the ignore_pid bit. The reading of these arrays is done
at sched_switch time, which sets or clears the ignore_pid field.

That said, since this only happens on buffer instances (it does not trigger
on the top level instance, which uses the same code for the pid masks)

Could this possibly be for the tr->array_buffer.data, which is allocated
with:

allocate_trace_buffer() {
	[..]
	buf->data = alloc_percpu(struct trace_array_cpu);

That is, the bug isn't the vmalloc being a problem, but perhaps the per_cpu
allocation. This would explain why this crashes with the buffer instance
and not with the top level instance. If it was related to the pid masks,
then it would trigger for either (because they act the same in allocating
at time of use). But when an instance is made, the tr->array_buffer.data is
created. Which for the top level happens at boot up and the pages would
have been synced long ago. But for a newly created instance, this happens
just before its used. This could possibly explain why it's not a problem
when doing it manually by hand, because the time between creating the
instance, and the time to start and stop the tracing, is long enough for
something to sync them page tables.

tl;dr; It's not an issue with the vmalloc, it's an issue with per_cpu
allocations!


-- Steve

  parent reply	other threads:[~2020-04-30 16:11 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-29  9:48 [RFC][PATCH] x86/mm: Sync all vmalloc mappings before text_poke() Steven Rostedt
2020-04-29 10:59 ` Joerg Roedel
2020-04-29 12:28   ` Steven Rostedt
2020-04-29 14:07     ` Steven Rostedt
2020-04-29 14:10       ` Joerg Roedel
2020-04-29 14:32         ` Steven Rostedt
2020-04-29 15:44           ` Peter Zijlstra
2020-04-29 16:17       ` Joerg Roedel
2020-04-29 16:20         ` Joerg Roedel
2020-04-29 16:52           ` Steven Rostedt
2020-04-29 17:29             ` Mathieu Desnoyers
2020-04-29 18:51               ` Peter Zijlstra
2020-04-30 14:11       ` Joerg Roedel
2020-04-30 14:50         ` Joerg Roedel
2020-04-30 15:20           ` Mathieu Desnoyers
2020-04-30 16:16             ` Steven Rostedt
2020-04-30 16:18               ` Mathieu Desnoyers
2020-04-30 16:30                 ` Steven Rostedt
2020-04-30 16:35                   ` Mathieu Desnoyers
2020-04-30 15:23         ` Mathieu Desnoyers
2020-04-30 16:12           ` Steven Rostedt
2020-04-30 16:11         ` Steven Rostedt [this message]
2020-04-30 16:16           ` Mathieu Desnoyers
2020-04-30 16:25             ` Steven Rostedt
2020-04-30 19:14           ` Joerg Roedel
2020-05-01  1:13             ` Steven Rostedt
2020-05-01  2:26               ` Mathieu Desnoyers
2020-05-01  2:39                 ` Steven Rostedt
2020-05-01 10:16                   ` Joerg Roedel
2020-05-01 13:35                   ` Mathieu Desnoyers
2020-05-04 15:12                   ` [PATCH] percpu: Sync vmalloc mappings in pcpu_alloc() and free_percpu() Joerg Roedel
2020-05-04 15:28                     ` Mathieu Desnoyers
2020-05-04 15:31                       ` Joerg Roedel
2020-05-04 15:38                         ` Mathieu Desnoyers
2020-05-04 15:51                           ` Joerg Roedel
2020-05-04 17:04                           ` Steven Rostedt
2020-05-04 17:40                     ` Steven Rostedt
2020-05-04 18:38                       ` Joerg Roedel
2020-05-04 19:10                         ` Steven Rostedt
2020-05-05 12:31                           ` [PATCH] tracing: Call vmalloc_sync_mappings() after alloc_percpu() Joerg Roedel
2020-05-06 15:17                             ` Steven Rostedt
2020-05-08 14:42                               ` Joerg Roedel
2020-05-04 20:25                     ` [PATCH] percpu: Sync vmalloc mappings in pcpu_alloc() and free_percpu() Peter Zijlstra
2020-05-04 20:43                       ` Steven Rostedt
2020-05-01  4:20                 ` [RFC][PATCH] x86/mm: Sync all vmalloc mappings before text_poke() Steven Rostedt
2020-05-01 13:22                   ` Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200430121136.6d7aeb22@gandalf.local.home \
    --to=rostedt@goodmis.org \
    --cc=akpm@linux-foundation.org \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=jroedel@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rafael.j.wysocki@intel.com \
    --cc=shile.zhang@linux.alibaba.com \
    --cc=tglx@linutronix.de \
    --cc=tz.stoyanov@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.