All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>
To: Masami Hiramatsu <mhiramat@kernel.org>,
	Ingo Molnar <mingo@kernel.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Nicholas Piggin <npiggin@gmail.com>,
	Steven Rostedt <rostedt@goodmis.org>
Cc: linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH 4/7] powerpc/ftrace: Additionally nop out the preceding mflr with -mprofile-kernel
Date: Wed, 19 Jun 2019 15:23:26 +0530	[thread overview]
Message-ID: <1560935530.70niyxru6o.naveen@linux.ibm.com> (raw)
In-Reply-To: <1560927184.kqsg9x9bd1.astroid@bobo.none>

Nicholas Piggin wrote:
> Michael Ellerman's on June 19, 2019 3:14 pm:
>> Hi Naveen,
>> 
>> Sorry I meant to reply to this earlier .. :/

No problem. Thanks for the questions.

>> 
>> "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> writes:
>>> With -mprofile-kernel, gcc emits 'mflr r0', followed by 'bl _mcount' to
>>> enable function tracing and profiling. So far, with dynamic ftrace, we
>>> used to only patch out the branch to _mcount(). However, mflr is
>>> executed by the branch unit that can only execute one per cycle on
>>> POWER9 and shared with branches, so it would be nice to avoid it where
>>> possible.
>>>
>>> We cannot simply nop out the mflr either. When enabling function
>>> tracing, there can be a race if tracing is enabled when some thread was
>>> interrupted after executing a nop'ed out mflr. In this case, the thread
>>> would execute the now-patched-in branch to _mcount() without having
>>> executed the preceding mflr.
>>>
>>> To solve this, we now enable function tracing in 2 steps: patch in the
>>> mflr instruction, use synchronize_rcu_tasks() to ensure all existing
>>> threads make progress, and then patch in the branch to _mcount(). We
>>> override ftrace_replace_code() with a powerpc64 variant for this
>>> purpose.
>> 
>> According to the ISA we're not allowed to patch mflr at runtime. See the
>> section on "CMODX".
> 
> According to "quasi patch class" engineering note, we can patch
> anything with a preferred nop. But that's written as an optional
> facility, which we don't have a feature to test for.
> 

Hmm... I wonder what the implications are. We've been patching in a 
'trap' for kprobes for a long time now, along with having to patch back 
the original instruction (which can be anything), when the probe is 
removed.

>> 
>> I'm also not convinced the ordering between the two patches is
>> guaranteed by the ISA, given that there's possibly no isync on the other
>> CPU.
> 
> Will they go through a context synchronizing event?
> 
> synchronize_rcu_tasks() should ensure a thread is scheduled away, but
> I'm not actually sure it guarantees CSI if it's kernel->kernel. Could
> do a smp_call_function to do the isync on each CPU to be sure.

Good point. Per 
Documentation/RCU/Design/Requirements/Requirements.html#Tasks RCU:
"The solution, in the form of Tasks RCU, is to have implicit read-side 
critical sections that are delimited by voluntary context switches, that 
is, calls to schedule(), cond_resched(), and synchronize_rcu_tasks(). In 
addition, transitions to and from userspace execution also delimit 
tasks-RCU read-side critical sections."

I suppose transitions to/from userspace, as well as calls to schedule() 
result in context synchronizing instruction being executed. But, if some 
tasks call cond_resched() and synchronize_rcu_tasks(), we probably won't 
have a CSI executed.

Also:
"In CONFIG_PREEMPT=n kernels, trampolines cannot be preempted, so these 
APIs map to call_rcu(), synchronize_rcu(), and rcu_barrier(), 
respectively."

In this scenario as well, I think we won't have a CSI executed in case 
of cond_resched().

Should we enhance patch_instruction() to handle that?


- Naveen


WARNING: multiple messages have this Message-ID (diff)
From: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>
To: Masami Hiramatsu <mhiramat@kernel.org>,
	Ingo Molnar <mingo@kernel.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Nicholas Piggin <npiggin@gmail.com>,
	Steven Rostedt <rostedt@goodmis.org>
Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 4/7] powerpc/ftrace: Additionally nop out the preceding mflr with -mprofile-kernel
Date: Wed, 19 Jun 2019 15:23:26 +0530	[thread overview]
Message-ID: <1560935530.70niyxru6o.naveen@linux.ibm.com> (raw)
In-Reply-To: <1560927184.kqsg9x9bd1.astroid@bobo.none>

Nicholas Piggin wrote:
> Michael Ellerman's on June 19, 2019 3:14 pm:
>> Hi Naveen,
>> 
>> Sorry I meant to reply to this earlier .. :/

No problem. Thanks for the questions.

>> 
>> "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> writes:
>>> With -mprofile-kernel, gcc emits 'mflr r0', followed by 'bl _mcount' to
>>> enable function tracing and profiling. So far, with dynamic ftrace, we
>>> used to only patch out the branch to _mcount(). However, mflr is
>>> executed by the branch unit that can only execute one per cycle on
>>> POWER9 and shared with branches, so it would be nice to avoid it where
>>> possible.
>>>
>>> We cannot simply nop out the mflr either. When enabling function
>>> tracing, there can be a race if tracing is enabled when some thread was
>>> interrupted after executing a nop'ed out mflr. In this case, the thread
>>> would execute the now-patched-in branch to _mcount() without having
>>> executed the preceding mflr.
>>>
>>> To solve this, we now enable function tracing in 2 steps: patch in the
>>> mflr instruction, use synchronize_rcu_tasks() to ensure all existing
>>> threads make progress, and then patch in the branch to _mcount(). We
>>> override ftrace_replace_code() with a powerpc64 variant for this
>>> purpose.
>> 
>> According to the ISA we're not allowed to patch mflr at runtime. See the
>> section on "CMODX".
> 
> According to "quasi patch class" engineering note, we can patch
> anything with a preferred nop. But that's written as an optional
> facility, which we don't have a feature to test for.
> 

Hmm... I wonder what the implications are. We've been patching in a 
'trap' for kprobes for a long time now, along with having to patch back 
the original instruction (which can be anything), when the probe is 
removed.

>> 
>> I'm also not convinced the ordering between the two patches is
>> guaranteed by the ISA, given that there's possibly no isync on the other
>> CPU.
> 
> Will they go through a context synchronizing event?
> 
> synchronize_rcu_tasks() should ensure a thread is scheduled away, but
> I'm not actually sure it guarantees CSI if it's kernel->kernel. Could
> do a smp_call_function to do the isync on each CPU to be sure.

Good point. Per 
Documentation/RCU/Design/Requirements/Requirements.html#Tasks RCU:
"The solution, in the form of Tasks RCU, is to have implicit read-side 
critical sections that are delimited by voluntary context switches, that 
is, calls to schedule(), cond_resched(), and synchronize_rcu_tasks(). In 
addition, transitions to and from userspace execution also delimit 
tasks-RCU read-side critical sections."

I suppose transitions to/from userspace, as well as calls to schedule() 
result in context synchronizing instruction being executed. But, if some 
tasks call cond_resched() and synchronize_rcu_tasks(), we probably won't 
have a CSI executed.

Also:
"In CONFIG_PREEMPT=n kernels, trampolines cannot be preempted, so these 
APIs map to call_rcu(), synchronize_rcu(), and rcu_barrier(), 
respectively."

In this scenario as well, I think we won't have a CSI executed in case 
of cond_resched().

Should we enhance patch_instruction() to handle that?


- Naveen


  reply	other threads:[~2019-06-19  9:53 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-18 14:46 [PATCH 0/7] powerpc/ftrace: Patch out -mprofile-kernel instructions Naveen N. Rao
2019-06-18 14:46 ` Naveen N. Rao
2019-06-18 14:47 ` [PATCH 1/7] ftrace: Expose flags used for ftrace_replace_code() Naveen N. Rao
2019-06-18 14:47   ` Naveen N. Rao
2019-06-18 14:47 ` [PATCH 2/7] x86/ftrace: Fix use of flags in ftrace_replace_code() Naveen N. Rao
2019-06-18 14:47   ` Naveen N. Rao
2019-06-18 14:47 ` [PATCH 3/7] ftrace: Expose __ftrace_replace_code() Naveen N. Rao
2019-06-18 14:47   ` Naveen N. Rao
2019-06-18 14:47 ` [PATCH 4/7] powerpc/ftrace: Additionally nop out the preceding mflr with -mprofile-kernel Naveen N. Rao
2019-06-18 14:47   ` Naveen N. Rao
2019-06-19  5:14   ` Michael Ellerman
2019-06-19  7:10     ` Nicholas Piggin
2019-06-19  7:10       ` Nicholas Piggin
2019-06-19  9:53       ` Naveen N. Rao [this message]
2019-06-19  9:53         ` Naveen N. Rao
2019-06-19 10:41         ` Nicholas Piggin
2019-06-19 10:41           ` Nicholas Piggin
2019-06-19 17:14           ` Naveen N. Rao
2019-06-19 17:14             ` Naveen N. Rao
2019-06-18 14:47 ` [PATCH 5/7] powerpc/ftrace: Update ftrace_location() for powerpc -mprofile-kernel Naveen N. Rao
2019-06-18 14:47   ` Naveen N. Rao
2019-06-18 15:45   ` Steven Rostedt
2019-06-18 15:45     ` Steven Rostedt
2019-06-18 18:11     ` Naveen N. Rao
2019-06-18 18:11       ` Naveen N. Rao
2019-06-18 18:23       ` Naveen N. Rao
2019-06-18 18:23         ` Naveen N. Rao
2019-06-18 18:32         ` Steven Rostedt
2019-06-18 18:32           ` Steven Rostedt
2019-06-19  7:56           ` Naveen N. Rao
2019-06-19  7:56             ` Naveen N. Rao
2019-06-19  9:28             ` Steven Rostedt
2019-06-19  9:28               ` Steven Rostedt
2019-06-18 14:47 ` [PATCH 6/7] kprobes/ftrace: Use ftrace_location() when [dis]arming probes Naveen N. Rao
2019-06-18 14:47   ` Naveen N. Rao
2019-06-21 14:41   ` Masami Hiramatsu
2019-06-21 14:41     ` Masami Hiramatsu
2019-06-18 14:47 ` [PATCH 7/7] powerpc/kprobes: Allow probing on any ftrace address Naveen N. Rao
2019-06-18 14:47   ` Naveen N. Rao
2019-06-21 14:50   ` Masami Hiramatsu
2019-06-21 14:50     ` Masami Hiramatsu
2019-06-22  3:49     ` Joe Perches
2019-06-22  3:49       ` Joe Perches
2019-06-26  9:39     ` Naveen N. Rao
2019-06-26  9:39       ` Naveen N. Rao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1560935530.70niyxru6o.naveen@linux.ibm.com \
    --to=naveen.n.rao@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mhiramat@kernel.org \
    --cc=mingo@kernel.org \
    --cc=mpe@ellerman.id.au \
    --cc=npiggin@gmail.com \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.