linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joel Fernandes <joelaf@google.com>
To: Namhyung Kim <namhyung@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-rt-users@vger.kernel.org,
	Steven Rostedt <rostedt@goodmis.org>,
	Peter Zilstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Tom Zanussi <tom.zanussi@linux.intel.com>,
	Thomas Glexiner <tglx@linutronix.de>,
	Boqun Feng <boqun.feng@gmail.com>,
	Paul McKenney <paulmck@linux.vnet.ibm.com>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Randy Dunlap <rdunlap@infradead.org>,
	Fenguang Wu <fengguang.wu@intel.com>,
	Baohong Liu <baohong.liu@intel.com>,
	Vedang Patel <vedang.patel@intel.com>,
	kernel-team@lge.com
Subject: Re: [RFC v4 3/4] irqflags: Avoid unnecessary calls to trace_ if you can
Date: Sun, 22 Apr 2018 18:14:18 -0700	[thread overview]
Message-ID: <CAJWu+opGN5+h4=ggKcQGnqG8h1LvtMjD_XQywU7=ULjjyS-v6w@mail.gmail.com> (raw)
In-Reply-To: <CAJWu+ooK5jETwB-L37=8PX+A4ercM8BrVb6g23t7F_xYhH4BSw@mail.gmail.com>

On Fri, Apr 20, 2018 at 12:07 AM, Joel Fernandes <joelaf@google.com> wrote:
> Hi,
>
> Thanks Matsami and Namhyung for the suggestions!
>
> On Wed, Apr 18, 2018 at 10:43 PM, Namhyung Kim <namhyung@kernel.org> wrote:
>> On Wed, Apr 18, 2018 at 06:02:50PM +0900, Masami Hiramatsu wrote:
>>> On Mon, 16 Apr 2018 21:07:47 -0700
>>> Joel Fernandes <joelaf@google.com> wrote:
>>>
>>> > With TRACE_IRQFLAGS, we call trace_ API too many times. We don't need
>>> > to if local_irq_restore or local_irq_save didn't actually do anything.
>>> >
>>> > This gives around a 4% improvement in performance when doing the
>>> > following command: "time find / > /dev/null"
>>> >
>>> > Also its best to avoid these calls where possible, since in this series,
>>> > the RCU code in tracepoint.h seems to be call these quite a bit and I'd
>>> > like to keep this overhead low.
>>>
>>> Can we assume that the "flags" has only 1 bit irq-disable flag?
>>> Since it skips calling raw_local_irq_restore(flags); too,
>>
>> I don't know how many it impacts on performance but maybe we can have
>> an arch-specific config option something like below?
>
> The flags restoration I am hoping is "cheap" but I haven't measured
> specifically the cost of this though.
>
>>
>>
>>> if there is any state in the flags on any arch, it may change the
>>> result. In that case, we can do it as below (just skipping trace_hardirqs_*)
>>>
>>> int disabled = irqs_disabled();
>>
>>   if (disabled == raw_irqs_disabled_flags(flags)) {
>> #ifndef CONFIG_ARCH_CAN_SKIP_NESTED_IRQ_RESTORE
>>         raw_local_irq_restore(flags);
>> #endif
>>         return;
>>   }
>
> Hmm, somehow I feel this part should be written generically enough
> that it applies to all architectures (as a first step).
>
>>
>>>
>>> if (!raw_irqs_disabled_flags(flags) && disabled)
>>>       trace_hardirqs_on();
>>>
>>> raw_local_irq_restore(flags);
>>>
>>> if (raw_irqs_disabled_flags(flags) && !disabled)
>>>       trace_hardirqs_off();
>
> I like this idea since its a good thing to do the flag restoration
> just to be safe and preserve the current behaviors. Also my goal was
> to reduce the trace_ calls in this series, so its probably better I
> just do as you're suggesting. I will do some experiments and make the
> changes for the next series.

So about performance of this series..

lockdep hooking into tracepoint code is a bit heavy, compared to
without this series. That's because of the design approach of
IRQ on/off -> Trace point -> lockdep

Versus without this series which does
IRQ on/off -> lockdep

So we lose performance because of that.

This particular patch improves the situation, as such so this
particular patch is probably good to merge once we can test
performance of Matsami's suggestion as well.

However, patch 4/4 which makes lockdep use the tracepoint causes a
performance hit of around 8% of mean time when I run:
hackbench -g 4 -f 2 -l 30000

I narrowed the performance hit down to the call to
rcu_irq_enter_irqson() and rcu_irq_exit_irqson() in __DO_TRACE.
Commenting these 2 functions brings the perf level back.

I was thinking about RCU usage here, and really we never change this
particular performance-sensitive tracepoint's function table 99.9% of
the time, so it seems there's quite in a win if we just had another
read-mostly synchronization mechanism that doesn't do all the RCU
tracking that's currently done here and such a mechanism can be
simpler..

If I understand correctly, RCU also adds other complications such as
that it can't be used from the idle path, that's why the
rcu_irq_enter_* was added in the first place. Would be nice if we can
just avoid these RCU calls for the preempt/irq tracepoints... Any
thoughts about this or any other ideas to solve this?

Meanwhile I'll also do some performance testing with Matsami's idea as well..

thanks,

- Joel

  reply	other threads:[~2018-04-23  1:14 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-17  4:07 [RFC v4 0/4] Centralize and unify usage of preempt/irq tracepoints Joel Fernandes
2018-04-17  4:07 ` [RFC v4 1/4] tracepoint: Add API to not do lockdep checks during RCU ops Joel Fernandes
2018-04-17  4:07 ` [RFC v4 2/4] softirq: reorder trace_softirqs_on to prevent lockdep splat Joel Fernandes
2018-04-17  4:07 ` [RFC v4 3/4] irqflags: Avoid unnecessary calls to trace_ if you can Joel Fernandes
2018-04-18  9:02   ` Masami Hiramatsu
2018-04-19  5:43     ` Namhyung Kim
2018-04-20  7:07       ` Joel Fernandes
2018-04-23  1:14         ` Joel Fernandes [this message]
2018-04-23  3:19           ` Paul E. McKenney
2018-04-23 14:31             ` Mathieu Desnoyers
2018-04-23 14:53               ` Steven Rostedt
2018-04-23 14:59                 ` Mathieu Desnoyers
2018-04-23 15:12                   ` Paul E. McKenney
2018-04-23 16:18                   ` Steven Rostedt
2018-04-23 17:12                     ` Mathieu Desnoyers
2018-04-23 17:24                       ` Joel Fernandes
2018-04-23 21:22                       ` Steven Rostedt
2018-04-24 15:56                         ` Paul E. McKenney
2018-04-24 16:01                           ` Joel Fernandes
2018-04-24 17:26                             ` Paul E. McKenney
2018-04-24 18:23                               ` Paul E. McKenney
2018-04-24 18:26                                 ` Paul E. McKenney
2018-04-24 18:59                                   ` Joel Fernandes
2018-04-24 19:01                                     ` Joel Fernandes
2018-04-24 19:09                                     ` Paul E. McKenney
2018-04-24 19:16                                       ` Joel Fernandes
2018-04-24 23:21                                     ` Mathieu Desnoyers
2018-04-24 23:46                                       ` Joel Fernandes
2018-04-25  0:10                                         ` Paul E. McKenney
2018-04-25  4:20                                           ` Paul E. McKenney
2018-04-25 21:27                                             ` Joel Fernandes
2018-04-25 21:35                                               ` Paul E. McKenney
2018-04-25 21:40                                               ` Mathieu Desnoyers
2018-04-25 22:51                                                 ` Steven Rostedt
2018-04-26 15:03                                                   ` Mathieu Desnoyers
2018-04-26 16:08                                                     ` Mathieu Desnoyers
2018-04-25 23:13                                                 ` Joel Fernandes
2018-04-26 15:13                                                   ` Mathieu Desnoyers
2018-04-26 15:20                                                     ` Joel Fernandes
2018-04-26 15:49                                                     ` Paul E. McKenney
2018-04-23 15:49                 ` Joel Fernandes
2018-04-26  2:18             ` Joel Fernandes
2018-05-01  1:18     ` Joel Fernandes
2018-04-17  4:07 ` [RFC v4 4/4] tracing: Centralize preemptirq tracepoints and unify their usage Joel Fernandes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJWu+opGN5+h4=ggKcQGnqG8h1LvtMjD_XQywU7=ULjjyS-v6w@mail.gmail.com' \
    --to=joelaf@google.com \
    --cc=baohong.liu@intel.com \
    --cc=boqun.feng@gmail.com \
    --cc=fengguang.wu@intel.com \
    --cc=fweisbec@gmail.com \
    --cc=kernel-team@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mhiramat@kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=rdunlap@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=tom.zanussi@linux.intel.com \
    --cc=vedang.patel@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).