linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Mark Rutland <mark.rutland@arm.com>
To: He Ying <heying24@huawei.com>
Cc: catalin.marinas@arm.com, dvyukov@google.com, elver@google.com,
	james.morse@arm.com, linux-arm-kernel@lists.infradead.org,
	paulmck@kernel.org, peterz@infradead.org, will@kernel.org
Subject: Re: [PATCHv2 06/11] arm64: entry: move el1 irq/nmi logic to C
Date: Thu, 6 May 2021 11:58:32 +0100	[thread overview]
Message-ID: <20210506105832.GB32366@C02TD0UTHF1T.local> (raw)
In-Reply-To: <e3843e03-173e-10a6-5b14-0d8c14219e09@huawei.com>

On Thu, May 06, 2021 at 06:25:40PM +0800, He Ying wrote:
> 
> 在 2021/5/6 17:16, Mark Rutland 写道:
> > On Thu, May 06, 2021 at 04:28:09PM +0800, He Ying wrote:
> > > Hi Mark,
> > Hi,
> > 
> > > I have faced a performance regression for handling IPIs since this commit.
> > > 
> > > I caculate the cycles from the entry of el1_irq to the entry of
> > > gic_handle_irq.
> > > 
> > >  From my test, this commit may overhead an average of 200 cycles. Do you
> > > 
> > > have any ideas about this? Looking forward to your reply.
> > On that path, the only meaningfull difference is the call to
> > enter_el1_irq_or_nmi(), since that's now unconditional, and it's an
> > extra layer in the callchain.
> > 
> > When either CONFIG_ARM64_PSEUDO_NMI or CONFIG_TRACE_IRQFLAGS are
> > selected, enter_el1_irq_or_nmi() is a wrapper for functions we'd already
> > call, and I'd expectthe cost of the callees to dominate.
> > 
> > When neither CONFIG_ARM64_PSEUDO_NMI nor CONFIG_TRACE_IRQFLAGS are
> > selected, this should add a trivial function that immediately returns,
> > and so 200 cycles seems excessive.
> > 
> > Building that commit with defconfig, I see that GCC 10.1.0 generates:
> > 
> > | ffff800010dfc864 <enter_el1_irq_or_nmi>:
> > | ffff800010dfc864:       d503233f        paciasp
> > | ffff800010dfc868:       d50323bf        autiasp
> > | ffff800010dfc86c:       d65f03c0        ret
> 
> CONFIG_ARM64_PSEUDO_NMI is not set in my test. And I generate a different
> object
> 
> from yours:
> 
> 00000000000002b8 <enter_el1_irq_or_nmi>:
> 
>  2b8:        d503233f         paciasp
>  2bc:        a9bf7bfd          stp  x29, x30, [sp, #-16]!
>  2c0:        91052000         add x0, x0, #0x148
>  2c4:        910003fd          mov x29, sp
>  2c8:        97ffff57             bl 24 <enter_from_kernel_mode.isra.6>
>  2cc:        a8c17bfd           ldp x29, x30, [sp], #16
>  2d0:       d50323bf           autiasp
>  2d4:       d65f03c0            ret

Which commit are you testing with?

The call to enter_from_kernel_mode() was introduced later in commit:

  7cd1ea1010acbede ("rm64: entry: fix non-NMI kernel<->kernel transitions")

... and doesn't exist in commit:

  105fc3352077bba5 ("arm64: entry: move el1 irq/nmi logic to C")

Do you see the 200 cycle penalty with 105fc3352077bba5 alone? ... or
only only after the whole series is applied?

If enter_from_kernel_mode() is what's taking the bulk of the cycles,
then this is likely unavoidable work that previously (erroneously)
omitted.

> > ... so perhaps the PACIASP and AUTIASP have an impact?
> I'm not sure...
> > 
> > I have a few questions:
> > 
> > * Which CPU do you see this on?
> Hisilicon hip05-d02.
> > 
> > * Does that CPU implement pointer authentication?
> I'm not sure. How to check?

Does the dmesg contain "Address authentication" anywhere?

> > 
> > * What kernel config are you using? e.g. is this seen with defconfig?
> 
> Our own. But CONFIG_ARM64_PSEUDO_NMI is not set.
> 
> Should I provide it as an attachment?

From your attachment I see that TRACE_IRQFLAGS and LOCKDEP aren't
selected either, so IIUC the only non-trivial bits in
enter_from_kernel_mode() will be the RCU accounting.

> > * What's the total cycle count from el1_irq to gic_handle_irq?
> 
> Applying the patchset:   249 cycles.
> 
> Reverting the patchset:  77 cycles.
> 
> Maybe 170 cycles is more correct.
> 
> > 
> > * Does this measurably impact a real workload?
> Have some impact to scheduling perf test.

Does it affect a real workload? i.e. not a microbenchmark?

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2021-05-06 11:01 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-30 11:59 [PATCHv2 00/11] arm64: entry lockdep/rcu/tracing fixes Mark Rutland
2020-11-30 11:59 ` [PATCHv2 01/11] arm64: syscall: exit userspace before unmasking exceptions Mark Rutland
2020-11-30 11:59 ` [PATCHv2 02/11] arm64: mark idle code as noinstr Mark Rutland
2020-11-30 11:59 ` [PATCHv2 03/11] arm64: entry: mark entry " Mark Rutland
2020-11-30 11:59 ` [PATCHv2 04/11] arm64: entry: move enter_from_user_mode to entry-common.c Mark Rutland
2020-11-30 11:59 ` [PATCHv2 05/11] arm64: entry: prepare ret_to_user for function call Mark Rutland
2020-12-17 17:57   ` Guenter Roeck
2020-12-17 18:38     ` Catalin Marinas
2020-11-30 11:59 ` [PATCHv2 06/11] arm64: entry: move el1 irq/nmi logic to C Mark Rutland
2021-05-06  8:28   ` He Ying
2021-05-06  9:16     ` Mark Rutland
     [not found]       ` <e3843e03-173e-10a6-5b14-0d8c14219e09@huawei.com>
2021-05-06 10:58         ` Mark Rutland [this message]
2021-05-07  3:25           ` He Ying
2021-05-07  9:41             ` Mark Rutland
2021-05-07 10:02               ` He Ying
2020-11-30 11:59 ` [PATCHv2 07/11] arm64: entry: fix non-NMI user<->kernel transitions Mark Rutland
2020-11-30 11:59 ` [PATCHv2 08/11] arm64: ptrace: prepare for EL1 irq/rcu tracking Mark Rutland
2020-11-30 11:59 ` [PATCHv2 09/11] arm64: entry: fix non-NMI kernel<->kernel transitions Mark Rutland
2021-04-25  5:29   ` Zenghui Yu
2021-04-26  9:21     ` Mark Rutland
2021-04-26 13:39       ` Zenghui Yu
2021-04-27  7:15         ` Zenghui Yu
2021-04-27 14:43           ` Mark Rutland
2020-11-30 11:59 ` [PATCHv2 10/11] arm64: entry: fix NMI {user, kernel}->kernel transitions Mark Rutland
2020-11-30 11:59 ` [PATCHv2 11/11] arm64: entry: fix EL1 debug transitions Mark Rutland
2020-11-30 13:52 ` [PATCHv2 00/11] arm64: entry lockdep/rcu/tracing fixes Will Deacon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210506105832.GB32366@C02TD0UTHF1T.local \
    --to=mark.rutland@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=dvyukov@google.com \
    --cc=elver@google.com \
    --cc=heying24@huawei.com \
    --cc=james.morse@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).