linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Punit Agrawal <punitagrawal@gmail.com>
To: Zhen Lei <thunder.leizhen@huawei.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	linux-arm-kernel <linux-arm-kernel@lists.infradead.org>,
	<linux-kernel@vger.kernel.org>,
	Mark Rutland <mark.rutland@arm.com>
Subject: Re: [PATCH] arm64: entry: Improve the performance of system calls
Date: Tue, 14 Sep 2021 08:01:22 +0900	[thread overview]
Message-ID: <87o88w84hp.fsf@stealth> (raw)
In-Reply-To: <20210903121950.2284-1-thunder.leizhen@huawei.com> (Zhen Lei's message of "Fri, 3 Sep 2021 20:19:50 +0800")

Hi Zhen Lei,

Zhen Lei <thunder.leizhen@huawei.com> writes:

> Commit 582f95835a8f ("arm64: entry: convert el0_sync to C") converted lots
> of functions from assembly to C, this greatly improves readability. But
> el0_svc()/el0_svc_compat() is in response to system call requests from
> user mode and may be in the hot path.
>
> Although the SVC is in the first case of the switch statement in C, the
> compiler optimizes the switch statement as a whole, and does not give SVC
> a small boost.
>
> Use "likely()" to help SVC directly invoke its handler after a simple
> judgment to avoid entering the switch table lookup process.
>
> After:
> 0000000000000ff0 <el0t_64_sync_handler>:
>      ff0:       d503245f        bti     c
>      ff4:       d503233f        paciasp
>      ff8:       a9bf7bfd        stp     x29, x30, [sp, #-16]!
>      ffc:       910003fd        mov     x29, sp
>     1000:       d5385201        mrs     x1, esr_el1
>     1004:       531a7c22        lsr     w2, w1, #26
>     1008:       f100545f        cmp     x2, #0x15
>     100c:       540000a1        b.ne    1020 <el0t_64_sync_handler+0x30>
>     1010:       97fffe14        bl      860 <el0_svc>
>     1014:       a8c17bfd        ldp     x29, x30, [sp], #16
>     1018:       d50323bf        autiasp
>     101c:       d65f03c0        ret
>     1020:       f100705f        cmp     x2, #0x1c
>
> Execute "./lat_syscall null" on my board (BogoMIPS : 200.00), it can save
> about 10ns.
>
> Before:
> Simple syscall: 0.2365 microseconds
> Simple syscall: 0.2354 microseconds
> Simple syscall: 0.2339 microseconds
>
> After:
> Simple syscall: 0.2255 microseconds
> Simple syscall: 0.2254 microseconds
> Simple syscall: 0.2256 microseconds

I was curious about the impact of the patch on other
micro-architectures. Following are the results from using the patch
applied to v5.14 on A72 and A53 on a RK399.

For the A72 -
Before:
Simple syscall: 0.4311 microseconds
Simple syscall: 0.4311 microseconds
Simple syscall: 0.4313 microseconds

After:
Simple syscall: 0.4249 microseconds
Simple syscall: 0.4248 microseconds
Simple syscall: 0.4248 microseconds

For the A53 -
Before:
Simple syscall: 0.4130 microseconds
Simple syscall: 0.4128 microseconds
Simple syscall: 0.4124 microseconds

After:
Simple syscall: 0.4031 microseconds
Simple syscall: 0.4078 microseconds
Simple syscall: 0.4030 microseconds

Although there is a small benefit, they are not as big as on your board
/ micro-architecture.

Were you able to see any impact on real workloads?

I imagine that other code paths in the sync handler would also benefit
from similar special casing - did you try any others. Page fault
handling came to mind.

Overall, I feel a little uneasy about the special casing introduced here
but at the same time see that it does benefit certain workloads.

One more comment below.

>
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> ---
>  arch/arm64/kernel/entry-common.c | 18 ++++++++++++------
>  1 file changed, 12 insertions(+), 6 deletions(-)
>
> diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
> index 32f9796c4ffe77b..062eb5a895ec6f3 100644
> --- a/arch/arm64/kernel/entry-common.c
> +++ b/arch/arm64/kernel/entry-common.c
> @@ -607,11 +607,14 @@ static void noinstr el0_fpac(struct pt_regs *regs, unsigned long esr)
>  asmlinkage void noinstr el0t_64_sync_handler(struct pt_regs *regs)
>  {
>  	unsigned long esr = read_sysreg(esr_el1);
> +	unsigned long ec = ESR_ELx_EC(esr);
>  
> -	switch (ESR_ELx_EC(esr)) {
> -	case ESR_ELx_EC_SVC64:
> +	if (likely(ec == ESR_ELx_EC_SVC64)) {
>  		el0_svc(regs);
> -		break;
> +		return;
> +	}
> +
> +	switch (ec) {
>  	case ESR_ELx_EC_DABT_LOW:
>  		el0_da(regs, esr);
>  		break;

Please include a big fat comment on why SVC (or any other patch) is
being separated out of the switch case - both here and below.

Thanks,
Punit

> @@ -730,11 +733,14 @@ static void noinstr el0_svc_compat(struct pt_regs *regs)
>  asmlinkage void noinstr el0t_32_sync_handler(struct pt_regs *regs)
>  {
>  	unsigned long esr = read_sysreg(esr_el1);
> +	unsigned long ec = ESR_ELx_EC(esr);
>  
> -	switch (ESR_ELx_EC(esr)) {
> -	case ESR_ELx_EC_SVC32:
> +	if (likely(ec == ESR_ELx_EC_SVC32)) {
>  		el0_svc_compat(regs);
> -		break;
> +		return;
> +	}
> +
> +	switch (ec) {
>  	case ESR_ELx_EC_DABT_LOW:
>  		el0_da(regs, esr);
>  		break;

  reply	other threads:[~2021-09-13 23:02 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-03 12:19 [PATCH] arm64: entry: Improve the performance of system calls Zhen Lei
2021-09-13 23:01 ` Punit Agrawal [this message]
2021-09-14  2:01   ` Leizhen (ThunderTown)
2021-09-14  9:55 ` Mark Rutland
2021-09-14 11:23   ` Leizhen (ThunderTown)
2021-09-14 11:48     ` Leizhen (ThunderTown)
2021-09-14 15:17   ` Joey Gouly

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87o88w84hp.fsf@stealth \
    --to=punitagrawal@gmail.com \
    --cc=catalin.marinas@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=thunder.leizhen@huawei.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).