From: Mark Rutland <mark.rutland@arm.com>
To: Pingfan Liu <kernelfans@gmail.com>
Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>,
Vladimir Murzin <vladimir.murzin@arm.com>,
Steve Capper <steve.capper@arm.com>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>,
linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH] arm64/mm: save memory access in check_and_switch_context() fast switch path
Date: Fri, 10 Jul 2020 10:35:16 +0100 [thread overview]
Message-ID: <20200710093516.GA25856@C02TD0UTHF1T.local> (raw)
In-Reply-To: <CAFgQCTviLCPkvCfrZ0Cwubqfzpht6n6=hJW-RsRQejYNHozT9Q@mail.gmail.com>
On Fri, Jul 10, 2020 at 04:03:39PM +0800, Pingfan Liu wrote:
> On Thu, Jul 9, 2020 at 7:48 PM Mark Rutland <mark.rutland@arm.com> wrote:
> [...]
> >
> > IIUC that's a 0.3% improvement. It'd be worth putting these results in
> > the commit message.
> Sure, I will.
> >
> > Could you also try that with "perf bench sched messaging" as the
> > workload? As a microbenchmark, that might show the highest potential
> > benefit, and it'd be nice to have those figures too if possible.
> I have finished 10 times of this test, and will put the results in the
> commit log too. In summary, this microbenchmark has about 1.69%
> improvement after this patch.
Great; thanks for gathering this data!
Mark.
>
> Test data:
>
> 1. without this patch, total 0.707 sec for 10 times
>
> # perf stat -r 10 perf bench sched messaging
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.074 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.071 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.068 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.072 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.070 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.070 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.072 [sec]
> # Running 'sched/messaging' benchmark:
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.072 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.068 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.070 [sec]
>
> Performance counter stats for 'perf bench sched messaging' (10 runs):
>
> 3,102.15 msec task-clock # 11.018 CPUs
> utilized ( +- 0.47% )
> 16,468 context-switches # 0.005 M/sec
> ( +- 2.56% )
> 6,877 cpu-migrations # 0.002 M/sec
> ( +- 3.44% )
> 83,645 page-faults # 0.027 M/sec
> ( +- 0.05% )
> 6,440,897,966 cycles # 2.076 GHz
> ( +- 0.37% )
> 3,620,264,483 instructions # 0.56 insn per
> cycle ( +- 0.11% )
> <not supported> branches
> 11,187,394 branch-misses
> ( +- 0.73% )
>
> 0.28155 +- 0.00166 seconds time elapsed ( +- 0.59% )
>
> 2. with this patch, totol 0.695 sec for 10 times
> perf stat -r 10 perf bench sched messaging
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.069 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.070 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.070 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.070 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.071 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.069 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.072 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.066 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.069 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.069 [sec]
>
> Performance counter stats for 'perf bench sched messaging' (10 runs):
>
> 3,098.48 msec task-clock # 11.182 CPUs
> utilized ( +- 0.38% )
> 15,485 context-switches # 0.005 M/sec
> ( +- 2.28% )
> 6,707 cpu-migrations # 0.002 M/sec
> ( +- 2.80% )
> 83,606 page-faults # 0.027 M/sec
> ( +- 0.00% )
> 6,435,068,186 cycles # 2.077 GHz
> ( +- 0.26% )
> 3,611,197,297 instructions # 0.56 insn per
> cycle ( +- 0.08% )
> <not supported> branches
> 11,323,244 branch-misses
> ( +- 0.51% )
>
> 0.277087 +- 0.000625 seconds time elapsed ( +- 0.23% )
>
>
> Thanks,
> Pingfan
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
prev parent reply other threads:[~2020-07-10 9:36 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-03 5:44 [PATCH] arm64/mm: save memory access in check_and_switch_context() fast switch path Pingfan Liu
2020-07-03 10:13 ` Mark Rutland
2020-07-06 8:10 ` Pingfan Liu
2020-07-07 1:50 ` Pingfan Liu
2020-07-09 11:48 ` Mark Rutland
2020-07-10 8:03 ` Pingfan Liu
2020-07-10 9:35 ` Mark Rutland [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200710093516.GA25856@C02TD0UTHF1T.local \
--to=mark.rutland@arm.com \
--cc=catalin.marinas@arm.com \
--cc=jean-philippe@linaro.org \
--cc=kernelfans@gmail.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=steve.capper@arm.com \
--cc=vladimir.murzin@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).