From: Mark Rutland <mark.rutland@arm.com>
To: Pingfan Liu <kernelfans@gmail.com>
Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>,
Vladimir Murzin <vladimir.murzin@arm.com>,
Steve Capper <steve.capper@arm.com>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>,
linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH] arm64/mm: save memory access in check_and_switch_context() fast switch path
Date: Fri, 10 Jul 2020 10:35:16 +0100 [thread overview]
Message-ID: <20200710093516.GA25856@C02TD0UTHF1T.local> (raw)
In-Reply-To: <CAFgQCTviLCPkvCfrZ0Cwubqfzpht6n6=hJW-RsRQejYNHozT9Q@mail.gmail.com>
On Fri, Jul 10, 2020 at 04:03:39PM +0800, Pingfan Liu wrote:
> On Thu, Jul 9, 2020 at 7:48 PM Mark Rutland <mark.rutland@arm.com> wrote:
> [...]
> >
> > IIUC that's a 0.3% improvement. It'd be worth putting these results in
> > the commit message.
> Sure, I will.
> >
> > Could you also try that with "perf bench sched messaging" as the
> > workload? As a microbenchmark, that might show the highest potential
> > benefit, and it'd be nice to have those figures too if possible.
> I have finished 10 times of this test, and will put the results in the
> commit log too. In summary, this microbenchmark has about 1.69%
> improvement after this patch.
Great; thanks for gathering this data!
Mark.
>
> Test data:
>
> 1. without this patch, total 0.707 sec for 10 times
>
> # perf stat -r 10 perf bench sched messaging
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.074 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.071 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.068 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.072 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.070 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.070 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.072 [sec]
> # Running 'sched/messaging' benchmark:
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.072 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.068 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.070 [sec]
>
> Performance counter stats for 'perf bench sched messaging' (10 runs):
>
> 3,102.15 msec task-clock # 11.018 CPUs
> utilized ( +- 0.47% )
> 16,468 context-switches # 0.005 M/sec
> ( +- 2.56% )
> 6,877 cpu-migrations # 0.002 M/sec
> ( +- 3.44% )
> 83,645 page-faults # 0.027 M/sec
> ( +- 0.05% )
> 6,440,897,966 cycles # 2.076 GHz
> ( +- 0.37% )
> 3,620,264,483 instructions # 0.56 insn per
> cycle ( +- 0.11% )
> <not supported> branches
> 11,187,394 branch-misses
> ( +- 0.73% )
>
> 0.28155 +- 0.00166 seconds time elapsed ( +- 0.59% )
>
> 2. with this patch, totol 0.695 sec for 10 times
> perf stat -r 10 perf bench sched messaging
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.069 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.070 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.070 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.070 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.071 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.069 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.072 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.066 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.069 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 0.069 [sec]
>
> Performance counter stats for 'perf bench sched messaging' (10 runs):
>
> 3,098.48 msec task-clock # 11.182 CPUs
> utilized ( +- 0.38% )
> 15,485 context-switches # 0.005 M/sec
> ( +- 2.28% )
> 6,707 cpu-migrations # 0.002 M/sec
> ( +- 2.80% )
> 83,606 page-faults # 0.027 M/sec
> ( +- 0.00% )
> 6,435,068,186 cycles # 2.077 GHz
> ( +- 0.26% )
> 3,611,197,297 instructions # 0.56 insn per
> cycle ( +- 0.08% )
> <not supported> branches
> 11,323,244 branch-misses
> ( +- 0.51% )
>
> 0.277087 +- 0.000625 seconds time elapsed ( +- 0.23% )
>
>
> Thanks,
> Pingfan
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
prev parent reply other threads:[~2020-07-10 9:36 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-03 5:44 [PATCH] arm64/mm: save memory access in check_and_switch_context() fast switch path Pingfan Liu
2020-07-03 10:13 ` Mark Rutland
2020-07-06 8:10 ` Pingfan Liu
2020-07-07 1:50 ` Pingfan Liu
2020-07-09 11:48 ` Mark Rutland
2020-07-10 8:03 ` Pingfan Liu
2020-07-10 9:35 ` Mark Rutland [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200710093516.GA25856@C02TD0UTHF1T.local \
--to=mark.rutland@arm.com \
--cc=catalin.marinas@arm.com \
--cc=jean-philippe@linaro.org \
--cc=kernelfans@gmail.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=steve.capper@arm.com \
--cc=vladimir.murzin@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.