All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mark Rutland <mark.rutland@arm.com>
To: Pingfan Liu <kernelfans@gmail.com>
Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>,
	Vladimir Murzin <vladimir.murzin@arm.com>,
	Steve Capper <steve.capper@arm.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH] arm64/mm: save memory access in check_and_switch_context() fast switch path
Date: Fri, 10 Jul 2020 10:35:16 +0100	[thread overview]
Message-ID: <20200710093516.GA25856@C02TD0UTHF1T.local> (raw)
In-Reply-To: <CAFgQCTviLCPkvCfrZ0Cwubqfzpht6n6=hJW-RsRQejYNHozT9Q@mail.gmail.com>

On Fri, Jul 10, 2020 at 04:03:39PM +0800, Pingfan Liu wrote:
> On Thu, Jul 9, 2020 at 7:48 PM Mark Rutland <mark.rutland@arm.com> wrote:
> [...]
> >
> > IIUC that's a 0.3% improvement. It'd be worth putting these results in
> > the commit message.
> Sure, I will.
> >
> > Could you also try that with "perf bench sched messaging" as the
> > workload? As a microbenchmark, that might show the highest potential
> > benefit, and it'd be nice to have those figures too if possible.
> I have finished 10 times of this test, and will put the results in the
> commit log too. In summary, this microbenchmark has about 1.69%
> improvement after this patch.

Great; thanks for gathering this data!

Mark.

> 
> Test data:
> 
> 1. without this patch, total 0.707 sec for 10 times
> 
> # perf stat -r 10 perf bench sched messaging
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
> 
>      Total time: 0.074 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
> 
>      Total time: 0.071 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
> 
>      Total time: 0.068 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
> 
>      Total time: 0.072 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
> 
>      Total time: 0.070 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
> 
>      Total time: 0.070 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
> 
>      Total time: 0.072 [sec]
> # Running 'sched/messaging' benchmark:
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
> 
>      Total time: 0.072 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
> 
>      Total time: 0.068 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
> 
>      Total time: 0.070 [sec]
> 
>  Performance counter stats for 'perf bench sched messaging' (10 runs):
> 
>           3,102.15 msec task-clock                #   11.018 CPUs
> utilized            ( +-  0.47% )
>             16,468      context-switches          #    0.005 M/sec
>                ( +-  2.56% )
>              6,877      cpu-migrations            #    0.002 M/sec
>                ( +-  3.44% )
>             83,645      page-faults               #    0.027 M/sec
>                ( +-  0.05% )
>      6,440,897,966      cycles                    #    2.076 GHz
>                ( +-  0.37% )
>      3,620,264,483      instructions              #    0.56  insn per
> cycle           ( +-  0.11% )
>    <not supported>      branches
>         11,187,394      branch-misses
>                ( +-  0.73% )
> 
>            0.28155 +- 0.00166 seconds time elapsed  ( +-  0.59% )
> 
> 2. with this patch, totol 0.695 sec for 10 times
> perf stat -r 10 perf bench sched messaging
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
> 
>      Total time: 0.069 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
> 
>      Total time: 0.070 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
> 
>      Total time: 0.070 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
> 
>      Total time: 0.070 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
> 
>      Total time: 0.071 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
> 
>      Total time: 0.069 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
> 
>      Total time: 0.072 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
> 
>      Total time: 0.066 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
> 
>      Total time: 0.069 [sec]
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
> 
>      Total time: 0.069 [sec]
> 
>  Performance counter stats for 'perf bench sched messaging' (10 runs):
> 
>           3,098.48 msec task-clock                #   11.182 CPUs
> utilized            ( +-  0.38% )
>             15,485      context-switches          #    0.005 M/sec
>                ( +-  2.28% )
>              6,707      cpu-migrations            #    0.002 M/sec
>                ( +-  2.80% )
>             83,606      page-faults               #    0.027 M/sec
>                ( +-  0.00% )
>      6,435,068,186      cycles                    #    2.077 GHz
>                ( +-  0.26% )
>      3,611,197,297      instructions              #    0.56  insn per
> cycle           ( +-  0.08% )
>    <not supported>      branches
>         11,323,244      branch-misses
>                ( +-  0.51% )
> 
>           0.277087 +- 0.000625 seconds time elapsed  ( +-  0.23% )
> 
> 
> Thanks,
> Pingfan

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

      reply	other threads:[~2020-07-10  9:36 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-03  5:44 [PATCH] arm64/mm: save memory access in check_and_switch_context() fast switch path Pingfan Liu
2020-07-03 10:13 ` Mark Rutland
2020-07-06  8:10   ` Pingfan Liu
2020-07-07  1:50     ` Pingfan Liu
2020-07-09 11:48       ` Mark Rutland
2020-07-10  8:03         ` Pingfan Liu
2020-07-10  9:35           ` Mark Rutland [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200710093516.GA25856@C02TD0UTHF1T.local \
    --to=mark.rutland@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=jean-philippe@linaro.org \
    --cc=kernelfans@gmail.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=steve.capper@arm.com \
    --cc=vladimir.murzin@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.