From: kernel test robot <oliver.sang@intel.com>
To: Raghavendra K T <raghavendra.kt@amd.com>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
Bharata B Rao <bharata@amd.com>, <linux-kernel@vger.kernel.org>,
<ying.huang@intel.com>, <feng.tang@intel.com>,
<fengwei.yin@intel.com>, <aubrey.li@linux.intel.com>,
<yu.c.chen@intel.com>, <linux-mm@kvack.org>,
Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Mel Gorman <mgorman@suse.de>,
"Andrew Morton" <akpm@linux-foundation.org>,
David Hildenbrand <david@redhat.com>, <rppt@kernel.org>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Aithal Srikanth <sraithal@amd.com>,
"kernel test robot" <oliver.sang@intel.com>,
Raghavendra K T <raghavendra.kt@amd.com>,
Sapkal Swapnil <Swapnil.Sapkal@amd.com>,
K Prateek Nayak <kprateek.nayak@amd.com>
Subject: Re: [RFC PATCH V1 5/6] sched/numa: Allow recently accessed VMAs to be scanned
Date: Sun, 10 Sep 2023 23:29:28 +0800 [thread overview]
Message-ID: <202309102311.84b42068-oliver.sang@intel.com> (raw)
In-Reply-To: <109ca1ea59b9dd6f2daf7b7fbc74e83ae074fbdf.1693287931.git.raghavendra.kt@amd.com>
Hello,
kernel test robot noticed a -33.6% improvement of autonuma-benchmark.numa02.seconds on:
commit: af46f3c9ca2d16485912f8b9c896ef48bbfe1388 ("[RFC PATCH V1 5/6] sched/numa: Allow recently accessed VMAs to be scanned")
url: https://github.com/intel-lab-lkp/linux/commits/Raghavendra-K-T/sched-numa-Move-up-the-access-pid-reset-logic/20230829-141007
base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git 2f88c8e802c8b128a155976631f4eb2ce4f3c805
patch link: https://lore.kernel.org/all/109ca1ea59b9dd6f2daf7b7fbc74e83ae074fbdf.1693287931.git.raghavendra.kt@amd.com/
patch subject: [RFC PATCH V1 5/6] sched/numa: Allow recently accessed VMAs to be scanned
testcase: autonuma-benchmark
test machine: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) with 128G memory
parameters:
iterations: 4x
test: numa01_THREAD_ALLOC
cpufreq_governor: performance
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20230910/202309102311.84b42068-oliver.sang@intel.com
=========================================================================================
compiler/cpufreq_governor/iterations/kconfig/rootfs/tbox_group/test/testcase:
gcc-12/performance/4x/x86_64-rhel-8.3/debian-11.1-x86_64-20220510.cgz/lkp-icl-2sp6/numa01_THREAD_ALLOC/autonuma-benchmark
commit:
167773d1dd ("sched/numa: Increase tasks' access history")
af46f3c9ca ("sched/numa: Allow recently accessed VMAs to be scanned")
167773d1ddb5ffdd af46f3c9ca2d16485912f8b9c89
---------------- ---------------------------
%stddev %change %stddev
\ | \
2.534e+10 ± 10% -13.0% 2.204e+10 ± 7% cpuidle..time
26431366 ± 10% -13.2% 22948978 ± 7% cpuidle..usage
0.15 ± 4% -0.0 0.12 ± 3% mpstat.cpu.all.soft%
2.92 ± 3% +0.4 3.32 ± 4% mpstat.cpu.all.sys%
2243 ± 2% -12.7% 1957 ± 3% uptime.boot
29811 ± 8% -11.1% 26507 ± 6% uptime.idle
5.32 ± 79% -64.2% 1.91 ± 60% perf-sched.sch_delay.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_exc_page_fault
2.70 ± 18% +37.8% 3.72 ± 9% perf-sched.sch_delay.max.ms.schedule_hrtimeout_range_clock.do_select.core_sys_select.kern_select
0.64 ±137% +26644.2% 169.91 ±220% perf-sched.wait_time.avg.ms.__cond_resched.task_work_run.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode
0.08 ± 20% +0.0 0.12 ± 10% perf-profile.children.cycles-pp.terminate_walk
0.10 ± 25% +0.0 0.14 ± 10% perf-profile.children.cycles-pp.wake_up_q
0.06 ± 50% +0.0 0.10 ± 10% perf-profile.children.cycles-pp.vfs_readlink
0.15 ± 36% +0.1 0.22 ± 13% perf-profile.children.cycles-pp.readlink
1.31 ± 19% +0.4 1.69 ± 12% perf-profile.children.cycles-pp.unmap_vmas
2.46 ± 19% +0.5 2.99 ± 4% perf-profile.children.cycles-pp.exit_mmap
311653 ± 10% -23.7% 237884 ± 9% turbostat.C1E
26018024 ± 10% -13.1% 22597563 ± 7% turbostat.C6
6.41 ± 9% -13.6% 5.54 ± 8% turbostat.CPU%c1
2.47 ± 11% +36.0% 3.36 ± 6% turbostat.CPU%c6
2.881e+08 ± 2% -12.8% 2.513e+08 ± 3% turbostat.IRQ
212.86 +2.8% 218.84 turbostat.RAMWatt
341.49 -4.1% 327.42 ± 2% autonuma-benchmark.numa01.seconds
186.67 ± 6% -27.1% 136.12 ± 7% autonuma-benchmark.numa01_THREAD_ALLOC.seconds
21.17 ± 7% -33.6% 14.05 autonuma-benchmark.numa02.seconds
2200 ± 2% -13.0% 1913 ± 3% autonuma-benchmark.time.elapsed_time
2200 ± 2% -13.0% 1913 ± 3% autonuma-benchmark.time.elapsed_time.max
1159380 ± 2% -12.0% 1019969 ± 3% autonuma-benchmark.time.involuntary_context_switches
3363550 -5.0% 3194802 autonuma-benchmark.time.minor_page_faults
243046 ± 2% -13.3% 210725 ± 3% autonuma-benchmark.time.user_time
7494239 -6.8% 6984234 proc-vmstat.numa_hit
118829 ± 6% +13.7% 135136 ± 6% proc-vmstat.numa_huge_pte_updates
6207618 -8.4% 5686795 ± 2% proc-vmstat.numa_local
8834573 ± 3% +20.2% 10616944 ± 4% proc-vmstat.numa_pages_migrated
61094857 ± 6% +13.6% 69409875 ± 6% proc-vmstat.numa_pte_updates
8602789 -9.0% 7827793 ± 2% proc-vmstat.pgfault
8834573 ± 3% +20.2% 10616944 ± 4% proc-vmstat.pgmigrate_success
371818 -10.1% 334391 ± 2% proc-vmstat.pgreuse
17200 ± 3% +20.3% 20686 ± 4% proc-vmstat.thp_migration_success
16401792 ± 2% -12.7% 14322816 ± 3% proc-vmstat.unevictable_pgs_scanned
1.606e+08 ± 2% -13.8% 1.385e+08 ± 3% sched_debug.cfs_rq:/.avg_vruntime.avg
1.666e+08 ± 2% -14.0% 1.433e+08 ± 3% sched_debug.cfs_rq:/.avg_vruntime.max
1.364e+08 ± 2% -11.7% 1.204e+08 ± 3% sched_debug.cfs_rq:/.avg_vruntime.min
4795327 ± 7% -17.5% 3956991 ± 7% sched_debug.cfs_rq:/.avg_vruntime.stddev
1.606e+08 ± 2% -13.8% 1.385e+08 ± 3% sched_debug.cfs_rq:/.min_vruntime.avg
1.666e+08 ± 2% -14.0% 1.433e+08 ± 3% sched_debug.cfs_rq:/.min_vruntime.max
1.364e+08 ± 2% -11.7% 1.204e+08 ± 3% sched_debug.cfs_rq:/.min_vruntime.min
4795327 ± 7% -17.5% 3956991 ± 7% sched_debug.cfs_rq:/.min_vruntime.stddev
364.96 ± 6% +16.6% 425.70 ± 5% sched_debug.cfs_rq:/.util_est_enqueued.avg
1099114 -13.0% 956021 ± 2% sched_debug.cpu.clock.avg
1099477 -13.0% 956344 ± 2% sched_debug.cpu.clock.max
1098702 -13.0% 955643 ± 2% sched_debug.cpu.clock.min
1080712 -13.0% 940415 ± 2% sched_debug.cpu.clock_task.avg
1085309 -13.1% 943557 ± 2% sched_debug.cpu.clock_task.max
1064613 -13.0% 925993 ± 2% sched_debug.cpu.clock_task.min
28890 ± 3% -11.7% 25504 ± 3% sched_debug.cpu.curr->pid.avg
35200 -11.0% 31344 sched_debug.cpu.curr->pid.max
862245 ± 3% -8.7% 786984 sched_debug.cpu.max_idle_balance_cost.max
74019 ± 9% -28.2% 53158 ± 7% sched_debug.cpu.max_idle_balance_cost.stddev
15507 -11.9% 13667 ± 2% sched_debug.cpu.nr_switches.avg
57616 ± 6% -19.0% 46642 ± 8% sched_debug.cpu.nr_switches.max
8460 ± 6% -12.9% 7368 ± 5% sched_debug.cpu.nr_switches.stddev
1098689 -13.0% 955631 ± 2% sched_debug.cpu_clk
1097964 -13.0% 954907 ± 2% sched_debug.ktime
0.00 +15.0% 0.00 ± 2% sched_debug.rt_rq:.rt_nr_migratory.avg
0.03 +15.0% 0.03 ± 2% sched_debug.rt_rq:.rt_nr_migratory.max
0.00 +15.0% 0.00 ± 2% sched_debug.rt_rq:.rt_nr_migratory.stddev
0.00 +15.0% 0.00 ± 2% sched_debug.rt_rq:.rt_nr_running.avg
0.03 +15.0% 0.03 ± 2% sched_debug.rt_rq:.rt_nr_running.max
0.00 +15.0% 0.00 ± 2% sched_debug.rt_rq:.rt_nr_running.stddev
1099511 -13.0% 956501 ± 2% sched_debug.sched_clk
1162 ± 2% +15.2% 1339 ± 3% perf-stat.i.MPKI
1.656e+08 +3.6% 1.716e+08 perf-stat.i.branch-instructions
0.95 ± 4% +0.1 1.03 perf-stat.i.branch-miss-rate%
1538367 ± 6% +11.0% 1707146 ± 2% perf-stat.i.branch-misses
6.327e+08 ± 3% +18.7% 7.513e+08 ± 4% perf-stat.i.cache-misses
8.282e+08 ± 2% +15.2% 9.542e+08 ± 3% perf-stat.i.cache-references
658.12 ± 3% -11.4% 582.98 ± 6% perf-stat.i.cycles-between-cache-misses
2.201e+08 +2.8% 2.263e+08 perf-stat.i.dTLB-loads
579771 +0.9% 584915 perf-stat.i.dTLB-store-misses
1.122e+08 +1.4% 1.138e+08 perf-stat.i.dTLB-stores
8.278e+08 +3.1% 8.538e+08 perf-stat.i.instructions
13.98 ± 2% +14.3% 15.98 ± 3% perf-stat.i.metric.M/sec
3797 +4.3% 3958 perf-stat.i.minor-faults
258749 +8.0% 279391 ± 2% perf-stat.i.node-load-misses
261169 ± 2% +7.4% 280417 ± 5% perf-stat.i.node-loads
40.91 ± 3% -3.0 37.89 ± 3% perf-stat.i.node-store-miss-rate%
3.841e+08 ± 6% +27.6% 4.902e+08 ± 7% perf-stat.i.node-stores
3797 +4.3% 3958 perf-stat.i.page-faults
998.24 ± 2% +11.8% 1116 ± 2% perf-stat.overall.MPKI
463.91 -3.2% 448.99 perf-stat.overall.cpi
604.23 ± 3% -15.9% 508.08 ± 4% perf-stat.overall.cycles-between-cache-misses
0.00 +3.3% 0.00 perf-stat.overall.ipc
39.20 ± 5% -4.5 34.70 ± 6% perf-stat.overall.node-store-miss-rate%
1.636e+08 +3.8% 1.698e+08 perf-stat.ps.branch-instructions
1499760 ± 6% +11.1% 1665855 ± 2% perf-stat.ps.branch-misses
6.296e+08 ± 3% +19.0% 7.489e+08 ± 4% perf-stat.ps.cache-misses
8.178e+08 ± 2% +15.5% 9.447e+08 ± 3% perf-stat.ps.cache-references
2.18e+08 +2.9% 2.244e+08 perf-stat.ps.dTLB-loads
578148 +0.9% 583328 perf-stat.ps.dTLB-store-misses
1.117e+08 +1.4% 1.132e+08 perf-stat.ps.dTLB-stores
8.192e+08 +3.3% 8.46e+08 perf-stat.ps.instructions
3744 +4.3% 3906 perf-stat.ps.minor-faults
255974 +8.2% 276924 ± 2% perf-stat.ps.node-load-misses
263796 ± 2% +7.7% 284110 ± 5% perf-stat.ps.node-loads
3.82e+08 ± 6% +27.7% 4.879e+08 ± 7% perf-stat.ps.node-stores
3744 +4.3% 3906 perf-stat.ps.page-faults
1.805e+12 ± 2% -10.1% 1.622e+12 ± 2% perf-stat.total.instructions
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
next prev parent reply other threads:[~2023-09-10 15:29 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-29 6:06 [RFC PATCH V1 0/6] sched/numa: Enhance disjoint VMA scanning Raghavendra K T
2023-08-29 6:06 ` [RFC PATCH V1 1/6] sched/numa: Move up the access pid reset logic Raghavendra K T
2023-08-29 6:06 ` [RFC PATCH V1 2/6] sched/numa: Add disjoint vma unconditional scan logic Raghavendra K T
2023-09-12 7:50 ` kernelt test robot
2023-09-13 6:21 ` Raghavendra K T
2023-08-29 6:06 ` [RFC PATCH V1 3/6] sched/numa: Remove unconditional scan logic using mm numa_scan_seq Raghavendra K T
2023-08-29 6:06 ` [RFC PATCH V1 4/6] sched/numa: Increase tasks' access history Raghavendra K T
2023-09-12 14:24 ` kernel test robot
2023-09-13 6:15 ` Raghavendra K T
2023-09-13 7:34 ` Oliver Sang
2023-08-29 6:06 ` [RFC PATCH V1 5/6] sched/numa: Allow recently accessed VMAs to be scanned Raghavendra K T
2023-09-10 15:29 ` kernel test robot [this message]
2023-09-11 11:25 ` Raghavendra K T
2023-09-12 2:22 ` Oliver Sang
2023-09-12 6:43 ` Raghavendra K T
2023-08-29 6:06 ` [RFC PATCH V1 6/6] sched/numa: Allow scanning of shared VMAs Raghavendra K T
2023-09-13 5:28 ` [RFC PATCH V1 0/6] sched/numa: Enhance disjoint VMA scanning Swapnil Sapkal
2023-09-13 6:24 ` Raghavendra K T
2023-09-19 6:30 ` Raghavendra K T
2023-09-19 7:15 ` Ingo Molnar
2023-09-19 8:06 ` Raghavendra K T
2023-09-19 9:28 ` Peter Zijlstra
2023-09-19 16:22 ` Mel Gorman
2023-09-19 19:11 ` Peter Zijlstra
2023-09-20 10:42 ` Raghavendra K T
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=202309102311.84b42068-oliver.sang@intel.com \
--to=oliver.sang@intel.com \
--cc=Swapnil.Sapkal@amd.com \
--cc=akpm@linux-foundation.org \
--cc=aubrey.li@linux.intel.com \
--cc=bharata@amd.com \
--cc=david@redhat.com \
--cc=feng.tang@intel.com \
--cc=fengwei.yin@intel.com \
--cc=juri.lelli@redhat.com \
--cc=kprateek.nayak@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lkp@intel.com \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=oe-lkp@lists.linux.dev \
--cc=peterz@infradead.org \
--cc=raghavendra.kt@amd.com \
--cc=rppt@kernel.org \
--cc=sraithal@amd.com \
--cc=vincent.guittot@linaro.org \
--cc=ying.huang@intel.com \
--cc=yu.c.chen@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.