From: Hyeonggon Yoo <42.hyeyoo@gmail.com>
To: kernel test robot <oliver.sang@intel.com>
Cc: Jay Patel <jaypatel@linux.ibm.com>,
oe-lkp@lists.linux.dev, lkp@intel.com, linux-mm@kvack.org,
ying.huang@intel.com, feng.tang@intel.com,
fengwei.yin@intel.com, cl@linux.com, penberg@kernel.org,
rientjes@google.com, iamjoonsoo.kim@lge.com,
akpm@linux-foundation.org, vbabka@suse.cz,
aneesh.kumar@linux.ibm.com, tsahu@linux.ibm.com,
piyushs@linux.ibm.com
Subject: Re: [PATCH] [RFC PATCH v2]mm/slub: Optimize slub memory usage
Date: Tue, 18 Jul 2023 15:43:16 +0900 [thread overview]
Message-ID: <CAB=+i9QY99=NzQugoMCdbEwkCKJObxx4DwWXwNjMqyMRYrgOHA@mail.gmail.com> (raw)
In-Reply-To: <202307172140.3b34825a-oliver.sang@intel.com>
On Mon, Jul 17, 2023 at 10:41 PM kernel test robot
<oliver.sang@intel.com> wrote:
>
>
>
> Hello,
>
> kernel test robot noticed a -12.5% regression of hackbench.throughput on:
>
>
> commit: a0fd217e6d6fbd23e91f8796787b621e7d576088 ("[PATCH] [RFC PATCH v2]mm/slub: Optimize slub memory usage")
> url: https://github.com/intel-lab-lkp/linux/commits/Jay-Patel/mm-slub-Optimize-slub-memory-usage/20230628-180050
> base: git://git.kernel.org/cgit/linux/kernel/git/vbabka/slab.git for-next
> patch link: https://lore.kernel.org/all/20230628095740.589893-1-jaypatel@linux.ibm.com/
> patch subject: [PATCH] [RFC PATCH v2]mm/slub: Optimize slub memory usage
>
> testcase: hackbench
> test machine: 128 threads 2 sockets Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz (Ice Lake) with 256G memory
> parameters:
>
> nr_threads: 100%
> iterations: 4
> mode: process
> ipc: socket
> cpufreq_governor: performance
>
>
>
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@intel.com>
> | Closes: https://lore.kernel.org/oe-lkp/202307172140.3b34825a-oliver.sang@intel.com
>
>
> Details are as below:
> -------------------------------------------------------------------------------------------------->
>
>
> To reproduce:
>
> git clone https://github.com/intel/lkp-tests.git
> cd lkp-tests
> sudo bin/lkp install job.yaml # job file is attached in this email
> bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
> sudo bin/lkp run generated-yaml-file
>
> # if come across any failure that blocks the test,
> # please remove ~/.lkp and /lkp dir to run from a clean state.
>
> =========================================================================================
> compiler/cpufreq_governor/ipc/iterations/kconfig/mode/nr_threads/rootfs/tbox_group/testcase:
> gcc-12/performance/socket/4/x86_64-rhel-8.3/process/100%/debian-11.1-x86_64-20220510.cgz/lkp-icl-2sp2/hackbench
>
> commit:
> 7bc162d5cc ("Merge branches 'slab/for-6.5/prandom', 'slab/for-6.5/slab_no_merge' and 'slab/for-6.5/slab-deprecate' into slab/for-next")
> a0fd217e6d ("mm/slub: Optimize slub memory usage")
>
> 7bc162d5cc4de5c3 a0fd217e6d6fbd23e91f8796787
> ---------------- ---------------------------
> %stddev %change %stddev
> \ | \
> 222503 ą 86% +108.7% 464342 ą 58% numa-meminfo.node1.Active
> 222459 ą 86% +108.7% 464294 ą 58% numa-meminfo.node1.Active(anon)
> 55573 ą 85% +108.0% 115619 ą 58% numa-vmstat.node1.nr_active_anon
> 55573 ą 85% +108.0% 115618 ą 58% numa-vmstat.node1.nr_zone_active_anon
I'm quite baffled while reading this.
How did changing slab order calculation double the number of active anon pages?
I doubt two experiments were performed on the same settings.
> 1377834 ą 2% -10.7% 1230013 sched_debug.cpu.nr_switches.avg
> 1218144 ą 2% -13.3% 1055659 ą 2% sched_debug.cpu.nr_switches.min
> 3047631 ą 2% -13.2% 2646560 vmstat.system.cs
> 561797 -13.8% 484137 vmstat.system.in
> 280976 ą 66% +122.6% 625459 ą 52% meminfo.Active
> 280881 ą 66% +122.6% 625365 ą 52% meminfo.Active(anon)
> 743351 ą 4% -9.7% 671534 ą 6% meminfo.AnonPages
> 1.36 -0.1 1.21 mpstat.cpu.all.irq%
> 0.04 ą 4% -0.0 0.03 ą 4% mpstat.cpu.all.soft%
> 5.38 -0.8 4.58 mpstat.cpu.all.usr%
> 0.26 -11.9% 0.23 turbostat.IPC
> 160.93 -19.3 141.61 turbostat.PKG_%
> 60.48 -8.9% 55.10 turbostat.RAMWatt
> 70049 ą 68% +124.5% 157279 ą 52% proc-vmstat.nr_active_anon
> 185963 ą 4% -9.8% 167802 ą 6% proc-vmstat.nr_anon_pages
> 37302 -1.2% 36837 proc-vmstat.nr_slab_reclaimable
> 70049 ą 68% +124.5% 157279 ą 52% proc-vmstat.nr_zone_active_anon
> 1101451 +12.0% 1233638 proc-vmstat.unevictable_pgs_scanned
> 477310 -12.5% 417480 hackbench.throughput
> 464064 -12.0% 408333 hackbench.throughput_avg
> 477310 -12.5% 417480 hackbench.throughput_best
> 435294 -9.5% 394098 hackbench.throughput_worst
> 131.28 +13.4% 148.89 hackbench.time.elapsed_time
> 131.28 +13.4% 148.89 hackbench.time.elapsed_time.max
> 90404617 -5.2% 85662614 ą 2% hackbench.time.involuntary_context_switches
> 15342 +15.0% 17642 hackbench.time.system_time
> 866.32 -3.2% 838.32 hackbench.time.user_time
> 4.581e+10 -11.2% 4.069e+10 perf-stat.i.branch-instructions
> 0.45 +0.1 0.56 perf-stat.i.branch-miss-rate%
> 2.024e+08 +11.8% 2.263e+08 perf-stat.i.branch-misses
> 21.49 -1.1 20.42 perf-stat.i.cache-miss-rate%
> 4.202e+08 -16.6% 3.505e+08 perf-stat.i.cache-misses
> 1.935e+09 -11.5% 1.711e+09 perf-stat.i.cache-references
> 3115707 ą 2% -13.9% 2681887 perf-stat.i.context-switches
> 1.31 +13.2% 1.48 perf-stat.i.cpi
> 375155 ą 3% -16.3% 314001 ą 2% perf-stat.i.cpu-migrations
> 6.727e+10 -11.2% 5.972e+10 perf-stat.i.dTLB-loads
> 4.169e+10 -12.2% 3.661e+10 perf-stat.i.dTLB-stores
> 2.465e+11 -11.4% 2.185e+11 perf-stat.i.instructions
> 0.77 -11.8% 0.68 perf-stat.i.ipc
> 818.18 ą 5% +61.8% 1323 ą 2% perf-stat.i.metric.K/sec
> 1225 -11.6% 1083 perf-stat.i.metric.M/sec
> 11341 ą 4% -12.6% 9916 ą 4% perf-stat.i.minor-faults
> 1.27e+08 -13.2% 1.102e+08 perf-stat.i.node-load-misses
> 3376198 -15.4% 2855906 perf-stat.i.node-loads
> 72756698 -22.9% 56082330 perf-stat.i.node-store-misses
> 4118986 ą 2% -19.3% 3322276 perf-stat.i.node-stores
> 11432 ą 3% -12.6% 9991 ą 4% perf-stat.i.page-faults
> 0.44 +0.1 0.56 perf-stat.overall.branch-miss-rate%
> 21.76 -1.3 20.49 perf-stat.overall.cache-miss-rate%
> 1.29 +13.5% 1.47 perf-stat.overall.cpi
> 755.39 +21.1% 914.82 perf-stat.overall.cycles-between-cache-misses
> 0.77 -11.9% 0.68 perf-stat.overall.ipc
> 4.546e+10 -11.0% 4.046e+10 perf-stat.ps.branch-instructions
> 2.006e+08 +12.0% 2.246e+08 perf-stat.ps.branch-misses
> 4.183e+08 -16.8% 3.48e+08 perf-stat.ps.cache-misses
> 1.923e+09 -11.7% 1.699e+09 perf-stat.ps.cache-references
> 3073921 ą 2% -13.9% 2647497 perf-stat.ps.context-switches
> 367849 ą 3% -16.1% 308496 ą 2% perf-stat.ps.cpu-migrations
> 6.683e+10 -11.2% 5.938e+10 perf-stat.ps.dTLB-loads
> 4.144e+10 -12.2% 3.639e+10 perf-stat.ps.dTLB-stores
> 2.447e+11 -11.2% 2.172e+11 perf-stat.ps.instructions
> 10654 ą 4% -11.5% 9428 ą 4% perf-stat.ps.minor-faults
> 1.266e+08 -13.5% 1.095e+08 perf-stat.ps.node-load-misses
> 3361116 -15.6% 2836863 perf-stat.ps.node-loads
> 72294146 -23.1% 55573600 perf-stat.ps.node-store-misses
> 4043240 ą 2% -19.4% 3258771 perf-stat.ps.node-stores
> 10734 ą 4% -11.6% 9494 ą 4% perf-stat.ps.page-faults
<...>
>
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
>
>
> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
>
>
next prev parent reply other threads:[~2023-07-18 6:43 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-28 9:57 [PATCH] [RFC PATCH v2]mm/slub: Optimize slub memory usage Jay Patel
2023-07-03 0:13 ` David Rientjes
2023-07-03 8:39 ` Jay Patel
2023-07-09 14:42 ` Hyeonggon Yoo
2023-07-12 13:06 ` Vlastimil Babka
2023-07-20 10:30 ` Jay Patel
2023-07-17 13:41 ` kernel test robot
2023-07-18 6:43 ` Hyeonggon Yoo [this message]
2023-07-20 3:00 ` Oliver Sang
2023-07-20 12:59 ` Hyeonggon Yoo
2023-07-20 13:46 ` Hyeonggon Yoo
2023-07-20 14:15 ` Hyeonggon Yoo
2023-07-24 2:39 ` Oliver Sang
2023-07-31 9:49 ` Hyeonggon Yoo
2023-07-20 13:49 ` Feng Tang
2023-07-20 15:05 ` Hyeonggon Yoo
2023-07-21 14:50 ` Binder Makin
2023-07-21 15:39 ` Hyeonggon Yoo
2023-07-21 18:31 ` Binder Makin
2023-07-24 14:35 ` Feng Tang
2023-07-25 3:13 ` Hyeonggon Yoo
2023-07-25 9:12 ` Feng Tang
2023-08-29 8:30 ` Feng Tang
2023-07-26 10:06 ` Vlastimil Babka
2023-08-10 10:38 ` Jay Patel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAB=+i9QY99=NzQugoMCdbEwkCKJObxx4DwWXwNjMqyMRYrgOHA@mail.gmail.com' \
--to=42.hyeyoo@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.ibm.com \
--cc=cl@linux.com \
--cc=feng.tang@intel.com \
--cc=fengwei.yin@intel.com \
--cc=iamjoonsoo.kim@lge.com \
--cc=jaypatel@linux.ibm.com \
--cc=linux-mm@kvack.org \
--cc=lkp@intel.com \
--cc=oe-lkp@lists.linux.dev \
--cc=oliver.sang@intel.com \
--cc=penberg@kernel.org \
--cc=piyushs@linux.ibm.com \
--cc=rientjes@google.com \
--cc=tsahu@linux.ibm.com \
--cc=vbabka@suse.cz \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).