From: kernel test robot <oliver.sang@intel.com>
To: Jan Kara <jack@suse.cz>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
Yury Norov <yury.norov@gmail.com>, Jan Kara <jack@suse.cz>,
<linux-kernel@vger.kernel.org>, <ying.huang@intel.com>,
<feng.tang@intel.com>, <fengwei.yin@intel.com>,
Andy Shevchenko <andriy.shevchenko@linux.intel.com>,
Rasmus Villemoes <linux@rasmusvillemoes.dk>,
Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>,
Matthew Wilcox <willy@infradead.org>,
<linux-fsdevel@vger.kernel.org>, <oliver.sang@intel.com>
Subject: Re: [PATCH 1/2] lib/find: Make functions safe on changing bitmaps
Date: Wed, 25 Oct 2023 15:18:18 +0800 [thread overview]
Message-ID: <202310251458.48b4452d-oliver.sang@intel.com> (raw)
In-Reply-To: <20231011150252.32737-1-jack@suse.cz>
Hello,
kernel test robot noticed a 3.7% improvement of will-it-scale.per_thread_ops on:
commit: df671b17195cd6526e029c70d04dfb72561082d7 ("[PATCH 1/2] lib/find: Make functions safe on changing bitmaps")
url: https://github.com/intel-lab-lkp/linux/commits/Jan-Kara/lib-find-Make-functions-safe-on-changing-bitmaps/20231011-230553
base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git 1c8b86a3799f7e5be903c3f49fcdaee29fd385b5
patch link: https://lore.kernel.org/all/20231011150252.32737-1-jack@suse.cz/
patch subject: [PATCH 1/2] lib/find: Make functions safe on changing bitmaps
testcase: will-it-scale
test machine: 104 threads 2 sockets (Skylake) with 192G memory
parameters:
nr_task: 50%
mode: thread
test: tlb_flush3
cpufreq_governor: performance
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20231025/202310251458.48b4452d-oliver.sang@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-12/performance/x86_64-rhel-8.3/thread/50%/debian-11.1-x86_64-20220510.cgz/lkp-skl-fpga01/tlb_flush3/will-it-scale
commit:
1c8b86a379 ("Merge tag 'xsa441-6.6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip")
df671b1719 ("lib/find: Make functions safe on changing bitmaps")
1c8b86a3799f7e5b df671b17195cd6526e029c70d04
---------------- ---------------------------
%stddev %change %stddev
\ | \
0.14 ± 19% +36.9% 0.19 ± 17% perf-sched.wait_time.avg.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write_killable.__vm_munmap
2.26e+08 +3.6% 2.343e+08 proc-vmstat.pgfault
0.04 +25.0% 0.05 turbostat.IPC
32666 -15.5% 27605 ± 2% turbostat.POLL
7856 +2.2% 8025 vmstat.system.cs
6331931 +2.3% 6478704 vmstat.system.in
700119 +3.7% 725931 will-it-scale.52.threads
13463 +3.7% 13959 will-it-scale.per_thread_ops
700119 +3.7% 725931 will-it-scale.workload
8.36 -7.3% 7.74 perf-stat.i.MPKI
4.591e+09 +3.4% 4.747e+09 perf-stat.i.branch-instructions
1.832e+08 +2.8% 1.883e+08 perf-stat.i.branch-misses
26.70 -0.3 26.40 perf-stat.i.cache-miss-rate%
7852 +2.2% 8021 perf-stat.i.context-switches
6.43 -7.2% 5.97 perf-stat.i.cpi
769.61 +1.8% 783.29 perf-stat.i.cpu-migrations
6.39e+09 +3.4% 6.606e+09 perf-stat.i.dTLB-loads
2.94e+09 +3.2% 3.035e+09 perf-stat.i.dTLB-stores
78.29 -0.9 77.44 perf-stat.i.iTLB-load-miss-rate%
18959450 +3.5% 19621273 perf-stat.i.iTLB-load-misses
5254435 +8.7% 5713444 perf-stat.i.iTLB-loads
2.236e+10 +7.7% 2.408e+10 perf-stat.i.instructions
1181 +4.0% 1228 perf-stat.i.instructions-per-iTLB-miss
0.16 +7.7% 0.17 perf-stat.i.ipc
0.02 ± 36% -49.6% 0.01 ± 53% perf-stat.i.major-faults
485.08 +3.0% 499.67 perf-stat.i.metric.K/sec
141.71 +3.2% 146.25 perf-stat.i.metric.M/sec
747997 +3.7% 775416 perf-stat.i.minor-faults
3127957 -13.9% 2693728 perf-stat.i.node-loads
26089697 +3.4% 26965335 perf-stat.i.node-store-misses
767569 +3.7% 796095 perf-stat.i.node-stores
747997 +3.7% 775416 perf-stat.i.page-faults
8.35 -7.3% 7.74 perf-stat.overall.MPKI
26.70 -0.3 26.40 perf-stat.overall.cache-miss-rate%
6.43 -7.1% 5.97 perf-stat.overall.cpi
78.30 -0.9 77.45 perf-stat.overall.iTLB-load-miss-rate%
1179 +4.0% 1226 perf-stat.overall.instructions-per-iTLB-miss
0.16 +7.7% 0.17 perf-stat.overall.ipc
9644584 +3.8% 10011125 perf-stat.overall.path-length
4.575e+09 +3.4% 4.731e+09 perf-stat.ps.branch-instructions
1.825e+08 +2.8% 1.876e+08 perf-stat.ps.branch-misses
7825 +2.2% 7995 perf-stat.ps.context-switches
767.16 +1.8% 780.76 perf-stat.ps.cpu-migrations
6.368e+09 +3.4% 6.583e+09 perf-stat.ps.dTLB-loads
2.93e+09 +3.2% 3.025e+09 perf-stat.ps.dTLB-stores
18896725 +3.5% 19555325 perf-stat.ps.iTLB-load-misses
5236456 +8.7% 5693636 perf-stat.ps.iTLB-loads
2.229e+10 +7.6% 2.399e+10 perf-stat.ps.instructions
745423 +3.7% 772705 perf-stat.ps.minor-faults
3117663 -13.9% 2684861 perf-stat.ps.node-loads
26002765 +3.4% 26875267 perf-stat.ps.node-store-misses
764789 +3.7% 793098 perf-stat.ps.node-stores
745423 +3.7% 772705 perf-stat.ps.page-faults
6.752e+12 +7.6% 7.267e+12 perf-stat.total.instructions
19.21 -1.0 18.18 perf-profile.calltrace.cycles-pp.llist_add_batch.smp_call_function_many_cond.on_each_cpu_cond_mask.flush_tlb_mm_range.tlb_finish_mmu
17.00 -0.9 16.09 perf-profile.calltrace.cycles-pp.llist_add_batch.smp_call_function_many_cond.on_each_cpu_cond_mask.flush_tlb_mm_range.zap_pte_range
65.30 -0.6 64.69 perf-profile.calltrace.cycles-pp.zap_page_range_single.madvise_vma_behavior.do_madvise.__x64_sys_madvise.do_syscall_64
65.34 -0.6 64.75 perf-profile.calltrace.cycles-pp.madvise_vma_behavior.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe
65.98 -0.5 65.45 perf-profile.calltrace.cycles-pp.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
65.96 -0.5 65.42 perf-profile.calltrace.cycles-pp.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
9.72 ± 2% -0.5 9.20 perf-profile.calltrace.cycles-pp.asm_sysvec_call_function.llist_add_batch.smp_call_function_many_cond.on_each_cpu_cond_mask.flush_tlb_mm_range
66.33 -0.5 65.81 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
66.46 -0.5 65.95 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__madvise
31.88 -0.4 31.43 perf-profile.calltrace.cycles-pp.smp_call_function_many_cond.on_each_cpu_cond_mask.flush_tlb_mm_range.tlb_finish_mmu.zap_page_range_single
67.72 -0.4 67.28 perf-profile.calltrace.cycles-pp.__madvise
32.15 -0.4 31.73 perf-profile.calltrace.cycles-pp.on_each_cpu_cond_mask.flush_tlb_mm_range.tlb_finish_mmu.zap_page_range_single.madvise_vma_behavior
32.60 -0.4 32.21 perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.tlb_finish_mmu.zap_page_range_single.madvise_vma_behavior.do_madvise
32.93 -0.3 32.58 perf-profile.calltrace.cycles-pp.tlb_finish_mmu.zap_page_range_single.madvise_vma_behavior.do_madvise.__x64_sys_madvise
31.07 -0.3 30.74 perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.zap_pte_range.zap_pmd_range.unmap_page_range.zap_page_range_single
31.58 -0.3 31.28 perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.zap_page_range_single.madvise_vma_behavior
31.61 -0.3 31.30 perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.zap_page_range_single.madvise_vma_behavior.do_madvise
31.80 -0.3 31.51 perf-profile.calltrace.cycles-pp.unmap_page_range.zap_page_range_single.madvise_vma_behavior.do_madvise.__x64_sys_madvise
8.34 -0.1 8.22 perf-profile.calltrace.cycles-pp.sysvec_call_function.asm_sysvec_call_function.llist_add_batch.smp_call_function_many_cond.on_each_cpu_cond_mask
8.06 -0.1 7.95 perf-profile.calltrace.cycles-pp.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.llist_add_batch.smp_call_function_many_cond
7.98 -0.1 7.87 perf-profile.calltrace.cycles-pp.__flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.llist_add_batch
0.59 ± 3% +0.1 0.65 ± 2% perf-profile.calltrace.cycles-pp.asm_sysvec_call_function.testcase
1.46 +0.1 1.53 perf-profile.calltrace.cycles-pp.filemap_map_pages.do_read_fault.do_fault.__handle_mm_fault.handle_mm_fault
1.48 +0.1 1.55 perf-profile.calltrace.cycles-pp.do_read_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
1.53 +0.1 1.62 perf-profile.calltrace.cycles-pp.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
2.92 +0.1 3.02 perf-profile.calltrace.cycles-pp.flush_tlb_func.__flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function
1.26 ± 2% +0.1 1.36 perf-profile.calltrace.cycles-pp.default_send_IPI_mask_sequence_phys.smp_call_function_many_cond.on_each_cpu_cond_mask.flush_tlb_mm_range.zap_pte_range
1.84 +0.1 1.96 perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
7.87 +0.1 8.00 perf-profile.calltrace.cycles-pp.llist_reverse_order.__flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function
2.03 ± 2% +0.1 2.17 perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
2.90 +0.2 3.06 perf-profile.calltrace.cycles-pp.asm_sysvec_call_function.smp_call_function_many_cond.on_each_cpu_cond_mask.flush_tlb_mm_range.zap_pte_range
2.62 ± 3% +0.2 2.80 perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault.testcase
2.58 ± 3% +0.2 2.76 perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
2.95 ± 3% +0.2 3.14 perf-profile.calltrace.cycles-pp.asm_exc_page_fault.testcase
2.75 +0.2 2.94 perf-profile.calltrace.cycles-pp.asm_sysvec_call_function.smp_call_function_many_cond.on_each_cpu_cond_mask.flush_tlb_mm_range.tlb_finish_mmu
4.96 +0.3 5.29 perf-profile.calltrace.cycles-pp.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.smp_call_function_many_cond.on_each_cpu_cond_mask
4.92 +0.3 5.25 perf-profile.calltrace.cycles-pp.__flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.smp_call_function_many_cond
5.13 +0.3 5.46 perf-profile.calltrace.cycles-pp.sysvec_call_function.asm_sysvec_call_function.smp_call_function_many_cond.on_each_cpu_cond_mask.flush_tlb_mm_range
5.08 +0.4 5.44 perf-profile.calltrace.cycles-pp.testcase
37.25 -2.0 35.24 perf-profile.children.cycles-pp.llist_add_batch
62.82 -0.8 62.04 perf-profile.children.cycles-pp.on_each_cpu_cond_mask
62.82 -0.8 62.04 perf-profile.children.cycles-pp.smp_call_function_many_cond
63.70 -0.7 62.98 perf-profile.children.cycles-pp.flush_tlb_mm_range
65.30 -0.6 64.70 perf-profile.children.cycles-pp.zap_page_range_single
65.34 -0.6 64.75 perf-profile.children.cycles-pp.madvise_vma_behavior
65.98 -0.5 65.45 perf-profile.children.cycles-pp.__x64_sys_madvise
65.96 -0.5 65.43 perf-profile.children.cycles-pp.do_madvise
66.52 -0.5 66.01 perf-profile.children.cycles-pp.do_syscall_64
66.65 -0.5 66.16 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
67.79 -0.4 67.36 perf-profile.children.cycles-pp.__madvise
32.94 -0.3 32.60 perf-profile.children.cycles-pp.tlb_finish_mmu
31.74 -0.3 31.43 perf-profile.children.cycles-pp.zap_pte_range
31.76 -0.3 31.46 perf-profile.children.cycles-pp.zap_pmd_range
31.95 -0.3 31.66 perf-profile.children.cycles-pp.unmap_page_range
0.42 ± 2% +0.0 0.46 perf-profile.children.cycles-pp.error_entry
0.20 ± 3% +0.0 0.24 ± 5% perf-profile.children.cycles-pp.up_read
0.69 +0.0 0.74 perf-profile.children.cycles-pp.native_flush_tlb_local
1.47 +0.1 1.55 perf-profile.children.cycles-pp.filemap_map_pages
1.48 +0.1 1.56 perf-profile.children.cycles-pp.do_read_fault
1.54 +0.1 1.62 perf-profile.children.cycles-pp.do_fault
2.75 +0.1 2.86 perf-profile.children.cycles-pp.default_send_IPI_mask_sequence_phys
1.85 +0.1 1.98 perf-profile.children.cycles-pp.__handle_mm_fault
2.04 ± 2% +0.1 2.18 perf-profile.children.cycles-pp.handle_mm_fault
2.63 ± 3% +0.2 2.81 perf-profile.children.cycles-pp.exc_page_fault
2.62 ± 3% +0.2 2.80 perf-profile.children.cycles-pp.do_user_addr_fault
3.24 ± 3% +0.2 3.44 perf-profile.children.cycles-pp.asm_exc_page_fault
3.83 +0.2 4.04 perf-profile.children.cycles-pp.flush_tlb_func
0.69 ± 2% +0.2 0.92 perf-profile.children.cycles-pp._find_next_bit
9.92 +0.3 10.23 perf-profile.children.cycles-pp.llist_reverse_order
5.45 +0.4 5.81 perf-profile.children.cycles-pp.testcase
18.42 +0.5 18.96 perf-profile.children.cycles-pp.asm_sysvec_call_function
16.24 +0.5 16.78 perf-profile.children.cycles-pp.__flush_smp_call_function_queue
15.78 +0.5 16.32 perf-profile.children.cycles-pp.__sysvec_call_function
16.36 +0.5 16.90 perf-profile.children.cycles-pp.sysvec_call_function
27.92 -1.9 26.04 perf-profile.self.cycles-pp.llist_add_batch
0.16 ± 2% +0.0 0.18 ± 4% perf-profile.self.cycles-pp.up_read
0.42 ± 2% +0.0 0.45 perf-profile.self.cycles-pp.error_entry
0.21 ± 4% +0.0 0.24 ± 5% perf-profile.self.cycles-pp.down_read
0.26 ± 2% +0.0 0.29 ± 3% perf-profile.self.cycles-pp.tlb_finish_mmu
2.01 +0.0 2.05 perf-profile.self.cycles-pp.default_send_IPI_mask_sequence_phys
0.68 +0.0 0.73 perf-profile.self.cycles-pp.native_flush_tlb_local
3.10 +0.2 3.26 perf-profile.self.cycles-pp.flush_tlb_func
0.50 ± 2% +0.2 0.68 perf-profile.self.cycles-pp._find_next_bit
9.92 +0.3 10.22 perf-profile.self.cycles-pp.llist_reverse_order
16.10 +0.5 16.64 perf-profile.self.cycles-pp.smp_call_function_many_cond
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
next prev parent reply other threads:[~2023-10-25 7:18 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-11 15:02 [PATCH 0/2] lib/find: Fix KCSAN warnings in find_*_bit() functions Jan Kara
2023-10-11 15:02 ` [PATCH 1/2] lib/find: Make functions safe on changing bitmaps Jan Kara
2023-10-11 18:26 ` Yury Norov
2023-10-11 18:49 ` Matthew Wilcox
2023-10-11 19:25 ` Mirsad Todorovac
2023-10-12 12:21 ` Jan Kara
2023-10-14 0:15 ` Yury Norov
2023-10-14 2:21 ` Mirsad Goran Todorovac
2023-10-14 2:53 ` Yury Norov
2023-10-14 10:04 ` Mirsad Todorovac
2023-10-16 9:22 ` Jan Kara
2023-10-11 20:40 ` Mirsad Todorovac
2023-10-18 16:23 ` kernel test robot
2023-10-25 7:18 ` kernel test robot [this message]
2023-10-25 8:18 ` Rasmus Villemoes
2023-10-27 3:51 ` Yury Norov
2023-10-27 9:55 ` Jan Kara
2023-10-27 15:51 ` Mirsad Todorovac
2023-10-11 15:02 ` [PATCH 2/2] xarray: Fix race in xas_find_chunk() Jan Kara
2023-10-11 15:38 ` Matthew Wilcox
2023-10-11 20:40 ` Mirsad Todorovac
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=202310251458.48b4452d-oliver.sang@intel.com \
--to=oliver.sang@intel.com \
--cc=andriy.shevchenko@linux.intel.com \
--cc=feng.tang@intel.com \
--cc=fengwei.yin@intel.com \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@rasmusvillemoes.dk \
--cc=lkp@intel.com \
--cc=mirsad.todorovac@alu.unizg.hr \
--cc=oe-lkp@lists.linux.dev \
--cc=willy@infradead.org \
--cc=ying.huang@intel.com \
--cc=yury.norov@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.