oe-lkp.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: kenel test robot <oliver.sang@intel.com>
To: Amir Goldstein <amir73il@gmail.com>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
	<linux-kernel@vger.kernel.org>,
	Christian Brauner <brauner@kernel.org>,
	Josef Bacik <josef@toxicpanda.com>,
	Christoph Hellwig <hch@lst.de>, Jan Kara <jack@suse.cz>,
	<linux-fsdevel@vger.kernel.org>, <ying.huang@intel.com>,
	<feng.tang@intel.com>, <fengwei.yin@intel.com>,
	<oliver.sang@intel.com>
Subject: [linus:master] [remap_range]  dfad37051a: stress-ng.file-ioctl.ops_per_sec -11.2% regression
Date: Wed, 31 Jan 2024 22:13:16 +0800	[thread overview]
Message-ID: <202401312229.eddeb9a6-oliver.sang@intel.com> (raw)



Hello,

kernel test robot noticed a -11.2% regression of stress-ng.file-ioctl.ops_per_sec on:


commit: dfad37051ade6ac0d404ef4913f3bd01954ee51c ("remap_range: move permission hooks out of do_clone_file_range()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

testcase: stress-ng
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:

	nr_threads: 10%
	disk: 1HDD
	testtime: 60s
	fs: btrfs
	test: file-ioctl
	cpufreq_governor: performance




If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202401312229.eddeb9a6-oliver.sang@intel.com


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240131/202401312229.eddeb9a6-oliver.sang@intel.com

=========================================================================================
compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-12/performance/1HDD/btrfs/x86_64-rhel-8.3/10%/debian-11.1-x86_64-20220510.cgz/lkp-icl-2sp8/file-ioctl/stress-ng/60s

commit: 
  d53471ba6f ("splice: remove permission hook from iter_file_splice_write()")
  dfad37051a ("remap_range: move permission hooks out of do_clone_file_range()")

d53471ba6f7ae97a dfad37051ade6ac0d404ef4913f 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      2.57            -0.3        2.27        mpstat.cpu.all.usr%
      7.40            +3.4%       7.65        iostat.cpu.system
      2.50           -11.5%       2.22        iostat.cpu.user
  95739218           -11.2%   84990543 ±  2%  stress-ng.file-ioctl.ops
   1595650           -11.2%    1416506 ±  2%  stress-ng.file-ioctl.ops_per_sec
    267.41            +4.2%     278.66        stress-ng.time.system_time
     90.19           -12.5%      78.96        stress-ng.time.user_time
      0.12 ±  9%     +37.6%       0.16 ±  3%  perf-stat.i.MPKI
 5.619e+09            -4.9%  5.346e+09        perf-stat.i.branch-instructions
     25.26 ± 12%      +5.4       30.67 ±  2%  perf-stat.i.cache-miss-rate%
   3226271 ±  8%     +32.3%    4268159 ±  2%  perf-stat.i.cache-misses
  13880671 ±  2%      +7.6%   14934433        perf-stat.i.cache-references
      0.83            +3.9%       0.86        perf-stat.i.cpi
      7405 ±  8%     -26.1%       5473 ±  2%  perf-stat.i.cycles-between-cache-misses
 5.186e+09            -6.0%  4.873e+09        perf-stat.i.dTLB-stores
 2.807e+10            -3.9%  2.696e+10        perf-stat.i.instructions
      1.21            -3.7%       1.17        perf-stat.i.ipc
    257.16           +12.9%     290.46        perf-stat.i.metric.K/sec
    290.80            -4.2%     278.45        perf-stat.i.metric.M/sec
   1580051 ± 11%     +38.0%    2180479 ±  5%  perf-stat.i.node-load-misses
    228848 ± 22%    +116.2%     494834 ± 27%  perf-stat.i.node-loads
      0.11 ±  9%     +37.7%       0.16 ±  3%  perf-stat.overall.MPKI
     23.29 ± 11%      +5.3       28.58 ±  2%  perf-stat.overall.cache-miss-rate%
      0.82            +3.9%       0.86        perf-stat.overall.cpi
      7231 ±  8%     -25.1%       5416 ±  2%  perf-stat.overall.cycles-between-cache-misses
      1.21            -3.7%       1.17        perf-stat.overall.ipc
 5.524e+09            -4.8%  5.257e+09        perf-stat.ps.branch-instructions
   3170718 ±  8%     +32.4%    4196610 ±  2%  perf-stat.ps.cache-misses
  13646445 ±  2%      +7.6%   14686495 ±  2%  perf-stat.ps.cache-references
 5.099e+09            -6.0%  4.792e+09        perf-stat.ps.dTLB-stores
 2.759e+10            -3.9%  2.651e+10        perf-stat.ps.instructions
   1553350 ± 11%     +38.1%    2144498 ±  5%  perf-stat.ps.node-load-misses
    224907 ± 22%    +116.2%     486304 ± 27%  perf-stat.ps.node-loads
 1.668e+12            -3.4%  1.611e+12 ±  2%  perf-stat.total.instructions
      5.57 ±  3%      -0.7        4.85 ±  2%  perf-profile.calltrace.cycles-pp.__fget_light.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl
      0.89 ± 23%      -0.4        0.45 ± 44%  perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl
      2.30 ±  2%      -0.3        2.00        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
      1.69 ±  3%      -0.3        1.39 ±  4%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64
      1.99 ±  2%      -0.3        1.72        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
      1.16 ±  3%      -0.2        1.00 ±  3%  perf-profile.calltrace.cycles-pp.__x64_sys_fcntl.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.60 ±  4%      -0.2        0.44 ± 45%  perf-profile.calltrace.cycles-pp.__fget_light.__x64_sys_fcntl.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.00            +1.5        1.52 ±  2%  perf-profile.calltrace.cycles-pp.__fsnotify_parent.vfs_clone_file_range.ioctl_file_clone.do_vfs_ioctl.__x64_sys_ioctl
      0.00            +6.9        6.94 ±  6%  perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.vfs_clone_file_range.ioctl_file_clone.do_vfs_ioctl
      0.00            +7.4        7.41 ±  6%  perf-profile.calltrace.cycles-pp.security_file_permission.vfs_clone_file_range.ioctl_file_clone.do_vfs_ioctl.__x64_sys_ioctl
     21.11            +7.4       28.53        perf-profile.calltrace.cycles-pp.do_vfs_ioctl.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl
      3.18 ±  2%      +8.7       11.87 ±  3%  perf-profile.calltrace.cycles-pp.ioctl_file_clone.do_vfs_ioctl.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe
      1.46 ±  9%      +8.9       10.36 ±  4%  perf-profile.calltrace.cycles-pp.vfs_clone_file_range.ioctl_file_clone.do_vfs_ioctl.__x64_sys_ioctl.do_syscall_64
     10.70            -1.3        9.39 ±  3%  perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
     11.31            -1.1       10.24 ±  2%  perf-profile.children.cycles-pp.entry_SYSCALL_64
      7.87 ±  3%      -1.0        6.90        perf-profile.children.cycles-pp.__fget_light
      5.13            -0.7        4.46 ±  2%  perf-profile.children.cycles-pp.syscall_exit_to_user_mode
      0.89            -0.4        0.46 ±  5%  perf-profile.children.cycles-pp.do_clone_file_range
      3.45 ±  2%      -0.4        3.10        perf-profile.children.cycles-pp.llseek
      1.80 ±  4%      -0.3        1.49 ±  3%  perf-profile.children.cycles-pp.stress_file_ioctl
      1.83            -0.2        1.63 ±  4%  perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
      1.53 ±  3%      -0.2        1.34 ±  4%  perf-profile.children.cycles-pp.exit_to_user_mode_prepare
      2.32 ±  3%      -0.2        2.13        perf-profile.children.cycles-pp.syscall_return_via_sysret
      1.58 ±  2%      -0.2        1.40        perf-profile.children.cycles-pp.memdup_user
      1.81            -0.2        1.62        perf-profile.children.cycles-pp.__get_user_4
      1.26 ±  3%      -0.2        1.08 ±  3%  perf-profile.children.cycles-pp.__x64_sys_fcntl
      1.32 ±  2%      -0.2        1.14 ±  2%  perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
      2.06 ±  2%      -0.2        1.90 ±  3%  perf-profile.children.cycles-pp.syscall_enter_from_user_mode
      1.12 ±  3%      -0.1        0.99 ±  2%  perf-profile.children.cycles-pp.security_file_ioctl
      0.84 ±  3%      -0.1        0.73 ±  3%  perf-profile.children.cycles-pp.ksys_lseek
      0.29 ±  4%      -0.1        0.18 ±  4%  perf-profile.children.cycles-pp.generic_file_rw_checks
      0.76 ±  3%      -0.1        0.68        perf-profile.children.cycles-pp.amd_clear_divider
      0.84 ±  3%      -0.1        0.75 ±  3%  perf-profile.children.cycles-pp.__put_user_4
      0.86 ±  4%      -0.1        0.78 ±  3%  perf-profile.children.cycles-pp._raw_spin_lock
      0.53 ±  3%      -0.1        0.46 ±  4%  perf-profile.children.cycles-pp.__fdget_pos
      0.19 ± 11%      -0.1        0.12 ± 10%  perf-profile.children.cycles-pp.stress_mwc8
      0.54 ±  5%      -0.1        0.48 ±  6%  perf-profile.children.cycles-pp.__check_object_size
      0.73 ±  2%      -0.1        0.67 ±  5%  perf-profile.children.cycles-pp.__fdget
      0.49 ±  2%      -0.1        0.43 ±  3%  perf-profile.children.cycles-pp.__kmalloc_node_track_caller
      0.51 ±  4%      -0.1        0.45 ±  5%  perf-profile.children.cycles-pp.ioctl@plt
      0.58 ±  3%      -0.0        0.54 ±  4%  perf-profile.children.cycles-pp.__get_user_2
      0.38 ±  3%      -0.0        0.33 ±  4%  perf-profile.children.cycles-pp.__kmem_cache_alloc_node
      0.44 ±  3%      -0.0        0.40 ±  3%  perf-profile.children.cycles-pp.__libc_fcntl64
      0.24 ±  6%      -0.0        0.20 ±  7%  perf-profile.children.cycles-pp.do_fcntl
      0.48 ±  3%      -0.0        0.44 ±  2%  perf-profile.children.cycles-pp.set_close_on_exec
      0.16 ±  8%      -0.0        0.14 ±  8%  perf-profile.children.cycles-pp.__check_heap_object
      0.00            +0.2        0.25 ±  4%  perf-profile.children.cycles-pp.fsnotify_perm
      0.57            +0.6        1.15 ±  3%  perf-profile.children.cycles-pp.aa_file_perm
     85.52            +1.4       86.91        perf-profile.children.cycles-pp.ioctl
      0.00            +1.6        1.55        perf-profile.children.cycles-pp.__fsnotify_parent
     62.60            +4.0       66.55        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     59.77            +4.3       64.05        perf-profile.children.cycles-pp.do_syscall_64
     47.98            +5.7       53.66        perf-profile.children.cycles-pp.__x64_sys_ioctl
     21.64            +7.3       28.98        perf-profile.children.cycles-pp.do_vfs_ioctl
      8.29 ±  4%      +7.4       15.74 ±  6%  perf-profile.children.cycles-pp.apparmor_file_permission
      8.78 ±  4%      +7.9       16.64 ±  5%  perf-profile.children.cycles-pp.security_file_permission
      3.30 ±  2%      +8.7       11.96 ±  3%  perf-profile.children.cycles-pp.ioctl_file_clone
      1.68            +8.9       10.55 ±  3%  perf-profile.children.cycles-pp.vfs_clone_file_range
     10.33            -1.3        9.02 ±  3%  perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
     11.15            -1.2        9.92 ±  2%  perf-profile.self.cycles-pp.ioctl
      7.55 ±  3%      -0.9        6.61        perf-profile.self.cycles-pp.__fget_light
      3.16 ±  4%      -0.5        2.69 ±  2%  perf-profile.self.cycles-pp.do_vfs_ioctl
      2.95 ±  2%      -0.4        2.55 ±  2%  perf-profile.self.cycles-pp.__x64_sys_ioctl
      3.32            -0.4        2.93 ±  2%  perf-profile.self.cycles-pp.do_syscall_64
      3.08 ±  2%      -0.4        2.72 ±  3%  perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      3.13            -0.4        2.78 ±  2%  perf-profile.self.cycles-pp.entry_SYSCALL_64
      2.39 ±  2%      -0.3        2.10 ±  2%  perf-profile.self.cycles-pp.ioctl_preallocate
      0.57 ±  2%      -0.3        0.31 ±  9%  perf-profile.self.cycles-pp.do_clone_file_range
      2.02 ±  2%      -0.3        1.77 ±  3%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode
      1.54 ±  4%      -0.2        1.29 ±  3%  perf-profile.self.cycles-pp.stress_file_ioctl
      1.83            -0.2        1.62 ±  4%  perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
      2.32 ±  3%      -0.2        2.13        perf-profile.self.cycles-pp.syscall_return_via_sysret
      1.77            -0.2        1.58        perf-profile.self.cycles-pp.__get_user_4
      1.28 ±  2%      -0.2        1.11 ±  4%  perf-profile.self.cycles-pp.exit_to_user_mode_prepare
      1.76 ±  2%      -0.1        1.62 ±  3%  perf-profile.self.cycles-pp.syscall_enter_from_user_mode
      0.25 ±  6%      -0.1        0.12 ±  8%  perf-profile.self.cycles-pp.generic_file_rw_checks
      0.48 ±  2%      -0.1        0.38 ±  4%  perf-profile.self.cycles-pp.ioctl_file_clone
      0.79 ±  3%      -0.1        0.70 ±  2%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
      0.81 ±  3%      -0.1        0.73 ±  4%  perf-profile.self.cycles-pp.__put_user_4
      0.81 ±  5%      -0.1        0.73 ±  3%  perf-profile.self.cycles-pp._raw_spin_lock
      0.52 ±  4%      -0.1        0.44 ±  3%  perf-profile.self.cycles-pp.amd_clear_divider
      0.17 ± 11%      -0.1        0.12 ± 10%  perf-profile.self.cycles-pp.stress_mwc8
      0.57 ±  3%      -0.0        0.52 ±  4%  perf-profile.self.cycles-pp.__get_user_2
      0.42 ±  4%      -0.0        0.38 ±  3%  perf-profile.self.cycles-pp.__libc_fcntl64
      0.30 ±  3%      -0.0        0.26 ±  5%  perf-profile.self.cycles-pp.__x64_sys_fcntl
      0.22 ±  5%      -0.0        0.18 ±  6%  perf-profile.self.cycles-pp.do_fcntl
      0.28 ±  3%      -0.0        0.24 ±  2%  perf-profile.self.cycles-pp.__kmem_cache_alloc_node
      0.00            +0.2        0.22 ±  4%  perf-profile.self.cycles-pp.fsnotify_perm
      0.49 ±  3%      +0.4        0.92 ±  2%  perf-profile.self.cycles-pp.security_file_permission
      0.46 ±  2%      +0.5        0.96 ±  2%  perf-profile.self.cycles-pp.aa_file_perm
      0.00            +1.5        1.52 ±  2%  perf-profile.self.cycles-pp.__fsnotify_parent
      7.75 ±  4%      +6.8       14.58 ±  7%  perf-profile.self.cycles-pp.apparmor_file_permission




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


             reply	other threads:[~2024-01-31 14:13 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-31 14:13 kenel test robot [this message]
2024-01-31 15:47 ` [linus:master] [remap_range] dfad37051a: stress-ng.file-ioctl.ops_per_sec -11.2% regression Amir Goldstein
2024-02-02  9:13   ` Amir Goldstein
2024-02-04  6:32     ` Oliver Sang
2024-02-06 15:04       ` Amir Goldstein
2024-02-06 16:08         ` Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202401312229.eddeb9a6-oliver.sang@intel.com \
    --to=oliver.sang@intel.com \
    --cc=amir73il@gmail.com \
    --cc=brauner@kernel.org \
    --cc=feng.tang@intel.com \
    --cc=fengwei.yin@intel.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=josef@toxicpanda.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=oe-lkp@lists.linux.dev \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).