linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [lkp] [locks] 7f3697e24d: +35.1% will-it-scale.per_thread_ops
@ 2016-01-29  1:32 kernel test robot
  2016-01-29  2:38 ` Jeff Layton
  0 siblings, 1 reply; 5+ messages in thread
From: kernel test robot @ 2016-01-29  1:32 UTC (permalink / raw)
  To: Jeff Layton; +Cc: lkp, LKML, J. Bruce Fields, Dmitry Vyukov, Alexander Viro

[-- Attachment #1: Type: text/plain, Size: 18360 bytes --]

FYI, we noticed the below changes on

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
commit 7f3697e24dc3820b10f445a4a7d914fc356012d1 ("locks: fix unlock when fcntl_setlk races with a close")


=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase:
  gcc-4.9/performance/x86_64-rhel/debian-x86_64-2015-02-07.cgz/lkp-snb01/lock1/will-it-scale

commit: 
  9189922675ecca0fab38931d86b676e9d79602dc
  7f3697e24dc3820b10f445a4a7d914fc356012d1

9189922675ecca0f 7f3697e24dc3820b10f445a4a7 
---------------- -------------------------- 
         %stddev     %change         %stddev
             \          |                \  
   2376432 ±  0%      +2.1%    2427484 ±  0%  will-it-scale.per_process_ops
    807889 ±  0%     +35.1%    1091496 ±  4%  will-it-scale.per_thread_ops
     22.08 ±  2%     +89.1%      41.75 ±  5%  will-it-scale.time.user_time
   1238371 ± 14%    +100.4%    2481345 ± 39%  cpuidle.C1E-SNB.time
      3098 ± 57%     -66.6%       1035 ±171%  numa-numastat.node1.other_node
    379.25 ±  8%     -21.4%     298.00 ± 12%  numa-vmstat.node0.nr_alloc_batch
     22.08 ±  2%     +89.1%      41.75 ±  5%  time.user_time
      1795 ±  4%      +7.5%       1930 ±  2%  vmstat.system.cs
      0.54 ±  5%    +136.9%       1.28 ± 10%  perf-profile.cycles.___might_sleep.__might_sleep.kmem_cache_alloc.locks_alloc_lock.__posix_lock_file
      1.65 ± 57%    +245.2%       5.70 ± 29%  perf-profile.cycles.__fdget_raw.sys_fcntl.entry_SYSCALL_64_fastpath
      1.58 ± 59%    +248.3%       5.50 ± 31%  perf-profile.cycles.__fget.__fget_light.__fdget_raw.sys_fcntl.entry_SYSCALL_64_fastpath
      1.62 ± 58%    +246.3%       5.63 ± 30%  perf-profile.cycles.__fget_light.__fdget_raw.sys_fcntl.entry_SYSCALL_64_fastpath
      0.00 ± -1%      +Inf%       5.88 ± 11%  perf-profile.cycles.__memset.locks_alloc_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait
      2.50 ±  2%    -100.0%       0.00 ± -1%  perf-profile.cycles.__memset.locks_alloc_lock.__posix_lock_file.vfs_lock_file.fcntl_setlk
      1.29 ±  4%    +138.8%       3.09 ± 11%  perf-profile.cycles.__memset.locks_alloc_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
      0.47 ±  9%    +144.4%       1.16 ± 11%  perf-profile.cycles.__might_fault.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
      0.37 ± 12%    +140.3%       0.90 ±  9%  perf-profile.cycles.__might_sleep.__might_fault.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
      0.86 ±  6%    +137.7%       2.05 ± 10%  perf-profile.cycles.__might_sleep.kmem_cache_alloc.locks_alloc_lock.__posix_lock_file.vfs_lock_file
      0.61 ± 14%     +56.8%       0.95 ± 14%  perf-profile.cycles.__might_sleep.kmem_cache_alloc.locks_alloc_lock.fcntl_setlk.sys_fcntl
      0.00 ± -1%      +Inf%      39.84 ± 12%  perf-profile.cycles.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk.sys_fcntl
     16.44 ±  3%    -100.0%       0.00 ± -1%  perf-profile.cycles.__posix_lock_file.vfs_lock_file.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
      0.00 ± -1%      +Inf%       1.77 ± 11%  perf-profile.cycles._raw_spin_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk
     59.34 ±  1%     -72.4%      16.36 ± 33%  perf-profile.cycles._raw_spin_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
      0.46 ± 11%    +144.9%       1.13 ± 19%  perf-profile.cycles.avc_has_perm.inode_has_perm.file_has_perm.selinux_file_fcntl.security_file_fcntl
      0.87 ±  6%    +103.2%       1.77 ± 12%  perf-profile.cycles.avc_has_perm.inode_has_perm.file_has_perm.selinux_file_lock.security_file_lock
      0.81 ±  4%    +135.7%       1.90 ± 10%  perf-profile.cycles.copy_user_generic_string.sys_fcntl.entry_SYSCALL_64_fastpath
      0.00 ± -1%      +Inf%      41.86 ± 12%  perf-profile.cycles.do_lock_file_wait.part.29.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
      0.88 ±  6%    +127.8%       2.00 ±  9%  perf-profile.cycles.entry_SYSCALL_64
      0.86 ±  4%    +122.6%       1.92 ± 12%  perf-profile.cycles.entry_SYSCALL_64_after_swapgs
     84.98 ±  0%      -9.1%      77.20 ±  2%  perf-profile.cycles.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
      0.76 ± 10%    +142.1%       1.84 ± 14%  perf-profile.cycles.file_has_perm.selinux_file_fcntl.security_file_fcntl.sys_fcntl.entry_SYSCALL_64_fastpath
      1.35 ±  4%    +106.3%       2.78 ± 11%  perf-profile.cycles.file_has_perm.selinux_file_lock.security_file_lock.fcntl_setlk.sys_fcntl
      0.00 ± -1%      +Inf%       0.89 ± 12%  perf-profile.cycles.flock_to_posix_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
      6.90 ±  4%     -48.6%       3.55 ± 27%  perf-profile.cycles.fput.entry_SYSCALL_64_fastpath
      0.51 ± 10%    +140.5%       1.23 ± 16%  perf-profile.cycles.inode_has_perm.isra.31.file_has_perm.selinux_file_fcntl.security_file_fcntl.sys_fcntl
      0.98 ±  4%     +97.7%       1.93 ± 11%  perf-profile.cycles.inode_has_perm.isra.31.file_has_perm.selinux_file_lock.security_file_lock.fcntl_setlk
      0.00 ± -1%      +Inf%       6.56 ± 10%  perf-profile.cycles.kmem_cache_alloc.locks_alloc_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait
      2.75 ±  4%    -100.0%       0.00 ± -1%  perf-profile.cycles.kmem_cache_alloc.locks_alloc_lock.__posix_lock_file.vfs_lock_file.fcntl_setlk
      1.53 ±  7%    +119.7%       3.37 ± 13%  perf-profile.cycles.kmem_cache_alloc.locks_alloc_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
      0.00 ± -1%      +Inf%       1.79 ± 11%  perf-profile.cycles.kmem_cache_free.locks_free_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait
      0.46 ± 14%    +257.0%       1.66 ± 11%  perf-profile.cycles.kmem_cache_free.locks_free_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
      0.40 ±  7%    +158.6%       1.05 ± 17%  perf-profile.cycles.kmem_cache_free.locks_free_lock.locks_dispose_list.__posix_lock_file.vfs_lock_file
      0.00 ± -1%      +Inf%       0.96 ± 10%  perf-profile.cycles.lg_local_lock.locks_insert_lock_ctx.__posix_lock_file.vfs_lock_file.do_lock_file_wait
      0.00 ± -1%      +Inf%      14.69 ± 10%  perf-profile.cycles.locks_alloc_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk
      6.38 ±  3%    -100.0%       0.00 ± -1%  perf-profile.cycles.locks_alloc_lock.__posix_lock_file.vfs_lock_file.fcntl_setlk.sys_fcntl
      3.28 ±  6%    +127.1%       7.45 ± 12%  perf-profile.cycles.locks_alloc_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
      0.00 ± -1%      +Inf%       9.75 ± 13%  perf-profile.cycles.locks_delete_lock_ctx.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk
      3.61 ±  1%    -100.0%       0.00 ± -1%  perf-profile.cycles.locks_delete_lock_ctx.__posix_lock_file.vfs_lock_file.fcntl_setlk.sys_fcntl
      0.00 ± -1%      +Inf%       1.84 ± 11%  perf-profile.cycles.locks_dispose_list.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk
      0.00 ± -1%      +Inf%       2.42 ± 10%  perf-profile.cycles.locks_free_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk
      1.00 ±  3%    -100.0%       0.00 ± -1%  perf-profile.cycles.locks_free_lock.__posix_lock_file.vfs_lock_file.fcntl_setlk.sys_fcntl
      0.63 ± 11%    +224.1%       2.05 ± 10%  perf-profile.cycles.locks_free_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
      0.00 ± -1%      +Inf%       1.22 ± 14%  perf-profile.cycles.locks_free_lock.locks_dispose_list.__posix_lock_file.vfs_lock_file.do_lock_file_wait
      0.00 ± -1%      +Inf%       6.17 ± 15%  perf-profile.cycles.locks_insert_lock_ctx.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk
      2.31 ±  6%    -100.0%       0.00 ± -1%  perf-profile.cycles.locks_insert_lock_ctx.__posix_lock_file.vfs_lock_file.fcntl_setlk.sys_fcntl
      0.00 ± -1%      +Inf%       8.96 ± 13%  perf-profile.cycles.locks_unlink_lock_ctx.locks_delete_lock_ctx.__posix_lock_file.vfs_lock_file.do_lock_file_wait
      3.27 ±  1%    -100.0%       0.00 ± -1%  perf-profile.cycles.locks_unlink_lock_ctx.locks_delete_lock_ctx.__posix_lock_file.vfs_lock_file.fcntl_setlk
     53.88 ±  1%     -79.7%      10.92 ± 46%  perf-profile.cycles.native_queued_spin_lock_slowpath._raw_spin_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
      2.75 ±  0%    +183.3%       7.79 ± 13%  perf-profile.cycles.put_pid.locks_unlink_lock_ctx.locks_delete_lock_ctx.__posix_lock_file.vfs_lock_file
      1.11 ±  9%    +137.2%       2.63 ± 14%  perf-profile.cycles.security_file_fcntl.sys_fcntl.entry_SYSCALL_64_fastpath
      1.69 ±  4%    +118.2%       3.69 ± 11%  perf-profile.cycles.security_file_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
      0.91 ±  9%    +139.0%       2.17 ± 14%  perf-profile.cycles.selinux_file_fcntl.security_file_fcntl.sys_fcntl.entry_SYSCALL_64_fastpath
      1.39 ±  4%    +114.6%       2.97 ± 10%  perf-profile.cycles.selinux_file_lock.security_file_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
      0.00 ± -1%      +Inf%      41.12 ± 12%  perf-profile.cycles.vfs_lock_file.do_lock_file_wait.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
     17.04 ±  3%    -100.0%       0.00 ± -1%  perf-profile.cycles.vfs_lock_file.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
     34.75 ±148%    +132.4%      80.75 ± 82%  sched_debug.cfs_rq:/.load.8
     15.00 ±  9%    +198.3%      44.75 ± 72%  sched_debug.cfs_rq:/.load_avg.21
     25.00 ± 29%    +574.0%     168.50 ± 78%  sched_debug.cfs_rq:/.load_avg.9
     38.47 ±  5%     +29.1%      49.65 ± 26%  sched_debug.cfs_rq:/.load_avg.avg
     63.17 ± 10%     +44.3%      91.16 ± 36%  sched_debug.cfs_rq:/.load_avg.stddev
    893865 ± 12%     -12.5%     782455 ±  0%  sched_debug.cfs_rq:/.min_vruntime.25
     18.25 ± 26%     +52.1%      27.75 ± 25%  sched_debug.cfs_rq:/.runnable_load_avg.9
    -57635 ±-68%    -196.4%      55548 ±130%  sched_debug.cfs_rq:/.spread0.1
   -802264 ±-25%     -29.5%    -565458 ±-49%  sched_debug.cfs_rq:/.spread0.8
   -804662 ±-25%     -29.4%    -567811 ±-48%  sched_debug.cfs_rq:/.spread0.min
      1233 ±  5%     +30.9%       1614 ± 28%  sched_debug.cfs_rq:/.tg_load_avg.0
      1233 ±  5%     +30.9%       1614 ± 28%  sched_debug.cfs_rq:/.tg_load_avg.1
      1228 ±  5%     +30.3%       1601 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.10
      1228 ±  5%     +30.4%       1601 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.11
      1228 ±  5%     +30.3%       1601 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.12
      1229 ±  5%     +30.0%       1598 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.13
      1228 ±  5%     +30.1%       1598 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.14
      1229 ±  5%     +30.0%       1598 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.15
      1226 ±  5%     +30.3%       1598 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.16
      1226 ±  5%     +30.2%       1597 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.17
      1227 ±  5%     +30.1%       1595 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.18
      1227 ±  5%     +29.4%       1588 ± 26%  sched_debug.cfs_rq:/.tg_load_avg.19
      1233 ±  5%     +30.4%       1609 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.2
      1222 ±  5%     +29.9%       1587 ± 26%  sched_debug.cfs_rq:/.tg_load_avg.20
      1223 ±  5%     +24.2%       1519 ± 20%  sched_debug.cfs_rq:/.tg_load_avg.21
      1223 ±  5%     +23.8%       1515 ± 20%  sched_debug.cfs_rq:/.tg_load_avg.22
      1223 ±  5%     +23.9%       1515 ± 20%  sched_debug.cfs_rq:/.tg_load_avg.23
      1223 ±  5%     +23.9%       1515 ± 20%  sched_debug.cfs_rq:/.tg_load_avg.24
      1223 ±  5%     +23.5%       1511 ± 19%  sched_debug.cfs_rq:/.tg_load_avg.25
      1224 ±  5%     +23.5%       1512 ± 19%  sched_debug.cfs_rq:/.tg_load_avg.26
      1223 ±  5%     +23.1%       1506 ± 19%  sched_debug.cfs_rq:/.tg_load_avg.27
      1223 ±  5%     +22.5%       1499 ± 19%  sched_debug.cfs_rq:/.tg_load_avg.28
      1224 ±  5%     +22.5%       1499 ± 19%  sched_debug.cfs_rq:/.tg_load_avg.29
      1233 ±  5%     +30.3%       1607 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.3
      1223 ±  5%     +22.2%       1495 ± 18%  sched_debug.cfs_rq:/.tg_load_avg.30
      1224 ±  5%     +22.0%       1493 ± 19%  sched_debug.cfs_rq:/.tg_load_avg.31
      1234 ±  5%     +30.0%       1604 ± 28%  sched_debug.cfs_rq:/.tg_load_avg.4
      1233 ±  5%     +30.0%       1604 ± 28%  sched_debug.cfs_rq:/.tg_load_avg.5
      1231 ±  5%     +30.3%       1604 ± 28%  sched_debug.cfs_rq:/.tg_load_avg.6
      1233 ±  5%     +30.0%       1603 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.7
      1231 ±  5%     +30.1%       1601 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.8
      1228 ±  5%     +30.3%       1601 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.9
      1228 ±  5%     +27.8%       1569 ± 24%  sched_debug.cfs_rq:/.tg_load_avg.avg
      1246 ±  5%     +30.7%       1628 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.max
      1212 ±  5%     +22.2%       1481 ± 19%  sched_debug.cfs_rq:/.tg_load_avg.min
     15.00 ±  9%    +198.3%      44.75 ± 72%  sched_debug.cfs_rq:/.tg_load_avg_contrib.21
     25.00 ± 29%    +574.0%     168.50 ± 78%  sched_debug.cfs_rq:/.tg_load_avg_contrib.9
     38.53 ±  5%     +29.0%      49.71 ± 26%  sched_debug.cfs_rq:/.tg_load_avg_contrib.avg
     63.34 ± 10%     +44.1%      91.30 ± 36%  sched_debug.cfs_rq:/.tg_load_avg_contrib.stddev
    532.25 ±  2%      +8.5%     577.50 ±  6%  sched_debug.cfs_rq:/.util_avg.15
    210.75 ± 14%     -14.4%     180.50 ±  4%  sched_debug.cfs_rq:/.util_avg.29
    450.00 ± 22%     +50.7%     678.00 ± 18%  sched_debug.cfs_rq:/.util_avg.9
    955572 ±  4%     -10.2%     857813 ±  5%  sched_debug.cpu.avg_idle.6
     23.99 ± 60%     -76.2%       5.71 ± 24%  sched_debug.cpu.clock.stddev
     23.99 ± 60%     -76.2%       5.71 ± 24%  sched_debug.cpu.clock_task.stddev
      2840 ± 37%     -47.4%       1492 ± 65%  sched_debug.cpu.curr->pid.25
     34.75 ±148%    +132.4%      80.75 ± 82%  sched_debug.cpu.load.8
     61776 ±  7%      -7.1%      57380 ±  0%  sched_debug.cpu.nr_load_updates.25
      6543 ±  2%     +20.4%       7879 ±  9%  sched_debug.cpu.nr_switches.0
      5256 ± 23%    +177.1%      14566 ± 52%  sched_debug.cpu.nr_switches.27
      7915 ±  3%      +8.7%       8605 ±  3%  sched_debug.cpu.nr_switches.avg
     -0.25 ±-519%   +1900.0%      -5.00 ±-24%  sched_debug.cpu.nr_uninterruptible.12
      2.00 ± 93%    -125.0%      -0.50 ±-300%  sched_debug.cpu.nr_uninterruptible.24
     17468 ± 14%    +194.3%      51413 ± 75%  sched_debug.cpu.sched_count.15
      2112 ±  2%     +20.8%       2552 ± 11%  sched_debug.cpu.sched_goidle.0
      2103 ± 34%    +219.0%       6709 ± 55%  sched_debug.cpu.sched_goidle.27
      3159 ±  3%      +8.2%       3418 ±  4%  sched_debug.cpu.sched_goidle.avg
      1323 ± 64%     -72.7%     361.50 ± 15%  sched_debug.cpu.ttwu_count.23
      3264 ± 12%     +94.4%       6347 ± 41%  sched_debug.cpu.ttwu_count.27
      3860 ±  3%      +9.0%       4208 ±  3%  sched_debug.cpu.ttwu_count.avg
      2358 ±  3%     +28.7%       3035 ±  9%  sched_debug.cpu.ttwu_local.0
      1110 ± 22%     +54.6%       1716 ± 28%  sched_debug.cpu.ttwu_local.27
      1814 ±  8%     +16.1%       2106 ±  5%  sched_debug.cpu.ttwu_local.stddev


lkp-snb01: Sandy Bridge-EP
Memory: 32G

                             will-it-scale.per_thread_ops

   1.2e+06 ++---------------------------------------------------------------+
           |                                  O                             |
  1.15e+06 O+O O   O O   O   O   O   O                                      |
   1.1e+06 ++                                                               |
           |     O             O   O   O O OO                               |
  1.05e+06 ++          O   O                                                |
     1e+06 ++                                                               |
           |                                                                |
    950000 ++                                                               |
    900000 ++                                                               |
           |                                                                |
    850000 ++                                                               |
    800000 *+*.*.*.*.*.*.*.*.*.*.*.*. .*.*. *.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*
           |                         *     *                                |
    750000 ++---------------------------------------------------------------+


                          will-it-scale.time.user_time

  50 ++---------------------------------------------------------------------+
     |                                                                      |
  45 ++         O   O   O    O   O          O                               |
     O O O    O                                                             |
     |     O          O        O   O O                                      |
  40 ++           O        O           O  O                                 |
     |                                                                      |
  35 ++                                                                     |
     |                                                                      |
  30 ++                                                                     |
     |                                                                      |
     |            *                                                         |
  25 ++          + +                                                        |
     *.*.*.*..*.*   *.*.*..*.*.*.*.*.*.*..*.*.*.*.*.*.*..*.*.*.*.*.*..*.*.*.*
  20 ++---------------------------------------------------------------------+


	[*] bisect-good sample
	[O] bisect-bad  sample

To reproduce:

        git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml


Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Ying Huang

[-- Attachment #2: job.yaml --]
[-- Type: text/plain, Size: 3311 bytes --]

---
LKP_SERVER: inn
LKP_CGI_PORT: 80
LKP_CIFS_PORT: 139
testcase: will-it-scale
default-monitors:
  wait: activate-monitor
  kmsg: 
  uptime: 
  iostat: 
  vmstat: 
  numa-numastat: 
  numa-vmstat: 
  numa-meminfo: 
  proc-vmstat: 
  proc-stat:
    interval: 10
  meminfo: 
  slabinfo: 
  interrupts: 
  lock_stat: 
  latency_stats: 
  softirqs: 
  bdi_dev_mapping: 
  diskstats: 
  nfsstat: 
  cpuidle: 
  cpufreq-stats: 
  turbostat: 
  pmeter: 
  sched_debug:
    interval: 60
cpufreq_governor: performance
default-watchdogs:
  oom-killer: 
  watchdog: 
commit: 7f3697e24dc3820b10f445a4a7d914fc356012d1
model: Sandy Bridge-EP
memory: 32G
hdd_partitions: "/dev/sda2"
swap_partitions: 
category: benchmark
perf-profile:
  freq: 800
will-it-scale:
  test: lock1
queue: bisect
testbox: lkp-snb01
tbox_group: lkp-snb01
kconfig: x86_64-rhel
enqueue_time: 2016-01-28 14:50:21.699178965 +08:00
id: c3ed72938d383a211effce7facc978c2cc247aa8
user: lkp
compiler: gcc-4.9
head_commit: 92e963f50fc74041b5e9e744c330dca48e04f08d
base_commit: 5348c1e9e0dc2b62a484c4b74a8d1d59aa9620a4
branch: linus/master
rootfs: debian-x86_64-2015-02-07.cgz
result_root: "/result/will-it-scale/performance-lock1/lkp-snb01/debian-x86_64-2015-02-07.cgz/x86_64-rhel/gcc-4.9/7f3697e24dc3820b10f445a4a7d914fc356012d1/0"
job_file: "/lkp/scheduled/lkp-snb01/bisect_will-it-scale-performance-lock1-debian-x86_64-2015-02-07.cgz-x86_64-rhel-7f3697e24dc3820b10f445a4a7d914fc356012d1-20160128-75911-11hdapy-0.yaml"
nr_cpu: "$(nproc)"
max_uptime: 1500
initrd: "/osimage/debian/debian-x86_64-2015-02-07.cgz"
bootloader_append:
- root=/dev/ram0
- user=lkp
- job=/lkp/scheduled/lkp-snb01/bisect_will-it-scale-performance-lock1-debian-x86_64-2015-02-07.cgz-x86_64-rhel-7f3697e24dc3820b10f445a4a7d914fc356012d1-20160128-75911-11hdapy-0.yaml
- ARCH=x86_64
- kconfig=x86_64-rhel
- branch=linus/master
- commit=7f3697e24dc3820b10f445a4a7d914fc356012d1
- BOOT_IMAGE=/pkg/linux/x86_64-rhel/gcc-4.9/7f3697e24dc3820b10f445a4a7d914fc356012d1/vmlinuz-4.4.0-rc1-00005-g7f3697e
- max_uptime=1500
- RESULT_ROOT=/result/will-it-scale/performance-lock1/lkp-snb01/debian-x86_64-2015-02-07.cgz/x86_64-rhel/gcc-4.9/7f3697e24dc3820b10f445a4a7d914fc356012d1/0
- LKP_SERVER=inn
- |2-


  earlyprintk=ttyS0,115200 systemd.log_level=err
  debug apic=debug sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100
  panic=-1 softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2 prompt_ramdisk=0
  console=ttyS0,115200 console=tty0 vga=normal

  rw
lkp_initrd: "/lkp/lkp/lkp-x86_64.cgz"
modules_initrd: "/pkg/linux/x86_64-rhel/gcc-4.9/7f3697e24dc3820b10f445a4a7d914fc356012d1/modules.cgz"
bm_initrd: "/osimage/deps/debian-x86_64-2015-02-07.cgz/lkp.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/run-ipconfig.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/turbostat.cgz,/lkp/benchmarks/turbostat.cgz,/lkp/benchmarks/will-it-scale.cgz"
linux_headers_initrd: "/pkg/linux/x86_64-rhel/gcc-4.9/7f3697e24dc3820b10f445a4a7d914fc356012d1/linux-headers.cgz"
repeat_to: 2
kernel: "/pkg/linux/x86_64-rhel/gcc-4.9/7f3697e24dc3820b10f445a4a7d914fc356012d1/vmlinuz-4.4.0-rc1-00005-g7f3697e"
dequeue_time: 2016-01-28 15:11:56.910896619 +08:00
job_state: finished
loadavg: 27.88 12.58 4.93 2/368 9093
start_time: '1453965158'
end_time: '1453965468'
version: "/lkp/lkp/.src-20160127-223853"

[-- Attachment #3: reproduce.sh --]
[-- Type: application/x-sh, Size: 3058 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [lkp] [locks] 7f3697e24d: +35.1% will-it-scale.per_thread_ops
  2016-01-29  1:32 [lkp] [locks] 7f3697e24d: +35.1% will-it-scale.per_thread_ops kernel test robot
@ 2016-01-29  2:38 ` Jeff Layton
  2016-01-29  2:52   ` [LKP] " Huang, Ying
  0 siblings, 1 reply; 5+ messages in thread
From: Jeff Layton @ 2016-01-29  2:38 UTC (permalink / raw)
  To: kernel test robot
  Cc: lkp, LKML, J. Bruce Fields, Dmitry Vyukov, Alexander Viro

On Fri, 29 Jan 2016 09:32:19 +0800
kernel test robot <ying.huang@linux.intel.com> wrote:

> FYI, we noticed the below changes on
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
> commit 7f3697e24dc3820b10f445a4a7d914fc356012d1 ("locks: fix unlock when fcntl_setlk races with a close")
> 
> 
> =========================================================================================
> compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase:
>   gcc-4.9/performance/x86_64-rhel/debian-x86_64-2015-02-07.cgz/lkp-snb01/lock1/will-it-scale
> 
> commit: 
>   9189922675ecca0fab38931d86b676e9d79602dc
>   7f3697e24dc3820b10f445a4a7d914fc356012d1
> 
> 9189922675ecca0f 7f3697e24dc3820b10f445a4a7 
> ---------------- -------------------------- 
>          %stddev     %change         %stddev
>              \          |                \  
>    2376432 ±  0%      +2.1%    2427484 ±  0%  will-it-scale.per_process_ops
>     807889 ±  0%     +35.1%    1091496 ±  4%  will-it-scale.per_thread_ops
>      22.08 ±  2%     +89.1%      41.75 ±  5%  will-it-scale.time.user_time
>    1238371 ± 14%    +100.4%    2481345 ± 39%  cpuidle.C1E-SNB.time
>       3098 ± 57%     -66.6%       1035 ±171%  numa-numastat.node1.other_node
>     379.25 ±  8%     -21.4%     298.00 ± 12%  numa-vmstat.node0.nr_alloc_batch
>      22.08 ±  2%     +89.1%      41.75 ±  5%  time.user_time
>       1795 ±  4%      +7.5%       1930 ±  2%  vmstat.system.cs
>       0.54 ±  5%    +136.9%       1.28 ± 10%  perf-profile.cycles.___might_sleep.__might_sleep.kmem_cache_alloc.locks_alloc_lock.__posix_lock_file
>       1.65 ± 57%    +245.2%       5.70 ± 29%  perf-profile.cycles.__fdget_raw.sys_fcntl.entry_SYSCALL_64_fastpath
>       1.58 ± 59%    +248.3%       5.50 ± 31%  perf-profile.cycles.__fget.__fget_light.__fdget_raw.sys_fcntl.entry_SYSCALL_64_fastpath
>       1.62 ± 58%    +246.3%       5.63 ± 30%  perf-profile.cycles.__fget_light.__fdget_raw.sys_fcntl.entry_SYSCALL_64_fastpath
>       0.00 ± -1%      +Inf%       5.88 ± 11%  perf-profile.cycles.__memset.locks_alloc_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait
>       2.50 ±  2%    -100.0%       0.00 ± -1%  perf-profile.cycles.__memset.locks_alloc_lock.__posix_lock_file.vfs_lock_file.fcntl_setlk
>       1.29 ±  4%    +138.8%       3.09 ± 11%  perf-profile.cycles.__memset.locks_alloc_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
>       0.47 ±  9%    +144.4%       1.16 ± 11%  perf-profile.cycles.__might_fault.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
>       0.37 ± 12%    +140.3%       0.90 ±  9%  perf-profile.cycles.__might_sleep.__might_fault.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
>       0.86 ±  6%    +137.7%       2.05 ± 10%  perf-profile.cycles.__might_sleep.kmem_cache_alloc.locks_alloc_lock.__posix_lock_file.vfs_lock_file
>       0.61 ± 14%     +56.8%       0.95 ± 14%  perf-profile.cycles.__might_sleep.kmem_cache_alloc.locks_alloc_lock.fcntl_setlk.sys_fcntl
>       0.00 ± -1%      +Inf%      39.84 ± 12%  perf-profile.cycles.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk.sys_fcntl
>      16.44 ±  3%    -100.0%       0.00 ± -1%  perf-profile.cycles.__posix_lock_file.vfs_lock_file.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
>       0.00 ± -1%      +Inf%       1.77 ± 11%  perf-profile.cycles._raw_spin_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk
>      59.34 ±  1%     -72.4%      16.36 ± 33%  perf-profile.cycles._raw_spin_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
>       0.46 ± 11%    +144.9%       1.13 ± 19%  perf-profile.cycles.avc_has_perm.inode_has_perm.file_has_perm.selinux_file_fcntl.security_file_fcntl
>       0.87 ±  6%    +103.2%       1.77 ± 12%  perf-profile.cycles.avc_has_perm.inode_has_perm.file_has_perm.selinux_file_lock.security_file_lock
>       0.81 ±  4%    +135.7%       1.90 ± 10%  perf-profile.cycles.copy_user_generic_string.sys_fcntl.entry_SYSCALL_64_fastpath
>       0.00 ± -1%      +Inf%      41.86 ± 12%  perf-profile.cycles.do_lock_file_wait.part.29.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
>       0.88 ±  6%    +127.8%       2.00 ±  9%  perf-profile.cycles.entry_SYSCALL_64
>       0.86 ±  4%    +122.6%       1.92 ± 12%  perf-profile.cycles.entry_SYSCALL_64_after_swapgs
>      84.98 ±  0%      -9.1%      77.20 ±  2%  perf-profile.cycles.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
>       0.76 ± 10%    +142.1%       1.84 ± 14%  perf-profile.cycles.file_has_perm.selinux_file_fcntl.security_file_fcntl.sys_fcntl.entry_SYSCALL_64_fastpath
>       1.35 ±  4%    +106.3%       2.78 ± 11%  perf-profile.cycles.file_has_perm.selinux_file_lock.security_file_lock.fcntl_setlk.sys_fcntl
>       0.00 ± -1%      +Inf%       0.89 ± 12%  perf-profile.cycles.flock_to_posix_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
>       6.90 ±  4%     -48.6%       3.55 ± 27%  perf-profile.cycles.fput.entry_SYSCALL_64_fastpath
>       0.51 ± 10%    +140.5%       1.23 ± 16%  perf-profile.cycles.inode_has_perm.isra.31.file_has_perm.selinux_file_fcntl.security_file_fcntl.sys_fcntl
>       0.98 ±  4%     +97.7%       1.93 ± 11%  perf-profile.cycles.inode_has_perm.isra.31.file_has_perm.selinux_file_lock.security_file_lock.fcntl_setlk
>       0.00 ± -1%      +Inf%       6.56 ± 10%  perf-profile.cycles.kmem_cache_alloc.locks_alloc_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait
>       2.75 ±  4%    -100.0%       0.00 ± -1%  perf-profile.cycles.kmem_cache_alloc.locks_alloc_lock.__posix_lock_file.vfs_lock_file.fcntl_setlk
>       1.53 ±  7%    +119.7%       3.37 ± 13%  perf-profile.cycles.kmem_cache_alloc.locks_alloc_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
>       0.00 ± -1%      +Inf%       1.79 ± 11%  perf-profile.cycles.kmem_cache_free.locks_free_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait
>       0.46 ± 14%    +257.0%       1.66 ± 11%  perf-profile.cycles.kmem_cache_free.locks_free_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
>       0.40 ±  7%    +158.6%       1.05 ± 17%  perf-profile.cycles.kmem_cache_free.locks_free_lock.locks_dispose_list.__posix_lock_file.vfs_lock_file
>       0.00 ± -1%      +Inf%       0.96 ± 10%  perf-profile.cycles.lg_local_lock.locks_insert_lock_ctx.__posix_lock_file.vfs_lock_file.do_lock_file_wait
>       0.00 ± -1%      +Inf%      14.69 ± 10%  perf-profile.cycles.locks_alloc_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk
>       6.38 ±  3%    -100.0%       0.00 ± -1%  perf-profile.cycles.locks_alloc_lock.__posix_lock_file.vfs_lock_file.fcntl_setlk.sys_fcntl
>       3.28 ±  6%    +127.1%       7.45 ± 12%  perf-profile.cycles.locks_alloc_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
>       0.00 ± -1%      +Inf%       9.75 ± 13%  perf-profile.cycles.locks_delete_lock_ctx.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk
>       3.61 ±  1%    -100.0%       0.00 ± -1%  perf-profile.cycles.locks_delete_lock_ctx.__posix_lock_file.vfs_lock_file.fcntl_setlk.sys_fcntl
>       0.00 ± -1%      +Inf%       1.84 ± 11%  perf-profile.cycles.locks_dispose_list.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk
>       0.00 ± -1%      +Inf%       2.42 ± 10%  perf-profile.cycles.locks_free_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk
>       1.00 ±  3%    -100.0%       0.00 ± -1%  perf-profile.cycles.locks_free_lock.__posix_lock_file.vfs_lock_file.fcntl_setlk.sys_fcntl
>       0.63 ± 11%    +224.1%       2.05 ± 10%  perf-profile.cycles.locks_free_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
>       0.00 ± -1%      +Inf%       1.22 ± 14%  perf-profile.cycles.locks_free_lock.locks_dispose_list.__posix_lock_file.vfs_lock_file.do_lock_file_wait
>       0.00 ± -1%      +Inf%       6.17 ± 15%  perf-profile.cycles.locks_insert_lock_ctx.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk
>       2.31 ±  6%    -100.0%       0.00 ± -1%  perf-profile.cycles.locks_insert_lock_ctx.__posix_lock_file.vfs_lock_file.fcntl_setlk.sys_fcntl
>       0.00 ± -1%      +Inf%       8.96 ± 13%  perf-profile.cycles.locks_unlink_lock_ctx.locks_delete_lock_ctx.__posix_lock_file.vfs_lock_file.do_lock_file_wait
>       3.27 ±  1%    -100.0%       0.00 ± -1%  perf-profile.cycles.locks_unlink_lock_ctx.locks_delete_lock_ctx.__posix_lock_file.vfs_lock_file.fcntl_setlk
>      53.88 ±  1%     -79.7%      10.92 ± 46%  perf-profile.cycles.native_queued_spin_lock_slowpath._raw_spin_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
>       2.75 ±  0%    +183.3%       7.79 ± 13%  perf-profile.cycles.put_pid.locks_unlink_lock_ctx.locks_delete_lock_ctx.__posix_lock_file.vfs_lock_file
>       1.11 ±  9%    +137.2%       2.63 ± 14%  perf-profile.cycles.security_file_fcntl.sys_fcntl.entry_SYSCALL_64_fastpath
>       1.69 ±  4%    +118.2%       3.69 ± 11%  perf-profile.cycles.security_file_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
>       0.91 ±  9%    +139.0%       2.17 ± 14%  perf-profile.cycles.selinux_file_fcntl.security_file_fcntl.sys_fcntl.entry_SYSCALL_64_fastpath
>       1.39 ±  4%    +114.6%       2.97 ± 10%  perf-profile.cycles.selinux_file_lock.security_file_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
>       0.00 ± -1%      +Inf%      41.12 ± 12%  perf-profile.cycles.vfs_lock_file.do_lock_file_wait.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
>      17.04 ±  3%    -100.0%       0.00 ± -1%  perf-profile.cycles.vfs_lock_file.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
>      34.75 ±148%    +132.4%      80.75 ± 82%  sched_debug.cfs_rq:/.load.8
>      15.00 ±  9%    +198.3%      44.75 ± 72%  sched_debug.cfs_rq:/.load_avg.21
>      25.00 ± 29%    +574.0%     168.50 ± 78%  sched_debug.cfs_rq:/.load_avg.9
>      38.47 ±  5%     +29.1%      49.65 ± 26%  sched_debug.cfs_rq:/.load_avg.avg
>      63.17 ± 10%     +44.3%      91.16 ± 36%  sched_debug.cfs_rq:/.load_avg.stddev
>     893865 ± 12%     -12.5%     782455 ±  0%  sched_debug.cfs_rq:/.min_vruntime.25
>      18.25 ± 26%     +52.1%      27.75 ± 25%  sched_debug.cfs_rq:/.runnable_load_avg.9
>     -57635 ±-68%    -196.4%      55548 ±130%  sched_debug.cfs_rq:/.spread0.1
>    -802264 ±-25%     -29.5%    -565458 ±-49%  sched_debug.cfs_rq:/.spread0.8
>    -804662 ±-25%     -29.4%    -567811 ±-48%  sched_debug.cfs_rq:/.spread0.min
>       1233 ±  5%     +30.9%       1614 ± 28%  sched_debug.cfs_rq:/.tg_load_avg.0
>       1233 ±  5%     +30.9%       1614 ± 28%  sched_debug.cfs_rq:/.tg_load_avg.1
>       1228 ±  5%     +30.3%       1601 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.10
>       1228 ±  5%     +30.4%       1601 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.11
>       1228 ±  5%     +30.3%       1601 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.12
>       1229 ±  5%     +30.0%       1598 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.13
>       1228 ±  5%     +30.1%       1598 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.14
>       1229 ±  5%     +30.0%       1598 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.15
>       1226 ±  5%     +30.3%       1598 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.16
>       1226 ±  5%     +30.2%       1597 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.17
>       1227 ±  5%     +30.1%       1595 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.18
>       1227 ±  5%     +29.4%       1588 ± 26%  sched_debug.cfs_rq:/.tg_load_avg.19
>       1233 ±  5%     +30.4%       1609 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.2
>       1222 ±  5%     +29.9%       1587 ± 26%  sched_debug.cfs_rq:/.tg_load_avg.20
>       1223 ±  5%     +24.2%       1519 ± 20%  sched_debug.cfs_rq:/.tg_load_avg.21
>       1223 ±  5%     +23.8%       1515 ± 20%  sched_debug.cfs_rq:/.tg_load_avg.22
>       1223 ±  5%     +23.9%       1515 ± 20%  sched_debug.cfs_rq:/.tg_load_avg.23
>       1223 ±  5%     +23.9%       1515 ± 20%  sched_debug.cfs_rq:/.tg_load_avg.24
>       1223 ±  5%     +23.5%       1511 ± 19%  sched_debug.cfs_rq:/.tg_load_avg.25
>       1224 ±  5%     +23.5%       1512 ± 19%  sched_debug.cfs_rq:/.tg_load_avg.26
>       1223 ±  5%     +23.1%       1506 ± 19%  sched_debug.cfs_rq:/.tg_load_avg.27
>       1223 ±  5%     +22.5%       1499 ± 19%  sched_debug.cfs_rq:/.tg_load_avg.28
>       1224 ±  5%     +22.5%       1499 ± 19%  sched_debug.cfs_rq:/.tg_load_avg.29
>       1233 ±  5%     +30.3%       1607 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.3
>       1223 ±  5%     +22.2%       1495 ± 18%  sched_debug.cfs_rq:/.tg_load_avg.30
>       1224 ±  5%     +22.0%       1493 ± 19%  sched_debug.cfs_rq:/.tg_load_avg.31
>       1234 ±  5%     +30.0%       1604 ± 28%  sched_debug.cfs_rq:/.tg_load_avg.4
>       1233 ±  5%     +30.0%       1604 ± 28%  sched_debug.cfs_rq:/.tg_load_avg.5
>       1231 ±  5%     +30.3%       1604 ± 28%  sched_debug.cfs_rq:/.tg_load_avg.6
>       1233 ±  5%     +30.0%       1603 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.7
>       1231 ±  5%     +30.1%       1601 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.8
>       1228 ±  5%     +30.3%       1601 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.9
>       1228 ±  5%     +27.8%       1569 ± 24%  sched_debug.cfs_rq:/.tg_load_avg.avg
>       1246 ±  5%     +30.7%       1628 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.max
>       1212 ±  5%     +22.2%       1481 ± 19%  sched_debug.cfs_rq:/.tg_load_avg.min
>      15.00 ±  9%    +198.3%      44.75 ± 72%  sched_debug.cfs_rq:/.tg_load_avg_contrib.21
>      25.00 ± 29%    +574.0%     168.50 ± 78%  sched_debug.cfs_rq:/.tg_load_avg_contrib.9
>      38.53 ±  5%     +29.0%      49.71 ± 26%  sched_debug.cfs_rq:/.tg_load_avg_contrib.avg
>      63.34 ± 10%     +44.1%      91.30 ± 36%  sched_debug.cfs_rq:/.tg_load_avg_contrib.stddev
>     532.25 ±  2%      +8.5%     577.50 ±  6%  sched_debug.cfs_rq:/.util_avg.15
>     210.75 ± 14%     -14.4%     180.50 ±  4%  sched_debug.cfs_rq:/.util_avg.29
>     450.00 ± 22%     +50.7%     678.00 ± 18%  sched_debug.cfs_rq:/.util_avg.9
>     955572 ±  4%     -10.2%     857813 ±  5%  sched_debug.cpu.avg_idle.6
>      23.99 ± 60%     -76.2%       5.71 ± 24%  sched_debug.cpu.clock.stddev
>      23.99 ± 60%     -76.2%       5.71 ± 24%  sched_debug.cpu.clock_task.stddev
>       2840 ± 37%     -47.4%       1492 ± 65%  sched_debug.cpu.curr->pid.25
>      34.75 ±148%    +132.4%      80.75 ± 82%  sched_debug.cpu.load.8
>      61776 ±  7%      -7.1%      57380 ±  0%  sched_debug.cpu.nr_load_updates.25
>       6543 ±  2%     +20.4%       7879 ±  9%  sched_debug.cpu.nr_switches.0
>       5256 ± 23%    +177.1%      14566 ± 52%  sched_debug.cpu.nr_switches.27
>       7915 ±  3%      +8.7%       8605 ±  3%  sched_debug.cpu.nr_switches.avg
>      -0.25 ±-519%   +1900.0%      -5.00 ±-24%  sched_debug.cpu.nr_uninterruptible.12
>       2.00 ± 93%    -125.0%      -0.50 ±-300%  sched_debug.cpu.nr_uninterruptible.24
>      17468 ± 14%    +194.3%      51413 ± 75%  sched_debug.cpu.sched_count.15
>       2112 ±  2%     +20.8%       2552 ± 11%  sched_debug.cpu.sched_goidle.0
>       2103 ± 34%    +219.0%       6709 ± 55%  sched_debug.cpu.sched_goidle.27
>       3159 ±  3%      +8.2%       3418 ±  4%  sched_debug.cpu.sched_goidle.avg
>       1323 ± 64%     -72.7%     361.50 ± 15%  sched_debug.cpu.ttwu_count.23
>       3264 ± 12%     +94.4%       6347 ± 41%  sched_debug.cpu.ttwu_count.27
>       3860 ±  3%      +9.0%       4208 ±  3%  sched_debug.cpu.ttwu_count.avg
>       2358 ±  3%     +28.7%       3035 ±  9%  sched_debug.cpu.ttwu_local.0
>       1110 ± 22%     +54.6%       1716 ± 28%  sched_debug.cpu.ttwu_local.27
>       1814 ±  8%     +16.1%       2106 ±  5%  sched_debug.cpu.ttwu_local.stddev
> 
> 
> lkp-snb01: Sandy Bridge-EP
> Memory: 32G
> 
>                              will-it-scale.per_thread_ops
> 
>    1.2e+06 ++---------------------------------------------------------------+
>            |                                  O                             |
>   1.15e+06 O+O O   O O   O   O   O   O                                      |
>    1.1e+06 ++                                                               |
>            |     O             O   O   O O OO                               |
>   1.05e+06 ++          O   O                                                |
>      1e+06 ++                                                               |
>            |                                                                |
>     950000 ++                                                               |
>     900000 ++                                                               |
>            |                                                                |
>     850000 ++                                                               |
>     800000 *+*.*.*.*.*.*.*.*.*.*.*.*. .*.*. *.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*
>            |                         *     *                                |
>     750000 ++---------------------------------------------------------------+
> 
> 
>                           will-it-scale.time.user_time
> 
>   50 ++---------------------------------------------------------------------+
>      |                                                                      |
>   45 ++         O   O   O    O   O          O                               |
>      O O O    O                                                             |
>      |     O          O        O   O O                                      |
>   40 ++           O        O           O  O                                 |
>      |                                                                      |
>   35 ++                                                                     |
>      |                                                                      |
>   30 ++                                                                     |
>      |                                                                      |
>      |            *                                                         |
>   25 ++          + +                                                        |
>      *.*.*.*..*.*   *.*.*..*.*.*.*.*.*.*..*.*.*.*.*.*.*..*.*.*.*.*.*..*.*.*.*
>   20 ++---------------------------------------------------------------------+
> 
> 
> 	[*] bisect-good sample
> 	[O] bisect-bad  sample
> 
> To reproduce:
> 
>         git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
>         cd lkp-tests
>         bin/lkp install job.yaml  # job file is attached in this email
>         bin/lkp run     job.yaml
> 
> 
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
> 
> 
> Thanks,
> Ying Huang

Thanks...

Huh...I'm stumped on this one. If anything I would have expected better
performance with this patch since we don't even take the file_lock or
do the fcheck in the F_UNLCK codepath now, or when there is an error.

I'll see if I can reproduce it on my own test rig, but I'd welcome
ideas of where and how this performance regression could have crept in.

-- 
Jeff Layton <jeff.layton@primarydata.com>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [LKP] [lkp] [locks] 7f3697e24d: +35.1% will-it-scale.per_thread_ops
  2016-01-29  2:38 ` Jeff Layton
@ 2016-01-29  2:52   ` Huang, Ying
  2016-01-29 12:13     ` Jeff Layton
  2016-02-01 13:39     ` J. Bruce Fields
  0 siblings, 2 replies; 5+ messages in thread
From: Huang, Ying @ 2016-01-29  2:52 UTC (permalink / raw)
  To: Jeff Layton; +Cc: J. Bruce Fields, lkp, LKML, Dmitry Vyukov, Alexander Viro

Jeff Layton <jeff.layton@primarydata.com> writes:

> On Fri, 29 Jan 2016 09:32:19 +0800
> kernel test robot <ying.huang@linux.intel.com> wrote:
>
>> FYI, we noticed the below changes on
>> 
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
>> commit 7f3697e24dc3820b10f445a4a7d914fc356012d1 ("locks: fix unlock when fcntl_setlk races with a close")
>> 
>> 
>> =========================================================================================
>> compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase:
>>   gcc-4.9/performance/x86_64-rhel/debian-x86_64-2015-02-07.cgz/lkp-snb01/lock1/will-it-scale
>> 
>> commit: 
>>   9189922675ecca0fab38931d86b676e9d79602dc
>>   7f3697e24dc3820b10f445a4a7d914fc356012d1
>> 
>> 9189922675ecca0f 7f3697e24dc3820b10f445a4a7 
>> ---------------- -------------------------- 
>>          %stddev     %change         %stddev
>>              \          |                \  
>>    2376432 ±  0%      +2.1%    2427484 ±  0%  will-it-scale.per_process_ops
>>     807889 ±  0%     +35.1%    1091496 ±  4%  will-it-scale.per_thread_ops
>>      22.08 ±  2%     +89.1%      41.75 ±  5%  will-it-scale.time.user_time
>>    1238371 ± 14%    +100.4%    2481345 ± 39%  cpuidle.C1E-SNB.time
>>       3098 ± 57%     -66.6%       1035 ±171%  numa-numastat.node1.other_node
>>     379.25 ±  8%     -21.4%     298.00 ± 12%  numa-vmstat.node0.nr_alloc_batch
>>      22.08 ±  2%     +89.1%      41.75 ±  5%  time.user_time
>>       1795 ±  4%      +7.5%       1930 ±  2%  vmstat.system.cs
>>       0.54 ±  5%    +136.9%       1.28 ± 10%  perf-profile.cycles.___might_sleep.__might_sleep.kmem_cache_alloc.locks_alloc_lock.__posix_lock_file
>>       1.65 ± 57%    +245.2%       5.70 ± 29%  perf-profile.cycles.__fdget_raw.sys_fcntl.entry_SYSCALL_64_fastpath
>>       1.58 ± 59%    +248.3%       5.50 ± 31%  perf-profile.cycles.__fget.__fget_light.__fdget_raw.sys_fcntl.entry_SYSCALL_64_fastpath
>>       1.62 ± 58%    +246.3%       5.63 ± 30%  perf-profile.cycles.__fget_light.__fdget_raw.sys_fcntl.entry_SYSCALL_64_fastpath
>>       0.00 ± -1%      +Inf%       5.88 ± 11%  perf-profile.cycles.__memset.locks_alloc_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait
>>       2.50 ±  2%    -100.0%       0.00 ± -1%  perf-profile.cycles.__memset.locks_alloc_lock.__posix_lock_file.vfs_lock_file.fcntl_setlk
>>       1.29 ±  4%    +138.8%       3.09 ± 11%  perf-profile.cycles.__memset.locks_alloc_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
>>       0.47 ±  9%    +144.4%       1.16 ± 11%  perf-profile.cycles.__might_fault.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
>>       0.37 ± 12%    +140.3%       0.90 ±  9%  perf-profile.cycles.__might_sleep.__might_fault.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
>>       0.86 ±  6%    +137.7%       2.05 ± 10%  perf-profile.cycles.__might_sleep.kmem_cache_alloc.locks_alloc_lock.__posix_lock_file.vfs_lock_file
>>       0.61 ± 14%     +56.8%       0.95 ± 14%  perf-profile.cycles.__might_sleep.kmem_cache_alloc.locks_alloc_lock.fcntl_setlk.sys_fcntl
>>       0.00 ± -1%      +Inf%      39.84 ± 12%  perf-profile.cycles.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk.sys_fcntl
>>      16.44 ±  3%    -100.0%       0.00 ± -1%  perf-profile.cycles.__posix_lock_file.vfs_lock_file.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
>>       0.00 ± -1%      +Inf%       1.77 ± 11%  perf-profile.cycles._raw_spin_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk
>>      59.34 ±  1%     -72.4%      16.36 ± 33%  perf-profile.cycles._raw_spin_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
>>       0.46 ± 11%    +144.9%       1.13 ± 19%  perf-profile.cycles.avc_has_perm.inode_has_perm.file_has_perm.selinux_file_fcntl.security_file_fcntl
>>       0.87 ±  6%    +103.2%       1.77 ± 12%  perf-profile.cycles.avc_has_perm.inode_has_perm.file_has_perm.selinux_file_lock.security_file_lock
>>       0.81 ±  4%    +135.7%       1.90 ± 10%  perf-profile.cycles.copy_user_generic_string.sys_fcntl.entry_SYSCALL_64_fastpath
>>       0.00 ± -1%      +Inf%      41.86 ± 12%  perf-profile.cycles.do_lock_file_wait.part.29.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
>>       0.88 ±  6%    +127.8%       2.00 ±  9%  perf-profile.cycles.entry_SYSCALL_64
>>       0.86 ±  4%    +122.6%       1.92 ± 12%  perf-profile.cycles.entry_SYSCALL_64_after_swapgs
>>      84.98 ±  0%      -9.1%      77.20 ±  2%  perf-profile.cycles.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
>>       0.76 ± 10%    +142.1%       1.84 ± 14%  perf-profile.cycles.file_has_perm.selinux_file_fcntl.security_file_fcntl.sys_fcntl.entry_SYSCALL_64_fastpath
>>       1.35 ±  4%    +106.3%       2.78 ± 11%  perf-profile.cycles.file_has_perm.selinux_file_lock.security_file_lock.fcntl_setlk.sys_fcntl
>>       0.00 ± -1%      +Inf%       0.89 ± 12%  perf-profile.cycles.flock_to_posix_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
>>       6.90 ±  4%     -48.6%       3.55 ± 27%  perf-profile.cycles.fput.entry_SYSCALL_64_fastpath
>>       0.51 ± 10%    +140.5%       1.23 ± 16%  perf-profile.cycles.inode_has_perm.isra.31.file_has_perm.selinux_file_fcntl.security_file_fcntl.sys_fcntl
>>       0.98 ±  4%     +97.7%       1.93 ± 11%  perf-profile.cycles.inode_has_perm.isra.31.file_has_perm.selinux_file_lock.security_file_lock.fcntl_setlk
>>       0.00 ± -1%      +Inf%       6.56 ± 10%  perf-profile.cycles.kmem_cache_alloc.locks_alloc_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait
>>       2.75 ±  4%    -100.0%       0.00 ± -1%  perf-profile.cycles.kmem_cache_alloc.locks_alloc_lock.__posix_lock_file.vfs_lock_file.fcntl_setlk
>>       1.53 ±  7%    +119.7%       3.37 ± 13%  perf-profile.cycles.kmem_cache_alloc.locks_alloc_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
>>       0.00 ± -1%      +Inf%       1.79 ± 11%  perf-profile.cycles.kmem_cache_free.locks_free_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait
>>       0.46 ± 14%    +257.0%       1.66 ± 11%  perf-profile.cycles.kmem_cache_free.locks_free_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
>>       0.40 ±  7%    +158.6%       1.05 ± 17%  perf-profile.cycles.kmem_cache_free.locks_free_lock.locks_dispose_list.__posix_lock_file.vfs_lock_file
>>       0.00 ± -1%      +Inf%       0.96 ± 10%  perf-profile.cycles.lg_local_lock.locks_insert_lock_ctx.__posix_lock_file.vfs_lock_file.do_lock_file_wait
>>       0.00 ± -1%      +Inf%      14.69 ± 10%  perf-profile.cycles.locks_alloc_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk
>>       6.38 ±  3%    -100.0%       0.00 ± -1%  perf-profile.cycles.locks_alloc_lock.__posix_lock_file.vfs_lock_file.fcntl_setlk.sys_fcntl
>>       3.28 ±  6%    +127.1%       7.45 ± 12%  perf-profile.cycles.locks_alloc_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
>>       0.00 ± -1%      +Inf%       9.75 ± 13%  perf-profile.cycles.locks_delete_lock_ctx.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk
>>       3.61 ±  1%    -100.0%       0.00 ± -1%  perf-profile.cycles.locks_delete_lock_ctx.__posix_lock_file.vfs_lock_file.fcntl_setlk.sys_fcntl
>>       0.00 ± -1%      +Inf%       1.84 ± 11%  perf-profile.cycles.locks_dispose_list.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk
>>       0.00 ± -1%      +Inf%       2.42 ± 10%  perf-profile.cycles.locks_free_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk
>>       1.00 ±  3%    -100.0%       0.00 ± -1%  perf-profile.cycles.locks_free_lock.__posix_lock_file.vfs_lock_file.fcntl_setlk.sys_fcntl
>>       0.63 ± 11%    +224.1%       2.05 ± 10%  perf-profile.cycles.locks_free_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
>>       0.00 ± -1%      +Inf%       1.22 ± 14%  perf-profile.cycles.locks_free_lock.locks_dispose_list.__posix_lock_file.vfs_lock_file.do_lock_file_wait
>>       0.00 ± -1%      +Inf%       6.17 ± 15%  perf-profile.cycles.locks_insert_lock_ctx.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk
>>       2.31 ±  6%    -100.0%       0.00 ± -1%  perf-profile.cycles.locks_insert_lock_ctx.__posix_lock_file.vfs_lock_file.fcntl_setlk.sys_fcntl
>>       0.00 ± -1%      +Inf%       8.96 ± 13%  perf-profile.cycles.locks_unlink_lock_ctx.locks_delete_lock_ctx.__posix_lock_file.vfs_lock_file.do_lock_file_wait
>>       3.27 ±  1%    -100.0%       0.00 ± -1%  perf-profile.cycles.locks_unlink_lock_ctx.locks_delete_lock_ctx.__posix_lock_file.vfs_lock_file.fcntl_setlk
>>      53.88 ±  1%     -79.7%      10.92 ± 46%  perf-profile.cycles.native_queued_spin_lock_slowpath._raw_spin_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
>>       2.75 ±  0%    +183.3%       7.79 ± 13%  perf-profile.cycles.put_pid.locks_unlink_lock_ctx.locks_delete_lock_ctx.__posix_lock_file.vfs_lock_file
>>       1.11 ±  9%    +137.2%       2.63 ± 14%  perf-profile.cycles.security_file_fcntl.sys_fcntl.entry_SYSCALL_64_fastpath
>>       1.69 ±  4%    +118.2%       3.69 ± 11%  perf-profile.cycles.security_file_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
>>       0.91 ±  9%    +139.0%       2.17 ± 14%  perf-profile.cycles.selinux_file_fcntl.security_file_fcntl.sys_fcntl.entry_SYSCALL_64_fastpath
>>       1.39 ±  4%    +114.6%       2.97 ± 10%  perf-profile.cycles.selinux_file_lock.security_file_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
>>       0.00 ± -1%      +Inf%      41.12 ± 12%  perf-profile.cycles.vfs_lock_file.do_lock_file_wait.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
>>      17.04 ±  3%    -100.0%       0.00 ± -1%  perf-profile.cycles.vfs_lock_file.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
>>      34.75 ±148%    +132.4%      80.75 ± 82%  sched_debug.cfs_rq:/.load.8
>>      15.00 ±  9%    +198.3%      44.75 ± 72%  sched_debug.cfs_rq:/.load_avg.21
>>      25.00 ± 29%    +574.0%     168.50 ± 78%  sched_debug.cfs_rq:/.load_avg.9
>>      38.47 ±  5%     +29.1%      49.65 ± 26%  sched_debug.cfs_rq:/.load_avg.avg
>>      63.17 ± 10%     +44.3%      91.16 ± 36%  sched_debug.cfs_rq:/.load_avg.stddev
>>     893865 ± 12%     -12.5%     782455 ±  0%  sched_debug.cfs_rq:/.min_vruntime.25
>>      18.25 ± 26%     +52.1%      27.75 ± 25%  sched_debug.cfs_rq:/.runnable_load_avg.9
>>     -57635 ±-68%    -196.4%      55548 ±130%  sched_debug.cfs_rq:/.spread0.1
>>    -802264 ±-25%     -29.5%    -565458 ±-49%  sched_debug.cfs_rq:/.spread0.8
>>    -804662 ±-25%     -29.4%    -567811 ±-48%  sched_debug.cfs_rq:/.spread0.min
>>       1233 ±  5%     +30.9%       1614 ± 28%  sched_debug.cfs_rq:/.tg_load_avg.0
>>       1233 ±  5%     +30.9%       1614 ± 28%  sched_debug.cfs_rq:/.tg_load_avg.1
>>       1228 ±  5%     +30.3%       1601 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.10
>>       1228 ±  5%     +30.4%       1601 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.11
>>       1228 ±  5%     +30.3%       1601 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.12
>>       1229 ±  5%     +30.0%       1598 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.13
>>       1228 ±  5%     +30.1%       1598 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.14
>>       1229 ±  5%     +30.0%       1598 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.15
>>       1226 ±  5%     +30.3%       1598 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.16
>>       1226 ±  5%     +30.2%       1597 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.17
>>       1227 ±  5%     +30.1%       1595 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.18
>>       1227 ±  5%     +29.4%       1588 ± 26%  sched_debug.cfs_rq:/.tg_load_avg.19
>>       1233 ±  5%     +30.4%       1609 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.2
>>       1222 ±  5%     +29.9%       1587 ± 26%  sched_debug.cfs_rq:/.tg_load_avg.20
>>       1223 ±  5%     +24.2%       1519 ± 20%  sched_debug.cfs_rq:/.tg_load_avg.21
>>       1223 ±  5%     +23.8%       1515 ± 20%  sched_debug.cfs_rq:/.tg_load_avg.22
>>       1223 ±  5%     +23.9%       1515 ± 20%  sched_debug.cfs_rq:/.tg_load_avg.23
>>       1223 ±  5%     +23.9%       1515 ± 20%  sched_debug.cfs_rq:/.tg_load_avg.24
>>       1223 ±  5%     +23.5%       1511 ± 19%  sched_debug.cfs_rq:/.tg_load_avg.25
>>       1224 ±  5%     +23.5%       1512 ± 19%  sched_debug.cfs_rq:/.tg_load_avg.26
>>       1223 ±  5%     +23.1%       1506 ± 19%  sched_debug.cfs_rq:/.tg_load_avg.27
>>       1223 ±  5%     +22.5%       1499 ± 19%  sched_debug.cfs_rq:/.tg_load_avg.28
>>       1224 ±  5%     +22.5%       1499 ± 19%  sched_debug.cfs_rq:/.tg_load_avg.29
>>       1233 ±  5%     +30.3%       1607 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.3
>>       1223 ±  5%     +22.2%       1495 ± 18%  sched_debug.cfs_rq:/.tg_load_avg.30
>>       1224 ±  5%     +22.0%       1493 ± 19%  sched_debug.cfs_rq:/.tg_load_avg.31
>>       1234 ±  5%     +30.0%       1604 ± 28%  sched_debug.cfs_rq:/.tg_load_avg.4
>>       1233 ±  5%     +30.0%       1604 ± 28%  sched_debug.cfs_rq:/.tg_load_avg.5
>>       1231 ±  5%     +30.3%       1604 ± 28%  sched_debug.cfs_rq:/.tg_load_avg.6
>>       1233 ±  5%     +30.0%       1603 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.7
>>       1231 ±  5%     +30.1%       1601 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.8
>>       1228 ±  5%     +30.3%       1601 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.9
>>       1228 ±  5%     +27.8%       1569 ± 24%  sched_debug.cfs_rq:/.tg_load_avg.avg
>>       1246 ±  5%     +30.7%       1628 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.max
>>       1212 ±  5%     +22.2%       1481 ± 19%  sched_debug.cfs_rq:/.tg_load_avg.min
>>      15.00 ±  9%    +198.3%      44.75 ± 72%  sched_debug.cfs_rq:/.tg_load_avg_contrib.21
>>      25.00 ± 29%    +574.0%     168.50 ± 78%  sched_debug.cfs_rq:/.tg_load_avg_contrib.9
>>      38.53 ±  5%     +29.0%      49.71 ± 26%  sched_debug.cfs_rq:/.tg_load_avg_contrib.avg
>>      63.34 ± 10%     +44.1%      91.30 ± 36%  sched_debug.cfs_rq:/.tg_load_avg_contrib.stddev
>>     532.25 ±  2%      +8.5%     577.50 ±  6%  sched_debug.cfs_rq:/.util_avg.15
>>     210.75 ± 14%     -14.4%     180.50 ±  4%  sched_debug.cfs_rq:/.util_avg.29
>>     450.00 ± 22%     +50.7%     678.00 ± 18%  sched_debug.cfs_rq:/.util_avg.9
>>     955572 ±  4%     -10.2%     857813 ±  5%  sched_debug.cpu.avg_idle.6
>>      23.99 ± 60%     -76.2%       5.71 ± 24%  sched_debug.cpu.clock.stddev
>>      23.99 ± 60%     -76.2%       5.71 ± 24%  sched_debug.cpu.clock_task.stddev
>>       2840 ± 37%     -47.4%       1492 ± 65%  sched_debug.cpu.curr->pid.25
>>      34.75 ±148%    +132.4%      80.75 ± 82%  sched_debug.cpu.load.8
>>      61776 ±  7%      -7.1%      57380 ±  0%  sched_debug.cpu.nr_load_updates.25
>>       6543 ±  2%     +20.4%       7879 ±  9%  sched_debug.cpu.nr_switches.0
>>       5256 ± 23%    +177.1%      14566 ± 52%  sched_debug.cpu.nr_switches.27
>>       7915 ±  3%      +8.7%       8605 ±  3%  sched_debug.cpu.nr_switches.avg
>>      -0.25 ±-519%   +1900.0%      -5.00 ±-24%  sched_debug.cpu.nr_uninterruptible.12
>>       2.00 ± 93%    -125.0%      -0.50 ±-300%  sched_debug.cpu.nr_uninterruptible.24
>>      17468 ± 14%    +194.3%      51413 ± 75%  sched_debug.cpu.sched_count.15
>>       2112 ±  2%     +20.8%       2552 ± 11%  sched_debug.cpu.sched_goidle.0
>>       2103 ± 34%    +219.0%       6709 ± 55%  sched_debug.cpu.sched_goidle.27
>>       3159 ±  3%      +8.2%       3418 ±  4%  sched_debug.cpu.sched_goidle.avg
>>       1323 ± 64%     -72.7%     361.50 ± 15%  sched_debug.cpu.ttwu_count.23
>>       3264 ± 12%     +94.4%       6347 ± 41%  sched_debug.cpu.ttwu_count.27
>>       3860 ±  3%      +9.0%       4208 ±  3%  sched_debug.cpu.ttwu_count.avg
>>       2358 ±  3%     +28.7%       3035 ±  9%  sched_debug.cpu.ttwu_local.0
>>       1110 ± 22%     +54.6%       1716 ± 28%  sched_debug.cpu.ttwu_local.27
>>       1814 ±  8%     +16.1%       2106 ±  5%  sched_debug.cpu.ttwu_local.stddev
>> 
>> 
>> lkp-snb01: Sandy Bridge-EP
>> Memory: 32G
>> 
>>                              will-it-scale.per_thread_ops
>> 
>>    1.2e+06 ++---------------------------------------------------------------+
>>            |                                  O                             |
>>   1.15e+06 O+O O   O O   O   O   O   O                                      |
>>    1.1e+06 ++                                                               |
>>            |     O             O   O   O O OO                               |
>>   1.05e+06 ++          O   O                                                |
>>      1e+06 ++                                                               |
>>            |                                                                |
>>     950000 ++                                                               |
>>     900000 ++                                                               |
>>            |                                                                |
>>     850000 ++                                                               |
>>     800000 *+*.*.*.*.*.*.*.*.*.*.*.*. .*.*. *.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*
>>            |                         *     *                                |
>>     750000 ++---------------------------------------------------------------+
>> 
>> 
>>                           will-it-scale.time.user_time
>> 
>>   50 ++---------------------------------------------------------------------+
>>      |                                                                      |
>>   45 ++         O   O   O    O   O          O                               |
>>      O O O    O                                                             |
>>      |     O          O        O   O O                                      |
>>   40 ++           O        O           O  O                                 |
>>      |                                                                      |
>>   35 ++                                                                     |
>>      |                                                                      |
>>   30 ++                                                                     |
>>      |                                                                      |
>>      |            *                                                         |
>>   25 ++          + +                                                        |
>>      *.*.*.*..*.*   *.*.*..*.*.*.*.*.*.*..*.*.*.*.*.*.*..*.*.*.*.*.*..*.*.*.*
>>   20 ++---------------------------------------------------------------------+
>> 
>> 
>> 	[*] bisect-good sample
>> 	[O] bisect-bad  sample
>> 
>> To reproduce:
>> 
>>         git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
>>         cd lkp-tests
>>         bin/lkp install job.yaml  # job file is attached in this email
>>         bin/lkp run     job.yaml
>> 
>> 
>> Disclaimer:
>> Results have been estimated based on internal Intel analysis and are provided
>> for informational purposes only. Any difference in system hardware or software
>> design or configuration may affect actual performance.
>> 
>> 
>> Thanks,
>> Ying Huang
>
> Thanks...
>
> Huh...I'm stumped on this one. If anything I would have expected better
> performance with this patch since we don't even take the file_lock or
> do the fcheck in the F_UNLCK codepath now, or when there is an error.
>
> I'll see if I can reproduce it on my own test rig, but I'd welcome
> ideas of where and how this performance regression could have crept in.

This is a performance increase instead of performance regression.

Best Regards,
Huang, Ying

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [LKP] [lkp] [locks] 7f3697e24d: +35.1% will-it-scale.per_thread_ops
  2016-01-29  2:52   ` [LKP] " Huang, Ying
@ 2016-01-29 12:13     ` Jeff Layton
  2016-02-01 13:39     ` J. Bruce Fields
  1 sibling, 0 replies; 5+ messages in thread
From: Jeff Layton @ 2016-01-29 12:13 UTC (permalink / raw)
  To: Huang, Ying; +Cc: J. Bruce Fields, lkp, LKML, Dmitry Vyukov, Alexander Viro

On Fri, 29 Jan 2016 10:52:20 +0800
"Huang\, Ying" <ying.huang@intel.com> wrote:

> Jeff Layton <jeff.layton@primarydata.com> writes:
> 
> > On Fri, 29 Jan 2016 09:32:19 +0800
> > kernel test robot <ying.huang@linux.intel.com> wrote:
> >  
> >> FYI, we noticed the below changes on
> >> 
> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
> >> commit 7f3697e24dc3820b10f445a4a7d914fc356012d1 ("locks: fix unlock when fcntl_setlk races with a close")
> >> 
> >> 
> >> =========================================================================================
> >> compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase:
> >>   gcc-4.9/performance/x86_64-rhel/debian-x86_64-2015-02-07.cgz/lkp-snb01/lock1/will-it-scale
> >> 
> >> commit: 
> >>   9189922675ecca0fab38931d86b676e9d79602dc
> >>   7f3697e24dc3820b10f445a4a7d914fc356012d1
> >> 
> >> 9189922675ecca0f 7f3697e24dc3820b10f445a4a7 
> >> ---------------- -------------------------- 
> >>          %stddev     %change         %stddev
> >>              \          |                \  
> >>    2376432 ±  0%      +2.1%    2427484 ±  0%  will-it-scale.per_process_ops
> >>     807889 ±  0%     +35.1%    1091496 ±  4%  will-it-scale.per_thread_ops
> >>      22.08 ±  2%     +89.1%      41.75 ±  5%  will-it-scale.time.user_time
> >>    1238371 ± 14%    +100.4%    2481345 ± 39%  cpuidle.C1E-SNB.time
> >>       3098 ± 57%     -66.6%       1035 ±171%  numa-numastat.node1.other_node
> >>     379.25 ±  8%     -21.4%     298.00 ± 12%  numa-vmstat.node0.nr_alloc_batch
> >>      22.08 ±  2%     +89.1%      41.75 ±  5%  time.user_time
> >>       1795 ±  4%      +7.5%       1930 ±  2%  vmstat.system.cs
> >>       0.54 ±  5%    +136.9%       1.28 ± 10%  perf-profile.cycles.___might_sleep.__might_sleep.kmem_cache_alloc.locks_alloc_lock.__posix_lock_file
> >>       1.65 ± 57%    +245.2%       5.70 ± 29%  perf-profile.cycles.__fdget_raw.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       1.58 ± 59%    +248.3%       5.50 ± 31%  perf-profile.cycles.__fget.__fget_light.__fdget_raw.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       1.62 ± 58%    +246.3%       5.63 ± 30%  perf-profile.cycles.__fget_light.__fdget_raw.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       0.00 ± -1%      +Inf%       5.88 ± 11%  perf-profile.cycles.__memset.locks_alloc_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait
> >>       2.50 ±  2%    -100.0%       0.00 ± -1%  perf-profile.cycles.__memset.locks_alloc_lock.__posix_lock_file.vfs_lock_file.fcntl_setlk
> >>       1.29 ±  4%    +138.8%       3.09 ± 11%  perf-profile.cycles.__memset.locks_alloc_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       0.47 ±  9%    +144.4%       1.16 ± 11%  perf-profile.cycles.__might_fault.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       0.37 ± 12%    +140.3%       0.90 ±  9%  perf-profile.cycles.__might_sleep.__might_fault.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       0.86 ±  6%    +137.7%       2.05 ± 10%  perf-profile.cycles.__might_sleep.kmem_cache_alloc.locks_alloc_lock.__posix_lock_file.vfs_lock_file
> >>       0.61 ± 14%     +56.8%       0.95 ± 14%  perf-profile.cycles.__might_sleep.kmem_cache_alloc.locks_alloc_lock.fcntl_setlk.sys_fcntl
> >>       0.00 ± -1%      +Inf%      39.84 ± 12%  perf-profile.cycles.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk.sys_fcntl
> >>      16.44 ±  3%    -100.0%       0.00 ± -1%  perf-profile.cycles.__posix_lock_file.vfs_lock_file.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       0.00 ± -1%      +Inf%       1.77 ± 11%  perf-profile.cycles._raw_spin_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk
> >>      59.34 ±  1%     -72.4%      16.36 ± 33%  perf-profile.cycles._raw_spin_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       0.46 ± 11%    +144.9%       1.13 ± 19%  perf-profile.cycles.avc_has_perm.inode_has_perm.file_has_perm.selinux_file_fcntl.security_file_fcntl
> >>       0.87 ±  6%    +103.2%       1.77 ± 12%  perf-profile.cycles.avc_has_perm.inode_has_perm.file_has_perm.selinux_file_lock.security_file_lock
> >>       0.81 ±  4%    +135.7%       1.90 ± 10%  perf-profile.cycles.copy_user_generic_string.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       0.00 ± -1%      +Inf%      41.86 ± 12%  perf-profile.cycles.do_lock_file_wait.part.29.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       0.88 ±  6%    +127.8%       2.00 ±  9%  perf-profile.cycles.entry_SYSCALL_64
> >>       0.86 ±  4%    +122.6%       1.92 ± 12%  perf-profile.cycles.entry_SYSCALL_64_after_swapgs
> >>      84.98 ±  0%      -9.1%      77.20 ±  2%  perf-profile.cycles.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       0.76 ± 10%    +142.1%       1.84 ± 14%  perf-profile.cycles.file_has_perm.selinux_file_fcntl.security_file_fcntl.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       1.35 ±  4%    +106.3%       2.78 ± 11%  perf-profile.cycles.file_has_perm.selinux_file_lock.security_file_lock.fcntl_setlk.sys_fcntl
> >>       0.00 ± -1%      +Inf%       0.89 ± 12%  perf-profile.cycles.flock_to_posix_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       6.90 ±  4%     -48.6%       3.55 ± 27%  perf-profile.cycles.fput.entry_SYSCALL_64_fastpath
> >>       0.51 ± 10%    +140.5%       1.23 ± 16%  perf-profile.cycles.inode_has_perm.isra.31.file_has_perm.selinux_file_fcntl.security_file_fcntl.sys_fcntl
> >>       0.98 ±  4%     +97.7%       1.93 ± 11%  perf-profile.cycles.inode_has_perm.isra.31.file_has_perm.selinux_file_lock.security_file_lock.fcntl_setlk
> >>       0.00 ± -1%      +Inf%       6.56 ± 10%  perf-profile.cycles.kmem_cache_alloc.locks_alloc_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait
> >>       2.75 ±  4%    -100.0%       0.00 ± -1%  perf-profile.cycles.kmem_cache_alloc.locks_alloc_lock.__posix_lock_file.vfs_lock_file.fcntl_setlk
> >>       1.53 ±  7%    +119.7%       3.37 ± 13%  perf-profile.cycles.kmem_cache_alloc.locks_alloc_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       0.00 ± -1%      +Inf%       1.79 ± 11%  perf-profile.cycles.kmem_cache_free.locks_free_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait
> >>       0.46 ± 14%    +257.0%       1.66 ± 11%  perf-profile.cycles.kmem_cache_free.locks_free_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       0.40 ±  7%    +158.6%       1.05 ± 17%  perf-profile.cycles.kmem_cache_free.locks_free_lock.locks_dispose_list.__posix_lock_file.vfs_lock_file
> >>       0.00 ± -1%      +Inf%       0.96 ± 10%  perf-profile.cycles.lg_local_lock.locks_insert_lock_ctx.__posix_lock_file.vfs_lock_file.do_lock_file_wait
> >>       0.00 ± -1%      +Inf%      14.69 ± 10%  perf-profile.cycles.locks_alloc_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk
> >>       6.38 ±  3%    -100.0%       0.00 ± -1%  perf-profile.cycles.locks_alloc_lock.__posix_lock_file.vfs_lock_file.fcntl_setlk.sys_fcntl
> >>       3.28 ±  6%    +127.1%       7.45 ± 12%  perf-profile.cycles.locks_alloc_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       0.00 ± -1%      +Inf%       9.75 ± 13%  perf-profile.cycles.locks_delete_lock_ctx.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk
> >>       3.61 ±  1%    -100.0%       0.00 ± -1%  perf-profile.cycles.locks_delete_lock_ctx.__posix_lock_file.vfs_lock_file.fcntl_setlk.sys_fcntl
> >>       0.00 ± -1%      +Inf%       1.84 ± 11%  perf-profile.cycles.locks_dispose_list.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk
> >>       0.00 ± -1%      +Inf%       2.42 ± 10%  perf-profile.cycles.locks_free_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk
> >>       1.00 ±  3%    -100.0%       0.00 ± -1%  perf-profile.cycles.locks_free_lock.__posix_lock_file.vfs_lock_file.fcntl_setlk.sys_fcntl
> >>       0.63 ± 11%    +224.1%       2.05 ± 10%  perf-profile.cycles.locks_free_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       0.00 ± -1%      +Inf%       1.22 ± 14%  perf-profile.cycles.locks_free_lock.locks_dispose_list.__posix_lock_file.vfs_lock_file.do_lock_file_wait
> >>       0.00 ± -1%      +Inf%       6.17 ± 15%  perf-profile.cycles.locks_insert_lock_ctx.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk
> >>       2.31 ±  6%    -100.0%       0.00 ± -1%  perf-profile.cycles.locks_insert_lock_ctx.__posix_lock_file.vfs_lock_file.fcntl_setlk.sys_fcntl
> >>       0.00 ± -1%      +Inf%       8.96 ± 13%  perf-profile.cycles.locks_unlink_lock_ctx.locks_delete_lock_ctx.__posix_lock_file.vfs_lock_file.do_lock_file_wait
> >>       3.27 ±  1%    -100.0%       0.00 ± -1%  perf-profile.cycles.locks_unlink_lock_ctx.locks_delete_lock_ctx.__posix_lock_file.vfs_lock_file.fcntl_setlk
> >>      53.88 ±  1%     -79.7%      10.92 ± 46%  perf-profile.cycles.native_queued_spin_lock_slowpath._raw_spin_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       2.75 ±  0%    +183.3%       7.79 ± 13%  perf-profile.cycles.put_pid.locks_unlink_lock_ctx.locks_delete_lock_ctx.__posix_lock_file.vfs_lock_file
> >>       1.11 ±  9%    +137.2%       2.63 ± 14%  perf-profile.cycles.security_file_fcntl.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       1.69 ±  4%    +118.2%       3.69 ± 11%  perf-profile.cycles.security_file_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       0.91 ±  9%    +139.0%       2.17 ± 14%  perf-profile.cycles.selinux_file_fcntl.security_file_fcntl.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       1.39 ±  4%    +114.6%       2.97 ± 10%  perf-profile.cycles.selinux_file_lock.security_file_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       0.00 ± -1%      +Inf%      41.12 ± 12%  perf-profile.cycles.vfs_lock_file.do_lock_file_wait.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
> >>      17.04 ±  3%    -100.0%       0.00 ± -1%  perf-profile.cycles.vfs_lock_file.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
> >>      34.75 ±148%    +132.4%      80.75 ± 82%  sched_debug.cfs_rq:/.load.8
> >>      15.00 ±  9%    +198.3%      44.75 ± 72%  sched_debug.cfs_rq:/.load_avg.21
> >>      25.00 ± 29%    +574.0%     168.50 ± 78%  sched_debug.cfs_rq:/.load_avg.9
> >>      38.47 ±  5%     +29.1%      49.65 ± 26%  sched_debug.cfs_rq:/.load_avg.avg
> >>      63.17 ± 10%     +44.3%      91.16 ± 36%  sched_debug.cfs_rq:/.load_avg.stddev
> >>     893865 ± 12%     -12.5%     782455 ±  0%  sched_debug.cfs_rq:/.min_vruntime.25
> >>      18.25 ± 26%     +52.1%      27.75 ± 25%  sched_debug.cfs_rq:/.runnable_load_avg.9
> >>     -57635 ±-68%    -196.4%      55548 ±130%  sched_debug.cfs_rq:/.spread0.1
> >>    -802264 ±-25%     -29.5%    -565458 ±-49%  sched_debug.cfs_rq:/.spread0.8
> >>    -804662 ±-25%     -29.4%    -567811 ±-48%  sched_debug.cfs_rq:/.spread0.min
> >>       1233 ±  5%     +30.9%       1614 ± 28%  sched_debug.cfs_rq:/.tg_load_avg.0
> >>       1233 ±  5%     +30.9%       1614 ± 28%  sched_debug.cfs_rq:/.tg_load_avg.1
> >>       1228 ±  5%     +30.3%       1601 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.10
> >>       1228 ±  5%     +30.4%       1601 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.11
> >>       1228 ±  5%     +30.3%       1601 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.12
> >>       1229 ±  5%     +30.0%       1598 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.13
> >>       1228 ±  5%     +30.1%       1598 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.14
> >>       1229 ±  5%     +30.0%       1598 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.15
> >>       1226 ±  5%     +30.3%       1598 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.16
> >>       1226 ±  5%     +30.2%       1597 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.17
> >>       1227 ±  5%     +30.1%       1595 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.18
> >>       1227 ±  5%     +29.4%       1588 ± 26%  sched_debug.cfs_rq:/.tg_load_avg.19
> >>       1233 ±  5%     +30.4%       1609 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.2
> >>       1222 ±  5%     +29.9%       1587 ± 26%  sched_debug.cfs_rq:/.tg_load_avg.20
> >>       1223 ±  5%     +24.2%       1519 ± 20%  sched_debug.cfs_rq:/.tg_load_avg.21
> >>       1223 ±  5%     +23.8%       1515 ± 20%  sched_debug.cfs_rq:/.tg_load_avg.22
> >>       1223 ±  5%     +23.9%       1515 ± 20%  sched_debug.cfs_rq:/.tg_load_avg.23
> >>       1223 ±  5%     +23.9%       1515 ± 20%  sched_debug.cfs_rq:/.tg_load_avg.24
> >>       1223 ±  5%     +23.5%       1511 ± 19%  sched_debug.cfs_rq:/.tg_load_avg.25
> >>       1224 ±  5%     +23.5%       1512 ± 19%  sched_debug.cfs_rq:/.tg_load_avg.26
> >>       1223 ±  5%     +23.1%       1506 ± 19%  sched_debug.cfs_rq:/.tg_load_avg.27
> >>       1223 ±  5%     +22.5%       1499 ± 19%  sched_debug.cfs_rq:/.tg_load_avg.28
> >>       1224 ±  5%     +22.5%       1499 ± 19%  sched_debug.cfs_rq:/.tg_load_avg.29
> >>       1233 ±  5%     +30.3%       1607 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.3
> >>       1223 ±  5%     +22.2%       1495 ± 18%  sched_debug.cfs_rq:/.tg_load_avg.30
> >>       1224 ±  5%     +22.0%       1493 ± 19%  sched_debug.cfs_rq:/.tg_load_avg.31
> >>       1234 ±  5%     +30.0%       1604 ± 28%  sched_debug.cfs_rq:/.tg_load_avg.4
> >>       1233 ±  5%     +30.0%       1604 ± 28%  sched_debug.cfs_rq:/.tg_load_avg.5
> >>       1231 ±  5%     +30.3%       1604 ± 28%  sched_debug.cfs_rq:/.tg_load_avg.6
> >>       1233 ±  5%     +30.0%       1603 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.7
> >>       1231 ±  5%     +30.1%       1601 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.8
> >>       1228 ±  5%     +30.3%       1601 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.9
> >>       1228 ±  5%     +27.8%       1569 ± 24%  sched_debug.cfs_rq:/.tg_load_avg.avg
> >>       1246 ±  5%     +30.7%       1628 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.max
> >>       1212 ±  5%     +22.2%       1481 ± 19%  sched_debug.cfs_rq:/.tg_load_avg.min
> >>      15.00 ±  9%    +198.3%      44.75 ± 72%  sched_debug.cfs_rq:/.tg_load_avg_contrib.21
> >>      25.00 ± 29%    +574.0%     168.50 ± 78%  sched_debug.cfs_rq:/.tg_load_avg_contrib.9
> >>      38.53 ±  5%     +29.0%      49.71 ± 26%  sched_debug.cfs_rq:/.tg_load_avg_contrib.avg
> >>      63.34 ± 10%     +44.1%      91.30 ± 36%  sched_debug.cfs_rq:/.tg_load_avg_contrib.stddev
> >>     532.25 ±  2%      +8.5%     577.50 ±  6%  sched_debug.cfs_rq:/.util_avg.15
> >>     210.75 ± 14%     -14.4%     180.50 ±  4%  sched_debug.cfs_rq:/.util_avg.29
> >>     450.00 ± 22%     +50.7%     678.00 ± 18%  sched_debug.cfs_rq:/.util_avg.9
> >>     955572 ±  4%     -10.2%     857813 ±  5%  sched_debug.cpu.avg_idle.6
> >>      23.99 ± 60%     -76.2%       5.71 ± 24%  sched_debug.cpu.clock.stddev
> >>      23.99 ± 60%     -76.2%       5.71 ± 24%  sched_debug.cpu.clock_task.stddev
> >>       2840 ± 37%     -47.4%       1492 ± 65%  sched_debug.cpu.curr->pid.25
> >>      34.75 ±148%    +132.4%      80.75 ± 82%  sched_debug.cpu.load.8
> >>      61776 ±  7%      -7.1%      57380 ±  0%  sched_debug.cpu.nr_load_updates.25
> >>       6543 ±  2%     +20.4%       7879 ±  9%  sched_debug.cpu.nr_switches.0
> >>       5256 ± 23%    +177.1%      14566 ± 52%  sched_debug.cpu.nr_switches.27
> >>       7915 ±  3%      +8.7%       8605 ±  3%  sched_debug.cpu.nr_switches.avg
> >>      -0.25 ±-519%   +1900.0%      -5.00 ±-24%  sched_debug.cpu.nr_uninterruptible.12
> >>       2.00 ± 93%    -125.0%      -0.50 ±-300%  sched_debug.cpu.nr_uninterruptible.24
> >>      17468 ± 14%    +194.3%      51413 ± 75%  sched_debug.cpu.sched_count.15
> >>       2112 ±  2%     +20.8%       2552 ± 11%  sched_debug.cpu.sched_goidle.0
> >>       2103 ± 34%    +219.0%       6709 ± 55%  sched_debug.cpu.sched_goidle.27
> >>       3159 ±  3%      +8.2%       3418 ±  4%  sched_debug.cpu.sched_goidle.avg
> >>       1323 ± 64%     -72.7%     361.50 ± 15%  sched_debug.cpu.ttwu_count.23
> >>       3264 ± 12%     +94.4%       6347 ± 41%  sched_debug.cpu.ttwu_count.27
> >>       3860 ±  3%      +9.0%       4208 ±  3%  sched_debug.cpu.ttwu_count.avg
> >>       2358 ±  3%     +28.7%       3035 ±  9%  sched_debug.cpu.ttwu_local.0
> >>       1110 ± 22%     +54.6%       1716 ± 28%  sched_debug.cpu.ttwu_local.27
> >>       1814 ±  8%     +16.1%       2106 ±  5%  sched_debug.cpu.ttwu_local.stddev
> >> 
> >> 
> >> lkp-snb01: Sandy Bridge-EP
> >> Memory: 32G
> >> 
> >>                              will-it-scale.per_thread_ops
> >> 
> >>    1.2e+06 ++---------------------------------------------------------------+
> >>            |                                  O                             |
> >>   1.15e+06 O+O O   O O   O   O   O   O                                      |
> >>    1.1e+06 ++                                                               |
> >>            |     O             O   O   O O OO                               |
> >>   1.05e+06 ++          O   O                                                |
> >>      1e+06 ++                                                               |
> >>            |                                                                |
> >>     950000 ++                                                               |
> >>     900000 ++                                                               |
> >>            |                                                                |
> >>     850000 ++                                                               |
> >>     800000 *+*.*.*.*.*.*.*.*.*.*.*.*. .*.*. *.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*
> >>            |                         *     *                                |
> >>     750000 ++---------------------------------------------------------------+
> >> 
> >> 
> >>                           will-it-scale.time.user_time
> >> 
> >>   50 ++---------------------------------------------------------------------+
> >>      |                                                                      |
> >>   45 ++         O   O   O    O   O          O                               |
> >>      O O O    O                                                             |
> >>      |     O          O        O   O O                                      |
> >>   40 ++           O        O           O  O                                 |
> >>      |                                                                      |
> >>   35 ++                                                                     |
> >>      |                                                                      |
> >>   30 ++                                                                     |
> >>      |                                                                      |
> >>      |            *                                                         |
> >>   25 ++          + +                                                        |
> >>      *.*.*.*..*.*   *.*.*..*.*.*.*.*.*.*..*.*.*.*.*.*.*..*.*.*.*.*.*..*.*.*.*
> >>   20 ++---------------------------------------------------------------------+
> >> 
> >> 
> >> 	[*] bisect-good sample
> >> 	[O] bisect-bad  sample
> >> 
> >> To reproduce:
> >> 
> >>         git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
> >>         cd lkp-tests
> >>         bin/lkp install job.yaml  # job file is attached in this email
> >>         bin/lkp run     job.yaml
> >> 
> >> 
> >> Disclaimer:
> >> Results have been estimated based on internal Intel analysis and are provided
> >> for informational purposes only. Any difference in system hardware or software
> >> design or configuration may affect actual performance.
> >> 
> >> 
> >> Thanks,
> >> Ying Huang  
> >
> > Thanks...
> >
> > Huh...I'm stumped on this one. If anything I would have expected better
> > performance with this patch since we don't even take the file_lock or
> > do the fcheck in the F_UNLCK codepath now, or when there is an error.
> >
> > I'll see if I can reproduce it on my own test rig, but I'd welcome
> > ideas of where and how this performance regression could have crept in.  
> 
> This is a performance increase instead of performance regression.
> 

Hah, no wonder that made no sense. Ok, thanks for letting me know!

-- 
Jeff Layton <jeff.layton@primarydata.com>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [LKP] [lkp] [locks] 7f3697e24d: +35.1% will-it-scale.per_thread_ops
  2016-01-29  2:52   ` [LKP] " Huang, Ying
  2016-01-29 12:13     ` Jeff Layton
@ 2016-02-01 13:39     ` J. Bruce Fields
  1 sibling, 0 replies; 5+ messages in thread
From: J. Bruce Fields @ 2016-02-01 13:39 UTC (permalink / raw)
  To: Huang, Ying; +Cc: Jeff Layton, lkp, LKML, Dmitry Vyukov, Alexander Viro

On Fri, Jan 29, 2016 at 10:52:20AM +0800, Huang, Ying wrote:
> Jeff Layton <jeff.layton@primarydata.com> writes:
> 
> > On Fri, 29 Jan 2016 09:32:19 +0800
> > kernel test robot <ying.huang@linux.intel.com> wrote:
> >
> >> FYI, we noticed the below changes on
> >> 
> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
> >> commit 7f3697e24dc3820b10f445a4a7d914fc356012d1 ("locks: fix unlock when fcntl_setlk races with a close")
> >> 
> >> 
> >> =========================================================================================
> >> compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase:
> >>   gcc-4.9/performance/x86_64-rhel/debian-x86_64-2015-02-07.cgz/lkp-snb01/lock1/will-it-scale
> >> 
> >> commit: 
> >>   9189922675ecca0fab38931d86b676e9d79602dc
> >>   7f3697e24dc3820b10f445a4a7d914fc356012d1
> >> 
> >> 9189922675ecca0f 7f3697e24dc3820b10f445a4a7 
> >> ---------------- -------------------------- 
> >>          %stddev     %change         %stddev
> >>              \          |                \  
> >>    2376432 ±  0%      +2.1%    2427484 ±  0%  will-it-scale.per_process_ops
> >>     807889 ±  0%     +35.1%    1091496 ±  4%  will-it-scale.per_thread_ops
> >>      22.08 ±  2%     +89.1%      41.75 ±  5%  will-it-scale.time.user_time
> >>    1238371 ± 14%    +100.4%    2481345 ± 39%  cpuidle.C1E-SNB.time
> >>       3098 ± 57%     -66.6%       1035 ±171%  numa-numastat.node1.other_node
> >>     379.25 ±  8%     -21.4%     298.00 ± 12%  numa-vmstat.node0.nr_alloc_batch
> >>      22.08 ±  2%     +89.1%      41.75 ±  5%  time.user_time
> >>       1795 ±  4%      +7.5%       1930 ±  2%  vmstat.system.cs
> >>       0.54 ±  5%    +136.9%       1.28 ± 10%  perf-profile.cycles.___might_sleep.__might_sleep.kmem_cache_alloc.locks_alloc_lock.__posix_lock_file
> >>       1.65 ± 57%    +245.2%       5.70 ± 29%  perf-profile.cycles.__fdget_raw.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       1.58 ± 59%    +248.3%       5.50 ± 31%  perf-profile.cycles.__fget.__fget_light.__fdget_raw.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       1.62 ± 58%    +246.3%       5.63 ± 30%  perf-profile.cycles.__fget_light.__fdget_raw.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       0.00 ± -1%      +Inf%       5.88 ± 11%  perf-profile.cycles.__memset.locks_alloc_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait
> >>       2.50 ±  2%    -100.0%       0.00 ± -1%  perf-profile.cycles.__memset.locks_alloc_lock.__posix_lock_file.vfs_lock_file.fcntl_setlk
> >>       1.29 ±  4%    +138.8%       3.09 ± 11%  perf-profile.cycles.__memset.locks_alloc_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       0.47 ±  9%    +144.4%       1.16 ± 11%  perf-profile.cycles.__might_fault.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       0.37 ± 12%    +140.3%       0.90 ±  9%  perf-profile.cycles.__might_sleep.__might_fault.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       0.86 ±  6%    +137.7%       2.05 ± 10%  perf-profile.cycles.__might_sleep.kmem_cache_alloc.locks_alloc_lock.__posix_lock_file.vfs_lock_file
> >>       0.61 ± 14%     +56.8%       0.95 ± 14%  perf-profile.cycles.__might_sleep.kmem_cache_alloc.locks_alloc_lock.fcntl_setlk.sys_fcntl
> >>       0.00 ± -1%      +Inf%      39.84 ± 12%  perf-profile.cycles.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk.sys_fcntl
> >>      16.44 ±  3%    -100.0%       0.00 ± -1%  perf-profile.cycles.__posix_lock_file.vfs_lock_file.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       0.00 ± -1%      +Inf%       1.77 ± 11%  perf-profile.cycles._raw_spin_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk
> >>      59.34 ±  1%     -72.4%      16.36 ± 33%  perf-profile.cycles._raw_spin_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       0.46 ± 11%    +144.9%       1.13 ± 19%  perf-profile.cycles.avc_has_perm.inode_has_perm.file_has_perm.selinux_file_fcntl.security_file_fcntl
> >>       0.87 ±  6%    +103.2%       1.77 ± 12%  perf-profile.cycles.avc_has_perm.inode_has_perm.file_has_perm.selinux_file_lock.security_file_lock
> >>       0.81 ±  4%    +135.7%       1.90 ± 10%  perf-profile.cycles.copy_user_generic_string.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       0.00 ± -1%      +Inf%      41.86 ± 12%  perf-profile.cycles.do_lock_file_wait.part.29.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       0.88 ±  6%    +127.8%       2.00 ±  9%  perf-profile.cycles.entry_SYSCALL_64
> >>       0.86 ±  4%    +122.6%       1.92 ± 12%  perf-profile.cycles.entry_SYSCALL_64_after_swapgs
> >>      84.98 ±  0%      -9.1%      77.20 ±  2%  perf-profile.cycles.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       0.76 ± 10%    +142.1%       1.84 ± 14%  perf-profile.cycles.file_has_perm.selinux_file_fcntl.security_file_fcntl.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       1.35 ±  4%    +106.3%       2.78 ± 11%  perf-profile.cycles.file_has_perm.selinux_file_lock.security_file_lock.fcntl_setlk.sys_fcntl
> >>       0.00 ± -1%      +Inf%       0.89 ± 12%  perf-profile.cycles.flock_to_posix_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       6.90 ±  4%     -48.6%       3.55 ± 27%  perf-profile.cycles.fput.entry_SYSCALL_64_fastpath
> >>       0.51 ± 10%    +140.5%       1.23 ± 16%  perf-profile.cycles.inode_has_perm.isra.31.file_has_perm.selinux_file_fcntl.security_file_fcntl.sys_fcntl
> >>       0.98 ±  4%     +97.7%       1.93 ± 11%  perf-profile.cycles.inode_has_perm.isra.31.file_has_perm.selinux_file_lock.security_file_lock.fcntl_setlk
> >>       0.00 ± -1%      +Inf%       6.56 ± 10%  perf-profile.cycles.kmem_cache_alloc.locks_alloc_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait
> >>       2.75 ±  4%    -100.0%       0.00 ± -1%  perf-profile.cycles.kmem_cache_alloc.locks_alloc_lock.__posix_lock_file.vfs_lock_file.fcntl_setlk
> >>       1.53 ±  7%    +119.7%       3.37 ± 13%  perf-profile.cycles.kmem_cache_alloc.locks_alloc_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       0.00 ± -1%      +Inf%       1.79 ± 11%  perf-profile.cycles.kmem_cache_free.locks_free_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait
> >>       0.46 ± 14%    +257.0%       1.66 ± 11%  perf-profile.cycles.kmem_cache_free.locks_free_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       0.40 ±  7%    +158.6%       1.05 ± 17%  perf-profile.cycles.kmem_cache_free.locks_free_lock.locks_dispose_list.__posix_lock_file.vfs_lock_file
> >>       0.00 ± -1%      +Inf%       0.96 ± 10%  perf-profile.cycles.lg_local_lock.locks_insert_lock_ctx.__posix_lock_file.vfs_lock_file.do_lock_file_wait
> >>       0.00 ± -1%      +Inf%      14.69 ± 10%  perf-profile.cycles.locks_alloc_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk
> >>       6.38 ±  3%    -100.0%       0.00 ± -1%  perf-profile.cycles.locks_alloc_lock.__posix_lock_file.vfs_lock_file.fcntl_setlk.sys_fcntl
> >>       3.28 ±  6%    +127.1%       7.45 ± 12%  perf-profile.cycles.locks_alloc_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       0.00 ± -1%      +Inf%       9.75 ± 13%  perf-profile.cycles.locks_delete_lock_ctx.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk
> >>       3.61 ±  1%    -100.0%       0.00 ± -1%  perf-profile.cycles.locks_delete_lock_ctx.__posix_lock_file.vfs_lock_file.fcntl_setlk.sys_fcntl
> >>       0.00 ± -1%      +Inf%       1.84 ± 11%  perf-profile.cycles.locks_dispose_list.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk
> >>       0.00 ± -1%      +Inf%       2.42 ± 10%  perf-profile.cycles.locks_free_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk
> >>       1.00 ±  3%    -100.0%       0.00 ± -1%  perf-profile.cycles.locks_free_lock.__posix_lock_file.vfs_lock_file.fcntl_setlk.sys_fcntl
> >>       0.63 ± 11%    +224.1%       2.05 ± 10%  perf-profile.cycles.locks_free_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       0.00 ± -1%      +Inf%       1.22 ± 14%  perf-profile.cycles.locks_free_lock.locks_dispose_list.__posix_lock_file.vfs_lock_file.do_lock_file_wait
> >>       0.00 ± -1%      +Inf%       6.17 ± 15%  perf-profile.cycles.locks_insert_lock_ctx.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk
> >>       2.31 ±  6%    -100.0%       0.00 ± -1%  perf-profile.cycles.locks_insert_lock_ctx.__posix_lock_file.vfs_lock_file.fcntl_setlk.sys_fcntl
> >>       0.00 ± -1%      +Inf%       8.96 ± 13%  perf-profile.cycles.locks_unlink_lock_ctx.locks_delete_lock_ctx.__posix_lock_file.vfs_lock_file.do_lock_file_wait
> >>       3.27 ±  1%    -100.0%       0.00 ± -1%  perf-profile.cycles.locks_unlink_lock_ctx.locks_delete_lock_ctx.__posix_lock_file.vfs_lock_file.fcntl_setlk
> >>      53.88 ±  1%     -79.7%      10.92 ± 46%  perf-profile.cycles.native_queued_spin_lock_slowpath._raw_spin_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       2.75 ±  0%    +183.3%       7.79 ± 13%  perf-profile.cycles.put_pid.locks_unlink_lock_ctx.locks_delete_lock_ctx.__posix_lock_file.vfs_lock_file
> >>       1.11 ±  9%    +137.2%       2.63 ± 14%  perf-profile.cycles.security_file_fcntl.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       1.69 ±  4%    +118.2%       3.69 ± 11%  perf-profile.cycles.security_file_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       0.91 ±  9%    +139.0%       2.17 ± 14%  perf-profile.cycles.selinux_file_fcntl.security_file_fcntl.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       1.39 ±  4%    +114.6%       2.97 ± 10%  perf-profile.cycles.selinux_file_lock.security_file_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
> >>       0.00 ± -1%      +Inf%      41.12 ± 12%  perf-profile.cycles.vfs_lock_file.do_lock_file_wait.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
> >>      17.04 ±  3%    -100.0%       0.00 ± -1%  perf-profile.cycles.vfs_lock_file.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath
> >>      34.75 ±148%    +132.4%      80.75 ± 82%  sched_debug.cfs_rq:/.load.8
> >>      15.00 ±  9%    +198.3%      44.75 ± 72%  sched_debug.cfs_rq:/.load_avg.21
> >>      25.00 ± 29%    +574.0%     168.50 ± 78%  sched_debug.cfs_rq:/.load_avg.9
> >>      38.47 ±  5%     +29.1%      49.65 ± 26%  sched_debug.cfs_rq:/.load_avg.avg
> >>      63.17 ± 10%     +44.3%      91.16 ± 36%  sched_debug.cfs_rq:/.load_avg.stddev
> >>     893865 ± 12%     -12.5%     782455 ±  0%  sched_debug.cfs_rq:/.min_vruntime.25
> >>      18.25 ± 26%     +52.1%      27.75 ± 25%  sched_debug.cfs_rq:/.runnable_load_avg.9
> >>     -57635 ±-68%    -196.4%      55548 ±130%  sched_debug.cfs_rq:/.spread0.1
> >>    -802264 ±-25%     -29.5%    -565458 ±-49%  sched_debug.cfs_rq:/.spread0.8
> >>    -804662 ±-25%     -29.4%    -567811 ±-48%  sched_debug.cfs_rq:/.spread0.min
> >>       1233 ±  5%     +30.9%       1614 ± 28%  sched_debug.cfs_rq:/.tg_load_avg.0
> >>       1233 ±  5%     +30.9%       1614 ± 28%  sched_debug.cfs_rq:/.tg_load_avg.1
> >>       1228 ±  5%     +30.3%       1601 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.10
> >>       1228 ±  5%     +30.4%       1601 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.11
> >>       1228 ±  5%     +30.3%       1601 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.12
> >>       1229 ±  5%     +30.0%       1598 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.13
> >>       1228 ±  5%     +30.1%       1598 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.14
> >>       1229 ±  5%     +30.0%       1598 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.15
> >>       1226 ±  5%     +30.3%       1598 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.16
> >>       1226 ±  5%     +30.2%       1597 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.17
> >>       1227 ±  5%     +30.1%       1595 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.18
> >>       1227 ±  5%     +29.4%       1588 ± 26%  sched_debug.cfs_rq:/.tg_load_avg.19
> >>       1233 ±  5%     +30.4%       1609 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.2
> >>       1222 ±  5%     +29.9%       1587 ± 26%  sched_debug.cfs_rq:/.tg_load_avg.20
> >>       1223 ±  5%     +24.2%       1519 ± 20%  sched_debug.cfs_rq:/.tg_load_avg.21
> >>       1223 ±  5%     +23.8%       1515 ± 20%  sched_debug.cfs_rq:/.tg_load_avg.22
> >>       1223 ±  5%     +23.9%       1515 ± 20%  sched_debug.cfs_rq:/.tg_load_avg.23
> >>       1223 ±  5%     +23.9%       1515 ± 20%  sched_debug.cfs_rq:/.tg_load_avg.24
> >>       1223 ±  5%     +23.5%       1511 ± 19%  sched_debug.cfs_rq:/.tg_load_avg.25
> >>       1224 ±  5%     +23.5%       1512 ± 19%  sched_debug.cfs_rq:/.tg_load_avg.26
> >>       1223 ±  5%     +23.1%       1506 ± 19%  sched_debug.cfs_rq:/.tg_load_avg.27
> >>       1223 ±  5%     +22.5%       1499 ± 19%  sched_debug.cfs_rq:/.tg_load_avg.28
> >>       1224 ±  5%     +22.5%       1499 ± 19%  sched_debug.cfs_rq:/.tg_load_avg.29
> >>       1233 ±  5%     +30.3%       1607 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.3
> >>       1223 ±  5%     +22.2%       1495 ± 18%  sched_debug.cfs_rq:/.tg_load_avg.30
> >>       1224 ±  5%     +22.0%       1493 ± 19%  sched_debug.cfs_rq:/.tg_load_avg.31
> >>       1234 ±  5%     +30.0%       1604 ± 28%  sched_debug.cfs_rq:/.tg_load_avg.4
> >>       1233 ±  5%     +30.0%       1604 ± 28%  sched_debug.cfs_rq:/.tg_load_avg.5
> >>       1231 ±  5%     +30.3%       1604 ± 28%  sched_debug.cfs_rq:/.tg_load_avg.6
> >>       1233 ±  5%     +30.0%       1603 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.7
> >>       1231 ±  5%     +30.1%       1601 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.8
> >>       1228 ±  5%     +30.3%       1601 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.9
> >>       1228 ±  5%     +27.8%       1569 ± 24%  sched_debug.cfs_rq:/.tg_load_avg.avg
> >>       1246 ±  5%     +30.7%       1628 ± 27%  sched_debug.cfs_rq:/.tg_load_avg.max
> >>       1212 ±  5%     +22.2%       1481 ± 19%  sched_debug.cfs_rq:/.tg_load_avg.min
> >>      15.00 ±  9%    +198.3%      44.75 ± 72%  sched_debug.cfs_rq:/.tg_load_avg_contrib.21
> >>      25.00 ± 29%    +574.0%     168.50 ± 78%  sched_debug.cfs_rq:/.tg_load_avg_contrib.9
> >>      38.53 ±  5%     +29.0%      49.71 ± 26%  sched_debug.cfs_rq:/.tg_load_avg_contrib.avg
> >>      63.34 ± 10%     +44.1%      91.30 ± 36%  sched_debug.cfs_rq:/.tg_load_avg_contrib.stddev
> >>     532.25 ±  2%      +8.5%     577.50 ±  6%  sched_debug.cfs_rq:/.util_avg.15
> >>     210.75 ± 14%     -14.4%     180.50 ±  4%  sched_debug.cfs_rq:/.util_avg.29
> >>     450.00 ± 22%     +50.7%     678.00 ± 18%  sched_debug.cfs_rq:/.util_avg.9
> >>     955572 ±  4%     -10.2%     857813 ±  5%  sched_debug.cpu.avg_idle.6
> >>      23.99 ± 60%     -76.2%       5.71 ± 24%  sched_debug.cpu.clock.stddev
> >>      23.99 ± 60%     -76.2%       5.71 ± 24%  sched_debug.cpu.clock_task.stddev
> >>       2840 ± 37%     -47.4%       1492 ± 65%  sched_debug.cpu.curr->pid.25
> >>      34.75 ±148%    +132.4%      80.75 ± 82%  sched_debug.cpu.load.8
> >>      61776 ±  7%      -7.1%      57380 ±  0%  sched_debug.cpu.nr_load_updates.25
> >>       6543 ±  2%     +20.4%       7879 ±  9%  sched_debug.cpu.nr_switches.0
> >>       5256 ± 23%    +177.1%      14566 ± 52%  sched_debug.cpu.nr_switches.27
> >>       7915 ±  3%      +8.7%       8605 ±  3%  sched_debug.cpu.nr_switches.avg
> >>      -0.25 ±-519%   +1900.0%      -5.00 ±-24%  sched_debug.cpu.nr_uninterruptible.12
> >>       2.00 ± 93%    -125.0%      -0.50 ±-300%  sched_debug.cpu.nr_uninterruptible.24
> >>      17468 ± 14%    +194.3%      51413 ± 75%  sched_debug.cpu.sched_count.15
> >>       2112 ±  2%     +20.8%       2552 ± 11%  sched_debug.cpu.sched_goidle.0
> >>       2103 ± 34%    +219.0%       6709 ± 55%  sched_debug.cpu.sched_goidle.27
> >>       3159 ±  3%      +8.2%       3418 ±  4%  sched_debug.cpu.sched_goidle.avg
> >>       1323 ± 64%     -72.7%     361.50 ± 15%  sched_debug.cpu.ttwu_count.23
> >>       3264 ± 12%     +94.4%       6347 ± 41%  sched_debug.cpu.ttwu_count.27
> >>       3860 ±  3%      +9.0%       4208 ±  3%  sched_debug.cpu.ttwu_count.avg
> >>       2358 ±  3%     +28.7%       3035 ±  9%  sched_debug.cpu.ttwu_local.0
> >>       1110 ± 22%     +54.6%       1716 ± 28%  sched_debug.cpu.ttwu_local.27
> >>       1814 ±  8%     +16.1%       2106 ±  5%  sched_debug.cpu.ttwu_local.stddev
> >> 
> >> 
> >> lkp-snb01: Sandy Bridge-EP
> >> Memory: 32G
> >> 
> >>                              will-it-scale.per_thread_ops
> >> 
> >>    1.2e+06 ++---------------------------------------------------------------+
> >>            |                                  O                             |
> >>   1.15e+06 O+O O   O O   O   O   O   O                                      |
> >>    1.1e+06 ++                                                               |
> >>            |     O             O   O   O O OO                               |
> >>   1.05e+06 ++          O   O                                                |
> >>      1e+06 ++                                                               |
> >>            |                                                                |
> >>     950000 ++                                                               |
> >>     900000 ++                                                               |
> >>            |                                                                |
> >>     850000 ++                                                               |
> >>     800000 *+*.*.*.*.*.*.*.*.*.*.*.*. .*.*. *.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*
> >>            |                         *     *                                |
> >>     750000 ++---------------------------------------------------------------+
> >> 
> >> 
> >>                           will-it-scale.time.user_time
> >> 
> >>   50 ++---------------------------------------------------------------------+
> >>      |                                                                      |
> >>   45 ++         O   O   O    O   O          O                               |
> >>      O O O    O                                                             |
> >>      |     O          O        O   O O                                      |
> >>   40 ++           O        O           O  O                                 |
> >>      |                                                                      |
> >>   35 ++                                                                     |
> >>      |                                                                      |
> >>   30 ++                                                                     |
> >>      |                                                                      |
> >>      |            *                                                         |
> >>   25 ++          + +                                                        |
> >>      *.*.*.*..*.*   *.*.*..*.*.*.*.*.*.*..*.*.*.*.*.*.*..*.*.*.*.*.*..*.*.*.*
> >>   20 ++---------------------------------------------------------------------+
> >> 
> >> 
> >> 	[*] bisect-good sample
> >> 	[O] bisect-bad  sample
> >> 
> >> To reproduce:
> >> 
> >>         git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
> >>         cd lkp-tests
> >>         bin/lkp install job.yaml  # job file is attached in this email
> >>         bin/lkp run     job.yaml
> >> 
> >> 
> >> Disclaimer:
> >> Results have been estimated based on internal Intel analysis and are provided
> >> for informational purposes only. Any difference in system hardware or software
> >> design or configuration may affect actual performance.
> >> 
> >> 
> >> Thanks,
> >> Ying Huang
> >
> > Thanks...
> >
> > Huh...I'm stumped on this one. If anything I would have expected better
> > performance with this patch since we don't even take the file_lock or
> > do the fcheck in the F_UNLCK codepath now, or when there is an error.
> >
> > I'll see if I can reproduce it on my own test rig, but I'd welcome
> > ideas of where and how this performance regression could have crept in.
> 
> This is a performance increase instead of performance regression.

Could you provide any help reading the above graphs?  For example, is
the "bisect-good" case before or after the given commit?  And are lower
or higher numbers better on the graph?

Thanks for doing this testing.  I'm impressed that it's happening and
curious to learn anything more about it.

--b.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-02-01 13:39 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-29  1:32 [lkp] [locks] 7f3697e24d: +35.1% will-it-scale.per_thread_ops kernel test robot
2016-01-29  2:38 ` Jeff Layton
2016-01-29  2:52   ` [LKP] " Huang, Ying
2016-01-29 12:13     ` Jeff Layton
2016-02-01 13:39     ` J. Bruce Fields

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).