All of lore.kernel.org
 help / color / mirror / Atom feed
* [lkp] [futex] 65d8fc777f: +25.6% will-it-scale.per_process_ops
@ 2016-02-29  8:36 ` kernel test robot
  0 siblings, 0 replies; 28+ messages in thread
From: kernel test robot @ 2016-02-29  8:36 UTC (permalink / raw)
  To: Mel Gorman
  Cc: lkp, LKML, Sebastian Andrzej Siewior, Peter Zijlstra, Mel Gorman,
	Linus Torvalds, Hugh Dickins, Darren Hart, Chris Mason,
	Thomas Gleixner, Davidlohr Bueso, Ingo Molnar

[-- Attachment #1: Type: text/plain, Size: 9522 bytes --]

FYI, we noticed the below changes on

https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
commit 65d8fc777f6dcfee12785c057a6b57f679641c90 ("futex: Remove requirement for lock_page() in get_futex_key()")


=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase:
  gcc-4.9/performance/x86_64-rhel/debian-x86_64-2015-02-07.cgz/lkp-sbx04/futex1/will-it-scale

commit: 
  8ad7b378d0d016309014cae0f640434bca7b5e11
  65d8fc777f6dcfee12785c057a6b57f679641c90

8ad7b378d0d01630 65d8fc777f6dcfee12785c057a 
---------------- -------------------------- 
         %stddev     %change         %stddev
             \          |                \  
   5076304 ±  0%     +25.6%    6374220 ±  0%  will-it-scale.per_process_ops
   1194117 ±  0%     +14.4%    1366153 ±  1%  will-it-scale.per_thread_ops
      0.58 ±  0%      -2.0%       0.57 ±  0%  will-it-scale.scalability
      6820 ±  0%     -19.6%       5483 ± 15%  meminfo.AnonHugePages
      2652 ±  5%     -10.4%       2375 ±  2%  vmstat.system.cs
      2848 ± 32%    +141.2%       6870 ± 65%  numa-meminfo.node1.Active(anon)
      2832 ± 31%     +57.6%       4465 ± 27%  numa-meminfo.node1.AnonPages
     15018 ± 12%     -23.3%      11515 ± 15%  numa-meminfo.node2.AnonPages
      1214 ± 14%     -22.8%     936.75 ± 20%  numa-meminfo.node3.PageTables
    712.25 ± 32%    +141.2%       1718 ± 65%  numa-vmstat.node1.nr_active_anon
    708.25 ± 31%     +57.7%       1116 ± 27%  numa-vmstat.node1.nr_anon_pages
      3754 ± 12%     -23.3%       2879 ± 15%  numa-vmstat.node2.nr_anon_pages
    304.75 ± 14%     -23.1%     234.50 ± 20%  numa-vmstat.node3.nr_page_table_pages
      3.53 ±  1%    -100.0%       0.00 ± -1%  perf-profile.cycles.___might_sleep.__might_sleep.get_futex_key.futex_wake.do_futex
      4.34 ±  1%    -100.0%       0.00 ± -1%  perf-profile.cycles.__might_sleep.get_futex_key.futex_wake.do_futex.sys_futex
      1.27 ±  3%    -100.0%       0.00 ± -1%  perf-profile.cycles.__wake_up_bit.unlock_page.get_futex_key.futex_wake.do_futex
      4.36 ±  1%     +29.6%       5.65 ±  1%  perf-profile.cycles.drop_futex_key_refs.isra.12.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath
      6.69 ±  1%     +28.1%       8.57 ±  0%  perf-profile.cycles.entry_SYSCALL_64
      6.73 ±  0%     +30.6%       8.79 ±  0%  perf-profile.cycles.entry_SYSCALL_64_after_swapgs
     74.21 ±  0%     -11.0%      66.06 ±  0%  perf-profile.cycles.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath
     59.05 ±  0%     -21.4%      46.40 ±  0%  perf-profile.cycles.get_futex_key.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath
      4.12 ±  0%     +78.5%       7.36 ±  1%  perf-profile.cycles.get_futex_key_refs.isra.11.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath
      2.27 ±  3%     +24.1%       2.82 ±  4%  perf-profile.cycles.get_user_pages_fast.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath
     26.95 ±  0%     +30.0%      35.04 ±  1%  perf-profile.cycles.get_user_pages_fast.get_futex_key.futex_wake.do_futex.sys_futex
     13.43 ±  0%     +27.2%      17.09 ±  1%  perf-profile.cycles.gup_pte_range.gup_pud_range.get_user_pages_fast.get_futex_key.futex_wake
     19.66 ±  1%     +28.4%      25.24 ±  0%  perf-profile.cycles.gup_pud_range.get_user_pages_fast.get_futex_key.futex_wake.do_futex
      4.33 ±  1%     +37.0%       5.93 ±  4%  perf-profile.cycles.hash_futex.do_futex.sys_futex.entry_SYSCALL_64_fastpath
     13.59 ±  0%    -100.0%       0.00 ± -1%  perf-profile.cycles.unlock_page.get_futex_key.futex_wake.do_futex.sys_futex
     15160 ± 19%     -34.8%       9883 ±  0%  sched_debug.cfs_rq:/.exec_clock.min
     27.25 ± 15%     -37.6%      17.00 ±  8%  sched_debug.cfs_rq:/.load_avg.7
     21.00 ± 38%     -27.4%      15.25 ±  2%  sched_debug.cpu.cpu_load[2].1
     21.00 ± 38%     -27.4%      15.25 ±  2%  sched_debug.cpu.cpu_load[3].1
     21.00 ± 38%     -27.4%      15.25 ±  2%  sched_debug.cpu.cpu_load[4].1
      1790 ±  0%     +42.4%       2549 ± 45%  sched_debug.cpu.curr->pid.21
     50033 ±  4%      -6.8%      46622 ±  4%  sched_debug.cpu.nr_load_updates.29
      4398 ± 42%    +103.5%       8949 ± 23%  sched_debug.cpu.nr_switches.11
      7452 ± 34%    +111.3%      15744 ± 54%  sched_debug.cpu.nr_switches.20
      3739 ± 13%    +213.5%      11723 ± 40%  sched_debug.cpu.nr_switches.23
      1648 ± 53%     +96.5%       3239 ± 63%  sched_debug.cpu.nr_switches.51
      0.25 ±519%   -1300.0%      -3.00 ±-52%  sched_debug.cpu.nr_uninterruptible.24
      8632 ± 16%     -32.5%       5823 ± 19%  sched_debug.cpu.sched_count.1
      5091 ± 36%    +137.5%      12092 ± 31%  sched_debug.cpu.sched_count.11
     12453 ± 90%     -74.6%       3159 ± 24%  sched_debug.cpu.sched_count.2
      7782 ± 32%    +118.2%      16977 ± 46%  sched_debug.cpu.sched_count.20
      2665 ± 48%     -49.8%       1337 ± 30%  sched_debug.cpu.sched_count.32
      1365 ± 11%     -14.0%       1174 ±  3%  sched_debug.cpu.sched_count.45
      1693 ± 51%    +147.7%       4193 ± 42%  sched_debug.cpu.sched_count.51
      5023 ± 57%     -51.5%       2434 ± 43%  sched_debug.cpu.sched_count.57
      1705 ± 16%    +129.6%       3915 ± 48%  sched_debug.cpu.sched_goidle.23
    536.25 ± 14%     -18.7%     435.75 ±  2%  sched_debug.cpu.sched_goidle.45
      1228 ± 19%     -27.3%     892.50 ± 17%  sched_debug.cpu.sched_goidle.5
      1919 ± 55%     +88.5%       3617 ± 37%  sched_debug.cpu.ttwu_count.11
      7699 ± 35%     -43.7%       4335 ± 43%  sched_debug.cpu.ttwu_count.24
      5380 ± 36%     -45.6%       2926 ± 18%  sched_debug.cpu.ttwu_count.30
    563.25 ± 20%    +140.3%       1353 ± 38%  sched_debug.cpu.ttwu_local.11
      4297 ± 46%     -49.1%       2186 ± 39%  sched_debug.cpu.ttwu_local.24
      2828 ± 47%     -47.8%       1475 ± 34%  sched_debug.cpu.ttwu_local.27
      3243 ± 36%     -54.3%       1482 ± 32%  sched_debug.cpu.ttwu_local.30
    199.25 ±  6%    +100.6%     399.75 ± 32%  sched_debug.cpu.ttwu_local.44
      1158 ± 64%     -67.3%     379.00 ± 46%  sched_debug.cpu.ttwu_local.54
    242.25 ± 21%     +51.0%     365.75 ± 19%  sched_debug.cpu.ttwu_local.55
      1009 ± 26%     -50.8%     496.50 ± 44%  sched_debug.cpu.ttwu_local.59
      1736 ± 53%     -67.8%     559.25 ± 22%  sched_debug.cpu.ttwu_local.9


lkp-sbx04: Sandy Bridge-EX
Memory: 64G


   perf-profile.cycles.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath

  76 ++---------------------------------------------------------------------+
     |                                                                      |
  74 ++   .*..        .*..*..*..     .*..    .*..  .*..  .*..  .*..*..*..*  |
     *..*.    *..*..*.          *..*.    *.*.    *.    *.    *.             |
     |                                                                      |
  72 ++                                                                     |
     |                                                                      |
  70 ++                                                                     |
     |                                                                      |
  68 ++                                                                     |
     |                                                                      |
     |                          O  O  O    O  O  O  O     O  O     O  O     |
  66 O+                                  O             O        O        O  O
     |  O  O  O  O     O  O  O                                              |
  64 ++-------------O-------------------------------------------------------+



                             will-it-scale.per_process_ops

  6.6e+06 O+----O-O--O------------------------------------------------------+
          |  O          O  O O  O                                           |
  6.4e+06 ++                       O  O O  O  O  O O  O  O  O O  O  O  O O  O
          |                                                                 |
  6.2e+06 ++                                                                |
    6e+06 ++                                                                |
          |                                                                 |
  5.8e+06 ++                                                                |
          |                                                                 |
  5.6e+06 ++                                                                |
  5.4e+06 ++                                                                |
          |                                                                 |
  5.2e+06 ++                                                                |
          *..*..*.*..*..*..*.*..*..*..*.*..*..*..*.*..*..*..*.*..*..*..*.*  |
    5e+06 ++----------------------------------------------------------------+


	[*] bisect-good sample
	[O] bisect-bad  sample

To reproduce:

        git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml


Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Ying Huang

[-- Attachment #2: job.yaml --]
[-- Type: text/plain, Size: 3522 bytes --]

---
LKP_SERVER: inn
LKP_CGI_PORT: 80
LKP_CIFS_PORT: 139
testcase: will-it-scale
default-monitors:
  wait: activate-monitor
  kmsg: 
  uptime: 
  iostat: 
  heartbeat: 
  vmstat: 
  numa-numastat: 
  numa-vmstat: 
  numa-meminfo: 
  proc-vmstat: 
  proc-stat:
    interval: 10
  meminfo: 
  slabinfo: 
  interrupts: 
  lock_stat: 
  latency_stats: 
  softirqs: 
  bdi_dev_mapping: 
  diskstats: 
  nfsstat: 
  cpuidle: 
  cpufreq-stats: 
  turbostat: 
  pmeter: 
  sched_debug:
    interval: 60
cpufreq_governor: performance
default-watchdogs:
  oom-killer: 
  watchdog: 
commit: 65d8fc777f6dcfee12785c057a6b57f679641c90
model: Sandy Bridge-EX
nr_cpu: 64
memory: 64G
nr_ssd_partitions: 7
ssd_partitions: "/dev/disk/by-id/ata-INTEL_SSDSC2*-part1"
swap_partitions: 
category: benchmark
perf-profile:
  freq: 800
will-it-scale:
  test: futex1
queue: bisect
testbox: lkp-sbx04
tbox_group: lkp-sbx04
kconfig: x86_64-rhel
enqueue_time: 2016-02-28 23:45:52.199165563 +08:00
compiler: gcc-4.9
rootfs: debian-x86_64-2015-02-07.cgz
id: 6b2c2bd744dd898009648cb82de7e0ba77de33f1
user: lkp
head_commit: ed520c327c4259ec08b1677023087f658329b961
base_commit: 81f70ba233d5f660e1ea5fe23260ee323af5d53a
branch: linux-devel/devel-hourly-2016022811
result_root: "/result/will-it-scale/performance-futex1/lkp-sbx04/debian-x86_64-2015-02-07.cgz/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/0"
job_file: "/lkp/scheduled/lkp-sbx04/bisect_will-it-scale-performance-futex1-debian-x86_64-2015-02-07.cgz-x86_64-rhel-65d8fc777f6dcfee12785c057a6b57f679641c90-20160228-23650-17c4qc1-0.yaml"
max_uptime: 1500
initrd: "/osimage/debian/debian-x86_64-2015-02-07.cgz"
bootloader_append:
- root=/dev/ram0
- user=lkp
- job=/lkp/scheduled/lkp-sbx04/bisect_will-it-scale-performance-futex1-debian-x86_64-2015-02-07.cgz-x86_64-rhel-65d8fc777f6dcfee12785c057a6b57f679641c90-20160228-23650-17c4qc1-0.yaml
- ARCH=x86_64
- kconfig=x86_64-rhel
- branch=linux-devel/devel-hourly-2016022811
- commit=65d8fc777f6dcfee12785c057a6b57f679641c90
- BOOT_IMAGE=/pkg/linux/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/vmlinuz-4.5.0-rc3-00235-g65d8fc7
- max_uptime=1500
- RESULT_ROOT=/result/will-it-scale/performance-futex1/lkp-sbx04/debian-x86_64-2015-02-07.cgz/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/0
- LKP_SERVER=inn
- |2-


  earlyprintk=ttyS0,115200 systemd.log_level=err
  debug apic=debug sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100
  panic=-1 softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2 prompt_ramdisk=0
  console=ttyS0,115200 console=tty0 vga=normal

  rw
lkp_initrd: "/lkp/lkp/lkp-x86_64.cgz"
modules_initrd: "/pkg/linux/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/modules.cgz"
bm_initrd: "/osimage/deps/debian-x86_64-2015-02-07.cgz/lkp.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/run-ipconfig.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/turbostat.cgz,/lkp/benchmarks/turbostat.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/will-it-scale.cgz,/lkp/benchmarks/will-it-scale.cgz,/lkp/benchmarks/will-it-scale-x86_64.cgz"
linux_headers_initrd: "/pkg/linux/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/linux-headers.cgz"
repeat_to: 2
kernel: "/pkg/linux/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/vmlinuz-4.5.0-rc3-00235-g65d8fc7"
dequeue_time: 2016-02-28 23:46:33.938915178 +08:00
job_state: finished
loadavg: 45.27 20.12 7.84 2/649 11559
start_time: '1456674445'
end_time: '1456674754'
version: "/lkp/lkp/.src-20160226-194908"

[-- Attachment #3: reproduce --]
[-- Type: text/plain, Size: 6073 bytes --]

2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu10/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu11/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu12/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu13/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu14/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu15/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu16/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu17/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu18/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu19/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu20/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu21/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu22/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu23/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu24/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu25/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu26/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu27/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu28/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu29/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu30/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu31/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu32/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu33/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu34/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu35/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu36/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu37/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu38/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu39/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu40/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu41/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu42/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu43/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu44/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu45/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu46/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu47/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu48/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu49/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu5/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu50/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu51/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu52/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu53/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu54/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu55/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu56/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu57/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu58/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu59/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu6/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu60/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu61/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu62/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu63/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu8/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu9/cpufreq/scaling_governor
2016-02-28 23:47:25 ./runtest.py futex1 16 both 1 8 16 24 32 48 64

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [futex] 65d8fc777f: +25.6% will-it-scale.per_process_ops
@ 2016-02-29  8:36 ` kernel test robot
  0 siblings, 0 replies; 28+ messages in thread
From: kernel test robot @ 2016-02-29  8:36 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 9648 bytes --]

FYI, we noticed the below changes on

https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
commit 65d8fc777f6dcfee12785c057a6b57f679641c90 ("futex: Remove requirement for lock_page() in get_futex_key()")


=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase:
  gcc-4.9/performance/x86_64-rhel/debian-x86_64-2015-02-07.cgz/lkp-sbx04/futex1/will-it-scale

commit: 
  8ad7b378d0d016309014cae0f640434bca7b5e11
  65d8fc777f6dcfee12785c057a6b57f679641c90

8ad7b378d0d01630 65d8fc777f6dcfee12785c057a 
---------------- -------------------------- 
         %stddev     %change         %stddev
             \          |                \  
   5076304 ±  0%     +25.6%    6374220 ±  0%  will-it-scale.per_process_ops
   1194117 ±  0%     +14.4%    1366153 ±  1%  will-it-scale.per_thread_ops
      0.58 ±  0%      -2.0%       0.57 ±  0%  will-it-scale.scalability
      6820 ±  0%     -19.6%       5483 ± 15%  meminfo.AnonHugePages
      2652 ±  5%     -10.4%       2375 ±  2%  vmstat.system.cs
      2848 ± 32%    +141.2%       6870 ± 65%  numa-meminfo.node1.Active(anon)
      2832 ± 31%     +57.6%       4465 ± 27%  numa-meminfo.node1.AnonPages
     15018 ± 12%     -23.3%      11515 ± 15%  numa-meminfo.node2.AnonPages
      1214 ± 14%     -22.8%     936.75 ± 20%  numa-meminfo.node3.PageTables
    712.25 ± 32%    +141.2%       1718 ± 65%  numa-vmstat.node1.nr_active_anon
    708.25 ± 31%     +57.7%       1116 ± 27%  numa-vmstat.node1.nr_anon_pages
      3754 ± 12%     -23.3%       2879 ± 15%  numa-vmstat.node2.nr_anon_pages
    304.75 ± 14%     -23.1%     234.50 ± 20%  numa-vmstat.node3.nr_page_table_pages
      3.53 ±  1%    -100.0%       0.00 ± -1%  perf-profile.cycles.___might_sleep.__might_sleep.get_futex_key.futex_wake.do_futex
      4.34 ±  1%    -100.0%       0.00 ± -1%  perf-profile.cycles.__might_sleep.get_futex_key.futex_wake.do_futex.sys_futex
      1.27 ±  3%    -100.0%       0.00 ± -1%  perf-profile.cycles.__wake_up_bit.unlock_page.get_futex_key.futex_wake.do_futex
      4.36 ±  1%     +29.6%       5.65 ±  1%  perf-profile.cycles.drop_futex_key_refs.isra.12.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath
      6.69 ±  1%     +28.1%       8.57 ±  0%  perf-profile.cycles.entry_SYSCALL_64
      6.73 ±  0%     +30.6%       8.79 ±  0%  perf-profile.cycles.entry_SYSCALL_64_after_swapgs
     74.21 ±  0%     -11.0%      66.06 ±  0%  perf-profile.cycles.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath
     59.05 ±  0%     -21.4%      46.40 ±  0%  perf-profile.cycles.get_futex_key.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath
      4.12 ±  0%     +78.5%       7.36 ±  1%  perf-profile.cycles.get_futex_key_refs.isra.11.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath
      2.27 ±  3%     +24.1%       2.82 ±  4%  perf-profile.cycles.get_user_pages_fast.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath
     26.95 ±  0%     +30.0%      35.04 ±  1%  perf-profile.cycles.get_user_pages_fast.get_futex_key.futex_wake.do_futex.sys_futex
     13.43 ±  0%     +27.2%      17.09 ±  1%  perf-profile.cycles.gup_pte_range.gup_pud_range.get_user_pages_fast.get_futex_key.futex_wake
     19.66 ±  1%     +28.4%      25.24 ±  0%  perf-profile.cycles.gup_pud_range.get_user_pages_fast.get_futex_key.futex_wake.do_futex
      4.33 ±  1%     +37.0%       5.93 ±  4%  perf-profile.cycles.hash_futex.do_futex.sys_futex.entry_SYSCALL_64_fastpath
     13.59 ±  0%    -100.0%       0.00 ± -1%  perf-profile.cycles.unlock_page.get_futex_key.futex_wake.do_futex.sys_futex
     15160 ± 19%     -34.8%       9883 ±  0%  sched_debug.cfs_rq:/.exec_clock.min
     27.25 ± 15%     -37.6%      17.00 ±  8%  sched_debug.cfs_rq:/.load_avg.7
     21.00 ± 38%     -27.4%      15.25 ±  2%  sched_debug.cpu.cpu_load[2].1
     21.00 ± 38%     -27.4%      15.25 ±  2%  sched_debug.cpu.cpu_load[3].1
     21.00 ± 38%     -27.4%      15.25 ±  2%  sched_debug.cpu.cpu_load[4].1
      1790 ±  0%     +42.4%       2549 ± 45%  sched_debug.cpu.curr->pid.21
     50033 ±  4%      -6.8%      46622 ±  4%  sched_debug.cpu.nr_load_updates.29
      4398 ± 42%    +103.5%       8949 ± 23%  sched_debug.cpu.nr_switches.11
      7452 ± 34%    +111.3%      15744 ± 54%  sched_debug.cpu.nr_switches.20
      3739 ± 13%    +213.5%      11723 ± 40%  sched_debug.cpu.nr_switches.23
      1648 ± 53%     +96.5%       3239 ± 63%  sched_debug.cpu.nr_switches.51
      0.25 ±519%   -1300.0%      -3.00 ±-52%  sched_debug.cpu.nr_uninterruptible.24
      8632 ± 16%     -32.5%       5823 ± 19%  sched_debug.cpu.sched_count.1
      5091 ± 36%    +137.5%      12092 ± 31%  sched_debug.cpu.sched_count.11
     12453 ± 90%     -74.6%       3159 ± 24%  sched_debug.cpu.sched_count.2
      7782 ± 32%    +118.2%      16977 ± 46%  sched_debug.cpu.sched_count.20
      2665 ± 48%     -49.8%       1337 ± 30%  sched_debug.cpu.sched_count.32
      1365 ± 11%     -14.0%       1174 ±  3%  sched_debug.cpu.sched_count.45
      1693 ± 51%    +147.7%       4193 ± 42%  sched_debug.cpu.sched_count.51
      5023 ± 57%     -51.5%       2434 ± 43%  sched_debug.cpu.sched_count.57
      1705 ± 16%    +129.6%       3915 ± 48%  sched_debug.cpu.sched_goidle.23
    536.25 ± 14%     -18.7%     435.75 ±  2%  sched_debug.cpu.sched_goidle.45
      1228 ± 19%     -27.3%     892.50 ± 17%  sched_debug.cpu.sched_goidle.5
      1919 ± 55%     +88.5%       3617 ± 37%  sched_debug.cpu.ttwu_count.11
      7699 ± 35%     -43.7%       4335 ± 43%  sched_debug.cpu.ttwu_count.24
      5380 ± 36%     -45.6%       2926 ± 18%  sched_debug.cpu.ttwu_count.30
    563.25 ± 20%    +140.3%       1353 ± 38%  sched_debug.cpu.ttwu_local.11
      4297 ± 46%     -49.1%       2186 ± 39%  sched_debug.cpu.ttwu_local.24
      2828 ± 47%     -47.8%       1475 ± 34%  sched_debug.cpu.ttwu_local.27
      3243 ± 36%     -54.3%       1482 ± 32%  sched_debug.cpu.ttwu_local.30
    199.25 ±  6%    +100.6%     399.75 ± 32%  sched_debug.cpu.ttwu_local.44
      1158 ± 64%     -67.3%     379.00 ± 46%  sched_debug.cpu.ttwu_local.54
    242.25 ± 21%     +51.0%     365.75 ± 19%  sched_debug.cpu.ttwu_local.55
      1009 ± 26%     -50.8%     496.50 ± 44%  sched_debug.cpu.ttwu_local.59
      1736 ± 53%     -67.8%     559.25 ± 22%  sched_debug.cpu.ttwu_local.9


lkp-sbx04: Sandy Bridge-EX
Memory: 64G


   perf-profile.cycles.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath

  76 ++---------------------------------------------------------------------+
     |                                                                      |
  74 ++   .*..        .*..*..*..     .*..    .*..  .*..  .*..  .*..*..*..*  |
     *..*.    *..*..*.          *..*.    *.*.    *.    *.    *.             |
     |                                                                      |
  72 ++                                                                     |
     |                                                                      |
  70 ++                                                                     |
     |                                                                      |
  68 ++                                                                     |
     |                                                                      |
     |                          O  O  O    O  O  O  O     O  O     O  O     |
  66 O+                                  O             O        O        O  O
     |  O  O  O  O     O  O  O                                              |
  64 ++-------------O-------------------------------------------------------+



                             will-it-scale.per_process_ops

  6.6e+06 O+----O-O--O------------------------------------------------------+
          |  O          O  O O  O                                           |
  6.4e+06 ++                       O  O O  O  O  O O  O  O  O O  O  O  O O  O
          |                                                                 |
  6.2e+06 ++                                                                |
    6e+06 ++                                                                |
          |                                                                 |
  5.8e+06 ++                                                                |
          |                                                                 |
  5.6e+06 ++                                                                |
  5.4e+06 ++                                                                |
          |                                                                 |
  5.2e+06 ++                                                                |
          *..*..*.*..*..*..*.*..*..*..*.*..*..*..*.*..*..*..*.*..*..*..*.*  |
    5e+06 ++----------------------------------------------------------------+


	[*] bisect-good sample
	[O] bisect-bad  sample

To reproduce:

        git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml


Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Ying Huang

[-- Attachment #2: job.yaml --]
[-- Type: text/plain, Size: 3522 bytes --]

---
LKP_SERVER: inn
LKP_CGI_PORT: 80
LKP_CIFS_PORT: 139
testcase: will-it-scale
default-monitors:
  wait: activate-monitor
  kmsg: 
  uptime: 
  iostat: 
  heartbeat: 
  vmstat: 
  numa-numastat: 
  numa-vmstat: 
  numa-meminfo: 
  proc-vmstat: 
  proc-stat:
    interval: 10
  meminfo: 
  slabinfo: 
  interrupts: 
  lock_stat: 
  latency_stats: 
  softirqs: 
  bdi_dev_mapping: 
  diskstats: 
  nfsstat: 
  cpuidle: 
  cpufreq-stats: 
  turbostat: 
  pmeter: 
  sched_debug:
    interval: 60
cpufreq_governor: performance
default-watchdogs:
  oom-killer: 
  watchdog: 
commit: 65d8fc777f6dcfee12785c057a6b57f679641c90
model: Sandy Bridge-EX
nr_cpu: 64
memory: 64G
nr_ssd_partitions: 7
ssd_partitions: "/dev/disk/by-id/ata-INTEL_SSDSC2*-part1"
swap_partitions: 
category: benchmark
perf-profile:
  freq: 800
will-it-scale:
  test: futex1
queue: bisect
testbox: lkp-sbx04
tbox_group: lkp-sbx04
kconfig: x86_64-rhel
enqueue_time: 2016-02-28 23:45:52.199165563 +08:00
compiler: gcc-4.9
rootfs: debian-x86_64-2015-02-07.cgz
id: 6b2c2bd744dd898009648cb82de7e0ba77de33f1
user: lkp
head_commit: ed520c327c4259ec08b1677023087f658329b961
base_commit: 81f70ba233d5f660e1ea5fe23260ee323af5d53a
branch: linux-devel/devel-hourly-2016022811
result_root: "/result/will-it-scale/performance-futex1/lkp-sbx04/debian-x86_64-2015-02-07.cgz/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/0"
job_file: "/lkp/scheduled/lkp-sbx04/bisect_will-it-scale-performance-futex1-debian-x86_64-2015-02-07.cgz-x86_64-rhel-65d8fc777f6dcfee12785c057a6b57f679641c90-20160228-23650-17c4qc1-0.yaml"
max_uptime: 1500
initrd: "/osimage/debian/debian-x86_64-2015-02-07.cgz"
bootloader_append:
- root=/dev/ram0
- user=lkp
- job=/lkp/scheduled/lkp-sbx04/bisect_will-it-scale-performance-futex1-debian-x86_64-2015-02-07.cgz-x86_64-rhel-65d8fc777f6dcfee12785c057a6b57f679641c90-20160228-23650-17c4qc1-0.yaml
- ARCH=x86_64
- kconfig=x86_64-rhel
- branch=linux-devel/devel-hourly-2016022811
- commit=65d8fc777f6dcfee12785c057a6b57f679641c90
- BOOT_IMAGE=/pkg/linux/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/vmlinuz-4.5.0-rc3-00235-g65d8fc7
- max_uptime=1500
- RESULT_ROOT=/result/will-it-scale/performance-futex1/lkp-sbx04/debian-x86_64-2015-02-07.cgz/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/0
- LKP_SERVER=inn
- |2-


  earlyprintk=ttyS0,115200 systemd.log_level=err
  debug apic=debug sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100
  panic=-1 softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2 prompt_ramdisk=0
  console=ttyS0,115200 console=tty0 vga=normal

  rw
lkp_initrd: "/lkp/lkp/lkp-x86_64.cgz"
modules_initrd: "/pkg/linux/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/modules.cgz"
bm_initrd: "/osimage/deps/debian-x86_64-2015-02-07.cgz/lkp.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/run-ipconfig.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/turbostat.cgz,/lkp/benchmarks/turbostat.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/will-it-scale.cgz,/lkp/benchmarks/will-it-scale.cgz,/lkp/benchmarks/will-it-scale-x86_64.cgz"
linux_headers_initrd: "/pkg/linux/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/linux-headers.cgz"
repeat_to: 2
kernel: "/pkg/linux/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/vmlinuz-4.5.0-rc3-00235-g65d8fc7"
dequeue_time: 2016-02-28 23:46:33.938915178 +08:00
job_state: finished
loadavg: 45.27 20.12 7.84 2/649 11559
start_time: '1456674445'
end_time: '1456674754'
version: "/lkp/lkp/.src-20160226-194908"

[-- Attachment #3: reproduce.ksh --]
[-- Type: text/plain, Size: 6073 bytes --]

2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu10/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu11/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu12/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu13/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu14/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu15/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu16/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu17/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu18/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu19/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu20/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu21/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu22/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu23/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu24/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu25/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu26/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu27/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu28/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu29/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu30/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu31/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu32/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu33/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu34/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu35/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu36/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu37/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu38/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu39/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu40/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu41/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu42/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu43/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu44/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu45/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu46/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu47/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu48/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu49/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu5/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu50/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu51/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu52/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu53/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu54/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu55/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu56/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu57/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu58/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu59/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu6/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu60/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu61/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu62/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu63/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu8/cpufreq/scaling_governor
2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu9/cpufreq/scaling_governor
2016-02-28 23:47:25 ./runtest.py futex1 16 both 1 8 16 24 32 48 64

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [lkp] [futex] 65d8fc777f: +25.6% will-it-scale.per_process_ops
  2016-02-29  8:36 ` kernel test robot
@ 2016-02-29  9:37   ` Ingo Molnar
  -1 siblings, 0 replies; 28+ messages in thread
From: Ingo Molnar @ 2016-02-29  9:37 UTC (permalink / raw)
  To: kernel test robot
  Cc: Mel Gorman, lkp, LKML, Sebastian Andrzej Siewior, Peter Zijlstra,
	Mel Gorman, Linus Torvalds, Hugh Dickins, Darren Hart,
	Chris Mason, Thomas Gleixner, Davidlohr Bueso, Peter Zijlstra


* kernel test robot <ying.huang@linux.intel.com> wrote:

> FYI, we noticed the below changes on
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master 
> commit 65d8fc777f6dcfee12785c057a6b57f679641c90 ("futex: Remove requirement for 
> lock_page() in get_futex_key()")

I have asked for this before, but let me try again: could you _PLEASE_ make these 
emails more readable?

For example what are the 'below changes'? Changes in the profile output? Profiles 
always change from run to run, so that alone is not informative.

Also, there are a lot of changes - which ones prompted the email to be generated?

All in one, this email is hard to parse, because it just dumps a lot of 
information with very little explanatory structure for someone not versed in their 
format. Please try to create an easy to parse 'story' that leads the reader 
towards what you want these emails to tell - not just a raw dump of seemingly 
unconnected pieces of data ...

Thanks,

	Ingo

> 
> 
> =========================================================================================
> compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase:
>   gcc-4.9/performance/x86_64-rhel/debian-x86_64-2015-02-07.cgz/lkp-sbx04/futex1/will-it-scale
> 
> commit: 
>   8ad7b378d0d016309014cae0f640434bca7b5e11
>   65d8fc777f6dcfee12785c057a6b57f679641c90
> 
> 8ad7b378d0d01630 65d8fc777f6dcfee12785c057a 
> ---------------- -------------------------- 
>          %stddev     %change         %stddev
>              \          |                \  
>    5076304 ±  0%     +25.6%    6374220 ±  0%  will-it-scale.per_process_ops
>    1194117 ±  0%     +14.4%    1366153 ±  1%  will-it-scale.per_thread_ops
>       0.58 ±  0%      -2.0%       0.57 ±  0%  will-it-scale.scalability
>       6820 ±  0%     -19.6%       5483 ± 15%  meminfo.AnonHugePages
>       2652 ±  5%     -10.4%       2375 ±  2%  vmstat.system.cs
>       2848 ± 32%    +141.2%       6870 ± 65%  numa-meminfo.node1.Active(anon)
>       2832 ± 31%     +57.6%       4465 ± 27%  numa-meminfo.node1.AnonPages
>      15018 ± 12%     -23.3%      11515 ± 15%  numa-meminfo.node2.AnonPages
>       1214 ± 14%     -22.8%     936.75 ± 20%  numa-meminfo.node3.PageTables
>     712.25 ± 32%    +141.2%       1718 ± 65%  numa-vmstat.node1.nr_active_anon
>     708.25 ± 31%     +57.7%       1116 ± 27%  numa-vmstat.node1.nr_anon_pages
>       3754 ± 12%     -23.3%       2879 ± 15%  numa-vmstat.node2.nr_anon_pages
>     304.75 ± 14%     -23.1%     234.50 ± 20%  numa-vmstat.node3.nr_page_table_pages
>       3.53 ±  1%    -100.0%       0.00 ± -1%  perf-profile.cycles.___might_sleep.__might_sleep.get_futex_key.futex_wake.do_futex
>       4.34 ±  1%    -100.0%       0.00 ± -1%  perf-profile.cycles.__might_sleep.get_futex_key.futex_wake.do_futex.sys_futex
>       1.27 ±  3%    -100.0%       0.00 ± -1%  perf-profile.cycles.__wake_up_bit.unlock_page.get_futex_key.futex_wake.do_futex
>       4.36 ±  1%     +29.6%       5.65 ±  1%  perf-profile.cycles.drop_futex_key_refs.isra.12.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath
>       6.69 ±  1%     +28.1%       8.57 ±  0%  perf-profile.cycles.entry_SYSCALL_64
>       6.73 ±  0%     +30.6%       8.79 ±  0%  perf-profile.cycles.entry_SYSCALL_64_after_swapgs
>      74.21 ±  0%     -11.0%      66.06 ±  0%  perf-profile.cycles.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath
>      59.05 ±  0%     -21.4%      46.40 ±  0%  perf-profile.cycles.get_futex_key.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath
>       4.12 ±  0%     +78.5%       7.36 ±  1%  perf-profile.cycles.get_futex_key_refs.isra.11.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath
>       2.27 ±  3%     +24.1%       2.82 ±  4%  perf-profile.cycles.get_user_pages_fast.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath
>      26.95 ±  0%     +30.0%      35.04 ±  1%  perf-profile.cycles.get_user_pages_fast.get_futex_key.futex_wake.do_futex.sys_futex
>      13.43 ±  0%     +27.2%      17.09 ±  1%  perf-profile.cycles.gup_pte_range.gup_pud_range.get_user_pages_fast.get_futex_key.futex_wake
>      19.66 ±  1%     +28.4%      25.24 ±  0%  perf-profile.cycles.gup_pud_range.get_user_pages_fast.get_futex_key.futex_wake.do_futex
>       4.33 ±  1%     +37.0%       5.93 ±  4%  perf-profile.cycles.hash_futex.do_futex.sys_futex.entry_SYSCALL_64_fastpath
>      13.59 ±  0%    -100.0%       0.00 ± -1%  perf-profile.cycles.unlock_page.get_futex_key.futex_wake.do_futex.sys_futex
>      15160 ± 19%     -34.8%       9883 ±  0%  sched_debug.cfs_rq:/.exec_clock.min
>      27.25 ± 15%     -37.6%      17.00 ±  8%  sched_debug.cfs_rq:/.load_avg.7
>      21.00 ± 38%     -27.4%      15.25 ±  2%  sched_debug.cpu.cpu_load[2].1
>      21.00 ± 38%     -27.4%      15.25 ±  2%  sched_debug.cpu.cpu_load[3].1
>      21.00 ± 38%     -27.4%      15.25 ±  2%  sched_debug.cpu.cpu_load[4].1
>       1790 ±  0%     +42.4%       2549 ± 45%  sched_debug.cpu.curr->pid.21
>      50033 ±  4%      -6.8%      46622 ±  4%  sched_debug.cpu.nr_load_updates.29
>       4398 ± 42%    +103.5%       8949 ± 23%  sched_debug.cpu.nr_switches.11
>       7452 ± 34%    +111.3%      15744 ± 54%  sched_debug.cpu.nr_switches.20
>       3739 ± 13%    +213.5%      11723 ± 40%  sched_debug.cpu.nr_switches.23
>       1648 ± 53%     +96.5%       3239 ± 63%  sched_debug.cpu.nr_switches.51
>       0.25 ±519%   -1300.0%      -3.00 ±-52%  sched_debug.cpu.nr_uninterruptible.24
>       8632 ± 16%     -32.5%       5823 ± 19%  sched_debug.cpu.sched_count.1
>       5091 ± 36%    +137.5%      12092 ± 31%  sched_debug.cpu.sched_count.11
>      12453 ± 90%     -74.6%       3159 ± 24%  sched_debug.cpu.sched_count.2
>       7782 ± 32%    +118.2%      16977 ± 46%  sched_debug.cpu.sched_count.20
>       2665 ± 48%     -49.8%       1337 ± 30%  sched_debug.cpu.sched_count.32
>       1365 ± 11%     -14.0%       1174 ±  3%  sched_debug.cpu.sched_count.45
>       1693 ± 51%    +147.7%       4193 ± 42%  sched_debug.cpu.sched_count.51
>       5023 ± 57%     -51.5%       2434 ± 43%  sched_debug.cpu.sched_count.57
>       1705 ± 16%    +129.6%       3915 ± 48%  sched_debug.cpu.sched_goidle.23
>     536.25 ± 14%     -18.7%     435.75 ±  2%  sched_debug.cpu.sched_goidle.45
>       1228 ± 19%     -27.3%     892.50 ± 17%  sched_debug.cpu.sched_goidle.5
>       1919 ± 55%     +88.5%       3617 ± 37%  sched_debug.cpu.ttwu_count.11
>       7699 ± 35%     -43.7%       4335 ± 43%  sched_debug.cpu.ttwu_count.24
>       5380 ± 36%     -45.6%       2926 ± 18%  sched_debug.cpu.ttwu_count.30
>     563.25 ± 20%    +140.3%       1353 ± 38%  sched_debug.cpu.ttwu_local.11
>       4297 ± 46%     -49.1%       2186 ± 39%  sched_debug.cpu.ttwu_local.24
>       2828 ± 47%     -47.8%       1475 ± 34%  sched_debug.cpu.ttwu_local.27
>       3243 ± 36%     -54.3%       1482 ± 32%  sched_debug.cpu.ttwu_local.30
>     199.25 ±  6%    +100.6%     399.75 ± 32%  sched_debug.cpu.ttwu_local.44
>       1158 ± 64%     -67.3%     379.00 ± 46%  sched_debug.cpu.ttwu_local.54
>     242.25 ± 21%     +51.0%     365.75 ± 19%  sched_debug.cpu.ttwu_local.55
>       1009 ± 26%     -50.8%     496.50 ± 44%  sched_debug.cpu.ttwu_local.59
>       1736 ± 53%     -67.8%     559.25 ± 22%  sched_debug.cpu.ttwu_local.9
> 
> 
> lkp-sbx04: Sandy Bridge-EX
> Memory: 64G
> 
> 
>    perf-profile.cycles.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath
> 
>   76 ++---------------------------------------------------------------------+
>      |                                                                      |
>   74 ++   .*..        .*..*..*..     .*..    .*..  .*..  .*..  .*..*..*..*  |
>      *..*.    *..*..*.          *..*.    *.*.    *.    *.    *.             |
>      |                                                                      |
>   72 ++                                                                     |
>      |                                                                      |
>   70 ++                                                                     |
>      |                                                                      |
>   68 ++                                                                     |
>      |                                                                      |
>      |                          O  O  O    O  O  O  O     O  O     O  O     |
>   66 O+                                  O             O        O        O  O
>      |  O  O  O  O     O  O  O                                              |
>   64 ++-------------O-------------------------------------------------------+
> 
> 
> 
>                              will-it-scale.per_process_ops
> 
>   6.6e+06 O+----O-O--O------------------------------------------------------+
>           |  O          O  O O  O                                           |
>   6.4e+06 ++                       O  O O  O  O  O O  O  O  O O  O  O  O O  O
>           |                                                                 |
>   6.2e+06 ++                                                                |
>     6e+06 ++                                                                |
>           |                                                                 |
>   5.8e+06 ++                                                                |
>           |                                                                 |
>   5.6e+06 ++                                                                |
>   5.4e+06 ++                                                                |
>           |                                                                 |
>   5.2e+06 ++                                                                |
>           *..*..*.*..*..*..*.*..*..*..*.*..*..*..*.*..*..*..*.*..*..*..*.*  |
>     5e+06 ++----------------------------------------------------------------+
> 
> 
> 	[*] bisect-good sample
> 	[O] bisect-bad  sample
> 
> To reproduce:
> 
>         git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
>         cd lkp-tests
>         bin/lkp install job.yaml  # job file is attached in this email
>         bin/lkp run     job.yaml
> 
> 
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
> 
> 
> Thanks,
> Ying Huang

> ---
> LKP_SERVER: inn
> LKP_CGI_PORT: 80
> LKP_CIFS_PORT: 139
> testcase: will-it-scale
> default-monitors:
>   wait: activate-monitor
>   kmsg: 
>   uptime: 
>   iostat: 
>   heartbeat: 
>   vmstat: 
>   numa-numastat: 
>   numa-vmstat: 
>   numa-meminfo: 
>   proc-vmstat: 
>   proc-stat:
>     interval: 10
>   meminfo: 
>   slabinfo: 
>   interrupts: 
>   lock_stat: 
>   latency_stats: 
>   softirqs: 
>   bdi_dev_mapping: 
>   diskstats: 
>   nfsstat: 
>   cpuidle: 
>   cpufreq-stats: 
>   turbostat: 
>   pmeter: 
>   sched_debug:
>     interval: 60
> cpufreq_governor: performance
> default-watchdogs:
>   oom-killer: 
>   watchdog: 
> commit: 65d8fc777f6dcfee12785c057a6b57f679641c90
> model: Sandy Bridge-EX
> nr_cpu: 64
> memory: 64G
> nr_ssd_partitions: 7
> ssd_partitions: "/dev/disk/by-id/ata-INTEL_SSDSC2*-part1"
> swap_partitions: 
> category: benchmark
> perf-profile:
>   freq: 800
> will-it-scale:
>   test: futex1
> queue: bisect
> testbox: lkp-sbx04
> tbox_group: lkp-sbx04
> kconfig: x86_64-rhel
> enqueue_time: 2016-02-28 23:45:52.199165563 +08:00
> compiler: gcc-4.9
> rootfs: debian-x86_64-2015-02-07.cgz
> id: 6b2c2bd744dd898009648cb82de7e0ba77de33f1
> user: lkp
> head_commit: ed520c327c4259ec08b1677023087f658329b961
> base_commit: 81f70ba233d5f660e1ea5fe23260ee323af5d53a
> branch: linux-devel/devel-hourly-2016022811
> result_root: "/result/will-it-scale/performance-futex1/lkp-sbx04/debian-x86_64-2015-02-07.cgz/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/0"
> job_file: "/lkp/scheduled/lkp-sbx04/bisect_will-it-scale-performance-futex1-debian-x86_64-2015-02-07.cgz-x86_64-rhel-65d8fc777f6dcfee12785c057a6b57f679641c90-20160228-23650-17c4qc1-0.yaml"
> max_uptime: 1500
> initrd: "/osimage/debian/debian-x86_64-2015-02-07.cgz"
> bootloader_append:
> - root=/dev/ram0
> - user=lkp
> - job=/lkp/scheduled/lkp-sbx04/bisect_will-it-scale-performance-futex1-debian-x86_64-2015-02-07.cgz-x86_64-rhel-65d8fc777f6dcfee12785c057a6b57f679641c90-20160228-23650-17c4qc1-0.yaml
> - ARCH=x86_64
> - kconfig=x86_64-rhel
> - branch=linux-devel/devel-hourly-2016022811
> - commit=65d8fc777f6dcfee12785c057a6b57f679641c90
> - BOOT_IMAGE=/pkg/linux/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/vmlinuz-4.5.0-rc3-00235-g65d8fc7
> - max_uptime=1500
> - RESULT_ROOT=/result/will-it-scale/performance-futex1/lkp-sbx04/debian-x86_64-2015-02-07.cgz/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/0
> - LKP_SERVER=inn
> - |2-
> 
> 
>   earlyprintk=ttyS0,115200 systemd.log_level=err
>   debug apic=debug sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100
>   panic=-1 softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2 prompt_ramdisk=0
>   console=ttyS0,115200 console=tty0 vga=normal
> 
>   rw
> lkp_initrd: "/lkp/lkp/lkp-x86_64.cgz"
> modules_initrd: "/pkg/linux/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/modules.cgz"
> bm_initrd: "/osimage/deps/debian-x86_64-2015-02-07.cgz/lkp.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/run-ipconfig.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/turbostat.cgz,/lkp/benchmarks/turbostat.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/will-it-scale.cgz,/lkp/benchmarks/will-it-scale.cgz,/lkp/benchmarks/will-it-scale-x86_64.cgz"
> linux_headers_initrd: "/pkg/linux/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/linux-headers.cgz"
> repeat_to: 2
> kernel: "/pkg/linux/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/vmlinuz-4.5.0-rc3-00235-g65d8fc7"
> dequeue_time: 2016-02-28 23:46:33.938915178 +08:00
> job_state: finished
> loadavg: 45.27 20.12 7.84 2/649 11559
> start_time: '1456674445'
> end_time: '1456674754'
> version: "/lkp/lkp/.src-20160226-194908"

> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu10/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu11/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu12/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu13/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu14/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu15/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu16/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu17/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu18/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu19/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu20/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu21/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu22/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu23/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu24/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu25/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu26/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu27/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu28/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu29/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu30/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu31/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu32/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu33/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu34/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu35/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu36/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu37/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu38/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu39/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu40/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu41/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu42/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu43/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu44/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu45/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu46/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu47/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu48/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu49/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu5/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu50/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu51/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu52/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu53/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu54/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu55/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu56/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu57/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu58/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu59/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu6/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu60/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu61/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu62/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu63/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu8/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu9/cpufreq/scaling_governor
> 2016-02-28 23:47:25 ./runtest.py futex1 16 both 1 8 16 24 32 48 64

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [futex] 65d8fc777f: +25.6% will-it-scale.per_process_ops
@ 2016-02-29  9:37   ` Ingo Molnar
  0 siblings, 0 replies; 28+ messages in thread
From: Ingo Molnar @ 2016-02-29  9:37 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 20813 bytes --]


* kernel test robot <ying.huang@linux.intel.com> wrote:

> FYI, we noticed the below changes on
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master 
> commit 65d8fc777f6dcfee12785c057a6b57f679641c90 ("futex: Remove requirement for 
> lock_page() in get_futex_key()")

I have asked for this before, but let me try again: could you _PLEASE_ make these 
emails more readable?

For example what are the 'below changes'? Changes in the profile output? Profiles 
always change from run to run, so that alone is not informative.

Also, there are a lot of changes - which ones prompted the email to be generated?

All in one, this email is hard to parse, because it just dumps a lot of 
information with very little explanatory structure for someone not versed in their 
format. Please try to create an easy to parse 'story' that leads the reader 
towards what you want these emails to tell - not just a raw dump of seemingly 
unconnected pieces of data ...

Thanks,

	Ingo

> 
> 
> =========================================================================================
> compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase:
>   gcc-4.9/performance/x86_64-rhel/debian-x86_64-2015-02-07.cgz/lkp-sbx04/futex1/will-it-scale
> 
> commit: 
>   8ad7b378d0d016309014cae0f640434bca7b5e11
>   65d8fc777f6dcfee12785c057a6b57f679641c90
> 
> 8ad7b378d0d01630 65d8fc777f6dcfee12785c057a 
> ---------------- -------------------------- 
>          %stddev     %change         %stddev
>              \          |                \  
>    5076304 ±  0%     +25.6%    6374220 ±  0%  will-it-scale.per_process_ops
>    1194117 ±  0%     +14.4%    1366153 ±  1%  will-it-scale.per_thread_ops
>       0.58 ±  0%      -2.0%       0.57 ±  0%  will-it-scale.scalability
>       6820 ±  0%     -19.6%       5483 ± 15%  meminfo.AnonHugePages
>       2652 ±  5%     -10.4%       2375 ±  2%  vmstat.system.cs
>       2848 ± 32%    +141.2%       6870 ± 65%  numa-meminfo.node1.Active(anon)
>       2832 ± 31%     +57.6%       4465 ± 27%  numa-meminfo.node1.AnonPages
>      15018 ± 12%     -23.3%      11515 ± 15%  numa-meminfo.node2.AnonPages
>       1214 ± 14%     -22.8%     936.75 ± 20%  numa-meminfo.node3.PageTables
>     712.25 ± 32%    +141.2%       1718 ± 65%  numa-vmstat.node1.nr_active_anon
>     708.25 ± 31%     +57.7%       1116 ± 27%  numa-vmstat.node1.nr_anon_pages
>       3754 ± 12%     -23.3%       2879 ± 15%  numa-vmstat.node2.nr_anon_pages
>     304.75 ± 14%     -23.1%     234.50 ± 20%  numa-vmstat.node3.nr_page_table_pages
>       3.53 ±  1%    -100.0%       0.00 ± -1%  perf-profile.cycles.___might_sleep.__might_sleep.get_futex_key.futex_wake.do_futex
>       4.34 ±  1%    -100.0%       0.00 ± -1%  perf-profile.cycles.__might_sleep.get_futex_key.futex_wake.do_futex.sys_futex
>       1.27 ±  3%    -100.0%       0.00 ± -1%  perf-profile.cycles.__wake_up_bit.unlock_page.get_futex_key.futex_wake.do_futex
>       4.36 ±  1%     +29.6%       5.65 ±  1%  perf-profile.cycles.drop_futex_key_refs.isra.12.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath
>       6.69 ±  1%     +28.1%       8.57 ±  0%  perf-profile.cycles.entry_SYSCALL_64
>       6.73 ±  0%     +30.6%       8.79 ±  0%  perf-profile.cycles.entry_SYSCALL_64_after_swapgs
>      74.21 ±  0%     -11.0%      66.06 ±  0%  perf-profile.cycles.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath
>      59.05 ±  0%     -21.4%      46.40 ±  0%  perf-profile.cycles.get_futex_key.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath
>       4.12 ±  0%     +78.5%       7.36 ±  1%  perf-profile.cycles.get_futex_key_refs.isra.11.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath
>       2.27 ±  3%     +24.1%       2.82 ±  4%  perf-profile.cycles.get_user_pages_fast.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath
>      26.95 ±  0%     +30.0%      35.04 ±  1%  perf-profile.cycles.get_user_pages_fast.get_futex_key.futex_wake.do_futex.sys_futex
>      13.43 ±  0%     +27.2%      17.09 ±  1%  perf-profile.cycles.gup_pte_range.gup_pud_range.get_user_pages_fast.get_futex_key.futex_wake
>      19.66 ±  1%     +28.4%      25.24 ±  0%  perf-profile.cycles.gup_pud_range.get_user_pages_fast.get_futex_key.futex_wake.do_futex
>       4.33 ±  1%     +37.0%       5.93 ±  4%  perf-profile.cycles.hash_futex.do_futex.sys_futex.entry_SYSCALL_64_fastpath
>      13.59 ±  0%    -100.0%       0.00 ± -1%  perf-profile.cycles.unlock_page.get_futex_key.futex_wake.do_futex.sys_futex
>      15160 ± 19%     -34.8%       9883 ±  0%  sched_debug.cfs_rq:/.exec_clock.min
>      27.25 ± 15%     -37.6%      17.00 ±  8%  sched_debug.cfs_rq:/.load_avg.7
>      21.00 ± 38%     -27.4%      15.25 ±  2%  sched_debug.cpu.cpu_load[2].1
>      21.00 ± 38%     -27.4%      15.25 ±  2%  sched_debug.cpu.cpu_load[3].1
>      21.00 ± 38%     -27.4%      15.25 ±  2%  sched_debug.cpu.cpu_load[4].1
>       1790 ±  0%     +42.4%       2549 ± 45%  sched_debug.cpu.curr->pid.21
>      50033 ±  4%      -6.8%      46622 ±  4%  sched_debug.cpu.nr_load_updates.29
>       4398 ± 42%    +103.5%       8949 ± 23%  sched_debug.cpu.nr_switches.11
>       7452 ± 34%    +111.3%      15744 ± 54%  sched_debug.cpu.nr_switches.20
>       3739 ± 13%    +213.5%      11723 ± 40%  sched_debug.cpu.nr_switches.23
>       1648 ± 53%     +96.5%       3239 ± 63%  sched_debug.cpu.nr_switches.51
>       0.25 ±519%   -1300.0%      -3.00 ±-52%  sched_debug.cpu.nr_uninterruptible.24
>       8632 ± 16%     -32.5%       5823 ± 19%  sched_debug.cpu.sched_count.1
>       5091 ± 36%    +137.5%      12092 ± 31%  sched_debug.cpu.sched_count.11
>      12453 ± 90%     -74.6%       3159 ± 24%  sched_debug.cpu.sched_count.2
>       7782 ± 32%    +118.2%      16977 ± 46%  sched_debug.cpu.sched_count.20
>       2665 ± 48%     -49.8%       1337 ± 30%  sched_debug.cpu.sched_count.32
>       1365 ± 11%     -14.0%       1174 ±  3%  sched_debug.cpu.sched_count.45
>       1693 ± 51%    +147.7%       4193 ± 42%  sched_debug.cpu.sched_count.51
>       5023 ± 57%     -51.5%       2434 ± 43%  sched_debug.cpu.sched_count.57
>       1705 ± 16%    +129.6%       3915 ± 48%  sched_debug.cpu.sched_goidle.23
>     536.25 ± 14%     -18.7%     435.75 ±  2%  sched_debug.cpu.sched_goidle.45
>       1228 ± 19%     -27.3%     892.50 ± 17%  sched_debug.cpu.sched_goidle.5
>       1919 ± 55%     +88.5%       3617 ± 37%  sched_debug.cpu.ttwu_count.11
>       7699 ± 35%     -43.7%       4335 ± 43%  sched_debug.cpu.ttwu_count.24
>       5380 ± 36%     -45.6%       2926 ± 18%  sched_debug.cpu.ttwu_count.30
>     563.25 ± 20%    +140.3%       1353 ± 38%  sched_debug.cpu.ttwu_local.11
>       4297 ± 46%     -49.1%       2186 ± 39%  sched_debug.cpu.ttwu_local.24
>       2828 ± 47%     -47.8%       1475 ± 34%  sched_debug.cpu.ttwu_local.27
>       3243 ± 36%     -54.3%       1482 ± 32%  sched_debug.cpu.ttwu_local.30
>     199.25 ±  6%    +100.6%     399.75 ± 32%  sched_debug.cpu.ttwu_local.44
>       1158 ± 64%     -67.3%     379.00 ± 46%  sched_debug.cpu.ttwu_local.54
>     242.25 ± 21%     +51.0%     365.75 ± 19%  sched_debug.cpu.ttwu_local.55
>       1009 ± 26%     -50.8%     496.50 ± 44%  sched_debug.cpu.ttwu_local.59
>       1736 ± 53%     -67.8%     559.25 ± 22%  sched_debug.cpu.ttwu_local.9
> 
> 
> lkp-sbx04: Sandy Bridge-EX
> Memory: 64G
> 
> 
>    perf-profile.cycles.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath
> 
>   76 ++---------------------------------------------------------------------+
>      |                                                                      |
>   74 ++   .*..        .*..*..*..     .*..    .*..  .*..  .*..  .*..*..*..*  |
>      *..*.    *..*..*.          *..*.    *.*.    *.    *.    *.             |
>      |                                                                      |
>   72 ++                                                                     |
>      |                                                                      |
>   70 ++                                                                     |
>      |                                                                      |
>   68 ++                                                                     |
>      |                                                                      |
>      |                          O  O  O    O  O  O  O     O  O     O  O     |
>   66 O+                                  O             O        O        O  O
>      |  O  O  O  O     O  O  O                                              |
>   64 ++-------------O-------------------------------------------------------+
> 
> 
> 
>                              will-it-scale.per_process_ops
> 
>   6.6e+06 O+----O-O--O------------------------------------------------------+
>           |  O          O  O O  O                                           |
>   6.4e+06 ++                       O  O O  O  O  O O  O  O  O O  O  O  O O  O
>           |                                                                 |
>   6.2e+06 ++                                                                |
>     6e+06 ++                                                                |
>           |                                                                 |
>   5.8e+06 ++                                                                |
>           |                                                                 |
>   5.6e+06 ++                                                                |
>   5.4e+06 ++                                                                |
>           |                                                                 |
>   5.2e+06 ++                                                                |
>           *..*..*.*..*..*..*.*..*..*..*.*..*..*..*.*..*..*..*.*..*..*..*.*  |
>     5e+06 ++----------------------------------------------------------------+
> 
> 
> 	[*] bisect-good sample
> 	[O] bisect-bad  sample
> 
> To reproduce:
> 
>         git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
>         cd lkp-tests
>         bin/lkp install job.yaml  # job file is attached in this email
>         bin/lkp run     job.yaml
> 
> 
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
> 
> 
> Thanks,
> Ying Huang

> ---
> LKP_SERVER: inn
> LKP_CGI_PORT: 80
> LKP_CIFS_PORT: 139
> testcase: will-it-scale
> default-monitors:
>   wait: activate-monitor
>   kmsg: 
>   uptime: 
>   iostat: 
>   heartbeat: 
>   vmstat: 
>   numa-numastat: 
>   numa-vmstat: 
>   numa-meminfo: 
>   proc-vmstat: 
>   proc-stat:
>     interval: 10
>   meminfo: 
>   slabinfo: 
>   interrupts: 
>   lock_stat: 
>   latency_stats: 
>   softirqs: 
>   bdi_dev_mapping: 
>   diskstats: 
>   nfsstat: 
>   cpuidle: 
>   cpufreq-stats: 
>   turbostat: 
>   pmeter: 
>   sched_debug:
>     interval: 60
> cpufreq_governor: performance
> default-watchdogs:
>   oom-killer: 
>   watchdog: 
> commit: 65d8fc777f6dcfee12785c057a6b57f679641c90
> model: Sandy Bridge-EX
> nr_cpu: 64
> memory: 64G
> nr_ssd_partitions: 7
> ssd_partitions: "/dev/disk/by-id/ata-INTEL_SSDSC2*-part1"
> swap_partitions: 
> category: benchmark
> perf-profile:
>   freq: 800
> will-it-scale:
>   test: futex1
> queue: bisect
> testbox: lkp-sbx04
> tbox_group: lkp-sbx04
> kconfig: x86_64-rhel
> enqueue_time: 2016-02-28 23:45:52.199165563 +08:00
> compiler: gcc-4.9
> rootfs: debian-x86_64-2015-02-07.cgz
> id: 6b2c2bd744dd898009648cb82de7e0ba77de33f1
> user: lkp
> head_commit: ed520c327c4259ec08b1677023087f658329b961
> base_commit: 81f70ba233d5f660e1ea5fe23260ee323af5d53a
> branch: linux-devel/devel-hourly-2016022811
> result_root: "/result/will-it-scale/performance-futex1/lkp-sbx04/debian-x86_64-2015-02-07.cgz/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/0"
> job_file: "/lkp/scheduled/lkp-sbx04/bisect_will-it-scale-performance-futex1-debian-x86_64-2015-02-07.cgz-x86_64-rhel-65d8fc777f6dcfee12785c057a6b57f679641c90-20160228-23650-17c4qc1-0.yaml"
> max_uptime: 1500
> initrd: "/osimage/debian/debian-x86_64-2015-02-07.cgz"
> bootloader_append:
> - root=/dev/ram0
> - user=lkp
> - job=/lkp/scheduled/lkp-sbx04/bisect_will-it-scale-performance-futex1-debian-x86_64-2015-02-07.cgz-x86_64-rhel-65d8fc777f6dcfee12785c057a6b57f679641c90-20160228-23650-17c4qc1-0.yaml
> - ARCH=x86_64
> - kconfig=x86_64-rhel
> - branch=linux-devel/devel-hourly-2016022811
> - commit=65d8fc777f6dcfee12785c057a6b57f679641c90
> - BOOT_IMAGE=/pkg/linux/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/vmlinuz-4.5.0-rc3-00235-g65d8fc7
> - max_uptime=1500
> - RESULT_ROOT=/result/will-it-scale/performance-futex1/lkp-sbx04/debian-x86_64-2015-02-07.cgz/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/0
> - LKP_SERVER=inn
> - |2-
> 
> 
>   earlyprintk=ttyS0,115200 systemd.log_level=err
>   debug apic=debug sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100
>   panic=-1 softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2 prompt_ramdisk=0
>   console=ttyS0,115200 console=tty0 vga=normal
> 
>   rw
> lkp_initrd: "/lkp/lkp/lkp-x86_64.cgz"
> modules_initrd: "/pkg/linux/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/modules.cgz"
> bm_initrd: "/osimage/deps/debian-x86_64-2015-02-07.cgz/lkp.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/run-ipconfig.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/turbostat.cgz,/lkp/benchmarks/turbostat.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/will-it-scale.cgz,/lkp/benchmarks/will-it-scale.cgz,/lkp/benchmarks/will-it-scale-x86_64.cgz"
> linux_headers_initrd: "/pkg/linux/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/linux-headers.cgz"
> repeat_to: 2
> kernel: "/pkg/linux/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/vmlinuz-4.5.0-rc3-00235-g65d8fc7"
> dequeue_time: 2016-02-28 23:46:33.938915178 +08:00
> job_state: finished
> loadavg: 45.27 20.12 7.84 2/649 11559
> start_time: '1456674445'
> end_time: '1456674754'
> version: "/lkp/lkp/.src-20160226-194908"

> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu10/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu11/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu12/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu13/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu14/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu15/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu16/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu17/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu18/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu19/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu20/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu21/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu22/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu23/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu24/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu25/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu26/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu27/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu28/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu29/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu30/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu31/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu32/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu33/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu34/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu35/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu36/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu37/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu38/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu39/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu40/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu41/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu42/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu43/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu44/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu45/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu46/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu47/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu48/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu49/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu5/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu50/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu51/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu52/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu53/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu54/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu55/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu56/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu57/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu58/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu59/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu6/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu60/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu61/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu62/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu63/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu8/cpufreq/scaling_governor
> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu9/cpufreq/scaling_governor
> 2016-02-28 23:47:25 ./runtest.py futex1 16 both 1 8 16 24 32 48 64


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [lkp] [futex] 65d8fc777f: +25.6% will-it-scale.per_process_ops
  2016-02-29  9:37   ` Ingo Molnar
@ 2016-02-29 17:37     ` Davidlohr Bueso
  -1 siblings, 0 replies; 28+ messages in thread
From: Davidlohr Bueso @ 2016-02-29 17:37 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: kernel test robot, Mel Gorman, lkp, LKML,
	Sebastian Andrzej Siewior, Peter Zijlstra, Mel Gorman,
	Linus Torvalds, Hugh Dickins, Darren Hart, Chris Mason,
	Thomas Gleixner, Davidlohr Bueso, Peter Zijlstra

On Mon, 29 Feb 2016, Ingo Molnar wrote:

>
>* kernel test robot <ying.huang@linux.intel.com> wrote:
>
>> FYI, we noticed the below changes on
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
>> commit 65d8fc777f6dcfee12785c057a6b57f679641c90 ("futex: Remove requirement for
>> lock_page() in get_futex_key()")
>
>I have asked for this before, but let me try again: could you _PLEASE_ make these
>emails more readable?
>
>For example what are the 'below changes'? Changes in the profile output? Profiles
>always change from run to run, so that alone is not informative.
>
>Also, there are a lot of changes - which ones prompted the email to be generated?
>
>All in one, this email is hard to parse, because it just dumps a lot of
>information with very little explanatory structure for someone not versed in their
>format. Please try to create an easy to parse 'story' that leads the reader
>towards what you want these emails to tell - not just a raw dump of seemingly
>unconnected pieces of data ...
>↓
>Thanks,
>
>	Ingo
>
>>
>>
>> =========================================================================================
>> compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase:
>>   gcc-4.9/performance/x86_64-rhel/debian-x86_64-2015-02-07.cgz/lkp-sbx04/futex1/will-it-scale

If I'm reading this correctly, it is similar to what I measured wrt ~lockleless get_futex_key()
stuff using the perf runs, with similar performance improvement numbers (per process/thread ops).
The futex1 test will just pound on FUTEX_WAKE without anyone actually blocked on a futex, so it
mainly measures the key/hashing part of the operation.

Thanks,
Davidlohr

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [futex] 65d8fc777f: +25.6% will-it-scale.per_process_ops
@ 2016-02-29 17:37     ` Davidlohr Bueso
  0 siblings, 0 replies; 28+ messages in thread
From: Davidlohr Bueso @ 2016-02-29 17:37 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 1734 bytes --]

On Mon, 29 Feb 2016, Ingo Molnar wrote:

>
>* kernel test robot <ying.huang@linux.intel.com> wrote:
>
>> FYI, we noticed the below changes on
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
>> commit 65d8fc777f6dcfee12785c057a6b57f679641c90 ("futex: Remove requirement for
>> lock_page() in get_futex_key()")
>
>I have asked for this before, but let me try again: could you _PLEASE_ make these
>emails more readable?
>
>For example what are the 'below changes'? Changes in the profile output? Profiles
>always change from run to run, so that alone is not informative.
>
>Also, there are a lot of changes - which ones prompted the email to be generated?
>
>All in one, this email is hard to parse, because it just dumps a lot of
>information with very little explanatory structure for someone not versed in their
>format. Please try to create an easy to parse 'story' that leads the reader
>towards what you want these emails to tell - not just a raw dump of seemingly
>unconnected pieces of data ...
>↓
>Thanks,
>
>	Ingo
>
>>
>>
>> =========================================================================================
>> compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase:
>>   gcc-4.9/performance/x86_64-rhel/debian-x86_64-2015-02-07.cgz/lkp-sbx04/futex1/will-it-scale

If I'm reading this correctly, it is similar to what I measured wrt ~lockleless get_futex_key()
stuff using the perf runs, with similar performance improvement numbers (per process/thread ops).
The futex1 test will just pound on FUTEX_WAKE without anyone actually blocked on a futex, so it
mainly measures the key/hashing part of the operation.

Thanks,
Davidlohr

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [LKP] [lkp] [futex] 65d8fc777f: +25.6% will-it-scale.per_process_ops
  2016-02-29  9:37   ` Ingo Molnar
@ 2016-03-01  8:38     ` Huang, Ying
  -1 siblings, 0 replies; 28+ messages in thread
From: Huang, Ying @ 2016-03-01  8:38 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Chris Mason, Peter Zijlstra, lkp,
	Sebastian Andrzej Siewior, Davidlohr Bueso, Hugh Dickins, LKML,
	Linus Torvalds, Mel Gorman, Thomas Gleixner, Mel Gorman,
	Darren Hart, Wu Fengguang

Hi, Ingo,

Ingo Molnar <mingo@kernel.org> writes:

> * kernel test robot <ying.huang@linux.intel.com> wrote:
>
>> FYI, we noticed the below changes on
>> 
>> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master 
>> commit 65d8fc777f6dcfee12785c057a6b57f679641c90 ("futex: Remove requirement for 
>> lock_page() in get_futex_key()")
>
> I have asked for this before, but let me try again: could you _PLEASE_ make these 
> emails more readable?
>
> For example what are the 'below changes'? Changes in the profile output? Profiles 
> always change from run to run, so that alone is not informative.
>
> Also, there are a lot of changes - which ones prompted the email to be generated?
>
> All in one, this email is hard to parse, because it just dumps a lot of 
> information with very little explanatory structure for someone not versed in their 
> format. Please try to create an easy to parse 'story' that leads the reader 
> towards what you want these emails to tell - not just a raw dump of seemingly 
> unconnected pieces of data ...

Your input are valuable for us.  We are discussing how to improve our
reporting to be helpful for kernel developers.  Will go back to you soon
on this.

Best Regards,
Huang, Ying

> Thanks,
>
> 	Ingo
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [futex] 65d8fc777f: +25.6% will-it-scale.per_process_ops
@ 2016-03-01  8:38     ` Huang, Ying
  0 siblings, 0 replies; 28+ messages in thread
From: Huang, Ying @ 2016-03-01  8:38 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 1301 bytes --]

Hi, Ingo,

Ingo Molnar <mingo@kernel.org> writes:

> * kernel test robot <ying.huang@linux.intel.com> wrote:
>
>> FYI, we noticed the below changes on
>> 
>> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master 
>> commit 65d8fc777f6dcfee12785c057a6b57f679641c90 ("futex: Remove requirement for 
>> lock_page() in get_futex_key()")
>
> I have asked for this before, but let me try again: could you _PLEASE_ make these 
> emails more readable?
>
> For example what are the 'below changes'? Changes in the profile output? Profiles 
> always change from run to run, so that alone is not informative.
>
> Also, there are a lot of changes - which ones prompted the email to be generated?
>
> All in one, this email is hard to parse, because it just dumps a lot of 
> information with very little explanatory structure for someone not versed in their 
> format. Please try to create an easy to parse 'story' that leads the reader 
> towards what you want these emails to tell - not just a raw dump of seemingly 
> unconnected pieces of data ...

Your input are valuable for us.  We are discussing how to improve our
reporting to be helpful for kernel developers.  Will go back to you soon
on this.

Best Regards,
Huang, Ying

> Thanks,
>
> 	Ingo
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [LKP] [lkp] [futex] 65d8fc777f: +25.6% will-it-scale.per_process_ops
  2016-02-29  9:37   ` Ingo Molnar
@ 2016-03-18  6:12     ` Huang, Ying
  -1 siblings, 0 replies; 28+ messages in thread
From: Huang, Ying @ 2016-03-18  6:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Chris Mason, Peter Zijlstra, lkp,
	Sebastian Andrzej Siewior, Davidlohr Bueso, Hugh Dickins, LKML,
	Linus Torvalds, Mel Gorman, Thomas Gleixner, Mel Gorman,
	Darren Hart, Wu Fengguang


Sorry, for late.

Ingo Molnar <mingo@kernel.org> writes:
> * kernel test robot <ying.huang@linux.intel.com> wrote:
>
>> FYI, we noticed the below changes on
>> 
>> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master 
>> commit 65d8fc777f6dcfee12785c057a6b57f679641c90 ("futex: Remove requirement for 
>> lock_page() in get_futex_key()")
>
> I have asked for this before, but let me try again: could you _PLEASE_ make these 
> emails more readable?
>
> For example what are the 'below changes'? Changes in the profile output? Profiles 
> always change from run to run, so that alone is not informative.
>
> Also, there are a lot of changes - which ones prompted the email to be generated?

Usually we will put most important change we think in the subject of the
mail, for this email, it is,

+25.6% will-it-scale.per_process_ops

and, we try to put most important changes at the top of the comparison
result below.  That is the will-it-scale.xxx below.

We are thinking about how to improve this.  You input is valuable for
us.  We are thinking change the "below changes" line to something like
below.

FYI, we noticed the +25.6% will-it-scale.per_process_ops improvement on

...

Does this looks better?

> All in one, this email is hard to parse, because it just dumps a lot of 
> information with very little explanatory structure for someone not versed in their 
> format. Please try to create an easy to parse 'story' that leads the reader 
> towards what you want these emails to tell - not just a raw dump of seemingly 
> unconnected pieces of data ...

Which kind of story?  Something like, we tested some benchmark on some
machine, which triggered some regression, we do bisect, and find the
first bad commit is xxx.  In addition to benchmark result, we collected
some other information, hope they are helpful for you.

We just try to help.  Sorry for confusion.  We try to provide all
information needed to describe the change and help for root causing the
changes.  But we know, we are still far from there, so your input is
important for us to improve our test report.  Which part of the report
do you think should be changed firstly?  The overall structure?  Or the
data format?

Best Regards,
Huang, Ying

> Thanks,
>
> 	Ingo
>
>> 
>> 
>> =========================================================================================
>> compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase:
>>   gcc-4.9/performance/x86_64-rhel/debian-x86_64-2015-02-07.cgz/lkp-sbx04/futex1/will-it-scale
>> 
>> commit: 
>>   8ad7b378d0d016309014cae0f640434bca7b5e11
>>   65d8fc777f6dcfee12785c057a6b57f679641c90
>> 
>> 8ad7b378d0d01630 65d8fc777f6dcfee12785c057a 
>> ---------------- -------------------------- 
>>          %stddev     %change         %stddev
>>              \          |                \  
>>    5076304 .  0%     +25.6%    6374220 .  0%  will-it-scale.per_process_ops
>>    1194117 .  0%     +14.4%    1366153 .  1%  will-it-scale.per_thread_ops
>>       0.58 .  0%      -2.0%       0.57 .  0%  will-it-scale.scalability
>>       6820 .  0%     -19.6%       5483 . 15%  meminfo.AnonHugePages
>>       2652 .  5%     -10.4%       2375 .  2%  vmstat.system.cs
>>       2848 . 32%    +141.2%       6870 . 65%  numa-meminfo.node1.Active(anon)
>>       2832 . 31%     +57.6%       4465 . 27%  numa-meminfo.node1.AnonPages
>>      15018 . 12%     -23.3%      11515 . 15%  numa-meminfo.node2.AnonPages
>>       1214 . 14%     -22.8%     936.75 . 20%  numa-meminfo.node3.PageTables
>>     712.25 . 32%    +141.2%       1718 . 65%  numa-vmstat.node1.nr_active_anon
>>     708.25 . 31%     +57.7%       1116 . 27%  numa-vmstat.node1.nr_anon_pages
>>       3754 . 12%     -23.3%       2879 . 15%  numa-vmstat.node2.nr_anon_pages
>>     304.75 . 14%     -23.1%     234.50 . 20%  numa-vmstat.node3.nr_page_table_pages
>> 3.53 . 1% -100.0% 0.00 . -1%
> perf-profile.cycles.___might_sleep.__might_sleep.get_futex_key.futex_wake.do_futex
>> 4.34 . 1% -100.0% 0.00 . -1%
> perf-profile.cycles.__might_sleep.get_futex_key.futex_wake.do_futex.sys_futex
>> 1.27 . 3% -100.0% 0.00 . -1%
> perf-profile.cycles.__wake_up_bit.unlock_page.get_futex_key.futex_wake.do_futex
>> 4.36 . 1% +29.6% 5.65 . 1%
> perf-profile.cycles.drop_futex_key_refs.isra.12.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath
>>       6.69 .  1%     +28.1%       8.57 .  0%  perf-profile.cycles.entry_SYSCALL_64
>>       6.73 .  0%     +30.6%       8.79 .  0%  perf-profile.cycles.entry_SYSCALL_64_after_swapgs
>> 74.21 . 0% -11.0% 66.06 . 0%
> perf-profile.cycles.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath
>> 59.05 . 0% -21.4% 46.40 . 0%
> perf-profile.cycles.get_futex_key.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath
>> 4.12 . 0% +78.5% 7.36 . 1%
> perf-profile.cycles.get_futex_key_refs.isra.11.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath
>> 2.27 . 3% +24.1% 2.82 . 4%
> perf-profile.cycles.get_user_pages_fast.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath
>> 26.95 . 0% +30.0% 35.04 . 1%
> perf-profile.cycles.get_user_pages_fast.get_futex_key.futex_wake.do_futex.sys_futex
>> 13.43 . 0% +27.2% 17.09 . 1%
> perf-profile.cycles.gup_pte_range.gup_pud_range.get_user_pages_fast.get_futex_key.futex_wake
>> 19.66 . 1% +28.4% 25.24 . 0%
> perf-profile.cycles.gup_pud_range.get_user_pages_fast.get_futex_key.futex_wake.do_futex
>> 4.33 . 1% +37.0% 5.93 . 4%
> perf-profile.cycles.hash_futex.do_futex.sys_futex.entry_SYSCALL_64_fastpath
>> 13.59 . 0% -100.0% 0.00 . -1%
> perf-profile.cycles.unlock_page.get_futex_key.futex_wake.do_futex.sys_futex
>>      15160 . 19%     -34.8%       9883 .  0%  sched_debug.cfs_rq:/.exec_clock.min
>>      27.25 . 15%     -37.6%      17.00 .  8%  sched_debug.cfs_rq:/.load_avg.7
>>      21.00 . 38%     -27.4%      15.25 .  2%  sched_debug.cpu.cpu_load[2].1
>>      21.00 . 38%     -27.4%      15.25 .  2%  sched_debug.cpu.cpu_load[3].1
>>      21.00 . 38%     -27.4%      15.25 .  2%  sched_debug.cpu.cpu_load[4].1
>>       1790 .  0%     +42.4%       2549 . 45%  sched_debug.cpu.curr->pid.21
>>      50033 .  4%      -6.8%      46622 .  4%  sched_debug.cpu.nr_load_updates.29
>>       4398 . 42%    +103.5%       8949 . 23%  sched_debug.cpu.nr_switches.11
>>       7452 . 34%    +111.3%      15744 . 54%  sched_debug.cpu.nr_switches.20
>>       3739 . 13%    +213.5%      11723 . 40%  sched_debug.cpu.nr_switches.23
>>       1648 . 53%     +96.5%       3239 . 63%  sched_debug.cpu.nr_switches.51
>>       0.25 .519%   -1300.0%      -3.00 .-52%  sched_debug.cpu.nr_uninterruptible.24
>>       8632 . 16%     -32.5%       5823 . 19%  sched_debug.cpu.sched_count.1
>>       5091 . 36%    +137.5%      12092 . 31%  sched_debug.cpu.sched_count.11
>>      12453 . 90%     -74.6%       3159 . 24%  sched_debug.cpu.sched_count.2
>>       7782 . 32%    +118.2%      16977 . 46%  sched_debug.cpu.sched_count.20
>>       2665 . 48%     -49.8%       1337 . 30%  sched_debug.cpu.sched_count.32
>>       1365 . 11%     -14.0%       1174 .  3%  sched_debug.cpu.sched_count.45
>>       1693 . 51%    +147.7%       4193 . 42%  sched_debug.cpu.sched_count.51
>>       5023 . 57%     -51.5%       2434 . 43%  sched_debug.cpu.sched_count.57
>>       1705 . 16%    +129.6%       3915 . 48%  sched_debug.cpu.sched_goidle.23
>>     536.25 . 14%     -18.7%     435.75 .  2%  sched_debug.cpu.sched_goidle.45
>>       1228 . 19%     -27.3%     892.50 . 17%  sched_debug.cpu.sched_goidle.5
>>       1919 . 55%     +88.5%       3617 . 37%  sched_debug.cpu.ttwu_count.11
>>       7699 . 35%     -43.7%       4335 . 43%  sched_debug.cpu.ttwu_count.24
>>       5380 . 36%     -45.6%       2926 . 18%  sched_debug.cpu.ttwu_count.30
>>     563.25 . 20%    +140.3%       1353 . 38%  sched_debug.cpu.ttwu_local.11
>>       4297 . 46%     -49.1%       2186 . 39%  sched_debug.cpu.ttwu_local.24
>>       2828 . 47%     -47.8%       1475 . 34%  sched_debug.cpu.ttwu_local.27
>>       3243 . 36%     -54.3%       1482 . 32%  sched_debug.cpu.ttwu_local.30
>>     199.25 .  6%    +100.6%     399.75 . 32%  sched_debug.cpu.ttwu_local.44
>>       1158 . 64%     -67.3%     379.00 . 46%  sched_debug.cpu.ttwu_local.54
>>     242.25 . 21%     +51.0%     365.75 . 19%  sched_debug.cpu.ttwu_local.55
>>       1009 . 26%     -50.8%     496.50 . 44%  sched_debug.cpu.ttwu_local.59
>>       1736 . 53%     -67.8%     559.25 . 22%  sched_debug.cpu.ttwu_local.9
>> 
>> 
>> lkp-sbx04: Sandy Bridge-EX
>> Memory: 64G
>> 
>> 
>>    perf-profile.cycles.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath
>> 
>>   76 ++---------------------------------------------------------------------+
>>      |                                                                      |
>>   74 ++   .*..        .*..*..*..     .*..    .*..  .*..  .*..  .*..*..*..*  |
>>      *..*.    *..*..*.          *..*.    *.*.    *.    *.    *.             |
>>      |                                                                      |
>>   72 ++                                                                     |
>>      |                                                                      |
>>   70 ++                                                                     |
>>      |                                                                      |
>>   68 ++                                                                     |
>>      |                                                                      |
>>      |                          O  O  O    O  O  O  O     O  O     O  O     |
>>   66 O+                                  O             O        O        O  O
>>      |  O  O  O  O     O  O  O                                              |
>>   64 ++-------------O-------------------------------------------------------+
>> 
>> 
>> 
>>                              will-it-scale.per_process_ops
>> 
>>   6.6e+06 O+----O-O--O------------------------------------------------------+
>>           |  O          O  O O  O                                           |
>>   6.4e+06 ++                       O  O O  O  O  O O  O  O  O O  O  O  O O  O
>>           |                                                                 |
>>   6.2e+06 ++                                                                |
>>     6e+06 ++                                                                |
>>           |                                                                 |
>>   5.8e+06 ++                                                                |
>>           |                                                                 |
>>   5.6e+06 ++                                                                |
>>   5.4e+06 ++                                                                |
>>           |                                                                 |
>>   5.2e+06 ++                                                                |
>>           *..*..*.*..*..*..*.*..*..*..*.*..*..*..*.*..*..*..*.*..*..*..*.*  |
>>     5e+06 ++----------------------------------------------------------------+
>> 
>> 
>> 	[*] bisect-good sample
>> 	[O] bisect-bad  sample
>> 
>> To reproduce:
>> 
>>         git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
>>         cd lkp-tests
>>         bin/lkp install job.yaml  # job file is attached in this email
>>         bin/lkp run     job.yaml
>> 
>> 
>> Disclaimer:
>> Results have been estimated based on internal Intel analysis and are provided
>> for informational purposes only. Any difference in system hardware or software
>> design or configuration may affect actual performance.
>> 
>> 
>> Thanks,
>> Ying Huang
>
>> ---
>> LKP_SERVER: inn
>> LKP_CGI_PORT: 80
>> LKP_CIFS_PORT: 139
>> testcase: will-it-scale
>> default-monitors:
>>   wait: activate-monitor
>>   kmsg: 
>>   uptime: 
>>   iostat: 
>>   heartbeat: 
>>   vmstat: 
>>   numa-numastat: 
>>   numa-vmstat: 
>>   numa-meminfo: 
>>   proc-vmstat: 
>>   proc-stat:
>>     interval: 10
>>   meminfo: 
>>   slabinfo: 
>>   interrupts: 
>>   lock_stat: 
>>   latency_stats: 
>>   softirqs: 
>>   bdi_dev_mapping: 
>>   diskstats: 
>>   nfsstat: 
>>   cpuidle: 
>>   cpufreq-stats: 
>>   turbostat: 
>>   pmeter: 
>>   sched_debug:
>>     interval: 60
>> cpufreq_governor: performance
>> default-watchdogs:
>>   oom-killer: 
>>   watchdog: 
>> commit: 65d8fc777f6dcfee12785c057a6b57f679641c90
>> model: Sandy Bridge-EX
>> nr_cpu: 64
>> memory: 64G
>> nr_ssd_partitions: 7
>> ssd_partitions: "/dev/disk/by-id/ata-INTEL_SSDSC2*-part1"
>> swap_partitions: 
>> category: benchmark
>> perf-profile:
>>   freq: 800
>> will-it-scale:
>>   test: futex1
>> queue: bisect
>> testbox: lkp-sbx04
>> tbox_group: lkp-sbx04
>> kconfig: x86_64-rhel
>> enqueue_time: 2016-02-28 23:45:52.199165563 +08:00
>> compiler: gcc-4.9
>> rootfs: debian-x86_64-2015-02-07.cgz
>> id: 6b2c2bd744dd898009648cb82de7e0ba77de33f1
>> user: lkp
>> head_commit: ed520c327c4259ec08b1677023087f658329b961
>> base_commit: 81f70ba233d5f660e1ea5fe23260ee323af5d53a
>> branch: linux-devel/devel-hourly-2016022811
>> result_root:
> "/result/will-it-scale/performance-futex1/lkp-sbx04/debian-x86_64-2015-02-07.cgz/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/0"
>> job_file:
> "/lkp/scheduled/lkp-sbx04/bisect_will-it-scale-performance-futex1-debian-x86_64-2015-02-07.cgz-x86_64-rhel-65d8fc777f6dcfee12785c057a6b57f679641c90-20160228-23650-17c4qc1-0.yaml"
>> max_uptime: 1500
>> initrd: "/osimage/debian/debian-x86_64-2015-02-07.cgz"
>> bootloader_append:
>> - root=/dev/ram0
>> - user=lkp
>> -
> job=/lkp/scheduled/lkp-sbx04/bisect_will-it-scale-performance-futex1-debian-x86_64-2015-02-07.cgz-x86_64-rhel-65d8fc777f6dcfee12785c057a6b57f679641c90-20160228-23650-17c4qc1-0.yaml
>> - ARCH=x86_64
>> - kconfig=x86_64-rhel
>> - branch=linux-devel/devel-hourly-2016022811
>> - commit=65d8fc777f6dcfee12785c057a6b57f679641c90
>> -
> BOOT_IMAGE=/pkg/linux/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/vmlinuz-4.5.0-rc3-00235-g65d8fc7
>> - max_uptime=1500
>> -
> RESULT_ROOT=/result/will-it-scale/performance-futex1/lkp-sbx04/debian-x86_64-2015-02-07.cgz/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/0
>> - LKP_SERVER=inn
>> - |2-
>> 
>> 
>>   earlyprintk=ttyS0,115200 systemd.log_level=err
>>   debug apic=debug sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100
>>   panic=-1 softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2 prompt_ramdisk=0
>>   console=ttyS0,115200 console=tty0 vga=normal
>> 
>>   rw
>> lkp_initrd: "/lkp/lkp/lkp-x86_64.cgz"
>> modules_initrd: "/pkg/linux/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/modules.cgz"
>> bm_initrd:
> "/osimage/deps/debian-x86_64-2015-02-07.cgz/lkp.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/run-ipconfig.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/turbostat.cgz,/lkp/benchmarks/turbostat.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/will-it-scale.cgz,/lkp/benchmarks/will-it-scale.cgz,/lkp/benchmarks/will-it-scale-x86_64.cgz"
>> linux_headers_initrd: "/pkg/linux/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/linux-headers.cgz"
>> repeat_to: 2
>> kernel: "/pkg/linux/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/vmlinuz-4.5.0-rc3-00235-g65d8fc7"
>> dequeue_time: 2016-02-28 23:46:33.938915178 +08:00
>> job_state: finished
>> loadavg: 45.27 20.12 7.84 2/649 11559
>> start_time: '1456674445'
>> end_time: '1456674754'
>> version: "/lkp/lkp/.src-20160226-194908"
>
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu10/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu11/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu12/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu13/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu14/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu15/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu16/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu17/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu18/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu19/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu20/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu21/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu22/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu23/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu24/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu25/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu26/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu27/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu28/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu29/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu30/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu31/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu32/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu33/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu34/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu35/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu36/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu37/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu38/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu39/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu40/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu41/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu42/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu43/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu44/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu45/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu46/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu47/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu48/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu49/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu5/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu50/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu51/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu52/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu53/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu54/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu55/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu56/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu57/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu58/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu59/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu6/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu60/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu61/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu62/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu63/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu8/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu9/cpufreq/scaling_governor
>> 2016-02-28 23:47:25 ./runtest.py futex1 16 both 1 8 16 24 32 48 64
>
> _______________________________________________
> LKP mailing list
> LKP@lists.01.org
> https://lists.01.org/mailman/listinfo/lkp

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [futex] 65d8fc777f: +25.6% will-it-scale.per_process_ops
@ 2016-03-18  6:12     ` Huang, Ying
  0 siblings, 0 replies; 28+ messages in thread
From: Huang, Ying @ 2016-03-18  6:12 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 22267 bytes --]


Sorry, for late.

Ingo Molnar <mingo@kernel.org> writes:
> * kernel test robot <ying.huang@linux.intel.com> wrote:
>
>> FYI, we noticed the below changes on
>> 
>> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master 
>> commit 65d8fc777f6dcfee12785c057a6b57f679641c90 ("futex: Remove requirement for 
>> lock_page() in get_futex_key()")
>
> I have asked for this before, but let me try again: could you _PLEASE_ make these 
> emails more readable?
>
> For example what are the 'below changes'? Changes in the profile output? Profiles 
> always change from run to run, so that alone is not informative.
>
> Also, there are a lot of changes - which ones prompted the email to be generated?

Usually we will put most important change we think in the subject of the
mail, for this email, it is,

+25.6% will-it-scale.per_process_ops

and, we try to put most important changes at the top of the comparison
result below.  That is the will-it-scale.xxx below.

We are thinking about how to improve this.  You input is valuable for
us.  We are thinking change the "below changes" line to something like
below.

FYI, we noticed the +25.6% will-it-scale.per_process_ops improvement on

...

Does this looks better?

> All in one, this email is hard to parse, because it just dumps a lot of 
> information with very little explanatory structure for someone not versed in their 
> format. Please try to create an easy to parse 'story' that leads the reader 
> towards what you want these emails to tell - not just a raw dump of seemingly 
> unconnected pieces of data ...

Which kind of story?  Something like, we tested some benchmark on some
machine, which triggered some regression, we do bisect, and find the
first bad commit is xxx.  In addition to benchmark result, we collected
some other information, hope they are helpful for you.

We just try to help.  Sorry for confusion.  We try to provide all
information needed to describe the change and help for root causing the
changes.  But we know, we are still far from there, so your input is
important for us to improve our test report.  Which part of the report
do you think should be changed firstly?  The overall structure?  Or the
data format?

Best Regards,
Huang, Ying

> Thanks,
>
> 	Ingo
>
>> 
>> 
>> =========================================================================================
>> compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase:
>>   gcc-4.9/performance/x86_64-rhel/debian-x86_64-2015-02-07.cgz/lkp-sbx04/futex1/will-it-scale
>> 
>> commit: 
>>   8ad7b378d0d016309014cae0f640434bca7b5e11
>>   65d8fc777f6dcfee12785c057a6b57f679641c90
>> 
>> 8ad7b378d0d01630 65d8fc777f6dcfee12785c057a 
>> ---------------- -------------------------- 
>>          %stddev     %change         %stddev
>>              \          |                \  
>>    5076304 .  0%     +25.6%    6374220 .  0%  will-it-scale.per_process_ops
>>    1194117 .  0%     +14.4%    1366153 .  1%  will-it-scale.per_thread_ops
>>       0.58 .  0%      -2.0%       0.57 .  0%  will-it-scale.scalability
>>       6820 .  0%     -19.6%       5483 . 15%  meminfo.AnonHugePages
>>       2652 .  5%     -10.4%       2375 .  2%  vmstat.system.cs
>>       2848 . 32%    +141.2%       6870 . 65%  numa-meminfo.node1.Active(anon)
>>       2832 . 31%     +57.6%       4465 . 27%  numa-meminfo.node1.AnonPages
>>      15018 . 12%     -23.3%      11515 . 15%  numa-meminfo.node2.AnonPages
>>       1214 . 14%     -22.8%     936.75 . 20%  numa-meminfo.node3.PageTables
>>     712.25 . 32%    +141.2%       1718 . 65%  numa-vmstat.node1.nr_active_anon
>>     708.25 . 31%     +57.7%       1116 . 27%  numa-vmstat.node1.nr_anon_pages
>>       3754 . 12%     -23.3%       2879 . 15%  numa-vmstat.node2.nr_anon_pages
>>     304.75 . 14%     -23.1%     234.50 . 20%  numa-vmstat.node3.nr_page_table_pages
>> 3.53 . 1% -100.0% 0.00 . -1%
> perf-profile.cycles.___might_sleep.__might_sleep.get_futex_key.futex_wake.do_futex
>> 4.34 . 1% -100.0% 0.00 . -1%
> perf-profile.cycles.__might_sleep.get_futex_key.futex_wake.do_futex.sys_futex
>> 1.27 . 3% -100.0% 0.00 . -1%
> perf-profile.cycles.__wake_up_bit.unlock_page.get_futex_key.futex_wake.do_futex
>> 4.36 . 1% +29.6% 5.65 . 1%
> perf-profile.cycles.drop_futex_key_refs.isra.12.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath
>>       6.69 .  1%     +28.1%       8.57 .  0%  perf-profile.cycles.entry_SYSCALL_64
>>       6.73 .  0%     +30.6%       8.79 .  0%  perf-profile.cycles.entry_SYSCALL_64_after_swapgs
>> 74.21 . 0% -11.0% 66.06 . 0%
> perf-profile.cycles.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath
>> 59.05 . 0% -21.4% 46.40 . 0%
> perf-profile.cycles.get_futex_key.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath
>> 4.12 . 0% +78.5% 7.36 . 1%
> perf-profile.cycles.get_futex_key_refs.isra.11.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath
>> 2.27 . 3% +24.1% 2.82 . 4%
> perf-profile.cycles.get_user_pages_fast.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath
>> 26.95 . 0% +30.0% 35.04 . 1%
> perf-profile.cycles.get_user_pages_fast.get_futex_key.futex_wake.do_futex.sys_futex
>> 13.43 . 0% +27.2% 17.09 . 1%
> perf-profile.cycles.gup_pte_range.gup_pud_range.get_user_pages_fast.get_futex_key.futex_wake
>> 19.66 . 1% +28.4% 25.24 . 0%
> perf-profile.cycles.gup_pud_range.get_user_pages_fast.get_futex_key.futex_wake.do_futex
>> 4.33 . 1% +37.0% 5.93 . 4%
> perf-profile.cycles.hash_futex.do_futex.sys_futex.entry_SYSCALL_64_fastpath
>> 13.59 . 0% -100.0% 0.00 . -1%
> perf-profile.cycles.unlock_page.get_futex_key.futex_wake.do_futex.sys_futex
>>      15160 . 19%     -34.8%       9883 .  0%  sched_debug.cfs_rq:/.exec_clock.min
>>      27.25 . 15%     -37.6%      17.00 .  8%  sched_debug.cfs_rq:/.load_avg.7
>>      21.00 . 38%     -27.4%      15.25 .  2%  sched_debug.cpu.cpu_load[2].1
>>      21.00 . 38%     -27.4%      15.25 .  2%  sched_debug.cpu.cpu_load[3].1
>>      21.00 . 38%     -27.4%      15.25 .  2%  sched_debug.cpu.cpu_load[4].1
>>       1790 .  0%     +42.4%       2549 . 45%  sched_debug.cpu.curr->pid.21
>>      50033 .  4%      -6.8%      46622 .  4%  sched_debug.cpu.nr_load_updates.29
>>       4398 . 42%    +103.5%       8949 . 23%  sched_debug.cpu.nr_switches.11
>>       7452 . 34%    +111.3%      15744 . 54%  sched_debug.cpu.nr_switches.20
>>       3739 . 13%    +213.5%      11723 . 40%  sched_debug.cpu.nr_switches.23
>>       1648 . 53%     +96.5%       3239 . 63%  sched_debug.cpu.nr_switches.51
>>       0.25 .519%   -1300.0%      -3.00 .-52%  sched_debug.cpu.nr_uninterruptible.24
>>       8632 . 16%     -32.5%       5823 . 19%  sched_debug.cpu.sched_count.1
>>       5091 . 36%    +137.5%      12092 . 31%  sched_debug.cpu.sched_count.11
>>      12453 . 90%     -74.6%       3159 . 24%  sched_debug.cpu.sched_count.2
>>       7782 . 32%    +118.2%      16977 . 46%  sched_debug.cpu.sched_count.20
>>       2665 . 48%     -49.8%       1337 . 30%  sched_debug.cpu.sched_count.32
>>       1365 . 11%     -14.0%       1174 .  3%  sched_debug.cpu.sched_count.45
>>       1693 . 51%    +147.7%       4193 . 42%  sched_debug.cpu.sched_count.51
>>       5023 . 57%     -51.5%       2434 . 43%  sched_debug.cpu.sched_count.57
>>       1705 . 16%    +129.6%       3915 . 48%  sched_debug.cpu.sched_goidle.23
>>     536.25 . 14%     -18.7%     435.75 .  2%  sched_debug.cpu.sched_goidle.45
>>       1228 . 19%     -27.3%     892.50 . 17%  sched_debug.cpu.sched_goidle.5
>>       1919 . 55%     +88.5%       3617 . 37%  sched_debug.cpu.ttwu_count.11
>>       7699 . 35%     -43.7%       4335 . 43%  sched_debug.cpu.ttwu_count.24
>>       5380 . 36%     -45.6%       2926 . 18%  sched_debug.cpu.ttwu_count.30
>>     563.25 . 20%    +140.3%       1353 . 38%  sched_debug.cpu.ttwu_local.11
>>       4297 . 46%     -49.1%       2186 . 39%  sched_debug.cpu.ttwu_local.24
>>       2828 . 47%     -47.8%       1475 . 34%  sched_debug.cpu.ttwu_local.27
>>       3243 . 36%     -54.3%       1482 . 32%  sched_debug.cpu.ttwu_local.30
>>     199.25 .  6%    +100.6%     399.75 . 32%  sched_debug.cpu.ttwu_local.44
>>       1158 . 64%     -67.3%     379.00 . 46%  sched_debug.cpu.ttwu_local.54
>>     242.25 . 21%     +51.0%     365.75 . 19%  sched_debug.cpu.ttwu_local.55
>>       1009 . 26%     -50.8%     496.50 . 44%  sched_debug.cpu.ttwu_local.59
>>       1736 . 53%     -67.8%     559.25 . 22%  sched_debug.cpu.ttwu_local.9
>> 
>> 
>> lkp-sbx04: Sandy Bridge-EX
>> Memory: 64G
>> 
>> 
>>    perf-profile.cycles.futex_wake.do_futex.sys_futex.entry_SYSCALL_64_fastpath
>> 
>>   76 ++---------------------------------------------------------------------+
>>      |                                                                      |
>>   74 ++   .*..        .*..*..*..     .*..    .*..  .*..  .*..  .*..*..*..*  |
>>      *..*.    *..*..*.          *..*.    *.*.    *.    *.    *.             |
>>      |                                                                      |
>>   72 ++                                                                     |
>>      |                                                                      |
>>   70 ++                                                                     |
>>      |                                                                      |
>>   68 ++                                                                     |
>>      |                                                                      |
>>      |                          O  O  O    O  O  O  O     O  O     O  O     |
>>   66 O+                                  O             O        O        O  O
>>      |  O  O  O  O     O  O  O                                              |
>>   64 ++-------------O-------------------------------------------------------+
>> 
>> 
>> 
>>                              will-it-scale.per_process_ops
>> 
>>   6.6e+06 O+----O-O--O------------------------------------------------------+
>>           |  O          O  O O  O                                           |
>>   6.4e+06 ++                       O  O O  O  O  O O  O  O  O O  O  O  O O  O
>>           |                                                                 |
>>   6.2e+06 ++                                                                |
>>     6e+06 ++                                                                |
>>           |                                                                 |
>>   5.8e+06 ++                                                                |
>>           |                                                                 |
>>   5.6e+06 ++                                                                |
>>   5.4e+06 ++                                                                |
>>           |                                                                 |
>>   5.2e+06 ++                                                                |
>>           *..*..*.*..*..*..*.*..*..*..*.*..*..*..*.*..*..*..*.*..*..*..*.*  |
>>     5e+06 ++----------------------------------------------------------------+
>> 
>> 
>> 	[*] bisect-good sample
>> 	[O] bisect-bad  sample
>> 
>> To reproduce:
>> 
>>         git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
>>         cd lkp-tests
>>         bin/lkp install job.yaml  # job file is attached in this email
>>         bin/lkp run     job.yaml
>> 
>> 
>> Disclaimer:
>> Results have been estimated based on internal Intel analysis and are provided
>> for informational purposes only. Any difference in system hardware or software
>> design or configuration may affect actual performance.
>> 
>> 
>> Thanks,
>> Ying Huang
>
>> ---
>> LKP_SERVER: inn
>> LKP_CGI_PORT: 80
>> LKP_CIFS_PORT: 139
>> testcase: will-it-scale
>> default-monitors:
>>   wait: activate-monitor
>>   kmsg: 
>>   uptime: 
>>   iostat: 
>>   heartbeat: 
>>   vmstat: 
>>   numa-numastat: 
>>   numa-vmstat: 
>>   numa-meminfo: 
>>   proc-vmstat: 
>>   proc-stat:
>>     interval: 10
>>   meminfo: 
>>   slabinfo: 
>>   interrupts: 
>>   lock_stat: 
>>   latency_stats: 
>>   softirqs: 
>>   bdi_dev_mapping: 
>>   diskstats: 
>>   nfsstat: 
>>   cpuidle: 
>>   cpufreq-stats: 
>>   turbostat: 
>>   pmeter: 
>>   sched_debug:
>>     interval: 60
>> cpufreq_governor: performance
>> default-watchdogs:
>>   oom-killer: 
>>   watchdog: 
>> commit: 65d8fc777f6dcfee12785c057a6b57f679641c90
>> model: Sandy Bridge-EX
>> nr_cpu: 64
>> memory: 64G
>> nr_ssd_partitions: 7
>> ssd_partitions: "/dev/disk/by-id/ata-INTEL_SSDSC2*-part1"
>> swap_partitions: 
>> category: benchmark
>> perf-profile:
>>   freq: 800
>> will-it-scale:
>>   test: futex1
>> queue: bisect
>> testbox: lkp-sbx04
>> tbox_group: lkp-sbx04
>> kconfig: x86_64-rhel
>> enqueue_time: 2016-02-28 23:45:52.199165563 +08:00
>> compiler: gcc-4.9
>> rootfs: debian-x86_64-2015-02-07.cgz
>> id: 6b2c2bd744dd898009648cb82de7e0ba77de33f1
>> user: lkp
>> head_commit: ed520c327c4259ec08b1677023087f658329b961
>> base_commit: 81f70ba233d5f660e1ea5fe23260ee323af5d53a
>> branch: linux-devel/devel-hourly-2016022811
>> result_root:
> "/result/will-it-scale/performance-futex1/lkp-sbx04/debian-x86_64-2015-02-07.cgz/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/0"
>> job_file:
> "/lkp/scheduled/lkp-sbx04/bisect_will-it-scale-performance-futex1-debian-x86_64-2015-02-07.cgz-x86_64-rhel-65d8fc777f6dcfee12785c057a6b57f679641c90-20160228-23650-17c4qc1-0.yaml"
>> max_uptime: 1500
>> initrd: "/osimage/debian/debian-x86_64-2015-02-07.cgz"
>> bootloader_append:
>> - root=/dev/ram0
>> - user=lkp
>> -
> job=/lkp/scheduled/lkp-sbx04/bisect_will-it-scale-performance-futex1-debian-x86_64-2015-02-07.cgz-x86_64-rhel-65d8fc777f6dcfee12785c057a6b57f679641c90-20160228-23650-17c4qc1-0.yaml
>> - ARCH=x86_64
>> - kconfig=x86_64-rhel
>> - branch=linux-devel/devel-hourly-2016022811
>> - commit=65d8fc777f6dcfee12785c057a6b57f679641c90
>> -
> BOOT_IMAGE=/pkg/linux/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/vmlinuz-4.5.0-rc3-00235-g65d8fc7
>> - max_uptime=1500
>> -
> RESULT_ROOT=/result/will-it-scale/performance-futex1/lkp-sbx04/debian-x86_64-2015-02-07.cgz/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/0
>> - LKP_SERVER=inn
>> - |2-
>> 
>> 
>>   earlyprintk=ttyS0,115200 systemd.log_level=err
>>   debug apic=debug sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100
>>   panic=-1 softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2 prompt_ramdisk=0
>>   console=ttyS0,115200 console=tty0 vga=normal
>> 
>>   rw
>> lkp_initrd: "/lkp/lkp/lkp-x86_64.cgz"
>> modules_initrd: "/pkg/linux/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/modules.cgz"
>> bm_initrd:
> "/osimage/deps/debian-x86_64-2015-02-07.cgz/lkp.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/run-ipconfig.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/turbostat.cgz,/lkp/benchmarks/turbostat.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/will-it-scale.cgz,/lkp/benchmarks/will-it-scale.cgz,/lkp/benchmarks/will-it-scale-x86_64.cgz"
>> linux_headers_initrd: "/pkg/linux/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/linux-headers.cgz"
>> repeat_to: 2
>> kernel: "/pkg/linux/x86_64-rhel/gcc-4.9/65d8fc777f6dcfee12785c057a6b57f679641c90/vmlinuz-4.5.0-rc3-00235-g65d8fc7"
>> dequeue_time: 2016-02-28 23:46:33.938915178 +08:00
>> job_state: finished
>> loadavg: 45.27 20.12 7.84 2/649 11559
>> start_time: '1456674445'
>> end_time: '1456674754'
>> version: "/lkp/lkp/.src-20160226-194908"
>
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu10/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu11/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu12/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu13/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu14/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu15/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu16/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu17/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu18/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu19/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu20/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu21/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu22/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu23/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu24/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu25/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu26/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu27/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu28/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu29/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu30/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu31/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu32/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu33/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu34/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu35/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu36/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu37/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu38/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu39/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu40/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu41/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu42/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu43/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu44/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu45/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu46/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu47/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu48/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu49/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu5/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu50/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu51/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu52/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu53/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu54/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu55/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu56/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu57/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu58/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu59/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu6/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu60/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu61/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu62/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu63/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu8/cpufreq/scaling_governor
>> 2016-02-28 23:47:24 echo performance > /sys/devices/system/cpu/cpu9/cpufreq/scaling_governor
>> 2016-02-28 23:47:25 ./runtest.py futex1 16 both 1 8 16 24 32 48 64
>
> _______________________________________________
> LKP mailing list
> LKP(a)lists.01.org
> https://lists.01.org/mailman/listinfo/lkp

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [LKP] [lkp] [futex] 65d8fc777f: +25.6% will-it-scale.per_process_ops
  2016-03-18  6:12     ` Huang, Ying
@ 2016-03-18  8:35       ` Thomas Gleixner
  -1 siblings, 0 replies; 28+ messages in thread
From: Thomas Gleixner @ 2016-03-18  8:35 UTC (permalink / raw)
  To: Huang, Ying
  Cc: Ingo Molnar, Peter Zijlstra, Chris Mason, Peter Zijlstra, lkp,
	Sebastian Andrzej Siewior, Davidlohr Bueso, Hugh Dickins, LKML,
	Linus Torvalds, Mel Gorman, Mel Gorman, Darren Hart,
	Wu Fengguang

On Fri, 18 Mar 2016, Huang, Ying wrote:
> Usually we will put most important change we think in the subject of the
> mail, for this email, it is,
> 
> +25.6% will-it-scale.per_process_ops

That is confusing on it's own, because the reader does not know at all whether
this is an improvement or a regression.

So something like this might be useful:

Subject: subsystem: 12digitsha1: 25% performance improvement

or in some other case

Subject: subsystem: 12digitsha1: 25% performance regression

So in the latter case I will look into that mail immediately. The improvement
one can wait until I have cared about urgent stuff.

In the subject line it is pretty much irrelevant which foo-bla-ops test has
produced that result. It really does not matter. If it's a regression, it's
urgent. If it's an improvement it's informal and it can wait to be read.

So in that case it would be:

futex: 65d8fc777f6d: 25% performance improvement

You can grab the subsystem prefix from the commit.

> and, we try to put most important changes at the top of the comparison
> result below.  That is the will-it-scale.xxx below.
> 
> We are thinking about how to improve this.  You input is valuable for
> us.  We are thinking change the "below changes" line to something like
> below.
> 
> FYI, we noticed the +25.6% will-it-scale.per_process_ops improvement on
> ...
> 
> Does this looks better?

A bit, but it still does not tell me much. It's completely non obvious what
'will-it-scale.per_process_ops' means. Let me give you an example how a useful
and easy to understand summary of the change could look like:


 FYI, we noticed 25.6% performance improvement due to commit

   65d8fc777f6d "futex: Remove requirement for lock_page() in get_futex_key()"

 in the will-it-scale.per_process_ops test.

 will-it-scale.per_process_ops tests the futex operations for process shared
 futexes (Or whatever that test really does).

 The commit has no significant impact on any other test in the test suite.

So those few lines tell precisely what this is about. It's something I already
expected, so I really can skip the rest of the mail unless I'm interested in
reproducing the result.

Now lets look at a performance regression.

Subject: futex: 65d8fc777f6d: 25% performance regression

 FYI, we noticed a 25.2% performance regression due to commit

 65d8fc777f6d "futex: Remove requirement for lock_page() in get_futex_key()"

 in the will-it-scale.per_process_ops test.

 will-it-scale.per_process_ops tests the futex operations for process shared
 futexes (Or whatever that test really does).

 The commit has no significant impact on any other test in the test suite.

In that case I will certainly be interested how to reproduce that test. So I
need the following information:

Machine description: Intel IvyBridge 2 sockets, 32 cores, 64G RAM
Config file: http://wherever.you.store/your/results/test-nr/config
Test: http://wherever.you.store/your/tests/will-it-scale.per_process_ops.tar.bz2

That tarball should contain:

     README
     test_script.sh
     test_binary

README should tell:

   will-it-scale.per_process_ops

   Short explanation of the test

   Preliminaries: 
   	- perf
	- whatever

So that allows me to reproduce that test more or less with no effort. And
that's the really important part.

You can provide nice charts and full comparison tables for all tests on a web
site for those who are interested in large stats and pretty charts.

Full results: http://wherever.you.store/your/results/test-nr/results

So now lets look at a scenario where that commit results in a performance
improvement on will-it-scale.per_process_ops and a regression on
will-it-scale.per_task.

Subject: futex: 65d8fc777f6d: 25% performance regression

 FYI, we noticed a 25.2% performance regression due to commit

 65d8fc777f6d "futex: Remove requirement for lock_page() in get_futex_key()"

 in the will-it-scale.per_task test.

 will-it-scale.per_tasks tests the futex operations for process private
 futexes (Or whatever that test really does).

 The commit has significant impact on the following tests in the test suite:
 
	will-it-scale.per_process_ops: 25% improvement

The 25% improvement is really not interesting. What's interesting is the
regression. That's what people need to look at. You still can provide the
information about all the tests including the 25% improvement data on 

Full results: http://wherever.you.store/your/results/test-nr/results

Now, if you have commits which have mixed impact on various tests, then it's
important to point out the most relevant issue clearly in the
summary. I.e. you need to filter the tests for the maximum
regression/improvement and use that as the main information. You still can
provide the information about the impact on other tests in a very condensed
form.

 The commit has significant impact on the following tests in the test suite:

     test1: 	0.5% Regression
     test2:	2.0% Improvement
     ....
     testN:	0.2% Regression

That's useful information, but it might be completely irrelevant if the
primary impact is, e.g. a 10% Regression. Then it's nice to know, that it gave
a 2% improvement on test2, but that's about it. The important information is
the 10% regression, which needs to be addressed urgently.

Hope that helps.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [futex] 65d8fc777f: +25.6% will-it-scale.per_process_ops
@ 2016-03-18  8:35       ` Thomas Gleixner
  0 siblings, 0 replies; 28+ messages in thread
From: Thomas Gleixner @ 2016-03-18  8:35 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 5467 bytes --]

On Fri, 18 Mar 2016, Huang, Ying wrote:
> Usually we will put most important change we think in the subject of the
> mail, for this email, it is,
> 
> +25.6% will-it-scale.per_process_ops

That is confusing on it's own, because the reader does not know at all whether
this is an improvement or a regression.

So something like this might be useful:

Subject: subsystem: 12digitsha1: 25% performance improvement

or in some other case

Subject: subsystem: 12digitsha1: 25% performance regression

So in the latter case I will look into that mail immediately. The improvement
one can wait until I have cared about urgent stuff.

In the subject line it is pretty much irrelevant which foo-bla-ops test has
produced that result. It really does not matter. If it's a regression, it's
urgent. If it's an improvement it's informal and it can wait to be read.

So in that case it would be:

futex: 65d8fc777f6d: 25% performance improvement

You can grab the subsystem prefix from the commit.

> and, we try to put most important changes at the top of the comparison
> result below.  That is the will-it-scale.xxx below.
> 
> We are thinking about how to improve this.  You input is valuable for
> us.  We are thinking change the "below changes" line to something like
> below.
> 
> FYI, we noticed the +25.6% will-it-scale.per_process_ops improvement on
> ...
> 
> Does this looks better?

A bit, but it still does not tell me much. It's completely non obvious what
'will-it-scale.per_process_ops' means. Let me give you an example how a useful
and easy to understand summary of the change could look like:


 FYI, we noticed 25.6% performance improvement due to commit

   65d8fc777f6d "futex: Remove requirement for lock_page() in get_futex_key()"

 in the will-it-scale.per_process_ops test.

 will-it-scale.per_process_ops tests the futex operations for process shared
 futexes (Or whatever that test really does).

 The commit has no significant impact on any other test in the test suite.

So those few lines tell precisely what this is about. It's something I already
expected, so I really can skip the rest of the mail unless I'm interested in
reproducing the result.

Now lets look at a performance regression.

Subject: futex: 65d8fc777f6d: 25% performance regression

 FYI, we noticed a 25.2% performance regression due to commit

 65d8fc777f6d "futex: Remove requirement for lock_page() in get_futex_key()"

 in the will-it-scale.per_process_ops test.

 will-it-scale.per_process_ops tests the futex operations for process shared
 futexes (Or whatever that test really does).

 The commit has no significant impact on any other test in the test suite.

In that case I will certainly be interested how to reproduce that test. So I
need the following information:

Machine description: Intel IvyBridge 2 sockets, 32 cores, 64G RAM
Config file: http://wherever.you.store/your/results/test-nr/config
Test: http://wherever.you.store/your/tests/will-it-scale.per_process_ops.tar.bz2

That tarball should contain:

     README
     test_script.sh
     test_binary

README should tell:

   will-it-scale.per_process_ops

   Short explanation of the test

   Preliminaries: 
   	- perf
	- whatever

So that allows me to reproduce that test more or less with no effort. And
that's the really important part.

You can provide nice charts and full comparison tables for all tests on a web
site for those who are interested in large stats and pretty charts.

Full results: http://wherever.you.store/your/results/test-nr/results

So now lets look at a scenario where that commit results in a performance
improvement on will-it-scale.per_process_ops and a regression on
will-it-scale.per_task.

Subject: futex: 65d8fc777f6d: 25% performance regression

 FYI, we noticed a 25.2% performance regression due to commit

 65d8fc777f6d "futex: Remove requirement for lock_page() in get_futex_key()"

 in the will-it-scale.per_task test.

 will-it-scale.per_tasks tests the futex operations for process private
 futexes (Or whatever that test really does).

 The commit has significant impact on the following tests in the test suite:
 
	will-it-scale.per_process_ops: 25% improvement

The 25% improvement is really not interesting. What's interesting is the
regression. That's what people need to look at. You still can provide the
information about all the tests including the 25% improvement data on 

Full results: http://wherever.you.store/your/results/test-nr/results

Now, if you have commits which have mixed impact on various tests, then it's
important to point out the most relevant issue clearly in the
summary. I.e. you need to filter the tests for the maximum
regression/improvement and use that as the main information. You still can
provide the information about the impact on other tests in a very condensed
form.

 The commit has significant impact on the following tests in the test suite:

     test1: 	0.5% Regression
     test2:	2.0% Improvement
     ....
     testN:	0.2% Regression

That's useful information, but it might be completely irrelevant if the
primary impact is, e.g. a 10% Regression. Then it's nice to know, that it gave
a 2% improvement on test2, but that's about it. The important information is
the 10% regression, which needs to be addressed urgently.

Hope that helps.

Thanks,

	tglx


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [LKP] [lkp] [futex] 65d8fc777f: +25.6% will-it-scale.per_process_ops
  2016-03-18  8:35       ` Thomas Gleixner
@ 2016-03-21  2:53         ` Huang, Ying
  -1 siblings, 0 replies; 28+ messages in thread
From: Huang, Ying @ 2016-03-21  2:53 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Huang, Ying, Ingo Molnar, Peter Zijlstra, Chris Mason,
	Peter Zijlstra, lkp, Sebastian Andrzej Siewior, Davidlohr Bueso,
	Hugh Dickins, LKML, Linus Torvalds, Mel Gorman, Mel Gorman,
	Darren Hart, Wu Fengguang, Xiaolong Ye

Hi, Thomas,

Thanks a lot for your valuable input!

Thomas Gleixner <tglx@linutronix.de> writes:

> On Fri, 18 Mar 2016, Huang, Ying wrote:
>> Usually we will put most important change we think in the subject of the
>> mail, for this email, it is,
>> 
>> +25.6% will-it-scale.per_process_ops
>
> That is confusing on it's own, because the reader does not know at all whether
> this is an improvement or a regression.
>
> So something like this might be useful:
>
> Subject: subsystem: 12digitsha1: 25% performance improvement
>
> or in some other case
>
> Subject: subsystem: 12digitsha1: 25% performance regression
>
> So in the latter case I will look into that mail immediately. The improvement
> one can wait until I have cared about urgent stuff.
>
> In the subject line it is pretty much irrelevant which foo-bla-ops test has
> produced that result. It really does not matter. If it's a regression, it's
> urgent. If it's an improvement it's informal and it can wait to be read.
>
> So in that case it would be:
>
> futex: 65d8fc777f6d: 25% performance improvement
>
> You can grab the subsystem prefix from the commit.

We will include regression/improvement information in subject at least.

>> and, we try to put most important changes at the top of the comparison
>> result below.  That is the will-it-scale.xxx below.
>> 
>> We are thinking about how to improve this.  You input is valuable for
>> us.  We are thinking change the "below changes" line to something like
>> below.
>> 
>> FYI, we noticed the +25.6% will-it-scale.per_process_ops improvement on
>> ...
>> 
>> Does this looks better?
>
> A bit, but it still does not tell me much. It's completely non obvious what
> 'will-it-scale.per_process_ops' means.

will-it-scale is a test suite, per_process_ops is one of its results.
That is the convention used in original report.

> Let me give you an example how a useful
> and easy to understand summary of the change could look like:
>
>
>  FYI, we noticed 25.6% performance improvement due to commit
>
>    65d8fc777f6d "futex: Remove requirement for lock_page() in get_futex_key()"
>
>  in the will-it-scale.per_process_ops test.
>
>  will-it-scale.per_process_ops tests the futex operations for process shared
>  futexes (Or whatever that test really does).

There is a futex sub test case for will-it-scale test suite.  But I got your
point, we need some description for the test case.  If email is not too
limited for the full description, we will put it in some web site and
include short description and link to the full description in email.

>  The commit has no significant impact on any other test in the test suite.

Sorry, we have no enough machine power to test all test cases for each
bisect result.  So we will have no such information until we find a way
to do that.

> So those few lines tell precisely what this is about. It's something I already
> expected, so I really can skip the rest of the mail unless I'm interested in
> reproducing the result.

We will put important information at first of the email.  And details
later.  Better to have clear mark.  So people can get important
information and ignore the details if they don't want.

> Now lets look at a performance regression.
>
> Subject: futex: 65d8fc777f6d: 25% performance regression
>
>  FYI, we noticed a 25.2% performance regression due to commit
>
>  65d8fc777f6d "futex: Remove requirement for lock_page() in get_futex_key()"
>
>  in the will-it-scale.per_process_ops test.
>
>  will-it-scale.per_process_ops tests the futex operations for process shared
>  futexes (Or whatever that test really does).
>
>  The commit has no significant impact on any other test in the test suite.
>
> In that case I will certainly be interested how to reproduce that test. So I
> need the following information:
>
> Machine description: Intel IvyBridge 2 sockets, 32 cores, 64G RAM
> Config file: http://wherever.you.store/your/results/test-nr/config

We have some information about this before.  But not organized good
enough, will improve it.

> Test: http://wherever.you.store/your/tests/will-it-scale.per_process_ops.tar.bz2
>
> That tarball should contain:
>
>      README
>      test_script.sh
>      test_binary
>
> README should tell:
>
>    will-it-scale.per_process_ops
>
>    Short explanation of the test
>
>    Preliminaries: 
>    	- perf
> 	- whatever
>
> So that allows me to reproduce that test more or less with no effort. And
> that's the really important part.

For reproducing, now we use lkp-tests tool, which includes scripts to
build the test case, run the test, collect various information, compare
the test result, with the job file attached with the report email.  That
is not the easiest way, we will continuously improve it.

> You can provide nice charts and full comparison tables for all tests on a web
> site for those who are interested in large stats and pretty charts.
>
> Full results: http://wherever.you.store/your/results/test-nr/results

Before we have a website for detailed information, we will still put
some details into report email.

> So now lets look at a scenario where that commit results in a performance
> improvement on will-it-scale.per_process_ops and a regression on
> will-it-scale.per_task.
>
> Subject: futex: 65d8fc777f6d: 25% performance regression
>
>  FYI, we noticed a 25.2% performance regression due to commit
>
>  65d8fc777f6d "futex: Remove requirement for lock_page() in get_futex_key()"
>
>  in the will-it-scale.per_task test.
>
>  will-it-scale.per_tasks tests the futex operations for process private
>  futexes (Or whatever that test really does).
>
>  The commit has significant impact on the following tests in the test suite:
>  
> 	will-it-scale.per_process_ops: 25% improvement
>
> The 25% improvement is really not interesting. What's interesting is the
> regression. That's what people need to look at. You still can provide the
> information about all the tests including the 25% improvement data on 
>
> Full results: http://wherever.you.store/your/results/test-nr/results
>
> Now, if you have commits which have mixed impact on various tests, then it's
> important to point out the most relevant issue clearly in the
> summary. I.e. you need to filter the tests for the maximum
> regression/improvement and use that as the main information. You still can
> provide the information about the impact on other tests in a very condensed
> form.
>
>  The commit has significant impact on the following tests in the test suite:
>
>      test1: 	0.5% Regression
>      test2:	2.0% Improvement
>      ....
>      testN:	0.2% Regression
>
> That's useful information, but it might be completely irrelevant if the
> primary impact is, e.g. a 10% Regression. Then it's nice to know, that it gave
> a 2% improvement on test2, but that's about it. The important information is
> the 10% regression, which needs to be addressed urgently.

Yes.  We will separate summary and details.  And put most important
information at the front of of the summary.

> Hope that helps.

That really help us a lot!  Thanks a lot!

Best Regards,
Huang, Ying

> Thanks,
>
> 	tglx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [futex] 65d8fc777f: +25.6% will-it-scale.per_process_ops
@ 2016-03-21  2:53         ` Huang, Ying
  0 siblings, 0 replies; 28+ messages in thread
From: Huang, Ying @ 2016-03-21  2:53 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 7330 bytes --]

Hi, Thomas,

Thanks a lot for your valuable input!

Thomas Gleixner <tglx@linutronix.de> writes:

> On Fri, 18 Mar 2016, Huang, Ying wrote:
>> Usually we will put most important change we think in the subject of the
>> mail, for this email, it is,
>> 
>> +25.6% will-it-scale.per_process_ops
>
> That is confusing on it's own, because the reader does not know at all whether
> this is an improvement or a regression.
>
> So something like this might be useful:
>
> Subject: subsystem: 12digitsha1: 25% performance improvement
>
> or in some other case
>
> Subject: subsystem: 12digitsha1: 25% performance regression
>
> So in the latter case I will look into that mail immediately. The improvement
> one can wait until I have cared about urgent stuff.
>
> In the subject line it is pretty much irrelevant which foo-bla-ops test has
> produced that result. It really does not matter. If it's a regression, it's
> urgent. If it's an improvement it's informal and it can wait to be read.
>
> So in that case it would be:
>
> futex: 65d8fc777f6d: 25% performance improvement
>
> You can grab the subsystem prefix from the commit.

We will include regression/improvement information in subject at least.

>> and, we try to put most important changes at the top of the comparison
>> result below.  That is the will-it-scale.xxx below.
>> 
>> We are thinking about how to improve this.  You input is valuable for
>> us.  We are thinking change the "below changes" line to something like
>> below.
>> 
>> FYI, we noticed the +25.6% will-it-scale.per_process_ops improvement on
>> ...
>> 
>> Does this looks better?
>
> A bit, but it still does not tell me much. It's completely non obvious what
> 'will-it-scale.per_process_ops' means.

will-it-scale is a test suite, per_process_ops is one of its results.
That is the convention used in original report.

> Let me give you an example how a useful
> and easy to understand summary of the change could look like:
>
>
>  FYI, we noticed 25.6% performance improvement due to commit
>
>    65d8fc777f6d "futex: Remove requirement for lock_page() in get_futex_key()"
>
>  in the will-it-scale.per_process_ops test.
>
>  will-it-scale.per_process_ops tests the futex operations for process shared
>  futexes (Or whatever that test really does).

There is a futex sub test case for will-it-scale test suite.  But I got your
point, we need some description for the test case.  If email is not too
limited for the full description, we will put it in some web site and
include short description and link to the full description in email.

>  The commit has no significant impact on any other test in the test suite.

Sorry, we have no enough machine power to test all test cases for each
bisect result.  So we will have no such information until we find a way
to do that.

> So those few lines tell precisely what this is about. It's something I already
> expected, so I really can skip the rest of the mail unless I'm interested in
> reproducing the result.

We will put important information at first of the email.  And details
later.  Better to have clear mark.  So people can get important
information and ignore the details if they don't want.

> Now lets look at a performance regression.
>
> Subject: futex: 65d8fc777f6d: 25% performance regression
>
>  FYI, we noticed a 25.2% performance regression due to commit
>
>  65d8fc777f6d "futex: Remove requirement for lock_page() in get_futex_key()"
>
>  in the will-it-scale.per_process_ops test.
>
>  will-it-scale.per_process_ops tests the futex operations for process shared
>  futexes (Or whatever that test really does).
>
>  The commit has no significant impact on any other test in the test suite.
>
> In that case I will certainly be interested how to reproduce that test. So I
> need the following information:
>
> Machine description: Intel IvyBridge 2 sockets, 32 cores, 64G RAM
> Config file: http://wherever.you.store/your/results/test-nr/config

We have some information about this before.  But not organized good
enough, will improve it.

> Test: http://wherever.you.store/your/tests/will-it-scale.per_process_ops.tar.bz2
>
> That tarball should contain:
>
>      README
>      test_script.sh
>      test_binary
>
> README should tell:
>
>    will-it-scale.per_process_ops
>
>    Short explanation of the test
>
>    Preliminaries: 
>    	- perf
> 	- whatever
>
> So that allows me to reproduce that test more or less with no effort. And
> that's the really important part.

For reproducing, now we use lkp-tests tool, which includes scripts to
build the test case, run the test, collect various information, compare
the test result, with the job file attached with the report email.  That
is not the easiest way, we will continuously improve it.

> You can provide nice charts and full comparison tables for all tests on a web
> site for those who are interested in large stats and pretty charts.
>
> Full results: http://wherever.you.store/your/results/test-nr/results

Before we have a website for detailed information, we will still put
some details into report email.

> So now lets look at a scenario where that commit results in a performance
> improvement on will-it-scale.per_process_ops and a regression on
> will-it-scale.per_task.
>
> Subject: futex: 65d8fc777f6d: 25% performance regression
>
>  FYI, we noticed a 25.2% performance regression due to commit
>
>  65d8fc777f6d "futex: Remove requirement for lock_page() in get_futex_key()"
>
>  in the will-it-scale.per_task test.
>
>  will-it-scale.per_tasks tests the futex operations for process private
>  futexes (Or whatever that test really does).
>
>  The commit has significant impact on the following tests in the test suite:
>  
> 	will-it-scale.per_process_ops: 25% improvement
>
> The 25% improvement is really not interesting. What's interesting is the
> regression. That's what people need to look at. You still can provide the
> information about all the tests including the 25% improvement data on 
>
> Full results: http://wherever.you.store/your/results/test-nr/results
>
> Now, if you have commits which have mixed impact on various tests, then it's
> important to point out the most relevant issue clearly in the
> summary. I.e. you need to filter the tests for the maximum
> regression/improvement and use that as the main information. You still can
> provide the information about the impact on other tests in a very condensed
> form.
>
>  The commit has significant impact on the following tests in the test suite:
>
>      test1: 	0.5% Regression
>      test2:	2.0% Improvement
>      ....
>      testN:	0.2% Regression
>
> That's useful information, but it might be completely irrelevant if the
> primary impact is, e.g. a 10% Regression. Then it's nice to know, that it gave
> a 2% improvement on test2, but that's about it. The important information is
> the 10% regression, which needs to be addressed urgently.

Yes.  We will separate summary and details.  And put most important
information at the front of of the summary.

> Hope that helps.

That really help us a lot!  Thanks a lot!

Best Regards,
Huang, Ying

> Thanks,
>
> 	tglx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [LKP] [lkp] [futex] 65d8fc777f: +25.6% will-it-scale.per_process_ops
  2016-03-21  2:53         ` Huang, Ying
@ 2016-03-21  8:00           ` Thomas Gleixner
  -1 siblings, 0 replies; 28+ messages in thread
From: Thomas Gleixner @ 2016-03-21  8:00 UTC (permalink / raw)
  To: Huang, Ying
  Cc: Ingo Molnar, Peter Zijlstra, Chris Mason, Peter Zijlstra, lkp,
	Sebastian Andrzej Siewior, Davidlohr Bueso, Hugh Dickins, LKML,
	Linus Torvalds, Mel Gorman, Mel Gorman, Darren Hart,
	Wu Fengguang, Xiaolong Ye

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3162 bytes --]

On Mon, 21 Mar 2016, Huang, Ying wrote:
> >  FYI, we noticed 25.6% performance improvement due to commit
> >
> >    65d8fc777f6d "futex: Remove requirement for lock_page() in get_futex_key()"
> >
> >  in the will-it-scale.per_process_ops test.
> >
> >  will-it-scale.per_process_ops tests the futex operations for process shared
> >  futexes (Or whatever that test really does).
> 
> There is a futex sub test case for will-it-scale test suite.  But I got your
> point, we need some description for the test case.  If email is not too
> limited for the full description, we will put it in some web site and
> include short description and link to the full description in email.

Ok. Just make sure the short description gives enough information for the
casual reader.
 
> >  The commit has no significant impact on any other test in the test suite.
> 
> Sorry, we have no enough machine power to test all test cases for each
> bisect result.  So we will have no such information until we find a way
> to do that.

Well, then I really have to ask how I should interpret the data here:

   5076304 ±  0%     +25.6%    6374220 ±  0%  will-it-scale.per_process_ops

   ^^^ That's the reason why you sent the mail in the first place

   1194117 ±  0%     +14.4%    1366153 ±  1%  will-it-scale.per_thread_ops
      0.58 ±  0%      -2.0%       0.57 ±  0%  will-it-scale.scalability
      6820 ±  0%     -19.6%       5483 ± 15%  meminfo.AnonHugePages
      2652 ±  5%     -10.4%       2375 ±  2%  vmstat.system.cs
      2848 ± 32%    +141.2%       6870 ± 65%  numa-meminfo.node1.Active(anon)
      2832 ± 31%     +57.6%       4465 ± 27%  numa-meminfo.node1.AnonPages
     15018 ± 12%     -23.3%      11515 ± 15%  numa-meminfo.node2.AnonPages
      1214 ± 14%     -22.8%     936.75 ± 20%  numa-meminfo.node3.PageTables
    712.25 ± 32%    +141.2%       1718 ± 65%  numa-vmstat.node1.nr_active_anon
    708.25 ± 31%     +57.7%       1116 ± 27%  numa-vmstat.node1.nr_anon_pages

How is this related and what should I do about this information? 
 
 If it's important then I have to admit, that I fail to understand why.

 If it's not important then I have to ask why is this included.

> > So that allows me to reproduce that test more or less with no effort. And
> > that's the really important part.
> 
> For reproducing, now we use lkp-tests tool, which includes scripts to
> build the test case, run the test, collect various information, compare
> the test result, with the job file attached with the report email.  That
> is not the easiest way, we will continuously improve it.

I know and lkp-tests is a pain to work with. So please look into a way to
extract the relevant binaries, so it's simple for developers to reproduce.
 
> > You can provide nice charts and full comparison tables for all tests on a web
> > site for those who are interested in large stats and pretty charts.
> >
> > Full results: http://wherever.you.store/your/results/test-nr/results
> 
> Before we have a website for detailed information, we will still put
> some details into report email.

Ok, but please make them understandable for mere mortals.
 
Thanks,

	tglx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [futex] 65d8fc777f: +25.6% will-it-scale.per_process_ops
@ 2016-03-21  8:00           ` Thomas Gleixner
  0 siblings, 0 replies; 28+ messages in thread
From: Thomas Gleixner @ 2016-03-21  8:00 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 3234 bytes --]

On Mon, 21 Mar 2016, Huang, Ying wrote:
> >  FYI, we noticed 25.6% performance improvement due to commit
> >
> >    65d8fc777f6d "futex: Remove requirement for lock_page() in get_futex_key()"
> >
> >  in the will-it-scale.per_process_ops test.
> >
> >  will-it-scale.per_process_ops tests the futex operations for process shared
> >  futexes (Or whatever that test really does).
> 
> There is a futex sub test case for will-it-scale test suite.  But I got your
> point, we need some description for the test case.  If email is not too
> limited for the full description, we will put it in some web site and
> include short description and link to the full description in email.

Ok. Just make sure the short description gives enough information for the
casual reader.
 
> >  The commit has no significant impact on any other test in the test suite.
> 
> Sorry, we have no enough machine power to test all test cases for each
> bisect result.  So we will have no such information until we find a way
> to do that.

Well, then I really have to ask how I should interpret the data here:

   5076304 ±  0%     +25.6%    6374220 ±  0%  will-it-scale.per_process_ops

   ^^^ That's the reason why you sent the mail in the first place

   1194117 ±  0%     +14.4%    1366153 ±  1%  will-it-scale.per_thread_ops
      0.58 ±  0%      -2.0%       0.57 ±  0%  will-it-scale.scalability
      6820 ±  0%     -19.6%       5483 ± 15%  meminfo.AnonHugePages
      2652 ±  5%     -10.4%       2375 ±  2%  vmstat.system.cs
      2848 ± 32%    +141.2%       6870 ± 65%  numa-meminfo.node1.Active(anon)
      2832 ± 31%     +57.6%       4465 ± 27%  numa-meminfo.node1.AnonPages
     15018 ± 12%     -23.3%      11515 ± 15%  numa-meminfo.node2.AnonPages
      1214 ± 14%     -22.8%     936.75 ± 20%  numa-meminfo.node3.PageTables
    712.25 ± 32%    +141.2%       1718 ± 65%  numa-vmstat.node1.nr_active_anon
    708.25 ± 31%     +57.7%       1116 ± 27%  numa-vmstat.node1.nr_anon_pages

How is this related and what should I do about this information? 
 
 If it's important then I have to admit, that I fail to understand why.

 If it's not important then I have to ask why is this included.

> > So that allows me to reproduce that test more or less with no effort. And
> > that's the really important part.
> 
> For reproducing, now we use lkp-tests tool, which includes scripts to
> build the test case, run the test, collect various information, compare
> the test result, with the job file attached with the report email.  That
> is not the easiest way, we will continuously improve it.

I know and lkp-tests is a pain to work with. So please look into a way to
extract the relevant binaries, so it's simple for developers to reproduce.
 
> > You can provide nice charts and full comparison tables for all tests on a web
> > site for those who are interested in large stats and pretty charts.
> >
> > Full results: http://wherever.you.store/your/results/test-nr/results
> 
> Before we have a website for detailed information, we will still put
> some details into report email.

Ok, but please make them understandable for mere mortals.
 
Thanks,

	tglx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [LKP] [lkp] [futex] 65d8fc777f: +25.6% will-it-scale.per_process_ops
  2016-03-21  8:00           ` Thomas Gleixner
@ 2016-03-21  8:42             ` Huang, Ying
  -1 siblings, 0 replies; 28+ messages in thread
From: Huang, Ying @ 2016-03-21  8:42 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Huang, Ying, Ingo Molnar, Peter Zijlstra, Chris Mason,
	Peter Zijlstra, lkp, Sebastian Andrzej Siewior, Davidlohr Bueso,
	Hugh Dickins, LKML, Linus Torvalds, Mel Gorman, Mel Gorman,
	Darren Hart, Wu Fengguang, Xiaolong Ye

Thomas Gleixner <tglx@linutronix.de> writes:

> On Mon, 21 Mar 2016, Huang, Ying wrote:
>> >  FYI, we noticed 25.6% performance improvement due to commit
>> >
>> >    65d8fc777f6d "futex: Remove requirement for lock_page() in get_futex_key()"
>> >
>> >  in the will-it-scale.per_process_ops test.
>> >
>> >  will-it-scale.per_process_ops tests the futex operations for process shared
>> >  futexes (Or whatever that test really does).
>> 
>> There is a futex sub test case for will-it-scale test suite.  But I got your
>> point, we need some description for the test case.  If email is not too
>> limited for the full description, we will put it in some web site and
>> include short description and link to the full description in email.
>
> Ok. Just make sure the short description gives enough information for the
> casual reader.
>  
>> >  The commit has no significant impact on any other test in the test suite.
>> 
>> Sorry, we have no enough machine power to test all test cases for each
>> bisect result.  So we will have no such information until we find a way
>> to do that.
>
> Well, then I really have to ask how I should interpret the data here:
>
>    5076304 ±  0%     +25.6%    6374220 ±  0%  will-it-scale.per_process_ops
>
>    ^^^ That's the reason why you sent the mail in the first place
>
>    1194117 ±  0%     +14.4%    1366153 ±  1%  will-it-scale.per_thread_ops
>       0.58 ±  0%      -2.0%       0.57 ±  0%  will-it-scale.scalability
>       6820 ±  0%     -19.6%       5483 ± 15%  meminfo.AnonHugePages
>       2652 ±  5%     -10.4%       2375 ±  2%  vmstat.system.cs
>       2848 ± 32%    +141.2%       6870 ± 65%  numa-meminfo.node1.Active(anon)
>       2832 ± 31%     +57.6%       4465 ± 27%  numa-meminfo.node1.AnonPages
>      15018 ± 12%     -23.3%      11515 ± 15%  numa-meminfo.node2.AnonPages
>       1214 ± 14%     -22.8%     936.75 ± 20%  numa-meminfo.node3.PageTables
>     712.25 ± 32%    +141.2%       1718 ± 65%  numa-vmstat.node1.nr_active_anon
>     708.25 ± 31%     +57.7%       1116 ± 27%  numa-vmstat.node1.nr_anon_pages
>
> How is this related and what should I do about this information? 

For each will-it-scale sub test case, it will be run in both process
mode and thread mode, and task number will change from 1 to CPU number.
will-it-scale.per_thread_ops shows thread mode main result.
will-it-scale.scalability is calculated to measure how per_process_ops
and per_thread_ops scaled along with the task number.  These are default
behavior of will-it-scale test suite.

Others are monitors output.  That is, other information collected during
test.  For example, meminfo is a monitor to sampling /proc/meminfo
contents,  AnonHugePages is a line in it.  meminfo.AnonHugePages is for
the average value of AnonHugePages line of /proc/meminfo.  Similarly
vmstat.system.cs is the average value of cs column of system column
group of /usr/bin/vmstat.

We hope these information are helpful for root causing the regression.

>  If it's important then I have to admit, that I fail to understand why.
>
>  If it's not important then I have to ask why is this included.
>
>> > So that allows me to reproduce that test more or less with no effort. And
>> > that's the really important part.
>> 
>> For reproducing, now we use lkp-tests tool, which includes scripts to
>> build the test case, run the test, collect various information, compare
>> the test result, with the job file attached with the report email.  That
>> is not the easiest way, we will continuously improve it.
>
> I know and lkp-tests is a pain to work with. So please look into a way to
> extract the relevant binaries, so it's simple for developers to reproduce.

OK.  We will try to improve on this side.  But it is not an easy task
for us to provided easy to use simple binaries.  Do you think something
like Docker image is easy to use?

>> > You can provide nice charts and full comparison tables for all tests on a web
>> > site for those who are interested in large stats and pretty charts.
>> >
>> > Full results: http://wherever.you.store/your/results/test-nr/results
>> 
>> Before we have a website for detailed information, we will still put
>> some details into report email.
>
> Ok, but please make them understandable for mere mortals.

Sure.

Best Regards,
Huang, Ying

> Thanks,
>
> 	tglx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [futex] 65d8fc777f: +25.6% will-it-scale.per_process_ops
@ 2016-03-21  8:42             ` Huang, Ying
  0 siblings, 0 replies; 28+ messages in thread
From: Huang, Ying @ 2016-03-21  8:42 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 4440 bytes --]

Thomas Gleixner <tglx@linutronix.de> writes:

> On Mon, 21 Mar 2016, Huang, Ying wrote:
>> >  FYI, we noticed 25.6% performance improvement due to commit
>> >
>> >    65d8fc777f6d "futex: Remove requirement for lock_page() in get_futex_key()"
>> >
>> >  in the will-it-scale.per_process_ops test.
>> >
>> >  will-it-scale.per_process_ops tests the futex operations for process shared
>> >  futexes (Or whatever that test really does).
>> 
>> There is a futex sub test case for will-it-scale test suite.  But I got your
>> point, we need some description for the test case.  If email is not too
>> limited for the full description, we will put it in some web site and
>> include short description and link to the full description in email.
>
> Ok. Just make sure the short description gives enough information for the
> casual reader.
>  
>> >  The commit has no significant impact on any other test in the test suite.
>> 
>> Sorry, we have no enough machine power to test all test cases for each
>> bisect result.  So we will have no such information until we find a way
>> to do that.
>
> Well, then I really have to ask how I should interpret the data here:
>
>    5076304 ±  0%     +25.6%    6374220 ±  0%  will-it-scale.per_process_ops
>
>    ^^^ That's the reason why you sent the mail in the first place
>
>    1194117 ±  0%     +14.4%    1366153 ±  1%  will-it-scale.per_thread_ops
>       0.58 ±  0%      -2.0%       0.57 ±  0%  will-it-scale.scalability
>       6820 ±  0%     -19.6%       5483 ± 15%  meminfo.AnonHugePages
>       2652 ±  5%     -10.4%       2375 ±  2%  vmstat.system.cs
>       2848 ± 32%    +141.2%       6870 ± 65%  numa-meminfo.node1.Active(anon)
>       2832 ± 31%     +57.6%       4465 ± 27%  numa-meminfo.node1.AnonPages
>      15018 ± 12%     -23.3%      11515 ± 15%  numa-meminfo.node2.AnonPages
>       1214 ± 14%     -22.8%     936.75 ± 20%  numa-meminfo.node3.PageTables
>     712.25 ± 32%    +141.2%       1718 ± 65%  numa-vmstat.node1.nr_active_anon
>     708.25 ± 31%     +57.7%       1116 ± 27%  numa-vmstat.node1.nr_anon_pages
>
> How is this related and what should I do about this information? 

For each will-it-scale sub test case, it will be run in both process
mode and thread mode, and task number will change from 1 to CPU number.
will-it-scale.per_thread_ops shows thread mode main result.
will-it-scale.scalability is calculated to measure how per_process_ops
and per_thread_ops scaled along with the task number.  These are default
behavior of will-it-scale test suite.

Others are monitors output.  That is, other information collected during
test.  For example, meminfo is a monitor to sampling /proc/meminfo
contents,  AnonHugePages is a line in it.  meminfo.AnonHugePages is for
the average value of AnonHugePages line of /proc/meminfo.  Similarly
vmstat.system.cs is the average value of cs column of system column
group of /usr/bin/vmstat.

We hope these information are helpful for root causing the regression.

>  If it's important then I have to admit, that I fail to understand why.
>
>  If it's not important then I have to ask why is this included.
>
>> > So that allows me to reproduce that test more or less with no effort. And
>> > that's the really important part.
>> 
>> For reproducing, now we use lkp-tests tool, which includes scripts to
>> build the test case, run the test, collect various information, compare
>> the test result, with the job file attached with the report email.  That
>> is not the easiest way, we will continuously improve it.
>
> I know and lkp-tests is a pain to work with. So please look into a way to
> extract the relevant binaries, so it's simple for developers to reproduce.

OK.  We will try to improve on this side.  But it is not an easy task
for us to provided easy to use simple binaries.  Do you think something
like Docker image is easy to use?

>> > You can provide nice charts and full comparison tables for all tests on a web
>> > site for those who are interested in large stats and pretty charts.
>> >
>> > Full results: http://wherever.you.store/your/results/test-nr/results
>> 
>> Before we have a website for detailed information, we will still put
>> some details into report email.
>
> Ok, but please make them understandable for mere mortals.

Sure.

Best Regards,
Huang, Ying

> Thanks,
>
> 	tglx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [LKP] [lkp] [futex] 65d8fc777f: +25.6% will-it-scale.per_process_ops
  2016-03-21  8:42             ` Huang, Ying
@ 2016-03-28 17:14               ` Darren Hart
  -1 siblings, 0 replies; 28+ messages in thread
From: Darren Hart @ 2016-03-28 17:14 UTC (permalink / raw)
  To: Huang, Ying
  Cc: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Chris Mason,
	Peter Zijlstra, lkp, Sebastian Andrzej Siewior, Davidlohr Bueso,
	Hugh Dickins, LKML, Linus Torvalds, Mel Gorman, Mel Gorman,
	Darren Hart, Wu Fengguang, Xiaolong Ye

On Mon, Mar 21, 2016 at 04:42:47PM +0800, Huang, Ying wrote:
> Thomas Gleixner <tglx@linutronix.de> writes:
> 
> > On Mon, 21 Mar 2016, Huang, Ying wrote:
> >> >  FYI, we noticed 25.6% performance improvement due to commit
> >> >
> >> >    65d8fc777f6d "futex: Remove requirement for lock_page() in get_futex_key()"
> >> >
> >> >  in the will-it-scale.per_process_ops test.
> >> >
> >> >  will-it-scale.per_process_ops tests the futex operations for process shared
> >> >  futexes (Or whatever that test really does).
> >> 
> >> There is a futex sub test case for will-it-scale test suite.  But I got your
> >> point, we need some description for the test case.  If email is not too
> >> limited for the full description, we will put it in some web site and
> >> include short description and link to the full description in email.
> >
> > Ok. Just make sure the short description gives enough information for the
> > casual reader.
> >  
> >> >  The commit has no significant impact on any other test in the test suite.
> >> 
> >> Sorry, we have no enough machine power to test all test cases for each
> >> bisect result.  So we will have no such information until we find a way
> >> to do that.
> >
> > Well, then I really have to ask how I should interpret the data here:
> >
> >    5076304 ±  0%     +25.6%    6374220 ±  0%  will-it-scale.per_process_ops
> >
> >    ^^^ That's the reason why you sent the mail in the first place
> >
> >    1194117 ±  0%     +14.4%    1366153 ±  1%  will-it-scale.per_thread_ops
> >       0.58 ±  0%      -2.0%       0.57 ±  0%  will-it-scale.scalability
> >       6820 ±  0%     -19.6%       5483 ± 15%  meminfo.AnonHugePages
> >       2652 ±  5%     -10.4%       2375 ±  2%  vmstat.system.cs
> >       2848 ± 32%    +141.2%       6870 ± 65%  numa-meminfo.node1.Active(anon)
> >       2832 ± 31%     +57.6%       4465 ± 27%  numa-meminfo.node1.AnonPages
> >      15018 ± 12%     -23.3%      11515 ± 15%  numa-meminfo.node2.AnonPages
> >       1214 ± 14%     -22.8%     936.75 ± 20%  numa-meminfo.node3.PageTables
> >     712.25 ± 32%    +141.2%       1718 ± 65%  numa-vmstat.node1.nr_active_anon
> >     708.25 ± 31%     +57.7%       1116 ± 27%  numa-vmstat.node1.nr_anon_pages
> >
> > How is this related and what should I do about this information? 
> 
> For each will-it-scale sub test case, it will be run in both process
> mode and thread mode, and task number will change from 1 to CPU number.
> will-it-scale.per_thread_ops shows thread mode main result.
> will-it-scale.scalability is calculated to measure how per_process_ops
> and per_thread_ops scaled along with the task number.  These are default
> behavior of will-it-scale test suite.
> 
> Others are monitors output.  That is, other information collected during
> test.  For example, meminfo is a monitor to sampling /proc/meminfo
> contents,  AnonHugePages is a line in it.  meminfo.AnonHugePages is for
> the average value of AnonHugePages line of /proc/meminfo.  Similarly
> vmstat.system.cs is the average value of cs column of system column
> group of /usr/bin/vmstat.
> 
> We hope these information are helpful for root causing the regression.
> 
> >  If it's important then I have to admit, that I fail to understand why.
> >
> >  If it's not important then I have to ask why is this included.
> >
> >> > So that allows me to reproduce that test more or less with no effort. And
> >> > that's the really important part.
> >> 
> >> For reproducing, now we use lkp-tests tool, which includes scripts to
> >> build the test case, run the test, collect various information, compare
> >> the test result, with the job file attached with the report email.  That
> >> is not the easiest way, we will continuously improve it.
> >
> > I know and lkp-tests is a pain to work with. So please look into a way to
> > extract the relevant binaries, so it's simple for developers to reproduce.
> 
> OK.  We will try to improve on this side.  But it is not an easy task
> for us to provided easy to use simple binaries.  Do you think something
> like Docker image is easy to use?

Thomas, I presume you are interested in binaries to be positive we're
reproducing with exactly the same bits? I agree that's good to have. I'd also
want to have or be pointed to the sources with a straight forward way to
rebuild and to inspect what exactly the test is doing. (I assume this is
implied, but just to make sure it's stated).

Huang, what makes the binaries difficult to package? And how would docker make
that any simpler?

> 
> >> > You can provide nice charts and full comparison tables for all tests on a web
> >> > site for those who are interested in large stats and pretty charts.
> >> >
> >> > Full results: http://wherever.you.store/your/results/test-nr/results
> >> 
> >> Before we have a website for detailed information, we will still put
> >> some details into report email.
> >
> > Ok, but please make them understandable for mere mortals.

Thanks for speaking up for us mortals Thomas ;-)

-- 
Darren Hart
Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [futex] 65d8fc777f: +25.6% will-it-scale.per_process_ops
@ 2016-03-28 17:14               ` Darren Hart
  0 siblings, 0 replies; 28+ messages in thread
From: Darren Hart @ 2016-03-28 17:14 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 5174 bytes --]

On Mon, Mar 21, 2016 at 04:42:47PM +0800, Huang, Ying wrote:
> Thomas Gleixner <tglx@linutronix.de> writes:
> 
> > On Mon, 21 Mar 2016, Huang, Ying wrote:
> >> >  FYI, we noticed 25.6% performance improvement due to commit
> >> >
> >> >    65d8fc777f6d "futex: Remove requirement for lock_page() in get_futex_key()"
> >> >
> >> >  in the will-it-scale.per_process_ops test.
> >> >
> >> >  will-it-scale.per_process_ops tests the futex operations for process shared
> >> >  futexes (Or whatever that test really does).
> >> 
> >> There is a futex sub test case for will-it-scale test suite.  But I got your
> >> point, we need some description for the test case.  If email is not too
> >> limited for the full description, we will put it in some web site and
> >> include short description and link to the full description in email.
> >
> > Ok. Just make sure the short description gives enough information for the
> > casual reader.
> >  
> >> >  The commit has no significant impact on any other test in the test suite.
> >> 
> >> Sorry, we have no enough machine power to test all test cases for each
> >> bisect result.  So we will have no such information until we find a way
> >> to do that.
> >
> > Well, then I really have to ask how I should interpret the data here:
> >
> >    5076304 ±  0%     +25.6%    6374220 ±  0%  will-it-scale.per_process_ops
> >
> >    ^^^ That's the reason why you sent the mail in the first place
> >
> >    1194117 ±  0%     +14.4%    1366153 ±  1%  will-it-scale.per_thread_ops
> >       0.58 ±  0%      -2.0%       0.57 ±  0%  will-it-scale.scalability
> >       6820 ±  0%     -19.6%       5483 ± 15%  meminfo.AnonHugePages
> >       2652 ±  5%     -10.4%       2375 ±  2%  vmstat.system.cs
> >       2848 ± 32%    +141.2%       6870 ± 65%  numa-meminfo.node1.Active(anon)
> >       2832 ± 31%     +57.6%       4465 ± 27%  numa-meminfo.node1.AnonPages
> >      15018 ± 12%     -23.3%      11515 ± 15%  numa-meminfo.node2.AnonPages
> >       1214 ± 14%     -22.8%     936.75 ± 20%  numa-meminfo.node3.PageTables
> >     712.25 ± 32%    +141.2%       1718 ± 65%  numa-vmstat.node1.nr_active_anon
> >     708.25 ± 31%     +57.7%       1116 ± 27%  numa-vmstat.node1.nr_anon_pages
> >
> > How is this related and what should I do about this information? 
> 
> For each will-it-scale sub test case, it will be run in both process
> mode and thread mode, and task number will change from 1 to CPU number.
> will-it-scale.per_thread_ops shows thread mode main result.
> will-it-scale.scalability is calculated to measure how per_process_ops
> and per_thread_ops scaled along with the task number.  These are default
> behavior of will-it-scale test suite.
> 
> Others are monitors output.  That is, other information collected during
> test.  For example, meminfo is a monitor to sampling /proc/meminfo
> contents,  AnonHugePages is a line in it.  meminfo.AnonHugePages is for
> the average value of AnonHugePages line of /proc/meminfo.  Similarly
> vmstat.system.cs is the average value of cs column of system column
> group of /usr/bin/vmstat.
> 
> We hope these information are helpful for root causing the regression.
> 
> >  If it's important then I have to admit, that I fail to understand why.
> >
> >  If it's not important then I have to ask why is this included.
> >
> >> > So that allows me to reproduce that test more or less with no effort. And
> >> > that's the really important part.
> >> 
> >> For reproducing, now we use lkp-tests tool, which includes scripts to
> >> build the test case, run the test, collect various information, compare
> >> the test result, with the job file attached with the report email.  That
> >> is not the easiest way, we will continuously improve it.
> >
> > I know and lkp-tests is a pain to work with. So please look into a way to
> > extract the relevant binaries, so it's simple for developers to reproduce.
> 
> OK.  We will try to improve on this side.  But it is not an easy task
> for us to provided easy to use simple binaries.  Do you think something
> like Docker image is easy to use?

Thomas, I presume you are interested in binaries to be positive we're
reproducing with exactly the same bits? I agree that's good to have. I'd also
want to have or be pointed to the sources with a straight forward way to
rebuild and to inspect what exactly the test is doing. (I assume this is
implied, but just to make sure it's stated).

Huang, what makes the binaries difficult to package? And how would docker make
that any simpler?

> 
> >> > You can provide nice charts and full comparison tables for all tests on a web
> >> > site for those who are interested in large stats and pretty charts.
> >> >
> >> > Full results: http://wherever.you.store/your/results/test-nr/results
> >> 
> >> Before we have a website for detailed information, we will still put
> >> some details into report email.
> >
> > Ok, but please make them understandable for mere mortals.

Thanks for speaking up for us mortals Thomas ;-)

-- 
Darren Hart
Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [LKP] [lkp] [futex] 65d8fc777f: +25.6% will-it-scale.per_process_ops
  2016-03-28 17:14               ` Darren Hart
@ 2016-03-29  1:12                 ` Huang, Ying
  -1 siblings, 0 replies; 28+ messages in thread
From: Huang, Ying @ 2016-03-29  1:12 UTC (permalink / raw)
  To: Darren Hart
  Cc: Huang, Ying, Thomas Gleixner, Ingo Molnar, Peter Zijlstra,
	Chris Mason, Peter Zijlstra, lkp, Sebastian Andrzej Siewior,
	Davidlohr Bueso, Hugh Dickins, LKML, Linus Torvalds, Mel Gorman,
	Mel Gorman, Darren Hart, Wu Fengguang, Xiaolong Ye

Darren Hart <dvhart@infradead.org> writes:

> On Mon, Mar 21, 2016 at 04:42:47PM +0800, Huang, Ying wrote:
>> Thomas Gleixner <tglx@linutronix.de> writes:
>> 
>> > On Mon, 21 Mar 2016, Huang, Ying wrote:
>> >> >  FYI, we noticed 25.6% performance improvement due to commit
>> >> >
>> >> >    65d8fc777f6d "futex: Remove requirement for lock_page() in get_futex_key()"
>> >> >
>> >> >  in the will-it-scale.per_process_ops test.
>> >> >
>> >> >  will-it-scale.per_process_ops tests the futex operations for process shared
>> >> >  futexes (Or whatever that test really does).
>> >> 
>> >> There is a futex sub test case for will-it-scale test suite.  But I got your
>> >> point, we need some description for the test case.  If email is not too
>> >> limited for the full description, we will put it in some web site and
>> >> include short description and link to the full description in email.
>> >
>> > Ok. Just make sure the short description gives enough information for the
>> > casual reader.
>> >  
>> >> >  The commit has no significant impact on any other test in the test suite.
>> >> 
>> >> Sorry, we have no enough machine power to test all test cases for each
>> >> bisect result.  So we will have no such information until we find a way
>> >> to do that.
>> >
>> > Well, then I really have to ask how I should interpret the data here:
>> >
>> >    5076304 .  0%     +25.6%    6374220 .  0%  will-it-scale.per_process_ops
>> >
>> >    ^^^ That's the reason why you sent the mail in the first place
>> >
>> >    1194117 .  0%     +14.4%    1366153 .  1%  will-it-scale.per_thread_ops
>> >       0.58 .  0%      -2.0%       0.57 .  0%  will-it-scale.scalability
>> >       6820 .  0%     -19.6%       5483 . 15%  meminfo.AnonHugePages
>> >       2652 .  5%     -10.4%       2375 .  2%  vmstat.system.cs
>> >       2848 . 32%    +141.2%       6870 . 65%  numa-meminfo.node1.Active(anon)
>> >       2832 . 31%     +57.6%       4465 . 27%  numa-meminfo.node1.AnonPages
>> >      15018 . 12%     -23.3%      11515 . 15%  numa-meminfo.node2.AnonPages
>> >       1214 . 14%     -22.8%     936.75 . 20%  numa-meminfo.node3.PageTables
>> >     712.25 . 32%    +141.2%       1718 . 65%  numa-vmstat.node1.nr_active_anon
>> >     708.25 . 31%     +57.7%       1116 . 27%  numa-vmstat.node1.nr_anon_pages
>> >
>> > How is this related and what should I do about this information? 
>> 
>> For each will-it-scale sub test case, it will be run in both process
>> mode and thread mode, and task number will change from 1 to CPU number.
>> will-it-scale.per_thread_ops shows thread mode main result.
>> will-it-scale.scalability is calculated to measure how per_process_ops
>> and per_thread_ops scaled along with the task number.  These are default
>> behavior of will-it-scale test suite.
>> 
>> Others are monitors output.  That is, other information collected during
>> test.  For example, meminfo is a monitor to sampling /proc/meminfo
>> contents,  AnonHugePages is a line in it.  meminfo.AnonHugePages is for
>> the average value of AnonHugePages line of /proc/meminfo.  Similarly
>> vmstat.system.cs is the average value of cs column of system column
>> group of /usr/bin/vmstat.
>> 
>> We hope these information are helpful for root causing the regression.
>> 
>> >  If it's important then I have to admit, that I fail to understand why.
>> >
>> >  If it's not important then I have to ask why is this included.
>> >
>> >> > So that allows me to reproduce that test more or less with no effort. And
>> >> > that's the really important part.
>> >> 
>> >> For reproducing, now we use lkp-tests tool, which includes scripts to
>> >> build the test case, run the test, collect various information, compare
>> >> the test result, with the job file attached with the report email.  That
>> >> is not the easiest way, we will continuously improve it.
>> >
>> > I know and lkp-tests is a pain to work with. So please look into a way to
>> > extract the relevant binaries, so it's simple for developers to reproduce.
>> 
>> OK.  We will try to improve on this side.  But it is not an easy task
>> for us to provided easy to use simple binaries.  Do you think something
>> like Docker image is easy to use?
>
> Thomas, I presume you are interested in binaries to be positive we're
> reproducing with exactly the same bits? I agree that's good to have. I'd also
> want to have or be pointed to the sources with a straight forward way to
> rebuild and to inspect what exactly the test is doing. (I assume this is
> implied, but just to make sure it's stated).

lkp-tests has the scripts to download the source, apply some patch, and
build the binary.  It is not a very straight forward way, but the script
is quite simple.

> Huang, what makes the binaries difficult to package? And how would docker make
> that any simpler?

The binaries is not difficult to package.  But the test is not only
benchmark binary.  We may do some setup for the specific test, for
example, change the cpu frequency governor, mkfs on a partition (may be
ramdisk), etc.  And we have mechanism to collect various information
during the test, for example, run vmstat, sampling /proc/sched_debug,
etc.

As for docker, we just want to reduce the pain of using lkp-tests.
 
>> >> > You can provide nice charts and full comparison tables for all tests on a web
>> >> > site for those who are interested in large stats and pretty charts.
>> >> >
>> >> > Full results: http://wherever.you.store/your/results/test-nr/results
>> >> 
>> >> Before we have a website for detailed information, we will still put
>> >> some details into report email.
>> >
>> > Ok, but please make them understandable for mere mortals.
>
> Thanks for speaking up for us mortals Thomas ;-)

Best Regards,
Huang, Ying

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [futex] 65d8fc777f: +25.6% will-it-scale.per_process_ops
@ 2016-03-29  1:12                 ` Huang, Ying
  0 siblings, 0 replies; 28+ messages in thread
From: Huang, Ying @ 2016-03-29  1:12 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 5872 bytes --]

Darren Hart <dvhart@infradead.org> writes:

> On Mon, Mar 21, 2016 at 04:42:47PM +0800, Huang, Ying wrote:
>> Thomas Gleixner <tglx@linutronix.de> writes:
>> 
>> > On Mon, 21 Mar 2016, Huang, Ying wrote:
>> >> >  FYI, we noticed 25.6% performance improvement due to commit
>> >> >
>> >> >    65d8fc777f6d "futex: Remove requirement for lock_page() in get_futex_key()"
>> >> >
>> >> >  in the will-it-scale.per_process_ops test.
>> >> >
>> >> >  will-it-scale.per_process_ops tests the futex operations for process shared
>> >> >  futexes (Or whatever that test really does).
>> >> 
>> >> There is a futex sub test case for will-it-scale test suite.  But I got your
>> >> point, we need some description for the test case.  If email is not too
>> >> limited for the full description, we will put it in some web site and
>> >> include short description and link to the full description in email.
>> >
>> > Ok. Just make sure the short description gives enough information for the
>> > casual reader.
>> >  
>> >> >  The commit has no significant impact on any other test in the test suite.
>> >> 
>> >> Sorry, we have no enough machine power to test all test cases for each
>> >> bisect result.  So we will have no such information until we find a way
>> >> to do that.
>> >
>> > Well, then I really have to ask how I should interpret the data here:
>> >
>> >    5076304 .  0%     +25.6%    6374220 .  0%  will-it-scale.per_process_ops
>> >
>> >    ^^^ That's the reason why you sent the mail in the first place
>> >
>> >    1194117 .  0%     +14.4%    1366153 .  1%  will-it-scale.per_thread_ops
>> >       0.58 .  0%      -2.0%       0.57 .  0%  will-it-scale.scalability
>> >       6820 .  0%     -19.6%       5483 . 15%  meminfo.AnonHugePages
>> >       2652 .  5%     -10.4%       2375 .  2%  vmstat.system.cs
>> >       2848 . 32%    +141.2%       6870 . 65%  numa-meminfo.node1.Active(anon)
>> >       2832 . 31%     +57.6%       4465 . 27%  numa-meminfo.node1.AnonPages
>> >      15018 . 12%     -23.3%      11515 . 15%  numa-meminfo.node2.AnonPages
>> >       1214 . 14%     -22.8%     936.75 . 20%  numa-meminfo.node3.PageTables
>> >     712.25 . 32%    +141.2%       1718 . 65%  numa-vmstat.node1.nr_active_anon
>> >     708.25 . 31%     +57.7%       1116 . 27%  numa-vmstat.node1.nr_anon_pages
>> >
>> > How is this related and what should I do about this information? 
>> 
>> For each will-it-scale sub test case, it will be run in both process
>> mode and thread mode, and task number will change from 1 to CPU number.
>> will-it-scale.per_thread_ops shows thread mode main result.
>> will-it-scale.scalability is calculated to measure how per_process_ops
>> and per_thread_ops scaled along with the task number.  These are default
>> behavior of will-it-scale test suite.
>> 
>> Others are monitors output.  That is, other information collected during
>> test.  For example, meminfo is a monitor to sampling /proc/meminfo
>> contents,  AnonHugePages is a line in it.  meminfo.AnonHugePages is for
>> the average value of AnonHugePages line of /proc/meminfo.  Similarly
>> vmstat.system.cs is the average value of cs column of system column
>> group of /usr/bin/vmstat.
>> 
>> We hope these information are helpful for root causing the regression.
>> 
>> >  If it's important then I have to admit, that I fail to understand why.
>> >
>> >  If it's not important then I have to ask why is this included.
>> >
>> >> > So that allows me to reproduce that test more or less with no effort. And
>> >> > that's the really important part.
>> >> 
>> >> For reproducing, now we use lkp-tests tool, which includes scripts to
>> >> build the test case, run the test, collect various information, compare
>> >> the test result, with the job file attached with the report email.  That
>> >> is not the easiest way, we will continuously improve it.
>> >
>> > I know and lkp-tests is a pain to work with. So please look into a way to
>> > extract the relevant binaries, so it's simple for developers to reproduce.
>> 
>> OK.  We will try to improve on this side.  But it is not an easy task
>> for us to provided easy to use simple binaries.  Do you think something
>> like Docker image is easy to use?
>
> Thomas, I presume you are interested in binaries to be positive we're
> reproducing with exactly the same bits? I agree that's good to have. I'd also
> want to have or be pointed to the sources with a straight forward way to
> rebuild and to inspect what exactly the test is doing. (I assume this is
> implied, but just to make sure it's stated).

lkp-tests has the scripts to download the source, apply some patch, and
build the binary.  It is not a very straight forward way, but the script
is quite simple.

> Huang, what makes the binaries difficult to package? And how would docker make
> that any simpler?

The binaries is not difficult to package.  But the test is not only
benchmark binary.  We may do some setup for the specific test, for
example, change the cpu frequency governor, mkfs on a partition (may be
ramdisk), etc.  And we have mechanism to collect various information
during the test, for example, run vmstat, sampling /proc/sched_debug,
etc.

As for docker, we just want to reduce the pain of using lkp-tests.
 
>> >> > You can provide nice charts and full comparison tables for all tests on a web
>> >> > site for those who are interested in large stats and pretty charts.
>> >> >
>> >> > Full results: http://wherever.you.store/your/results/test-nr/results
>> >> 
>> >> Before we have a website for detailed information, we will still put
>> >> some details into report email.
>> >
>> > Ok, but please make them understandable for mere mortals.
>
> Thanks for speaking up for us mortals Thomas ;-)

Best Regards,
Huang, Ying

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [LKP] [lkp] [futex] 65d8fc777f: +25.6% will-it-scale.per_process_ops
  2016-03-29  1:12                 ` Huang, Ying
@ 2016-03-29  5:17                   ` Darren Hart
  -1 siblings, 0 replies; 28+ messages in thread
From: Darren Hart @ 2016-03-29  5:17 UTC (permalink / raw)
  To: Huang, Ying
  Cc: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Chris Mason,
	Peter Zijlstra, lkp, Sebastian Andrzej Siewior, Davidlohr Bueso,
	Hugh Dickins, LKML, Linus Torvalds, Mel Gorman, Mel Gorman,
	Darren Hart, Wu Fengguang, Xiaolong Ye

On Tue, Mar 29, 2016 at 09:12:56AM +0800, Huang, Ying wrote:
> Darren Hart <dvhart@infradead.org> writes:
> 
> > On Mon, Mar 21, 2016 at 04:42:47PM +0800, Huang, Ying wrote:
> >> Thomas Gleixner <tglx@linutronix.de> writes:
> >> 
> >> > On Mon, 21 Mar 2016, Huang, Ying wrote:
> >> >> >  FYI, we noticed 25.6% performance improvement due to commit
> >> >> >
> >> >> >    65d8fc777f6d "futex: Remove requirement for lock_page() in get_futex_key()"
> >> >> >
> >> >> >  in the will-it-scale.per_process_ops test.
> >> >> >
> >> >> >  will-it-scale.per_process_ops tests the futex operations for process shared
> >> >> >  futexes (Or whatever that test really does).
> >> >> 
> >> >> There is a futex sub test case for will-it-scale test suite.  But I got your
> >> >> point, we need some description for the test case.  If email is not too
> >> >> limited for the full description, we will put it in some web site and
> >> >> include short description and link to the full description in email.
> >> >
> >> > Ok. Just make sure the short description gives enough information for the
> >> > casual reader.
> >> >  
> >> >> >  The commit has no significant impact on any other test in the test suite.
> >> >> 
> >> >> Sorry, we have no enough machine power to test all test cases for each
> >> >> bisect result.  So we will have no such information until we find a way
> >> >> to do that.
> >> >
> >> > Well, then I really have to ask how I should interpret the data here:
> >> >
> >> >    5076304 .  0%     +25.6%    6374220 .  0%  will-it-scale.per_process_ops
> >> >
> >> >    ^^^ That's the reason why you sent the mail in the first place
> >> >
> >> >    1194117 .  0%     +14.4%    1366153 .  1%  will-it-scale.per_thread_ops
> >> >       0.58 .  0%      -2.0%       0.57 .  0%  will-it-scale.scalability
> >> >       6820 .  0%     -19.6%       5483 . 15%  meminfo.AnonHugePages
> >> >       2652 .  5%     -10.4%       2375 .  2%  vmstat.system.cs
> >> >       2848 . 32%    +141.2%       6870 . 65%  numa-meminfo.node1.Active(anon)
> >> >       2832 . 31%     +57.6%       4465 . 27%  numa-meminfo.node1.AnonPages
> >> >      15018 . 12%     -23.3%      11515 . 15%  numa-meminfo.node2.AnonPages
> >> >       1214 . 14%     -22.8%     936.75 . 20%  numa-meminfo.node3.PageTables
> >> >     712.25 . 32%    +141.2%       1718 . 65%  numa-vmstat.node1.nr_active_anon
> >> >     708.25 . 31%     +57.7%       1116 . 27%  numa-vmstat.node1.nr_anon_pages
> >> >
> >> > How is this related and what should I do about this information? 
> >> 
> >> For each will-it-scale sub test case, it will be run in both process
> >> mode and thread mode, and task number will change from 1 to CPU number.
> >> will-it-scale.per_thread_ops shows thread mode main result.
> >> will-it-scale.scalability is calculated to measure how per_process_ops
> >> and per_thread_ops scaled along with the task number.  These are default
> >> behavior of will-it-scale test suite.
> >> 
> >> Others are monitors output.  That is, other information collected during
> >> test.  For example, meminfo is a monitor to sampling /proc/meminfo
> >> contents,  AnonHugePages is a line in it.  meminfo.AnonHugePages is for
> >> the average value of AnonHugePages line of /proc/meminfo.  Similarly
> >> vmstat.system.cs is the average value of cs column of system column
> >> group of /usr/bin/vmstat.
> >> 
> >> We hope these information are helpful for root causing the regression.
> >> 
> >> >  If it's important then I have to admit, that I fail to understand why.
> >> >
> >> >  If it's not important then I have to ask why is this included.
> >> >
> >> >> > So that allows me to reproduce that test more or less with no effort. And
> >> >> > that's the really important part.
> >> >> 
> >> >> For reproducing, now we use lkp-tests tool, which includes scripts to
> >> >> build the test case, run the test, collect various information, compare
> >> >> the test result, with the job file attached with the report email.  That
> >> >> is not the easiest way, we will continuously improve it.
> >> >
> >> > I know and lkp-tests is a pain to work with. So please look into a way to
> >> > extract the relevant binaries, so it's simple for developers to reproduce.
> >> 
> >> OK.  We will try to improve on this side.  But it is not an easy task
> >> for us to provided easy to use simple binaries.  Do you think something
> >> like Docker image is easy to use?
> >
> > Thomas, I presume you are interested in binaries to be positive we're
> > reproducing with exactly the same bits? I agree that's good to have. I'd also
> > want to have or be pointed to the sources with a straight forward way to
> > rebuild and to inspect what exactly the test is doing. (I assume this is
> > implied, but just to make sure it's stated).
> 
> lkp-tests has the scripts to download the source, apply some patch, and
> build the binary.  It is not a very straight forward way, but the script
> is quite simple.
> 
> > Huang, what makes the binaries difficult to package? And how would docker make
> > that any simpler?
> 
> The binaries is not difficult to package.  But the test is not only
> benchmark binary.  We may do some setup for the specific test, for
> example, change the cpu frequency governor, mkfs on a partition (may be
> ramdisk), etc.  And we have mechanism to collect various information
> during the test, for example, run vmstat, sampling /proc/sched_debug,
> etc.

In that case, it would be really useful to document your setup and include a
link to that in every report you send out. It should include things like the
partition layout (which ideally would have normalized symlinks to facilitate
reproduction outside your lab) (/dev/testpart -> /dev/sde3) and the
scripts/tests should only use the symlinks. That's an over simplification of
course, but that kind of configuration and documentation would be very helpful
in reducing the barrier to getting people to look at the issues your testing
discovers - and that, of course, is the whole point.

Thank you for doing the work and accepting all this feedback.

> As for docker, we just want to reduce the pain of using lkp-tests.
>  

Sometimes docker is pain as well. It became completely unusable on my Debian 8
system not too long ago. That can become another barrier if it isn't also
documented without docker, even if it requires more steps.

-- 
Darren Hart
Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [futex] 65d8fc777f: +25.6% will-it-scale.per_process_ops
@ 2016-03-29  5:17                   ` Darren Hart
  0 siblings, 0 replies; 28+ messages in thread
From: Darren Hart @ 2016-03-29  5:17 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 6547 bytes --]

On Tue, Mar 29, 2016 at 09:12:56AM +0800, Huang, Ying wrote:
> Darren Hart <dvhart@infradead.org> writes:
> 
> > On Mon, Mar 21, 2016 at 04:42:47PM +0800, Huang, Ying wrote:
> >> Thomas Gleixner <tglx@linutronix.de> writes:
> >> 
> >> > On Mon, 21 Mar 2016, Huang, Ying wrote:
> >> >> >  FYI, we noticed 25.6% performance improvement due to commit
> >> >> >
> >> >> >    65d8fc777f6d "futex: Remove requirement for lock_page() in get_futex_key()"
> >> >> >
> >> >> >  in the will-it-scale.per_process_ops test.
> >> >> >
> >> >> >  will-it-scale.per_process_ops tests the futex operations for process shared
> >> >> >  futexes (Or whatever that test really does).
> >> >> 
> >> >> There is a futex sub test case for will-it-scale test suite.  But I got your
> >> >> point, we need some description for the test case.  If email is not too
> >> >> limited for the full description, we will put it in some web site and
> >> >> include short description and link to the full description in email.
> >> >
> >> > Ok. Just make sure the short description gives enough information for the
> >> > casual reader.
> >> >  
> >> >> >  The commit has no significant impact on any other test in the test suite.
> >> >> 
> >> >> Sorry, we have no enough machine power to test all test cases for each
> >> >> bisect result.  So we will have no such information until we find a way
> >> >> to do that.
> >> >
> >> > Well, then I really have to ask how I should interpret the data here:
> >> >
> >> >    5076304 .  0%     +25.6%    6374220 .  0%  will-it-scale.per_process_ops
> >> >
> >> >    ^^^ That's the reason why you sent the mail in the first place
> >> >
> >> >    1194117 .  0%     +14.4%    1366153 .  1%  will-it-scale.per_thread_ops
> >> >       0.58 .  0%      -2.0%       0.57 .  0%  will-it-scale.scalability
> >> >       6820 .  0%     -19.6%       5483 . 15%  meminfo.AnonHugePages
> >> >       2652 .  5%     -10.4%       2375 .  2%  vmstat.system.cs
> >> >       2848 . 32%    +141.2%       6870 . 65%  numa-meminfo.node1.Active(anon)
> >> >       2832 . 31%     +57.6%       4465 . 27%  numa-meminfo.node1.AnonPages
> >> >      15018 . 12%     -23.3%      11515 . 15%  numa-meminfo.node2.AnonPages
> >> >       1214 . 14%     -22.8%     936.75 . 20%  numa-meminfo.node3.PageTables
> >> >     712.25 . 32%    +141.2%       1718 . 65%  numa-vmstat.node1.nr_active_anon
> >> >     708.25 . 31%     +57.7%       1116 . 27%  numa-vmstat.node1.nr_anon_pages
> >> >
> >> > How is this related and what should I do about this information? 
> >> 
> >> For each will-it-scale sub test case, it will be run in both process
> >> mode and thread mode, and task number will change from 1 to CPU number.
> >> will-it-scale.per_thread_ops shows thread mode main result.
> >> will-it-scale.scalability is calculated to measure how per_process_ops
> >> and per_thread_ops scaled along with the task number.  These are default
> >> behavior of will-it-scale test suite.
> >> 
> >> Others are monitors output.  That is, other information collected during
> >> test.  For example, meminfo is a monitor to sampling /proc/meminfo
> >> contents,  AnonHugePages is a line in it.  meminfo.AnonHugePages is for
> >> the average value of AnonHugePages line of /proc/meminfo.  Similarly
> >> vmstat.system.cs is the average value of cs column of system column
> >> group of /usr/bin/vmstat.
> >> 
> >> We hope these information are helpful for root causing the regression.
> >> 
> >> >  If it's important then I have to admit, that I fail to understand why.
> >> >
> >> >  If it's not important then I have to ask why is this included.
> >> >
> >> >> > So that allows me to reproduce that test more or less with no effort. And
> >> >> > that's the really important part.
> >> >> 
> >> >> For reproducing, now we use lkp-tests tool, which includes scripts to
> >> >> build the test case, run the test, collect various information, compare
> >> >> the test result, with the job file attached with the report email.  That
> >> >> is not the easiest way, we will continuously improve it.
> >> >
> >> > I know and lkp-tests is a pain to work with. So please look into a way to
> >> > extract the relevant binaries, so it's simple for developers to reproduce.
> >> 
> >> OK.  We will try to improve on this side.  But it is not an easy task
> >> for us to provided easy to use simple binaries.  Do you think something
> >> like Docker image is easy to use?
> >
> > Thomas, I presume you are interested in binaries to be positive we're
> > reproducing with exactly the same bits? I agree that's good to have. I'd also
> > want to have or be pointed to the sources with a straight forward way to
> > rebuild and to inspect what exactly the test is doing. (I assume this is
> > implied, but just to make sure it's stated).
> 
> lkp-tests has the scripts to download the source, apply some patch, and
> build the binary.  It is not a very straight forward way, but the script
> is quite simple.
> 
> > Huang, what makes the binaries difficult to package? And how would docker make
> > that any simpler?
> 
> The binaries is not difficult to package.  But the test is not only
> benchmark binary.  We may do some setup for the specific test, for
> example, change the cpu frequency governor, mkfs on a partition (may be
> ramdisk), etc.  And we have mechanism to collect various information
> during the test, for example, run vmstat, sampling /proc/sched_debug,
> etc.

In that case, it would be really useful to document your setup and include a
link to that in every report you send out. It should include things like the
partition layout (which ideally would have normalized symlinks to facilitate
reproduction outside your lab) (/dev/testpart -> /dev/sde3) and the
scripts/tests should only use the symlinks. That's an over simplification of
course, but that kind of configuration and documentation would be very helpful
in reducing the barrier to getting people to look at the issues your testing
discovers - and that, of course, is the whole point.

Thank you for doing the work and accepting all this feedback.

> As for docker, we just want to reduce the pain of using lkp-tests.
>  

Sometimes docker is pain as well. It became completely unusable on my Debian 8
system not too long ago. That can become another barrier if it isn't also
documented without docker, even if it requires more steps.

-- 
Darren Hart
Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [LKP] [lkp] [futex] 65d8fc777f: +25.6% will-it-scale.per_process_ops
  2016-03-29  5:17                   ` Darren Hart
@ 2016-03-29  5:57                     ` Huang, Ying
  -1 siblings, 0 replies; 28+ messages in thread
From: Huang, Ying @ 2016-03-29  5:57 UTC (permalink / raw)
  To: Darren Hart
  Cc: Huang, Ying, Thomas Gleixner, Ingo Molnar, Peter Zijlstra,
	Chris Mason, Peter Zijlstra, lkp, Sebastian Andrzej Siewior,
	Davidlohr Bueso, Hugh Dickins, LKML, Linus Torvalds, Mel Gorman,
	Mel Gorman, Darren Hart, Wu Fengguang, Xiaolong Ye

Darren Hart <dvhart@infradead.org> writes:

> On Tue, Mar 29, 2016 at 09:12:56AM +0800, Huang, Ying wrote:
>> Darren Hart <dvhart@infradead.org> writes:
>> 
>> > On Mon, Mar 21, 2016 at 04:42:47PM +0800, Huang, Ying wrote:
>> >> Thomas Gleixner <tglx@linutronix.de> writes:
>> >> 
>> >> > On Mon, 21 Mar 2016, Huang, Ying wrote:
>> >> >> >  FYI, we noticed 25.6% performance improvement due to commit
>> >> >> >
>> >> >> >    65d8fc777f6d "futex: Remove requirement for lock_page() in get_futex_key()"
>> >> >> >
>> >> >> >  in the will-it-scale.per_process_ops test.
>> >> >> >
>> >> >> >  will-it-scale.per_process_ops tests the futex operations for process shared
>> >> >> >  futexes (Or whatever that test really does).
>> >> >> 
>> >> >> There is a futex sub test case for will-it-scale test suite.  But I got your
>> >> >> point, we need some description for the test case.  If email is not too
>> >> >> limited for the full description, we will put it in some web site and
>> >> >> include short description and link to the full description in email.
>> >> >
>> >> > Ok. Just make sure the short description gives enough information for the
>> >> > casual reader.
>> >> >  
>> >> >> >  The commit has no significant impact on any other test in the test suite.
>> >> >> 
>> >> >> Sorry, we have no enough machine power to test all test cases for each
>> >> >> bisect result.  So we will have no such information until we find a way
>> >> >> to do that.
>> >> >
>> >> > Well, then I really have to ask how I should interpret the data here:
>> >> >
>> >> >    5076304 .  0%     +25.6%    6374220 .  0%  will-it-scale.per_process_ops
>> >> >
>> >> >    ^^^ That's the reason why you sent the mail in the first place
>> >> >
>> >> >    1194117 .  0%     +14.4%    1366153 .  1%  will-it-scale.per_thread_ops
>> >> >       0.58 .  0%      -2.0%       0.57 .  0%  will-it-scale.scalability
>> >> >       6820 .  0%     -19.6%       5483 . 15%  meminfo.AnonHugePages
>> >> >       2652 .  5%     -10.4%       2375 .  2%  vmstat.system.cs
>> >> >       2848 . 32%    +141.2%       6870 . 65%  numa-meminfo.node1.Active(anon)
>> >> >       2832 . 31%     +57.6%       4465 . 27%  numa-meminfo.node1.AnonPages
>> >> >      15018 . 12%     -23.3%      11515 . 15%  numa-meminfo.node2.AnonPages
>> >> >       1214 . 14%     -22.8%     936.75 . 20%  numa-meminfo.node3.PageTables
>> >> >     712.25 . 32%    +141.2%       1718 . 65%  numa-vmstat.node1.nr_active_anon
>> >> >     708.25 . 31%     +57.7%       1116 . 27%  numa-vmstat.node1.nr_anon_pages
>> >> >
>> >> > How is this related and what should I do about this information? 
>> >> 
>> >> For each will-it-scale sub test case, it will be run in both process
>> >> mode and thread mode, and task number will change from 1 to CPU number.
>> >> will-it-scale.per_thread_ops shows thread mode main result.
>> >> will-it-scale.scalability is calculated to measure how per_process_ops
>> >> and per_thread_ops scaled along with the task number.  These are default
>> >> behavior of will-it-scale test suite.
>> >> 
>> >> Others are monitors output.  That is, other information collected during
>> >> test.  For example, meminfo is a monitor to sampling /proc/meminfo
>> >> contents,  AnonHugePages is a line in it.  meminfo.AnonHugePages is for
>> >> the average value of AnonHugePages line of /proc/meminfo.  Similarly
>> >> vmstat.system.cs is the average value of cs column of system column
>> >> group of /usr/bin/vmstat.
>> >> 
>> >> We hope these information are helpful for root causing the regression.
>> >> 
>> >> >  If it's important then I have to admit, that I fail to understand why.
>> >> >
>> >> >  If it's not important then I have to ask why is this included.
>> >> >
>> >> >> > So that allows me to reproduce that test more or less with no effort. And
>> >> >> > that's the really important part.
>> >> >> 
>> >> >> For reproducing, now we use lkp-tests tool, which includes scripts to
>> >> >> build the test case, run the test, collect various information, compare
>> >> >> the test result, with the job file attached with the report email.  That
>> >> >> is not the easiest way, we will continuously improve it.
>> >> >
>> >> > I know and lkp-tests is a pain to work with. So please look into a way to
>> >> > extract the relevant binaries, so it's simple for developers to reproduce.
>> >> 
>> >> OK.  We will try to improve on this side.  But it is not an easy task
>> >> for us to provided easy to use simple binaries.  Do you think something
>> >> like Docker image is easy to use?
>> >
>> > Thomas, I presume you are interested in binaries to be positive we're
>> > reproducing with exactly the same bits? I agree that's good to have. I'd also
>> > want to have or be pointed to the sources with a straight forward way to
>> > rebuild and to inspect what exactly the test is doing. (I assume this is
>> > implied, but just to make sure it's stated).
>> 
>> lkp-tests has the scripts to download the source, apply some patch, and
>> build the binary.  It is not a very straight forward way, but the script
>> is quite simple.
>> 
>> > Huang, what makes the binaries difficult to package? And how would docker make
>> > that any simpler?
>> 
>> The binaries is not difficult to package.  But the test is not only
>> benchmark binary.  We may do some setup for the specific test, for
>> example, change the cpu frequency governor, mkfs on a partition (may be
>> ramdisk), etc.  And we have mechanism to collect various information
>> during the test, for example, run vmstat, sampling /proc/sched_debug,
>> etc.
>
> In that case, it would be really useful to document your setup and include a
> link to that in every report you send out.

The setup information is in the job file attached in the report email.
The job file can be used together with lkp-tests to reproduce exactly we
done in our test machine.

> It should include things like the
> partition layout (which ideally would have normalized symlinks to facilitate
> reproduction outside your lab) (/dev/testpart -> /dev/sde3) and the
> scripts/tests should only use the symlinks. That's an over simplification of
> course, but that kind of configuration and documentation would be very helpful
> in reducing the barrier to getting people to look at the issues your testing
> discovers - and that, of course, is the whole point.

The test partition of your test machine can be specified in the <host>
file in lkp-tests.

Our current reproducing solution is based on lkp-tests.  We hope we can
improve the lkp-tests tool to make reproduction easier.

> Thank you for doing the work and accepting all this feedback.

Our pleasure.

>> As for docker, we just want to reduce the pain of using lkp-tests.
>>  
>
> Sometimes docker is pain as well. It became completely unusable on my Debian 8
> system not too long ago. That can become another barrier if it isn't also
> documented without docker, even if it requires more steps.

OK.  I see.  Your input are valuable for us.

Best Regards,
Huang, Ying

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [futex] 65d8fc777f: +25.6% will-it-scale.per_process_ops
@ 2016-03-29  5:57                     ` Huang, Ying
  0 siblings, 0 replies; 28+ messages in thread
From: Huang, Ying @ 2016-03-29  5:57 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 7169 bytes --]

Darren Hart <dvhart@infradead.org> writes:

> On Tue, Mar 29, 2016 at 09:12:56AM +0800, Huang, Ying wrote:
>> Darren Hart <dvhart@infradead.org> writes:
>> 
>> > On Mon, Mar 21, 2016 at 04:42:47PM +0800, Huang, Ying wrote:
>> >> Thomas Gleixner <tglx@linutronix.de> writes:
>> >> 
>> >> > On Mon, 21 Mar 2016, Huang, Ying wrote:
>> >> >> >  FYI, we noticed 25.6% performance improvement due to commit
>> >> >> >
>> >> >> >    65d8fc777f6d "futex: Remove requirement for lock_page() in get_futex_key()"
>> >> >> >
>> >> >> >  in the will-it-scale.per_process_ops test.
>> >> >> >
>> >> >> >  will-it-scale.per_process_ops tests the futex operations for process shared
>> >> >> >  futexes (Or whatever that test really does).
>> >> >> 
>> >> >> There is a futex sub test case for will-it-scale test suite.  But I got your
>> >> >> point, we need some description for the test case.  If email is not too
>> >> >> limited for the full description, we will put it in some web site and
>> >> >> include short description and link to the full description in email.
>> >> >
>> >> > Ok. Just make sure the short description gives enough information for the
>> >> > casual reader.
>> >> >  
>> >> >> >  The commit has no significant impact on any other test in the test suite.
>> >> >> 
>> >> >> Sorry, we have no enough machine power to test all test cases for each
>> >> >> bisect result.  So we will have no such information until we find a way
>> >> >> to do that.
>> >> >
>> >> > Well, then I really have to ask how I should interpret the data here:
>> >> >
>> >> >    5076304 .  0%     +25.6%    6374220 .  0%  will-it-scale.per_process_ops
>> >> >
>> >> >    ^^^ That's the reason why you sent the mail in the first place
>> >> >
>> >> >    1194117 .  0%     +14.4%    1366153 .  1%  will-it-scale.per_thread_ops
>> >> >       0.58 .  0%      -2.0%       0.57 .  0%  will-it-scale.scalability
>> >> >       6820 .  0%     -19.6%       5483 . 15%  meminfo.AnonHugePages
>> >> >       2652 .  5%     -10.4%       2375 .  2%  vmstat.system.cs
>> >> >       2848 . 32%    +141.2%       6870 . 65%  numa-meminfo.node1.Active(anon)
>> >> >       2832 . 31%     +57.6%       4465 . 27%  numa-meminfo.node1.AnonPages
>> >> >      15018 . 12%     -23.3%      11515 . 15%  numa-meminfo.node2.AnonPages
>> >> >       1214 . 14%     -22.8%     936.75 . 20%  numa-meminfo.node3.PageTables
>> >> >     712.25 . 32%    +141.2%       1718 . 65%  numa-vmstat.node1.nr_active_anon
>> >> >     708.25 . 31%     +57.7%       1116 . 27%  numa-vmstat.node1.nr_anon_pages
>> >> >
>> >> > How is this related and what should I do about this information? 
>> >> 
>> >> For each will-it-scale sub test case, it will be run in both process
>> >> mode and thread mode, and task number will change from 1 to CPU number.
>> >> will-it-scale.per_thread_ops shows thread mode main result.
>> >> will-it-scale.scalability is calculated to measure how per_process_ops
>> >> and per_thread_ops scaled along with the task number.  These are default
>> >> behavior of will-it-scale test suite.
>> >> 
>> >> Others are monitors output.  That is, other information collected during
>> >> test.  For example, meminfo is a monitor to sampling /proc/meminfo
>> >> contents,  AnonHugePages is a line in it.  meminfo.AnonHugePages is for
>> >> the average value of AnonHugePages line of /proc/meminfo.  Similarly
>> >> vmstat.system.cs is the average value of cs column of system column
>> >> group of /usr/bin/vmstat.
>> >> 
>> >> We hope these information are helpful for root causing the regression.
>> >> 
>> >> >  If it's important then I have to admit, that I fail to understand why.
>> >> >
>> >> >  If it's not important then I have to ask why is this included.
>> >> >
>> >> >> > So that allows me to reproduce that test more or less with no effort. And
>> >> >> > that's the really important part.
>> >> >> 
>> >> >> For reproducing, now we use lkp-tests tool, which includes scripts to
>> >> >> build the test case, run the test, collect various information, compare
>> >> >> the test result, with the job file attached with the report email.  That
>> >> >> is not the easiest way, we will continuously improve it.
>> >> >
>> >> > I know and lkp-tests is a pain to work with. So please look into a way to
>> >> > extract the relevant binaries, so it's simple for developers to reproduce.
>> >> 
>> >> OK.  We will try to improve on this side.  But it is not an easy task
>> >> for us to provided easy to use simple binaries.  Do you think something
>> >> like Docker image is easy to use?
>> >
>> > Thomas, I presume you are interested in binaries to be positive we're
>> > reproducing with exactly the same bits? I agree that's good to have. I'd also
>> > want to have or be pointed to the sources with a straight forward way to
>> > rebuild and to inspect what exactly the test is doing. (I assume this is
>> > implied, but just to make sure it's stated).
>> 
>> lkp-tests has the scripts to download the source, apply some patch, and
>> build the binary.  It is not a very straight forward way, but the script
>> is quite simple.
>> 
>> > Huang, what makes the binaries difficult to package? And how would docker make
>> > that any simpler?
>> 
>> The binaries is not difficult to package.  But the test is not only
>> benchmark binary.  We may do some setup for the specific test, for
>> example, change the cpu frequency governor, mkfs on a partition (may be
>> ramdisk), etc.  And we have mechanism to collect various information
>> during the test, for example, run vmstat, sampling /proc/sched_debug,
>> etc.
>
> In that case, it would be really useful to document your setup and include a
> link to that in every report you send out.

The setup information is in the job file attached in the report email.
The job file can be used together with lkp-tests to reproduce exactly we
done in our test machine.

> It should include things like the
> partition layout (which ideally would have normalized symlinks to facilitate
> reproduction outside your lab) (/dev/testpart -> /dev/sde3) and the
> scripts/tests should only use the symlinks. That's an over simplification of
> course, but that kind of configuration and documentation would be very helpful
> in reducing the barrier to getting people to look at the issues your testing
> discovers - and that, of course, is the whole point.

The test partition of your test machine can be specified in the <host>
file in lkp-tests.

Our current reproducing solution is based on lkp-tests.  We hope we can
improve the lkp-tests tool to make reproduction easier.

> Thank you for doing the work and accepting all this feedback.

Our pleasure.

>> As for docker, we just want to reduce the pain of using lkp-tests.
>>  
>
> Sometimes docker is pain as well. It became completely unusable on my Debian 8
> system not too long ago. That can become another barrier if it isn't also
> documented without docker, even if it requires more steps.

OK.  I see.  Your input are valuable for us.

Best Regards,
Huang, Ying

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [LKP] [lkp] [futex] 65d8fc777f: +25.6% will-it-scale.per_process_ops
  2016-03-21  8:42             ` Huang, Ying
@ 2016-03-29  7:39               ` Peter Zijlstra
  -1 siblings, 0 replies; 28+ messages in thread
From: Peter Zijlstra @ 2016-03-29  7:39 UTC (permalink / raw)
  To: Huang, Ying
  Cc: Thomas Gleixner, Ingo Molnar, Chris Mason, lkp,
	Sebastian Andrzej Siewior, Davidlohr Bueso, Hugh Dickins, LKML,
	Linus Torvalds, Mel Gorman, Mel Gorman, Darren Hart,
	Wu Fengguang, Xiaolong Ye

On Mon, Mar 21, 2016 at 04:42:47PM +0800, Huang, Ying wrote:
> OK.  We will try to improve on this side.  But it is not an easy task
> for us to provided easy to use simple binaries.  Do you think something
> like Docker image is easy to use?

Docker muck would just make me turn around and run.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [futex] 65d8fc777f: +25.6% will-it-scale.per_process_ops
@ 2016-03-29  7:39               ` Peter Zijlstra
  0 siblings, 0 replies; 28+ messages in thread
From: Peter Zijlstra @ 2016-03-29  7:39 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 302 bytes --]

On Mon, Mar 21, 2016 at 04:42:47PM +0800, Huang, Ying wrote:
> OK.  We will try to improve on this side.  But it is not an easy task
> for us to provided easy to use simple binaries.  Do you think something
> like Docker image is easy to use?

Docker muck would just make me turn around and run.

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2016-03-29  7:40 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-29  8:36 [lkp] [futex] 65d8fc777f: +25.6% will-it-scale.per_process_ops kernel test robot
2016-02-29  8:36 ` kernel test robot
2016-02-29  9:37 ` [lkp] " Ingo Molnar
2016-02-29  9:37   ` Ingo Molnar
2016-02-29 17:37   ` [lkp] " Davidlohr Bueso
2016-02-29 17:37     ` Davidlohr Bueso
2016-03-01  8:38   ` [LKP] [lkp] " Huang, Ying
2016-03-01  8:38     ` Huang, Ying
2016-03-18  6:12   ` [LKP] [lkp] " Huang, Ying
2016-03-18  6:12     ` Huang, Ying
2016-03-18  8:35     ` [LKP] [lkp] " Thomas Gleixner
2016-03-18  8:35       ` Thomas Gleixner
2016-03-21  2:53       ` [LKP] [lkp] " Huang, Ying
2016-03-21  2:53         ` Huang, Ying
2016-03-21  8:00         ` [LKP] [lkp] " Thomas Gleixner
2016-03-21  8:00           ` Thomas Gleixner
2016-03-21  8:42           ` [LKP] [lkp] " Huang, Ying
2016-03-21  8:42             ` Huang, Ying
2016-03-28 17:14             ` [LKP] [lkp] " Darren Hart
2016-03-28 17:14               ` Darren Hart
2016-03-29  1:12               ` [LKP] [lkp] " Huang, Ying
2016-03-29  1:12                 ` Huang, Ying
2016-03-29  5:17                 ` [LKP] [lkp] " Darren Hart
2016-03-29  5:17                   ` Darren Hart
2016-03-29  5:57                   ` [LKP] [lkp] " Huang, Ying
2016-03-29  5:57                     ` Huang, Ying
2016-03-29  7:39             ` [LKP] [lkp] " Peter Zijlstra
2016-03-29  7:39               ` Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.