linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [lkp] [mm/vmstat] 6cdb18ad98: -8.5% will-it-scale.per_thread_ops
@ 2016-01-06  3:20 kernel test robot
  2016-01-07 11:23 ` Heiko Carstens
  0 siblings, 1 reply; 5+ messages in thread
From: kernel test robot @ 2016-01-06  3:20 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: lkp, LKML, Andrew Morton, Christoph Lameter, Linus Torvalds

[-- Attachment #1: Type: text/plain, Size: 7487 bytes --]

FYI, we noticed the below changes on

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
commit 6cdb18ad98a49f7e9b95d538a0614cde827404b8 ("mm/vmstat: fix overflow in mod_zone_page_state()")


=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase:
  gcc-4.9/performance/x86_64-rhel/debian-x86_64-2015-02-07.cgz/ivb42/pread1/will-it-scale

commit: 
  cc28d6d80f6ab494b10f0e2ec949eacd610f66e3
  6cdb18ad98a49f7e9b95d538a0614cde827404b8

cc28d6d80f6ab494 6cdb18ad98a49f7e9b95d538a0 
---------------- -------------------------- 
         %stddev     %change         %stddev
             \          |                \  
   2733943 ±  0%      -8.5%    2502129 ±  0%  will-it-scale.per_thread_ops
      3410 ±  0%      -2.0%       3343 ±  0%  will-it-scale.time.system_time
    340.08 ±  0%     +19.7%     406.99 ±  0%  will-it-scale.time.user_time
  69882822 ±  2%     -24.3%   52926191 ±  5%  cpuidle.C1-IVT.time
    340.08 ±  0%     +19.7%     406.99 ±  0%  time.user_time
    491.25 ±  6%     -17.7%     404.25 ±  7%  numa-vmstat.node0.nr_alloc_batch
      2799 ± 20%     -36.6%       1776 ±  0%  numa-vmstat.node0.nr_mapped
    630.00 ±140%    +244.4%       2169 ±  1%  numa-vmstat.node1.nr_inactive_anon
      6440 ± 11%     -15.5%       5440 ± 16%  numa-vmstat.node1.nr_slab_reclaimable
     11204 ± 20%     -36.6%       7106 ±  0%  numa-meminfo.node0.Mapped
      1017 ±173%    +450.3%       5598 ± 15%  numa-meminfo.node1.AnonHugePages
      2521 ±140%    +244.1%       8678 ±  1%  numa-meminfo.node1.Inactive(anon)
     25762 ± 11%     -15.5%      21764 ± 16%  numa-meminfo.node1.SReclaimable
     70103 ±  9%      -9.8%      63218 ±  9%  numa-meminfo.node1.Slab
      2.29 ±  3%     +32.8%       3.04 ±  4%  perf-profile.cycles-pp.atime_needs_update.touch_atime.shmem_file_read_iter.__vfs_read.vfs_read
      1.10 ±  3%     -27.4%       0.80 ±  5%  perf-profile.cycles-pp.current_fs_time.atime_needs_update.touch_atime.shmem_file_read_iter.__vfs_read
      2.33 ±  2%     -13.0%       2.02 ±  3%  perf-profile.cycles-pp.fput.entry_SYSCALL_64_fastpath
      0.89 ±  2%     +29.6%       1.15 ±  7%  perf-profile.cycles-pp.fsnotify.vfs_read.sys_pread64.entry_SYSCALL_64_fastpath
      2.85 ±  2%     +45.4%       4.14 ±  5%  perf-profile.cycles-pp.touch_atime.shmem_file_read_iter.__vfs_read.vfs_read.sys_pread64
     63939 ±  0%     +17.9%      75370 ± 15%  sched_debug.cfs_rq:/.exec_clock.25
     72.50 ± 73%     -63.1%      26.75 ± 19%  sched_debug.cfs_rq:/.load_avg.1
     34.00 ± 62%     -61.8%      13.00 ± 12%  sched_debug.cfs_rq:/.load_avg.14
     18.00 ± 11%     -11.1%      16.00 ± 10%  sched_debug.cfs_rq:/.load_avg.20
     14.75 ± 41%    +122.0%      32.75 ± 26%  sched_debug.cfs_rq:/.load_avg.25
    278.88 ± 11%     +18.8%     331.25 ±  7%  sched_debug.cfs_rq:/.load_avg.max
     51.89 ± 11%     +13.6%      58.97 ±  4%  sched_debug.cfs_rq:/.load_avg.stddev
      7.25 ±  5%    +255.2%      25.75 ± 53%  sched_debug.cfs_rq:/.runnable_load_avg.25
     28.50 ±  1%     +55.3%      44.25 ± 46%  sched_debug.cfs_rq:/.runnable_load_avg.7
     72.50 ± 73%     -63.1%      26.75 ± 19%  sched_debug.cfs_rq:/.tg_load_avg_contrib.1
     34.00 ± 62%     -61.8%      13.00 ± 12%  sched_debug.cfs_rq:/.tg_load_avg_contrib.14
     18.00 ± 11%     -11.1%      16.00 ± 10%  sched_debug.cfs_rq:/.tg_load_avg_contrib.20
     14.75 ± 41%    +122.0%      32.75 ± 25%  sched_debug.cfs_rq:/.tg_load_avg_contrib.25
    279.29 ± 11%     +19.1%     332.67 ±  7%  sched_debug.cfs_rq:/.tg_load_avg_contrib.max
     52.01 ± 11%     +13.8%      59.18 ±  4%  sched_debug.cfs_rq:/.tg_load_avg_contrib.stddev
    359.50 ±  6%     +41.5%     508.75 ± 22%  sched_debug.cfs_rq:/.util_avg.25
    206.25 ± 16%     -13.1%     179.25 ± 11%  sched_debug.cfs_rq:/.util_avg.40
    688.75 ±  1%     +18.5%     816.00 ±  1%  sched_debug.cfs_rq:/.util_avg.7
    953467 ±  1%     -17.9%     782518 ± 10%  sched_debug.cpu.avg_idle.5
      9177 ± 43%     +73.9%      15957 ± 29%  sched_debug.cpu.nr_switches.13
      7365 ± 19%     -35.4%       4755 ± 11%  sched_debug.cpu.nr_switches.20
     12203 ± 28%     -62.2%       4608 ±  9%  sched_debug.cpu.nr_switches.22
      1868 ± 49%     -51.1%     913.50 ± 27%  sched_debug.cpu.nr_switches.27
      2546 ± 56%     -70.0%     763.00 ± 18%  sched_debug.cpu.nr_switches.28
      3003 ± 78%     -77.9%     663.00 ± 18%  sched_debug.cpu.nr_switches.33
      1820 ± 19%     +68.0%       3058 ± 33%  sched_debug.cpu.nr_switches.8
     -4.00 ±-35%    -156.2%       2.25 ± 85%  sched_debug.cpu.nr_uninterruptible.11
      4.00 ±133%    -187.5%      -3.50 ±-24%  sched_debug.cpu.nr_uninterruptible.17
      1.75 ± 74%    -214.3%      -2.00 ±-127%  sched_debug.cpu.nr_uninterruptible.25
      0.00 ±  2%      +Inf%       4.00 ± 39%  sched_debug.cpu.nr_uninterruptible.26
      2.50 ± 44%    -110.0%      -0.25 ±-591%  sched_debug.cpu.nr_uninterruptible.27
      1.33 ±154%    -287.5%      -2.50 ±-72%  sched_debug.cpu.nr_uninterruptible.32
     -1.00 ±-244%    -250.0%       1.50 ±251%  sched_debug.cpu.nr_uninterruptible.45
      3.50 ± 82%    -135.7%      -1.25 ±-66%  sched_debug.cpu.nr_uninterruptible.46
     -4.50 ±-40%    -133.3%       1.50 ±242%  sched_debug.cpu.nr_uninterruptible.6
     -3.00 ±-78%    -433.3%      10.00 ±150%  sched_debug.cpu.nr_uninterruptible.7
     10124 ± 39%     +65.8%      16783 ± 23%  sched_debug.cpu.sched_count.13
     12833 ± 23%     -54.6%       5823 ± 32%  sched_debug.cpu.sched_count.22
      1934 ± 48%     -49.8%     971.00 ± 26%  sched_debug.cpu.sched_count.27
      3065 ± 76%     -76.2%     728.25 ± 16%  sched_debug.cpu.sched_count.33
      2098 ± 24%    +664.1%      16030 ±126%  sched_debug.cpu.sched_count.5
      4653 ± 33%     +83.4%       8536 ± 25%  sched_debug.cpu.sched_goidle.15
      5061 ± 41%     -61.1%       1968 ± 13%  sched_debug.cpu.sched_goidle.22
    834.75 ± 57%     -60.2%     332.00 ± 35%  sched_debug.cpu.sched_goidle.27
    719.00 ± 71%     -63.3%     264.00 ± 19%  sched_debug.cpu.sched_goidle.28
    943.25 ±115%     -76.3%     223.25 ± 21%  sched_debug.cpu.sched_goidle.33
      2520 ± 26%    +112.4%       5353 ± 19%  sched_debug.cpu.ttwu_count.13
      5324 ± 22%     -49.7%       2679 ± 45%  sched_debug.cpu.ttwu_count.22
      2926 ± 38%    +231.1%       9690 ± 37%  sched_debug.cpu.ttwu_count.23
    277.25 ± 18%    +166.7%     739.50 ± 83%  sched_debug.cpu.ttwu_count.27
      1247 ± 61%     -76.6%     292.25 ± 11%  sched_debug.cpu.ttwu_count.28
    751.75 ± 22%    +183.9%       2134 ±  9%  sched_debug.cpu.ttwu_count.3
      6405 ± 97%     -75.9%       1542 ± 48%  sched_debug.cpu.ttwu_count.41
      5582 ±104%     -76.2%       1327 ± 55%  sched_debug.cpu.ttwu_count.43
      3201 ± 26%     -75.1%     796.75 ± 18%  sched_debug.cpu.ttwu_local.22


ivb42: Ivytown Ivy Bridge-EP
Memory: 64G

To reproduce:

        git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml


Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Ying Huang

[-- Attachment #2: job.yaml --]
[-- Type: text/plain, Size: 3321 bytes --]

---
LKP_SERVER: inn
LKP_CGI_PORT: 80
LKP_CIFS_PORT: 139
testcase: will-it-scale
default-monitors:
  wait: activate-monitor
  kmsg: 
  uptime: 
  iostat: 
  vmstat: 
  numa-numastat: 
  numa-vmstat: 
  numa-meminfo: 
  proc-vmstat: 
  proc-stat:
    interval: 10
  meminfo: 
  slabinfo: 
  interrupts: 
  lock_stat: 
  latency_stats: 
  softirqs: 
  bdi_dev_mapping: 
  diskstats: 
  nfsstat: 
  cpuidle: 
  cpufreq-stats: 
  turbostat: 
  pmeter: 
  sched_debug:
    interval: 60
cpufreq_governor: performance
default-watchdogs:
  oom-killer: 
  watchdog: 
commit: 6cdb18ad98a49f7e9b95d538a0614cde827404b8
model: Ivytown Ivy Bridge-EP
nr_cpu: 48
memory: 64G
swap_partitions: LABEL=SWAP
rootfs_partition: LABEL=LKP-ROOTFS
category: benchmark
perf-profile:
  freq: 800
will-it-scale:
  test: pread1
queue: bisect
testbox: ivb42
tbox_group: ivb42
kconfig: x86_64-rhel
enqueue_time: 2016-01-05 05:00:40.511744641 +08:00
id: 374e605d3cbf102941031de6640b9edf424e5409
user: lkp
compiler: gcc-4.9
head_commit: f4366aad18b531cf15057f70e3cea09fef88c310
base_commit: 168309855a7d1e16db751e9c647119fe2d2dc878
branch: internal-eywa/master
rootfs: debian-x86_64-2015-02-07.cgz
result_root: "/result/will-it-scale/performance-pread1/ivb42/debian-x86_64-2015-02-07.cgz/x86_64-rhel/gcc-4.9/6cdb18ad98a49f7e9b95d538a0614cde827404b8/0"
job_file: "/lkp/scheduled/ivb42/bisect_will-it-scale-performance-pread1-debian-x86_64-2015-02-07.cgz-x86_64-rhel-6cdb18ad98a49f7e9b95d538a0614cde827404b8-20160105-44334-2cdw3i-0.yaml"
max_uptime: 1500
initrd: "/osimage/debian/debian-x86_64-2015-02-07.cgz"
bootloader_append:
- root=/dev/ram0
- user=lkp
- job=/lkp/scheduled/ivb42/bisect_will-it-scale-performance-pread1-debian-x86_64-2015-02-07.cgz-x86_64-rhel-6cdb18ad98a49f7e9b95d538a0614cde827404b8-20160105-44334-2cdw3i-0.yaml
- ARCH=x86_64
- kconfig=x86_64-rhel
- branch=internal-eywa/master
- commit=6cdb18ad98a49f7e9b95d538a0614cde827404b8
- BOOT_IMAGE=/pkg/linux/x86_64-rhel/gcc-4.9/6cdb18ad98a49f7e9b95d538a0614cde827404b8/vmlinuz-4.4.0-rc7-00013-g6cdb18a
- max_uptime=1500
- RESULT_ROOT=/result/will-it-scale/performance-pread1/ivb42/debian-x86_64-2015-02-07.cgz/x86_64-rhel/gcc-4.9/6cdb18ad98a49f7e9b95d538a0614cde827404b8/0
- LKP_SERVER=inn
- |2-


  earlyprintk=ttyS0,115200 systemd.log_level=err
  debug apic=debug sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100
  panic=-1 softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2 prompt_ramdisk=0
  console=ttyS0,115200 console=tty0 vga=normal

  rw
lkp_initrd: "/lkp/lkp/lkp-x86_64.cgz"
modules_initrd: "/pkg/linux/x86_64-rhel/gcc-4.9/6cdb18ad98a49f7e9b95d538a0614cde827404b8/modules.cgz"
bm_initrd: "/osimage/deps/debian-x86_64-2015-02-07.cgz/lkp.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/run-ipconfig.cgz,/osimage/deps/debian-x86_64-2015-02-07.cgz/turbostat.cgz,/lkp/benchmarks/turbostat.cgz,/lkp/benchmarks/will-it-scale.cgz"
linux_headers_initrd: "/pkg/linux/x86_64-rhel/gcc-4.9/6cdb18ad98a49f7e9b95d538a0614cde827404b8/linux-headers.cgz"
repeat_to: 2
kernel: "/pkg/linux/x86_64-rhel/gcc-4.9/6cdb18ad98a49f7e9b95d538a0614cde827404b8/vmlinuz-4.4.0-rc7-00013-g6cdb18a"
dequeue_time: 2016-01-05 05:06:46.434314862 +08:00
job_state: finished
loadavg: 38.55 18.46 7.28 1/506 9298
start_time: '1451941650'
end_time: '1451941960'
version: "/lkp/lkp/.src-20160104-165204"

[-- Attachment #3: reproduce.sh --]
[-- Type: application/x-sh, Size: 4564 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [lkp] [mm/vmstat] 6cdb18ad98: -8.5% will-it-scale.per_thread_ops
  2016-01-06  3:20 [lkp] [mm/vmstat] 6cdb18ad98: -8.5% will-it-scale.per_thread_ops kernel test robot
@ 2016-01-07 11:23 ` Heiko Carstens
  2016-01-08  5:24   ` [LKP] " Huang, Ying
  2016-01-21  6:47   ` Huang, Ying
  0 siblings, 2 replies; 5+ messages in thread
From: Heiko Carstens @ 2016-01-07 11:23 UTC (permalink / raw)
  To: kernel test robot
  Cc: lkp, LKML, Andrew Morton, Christoph Lameter, Linus Torvalds

On Wed, Jan 06, 2016 at 11:20:55AM +0800, kernel test robot wrote:
> FYI, we noticed the below changes on
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
> commit 6cdb18ad98a49f7e9b95d538a0614cde827404b8 ("mm/vmstat: fix overflow in mod_zone_page_state()")
> 
> 
> =========================================================================================
> compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase:
>   gcc-4.9/performance/x86_64-rhel/debian-x86_64-2015-02-07.cgz/ivb42/pread1/will-it-scale
> 
> commit: 
>   cc28d6d80f6ab494b10f0e2ec949eacd610f66e3
>   6cdb18ad98a49f7e9b95d538a0614cde827404b8
> 
> cc28d6d80f6ab494 6cdb18ad98a49f7e9b95d538a0 
> ---------------- -------------------------- 
>          %stddev     %change         %stddev
>              \          |                \  
>    2733943 ±  0%      -8.5%    2502129 ±  0%  will-it-scale.per_thread_ops
>       3410 ±  0%      -2.0%       3343 ±  0%  will-it-scale.time.system_time
>     340.08 ±  0%     +19.7%     406.99 ±  0%  will-it-scale.time.user_time
>   69882822 ±  2%     -24.3%   52926191 ±  5%  cpuidle.C1-IVT.time
>     340.08 ±  0%     +19.7%     406.99 ±  0%  time.user_time
>     491.25 ±  6%     -17.7%     404.25 ±  7%  numa-vmstat.node0.nr_alloc_batch
>       2799 ± 20%     -36.6%       1776 ±  0%  numa-vmstat.node0.nr_mapped
>     630.00 ±140%    +244.4%       2169 ±  1%  numa-vmstat.node1.nr_inactive_anon

Hmm... this is odd. I did review all callers of mod_zone_page_state() and
couldn't find anything obvious that would go wrong after the int -> long
change.

I also tried the "pread1_threads" test case from
https://github.com/antonblanchard/will-it-scale.git

However the results seem to vary a lot after a reboot(!), at least on s390.

So I'm not sure if this is really a regression.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [LKP] [lkp] [mm/vmstat] 6cdb18ad98: -8.5% will-it-scale.per_thread_ops
  2016-01-07 11:23 ` Heiko Carstens
@ 2016-01-08  5:24   ` Huang, Ying
  2016-01-08 11:13     ` Heiko Carstens
  2016-01-21  6:47   ` Huang, Ying
  1 sibling, 1 reply; 5+ messages in thread
From: Huang, Ying @ 2016-01-08  5:24 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Christoph Lameter, Linus Torvalds, Andrew Morton, lkp, LKML

Heiko Carstens <heiko.carstens@de.ibm.com> writes:

> On Wed, Jan 06, 2016 at 11:20:55AM +0800, kernel test robot wrote:
>> FYI, we noticed the below changes on
>> 
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
>> commit 6cdb18ad98a49f7e9b95d538a0614cde827404b8 ("mm/vmstat: fix overflow in mod_zone_page_state()")
>> 
>> 
>> =========================================================================================
>> compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase:
>>   gcc-4.9/performance/x86_64-rhel/debian-x86_64-2015-02-07.cgz/ivb42/pread1/will-it-scale
>> 
>> commit: 
>>   cc28d6d80f6ab494b10f0e2ec949eacd610f66e3
>>   6cdb18ad98a49f7e9b95d538a0614cde827404b8
>> 
>> cc28d6d80f6ab494 6cdb18ad98a49f7e9b95d538a0 
>> ---------------- -------------------------- 
>>          %stddev     %change         %stddev
>>              \          |                \  
>>    2733943   0%      -8.5%    2502129   0%  will-it-scale.per_thread_ops
>>       3410   0%      -2.0%       3343   0%  will-it-scale.time.system_time
>>     340.08   0%     +19.7%     406.99   0%  will-it-scale.time.user_time
>>   69882822   2%     -24.3%   52926191   5%  cpuidle.C1-IVT.time
>>     340.08   0%     +19.7%     406.99   0%  time.user_time
>>     491.25   6%     -17.7%     404.25   7%  numa-vmstat.node0.nr_alloc_batch
>>       2799  20%     -36.6%       1776   0%  numa-vmstat.node0.nr_mapped
>>     630.00 140%    +244.4%       2169   1%  numa-vmstat.node1.nr_inactive_anon
>
> Hmm... this is odd. I did review all callers of mod_zone_page_state() and
> couldn't find anything obvious that would go wrong after the int -> long
> change.
>
> I also tried the "pread1_threads" test case from
> https://github.com/antonblanchard/will-it-scale.git
>
> However the results seem to vary a lot after a reboot(!), at least on s390.
>
> So I'm not sure if this is really a regression.

The test is quite stable for my side.  We run the test case 7 times for
your commit and its parent.  The standard variation is very low.

you commit:

[2493136, 2510964, 2508784, 2495632, 2506735, 2503016, 2510121]

parent commit:

[2735669, 2719566, 2739052, 2741485, 2735152, 2739356, 2739125]

The test result is stable for bisection too.  The below figure show the
results of good commits and bad commits.  The distance between is quite
big.  And the variation is quite small.

                             will-it-scale.per_thread_ops

  2.75e+06 ++--*---*--------------*---*------*---*---*-*-*-*----------*---*-+
           *.*  + + +  .*. .*.*.*  + + + .*.* + + + +        *.*.**.*   *   *
   2.7e+06 ++    *   **   *         *   *      *   *                        |
           |                                                                |
           |                                                                |
  2.65e+06 ++                                                               |
           |                                                                |
   2.6e+06 ++                                                               |
           |                                                                |
  2.55e+06 ++                                                               |
           |                                                                |
           O O O O      O O   O   O O   O   O  O O   O   O                  |
   2.5e+06 ++      O OO     O   O     O   O  O     O   O                    |
           |                                                                |
  2.45e+06 ++---------------------------------------------------------------+


FYI, I test your patch on x86 platform.  I have no s390 system.

Best Regards,
Huang, Ying

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [LKP] [lkp] [mm/vmstat] 6cdb18ad98: -8.5% will-it-scale.per_thread_ops
  2016-01-08  5:24   ` [LKP] " Huang, Ying
@ 2016-01-08 11:13     ` Heiko Carstens
  0 siblings, 0 replies; 5+ messages in thread
From: Heiko Carstens @ 2016-01-08 11:13 UTC (permalink / raw)
  To: Huang, Ying; +Cc: Christoph Lameter, Linus Torvalds, Andrew Morton, lkp, LKML

On Fri, Jan 08, 2016 at 01:24:30PM +0800, Huang, Ying wrote:
> Heiko Carstens <heiko.carstens@de.ibm.com> writes:
> 
> > On Wed, Jan 06, 2016 at 11:20:55AM +0800, kernel test robot wrote:
> >> FYI, we noticed the below changes on
> >> 
> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
> >> commit 6cdb18ad98a49f7e9b95d538a0614cde827404b8 ("mm/vmstat: fix overflow in mod_zone_page_state()")
> >> 
> >> 
> >> =========================================================================================
> >> compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase:
> >>   gcc-4.9/performance/x86_64-rhel/debian-x86_64-2015-02-07.cgz/ivb42/pread1/will-it-scale
> >> 
> >> commit: 
> >>   cc28d6d80f6ab494b10f0e2ec949eacd610f66e3
> >>   6cdb18ad98a49f7e9b95d538a0614cde827404b8
> >> 
> >> cc28d6d80f6ab494 6cdb18ad98a49f7e9b95d538a0 
> >> ---------------- -------------------------- 
> >>          %stddev     %change         %stddev
> >>              \          |                \  
> >>    2733943   0%      -8.5%    2502129   0%  will-it-scale.per_thread_ops
> >>       3410   0%      -2.0%       3343   0%  will-it-scale.time.system_time
> >>     340.08   0%     +19.7%     406.99   0%  will-it-scale.time.user_time
> >>   69882822   2%     -24.3%   52926191   5%  cpuidle.C1-IVT.time
> >>     340.08   0%     +19.7%     406.99   0%  time.user_time
> >>     491.25   6%     -17.7%     404.25   7%  numa-vmstat.node0.nr_alloc_batch
> >>       2799  20%     -36.6%       1776   0%  numa-vmstat.node0.nr_mapped
> >>     630.00 140%    +244.4%       2169   1%  numa-vmstat.node1.nr_inactive_anon
> >
> > Hmm... this is odd. I did review all callers of mod_zone_page_state() and
> > couldn't find anything obvious that would go wrong after the int -> long
> > change.
> >
> > I also tried the "pread1_threads" test case from
> > https://github.com/antonblanchard/will-it-scale.git
> >
> > However the results seem to vary a lot after a reboot(!), at least on s390.
> >
> > So I'm not sure if this is really a regression.
> 
> The test is quite stable for my side.  We run the test case 7 times for
> your commit and its parent.  The standard variation is very low.
> 
> you commit:
> 
> [2493136, 2510964, 2508784, 2495632, 2506735, 2503016, 2510121]
> 
> parent commit:
> 
> [2735669, 2719566, 2739052, 2741485, 2735152, 2739356, 2739125]
> 
> The test result is stable for bisection too.  The below figure show the
> results of good commits and bad commits.  The distance between is quite
> big.  And the variation is quite small.

Ok, so it seems to be quite stable on your machine across reboots.

I have to admit I still cannot make much sense of this. Is the "pread1"
testcase the only one that performs worse, or are there more?

Also could you please provide the output of /proc/zoneinfo and the output
of "perf top" of the good/bad cases? Maybe that might help to figure out
what is happening.

> FYI, I test your patch on x86 platform.  I have no s390 system.

Sure, I wouldn't expect that.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [lkp] [mm/vmstat] 6cdb18ad98: -8.5% will-it-scale.per_thread_ops
  2016-01-07 11:23 ` Heiko Carstens
  2016-01-08  5:24   ` [LKP] " Huang, Ying
@ 2016-01-21  6:47   ` Huang, Ying
  1 sibling, 0 replies; 5+ messages in thread
From: Huang, Ying @ 2016-01-21  6:47 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: lkp, LKML, Andrew Morton, Christoph Lameter, Linus Torvalds

Heiko Carstens <heiko.carstens@de.ibm.com> writes:

> On Wed, Jan 06, 2016 at 11:20:55AM +0800, kernel test robot wrote:
>> FYI, we noticed the below changes on
>> 
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
>> commit 6cdb18ad98a49f7e9b95d538a0614cde827404b8 ("mm/vmstat: fix overflow in mod_zone_page_state()")
>> 
>> 
>> =========================================================================================
>> compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase:
>>   gcc-4.9/performance/x86_64-rhel/debian-x86_64-2015-02-07.cgz/ivb42/pread1/will-it-scale
>> 
>> commit: 
>>   cc28d6d80f6ab494b10f0e2ec949eacd610f66e3
>>   6cdb18ad98a49f7e9b95d538a0614cde827404b8
>> 
>> cc28d6d80f6ab494 6cdb18ad98a49f7e9b95d538a0 
>> ---------------- -------------------------- 
>>          %stddev     %change         %stddev
>>              \          |                \  
>>    2733943 .  0%      -8.5%    2502129 .  0%  will-it-scale.per_thread_ops
>>       3410 .  0%      -2.0%       3343 .  0%  will-it-scale.time.system_time
>>     340.08 .  0%     +19.7%     406.99 .  0%  will-it-scale.time.user_time
>>   69882822 .  2%     -24.3%   52926191 .  5%  cpuidle.C1-IVT.time
>>     340.08 .  0%     +19.7%     406.99 .  0%  time.user_time
>>     491.25 .  6%     -17.7%     404.25 .  7%  numa-vmstat.node0.nr_alloc_batch
>>       2799 . 20%     -36.6%       1776 .  0%  numa-vmstat.node0.nr_mapped
>>     630.00 .140%    +244.4%       2169 .  1%  numa-vmstat.node1.nr_inactive_anon
>
> Hmm... this is odd. I did review all callers of mod_zone_page_state() and
> couldn't find anything obvious that would go wrong after the int -> long
> change.
>
> I also tried the "pread1_threads" test case from
> https://github.com/antonblanchard/will-it-scale.git
>
> However the results seem to vary a lot after a reboot(!), at least on s390.
>
> So I'm not sure if this is really a regression.

Most part of the regression is restored for v4.4.  But because the changes are
like "V", it is hard to bisect.

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-4.9/performance/x86_64-rhel/thread/24/debian-x86_64-2015-02-07.cgz/ivb42/pread1/will-it-scale

commit: 
  cc28d6d80f6ab494b10f0e2ec949eacd610f66e3
  6cdb18ad98a49f7e9b95d538a0614cde827404b8
  v4.4

cc28d6d80f6ab494 6cdb18ad98a49f7e9b95d538a0                       v4.4 
---------------- -------------------------- -------------------------- 
         %stddev     %change         %stddev     %change         %stddev
             \          |                \          |                \  
   3083436 ±  0%      -9.6%    2788374 ±  0%      -3.7%    2970130 ±  0%  will-it-scale.per_thread_ops
      6447 ±  0%      -2.2%       6308 ±  0%      -0.3%       6425 ±  0%  will-it-scale.time.system_time
    776.90 ±  0%     +17.9%     915.71 ±  0%      +2.9%     799.12 ±  0%  will-it-scale.time.user_time
    316177 ±  4%      -4.6%     301616 ±  3%     -10.3%     283563 ±  3%  softirqs.RCU
    776.90 ±  0%     +17.9%     915.71 ±  0%      +2.9%     799.12 ±  0%  time.user_time
    777.33 ±  7%     +20.8%     938.67 ±  7%      +7.5%     836.00 ±  8%  slabinfo.blkdev_requests.active_objs
    777.33 ±  7%     +20.8%     938.67 ±  7%      +7.5%     836.00 ±  8%  slabinfo.blkdev_requests.num_objs
  74313962 ± 44%     -16.5%   62053062 ± 41%     -49.9%   37246967 ±  8%  cpuidle.C1-IVT.time
  43381614 ± 79%     +24.4%   53966568 ±111%    +123.9%   97135791 ± 33%  cpuidle.C1E-IVT.time
     97.67 ± 36%     +95.2%     190.67 ± 63%    +122.5%     217.33 ± 41%  cpuidle.C3-IVT.usage
   3679437 ± 69%    -100.0%       0.00 ± -1%    -100.0%       0.00 ± -1%  latency_stats.avg.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath
   5177475 ± 82%    -100.0%       0.00 ± -1%    -100.0%       0.00 ± -1%  latency_stats.max.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath
  11726393 ±112%    -100.0%       0.00 ± -1%    -100.0%       0.00 ± -1%  latency_stats.sum.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath
    178.07 ±  0%      -1.3%     175.79 ±  0%      -0.8%     176.62 ±  0%  turbostat.CorWatt
      0.20 ±  2%     -16.9%       0.16 ± 18%     -11.9%       0.17 ± 17%  turbostat.Pkg%pc6
    207.38 ±  0%      -1.1%     205.13 ±  0%      -0.7%     205.99 ±  0%  turbostat.PkgWatt
      6889 ± 33%     -49.2%       3497 ± 86%     -19.4%       5552 ± 27%  numa-vmstat.node0.nr_active_anon
    483.33 ± 29%     -32.3%     327.00 ± 48%      +0.1%     483.67 ± 29%  numa-vmstat.node0.nr_page_table_pages
     27536 ± 96%     +10.9%      30535 ± 78%    +148.5%      68418 ±  2%  numa-vmstat.node0.numa_other
    214.00 ± 11%     +18.1%     252.67 ±  4%      +2.8%     220.00 ±  9%  numa-vmstat.node1.nr_kernel_stack
    370.67 ± 38%     +42.0%     526.33 ± 30%      -0.2%     370.00 ± 39%  numa-vmstat.node1.nr_page_table_pages
     61177 ± 43%      -5.2%      57976 ± 41%     -66.3%      20644 ± 10%  numa-vmstat.node1.numa_other
     78172 ± 13%     -16.1%      65573 ± 18%      -5.8%      73626 ±  9%  numa-meminfo.node0.Active
     27560 ± 33%     -49.2%      14006 ± 86%     -19.4%      22203 ± 27%  numa-meminfo.node0.Active(anon)
      3891 ± 58%     -38.1%       2407 ±100%     -58.8%       1604 ±110%  numa-meminfo.node0.AnonHugePages
      1934 ± 29%     -32.3%       1309 ± 48%      +0.1%       1936 ± 29%  numa-meminfo.node0.PageTables
     63139 ± 17%     +19.8%      75670 ± 16%      +6.0%      66937 ± 10%  numa-meminfo.node1.Active
      3432 ± 11%     +18.0%       4049 ±  4%      +2.8%       3527 ±  9%  numa-meminfo.node1.KernelStack
      1483 ± 38%     +42.0%       2106 ± 30%      -0.2%       1481 ± 39%  numa-meminfo.node1.PageTables
      1.47 ±  1%     -11.8%       1.30 ±  2%      -7.0%       1.37 ±  3%  perf-profile.cycles-pp.___might_sleep.__might_sleep.find_lock_entry.shmem_getpage_gfp.shmem_file_read_iter
      2.00 ±  2%     -11.3%       1.78 ±  2%      -7.2%       1.86 ±  2%  perf-profile.cycles-pp.__might_sleep.find_lock_entry.shmem_getpage_gfp.shmem_file_read_iter.__vfs_read
      2.30 ±  4%     +33.6%       3.07 ±  0%      -1.9%       2.26 ±  1%  perf-profile.cycles-pp.atime_needs_update.touch_atime.shmem_file_read_iter.__vfs_read.vfs_read
      1.05 ±  1%     -27.7%       0.76 ±  1%      -8.0%       0.96 ±  0%  perf-profile.cycles-pp.current_fs_time.atime_needs_update.touch_atime.shmem_file_read_iter.__vfs_read
      2.21 ±  3%     -11.9%       1.94 ±  2%      -9.4%       2.00 ±  0%  perf-profile.cycles-pp.fput.entry_SYSCALL_64_fastpath
      0.78 ±  2%     +38.5%       1.08 ±  2%     +23.1%       0.96 ±  3%  perf-profile.cycles-pp.fsnotify.vfs_read.sys_pread64.entry_SYSCALL_64_fastpath
      2.87 ±  7%     +42.6%       4.09 ±  1%      -0.3%       2.86 ±  2%  perf-profile.cycles-pp.touch_atime.shmem_file_read_iter.__vfs_read.vfs_read.sys_pread64
      6.68 ±  2%      -7.3%       6.19 ±  1%      -6.7%       6.23 ±  1%  perf-profile.cycles-pp.unlock_page.shmem_file_read_iter.__vfs_read.vfs_read.sys_pread64


Best Regards,
Huang, Ying

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-01-21  6:47 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-06  3:20 [lkp] [mm/vmstat] 6cdb18ad98: -8.5% will-it-scale.per_thread_ops kernel test robot
2016-01-07 11:23 ` Heiko Carstens
2016-01-08  5:24   ` [LKP] " Huang, Ying
2016-01-08 11:13     ` Heiko Carstens
2016-01-21  6:47   ` Huang, Ying

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).