All of lore.kernel.org
 help / color / mirror / Atom feed
* [LKP] [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
       [not found] <53d70ee6.JsUEmW5dWsv8dev+%fengguang.wu@intel.com>
@ 2014-07-29  5:24   ` Aaron Lu
  0 siblings, 0 replies; 66+ messages in thread
From: Aaron Lu @ 2014-07-29  5:24 UTC (permalink / raw)
  To: Rik van Riel; +Cc: LKML, lkp

[-- Attachment #1: Type: text/plain, Size: 5602 bytes --]

FYI, we noticed the below changes on

git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
commit a43455a1d572daf7b730fe12eb747d1e17411365 ("sched/numa: Ensure task_numa_migrate() checks the preferred node")

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
     94500 ~ 3%    +115.6%     203711 ~ 6%  ivb42/hackbench/50%-threads-pipe
     67745 ~ 4%     +64.1%     111174 ~ 5%  lkp-snb01/hackbench/50%-threads-socket
    162245 ~ 3%     +94.1%     314885 ~ 6%  TOTAL proc-vmstat.numa_hint_faults_local

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
    147474 ~ 3%     +70.6%     251650 ~ 5%  ivb42/hackbench/50%-threads-pipe
     94889 ~ 3%     +46.3%     138815 ~ 5%  lkp-snb01/hackbench/50%-threads-socket
    242364 ~ 3%     +61.1%     390465 ~ 5%  TOTAL proc-vmstat.numa_pte_updates

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
    147104 ~ 3%     +69.5%     249306 ~ 5%  ivb42/hackbench/50%-threads-pipe
     94431 ~ 3%     +43.9%     135902 ~ 5%  lkp-snb01/hackbench/50%-threads-socket
    241535 ~ 3%     +59.5%     385209 ~ 5%  TOTAL proc-vmstat.numa_hint_faults

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
       308 ~ 8%     +24.1%        382 ~ 5%  lkp-snb01/hackbench/50%-threads-socket
       308 ~ 8%     +24.1%        382 ~ 5%  TOTAL numa-vmstat.node0.nr_page_table_pages

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
      1234 ~ 8%     +24.0%       1530 ~ 5%  lkp-snb01/hackbench/50%-threads-socket
      1234 ~ 8%     +24.0%       1530 ~ 5%  TOTAL numa-meminfo.node0.PageTables

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
       381 ~ 6%     -17.9%        313 ~ 6%  lkp-snb01/hackbench/50%-threads-socket
       381 ~ 6%     -17.9%        313 ~ 6%  TOTAL numa-vmstat.node1.nr_page_table_pages

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
      1528 ~ 6%     -18.0%       1253 ~ 6%  lkp-snb01/hackbench/50%-threads-socket
      1528 ~ 6%     -18.0%       1253 ~ 6%  TOTAL numa-meminfo.node1.PageTables

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
     24533 ~ 2%     -16.2%      20560 ~ 3%  ivb42/hackbench/50%-threads-pipe
     13551 ~ 2%     -10.7%      12096 ~ 2%  lkp-snb01/hackbench/50%-threads-socket
     38084 ~ 2%     -14.2%      32657 ~ 3%  TOTAL proc-vmstat.numa_pages_migrated

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
     24533 ~ 2%     -16.2%      20560 ~ 3%  ivb42/hackbench/50%-threads-pipe
     13551 ~ 2%     -10.7%      12096 ~ 2%  lkp-snb01/hackbench/50%-threads-socket
     38084 ~ 2%     -14.2%      32657 ~ 3%  TOTAL proc-vmstat.pgmigrate_success

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
      3538 ~ 7%     +11.6%       3949 ~ 7%  lkp-snb01/hackbench/50%-threads-socket
      3538 ~ 7%     +11.6%       3949 ~ 7%  TOTAL numa-vmstat.node0.nr_anon_pages

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
     14154 ~ 7%     +11.6%      15799 ~ 7%  lkp-snb01/hackbench/50%-threads-socket
     14154 ~ 7%     +11.6%      15799 ~ 7%  TOTAL numa-meminfo.node0.AnonPages

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
      3511 ~ 7%     +11.0%       3898 ~ 7%  lkp-snb01/hackbench/50%-threads-socket
      3511 ~ 7%     +11.0%       3898 ~ 7%  TOTAL numa-vmstat.node0.nr_active_anon

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
     14044 ~ 7%     +11.1%      15597 ~ 7%  lkp-snb01/hackbench/50%-threads-socket
     14044 ~ 7%     +11.1%      15597 ~ 7%  TOTAL numa-meminfo.node0.Active(anon)

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
    187958 ~ 2%     +56.6%     294375 ~ 5%  ivb42/hackbench/50%-threads-pipe
    124490 ~ 2%     +35.0%     168004 ~ 4%  lkp-snb01/hackbench/50%-threads-socket
    312448 ~ 2%     +48.0%     462379 ~ 5%  TOTAL time.minor_page_faults

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
     11.47 ~ 1%      -2.8%      11.15 ~ 1%  ivb42/hackbench/50%-threads-pipe
     11.47 ~ 1%      -2.8%      11.15 ~ 1%  TOTAL turbostat.RAM_W

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
 3.649e+08 ~ 0%      -2.4%  3.562e+08 ~ 0%  lkp-snb01/hackbench/50%-threads-socket
 3.649e+08 ~ 0%      -2.4%  3.562e+08 ~ 0%  TOTAL time.involuntary_context_switches

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
   1924472 ~ 0%      -2.6%    1874425 ~ 0%  ivb42/hackbench/50%-threads-pipe
   1924472 ~ 0%      -2.6%    1874425 ~ 0%  TOTAL vmstat.system.in

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
  1.38e+09 ~ 0%      -1.8%  1.355e+09 ~ 0%  lkp-snb01/hackbench/50%-threads-socket
  1.38e+09 ~ 0%      -1.8%  1.355e+09 ~ 0%  TOTAL time.voluntary_context_switches


Legend:
	~XX%    - stddev percent
	[+-]XX% - change percent


	[*] bisect-good sample
	[O] bisect-bad  sample


Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.

Thanks,
Aaron

[-- Attachment #2: reproduce --]
[-- Type: text/plain, Size: 2931 bytes --]

echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu10/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu11/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu12/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu13/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu14/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu15/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu16/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu17/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu18/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu19/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu20/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu21/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu22/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu23/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu24/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu25/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu26/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu27/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu28/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu29/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu30/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu31/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu5/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu6/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu8/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu9/cpufreq/scaling_governor
/usr/bin/hackbench -g 16 --threads -l 60000
/usr/bin/hackbench -g 16 --threads -l 60000
/usr/bin/hackbench -g 16 --threads -l 60000
/usr/bin/hackbench -g 16 --threads -l 60000
/usr/bin/hackbench -g 16 --threads -l 60000
/usr/bin/hackbench -g 16 --threads -l 60000
/usr/bin/hackbench -g 16 --threads -l 60000
/usr/bin/hackbench -g 16 --threads -l 60000
/usr/bin/hackbench -g 16 --threads -l 60000
/usr/bin/hackbench -g 16 --threads -l 60000
/usr/bin/hackbench -g 16 --threads -l 60000
/usr/bin/hackbench -g 16 --threads -l 60000
/usr/bin/hackbench -g 16 --threads -l 60000


^ permalink raw reply	[flat|nested] 66+ messages in thread

* [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
@ 2014-07-29  5:24   ` Aaron Lu
  0 siblings, 0 replies; 66+ messages in thread
From: Aaron Lu @ 2014-07-29  5:24 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 5720 bytes --]

FYI, we noticed the below changes on

git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
commit a43455a1d572daf7b730fe12eb747d1e17411365 ("sched/numa: Ensure task_numa_migrate() checks the preferred node")

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
     94500 ~ 3%    +115.6%     203711 ~ 6%  ivb42/hackbench/50%-threads-pipe
     67745 ~ 4%     +64.1%     111174 ~ 5%  lkp-snb01/hackbench/50%-threads-socket
    162245 ~ 3%     +94.1%     314885 ~ 6%  TOTAL proc-vmstat.numa_hint_faults_local

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
    147474 ~ 3%     +70.6%     251650 ~ 5%  ivb42/hackbench/50%-threads-pipe
     94889 ~ 3%     +46.3%     138815 ~ 5%  lkp-snb01/hackbench/50%-threads-socket
    242364 ~ 3%     +61.1%     390465 ~ 5%  TOTAL proc-vmstat.numa_pte_updates

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
    147104 ~ 3%     +69.5%     249306 ~ 5%  ivb42/hackbench/50%-threads-pipe
     94431 ~ 3%     +43.9%     135902 ~ 5%  lkp-snb01/hackbench/50%-threads-socket
    241535 ~ 3%     +59.5%     385209 ~ 5%  TOTAL proc-vmstat.numa_hint_faults

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
       308 ~ 8%     +24.1%        382 ~ 5%  lkp-snb01/hackbench/50%-threads-socket
       308 ~ 8%     +24.1%        382 ~ 5%  TOTAL numa-vmstat.node0.nr_page_table_pages

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
      1234 ~ 8%     +24.0%       1530 ~ 5%  lkp-snb01/hackbench/50%-threads-socket
      1234 ~ 8%     +24.0%       1530 ~ 5%  TOTAL numa-meminfo.node0.PageTables

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
       381 ~ 6%     -17.9%        313 ~ 6%  lkp-snb01/hackbench/50%-threads-socket
       381 ~ 6%     -17.9%        313 ~ 6%  TOTAL numa-vmstat.node1.nr_page_table_pages

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
      1528 ~ 6%     -18.0%       1253 ~ 6%  lkp-snb01/hackbench/50%-threads-socket
      1528 ~ 6%     -18.0%       1253 ~ 6%  TOTAL numa-meminfo.node1.PageTables

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
     24533 ~ 2%     -16.2%      20560 ~ 3%  ivb42/hackbench/50%-threads-pipe
     13551 ~ 2%     -10.7%      12096 ~ 2%  lkp-snb01/hackbench/50%-threads-socket
     38084 ~ 2%     -14.2%      32657 ~ 3%  TOTAL proc-vmstat.numa_pages_migrated

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
     24533 ~ 2%     -16.2%      20560 ~ 3%  ivb42/hackbench/50%-threads-pipe
     13551 ~ 2%     -10.7%      12096 ~ 2%  lkp-snb01/hackbench/50%-threads-socket
     38084 ~ 2%     -14.2%      32657 ~ 3%  TOTAL proc-vmstat.pgmigrate_success

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
      3538 ~ 7%     +11.6%       3949 ~ 7%  lkp-snb01/hackbench/50%-threads-socket
      3538 ~ 7%     +11.6%       3949 ~ 7%  TOTAL numa-vmstat.node0.nr_anon_pages

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
     14154 ~ 7%     +11.6%      15799 ~ 7%  lkp-snb01/hackbench/50%-threads-socket
     14154 ~ 7%     +11.6%      15799 ~ 7%  TOTAL numa-meminfo.node0.AnonPages

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
      3511 ~ 7%     +11.0%       3898 ~ 7%  lkp-snb01/hackbench/50%-threads-socket
      3511 ~ 7%     +11.0%       3898 ~ 7%  TOTAL numa-vmstat.node0.nr_active_anon

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
     14044 ~ 7%     +11.1%      15597 ~ 7%  lkp-snb01/hackbench/50%-threads-socket
     14044 ~ 7%     +11.1%      15597 ~ 7%  TOTAL numa-meminfo.node0.Active(anon)

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
    187958 ~ 2%     +56.6%     294375 ~ 5%  ivb42/hackbench/50%-threads-pipe
    124490 ~ 2%     +35.0%     168004 ~ 4%  lkp-snb01/hackbench/50%-threads-socket
    312448 ~ 2%     +48.0%     462379 ~ 5%  TOTAL time.minor_page_faults

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
     11.47 ~ 1%      -2.8%      11.15 ~ 1%  ivb42/hackbench/50%-threads-pipe
     11.47 ~ 1%      -2.8%      11.15 ~ 1%  TOTAL turbostat.RAM_W

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
 3.649e+08 ~ 0%      -2.4%  3.562e+08 ~ 0%  lkp-snb01/hackbench/50%-threads-socket
 3.649e+08 ~ 0%      -2.4%  3.562e+08 ~ 0%  TOTAL time.involuntary_context_switches

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
   1924472 ~ 0%      -2.6%    1874425 ~ 0%  ivb42/hackbench/50%-threads-pipe
   1924472 ~ 0%      -2.6%    1874425 ~ 0%  TOTAL vmstat.system.in

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
---------------  -------------------------  
  1.38e+09 ~ 0%      -1.8%  1.355e+09 ~ 0%  lkp-snb01/hackbench/50%-threads-socket
  1.38e+09 ~ 0%      -1.8%  1.355e+09 ~ 0%  TOTAL time.voluntary_context_switches


Legend:
	~XX%    - stddev percent
	[+-]XX% - change percent


	[*] bisect-good sample
	[O] bisect-bad  sample


Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.

Thanks,
Aaron

[-- Attachment #2: reproduce.ksh --]
[-- Type: text/plain, Size: 2931 bytes --]

echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu10/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu11/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu12/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu13/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu14/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu15/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu16/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu17/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu18/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu19/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu20/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu21/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu22/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu23/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu24/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu25/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu26/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu27/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu28/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu29/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu30/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu31/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu5/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu6/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu8/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu9/cpufreq/scaling_governor
/usr/bin/hackbench -g 16 --threads -l 60000
/usr/bin/hackbench -g 16 --threads -l 60000
/usr/bin/hackbench -g 16 --threads -l 60000
/usr/bin/hackbench -g 16 --threads -l 60000
/usr/bin/hackbench -g 16 --threads -l 60000
/usr/bin/hackbench -g 16 --threads -l 60000
/usr/bin/hackbench -g 16 --threads -l 60000
/usr/bin/hackbench -g 16 --threads -l 60000
/usr/bin/hackbench -g 16 --threads -l 60000
/usr/bin/hackbench -g 16 --threads -l 60000
/usr/bin/hackbench -g 16 --threads -l 60000
/usr/bin/hackbench -g 16 --threads -l 60000
/usr/bin/hackbench -g 16 --threads -l 60000


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [LKP] [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
  2014-07-29  5:24   ` Aaron Lu
@ 2014-07-29  6:39     ` Rik van Riel
  -1 siblings, 0 replies; 66+ messages in thread
From: Rik van Riel @ 2014-07-29  6:39 UTC (permalink / raw)
  To: Aaron Lu; +Cc: LKML, lkp, peterz, jhladky

On Tue, 29 Jul 2014 13:24:05 +0800
Aaron Lu <aaron.lu@intel.com> wrote:

> FYI, we noticed the below changes on
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> commit a43455a1d572daf7b730fe12eb747d1e17411365 ("sched/numa: Ensure task_numa_migrate() checks the preferred node")
> 
> ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
> ---------------  -------------------------  
>      94500 ~ 3%    +115.6%     203711 ~ 6%  ivb42/hackbench/50%-threads-pipe
>      67745 ~ 4%     +64.1%     111174 ~ 5%  lkp-snb01/hackbench/50%-threads-socket
>     162245 ~ 3%     +94.1%     314885 ~ 6%  TOTAL proc-vmstat.numa_hint_faults_local

Hi Aaron,

Jirka Hladky has reported a regression with that changeset as
well, and I have already spent some time debugging the issue.

I added tracing code to task_numa_compare() and saw a number
of thread swaps with tiny improvements.

Does preventing those help your workload, or am I barking up
the wrong tree again?  (I have been looking at this for a while...)

---8<---

Subject: sched,numa: prevent task moves with marginal benefit

Commit a43455a1d57 makes task_numa_migrate() always check the
preferred node for task placement. This is causing a performance
regression with hackbench, as well as SPECjbb2005.

Tracing task_numa_compare() with a single instance of SPECjbb2005
on a 4 node system, I have seen several thread swaps with tiny
improvements. 

It appears that the hysteresis code that was added to task_numa_compare
is not doing what we needed it to do, and a simple threshold could be
better.

Reported-by: Aaron Lu <aaron.lu@intel.com>
Reported-by: Jirka Hladky <jhladky@redhat.com>
Signed-off-by: Rik van Riel <riel@redhat.com>
---
 kernel/sched/fair.c | 24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 4f5e3c2..bedbc3e 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -924,10 +924,12 @@ static inline unsigned long group_faults_cpu(struct numa_group *group, int nid)
 
 /*
  * These return the fraction of accesses done by a particular task, or
- * task group, on a particular numa node.  The group weight is given a
- * larger multiplier, in order to group tasks together that are almost
- * evenly spread out between numa nodes.
+ * task group, on a particular numa node.  The NUMA move threshold
+ * prevents task moves with marginal improvement, and is set to 5%.
  */
+#define NUMA_SCALE 1000
+#define NUMA_MOVE_THRESH 50
+
 static inline unsigned long task_weight(struct task_struct *p, int nid)
 {
 	unsigned long total_faults;
@@ -940,7 +942,7 @@ static inline unsigned long task_weight(struct task_struct *p, int nid)
 	if (!total_faults)
 		return 0;
 
-	return 1000 * task_faults(p, nid) / total_faults;
+	return NUMA_SCALE * task_faults(p, nid) / total_faults;
 }
 
 static inline unsigned long group_weight(struct task_struct *p, int nid)
@@ -948,7 +950,7 @@ static inline unsigned long group_weight(struct task_struct *p, int nid)
 	if (!p->numa_group || !p->numa_group->total_faults)
 		return 0;
 
-	return 1000 * group_faults(p, nid) / p->numa_group->total_faults;
+	return NUMA_SCALE * group_faults(p, nid) / p->numa_group->total_faults;
 }
 
 bool should_numa_migrate_memory(struct task_struct *p, struct page * page,
@@ -1181,11 +1183,11 @@ static void task_numa_compare(struct task_numa_env *env,
 			imp = taskimp + task_weight(cur, env->src_nid) -
 			      task_weight(cur, env->dst_nid);
 			/*
-			 * Add some hysteresis to prevent swapping the
-			 * tasks within a group over tiny differences.
+			 * Do not swap tasks within a group around unless
+			 * there is a significant improvement.
 			 */
-			if (cur->numa_group)
-				imp -= imp/16;
+			if (cur->numa_group && imp < NUMA_MOVE_THRESH)
+				goto unlock;
 		} else {
 			/*
 			 * Compare the group weights. If a task is all by
@@ -1205,6 +1207,10 @@ static void task_numa_compare(struct task_numa_env *env,
 		goto unlock;
 
 	if (!cur) {
+		/* Only move if there is a significant improvement. */
+		if (imp < NUMA_MOVE_THRESH)
+			goto unlock;
+
 		/* Is there capacity at our destination? */
 		if (env->src_stats.has_free_capacity &&
 		    !env->dst_stats.has_free_capacity)

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
@ 2014-07-29  6:39     ` Rik van Riel
  0 siblings, 0 replies; 66+ messages in thread
From: Rik van Riel @ 2014-07-29  6:39 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 4364 bytes --]

On Tue, 29 Jul 2014 13:24:05 +0800
Aaron Lu <aaron.lu@intel.com> wrote:

> FYI, we noticed the below changes on
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> commit a43455a1d572daf7b730fe12eb747d1e17411365 ("sched/numa: Ensure task_numa_migrate() checks the preferred node")
> 
> ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
> ---------------  -------------------------  
>      94500 ~ 3%    +115.6%     203711 ~ 6%  ivb42/hackbench/50%-threads-pipe
>      67745 ~ 4%     +64.1%     111174 ~ 5%  lkp-snb01/hackbench/50%-threads-socket
>     162245 ~ 3%     +94.1%     314885 ~ 6%  TOTAL proc-vmstat.numa_hint_faults_local

Hi Aaron,

Jirka Hladky has reported a regression with that changeset as
well, and I have already spent some time debugging the issue.

I added tracing code to task_numa_compare() and saw a number
of thread swaps with tiny improvements.

Does preventing those help your workload, or am I barking up
the wrong tree again?  (I have been looking at this for a while...)

---8<---

Subject: sched,numa: prevent task moves with marginal benefit

Commit a43455a1d57 makes task_numa_migrate() always check the
preferred node for task placement. This is causing a performance
regression with hackbench, as well as SPECjbb2005.

Tracing task_numa_compare() with a single instance of SPECjbb2005
on a 4 node system, I have seen several thread swaps with tiny
improvements. 

It appears that the hysteresis code that was added to task_numa_compare
is not doing what we needed it to do, and a simple threshold could be
better.

Reported-by: Aaron Lu <aaron.lu@intel.com>
Reported-by: Jirka Hladky <jhladky@redhat.com>
Signed-off-by: Rik van Riel <riel@redhat.com>
---
 kernel/sched/fair.c | 24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 4f5e3c2..bedbc3e 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -924,10 +924,12 @@ static inline unsigned long group_faults_cpu(struct numa_group *group, int nid)
 
 /*
  * These return the fraction of accesses done by a particular task, or
- * task group, on a particular numa node.  The group weight is given a
- * larger multiplier, in order to group tasks together that are almost
- * evenly spread out between numa nodes.
+ * task group, on a particular numa node.  The NUMA move threshold
+ * prevents task moves with marginal improvement, and is set to 5%.
  */
+#define NUMA_SCALE 1000
+#define NUMA_MOVE_THRESH 50
+
 static inline unsigned long task_weight(struct task_struct *p, int nid)
 {
 	unsigned long total_faults;
@@ -940,7 +942,7 @@ static inline unsigned long task_weight(struct task_struct *p, int nid)
 	if (!total_faults)
 		return 0;
 
-	return 1000 * task_faults(p, nid) / total_faults;
+	return NUMA_SCALE * task_faults(p, nid) / total_faults;
 }
 
 static inline unsigned long group_weight(struct task_struct *p, int nid)
@@ -948,7 +950,7 @@ static inline unsigned long group_weight(struct task_struct *p, int nid)
 	if (!p->numa_group || !p->numa_group->total_faults)
 		return 0;
 
-	return 1000 * group_faults(p, nid) / p->numa_group->total_faults;
+	return NUMA_SCALE * group_faults(p, nid) / p->numa_group->total_faults;
 }
 
 bool should_numa_migrate_memory(struct task_struct *p, struct page * page,
@@ -1181,11 +1183,11 @@ static void task_numa_compare(struct task_numa_env *env,
 			imp = taskimp + task_weight(cur, env->src_nid) -
 			      task_weight(cur, env->dst_nid);
 			/*
-			 * Add some hysteresis to prevent swapping the
-			 * tasks within a group over tiny differences.
+			 * Do not swap tasks within a group around unless
+			 * there is a significant improvement.
 			 */
-			if (cur->numa_group)
-				imp -= imp/16;
+			if (cur->numa_group && imp < NUMA_MOVE_THRESH)
+				goto unlock;
 		} else {
 			/*
 			 * Compare the group weights. If a task is all by
@@ -1205,6 +1207,10 @@ static void task_numa_compare(struct task_numa_env *env,
 		goto unlock;
 
 	if (!cur) {
+		/* Only move if there is a significant improvement. */
+		if (imp < NUMA_MOVE_THRESH)
+			goto unlock;
+
 		/* Is there capacity@our destination? */
 		if (env->src_stats.has_free_capacity &&
 		    !env->dst_stats.has_free_capacity)

^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [LKP] [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
  2014-07-29  6:39     ` Rik van Riel
@ 2014-07-29  8:17       ` Peter Zijlstra
  -1 siblings, 0 replies; 66+ messages in thread
From: Peter Zijlstra @ 2014-07-29  8:17 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Aaron Lu, LKML, lkp, jhladky

On Tue, Jul 29, 2014 at 02:39:40AM -0400, Rik van Riel wrote:
> Subject: sched,numa: prevent task moves with marginal benefit
> 
> Commit a43455a1d57 makes task_numa_migrate() always check the
> preferred node for task placement. This is causing a performance
> regression with hackbench, as well as SPECjbb2005.
> 
> Tracing task_numa_compare() with a single instance of SPECjbb2005
> on a 4 node system, I have seen several thread swaps with tiny
> improvements. 
> 
> It appears that the hysteresis code that was added to task_numa_compare
> is not doing what we needed it to do, and a simple threshold could be
> better.
> 
> Reported-by: Aaron Lu <aaron.lu@intel.com>
> Reported-by: Jirka Hladky <jhladky@redhat.com>
> Signed-off-by: Rik van Riel <riel@redhat.com>
> ---
>  kernel/sched/fair.c | 24 +++++++++++++++---------
>  1 file changed, 15 insertions(+), 9 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 4f5e3c2..bedbc3e 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -924,10 +924,12 @@ static inline unsigned long group_faults_cpu(struct numa_group *group, int nid)
>  
>  /*
>   * These return the fraction of accesses done by a particular task, or
> - * task group, on a particular numa node.  The group weight is given a
> - * larger multiplier, in order to group tasks together that are almost
> - * evenly spread out between numa nodes.
> + * task group, on a particular numa node.  The NUMA move threshold
> + * prevents task moves with marginal improvement, and is set to 5%.
>   */
> +#define NUMA_SCALE 1000
> +#define NUMA_MOVE_THRESH 50

Please make that 1024, there's no reason not to use power of two here.
This base 10 factor thing annoyed me no end already, its time for it to
die.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
@ 2014-07-29  8:17       ` Peter Zijlstra
  0 siblings, 0 replies; 66+ messages in thread
From: Peter Zijlstra @ 2014-07-29  8:17 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 1806 bytes --]

On Tue, Jul 29, 2014 at 02:39:40AM -0400, Rik van Riel wrote:
> Subject: sched,numa: prevent task moves with marginal benefit
> 
> Commit a43455a1d57 makes task_numa_migrate() always check the
> preferred node for task placement. This is causing a performance
> regression with hackbench, as well as SPECjbb2005.
> 
> Tracing task_numa_compare() with a single instance of SPECjbb2005
> on a 4 node system, I have seen several thread swaps with tiny
> improvements. 
> 
> It appears that the hysteresis code that was added to task_numa_compare
> is not doing what we needed it to do, and a simple threshold could be
> better.
> 
> Reported-by: Aaron Lu <aaron.lu@intel.com>
> Reported-by: Jirka Hladky <jhladky@redhat.com>
> Signed-off-by: Rik van Riel <riel@redhat.com>
> ---
>  kernel/sched/fair.c | 24 +++++++++++++++---------
>  1 file changed, 15 insertions(+), 9 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 4f5e3c2..bedbc3e 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -924,10 +924,12 @@ static inline unsigned long group_faults_cpu(struct numa_group *group, int nid)
>  
>  /*
>   * These return the fraction of accesses done by a particular task, or
> - * task group, on a particular numa node.  The group weight is given a
> - * larger multiplier, in order to group tasks together that are almost
> - * evenly spread out between numa nodes.
> + * task group, on a particular numa node.  The NUMA move threshold
> + * prevents task moves with marginal improvement, and is set to 5%.
>   */
> +#define NUMA_SCALE 1000
> +#define NUMA_MOVE_THRESH 50

Please make that 1024, there's no reason not to use power of two here.
This base 10 factor thing annoyed me no end already, its time for it to
die.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [LKP] [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
  2014-07-29  8:17       ` Peter Zijlstra
@ 2014-07-29 20:04         ` Rik van Riel
  -1 siblings, 0 replies; 66+ messages in thread
From: Rik van Riel @ 2014-07-29 20:04 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Aaron Lu, LKML, lkp, jhladky

On Tue, 29 Jul 2014 10:17:12 +0200
Peter Zijlstra <peterz@infradead.org> wrote:

> > +#define NUMA_SCALE 1000
> > +#define NUMA_MOVE_THRESH 50
> 
> Please make that 1024, there's no reason not to use power of two here.
> This base 10 factor thing annoyed me no end already, its time for it to
> die.

That's easy enough.  However, it would be good to know whether
this actually helps with the regression Aaron found :)

---8<---

Subject: sched,numa: prevent task moves with marginal benefit

Commit a43455a1d57 makes task_numa_migrate() always check the
preferred node for task placement. This is causing a performance
regression with hackbench, as well as SPECjbb2005.

Tracing task_numa_compare() with a single instance of SPECjbb2005
on a 4 node system, I have seen several thread swaps with tiny
improvements. 

It appears that the hysteresis code that was added to task_numa_compare
is not doing what we needed it to do, and a simple threshold could be
better.

Aaron, does this patch help, or am I barking up the wrong tree?

Reported-by: Aaron Lu <aaron.lu@intel.com>
Reported-by: Jirka Hladky <jhladky@redhat.com>
Signed-off-by: Rik van Riel <riel@redhat.com>
---
 kernel/sched/fair.c | 24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 4f5e3c2..9bd283b 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -924,10 +924,12 @@ static inline unsigned long group_faults_cpu(struct numa_group *group, int nid)
 
 /*
  * These return the fraction of accesses done by a particular task, or
- * task group, on a particular numa node.  The group weight is given a
- * larger multiplier, in order to group tasks together that are almost
- * evenly spread out between numa nodes.
+ * task group, on a particular numa node.  The NUMA move threshold
+ * prevents task moves with marginal improvement, and is set to 5%.
  */
+#define NUMA_SCALE 1024
+#define NUMA_MOVE_THRESH (5 * NUMA_SCALE / 100)
+
 static inline unsigned long task_weight(struct task_struct *p, int nid)
 {
 	unsigned long total_faults;
@@ -940,7 +942,7 @@ static inline unsigned long task_weight(struct task_struct *p, int nid)
 	if (!total_faults)
 		return 0;
 
-	return 1000 * task_faults(p, nid) / total_faults;
+	return NUMA_SCALE * task_faults(p, nid) / total_faults;
 }
 
 static inline unsigned long group_weight(struct task_struct *p, int nid)
@@ -948,7 +950,7 @@ static inline unsigned long group_weight(struct task_struct *p, int nid)
 	if (!p->numa_group || !p->numa_group->total_faults)
 		return 0;
 
-	return 1000 * group_faults(p, nid) / p->numa_group->total_faults;
+	return NUMA_SCALE * group_faults(p, nid) / p->numa_group->total_faults;
 }
 
 bool should_numa_migrate_memory(struct task_struct *p, struct page * page,
@@ -1181,11 +1183,11 @@ static void task_numa_compare(struct task_numa_env *env,
 			imp = taskimp + task_weight(cur, env->src_nid) -
 			      task_weight(cur, env->dst_nid);
 			/*
-			 * Add some hysteresis to prevent swapping the
-			 * tasks within a group over tiny differences.
+			 * Do not swap tasks within a group around unless
+			 * there is a significant improvement.
 			 */
-			if (cur->numa_group)
-				imp -= imp/16;
+			if (cur->numa_group && imp < NUMA_MOVE_THRESH)
+				goto unlock;
 		} else {
 			/*
 			 * Compare the group weights. If a task is all by
@@ -1205,6 +1207,10 @@ static void task_numa_compare(struct task_numa_env *env,
 		goto unlock;
 
 	if (!cur) {
+		/* Only move if there is a significant improvement. */
+		if (imp < NUMA_MOVE_THRESH)
+			goto unlock;
+
 		/* Is there capacity at our destination? */
 		if (env->src_stats.has_free_capacity &&
 		    !env->dst_stats.has_free_capacity)


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
@ 2014-07-29 20:04         ` Rik van Riel
  0 siblings, 0 replies; 66+ messages in thread
From: Rik van Riel @ 2014-07-29 20:04 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 3835 bytes --]

On Tue, 29 Jul 2014 10:17:12 +0200
Peter Zijlstra <peterz@infradead.org> wrote:

> > +#define NUMA_SCALE 1000
> > +#define NUMA_MOVE_THRESH 50
> 
> Please make that 1024, there's no reason not to use power of two here.
> This base 10 factor thing annoyed me no end already, its time for it to
> die.

That's easy enough.  However, it would be good to know whether
this actually helps with the regression Aaron found :)

---8<---

Subject: sched,numa: prevent task moves with marginal benefit

Commit a43455a1d57 makes task_numa_migrate() always check the
preferred node for task placement. This is causing a performance
regression with hackbench, as well as SPECjbb2005.

Tracing task_numa_compare() with a single instance of SPECjbb2005
on a 4 node system, I have seen several thread swaps with tiny
improvements. 

It appears that the hysteresis code that was added to task_numa_compare
is not doing what we needed it to do, and a simple threshold could be
better.

Aaron, does this patch help, or am I barking up the wrong tree?

Reported-by: Aaron Lu <aaron.lu@intel.com>
Reported-by: Jirka Hladky <jhladky@redhat.com>
Signed-off-by: Rik van Riel <riel@redhat.com>
---
 kernel/sched/fair.c | 24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 4f5e3c2..9bd283b 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -924,10 +924,12 @@ static inline unsigned long group_faults_cpu(struct numa_group *group, int nid)
 
 /*
  * These return the fraction of accesses done by a particular task, or
- * task group, on a particular numa node.  The group weight is given a
- * larger multiplier, in order to group tasks together that are almost
- * evenly spread out between numa nodes.
+ * task group, on a particular numa node.  The NUMA move threshold
+ * prevents task moves with marginal improvement, and is set to 5%.
  */
+#define NUMA_SCALE 1024
+#define NUMA_MOVE_THRESH (5 * NUMA_SCALE / 100)
+
 static inline unsigned long task_weight(struct task_struct *p, int nid)
 {
 	unsigned long total_faults;
@@ -940,7 +942,7 @@ static inline unsigned long task_weight(struct task_struct *p, int nid)
 	if (!total_faults)
 		return 0;
 
-	return 1000 * task_faults(p, nid) / total_faults;
+	return NUMA_SCALE * task_faults(p, nid) / total_faults;
 }
 
 static inline unsigned long group_weight(struct task_struct *p, int nid)
@@ -948,7 +950,7 @@ static inline unsigned long group_weight(struct task_struct *p, int nid)
 	if (!p->numa_group || !p->numa_group->total_faults)
 		return 0;
 
-	return 1000 * group_faults(p, nid) / p->numa_group->total_faults;
+	return NUMA_SCALE * group_faults(p, nid) / p->numa_group->total_faults;
 }
 
 bool should_numa_migrate_memory(struct task_struct *p, struct page * page,
@@ -1181,11 +1183,11 @@ static void task_numa_compare(struct task_numa_env *env,
 			imp = taskimp + task_weight(cur, env->src_nid) -
 			      task_weight(cur, env->dst_nid);
 			/*
-			 * Add some hysteresis to prevent swapping the
-			 * tasks within a group over tiny differences.
+			 * Do not swap tasks within a group around unless
+			 * there is a significant improvement.
 			 */
-			if (cur->numa_group)
-				imp -= imp/16;
+			if (cur->numa_group && imp < NUMA_MOVE_THRESH)
+				goto unlock;
 		} else {
 			/*
 			 * Compare the group weights. If a task is all by
@@ -1205,6 +1207,10 @@ static void task_numa_compare(struct task_numa_env *env,
 		goto unlock;
 
 	if (!cur) {
+		/* Only move if there is a significant improvement. */
+		if (imp < NUMA_MOVE_THRESH)
+			goto unlock;
+
 		/* Is there capacity@our destination? */
 		if (env->src_stats.has_free_capacity &&
 		    !env->dst_stats.has_free_capacity)


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [LKP] [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
  2014-07-29 20:04         ` Rik van Riel
@ 2014-07-30  2:14           ` Aaron Lu
  -1 siblings, 0 replies; 66+ messages in thread
From: Aaron Lu @ 2014-07-30  2:14 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Peter Zijlstra, LKML, lkp, jhladky

On Tue, Jul 29, 2014 at 04:04:37PM -0400, Rik van Riel wrote:
> On Tue, 29 Jul 2014 10:17:12 +0200
> Peter Zijlstra <peterz@infradead.org> wrote:
> 
> > > +#define NUMA_SCALE 1000
> > > +#define NUMA_MOVE_THRESH 50
> > 
> > Please make that 1024, there's no reason not to use power of two here.
> > This base 10 factor thing annoyed me no end already, its time for it to
> > die.
> 
> That's easy enough.  However, it would be good to know whether
> this actually helps with the regression Aaron found :)

Sorry for the delay.

I applied the last patch and queued the hackbench job to the ivb42 test
machine for it to run 5 times, and here is the result(regarding the
proc-vmstat.numa_hint_faults_local field):
173565
201262
192317
198342
198595
avg:
192816

It seems it is still very big than previous kernels.

BTW, to highlight changes, we only include metrics that have changed a
lot in the report, which means, for metrics that don't show in the
report, it means it doesn't change much. But just in case, here is the
throughput metric regarding commit a43455a1d(compared to its parent):

ebe06187bf2aec1   a43455a1d572daf7b730fe12e  
---------------   -------------------------  
118881 ~ 0%            +1.2%    120325 ~ 0% ivb42/hackbench/50%-threads-pipe
 78410 ~ 0%            +0.6%     78857 ~ 0% lkp-snb01/hackbench/50%-threads-socket
197292 ~ 0%            +1.0%    199182 ~ 0% TOTAL hackbench.throughput

Feel free to let me know if you need more information.

Thanks,
Aaron

> 
> ---8<---
> 
> Subject: sched,numa: prevent task moves with marginal benefit
> 
> Commit a43455a1d57 makes task_numa_migrate() always check the
> preferred node for task placement. This is causing a performance
> regression with hackbench, as well as SPECjbb2005.
> 
> Tracing task_numa_compare() with a single instance of SPECjbb2005
> on a 4 node system, I have seen several thread swaps with tiny
> improvements. 
> 
> It appears that the hysteresis code that was added to task_numa_compare
> is not doing what we needed it to do, and a simple threshold could be
> better.
> 
> Aaron, does this patch help, or am I barking up the wrong tree?
> 
> Reported-by: Aaron Lu <aaron.lu@intel.com>
> Reported-by: Jirka Hladky <jhladky@redhat.com>
> Signed-off-by: Rik van Riel <riel@redhat.com>
> ---
>  kernel/sched/fair.c | 24 +++++++++++++++---------
>  1 file changed, 15 insertions(+), 9 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 4f5e3c2..9bd283b 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -924,10 +924,12 @@ static inline unsigned long group_faults_cpu(struct numa_group *group, int nid)
>  
>  /*
>   * These return the fraction of accesses done by a particular task, or
> - * task group, on a particular numa node.  The group weight is given a
> - * larger multiplier, in order to group tasks together that are almost
> - * evenly spread out between numa nodes.
> + * task group, on a particular numa node.  The NUMA move threshold
> + * prevents task moves with marginal improvement, and is set to 5%.
>   */
> +#define NUMA_SCALE 1024
> +#define NUMA_MOVE_THRESH (5 * NUMA_SCALE / 100)
> +
>  static inline unsigned long task_weight(struct task_struct *p, int nid)
>  {
>  	unsigned long total_faults;
> @@ -940,7 +942,7 @@ static inline unsigned long task_weight(struct task_struct *p, int nid)
>  	if (!total_faults)
>  		return 0;
>  
> -	return 1000 * task_faults(p, nid) / total_faults;
> +	return NUMA_SCALE * task_faults(p, nid) / total_faults;
>  }
>  
>  static inline unsigned long group_weight(struct task_struct *p, int nid)
> @@ -948,7 +950,7 @@ static inline unsigned long group_weight(struct task_struct *p, int nid)
>  	if (!p->numa_group || !p->numa_group->total_faults)
>  		return 0;
>  
> -	return 1000 * group_faults(p, nid) / p->numa_group->total_faults;
> +	return NUMA_SCALE * group_faults(p, nid) / p->numa_group->total_faults;
>  }
>  
>  bool should_numa_migrate_memory(struct task_struct *p, struct page * page,
> @@ -1181,11 +1183,11 @@ static void task_numa_compare(struct task_numa_env *env,
>  			imp = taskimp + task_weight(cur, env->src_nid) -
>  			      task_weight(cur, env->dst_nid);
>  			/*
> -			 * Add some hysteresis to prevent swapping the
> -			 * tasks within a group over tiny differences.
> +			 * Do not swap tasks within a group around unless
> +			 * there is a significant improvement.
>  			 */
> -			if (cur->numa_group)
> -				imp -= imp/16;
> +			if (cur->numa_group && imp < NUMA_MOVE_THRESH)
> +				goto unlock;
>  		} else {
>  			/*
>  			 * Compare the group weights. If a task is all by
> @@ -1205,6 +1207,10 @@ static void task_numa_compare(struct task_numa_env *env,
>  		goto unlock;
>  
>  	if (!cur) {
> +		/* Only move if there is a significant improvement. */
> +		if (imp < NUMA_MOVE_THRESH)
> +			goto unlock;
> +
>  		/* Is there capacity at our destination? */
>  		if (env->src_stats.has_free_capacity &&
>  		    !env->dst_stats.has_free_capacity)
> 

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
@ 2014-07-30  2:14           ` Aaron Lu
  0 siblings, 0 replies; 66+ messages in thread
From: Aaron Lu @ 2014-07-30  2:14 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 5124 bytes --]

On Tue, Jul 29, 2014 at 04:04:37PM -0400, Rik van Riel wrote:
> On Tue, 29 Jul 2014 10:17:12 +0200
> Peter Zijlstra <peterz@infradead.org> wrote:
> 
> > > +#define NUMA_SCALE 1000
> > > +#define NUMA_MOVE_THRESH 50
> > 
> > Please make that 1024, there's no reason not to use power of two here.
> > This base 10 factor thing annoyed me no end already, its time for it to
> > die.
> 
> That's easy enough.  However, it would be good to know whether
> this actually helps with the regression Aaron found :)

Sorry for the delay.

I applied the last patch and queued the hackbench job to the ivb42 test
machine for it to run 5 times, and here is the result(regarding the
proc-vmstat.numa_hint_faults_local field):
173565
201262
192317
198342
198595
avg:
192816

It seems it is still very big than previous kernels.

BTW, to highlight changes, we only include metrics that have changed a
lot in the report, which means, for metrics that don't show in the
report, it means it doesn't change much. But just in case, here is the
throughput metric regarding commit a43455a1d(compared to its parent):

ebe06187bf2aec1   a43455a1d572daf7b730fe12e  
---------------   -------------------------  
118881 ~ 0%            +1.2%    120325 ~ 0% ivb42/hackbench/50%-threads-pipe
 78410 ~ 0%            +0.6%     78857 ~ 0% lkp-snb01/hackbench/50%-threads-socket
197292 ~ 0%            +1.0%    199182 ~ 0% TOTAL hackbench.throughput

Feel free to let me know if you need more information.

Thanks,
Aaron

> 
> ---8<---
> 
> Subject: sched,numa: prevent task moves with marginal benefit
> 
> Commit a43455a1d57 makes task_numa_migrate() always check the
> preferred node for task placement. This is causing a performance
> regression with hackbench, as well as SPECjbb2005.
> 
> Tracing task_numa_compare() with a single instance of SPECjbb2005
> on a 4 node system, I have seen several thread swaps with tiny
> improvements. 
> 
> It appears that the hysteresis code that was added to task_numa_compare
> is not doing what we needed it to do, and a simple threshold could be
> better.
> 
> Aaron, does this patch help, or am I barking up the wrong tree?
> 
> Reported-by: Aaron Lu <aaron.lu@intel.com>
> Reported-by: Jirka Hladky <jhladky@redhat.com>
> Signed-off-by: Rik van Riel <riel@redhat.com>
> ---
>  kernel/sched/fair.c | 24 +++++++++++++++---------
>  1 file changed, 15 insertions(+), 9 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 4f5e3c2..9bd283b 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -924,10 +924,12 @@ static inline unsigned long group_faults_cpu(struct numa_group *group, int nid)
>  
>  /*
>   * These return the fraction of accesses done by a particular task, or
> - * task group, on a particular numa node.  The group weight is given a
> - * larger multiplier, in order to group tasks together that are almost
> - * evenly spread out between numa nodes.
> + * task group, on a particular numa node.  The NUMA move threshold
> + * prevents task moves with marginal improvement, and is set to 5%.
>   */
> +#define NUMA_SCALE 1024
> +#define NUMA_MOVE_THRESH (5 * NUMA_SCALE / 100)
> +
>  static inline unsigned long task_weight(struct task_struct *p, int nid)
>  {
>  	unsigned long total_faults;
> @@ -940,7 +942,7 @@ static inline unsigned long task_weight(struct task_struct *p, int nid)
>  	if (!total_faults)
>  		return 0;
>  
> -	return 1000 * task_faults(p, nid) / total_faults;
> +	return NUMA_SCALE * task_faults(p, nid) / total_faults;
>  }
>  
>  static inline unsigned long group_weight(struct task_struct *p, int nid)
> @@ -948,7 +950,7 @@ static inline unsigned long group_weight(struct task_struct *p, int nid)
>  	if (!p->numa_group || !p->numa_group->total_faults)
>  		return 0;
>  
> -	return 1000 * group_faults(p, nid) / p->numa_group->total_faults;
> +	return NUMA_SCALE * group_faults(p, nid) / p->numa_group->total_faults;
>  }
>  
>  bool should_numa_migrate_memory(struct task_struct *p, struct page * page,
> @@ -1181,11 +1183,11 @@ static void task_numa_compare(struct task_numa_env *env,
>  			imp = taskimp + task_weight(cur, env->src_nid) -
>  			      task_weight(cur, env->dst_nid);
>  			/*
> -			 * Add some hysteresis to prevent swapping the
> -			 * tasks within a group over tiny differences.
> +			 * Do not swap tasks within a group around unless
> +			 * there is a significant improvement.
>  			 */
> -			if (cur->numa_group)
> -				imp -= imp/16;
> +			if (cur->numa_group && imp < NUMA_MOVE_THRESH)
> +				goto unlock;
>  		} else {
>  			/*
>  			 * Compare the group weights. If a task is all by
> @@ -1205,6 +1207,10 @@ static void task_numa_compare(struct task_numa_env *env,
>  		goto unlock;
>  
>  	if (!cur) {
> +		/* Only move if there is a significant improvement. */
> +		if (imp < NUMA_MOVE_THRESH)
> +			goto unlock;
> +
>  		/* Is there capacity at our destination? */
>  		if (env->src_stats.has_free_capacity &&
>  		    !env->dst_stats.has_free_capacity)
> 

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [LKP] [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
  2014-07-30  2:14           ` Aaron Lu
@ 2014-07-30 14:25             ` Rik van Riel
  -1 siblings, 0 replies; 66+ messages in thread
From: Rik van Riel @ 2014-07-30 14:25 UTC (permalink / raw)
  To: Aaron Lu; +Cc: Peter Zijlstra, LKML, lkp, jhladky

On 07/29/2014 10:14 PM, Aaron Lu wrote:
> On Tue, Jul 29, 2014 at 04:04:37PM -0400, Rik van Riel wrote:
>> On Tue, 29 Jul 2014 10:17:12 +0200
>> Peter Zijlstra <peterz@infradead.org> wrote:
>>
>>>> +#define NUMA_SCALE 1000
>>>> +#define NUMA_MOVE_THRESH 50
>>>
>>> Please make that 1024, there's no reason not to use power of two here.
>>> This base 10 factor thing annoyed me no end already, its time for it to
>>> die.
>>
>> That's easy enough.  However, it would be good to know whether
>> this actually helps with the regression Aaron found :)
> 
> Sorry for the delay.
> 
> I applied the last patch and queued the hackbench job to the ivb42 test
> machine for it to run 5 times, and here is the result(regarding the
> proc-vmstat.numa_hint_faults_local field):
> 173565
> 201262
> 192317
> 198342
> 198595
> avg:
> 192816
> 
> It seems it is still very big than previous kernels.

It looks like a step in the right direction, though.

Could you try running with a larger threshold?

>> +++ b/kernel/sched/fair.c
>> @@ -924,10 +924,12 @@ static inline unsigned long group_faults_cpu(struct numa_group *group, int nid)
>>  
>>  /*
>>   * These return the fraction of accesses done by a particular task, or
>> - * task group, on a particular numa node.  The group weight is given a
>> - * larger multiplier, in order to group tasks together that are almost
>> - * evenly spread out between numa nodes.
>> + * task group, on a particular numa node.  The NUMA move threshold
>> + * prevents task moves with marginal improvement, and is set to 5%.
>>   */
>> +#define NUMA_SCALE 1024
>> +#define NUMA_MOVE_THRESH (5 * NUMA_SCALE / 100)

It would be good to see if changing NUMA_MOVE_THRESH to
(NUMA_SCALE / 8) does the trick.

I will run the same thing here with SPECjbb2005.


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
@ 2014-07-30 14:25             ` Rik van Riel
  0 siblings, 0 replies; 66+ messages in thread
From: Rik van Riel @ 2014-07-30 14:25 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 1829 bytes --]

On 07/29/2014 10:14 PM, Aaron Lu wrote:
> On Tue, Jul 29, 2014 at 04:04:37PM -0400, Rik van Riel wrote:
>> On Tue, 29 Jul 2014 10:17:12 +0200
>> Peter Zijlstra <peterz@infradead.org> wrote:
>>
>>>> +#define NUMA_SCALE 1000
>>>> +#define NUMA_MOVE_THRESH 50
>>>
>>> Please make that 1024, there's no reason not to use power of two here.
>>> This base 10 factor thing annoyed me no end already, its time for it to
>>> die.
>>
>> That's easy enough.  However, it would be good to know whether
>> this actually helps with the regression Aaron found :)
> 
> Sorry for the delay.
> 
> I applied the last patch and queued the hackbench job to the ivb42 test
> machine for it to run 5 times, and here is the result(regarding the
> proc-vmstat.numa_hint_faults_local field):
> 173565
> 201262
> 192317
> 198342
> 198595
> avg:
> 192816
> 
> It seems it is still very big than previous kernels.

It looks like a step in the right direction, though.

Could you try running with a larger threshold?

>> +++ b/kernel/sched/fair.c
>> @@ -924,10 +924,12 @@ static inline unsigned long group_faults_cpu(struct numa_group *group, int nid)
>>  
>>  /*
>>   * These return the fraction of accesses done by a particular task, or
>> - * task group, on a particular numa node.  The group weight is given a
>> - * larger multiplier, in order to group tasks together that are almost
>> - * evenly spread out between numa nodes.
>> + * task group, on a particular numa node.  The NUMA move threshold
>> + * prevents task moves with marginal improvement, and is set to 5%.
>>   */
>> +#define NUMA_SCALE 1024
>> +#define NUMA_MOVE_THRESH (5 * NUMA_SCALE / 100)

It would be good to see if changing NUMA_MOVE_THRESH to
(NUMA_SCALE / 8) does the trick.

I will run the same thing here with SPECjbb2005.


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [LKP] [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
  2014-07-30 14:25             ` Rik van Riel
@ 2014-07-31  5:04               ` Aaron Lu
  -1 siblings, 0 replies; 66+ messages in thread
From: Aaron Lu @ 2014-07-31  5:04 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Peter Zijlstra, LKML, lkp, jhladky

On Wed, Jul 30, 2014 at 10:25:03AM -0400, Rik van Riel wrote:
> On 07/29/2014 10:14 PM, Aaron Lu wrote:
> > On Tue, Jul 29, 2014 at 04:04:37PM -0400, Rik van Riel wrote:
> >> On Tue, 29 Jul 2014 10:17:12 +0200
> >> Peter Zijlstra <peterz@infradead.org> wrote:
> >>
> >>>> +#define NUMA_SCALE 1000
> >>>> +#define NUMA_MOVE_THRESH 50
> >>>
> >>> Please make that 1024, there's no reason not to use power of two here.
> >>> This base 10 factor thing annoyed me no end already, its time for it to
> >>> die.
> >>
> >> That's easy enough.  However, it would be good to know whether
> >> this actually helps with the regression Aaron found :)
> > 
> > Sorry for the delay.
> > 
> > I applied the last patch and queued the hackbench job to the ivb42 test
> > machine for it to run 5 times, and here is the result(regarding the
> > proc-vmstat.numa_hint_faults_local field):
> > 173565
> > 201262
> > 192317
> > 198342
> > 198595
> > avg:
> > 192816
> > 
> > It seems it is still very big than previous kernels.
> 
> It looks like a step in the right direction, though.
> 
> Could you try running with a larger threshold?
> 
> >> +++ b/kernel/sched/fair.c
> >> @@ -924,10 +924,12 @@ static inline unsigned long group_faults_cpu(struct numa_group *group, int nid)
> >>  
> >>  /*
> >>   * These return the fraction of accesses done by a particular task, or
> >> - * task group, on a particular numa node.  The group weight is given a
> >> - * larger multiplier, in order to group tasks together that are almost
> >> - * evenly spread out between numa nodes.
> >> + * task group, on a particular numa node.  The NUMA move threshold
> >> + * prevents task moves with marginal improvement, and is set to 5%.
> >>   */
> >> +#define NUMA_SCALE 1024
> >> +#define NUMA_MOVE_THRESH (5 * NUMA_SCALE / 100)
> 
> It would be good to see if changing NUMA_MOVE_THRESH to
> (NUMA_SCALE / 8) does the trick.

With your 2nd patch and the above change, the result is:

"proc-vmstat.numa_hint_faults_local": [
  199708,
  209152,
  200638,
  187324,
  196654
  ],

avg:
198695

Regards,
Aaron

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
@ 2014-07-31  5:04               ` Aaron Lu
  0 siblings, 0 replies; 66+ messages in thread
From: Aaron Lu @ 2014-07-31  5:04 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 2136 bytes --]

On Wed, Jul 30, 2014 at 10:25:03AM -0400, Rik van Riel wrote:
> On 07/29/2014 10:14 PM, Aaron Lu wrote:
> > On Tue, Jul 29, 2014 at 04:04:37PM -0400, Rik van Riel wrote:
> >> On Tue, 29 Jul 2014 10:17:12 +0200
> >> Peter Zijlstra <peterz@infradead.org> wrote:
> >>
> >>>> +#define NUMA_SCALE 1000
> >>>> +#define NUMA_MOVE_THRESH 50
> >>>
> >>> Please make that 1024, there's no reason not to use power of two here.
> >>> This base 10 factor thing annoyed me no end already, its time for it to
> >>> die.
> >>
> >> That's easy enough.  However, it would be good to know whether
> >> this actually helps with the regression Aaron found :)
> > 
> > Sorry for the delay.
> > 
> > I applied the last patch and queued the hackbench job to the ivb42 test
> > machine for it to run 5 times, and here is the result(regarding the
> > proc-vmstat.numa_hint_faults_local field):
> > 173565
> > 201262
> > 192317
> > 198342
> > 198595
> > avg:
> > 192816
> > 
> > It seems it is still very big than previous kernels.
> 
> It looks like a step in the right direction, though.
> 
> Could you try running with a larger threshold?
> 
> >> +++ b/kernel/sched/fair.c
> >> @@ -924,10 +924,12 @@ static inline unsigned long group_faults_cpu(struct numa_group *group, int nid)
> >>  
> >>  /*
> >>   * These return the fraction of accesses done by a particular task, or
> >> - * task group, on a particular numa node.  The group weight is given a
> >> - * larger multiplier, in order to group tasks together that are almost
> >> - * evenly spread out between numa nodes.
> >> + * task group, on a particular numa node.  The NUMA move threshold
> >> + * prevents task moves with marginal improvement, and is set to 5%.
> >>   */
> >> +#define NUMA_SCALE 1024
> >> +#define NUMA_MOVE_THRESH (5 * NUMA_SCALE / 100)
> 
> It would be good to see if changing NUMA_MOVE_THRESH to
> (NUMA_SCALE / 8) does the trick.

With your 2nd patch and the above change, the result is:

"proc-vmstat.numa_hint_faults_local": [
  199708,
  209152,
  200638,
  187324,
  196654
  ],

avg:
198695

Regards,
Aaron

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [LKP] [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
  2014-07-31  5:04               ` Aaron Lu
@ 2014-07-31  6:22                 ` Rik van Riel
  -1 siblings, 0 replies; 66+ messages in thread
From: Rik van Riel @ 2014-07-31  6:22 UTC (permalink / raw)
  To: Aaron Lu; +Cc: Peter Zijlstra, LKML, lkp, jhladky

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 07/31/2014 01:04 AM, Aaron Lu wrote:
> On Wed, Jul 30, 2014 at 10:25:03AM -0400, Rik van Riel wrote:
>> On 07/29/2014 10:14 PM, Aaron Lu wrote:
>>> On Tue, Jul 29, 2014 at 04:04:37PM -0400, Rik van Riel wrote:
>>>> On Tue, 29 Jul 2014 10:17:12 +0200 Peter Zijlstra
>>>> <peterz@infradead.org> wrote:
>>>> 
>>>>>> +#define NUMA_SCALE 1000 +#define NUMA_MOVE_THRESH 50
>>>>> 
>>>>> Please make that 1024, there's no reason not to use power
>>>>> of two here. This base 10 factor thing annoyed me no end
>>>>> already, its time for it to die.
>>>> 
>>>> That's easy enough.  However, it would be good to know
>>>> whether this actually helps with the regression Aaron found
>>>> :)
>>> 
>>> Sorry for the delay.
>>> 
>>> I applied the last patch and queued the hackbench job to the
>>> ivb42 test machine for it to run 5 times, and here is the
>>> result(regarding the proc-vmstat.numa_hint_faults_local
>>> field): 173565 201262 192317 198342 198595 avg: 192816
>>> 
>>> It seems it is still very big than previous kernels.
>> 
>> It looks like a step in the right direction, though.
>> 
>> Could you try running with a larger threshold?
>> 
>>>> +++ b/kernel/sched/fair.c @@ -924,10 +924,12 @@ static inline
>>>> unsigned long group_faults_cpu(struct numa_group *group, int
>>>> nid)
>>>> 
>>>> /* * These return the fraction of accesses done by a
>>>> particular task, or - * task group, on a particular numa
>>>> node.  The group weight is given a - * larger multiplier, in
>>>> order to group tasks together that are almost - * evenly
>>>> spread out between numa nodes. + * task group, on a
>>>> particular numa node.  The NUMA move threshold + * prevents
>>>> task moves with marginal improvement, and is set to 5%. */ 
>>>> +#define NUMA_SCALE 1024 +#define NUMA_MOVE_THRESH (5 *
>>>> NUMA_SCALE / 100)
>> 
>> It would be good to see if changing NUMA_MOVE_THRESH to 
>> (NUMA_SCALE / 8) does the trick.
> 
> With your 2nd patch and the above change, the result is:
> 
> "proc-vmstat.numa_hint_faults_local": [ 199708, 209152, 200638, 
> 187324, 196654 ],
> 
> avg: 198695

OK, so it is still a little higher than your original 162245.

I guess this is to be expected, since the code will be more
successful at placing a task on the right node, which results
in the task scanning its memory more rapidly for a little bit.

Are you seeing any changes in throughput?

- -- 
All rights reversed
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQEcBAEBAgAGBQJT2eC/AAoJEM553pKExN6DIFMH/23LsoEJ8cUqMTdWUzhXesEb
TW0yncraZ6tDkGHopTU4oFmck93XUUVSJRVjLC3lxvxAIdWt8M4GCbWN8RD1yicX
Ii9s18+2r2vkc30gkIgh2yahaqQUun9sUkuaQ4BaKlbP+hwQzB3OfU1GjR7iStFE
t04krgCAL+xL63H/4mN0Y9ZjOBUz2QYbkspS21+oEWKkFY2FyyQn+hOSnA6lSvqy
o7v4tmC8jtRXsQY+hfy1aOtMUZO5sRcYHOttlxgjE5MbnW/whhsC+oB7cWw646St
LhvhhIykl/g2Bz+E3KbfnREGn5OO7NmEhv3am2Dj5XsNHnEfxYJH/m/aTA4az/s=
=/IeV
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
@ 2014-07-31  6:22                 ` Rik van Riel
  0 siblings, 0 replies; 66+ messages in thread
From: Rik van Riel @ 2014-07-31  6:22 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 2997 bytes --]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 07/31/2014 01:04 AM, Aaron Lu wrote:
> On Wed, Jul 30, 2014 at 10:25:03AM -0400, Rik van Riel wrote:
>> On 07/29/2014 10:14 PM, Aaron Lu wrote:
>>> On Tue, Jul 29, 2014 at 04:04:37PM -0400, Rik van Riel wrote:
>>>> On Tue, 29 Jul 2014 10:17:12 +0200 Peter Zijlstra
>>>> <peterz@infradead.org> wrote:
>>>> 
>>>>>> +#define NUMA_SCALE 1000 +#define NUMA_MOVE_THRESH 50
>>>>> 
>>>>> Please make that 1024, there's no reason not to use power
>>>>> of two here. This base 10 factor thing annoyed me no end
>>>>> already, its time for it to die.
>>>> 
>>>> That's easy enough.  However, it would be good to know
>>>> whether this actually helps with the regression Aaron found
>>>> :)
>>> 
>>> Sorry for the delay.
>>> 
>>> I applied the last patch and queued the hackbench job to the
>>> ivb42 test machine for it to run 5 times, and here is the
>>> result(regarding the proc-vmstat.numa_hint_faults_local
>>> field): 173565 201262 192317 198342 198595 avg: 192816
>>> 
>>> It seems it is still very big than previous kernels.
>> 
>> It looks like a step in the right direction, though.
>> 
>> Could you try running with a larger threshold?
>> 
>>>> +++ b/kernel/sched/fair.c @@ -924,10 +924,12 @@ static inline
>>>> unsigned long group_faults_cpu(struct numa_group *group, int
>>>> nid)
>>>> 
>>>> /* * These return the fraction of accesses done by a
>>>> particular task, or - * task group, on a particular numa
>>>> node.  The group weight is given a - * larger multiplier, in
>>>> order to group tasks together that are almost - * evenly
>>>> spread out between numa nodes. + * task group, on a
>>>> particular numa node.  The NUMA move threshold + * prevents
>>>> task moves with marginal improvement, and is set to 5%. */ 
>>>> +#define NUMA_SCALE 1024 +#define NUMA_MOVE_THRESH (5 *
>>>> NUMA_SCALE / 100)
>> 
>> It would be good to see if changing NUMA_MOVE_THRESH to 
>> (NUMA_SCALE / 8) does the trick.
> 
> With your 2nd patch and the above change, the result is:
> 
> "proc-vmstat.numa_hint_faults_local": [ 199708, 209152, 200638, 
> 187324, 196654 ],
> 
> avg: 198695

OK, so it is still a little higher than your original 162245.

I guess this is to be expected, since the code will be more
successful at placing a task on the right node, which results
in the task scanning its memory more rapidly for a little bit.

Are you seeing any changes in throughput?

- -- 
All rights reversed
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQEcBAEBAgAGBQJT2eC/AAoJEM553pKExN6DIFMH/23LsoEJ8cUqMTdWUzhXesEb
TW0yncraZ6tDkGHopTU4oFmck93XUUVSJRVjLC3lxvxAIdWt8M4GCbWN8RD1yicX
Ii9s18+2r2vkc30gkIgh2yahaqQUun9sUkuaQ4BaKlbP+hwQzB3OfU1GjR7iStFE
t04krgCAL+xL63H/4mN0Y9ZjOBUz2QYbkspS21+oEWKkFY2FyyQn+hOSnA6lSvqy
o7v4tmC8jtRXsQY+hfy1aOtMUZO5sRcYHOttlxgjE5MbnW/whhsC+oB7cWw646St
LhvhhIykl/g2Bz+E3KbfnREGn5OO7NmEhv3am2Dj5XsNHnEfxYJH/m/aTA4az/s=
=/IeV
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [LKP] [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
  2014-07-31  5:04               ` Aaron Lu
@ 2014-07-31  6:42                 ` Rik van Riel
  -1 siblings, 0 replies; 66+ messages in thread
From: Rik van Riel @ 2014-07-31  6:42 UTC (permalink / raw)
  To: Aaron Lu; +Cc: Peter Zijlstra, LKML, lkp, jhladky

On Thu, 31 Jul 2014 13:04:54 +0800
Aaron Lu <aaron.lu@intel.com> wrote:

> On Wed, Jul 30, 2014 at 10:25:03AM -0400, Rik van Riel wrote:
> > On 07/29/2014 10:14 PM, Aaron Lu wrote:

> > >> +#define NUMA_SCALE 1024
> > >> +#define NUMA_MOVE_THRESH (5 * NUMA_SCALE / 100)
> > 
> > It would be good to see if changing NUMA_MOVE_THRESH to
> > (NUMA_SCALE / 8) does the trick.

FWIW, running with NUMA_MOVE_THRESH set to (NUMA_SCALE / 8)
seems to resolve the SPECjbb2005 threshold on my system.

I will run some more sanity tests later today...

-- 
All rights reversed.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
@ 2014-07-31  6:42                 ` Rik van Riel
  0 siblings, 0 replies; 66+ messages in thread
From: Rik van Riel @ 2014-07-31  6:42 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 585 bytes --]

On Thu, 31 Jul 2014 13:04:54 +0800
Aaron Lu <aaron.lu@intel.com> wrote:

> On Wed, Jul 30, 2014 at 10:25:03AM -0400, Rik van Riel wrote:
> > On 07/29/2014 10:14 PM, Aaron Lu wrote:

> > >> +#define NUMA_SCALE 1024
> > >> +#define NUMA_MOVE_THRESH (5 * NUMA_SCALE / 100)
> > 
> > It would be good to see if changing NUMA_MOVE_THRESH to
> > (NUMA_SCALE / 8) does the trick.

FWIW, running with NUMA_MOVE_THRESH set to (NUMA_SCALE / 8)
seems to resolve the SPECjbb2005 threshold on my system.

I will run some more sanity tests later today...

-- 
All rights reversed.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [LKP] [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
  2014-07-31  6:22                 ` Rik van Riel
@ 2014-07-31  6:53                   ` Aaron Lu
  -1 siblings, 0 replies; 66+ messages in thread
From: Aaron Lu @ 2014-07-31  6:53 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Peter Zijlstra, LKML, lkp, jhladky

On Thu, Jul 31, 2014 at 02:22:55AM -0400, Rik van Riel wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> On 07/31/2014 01:04 AM, Aaron Lu wrote:
> > On Wed, Jul 30, 2014 at 10:25:03AM -0400, Rik van Riel wrote:
> >> On 07/29/2014 10:14 PM, Aaron Lu wrote:
> >>> On Tue, Jul 29, 2014 at 04:04:37PM -0400, Rik van Riel wrote:
> >>>> On Tue, 29 Jul 2014 10:17:12 +0200 Peter Zijlstra
> >>>> <peterz@infradead.org> wrote:
> >>>> 
> >>>>>> +#define NUMA_SCALE 1000 +#define NUMA_MOVE_THRESH 50
> >>>>> 
> >>>>> Please make that 1024, there's no reason not to use power
> >>>>> of two here. This base 10 factor thing annoyed me no end
> >>>>> already, its time for it to die.
> >>>> 
> >>>> That's easy enough.  However, it would be good to know
> >>>> whether this actually helps with the regression Aaron found
> >>>> :)
> >>> 
> >>> Sorry for the delay.
> >>> 
> >>> I applied the last patch and queued the hackbench job to the
> >>> ivb42 test machine for it to run 5 times, and here is the
> >>> result(regarding the proc-vmstat.numa_hint_faults_local
> >>> field): 173565 201262 192317 198342 198595 avg: 192816
> >>> 
> >>> It seems it is still very big than previous kernels.
> >> 
> >> It looks like a step in the right direction, though.
> >> 
> >> Could you try running with a larger threshold?
> >> 
> >>>> +++ b/kernel/sched/fair.c @@ -924,10 +924,12 @@ static inline
> >>>> unsigned long group_faults_cpu(struct numa_group *group, int
> >>>> nid)
> >>>> 
> >>>> /* * These return the fraction of accesses done by a
> >>>> particular task, or - * task group, on a particular numa
> >>>> node.  The group weight is given a - * larger multiplier, in
> >>>> order to group tasks together that are almost - * evenly
> >>>> spread out between numa nodes. + * task group, on a
> >>>> particular numa node.  The NUMA move threshold + * prevents
> >>>> task moves with marginal improvement, and is set to 5%. */ 
> >>>> +#define NUMA_SCALE 1024 +#define NUMA_MOVE_THRESH (5 *
> >>>> NUMA_SCALE / 100)
> >> 
> >> It would be good to see if changing NUMA_MOVE_THRESH to 
> >> (NUMA_SCALE / 8) does the trick.
> > 
> > With your 2nd patch and the above change, the result is:
> > 
> > "proc-vmstat.numa_hint_faults_local": [ 199708, 209152, 200638, 
> > 187324, 196654 ],
> > 
> > avg: 198695
> 
> OK, so it is still a little higher than your original 162245.

The original number is 94500 for ivb42 machine, the 162245 is the sum
of the two numbers above it that are tested on two machines - one is the
number for ivb42 and one is for lkp-snb01. Sorry if that is not clear.

And for the numbers I have given with your patch applied, they are all
for ivb42 alone.

> 
> I guess this is to be expected, since the code will be more
> successful at placing a task on the right node, which results
> in the task scanning its memory more rapidly for a little bit.
> 
> Are you seeing any changes in throughput?

The throughput has almost no change. Your 2nd patch with scale changed
has seen a decrease of 0.1% compared to your original commit that
triggered the report, and that original commit has a increase of 1.2%
compared to its parent commit.

Regards,
Aaron

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
@ 2014-07-31  6:53                   ` Aaron Lu
  0 siblings, 0 replies; 66+ messages in thread
From: Aaron Lu @ 2014-07-31  6:53 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 3250 bytes --]

On Thu, Jul 31, 2014 at 02:22:55AM -0400, Rik van Riel wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> On 07/31/2014 01:04 AM, Aaron Lu wrote:
> > On Wed, Jul 30, 2014 at 10:25:03AM -0400, Rik van Riel wrote:
> >> On 07/29/2014 10:14 PM, Aaron Lu wrote:
> >>> On Tue, Jul 29, 2014 at 04:04:37PM -0400, Rik van Riel wrote:
> >>>> On Tue, 29 Jul 2014 10:17:12 +0200 Peter Zijlstra
> >>>> <peterz@infradead.org> wrote:
> >>>> 
> >>>>>> +#define NUMA_SCALE 1000 +#define NUMA_MOVE_THRESH 50
> >>>>> 
> >>>>> Please make that 1024, there's no reason not to use power
> >>>>> of two here. This base 10 factor thing annoyed me no end
> >>>>> already, its time for it to die.
> >>>> 
> >>>> That's easy enough.  However, it would be good to know
> >>>> whether this actually helps with the regression Aaron found
> >>>> :)
> >>> 
> >>> Sorry for the delay.
> >>> 
> >>> I applied the last patch and queued the hackbench job to the
> >>> ivb42 test machine for it to run 5 times, and here is the
> >>> result(regarding the proc-vmstat.numa_hint_faults_local
> >>> field): 173565 201262 192317 198342 198595 avg: 192816
> >>> 
> >>> It seems it is still very big than previous kernels.
> >> 
> >> It looks like a step in the right direction, though.
> >> 
> >> Could you try running with a larger threshold?
> >> 
> >>>> +++ b/kernel/sched/fair.c @@ -924,10 +924,12 @@ static inline
> >>>> unsigned long group_faults_cpu(struct numa_group *group, int
> >>>> nid)
> >>>> 
> >>>> /* * These return the fraction of accesses done by a
> >>>> particular task, or - * task group, on a particular numa
> >>>> node.  The group weight is given a - * larger multiplier, in
> >>>> order to group tasks together that are almost - * evenly
> >>>> spread out between numa nodes. + * task group, on a
> >>>> particular numa node.  The NUMA move threshold + * prevents
> >>>> task moves with marginal improvement, and is set to 5%. */ 
> >>>> +#define NUMA_SCALE 1024 +#define NUMA_MOVE_THRESH (5 *
> >>>> NUMA_SCALE / 100)
> >> 
> >> It would be good to see if changing NUMA_MOVE_THRESH to 
> >> (NUMA_SCALE / 8) does the trick.
> > 
> > With your 2nd patch and the above change, the result is:
> > 
> > "proc-vmstat.numa_hint_faults_local": [ 199708, 209152, 200638, 
> > 187324, 196654 ],
> > 
> > avg: 198695
> 
> OK, so it is still a little higher than your original 162245.

The original number is 94500 for ivb42 machine, the 162245 is the sum
of the two numbers above it that are tested on two machines - one is the
number for ivb42 and one is for lkp-snb01. Sorry if that is not clear.

And for the numbers I have given with your patch applied, they are all
for ivb42 alone.

> 
> I guess this is to be expected, since the code will be more
> successful at placing a task on the right node, which results
> in the task scanning its memory more rapidly for a little bit.
> 
> Are you seeing any changes in throughput?

The throughput has almost no change. Your 2nd patch with scale changed
has seen a decrease of 0.1% compared to your original commit that
triggered the report, and that original commit has a increase of 1.2%
compared to its parent commit.

Regards,
Aaron

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [LKP] [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
  2014-07-30  2:14           ` Aaron Lu
@ 2014-07-31  8:33             ` Peter Zijlstra
  -1 siblings, 0 replies; 66+ messages in thread
From: Peter Zijlstra @ 2014-07-31  8:33 UTC (permalink / raw)
  To: Aaron Lu; +Cc: Rik van Riel, LKML, lkp, jhladky

[-- Attachment #1: Type: text/plain, Size: 249 bytes --]

On Wed, Jul 30, 2014 at 10:14:25AM +0800, Aaron Lu wrote:
> 118881 ~ 0%            +1.2%    120325 ~ 0% ivb42/hackbench/50%-threads-pipe

What kind of IVB is that EP or EX (or rather, how many sockets)? Also
what arguments to hackbench do you use?


[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
@ 2014-07-31  8:33             ` Peter Zijlstra
  0 siblings, 0 replies; 66+ messages in thread
From: Peter Zijlstra @ 2014-07-31  8:33 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 255 bytes --]

On Wed, Jul 30, 2014 at 10:14:25AM +0800, Aaron Lu wrote:
> 118881 ~ 0%            +1.2%    120325 ~ 0% ivb42/hackbench/50%-threads-pipe

What kind of IVB is that EP or EX (or rather, how many sockets)? Also
what arguments to hackbench do you use?


[-- Attachment #2: attachment.sig --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [LKP] [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
  2014-07-31  8:33             ` Peter Zijlstra
@ 2014-07-31  8:56               ` Aaron Lu
  -1 siblings, 0 replies; 66+ messages in thread
From: Aaron Lu @ 2014-07-31  8:56 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Rik van Riel, LKML, lkp, jhladky

On Thu, Jul 31, 2014 at 10:33:30AM +0200, Peter Zijlstra wrote:
> On Wed, Jul 30, 2014 at 10:14:25AM +0800, Aaron Lu wrote:
> > 118881 ~ 0%            +1.2%    120325 ~ 0% ivb42/hackbench/50%-threads-pipe
> 
> What kind of IVB is that EP or EX (or rather, how many sockets)? Also
> what arguments to hackbench do you use?
> 

2 sockets EP.

The cmdline is:
/usr/bin/hackbench -g 24 --threads --pipe -l 60000

Regards,
Aaron

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
@ 2014-07-31  8:56               ` Aaron Lu
  0 siblings, 0 replies; 66+ messages in thread
From: Aaron Lu @ 2014-07-31  8:56 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 439 bytes --]

On Thu, Jul 31, 2014 at 10:33:30AM +0200, Peter Zijlstra wrote:
> On Wed, Jul 30, 2014 at 10:14:25AM +0800, Aaron Lu wrote:
> > 118881 ~ 0%            +1.2%    120325 ~ 0% ivb42/hackbench/50%-threads-pipe
> 
> What kind of IVB is that EP or EX (or rather, how many sockets)? Also
> what arguments to hackbench do you use?
> 

2 sockets EP.

The cmdline is:
/usr/bin/hackbench -g 24 --threads --pipe -l 60000

Regards,
Aaron

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [LKP] [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
  2014-07-29  6:39     ` Rik van Riel
@ 2014-07-31 10:42       ` Peter Zijlstra
  -1 siblings, 0 replies; 66+ messages in thread
From: Peter Zijlstra @ 2014-07-31 10:42 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Aaron Lu, LKML, lkp, jhladky

[-- Attachment #1: Type: text/plain, Size: 2392 bytes --]

On Tue, Jul 29, 2014 at 02:39:40AM -0400, Rik van Riel wrote:
> On Tue, 29 Jul 2014 13:24:05 +0800
> Aaron Lu <aaron.lu@intel.com> wrote:
> 
> > FYI, we noticed the below changes on
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> > commit a43455a1d572daf7b730fe12eb747d1e17411365 ("sched/numa: Ensure task_numa_migrate() checks the preferred node")
> > 
> > ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
> > ---------------  -------------------------  
> >      94500 ~ 3%    +115.6%     203711 ~ 6%  ivb42/hackbench/50%-threads-pipe
> >      67745 ~ 4%     +64.1%     111174 ~ 5%  lkp-snb01/hackbench/50%-threads-socket
> >     162245 ~ 3%     +94.1%     314885 ~ 6%  TOTAL proc-vmstat.numa_hint_faults_local
> 
> Hi Aaron,
> 
> Jirka Hladky has reported a regression with that changeset as
> well, and I have already spent some time debugging the issue.

So assuming those numbers above are the difference in
numa_hint_local_faults, the report is actually a significant
_improvement_, not a regression.

On my IVB-EP I get similar numbers; using:

  PRE=`grep numa_hint_faults_local /proc/vmstat | cut -d' ' -f2`
  perf bench sched messaging -g 24 -t -p -l 60000
  POST=`grep numa_hint_faults_local /proc/vmstat | cut -d' ' -f2`
  echo $((POST-PRE))


tip/mater+origin/master		tip/master+origin/master-a43455a1d57

local	total                   local	total
faults  time                    faults  time

19971	51.384                  10104	50.838
17193	50.564                  9116	50.208
13435	49.057                  8332	51.344
23794	50.795                  9954	51.364
20255	49.463                  9598	51.258

18929.6	50.2526                 9420.8	51.0024
3863.61	0.96                    717.78	0.49

So that patch improves both local faults and runtime. Its good (even
though for the runtime we're still inside stdev overlap, so ideally I'd
do more runs).


Now I also did a run with the proposed patch, NUMA_SCALE/8 variant, and
that slightly reduces both again:

tip/master+origin/master+patch

local	total
faults  time

21296	50.541
12771	50.54
13872	52.224
23352	50.85
16516	50.705

17561.4	50.972
4613.32	0.71

So for hackbench a43455a1d57 is good and the proposed patch is making
things worse.

Let me see if I can still find my SPECjbb2005 copy to see what that
does.

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
@ 2014-07-31 10:42       ` Peter Zijlstra
  0 siblings, 0 replies; 66+ messages in thread
From: Peter Zijlstra @ 2014-07-31 10:42 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 2392 bytes --]

On Tue, Jul 29, 2014 at 02:39:40AM -0400, Rik van Riel wrote:
> On Tue, 29 Jul 2014 13:24:05 +0800
> Aaron Lu <aaron.lu@intel.com> wrote:
> 
> > FYI, we noticed the below changes on
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> > commit a43455a1d572daf7b730fe12eb747d1e17411365 ("sched/numa: Ensure task_numa_migrate() checks the preferred node")
> > 
> > ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
> > ---------------  -------------------------  
> >      94500 ~ 3%    +115.6%     203711 ~ 6%  ivb42/hackbench/50%-threads-pipe
> >      67745 ~ 4%     +64.1%     111174 ~ 5%  lkp-snb01/hackbench/50%-threads-socket
> >     162245 ~ 3%     +94.1%     314885 ~ 6%  TOTAL proc-vmstat.numa_hint_faults_local
> 
> Hi Aaron,
> 
> Jirka Hladky has reported a regression with that changeset as
> well, and I have already spent some time debugging the issue.

So assuming those numbers above are the difference in
numa_hint_local_faults, the report is actually a significant
_improvement_, not a regression.

On my IVB-EP I get similar numbers; using:

  PRE=`grep numa_hint_faults_local /proc/vmstat | cut -d' ' -f2`
  perf bench sched messaging -g 24 -t -p -l 60000
  POST=`grep numa_hint_faults_local /proc/vmstat | cut -d' ' -f2`
  echo $((POST-PRE))


tip/mater+origin/master		tip/master+origin/master-a43455a1d57

local	total                   local	total
faults  time                    faults  time

19971	51.384                  10104	50.838
17193	50.564                  9116	50.208
13435	49.057                  8332	51.344
23794	50.795                  9954	51.364
20255	49.463                  9598	51.258

18929.6	50.2526                 9420.8	51.0024
3863.61	0.96                    717.78	0.49

So that patch improves both local faults and runtime. Its good (even
though for the runtime we're still inside stdev overlap, so ideally I'd
do more runs).


Now I also did a run with the proposed patch, NUMA_SCALE/8 variant, and
that slightly reduces both again:

tip/master+origin/master+patch

local	total
faults  time

21296	50.541
12771	50.54
13872	52.224
23352	50.85
16516	50.705

17561.4	50.972
4613.32	0.71

So for hackbench a43455a1d57 is good and the proposed patch is making
things worse.

Let me see if I can still find my SPECjbb2005 copy to see what that
does.

[-- Attachment #2: attachment.sig --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [LKP] [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
  2014-07-31 10:42       ` Peter Zijlstra
@ 2014-07-31 15:57         ` Peter Zijlstra
  -1 siblings, 0 replies; 66+ messages in thread
From: Peter Zijlstra @ 2014-07-31 15:57 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Aaron Lu, LKML, lkp, jhladky

[-- Attachment #1: Type: text/plain, Size: 1278 bytes --]

On Thu, Jul 31, 2014 at 12:42:41PM +0200, Peter Zijlstra wrote:
> On Tue, Jul 29, 2014 at 02:39:40AM -0400, Rik van Riel wrote:
> > On Tue, 29 Jul 2014 13:24:05 +0800
> > Aaron Lu <aaron.lu@intel.com> wrote:
> > 
> > > FYI, we noticed the below changes on
> > > 
> > > git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> > > commit a43455a1d572daf7b730fe12eb747d1e17411365 ("sched/numa: Ensure task_numa_migrate() checks the preferred node")
> > > 
> > > ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
> > > ---------------  -------------------------  
> > >      94500 ~ 3%    +115.6%     203711 ~ 6%  ivb42/hackbench/50%-threads-pipe
> > >      67745 ~ 4%     +64.1%     111174 ~ 5%  lkp-snb01/hackbench/50%-threads-socket
> > >     162245 ~ 3%     +94.1%     314885 ~ 6%  TOTAL proc-vmstat.numa_hint_faults_local
> > 
> > Hi Aaron,
> > 
> > Jirka Hladky has reported a regression with that changeset as
> > well, and I have already spent some time debugging the issue.
> 
> Let me see if I can still find my SPECjbb2005 copy to see what that
> does.

Jirka, what kind of setup were you seeing SPECjbb regressions?

I'm not seeing any on 2 sockets with a single SPECjbb instance, I'll go
check one instance per socket now.



[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
@ 2014-07-31 15:57         ` Peter Zijlstra
  0 siblings, 0 replies; 66+ messages in thread
From: Peter Zijlstra @ 2014-07-31 15:57 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 1278 bytes --]

On Thu, Jul 31, 2014 at 12:42:41PM +0200, Peter Zijlstra wrote:
> On Tue, Jul 29, 2014 at 02:39:40AM -0400, Rik van Riel wrote:
> > On Tue, 29 Jul 2014 13:24:05 +0800
> > Aaron Lu <aaron.lu@intel.com> wrote:
> > 
> > > FYI, we noticed the below changes on
> > > 
> > > git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> > > commit a43455a1d572daf7b730fe12eb747d1e17411365 ("sched/numa: Ensure task_numa_migrate() checks the preferred node")
> > > 
> > > ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
> > > ---------------  -------------------------  
> > >      94500 ~ 3%    +115.6%     203711 ~ 6%  ivb42/hackbench/50%-threads-pipe
> > >      67745 ~ 4%     +64.1%     111174 ~ 5%  lkp-snb01/hackbench/50%-threads-socket
> > >     162245 ~ 3%     +94.1%     314885 ~ 6%  TOTAL proc-vmstat.numa_hint_faults_local
> > 
> > Hi Aaron,
> > 
> > Jirka Hladky has reported a regression with that changeset as
> > well, and I have already spent some time debugging the issue.
> 
> Let me see if I can still find my SPECjbb2005 copy to see what that
> does.

Jirka, what kind of setup were you seeing SPECjbb regressions?

I'm not seeing any on 2 sockets with a single SPECjbb instance, I'll go
check one instance per socket now.



[-- Attachment #2: attachment.sig --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [LKP] [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
  2014-07-31 15:57         ` Peter Zijlstra
@ 2014-07-31 16:16           ` Jirka Hladky
  -1 siblings, 0 replies; 66+ messages in thread
From: Jirka Hladky @ 2014-07-31 16:16 UTC (permalink / raw)
  To: Peter Zijlstra, Rik van Riel; +Cc: Aaron Lu, LKML, lkp

[-- Attachment #1: Type: text/plain, Size: 1549 bytes --]

On 07/31/2014 05:57 PM, Peter Zijlstra wrote:
> On Thu, Jul 31, 2014 at 12:42:41PM +0200, Peter Zijlstra wrote:
>> On Tue, Jul 29, 2014 at 02:39:40AM -0400, Rik van Riel wrote:
>>> On Tue, 29 Jul 2014 13:24:05 +0800
>>> Aaron Lu <aaron.lu@intel.com> wrote:
>>>
>>>> FYI, we noticed the below changes on
>>>>
>>>> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
>>>> commit a43455a1d572daf7b730fe12eb747d1e17411365 ("sched/numa: Ensure task_numa_migrate() checks the preferred node")
>>>>
>>>> ebe06187bf2aec1  a43455a1d572daf7b730fe12e
>>>> ---------------  -------------------------
>>>>       94500 ~ 3%    +115.6%     203711 ~ 6%  ivb42/hackbench/50%-threads-pipe
>>>>       67745 ~ 4%     +64.1%     111174 ~ 5%  lkp-snb01/hackbench/50%-threads-socket
>>>>      162245 ~ 3%     +94.1%     314885 ~ 6%  TOTAL proc-vmstat.numa_hint_faults_local
>>> Hi Aaron,
>>>
>>> Jirka Hladky has reported a regression with that changeset as
>>> well, and I have already spent some time debugging the issue.
>> Let me see if I can still find my SPECjbb2005 copy to see what that
>> does.
> Jirka, what kind of setup were you seeing SPECjbb regressions?
>
> I'm not seeing any on 2 sockets with a single SPECjbb instance, I'll go
> check one instance per socket now.
>
>
Peter, I'm seeing regressions for

SINGLE SPECjbb instance for number of warehouses being the same as total 
number of cores in the box.

Example: 4 NUMA node box, each CPU has 6 cores => biggest regression is 
for 24 warehouses.

See the attached snapshot.

Jirka

[-- Attachment #2: SPECjbb2005_-127.el7numafixes9.png --]
[-- Type: image/png, Size: 91443 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
@ 2014-07-31 16:16           ` Jirka Hladky
  0 siblings, 0 replies; 66+ messages in thread
From: Jirka Hladky @ 2014-07-31 16:16 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 1588 bytes --]

On 07/31/2014 05:57 PM, Peter Zijlstra wrote:
> On Thu, Jul 31, 2014 at 12:42:41PM +0200, Peter Zijlstra wrote:
>> On Tue, Jul 29, 2014 at 02:39:40AM -0400, Rik van Riel wrote:
>>> On Tue, 29 Jul 2014 13:24:05 +0800
>>> Aaron Lu <aaron.lu@intel.com> wrote:
>>>
>>>> FYI, we noticed the below changes on
>>>>
>>>> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
>>>> commit a43455a1d572daf7b730fe12eb747d1e17411365 ("sched/numa: Ensure task_numa_migrate() checks the preferred node")
>>>>
>>>> ebe06187bf2aec1  a43455a1d572daf7b730fe12e
>>>> ---------------  -------------------------
>>>>       94500 ~ 3%    +115.6%     203711 ~ 6%  ivb42/hackbench/50%-threads-pipe
>>>>       67745 ~ 4%     +64.1%     111174 ~ 5%  lkp-snb01/hackbench/50%-threads-socket
>>>>      162245 ~ 3%     +94.1%     314885 ~ 6%  TOTAL proc-vmstat.numa_hint_faults_local
>>> Hi Aaron,
>>>
>>> Jirka Hladky has reported a regression with that changeset as
>>> well, and I have already spent some time debugging the issue.
>> Let me see if I can still find my SPECjbb2005 copy to see what that
>> does.
> Jirka, what kind of setup were you seeing SPECjbb regressions?
>
> I'm not seeing any on 2 sockets with a single SPECjbb instance, I'll go
> check one instance per socket now.
>
>
Peter, I'm seeing regressions for

SINGLE SPECjbb instance for number of warehouses being the same as total 
number of cores in the box.

Example: 4 NUMA node box, each CPU has 6 cores => biggest regression is 
for 24 warehouses.

See the attached snapshot.

Jirka

[-- Attachment #2: SPECjbb2005_-127.el7numafixes9.png --]
[-- Type: image/png, Size: 91443 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [LKP] [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
  2014-07-31 16:16           ` Jirka Hladky
@ 2014-07-31 16:27             ` Peter Zijlstra
  -1 siblings, 0 replies; 66+ messages in thread
From: Peter Zijlstra @ 2014-07-31 16:27 UTC (permalink / raw)
  To: Jirka Hladky; +Cc: Rik van Riel, Aaron Lu, LKML, lkp

[-- Attachment #1: Type: text/plain, Size: 3731 bytes --]

On Thu, Jul 31, 2014 at 06:16:26PM +0200, Jirka Hladky wrote:
> On 07/31/2014 05:57 PM, Peter Zijlstra wrote:
> >On Thu, Jul 31, 2014 at 12:42:41PM +0200, Peter Zijlstra wrote:
> >>On Tue, Jul 29, 2014 at 02:39:40AM -0400, Rik van Riel wrote:
> >>>On Tue, 29 Jul 2014 13:24:05 +0800
> >>>Aaron Lu <aaron.lu@intel.com> wrote:
> >>>
> >>>>FYI, we noticed the below changes on
> >>>>
> >>>>git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> >>>>commit a43455a1d572daf7b730fe12eb747d1e17411365 ("sched/numa: Ensure task_numa_migrate() checks the preferred node")
> >>>>
> >>>>ebe06187bf2aec1  a43455a1d572daf7b730fe12e
> >>>>---------------  -------------------------
> >>>>      94500 ~ 3%    +115.6%     203711 ~ 6%  ivb42/hackbench/50%-threads-pipe
> >>>>      67745 ~ 4%     +64.1%     111174 ~ 5%  lkp-snb01/hackbench/50%-threads-socket
> >>>>     162245 ~ 3%     +94.1%     314885 ~ 6%  TOTAL proc-vmstat.numa_hint_faults_local
> >>>Hi Aaron,
> >>>
> >>>Jirka Hladky has reported a regression with that changeset as
> >>>well, and I have already spent some time debugging the issue.
> >>Let me see if I can still find my SPECjbb2005 copy to see what that
> >>does.
> >Jirka, what kind of setup were you seeing SPECjbb regressions?
> >
> >I'm not seeing any on 2 sockets with a single SPECjbb instance, I'll go
> >check one instance per socket now.
> >
> >
> Peter, I'm seeing regressions for
> 
> SINGLE SPECjbb instance for number of warehouses being the same as total
> number of cores in the box.
> 
> Example: 4 NUMA node box, each CPU has 6 cores => biggest regression is for
> 24 warehouses.

IVB-EP: 2 node, 10 cores, 2 thread per core:

tip/master+origin/master:

     Warehouses               Thrput
              4               196781
              8               358064
             12               511318
             16               589251
             20               656123
             24               710789
             28               765426
             32               787059
             36               777899
           * 40               748568
                                    
Throughput      18258   

     Warehouses               Thrput
              4               201598
              8               363470
             12               512968
             16               584289
             20               605299
             24               720142
             28               776066
             32               791263
             36               776965
           * 40               760572
                                    
Throughput      18551   


tip/master+origin/master-a43455a1d57

                   SPEC scores                                                                                        
     Warehouses               Thrput
              4               198667
              8               362481
             12               503344
             16               582602
             20               647688
             24               731639
             28               786135
             32               794124
             36               774567
           * 40               757559
                                    
Throughput      18477  


Given that there's fairly large variance between the two runs with the
commit in, I'm not sure I can say there's a problem here.

The one run without the patch is more or less between the two runs with
the patch.

And doing this many runs takes ages, so I'm not tempted to either make
the runs longer or do more of them.

Lemme try on a 4 node box though, who knows.

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
@ 2014-07-31 16:27             ` Peter Zijlstra
  0 siblings, 0 replies; 66+ messages in thread
From: Peter Zijlstra @ 2014-07-31 16:27 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 3731 bytes --]

On Thu, Jul 31, 2014 at 06:16:26PM +0200, Jirka Hladky wrote:
> On 07/31/2014 05:57 PM, Peter Zijlstra wrote:
> >On Thu, Jul 31, 2014 at 12:42:41PM +0200, Peter Zijlstra wrote:
> >>On Tue, Jul 29, 2014 at 02:39:40AM -0400, Rik van Riel wrote:
> >>>On Tue, 29 Jul 2014 13:24:05 +0800
> >>>Aaron Lu <aaron.lu@intel.com> wrote:
> >>>
> >>>>FYI, we noticed the below changes on
> >>>>
> >>>>git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> >>>>commit a43455a1d572daf7b730fe12eb747d1e17411365 ("sched/numa: Ensure task_numa_migrate() checks the preferred node")
> >>>>
> >>>>ebe06187bf2aec1  a43455a1d572daf7b730fe12e
> >>>>---------------  -------------------------
> >>>>      94500 ~ 3%    +115.6%     203711 ~ 6%  ivb42/hackbench/50%-threads-pipe
> >>>>      67745 ~ 4%     +64.1%     111174 ~ 5%  lkp-snb01/hackbench/50%-threads-socket
> >>>>     162245 ~ 3%     +94.1%     314885 ~ 6%  TOTAL proc-vmstat.numa_hint_faults_local
> >>>Hi Aaron,
> >>>
> >>>Jirka Hladky has reported a regression with that changeset as
> >>>well, and I have already spent some time debugging the issue.
> >>Let me see if I can still find my SPECjbb2005 copy to see what that
> >>does.
> >Jirka, what kind of setup were you seeing SPECjbb regressions?
> >
> >I'm not seeing any on 2 sockets with a single SPECjbb instance, I'll go
> >check one instance per socket now.
> >
> >
> Peter, I'm seeing regressions for
> 
> SINGLE SPECjbb instance for number of warehouses being the same as total
> number of cores in the box.
> 
> Example: 4 NUMA node box, each CPU has 6 cores => biggest regression is for
> 24 warehouses.

IVB-EP: 2 node, 10 cores, 2 thread per core:

tip/master+origin/master:

     Warehouses               Thrput
              4               196781
              8               358064
             12               511318
             16               589251
             20               656123
             24               710789
             28               765426
             32               787059
             36               777899
           * 40               748568
                                    
Throughput      18258   

     Warehouses               Thrput
              4               201598
              8               363470
             12               512968
             16               584289
             20               605299
             24               720142
             28               776066
             32               791263
             36               776965
           * 40               760572
                                    
Throughput      18551   


tip/master+origin/master-a43455a1d57

                   SPEC scores                                                                                        
     Warehouses               Thrput
              4               198667
              8               362481
             12               503344
             16               582602
             20               647688
             24               731639
             28               786135
             32               794124
             36               774567
           * 40               757559
                                    
Throughput      18477  


Given that there's fairly large variance between the two runs with the
commit in, I'm not sure I can say there's a problem here.

The one run without the patch is more or less between the two runs with
the patch.

And doing this many runs takes ages, so I'm not tempted to either make
the runs longer or do more of them.

Lemme try on a 4 node box though, who knows.

[-- Attachment #2: attachment.sig --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [LKP] [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
  2014-07-31 16:27             ` Peter Zijlstra
@ 2014-07-31 16:39               ` Jirka Hladky
  -1 siblings, 0 replies; 66+ messages in thread
From: Jirka Hladky @ 2014-07-31 16:39 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Rik van Riel, Aaron Lu, LKML, lkp

On 07/31/2014 06:27 PM, Peter Zijlstra wrote:
> On Thu, Jul 31, 2014 at 06:16:26PM +0200, Jirka Hladky wrote:
>> On 07/31/2014 05:57 PM, Peter Zijlstra wrote:
>>> On Thu, Jul 31, 2014 at 12:42:41PM +0200, Peter Zijlstra wrote:
>>>> On Tue, Jul 29, 2014 at 02:39:40AM -0400, Rik van Riel wrote:
>>>>> On Tue, 29 Jul 2014 13:24:05 +0800
>>>>> Aaron Lu <aaron.lu@intel.com> wrote:
>>>>>
>>>>>> FYI, we noticed the below changes on
>>>>>>
>>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
>>>>>> commit a43455a1d572daf7b730fe12eb747d1e17411365 ("sched/numa: Ensure task_numa_migrate() checks the preferred node")
>>>>>>
>>>>>> ebe06187bf2aec1  a43455a1d572daf7b730fe12e
>>>>>> ---------------  -------------------------
>>>>>>       94500 ~ 3%    +115.6%     203711 ~ 6%  ivb42/hackbench/50%-threads-pipe
>>>>>>       67745 ~ 4%     +64.1%     111174 ~ 5%  lkp-snb01/hackbench/50%-threads-socket
>>>>>>      162245 ~ 3%     +94.1%     314885 ~ 6%  TOTAL proc-vmstat.numa_hint_faults_local
>>>>> Hi Aaron,
>>>>>
>>>>> Jirka Hladky has reported a regression with that changeset as
>>>>> well, and I have already spent some time debugging the issue.
>>>> Let me see if I can still find my SPECjbb2005 copy to see what that
>>>> does.
>>> Jirka, what kind of setup were you seeing SPECjbb regressions?
>>>
>>> I'm not seeing any on 2 sockets with a single SPECjbb instance, I'll go
>>> check one instance per socket now.
>>>
>>>
>> Peter, I'm seeing regressions for
>>
>> SINGLE SPECjbb instance for number of warehouses being the same as total
>> number of cores in the box.
>>
>> Example: 4 NUMA node box, each CPU has 6 cores => biggest regression is for
>> 24 warehouses.
> IVB-EP: 2 node, 10 cores, 2 thread per core:
>
> tip/master+origin/master:
>
>       Warehouses               Thrput
>                4               196781
>                8               358064
>               12               511318
>               16               589251
>               20               656123
>               24               710789
>               28               765426
>               32               787059
>               36               777899
>             * 40               748568
>                                      
> Throughput      18258
>
>       Warehouses               Thrput
>                4               201598
>                8               363470
>               12               512968
>               16               584289
>               20               605299
>               24               720142
>               28               776066
>               32               791263
>               36               776965
>             * 40               760572
>                                      
> Throughput      18551
>
>
> tip/master+origin/master-a43455a1d57
>
>                     SPEC scores
>       Warehouses               Thrput
>                4               198667
>                8               362481
>               12               503344
>               16               582602
>               20               647688
>               24               731639
>               28               786135
>               32               794124
>               36               774567
>             * 40               757559
>                                      
> Throughput      18477
>
>
> Given that there's fairly large variance between the two runs with the
> commit in, I'm not sure I can say there's a problem here.
>
> The one run without the patch is more or less between the two runs with
> the patch.
>
> And doing this many runs takes ages, so I'm not tempted to either make
> the runs longer or do more of them.
>
> Lemme try on a 4 node box though, who knows.

IVB-EP: 2 node, 10 cores, 2 thread per core
=> on such system, I run only 20 warenhouses as maximum. (number of 
nodes * number of PHYSICAL cores)

The kernels you have tested shows following results:
656123/605299/647688


I'm doing 3 iterations (3 runs) to get some statistics. To speed up the 
test significantly please do the run with 20 warehouses only
(or in general with #warehouses ==  number of nodes * number of PHYSICAL 
cores)

Jirka

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
@ 2014-07-31 16:39               ` Jirka Hladky
  0 siblings, 0 replies; 66+ messages in thread
From: Jirka Hladky @ 2014-07-31 16:39 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 4318 bytes --]

On 07/31/2014 06:27 PM, Peter Zijlstra wrote:
> On Thu, Jul 31, 2014 at 06:16:26PM +0200, Jirka Hladky wrote:
>> On 07/31/2014 05:57 PM, Peter Zijlstra wrote:
>>> On Thu, Jul 31, 2014 at 12:42:41PM +0200, Peter Zijlstra wrote:
>>>> On Tue, Jul 29, 2014 at 02:39:40AM -0400, Rik van Riel wrote:
>>>>> On Tue, 29 Jul 2014 13:24:05 +0800
>>>>> Aaron Lu <aaron.lu@intel.com> wrote:
>>>>>
>>>>>> FYI, we noticed the below changes on
>>>>>>
>>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
>>>>>> commit a43455a1d572daf7b730fe12eb747d1e17411365 ("sched/numa: Ensure task_numa_migrate() checks the preferred node")
>>>>>>
>>>>>> ebe06187bf2aec1  a43455a1d572daf7b730fe12e
>>>>>> ---------------  -------------------------
>>>>>>       94500 ~ 3%    +115.6%     203711 ~ 6%  ivb42/hackbench/50%-threads-pipe
>>>>>>       67745 ~ 4%     +64.1%     111174 ~ 5%  lkp-snb01/hackbench/50%-threads-socket
>>>>>>      162245 ~ 3%     +94.1%     314885 ~ 6%  TOTAL proc-vmstat.numa_hint_faults_local
>>>>> Hi Aaron,
>>>>>
>>>>> Jirka Hladky has reported a regression with that changeset as
>>>>> well, and I have already spent some time debugging the issue.
>>>> Let me see if I can still find my SPECjbb2005 copy to see what that
>>>> does.
>>> Jirka, what kind of setup were you seeing SPECjbb regressions?
>>>
>>> I'm not seeing any on 2 sockets with a single SPECjbb instance, I'll go
>>> check one instance per socket now.
>>>
>>>
>> Peter, I'm seeing regressions for
>>
>> SINGLE SPECjbb instance for number of warehouses being the same as total
>> number of cores in the box.
>>
>> Example: 4 NUMA node box, each CPU has 6 cores => biggest regression is for
>> 24 warehouses.
> IVB-EP: 2 node, 10 cores, 2 thread per core:
>
> tip/master+origin/master:
>
>       Warehouses               Thrput
>                4               196781
>                8               358064
>               12               511318
>               16               589251
>               20               656123
>               24               710789
>               28               765426
>               32               787059
>               36               777899
>             * 40               748568
>                                      
> Throughput      18258
>
>       Warehouses               Thrput
>                4               201598
>                8               363470
>               12               512968
>               16               584289
>               20               605299
>               24               720142
>               28               776066
>               32               791263
>               36               776965
>             * 40               760572
>                                      
> Throughput      18551
>
>
> tip/master+origin/master-a43455a1d57
>
>                     SPEC scores
>       Warehouses               Thrput
>                4               198667
>                8               362481
>               12               503344
>               16               582602
>               20               647688
>               24               731639
>               28               786135
>               32               794124
>               36               774567
>             * 40               757559
>                                      
> Throughput      18477
>
>
> Given that there's fairly large variance between the two runs with the
> commit in, I'm not sure I can say there's a problem here.
>
> The one run without the patch is more or less between the two runs with
> the patch.
>
> And doing this many runs takes ages, so I'm not tempted to either make
> the runs longer or do more of them.
>
> Lemme try on a 4 node box though, who knows.

IVB-EP: 2 node, 10 cores, 2 thread per core
=> on such system, I run only 20 warenhouses as maximum. (number of 
nodes * number of PHYSICAL cores)

The kernels you have tested shows following results:
656123/605299/647688


I'm doing 3 iterations (3 runs) to get some statistics. To speed up the 
test significantly please do the run with 20 warehouses only
(or in general with #warehouses ==  number of nodes * number of PHYSICAL 
cores)

Jirka

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [LKP] [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
  2014-07-31 16:39               ` Jirka Hladky
@ 2014-07-31 17:37                 ` Peter Zijlstra
  -1 siblings, 0 replies; 66+ messages in thread
From: Peter Zijlstra @ 2014-07-31 17:37 UTC (permalink / raw)
  To: Jirka Hladky; +Cc: Rik van Riel, Aaron Lu, LKML, lkp

[-- Attachment #1: Type: text/plain, Size: 697 bytes --]

On Thu, Jul 31, 2014 at 06:39:05PM +0200, Jirka Hladky wrote:
> I'm doing 3 iterations (3 runs) to get some statistics. To speed up the test
> significantly please do the run with 20 warehouses only
> (or in general with #warehouses ==  number of nodes * number of PHYSICAL
> cores)

Yeah, went and did that for my 4 node machine, its got a ton more cores, but I
matches the warehouses to it:

-a43455a1d57	tip/master

979996.47	1144715.44
876146		1098499.07
1058974.18	1019499.38
1055951.59	1139405.22
970504.01	1099659.09

988314.45	1100355.64	(avg)
75059.546179565	50085.7473975167(stdev)

So for 5 runs, tip/master (which includes the offending patch) wins hands down.

Each run is 2 minutes.

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
@ 2014-07-31 17:37                 ` Peter Zijlstra
  0 siblings, 0 replies; 66+ messages in thread
From: Peter Zijlstra @ 2014-07-31 17:37 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 720 bytes --]

On Thu, Jul 31, 2014 at 06:39:05PM +0200, Jirka Hladky wrote:
> I'm doing 3 iterations (3 runs) to get some statistics. To speed up the test
> significantly please do the run with 20 warehouses only
> (or in general with #warehouses ==  number of nodes * number of PHYSICAL
> cores)

Yeah, went and did that for my 4 node machine, its got a ton more cores, but I
matches the warehouses to it:

-a43455a1d57	tip/master

979996.47	1144715.44
876146		1098499.07
1058974.18	1019499.38
1055951.59	1139405.22
970504.01	1099659.09

988314.45	1100355.64	(avg)
75059.546179565	50085.7473975167(stdev)

So for 5 runs, tip/master (which includes the offending patch) wins hands down.

Each run is 2 minutes.

[-- Attachment #2: attachment.sig --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [LKP] [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
  2014-08-01  7:29           ` Peter Zijlstra
@ 2014-07-31 23:58             ` Yuyang Du
  -1 siblings, 0 replies; 66+ messages in thread
From: Yuyang Du @ 2014-07-31 23:58 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Aaron Lu, Rik van Riel, LKML, lkp, jhladky, Fengguang Wu

On Fri, Aug 01, 2014 at 09:29:11AM +0200, Peter Zijlstra wrote:
> On Fri, Aug 01, 2014 at 10:03:30AM +0800, Aaron Lu wrote:
> > > > > ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
> > > > > ---------------  -------------------------  
> > > > >      94500 ~ 3%    +115.6%     203711 ~ 6%  ivb42/hackbench/50%-threads-pipe
> > > > >      67745 ~ 4%     +64.1%     111174 ~ 5%  lkp-snb01/hackbench/50%-threads-socket
> > > > >     162245 ~ 3%     +94.1%     314885 ~ 6%  TOTAL proc-vmstat.numa_hint_faults_local
> 
> > It means, for commit ebe06187bf2aec1, the number for
> > num_hint_local_faults is 94500 for ivb42 machine and 67745 for lkp-snb01
> > machine. The 3%, 4% following that number means the deviation of the
> > different runs to their average(we usually run it multiple times to
> > phase out possible sharp values). We should probably remove that
> > percentage, as they cause confusion if no detailed explanation and may
> > not mean much to the commit author and others(if the deviation is big
> > enough, we should simply drop that result).
> 
> Nah, variance is good, but the typical symbol would be +- or the fancy
> ±.
> 
> ~ when used as a unary op means 'approx' or 'about' or 'same order'
> ~ when used as a binary op means equivalence, a weaker equal, often in
> the vein of the unary op meaning.
> 
> Also see: http://en.wikipedia.org/wiki/Tilde#Mathematics
> 
> So while I think having a measure of variance is good, I think you
> picked entirely the wrong symbol.

Or, maybe you can use σ (lower case sigma) to indicate stddev, :)

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
@ 2014-07-31 23:58             ` Yuyang Du
  0 siblings, 0 replies; 66+ messages in thread
From: Yuyang Du @ 2014-07-31 23:58 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 1589 bytes --]

On Fri, Aug 01, 2014 at 09:29:11AM +0200, Peter Zijlstra wrote:
> On Fri, Aug 01, 2014 at 10:03:30AM +0800, Aaron Lu wrote:
> > > > > ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
> > > > > ---------------  -------------------------  
> > > > >      94500 ~ 3%    +115.6%     203711 ~ 6%  ivb42/hackbench/50%-threads-pipe
> > > > >      67745 ~ 4%     +64.1%     111174 ~ 5%  lkp-snb01/hackbench/50%-threads-socket
> > > > >     162245 ~ 3%     +94.1%     314885 ~ 6%  TOTAL proc-vmstat.numa_hint_faults_local
> 
> > It means, for commit ebe06187bf2aec1, the number for
> > num_hint_local_faults is 94500 for ivb42 machine and 67745 for lkp-snb01
> > machine. The 3%, 4% following that number means the deviation of the
> > different runs to their average(we usually run it multiple times to
> > phase out possible sharp values). We should probably remove that
> > percentage, as they cause confusion if no detailed explanation and may
> > not mean much to the commit author and others(if the deviation is big
> > enough, we should simply drop that result).
> 
> Nah, variance is good, but the typical symbol would be +- or the fancy
> ±.
> 
> ~ when used as a unary op means 'approx' or 'about' or 'same order'
> ~ when used as a binary op means equivalence, a weaker equal, often in
> the vein of the unary op meaning.
> 
> Also see: http://en.wikipedia.org/wiki/Tilde#Mathematics
> 
> So while I think having a measure of variance is good, I think you
> picked entirely the wrong symbol.

Or, maybe you can use σ (lower case sigma) to indicate stddev, :)

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [LKP] [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
  2014-07-31 10:42       ` Peter Zijlstra
@ 2014-08-01  0:18         ` Davidlohr Bueso
  -1 siblings, 0 replies; 66+ messages in thread
From: Davidlohr Bueso @ 2014-08-01  0:18 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Rik van Riel, Aaron Lu, LKML, lkp, jhladky

On Thu, 2014-07-31 at 12:42 +0200, Peter Zijlstra wrote:
> On Tue, Jul 29, 2014 at 02:39:40AM -0400, Rik van Riel wrote:
> > On Tue, 29 Jul 2014 13:24:05 +0800
> > Aaron Lu <aaron.lu@intel.com> wrote:
> > 
> > > FYI, we noticed the below changes on
> > > 
> > > git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> > > commit a43455a1d572daf7b730fe12eb747d1e17411365 ("sched/numa: Ensure task_numa_migrate() checks the preferred node")
> > > 
> > > ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
> > > ---------------  -------------------------  
> > >      94500 ~ 3%    +115.6%     203711 ~ 6%  ivb42/hackbench/50%-threads-pipe
> > >      67745 ~ 4%     +64.1%     111174 ~ 5%  lkp-snb01/hackbench/50%-threads-socket
> > >     162245 ~ 3%     +94.1%     314885 ~ 6%  TOTAL proc-vmstat.numa_hint_faults_local
> > 
> > Hi Aaron,
> > 
> > Jirka Hladky has reported a regression with that changeset as
> > well, and I have already spent some time debugging the issue.
> 
> So assuming those numbers above are the difference in
> numa_hint_local_faults, the report is actually a significant
> _improvement_, not a regression.
> 
> On my IVB-EP I get similar numbers; using:
> 
>   PRE=`grep numa_hint_faults_local /proc/vmstat | cut -d' ' -f2`
>   perf bench sched messaging -g 24 -t -p -l 60000
>   POST=`grep numa_hint_faults_local /proc/vmstat | cut -d' ' -f2`
>   echo $((POST-PRE))
> 
> 
> tip/mater+origin/master		tip/master+origin/master-a43455a1d57
> 
> local	total                   local	total
> faults  time                    faults  time
> 
> 19971	51.384                  10104	50.838
> 17193	50.564                  9116	50.208
> 13435	49.057                  8332	51.344
> 23794	50.795                  9954	51.364
> 20255	49.463                  9598	51.258
> 
> 18929.6	50.2526                 9420.8	51.0024
> 3863.61	0.96                    717.78	0.49
> 
> So that patch improves both local faults and runtime. Its good (even
> though for the runtime we're still inside stdev overlap, so ideally I'd
> do more runs).
> 
> 
> Now I also did a run with the proposed patch, NUMA_SCALE/8 variant, and
> that slightly reduces both again:
> 
> tip/master+origin/master+patch
> 
> local	total
> faults  time
> 
> 21296	50.541
> 12771	50.54
> 13872	52.224
> 23352	50.85
> 16516	50.705
> 
> 17561.4	50.972
> 4613.32	0.71
> 
> So for hackbench a43455a1d57 is good and the proposed patch is making
> things worse.

It also seems to be the case on a 8-socket 80 core DL980:

tip/master baseline:
67276 169.590 [sec]
82400 188.406 [sec]
87827 201.122 [sec]
96659 228.243 [sec]
83180 192.422 [sec]

tip/master + a43455a1d57 reverted
36686 170.373 [sec]
52670 187.904 [sec]
55723 203.597 [sec]
41780 174.354 [sec]
36070 173.179 [sec]

Runtimes are pretty much all over the place, cannot really say if it's
gotten slower or faster. However, on avg, we nearly double the amount of
hint local faults with the commit in question.

After adding the proposed fix (NUMA_SCALE/8 variant), it goes down
again, closer to without a43455a1d57"

tip/master + patch
50591 175.272 [sec]
57858 191.969 [sec]
77564 215.429 [sec]
50613 179.384 [sec]
61673 201.694 [sec]

> Let me see if I can still find my SPECjbb2005 copy to see what that
> does.

I'll try to dig it up as well.


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
@ 2014-08-01  0:18         ` Davidlohr Bueso
  0 siblings, 0 replies; 66+ messages in thread
From: Davidlohr Bueso @ 2014-08-01  0:18 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 3394 bytes --]

On Thu, 2014-07-31 at 12:42 +0200, Peter Zijlstra wrote:
> On Tue, Jul 29, 2014 at 02:39:40AM -0400, Rik van Riel wrote:
> > On Tue, 29 Jul 2014 13:24:05 +0800
> > Aaron Lu <aaron.lu@intel.com> wrote:
> > 
> > > FYI, we noticed the below changes on
> > > 
> > > git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> > > commit a43455a1d572daf7b730fe12eb747d1e17411365 ("sched/numa: Ensure task_numa_migrate() checks the preferred node")
> > > 
> > > ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
> > > ---------------  -------------------------  
> > >      94500 ~ 3%    +115.6%     203711 ~ 6%  ivb42/hackbench/50%-threads-pipe
> > >      67745 ~ 4%     +64.1%     111174 ~ 5%  lkp-snb01/hackbench/50%-threads-socket
> > >     162245 ~ 3%     +94.1%     314885 ~ 6%  TOTAL proc-vmstat.numa_hint_faults_local
> > 
> > Hi Aaron,
> > 
> > Jirka Hladky has reported a regression with that changeset as
> > well, and I have already spent some time debugging the issue.
> 
> So assuming those numbers above are the difference in
> numa_hint_local_faults, the report is actually a significant
> _improvement_, not a regression.
> 
> On my IVB-EP I get similar numbers; using:
> 
>   PRE=`grep numa_hint_faults_local /proc/vmstat | cut -d' ' -f2`
>   perf bench sched messaging -g 24 -t -p -l 60000
>   POST=`grep numa_hint_faults_local /proc/vmstat | cut -d' ' -f2`
>   echo $((POST-PRE))
> 
> 
> tip/mater+origin/master		tip/master+origin/master-a43455a1d57
> 
> local	total                   local	total
> faults  time                    faults  time
> 
> 19971	51.384                  10104	50.838
> 17193	50.564                  9116	50.208
> 13435	49.057                  8332	51.344
> 23794	50.795                  9954	51.364
> 20255	49.463                  9598	51.258
> 
> 18929.6	50.2526                 9420.8	51.0024
> 3863.61	0.96                    717.78	0.49
> 
> So that patch improves both local faults and runtime. Its good (even
> though for the runtime we're still inside stdev overlap, so ideally I'd
> do more runs).
> 
> 
> Now I also did a run with the proposed patch, NUMA_SCALE/8 variant, and
> that slightly reduces both again:
> 
> tip/master+origin/master+patch
> 
> local	total
> faults  time
> 
> 21296	50.541
> 12771	50.54
> 13872	52.224
> 23352	50.85
> 16516	50.705
> 
> 17561.4	50.972
> 4613.32	0.71
> 
> So for hackbench a43455a1d57 is good and the proposed patch is making
> things worse.

It also seems to be the case on a 8-socket 80 core DL980:

tip/master baseline:
67276 169.590 [sec]
82400 188.406 [sec]
87827 201.122 [sec]
96659 228.243 [sec]
83180 192.422 [sec]

tip/master + a43455a1d57 reverted
36686 170.373 [sec]
52670 187.904 [sec]
55723 203.597 [sec]
41780 174.354 [sec]
36070 173.179 [sec]

Runtimes are pretty much all over the place, cannot really say if it's
gotten slower or faster. However, on avg, we nearly double the amount of
hint local faults with the commit in question.

After adding the proposed fix (NUMA_SCALE/8 variant), it goes down
again, closer to without a43455a1d57"

tip/master + patch
50591 175.272 [sec]
57858 191.969 [sec]
77564 215.429 [sec]
50613 179.384 [sec]
61673 201.694 [sec]

> Let me see if I can still find my SPECjbb2005 copy to see what that
> does.

I'll try to dig it up as well.


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [LKP] [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
  2014-07-31 10:42       ` Peter Zijlstra
@ 2014-08-01  2:03         ` Aaron Lu
  -1 siblings, 0 replies; 66+ messages in thread
From: Aaron Lu @ 2014-08-01  2:03 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Rik van Riel, LKML, lkp, jhladky, Fengguang Wu

On Thu, Jul 31, 2014 at 12:42:41PM +0200, Peter Zijlstra wrote:
> On Tue, Jul 29, 2014 at 02:39:40AM -0400, Rik van Riel wrote:
> > On Tue, 29 Jul 2014 13:24:05 +0800
> > Aaron Lu <aaron.lu@intel.com> wrote:
> > 
> > > FYI, we noticed the below changes on
> > > 
> > > git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> > > commit a43455a1d572daf7b730fe12eb747d1e17411365 ("sched/numa: Ensure task_numa_migrate() checks the preferred node")
> > > 
> > > ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
> > > ---------------  -------------------------  
> > >      94500 ~ 3%    +115.6%     203711 ~ 6%  ivb42/hackbench/50%-threads-pipe
> > >      67745 ~ 4%     +64.1%     111174 ~ 5%  lkp-snb01/hackbench/50%-threads-socket
> > >     162245 ~ 3%     +94.1%     314885 ~ 6%  TOTAL proc-vmstat.numa_hint_faults_local
> > 
> > Hi Aaron,
> > 
> > Jirka Hladky has reported a regression with that changeset as
> > well, and I have already spent some time debugging the issue.
> 
> So assuming those numbers above are the difference in

Yes, they are.

It means, for commit ebe06187bf2aec1, the number for
num_hint_local_faults is 94500 for ivb42 machine and 67745 for lkp-snb01
machine. The 3%, 4% following that number means the deviation of the
different runs to their average(we usually run it multiple times to
phase out possible sharp values). We should probably remove that
percentage, as they cause confusion if no detailed explanation and may
not mean much to the commit author and others(if the deviation is big
enough, we should simply drop that result).

The percentage in the middle is the change between the two commits.

Another thing is the meaning of the numbers, it doesn't seem that
evident they are for proc-vmstat.numa_hint_faults_local. Maybe something
like this is better?

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  proc-vmstat.numa_hint_faults_local
---------------  -------------------------  -----------------------------
     94500         +115.6%     203711       ivb42/hackbench/50%-threads-pipe
     67745          +64.1%     111174       lkp-snb01/hackbench/50%-threads-socket
    162245          +94.1%     314885       TOTAL 

Regards,
Aaron

> numa_hint_local_faults, the report is actually a significant
> _improvement_, not a regression.
> 
> On my IVB-EP I get similar numbers; using:
> 
>   PRE=`grep numa_hint_faults_local /proc/vmstat | cut -d' ' -f2`
>   perf bench sched messaging -g 24 -t -p -l 60000
>   POST=`grep numa_hint_faults_local /proc/vmstat | cut -d' ' -f2`
>   echo $((POST-PRE))
> 
> 
> tip/mater+origin/master		tip/master+origin/master-a43455a1d57
> 
> local	total                   local	total
> faults  time                    faults  time
> 
> 19971	51.384                  10104	50.838
> 17193	50.564                  9116	50.208
> 13435	49.057                  8332	51.344
> 23794	50.795                  9954	51.364
> 20255	49.463                  9598	51.258
> 
> 18929.6	50.2526                 9420.8	51.0024
> 3863.61	0.96                    717.78	0.49
> 
> So that patch improves both local faults and runtime. Its good (even
> though for the runtime we're still inside stdev overlap, so ideally I'd
> do more runs).
> 
> 
> Now I also did a run with the proposed patch, NUMA_SCALE/8 variant, and
> that slightly reduces both again:
> 
> tip/master+origin/master+patch
> 
> local	total
> faults  time
> 
> 21296	50.541
> 12771	50.54
> 13872	52.224
> 23352	50.85
> 16516	50.705
> 
> 17561.4	50.972
> 4613.32	0.71
> 
> So for hackbench a43455a1d57 is good and the proposed patch is making
> things worse.
> 
> Let me see if I can still find my SPECjbb2005 copy to see what that
> does.



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
@ 2014-08-01  2:03         ` Aaron Lu
  0 siblings, 0 replies; 66+ messages in thread
From: Aaron Lu @ 2014-08-01  2:03 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 3776 bytes --]

On Thu, Jul 31, 2014 at 12:42:41PM +0200, Peter Zijlstra wrote:
> On Tue, Jul 29, 2014 at 02:39:40AM -0400, Rik van Riel wrote:
> > On Tue, 29 Jul 2014 13:24:05 +0800
> > Aaron Lu <aaron.lu@intel.com> wrote:
> > 
> > > FYI, we noticed the below changes on
> > > 
> > > git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> > > commit a43455a1d572daf7b730fe12eb747d1e17411365 ("sched/numa: Ensure task_numa_migrate() checks the preferred node")
> > > 
> > > ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
> > > ---------------  -------------------------  
> > >      94500 ~ 3%    +115.6%     203711 ~ 6%  ivb42/hackbench/50%-threads-pipe
> > >      67745 ~ 4%     +64.1%     111174 ~ 5%  lkp-snb01/hackbench/50%-threads-socket
> > >     162245 ~ 3%     +94.1%     314885 ~ 6%  TOTAL proc-vmstat.numa_hint_faults_local
> > 
> > Hi Aaron,
> > 
> > Jirka Hladky has reported a regression with that changeset as
> > well, and I have already spent some time debugging the issue.
> 
> So assuming those numbers above are the difference in

Yes, they are.

It means, for commit ebe06187bf2aec1, the number for
num_hint_local_faults is 94500 for ivb42 machine and 67745 for lkp-snb01
machine. The 3%, 4% following that number means the deviation of the
different runs to their average(we usually run it multiple times to
phase out possible sharp values). We should probably remove that
percentage, as they cause confusion if no detailed explanation and may
not mean much to the commit author and others(if the deviation is big
enough, we should simply drop that result).

The percentage in the middle is the change between the two commits.

Another thing is the meaning of the numbers, it doesn't seem that
evident they are for proc-vmstat.numa_hint_faults_local. Maybe something
like this is better?

ebe06187bf2aec1  a43455a1d572daf7b730fe12e  proc-vmstat.numa_hint_faults_local
---------------  -------------------------  -----------------------------
     94500         +115.6%     203711       ivb42/hackbench/50%-threads-pipe
     67745          +64.1%     111174       lkp-snb01/hackbench/50%-threads-socket
    162245          +94.1%     314885       TOTAL 

Regards,
Aaron

> numa_hint_local_faults, the report is actually a significant
> _improvement_, not a regression.
> 
> On my IVB-EP I get similar numbers; using:
> 
>   PRE=`grep numa_hint_faults_local /proc/vmstat | cut -d' ' -f2`
>   perf bench sched messaging -g 24 -t -p -l 60000
>   POST=`grep numa_hint_faults_local /proc/vmstat | cut -d' ' -f2`
>   echo $((POST-PRE))
> 
> 
> tip/mater+origin/master		tip/master+origin/master-a43455a1d57
> 
> local	total                   local	total
> faults  time                    faults  time
> 
> 19971	51.384                  10104	50.838
> 17193	50.564                  9116	50.208
> 13435	49.057                  8332	51.344
> 23794	50.795                  9954	51.364
> 20255	49.463                  9598	51.258
> 
> 18929.6	50.2526                 9420.8	51.0024
> 3863.61	0.96                    717.78	0.49
> 
> So that patch improves both local faults and runtime. Its good (even
> though for the runtime we're still inside stdev overlap, so ideally I'd
> do more runs).
> 
> 
> Now I also did a run with the proposed patch, NUMA_SCALE/8 variant, and
> that slightly reduces both again:
> 
> tip/master+origin/master+patch
> 
> local	total
> faults  time
> 
> 21296	50.541
> 12771	50.54
> 13872	52.224
> 23352	50.85
> 16516	50.705
> 
> 17561.4	50.972
> 4613.32	0.71
> 
> So for hackbench a43455a1d57 is good and the proposed patch is making
> things worse.
> 
> Let me see if I can still find my SPECjbb2005 copy to see what that
> does.



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [LKP] [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
  2014-08-01  2:03         ` Aaron Lu
@ 2014-08-01  4:03           ` Davidlohr Bueso
  -1 siblings, 0 replies; 66+ messages in thread
From: Davidlohr Bueso @ 2014-08-01  4:03 UTC (permalink / raw)
  To: Aaron Lu; +Cc: Peter Zijlstra, Rik van Riel, LKML, lkp, jhladky, Fengguang Wu

On Fri, 2014-08-01 at 10:03 +0800, Aaron Lu wrote:
> On Thu, Jul 31, 2014 at 12:42:41PM +0200, Peter Zijlstra wrote:
> > On Tue, Jul 29, 2014 at 02:39:40AM -0400, Rik van Riel wrote:
> > > On Tue, 29 Jul 2014 13:24:05 +0800
> > > Aaron Lu <aaron.lu@intel.com> wrote:
> > > 
> > > > FYI, we noticed the below changes on
> > > > 
> > > > git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> > > > commit a43455a1d572daf7b730fe12eb747d1e17411365 ("sched/numa: Ensure task_numa_migrate() checks the preferred node")
> > > > 
> > > > ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
> > > > ---------------  -------------------------  
> > > >      94500 ~ 3%    +115.6%     203711 ~ 6%  ivb42/hackbench/50%-threads-pipe
> > > >      67745 ~ 4%     +64.1%     111174 ~ 5%  lkp-snb01/hackbench/50%-threads-socket
> > > >     162245 ~ 3%     +94.1%     314885 ~ 6%  TOTAL proc-vmstat.numa_hint_faults_local
> > > 
> > > Hi Aaron,
> > > 
> > > Jirka Hladky has reported a regression with that changeset as
> > > well, and I have already spent some time debugging the issue.
> > 
> > So assuming those numbers above are the difference in
> 
> Yes, they are.
> 
> It means, for commit ebe06187bf2aec1, the number for
> num_hint_local_faults is 94500 for ivb42 machine and 67745 for lkp-snb01
> machine. The 3%, 4% following that number means the deviation of the
> different runs to their average(we usually run it multiple times to
> phase out possible sharp values). We should probably remove that
> percentage, as they cause confusion if no detailed explanation and may
> not mean much to the commit author and others(if the deviation is big
> enough, we should simply drop that result).
> 
> The percentage in the middle is the change between the two commits.
> 
> Another thing is the meaning of the numbers, it doesn't seem that
> evident they are for proc-vmstat.numa_hint_faults_local. Maybe something
> like this is better?

Instead of removing info, why not document what each piece of data
represents. Or add headers to the table. etc.

Thanks,
Davidlohr


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
@ 2014-08-01  4:03           ` Davidlohr Bueso
  0 siblings, 0 replies; 66+ messages in thread
From: Davidlohr Bueso @ 2014-08-01  4:03 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 2122 bytes --]

On Fri, 2014-08-01 at 10:03 +0800, Aaron Lu wrote:
> On Thu, Jul 31, 2014 at 12:42:41PM +0200, Peter Zijlstra wrote:
> > On Tue, Jul 29, 2014 at 02:39:40AM -0400, Rik van Riel wrote:
> > > On Tue, 29 Jul 2014 13:24:05 +0800
> > > Aaron Lu <aaron.lu@intel.com> wrote:
> > > 
> > > > FYI, we noticed the below changes on
> > > > 
> > > > git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> > > > commit a43455a1d572daf7b730fe12eb747d1e17411365 ("sched/numa: Ensure task_numa_migrate() checks the preferred node")
> > > > 
> > > > ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
> > > > ---------------  -------------------------  
> > > >      94500 ~ 3%    +115.6%     203711 ~ 6%  ivb42/hackbench/50%-threads-pipe
> > > >      67745 ~ 4%     +64.1%     111174 ~ 5%  lkp-snb01/hackbench/50%-threads-socket
> > > >     162245 ~ 3%     +94.1%     314885 ~ 6%  TOTAL proc-vmstat.numa_hint_faults_local
> > > 
> > > Hi Aaron,
> > > 
> > > Jirka Hladky has reported a regression with that changeset as
> > > well, and I have already spent some time debugging the issue.
> > 
> > So assuming those numbers above are the difference in
> 
> Yes, they are.
> 
> It means, for commit ebe06187bf2aec1, the number for
> num_hint_local_faults is 94500 for ivb42 machine and 67745 for lkp-snb01
> machine. The 3%, 4% following that number means the deviation of the
> different runs to their average(we usually run it multiple times to
> phase out possible sharp values). We should probably remove that
> percentage, as they cause confusion if no detailed explanation and may
> not mean much to the commit author and others(if the deviation is big
> enough, we should simply drop that result).
> 
> The percentage in the middle is the change between the two commits.
> 
> Another thing is the meaning of the numbers, it doesn't seem that
> evident they are for proc-vmstat.numa_hint_faults_local. Maybe something
> like this is better?

Instead of removing info, why not document what each piece of data
represents. Or add headers to the table. etc.

Thanks,
Davidlohr


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [LKP] [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
  2014-08-01  2:03         ` Aaron Lu
@ 2014-08-01  7:29           ` Peter Zijlstra
  -1 siblings, 0 replies; 66+ messages in thread
From: Peter Zijlstra @ 2014-08-01  7:29 UTC (permalink / raw)
  To: Aaron Lu; +Cc: Rik van Riel, LKML, lkp, jhladky, Fengguang Wu

[-- Attachment #1: Type: text/plain, Size: 1399 bytes --]

On Fri, Aug 01, 2014 at 10:03:30AM +0800, Aaron Lu wrote:
> > > > ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
> > > > ---------------  -------------------------  
> > > >      94500 ~ 3%    +115.6%     203711 ~ 6%  ivb42/hackbench/50%-threads-pipe
> > > >      67745 ~ 4%     +64.1%     111174 ~ 5%  lkp-snb01/hackbench/50%-threads-socket
> > > >     162245 ~ 3%     +94.1%     314885 ~ 6%  TOTAL proc-vmstat.numa_hint_faults_local

> It means, for commit ebe06187bf2aec1, the number for
> num_hint_local_faults is 94500 for ivb42 machine and 67745 for lkp-snb01
> machine. The 3%, 4% following that number means the deviation of the
> different runs to their average(we usually run it multiple times to
> phase out possible sharp values). We should probably remove that
> percentage, as they cause confusion if no detailed explanation and may
> not mean much to the commit author and others(if the deviation is big
> enough, we should simply drop that result).

Nah, variance is good, but the typical symbol would be +- or the fancy
±.

~ when used as a unary op means 'approx' or 'about' or 'same order'
~ when used as a binary op means equivalence, a weaker equal, often in
the vein of the unary op meaning.

Also see: http://en.wikipedia.org/wiki/Tilde#Mathematics

So while I think having a measure of variance is good, I think you
picked entirely the wrong symbol.

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
@ 2014-08-01  7:29           ` Peter Zijlstra
  0 siblings, 0 replies; 66+ messages in thread
From: Peter Zijlstra @ 2014-08-01  7:29 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 1400 bytes --]

On Fri, Aug 01, 2014 at 10:03:30AM +0800, Aaron Lu wrote:
> > > > ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
> > > > ---------------  -------------------------  
> > > >      94500 ~ 3%    +115.6%     203711 ~ 6%  ivb42/hackbench/50%-threads-pipe
> > > >      67745 ~ 4%     +64.1%     111174 ~ 5%  lkp-snb01/hackbench/50%-threads-socket
> > > >     162245 ~ 3%     +94.1%     314885 ~ 6%  TOTAL proc-vmstat.numa_hint_faults_local

> It means, for commit ebe06187bf2aec1, the number for
> num_hint_local_faults is 94500 for ivb42 machine and 67745 for lkp-snb01
> machine. The 3%, 4% following that number means the deviation of the
> different runs to their average(we usually run it multiple times to
> phase out possible sharp values). We should probably remove that
> percentage, as they cause confusion if no detailed explanation and may
> not mean much to the commit author and others(if the deviation is big
> enough, we should simply drop that result).

Nah, variance is good, but the typical symbol would be +- or the fancy
±.

~ when used as a unary op means 'approx' or 'about' or 'same order'
~ when used as a binary op means equivalence, a weaker equal, often in
the vein of the unary op meaning.

Also see: http://en.wikipedia.org/wiki/Tilde#Mathematics

So while I think having a measure of variance is good, I think you
picked entirely the wrong symbol.

[-- Attachment #2: attachment.sig --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [LKP] [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
  2014-08-01  4:03           ` Davidlohr Bueso
@ 2014-08-01  7:29             ` Peter Zijlstra
  -1 siblings, 0 replies; 66+ messages in thread
From: Peter Zijlstra @ 2014-08-01  7:29 UTC (permalink / raw)
  To: Davidlohr Bueso; +Cc: Aaron Lu, Rik van Riel, LKML, lkp, jhladky, Fengguang Wu

[-- Attachment #1: Type: text/plain, Size: 285 bytes --]

On Thu, Jul 31, 2014 at 09:03:22PM -0700, Davidlohr Bueso wrote:
> 
> Instead of removing info, why not document what each piece of data
> represents. Or add headers to the table. etc.

Yes headers are good, knowing exactly what a number is often removes a
lot of confusion ;-)

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
@ 2014-08-01  7:29             ` Peter Zijlstra
  0 siblings, 0 replies; 66+ messages in thread
From: Peter Zijlstra @ 2014-08-01  7:29 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 285 bytes --]

On Thu, Jul 31, 2014 at 09:03:22PM -0700, Davidlohr Bueso wrote:
> 
> Instead of removing info, why not document what each piece of data
> represents. Or add headers to the table. etc.

Yes headers are good, knowing exactly what a number is often removes a
lot of confusion ;-)

[-- Attachment #2: attachment.sig --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [LKP] [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
  2014-08-01  7:29           ` Peter Zijlstra
@ 2014-08-01  8:14             ` Fengguang Wu
  -1 siblings, 0 replies; 66+ messages in thread
From: Fengguang Wu @ 2014-08-01  8:14 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Aaron Lu, Rik van Riel, LKML, lkp, jhladky

On Fri, Aug 01, 2014 at 09:29:11AM +0200, Peter Zijlstra wrote:
> On Fri, Aug 01, 2014 at 10:03:30AM +0800, Aaron Lu wrote:
> > > > > ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
> > > > > ---------------  -------------------------  
> > > > >      94500 ~ 3%    +115.6%     203711 ~ 6%  ivb42/hackbench/50%-threads-pipe
> > > > >      67745 ~ 4%     +64.1%     111174 ~ 5%  lkp-snb01/hackbench/50%-threads-socket
> > > > >     162245 ~ 3%     +94.1%     314885 ~ 6%  TOTAL proc-vmstat.numa_hint_faults_local
> 
> > It means, for commit ebe06187bf2aec1, the number for
> > num_hint_local_faults is 94500 for ivb42 machine and 67745 for lkp-snb01
> > machine. The 3%, 4% following that number means the deviation of the
> > different runs to their average(we usually run it multiple times to
> > phase out possible sharp values). We should probably remove that
> > percentage, as they cause confusion if no detailed explanation and may
> > not mean much to the commit author and others(if the deviation is big
> > enough, we should simply drop that result).
> 
> Nah, variance is good, but the typical symbol would be +- or the fancy
> ±.
> 
> ~ when used as a unary op means 'approx' or 'about' or 'same order'
> ~ when used as a binary op means equivalence, a weaker equal, often in
> the vein of the unary op meaning.
> 
> Also see: http://en.wikipedia.org/wiki/Tilde#Mathematics
> 
> So while I think having a measure of variance is good, I think you
> picked entirely the wrong symbol.

Good point! We'll first try ± for the stddev percent and fall back to
+- if it turn out to not work well in some cases.

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
@ 2014-08-01  8:14             ` Fengguang Wu
  0 siblings, 0 replies; 66+ messages in thread
From: Fengguang Wu @ 2014-08-01  8:14 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 1666 bytes --]

On Fri, Aug 01, 2014 at 09:29:11AM +0200, Peter Zijlstra wrote:
> On Fri, Aug 01, 2014 at 10:03:30AM +0800, Aaron Lu wrote:
> > > > > ebe06187bf2aec1  a43455a1d572daf7b730fe12e  
> > > > > ---------------  -------------------------  
> > > > >      94500 ~ 3%    +115.6%     203711 ~ 6%  ivb42/hackbench/50%-threads-pipe
> > > > >      67745 ~ 4%     +64.1%     111174 ~ 5%  lkp-snb01/hackbench/50%-threads-socket
> > > > >     162245 ~ 3%     +94.1%     314885 ~ 6%  TOTAL proc-vmstat.numa_hint_faults_local
> 
> > It means, for commit ebe06187bf2aec1, the number for
> > num_hint_local_faults is 94500 for ivb42 machine and 67745 for lkp-snb01
> > machine. The 3%, 4% following that number means the deviation of the
> > different runs to their average(we usually run it multiple times to
> > phase out possible sharp values). We should probably remove that
> > percentage, as they cause confusion if no detailed explanation and may
> > not mean much to the commit author and others(if the deviation is big
> > enough, we should simply drop that result).
> 
> Nah, variance is good, but the typical symbol would be +- or the fancy
> ±.
> 
> ~ when used as a unary op means 'approx' or 'about' or 'same order'
> ~ when used as a binary op means equivalence, a weaker equal, often in
> the vein of the unary op meaning.
> 
> Also see: http://en.wikipedia.org/wiki/Tilde#Mathematics
> 
> So while I think having a measure of variance is good, I think you
> picked entirely the wrong symbol.

Good point! We'll first try ± for the stddev percent and fall back to
+- if it turn out to not work well in some cases.

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [LKP] [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
  2014-07-31 17:37                 ` Peter Zijlstra
@ 2014-08-01 15:02                   ` Peter Zijlstra
  -1 siblings, 0 replies; 66+ messages in thread
From: Peter Zijlstra @ 2014-08-01 15:02 UTC (permalink / raw)
  To: Jirka Hladky; +Cc: Rik van Riel, Aaron Lu, LKML, lkp

[-- Attachment #1: Type: text/plain, Size: 1461 bytes --]

On Thu, Jul 31, 2014 at 07:37:05PM +0200, Peter Zijlstra wrote:
> On Thu, Jul 31, 2014 at 06:39:05PM +0200, Jirka Hladky wrote:
> > I'm doing 3 iterations (3 runs) to get some statistics. To speed up the test
> > significantly please do the run with 20 warehouses only
> > (or in general with #warehouses ==  number of nodes * number of PHYSICAL
> > cores)
> 
> Yeah, went and did that for my 4 node machine, its got a ton more cores, but I
> matches the warehouses to it:
> 
> -a43455a1d57	tip/master
> 
> 979996.47	1144715.44
> 876146		1098499.07
> 1058974.18	1019499.38
> 1055951.59	1139405.22
> 970504.01	1099659.09
> 
> 988314.45	1100355.64	(avg)
> 75059.546179565	50085.7473975167(stdev)
> 
> So for 5 runs, tip/master (which includes the offending patch) wins hands down.
> 
> Each run is 2 minutes.

Because Rik asked for a43455a1d57^1 numbers:

546423.08
546558.63
545990.01
546015.98

some a43455a1d57 numbers:

538652.93
544333.57
542684.77

same setup and everything. So clearly the patches after that made 'some'
difference indeed, seeing how tip/master is almost twice that.

So the reason I didn't so a43455a1d57^1 vs a43455a1d57 is because we already
fingered a commit, after that what you test is the revert of that commit,
because revert is what you typically end up doing if a commit is fail.

But on the state of tip/master, taking that commit out is a net negative for
everything I've tested.

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
@ 2014-08-01 15:02                   ` Peter Zijlstra
  0 siblings, 0 replies; 66+ messages in thread
From: Peter Zijlstra @ 2014-08-01 15:02 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 1461 bytes --]

On Thu, Jul 31, 2014 at 07:37:05PM +0200, Peter Zijlstra wrote:
> On Thu, Jul 31, 2014 at 06:39:05PM +0200, Jirka Hladky wrote:
> > I'm doing 3 iterations (3 runs) to get some statistics. To speed up the test
> > significantly please do the run with 20 warehouses only
> > (or in general with #warehouses ==  number of nodes * number of PHYSICAL
> > cores)
> 
> Yeah, went and did that for my 4 node machine, its got a ton more cores, but I
> matches the warehouses to it:
> 
> -a43455a1d57	tip/master
> 
> 979996.47	1144715.44
> 876146		1098499.07
> 1058974.18	1019499.38
> 1055951.59	1139405.22
> 970504.01	1099659.09
> 
> 988314.45	1100355.64	(avg)
> 75059.546179565	50085.7473975167(stdev)
> 
> So for 5 runs, tip/master (which includes the offending patch) wins hands down.
> 
> Each run is 2 minutes.

Because Rik asked for a43455a1d57^1 numbers:

546423.08
546558.63
545990.01
546015.98

some a43455a1d57 numbers:

538652.93
544333.57
542684.77

same setup and everything. So clearly the patches after that made 'some'
difference indeed, seeing how tip/master is almost twice that.

So the reason I didn't so a43455a1d57^1 vs a43455a1d57 is because we already
fingered a commit, after that what you test is the revert of that commit,
because revert is what you typically end up doing if a commit is fail.

But on the state of tip/master, taking that commit out is a net negative for
everything I've tested.

[-- Attachment #2: attachment.sig --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [LKP] [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
  2014-07-31 16:16           ` Jirka Hladky
@ 2014-08-01 20:46             ` Davidlohr Bueso
  -1 siblings, 0 replies; 66+ messages in thread
From: Davidlohr Bueso @ 2014-08-01 20:46 UTC (permalink / raw)
  To: Jirka Hladky; +Cc: Peter Zijlstra, Rik van Riel, Aaron Lu, LKML, lkp

On Thu, 2014-07-31 at 18:16 +0200, Jirka Hladky wrote:
> Peter, I'm seeing regressions for
> 
> SINGLE SPECjbb instance for number of warehouses being the same as total 
> number of cores in the box.
> 
> Example: 4 NUMA node box, each CPU has 6 cores => biggest regression is 
> for 24 warehouses.

By looking at your graph, that's around a 10% difference.

So I'm not seeing anywhere near as bad a regression on a 80-core box.
Testing single with 80 warehouses, I get:

tip/master baseline:
677476.36 bops
705826.70 bops
704870.87 bops
681741.20 bops 
707014.59 bops

Avg: 695385.94 bops

tip/master + patch (NUMA_SCALE/8 variant):
698242.66 bops
693873.18 bops 
707852.28 bops
691785.96 bops 
747206.03 bopsthis 

Avg: 707792.022 bops

So both these are pretty similar, however, when reverting, on avg we
increase the amount of bops a mere ~4%:

tip/master + reverted:
778416.02 bops 
702602.62 bops 
712557.32 bops 
713982.90 bops
783300.36 bops

Avg: 738171.84 bops

Are there perhaps any special specjbb options you are using?


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
@ 2014-08-01 20:46             ` Davidlohr Bueso
  0 siblings, 0 replies; 66+ messages in thread
From: Davidlohr Bueso @ 2014-08-01 20:46 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 1080 bytes --]

On Thu, 2014-07-31 at 18:16 +0200, Jirka Hladky wrote:
> Peter, I'm seeing regressions for
> 
> SINGLE SPECjbb instance for number of warehouses being the same as total 
> number of cores in the box.
> 
> Example: 4 NUMA node box, each CPU has 6 cores => biggest regression is 
> for 24 warehouses.

By looking at your graph, that's around a 10% difference.

So I'm not seeing anywhere near as bad a regression on a 80-core box.
Testing single with 80 warehouses, I get:

tip/master baseline:
677476.36 bops
705826.70 bops
704870.87 bops
681741.20 bops 
707014.59 bops

Avg: 695385.94 bops

tip/master + patch (NUMA_SCALE/8 variant):
698242.66 bops
693873.18 bops 
707852.28 bops
691785.96 bops 
747206.03 bopsthis 

Avg: 707792.022 bops

So both these are pretty similar, however, when reverting, on avg we
increase the amount of bops a mere ~4%:

tip/master + reverted:
778416.02 bops 
702602.62 bops 
712557.32 bops 
713982.90 bops
783300.36 bops

Avg: 738171.84 bops

Are there perhaps any special specjbb options you are using?


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [LKP] [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
  2014-08-01 20:46             ` Davidlohr Bueso
@ 2014-08-01 20:48               ` Davidlohr Bueso
  -1 siblings, 0 replies; 66+ messages in thread
From: Davidlohr Bueso @ 2014-08-01 20:48 UTC (permalink / raw)
  To: Jirka Hladky; +Cc: Peter Zijlstra, Rik van Riel, Aaron Lu, LKML, lkp

On Fri, 2014-08-01 at 13:46 -0700, Davidlohr Bueso wrote:
> So both these are pretty similar, however, when reverting, on avg we
> increase the amount of bops a mere ~4%:
> 
> tip/master + reverted:

Just to be clear, this is reverting a43455a1d57.


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
@ 2014-08-01 20:48               ` Davidlohr Bueso
  0 siblings, 0 replies; 66+ messages in thread
From: Davidlohr Bueso @ 2014-08-01 20:48 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 258 bytes --]

On Fri, 2014-08-01 at 13:46 -0700, Davidlohr Bueso wrote:
> So both these are pretty similar, however, when reverting, on avg we
> increase the amount of bops a mere ~4%:
> 
> tip/master + reverted:

Just to be clear, this is reverting a43455a1d57.


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [LKP] [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
  2014-08-01 20:46             ` Davidlohr Bueso
@ 2014-08-01 21:30               ` Jirka Hladky
  -1 siblings, 0 replies; 66+ messages in thread
From: Jirka Hladky @ 2014-08-01 21:30 UTC (permalink / raw)
  To: Davidlohr Bueso; +Cc: Peter Zijlstra, Rik van Riel, Aaron Lu, LKML, lkp

On 08/01/2014 10:46 PM, Davidlohr Bueso wrote:
> On Thu, 2014-07-31 at 18:16 +0200, Jirka Hladky wrote:
>> Peter, I'm seeing regressions for
>>
>> SINGLE SPECjbb instance for number of warehouses being the same as total
>> number of cores in the box.
>>
>> Example: 4 NUMA node box, each CPU has 6 cores => biggest regression is
>> for 24 warehouses.
> By looking at your graph, that's around a 10% difference.
>
> So I'm not seeing anywhere near as bad a regression on a 80-core box.
> Testing single with 80 warehouses, I get:
>
> tip/master baseline:
> 677476.36 bops
> 705826.70 bops
> 704870.87 bops
> 681741.20 bops
> 707014.59 bops
>
> Avg: 695385.94 bops
>
> tip/master + patch (NUMA_SCALE/8 variant):
> 698242.66 bops
> 693873.18 bops
> 707852.28 bops
> 691785.96 bops
> 747206.03 bopsthis
>
> Avg: 707792.022 bops
>
> So both these are pretty similar, however, when reverting, on avg we
> increase the amount of bops a mere ~4%:
>
> tip/master + reverted:
> 778416.02 bops
> 702602.62 bops
> 712557.32 bops
> 713982.90 bops
> 783300.36 bops
>
> Avg: 738171.84 bops
>
> Are there perhaps any special specjbb options you are using?
>

I see the regression only on this box. It has 4 "Ivy Bridge-EX" Xeon 
E7-4890 v2 CPUs.

http://ark.intel.com/products/75251
http://en.wikipedia.org/wiki/List_of_Intel_Xeon_microprocessors#.22Ivy_Bridge-EX.22_.2822_nm.29_Expandable_2

Please rerun the test on box with Ivy Bridge CPUs. It seems that older 
CPU generations are not affected.

Thanks
Jirka



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
@ 2014-08-01 21:30               ` Jirka Hladky
  0 siblings, 0 replies; 66+ messages in thread
From: Jirka Hladky @ 2014-08-01 21:30 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 1559 bytes --]

On 08/01/2014 10:46 PM, Davidlohr Bueso wrote:
> On Thu, 2014-07-31 at 18:16 +0200, Jirka Hladky wrote:
>> Peter, I'm seeing regressions for
>>
>> SINGLE SPECjbb instance for number of warehouses being the same as total
>> number of cores in the box.
>>
>> Example: 4 NUMA node box, each CPU has 6 cores => biggest regression is
>> for 24 warehouses.
> By looking at your graph, that's around a 10% difference.
>
> So I'm not seeing anywhere near as bad a regression on a 80-core box.
> Testing single with 80 warehouses, I get:
>
> tip/master baseline:
> 677476.36 bops
> 705826.70 bops
> 704870.87 bops
> 681741.20 bops
> 707014.59 bops
>
> Avg: 695385.94 bops
>
> tip/master + patch (NUMA_SCALE/8 variant):
> 698242.66 bops
> 693873.18 bops
> 707852.28 bops
> 691785.96 bops
> 747206.03 bopsthis
>
> Avg: 707792.022 bops
>
> So both these are pretty similar, however, when reverting, on avg we
> increase the amount of bops a mere ~4%:
>
> tip/master + reverted:
> 778416.02 bops
> 702602.62 bops
> 712557.32 bops
> 713982.90 bops
> 783300.36 bops
>
> Avg: 738171.84 bops
>
> Are there perhaps any special specjbb options you are using?
>

I see the regression only on this box. It has 4 "Ivy Bridge-EX" Xeon 
E7-4890 v2 CPUs.

http://ark.intel.com/products/75251
http://en.wikipedia.org/wiki/List_of_Intel_Xeon_microprocessors#.22Ivy_Bridge-EX.22_.2822_nm.29_Expandable_2

Please rerun the test on box with Ivy Bridge CPUs. It seems that older 
CPU generations are not affected.

Thanks
Jirka



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [LKP] [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
  2014-08-01 21:30               ` Jirka Hladky
@ 2014-08-02  4:17                 ` Rik van Riel
  -1 siblings, 0 replies; 66+ messages in thread
From: Rik van Riel @ 2014-08-02  4:17 UTC (permalink / raw)
  To: Jirka Hladky, Davidlohr Bueso
  Cc: Peter Zijlstra, Aaron Lu, LKML, lkp, Hai Huang

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 08/01/2014 05:30 PM, Jirka Hladky wrote:

> I see the regression only on this box. It has 4 "Ivy Bridge-EX"
> Xeon E7-4890 v2 CPUs.
> 
> http://ark.intel.com/products/75251 
> http://en.wikipedia.org/wiki/List_of_Intel_Xeon_microprocessors#.22Ivy_Bridge-EX.22_.2822_nm.29_Expandable_2
>
> 
> 
> Please rerun the test on box with Ivy Bridge CPUs. It seems that
> older CPU generations are not affected.

That would have been good info to know :)

I've been spending about a month trying to reproduce your issue on a
Westmere E7-4860.

Good thing I found all kinds of other scheduler issues along the way...

- -- 
All rights reversed
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQEcBAEBAgAGBQJT3GZBAAoJEM553pKExN6D4FcH/2c/kYOZkbJeLBEWJHB0yWNR
tqI2Lt/qxPxOKADlDylJwj2Dq8R19Cc4tnJZAdPh+wgCivFefseQY0MI1TI8CO/Z
vEH+dCG8hokygFxKqAX9udI0MD1OxfTKKIk4fdjInZ632JG+JHnqVH6qWxBsriXD
151jzCR/zQEjg6gyCc8YsL06Q9YHyVv7dakggtRkYnE1GIUAtTDhFttRpNYoiVQQ
y/d32adq//PywTmsyWwJMu1ZGe1eGC57JBYzjoUo2iOlFQ9QR+fe4W2/6ZCbekwK
O8ZYbrJzDGrNQP2yDYd+o040KeVfzYkOtwz7+/40TYIvqFiuvKxEAxbJ32+krxA=
=XxCE
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
@ 2014-08-02  4:17                 ` Rik van Riel
  0 siblings, 0 replies; 66+ messages in thread
From: Rik van Riel @ 2014-08-02  4:17 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 1192 bytes --]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 08/01/2014 05:30 PM, Jirka Hladky wrote:

> I see the regression only on this box. It has 4 "Ivy Bridge-EX"
> Xeon E7-4890 v2 CPUs.
> 
> http://ark.intel.com/products/75251 
> http://en.wikipedia.org/wiki/List_of_Intel_Xeon_microprocessors#.22Ivy_Bridge-EX.22_.2822_nm.29_Expandable_2
>
> 
> 
> Please rerun the test on box with Ivy Bridge CPUs. It seems that
> older CPU generations are not affected.

That would have been good info to know :)

I've been spending about a month trying to reproduce your issue on a
Westmere E7-4860.

Good thing I found all kinds of other scheduler issues along the way...

- -- 
All rights reversed
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQEcBAEBAgAGBQJT3GZBAAoJEM553pKExN6D4FcH/2c/kYOZkbJeLBEWJHB0yWNR
tqI2Lt/qxPxOKADlDylJwj2Dq8R19Cc4tnJZAdPh+wgCivFefseQY0MI1TI8CO/Z
vEH+dCG8hokygFxKqAX9udI0MD1OxfTKKIk4fdjInZ632JG+JHnqVH6qWxBsriXD
151jzCR/zQEjg6gyCc8YsL06Q9YHyVv7dakggtRkYnE1GIUAtTDhFttRpNYoiVQQ
y/d32adq//PywTmsyWwJMu1ZGe1eGC57JBYzjoUo2iOlFQ9QR+fe4W2/6ZCbekwK
O8ZYbrJzDGrNQP2yDYd+o040KeVfzYkOtwz7+/40TYIvqFiuvKxEAxbJ32+krxA=
=XxCE
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [LKP] [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
  2014-08-01 21:30               ` Jirka Hladky
@ 2014-08-02  4:26                 ` Peter Zijlstra
  -1 siblings, 0 replies; 66+ messages in thread
From: Peter Zijlstra @ 2014-08-02  4:26 UTC (permalink / raw)
  To: Jirka Hladky; +Cc: Davidlohr Bueso, Rik van Riel, Aaron Lu, LKML, lkp

[-- Attachment #1: Type: text/plain, Size: 226 bytes --]

On Fri, Aug 01, 2014 at 11:30:34PM +0200, Jirka Hladky wrote:
> I see the regression only on this box. It has 4 "Ivy Bridge-EX" Xeon E7-4890
> v2 CPUs.

That's the exact CPU I've got in the 4 node machine I did the tests on.


[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
@ 2014-08-02  4:26                 ` Peter Zijlstra
  0 siblings, 0 replies; 66+ messages in thread
From: Peter Zijlstra @ 2014-08-02  4:26 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 232 bytes --]

On Fri, Aug 01, 2014 at 11:30:34PM +0200, Jirka Hladky wrote:
> I see the regression only on this box. It has 4 "Ivy Bridge-EX" Xeon E7-4890
> v2 CPUs.

That's the exact CPU I've got in the 4 node machine I did the tests on.


[-- Attachment #2: attachment.sig --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [LKP] [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
  2014-08-02  4:17                 ` Rik van Riel
@ 2014-08-02  5:28                   ` Jirka Hladky
  -1 siblings, 0 replies; 66+ messages in thread
From: Jirka Hladky @ 2014-08-02  5:28 UTC (permalink / raw)
  To: Rik van Riel, Davidlohr Bueso
  Cc: Peter Zijlstra, Aaron Lu, LKML, lkp, Hai Huang, Kamil Kolakowski

On 08/02/2014 06:17 AM, Rik van Riel wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 08/01/2014 05:30 PM, Jirka Hladky wrote:
>
>> I see the regression only on this box. It has 4 "Ivy Bridge-EX"
>> Xeon E7-4890 v2 CPUs.
>>
>> http://ark.intel.com/products/75251
>> http://en.wikipedia.org/wiki/List_of_Intel_Xeon_microprocessors#.22Ivy_Bridge-EX.22_.2822_nm.29_Expandable_2
>>
>>
>>
>> Please rerun the test on box with Ivy Bridge CPUs. It seems that
>> older CPU generations are not affected.
> That would have been good info to know :)
>
> I've been spending about a month trying to reproduce your issue on a
> Westmere E7-4860.
>
> Good thing I found all kinds of other scheduler issues along the way...

Hi Rik,

till recently I have seen the regression on all systems.

With the latest kernel, only Ivy Bridge system seems to be affected.

Jirka


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
@ 2014-08-02  5:28                   ` Jirka Hladky
  0 siblings, 0 replies; 66+ messages in thread
From: Jirka Hladky @ 2014-08-02  5:28 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 899 bytes --]

On 08/02/2014 06:17 AM, Rik van Riel wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 08/01/2014 05:30 PM, Jirka Hladky wrote:
>
>> I see the regression only on this box. It has 4 "Ivy Bridge-EX"
>> Xeon E7-4890 v2 CPUs.
>>
>> http://ark.intel.com/products/75251
>> http://en.wikipedia.org/wiki/List_of_Intel_Xeon_microprocessors#.22Ivy_Bridge-EX.22_.2822_nm.29_Expandable_2
>>
>>
>>
>> Please rerun the test on box with Ivy Bridge CPUs. It seems that
>> older CPU generations are not affected.
> That would have been good info to know :)
>
> I've been spending about a month trying to reproduce your issue on a
> Westmere E7-4860.
>
> Good thing I found all kinds of other scheduler issues along the way...

Hi Rik,

till recently I have seen the regression on all systems.

With the latest kernel, only Ivy Bridge system seems to be affected.

Jirka


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [LKP] [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
  2014-07-31  5:04               ` Aaron Lu
@ 2014-08-05 21:43                 ` Rik van Riel
  -1 siblings, 0 replies; 66+ messages in thread
From: Rik van Riel @ 2014-08-05 21:43 UTC (permalink / raw)
  To: Aaron Lu; +Cc: Peter Zijlstra, LKML, lkp, jhladky

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 07/31/2014 01:04 AM, Aaron Lu wrote:

>>>> +++ b/kernel/sched/fair.c @@ -924,10 +924,12 @@ static inline
>>>> unsigned long group_faults_cpu(struct numa_group *group, int
>>>> nid)
>>>> 
>>>> /* * These return the fraction of accesses done by a
>>>> particular task, or - * task group, on a particular numa
>>>> node.  The group weight is given a - * larger multiplier, in
>>>> order to group tasks together that are almost - * evenly
>>>> spread out between numa nodes. + * task group, on a
>>>> particular numa node.  The NUMA move threshold + * prevents
>>>> task moves with marginal improvement, and is set to 5%. */ 
>>>> +#define NUMA_SCALE 1024 +#define NUMA_MOVE_THRESH (5 *
>>>> NUMA_SCALE / 100)
>> 
>> It would be good to see if changing NUMA_MOVE_THRESH to 
>> (NUMA_SCALE / 8) does the trick.
> 
> With your 2nd patch and the above change, the result is:

Peter,

the threshold does not seem to make a difference for the
performance tests on my system, I guess you can drop this
patch :)

- -- 
All rights reversed
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQEcBAEBAgAGBQJT4U/xAAoJEM553pKExN6DY4oH/ihJDmcCSZ0sKqGbyzJqLrFY
KWCEXhfiN6hQJBrmeOvrbzlHsMH0LzYfgTVnc1nteAcnUXiBeqkgxwf+S1dmvoFr
DZSxC+9tQ68ho0YcLd7rpEMfsnwOQAB9BgX8GxxwMb8q5zZ9Bz3r9NKVF0P2D3cj
eeJ8Z3EGaKOteVhwAPVPeuTf7xwhqoqp4ujLgTL7BcaifqvGhi3+uo9/KcavE15d
eale3MuhbCIsAQeyB4SwgGwilE/oZTPTos4BNdUrIyxO4nDajbeLb1qsLSHYcirH
CA7++bTE9V6TvO1tBLVpeYdSAGcDKKUBHM6N+0UDwkR/Tp4oRyQ115Peo2H34ak=
=kFxZ
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local
@ 2014-08-05 21:43                 ` Rik van Riel
  0 siblings, 0 replies; 66+ messages in thread
From: Rik van Riel @ 2014-08-05 21:43 UTC (permalink / raw)
  To: lkp

[-- Attachment #1: Type: text/plain, Size: 1594 bytes --]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 07/31/2014 01:04 AM, Aaron Lu wrote:

>>>> +++ b/kernel/sched/fair.c @@ -924,10 +924,12 @@ static inline
>>>> unsigned long group_faults_cpu(struct numa_group *group, int
>>>> nid)
>>>> 
>>>> /* * These return the fraction of accesses done by a
>>>> particular task, or - * task group, on a particular numa
>>>> node.  The group weight is given a - * larger multiplier, in
>>>> order to group tasks together that are almost - * evenly
>>>> spread out between numa nodes. + * task group, on a
>>>> particular numa node.  The NUMA move threshold + * prevents
>>>> task moves with marginal improvement, and is set to 5%. */ 
>>>> +#define NUMA_SCALE 1024 +#define NUMA_MOVE_THRESH (5 *
>>>> NUMA_SCALE / 100)
>> 
>> It would be good to see if changing NUMA_MOVE_THRESH to 
>> (NUMA_SCALE / 8) does the trick.
> 
> With your 2nd patch and the above change, the result is:

Peter,

the threshold does not seem to make a difference for the
performance tests on my system, I guess you can drop this
patch :)

- -- 
All rights reversed
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQEcBAEBAgAGBQJT4U/xAAoJEM553pKExN6DY4oH/ihJDmcCSZ0sKqGbyzJqLrFY
KWCEXhfiN6hQJBrmeOvrbzlHsMH0LzYfgTVnc1nteAcnUXiBeqkgxwf+S1dmvoFr
DZSxC+9tQ68ho0YcLd7rpEMfsnwOQAB9BgX8GxxwMb8q5zZ9Bz3r9NKVF0P2D3cj
eeJ8Z3EGaKOteVhwAPVPeuTf7xwhqoqp4ujLgTL7BcaifqvGhi3+uo9/KcavE15d
eale3MuhbCIsAQeyB4SwgGwilE/oZTPTos4BNdUrIyxO4nDajbeLb1qsLSHYcirH
CA7++bTE9V6TvO1tBLVpeYdSAGcDKKUBHM6N+0UDwkR/Tp4oRyQ115Peo2H34ak=
=kFxZ
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 66+ messages in thread

end of thread, other threads:[~2014-08-05 21:43 UTC | newest]

Thread overview: 66+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <53d70ee6.JsUEmW5dWsv8dev+%fengguang.wu@intel.com>
2014-07-29  5:24 ` [LKP] [sched/numa] a43455a1d57: +94.1% proc-vmstat.numa_hint_faults_local Aaron Lu
2014-07-29  5:24   ` Aaron Lu
2014-07-29  6:39   ` [LKP] " Rik van Riel
2014-07-29  6:39     ` Rik van Riel
2014-07-29  8:17     ` [LKP] " Peter Zijlstra
2014-07-29  8:17       ` Peter Zijlstra
2014-07-29 20:04       ` [LKP] " Rik van Riel
2014-07-29 20:04         ` Rik van Riel
2014-07-30  2:14         ` [LKP] " Aaron Lu
2014-07-30  2:14           ` Aaron Lu
2014-07-30 14:25           ` [LKP] " Rik van Riel
2014-07-30 14:25             ` Rik van Riel
2014-07-31  5:04             ` [LKP] " Aaron Lu
2014-07-31  5:04               ` Aaron Lu
2014-07-31  6:22               ` [LKP] " Rik van Riel
2014-07-31  6:22                 ` Rik van Riel
2014-07-31  6:53                 ` [LKP] " Aaron Lu
2014-07-31  6:53                   ` Aaron Lu
2014-07-31  6:42               ` [LKP] " Rik van Riel
2014-07-31  6:42                 ` Rik van Riel
2014-08-05 21:43               ` [LKP] " Rik van Riel
2014-08-05 21:43                 ` Rik van Riel
2014-07-31  8:33           ` [LKP] " Peter Zijlstra
2014-07-31  8:33             ` Peter Zijlstra
2014-07-31  8:56             ` [LKP] " Aaron Lu
2014-07-31  8:56               ` Aaron Lu
2014-07-31 10:42     ` [LKP] " Peter Zijlstra
2014-07-31 10:42       ` Peter Zijlstra
2014-07-31 15:57       ` [LKP] " Peter Zijlstra
2014-07-31 15:57         ` Peter Zijlstra
2014-07-31 16:16         ` [LKP] " Jirka Hladky
2014-07-31 16:16           ` Jirka Hladky
2014-07-31 16:27           ` [LKP] " Peter Zijlstra
2014-07-31 16:27             ` Peter Zijlstra
2014-07-31 16:39             ` [LKP] " Jirka Hladky
2014-07-31 16:39               ` Jirka Hladky
2014-07-31 17:37               ` [LKP] " Peter Zijlstra
2014-07-31 17:37                 ` Peter Zijlstra
2014-08-01 15:02                 ` [LKP] " Peter Zijlstra
2014-08-01 15:02                   ` Peter Zijlstra
2014-08-01 20:46           ` [LKP] " Davidlohr Bueso
2014-08-01 20:46             ` Davidlohr Bueso
2014-08-01 20:48             ` [LKP] " Davidlohr Bueso
2014-08-01 20:48               ` Davidlohr Bueso
2014-08-01 21:30             ` [LKP] " Jirka Hladky
2014-08-01 21:30               ` Jirka Hladky
2014-08-02  4:17               ` [LKP] " Rik van Riel
2014-08-02  4:17                 ` Rik van Riel
2014-08-02  5:28                 ` [LKP] " Jirka Hladky
2014-08-02  5:28                   ` Jirka Hladky
2014-08-02  4:26               ` [LKP] " Peter Zijlstra
2014-08-02  4:26                 ` Peter Zijlstra
2014-08-01  0:18       ` [LKP] " Davidlohr Bueso
2014-08-01  0:18         ` Davidlohr Bueso
2014-08-01  2:03       ` [LKP] " Aaron Lu
2014-08-01  2:03         ` Aaron Lu
2014-08-01  4:03         ` [LKP] " Davidlohr Bueso
2014-08-01  4:03           ` Davidlohr Bueso
2014-08-01  7:29           ` [LKP] " Peter Zijlstra
2014-08-01  7:29             ` Peter Zijlstra
2014-08-01  7:29         ` [LKP] " Peter Zijlstra
2014-08-01  7:29           ` Peter Zijlstra
2014-07-31 23:58           ` [LKP] " Yuyang Du
2014-07-31 23:58             ` Yuyang Du
2014-08-01  8:14           ` [LKP] " Fengguang Wu
2014-08-01  8:14             ` Fengguang Wu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.