All of lore.kernel.org
 help / color / mirror / Atom feed
* Regression in perf bench numa convergence stats
@ 2015-06-24 11:10 Srikar Dronamraju
  2015-06-24 12:49 ` Ingo Molnar
  2015-06-26  8:43 ` [tip:perf/urgent] perf bench numa: Fix to show proper " tip-bot for Srikar Dronamraju
  0 siblings, 2 replies; 4+ messages in thread
From: Srikar Dronamraju @ 2015-06-24 11:10 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Vinson Lee, Ingo Molnar
  Cc: LKML, Namhyung Kim, Masami Hiramatsu


perf bench numa mem with -c / -m options on v4.1 and latest tip arent
showing correct convergence statistics. I ran git bisect between v4.0 and
v4.1. I have included the patch that fixed the problem for me.

After bisect,  git bisect visualize shows

>From e1e455f4f4d35850c30235747620d0d078fe9f64 Mon Sep 17 00:00:00 2001
From: Vinson Lee <vlee@twitter.com>
Date: Mon, 23 Mar 2015 12:09:16 -0700
Subject: [PATCH] perf tools: Work around lack of sched_getcpu in glibc < 2.6.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This patch fixes this build error with glibc < 2.6.

  CC       util/cloexec.o
cc1: warnings being treated as errors
util/cloexec.c: In function _perf_flag_probe_:
util/cloexec.c:24: error: implicit declaration of function
_sched_getcpu_
util/cloexec.c:24: error: nested extern declaration of _sched_getcpu_
make: *** [util/cloexec.o] Error 1

Signed-off-by: Vinson Lee <vlee@twitter.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Yann Droneaud <ydroneaud@opteya.com>
Cc: stable@vger.kernel.org # 3.18+
Link: http://lkml.kernel.org/r/1427137761-16119-1-git-send-email-vlee@twopensource.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


# git log --oneline e1e455f
e1e455f perf tools: Work around lack of sched_getcpu in glibc < 2.6.
77cfe38 perf kmem: Print big numbers using thousands' group
929a6bb tools lib traceevent: Factor out allocating and processing args
e6d7c91 perf probe: Fix to get ummapped symbol address on kernel
228f14f perf tools: Remove (null) value of "Sort order" for perf mem report
2c7da8c perf annotate: Allow annotation for decompressed kernel modules
bc84f46 perf tools: Try to lookup kernel module map before creating one
907fb50 perf tools: Remove is_kmodule_extension function
e746b3e perf tools: Remove compressed argument from is_kernel_module
8dee9ff perf tools: Use kmod_path__parse in is_kernel_module

To further verify if the problem is because of e1e455f commit, I did roll back to e1e455f
and its parent 77cfe38. I see this problem on more than one system.

# rpm -qa | grep glibc-2
glibc-2.17-55.el7.x86_64


git reset --hard e1e455f

# Running 'numa/mem' benchmark:

# Running main, "perf bench numa numa-mem --no-data_rand_walk -p 1 -t 64 -G 0 -P 0 -T 32 -l 800 -zZ0c"
#
#

 ###
 # 64 tasks will execute (on 4 nodes, 64 CPUs):
 #        800x     0MB global  shared mem operations
 #        800x     0MB process shared mem operations
 #        800x    32MB thread  local  mem operations
 ###

 ###
 #
 # Startup synchronization: ... threads initialized in 0.512908 seconds.
 #
 #    0.1%  [0.0 mins]  0/0   0/0   0/0   0/0  [ 0/0 ] l: -1-0   (  1) {0-0}
 #    0.6%  [0.0 mins]  0/0   0/0   0/0   0/0  [ 0/0 ] l: -1-0   (  1) {0-0}
 #    5.1%  [0.0 mins]  0/0   0/0   0/0   0/0  [ 0/0 ] l: -1-0   (  1) {0-0}
 #    9.6%  [0.1 mins]  0/0   0/0   0/0   0/0  [ 0/0 ] l: -1-0   (  1) {0-0}
 #   14.0%  [0.1 mins]  0/0   0/0   0/0   0/0  [ 0/0 ] l: -1-0   (  1) {0-0}

 ###

          4.903 secs slowest (max) thread-runtime
          4.873 secs fastest (min) thread-runtime
          4.941 secs average thread-runtime
          0.301 % difference between max/avg runtime
          4.228 GB data processed, per thread
        270.583 GB data processed, total
          1.160 nsecs/byte/thread runtime
          0.862 GB/sec/thread speed
         55.193 GB/sec total speed

and its parent 77cfe38
# git reset --hard 77cfe38

# Running 'numa/mem' benchmark:


# Running main, "perf bench numa numa-mem --no-data_rand_walk -p 1 -t 64 -G 0 -P 0 -T 32 -l 800 -zZ0c"
#
#

 ###
 # 64 tasks will execute (on 4 nodes, 64 CPUs):
 #        800x     0MB global  shared mem operations
 #        800x     0MB process shared mem operations
 #        800x    32MB thread  local  mem operations
 ###

 ###
 #
 # Startup synchronization: ... threads initialized in 0.421336 seconds.
 #
 #    0.4%  [0.0 mins] 16/1  16/1  16/1  16/1  [ 0/4 ] l:  1-20  ( 19) [95.0%] {4-4}
 #    2.6%  [0.0 mins] 17/1  15/1  16/1  16/1  [ 2/4 ] l:  3-37  ( 34) [91.9%] {4-4}
 #    7.1%  [0.0 mins] 17/1  15/1  16/1  16/1  [ 2/4 ] l: 32-67  ( 35) [52.2%] {4-4}
 #   11.8%  [0.1 mins] 17/1  15/1  16/1  16/1  [ 2/4 ] l: 65-103 ( 38) [36.9%] {4-4}
 #   15.9%  [0.1 mins] 17/1  15/1  16/1  16/1  [ 2/4 ] l: 98-136 ( 38) [27.9%] {4-4}

 ###

          4.970 secs slowest (max) thread-runtime
          4.940 secs fastest (min) thread-runtime
          4.980 secs average thread-runtime
          0.300 % difference between max/avg runtime
          4.237 GB data processed, per thread
        271.187 GB data processed, total
          1.173 nsecs/byte/thread runtime
          0.853 GB/sec/thread speed
         54.562 GB/sec total speed


Even reverting e1e455f on top of tip/master seems to avoid the problem.
The below patch fixes the problem.

-- 
Thanks and Regards
Srikar Dronamraju

---->8--------------------------------------------

>From 88199ad8a3d6495080eaa016b87a612bc742b1c4 Mon Sep 17 00:00:00 2001
From: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Date: Wed, 24 Jun 2015 16:23:22 +0530
Subject: [PATCH] perf tools:Fix perf_bench to show proper convergence

With commit: e1e455f (perf tools: Work around lack of sched_getcpu in
glibc < 2.6), perf_bench numa mem with -c or -m option is not able to
correctly calculate convergence. With the above commit, sched_getcpu
always seems to return -1. The intention of commit e1e455f was to add a
sched_getcpu in glibc < 2.6. Hence keep the sched_getcpu definition
under an ifdef.

This regression happened occurred between v4.0 and v4.1

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
 tools/perf/util/cloexec.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/tools/perf/util/cloexec.c b/tools/perf/util/cloexec.c
index 85b5238..2babdda 100644
--- a/tools/perf/util/cloexec.c
+++ b/tools/perf/util/cloexec.c
@@ -7,11 +7,15 @@
 
 static unsigned long flag = PERF_FLAG_FD_CLOEXEC;
 
+#ifdef __GLIBC_PREREQ
+#if !__GLIBC_PREREQ(2, 6)
 int __weak sched_getcpu(void)
 {
 	errno = ENOSYS;
 	return -1;
 }
+#endif
+#endif
 
 static int perf_flag_probe(void)
 {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: Regression in perf bench numa convergence stats
  2015-06-24 11:10 Regression in perf bench numa convergence stats Srikar Dronamraju
@ 2015-06-24 12:49 ` Ingo Molnar
  2015-06-25 15:30   ` Arnaldo Carvalho de Melo
  2015-06-26  8:43 ` [tip:perf/urgent] perf bench numa: Fix to show proper " tip-bot for Srikar Dronamraju
  1 sibling, 1 reply; 4+ messages in thread
From: Ingo Molnar @ 2015-06-24 12:49 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Vinson Lee, Ingo Molnar,
	LKML, Namhyung Kim, Masami Hiramatsu


* Srikar Dronamraju <srikar@linux.vnet.ibm.com> wrote:

> perf bench numa mem with -c / -m options on v4.1 and latest tip arent
> showing correct convergence statistics. I ran git bisect between v4.0 and
> v4.1. I have included the patch that fixed the problem for me.

> From 88199ad8a3d6495080eaa016b87a612bc742b1c4 Mon Sep 17 00:00:00 2001
> From: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> Date: Wed, 24 Jun 2015 16:23:22 +0530
> Subject: [PATCH] perf tools:Fix perf_bench to show proper convergence
> 
> With commit: e1e455f (perf tools: Work around lack of sched_getcpu in
> glibc < 2.6), perf_bench numa mem with -c or -m option is not able to
> correctly calculate convergence. With the above commit, sched_getcpu
> always seems to return -1. The intention of commit e1e455f was to add a
> sched_getcpu in glibc < 2.6. Hence keep the sched_getcpu definition
> under an ifdef.
> 
> This regression happened occurred between v4.0 and v4.1
> 
> Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> ---
>  tools/perf/util/cloexec.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/tools/perf/util/cloexec.c b/tools/perf/util/cloexec.c
> index 85b5238..2babdda 100644
> --- a/tools/perf/util/cloexec.c
> +++ b/tools/perf/util/cloexec.c
> @@ -7,11 +7,15 @@
>  
>  static unsigned long flag = PERF_FLAG_FD_CLOEXEC;
>  
> +#ifdef __GLIBC_PREREQ
> +#if !__GLIBC_PREREQ(2, 6)
>  int __weak sched_getcpu(void)
>  {
>  	errno = ENOSYS;
>  	return -1;
>  }
> +#endif
> +#endif
>  

Thanks Srikar!

Acked-by: Ingo Molnar <mingo@kernel.org>

	Ingo

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Regression in perf bench numa convergence stats
  2015-06-24 12:49 ` Ingo Molnar
@ 2015-06-25 15:30   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 4+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-06-25 15:30 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Srikar Dronamraju, Jiri Olsa, Vinson Lee, Ingo Molnar, LKML,
	Namhyung Kim, Masami Hiramatsu

Em Wed, Jun 24, 2015 at 02:49:28PM +0200, Ingo Molnar escreveu:
> 
> Thanks Srikar!
> 
> Acked-by: Ingo Molnar <mingo@kernel.org>

Thanks, applied to perf/urgent.

- Arnaldo

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [tip:perf/urgent] perf bench numa: Fix to show proper convergence stats
  2015-06-24 11:10 Regression in perf bench numa convergence stats Srikar Dronamraju
  2015-06-24 12:49 ` Ingo Molnar
@ 2015-06-26  8:43 ` tip-bot for Srikar Dronamraju
  1 sibling, 0 replies; 4+ messages in thread
From: tip-bot for Srikar Dronamraju @ 2015-06-26  8:43 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: acme, mingo, tglx, namhyung, jolsa, srikar, masami.hiramatsu.pt,
	vlee, linux-kernel, hpa

Commit-ID:  2b42b09b88c831ba4da2d669581dde371c38c2af
Gitweb:     http://git.kernel.org/tip/2b42b09b88c831ba4da2d669581dde371c38c2af
Author:     Srikar Dronamraju <srikar@linux.vnet.ibm.com>
AuthorDate: Wed, 24 Jun 2015 16:40:04 +0530
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 25 Jun 2015 12:28:35 -0300

perf bench numa: Fix to show proper convergence stats

With commit: e1e455f4f4d3 (perf tools: Work around lack of sched_getcpu
in glibc < 2.6), perf_bench numa mem with -c or -m option is not able to
correctly calculate convergence.

With the above commit, sched_getcpu always seems to return -1. The
intention of commit e1e455f was to add a sched_getcpu in glibc < 2.6.
Hence keep the sched_getcpu definition under an ifdef.

This regression happened occurred between v4.0 and v4.1

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Vinson Lee <vlee@twitter.com>
Fixes:  e1e455f4f4d3 ("perf tools: Work around lack of sched_getcpu in glibc < 2.6")
Link: http://lkml.kernel.org/r/20150624111004.GA5220@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/cloexec.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/tools/perf/util/cloexec.c b/tools/perf/util/cloexec.c
index 85b5238..2babdda 100644
--- a/tools/perf/util/cloexec.c
+++ b/tools/perf/util/cloexec.c
@@ -7,11 +7,15 @@
 
 static unsigned long flag = PERF_FLAG_FD_CLOEXEC;
 
+#ifdef __GLIBC_PREREQ
+#if !__GLIBC_PREREQ(2, 6)
 int __weak sched_getcpu(void)
 {
 	errno = ENOSYS;
 	return -1;
 }
+#endif
+#endif
 
 static int perf_flag_probe(void)
 {

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-06-26  8:43 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-24 11:10 Regression in perf bench numa convergence stats Srikar Dronamraju
2015-06-24 12:49 ` Ingo Molnar
2015-06-25 15:30   ` Arnaldo Carvalho de Melo
2015-06-26  8:43 ` [tip:perf/urgent] perf bench numa: Fix to show proper " tip-bot for Srikar Dronamraju

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.