All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 01/36] tools headers: Follow the upstream UAPI header version 100% differ from the kernel
       [not found] <20171206144115.15097-1-acme@kernel.org>
@ 2017-12-06 14:40 ` Arnaldo Carvalho de Melo
  2017-12-06 14:40 ` [PATCH 02/36] perf test: Disable test cases 19 and 20 on s390x Arnaldo Carvalho de Melo
                   ` (34 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:40 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Adrian Hunter, Arnaldo Carvalho de Melo

From: Ingo Molnar <mingo@kernel.org>

Remove this from check-headers.sh:

  opts="--ignore-blank-lines --ignore-space-change"

as the easiest policy is to just follow the upstream UAPI header version 100%.
Pure space-only changes are comparatively rare.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Link: http://lkml.kernel.org/r/20171121084111.y6p5zwqso2cbms5s@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/check-headers.sh | 1 -
 1 file changed, 1 deletion(-)

diff --git a/tools/perf/check-headers.sh b/tools/perf/check-headers.sh
index 77406d25e521..e66a8a7bcced 100755
--- a/tools/perf/check-headers.sh
+++ b/tools/perf/check-headers.sh
@@ -45,7 +45,6 @@ include/uapi/asm-generic/mman-common.h
 
 check () {
   file=$1
-  opts="--ignore-blank-lines --ignore-space-change"
 
   shift
   while [ -n "$*" ]; do
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 02/36] perf test: Disable test cases 19 and 20 on s390x
       [not found] <20171206144115.15097-1-acme@kernel.org>
  2017-12-06 14:40 ` [PATCH 01/36] tools headers: Follow the upstream UAPI header version 100% differ from the kernel Arnaldo Carvalho de Melo
@ 2017-12-06 14:40 ` Arnaldo Carvalho de Melo
  2017-12-06 14:40 ` [PATCH 03/36] perf record: Synthesize unit/scale/... in event update Arnaldo Carvalho de Melo
                   ` (33 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:40 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Thomas Richter,
	Martin Schwidefsky, Arnaldo Carvalho de Melo

From: Thomas Richter <tmricht@linux.vnet.ibm.com>

The s390x CPU sampling and measurement facilities do not support perf
events of type PERF_TYPE_BREAKPOINT. The test cases are executed and
fail with -ENOENT due to missing hardware support.

Disable the execution of both test cases based on a
platform check. This is the same approach as done for
PowerPC.

Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
LPU-Reference: 20171123074623.20817-1-tmricht@linux.vnet.ibm.com
Link: https://lkml.kernel.org/n/tip-uqvoy6a1tsu8jddo5jjg4h85@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/tests/bp_signal.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/tests/bp_signal.c b/tools/perf/tests/bp_signal.c
index 335b695f4970..a467615c5a0e 100644
--- a/tools/perf/tests/bp_signal.c
+++ b/tools/perf/tests/bp_signal.c
@@ -296,7 +296,7 @@ bool test__bp_signal_is_supported(void)
  * instruction breakpoint using the perf event interface.
  * Once it's there we can release this.
  */
-#ifdef __powerpc__
+#if defined(__powerpc__) || defined(__s390x__)
 	return false;
 #else
 	return true;
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 03/36] perf record: Synthesize unit/scale/... in event update
       [not found] <20171206144115.15097-1-acme@kernel.org>
  2017-12-06 14:40 ` [PATCH 01/36] tools headers: Follow the upstream UAPI header version 100% differ from the kernel Arnaldo Carvalho de Melo
  2017-12-06 14:40 ` [PATCH 02/36] perf test: Disable test cases 19 and 20 on s390x Arnaldo Carvalho de Melo
@ 2017-12-06 14:40 ` Arnaldo Carvalho de Melo
  2017-12-06 14:40 ` [PATCH 04/36] perf record: Synthesize thread map and cpu map Arnaldo Carvalho de Melo
                   ` (32 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:40 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Andi Kleen, Arnaldo Carvalho de Melo

From: Andi Kleen <ak@linux.intel.com>

Move the code to synthesize event updates for scale/unit/cpus to a
common utility file, and use it both from stat and record.

This allows to access scale and other extra qualifiers from perf script.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20171117214300.32746-2-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-record.c |  9 ++++++
 tools/perf/builtin-stat.c   | 62 +++--------------------------------------
 tools/perf/util/header.c    | 68 +++++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/header.h    |  5 ++++
 4 files changed, 86 insertions(+), 58 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 003255910c05..b92d6d67bca8 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -372,6 +372,8 @@ static int record__open(struct record *rec)
 			ui__error("%s\n", msg);
 			goto out;
 		}
+
+		pos->supported = true;
 	}
 
 	if (perf_evlist__apply_filters(evlist, &pos)) {
@@ -784,6 +786,13 @@ static int record__synthesize(struct record *rec, bool tail)
 					 perf_event__synthesize_guest_os, tool);
 	}
 
+	err = perf_event__synthesize_extra_attr(&rec->tool,
+						rec->evlist,
+						process_synthesized_event,
+						data->is_pipe);
+	if (err)
+		goto out;
+
 	err = __machine__synthesize_threads(machine, tool, &opts->target, rec->evlist->threads,
 					    process_synthesized_event, opts->sample_address,
 					    opts->proc_map_timeout, 1);
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 59af5a8419e2..a027b4712e48 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -458,19 +458,8 @@ static void workload_exec_failed_signal(int signo __maybe_unused, siginfo_t *inf
 	workload_exec_errno = info->si_value.sival_int;
 }
 
-static bool has_unit(struct perf_evsel *counter)
-{
-	return counter->unit && *counter->unit;
-}
-
-static bool has_scale(struct perf_evsel *counter)
-{
-	return counter->scale != 1;
-}
-
 static int perf_stat_synthesize_config(bool is_pipe)
 {
-	struct perf_evsel *counter;
 	int err;
 
 	if (is_pipe) {
@@ -482,53 +471,10 @@ static int perf_stat_synthesize_config(bool is_pipe)
 		}
 	}
 
-	/*
-	 * Synthesize other events stuff not carried within
-	 * attr event - unit, scale, name
-	 */
-	evlist__for_each_entry(evsel_list, counter) {
-		if (!counter->supported)
-			continue;
-
-		/*
-		 * Synthesize unit and scale only if it's defined.
-		 */
-		if (has_unit(counter)) {
-			err = perf_event__synthesize_event_update_unit(NULL, counter, process_synthesized_event);
-			if (err < 0) {
-				pr_err("Couldn't synthesize evsel unit.\n");
-				return err;
-			}
-		}
-
-		if (has_scale(counter)) {
-			err = perf_event__synthesize_event_update_scale(NULL, counter, process_synthesized_event);
-			if (err < 0) {
-				pr_err("Couldn't synthesize evsel scale.\n");
-				return err;
-			}
-		}
-
-		if (counter->own_cpus) {
-			err = perf_event__synthesize_event_update_cpus(NULL, counter, process_synthesized_event);
-			if (err < 0) {
-				pr_err("Couldn't synthesize evsel scale.\n");
-				return err;
-			}
-		}
-
-		/*
-		 * Name is needed only for pipe output,
-		 * perf.data carries event names.
-		 */
-		if (is_pipe) {
-			err = perf_event__synthesize_event_update_name(NULL, counter, process_synthesized_event);
-			if (err < 0) {
-				pr_err("Couldn't synthesize evsel name.\n");
-				return err;
-			}
-		}
-	}
+	err = perf_event__synthesize_extra_attr(NULL,
+						evsel_list,
+						process_synthesized_event,
+						is_pipe);
 
 	err = perf_event__synthesize_thread_map2(NULL, evsel_list->threads,
 						process_synthesized_event,
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 7c0e9d587bfa..5890e08e0754 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -3258,6 +3258,74 @@ int perf_event__synthesize_attrs(struct perf_tool *tool,
 	return err;
 }
 
+static bool has_unit(struct perf_evsel *counter)
+{
+	return counter->unit && *counter->unit;
+}
+
+static bool has_scale(struct perf_evsel *counter)
+{
+	return counter->scale != 1;
+}
+
+int perf_event__synthesize_extra_attr(struct perf_tool *tool,
+				      struct perf_evlist *evsel_list,
+				      perf_event__handler_t process,
+				      bool is_pipe)
+{
+	struct perf_evsel *counter;
+	int err;
+
+	/*
+	 * Synthesize other events stuff not carried within
+	 * attr event - unit, scale, name
+	 */
+	evlist__for_each_entry(evsel_list, counter) {
+		if (!counter->supported)
+			continue;
+
+		/*
+		 * Synthesize unit and scale only if it's defined.
+		 */
+		if (has_unit(counter)) {
+			err = perf_event__synthesize_event_update_unit(tool, counter, process);
+			if (err < 0) {
+				pr_err("Couldn't synthesize evsel unit.\n");
+				return err;
+			}
+		}
+
+		if (has_scale(counter)) {
+			err = perf_event__synthesize_event_update_scale(tool, counter, process);
+			if (err < 0) {
+				pr_err("Couldn't synthesize evsel counter.\n");
+				return err;
+			}
+		}
+
+		if (counter->own_cpus) {
+			err = perf_event__synthesize_event_update_cpus(tool, counter, process);
+			if (err < 0) {
+				pr_err("Couldn't synthesize evsel cpus.\n");
+				return err;
+			}
+		}
+
+		/*
+		 * Name is needed only for pipe output,
+		 * perf.data carries event names.
+		 */
+		if (is_pipe) {
+			err = perf_event__synthesize_event_update_name(tool, counter, process);
+			if (err < 0) {
+				pr_err("Couldn't synthesize evsel name.\n");
+				return err;
+			}
+		}
+	}
+	return 0;
+}
+
 int perf_event__process_attr(struct perf_tool *tool __maybe_unused,
 			     union perf_event *event,
 			     struct perf_evlist **pevlist)
diff --git a/tools/perf/util/header.h b/tools/perf/util/header.h
index 29ccbfdf8724..91befc3b550d 100644
--- a/tools/perf/util/header.h
+++ b/tools/perf/util/header.h
@@ -107,6 +107,11 @@ int perf_event__synthesize_features(struct perf_tool *tool,
 				    struct perf_evlist *evlist,
 				    perf_event__handler_t process);
 
+int perf_event__synthesize_extra_attr(struct perf_tool *tool,
+				      struct perf_evlist *evsel_list,
+				      perf_event__handler_t process,
+				      bool is_pipe);
+
 int perf_event__process_feature(struct perf_tool *tool,
 				union perf_event *event,
 				struct perf_session *session);
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 04/36] perf record: Synthesize thread map and cpu map
       [not found] <20171206144115.15097-1-acme@kernel.org>
                   ` (2 preceding siblings ...)
  2017-12-06 14:40 ` [PATCH 03/36] perf record: Synthesize unit/scale/... in event update Arnaldo Carvalho de Melo
@ 2017-12-06 14:40 ` Arnaldo Carvalho de Melo
  2017-12-06 14:40 ` [PATCH 05/36] perf script: Allow computing 'perf stat' style metrics Arnaldo Carvalho de Melo
                   ` (31 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:40 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Andi Kleen, Arnaldo Carvalho de Melo

From: Andi Kleen <ak@linux.intel.com>

Synthesize the per attr thread maps and cpu maps in 'perf record'.

This allows code from 'perf stat' called from 'perf script' to access
this information.

Committer testing:

Please see the PERF_RECORD_THREAD_MAP and PERF_RECORD_CPU_MAP records,
added by this patch:

  $ perf record sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.001 MB perf.data (8 samples) ]
  $ perf report -D | grep PERF_RECORD_ | head
  0xe8 [0x20]: PERF_RECORD_TIME_CONV: unhandled!
  0x108 [0x28]: PERF_RECORD_THREAD_MAP nr: 1 thread: 23568
  0x130 [0x18]: PERF_RECORD_CPU_MAP: 0-3
  0 0x148 [0x28]: PERF_RECORD_COMM: perf:23568/23568
  0x570 [0x8]: PERF_RECORD_FINISHED_ROUND
  445342677837144 0x170 [0x28]: PERF_RECORD_COMM exec: sleep:23568/23568
  445342677847339 0x198 [0x68]: PERF_RECORD_MMAP2 23568/23568: [0x564c943a4000(0x208000) @ 0 fd:00 3147174 2566255743]: r-xp /usr/bin/sleep
  445342677862450 0x200 [0x70]: PERF_RECORD_MMAP2 23568/23568: [0x7f25968a8000(0x229000) @ 0 fd:00 3151761 2566238119]: r-xp /usr/lib64/ld-2.25.so
  445342677873174 0x270 [0x60]: PERF_RECORD_MMAP2 23568/23568: [0x7ffc98176000(0x2000) @ 0 00:00 0 0]: r-xp [vdso]
  445342677891928 0x2d0 [0x28]: PERF_RECORD_SAMPLE(IP, 0x4002): 23568/23568: 0xffffffff8f84c7e7 period: 1 addr: 0
  $

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/r/20171117214300.32746-3-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-record.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index b92d6d67bca8..e304bc47fe9b 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -793,6 +793,21 @@ static int record__synthesize(struct record *rec, bool tail)
 	if (err)
 		goto out;
 
+	err = perf_event__synthesize_thread_map2(&rec->tool, rec->evlist->threads,
+						 process_synthesized_event,
+						NULL);
+	if (err < 0) {
+		pr_err("Couldn't synthesize thread map.\n");
+		return err;
+	}
+
+	err = perf_event__synthesize_cpu_map(&rec->tool, rec->evlist->cpus,
+					     process_synthesized_event, NULL);
+	if (err < 0) {
+		pr_err("Couldn't synthesize cpu map.\n");
+		return err;
+	}
+
 	err = __machine__synthesize_threads(machine, tool, &opts->target, rec->evlist->threads,
 					    process_synthesized_event, opts->sample_address,
 					    opts->proc_map_timeout, 1);
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 05/36] perf script: Allow computing 'perf stat' style metrics
       [not found] <20171206144115.15097-1-acme@kernel.org>
                   ` (3 preceding siblings ...)
  2017-12-06 14:40 ` [PATCH 04/36] perf record: Synthesize thread map and cpu map Arnaldo Carvalho de Melo
@ 2017-12-06 14:40 ` Arnaldo Carvalho de Melo
  2017-12-06 14:40 ` [PATCH 06/36] perf buildid-cache: Document for Node.js USDT Arnaldo Carvalho de Melo
                   ` (30 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:40 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Andi Kleen, Arnaldo Carvalho de Melo

From: Andi Kleen <ak@linux.intel.com>

Add support for computing 'perf stat' style metrics in 'perf script'.

When using leader sampling we can get metrics for each sampling period
by computing formulas over the values of the different group members.

This allows things like fine grained IPC tracking through sampling, much
more fine grained than with 'perf stat'.

The metric is still averaged over the sampling period, it is not just
for the sampling point.

This patch adds a new metric output field for 'perf script' that uses
the existing 'perf stat' metrics infrastructure to compute any metrics
supported by 'perf stat'.

For example to sample IPC:

  $ perf record -e '{ref-cycles,cycles,instructions}:S' -a sleep 1
  $ perf script -F metric,ip,sym,time,cpu,comm
  ...
   alsa-sink-ALC32 [000] 42815.856074:      7fd65937d6cc [unknown]
   alsa-sink-ALC32 [000] 42815.856074:      7fd65937d6cc [unknown]
   alsa-sink-ALC32 [000] 42815.856074:      7fd65937d6cc [unknown]
   alsa-sink-ALC32 [000] 42815.856074:    metric:    0.13  insn per cycle
           swapper [000] 42815.857961:  ffffffff81655df0 __schedule
           swapper [000] 42815.857961:  ffffffff81655df0 __schedule
           swapper [000] 42815.857961:  ffffffff81655df0 __schedule
           swapper [000] 42815.857961:    metric:    0.23  insn per cycle
   qemu-system-x86 [000] 42815.858130:  ffffffff8165ad0e _raw_spin_unlock_irqrestore
   qemu-system-x86 [000] 42815.858130:  ffffffff8165ad0e _raw_spin_unlock_irqrestore
   qemu-system-x86 [000] 42815.858130:  ffffffff8165ad0e _raw_spin_unlock_irqrestore
   qemu-system-x86 [000] 42815.858130:    metric:    0.46  insn per cycle
             :4972 [000] 42815.858312:  ffffffffa080e5f2 vmx_vcpu_run
             :4972 [000] 42815.858312:  ffffffffa080e5f2 vmx_vcpu_run
             :4972 [000] 42815.858312:  ffffffffa080e5f2 vmx_vcpu_run
             :4972 [000] 42815.858312:    metric:    0.45  insn per cycle

TopDown:

This requires disabling SMT if you have it enabled, because SMT would
require sampling per core, which is not supported.

  $ perf record -e '{ref-cycles,topdown-fetch-bubbles,\
                     topdown-recovery-bubbles,\
                     topdown-slots-retired,topdown-total-slots,\
                     topdown-slots-issued}:S' -a sleep 1
  $ perf script --header -I -F cpu,ip,sym,event,metric,period
  ...
  [000]     121108               ref-cycles:  ffffffff8165222e copy_user_enhanced_fast_string
  [000]     190350    topdown-fetch-bubbles:  ffffffff8165222e copy_user_enhanced_fast_string
  [000]       2055 topdown-recovery-bubbles:  ffffffff8165222e copy_user_enhanced_fast_string
  [000]     148729    topdown-slots-retired:  ffffffff8165222e copy_user_enhanced_fast_string
  [000]     144324      topdown-total-slots:  ffffffff8165222e copy_user_enhanced_fast_string
  [000]     160852     topdown-slots-issued:  ffffffff8165222e copy_user_enhanced_fast_string
  [000]   metric:     33.0% frontend bound
  [000]   metric:      3.5% bad speculation
  [000]   metric:     25.8% retiring
  [000]   metric:     37.7% backend bound
  [000]     112112               ref-cycles:  ffffffff8165aec8 _raw_spin_lock_irqsave
  [000]     357222    topdown-fetch-bubbles:  ffffffff8165aec8 _raw_spin_lock_irqsave
  [000]       3325 topdown-recovery-bubbles:  ffffffff8165aec8 _raw_spin_lock_irqsave
  [000]     323553    topdown-slots-retired:  ffffffff8165aec8 _raw_spin_lock_irqsave
  [000]     270507      topdown-total-slots:  ffffffff8165aec8 _raw_spin_lock_irqsave
  [000]     341226     topdown-slots-issued:  ffffffff8165aec8 _raw_spin_lock_irqsave
  [000]   metric:     33.0% frontend bound
  [000]   metric:      2.9% bad speculation
  [000]   metric:     29.9% retiring
  [000]   metric:     34.2% backend bound
...

v2:
Use evsel->priv for new fields
Port to new base line, support fp output.
Handle stats in ->stats, not ->priv
Minor cleanups

Extra explanation about the use of the term 'averaging', from Andi in the
thread in the Link: tag below:

<quote Andi>
The current samples contains the sum of event counts for a sampling period.

EventA-1           EventA-2                EventA-3      EventA-4
EventB-1     EventB-2                             EventC-3

                         gap with no events                overflow
|-----------------------------------------------------------------|
period-start                                             period-end
^                                                                 ^
|                                                                 |
previous sample                                      current sample

So EventA = 4 and EventB = 3 at the sample point

I generate a metric, let's say EventA / EventB. It applies to the whole period.

But the metric is over a longer time which does not have the same behavior. For
example the gap above doesn't have any events, while they are clustered at the
beginning and end of the sample period.

But we're summing everything together. The metric doesn't know that the gap is
different than the busy period.

That's what I'm trying to express with averaging.
</quote>

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20171117214300.32746-4-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-script.txt | 10 +++-
 tools/perf/builtin-script.c              | 97 +++++++++++++++++++++++++++++++-
 tools/perf/util/metricgroup.c            |  4 ++
 3 files changed, 108 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index 2811fcf684cb..974ceb12c7f3 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -117,7 +117,7 @@ OPTIONS
         Comma separated list of fields to print. Options are:
         comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff,
         srcline, period, iregs, uregs, brstack, brstacksym, flags, bpf-output, brstackinsn,
-        brstackoff, callindent, insn, insnlen, synth, phys_addr.
+	brstackoff, callindent, insn, insnlen, synth, phys_addr, metric.
         Field list can be prepended with the type, trace, sw or hw,
         to indicate to which event type the field list applies.
         e.g., -F sw:comm,tid,time,ip,sym  and -F trace:time,cpu,trace
@@ -217,6 +217,14 @@ OPTIONS
 
 	The brstackoff field will print an offset into a specific dso/binary.
 
+	With the metric option perf script can compute metrics for
+	sampling periods, similar to perf stat. This requires
+	specifying a group with multiple metrics with the :S option
+	for perf record. perf will sample on the first event, and
+	compute metrics for all the events in the group. Please note
+	that the metric computed is averaged over the whole sampling
+	period, not just for the sample point.
+
 -k::
 --vmlinux=<file>::
         vmlinux pathname
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index ee7c7aaaae72..39d8b55f0db3 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -22,6 +22,7 @@
 #include "util/cpumap.h"
 #include "util/thread_map.h"
 #include "util/stat.h"
+#include "util/color.h"
 #include "util/string2.h"
 #include "util/thread-stack.h"
 #include "util/time-utils.h"
@@ -90,6 +91,7 @@ enum perf_output_field {
 	PERF_OUTPUT_SYNTH           = 1U << 25,
 	PERF_OUTPUT_PHYS_ADDR       = 1U << 26,
 	PERF_OUTPUT_UREGS	    = 1U << 27,
+	PERF_OUTPUT_METRIC	    = 1U << 28,
 };
 
 struct output_option {
@@ -124,6 +126,7 @@ struct output_option {
 	{.str = "brstackoff", .field = PERF_OUTPUT_BRSTACKOFF},
 	{.str = "synth", .field = PERF_OUTPUT_SYNTH},
 	{.str = "phys_addr", .field = PERF_OUTPUT_PHYS_ADDR},
+	{.str = "metric", .field = PERF_OUTPUT_METRIC},
 };
 
 enum {
@@ -215,12 +218,20 @@ struct perf_evsel_script {
        char *filename;
        FILE *fp;
        u64  samples;
+       /* For metric output */
+       u64  val;
+       int  gnum;
 };
 
+static inline struct perf_evsel_script *evsel_script(struct perf_evsel *evsel)
+{
+	return (struct perf_evsel_script *)evsel->priv;
+}
+
 static struct perf_evsel_script *perf_evsel_script__new(struct perf_evsel *evsel,
 							struct perf_data *data)
 {
-	struct perf_evsel_script *es = malloc(sizeof(*es));
+	struct perf_evsel_script *es = zalloc(sizeof(*es));
 
 	if (es != NULL) {
 		if (asprintf(&es->filename, "%s.%s.dump", data->file.path, perf_evsel__name(evsel)) < 0)
@@ -228,7 +239,6 @@ static struct perf_evsel_script *perf_evsel_script__new(struct perf_evsel *evsel
 		es->fp = fopen(es->filename, "w");
 		if (es->fp == NULL)
 			goto out_free_filename;
-		es->samples = 0;
 	}
 
 	return es;
@@ -1472,6 +1482,86 @@ static int data_src__fprintf(u64 data_src, FILE *fp)
 	return fprintf(fp, "%-*s", maxlen, out);
 }
 
+struct metric_ctx {
+	struct perf_sample	*sample;
+	struct thread		*thread;
+	struct perf_evsel	*evsel;
+	FILE 			*fp;
+};
+
+static void script_print_metric(void *ctx, const char *color,
+			        const char *fmt,
+			        const char *unit, double val)
+{
+	struct metric_ctx *mctx = ctx;
+
+	if (!fmt)
+		return;
+	perf_sample__fprintf_start(mctx->sample, mctx->thread, mctx->evsel,
+				   mctx->fp);
+	fputs("\tmetric: ", mctx->fp);
+	if (color)
+		color_fprintf(mctx->fp, color, fmt, val);
+	else
+		printf(fmt, val);
+	fprintf(mctx->fp, " %s\n", unit);
+}
+
+static void script_new_line(void *ctx)
+{
+	struct metric_ctx *mctx = ctx;
+
+	perf_sample__fprintf_start(mctx->sample, mctx->thread, mctx->evsel,
+				   mctx->fp);
+	fputs("\tmetric: ", mctx->fp);
+}
+
+static void perf_sample__fprint_metric(struct perf_script *script,
+				       struct thread *thread,
+				       struct perf_evsel *evsel,
+				       struct perf_sample *sample,
+				       FILE *fp)
+{
+	struct perf_stat_output_ctx ctx = {
+		.print_metric = script_print_metric,
+		.new_line = script_new_line,
+		.ctx = &(struct metric_ctx) {
+				.sample = sample,
+				.thread = thread,
+				.evsel  = evsel,
+				.fp     = fp,
+			 },
+		.force_header = false,
+	};
+	struct perf_evsel *ev2;
+	static bool init;
+	u64 val;
+
+	if (!init) {
+		perf_stat__init_shadow_stats();
+		init = true;
+	}
+	if (!evsel->stats)
+		perf_evlist__alloc_stats(script->session->evlist, false);
+	if (evsel_script(evsel->leader)->gnum++ == 0)
+		perf_stat__reset_shadow_stats();
+	val = sample->period * evsel->scale;
+	perf_stat__update_shadow_stats(evsel,
+				       val,
+				       sample->cpu);
+	evsel_script(evsel)->val = val;
+	if (evsel_script(evsel->leader)->gnum == evsel->leader->nr_members) {
+		for_each_group_member (ev2, evsel->leader) {
+			perf_stat__print_shadow_stats(ev2,
+						      evsel_script(ev2)->val,
+						      sample->cpu,
+						      &ctx,
+						      NULL);
+		}
+		evsel_script(evsel->leader)->gnum = 0;
+	}
+}
+
 static void process_event(struct perf_script *script,
 			  struct perf_sample *sample, struct perf_evsel *evsel,
 			  struct addr_location *al,
@@ -1559,6 +1649,9 @@ static void process_event(struct perf_script *script,
 	if (PRINT_FIELD(PHYS_ADDR))
 		fprintf(fp, "%16" PRIx64, sample->phys_addr);
 	fprintf(fp, "\n");
+
+	if (PRINT_FIELD(METRIC))
+		perf_sample__fprint_metric(script, thread, evsel, sample, fp);
 }
 
 static struct scripting_ops	*scripting_ops;
diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index 0ddd9c199227..6fd709017bbc 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -38,6 +38,10 @@ struct metric_event *metricgroup__lookup(struct rblist *metric_events,
 	struct metric_event me = {
 		.evsel = evsel
 	};
+
+	if (!metric_events)
+		return NULL;
+
 	nd = rblist__find(metric_events, &me);
 	if (nd)
 		return container_of(nd, struct metric_event, nd);
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 06/36] perf buildid-cache: Document for Node.js USDT
       [not found] <20171206144115.15097-1-acme@kernel.org>
                   ` (4 preceding siblings ...)
  2017-12-06 14:40 ` [PATCH 05/36] perf script: Allow computing 'perf stat' style metrics Arnaldo Carvalho de Melo
@ 2017-12-06 14:40 ` Arnaldo Carvalho de Melo
  2017-12-06 14:40 ` [PATCH 07/36] perf report: Fix -D output for user metadata events Arnaldo Carvalho de Melo
                   ` (29 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:40 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Hansuk Hong, Jiri Olsa,
	Namhyung Kim, Arnaldo Carvalho de Melo

From: Hansuk Hong <flavono123@gmail.com>

Add a tip for Node.js USDT(User-Level Statically Defined Tracing) probes
in tips.txt

Signed-off-by: Hansuk Hong <flavono123@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20171123160546.9722-1-flavono123@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/tips.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/perf/Documentation/tips.txt b/tools/perf/Documentation/tips.txt
index db0ca3063eae..3dd1dbe28407 100644
--- a/tools/perf/Documentation/tips.txt
+++ b/tools/perf/Documentation/tips.txt
@@ -32,3 +32,4 @@ Order by the overhead of source file name and line number: perf report -s srclin
 System-wide collection from all CPUs: perf record -a
 Show current config key-value pairs: perf config --list
 Show user configuration overrides: perf config --user --list
+To add Node.js USDT(User-Level Statically Defined Tracing): perf buildid-cache --add `which node`
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 07/36] perf report: Fix -D output for user metadata events
       [not found] <20171206144115.15097-1-acme@kernel.org>
                   ` (5 preceding siblings ...)
  2017-12-06 14:40 ` [PATCH 06/36] perf buildid-cache: Document for Node.js USDT Arnaldo Carvalho de Melo
@ 2017-12-06 14:40 ` Arnaldo Carvalho de Melo
  2017-12-06 14:40 ` [PATCH 08/36] perf intel-pt: Improve build messages for files that differ from the kernel Arnaldo Carvalho de Melo
                   ` (28 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:40 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo,
	Adrian Hunter, Andi Kleen, David Ahern, Jiri Olsa, Namhyung Kim,
	Wang Nan, Thomas Gleixner

From: Arnaldo Carvalho de Melo <acme@redhat.com>

The PERF_RECORD_USER_ events are synthesized by the tool to assist in
processing the PERF_RECORD_ ones generated by the kernel, the printing
of that information doesn't come with a perf_sample structure, so, when
dumping the event fields using 'perf report -D' there were columns that
end up not being printed.

To tidy up a bit this, fake a perf_sample structure with zeroes to have
the missing columns printed and avoid the occasional surprise with that.

Before:

0 0x45b8 [0x68]: PERF_RECORD_MMAP -1/0: [0xffffffffc12ec000(0x4000) @ 0]: x /lib/modules/4.14.0+/kernel/fs/nls/nls_utf8.ko
0x4620 [0x28]: PERF_RECORD_THREAD_MAP nr: 1 thread: 27820
0x4648 [0x18]: PERF_RECORD_CPU_MAP: 0-3
0 0x4660 [0x28]: PERF_RECORD_COMM: perf:27820/27820
0x4a58 [0x8]: PERF_RECORD_FINISHED_ROUND
447723433020976 0x4688 [0x28]: PERF_RECORD_SAMPLE(IP, 0x4001): 27820/27820: 0xffffffff8f1b6d7a period: 1 addr: 0

After:

  $ perf report -D | grep PERF_RECORD_ | head
  0 0xe8 [0x20]: PERF_RECORD_TIME_CONV: unhandled!
  0 0x108 [0x28]: PERF_RECORD_THREAD_MAP nr: 1 thread: 32555
  0 0x130 [0x18]: PERF_RECORD_CPU_MAP: 0-3
  0 0x148 [0x28]: PERF_RECORD_COMM: perf:32555/32555
  0 0x4e8 [0x8]: PERF_RECORD_FINISHED_ROUND
  448743409421205 0x170 [0x28]: PERF_RECORD_COMM exec: sleep:32555/32555
  448743409431883 0x198 [0x68]: PERF_RECORD_MMAP2 32555/32555: [0x55e11d75a000(0x208000) @ 0 fd:00 3147174 2566255743]: r-xp /usr/bin/sleep
  448743409443873 0x200 [0x70]: PERF_RECORD_MMAP2 32555/32555: [0x7f0ced316000(0x229000) @ 0 fd:00 3151761 2566238119]: r-xp /usr/lib64/ld-2.25.so
  448743409454790 0x270 [0x60]: PERF_RECORD_MMAP2 32555/32555: [0x7ffe84f6d000(0x2000) @ 0 00:00 0 0]: r-xp [vdso]
  448743409479500 0x2d0 [0x28]: PERF_RECORD_SAMPLE(IP, 0x4002): 32555/32555: 0xffffffff8f84c7e7 period: 1 addr: 0
  $

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 9aefcab0de47 ("perf session: Consolidate the dump code")
Link: https://lkml.kernel.org/n/tip-todcu15x0cwgppkh1gi6uhru@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/session.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index df2857137908..54e30f1bcbd7 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1348,10 +1348,11 @@ static s64 perf_session__process_user_event(struct perf_session *session,
 {
 	struct ordered_events *oe = &session->ordered_events;
 	struct perf_tool *tool = session->tool;
+	struct perf_sample sample = { .time = 0, };
 	int fd = perf_data__fd(session->data);
 	int err;
 
-	dump_event(session->evlist, event, file_offset, NULL);
+	dump_event(session->evlist, event, file_offset, &sample);
 
 	/* These events are processed right away */
 	switch (event->header.type) {
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 08/36] perf intel-pt: Improve build messages for files that differ from the kernel
       [not found] <20171206144115.15097-1-acme@kernel.org>
                   ` (6 preceding siblings ...)
  2017-12-06 14:40 ` [PATCH 07/36] perf report: Fix -D output for user metadata events Arnaldo Carvalho de Melo
@ 2017-12-06 14:40 ` Arnaldo Carvalho de Melo
  2017-12-06 14:40 ` [PATCH 09/36] Documentation: Add Arnaldo Melo to list of enforcement statement endorsers Arnaldo Carvalho de Melo
                   ` (27 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:40 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Adrian Hunter, Arnaldo Carvalho de Melo

From: Adrian Hunter <adrian.hunter@intel.com>

Print file names of files that differ. For example, instead of:

  Warning: Intel PT: x86 instruction decoder differs from kernel

print:

  Warning: Intel PT: x86 instruction decoder header at 'tools/perf/util/intel-pt-decoder/inat.h' differs from latest version at 'arch/x86/include/asm/inat.h'

Reported-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Link: http://lkml.kernel.org/r/1511253326-22308-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/intel-pt-decoder/Build | 24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/tools/perf/util/intel-pt-decoder/Build b/tools/perf/util/intel-pt-decoder/Build
index 10e0814bb8d2..1b704fbea9de 100644
--- a/tools/perf/util/intel-pt-decoder/Build
+++ b/tools/perf/util/intel-pt-decoder/Build
@@ -11,15 +11,21 @@ $(OUTPUT)util/intel-pt-decoder/inat-tables.c: $(inat_tables_script) $(inat_table
 
 $(OUTPUT)util/intel-pt-decoder/intel-pt-insn-decoder.o: util/intel-pt-decoder/intel-pt-insn-decoder.c util/intel-pt-decoder/inat.c $(OUTPUT)util/intel-pt-decoder/inat-tables.c
 	@(diff -I 2>&1 | grep -q 'option requires an argument' && \
-	test -d ../../kernel -a -d ../../tools -a -d ../perf && (( \
-	diff -B -I'^#include' util/intel-pt-decoder/insn.c ../../arch/x86/lib/insn.c >/dev/null && \
-	diff -B -I'^#include' util/intel-pt-decoder/inat.c ../../arch/x86/lib/inat.c >/dev/null && \
-	diff -B util/intel-pt-decoder/x86-opcode-map.txt ../../arch/x86/lib/x86-opcode-map.txt >/dev/null && \
-	diff -B util/intel-pt-decoder/gen-insn-attr-x86.awk ../../arch/x86/tools/gen-insn-attr-x86.awk >/dev/null && \
-	diff -B -I'^#include' util/intel-pt-decoder/insn.h ../../arch/x86/include/asm/insn.h >/dev/null && \
-	diff -B -I'^#include' util/intel-pt-decoder/inat.h ../../arch/x86/include/asm/inat.h >/dev/null && \
-	diff -B -I'^#include' util/intel-pt-decoder/inat_types.h ../../arch/x86/include/asm/inat_types.h >/dev/null) \
-	|| echo "Warning: Intel PT: x86 instruction decoder differs from kernel" >&2 )) || true
+	test -d ../../kernel -a -d ../../tools -a -d ../perf && ( \
+	((diff -B -I'^#include' util/intel-pt-decoder/insn.c ../../arch/x86/lib/insn.c >/dev/null) || \
+	(echo "Warning: Intel PT: x86 instruction decoder C file at 'tools/perf/util/intel-pt-decoder/insn.c' differs from latest version at 'arch/x86/lib/insn.c'" >&2)) && \
+	((diff -B -I'^#include' util/intel-pt-decoder/inat.c ../../arch/x86/lib/inat.c >/dev/null) || \
+	(echo "Warning: Intel PT: x86 instruction decoder C file at 'tools/perf/util/intel-pt-decoder/inat.c' differs from latest version at 'arch/x86/lib/inat.c'" >&2)) && \
+	((diff -B util/intel-pt-decoder/x86-opcode-map.txt ../../arch/x86/lib/x86-opcode-map.txt >/dev/null) || \
+	(echo "Warning: Intel PT: x86 instruction decoder map file at 'tools/perf/util/intel-pt-decoder/x86-opcode-map.txt' differs from latest version at 'arch/x86/lib/x86-opcode-map.txt'" >&2)) && \
+	((diff -B util/intel-pt-decoder/gen-insn-attr-x86.awk ../../arch/x86/tools/gen-insn-attr-x86.awk >/dev/null) || \
+	(echo "Warning: Intel PT: x86 instruction decoder script at 'tools/perf/util/intel-pt-decoder/gen-insn-attr-x86.awk' differs from latest version at 'arch/x86/tools/gen-insn-attr-x86.awk'" >&2)) && \
+	((diff -B -I'^#include' util/intel-pt-decoder/insn.h ../../arch/x86/include/asm/insn.h >/dev/null) || \
+	(echo "Warning: Intel PT: x86 instruction decoder header at 'tools/perf/util/intel-pt-decoder/insn.h' differs from latest version at 'arch/x86/include/asm/insn.h'" >&2)) && \
+	((diff -B -I'^#include' util/intel-pt-decoder/inat.h ../../arch/x86/include/asm/inat.h >/dev/null) || \
+	(echo "Warning: Intel PT: x86 instruction decoder header at 'tools/perf/util/intel-pt-decoder/inat.h' differs from latest version at 'arch/x86/include/asm/inat.h'" >&2)) && \
+	((diff -B -I'^#include' util/intel-pt-decoder/inat_types.h ../../arch/x86/include/asm/inat_types.h >/dev/null) || \
+	(echo "Warning: Intel PT: x86 instruction decoder header at 'tools/perf/util/intel-pt-decoder/inat_types.h' differs from latest version at 'arch/x86/include/asm/inat_types.h'" >&2)))) || true
 	$(call rule_mkdir)
 	$(call if_changed_dep,cc_o_c)
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 09/36] Documentation: Add Arnaldo Melo to list of enforcement statement endorsers
       [not found] <20171206144115.15097-1-acme@kernel.org>
                   ` (7 preceding siblings ...)
  2017-12-06 14:40 ` [PATCH 08/36] perf intel-pt: Improve build messages for files that differ from the kernel Arnaldo Carvalho de Melo
@ 2017-12-06 14:40 ` Arnaldo Carvalho de Melo
  2017-12-06 14:40 ` [PATCH 10/36] perf bench futex: Use cpumaps Arnaldo Carvalho de Melo
                   ` (26 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:40 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Add my name to the list.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 Documentation/process/kernel-enforcement-statement.rst | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/process/kernel-enforcement-statement.rst b/Documentation/process/kernel-enforcement-statement.rst
index b3170671a1df..bfa6a78103d8 100644
--- a/Documentation/process/kernel-enforcement-statement.rst
+++ b/Documentation/process/kernel-enforcement-statement.rst
@@ -118,6 +118,7 @@ we might work for today, have in the past, or will in the future.
   - Mike Marshall
   - Chris Mason
   - Paul E. McKenney
+  - Arnaldo Carvalho de Melo
   - David S. Miller
   - Ingo Molnar
   - Kuninori Morimoto
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 10/36] perf bench futex: Use cpumaps
       [not found] <20171206144115.15097-1-acme@kernel.org>
                   ` (8 preceding siblings ...)
  2017-12-06 14:40 ` [PATCH 09/36] Documentation: Add Arnaldo Melo to list of enforcement statement endorsers Arnaldo Carvalho de Melo
@ 2017-12-06 14:40 ` Arnaldo Carvalho de Melo
  2017-12-06 14:40 ` [PATCH 11/36] tools build feature: Check if pthread_barrier_t is available Arnaldo Carvalho de Melo
                   ` (25 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:40 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Davidlohr Bueso, Davidlohr Bueso,
	Kim Phillips, Arnaldo Carvalho de Melo

From: Davidlohr Bueso <dave@stgolabs.net>

It was reported that the whole futex bench breaks when dealing with
non-contiguously numbered cpus.

$ echo 0 | sudo tee /sys/devices/system/cpu/cpu3/online
$ ./perf bench futex all
 perf: pthread_create: Operation not permitted
 Run summary [PID 14934]: 7 threads, each ....

James had implemented an approach with cpumaps that use an in house
flavor. Instead of re-inventing the wheel, I've redone the patch such
that we use the perf's util/cpumap.c interface instead.

Applies to all futex benchmarks.

Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Originally-from: James Yang <james.yang@arm.com>
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Kim Phillips <kim.phillips@arm.com>
Link: http://lkml.kernel.org/r/20171127042101.3659-2-dave@stgolabs.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/bench/futex-hash.c          | 19 ++++++++++++-------
 tools/perf/bench/futex-lock-pi.c       | 23 ++++++++++++++---------
 tools/perf/bench/futex-requeue.c       | 22 +++++++++++++---------
 tools/perf/bench/futex-wake-parallel.c | 24 +++++++++++++++---------
 tools/perf/bench/futex-wake.c          | 18 +++++++++++-------
 5 files changed, 65 insertions(+), 41 deletions(-)

diff --git a/tools/perf/bench/futex-hash.c b/tools/perf/bench/futex-hash.c
index 58ae6ed8f38b..2defb6df7fd0 100644
--- a/tools/perf/bench/futex-hash.c
+++ b/tools/perf/bench/futex-hash.c
@@ -24,6 +24,7 @@
 #include <subcmd/parse-options.h>
 #include "bench.h"
 #include "futex.h"
+#include "cpumap.h"
 
 #include <err.h>
 #include <sys/time.h>
@@ -118,11 +119,12 @@ static void print_summary(void)
 int bench_futex_hash(int argc, const char **argv)
 {
 	int ret = 0;
-	cpu_set_t cpu;
+	cpu_set_t cpuset;
 	struct sigaction act;
-	unsigned int i, ncpus;
+	unsigned int i;
 	pthread_attr_t thread_attr;
 	struct worker *worker = NULL;
+	struct cpu_map *cpu;
 
 	argc = parse_options(argc, argv, options, bench_futex_hash_usage, 0);
 	if (argc) {
@@ -130,14 +132,16 @@ int bench_futex_hash(int argc, const char **argv)
 		exit(EXIT_FAILURE);
 	}
 
-	ncpus = sysconf(_SC_NPROCESSORS_ONLN);
+	cpu = cpu_map__new(NULL);
+	if (!cpu)
+		goto errmem;
 
 	sigfillset(&act.sa_mask);
 	act.sa_sigaction = toggle_done;
 	sigaction(SIGINT, &act, NULL);
 
 	if (!nthreads) /* default to the number of CPUs */
-		nthreads = ncpus;
+		nthreads = cpu->nr;
 
 	worker = calloc(nthreads, sizeof(*worker));
 	if (!worker)
@@ -163,10 +167,10 @@ int bench_futex_hash(int argc, const char **argv)
 		if (!worker[i].futex)
 			goto errmem;
 
-		CPU_ZERO(&cpu);
-		CPU_SET(i % ncpus, &cpu);
+		CPU_ZERO(&cpuset);
+		CPU_SET(cpu->map[i % cpu->nr], &cpuset);
 
-		ret = pthread_attr_setaffinity_np(&thread_attr, sizeof(cpu_set_t), &cpu);
+		ret = pthread_attr_setaffinity_np(&thread_attr, sizeof(cpu_set_t), &cpuset);
 		if (ret)
 			err(EXIT_FAILURE, "pthread_attr_setaffinity_np");
 
@@ -217,6 +221,7 @@ int bench_futex_hash(int argc, const char **argv)
 	print_summary();
 
 	free(worker);
+	free(cpu);
 	return ret;
 errmem:
 	err(EXIT_FAILURE, "calloc");
diff --git a/tools/perf/bench/futex-lock-pi.c b/tools/perf/bench/futex-lock-pi.c
index 08653ae8a8c4..8e9c4753e304 100644
--- a/tools/perf/bench/futex-lock-pi.c
+++ b/tools/perf/bench/futex-lock-pi.c
@@ -15,6 +15,7 @@
 #include <errno.h>
 #include "bench.h"
 #include "futex.h"
+#include "cpumap.h"
 
 #include <err.h>
 #include <stdlib.h>
@@ -32,7 +33,7 @@ static struct worker *worker;
 static unsigned int nsecs = 10;
 static bool silent = false, multi = false;
 static bool done = false, fshared = false;
-static unsigned int ncpus, nthreads = 0;
+static unsigned int nthreads = 0;
 static int futex_flag = 0;
 struct timeval start, end, runtime;
 static pthread_mutex_t thread_lock;
@@ -113,9 +114,10 @@ static void *workerfn(void *arg)
 	return NULL;
 }
 
-static void create_threads(struct worker *w, pthread_attr_t thread_attr)
+static void create_threads(struct worker *w, pthread_attr_t thread_attr,
+			   struct cpu_map *cpu)
 {
-	cpu_set_t cpu;
+	cpu_set_t cpuset;
 	unsigned int i;
 
 	threads_starting = nthreads;
@@ -130,10 +132,10 @@ static void create_threads(struct worker *w, pthread_attr_t thread_attr)
 		} else
 			worker[i].futex = &global_futex;
 
-		CPU_ZERO(&cpu);
-		CPU_SET(i % ncpus, &cpu);
+		CPU_ZERO(&cpuset);
+		CPU_SET(cpu->map[i % cpu->nr], &cpuset);
 
-		if (pthread_attr_setaffinity_np(&thread_attr, sizeof(cpu_set_t), &cpu))
+		if (pthread_attr_setaffinity_np(&thread_attr, sizeof(cpu_set_t), &cpuset))
 			err(EXIT_FAILURE, "pthread_attr_setaffinity_np");
 
 		if (pthread_create(&w[i].thread, &thread_attr, workerfn, &worker[i]))
@@ -147,19 +149,22 @@ int bench_futex_lock_pi(int argc, const char **argv)
 	unsigned int i;
 	struct sigaction act;
 	pthread_attr_t thread_attr;
+	struct cpu_map *cpu;
 
 	argc = parse_options(argc, argv, options, bench_futex_lock_pi_usage, 0);
 	if (argc)
 		goto err;
 
-	ncpus = sysconf(_SC_NPROCESSORS_ONLN);
+	cpu = cpu_map__new(NULL);
+	if (!cpu)
+		err(EXIT_FAILURE, "calloc");
 
 	sigfillset(&act.sa_mask);
 	act.sa_sigaction = toggle_done;
 	sigaction(SIGINT, &act, NULL);
 
 	if (!nthreads)
-		nthreads = ncpus;
+		nthreads = cpu->nr;
 
 	worker = calloc(nthreads, sizeof(*worker));
 	if (!worker)
@@ -180,7 +185,7 @@ int bench_futex_lock_pi(int argc, const char **argv)
 	pthread_attr_init(&thread_attr);
 	gettimeofday(&start, NULL);
 
-	create_threads(worker, thread_attr);
+	create_threads(worker, thread_attr, cpu);
 	pthread_attr_destroy(&thread_attr);
 
 	pthread_mutex_lock(&thread_lock);
diff --git a/tools/perf/bench/futex-requeue.c b/tools/perf/bench/futex-requeue.c
index 1058c194608a..fc692efa0c05 100644
--- a/tools/perf/bench/futex-requeue.c
+++ b/tools/perf/bench/futex-requeue.c
@@ -22,6 +22,7 @@
 #include <errno.h>
 #include "bench.h"
 #include "futex.h"
+#include "cpumap.h"
 
 #include <err.h>
 #include <stdlib.h>
@@ -40,7 +41,7 @@ static bool done = false, silent = false, fshared = false;
 static pthread_mutex_t thread_lock;
 static pthread_cond_t thread_parent, thread_worker;
 static struct stats requeuetime_stats, requeued_stats;
-static unsigned int ncpus, threads_starting, nthreads = 0;
+static unsigned int threads_starting, nthreads = 0;
 static int futex_flag = 0;
 
 static const struct option options[] = {
@@ -83,19 +84,19 @@ static void *workerfn(void *arg __maybe_unused)
 }
 
 static void block_threads(pthread_t *w,
-			  pthread_attr_t thread_attr)
+			  pthread_attr_t thread_attr, struct cpu_map *cpu)
 {
-	cpu_set_t cpu;
+	cpu_set_t cpuset;
 	unsigned int i;
 
 	threads_starting = nthreads;
 
 	/* create and block all threads */
 	for (i = 0; i < nthreads; i++) {
-		CPU_ZERO(&cpu);
-		CPU_SET(i % ncpus, &cpu);
+		CPU_ZERO(&cpuset);
+		CPU_SET(cpu->map[i % cpu->nr], &cpuset);
 
-		if (pthread_attr_setaffinity_np(&thread_attr, sizeof(cpu_set_t), &cpu))
+		if (pthread_attr_setaffinity_np(&thread_attr, sizeof(cpu_set_t), &cpuset))
 			err(EXIT_FAILURE, "pthread_attr_setaffinity_np");
 
 		if (pthread_create(&w[i], &thread_attr, workerfn, NULL))
@@ -116,19 +117,22 @@ int bench_futex_requeue(int argc, const char **argv)
 	unsigned int i, j;
 	struct sigaction act;
 	pthread_attr_t thread_attr;
+	struct cpu_map *cpu;
 
 	argc = parse_options(argc, argv, options, bench_futex_requeue_usage, 0);
 	if (argc)
 		goto err;
 
-	ncpus = sysconf(_SC_NPROCESSORS_ONLN);
+	cpu = cpu_map__new(NULL);
+	if (!cpu)
+		err(EXIT_FAILURE, "cpu_map__new");
 
 	sigfillset(&act.sa_mask);
 	act.sa_sigaction = toggle_done;
 	sigaction(SIGINT, &act, NULL);
 
 	if (!nthreads)
-		nthreads = ncpus;
+		nthreads = cpu->nr;
 
 	worker = calloc(nthreads, sizeof(*worker));
 	if (!worker)
@@ -156,7 +160,7 @@ int bench_futex_requeue(int argc, const char **argv)
 		struct timeval start, end, runtime;
 
 		/* create, launch & block all threads */
-		block_threads(worker, thread_attr);
+		block_threads(worker, thread_attr, cpu);
 
 		/* make sure all threads are already blocked */
 		pthread_mutex_lock(&thread_lock);
diff --git a/tools/perf/bench/futex-wake-parallel.c b/tools/perf/bench/futex-wake-parallel.c
index b4732dad9f89..4488c27e8a43 100644
--- a/tools/perf/bench/futex-wake-parallel.c
+++ b/tools/perf/bench/futex-wake-parallel.c
@@ -21,6 +21,7 @@
 #include <errno.h>
 #include "bench.h"
 #include "futex.h"
+#include "cpumap.h"
 
 #include <err.h>
 #include <stdlib.h>
@@ -43,7 +44,7 @@ static unsigned int nblocked_threads = 0, nwaking_threads = 0;
 static pthread_mutex_t thread_lock;
 static pthread_cond_t thread_parent, thread_worker;
 static struct stats waketime_stats, wakeup_stats;
-static unsigned int ncpus, threads_starting;
+static unsigned int threads_starting;
 static int futex_flag = 0;
 
 static const struct option options[] = {
@@ -119,19 +120,20 @@ static void *blocked_workerfn(void *arg __maybe_unused)
 	return NULL;
 }
 
-static void block_threads(pthread_t *w, pthread_attr_t thread_attr)
+static void block_threads(pthread_t *w, pthread_attr_t thread_attr,
+			  struct cpu_map *cpu)
 {
-	cpu_set_t cpu;
+	cpu_set_t cpuset;
 	unsigned int i;
 
 	threads_starting = nblocked_threads;
 
 	/* create and block all threads */
 	for (i = 0; i < nblocked_threads; i++) {
-		CPU_ZERO(&cpu);
-		CPU_SET(i % ncpus, &cpu);
+		CPU_ZERO(&cpuset);
+		CPU_SET(cpu->map[i % cpu->nr], &cpuset);
 
-		if (pthread_attr_setaffinity_np(&thread_attr, sizeof(cpu_set_t), &cpu))
+		if (pthread_attr_setaffinity_np(&thread_attr, sizeof(cpu_set_t), &cpuset))
 			err(EXIT_FAILURE, "pthread_attr_setaffinity_np");
 
 		if (pthread_create(&w[i], &thread_attr, blocked_workerfn, NULL))
@@ -205,6 +207,7 @@ int bench_futex_wake_parallel(int argc, const char **argv)
 	struct sigaction act;
 	pthread_attr_t thread_attr;
 	struct thread_data *waking_worker;
+	struct cpu_map *cpu;
 
 	argc = parse_options(argc, argv, options,
 			     bench_futex_wake_parallel_usage, 0);
@@ -217,9 +220,12 @@ int bench_futex_wake_parallel(int argc, const char **argv)
 	act.sa_sigaction = toggle_done;
 	sigaction(SIGINT, &act, NULL);
 
-	ncpus = sysconf(_SC_NPROCESSORS_ONLN);
+	cpu = cpu_map__new(NULL);
+	if (!cpu)
+		err(EXIT_FAILURE, "calloc");
+
 	if (!nblocked_threads)
-		nblocked_threads = ncpus;
+		nblocked_threads = cpu->nr;
 
 	/* some sanity checks */
 	if (nwaking_threads > nblocked_threads || !nwaking_threads)
@@ -259,7 +265,7 @@ int bench_futex_wake_parallel(int argc, const char **argv)
 			err(EXIT_FAILURE, "calloc");
 
 		/* create, launch & block all threads */
-		block_threads(blocked_worker, thread_attr);
+		block_threads(blocked_worker, thread_attr, cpu);
 
 		/* make sure all threads are already blocked */
 		pthread_mutex_lock(&thread_lock);
diff --git a/tools/perf/bench/futex-wake.c b/tools/perf/bench/futex-wake.c
index 8c5c0b6b5c97..e8181ad7d088 100644
--- a/tools/perf/bench/futex-wake.c
+++ b/tools/perf/bench/futex-wake.c
@@ -22,6 +22,7 @@
 #include <errno.h>
 #include "bench.h"
 #include "futex.h"
+#include "cpumap.h"
 
 #include <err.h>
 #include <stdlib.h>
@@ -89,19 +90,19 @@ static void print_summary(void)
 }
 
 static void block_threads(pthread_t *w,
-			  pthread_attr_t thread_attr)
+			  pthread_attr_t thread_attr, struct cpu_map *cpu)
 {
-	cpu_set_t cpu;
+	cpu_set_t cpuset;
 	unsigned int i;
 
 	threads_starting = nthreads;
 
 	/* create and block all threads */
 	for (i = 0; i < nthreads; i++) {
-		CPU_ZERO(&cpu);
-		CPU_SET(i % ncpus, &cpu);
+		CPU_ZERO(&cpuset);
+		CPU_SET(cpu->map[i % cpu->nr], &cpuset);
 
-		if (pthread_attr_setaffinity_np(&thread_attr, sizeof(cpu_set_t), &cpu))
+		if (pthread_attr_setaffinity_np(&thread_attr, sizeof(cpu_set_t), &cpuset))
 			err(EXIT_FAILURE, "pthread_attr_setaffinity_np");
 
 		if (pthread_create(&w[i], &thread_attr, workerfn, NULL))
@@ -122,6 +123,7 @@ int bench_futex_wake(int argc, const char **argv)
 	unsigned int i, j;
 	struct sigaction act;
 	pthread_attr_t thread_attr;
+	struct cpu_map *cpu;
 
 	argc = parse_options(argc, argv, options, bench_futex_wake_usage, 0);
 	if (argc) {
@@ -129,7 +131,9 @@ int bench_futex_wake(int argc, const char **argv)
 		exit(EXIT_FAILURE);
 	}
 
-	ncpus = sysconf(_SC_NPROCESSORS_ONLN);
+	cpu = cpu_map__new(NULL);
+	if (!cpu)
+		err(EXIT_FAILURE, "calloc");
 
 	sigfillset(&act.sa_mask);
 	act.sa_sigaction = toggle_done;
@@ -161,7 +165,7 @@ int bench_futex_wake(int argc, const char **argv)
 		struct timeval start, end, runtime;
 
 		/* create, launch & block all threads */
-		block_threads(worker, thread_attr);
+		block_threads(worker, thread_attr, cpu);
 
 		/* make sure all threads are already blocked */
 		pthread_mutex_lock(&thread_lock);
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 11/36] tools build feature: Check if pthread_barrier_t is available
       [not found] <20171206144115.15097-1-acme@kernel.org>
                   ` (9 preceding siblings ...)
  2017-12-06 14:40 ` [PATCH 10/36] perf bench futex: Use cpumaps Arnaldo Carvalho de Melo
@ 2017-12-06 14:40 ` Arnaldo Carvalho de Melo
  2017-12-06 21:31   ` Philippe Ombredanne
  2017-12-06 14:40 ` [PATCH 12/36] perf bench futex: Sync waker threads Arnaldo Carvalho de Melo
                   ` (24 subsequent siblings)
  35 siblings, 1 reply; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:40 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo,
	Adrian Hunter, David Ahern, Davidlohr Bueso, James Yang,
	Jiri Olsa, Kim Phillips, Namhyung Kim, Wang Nan

From: Arnaldo Carvalho de Melo <acme@redhat.com>

As 'perf bench futex wake-parallel" will use this, which is not
available in older systems such as versions of the android NDK used in
my container build tests (r12b and r15c at the moment).

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: James Yang <james.yang@arm.com
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-1i7iv54in4wj08lwo55b0pzv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/build/Makefile.feature               |  1 +
 tools/build/feature/Makefile               |  4 ++++
 tools/build/feature/test-all.c             |  5 +++++
 tools/build/feature/test-pthread-barrier.c | 12 ++++++++++++
 tools/perf/Makefile.config                 |  4 ++++
 5 files changed, 26 insertions(+)
 create mode 100644 tools/build/feature/test-pthread-barrier.c

diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature
index c71a05b9c984..e52fcefee379 100644
--- a/tools/build/Makefile.feature
+++ b/tools/build/Makefile.feature
@@ -56,6 +56,7 @@ FEATURE_TESTS_BASIC :=                  \
         libunwind-arm                   \
         libunwind-aarch64               \
         pthread-attr-setaffinity-np     \
+        pthread-barrier     		\
         stackprotector-all              \
         timerfd                         \
         libdw-dwarf-unwind              \
diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile
index 96982640fbf8..cff38f342283 100644
--- a/tools/build/feature/Makefile
+++ b/tools/build/feature/Makefile
@@ -37,6 +37,7 @@ FILES=                                          \
          test-libunwind-debug-frame-arm.bin     \
          test-libunwind-debug-frame-aarch64.bin \
          test-pthread-attr-setaffinity-np.bin   \
+         test-pthread-barrier.bin		\
          test-stackprotector-all.bin            \
          test-timerfd.bin                       \
          test-libdw-dwarf-unwind.bin            \
@@ -79,6 +80,9 @@ $(OUTPUT)test-hello.bin:
 $(OUTPUT)test-pthread-attr-setaffinity-np.bin:
 	$(BUILD) -D_GNU_SOURCE -lpthread
 
+$(OUTPUT)test-pthread-barrier.bin:
+	$(BUILD) -lpthread
+
 $(OUTPUT)test-stackprotector-all.bin:
 	$(BUILD) -fstack-protector-all
 
diff --git a/tools/build/feature/test-all.c b/tools/build/feature/test-all.c
index 4112702e4aed..6fdf83263ab7 100644
--- a/tools/build/feature/test-all.c
+++ b/tools/build/feature/test-all.c
@@ -118,6 +118,10 @@
 # include "test-pthread-attr-setaffinity-np.c"
 #undef main
 
+#define main main_test_pthread_barrier
+# include "test-pthread-barrier.c"
+#undef main
+
 #define main main_test_sched_getcpu
 # include "test-sched_getcpu.c"
 #undef main
@@ -187,6 +191,7 @@ int main(int argc, char *argv[])
 	main_test_sync_compare_and_swap(argc, argv);
 	main_test_zlib();
 	main_test_pthread_attr_setaffinity_np();
+	main_test_pthread_barrier();
 	main_test_lzma();
 	main_test_get_cpuid();
 	main_test_bpf();
diff --git a/tools/build/feature/test-pthread-barrier.c b/tools/build/feature/test-pthread-barrier.c
new file mode 100644
index 000000000000..0558d9334d97
--- /dev/null
+++ b/tools/build/feature/test-pthread-barrier.c
@@ -0,0 +1,12 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdint.h>
+#include <pthread.h>
+
+int main(void)
+{
+	pthread_barrier_t barrier;
+
+	pthread_barrier_init(&barrier, NULL, 1);
+	pthread_barrier_wait(&barrier);
+	return pthread_barrier_destroy(&barrier);
+}
diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index f6786fa2419f..2c437baf8364 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -263,6 +263,10 @@ ifeq ($(feature-pthread-attr-setaffinity-np), 1)
   CFLAGS += -DHAVE_PTHREAD_ATTR_SETAFFINITY_NP
 endif
 
+ifeq ($(feature-pthread-barrier), 1)
+  CFLAGS += -DHAVE_PTHREAD_BARRIER
+endif
+
 ifndef NO_BIONIC
   $(call feature_check,bionic)
   ifeq ($(feature-bionic), 1)
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 12/36] perf bench futex: Sync waker threads
       [not found] <20171206144115.15097-1-acme@kernel.org>
                   ` (10 preceding siblings ...)
  2017-12-06 14:40 ` [PATCH 11/36] tools build feature: Check if pthread_barrier_t is available Arnaldo Carvalho de Melo
@ 2017-12-06 14:40 ` Arnaldo Carvalho de Melo
  2017-12-06 14:40 ` [PATCH 13/36] perf annotate: Fix unnecessary memory allocation for s390x Arnaldo Carvalho de Melo
                   ` (23 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:40 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, James Yang, Kim Phillips,
	Davidlohr Bueso, Arnaldo Carvalho de Melo

From: James Yang <james.yang@arm.com>

Waker threads in the futex wake-parallel benchmark are started by a loop
using pthread_create().  However, there is no synchronization for when
the waker threads wake the waiting threads.  Comparison of the waker
threads' measurement timestamps show they are not all running
concurrently because older waker threads finish their task before newer
waker threads even start.

This patch uses a barrier to better synchronize the waker threads.

Signed-off-by: James Yang <james.yang@arm.com
Cc: Kim Phillips <kim.phillips@arm.com>
Link: http://lkml.kernel.org/r/20171127042101.3659-4-dave@stgolabs.net
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
[ Disable the wake-parallel test for systems without pthread_barrier_t ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/bench/futex-wake-parallel.c | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/tools/perf/bench/futex-wake-parallel.c b/tools/perf/bench/futex-wake-parallel.c
index 4488c27e8a43..69d8fdc87315 100644
--- a/tools/perf/bench/futex-wake-parallel.c
+++ b/tools/perf/bench/futex-wake-parallel.c
@@ -7,7 +7,17 @@
  * for each individual thread to service its share of work. Ultimately
  * it can be used to measure futex_wake() changes.
  */
+#include "bench.h"
+#include <linux/compiler.h>
+#include "../util/debug.h"
 
+#ifndef HAVE_PTHREAD_BARRIER
+int bench_futex_wake_parallel(int argc __maybe_unused, const char **argv __maybe_unused)
+{
+	pr_err("%s: pthread_barrier_t unavailable, disabling this test...\n", __func__);
+	return 0;
+}
+#else /* HAVE_PTHREAD_BARRIER */
 /* For the CLR_() macros */
 #include <string.h>
 #include <pthread.h>
@@ -15,11 +25,9 @@
 #include <signal.h>
 #include "../util/stat.h"
 #include <subcmd/parse-options.h>
-#include <linux/compiler.h>
 #include <linux/kernel.h>
 #include <linux/time64.h>
 #include <errno.h>
-#include "bench.h"
 #include "futex.h"
 #include "cpumap.h"
 
@@ -43,6 +51,7 @@ static bool done = false, silent = false, fshared = false;
 static unsigned int nblocked_threads = 0, nwaking_threads = 0;
 static pthread_mutex_t thread_lock;
 static pthread_cond_t thread_parent, thread_worker;
+static pthread_barrier_t barrier;
 static struct stats waketime_stats, wakeup_stats;
 static unsigned int threads_starting;
 static int futex_flag = 0;
@@ -65,6 +74,8 @@ static void *waking_workerfn(void *arg)
 	struct thread_data *waker = (struct thread_data *) arg;
 	struct timeval start, end;
 
+	pthread_barrier_wait(&barrier);
+
 	gettimeofday(&start, NULL);
 
 	waker->nwoken = futex_wake(&futex, nwakes, futex_flag);
@@ -85,6 +96,8 @@ static void wakeup_threads(struct thread_data *td, pthread_attr_t thread_attr)
 
 	pthread_attr_setdetachstate(&thread_attr, PTHREAD_CREATE_JOINABLE);
 
+	pthread_barrier_init(&barrier, NULL, nwaking_threads + 1);
+
 	/* create and block all threads */
 	for (i = 0; i < nwaking_threads; i++) {
 		/*
@@ -97,9 +110,13 @@ static void wakeup_threads(struct thread_data *td, pthread_attr_t thread_attr)
 			err(EXIT_FAILURE, "pthread_create");
 	}
 
+	pthread_barrier_wait(&barrier);
+
 	for (i = 0; i < nwaking_threads; i++)
 		if (pthread_join(td[i].worker, NULL))
 			err(EXIT_FAILURE, "pthread_join");
+
+	pthread_barrier_destroy(&barrier);
 }
 
 static void *blocked_workerfn(void *arg __maybe_unused)
@@ -303,3 +320,4 @@ int bench_futex_wake_parallel(int argc, const char **argv)
 	free(blocked_worker);
 	return ret;
 }
+#endif /* HAVE_PTHREAD_BARRIER */
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 13/36] perf annotate: Fix unnecessary memory allocation for s390x
       [not found] <20171206144115.15097-1-acme@kernel.org>
                   ` (11 preceding siblings ...)
  2017-12-06 14:40 ` [PATCH 12/36] perf bench futex: Sync waker threads Arnaldo Carvalho de Melo
@ 2017-12-06 14:40 ` Arnaldo Carvalho de Melo
  2017-12-06 14:40 ` [PATCH 14/36] perf annotate: Fix objdump comment parsing for Intel mov dissassembly Arnaldo Carvalho de Melo
                   ` (22 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:40 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Thomas Richter, Heiko Carstens,
	Martin Schwidefsky, Arnaldo Carvalho de Melo

From: Thomas Richter <tmricht@linux.vnet.ibm.com>

This patch fixes a bug introduced with commit d9f8dfa9baf9 ("perf
annotate s390: Implement jump types for perf annotate").

'perf annotate' displays annotated assembler output by reading output of
command objdump and parsing the disassembled lines. For each shown
mnemonic this function sequence is executed:

  disasm_line__new()
  |
  +--> disasm_line__init_ins()
       |
       +--> ins__find()
            |
            +--> arch->associate_instruction_ops()

The s390x specific function assigned to function pointer
associate_instruction_ops refers to function s390__associate_ins_ops().

This function checks for supported mnemonics and assigns a NULL pointer
to unsupported mnemonics.  However even the NULL pointer is added to the
architecture dependend instruction array.

This leads to an extremely large architecture instruction array
(due to array resize logic in function arch__grow_instructions()).

Depending on the objdump output being parsed the array can end up
with several ten-thousand elements.

This patch checks if a mnemonic is supported and only adds supported
ones into the architecture instruction array. The array does not contain
elements with NULL pointers anymore.

Before the patch (With some debug printf output):

[root@s35lp76 perf]# time ./perf annotate --stdio > /tmp/xxxbb

real	8m49.679s
user	7m13.008s
sys	0m1.649s
[root@s35lp76 perf]# fgrep '__ins__find sorted:1 nr_instructions:'
			/tmp/xxxbb | tail -1
__ins__find sorted:1 nr_instructions:87433 ins:0x341583c0
[root@s35lp76 perf]#

The number of different s390x branch/jump/call/return instructions
entered into the array is 87433.

After the patch (With some printf debug output:)

[root@s35lp76 perf]# time ./perf annotate --stdio > /tmp/xxxaa

real	1m24.553s
user	0m0.587s
sys	0m1.530s
[root@s35lp76 perf]# fgrep '__ins__find sorted:1 nr_instructions:'
			/tmp/xxxaa | tail -1
__ins__find sorted:1 nr_instructions:56 ins:0x3f406570
[root@s35lp76 perf]#

The number of different s390x branch/jump/call/return instructions
entered into the array is 56 which is sensible.

Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Acked-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20171124094637.55558-1-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/arch/s390/annotate/instructions.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/arch/s390/annotate/instructions.c b/tools/perf/arch/s390/annotate/instructions.c
index e0e466c650df..8c72b44444cb 100644
--- a/tools/perf/arch/s390/annotate/instructions.c
+++ b/tools/perf/arch/s390/annotate/instructions.c
@@ -18,7 +18,8 @@ static struct ins_ops *s390__associate_ins_ops(struct arch *arch, const char *na
 	if (!strcmp(name, "br"))
 		ops = &ret_ops;
 
-	arch__associate_ins_ops(arch, name, ops);
+	if (ops)
+		arch__associate_ins_ops(arch, name, ops);
 	return ops;
 }
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 14/36] perf annotate: Fix objdump comment parsing for Intel mov dissassembly
       [not found] <20171206144115.15097-1-acme@kernel.org>
                   ` (12 preceding siblings ...)
  2017-12-06 14:40 ` [PATCH 13/36] perf annotate: Fix unnecessary memory allocation for s390x Arnaldo Carvalho de Melo
@ 2017-12-06 14:40 ` Arnaldo Carvalho de Melo
  2017-12-06 14:40 ` [PATCH 15/36] perf rblist: Create rblist__exit() function Arnaldo Carvalho de Melo
                   ` (21 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:40 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Thomas Richter, Heiko Carstens,
	Martin Schwidefsky, Arnaldo Carvalho de Melo

From: Thomas Richter <tmricht@linux.vnet.ibm.com>

The command 'perf annotate' parses the output of objdump and also
investigates the comments produced by objdump. For example the
output of objdump produces (on x86):

23eee:  4c 8b 3d 13 01 21 00 mov 0x210113(%rip),%r15
                                # 234008 <stderr@@GLIBC_2.2.5+0x9a8>

and the function mov__parse() is called to investigate the complete
line. Mov__parse() breaks this line into several parts and finally
calls function comment__symbol() to parse the data after the comment
character '#'. Comment__symbol() expects a hexadecimal address followed
by a symbol in '<' and '>' brackets.

However the 2nd parameter given to function comment__symbol()
always points to the comment character '#'. The address parsing
always returns 0 because the character '#' is not a digit and
strtoull() fails without being noticed.

Fix this by advancing the second parameter to function comment__symbol()
by one byte before invocation and add an error check after strtoull()
has been called.

Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Acked-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fixes: 6de783b6f50f ("perf annotate: Resolve symbols using objdump comment")
Link: http://lkml.kernel.org/r/20171128075632.72182-1-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/annotate.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 22ea7936d92f..facad1e279a8 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -322,6 +322,8 @@ static int comment__symbol(char *raw, char *comment, u64 *addrp, char **namep)
 		return 0;
 
 	*addrp = strtoull(comment, &endptr, 16);
+	if (endptr == comment)
+		return 0;
 	name = strchr(endptr, '<');
 	if (name == NULL)
 		return -1;
@@ -435,8 +437,8 @@ static int mov__parse(struct arch *arch, struct ins_operands *ops, struct map *m
 		return 0;
 
 	comment = ltrim(comment);
-	comment__symbol(ops->source.raw, comment, &ops->source.addr, &ops->source.name);
-	comment__symbol(ops->target.raw, comment, &ops->target.addr, &ops->target.name);
+	comment__symbol(ops->source.raw, comment + 1, &ops->source.addr, &ops->source.name);
+	comment__symbol(ops->target.raw, comment + 1, &ops->target.addr, &ops->target.name);
 
 	return 0;
 
@@ -480,7 +482,7 @@ static int dec__parse(struct arch *arch __maybe_unused, struct ins_operands *ops
 		return 0;
 
 	comment = ltrim(comment);
-	comment__symbol(ops->target.raw, comment, &ops->target.addr, &ops->target.name);
+	comment__symbol(ops->target.raw, comment + 1, &ops->target.addr, &ops->target.name);
 
 	return 0;
 }
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 15/36] perf rblist: Create rblist__exit() function
       [not found] <20171206144115.15097-1-acme@kernel.org>
                   ` (13 preceding siblings ...)
  2017-12-06 14:40 ` [PATCH 14/36] perf annotate: Fix objdump comment parsing for Intel mov dissassembly Arnaldo Carvalho de Melo
@ 2017-12-06 14:40 ` Arnaldo Carvalho de Melo
  2017-12-06 14:40 ` [PATCH 16/36] perf stat: Add rbtree node_delete op Arnaldo Carvalho de Melo
                   ` (20 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:40 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Jin Yao, Alexander Shishkin,
	Andi Kleen, Kan Liang, Peter Zijlstra, Arnaldo Carvalho de Melo

From: Jin Yao <yao.jin@linux.intel.com>

Currently we have a rblist__delete() which is used to delete a rblist.
While rblist__delete() will free the pointer of rblist at the end.

It's an inconvenience for the user to delete a rblist which is not
allocated by something like malloc(). For example, the rblist is
embedded in a larger data structure.

This patch creates a new function rblist__exit() which is similar to
rblist__delete() but it will not free the pointer of rblist.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512125856-22056-2-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/rblist.c | 19 ++++++++++++-------
 tools/perf/util/rblist.h |  1 +
 2 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/rblist.c b/tools/perf/util/rblist.c
index 0dfe27d99458..0efc3258c648 100644
--- a/tools/perf/util/rblist.c
+++ b/tools/perf/util/rblist.c
@@ -101,16 +101,21 @@ void rblist__init(struct rblist *rblist)
 	return;
 }
 
+void rblist__exit(struct rblist *rblist)
+{
+	struct rb_node *pos, *next = rb_first(&rblist->entries);
+
+	while (next) {
+		pos = next;
+		next = rb_next(pos);
+		rblist__remove_node(rblist, pos);
+	}
+}
+
 void rblist__delete(struct rblist *rblist)
 {
 	if (rblist != NULL) {
-		struct rb_node *pos, *next = rb_first(&rblist->entries);
-
-		while (next) {
-			pos = next;
-			next = rb_next(pos);
-			rblist__remove_node(rblist, pos);
-		}
+		rblist__exit(rblist);
 		free(rblist);
 	}
 }
diff --git a/tools/perf/util/rblist.h b/tools/perf/util/rblist.h
index 4c8638a22571..76df15c27f5f 100644
--- a/tools/perf/util/rblist.h
+++ b/tools/perf/util/rblist.h
@@ -29,6 +29,7 @@ struct rblist {
 };
 
 void rblist__init(struct rblist *rblist);
+void rblist__exit(struct rblist *rblist);
 void rblist__delete(struct rblist *rblist);
 int rblist__add_node(struct rblist *rblist, const void *new_entry);
 void rblist__remove_node(struct rblist *rblist, struct rb_node *rb_node);
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 16/36] perf stat: Add rbtree node_delete op
       [not found] <20171206144115.15097-1-acme@kernel.org>
                   ` (14 preceding siblings ...)
  2017-12-06 14:40 ` [PATCH 15/36] perf rblist: Create rblist__exit() function Arnaldo Carvalho de Melo
@ 2017-12-06 14:40 ` Arnaldo Carvalho de Melo
  2017-12-06 14:40 ` [PATCH 17/36] perf thread_map: Add method to map all threads in the system Arnaldo Carvalho de Melo
                   ` (19 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:40 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Jin Yao, Alexander Shishkin,
	Andi Kleen, Kan Liang, Peter Zijlstra, Arnaldo Carvalho de Melo

From: Jin Yao <yao.jin@linux.intel.com>

In current stat-shadow.c, the rbtree deleting is ignored.

The patch adds the implementation to node_delete method of rblist.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512125856-22056-5-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/stat-shadow.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 855e35cbb1dc..57ec22513971 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -87,6 +87,16 @@ static struct rb_node *saved_value_new(struct rblist *rblist __maybe_unused,
 	return &nd->rb_node;
 }
 
+static void saved_value_delete(struct rblist *rblist __maybe_unused,
+			       struct rb_node *rb_node)
+{
+	struct saved_value *v;
+
+	BUG_ON(!rb_node);
+	v = container_of(rb_node, struct saved_value, rb_node);
+	free(v);
+}
+
 static struct saved_value *saved_value_lookup(struct perf_evsel *evsel,
 					      int cpu,
 					      bool create)
@@ -114,7 +124,7 @@ void perf_stat__init_shadow_stats(void)
 	rblist__init(&runtime_saved_values);
 	runtime_saved_values.node_cmp = saved_value_cmp;
 	runtime_saved_values.node_new = saved_value_new;
-	/* No delete for now */
+	runtime_saved_values.node_delete = saved_value_delete;
 }
 
 static int evsel_context(struct perf_evsel *evsel)
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 17/36] perf thread_map: Add method to map all threads in the system
       [not found] <20171206144115.15097-1-acme@kernel.org>
                   ` (15 preceding siblings ...)
  2017-12-06 14:40 ` [PATCH 16/36] perf stat: Add rbtree node_delete op Arnaldo Carvalho de Melo
@ 2017-12-06 14:40 ` Arnaldo Carvalho de Melo
  2017-12-06 14:40 ` [PATCH 18/36] perf s390: Always build with -fPIC Arnaldo Carvalho de Melo
                   ` (18 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:40 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Andi Kleen, Kan Liang, Peter Zijlstra

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Reusing the thread_map__new_by_uid() proc scanning already in place to
return a map with all threads in the system.

Based-on-a-patch-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/n/tip-khh28q0wwqbqtrk32bfe07hd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/thread_map.c | 22 ++++++++++++++++------
 tools/perf/util/thread_map.h |  1 +
 2 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/thread_map.c b/tools/perf/util/thread_map.c
index be0d5a736dea..2b653853eec2 100644
--- a/tools/perf/util/thread_map.c
+++ b/tools/perf/util/thread_map.c
@@ -92,7 +92,7 @@ struct thread_map *thread_map__new_by_tid(pid_t tid)
 	return threads;
 }
 
-struct thread_map *thread_map__new_by_uid(uid_t uid)
+static struct thread_map *__thread_map__new_all_cpus(uid_t uid)
 {
 	DIR *proc;
 	int max_threads = 32, items, i;
@@ -113,7 +113,6 @@ struct thread_map *thread_map__new_by_uid(uid_t uid)
 	while ((dirent = readdir(proc)) != NULL) {
 		char *end;
 		bool grow = false;
-		struct stat st;
 		pid_t pid = strtol(dirent->d_name, &end, 10);
 
 		if (*end) /* only interested in proper numerical dirents */
@@ -121,11 +120,12 @@ struct thread_map *thread_map__new_by_uid(uid_t uid)
 
 		snprintf(path, sizeof(path), "/proc/%s", dirent->d_name);
 
-		if (stat(path, &st) != 0)
-			continue;
+		if (uid != UINT_MAX) {
+			struct stat st;
 
-		if (st.st_uid != uid)
-			continue;
+			if (stat(path, &st) != 0 || st.st_uid != uid)
+				continue;
+		}
 
 		snprintf(path, sizeof(path), "/proc/%d/task", pid);
 		items = scandir(path, &namelist, filter, NULL);
@@ -178,6 +178,16 @@ struct thread_map *thread_map__new_by_uid(uid_t uid)
 	goto out_closedir;
 }
 
+struct thread_map *thread_map__new_all_cpus(void)
+{
+	return __thread_map__new_all_cpus(UINT_MAX);
+}
+
+struct thread_map *thread_map__new_by_uid(uid_t uid)
+{
+	return __thread_map__new_all_cpus(uid);
+}
+
 struct thread_map *thread_map__new(pid_t pid, pid_t tid, uid_t uid)
 {
 	if (pid != -1)
diff --git a/tools/perf/util/thread_map.h b/tools/perf/util/thread_map.h
index f15803985435..07a765fb22bb 100644
--- a/tools/perf/util/thread_map.h
+++ b/tools/perf/util/thread_map.h
@@ -23,6 +23,7 @@ struct thread_map *thread_map__new_dummy(void);
 struct thread_map *thread_map__new_by_pid(pid_t pid);
 struct thread_map *thread_map__new_by_tid(pid_t tid);
 struct thread_map *thread_map__new_by_uid(uid_t uid);
+struct thread_map *thread_map__new_all_cpus(void);
 struct thread_map *thread_map__new(pid_t pid, pid_t tid, uid_t uid);
 struct thread_map *thread_map__new_event(struct thread_map_event *event);
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 18/36] perf s390: Always build with -fPIC
       [not found] <20171206144115.15097-1-acme@kernel.org>
                   ` (16 preceding siblings ...)
  2017-12-06 14:40 ` [PATCH 17/36] perf thread_map: Add method to map all threads in the system Arnaldo Carvalho de Melo
@ 2017-12-06 14:40 ` Arnaldo Carvalho de Melo
  2017-12-07  8:09   ` Hendrik Brueckner
  2017-12-06 14:40   ` Arnaldo Carvalho de Melo
                   ` (17 subsequent siblings)
  35 siblings, 1 reply; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:40 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Hendrik Brueckner,
	Heiko Carstens, Martin Schwidefsky, Thomas Richter,
	linux s390 list, Arnaldo Carvalho de Melo

From: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>

On s390, object files must be compiled with position-indepedent code in
order to be incrementally linked or linked to shared libraries.
Therefore, add -fPIC to the CFLAGS for s390 to ensure each object file
is built properly.

Reported-by: Jonathan Hermann <jonathan.hermann@de.ibm.com>
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: linux s390 list <linux-s390@vger.kernel.org>
LPU-Reference: 1512031765-9382-1-git-send-email-brueckner@linux.vnet.ibm.com
Link: https://lkml.kernel.org/n/tip-a8wga8hrl0d0r84cal96fmgv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Makefile.config | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index 2c437baf8364..bf86c09ca889 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -41,6 +41,7 @@ ifeq ($(SRCARCH),x86)
     LIBUNWIND_LIBS = -lunwind-x86 -llzma -lunwind
   endif
   NO_PERF_REGS := 0
+  CFLAGS += -fPIC
 endif
 
 ifeq ($(SRCARCH),arm)
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 19/36] perf pmu: Pass pmu as a parameter to get_cpuid_str()
       [not found] <20171206144115.15097-1-acme@kernel.org>
@ 2017-12-06 14:40   ` Arnaldo Carvalho de Melo
  2017-12-06 14:40 ` [PATCH 02/36] perf test: Disable test cases 19 and 20 on s390x Arnaldo Carvalho de Melo
                     ` (34 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:40 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Ganapatrao Kulkarni,
	Alexander Shishkin, Catalin Marinas, Jayachandran C,
	Jonathan Cameron, linux-arm-kernel, Mark Rutland, Peter Zijlstra,
	Robert Richter, Shaokun Zhang, Arnaldo Carvalho de Melo

From: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>

The cpuid string will not be same on all CPUs on heterogeneous platforms
like ARM's big.LITTLE, adding provision(using pmu->cpus) to find cpuid
string from associated CPUs of PMU CORE device.

Also optimise arguments to function pmu_add_cpu_aliases.

Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jayachandran C <jnair@caviumnetworks.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@cavium.com>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: http://lkml.kernel.org/r/20171016183222.25750-2-ganapatrao.kulkarni@cavium.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/arch/powerpc/util/header.c |  2 +-
 tools/perf/arch/x86/util/header.c     |  2 +-
 tools/perf/util/header.h              |  3 ++-
 tools/perf/util/metricgroup.c         |  4 ++--
 tools/perf/util/pmu.c                 | 22 +++++++++++-----------
 tools/perf/util/pmu.h                 |  2 +-
 6 files changed, 18 insertions(+), 17 deletions(-)

diff --git a/tools/perf/arch/powerpc/util/header.c b/tools/perf/arch/powerpc/util/header.c
index 7a4cf80c207a..0b242664f5ea 100644
--- a/tools/perf/arch/powerpc/util/header.c
+++ b/tools/perf/arch/powerpc/util/header.c
@@ -35,7 +35,7 @@ get_cpuid(char *buffer, size_t sz)
 }
 
 char *
-get_cpuid_str(void)
+get_cpuid_str(struct perf_pmu *pmu __maybe_unused)
 {
 	char *bufp;
 
diff --git a/tools/perf/arch/x86/util/header.c b/tools/perf/arch/x86/util/header.c
index 33027c5e6f92..b626d2bad9f1 100644
--- a/tools/perf/arch/x86/util/header.c
+++ b/tools/perf/arch/x86/util/header.c
@@ -66,7 +66,7 @@ get_cpuid(char *buffer, size_t sz)
 }
 
 char *
-get_cpuid_str(void)
+get_cpuid_str(struct perf_pmu *pmu __maybe_unused)
 {
 	char *buf = malloc(128);
 
diff --git a/tools/perf/util/header.h b/tools/perf/util/header.h
index 91befc3b550d..317fb901e47f 100644
--- a/tools/perf/util/header.h
+++ b/tools/perf/util/header.h
@@ -9,6 +9,7 @@
 #include <linux/types.h>
 #include "event.h"
 #include "env.h"
+#include "pmu.h"
 
 enum {
 	HEADER_RESERVED		= 0,	/* always cleared */
@@ -171,5 +172,5 @@ int write_padded(struct feat_fd *fd, const void *bf,
  */
 int get_cpuid(char *buffer, size_t sz);
 
-char *get_cpuid_str(void);
+char *get_cpuid_str(struct perf_pmu *pmu __maybe_unused);
 #endif /* __PERF_HEADER_H */
diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index 6fd709017bbc..e48410c99b39 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -274,7 +274,7 @@ static void metricgroup__print_strlist(struct strlist *metrics, bool raw)
 void metricgroup__print(bool metrics, bool metricgroups, char *filter,
 			bool raw)
 {
-	struct pmu_events_map *map = perf_pmu__find_map();
+	struct pmu_events_map *map = perf_pmu__find_map(NULL);
 	struct pmu_event *pe;
 	int i;
 	struct rblist groups;
@@ -372,7 +372,7 @@ void metricgroup__print(bool metrics, bool metricgroups, char *filter,
 static int metricgroup__add_metric(const char *metric, struct strbuf *events,
 				   struct list_head *group_list)
 {
-	struct pmu_events_map *map = perf_pmu__find_map();
+	struct pmu_events_map *map = perf_pmu__find_map(NULL);
 	struct pmu_event *pe;
 	int ret = -EINVAL;
 	int i, j;
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 80fb1593913a..4e7dd3a0f123 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -542,12 +542,12 @@ static bool pmu_is_uncore(const char *name)
  * Each architecture should provide a more precise id string that
  * can be use to match the architecture's "mapfile".
  */
-char * __weak get_cpuid_str(void)
+char * __weak get_cpuid_str(struct perf_pmu *pmu __maybe_unused)
 {
 	return NULL;
 }
 
-static char *perf_pmu__getcpuid(void)
+static char *perf_pmu__getcpuid(struct perf_pmu *pmu)
 {
 	char *cpuid;
 	static bool printed;
@@ -556,7 +556,7 @@ static char *perf_pmu__getcpuid(void)
 	if (cpuid)
 		cpuid = strdup(cpuid);
 	if (!cpuid)
-		cpuid = get_cpuid_str();
+		cpuid = get_cpuid_str(pmu);
 	if (!cpuid)
 		return NULL;
 
@@ -567,10 +567,10 @@ static char *perf_pmu__getcpuid(void)
 	return cpuid;
 }
 
-struct pmu_events_map *perf_pmu__find_map(void)
+struct pmu_events_map *perf_pmu__find_map(struct perf_pmu *pmu)
 {
 	struct pmu_events_map *map;
-	char *cpuid = perf_pmu__getcpuid();
+	char *cpuid = perf_pmu__getcpuid(pmu);
 	int i;
 
 	i = 0;
@@ -593,13 +593,14 @@ struct pmu_events_map *perf_pmu__find_map(void)
  * to the current running CPU. Then, add all PMU events from that table
  * as aliases.
  */
-static void pmu_add_cpu_aliases(struct list_head *head, const char *name)
+static void pmu_add_cpu_aliases(struct list_head *head, struct perf_pmu *pmu)
 {
 	int i;
 	struct pmu_events_map *map;
 	struct pmu_event *pe;
+	const char *name = pmu->name;
 
-	map = perf_pmu__find_map();
+	map = perf_pmu__find_map(pmu);
 	if (!map)
 		return;
 
@@ -661,21 +662,20 @@ static struct perf_pmu *pmu_lookup(const char *name)
 	if (pmu_aliases(name, &aliases))
 		return NULL;
 
-	pmu_add_cpu_aliases(&aliases, name);
 	pmu = zalloc(sizeof(*pmu));
 	if (!pmu)
 		return NULL;
 
 	pmu->cpus = pmu_cpumask(name);
-
+	pmu->name = strdup(name);
+	pmu->type = type;
 	pmu->is_uncore = pmu_is_uncore(name);
+	pmu_add_cpu_aliases(&aliases, pmu);
 
 	INIT_LIST_HEAD(&pmu->format);
 	INIT_LIST_HEAD(&pmu->aliases);
 	list_splice(&format, &pmu->format);
 	list_splice(&aliases, &pmu->aliases);
-	pmu->name = strdup(name);
-	pmu->type = type;
 	list_add_tail(&pmu->list, &pmus);
 
 	pmu->default_config = perf_pmu__get_default_config(pmu);
diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
index 27c75e635866..76fecec7b3f9 100644
--- a/tools/perf/util/pmu.h
+++ b/tools/perf/util/pmu.h
@@ -92,6 +92,6 @@ int perf_pmu__test(void);
 
 struct perf_event_attr *perf_pmu__get_default_config(struct perf_pmu *pmu);
 
-struct pmu_events_map *perf_pmu__find_map(void);
+struct pmu_events_map *perf_pmu__find_map(struct perf_pmu *pmu);
 
 #endif /* __PMU_H */
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 19/36] perf pmu: Pass pmu as a parameter to get_cpuid_str()
@ 2017-12-06 14:40   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:40 UTC (permalink / raw)
  To: linux-arm-kernel

From: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>

The cpuid string will not be same on all CPUs on heterogeneous platforms
like ARM's big.LITTLE, adding provision(using pmu->cpus) to find cpuid
string from associated CPUs of PMU CORE device.

Also optimise arguments to function pmu_add_cpu_aliases.

Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jayachandran C <jnair@caviumnetworks.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: linux-arm-kernel at lists.infradead.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@cavium.com>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: http://lkml.kernel.org/r/20171016183222.25750-2-ganapatrao.kulkarni at cavium.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/arch/powerpc/util/header.c |  2 +-
 tools/perf/arch/x86/util/header.c     |  2 +-
 tools/perf/util/header.h              |  3 ++-
 tools/perf/util/metricgroup.c         |  4 ++--
 tools/perf/util/pmu.c                 | 22 +++++++++++-----------
 tools/perf/util/pmu.h                 |  2 +-
 6 files changed, 18 insertions(+), 17 deletions(-)

diff --git a/tools/perf/arch/powerpc/util/header.c b/tools/perf/arch/powerpc/util/header.c
index 7a4cf80c207a..0b242664f5ea 100644
--- a/tools/perf/arch/powerpc/util/header.c
+++ b/tools/perf/arch/powerpc/util/header.c
@@ -35,7 +35,7 @@ get_cpuid(char *buffer, size_t sz)
 }
 
 char *
-get_cpuid_str(void)
+get_cpuid_str(struct perf_pmu *pmu __maybe_unused)
 {
 	char *bufp;
 
diff --git a/tools/perf/arch/x86/util/header.c b/tools/perf/arch/x86/util/header.c
index 33027c5e6f92..b626d2bad9f1 100644
--- a/tools/perf/arch/x86/util/header.c
+++ b/tools/perf/arch/x86/util/header.c
@@ -66,7 +66,7 @@ get_cpuid(char *buffer, size_t sz)
 }
 
 char *
-get_cpuid_str(void)
+get_cpuid_str(struct perf_pmu *pmu __maybe_unused)
 {
 	char *buf = malloc(128);
 
diff --git a/tools/perf/util/header.h b/tools/perf/util/header.h
index 91befc3b550d..317fb901e47f 100644
--- a/tools/perf/util/header.h
+++ b/tools/perf/util/header.h
@@ -9,6 +9,7 @@
 #include <linux/types.h>
 #include "event.h"
 #include "env.h"
+#include "pmu.h"
 
 enum {
 	HEADER_RESERVED		= 0,	/* always cleared */
@@ -171,5 +172,5 @@ int write_padded(struct feat_fd *fd, const void *bf,
  */
 int get_cpuid(char *buffer, size_t sz);
 
-char *get_cpuid_str(void);
+char *get_cpuid_str(struct perf_pmu *pmu __maybe_unused);
 #endif /* __PERF_HEADER_H */
diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index 6fd709017bbc..e48410c99b39 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -274,7 +274,7 @@ static void metricgroup__print_strlist(struct strlist *metrics, bool raw)
 void metricgroup__print(bool metrics, bool metricgroups, char *filter,
 			bool raw)
 {
-	struct pmu_events_map *map = perf_pmu__find_map();
+	struct pmu_events_map *map = perf_pmu__find_map(NULL);
 	struct pmu_event *pe;
 	int i;
 	struct rblist groups;
@@ -372,7 +372,7 @@ void metricgroup__print(bool metrics, bool metricgroups, char *filter,
 static int metricgroup__add_metric(const char *metric, struct strbuf *events,
 				   struct list_head *group_list)
 {
-	struct pmu_events_map *map = perf_pmu__find_map();
+	struct pmu_events_map *map = perf_pmu__find_map(NULL);
 	struct pmu_event *pe;
 	int ret = -EINVAL;
 	int i, j;
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 80fb1593913a..4e7dd3a0f123 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -542,12 +542,12 @@ static bool pmu_is_uncore(const char *name)
  * Each architecture should provide a more precise id string that
  * can be use to match the architecture's "mapfile".
  */
-char * __weak get_cpuid_str(void)
+char * __weak get_cpuid_str(struct perf_pmu *pmu __maybe_unused)
 {
 	return NULL;
 }
 
-static char *perf_pmu__getcpuid(void)
+static char *perf_pmu__getcpuid(struct perf_pmu *pmu)
 {
 	char *cpuid;
 	static bool printed;
@@ -556,7 +556,7 @@ static char *perf_pmu__getcpuid(void)
 	if (cpuid)
 		cpuid = strdup(cpuid);
 	if (!cpuid)
-		cpuid = get_cpuid_str();
+		cpuid = get_cpuid_str(pmu);
 	if (!cpuid)
 		return NULL;
 
@@ -567,10 +567,10 @@ static char *perf_pmu__getcpuid(void)
 	return cpuid;
 }
 
-struct pmu_events_map *perf_pmu__find_map(void)
+struct pmu_events_map *perf_pmu__find_map(struct perf_pmu *pmu)
 {
 	struct pmu_events_map *map;
-	char *cpuid = perf_pmu__getcpuid();
+	char *cpuid = perf_pmu__getcpuid(pmu);
 	int i;
 
 	i = 0;
@@ -593,13 +593,14 @@ struct pmu_events_map *perf_pmu__find_map(void)
  * to the current running CPU. Then, add all PMU events from that table
  * as aliases.
  */
-static void pmu_add_cpu_aliases(struct list_head *head, const char *name)
+static void pmu_add_cpu_aliases(struct list_head *head, struct perf_pmu *pmu)
 {
 	int i;
 	struct pmu_events_map *map;
 	struct pmu_event *pe;
+	const char *name = pmu->name;
 
-	map = perf_pmu__find_map();
+	map = perf_pmu__find_map(pmu);
 	if (!map)
 		return;
 
@@ -661,21 +662,20 @@ static struct perf_pmu *pmu_lookup(const char *name)
 	if (pmu_aliases(name, &aliases))
 		return NULL;
 
-	pmu_add_cpu_aliases(&aliases, name);
 	pmu = zalloc(sizeof(*pmu));
 	if (!pmu)
 		return NULL;
 
 	pmu->cpus = pmu_cpumask(name);
-
+	pmu->name = strdup(name);
+	pmu->type = type;
 	pmu->is_uncore = pmu_is_uncore(name);
+	pmu_add_cpu_aliases(&aliases, pmu);
 
 	INIT_LIST_HEAD(&pmu->format);
 	INIT_LIST_HEAD(&pmu->aliases);
 	list_splice(&format, &pmu->format);
 	list_splice(&aliases, &pmu->aliases);
-	pmu->name = strdup(name);
-	pmu->type = type;
 	list_add_tail(&pmu->list, &pmus);
 
 	pmu->default_config = perf_pmu__get_default_config(pmu);
diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
index 27c75e635866..76fecec7b3f9 100644
--- a/tools/perf/util/pmu.h
+++ b/tools/perf/util/pmu.h
@@ -92,6 +92,6 @@ int perf_pmu__test(void);
 
 struct perf_event_attr *perf_pmu__get_default_config(struct perf_pmu *pmu);
 
-struct pmu_events_map *perf_pmu__find_map(void);
+struct pmu_events_map *perf_pmu__find_map(struct perf_pmu *pmu);
 
 #endif /* __PMU_H */
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 20/36] perf tools arm64: Add support for get_cpuid_str function.
       [not found] <20171206144115.15097-1-acme@kernel.org>
  2017-12-06 14:40 ` [PATCH 01/36] tools headers: Follow the upstream UAPI header version 100% differ from the kernel Arnaldo Carvalho de Melo
@ 2017-12-06 14:40   ` Arnaldo Carvalho de Melo
  2017-12-06 14:40 ` [PATCH 03/36] perf record: Synthesize unit/scale/... in event update Arnaldo Carvalho de Melo
                     ` (33 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:40 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Ganapatrao Kulkarni,
	Alexander Shishkin, Catalin Marinas, Ganapatrao Kulkarni,
	Jayachandran C, Jonathan Cameron, Mark Rutland, Peter Zijlstra,
	Robert Richter, Shaokun Zhang, linux-arm-kernel,
	Arnaldo Carvalho de Melo

From: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>

The get_cpuid_str function returns the MIDR string of the first online
cpu from the range of cpus associated with the PMU CORE device.

Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ganapatrao Kulkarni <gklkml16@gmail.com>
Cc: Jayachandran C <jnair@caviumnetworks.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@cavium.com>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20171016183222.25750-3-ganapatrao.kulkarni@cavium.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/arch/arm64/util/Build    |  1 +
 tools/perf/arch/arm64/util/header.c | 65 +++++++++++++++++++++++++++++++++++++
 2 files changed, 66 insertions(+)
 create mode 100644 tools/perf/arch/arm64/util/header.c

diff --git a/tools/perf/arch/arm64/util/Build b/tools/perf/arch/arm64/util/Build
index cef6fb38d17e..b1ab72d2a42e 100644
--- a/tools/perf/arch/arm64/util/Build
+++ b/tools/perf/arch/arm64/util/Build
@@ -1,3 +1,4 @@
+libperf-y += header.o
 libperf-$(CONFIG_DWARF)     += dwarf-regs.o
 libperf-$(CONFIG_LOCAL_LIBUNWIND) += unwind-libunwind.o
 
diff --git a/tools/perf/arch/arm64/util/header.c b/tools/perf/arch/arm64/util/header.c
new file mode 100644
index 000000000000..534cd2507d83
--- /dev/null
+++ b/tools/perf/arch/arm64/util/header.c
@@ -0,0 +1,65 @@
+#include <stdio.h>
+#include <stdlib.h>
+#include <api/fs/fs.h>
+#include "header.h"
+
+#define MIDR "/regs/identification/midr_el1"
+#define MIDR_SIZE 19
+#define MIDR_REVISION_MASK      0xf
+#define MIDR_VARIANT_SHIFT      20
+#define MIDR_VARIANT_MASK       (0xf << MIDR_VARIANT_SHIFT)
+
+char *get_cpuid_str(struct perf_pmu *pmu)
+{
+	char *buf = NULL;
+	char path[PATH_MAX];
+	const char *sysfs = sysfs__mountpoint();
+	int cpu;
+	u64 midr = 0;
+	struct cpu_map *cpus;
+	FILE *file;
+
+	if (!sysfs || !pmu || !pmu->cpus)
+		return NULL;
+
+	buf = malloc(MIDR_SIZE);
+	if (!buf)
+		return NULL;
+
+	/* read midr from list of cpus mapped to this pmu */
+	cpus = cpu_map__get(pmu->cpus);
+	for (cpu = 0; cpu < cpus->nr; cpu++) {
+		scnprintf(path, PATH_MAX, "%s/devices/system/cpu/cpu%d"MIDR,
+				sysfs, cpus->map[cpu]);
+
+		file = fopen(path, "r");
+		if (!file) {
+			pr_debug("fopen failed for file %s\n", path);
+			continue;
+		}
+
+		if (!fgets(buf, MIDR_SIZE, file)) {
+			fclose(file);
+			continue;
+		}
+		fclose(file);
+
+		/* Ignore/clear Variant[23:20] and
+		 * Revision[3:0] of MIDR
+		 */
+		midr = strtoul(buf, NULL, 16);
+		midr &= (~(MIDR_VARIANT_MASK | MIDR_REVISION_MASK));
+		scnprintf(buf, MIDR_SIZE, "0x%016lx", midr);
+		/* got midr break loop */
+		break;
+	}
+
+	if (!midr) {
+		pr_err("failed to get cpuid string for PMU %s\n", pmu->name);
+		free(buf);
+		buf = NULL;
+	}
+
+	cpu_map__put(cpus);
+	return buf;
+}
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 20/36] perf tools arm64: Add support for get_cpuid_str function.
@ 2017-12-06 14:40   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:40 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mark Rutland, Shaokun Zhang, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Catalin Marinas, Jonathan Cameron,
	linux-kernel, linux-perf-users, Peter Zijlstra,
	Ganapatrao Kulkarni, Jayachandran C, Ganapatrao Kulkarni,
	Robert Richter, linux-arm-kernel

From: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>

The get_cpuid_str function returns the MIDR string of the first online
cpu from the range of cpus associated with the PMU CORE device.

Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ganapatrao Kulkarni <gklkml16@gmail.com>
Cc: Jayachandran C <jnair@caviumnetworks.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@cavium.com>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20171016183222.25750-3-ganapatrao.kulkarni@cavium.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/arch/arm64/util/Build    |  1 +
 tools/perf/arch/arm64/util/header.c | 65 +++++++++++++++++++++++++++++++++++++
 2 files changed, 66 insertions(+)
 create mode 100644 tools/perf/arch/arm64/util/header.c

diff --git a/tools/perf/arch/arm64/util/Build b/tools/perf/arch/arm64/util/Build
index cef6fb38d17e..b1ab72d2a42e 100644
--- a/tools/perf/arch/arm64/util/Build
+++ b/tools/perf/arch/arm64/util/Build
@@ -1,3 +1,4 @@
+libperf-y += header.o
 libperf-$(CONFIG_DWARF)     += dwarf-regs.o
 libperf-$(CONFIG_LOCAL_LIBUNWIND) += unwind-libunwind.o
 
diff --git a/tools/perf/arch/arm64/util/header.c b/tools/perf/arch/arm64/util/header.c
new file mode 100644
index 000000000000..534cd2507d83
--- /dev/null
+++ b/tools/perf/arch/arm64/util/header.c
@@ -0,0 +1,65 @@
+#include <stdio.h>
+#include <stdlib.h>
+#include <api/fs/fs.h>
+#include "header.h"
+
+#define MIDR "/regs/identification/midr_el1"
+#define MIDR_SIZE 19
+#define MIDR_REVISION_MASK      0xf
+#define MIDR_VARIANT_SHIFT      20
+#define MIDR_VARIANT_MASK       (0xf << MIDR_VARIANT_SHIFT)
+
+char *get_cpuid_str(struct perf_pmu *pmu)
+{
+	char *buf = NULL;
+	char path[PATH_MAX];
+	const char *sysfs = sysfs__mountpoint();
+	int cpu;
+	u64 midr = 0;
+	struct cpu_map *cpus;
+	FILE *file;
+
+	if (!sysfs || !pmu || !pmu->cpus)
+		return NULL;
+
+	buf = malloc(MIDR_SIZE);
+	if (!buf)
+		return NULL;
+
+	/* read midr from list of cpus mapped to this pmu */
+	cpus = cpu_map__get(pmu->cpus);
+	for (cpu = 0; cpu < cpus->nr; cpu++) {
+		scnprintf(path, PATH_MAX, "%s/devices/system/cpu/cpu%d"MIDR,
+				sysfs, cpus->map[cpu]);
+
+		file = fopen(path, "r");
+		if (!file) {
+			pr_debug("fopen failed for file %s\n", path);
+			continue;
+		}
+
+		if (!fgets(buf, MIDR_SIZE, file)) {
+			fclose(file);
+			continue;
+		}
+		fclose(file);
+
+		/* Ignore/clear Variant[23:20] and
+		 * Revision[3:0] of MIDR
+		 */
+		midr = strtoul(buf, NULL, 16);
+		midr &= (~(MIDR_VARIANT_MASK | MIDR_REVISION_MASK));
+		scnprintf(buf, MIDR_SIZE, "0x%016lx", midr);
+		/* got midr break loop */
+		break;
+	}
+
+	if (!midr) {
+		pr_err("failed to get cpuid string for PMU %s\n", pmu->name);
+		free(buf);
+		buf = NULL;
+	}
+
+	cpu_map__put(cpus);
+	return buf;
+}
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 20/36] perf tools arm64: Add support for get_cpuid_str function.
@ 2017-12-06 14:40   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:40 UTC (permalink / raw)
  To: linux-arm-kernel

From: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>

The get_cpuid_str function returns the MIDR string of the first online
cpu from the range of cpus associated with the PMU CORE device.

Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ganapatrao Kulkarni <gklkml16@gmail.com>
Cc: Jayachandran C <jnair@caviumnetworks.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@cavium.com>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: linux-arm-kernel at lists.infradead.org
Link: http://lkml.kernel.org/r/20171016183222.25750-3-ganapatrao.kulkarni at cavium.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/arch/arm64/util/Build    |  1 +
 tools/perf/arch/arm64/util/header.c | 65 +++++++++++++++++++++++++++++++++++++
 2 files changed, 66 insertions(+)
 create mode 100644 tools/perf/arch/arm64/util/header.c

diff --git a/tools/perf/arch/arm64/util/Build b/tools/perf/arch/arm64/util/Build
index cef6fb38d17e..b1ab72d2a42e 100644
--- a/tools/perf/arch/arm64/util/Build
+++ b/tools/perf/arch/arm64/util/Build
@@ -1,3 +1,4 @@
+libperf-y += header.o
 libperf-$(CONFIG_DWARF)     += dwarf-regs.o
 libperf-$(CONFIG_LOCAL_LIBUNWIND) += unwind-libunwind.o
 
diff --git a/tools/perf/arch/arm64/util/header.c b/tools/perf/arch/arm64/util/header.c
new file mode 100644
index 000000000000..534cd2507d83
--- /dev/null
+++ b/tools/perf/arch/arm64/util/header.c
@@ -0,0 +1,65 @@
+#include <stdio.h>
+#include <stdlib.h>
+#include <api/fs/fs.h>
+#include "header.h"
+
+#define MIDR "/regs/identification/midr_el1"
+#define MIDR_SIZE 19
+#define MIDR_REVISION_MASK      0xf
+#define MIDR_VARIANT_SHIFT      20
+#define MIDR_VARIANT_MASK       (0xf << MIDR_VARIANT_SHIFT)
+
+char *get_cpuid_str(struct perf_pmu *pmu)
+{
+	char *buf = NULL;
+	char path[PATH_MAX];
+	const char *sysfs = sysfs__mountpoint();
+	int cpu;
+	u64 midr = 0;
+	struct cpu_map *cpus;
+	FILE *file;
+
+	if (!sysfs || !pmu || !pmu->cpus)
+		return NULL;
+
+	buf = malloc(MIDR_SIZE);
+	if (!buf)
+		return NULL;
+
+	/* read midr from list of cpus mapped to this pmu */
+	cpus = cpu_map__get(pmu->cpus);
+	for (cpu = 0; cpu < cpus->nr; cpu++) {
+		scnprintf(path, PATH_MAX, "%s/devices/system/cpu/cpu%d"MIDR,
+				sysfs, cpus->map[cpu]);
+
+		file = fopen(path, "r");
+		if (!file) {
+			pr_debug("fopen failed for file %s\n", path);
+			continue;
+		}
+
+		if (!fgets(buf, MIDR_SIZE, file)) {
+			fclose(file);
+			continue;
+		}
+		fclose(file);
+
+		/* Ignore/clear Variant[23:20] and
+		 * Revision[3:0] of MIDR
+		 */
+		midr = strtoul(buf, NULL, 16);
+		midr &= (~(MIDR_VARIANT_MASK | MIDR_REVISION_MASK));
+		scnprintf(buf, MIDR_SIZE, "0x%016lx", midr);
+		/* got midr break loop */
+		break;
+	}
+
+	if (!midr) {
+		pr_err("failed to get cpuid string for PMU %s\n", pmu->name);
+		free(buf);
+		buf = NULL;
+	}
+
+	cpu_map__put(cpus);
+	return buf;
+}
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 21/36] perf pmu: Add helper function is_pmu_core to detect PMU CORE devices
       [not found] <20171206144115.15097-1-acme@kernel.org>
                   ` (19 preceding siblings ...)
  2017-12-06 14:40   ` Arnaldo Carvalho de Melo
@ 2017-12-06 14:41 ` Arnaldo Carvalho de Melo
  2017-12-06 14:41   ` Arnaldo Carvalho de Melo
                   ` (14 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Ganapatrao Kulkarni,
	Arnaldo Carvalho de Melo

From: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>

On some platforms, PMU core devices sysfs name is not cpu.
Adding function is_pmu_core to detect PMU core devices using
core device specific hints in sysfs.

For arm64 platforms, all core devices have file "cpus" in sysfs.

Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Tested-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Tested-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Link: https://lkml.kernel.org/n/tip-y1woxt1k2pqqwpprhonnft2s@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/pmu.c | 39 +++++++++++++++++++++++++++++++++++----
 1 file changed, 35 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 4e7dd3a0f123..732ff579ec65 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -537,6 +537,34 @@ static bool pmu_is_uncore(const char *name)
 }
 
 /*
+ *  PMU CORE devices have different name other than cpu in sysfs on some
+ *  platforms. looking for possible sysfs files to identify as core device.
+ */
+static int is_pmu_core(const char *name)
+{
+	struct stat st;
+	char path[PATH_MAX];
+	const char *sysfs = sysfs__mountpoint();
+
+	if (!sysfs)
+		return 0;
+
+	/* Look for cpu sysfs (x86 and others) */
+	scnprintf(path, PATH_MAX, "%s/bus/event_source/devices/cpu", sysfs);
+	if ((stat(path, &st) == 0) &&
+			(strncmp(name, "cpu", strlen("cpu")) == 0))
+		return 1;
+
+	/* Look for cpu sysfs (specific to arm) */
+	scnprintf(path, PATH_MAX, "%s/bus/event_source/devices/%s/cpus",
+				sysfs, name);
+	if (stat(path, &st) == 0)
+		return 1;
+
+	return 0;
+}
+
+/*
  * Return the CPU id as a raw string.
  *
  * Each architecture should provide a more precise id string that
@@ -609,7 +637,6 @@ static void pmu_add_cpu_aliases(struct list_head *head, struct perf_pmu *pmu)
 	 */
 	i = 0;
 	while (1) {
-		const char *pname;
 
 		pe = &map->table[i++];
 		if (!pe->name) {
@@ -618,9 +645,13 @@ static void pmu_add_cpu_aliases(struct list_head *head, struct perf_pmu *pmu)
 			break;
 		}
 
-		pname = pe->pmu ? pe->pmu : "cpu";
-		if (strncmp(pname, name, strlen(pname)))
-			continue;
+		if (!is_pmu_core(name)) {
+			/* check for uncore devices */
+			if (pe->pmu == NULL)
+				continue;
+			if (strncmp(pe->pmu, name, strlen(pe->pmu)))
+				continue;
+		}
 
 		/* need type casts to override 'const' */
 		__perf_pmu__new_alias(head, NULL, (char *)pe->name,
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 22/36] perf vendor events arm64: Add ThunderX2 implementation defined pmu core events
       [not found] <20171206144115.15097-1-acme@kernel.org>
@ 2017-12-06 14:41   ` Arnaldo Carvalho de Melo
  2017-12-06 14:40 ` [PATCH 02/36] perf test: Disable test cases 19 and 20 on s390x Arnaldo Carvalho de Melo
                     ` (34 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Ganapatrao Kulkarni,
	Alexander Shishkin, Catalin Marinas, Ganapatrao Kulkarni,
	Jayachandran C, Jonathan Cameron, Mark Rutland, Peter Zijlstra,
	Robert Richter, Shaokun Zhang, linux-arm-kernel,
	Arnaldo Carvalho de Melo

From: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>

This is not a full event list, but a short list of useful events.

Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ganapatrao Kulkarni <gklkml16@gmail.com>
Cc: Jayachandran C <jnair@caviumnetworks.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@cavium.com>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20171016183222.25750-5-ganapatrao.kulkarni@cavium.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 .../arch/arm64/cavium/thunderx2-imp-def.json       | 62 ++++++++++++++++++++++
 tools/perf/pmu-events/arch/arm64/mapfile.csv       | 15 ++++++
 2 files changed, 77 insertions(+)
 create mode 100644 tools/perf/pmu-events/arch/arm64/cavium/thunderx2-imp-def.json
 create mode 100644 tools/perf/pmu-events/arch/arm64/mapfile.csv

diff --git a/tools/perf/pmu-events/arch/arm64/cavium/thunderx2-imp-def.json b/tools/perf/pmu-events/arch/arm64/cavium/thunderx2-imp-def.json
new file mode 100644
index 000000000000..2db45c40ebc7
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/cavium/thunderx2-imp-def.json
@@ -0,0 +1,62 @@
+[
+    {
+        "PublicDescription": "Attributable Level 1 data cache access, read",
+        "EventCode": "0x40",
+        "EventName": "l1d_cache_rd",
+        "BriefDescription": "L1D cache read",
+    },
+    {
+        "PublicDescription": "Attributable Level 1 data cache access, write ",
+        "EventCode": "0x41",
+        "EventName": "l1d_cache_wr",
+        "BriefDescription": "L1D cache write",
+    },
+    {
+        "PublicDescription": "Attributable Level 1 data cache refill, read",
+        "EventCode": "0x42",
+        "EventName": "l1d_cache_refill_rd",
+        "BriefDescription": "L1D cache refill read",
+    },
+    {
+        "PublicDescription": "Attributable Level 1 data cache refill, write",
+        "EventCode": "0x43",
+        "EventName": "l1d_cache_refill_wr",
+        "BriefDescription": "L1D refill write",
+    },
+    {
+        "PublicDescription": "Attributable Level 1 data TLB refill, read",
+        "EventCode": "0x4C",
+        "EventName": "l1d_tlb_refill_rd",
+        "BriefDescription": "L1D tlb refill read",
+    },
+    {
+        "PublicDescription": "Attributable Level 1 data TLB refill, write",
+        "EventCode": "0x4D",
+        "EventName": "l1d_tlb_refill_wr",
+        "BriefDescription": "L1D tlb refill write",
+    },
+    {
+        "PublicDescription": "Attributable Level 1 data or unified TLB access, read",
+        "EventCode": "0x4E",
+        "EventName": "l1d_tlb_rd",
+        "BriefDescription": "L1D tlb read",
+    },
+    {
+        "PublicDescription": "Attributable Level 1 data or unified TLB access, write",
+        "EventCode": "0x4F",
+        "EventName": "l1d_tlb_wr",
+        "BriefDescription": "L1D tlb write",
+    },
+    {
+        "PublicDescription": "Bus access read",
+        "EventCode": "0x60",
+        "EventName": "bus_access_rd",
+        "BriefDescription": "Bus access read",
+   },
+   {
+        "PublicDescription": "Bus access write",
+        "EventCode": "0x61",
+        "EventName": "bus_access_wr",
+        "BriefDescription": "Bus access write",
+   }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/mapfile.csv b/tools/perf/pmu-events/arch/arm64/mapfile.csv
new file mode 100644
index 000000000000..219d6756134e
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/mapfile.csv
@@ -0,0 +1,15 @@
+# Format:
+#	MIDR,Version,JSON/file/pathname,Type
+#
+# where
+#	MIDR	Processor version
+#		Variant[23:20] and Revision [3:0] should be zero.
+#	Version could be used to track version of of JSON file
+#		but currently unused.
+#	JSON/file/pathname is the path to JSON file, relative
+#		to tools/perf/pmu-events/arch/arm64/.
+#	Type is core, uncore etc
+#
+#
+#Family-model,Version,Filename,EventType
+0x00000000420f5160,v1,cavium,core
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 22/36] perf vendor events arm64: Add ThunderX2 implementation defined pmu core events
@ 2017-12-06 14:41   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:41 UTC (permalink / raw)
  To: linux-arm-kernel

From: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>

This is not a full event list, but a short list of useful events.

Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ganapatrao Kulkarni <gklkml16@gmail.com>
Cc: Jayachandran C <jnair@caviumnetworks.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@cavium.com>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: linux-arm-kernel at lists.infradead.org
Link: http://lkml.kernel.org/r/20171016183222.25750-5-ganapatrao.kulkarni at cavium.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 .../arch/arm64/cavium/thunderx2-imp-def.json       | 62 ++++++++++++++++++++++
 tools/perf/pmu-events/arch/arm64/mapfile.csv       | 15 ++++++
 2 files changed, 77 insertions(+)
 create mode 100644 tools/perf/pmu-events/arch/arm64/cavium/thunderx2-imp-def.json
 create mode 100644 tools/perf/pmu-events/arch/arm64/mapfile.csv

diff --git a/tools/perf/pmu-events/arch/arm64/cavium/thunderx2-imp-def.json b/tools/perf/pmu-events/arch/arm64/cavium/thunderx2-imp-def.json
new file mode 100644
index 000000000000..2db45c40ebc7
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/cavium/thunderx2-imp-def.json
@@ -0,0 +1,62 @@
+[
+    {
+        "PublicDescription": "Attributable Level 1 data cache access, read",
+        "EventCode": "0x40",
+        "EventName": "l1d_cache_rd",
+        "BriefDescription": "L1D cache read",
+    },
+    {
+        "PublicDescription": "Attributable Level 1 data cache access, write ",
+        "EventCode": "0x41",
+        "EventName": "l1d_cache_wr",
+        "BriefDescription": "L1D cache write",
+    },
+    {
+        "PublicDescription": "Attributable Level 1 data cache refill, read",
+        "EventCode": "0x42",
+        "EventName": "l1d_cache_refill_rd",
+        "BriefDescription": "L1D cache refill read",
+    },
+    {
+        "PublicDescription": "Attributable Level 1 data cache refill, write",
+        "EventCode": "0x43",
+        "EventName": "l1d_cache_refill_wr",
+        "BriefDescription": "L1D refill write",
+    },
+    {
+        "PublicDescription": "Attributable Level 1 data TLB refill, read",
+        "EventCode": "0x4C",
+        "EventName": "l1d_tlb_refill_rd",
+        "BriefDescription": "L1D tlb refill read",
+    },
+    {
+        "PublicDescription": "Attributable Level 1 data TLB refill, write",
+        "EventCode": "0x4D",
+        "EventName": "l1d_tlb_refill_wr",
+        "BriefDescription": "L1D tlb refill write",
+    },
+    {
+        "PublicDescription": "Attributable Level 1 data or unified TLB access, read",
+        "EventCode": "0x4E",
+        "EventName": "l1d_tlb_rd",
+        "BriefDescription": "L1D tlb read",
+    },
+    {
+        "PublicDescription": "Attributable Level 1 data or unified TLB access, write",
+        "EventCode": "0x4F",
+        "EventName": "l1d_tlb_wr",
+        "BriefDescription": "L1D tlb write",
+    },
+    {
+        "PublicDescription": "Bus access read",
+        "EventCode": "0x60",
+        "EventName": "bus_access_rd",
+        "BriefDescription": "Bus access read",
+   },
+   {
+        "PublicDescription": "Bus access write",
+        "EventCode": "0x61",
+        "EventName": "bus_access_wr",
+        "BriefDescription": "Bus access write",
+   }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/mapfile.csv b/tools/perf/pmu-events/arch/arm64/mapfile.csv
new file mode 100644
index 000000000000..219d6756134e
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/mapfile.csv
@@ -0,0 +1,15 @@
+# Format:
+#	MIDR,Version,JSON/file/pathname,Type
+#
+# where
+#	MIDR	Processor version
+#		Variant[23:20] and Revision [3:0] should be zero.
+#	Version could be used to track version of of JSON file
+#		but currently unused.
+#	JSON/file/pathname is the path to JSON file, relative
+#		to tools/perf/pmu-events/arch/arm64/.
+#	Type is core, uncore etc
+#
+#
+#Family-model,Version,Filename,EventType
+0x00000000420f5160,v1,cavium,core
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 23/36] perf pmu: Add check for valid cpuid in perf_pmu__find_map()
       [not found] <20171206144115.15097-1-acme@kernel.org>
@ 2017-12-06 14:41   ` Arnaldo Carvalho de Melo
  2017-12-06 14:40 ` [PATCH 02/36] perf test: Disable test cases 19 and 20 on s390x Arnaldo Carvalho de Melo
                     ` (34 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Ganapatrao Kulkarni,
	Alexander Shishkin, Catalin Marinas, Ganapatrao Kulkarni,
	Jayachandran C, Jonathan Cameron, Mark Rutland, Peter Zijlstra,
	Robert Richter, Shaokun Zhang, Will Deacon, linux-arm-kernel,
	Arnaldo Carvalho de Melo

From: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>

On some platforms(arm/arm64) which uses cpus map to get corresponding
cpuid string, cpuid can be NULL for PMUs other than CORE PMUs.  Adding
check for NULL cpuid in function perf_pmu__find_map to avoid
segmentation fault.

Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ganapatrao Kulkarni <gklkml16@gmail.com>
Cc: Jayachandran C <jnair@caviumnetworks.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@cavium.com>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20171016183222.25750-6-ganapatrao.kulkarni@cavium.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/pmu.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 732ff579ec65..8b7c151579c0 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -601,6 +601,12 @@ struct pmu_events_map *perf_pmu__find_map(struct perf_pmu *pmu)
 	char *cpuid = perf_pmu__getcpuid(pmu);
 	int i;
 
+	/* on some platforms which uses cpus map, cpuid can be NULL for
+	 * PMUs other than CORE PMUs.
+	 */
+	if (!cpuid)
+		return NULL;
+
 	i = 0;
 	for (;;) {
 		map = &pmu_events_map[i++];
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 23/36] perf pmu: Add check for valid cpuid in perf_pmu__find_map()
@ 2017-12-06 14:41   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:41 UTC (permalink / raw)
  To: linux-arm-kernel

From: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>

On some platforms(arm/arm64) which uses cpus map to get corresponding
cpuid string, cpuid can be NULL for PMUs other than CORE PMUs.  Adding
check for NULL cpuid in function perf_pmu__find_map to avoid
segmentation fault.

Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ganapatrao Kulkarni <gklkml16@gmail.com>
Cc: Jayachandran C <jnair@caviumnetworks.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@cavium.com>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel at lists.infradead.org
Link: http://lkml.kernel.org/r/20171016183222.25750-6-ganapatrao.kulkarni at cavium.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/pmu.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 732ff579ec65..8b7c151579c0 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -601,6 +601,12 @@ struct pmu_events_map *perf_pmu__find_map(struct perf_pmu *pmu)
 	char *cpuid = perf_pmu__getcpuid(pmu);
 	int i;
 
+	/* on some platforms which uses cpus map, cpuid can be NULL for
+	 * PMUs other than CORE PMUs.
+	 */
+	if (!cpuid)
+		return NULL;
+
 	i = 0;
 	for (;;) {
 		map = &pmu_events_map[i++];
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 24/36] perf tools: Fix up build in hardnened environments
       [not found] <20171206144115.15097-1-acme@kernel.org>
                   ` (22 preceding siblings ...)
  2017-12-06 14:41   ` Arnaldo Carvalho de Melo
@ 2017-12-06 14:41 ` Arnaldo Carvalho de Melo
  2017-12-06 14:41 ` [PATCH 25/36] perf evlist: Remove 'overwrite' parameter from perf_evlist__mmap Arnaldo Carvalho de Melo
                   ` (11 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Jiri Olsa, David Ahern,
	Namhyung Kim, Peter Zijlstra, Arnaldo Carvalho de Melo

From: Jiri Olsa <jolsa@kernel.org>

On Fedora systems the perl and python CFLAGS/LDFLAGS include the
hardened specs from redhat-rpm-config package. We apply them only for
perl/python objects, which makes them not compatible with the rest of
the objects and the build fails with:

  /usr/bin/ld: perf-in.o: relocation R_X86_64_32 against `.rodata.str1.1' can not be used when making a shared object; recompile with -f
+PIC
  /usr/bin/ld: libperf.a(libperf-in.o): relocation R_X86_64_32S against `.text' can not be used when making a shared object; recompile w
+ith -fPIC
  /usr/bin/ld: final link failed: Nonrepresentable section on output
  collect2: error: ld returned 1 exit status
  make[2]: *** [Makefile.perf:507: perf] Error 1
  make[1]: *** [Makefile.perf:210: sub-make] Error 2
  make: *** [Makefile:69: all] Error 2

Mainly it's caused by perl/python objects being compiled with:

  -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1

which prevent the final link impossible, because it will check
for 'proper' objects with following option:

  -specs=/usr/lib/rpm/redhat/redhat-hardened-ld

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20171204082437.GC30564@krava
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Makefile.config | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index bf86c09ca889..808066c823f7 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -185,9 +185,7 @@ ifdef PYTHON_CONFIG
   PYTHON_EMBED_LDFLAGS := $(call strip-libs,$(PYTHON_EMBED_LDOPTS))
   PYTHON_EMBED_LIBADD := $(call grep-libs,$(PYTHON_EMBED_LDOPTS)) -lutil
   PYTHON_EMBED_CCOPTS := $(shell $(PYTHON_CONFIG_SQ) --cflags 2>/dev/null)
-  ifeq ($(CC_NO_CLANG), 1)
-    PYTHON_EMBED_CCOPTS := $(filter-out -specs=%,$(PYTHON_EMBED_CCOPTS))
-  endif
+  PYTHON_EMBED_CCOPTS := $(filter-out -specs=%,$(PYTHON_EMBED_CCOPTS))
   FLAGS_PYTHON_EMBED := $(PYTHON_EMBED_CCOPTS) $(PYTHON_EMBED_LDOPTS)
 endif
 
@@ -577,7 +575,6 @@ ifndef NO_GTK2
   endif
 endif
 
-
 ifdef NO_LIBPERL
   CFLAGS += -DNO_LIBPERL
 else
@@ -585,6 +582,8 @@ else
   PERL_EMBED_LDFLAGS = $(call strip-libs,$(PERL_EMBED_LDOPTS))
   PERL_EMBED_LIBADD = $(call grep-libs,$(PERL_EMBED_LDOPTS))
   PERL_EMBED_CCOPTS = $(shell perl -MExtUtils::Embed -e ccopts 2>/dev/null)
+  PERL_EMBED_CCOPTS := $(filter-out -specs=%,$(PERL_EMBED_CCOPTS))
+  PERL_EMBED_LDOPTS := $(filter-out -specs=%,$(PERL_EMBED_LDOPTS))
   FLAGS_PERL_EMBED=$(PERL_EMBED_CCOPTS) $(PERL_EMBED_LDOPTS)
 
   ifneq ($(feature-libperl), 1)
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 25/36] perf evlist: Remove 'overwrite' parameter from perf_evlist__mmap
       [not found] <20171206144115.15097-1-acme@kernel.org>
                   ` (23 preceding siblings ...)
  2017-12-06 14:41 ` [PATCH 24/36] perf tools: Fix up build in hardnened environments Arnaldo Carvalho de Melo
@ 2017-12-06 14:41 ` Arnaldo Carvalho de Melo
  2017-12-06 14:41 ` [PATCH 26/36] perf evlist: Remove 'overwrite' parameter from perf_evlist__mmap_ex Arnaldo Carvalho de Melo
                   ` (10 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Wang Nan, Jiri Olsa, Kan Liang,
	Arnaldo Carvalho de Melo

From: Wang Nan <wangnan0@huawei.com>

Now all perf_evlist__mmap's users doesn't set 'overwrite'. Remove it
from arguments list.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/20171203020044.81680-2-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/arch/x86/tests/perf-time-to-tsc.c | 2 +-
 tools/perf/builtin-kvm.c                     | 2 +-
 tools/perf/builtin-top.c                     | 2 +-
 tools/perf/builtin-trace.c                   | 2 +-
 tools/perf/tests/backward-ring-buffer.c      | 2 +-
 tools/perf/tests/bpf.c                       | 2 +-
 tools/perf/tests/code-reading.c              | 2 +-
 tools/perf/tests/keep-tracking.c             | 2 +-
 tools/perf/tests/mmap-basic.c                | 2 +-
 tools/perf/tests/openat-syscall-tp-fields.c  | 2 +-
 tools/perf/tests/perf-record.c               | 2 +-
 tools/perf/tests/sw-clock.c                  | 2 +-
 tools/perf/tests/switch-tracking.c           | 2 +-
 tools/perf/tests/task-exit.c                 | 2 +-
 tools/perf/util/evlist.c                     | 5 ++---
 tools/perf/util/evlist.h                     | 3 +--
 tools/perf/util/python.c                     | 2 +-
 17 files changed, 18 insertions(+), 20 deletions(-)

diff --git a/tools/perf/arch/x86/tests/perf-time-to-tsc.c b/tools/perf/arch/x86/tests/perf-time-to-tsc.c
index b59678e8c1e2..06abe8108b33 100644
--- a/tools/perf/arch/x86/tests/perf-time-to-tsc.c
+++ b/tools/perf/arch/x86/tests/perf-time-to-tsc.c
@@ -84,7 +84,7 @@ int test__perf_time_to_tsc(struct test *test __maybe_unused, int subtest __maybe
 
 	CHECK__(perf_evlist__open(evlist));
 
-	CHECK__(perf_evlist__mmap(evlist, UINT_MAX, false));
+	CHECK__(perf_evlist__mmap(evlist, UINT_MAX));
 
 	pc = evlist->mmap[0].base;
 	ret = perf_read_tsc_conversion(pc, &tc);
diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c
index 597c7de9bec9..98853162eae9 100644
--- a/tools/perf/builtin-kvm.c
+++ b/tools/perf/builtin-kvm.c
@@ -1044,7 +1044,7 @@ static int kvm_live_open_events(struct perf_kvm_stat *kvm)
 		goto out;
 	}
 
-	if (perf_evlist__mmap(evlist, kvm->opts.mmap_pages, false) < 0) {
+	if (perf_evlist__mmap(evlist, kvm->opts.mmap_pages) < 0) {
 		ui__error("Failed to mmap the events: %s\n",
 			  str_error_r(errno, sbuf, sizeof(sbuf)));
 		perf_evlist__close(evlist);
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 0077724fb24f..540461f5e345 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -907,7 +907,7 @@ static int perf_top__start_counters(struct perf_top *top)
 		}
 	}
 
-	if (perf_evlist__mmap(evlist, opts->mmap_pages, false) < 0) {
+	if (perf_evlist__mmap(evlist, opts->mmap_pages) < 0) {
 		ui__error("Failed to mmap with %d (%s)\n",
 			    errno, str_error_r(errno, msg, sizeof(msg)));
 		goto out_err;
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 84debdbad327..7c57898095ea 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -2437,7 +2437,7 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
 	if (err < 0)
 		goto out_error_apply_filters;
 
-	err = perf_evlist__mmap(evlist, trace->opts.mmap_pages, false);
+	err = perf_evlist__mmap(evlist, trace->opts.mmap_pages);
 	if (err < 0)
 		goto out_error_mmap;
 
diff --git a/tools/perf/tests/backward-ring-buffer.c b/tools/perf/tests/backward-ring-buffer.c
index 43a8c6ac4070..cf37e43c42f3 100644
--- a/tools/perf/tests/backward-ring-buffer.c
+++ b/tools/perf/tests/backward-ring-buffer.c
@@ -59,7 +59,7 @@ static int do_test(struct perf_evlist *evlist, int mmap_pages,
 	int err;
 	char sbuf[STRERR_BUFSIZE];
 
-	err = perf_evlist__mmap(evlist, mmap_pages, false);
+	err = perf_evlist__mmap(evlist, mmap_pages);
 	if (err < 0) {
 		pr_debug("perf_evlist__mmap: %s\n",
 			 str_error_r(errno, sbuf, sizeof(sbuf)));
diff --git a/tools/perf/tests/bpf.c b/tools/perf/tests/bpf.c
index 34c22cdf4d5d..c433dd30975a 100644
--- a/tools/perf/tests/bpf.c
+++ b/tools/perf/tests/bpf.c
@@ -167,7 +167,7 @@ static int do_test(struct bpf_object *obj, int (*func)(void),
 		goto out_delete_evlist;
 	}
 
-	err = perf_evlist__mmap(evlist, opts.mmap_pages, false);
+	err = perf_evlist__mmap(evlist, opts.mmap_pages);
 	if (err < 0) {
 		pr_debug("perf_evlist__mmap: %s\n",
 			 str_error_r(errno, sbuf, sizeof(sbuf)));
diff --git a/tools/perf/tests/code-reading.c b/tools/perf/tests/code-reading.c
index fcc8984bc329..3bf7b145b826 100644
--- a/tools/perf/tests/code-reading.c
+++ b/tools/perf/tests/code-reading.c
@@ -639,7 +639,7 @@ static int do_test_code_reading(bool try_kcore)
 		break;
 	}
 
-	ret = perf_evlist__mmap(evlist, UINT_MAX, false);
+	ret = perf_evlist__mmap(evlist, UINT_MAX);
 	if (ret < 0) {
 		pr_debug("perf_evlist__mmap failed\n");
 		goto out_put;
diff --git a/tools/perf/tests/keep-tracking.c b/tools/perf/tests/keep-tracking.c
index 842d33637a18..c46530918938 100644
--- a/tools/perf/tests/keep-tracking.c
+++ b/tools/perf/tests/keep-tracking.c
@@ -95,7 +95,7 @@ int test__keep_tracking(struct test *test __maybe_unused, int subtest __maybe_un
 		goto out_err;
 	}
 
-	CHECK__(perf_evlist__mmap(evlist, UINT_MAX, false));
+	CHECK__(perf_evlist__mmap(evlist, UINT_MAX));
 
 	/*
 	 * First, test that a 'comm' event can be found when the event is
diff --git a/tools/perf/tests/mmap-basic.c b/tools/perf/tests/mmap-basic.c
index 91f10d6d9ae2..c0e971da965c 100644
--- a/tools/perf/tests/mmap-basic.c
+++ b/tools/perf/tests/mmap-basic.c
@@ -94,7 +94,7 @@ int test__basic_mmap(struct test *test __maybe_unused, int subtest __maybe_unuse
 		expected_nr_events[i] = 1 + rand() % 127;
 	}
 
-	if (perf_evlist__mmap(evlist, 128, false) < 0) {
+	if (perf_evlist__mmap(evlist, 128) < 0) {
 		pr_debug("failed to mmap events: %d (%s)\n", errno,
 			 str_error_r(errno, sbuf, sizeof(sbuf)));
 		goto out_delete_evlist;
diff --git a/tools/perf/tests/openat-syscall-tp-fields.c b/tools/perf/tests/openat-syscall-tp-fields.c
index d9619d265314..97c9407d02a0 100644
--- a/tools/perf/tests/openat-syscall-tp-fields.c
+++ b/tools/perf/tests/openat-syscall-tp-fields.c
@@ -64,7 +64,7 @@ int test__syscall_openat_tp_fields(struct test *test __maybe_unused, int subtest
 		goto out_delete_evlist;
 	}
 
-	err = perf_evlist__mmap(evlist, UINT_MAX, false);
+	err = perf_evlist__mmap(evlist, UINT_MAX);
 	if (err < 0) {
 		pr_debug("perf_evlist__mmap: %s\n",
 			 str_error_r(errno, sbuf, sizeof(sbuf)));
diff --git a/tools/perf/tests/perf-record.c b/tools/perf/tests/perf-record.c
index c34904d37705..0afafab85238 100644
--- a/tools/perf/tests/perf-record.c
+++ b/tools/perf/tests/perf-record.c
@@ -141,7 +141,7 @@ int test__PERF_RECORD(struct test *test __maybe_unused, int subtest __maybe_unus
 	 * fds in the same CPU to be injected in the same mmap ring buffer
 	 * (using ioctl(PERF_EVENT_IOC_SET_OUTPUT)).
 	 */
-	err = perf_evlist__mmap(evlist, opts.mmap_pages, false);
+	err = perf_evlist__mmap(evlist, opts.mmap_pages);
 	if (err < 0) {
 		pr_debug("perf_evlist__mmap: %s\n",
 			 str_error_r(errno, sbuf, sizeof(sbuf)));
diff --git a/tools/perf/tests/sw-clock.c b/tools/perf/tests/sw-clock.c
index c6937ed12e6b..f6c72f915d48 100644
--- a/tools/perf/tests/sw-clock.c
+++ b/tools/perf/tests/sw-clock.c
@@ -78,7 +78,7 @@ static int __test__sw_clock_freq(enum perf_sw_ids clock_id)
 		goto out_delete_evlist;
 	}
 
-	err = perf_evlist__mmap(evlist, 128, false);
+	err = perf_evlist__mmap(evlist, 128);
 	if (err < 0) {
 		pr_debug("failed to mmap event: %d (%s)\n", errno,
 			 str_error_r(errno, sbuf, sizeof(sbuf)));
diff --git a/tools/perf/tests/switch-tracking.c b/tools/perf/tests/switch-tracking.c
index 7d3f4bf9534f..33e00295a972 100644
--- a/tools/perf/tests/switch-tracking.c
+++ b/tools/perf/tests/switch-tracking.c
@@ -449,7 +449,7 @@ int test__switch_tracking(struct test *test __maybe_unused, int subtest __maybe_
 		goto out;
 	}
 
-	err = perf_evlist__mmap(evlist, UINT_MAX, false);
+	err = perf_evlist__mmap(evlist, UINT_MAX);
 	if (err) {
 		pr_debug("perf_evlist__mmap failed!\n");
 		goto out_err;
diff --git a/tools/perf/tests/task-exit.c b/tools/perf/tests/task-exit.c
index 5d06ac81f7f1..01b62b81751b 100644
--- a/tools/perf/tests/task-exit.c
+++ b/tools/perf/tests/task-exit.c
@@ -101,7 +101,7 @@ int test__task_exit(struct test *test __maybe_unused, int subtest __maybe_unused
 		goto out_delete_evlist;
 	}
 
-	if (perf_evlist__mmap(evlist, 128, false) < 0) {
+	if (perf_evlist__mmap(evlist, 128) < 0) {
 		pr_debug("failed to mmap events: %d (%s)\n", errno,
 			 str_error_r(errno, sbuf, sizeof(sbuf)));
 		goto out_delete_evlist;
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 199bb82efbcd..3c1778b500e0 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1091,10 +1091,9 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 	return perf_evlist__mmap_per_cpu(evlist, &mp);
 }
 
-int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
-		      bool overwrite)
+int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages)
 {
-	return perf_evlist__mmap_ex(evlist, pages, overwrite, 0, false);
+	return perf_evlist__mmap_ex(evlist, pages, false, 0, false);
 }
 
 int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 4e8131dacbd7..f0f2c8b2504b 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -171,8 +171,7 @@ unsigned long perf_event_mlock_kb_in_pages(void);
 int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 			 bool overwrite, unsigned int auxtrace_pages,
 			 bool auxtrace_overwrite);
-int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
-		      bool overwrite);
+int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages);
 void perf_evlist__munmap(struct perf_evlist *evlist);
 
 size_t perf_evlist__mmap_size(unsigned long pages);
diff --git a/tools/perf/util/python.c b/tools/perf/util/python.c
index 8e49d9cafcfc..b1e999bd21ef 100644
--- a/tools/perf/util/python.c
+++ b/tools/perf/util/python.c
@@ -864,7 +864,7 @@ static PyObject *pyrf_evlist__mmap(struct pyrf_evlist *pevlist,
 					 &pages, &overwrite))
 		return NULL;
 
-	if (perf_evlist__mmap(evlist, pages, overwrite) < 0) {
+	if (perf_evlist__mmap(evlist, pages) < 0) {
 		PyErr_SetFromErrno(PyExc_OSError);
 		return NULL;
 	}
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 26/36] perf evlist: Remove 'overwrite' parameter from perf_evlist__mmap_ex
       [not found] <20171206144115.15097-1-acme@kernel.org>
                   ` (24 preceding siblings ...)
  2017-12-06 14:41 ` [PATCH 25/36] perf evlist: Remove 'overwrite' parameter from perf_evlist__mmap Arnaldo Carvalho de Melo
@ 2017-12-06 14:41 ` Arnaldo Carvalho de Melo
  2017-12-06 14:41 ` [PATCH 27/36] perf evlist: Remove evlist->overwrite Arnaldo Carvalho de Melo
                   ` (9 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Wang Nan, Jiri Olsa, Kan Liang,
	Arnaldo Carvalho de Melo

From: Wang Nan <wangnan0@huawei.com>

All users of perf_evlist__mmap_ex set !overwrite. Remove it from its
arguments list.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/20171203020044.81680-3-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-record.c | 2 +-
 tools/perf/util/evlist.c    | 8 ++++----
 tools/perf/util/evlist.h    | 2 +-
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index e304bc47fe9b..08070f87d489 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -301,7 +301,7 @@ static int record__mmap_evlist(struct record *rec,
 	struct record_opts *opts = &rec->opts;
 	char msg[512];
 
-	if (perf_evlist__mmap_ex(evlist, opts->mmap_pages, false,
+	if (perf_evlist__mmap_ex(evlist, opts->mmap_pages,
 				 opts->auxtrace_mmap_pages,
 				 opts->auxtrace_snapshot_mode) < 0) {
 		if (errno == EPERM) {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 3c1778b500e0..93272d932407 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1052,14 +1052,14 @@ int perf_evlist__parse_mmap_pages(const struct option *opt, const char *str,
  * Return: %0 on success, negative error code otherwise.
  */
 int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
-			 bool overwrite, unsigned int auxtrace_pages,
+			 unsigned int auxtrace_pages,
 			 bool auxtrace_overwrite)
 {
 	struct perf_evsel *evsel;
 	const struct cpu_map *cpus = evlist->cpus;
 	const struct thread_map *threads = evlist->threads;
 	struct mmap_params mp = {
-		.prot = PROT_READ | (overwrite ? 0 : PROT_WRITE),
+		.prot = PROT_READ | PROT_WRITE,
 	};
 
 	if (!evlist->mmap)
@@ -1070,7 +1070,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 	if (evlist->pollfd.entries == NULL && perf_evlist__alloc_pollfd(evlist) < 0)
 		return -ENOMEM;
 
-	evlist->overwrite = overwrite;
+	evlist->overwrite = false;
 	evlist->mmap_len = perf_evlist__mmap_size(pages);
 	pr_debug("mmap size %zuB\n", evlist->mmap_len);
 	mp.mask = evlist->mmap_len - page_size - 1;
@@ -1093,7 +1093,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 
 int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages)
 {
-	return perf_evlist__mmap_ex(evlist, pages, false, 0, false);
+	return perf_evlist__mmap_ex(evlist, pages, 0, false);
 }
 
 int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index f0f2c8b2504b..424a3d6015af 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -169,7 +169,7 @@ int perf_evlist__parse_mmap_pages(const struct option *opt,
 unsigned long perf_event_mlock_kb_in_pages(void);
 
 int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
-			 bool overwrite, unsigned int auxtrace_pages,
+			 unsigned int auxtrace_pages,
 			 bool auxtrace_overwrite);
 int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages);
 void perf_evlist__munmap(struct perf_evlist *evlist);
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 27/36] perf evlist: Remove evlist->overwrite
       [not found] <20171206144115.15097-1-acme@kernel.org>
                   ` (25 preceding siblings ...)
  2017-12-06 14:41 ` [PATCH 26/36] perf evlist: Remove 'overwrite' parameter from perf_evlist__mmap_ex Arnaldo Carvalho de Melo
@ 2017-12-06 14:41 ` Arnaldo Carvalho de Melo
  2017-12-06 14:41 ` [PATCH 28/36] perf mmap: Remove overwrite from arguments list of perf_mmap__push Arnaldo Carvalho de Melo
                   ` (8 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Wang Nan, Jiri Olsa, Kan Liang,
	Arnaldo Carvalho de Melo

From: Wang Nan <wangnan0@huawei.com>

evlist->overwrite is set to false in all users. It can be removed.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/20171203020044.81680-4-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-record.c | 2 +-
 tools/perf/util/evlist.c    | 5 ++---
 tools/perf/util/evlist.h    | 1 -
 3 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 08070f87d489..3bc6ceeae1f9 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -500,7 +500,7 @@ static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evli
 		struct auxtrace_mmap *mm = &maps[i].auxtrace_mmap;
 
 		if (maps[i].base) {
-			if (perf_mmap__push(&maps[i], evlist->overwrite, backward, rec, record__pushfn) != 0) {
+			if (perf_mmap__push(&maps[i], false, backward, rec, record__pushfn) != 0) {
 				rc = -1;
 				goto out;
 			}
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 93272d932407..a59134fb141f 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -711,7 +711,7 @@ union perf_event *perf_evlist__mmap_read_forward(struct perf_evlist *evlist, int
 	 * No need for read-write ring buffer: kernel stop outputting when
 	 * it hit md->prev (perf_mmap__consume()).
 	 */
-	return perf_mmap__read_forward(md, evlist->overwrite);
+	return perf_mmap__read_forward(md, false);
 }
 
 union perf_event *perf_evlist__mmap_read_backward(struct perf_evlist *evlist, int idx)
@@ -738,7 +738,7 @@ void perf_evlist__mmap_read_catchup(struct perf_evlist *evlist, int idx)
 
 void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
 {
-	perf_mmap__consume(&evlist->mmap[idx], evlist->overwrite);
+	perf_mmap__consume(&evlist->mmap[idx], false);
 }
 
 static void perf_evlist__munmap_nofree(struct perf_evlist *evlist)
@@ -1070,7 +1070,6 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 	if (evlist->pollfd.entries == NULL && perf_evlist__alloc_pollfd(evlist) < 0)
 		return -ENOMEM;
 
-	evlist->overwrite = false;
 	evlist->mmap_len = perf_evlist__mmap_size(pages);
 	pr_debug("mmap size %zuB\n", evlist->mmap_len);
 	mp.mask = evlist->mmap_len - page_size - 1;
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 424a3d6015af..eec33770b8e6 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -31,7 +31,6 @@ struct perf_evlist {
 	int		 nr_entries;
 	int		 nr_groups;
 	int		 nr_mmaps;
-	bool		 overwrite;
 	bool		 enabled;
 	bool		 has_user_cpus;
 	size_t		 mmap_len;
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 28/36] perf mmap: Remove overwrite from arguments list of perf_mmap__push
       [not found] <20171206144115.15097-1-acme@kernel.org>
                   ` (26 preceding siblings ...)
  2017-12-06 14:41 ` [PATCH 27/36] perf evlist: Remove evlist->overwrite Arnaldo Carvalho de Melo
@ 2017-12-06 14:41 ` Arnaldo Carvalho de Melo
  2017-12-06 14:41 ` [PATCH 29/36] perf mmap: Remove overwrite and check_messup from mmap read Arnaldo Carvalho de Melo
                   ` (7 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Wang Nan, Jiri Olsa, Kan Liang,
	Arnaldo Carvalho de Melo

From: Wang Nan <wangnan0@huawei.com>

'overwrite' argument is always 'false'. Remove it from arguments list of
perf_mmap__push().

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/20171203020044.81680-5-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-record.c | 2 +-
 tools/perf/util/mmap.c      | 6 +++---
 tools/perf/util/mmap.h      | 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 3bc6ceeae1f9..26b8571d0fdb 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -500,7 +500,7 @@ static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evli
 		struct auxtrace_mmap *mm = &maps[i].auxtrace_mmap;
 
 		if (maps[i].base) {
-			if (perf_mmap__push(&maps[i], false, backward, rec, record__pushfn) != 0) {
+			if (perf_mmap__push(&maps[i], backward, rec, record__pushfn) != 0) {
 				rc = -1;
 				goto out;
 			}
diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
index 9fe5f9c7d577..703ed41a9269 100644
--- a/tools/perf/util/mmap.c
+++ b/tools/perf/util/mmap.c
@@ -299,7 +299,7 @@ static int rb_find_range(void *data, int mask, u64 head, u64 old,
 	return backward_rb_find_range(data, mask, head, start, end);
 }
 
-int perf_mmap__push(struct perf_mmap *md, bool overwrite, bool backward,
+int perf_mmap__push(struct perf_mmap *md, bool backward,
 		    void *to, int push(void *to, void *buf, size_t size))
 {
 	u64 head = perf_mmap__read_head(md);
@@ -321,7 +321,7 @@ int perf_mmap__push(struct perf_mmap *md, bool overwrite, bool backward,
 		WARN_ONCE(1, "failed to keep up with mmap data. (warn only once)\n");
 
 		md->prev = head;
-		perf_mmap__consume(md, overwrite || backward);
+		perf_mmap__consume(md, backward);
 		return 0;
 	}
 
@@ -346,7 +346,7 @@ int perf_mmap__push(struct perf_mmap *md, bool overwrite, bool backward,
 	}
 
 	md->prev = head;
-	perf_mmap__consume(md, overwrite || backward);
+	perf_mmap__consume(md, backward);
 out:
 	return rc;
 }
diff --git a/tools/perf/util/mmap.h b/tools/perf/util/mmap.h
index efd78b827b05..2c3d291785de 100644
--- a/tools/perf/util/mmap.h
+++ b/tools/perf/util/mmap.h
@@ -89,7 +89,7 @@ static inline void perf_mmap__write_tail(struct perf_mmap *md, u64 tail)
 union perf_event *perf_mmap__read_forward(struct perf_mmap *map, bool check_messup);
 union perf_event *perf_mmap__read_backward(struct perf_mmap *map);
 
-int perf_mmap__push(struct perf_mmap *md, bool overwrite, bool backward,
+int perf_mmap__push(struct perf_mmap *md, bool backward,
 		    void *to, int push(void *to, void *buf, size_t size));
 
 size_t perf_mmap__mmap_len(struct perf_mmap *map);
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 29/36] perf mmap: Remove overwrite and check_messup from mmap read
       [not found] <20171206144115.15097-1-acme@kernel.org>
                   ` (27 preceding siblings ...)
  2017-12-06 14:41 ` [PATCH 28/36] perf mmap: Remove overwrite from arguments list of perf_mmap__push Arnaldo Carvalho de Melo
@ 2017-12-06 14:41 ` Arnaldo Carvalho de Melo
  2017-12-06 14:41 ` [PATCH 30/36] perf c2c: Add a tip about cacheline events Arnaldo Carvalho de Melo
                   ` (6 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Wang Nan, Jiri Olsa, Kan Liang,
	Arnaldo Carvalho de Melo

From: Wang Nan <wangnan0@huawei.com>

All perf_mmap__read_forward() read from read-write ring buffer, so no
need check_messup. Reading from backward ring buffer doesn't require
check_messup because it never mess up. Cleanup arguments lists.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/20171203020044.81680-6-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evlist.c |  2 +-
 tools/perf/util/mmap.c   | 28 ++++------------------------
 tools/perf/util/mmap.h   |  2 +-
 3 files changed, 6 insertions(+), 26 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index a59134fb141f..68c1f9546650 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -711,7 +711,7 @@ union perf_event *perf_evlist__mmap_read_forward(struct perf_evlist *evlist, int
 	 * No need for read-write ring buffer: kernel stop outputting when
 	 * it hit md->prev (perf_mmap__consume()).
 	 */
-	return perf_mmap__read_forward(md, false);
+	return perf_mmap__read_forward(md);
 }
 
 union perf_event *perf_evlist__mmap_read_backward(struct perf_evlist *evlist, int idx)
diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
index 703ed41a9269..3f262e707a41 100644
--- a/tools/perf/util/mmap.c
+++ b/tools/perf/util/mmap.c
@@ -21,33 +21,13 @@ size_t perf_mmap__mmap_len(struct perf_mmap *map)
 }
 
 /* When check_messup is true, 'end' must points to a good entry */
-static union perf_event *perf_mmap__read(struct perf_mmap *map, bool check_messup,
+static union perf_event *perf_mmap__read(struct perf_mmap *map,
 					 u64 start, u64 end, u64 *prev)
 {
 	unsigned char *data = map->base + page_size;
 	union perf_event *event = NULL;
 	int diff = end - start;
 
-	if (check_messup) {
-		/*
-		 * If we're further behind than half the buffer, there's a chance
-		 * the writer will bite our tail and mess up the samples under us.
-		 *
-		 * If we somehow ended up ahead of the 'end', we got messed up.
-		 *
-		 * In either case, truncate and restart at 'end'.
-		 */
-		if (diff > map->mask / 2 || diff < 0) {
-			fprintf(stderr, "WARNING: failed to keep up with mmap data.\n");
-
-			/*
-			 * 'end' points to a known good entry, start there.
-			 */
-			start = end;
-			diff = 0;
-		}
-	}
-
 	if (diff >= (int)sizeof(event->header)) {
 		size_t size;
 
@@ -89,7 +69,7 @@ static union perf_event *perf_mmap__read(struct perf_mmap *map, bool check_messu
 	return event;
 }
 
-union perf_event *perf_mmap__read_forward(struct perf_mmap *map, bool check_messup)
+union perf_event *perf_mmap__read_forward(struct perf_mmap *map)
 {
 	u64 head;
 	u64 old = map->prev;
@@ -102,7 +82,7 @@ union perf_event *perf_mmap__read_forward(struct perf_mmap *map, bool check_mess
 
 	head = perf_mmap__read_head(map);
 
-	return perf_mmap__read(map, check_messup, old, head, &map->prev);
+	return perf_mmap__read(map, old, head, &map->prev);
 }
 
 union perf_event *perf_mmap__read_backward(struct perf_mmap *map)
@@ -138,7 +118,7 @@ union perf_event *perf_mmap__read_backward(struct perf_mmap *map)
 	else
 		end = head + map->mask + 1;
 
-	return perf_mmap__read(map, false, start, end, &map->prev);
+	return perf_mmap__read(map, start, end, &map->prev);
 }
 
 void perf_mmap__read_catchup(struct perf_mmap *map)
diff --git a/tools/perf/util/mmap.h b/tools/perf/util/mmap.h
index 2c3d291785de..d640273b7762 100644
--- a/tools/perf/util/mmap.h
+++ b/tools/perf/util/mmap.h
@@ -86,7 +86,7 @@ static inline void perf_mmap__write_tail(struct perf_mmap *md, u64 tail)
 	pc->data_tail = tail;
 }
 
-union perf_event *perf_mmap__read_forward(struct perf_mmap *map, bool check_messup);
+union perf_event *perf_mmap__read_forward(struct perf_mmap *map);
 union perf_event *perf_mmap__read_backward(struct perf_mmap *map);
 
 int perf_mmap__push(struct perf_mmap *md, bool backward,
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 30/36] perf c2c: Add a tip about cacheline events
       [not found] <20171206144115.15097-1-acme@kernel.org>
                   ` (28 preceding siblings ...)
  2017-12-06 14:41 ` [PATCH 29/36] perf mmap: Remove overwrite and check_messup from mmap read Arnaldo Carvalho de Melo
@ 2017-12-06 14:41 ` Arnaldo Carvalho de Melo
  2017-12-06 14:41 ` [PATCH 31/36] perf vendor events: Use more flexible pattern matching for CPU identification for mapfile.csv Arnaldo Carvalho de Melo
                   ` (5 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Sangwon Hong, Jiri Olsa,
	Namhyung Kim, Arnaldo Carvalho de Melo

From: Sangwon Hong <qpakzk@gmail.com>

Signed-off-by: Sangwon Hong <qpakzk@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1512188201-14109-1-git-send-email-qpakzk@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/tips.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/perf/Documentation/tips.txt b/tools/perf/Documentation/tips.txt
index 3dd1dbe28407..849599f39c5e 100644
--- a/tools/perf/Documentation/tips.txt
+++ b/tools/perf/Documentation/tips.txt
@@ -33,3 +33,4 @@ System-wide collection from all CPUs: perf record -a
 Show current config key-value pairs: perf config --list
 Show user configuration overrides: perf config --user --list
 To add Node.js USDT(User-Level Statically Defined Tracing): perf buildid-cache --add `which node`
+To report cacheline events from previous recording: perf c2c report
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 31/36] perf vendor events: Use more flexible pattern matching for CPU identification for mapfile.csv
       [not found] <20171206144115.15097-1-acme@kernel.org>
                   ` (29 preceding siblings ...)
  2017-12-06 14:41 ` [PATCH 30/36] perf c2c: Add a tip about cacheline events Arnaldo Carvalho de Melo
@ 2017-12-06 14:41 ` Arnaldo Carvalho de Melo
  2017-12-06 14:41   ` Arnaldo Carvalho de Melo
                   ` (4 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, William Cohen,
	Alexander Shishkin, Jiri Olsa, Michael Petlan, Namhyung Kim,
	Peter Zijlstra, Shriya, Arnaldo Carvalho de Melo

From: William Cohen <wcohen@redhat.com>

The powerpc cpuid information includes chip revision information.
Changes between chip revisions are usually minor bug fixes and usually
do not affect the operation of the performance monitoring hardware.

The original mapfile.csv matching requires enumerating every possible
cpuid string.  When a new minor chip revision is produced a new entry
has to be added to the mapfile.csv and the code recompiled to allow perf
to have the implementation specific perf events for this new minor
revision.  For users of various distibutions of Linux having to wait for
a new release of the kernel's perf tool to be built with these trivial
patches is inconvenient.

Using regular expressions rather than exactly string matching of the
entire cpuid string allows developers to write mapfile.csv files that do
not require patches and recompiles for each of these minor version
changes.  If special cases need to be made for some particular versions,
they can be placed earlier in the mapfile.csv file before the more
general matches.

Signed-off-by: William Cohen <wcohen@redhat.com>
Tested-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shriya <shriyak@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20171204145728.16792-1-wcohen@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/pmu-events/arch/powerpc/mapfile.csv | 12 ++------
 tools/perf/pmu-events/arch/x86/mapfile.csv     |  5 +---
 tools/perf/pmu-events/jevents.c                | 39 +++++++++++++++++++++++++-
 tools/perf/util/pmu.c                          | 20 ++++++++++++-
 4 files changed, 60 insertions(+), 16 deletions(-)

diff --git a/tools/perf/pmu-events/arch/powerpc/mapfile.csv b/tools/perf/pmu-events/arch/powerpc/mapfile.csv
index a0f3a11ca19f..229150e7ab7d 100644
--- a/tools/perf/pmu-events/arch/powerpc/mapfile.csv
+++ b/tools/perf/pmu-events/arch/powerpc/mapfile.csv
@@ -13,13 +13,5 @@
 #
 
 # Power8 entries
-004b0000,1,power8,core
-004b0201,1,power8,core
-004c0000,1,power8,core
-004d0000,1,power8,core
-004d0100,1,power8,core
-004d0200,1,power8,core
-004c0100,1,power8,core
-004e0100,1,power9,core
-004e0200,1,power9,core
-004e1200,1,power9,core
+004[bcd][[:xdigit:]]{4},1,power8,core
+004e[[:xdigit:]]{4},1,power9,core
diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv
index fe1a2c47cabf..93656f2fd53a 100644
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -23,10 +23,7 @@ GenuineIntel-6-1E,v2,nehalemep,core
 GenuineIntel-6-1F,v2,nehalemep,core
 GenuineIntel-6-1A,v2,nehalemep,core
 GenuineIntel-6-2E,v2,nehalemex,core
-GenuineIntel-6-4E,v24,skylake,core
-GenuineIntel-6-5E,v24,skylake,core
-GenuineIntel-6-8E,v24,skylake,core
-GenuineIntel-6-9E,v24,skylake,core
+GenuineIntel-6-[4589]E,v24,skylake,core
 GenuineIntel-6-37,v13,silvermont,core
 GenuineIntel-6-4D,v13,silvermont,core
 GenuineIntel-6-4C,v13,silvermont,core
diff --git a/tools/perf/pmu-events/jevents.c b/tools/perf/pmu-events/jevents.c
index 9eb7047bafe4..b578aa26e375 100644
--- a/tools/perf/pmu-events/jevents.c
+++ b/tools/perf/pmu-events/jevents.c
@@ -116,6 +116,43 @@ static void fixdesc(char *s)
 		*e = 0;
 }
 
+/* Add escapes for '\' so they are proper C strings. */
+static char *fixregex(char *s)
+{
+	int len = 0;
+	int esc_count = 0;
+	char *fixed = NULL;
+	char *p, *q;
+
+	/* Count the number of '\' in string */
+	for (p = s; *p; p++) {
+		++len;
+		if (*p == '\\')
+			++esc_count;
+	}
+
+	if (esc_count == 0)
+		return s;
+
+	/* allocate space for a new string */
+	fixed = (char *) malloc(len + 1);
+	if (!fixed)
+		return NULL;
+
+	/* copy over the characters */
+	q = fixed;
+	for (p = s; *p; p++) {
+		if (*p == '\\') {
+			*q = '\\';
+			++q;
+		}
+		*q = *p;
+		++q;
+	}
+	*q = '\0';
+	return fixed;
+}
+
 static struct msrmap {
 	const char *num;
 	const char *pname;
@@ -648,7 +685,7 @@ static int process_mapfile(FILE *outfp, char *fpath)
 		}
 		line[strlen(line)-1] = '\0';
 
-		cpuid = strtok_r(p, ",", &save);
+		cpuid = fixregex(strtok_r(p, ",", &save));
 		version = strtok_r(NULL, ",", &save);
 		fname = strtok_r(NULL, ",", &save);
 		type = strtok_r(NULL, ",", &save);
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 8b7c151579c0..57e38fdf0b34 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -12,6 +12,7 @@
 #include <dirent.h>
 #include <api/fs/fs.h>
 #include <locale.h>
+#include <regex.h>
 #include "util.h"
 #include "pmu.h"
 #include "parse-events.h"
@@ -609,14 +610,31 @@ struct pmu_events_map *perf_pmu__find_map(struct perf_pmu *pmu)
 
 	i = 0;
 	for (;;) {
+		regex_t re;
+		regmatch_t pmatch[1];
+		int match;
+
 		map = &pmu_events_map[i++];
 		if (!map->table) {
 			map = NULL;
 			break;
 		}
 
-		if (!strcmp(map->cpuid, cpuid))
+		if (regcomp(&re, map->cpuid, REG_EXTENDED) != 0) {
+			/* Warn unable to generate match particular string. */
+			pr_info("Invalid regular expression %s\n", map->cpuid);
 			break;
+		}
+
+		match = !regexec(&re, cpuid, 1, pmatch, 0);
+		regfree(&re);
+		if (match) {
+			size_t match_len = (pmatch[0].rm_eo - pmatch[0].rm_so);
+
+			/* Verify the entire string matched. */
+			if (match_len == strlen(cpuid))
+				break;
+		}
 	}
 	free(cpuid);
 	return map;
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 32/36] x86/asm: Allow again using asm.h when building for the 'bpf' clang target
       [not found] <20171206144115.15097-1-acme@kernel.org>
@ 2017-12-06 14:41   ` Arnaldo Carvalho de Melo
  2017-12-06 14:40 ` [PATCH 02/36] perf test: Disable test cases 19 and 20 on s390x Arnaldo Carvalho de Melo
                     ` (34 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo,
	Adrian Hunter, Alexander Potapenko, Alexei Starovoitov,
	Andrey Ryabinin, Andy Lutomirski, Arnd Bergmann, Daniel Borkmann,
	David Ahern, Dmitriy Vyukov, Jiri Olsa, Josh Poimboeuf,
	Linus Torvalds, Matthias Kaehlcke, Miguel Bernal Marin,
	Namhyung Kim, Peter Zijlstra, Thomas Gleixner, Wang Nan,
	Yonghong Song

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Up to f5caf621ee35 ("x86/asm: Fix inline asm call constraints for Clang")
we were able to use x86 headers to build to the 'bpf' clang target, as
done by the BPF code in tools/perf/.

With that commit, we ended up with following failure for 'perf test LLVM', this
is because "clang ... -target bpf ..." fails since 4.0 does not have bpf inline
asm support and 6.0 does not recognize the register 'esp', fix it by guarding
that part with an #ifndef __BPF__, that is defined by clang when building to
the "bpf" target.

  # perf test -v LLVM
  37: LLVM search and compile                               :
  37.1: Basic BPF llvm compile                              :
  --- start ---
  test child forked, pid 25526
  Kernel build dir is set to /lib/modules/4.14.0+/build
  set env: KBUILD_DIR=/lib/modules/4.14.0+/build
  unset env: KBUILD_OPTS
  include option is set to  -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h
  set env: NR_CPUS=4
  set env: LINUX_VERSION_CODE=0x40e00
  set env: CLANG_EXEC=/usr/local/bin/clang
  set env: CLANG_OPTIONS=-xc
  set env: KERNEL_INC_OPTIONS= -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h
  set env: WORKING_DIR=/lib/modules/4.14.0+/build
  set env: CLANG_SOURCE=-
  llvm compiling command template: echo '/*
   * bpf-script-example.c
   * Test basic LLVM building
   */
  #ifndef LINUX_VERSION_CODE
  # error Need LINUX_VERSION_CODE
  # error Example: for 4.2 kernel, put 'clang-opt="-DLINUX_VERSION_CODE=0x40200" into llvm section of ~/.perfconfig'
  #endif
  #define BPF_ANY 0
  #define BPF_MAP_TYPE_ARRAY 2
  #define BPF_FUNC_map_lookup_elem 1
  #define BPF_FUNC_map_update_elem 2

  static void *(*bpf_map_lookup_elem)(void *map, void *key) =
	  (void *) BPF_FUNC_map_lookup_elem;
  static void *(*bpf_map_update_elem)(void *map, void *key, void *value, int flags) =
	  (void *) BPF_FUNC_map_update_elem;

  struct bpf_map_def {
	  unsigned int type;
	  unsigned int key_size;
	  unsigned int value_size;
	  unsigned int max_entries;
  };

  #define SEC(NAME) __attribute__((section(NAME), used))
  struct bpf_map_def SEC("maps") flip_table = {
	  .type = BPF_MAP_TYPE_ARRAY,
	  .key_size = sizeof(int),
	  .value_size = sizeof(int),
	  .max_entries = 1,
  };

  SEC("func=SyS_epoll_wait")
  int bpf_func__SyS_epoll_wait(void *ctx)
  {
	  int ind =0;
	  int *flag = bpf_map_lookup_elem(&flip_table, &ind);
	  int new_flag;
	  if (!flag)
		  return 0;
	  /* flip flag and store back */
	  new_flag = !*flag;
	  bpf_map_update_elem(&flip_table, &ind, &new_flag, BPF_ANY);
	  return new_flag;
  }
  char _license[] SEC("license") = "GPL";
  int _version SEC("version") = LINUX_VERSION_CODE;
  ' | $CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS -DLINUX_VERSION_CODE=$LINUX_VERSION_CODE $CLANG_OPTIONS $KERNEL_INC_OPTIONS -Wno-unused-value -Wno-pointer-sign -working-directory $WORKING_DIR -c "$CLANG_SOURCE" -target bpf -O2 -o -
  test child finished with 0
  ---- end ----
  LLVM search and compile subtest 0: Ok
  37.2: kbuild searching                                    :
  --- start ---
  test child forked, pid 25950
  Kernel build dir is set to /lib/modules/4.14.0+/build
  set env: KBUILD_DIR=/lib/modules/4.14.0+/build
  unset env: KBUILD_OPTS
  include option is set to  -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h
  set env: NR_CPUS=4
  set env: LINUX_VERSION_CODE=0x40e00
  set env: CLANG_EXEC=/usr/local/bin/clang
  set env: CLANG_OPTIONS=-xc
  set env: KERNEL_INC_OPTIONS= -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h
  set env: WORKING_DIR=/lib/modules/4.14.0+/build
  set env: CLANG_SOURCE=-
  llvm compiling command template: echo '/*
   * bpf-script-test-kbuild.c
   * Test include from kernel header
   */
  #ifndef LINUX_VERSION_CODE
  # error Need LINUX_VERSION_CODE
  # error Example: for 4.2 kernel, put 'clang-opt="-DLINUX_VERSION_CODE=0x40200" into llvm section of ~/.perfconfig'
  #endif
  #define SEC(NAME) __attribute__((section(NAME), used))

  #include <uapi/linux/fs.h>
  #include <uapi/asm/ptrace.h>

  SEC("func=vfs_llseek")
  int bpf_func__vfs_llseek(void *ctx)
  {
	  return 0;
  }

  char _license[] SEC("license") = "GPL";
  int _version SEC("version") = LINUX_VERSION_CODE;
  ' | $CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS -DLINUX_VERSION_CODE=$LINUX_VERSION_CODE $CLANG_OPTIONS $KERNEL_INC_OPTIONS -Wno-unused-value -Wno-pointer-sign -working-directory $WORKING_DIR -c "$CLANG_SOURCE" -target bpf -O2 -o -
  In file included from <stdin>:12:
  In file included from /home/acme/git/linux/arch/x86/include/uapi/asm/ptrace.h:5:
  In file included from /home/acme/git/linux/include/linux/compiler.h:242:
  In file included from /home/acme/git/linux/arch/x86/include/asm/barrier.h:5:
  In file included from /home/acme/git/linux/arch/x86/include/asm/alternative.h:10:
  /home/acme/git/linux/arch/x86/include/asm/asm.h:145:50: error: unknown register name 'esp' in asm
  register unsigned long current_stack_pointer asm(_ASM_SP);
                                                   ^
  /home/acme/git/linux/arch/x86/include/asm/asm.h:44:18: note: expanded from macro '_ASM_SP'
  #define _ASM_SP         __ASM_REG(sp)
                          ^
  /home/acme/git/linux/arch/x86/include/asm/asm.h:27:32: note: expanded from macro '__ASM_REG'
  #define __ASM_REG(reg)         __ASM_SEL_RAW(e##reg, r##reg)
                                 ^
  /home/acme/git/linux/arch/x86/include/asm/asm.h:18:29: note: expanded from macro '__ASM_SEL_RAW'
  # define __ASM_SEL_RAW(a,b) __ASM_FORM_RAW(a)
                              ^
  /home/acme/git/linux/arch/x86/include/asm/asm.h:11:32: note: expanded from macro '__ASM_FORM_RAW'
  # define __ASM_FORM_RAW(x)     #x
                                 ^
  <scratch space>:4:1: note: expanded from here
  "esp"
  ^
  1 error generated.
  ERROR:	unable to compile -
  Hint:	Check error message shown above.
  Hint:	You can also pre-compile it into .o using:
     		  clang -target bpf -O2 -c -
     	  with proper -I and -D options.
  Failed to compile test case: 'kbuild searching'
  test child finished with -1
  ---- end ----
  LLVM search and compile subtest 1: FAILED!

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthias Kaehlcke <mka@chromium.org>
Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Yonghong Song <yhs@fb.com>
Link: https://lkml.kernel.org/r/20171128175948.GL3298@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 arch/x86/include/asm/asm.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/include/asm/asm.h b/arch/x86/include/asm/asm.h
index 219faaec51df..386a6900e206 100644
--- a/arch/x86/include/asm/asm.h
+++ b/arch/x86/include/asm/asm.h
@@ -136,6 +136,7 @@
 #endif
 
 #ifndef __ASSEMBLY__
+#ifndef __BPF__
 /*
  * This output constraint should be used for any inline asm which has a "call"
  * instruction.  Otherwise the asm may be inserted before the frame pointer
@@ -145,5 +146,6 @@
 register unsigned long current_stack_pointer asm(_ASM_SP);
 #define ASM_CALL_CONSTRAINT "+r" (current_stack_pointer)
 #endif
+#endif
 
 #endif /* _ASM_X86_ASM_H */
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 32/36] x86/asm: Allow again using asm.h when building for the 'bpf' clang target
@ 2017-12-06 14:41   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo,
	Adrian Hunter, Alexander Potapenko, Alexei Starovoitov,
	Andrey Ryabinin, Andy Lutomirski, Arnd Bergmann, Daniel Borkmann,
	David Ahern, Dmitriy Vyukov, Jiri Olsa, Josh Poimboeuf,
	Linus Torvalds, Matthias Kaehlcke, Miguel Bernal Marin,
	Namhyung Kim, Peter Zijlstra, Thom

From: Arnaldo Carvalho de Melo <acme@redhat.com>

Up to f5caf621ee35 ("x86/asm: Fix inline asm call constraints for Clang")
we were able to use x86 headers to build to the 'bpf' clang target, as
done by the BPF code in tools/perf/.

With that commit, we ended up with following failure for 'perf test LLVM', this
is because "clang ... -target bpf ..." fails since 4.0 does not have bpf inline
asm support and 6.0 does not recognize the register 'esp', fix it by guarding
that part with an #ifndef __BPF__, that is defined by clang when building to
the "bpf" target.

  # perf test -v LLVM
  37: LLVM search and compile                               :
  37.1: Basic BPF llvm compile                              :
  --- start ---
  test child forked, pid 25526
  Kernel build dir is set to /lib/modules/4.14.0+/build
  set env: KBUILD_DIR=/lib/modules/4.14.0+/build
  unset env: KBUILD_OPTS
  include option is set to  -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h
  set env: NR_CPUS=4
  set env: LINUX_VERSION_CODE=0x40e00
  set env: CLANG_EXEC=/usr/local/bin/clang
  set env: CLANG_OPTIONS=-xc
  set env: KERNEL_INC_OPTIONS= -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h
  set env: WORKING_DIR=/lib/modules/4.14.0+/build
  set env: CLANG_SOURCE=-
  llvm compiling command template: echo '/*
   * bpf-script-example.c
   * Test basic LLVM building
   */
  #ifndef LINUX_VERSION_CODE
  # error Need LINUX_VERSION_CODE
  # error Example: for 4.2 kernel, put 'clang-opt="-DLINUX_VERSION_CODE=0x40200" into llvm section of ~/.perfconfig'
  #endif
  #define BPF_ANY 0
  #define BPF_MAP_TYPE_ARRAY 2
  #define BPF_FUNC_map_lookup_elem 1
  #define BPF_FUNC_map_update_elem 2

  static void *(*bpf_map_lookup_elem)(void *map, void *key) =
	  (void *) BPF_FUNC_map_lookup_elem;
  static void *(*bpf_map_update_elem)(void *map, void *key, void *value, int flags) =
	  (void *) BPF_FUNC_map_update_elem;

  struct bpf_map_def {
	  unsigned int type;
	  unsigned int key_size;
	  unsigned int value_size;
	  unsigned int max_entries;
  };

  #define SEC(NAME) __attribute__((section(NAME), used))
  struct bpf_map_def SEC("maps") flip_table = {
	  .type = BPF_MAP_TYPE_ARRAY,
	  .key_size = sizeof(int),
	  .value_size = sizeof(int),
	  .max_entries = 1,
  };

  SEC("func=SyS_epoll_wait")
  int bpf_func__SyS_epoll_wait(void *ctx)
  {
	  int ind =0;
	  int *flag = bpf_map_lookup_elem(&flip_table, &ind);
	  int new_flag;
	  if (!flag)
		  return 0;
	  /* flip flag and store back */
	  new_flag = !*flag;
	  bpf_map_update_elem(&flip_table, &ind, &new_flag, BPF_ANY);
	  return new_flag;
  }
  char _license[] SEC("license") = "GPL";
  int _version SEC("version") = LINUX_VERSION_CODE;
  ' | $CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS -DLINUX_VERSION_CODE=$LINUX_VERSION_CODE $CLANG_OPTIONS $KERNEL_INC_OPTIONS -Wno-unused-value -Wno-pointer-sign -working-directory $WORKING_DIR -c "$CLANG_SOURCE" -target bpf -O2 -o -
  test child finished with 0
  ---- end ----
  LLVM search and compile subtest 0: Ok
  37.2: kbuild searching                                    :
  --- start ---
  test child forked, pid 25950
  Kernel build dir is set to /lib/modules/4.14.0+/build
  set env: KBUILD_DIR=/lib/modules/4.14.0+/build
  unset env: KBUILD_OPTS
  include option is set to  -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h
  set env: NR_CPUS=4
  set env: LINUX_VERSION_CODE=0x40e00
  set env: CLANG_EXEC=/usr/local/bin/clang
  set env: CLANG_OPTIONS=-xc
  set env: KERNEL_INC_OPTIONS= -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h
  set env: WORKING_DIR=/lib/modules/4.14.0+/build
  set env: CLANG_SOURCE=-
  llvm compiling command template: echo '/*
   * bpf-script-test-kbuild.c
   * Test include from kernel header
   */
  #ifndef LINUX_VERSION_CODE
  # error Need LINUX_VERSION_CODE
  # error Example: for 4.2 kernel, put 'clang-opt="-DLINUX_VERSION_CODE=0x40200" into llvm section of ~/.perfconfig'
  #endif
  #define SEC(NAME) __attribute__((section(NAME), used))

  #include <uapi/linux/fs.h>
  #include <uapi/asm/ptrace.h>

  SEC("func=vfs_llseek")
  int bpf_func__vfs_llseek(void *ctx)
  {
	  return 0;
  }

  char _license[] SEC("license") = "GPL";
  int _version SEC("version") = LINUX_VERSION_CODE;
  ' | $CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS -DLINUX_VERSION_CODE=$LINUX_VERSION_CODE $CLANG_OPTIONS $KERNEL_INC_OPTIONS -Wno-unused-value -Wno-pointer-sign -working-directory $WORKING_DIR -c "$CLANG_SOURCE" -target bpf -O2 -o -
  In file included from <stdin>:12:
  In file included from /home/acme/git/linux/arch/x86/include/uapi/asm/ptrace.h:5:
  In file included from /home/acme/git/linux/include/linux/compiler.h:242:
  In file included from /home/acme/git/linux/arch/x86/include/asm/barrier.h:5:
  In file included from /home/acme/git/linux/arch/x86/include/asm/alternative.h:10:
  /home/acme/git/linux/arch/x86/include/asm/asm.h:145:50: error: unknown register name 'esp' in asm
  register unsigned long current_stack_pointer asm(_ASM_SP);
                                                   ^
  /home/acme/git/linux/arch/x86/include/asm/asm.h:44:18: note: expanded from macro '_ASM_SP'
  #define _ASM_SP         __ASM_REG(sp)
                          ^
  /home/acme/git/linux/arch/x86/include/asm/asm.h:27:32: note: expanded from macro '__ASM_REG'
  #define __ASM_REG(reg)         __ASM_SEL_RAW(e##reg, r##reg)
                                 ^
  /home/acme/git/linux/arch/x86/include/asm/asm.h:18:29: note: expanded from macro '__ASM_SEL_RAW'
  # define __ASM_SEL_RAW(a,b) __ASM_FORM_RAW(a)
                              ^
  /home/acme/git/linux/arch/x86/include/asm/asm.h:11:32: note: expanded from macro '__ASM_FORM_RAW'
  # define __ASM_FORM_RAW(x)     #x
                                 ^
  <scratch space>:4:1: note: expanded from here
  "esp"
  ^
  1 error generated.
  ERROR:	unable to compile -
  Hint:	Check error message shown above.
  Hint:	You can also pre-compile it into .o using:
     		  clang -target bpf -O2 -c -
     	  with proper -I and -D options.
  Failed to compile test case: 'kbuild searching'
  test child finished with -1
  ---- end ----
  LLVM search and compile subtest 1: FAILED!

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthias Kaehlcke <mka@chromium.org>
Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Yonghong Song <yhs@fb.com>
Link: https://lkml.kernel.org/r/20171128175948.GL3298@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 arch/x86/include/asm/asm.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/include/asm/asm.h b/arch/x86/include/asm/asm.h
index 219faaec51df..386a6900e206 100644
--- a/arch/x86/include/asm/asm.h
+++ b/arch/x86/include/asm/asm.h
@@ -136,6 +136,7 @@
 #endif
 
 #ifndef __ASSEMBLY__
+#ifndef __BPF__
 /*
  * This output constraint should be used for any inline asm which has a "call"
  * instruction.  Otherwise the asm may be inserted before the frame pointer
@@ -145,5 +146,6 @@
 register unsigned long current_stack_pointer asm(_ASM_SP);
 #define ASM_CALL_CONSTRAINT "+r" (current_stack_pointer)
 #endif
+#endif
 
 #endif /* _ASM_X86_ASM_H */
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 33/36] perf report: Set browser mode right before setup_browser()
       [not found] <20171206144115.15097-1-acme@kernel.org>
                   ` (31 preceding siblings ...)
  2017-12-06 14:41   ` Arnaldo Carvalho de Melo
@ 2017-12-06 14:41 ` Arnaldo Carvalho de Melo
  2017-12-06 14:41 ` [PATCH 34/36] perf mmap: Fix perf backward recording Arnaldo Carvalho de Melo
                   ` (2 subsequent siblings)
  35 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Seokho Song, Jiri Olsa,
	Park Ju Hyung, Arnaldo Carvalho de Melo

From: Seokho Song <0xdevssh@gmail.com>

There are codes that print messages to the screen between assignment of
the use_browser variable and setup_browser().

But since the GUI browser is not initialized during that period, all
messages fail to show if the user passed the --gtk option to perf as GTK
is not initialized yet.

Reorder the code to assign use_browser variable right before
setup_browser() is called.

Signed-off-by: Seokho Song <0xdevssh@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20171204160244.6332-1-0xdevssh@gmail.com
Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-report.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index af5dd038195e..eb9ce6327e71 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -921,13 +921,6 @@ int cmd_report(int argc, const char **argv)
 		return -EINVAL;
 	}
 
-	if (report.use_stdio)
-		use_browser = 0;
-	else if (report.use_tui)
-		use_browser = 1;
-	else if (report.use_gtk)
-		use_browser = 2;
-
 	if (report.inverted_callchain)
 		callchain_param.order = ORDER_CALLER;
 	if (symbol_conf.cumulate_callchain && !callchain_param.order_set)
@@ -1014,6 +1007,13 @@ int cmd_report(int argc, const char **argv)
 		perf_hpp_list.need_collapse = true;
 	}
 
+	if (report.use_stdio)
+		use_browser = 0;
+	else if (report.use_tui)
+		use_browser = 1;
+	else if (report.use_gtk)
+		use_browser = 2;
+
 	/* Force tty output for header output and per-thread stat. */
 	if (report.header || report.header_only || report.show_threads)
 		use_browser = 0;
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 34/36] perf mmap: Fix perf backward recording
       [not found] <20171206144115.15097-1-acme@kernel.org>
                   ` (32 preceding siblings ...)
  2017-12-06 14:41 ` [PATCH 33/36] perf report: Set browser mode right before setup_browser() Arnaldo Carvalho de Melo
@ 2017-12-06 14:41 ` Arnaldo Carvalho de Melo
  2017-12-06 14:41 ` [PATCH 35/36] perf mmap: Don't discard prev in backward mode Arnaldo Carvalho de Melo
  2017-12-06 14:41 ` [PATCH 36/36] perf tools: Rename 'backward' to 'overwrite' in evlist, mmap and record Arnaldo Carvalho de Melo
  35 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Wang Nan, Jiri Olsa, Kan Liang,
	Mengting Zhang, Arnaldo Carvalho de Melo

From: Wang Nan <wangnan0@huawei.com>

'perf record' backward recording doesn't work as we expected: it never
overwrites when ring buffer gets full.

Test:

Run a busy python printing task background like this:

 while True:
     print 123

send SIGUSR2 to perf to capture snapshot, then:

 # ./perf record --overwrite -e raw_syscalls:sys_enter -e raw_syscalls:sys_exit --exclude-perf -a --switch-output
 [ perf record: dump data: Woken up 1 times ]
 [ perf record: Dump perf.data.2017110101520743 ]
 [ perf record: dump data: Woken up 1 times ]
 [ perf record: Dump perf.data.2017110101521251 ]
 [ perf record: dump data: Woken up 1 times ]
 [ perf record: Dump perf.data.2017110101521692 ]
 ^C[ perf record: Woken up 1 times to write data ]
 [ perf record: Dump perf.data.2017110101521936 ]
 [ perf record: Captured and wrote 0.826 MB perf.data.<timestamp> ]

 # ./perf script -i ./perf.data.2017110101520743 | head -n3
             perf  2717 [000] 12449.310785: raw_syscalls:sys_enter: NR 16 (5, 2400, 0, 59, 100, 0)
             perf  2717 [000] 12449.310790: raw_syscalls:sys_enter: NR 7 (4112340, 2, ffffffff, 3df, 100, 0)
           python  2545 [000] 12449.310800:  raw_syscalls:sys_exit: NR 1 = 4
 # ./perf script -i ./perf.data.2017110101521251 | head -n3
             perf  2717 [000] 12449.310785: raw_syscalls:sys_enter: NR 16 (5, 2400, 0, 59, 100, 0)
             perf  2717 [000] 12449.310790: raw_syscalls:sys_enter: NR 7 (4112340, 2, ffffffff, 3df, 100, 0)
           python  2545 [000] 12449.310800:  raw_syscalls:sys_exit: NR 1 = 4
 # ./perf script -i ./perf.data.2017110101521692 | head -n3
             perf  2717 [000] 12449.310785: raw_syscalls:sys_enter: NR 16 (5, 2400, 0, 59, 100, 0)
             perf  2717 [000] 12449.310790: raw_syscalls:sys_enter: NR 7 (4112340, 2, ffffffff, 3df, 100, 0)
           python  2545 [000] 12449.310800:  raw_syscalls:sys_exit: NR 1 = 4

Timestamps never change, but my background task is a dead loop, can
easily overwhelm the ring buffer.

This patch fixes it by forcing unsetting PROT_WRITE for a backward ring
buffer, so all backward ring buffers become overwrite ring buffers.

Test result:

 # ./perf record --overwrite -e raw_syscalls:sys_enter -e raw_syscalls:sys_exit --exclude-perf -a --switch-output
 [ perf record: dump data: Woken up 1 times ]
 [ perf record: Dump perf.data.2017110101285323 ]
 [ perf record: dump data: Woken up 1 times ]
 [ perf record: Dump perf.data.2017110101290053 ]
 [ perf record: dump data: Woken up 1 times ]
 [ perf record: Dump perf.data.2017110101290446 ]
 ^C[ perf record: Woken up 1 times to write data ]
 [ perf record: Dump perf.data.2017110101290837 ]
 [ perf record: Captured and wrote 0.826 MB perf.data.<timestamp> ]
 # ./perf script -i ./perf.data.2017110101285323 | head -n3
           python  2545 [000] 11064.268083:  raw_syscalls:sys_exit: NR 1 = 4
           python  2545 [000] 11064.268084: raw_syscalls:sys_enter: NR 1 (1, 12cc330, 4, 7fc237280370, 7fc2373d0700, 2c7b0)
           python  2545 [000] 11064.268086:  raw_syscalls:sys_exit: NR 1 = 4
 # ./perf script -i ./perf.data.2017110101290 | head -n3
 failed to open ./perf.data.2017110101290: No such file or directory
 # ./perf script -i ./perf.data.2017110101290053 | head -n3
           python  2545 [000] 11071.564062: raw_syscalls:sys_enter: NR 1 (1, 12cc330, 4, 7fc237280370, 7fc2373d0700, 2c7b0)
           python  2545 [000] 11071.564064:  raw_syscalls:sys_exit: NR 1 = 4
           python  2545 [000] 11071.564066: raw_syscalls:sys_enter: NR 1 (1, 12cc330, 4, 7fc237280370, 7fc2373d0700, 2c7b0)
 # ./perf script -i ./perf.data.2017110101290 | head -n3
 perf.data.2017110101290053  perf.data.2017110101290446  perf.data.2017110101290837
 # ./perf script -i ./perf.data.2017110101290446 | head -n3
             sshd  1321 [000] 11075.499473:  raw_syscalls:sys_exit: NR 14 = 0
             sshd  1321 [000] 11075.499474: raw_syscalls:sys_enter: NR 14 (2, 7ffe98899490, 0, 8, 0, 3000)
             sshd  1321 [000] 11075.499474:  raw_syscalls:sys_exit: NR 14 = 0
 # ./perf script -i ./perf.data.2017110101290837 | head -n3
           python  2545 [000] 11079.280844:  raw_syscalls:sys_exit: NR 1 = 4
           python  2545 [000] 11079.280847: raw_syscalls:sys_enter: NR 1 (1, 12cc330, 4, 7fc237280370, 7fc2373d0700, 2c7b0)
           python  2545 [000] 11079.280850:  raw_syscalls:sys_exit: NR 1 = 4

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Mengting Zhang <zhangmengting@huawei.com>
Link: http://lkml.kernel.org/r/20171204165107.95327-2-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evlist.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 68c1f9546650..b1cea711232b 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -812,6 +812,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 		int fd;
 		int cpu;
 
+		mp->prot = PROT_READ | PROT_WRITE;
 		if (evsel->attr.write_backward) {
 			output = _output_backward;
 			maps = evlist->backward_mmap;
@@ -824,6 +825,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 				if (evlist->bkw_mmap_state == BKW_MMAP_NOTREADY)
 					perf_evlist__toggle_bkw_mmap(evlist, BKW_MMAP_RUNNING);
 			}
+			mp->prot &= ~PROT_WRITE;
 		}
 
 		if (evsel->system_wide && thread)
@@ -1058,9 +1060,12 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 	struct perf_evsel *evsel;
 	const struct cpu_map *cpus = evlist->cpus;
 	const struct thread_map *threads = evlist->threads;
-	struct mmap_params mp = {
-		.prot = PROT_READ | PROT_WRITE,
-	};
+	/*
+	 * Delay setting mp.prot: set it before calling perf_mmap__mmap.
+	 * Its value is decided by evsel's write_backward.
+	 * So &mp should not be passed through const pointer.
+	 */
+	struct mmap_params mp;
 
 	if (!evlist->mmap)
 		evlist->mmap = perf_evlist__alloc_mmap(evlist);
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 35/36] perf mmap: Don't discard prev in backward mode
       [not found] <20171206144115.15097-1-acme@kernel.org>
                   ` (33 preceding siblings ...)
  2017-12-06 14:41 ` [PATCH 34/36] perf mmap: Fix perf backward recording Arnaldo Carvalho de Melo
@ 2017-12-06 14:41 ` Arnaldo Carvalho de Melo
  2017-12-06 14:41 ` [PATCH 36/36] perf tools: Rename 'backward' to 'overwrite' in evlist, mmap and record Arnaldo Carvalho de Melo
  35 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Wang Nan, Jiri Olsa,
	Mengting Zhang, Arnaldo Carvalho de Melo

From: Wang Nan <wangnan0@huawei.com>

'perf record' can switch its output data file. The new output should
only store the data after switching. However, in overwrite backward
mode, the new output still can have data from before switching. That
also brings extra overhead.

At the end of mmap_read(), the position of the processed ring buffer is
saved in md->prev. Next mmap_read should be end in md->prev if it is not
overwriten. That avoids processing duplicate data.  However, md->prev is
discarded. So next the mmap_read() has to process whole valid ring
buffer, which probably includes old processed data.

Avoid calling backward_rb_find_range() when md->prev is still
available.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Kan Liang <kan.liang@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mengting Zhang <zhangmengting@huawei.com>
Link: http://lkml.kernel.org/r/20171204165107.95327-3-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/mmap.c | 33 +++++++++++++++------------------
 1 file changed, 15 insertions(+), 18 deletions(-)

diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
index 3f262e707a41..5f8cb1583e53 100644
--- a/tools/perf/util/mmap.c
+++ b/tools/perf/util/mmap.c
@@ -267,18 +267,6 @@ static int backward_rb_find_range(void *buf, int mask, u64 head, u64 *start, u64
 	return -1;
 }
 
-static int rb_find_range(void *data, int mask, u64 head, u64 old,
-			 u64 *start, u64 *end, bool backward)
-{
-	if (!backward) {
-		*start = old;
-		*end = head;
-		return 0;
-	}
-
-	return backward_rb_find_range(data, mask, head, start, end);
-}
-
 int perf_mmap__push(struct perf_mmap *md, bool backward,
 		    void *to, int push(void *to, void *buf, size_t size))
 {
@@ -290,19 +278,28 @@ int perf_mmap__push(struct perf_mmap *md, bool backward,
 	void *buf;
 	int rc = 0;
 
-	if (rb_find_range(data, md->mask, head, old, &start, &end, backward))
-		return -1;
+	start = backward ? head : old;
+	end = backward ? old : head;
 
 	if (start == end)
 		return 0;
 
 	size = end - start;
 	if (size > (unsigned long)(md->mask) + 1) {
-		WARN_ONCE(1, "failed to keep up with mmap data. (warn only once)\n");
+		if (!backward) {
+			WARN_ONCE(1, "failed to keep up with mmap data. (warn only once)\n");
 
-		md->prev = head;
-		perf_mmap__consume(md, backward);
-		return 0;
+			md->prev = head;
+			perf_mmap__consume(md, backward);
+			return 0;
+		}
+
+		/*
+		 * Backward ring buffer is full. We still have a chance to read
+		 * most of data from it.
+		 */
+		if (backward_rb_find_range(data, md->mask, head, &start, &end))
+			return -1;
 	}
 
 	if ((start & md->mask) + size != (end & md->mask)) {
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 36/36] perf tools: Rename 'backward' to 'overwrite' in evlist, mmap and record
       [not found] <20171206144115.15097-1-acme@kernel.org>
                   ` (34 preceding siblings ...)
  2017-12-06 14:41 ` [PATCH 35/36] perf mmap: Don't discard prev in backward mode Arnaldo Carvalho de Melo
@ 2017-12-06 14:41 ` Arnaldo Carvalho de Melo
  35 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-06 14:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, linux-perf-users, Wang Nan, Jiri Olsa, Kan Liang,
	Mengting Zhang, Arnaldo Carvalho de Melo

From: Wang Nan <wangnan0@huawei.com>

Remove the backward/forward concept to make it uniform with user
interface (the '--overwrite' option).

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Mengting Zhang <zhangmengting@huawei.com>
Link: http://lkml.kernel.org/r/20171204165107.95327-4-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-record.c             | 14 +++++++-------
 tools/perf/tests/backward-ring-buffer.c |  4 ++--
 tools/perf/util/evlist.c                | 30 +++++++++++++++---------------
 tools/perf/util/evlist.h                |  2 +-
 tools/perf/util/mmap.c                  | 22 +++++++++++-----------
 5 files changed, 36 insertions(+), 36 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 26b8571d0fdb..0a5749ef8b94 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -479,7 +479,7 @@ static struct perf_event_header finished_round_event = {
 };
 
 static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evlist,
-				    bool backward)
+				    bool overwrite)
 {
 	u64 bytes_written = rec->bytes_written;
 	int i;
@@ -489,18 +489,18 @@ static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evli
 	if (!evlist)
 		return 0;
 
-	maps = backward ? evlist->backward_mmap : evlist->mmap;
+	maps = overwrite ? evlist->overwrite_mmap : evlist->mmap;
 	if (!maps)
 		return 0;
 
-	if (backward && evlist->bkw_mmap_state != BKW_MMAP_DATA_PENDING)
+	if (overwrite && evlist->bkw_mmap_state != BKW_MMAP_DATA_PENDING)
 		return 0;
 
 	for (i = 0; i < evlist->nr_mmaps; i++) {
 		struct auxtrace_mmap *mm = &maps[i].auxtrace_mmap;
 
 		if (maps[i].base) {
-			if (perf_mmap__push(&maps[i], backward, rec, record__pushfn) != 0) {
+			if (perf_mmap__push(&maps[i], overwrite, rec, record__pushfn) != 0) {
 				rc = -1;
 				goto out;
 			}
@@ -520,7 +520,7 @@ static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evli
 	if (bytes_written != rec->bytes_written)
 		rc = record__write(rec, &finished_round_event, sizeof(finished_round_event));
 
-	if (backward)
+	if (overwrite)
 		perf_evlist__toggle_bkw_mmap(evlist, BKW_MMAP_EMPTY);
 out:
 	return rc;
@@ -692,8 +692,8 @@ perf_evlist__pick_pc(struct perf_evlist *evlist)
 	if (evlist) {
 		if (evlist->mmap && evlist->mmap[0].base)
 			return evlist->mmap[0].base;
-		if (evlist->backward_mmap && evlist->backward_mmap[0].base)
-			return evlist->backward_mmap[0].base;
+		if (evlist->overwrite_mmap && evlist->overwrite_mmap[0].base)
+			return evlist->overwrite_mmap[0].base;
 	}
 	return NULL;
 }
diff --git a/tools/perf/tests/backward-ring-buffer.c b/tools/perf/tests/backward-ring-buffer.c
index cf37e43c42f3..4035d43523c3 100644
--- a/tools/perf/tests/backward-ring-buffer.c
+++ b/tools/perf/tests/backward-ring-buffer.c
@@ -33,8 +33,8 @@ static int count_samples(struct perf_evlist *evlist, int *sample_count,
 	for (i = 0; i < evlist->nr_mmaps; i++) {
 		union perf_event *event;
 
-		perf_mmap__read_catchup(&evlist->backward_mmap[i]);
-		while ((event = perf_mmap__read_backward(&evlist->backward_mmap[i])) != NULL) {
+		perf_mmap__read_catchup(&evlist->overwrite_mmap[i]);
+		while ((event = perf_mmap__read_backward(&evlist->overwrite_mmap[i])) != NULL) {
 			const u32 type = event->header.type;
 
 			switch (type) {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index b1cea711232b..3570355bcf39 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -125,7 +125,7 @@ static void perf_evlist__purge(struct perf_evlist *evlist)
 void perf_evlist__exit(struct perf_evlist *evlist)
 {
 	zfree(&evlist->mmap);
-	zfree(&evlist->backward_mmap);
+	zfree(&evlist->overwrite_mmap);
 	fdarray__exit(&evlist->pollfd);
 }
 
@@ -675,11 +675,11 @@ static int perf_evlist__set_paused(struct perf_evlist *evlist, bool value)
 {
 	int i;
 
-	if (!evlist->backward_mmap)
+	if (!evlist->overwrite_mmap)
 		return 0;
 
 	for (i = 0; i < evlist->nr_mmaps; i++) {
-		int fd = evlist->backward_mmap[i].fd;
+		int fd = evlist->overwrite_mmap[i].fd;
 		int err;
 
 		if (fd < 0)
@@ -749,16 +749,16 @@ static void perf_evlist__munmap_nofree(struct perf_evlist *evlist)
 		for (i = 0; i < evlist->nr_mmaps; i++)
 			perf_mmap__munmap(&evlist->mmap[i]);
 
-	if (evlist->backward_mmap)
+	if (evlist->overwrite_mmap)
 		for (i = 0; i < evlist->nr_mmaps; i++)
-			perf_mmap__munmap(&evlist->backward_mmap[i]);
+			perf_mmap__munmap(&evlist->overwrite_mmap[i]);
 }
 
 void perf_evlist__munmap(struct perf_evlist *evlist)
 {
 	perf_evlist__munmap_nofree(evlist);
 	zfree(&evlist->mmap);
-	zfree(&evlist->backward_mmap);
+	zfree(&evlist->overwrite_mmap);
 }
 
 static struct perf_mmap *perf_evlist__alloc_mmap(struct perf_evlist *evlist)
@@ -800,7 +800,7 @@ perf_evlist__should_poll(struct perf_evlist *evlist __maybe_unused,
 
 static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 				       struct mmap_params *mp, int cpu_idx,
-				       int thread, int *_output, int *_output_backward)
+				       int thread, int *_output, int *_output_overwrite)
 {
 	struct perf_evsel *evsel;
 	int revent;
@@ -814,14 +814,14 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 
 		mp->prot = PROT_READ | PROT_WRITE;
 		if (evsel->attr.write_backward) {
-			output = _output_backward;
-			maps = evlist->backward_mmap;
+			output = _output_overwrite;
+			maps = evlist->overwrite_mmap;
 
 			if (!maps) {
 				maps = perf_evlist__alloc_mmap(evlist);
 				if (!maps)
 					return -1;
-				evlist->backward_mmap = maps;
+				evlist->overwrite_mmap = maps;
 				if (evlist->bkw_mmap_state == BKW_MMAP_NOTREADY)
 					perf_evlist__toggle_bkw_mmap(evlist, BKW_MMAP_RUNNING);
 			}
@@ -886,14 +886,14 @@ static int perf_evlist__mmap_per_cpu(struct perf_evlist *evlist,
 	pr_debug2("perf event ring buffer mmapped per cpu\n");
 	for (cpu = 0; cpu < nr_cpus; cpu++) {
 		int output = -1;
-		int output_backward = -1;
+		int output_overwrite = -1;
 
 		auxtrace_mmap_params__set_idx(&mp->auxtrace_mp, evlist, cpu,
 					      true);
 
 		for (thread = 0; thread < nr_threads; thread++) {
 			if (perf_evlist__mmap_per_evsel(evlist, cpu, mp, cpu,
-							thread, &output, &output_backward))
+							thread, &output, &output_overwrite))
 				goto out_unmap;
 		}
 	}
@@ -914,13 +914,13 @@ static int perf_evlist__mmap_per_thread(struct perf_evlist *evlist,
 	pr_debug2("perf event ring buffer mmapped per thread\n");
 	for (thread = 0; thread < nr_threads; thread++) {
 		int output = -1;
-		int output_backward = -1;
+		int output_overwrite = -1;
 
 		auxtrace_mmap_params__set_idx(&mp->auxtrace_mp, evlist, thread,
 					      false);
 
 		if (perf_evlist__mmap_per_evsel(evlist, thread, mp, 0, thread,
-						&output, &output_backward))
+						&output, &output_overwrite))
 			goto out_unmap;
 	}
 
@@ -1753,7 +1753,7 @@ void perf_evlist__toggle_bkw_mmap(struct perf_evlist *evlist,
 		RESUME,
 	} action = NONE;
 
-	if (!evlist->backward_mmap)
+	if (!evlist->overwrite_mmap)
 		return;
 
 	switch (old_state) {
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index eec33770b8e6..75160666d305 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -44,7 +44,7 @@ struct perf_evlist {
 	} workload;
 	struct fdarray	 pollfd;
 	struct perf_mmap *mmap;
-	struct perf_mmap *backward_mmap;
+	struct perf_mmap *overwrite_mmap;
 	struct thread_map *threads;
 	struct cpu_map	  *cpus;
 	struct perf_evsel *selected;
diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
index 5f8cb1583e53..05076e683938 100644
--- a/tools/perf/util/mmap.c
+++ b/tools/perf/util/mmap.c
@@ -234,18 +234,18 @@ int perf_mmap__mmap(struct perf_mmap *map, struct mmap_params *mp, int fd)
 	return 0;
 }
 
-static int backward_rb_find_range(void *buf, int mask, u64 head, u64 *start, u64 *end)
+static int overwrite_rb_find_range(void *buf, int mask, u64 head, u64 *start, u64 *end)
 {
 	struct perf_event_header *pheader;
 	u64 evt_head = head;
 	int size = mask + 1;
 
-	pr_debug2("backward_rb_find_range: buf=%p, head=%"PRIx64"\n", buf, head);
+	pr_debug2("overwrite_rb_find_range: buf=%p, head=%"PRIx64"\n", buf, head);
 	pheader = (struct perf_event_header *)(buf + (head & mask));
 	*start = head;
 	while (true) {
 		if (evt_head - head >= (unsigned int)size) {
-			pr_debug("Finished reading backward ring buffer: rewind\n");
+			pr_debug("Finished reading overwrite ring buffer: rewind\n");
 			if (evt_head - head > (unsigned int)size)
 				evt_head -= pheader->size;
 			*end = evt_head;
@@ -255,7 +255,7 @@ static int backward_rb_find_range(void *buf, int mask, u64 head, u64 *start, u64
 		pheader = (struct perf_event_header *)(buf + (evt_head & mask));
 
 		if (pheader->size == 0) {
-			pr_debug("Finished reading backward ring buffer: get start\n");
+			pr_debug("Finished reading overwrite ring buffer: get start\n");
 			*end = evt_head;
 			return 0;
 		}
@@ -267,7 +267,7 @@ static int backward_rb_find_range(void *buf, int mask, u64 head, u64 *start, u64
 	return -1;
 }
 
-int perf_mmap__push(struct perf_mmap *md, bool backward,
+int perf_mmap__push(struct perf_mmap *md, bool overwrite,
 		    void *to, int push(void *to, void *buf, size_t size))
 {
 	u64 head = perf_mmap__read_head(md);
@@ -278,19 +278,19 @@ int perf_mmap__push(struct perf_mmap *md, bool backward,
 	void *buf;
 	int rc = 0;
 
-	start = backward ? head : old;
-	end = backward ? old : head;
+	start = overwrite ? head : old;
+	end = overwrite ? old : head;
 
 	if (start == end)
 		return 0;
 
 	size = end - start;
 	if (size > (unsigned long)(md->mask) + 1) {
-		if (!backward) {
+		if (!overwrite) {
 			WARN_ONCE(1, "failed to keep up with mmap data. (warn only once)\n");
 
 			md->prev = head;
-			perf_mmap__consume(md, backward);
+			perf_mmap__consume(md, overwrite);
 			return 0;
 		}
 
@@ -298,7 +298,7 @@ int perf_mmap__push(struct perf_mmap *md, bool backward,
 		 * Backward ring buffer is full. We still have a chance to read
 		 * most of data from it.
 		 */
-		if (backward_rb_find_range(data, md->mask, head, &start, &end))
+		if (overwrite_rb_find_range(data, md->mask, head, &start, &end))
 			return -1;
 	}
 
@@ -323,7 +323,7 @@ int perf_mmap__push(struct perf_mmap *md, bool backward,
 	}
 
 	md->prev = head;
-	perf_mmap__consume(md, backward);
+	perf_mmap__consume(md, overwrite);
 out:
 	return rc;
 }
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH 11/36] tools build feature: Check if pthread_barrier_t is available
  2017-12-06 14:40 ` [PATCH 11/36] tools build feature: Check if pthread_barrier_t is available Arnaldo Carvalho de Melo
@ 2017-12-06 21:31   ` Philippe Ombredanne
  2017-12-07 11:24     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 46+ messages in thread
From: Philippe Ombredanne @ 2017-12-06 21:31 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, LKML, linux-perf-users, Arnaldo Carvalho de Melo,
	Adrian Hunter, David Ahern, Davidlohr Bueso, James Yang,
	Jiri Olsa, Kim Phillips, Namhyung Kim, Wang Nan

On Wed, Dec 6, 2017 at 3:40 PM, Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
> From: Arnaldo Carvalho de Melo <acme@redhat.com>
>
> As 'perf bench futex wake-parallel" will use this, which is not
> available in older systems such as versions of the android NDK used in
> my container build tests (r12b and r15c at the moment).
>
> Cc: Adrian Hunter <adrian.hunter@intel.com>
> Cc: David Ahern <dsahern@gmail.com>
> Cc: Davidlohr Bueso <dave@stgolabs.net>
> Cc: James Yang <james.yang@arm.com
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Kim Phillips <kim.phillips@arm.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Wang Nan <wangnan0@huawei.com>
> Link: https://lkml.kernel.org/n/tip-1i7iv54in4wj08lwo55b0pzv@git.kernel.org
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> ---
>  tools/build/Makefile.feature               |  1 +
>  tools/build/feature/Makefile               |  4 ++++
>  tools/build/feature/test-all.c             |  5 +++++
>  tools/build/feature/test-pthread-barrier.c | 12 ++++++++++++
>  tools/perf/Makefile.config                 |  4 ++++
>  5 files changed, 26 insertions(+)
>  create mode 100644 tools/build/feature/test-pthread-barrier.c
>
> diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature
> index c71a05b9c984..e52fcefee379 100644
> --- a/tools/build/Makefile.feature
> +++ b/tools/build/Makefile.feature
> @@ -56,6 +56,7 @@ FEATURE_TESTS_BASIC :=                  \
>          libunwind-arm                   \
>          libunwind-aarch64               \
>          pthread-attr-setaffinity-np     \
> +        pthread-barrier                \
>          stackprotector-all              \
>          timerfd                         \
>          libdw-dwarf-unwind              \
> diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile
> index 96982640fbf8..cff38f342283 100644
> --- a/tools/build/feature/Makefile
> +++ b/tools/build/feature/Makefile
> @@ -37,6 +37,7 @@ FILES=                                          \
>           test-libunwind-debug-frame-arm.bin     \
>           test-libunwind-debug-frame-aarch64.bin \
>           test-pthread-attr-setaffinity-np.bin   \
> +         test-pthread-barrier.bin              \
>           test-stackprotector-all.bin            \
>           test-timerfd.bin                       \
>           test-libdw-dwarf-unwind.bin            \
> @@ -79,6 +80,9 @@ $(OUTPUT)test-hello.bin:
>  $(OUTPUT)test-pthread-attr-setaffinity-np.bin:
>         $(BUILD) -D_GNU_SOURCE -lpthread
>
> +$(OUTPUT)test-pthread-barrier.bin:
> +       $(BUILD) -lpthread
> +
>  $(OUTPUT)test-stackprotector-all.bin:
>         $(BUILD) -fstack-protector-all
>
> diff --git a/tools/build/feature/test-all.c b/tools/build/feature/test-all.c
> index 4112702e4aed..6fdf83263ab7 100644
> --- a/tools/build/feature/test-all.c
> +++ b/tools/build/feature/test-all.c
> @@ -118,6 +118,10 @@
>  # include "test-pthread-attr-setaffinity-np.c"
>  #undef main
>
> +#define main main_test_pthread_barrier
> +# include "test-pthread-barrier.c"
> +#undef main
> +
>  #define main main_test_sched_getcpu
>  # include "test-sched_getcpu.c"
>  #undef main
> @@ -187,6 +191,7 @@ int main(int argc, char *argv[])
>         main_test_sync_compare_and_swap(argc, argv);
>         main_test_zlib();
>         main_test_pthread_attr_setaffinity_np();
> +       main_test_pthread_barrier();
>         main_test_lzma();
>         main_test_get_cpuid();
>         main_test_bpf();
> diff --git a/tools/build/feature/test-pthread-barrier.c b/tools/build/feature/test-pthread-barrier.c
> new file mode 100644
> index 000000000000..0558d9334d97
> --- /dev/null
> +++ b/tools/build/feature/test-pthread-barrier.c
> @@ -0,0 +1,12 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <stdint.h>
> +#include <pthread.h>
> +
> +int main(void)
> +{
> +       pthread_barrier_t barrier;
> +
> +       pthread_barrier_init(&barrier, NULL, 1);
> +       pthread_barrier_wait(&barrier);
> +       return pthread_barrier_destroy(&barrier);
> +}
> diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
> index f6786fa2419f..2c437baf8364 100644
> --- a/tools/perf/Makefile.config
> +++ b/tools/perf/Makefile.config
> @@ -263,6 +263,10 @@ ifeq ($(feature-pthread-attr-setaffinity-np), 1)
>    CFLAGS += -DHAVE_PTHREAD_ATTR_SETAFFINITY_NP
>  endif
>
> +ifeq ($(feature-pthread-barrier), 1)
> +  CFLAGS += -DHAVE_PTHREAD_BARRIER
> +endif
> +
>  ifndef NO_BIONIC
>    $(call feature_check,bionic)
>    ifeq ($(feature-bionic), 1)
> --
> 2.13.6
>

Arnaldo: for the use of SPDX ids in the hwol patch series, thank you very much.
Acked-by: Philippe Ombredanne <pombredanne@nexb.com>

-- 
Cordially
Philippe Ombredanne

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 18/36] perf s390: Always build with -fPIC
  2017-12-06 14:40 ` [PATCH 18/36] perf s390: Always build with -fPIC Arnaldo Carvalho de Melo
@ 2017-12-07  8:09   ` Hendrik Brueckner
  2017-12-28 15:34     ` [tip:perf/core] " tip-bot for Hendrik Brueckner
  0 siblings, 1 reply; 46+ messages in thread
From: Hendrik Brueckner @ 2017-12-07  8:09 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, linux-kernel, linux-perf-users, Hendrik Brueckner,
	Heiko Carstens, Martin Schwidefsky, Thomas Richter,
	linux s390 list, Arnaldo Carvalho de Melo

Hi Arnaldo,

On Wed, Dec 06, 2017 at 11:40:57AM -0300, Arnaldo Carvalho de Melo wrote:
> From: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
> 
> On s390, object files must be compiled with position-indepedent code in
> order to be incrementally linked or linked to shared libraries.
> Therefore, add -fPIC to the CFLAGS for s390 to ensure each object file
> is built properly.
> 
> Reported-by: Jonathan Hermann <jonathan.hermann@de.ibm.com>
> Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
> Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
> Cc: linux s390 list <linux-s390@vger.kernel.org>
> LPU-Reference: 1512031765-9382-1-git-send-email-brueckner@linux.vnet.ibm.com
> Link: https://lkml.kernel.org/n/tip-a8wga8hrl0d0r84cal96fmgv@git.kernel.org
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> ---
>  tools/perf/Makefile.config | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
> index 2c437baf8364..bf86c09ca889 100644
> --- a/tools/perf/Makefile.config
> +++ b/tools/perf/Makefile.config
> @@ -41,6 +41,7 @@ ifeq ($(SRCARCH),x86)
>      LIBUNWIND_LIBS = -lunwind-x86 -llzma -lunwind
>    endif
>    NO_PERF_REGS := 0
> +  CFLAGS += -fPIC
>  endif

I was just rebasing my syscall table work on top of your perf/core tree.
It looks like that there is a significant difference compared to linux
master tree.

With f704ef44602fbf403e6722c7ed13f62d17e8cb20 ("s390/perf: add support for
perf_regs and libdw"), Heiko introduced a change in the Makefile.config:

	--- a/tools/perf/Makefile.config
	+++ b/tools/perf/Makefile.config
	@@ -53,6 +53,10 @@ ifeq ($(SRCARCH),arm64)
	   LIBUNWIND_LIBS = -lunwind -lunwind-aarch64
	    endif
	     
	+ifeq ($(ARCH),s390)
	+  NO_PERF_REGS := 0
	+endif
	+
	ifeq ($(NO_PERF_REGS),0)

The CFLAGS actually should applied to the s390 block and not in the x86
block.  Somehow this got messed up with git cherryp-pick / am.  So actually,
this should go into the section above:

diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index ed65e82..0833d9f 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -55,6 +55,7 @@ endif
 
 ifeq ($(ARCH),s390)
   NO_PERF_REGS := 0
+  CFLAGS += -fPIC
 endif
 
 ifeq ($(NO_PERF_REGS),0)

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH 11/36] tools build feature: Check if pthread_barrier_t is available
  2017-12-06 21:31   ` Philippe Ombredanne
@ 2017-12-07 11:24     ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 46+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-12-07 11:24 UTC (permalink / raw)
  To: Philippe Ombredanne
  Cc: Ingo Molnar, LKML, linux-perf-users, Arnaldo Carvalho de Melo,
	Adrian Hunter, David Ahern, Davidlohr Bueso, James Yang,
	Jiri Olsa, Kim Phillips, Namhyung Kim, Wang Nan

Em Wed, Dec 06, 2017 at 10:31:50PM +0100, Philippe Ombredanne escreveu:
> On Wed, Dec 6, 2017 at 3:40 PM, Arnaldo Carvalho de Melo
> > +++ b/tools/build/feature/test-pthread-barrier.c
> > @@ -0,0 +1,12 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +#include <stdint.h>
> > +#include <pthread.h>

> Arnaldo: for the use of SPDX ids in the hwol patch series, thank you very much.
> Acked-by: Philippe Ombredanne <pombredanne@nexb.com>

Thanks!

- Arnaldo

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [tip:perf/core] perf s390: Always build with -fPIC
  2017-12-07  8:09   ` Hendrik Brueckner
@ 2017-12-28 15:34     ` tip-bot for Hendrik Brueckner
  0 siblings, 0 replies; 46+ messages in thread
From: tip-bot for Hendrik Brueckner @ 2017-12-28 15:34 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: acme, hpa, tglx, linux-kernel, mingo, tmricht, jonathan.hermann,
	heiko.carstens, brueckner, linux-s390, schwidefsky

Commit-ID:  a9a3f1d18a6c9ccf89728e23474645aa91e2f4f1
Gitweb:     https://git.kernel.org/tip/a9a3f1d18a6c9ccf89728e23474645aa91e2f4f1
Author:     Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
AuthorDate: Wed, 13 Dec 2017 17:46:54 -0300
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 27 Dec 2017 12:15:57 -0300

perf s390: Always build with -fPIC

On s390, object files must be compiled with position-indepedent code in
order to be incrementally linked or linked to shared libraries.

Therefore, add -fPIC to the CFLAGS for s390 to ensure each object file
is built properly.

Reported-by: Jonathan Hermann <jonathan.hermann@de.ibm.com>
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: linux s390 list <linux-s390@vger.kernel.org>
Link: https://lkml.kernel.org/r/20171207080951.GC4889@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Makefile.config | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index eb6bd99..f050f38 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -58,7 +58,7 @@ endif
 ifeq ($(ARCH),s390)
   NO_PERF_REGS := 0
   NO_SYSCALL_TABLE := 0
-  CFLAGS += -I$(OUTPUT)arch/s390/include/generated
+  CFLAGS += -fPIC -I$(OUTPUT)arch/s390/include/generated
 endif
 
 ifeq ($(NO_PERF_REGS),0)

^ permalink raw reply related	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2017-12-28 15:36 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20171206144115.15097-1-acme@kernel.org>
2017-12-06 14:40 ` [PATCH 01/36] tools headers: Follow the upstream UAPI header version 100% differ from the kernel Arnaldo Carvalho de Melo
2017-12-06 14:40 ` [PATCH 02/36] perf test: Disable test cases 19 and 20 on s390x Arnaldo Carvalho de Melo
2017-12-06 14:40 ` [PATCH 03/36] perf record: Synthesize unit/scale/... in event update Arnaldo Carvalho de Melo
2017-12-06 14:40 ` [PATCH 04/36] perf record: Synthesize thread map and cpu map Arnaldo Carvalho de Melo
2017-12-06 14:40 ` [PATCH 05/36] perf script: Allow computing 'perf stat' style metrics Arnaldo Carvalho de Melo
2017-12-06 14:40 ` [PATCH 06/36] perf buildid-cache: Document for Node.js USDT Arnaldo Carvalho de Melo
2017-12-06 14:40 ` [PATCH 07/36] perf report: Fix -D output for user metadata events Arnaldo Carvalho de Melo
2017-12-06 14:40 ` [PATCH 08/36] perf intel-pt: Improve build messages for files that differ from the kernel Arnaldo Carvalho de Melo
2017-12-06 14:40 ` [PATCH 09/36] Documentation: Add Arnaldo Melo to list of enforcement statement endorsers Arnaldo Carvalho de Melo
2017-12-06 14:40 ` [PATCH 10/36] perf bench futex: Use cpumaps Arnaldo Carvalho de Melo
2017-12-06 14:40 ` [PATCH 11/36] tools build feature: Check if pthread_barrier_t is available Arnaldo Carvalho de Melo
2017-12-06 21:31   ` Philippe Ombredanne
2017-12-07 11:24     ` Arnaldo Carvalho de Melo
2017-12-06 14:40 ` [PATCH 12/36] perf bench futex: Sync waker threads Arnaldo Carvalho de Melo
2017-12-06 14:40 ` [PATCH 13/36] perf annotate: Fix unnecessary memory allocation for s390x Arnaldo Carvalho de Melo
2017-12-06 14:40 ` [PATCH 14/36] perf annotate: Fix objdump comment parsing for Intel mov dissassembly Arnaldo Carvalho de Melo
2017-12-06 14:40 ` [PATCH 15/36] perf rblist: Create rblist__exit() function Arnaldo Carvalho de Melo
2017-12-06 14:40 ` [PATCH 16/36] perf stat: Add rbtree node_delete op Arnaldo Carvalho de Melo
2017-12-06 14:40 ` [PATCH 17/36] perf thread_map: Add method to map all threads in the system Arnaldo Carvalho de Melo
2017-12-06 14:40 ` [PATCH 18/36] perf s390: Always build with -fPIC Arnaldo Carvalho de Melo
2017-12-07  8:09   ` Hendrik Brueckner
2017-12-28 15:34     ` [tip:perf/core] " tip-bot for Hendrik Brueckner
2017-12-06 14:40 ` [PATCH 19/36] perf pmu: Pass pmu as a parameter to get_cpuid_str() Arnaldo Carvalho de Melo
2017-12-06 14:40   ` Arnaldo Carvalho de Melo
2017-12-06 14:40 ` [PATCH 20/36] perf tools arm64: Add support for get_cpuid_str function Arnaldo Carvalho de Melo
2017-12-06 14:40   ` Arnaldo Carvalho de Melo
2017-12-06 14:40   ` Arnaldo Carvalho de Melo
2017-12-06 14:41 ` [PATCH 21/36] perf pmu: Add helper function is_pmu_core to detect PMU CORE devices Arnaldo Carvalho de Melo
2017-12-06 14:41 ` [PATCH 22/36] perf vendor events arm64: Add ThunderX2 implementation defined pmu core events Arnaldo Carvalho de Melo
2017-12-06 14:41   ` Arnaldo Carvalho de Melo
2017-12-06 14:41 ` [PATCH 23/36] perf pmu: Add check for valid cpuid in perf_pmu__find_map() Arnaldo Carvalho de Melo
2017-12-06 14:41   ` Arnaldo Carvalho de Melo
2017-12-06 14:41 ` [PATCH 24/36] perf tools: Fix up build in hardnened environments Arnaldo Carvalho de Melo
2017-12-06 14:41 ` [PATCH 25/36] perf evlist: Remove 'overwrite' parameter from perf_evlist__mmap Arnaldo Carvalho de Melo
2017-12-06 14:41 ` [PATCH 26/36] perf evlist: Remove 'overwrite' parameter from perf_evlist__mmap_ex Arnaldo Carvalho de Melo
2017-12-06 14:41 ` [PATCH 27/36] perf evlist: Remove evlist->overwrite Arnaldo Carvalho de Melo
2017-12-06 14:41 ` [PATCH 28/36] perf mmap: Remove overwrite from arguments list of perf_mmap__push Arnaldo Carvalho de Melo
2017-12-06 14:41 ` [PATCH 29/36] perf mmap: Remove overwrite and check_messup from mmap read Arnaldo Carvalho de Melo
2017-12-06 14:41 ` [PATCH 30/36] perf c2c: Add a tip about cacheline events Arnaldo Carvalho de Melo
2017-12-06 14:41 ` [PATCH 31/36] perf vendor events: Use more flexible pattern matching for CPU identification for mapfile.csv Arnaldo Carvalho de Melo
2017-12-06 14:41 ` [PATCH 32/36] x86/asm: Allow again using asm.h when building for the 'bpf' clang target Arnaldo Carvalho de Melo
2017-12-06 14:41   ` Arnaldo Carvalho de Melo
2017-12-06 14:41 ` [PATCH 33/36] perf report: Set browser mode right before setup_browser() Arnaldo Carvalho de Melo
2017-12-06 14:41 ` [PATCH 34/36] perf mmap: Fix perf backward recording Arnaldo Carvalho de Melo
2017-12-06 14:41 ` [PATCH 35/36] perf mmap: Don't discard prev in backward mode Arnaldo Carvalho de Melo
2017-12-06 14:41 ` [PATCH 36/36] perf tools: Rename 'backward' to 'overwrite' in evlist, mmap and record Arnaldo Carvalho de Melo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.