linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 00/18] perf: add support for sampling taken branches
@ 2012-02-02 12:54 Stephane Eranian
  2012-02-02 12:54 ` [PATCH v5 01/18] perf: add generic taken branch sampling support Stephane Eranian
                   ` (17 more replies)
  0 siblings, 18 replies; 43+ messages in thread
From: Stephane Eranian @ 2012-02-02 12:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterz, mingo, acme, robert.richter, ming.m.lin, andi, asharma,
	ravitillo, vweaver1, khandual, dsahern

This patchset adds an important and useful new feature to
perf_events: branch stack sampling. In other words, the
ability to capture taken branches into each sample.

Statistical sampling of taken branch should not be confused
for branch tracing. Not all branches are necessarily captured

Sampling taken branches is important for basic block profiling,
statistical call graph, function call counts. Many of those
measurements can help drive a compiler optimizer.

The branch stack is a software abstraction which sits on top
of the PMU hardware. As such, it is not available on all
processors. For now, the patch provides the generic interface
and the Intel X86 implementation where it leverages the Last
Branch Record (LBR) feature (from Core2 to SandyBridge).

Branch stack sampling is supported for both per-thread and
system-wide modes.

It is possible to filter the type and privilege level of branches
to sample. The target of the branch is used to determine
the privilege level.

For each branch, the source and destination are captured. On
some hardware platforms, it may be possible to also extract
the target prediction and, in that case, it is also exposed
to end users.

The branch stack can record a variable number of taken
branches per sample. Those branches are always consecutive
in time. The number of branches captured depends on the
filtering and the underlying hardware. On Intel Nehalem
and later, up to 16 consecutive branches can be captured
per sample.

Branch sampling is always coupled with an event. It can
be any PMU event but it can't be a SW or tracepoint event.

Branch sampling is requested by setting a new sample_type
flag called: PERF_SAMPLE_BRANCH_STACK.

To support branch filtering, we introduce a new field
to the perf_event_attr struct: branch_sample_type. We chose
NOT to overload the config1, config2 field because those
are related to the event encoding. Branch stack is a
separate feature which is combined with the event.

The branch_sample_type is a bitmask of possible filters.
The following filters are defined (more can be added):
- PERF_SAMPLE_BRANCH_ANY     : any control flow change
- PERF_SAMPLE_BRANCH_USER    : branches when target is at user level
- PERF_SAMPLE_BRANCH_KERNEL  : branches when target is at kernel level
- PERF_SAMPLE_BRANCH_HV      : branches when target is at hypervisor level
- PERF_SAMPLE_BRANCH_ANY_CALL: call branches (incl. syscalls)
- PERF_SAMPLE_BRANCH_ANY_RET : return branches (incl. syscall returns)
- PERF_SAMPLE_BRANCH_IND_CALL: indirect calls

It is possible to combine filters, e.g., IND_CALL|USER|KERNEL.

When the privilege level is not specified, the branch stack
inherits that of the associated event.

Some processors may not offer hardware branch filtering, e.g., Intel
Atom. Some may have HW filtering bugs (e.g., Nehalem). The Intel
X86 implementation in this patchset also provides a SW branch filter
which works on a best effort basis. It can compensate for the lack
of LBR filtering. But first and foremost, it helps work around LBR
filtering errata. The goal is to only capture the type of branches
requested by the user.

It is possible to combine branch stack sampling with PEBS on Intel
X86 processors. Depending on the precise_sampling mode, there are
certain filterting restrictions. When precise_sampling=1, then
there are no filtering restrictions. When precise_sampling > 1, 
then only ANY|USER|KERNEL filter can be used. This comes from
the fact that the kernel uses LBR to compensate for the PEBS
off-by-1 skid on the instruction pointer.

To demonstrate how the perf_event branch stack sampling interface
works, the patchset also modifies perf record to capture taken
branches. Similarly perf report is enhanced to display a histogram
of taken branches.

I would like to thank Roberto Vitillo @ LBL for his work on the perf
tool for this.

Enough talking, let's take a simple example. Our trivial test program
goes like this:

void f2(void)
{}
void f3(void)
{}
void f1(unsigned long n)
{
  if (n & 1UL)
    f2();
  else
    f3();
}
int main(void)
{
  unsigned long i;

  for (i=0; i < N; i++)
   f1(i);
  return 0;
}

$ perf record -b any branchy
$ perf report -b
# Events: 23K cycles
#
# Overhead  Source Symbol     Target Symbol
# ........  ................  ................

    18.13%  [.] f1            [.] main                          
    18.10%  [.] main          [.] main                          
    18.01%  [.] main          [.] f1                            
    15.69%  [.] f1            [.] f1                            
     9.11%  [.] f3            [.] f1                            
     6.78%  [.] f1            [.] f3                            
     6.74%  [.] f1            [.] f2                            
     6.71%  [.] f2            [.] f1                            

Of the total number of branches captured, 18.13% were from f1() -> main().

Let's make this clearer by filtering the user call branches only:

$ perf record -b any_call -e cycles:u branchy
$ perf report -b
# Events: 19K cycles
#
# Overhead  Source Symbol              Target Symbol
# ........  .........................  .........................
#
    52.50%  [.] main                   [.] f1                   
    23.99%  [.] f1                     [.] f3                   
    23.48%  [.] f1                     [.] f2                   
     0.03%  [.] _IO_default_xsputn     [.] _IO_new_file_overflow
     0.01%  [k] _start                 [k] __libc_start_main    

Now it is more obvious. %52 of all the captured branches where calls from main() -> f1().
The rest is split 50/50 between f1() -> f2() and f1() -> f3() which is expected given
that f1() dispatches based on odd vs. even values of n which is constantly increasing.


Here is a kernel example, where we want to sample indirect calls:
$ perf record -a -C 1 -b ind_call -e r1c4:k sleep 10 
$ perf report -b
#
# Overhead  Source Symbol               Target Symbol
# ........  ..........................  ..........................
#
    36.36%  [k] __delay                 [k] delay_tsc             
     9.09%  [k] ktime_get               [k] read_tsc              
     9.09%  [k] getnstimeofday          [k] read_tsc              
     9.09%  [k] notifier_call_chain     [k] tick_notify           
     4.55%  [k] cpuidle_idle_call       [k] intel_idle            
     4.55%  [k] cpuidle_idle_call       [k] menu_reflect          
     2.27%  [k] handle_irq              [k] handle_edge_irq       
     2.27%  [k] ack_apic_edge           [k] native_apic_mem_write 
     2.27%  [k] hpet_interrupt_handler  [k] hrtimer_interrupt     
     2.27%  [k] __run_hrtimer           [k] watchdog_timer_fn     
     2.27%  [k] enqueue_task            [k] enqueue_task_rt       
     2.27%  [k] try_to_wake_up          [k] select_task_rq_rt     
     2.27%  [k] do_timer                [k] read_tsc              

Due to HW limitations, branch filtering may be approximate on
Core, Atom processors. It is more accurate on Nehalem, Westmere
and best on Sandy Bridge.

In version 2, we've updated the patch to tip/master (commit 5734857) and
we've incoporated the feedback from v1 concerning anynous bitfield
struct for branch_stack_entry and the hanlding of i386 ABI binaries
on 64-bit host in the instr decoder for the LBR SW filter.

In version 3, we've updated to 3.2.0-tip. The Atom revision
check has been put into its own patch. We fixed a browser
issue with report report. We fixed all the style issues as well.

In version 4, we've modified the branch stack API to add a missing
priv level : hypervisor. There is a new PERF_SAMPLE_BRANCH_HV. It
is not used on Intel X86. Thanks to  khandual@linux.vnet.ibm.com
for pointing this out. We also fix compilation error on ARM.

In version 4, we also extend the patch to include the changes necessary
to the perf tool to support reading perf.data files which were produced
from older perf_event ABI revisions. This patch set extends the ABI
with a new field in struct perf_event_attr. That struct is saved as
is in the perf.data file. Therefore, older perf.data files contain
smaller perf_event_attr struct, yet perf must process them transparently.
That's not the case today. It dies with 'incompatible file format'.

The patch solves this problem and, at the same time, decouples endianness
detection from the size of perf_event_attr. Endianness is now detected via
the signature (the first 8 bytes of the file). We introduce a new signature
(PERFILE2). It is not laid out the same way in the file based on the endianness
of the host where the file is written. Therefore, we can dynamically detect
the endianness by simply reading the first 8 bytes. The size of the
perf_event_attr struct can then be processed according to the endianness.
The ambiguity between the size being at the same time, the endianness marker
and the actual size is gone. We can now distinguish an older ABI by the size
and not confuse it with an endianness mismatch.

In version 5, we fix the PEBS+LBR vs. BRANCH_STACK check in x86_pmu_hw_config.
We also changed the handling of PERF_SAMPLE_BRANCH_HV on X86. It is now ignored
instead of triggering an error. That enables: perf record -b any -e cycles,
without having to force a priv level on the branch type. We also fix an
uninitialized variable bug in the perf tool reported by reviewers. Thanks
to Anshuman Khandual for his comments.

Signed-off-by: Stephane Eranian <eranian@google.com>


Roberto Agostino Vitillo (3):
  perf: add code to support PERF_SAMPLE_BRANCH_STACK
  perf: add support for sampling taken branch to perf record
  perf: add support for taken branch sampling to perf report

Stephane Eranian (15):
  perf: add generic taken branch sampling support
  perf: add Intel LBR MSR definitions
  perf: add Intel X86 LBR sharing logic
  perf: sync branch stack sampling with X86 precise_sampling
  perf: add Intel X86 LBR mappings for PERF_SAMPLE_BRANCH filters
  perf: disable LBR support for older Intel Atom processors
  perf: implement PERF_SAMPLE_BRANCH for Intel X86
  perf: add LBR software filter support for Intel X86
  perf: disable PERF_SAMPLE_BRANCH_* when not supported
  perf: add hook to flush branch_stack on context switch
  perf: fix endianness detection in perf.data
  perf: add ABI reference sizes
  perf: enable reading of perf.data files from different ABI rev
  perf: fix bug print_event_desc()
  perf: make perf able to read file from older ABIs

 arch/alpha/kernel/perf_event.c             |    4 +
 arch/arm/kernel/perf_event.c               |    4 +
 arch/mips/kernel/perf_event_mipsxx.c       |    4 +
 arch/powerpc/kernel/perf_event.c           |    4 +
 arch/sh/kernel/perf_event.c                |    4 +
 arch/sparc/kernel/perf_event.c             |    4 +
 arch/x86/include/asm/msr-index.h           |    7 +
 arch/x86/kernel/cpu/perf_event.c           |   85 ++++-
 arch/x86/kernel/cpu/perf_event.h           |   19 +
 arch/x86/kernel/cpu/perf_event_amd.c       |    3 +
 arch/x86/kernel/cpu/perf_event_intel.c     |  120 +++++--
 arch/x86/kernel/cpu/perf_event_intel_ds.c  |   22 +-
 arch/x86/kernel/cpu/perf_event_intel_lbr.c |  526 ++++++++++++++++++++++++++--
 include/linux/perf_event.h                 |   82 ++++-
 kernel/events/core.c                       |  177 ++++++++++
 kernel/events/hw_breakpoint.c              |    6 +
 tools/perf/Documentation/perf-record.txt   |   25 ++
 tools/perf/Documentation/perf-report.txt   |    7 +
 tools/perf/builtin-record.c                |   74 ++++
 tools/perf/builtin-report.c                |   98 +++++-
 tools/perf/perf.h                          |   18 +
 tools/perf/util/annotate.c                 |    2 +-
 tools/perf/util/event.h                    |    1 +
 tools/perf/util/evsel.c                    |   14 +
 tools/perf/util/header.c                   |  230 +++++++++++--
 tools/perf/util/hist.c                     |   93 ++++-
 tools/perf/util/hist.h                     |    7 +
 tools/perf/util/session.c                  |   72 ++++
 tools/perf/util/session.h                  |    4 +
 tools/perf/util/sort.c                     |  362 ++++++++++++++-----
 tools/perf/util/sort.h                     |    5 +
 tools/perf/util/symbol.h                   |   13 +
 32 files changed, 1866 insertions(+), 230 deletions(-)

-- 
1.7.4.1


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH v5 01/18] perf: add generic taken branch sampling support
  2012-02-02 12:54 [PATCH v5 00/18] perf: add support for sampling taken branches Stephane Eranian
@ 2012-02-02 12:54 ` Stephane Eranian
  2012-02-02 12:54 ` [PATCH v5 02/18] perf: add Intel LBR MSR definitions Stephane Eranian
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 43+ messages in thread
From: Stephane Eranian @ 2012-02-02 12:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterz, mingo, acme, robert.richter, ming.m.lin, andi, asharma,
	ravitillo, vweaver1, khandual, dsahern

This patch adds the ability to sample taken branches to the
perf_event interface.

The ability to capture taken branches is very useful for all
sorts of analysis. For instance, basic block profiling, call
counts, statistical call graph.

This new capability requires hardware assist and as such may
not be available on all HW platforms. On Intel X86, it is
implemented on top of the Last Branch Record (LBR) facility.

To enable taken branches sampling, the PERF_SAMPLE_BRANCH_STACK
bit must be set in attr->sample_type.

Sampled taken branches may be filtered by type and/or priv
levels.

The patch adds a new field, called branch_sample_type, to the
perf_event_attr structure. It contains a bitmask of filters
to apply to the sampled taken branches.

Filters may be implemented in HW. If the HW filter does not exist
or is not good enough, some arch may also implement a SW filter.

The following generic filters are currently defined:
- PERF_SAMPLE_USER
  only branches whose targets are at the user level

- PERF_SAMPLE_KERNEL
  only branches whose targets are at the kernel level

- PERF_SAMPLE_HV
  only branches whose targets are at the hypervisor level

- PERF_SAMPLE_ANY
  any type of branches (subject to priv levels filters)

- PERF_SAMPLE_ANY_CALL
  any call branches (may incl. syscall on some arch)

- PERF_SAMPLE_ANY_RET
  any return branches (may incl. syscall returns on some arch)

- PERF_SAMPLE_IND_CALL
  indirect call branches

Obviously filter may be combined. The priv level bits are optional.
If not provided, the priv level of the associated event are used. It
is possible to collect branches at a priv level different from the
associated event. Use of kernel, hv priv levels is subject to permissions
and availability (hv).

The number of taken branch records present in each sample may vary based
on HW, the type of sampled branches, the executed code. Therefore
each sample contains the number of taken branches it contains.

Signed-off-by: Stephane Eranian <eranian@google.com>
---
 arch/x86/kernel/cpu/perf_event_intel_lbr.c |   21 +++++---
 include/linux/perf_event.h                 |   71 ++++++++++++++++++++++++++--
 kernel/events/core.c                       |   68 ++++++++++++++++++++++++++
 3 files changed, 148 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_lbr.c b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
index 3fab3de..c3f8100 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_lbr.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
@@ -144,9 +144,11 @@ static void intel_pmu_lbr_read_32(struct cpu_hw_events *cpuc)
 
 		rdmsrl(x86_pmu.lbr_from + lbr_idx, msr_lastbranch.lbr);
 
-		cpuc->lbr_entries[i].from  = msr_lastbranch.from;
-		cpuc->lbr_entries[i].to    = msr_lastbranch.to;
-		cpuc->lbr_entries[i].flags = 0;
+		cpuc->lbr_entries[i].from	= msr_lastbranch.from;
+		cpuc->lbr_entries[i].to		= msr_lastbranch.to;
+		cpuc->lbr_entries[i].mispred	= 0;
+		cpuc->lbr_entries[i].predicted	= 0;
+		cpuc->lbr_entries[i].reserved	= 0;
 	}
 	cpuc->lbr_stack.nr = i;
 }
@@ -167,19 +169,22 @@ static void intel_pmu_lbr_read_64(struct cpu_hw_events *cpuc)
 
 	for (i = 0; i < x86_pmu.lbr_nr; i++) {
 		unsigned long lbr_idx = (tos - i) & mask;
-		u64 from, to, flags = 0;
+		u64 from, to, mis = 0, pred = 0;
 
 		rdmsrl(x86_pmu.lbr_from + lbr_idx, from);
 		rdmsrl(x86_pmu.lbr_to   + lbr_idx, to);
 
 		if (lbr_format == LBR_FORMAT_EIP_FLAGS) {
-			flags = !!(from & LBR_FROM_FLAG_MISPRED);
+			mis = !!(from & LBR_FROM_FLAG_MISPRED);
+			pred = !mis;
 			from = (u64)((((s64)from) << 1) >> 1);
 		}
 
-		cpuc->lbr_entries[i].from  = from;
-		cpuc->lbr_entries[i].to    = to;
-		cpuc->lbr_entries[i].flags = flags;
+		cpuc->lbr_entries[i].from	= from;
+		cpuc->lbr_entries[i].to		= to;
+		cpuc->lbr_entries[i].mispred	= mis;
+		cpuc->lbr_entries[i].predicted	= pred;
+		cpuc->lbr_entries[i].reserved	= 0;
 	}
 	cpuc->lbr_stack.nr = i;
 }
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 412b790..71b0232 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -129,11 +129,40 @@ enum perf_event_sample_format {
 	PERF_SAMPLE_PERIOD			= 1U << 8,
 	PERF_SAMPLE_STREAM_ID			= 1U << 9,
 	PERF_SAMPLE_RAW				= 1U << 10,
+	PERF_SAMPLE_BRANCH_STACK		= 1U << 11,
 
-	PERF_SAMPLE_MAX = 1U << 11,		/* non-ABI */
+	PERF_SAMPLE_MAX = 1U << 12,		/* non-ABI */
 };
 
 /*
+ * values to program into branch_sample_type when PERF_SAMPLE_BRANCH is set
+ *
+ * If the user does not pass priv level information via branch_sample_type,
+ * the kernel uses the event's priv level. Branch and event priv levels do
+ * not have to match. Branch priv level is checked for permissions.
+ *
+ * The branch types can be combined, however BRANCH_ANY covers all types
+ * of branches and therefore it supersedes all the other types.
+ */
+enum perf_branch_sample_type {
+	PERF_SAMPLE_BRANCH_USER		= 1U << 0, /* user branches */
+	PERF_SAMPLE_BRANCH_KERNEL	= 1U << 1, /* kernel branches */
+	PERF_SAMPLE_BRANCH_HV		= 1U << 2, /* hypervisor branches */
+
+	PERF_SAMPLE_BRANCH_ANY		= 1U << 3, /* any branch types */
+	PERF_SAMPLE_BRANCH_ANY_CALL	= 1U << 4, /* any call branch */
+	PERF_SAMPLE_BRANCH_ANY_RETURN	= 1U << 5, /* any return branch */
+	PERF_SAMPLE_BRANCH_IND_CALL	= 1U << 6, /* indirect calls */
+
+	PERF_SAMPLE_BRANCH_MAX		= 1U << 7, /* non-ABI */
+};
+
+#define PERF_SAMPLE_BRANCH_PLM_ALL \
+	(PERF_SAMPLE_BRANCH_USER|\
+	 PERF_SAMPLE_BRANCH_KERNEL|\
+	 PERF_SAMPLE_BRANCH_HV)
+
+/*
  * The format of the data returned by read() on a perf event fd,
  * as specified by attr.read_format:
  *
@@ -240,6 +269,7 @@ struct perf_event_attr {
 		__u64		bp_len;
 		__u64		config2; /* extension of config1 */
 	};
+	__u64	branch_sample_type; /* enum branch_sample_type */
 };
 
 /*
@@ -458,6 +488,8 @@ enum perf_event_type {
 	 *
 	 *	{ u32			size;
 	 *	  char                  data[size];}&& PERF_SAMPLE_RAW
+	 *
+	 *	{ u64 from, to, flags } lbr[nr];} && PERF_SAMPLE_BRANCH_STACK
 	 * };
 	 */
 	PERF_RECORD_SAMPLE			= 9,
@@ -530,12 +562,34 @@ struct perf_raw_record {
 	void				*data;
 };
 
+/*
+ * single taken branch record layout:
+ *
+ *      from: source instruction (may not always be a branch insn)
+ *        to: branch target
+ *   mispred: branch target was mispredicted
+ * predicted: branch target was predicted
+ *
+ * support for mispred, predicted is optional. In case it
+ * is not supported mispred = predicted = 0.
+ */
 struct perf_branch_entry {
-	__u64				from;
-	__u64				to;
-	__u64				flags;
+	__u64	from;
+	__u64	to;
+	__u64	mispred:1,  /* target mispredicted */
+		predicted:1,/* target predicted */
+		reserved:62;
 };
 
+/*
+ * branch stack layout:
+ *  nr: number of taken branches stored in entries[]
+ *
+ * Note that nr can vary from sample to sample
+ * branches (to, from) are stored from most recent
+ * to least recent, i.e., entries[0] contains the most
+ * recent branch.
+ */
 struct perf_branch_stack {
 	__u64				nr;
 	struct perf_branch_entry	entries[0];
@@ -566,7 +620,9 @@ struct hw_perf_event {
 			unsigned long	event_base;
 			int		idx;
 			int		last_cpu;
+
 			struct hw_perf_event_extra extra_reg;
+			struct hw_perf_event_extra branch_reg;
 		};
 		struct { /* software */
 			struct hrtimer	hrtimer;
@@ -1004,12 +1060,14 @@ struct perf_sample_data {
 	u64				period;
 	struct perf_callchain_entry	*callchain;
 	struct perf_raw_record		*raw;
+	struct perf_branch_stack	*br_stack;
 };
 
 static inline void perf_sample_data_init(struct perf_sample_data *data, u64 addr)
 {
 	data->addr = addr;
 	data->raw  = NULL;
+	data->br_stack = NULL;
 }
 
 extern void perf_output_sample(struct perf_output_handle *handle,
@@ -1148,6 +1206,11 @@ extern void perf_bp_event(struct perf_event *event, void *data);
 # define perf_instruction_pointer(regs)	instruction_pointer(regs)
 #endif
 
+static inline bool has_branch_stack(struct perf_event *event)
+{
+	return event->attr.sample_type & PERF_SAMPLE_BRANCH_STACK;
+}
+
 extern int perf_output_begin(struct perf_output_handle *handle,
 			     struct perf_event *event, unsigned int size);
 extern void perf_output_end(struct perf_output_handle *handle);
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 7c3b9de..a268e45 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -118,6 +118,13 @@ static int cpu_function_call(int cpu, int (*func) (void *info), void *info)
 		       PERF_FLAG_FD_OUTPUT  |\
 		       PERF_FLAG_PID_CGROUP)
 
+/*
+ * branch priv levels that need permission checks
+ */
+#define PERF_SAMPLE_BRANCH_PERM_PLM \
+	(PERF_SAMPLE_BRANCH_KERNEL |\
+	 PERF_SAMPLE_BRANCH_HV)
+
 enum event_type_t {
 	EVENT_FLEXIBLE = 0x1,
 	EVENT_PINNED = 0x2,
@@ -3898,6 +3905,24 @@ void perf_output_sample(struct perf_output_handle *handle,
 			}
 		}
 	}
+
+	if (sample_type & PERF_SAMPLE_BRANCH_STACK) {
+		if (data->br_stack) {
+			size_t size;
+
+			size = data->br_stack->nr
+			     * sizeof(struct perf_branch_entry);
+
+			perf_output_put(handle, data->br_stack->nr);
+			perf_output_copy(handle, data->br_stack->entries, size);
+		} else {
+			/*
+			 * we always store at least the value of nr
+			 */
+			u64 nr = 0;
+			perf_output_put(handle, nr);
+		}
+	}
 }
 
 void perf_prepare_sample(struct perf_event_header *header,
@@ -3940,6 +3965,15 @@ void perf_prepare_sample(struct perf_event_header *header,
 		WARN_ON_ONCE(size & (sizeof(u64)-1));
 		header->size += size;
 	}
+
+	if (sample_type & PERF_SAMPLE_BRANCH_STACK) {
+		int size = sizeof(u64); /* nr */
+		if (data->br_stack) {
+			size += data->br_stack->nr
+			      * sizeof(struct perf_branch_entry);
+		}
+		header->size += size;
+	}
 }
 
 static void perf_event_output(struct perf_event *event,
@@ -5926,6 +5960,40 @@ static int perf_copy_attr(struct perf_event_attr __user *uattr,
 	if (attr->read_format & ~(PERF_FORMAT_MAX-1))
 		return -EINVAL;
 
+	if (attr->sample_type & PERF_SAMPLE_BRANCH_STACK) {
+		u64 mask = attr->branch_sample_type;
+
+		/* only using defined bits */
+		if (mask & ~(PERF_SAMPLE_BRANCH_MAX-1))
+			return -EINVAL;
+
+		/* at least one branch bit must be set */
+		if (!(mask & ~PERF_SAMPLE_BRANCH_PLM_ALL))
+			return -EINVAL;
+
+		/* kernel level capture: check permissions */
+		if ((mask & PERF_SAMPLE_BRANCH_PERM_PLM)
+		    && perf_paranoid_kernel() && !capable(CAP_SYS_ADMIN))
+			return -EACCES;
+
+		/* propagate priv level, when not set for branch */
+		if (!(mask & PERF_SAMPLE_BRANCH_PLM_ALL)) {
+
+			/* exclude_kernel checked on syscall entry */
+			if (!attr->exclude_kernel)
+				mask |= PERF_SAMPLE_BRANCH_KERNEL;
+
+			if (!attr->exclude_user)
+				mask |= PERF_SAMPLE_BRANCH_USER;
+
+			if (!attr->exclude_hv)
+				mask |= PERF_SAMPLE_BRANCH_HV;
+			/*
+			 * adjust user setting (for HW filter setup)
+			 */
+			attr->branch_sample_type = mask;
+		}
+	}
 out:
 	return ret;
 
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v5 02/18] perf: add Intel LBR MSR definitions
  2012-02-02 12:54 [PATCH v5 00/18] perf: add support for sampling taken branches Stephane Eranian
  2012-02-02 12:54 ` [PATCH v5 01/18] perf: add generic taken branch sampling support Stephane Eranian
@ 2012-02-02 12:54 ` Stephane Eranian
  2012-02-02 12:54 ` [PATCH v5 03/18] perf: add Intel X86 LBR sharing logic Stephane Eranian
                   ` (15 subsequent siblings)
  17 siblings, 0 replies; 43+ messages in thread
From: Stephane Eranian @ 2012-02-02 12:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterz, mingo, acme, robert.richter, ming.m.lin, andi, asharma,
	ravitillo, vweaver1, khandual, dsahern

This patch adds the LBR definitions for NHM/WSM/SNB and Core.
It also adds the definitions for the architected LBR MSR:
LBR_SELECT, LBRT_TOS.

Signed-off-by: Stephane Eranian <eranian@google.com>
---
 arch/x86/include/asm/msr-index.h           |    7 +++++++
 arch/x86/kernel/cpu/perf_event_intel_lbr.c |   18 +++++++++---------
 2 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index a6962d9..ccb8059 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -56,6 +56,13 @@
 #define MSR_OFFCORE_RSP_0		0x000001a6
 #define MSR_OFFCORE_RSP_1		0x000001a7
 
+#define MSR_LBR_SELECT			0x000001c8
+#define MSR_LBR_TOS			0x000001c9
+#define MSR_LBR_NHM_FROM		0x00000680
+#define MSR_LBR_NHM_TO			0x000006c0
+#define MSR_LBR_CORE_FROM		0x00000040
+#define MSR_LBR_CORE_TO			0x00000060
+
 #define MSR_IA32_PEBS_ENABLE		0x000003f1
 #define MSR_IA32_DS_AREA		0x00000600
 #define MSR_IA32_PERF_CAPABILITIES	0x00000345
diff --git a/arch/x86/kernel/cpu/perf_event_intel_lbr.c b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
index c3f8100..e14431f 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_lbr.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
@@ -205,23 +205,23 @@ void intel_pmu_lbr_read(void)
 void intel_pmu_lbr_init_core(void)
 {
 	x86_pmu.lbr_nr     = 4;
-	x86_pmu.lbr_tos    = 0x01c9;
-	x86_pmu.lbr_from   = 0x40;
-	x86_pmu.lbr_to     = 0x60;
+	x86_pmu.lbr_tos    = MSR_LBR_TOS;
+	x86_pmu.lbr_from   = MSR_LBR_CORE_FROM;
+	x86_pmu.lbr_to     = MSR_LBR_CORE_TO;
 }
 
 void intel_pmu_lbr_init_nhm(void)
 {
 	x86_pmu.lbr_nr     = 16;
-	x86_pmu.lbr_tos    = 0x01c9;
-	x86_pmu.lbr_from   = 0x680;
-	x86_pmu.lbr_to     = 0x6c0;
+	x86_pmu.lbr_tos    = MSR_LBR_TOS;
+	x86_pmu.lbr_from   = MSR_LBR_NHM_FROM;
+	x86_pmu.lbr_to     = MSR_LBR_NHM_TO;
 }
 
 void intel_pmu_lbr_init_atom(void)
 {
 	x86_pmu.lbr_nr	   = 8;
-	x86_pmu.lbr_tos    = 0x01c9;
-	x86_pmu.lbr_from   = 0x40;
-	x86_pmu.lbr_to     = 0x60;
+	x86_pmu.lbr_tos    = MSR_LBR_TOS;
+	x86_pmu.lbr_from   = MSR_LBR_CORE_FROM;
+	x86_pmu.lbr_to     = MSR_LBR_CORE_TO;
 }
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v5 03/18] perf: add Intel X86 LBR sharing logic
  2012-02-02 12:54 [PATCH v5 00/18] perf: add support for sampling taken branches Stephane Eranian
  2012-02-02 12:54 ` [PATCH v5 01/18] perf: add generic taken branch sampling support Stephane Eranian
  2012-02-02 12:54 ` [PATCH v5 02/18] perf: add Intel LBR MSR definitions Stephane Eranian
@ 2012-02-02 12:54 ` Stephane Eranian
  2012-02-02 12:54 ` [PATCH v5 04/18] perf: sync branch stack sampling with X86 precise_sampling Stephane Eranian
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 43+ messages in thread
From: Stephane Eranian @ 2012-02-02 12:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterz, mingo, acme, robert.richter, ming.m.lin, andi, asharma,
	ravitillo, vweaver1, khandual, dsahern

The Intel LBR on some recent processor is capable
of filtering branches by type. The filter is configurable
via the LBR_SELECT MSR register.

There are limitation on how this register can be used.

On Nehalem/Westmere, the LBR_SELECT is shared by the two HT threads
when HT is on. It is private to each core when HT is off.

On SandyBridge, the LBR_SELECT register is private to each thread
when HT is on. It is private to each core when HT is off.

The kernel must manage the sharing of LBR_SELECT. It allows
multiple users on the same logical CPU to use LBR_SELECT as
long as they program it with the same value. Across sibling
CPUs (HT threads), the same restriction applies on NHM/WSM.

This patch implements this sharing logic by leveraging the
mechanism put in place for managing the offcore_response
shared MSR.

We modify __intel_shared_reg_get_constraints() to cause
x86_get_event_constraint() to be called because LBR may
be associated with events that may be counter constrained.

Signed-off-by: Stephane Eranian <eranian@google.com>
---
 arch/x86/kernel/cpu/perf_event.c       |    4 ++
 arch/x86/kernel/cpu/perf_event.h       |    4 ++
 arch/x86/kernel/cpu/perf_event_intel.c |   70 ++++++++++++++++++++------------
 3 files changed, 52 insertions(+), 26 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index f8bddb5..3779313 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -426,6 +426,10 @@ static int __x86_pmu_event_init(struct perf_event *event)
 	/* mark unused */
 	event->hw.extra_reg.idx = EXTRA_REG_NONE;
 
+	/* mark not used */
+	event->hw.extra_reg.idx = EXTRA_REG_NONE;
+	event->hw.branch_reg.idx = EXTRA_REG_NONE;
+
 	return x86_pmu.hw_config(event);
 }
 
diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h
index 513d617..4535ada 100644
--- a/arch/x86/kernel/cpu/perf_event.h
+++ b/arch/x86/kernel/cpu/perf_event.h
@@ -33,6 +33,7 @@ enum extra_reg_type {
 
 	EXTRA_REG_RSP_0 = 0,	/* offcore_response_0 */
 	EXTRA_REG_RSP_1 = 1,	/* offcore_response_1 */
+	EXTRA_REG_LBR   = 2,	/* lbr_select */
 
 	EXTRA_REG_MAX		/* number of entries needed */
 };
@@ -130,6 +131,7 @@ struct cpu_hw_events {
 	void				*lbr_context;
 	struct perf_branch_stack	lbr_stack;
 	struct perf_branch_entry	lbr_entries[MAX_LBR_ENTRIES];
+	struct er_account		*lbr_sel;
 
 	/*
 	 * Intel host/guest exclude bits
@@ -340,6 +342,8 @@ struct x86_pmu {
 	 */
 	unsigned long	lbr_tos, lbr_from, lbr_to; /* MSR base regs       */
 	int		lbr_nr;			   /* hardware stack size */
+	u64		lbr_sel_mask;		   /* LBR_SELECT valid bits */
+	const int	*lbr_sel_map;		   /* lbr_select mappings */
 
 	/*
 	 * Extra registers for events
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index 3bd37bd..97f7bb5 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -1123,17 +1123,17 @@ static bool intel_try_alt_er(struct perf_event *event, int orig_idx)
  */
 static struct event_constraint *
 __intel_shared_reg_get_constraints(struct cpu_hw_events *cpuc,
-				   struct perf_event *event)
+				   struct perf_event *event,
+				   struct hw_perf_event_extra *reg)
 {
 	struct event_constraint *c = &emptyconstraint;
-	struct hw_perf_event_extra *reg = &event->hw.extra_reg;
 	struct er_account *era;
 	unsigned long flags;
 	int orig_idx = reg->idx;
 
 	/* already allocated shared msr */
 	if (reg->alloc)
-		return &unconstrained;
+		return NULL; /* call x86_get_event_constraint() */
 
 again:
 	era = &cpuc->shared_regs->regs[reg->idx];
@@ -1156,14 +1156,10 @@ __intel_shared_reg_get_constraints(struct cpu_hw_events *cpuc,
 		reg->alloc = 1;
 
 		/*
-		 * All events using extra_reg are unconstrained.
-		 * Avoids calling x86_get_event_constraints()
-		 *
-		 * Must revisit if extra_reg controlling events
-		 * ever have constraints. Worst case we go through
-		 * the regular event constraint table.
+		 * need to call x86_get_event_constraint()
+		 * to check if associated event has constraints
 		 */
-		c = &unconstrained;
+		c = NULL;
 	} else if (intel_try_alt_er(event, orig_idx)) {
 		raw_spin_unlock_irqrestore(&era->lock, flags);
 		goto again;
@@ -1200,11 +1196,23 @@ static struct event_constraint *
 intel_shared_regs_constraints(struct cpu_hw_events *cpuc,
 			      struct perf_event *event)
 {
-	struct event_constraint *c = NULL;
-
-	if (event->hw.extra_reg.idx != EXTRA_REG_NONE)
-		c = __intel_shared_reg_get_constraints(cpuc, event);
-
+	struct event_constraint *c = NULL, *d;
+	struct hw_perf_event_extra *xreg, *breg;
+
+	xreg = &event->hw.extra_reg;
+	if (xreg->idx != EXTRA_REG_NONE) {
+		c = __intel_shared_reg_get_constraints(cpuc, event, xreg);
+		if (c == &emptyconstraint)
+			return c;
+	}
+	breg = &event->hw.branch_reg;
+	if (breg->idx != EXTRA_REG_NONE) {
+		d = __intel_shared_reg_get_constraints(cpuc, event, breg);
+		if (d == &emptyconstraint) {
+			__intel_shared_reg_put_constraints(cpuc, xreg);
+			c = d;
+		}
+	}
 	return c;
 }
 
@@ -1252,6 +1260,10 @@ intel_put_shared_regs_event_constraints(struct cpu_hw_events *cpuc,
 	reg = &event->hw.extra_reg;
 	if (reg->idx != EXTRA_REG_NONE)
 		__intel_shared_reg_put_constraints(cpuc, reg);
+
+	reg = &event->hw.branch_reg;
+	if (reg->idx != EXTRA_REG_NONE)
+		__intel_shared_reg_put_constraints(cpuc, reg);
 }
 
 static void intel_put_event_constraints(struct cpu_hw_events *cpuc,
@@ -1431,7 +1443,7 @@ static int intel_pmu_cpu_prepare(int cpu)
 {
 	struct cpu_hw_events *cpuc = &per_cpu(cpu_hw_events, cpu);
 
-	if (!x86_pmu.extra_regs)
+	if (!(x86_pmu.extra_regs || x86_pmu.lbr_sel_map))
 		return NOTIFY_OK;
 
 	cpuc->shared_regs = allocate_shared_regs(cpu);
@@ -1453,22 +1465,28 @@ static void intel_pmu_cpu_starting(int cpu)
 	 */
 	intel_pmu_lbr_reset();
 
-	if (!cpuc->shared_regs || (x86_pmu.er_flags & ERF_NO_HT_SHARING))
+	cpuc->lbr_sel = NULL;
+
+	if (!cpuc->shared_regs)
 		return;
 
-	for_each_cpu(i, topology_thread_cpumask(cpu)) {
-		struct intel_shared_regs *pc;
+	if (!(x86_pmu.er_flags & ERF_NO_HT_SHARING)) {
+		for_each_cpu(i, topology_thread_cpumask(cpu)) {
+			struct intel_shared_regs *pc;
 
-		pc = per_cpu(cpu_hw_events, i).shared_regs;
-		if (pc && pc->core_id == core_id) {
-			cpuc->kfree_on_online = cpuc->shared_regs;
-			cpuc->shared_regs = pc;
-			break;
+			pc = per_cpu(cpu_hw_events, i).shared_regs;
+			if (pc && pc->core_id == core_id) {
+				cpuc->kfree_on_online = cpuc->shared_regs;
+				cpuc->shared_regs = pc;
+				break;
+			}
 		}
+		cpuc->shared_regs->core_id = core_id;
+		cpuc->shared_regs->refcnt++;
 	}
 
-	cpuc->shared_regs->core_id = core_id;
-	cpuc->shared_regs->refcnt++;
+	if (x86_pmu.lbr_sel_map)
+		cpuc->lbr_sel = &cpuc->shared_regs->regs[EXTRA_REG_LBR];
 }
 
 static void intel_pmu_cpu_dying(int cpu)
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v5 04/18] perf: sync branch stack sampling with X86 precise_sampling
  2012-02-02 12:54 [PATCH v5 00/18] perf: add support for sampling taken branches Stephane Eranian
                   ` (2 preceding siblings ...)
  2012-02-02 12:54 ` [PATCH v5 03/18] perf: add Intel X86 LBR sharing logic Stephane Eranian
@ 2012-02-02 12:54 ` Stephane Eranian
  2012-02-02 12:54 ` [PATCH v5 05/18] perf: add Intel X86 LBR mappings for PERF_SAMPLE_BRANCH filters Stephane Eranian
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 43+ messages in thread
From: Stephane Eranian @ 2012-02-02 12:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterz, mingo, acme, robert.richter, ming.m.lin, andi, asharma,
	ravitillo, vweaver1, khandual, dsahern

If precise sampling is enabled on Intel X86, then perf_event uses PEBS.
To correct for the off-by-one error of PEBS, perf_event uses LBR when
precise_sample > 1.

On Intel X86, PERF_SAMPLE_BRANCH_STACK is implemented using LBR,
therefore both features must be coordinated as they may not
configure LBR the same way.

For PEBS, LBR needs to capture all branches at the priv level of
the associated event.

This patch checks that the branch type and priv level of BRANCH_STACK
is compatible with that of the PEBS LBR requirement, thereby allowing:

   $ perf record -b any,u -e instructions:upp ....

But
   $ perf record -b any_call,u -e instructions:upp

Is not possible.

Signed-off-by: Stephane Eranian <eranian@google.com>
---
 arch/x86/kernel/cpu/perf_event.c |   60 ++++++++++++++++++++++++++++++++++++++
 1 files changed, 60 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 3779313..71e0264 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -353,6 +353,36 @@ int x86_setup_perfctr(struct perf_event *event)
 	return 0;
 }
 
+/*
+ * check that branch_sample_type is compatible with
+ * settings needed for precise_ip > 1 which implies
+ * using the LBR to capture ALL taken branches at the
+ * priv levels of the measurement
+ */
+static inline int precise_br_compat(struct perf_event *event)
+{
+	u64 m = event->attr.branch_sample_type;
+	u64 b = 0;
+
+	/* must capture all branches */
+	if (!(m & PERF_SAMPLE_BRANCH_ANY))
+		return 0;
+
+	m &= PERF_SAMPLE_BRANCH_KERNEL | PERF_SAMPLE_BRANCH_USER;
+
+	if (!event->attr.exclude_user)
+		b |= PERF_SAMPLE_BRANCH_USER;
+
+	if (!event->attr.exclude_kernel)
+		b |= PERF_SAMPLE_BRANCH_KERNEL;
+
+	/*
+	 * ignore PERF_SAMPLE_BRANCH_HV, not supported on X86
+	 */
+
+	return m == b;
+}
+
 int x86_pmu_hw_config(struct perf_event *event)
 {
 	if (event->attr.precise_ip) {
@@ -369,6 +399,36 @@ int x86_pmu_hw_config(struct perf_event *event)
 
 		if (event->attr.precise_ip > precise)
 			return -EOPNOTSUPP;
+		/*
+		 * check that PEBS LBR correction does not conflict with
+		 * whatever the user is asking with attr->branch_sample_type
+		 */
+		if (event->attr.precise_ip > 1) {
+			u64 *br_type = &event->attr.branch_sample_type;
+
+			if (has_branch_stack(event)) {
+				if (!precise_br_compat(event))
+					return -EOPNOTSUPP;
+
+				/* branch_sample_type is compatible */
+
+			} else {
+				/*
+				 * user did not specify  branch_sample_type
+				 *
+				 * For PEBS fixups, we capture all
+				 * the branches at the priv level of the
+				 * event.
+				 */
+				*br_type = PERF_SAMPLE_BRANCH_ANY;
+
+				if (!event->attr.exclude_user)
+					*br_type |= PERF_SAMPLE_BRANCH_USER;
+
+				if (!event->attr.exclude_kernel)
+					*br_type |= PERF_SAMPLE_BRANCH_KERNEL;
+			}
+		}
 	}
 
 	/*
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v5 05/18] perf: add Intel X86 LBR mappings for PERF_SAMPLE_BRANCH filters
  2012-02-02 12:54 [PATCH v5 00/18] perf: add support for sampling taken branches Stephane Eranian
                   ` (3 preceding siblings ...)
  2012-02-02 12:54 ` [PATCH v5 04/18] perf: sync branch stack sampling with X86 precise_sampling Stephane Eranian
@ 2012-02-02 12:54 ` Stephane Eranian
  2012-02-02 12:54 ` [PATCH v5 06/18] perf: disable LBR support for older Intel Atom processors Stephane Eranian
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 43+ messages in thread
From: Stephane Eranian @ 2012-02-02 12:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterz, mingo, acme, robert.richter, ming.m.lin, andi, asharma,
	ravitillo, vweaver1, khandual, dsahern

This patch adds the mappings from the generic PERF_SAMPLE_BRANCH_*
filters to the actual Intel X86 LBR filters, whenever they exist.

Signed-off-by: Stephane Eranian <eranian@google.com>
---
 arch/x86/kernel/cpu/perf_event.h           |    2 +
 arch/x86/kernel/cpu/perf_event_intel.c     |    2 +-
 arch/x86/kernel/cpu/perf_event_intel_lbr.c |  103 +++++++++++++++++++++++++++-
 3 files changed, 104 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h
index 4535ada..776fb5a 100644
--- a/arch/x86/kernel/cpu/perf_event.h
+++ b/arch/x86/kernel/cpu/perf_event.h
@@ -535,6 +535,8 @@ void intel_pmu_lbr_init_nhm(void);
 
 void intel_pmu_lbr_init_atom(void);
 
+void intel_pmu_lbr_init_snb(void);
+
 int p4_pmu_init(void);
 
 int p6_pmu_init(void);
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index 97f7bb5..b0db016 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -1757,7 +1757,7 @@ __init int intel_pmu_init(void)
 		memcpy(hw_cache_event_ids, snb_hw_cache_event_ids,
 		       sizeof(hw_cache_event_ids));
 
-		intel_pmu_lbr_init_nhm();
+		intel_pmu_lbr_init_snb();
 
 		x86_pmu.event_constraints = intel_snb_event_constraints;
 		x86_pmu.pebs_constraints = intel_snb_pebs_event_constraints;
diff --git a/arch/x86/kernel/cpu/perf_event_intel_lbr.c b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
index e14431f..b4150dc 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_lbr.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
@@ -14,6 +14,49 @@ enum {
 };
 
 /*
+ * Intel LBR_SELECT bits
+ * Intel Vol3a, April 2011, Section 16.7 Table 16-10
+ *
+ * Hardware branch filter (not available on all CPUs)
+ */
+#define LBR_KERNEL_BIT		0 /* do not capture at ring0 */
+#define LBR_USER_BIT		1 /* do not capture at ring > 0 */
+#define LBR_JCC_BIT		2 /* do not capture conditional branches */
+#define LBR_REL_CALL_BIT	3 /* do not capture relative calls */
+#define LBR_IND_CALL_BIT	4 /* do not capture indirect calls */
+#define LBR_RETURN_BIT		5 /* do not capture near returns */
+#define LBR_IND_JMP_BIT		6 /* do not capture indirect jumps */
+#define LBR_REL_JMP_BIT		7 /* do not capture relative jumps */
+#define LBR_FAR_BIT		8 /* do not capture far branches */
+
+#define LBR_KERNEL	(1 << LBR_KERNEL_BIT)
+#define LBR_USER	(1 << LBR_USER_BIT)
+#define LBR_JCC		(1 << LBR_JCC_BIT)
+#define LBR_REL_CALL	(1 << LBR_REL_CALL_BIT)
+#define LBR_IND_CALL	(1 << LBR_IND_CALL_BIT)
+#define LBR_RETURN	(1 << LBR_RETURN_BIT)
+#define LBR_REL_JMP	(1 << LBR_REL_JMP_BIT)
+#define LBR_IND_JMP	(1 << LBR_IND_JMP_BIT)
+#define LBR_FAR		(1 << LBR_FAR_BIT)
+
+#define LBR_PLM (LBR_KERNEL | LBR_USER)
+
+#define LBR_SEL_MASK	0x1ff	/* valid bits in LBR_SELECT */
+#define LBR_NOT_SUPP	-1	/* LBR filter not supported */
+#define LBR_IGN		0	/* ignored */
+
+#define LBR_ANY		 \
+	(LBR_JCC	|\
+	 LBR_REL_CALL	|\
+	 LBR_IND_CALL	|\
+	 LBR_RETURN	|\
+	 LBR_REL_JMP	|\
+	 LBR_IND_JMP	|\
+	 LBR_FAR)
+
+#define LBR_FROM_FLAG_MISPRED  (1ULL << 63)
+
+/*
  * We only support LBR implementations that have FREEZE_LBRS_ON_PMI
  * otherwise it becomes near impossible to get a reliable stack.
  */
@@ -153,8 +196,6 @@ static void intel_pmu_lbr_read_32(struct cpu_hw_events *cpuc)
 	cpuc->lbr_stack.nr = i;
 }
 
-#define LBR_FROM_FLAG_MISPRED  (1ULL << 63)
-
 /*
  * Due to lack of segmentation in Linux the effective address (offset)
  * is the same as the linear address, allowing us to merge the LIP and EIP
@@ -202,26 +243,84 @@ void intel_pmu_lbr_read(void)
 		intel_pmu_lbr_read_64(cpuc);
 }
 
+/*
+ * Map interface branch filters onto LBR filters
+ */
+static const int nhm_lbr_sel_map[PERF_SAMPLE_BRANCH_MAX] = {
+	[PERF_SAMPLE_BRANCH_ANY]	= LBR_ANY,
+	[PERF_SAMPLE_BRANCH_USER]	= LBR_USER,
+	[PERF_SAMPLE_BRANCH_KERNEL]	= LBR_KERNEL,
+	[PERF_SAMPLE_BRANCH_HV]		= LBR_IGN,
+	[PERF_SAMPLE_BRANCH_ANY_RETURN]	= LBR_RETURN | LBR_REL_JMP
+					| LBR_IND_JMP | LBR_FAR,
+	/*
+	 * NHM/WSM erratum: must include REL_JMP+IND_JMP to get CALL branches
+	 */
+	[PERF_SAMPLE_BRANCH_ANY_CALL] =
+	 LBR_REL_CALL | LBR_IND_CALL | LBR_REL_JMP | LBR_IND_JMP | LBR_FAR,
+	/*
+	 * NHM/WSM erratum: must include IND_JMP to capture IND_CALL
+	 */
+	[PERF_SAMPLE_BRANCH_IND_CALL] = LBR_IND_CALL | LBR_IND_JMP,
+};
+
+static const int snb_lbr_sel_map[PERF_SAMPLE_BRANCH_MAX] = {
+	[PERF_SAMPLE_BRANCH_ANY]	= LBR_ANY,
+	[PERF_SAMPLE_BRANCH_USER]	= LBR_USER,
+	[PERF_SAMPLE_BRANCH_KERNEL]	= LBR_KERNEL,
+	[PERF_SAMPLE_BRANCH_HV]		= LBR_IGN,
+	[PERF_SAMPLE_BRANCH_ANY_RETURN]	= LBR_RETURN | LBR_FAR,
+	[PERF_SAMPLE_BRANCH_ANY_CALL]	= LBR_REL_CALL | LBR_IND_CALL
+					| LBR_FAR,
+	[PERF_SAMPLE_BRANCH_IND_CALL]	= LBR_IND_CALL,
+};
+
+/* core */
 void intel_pmu_lbr_init_core(void)
 {
 	x86_pmu.lbr_nr     = 4;
 	x86_pmu.lbr_tos    = MSR_LBR_TOS;
 	x86_pmu.lbr_from   = MSR_LBR_CORE_FROM;
 	x86_pmu.lbr_to     = MSR_LBR_CORE_TO;
+
+	pr_cont("4-deep LBR, ");
 }
 
+/* nehalem/westmere */
 void intel_pmu_lbr_init_nhm(void)
 {
 	x86_pmu.lbr_nr     = 16;
 	x86_pmu.lbr_tos    = MSR_LBR_TOS;
 	x86_pmu.lbr_from   = MSR_LBR_NHM_FROM;
 	x86_pmu.lbr_to     = MSR_LBR_NHM_TO;
+
+	x86_pmu.lbr_sel_mask = LBR_SEL_MASK;
+	x86_pmu.lbr_sel_map  = nhm_lbr_sel_map;
+
+	pr_cont("16-deep LBR, ");
 }
 
+/* sandy bridge */
+void intel_pmu_lbr_init_snb(void)
+{
+	x86_pmu.lbr_nr	 = 16;
+	x86_pmu.lbr_tos	 = MSR_LBR_TOS;
+	x86_pmu.lbr_from = MSR_LBR_NHM_FROM;
+	x86_pmu.lbr_to   = MSR_LBR_NHM_TO;
+
+	x86_pmu.lbr_sel_mask = LBR_SEL_MASK;
+	x86_pmu.lbr_sel_map  = snb_lbr_sel_map;
+
+	pr_cont("16-deep LBR, ");
+}
+
+/* atom */
 void intel_pmu_lbr_init_atom(void)
 {
 	x86_pmu.lbr_nr	   = 8;
 	x86_pmu.lbr_tos    = MSR_LBR_TOS;
 	x86_pmu.lbr_from   = MSR_LBR_CORE_FROM;
 	x86_pmu.lbr_to     = MSR_LBR_CORE_TO;
+
+	pr_cont("8-deep LBR, ");
 }
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v5 06/18] perf: disable LBR support for older Intel Atom processors
  2012-02-02 12:54 [PATCH v5 00/18] perf: add support for sampling taken branches Stephane Eranian
                   ` (4 preceding siblings ...)
  2012-02-02 12:54 ` [PATCH v5 05/18] perf: add Intel X86 LBR mappings for PERF_SAMPLE_BRANCH filters Stephane Eranian
@ 2012-02-02 12:54 ` Stephane Eranian
  2012-02-02 12:54 ` [PATCH v5 07/18] perf: implement PERF_SAMPLE_BRANCH for Intel X86 Stephane Eranian
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 43+ messages in thread
From: Stephane Eranian @ 2012-02-02 12:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterz, mingo, acme, robert.richter, ming.m.lin, andi, asharma,
	ravitillo, vweaver1, khandual, dsahern

The patch adds a restriction for Intel Atom LBR support. Only
steppings 10 (PineView) and more recent are supported. Older models,
do not have a functional LBR. Their LBR does not freeze on PMU interrupt
which makes LBR unusable in the context of perf_events.

Signed-off-by: Stephane Eranian <eranian@google.com>
---
 arch/x86/kernel/cpu/perf_event_intel_lbr.c |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_lbr.c b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
index b4150dc..5cf86b4 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_lbr.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
@@ -317,6 +317,16 @@ void intel_pmu_lbr_init_snb(void)
 /* atom */
 void intel_pmu_lbr_init_atom(void)
 {
+	/*
+	 * only models starting at stepping 10 seems
+	 * to have an operational LBR which can freeze
+	 * on PMU interrupt
+	 */
+	if (boot_cpu_data.x86_mask < 10) {
+		pr_cont("LBR disabled due to erratum");
+		return;
+	}
+
 	x86_pmu.lbr_nr	   = 8;
 	x86_pmu.lbr_tos    = MSR_LBR_TOS;
 	x86_pmu.lbr_from   = MSR_LBR_CORE_FROM;
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v5 07/18] perf: implement PERF_SAMPLE_BRANCH for Intel X86
  2012-02-02 12:54 [PATCH v5 00/18] perf: add support for sampling taken branches Stephane Eranian
                   ` (5 preceding siblings ...)
  2012-02-02 12:54 ` [PATCH v5 06/18] perf: disable LBR support for older Intel Atom processors Stephane Eranian
@ 2012-02-02 12:54 ` Stephane Eranian
  2012-02-02 12:54 ` [PATCH v5 08/18] perf: add LBR software filter support " Stephane Eranian
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 43+ messages in thread
From: Stephane Eranian @ 2012-02-02 12:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterz, mingo, acme, robert.richter, ming.m.lin, andi, asharma,
	ravitillo, vweaver1, khandual, dsahern

This patch implements PERF_SAMPLE_BRANCH support for Intel
X86 processors. It connects PERF_SAMPLE_BRANCH to the actual LBR.

The patch adds the hooks in the PMU irq handler to save the LBR
on counter overflow for both regular and PEBS modes.

Signed-off-by: Stephane Eranian <eranian@google.com>
---
 arch/x86/kernel/cpu/perf_event.h           |    2 +
 arch/x86/kernel/cpu/perf_event_intel.c     |   35 +++++++++++
 arch/x86/kernel/cpu/perf_event_intel_ds.c  |   10 +--
 arch/x86/kernel/cpu/perf_event_intel_lbr.c |   86 +++++++++++++++++++++++++++-
 4 files changed, 125 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h
index 776fb5a..adbe80a 100644
--- a/arch/x86/kernel/cpu/perf_event.h
+++ b/arch/x86/kernel/cpu/perf_event.h
@@ -537,6 +537,8 @@ void intel_pmu_lbr_init_atom(void);
 
 void intel_pmu_lbr_init_snb(void);
 
+int intel_pmu_setup_lbr_filter(struct perf_event *event);
+
 int p4_pmu_init(void);
 
 int p6_pmu_init(void);
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index b0db016..7cc1e2d 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -727,6 +727,19 @@ static __initconst const u64 atom_hw_cache_event_ids
  },
 };
 
+static inline bool intel_pmu_needs_lbr_smpl(struct perf_event *event)
+{
+	/* user explicitly requested branch sampling */
+	if (has_branch_stack(event))
+		return true;
+
+	/* implicit branch sampling to correct PEBS skid */
+	if (x86_pmu.intel_cap.pebs_trap && event->attr.precise_ip > 1)
+		return true;
+
+	return false;
+}
+
 static void intel_pmu_disable_all(void)
 {
 	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
@@ -881,6 +894,13 @@ static void intel_pmu_disable_event(struct perf_event *event)
 	cpuc->intel_ctrl_guest_mask &= ~(1ull << hwc->idx);
 	cpuc->intel_ctrl_host_mask &= ~(1ull << hwc->idx);
 
+	/*
+	 * must disable before any actual event
+	 * because any event may be combined with LBR
+	 */
+	if (intel_pmu_needs_lbr_smpl(event))
+		intel_pmu_lbr_disable(event);
+
 	if (unlikely(hwc->config_base == MSR_ARCH_PERFMON_FIXED_CTR_CTRL)) {
 		intel_pmu_disable_fixed(hwc);
 		return;
@@ -935,6 +955,12 @@ static void intel_pmu_enable_event(struct perf_event *event)
 		intel_pmu_enable_bts(hwc->config);
 		return;
 	}
+	/*
+	 * must enabled before any actual event
+	 * because any event may be combined with LBR
+	 */
+	if (intel_pmu_needs_lbr_smpl(event))
+		intel_pmu_lbr_enable(event);
 
 	if (event->attr.exclude_host)
 		cpuc->intel_ctrl_guest_mask |= (1ull << hwc->idx);
@@ -1057,6 +1083,9 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)
 
 		data.period = event->hw.last_period;
 
+		if (has_branch_stack(event))
+			data.br_stack = &cpuc->lbr_stack;
+
 		if (perf_event_overflow(event, &data, regs))
 			x86_pmu_stop(event, 0);
 	}
@@ -1305,6 +1334,12 @@ static int intel_pmu_hw_config(struct perf_event *event)
 		event->hw.config = alt_config;
 	}
 
+	if (intel_pmu_needs_lbr_smpl(event)) {
+		ret = intel_pmu_setup_lbr_filter(event);
+		if (ret)
+			return ret;
+	}
+
 	if (event->attr.type != PERF_TYPE_RAW)
 		return 0;
 
diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c
index 73da6b6..04c71ea 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -440,9 +440,6 @@ void intel_pmu_pebs_enable(struct perf_event *event)
 
 	cpuc->pebs_enabled |= 1ULL << hwc->idx;
 	WARN_ON_ONCE(cpuc->enabled);
-
-	if (x86_pmu.intel_cap.pebs_trap && event->attr.precise_ip > 1)
-		intel_pmu_lbr_enable(event);
 }
 
 void intel_pmu_pebs_disable(struct perf_event *event)
@@ -455,9 +452,6 @@ void intel_pmu_pebs_disable(struct perf_event *event)
 		wrmsrl(MSR_IA32_PEBS_ENABLE, cpuc->pebs_enabled);
 
 	hwc->config |= ARCH_PERFMON_EVENTSEL_INT;
-
-	if (x86_pmu.intel_cap.pebs_trap && event->attr.precise_ip > 1)
-		intel_pmu_lbr_disable(event);
 }
 
 void intel_pmu_pebs_enable_all(void)
@@ -573,6 +567,7 @@ static void __intel_pmu_pebs_event(struct perf_event *event,
 	 * both formats and we don't use the other fields in this
 	 * routine.
 	 */
+	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
 	struct pebs_record_core *pebs = __pebs;
 	struct perf_sample_data data;
 	struct pt_regs regs;
@@ -603,6 +598,9 @@ static void __intel_pmu_pebs_event(struct perf_event *event,
 	else
 		regs.flags &= ~PERF_EFLAGS_EXACT;
 
+	if (has_branch_stack(event))
+		data.br_stack = &cpuc->lbr_stack;
+
 	if (perf_event_overflow(event, &data, &regs))
 		x86_pmu_stop(event, 0);
 }
diff --git a/arch/x86/kernel/cpu/perf_event_intel_lbr.c b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
index 5cf86b4..94ad84d 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_lbr.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
@@ -56,6 +56,10 @@ enum {
 
 #define LBR_FROM_FLAG_MISPRED  (1ULL << 63)
 
+#define for_each_branch_sample_type(x) \
+	for ((x) = PERF_SAMPLE_BRANCH_USER; \
+	     (x) < PERF_SAMPLE_BRANCH_MAX; (x) <<= 1)
+
 /*
  * We only support LBR implementations that have FREEZE_LBRS_ON_PMI
  * otherwise it becomes near impossible to get a reliable stack.
@@ -64,6 +68,10 @@ enum {
 static void __intel_pmu_lbr_enable(void)
 {
 	u64 debugctl;
+	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+
+	if (cpuc->lbr_sel)
+		wrmsrl(MSR_LBR_SELECT, cpuc->lbr_sel->config);
 
 	rdmsrl(MSR_IA32_DEBUGCTLMSR, debugctl);
 	debugctl |= (DEBUGCTLMSR_LBR | DEBUGCTLMSR_FREEZE_LBRS_ON_PMI);
@@ -121,7 +129,6 @@ void intel_pmu_lbr_enable(struct perf_event *event)
 	 * Reset the LBR stack if we changed task context to
 	 * avoid data leaks.
 	 */
-
 	if (event->ctx->task && cpuc->lbr_context != event->ctx) {
 		intel_pmu_lbr_reset();
 		cpuc->lbr_context = event->ctx;
@@ -140,8 +147,11 @@ void intel_pmu_lbr_disable(struct perf_event *event)
 	cpuc->lbr_users--;
 	WARN_ON_ONCE(cpuc->lbr_users < 0);
 
-	if (cpuc->enabled && !cpuc->lbr_users)
+	if (cpuc->enabled && !cpuc->lbr_users) {
 		__intel_pmu_lbr_disable();
+		/* avoid stale pointer */
+		cpuc->lbr_context = NULL;
+	}
 }
 
 void intel_pmu_lbr_enable_all(void)
@@ -160,6 +170,9 @@ void intel_pmu_lbr_disable_all(void)
 		__intel_pmu_lbr_disable();
 }
 
+/*
+ * TOS = most recently recorded branch
+ */
 static inline u64 intel_pmu_lbr_tos(void)
 {
 	u64 tos;
@@ -244,6 +257,75 @@ void intel_pmu_lbr_read(void)
 }
 
 /*
+ * setup the HW LBR filter
+ * Used only when available, may not be enough to disambiguate
+ * all branches, may need the help of the SW filter
+ */
+static int intel_pmu_setup_hw_lbr_filter(struct perf_event *event)
+{
+	struct hw_perf_event_extra *reg;
+	u64 br_type = event->attr.branch_sample_type;
+	u64 mask = 0, m;
+	u64 v;
+
+	for_each_branch_sample_type(m) {
+		if (!(br_type & m))
+			continue;
+
+		v = x86_pmu.lbr_sel_map[m];
+		if (v == LBR_NOT_SUPP)
+			return -EOPNOTSUPP;
+		mask |= v;
+
+		if (m == PERF_SAMPLE_BRANCH_ANY)
+			break;
+	}
+	reg = &event->hw.branch_reg;
+	reg->idx = EXTRA_REG_LBR;
+
+	/* LBR_SELECT operates in suppress mode so invert mask */
+	reg->config = ~mask & x86_pmu.lbr_sel_mask;
+
+	return 0;
+}
+
+/*
+ * all the bits supported on some flavor of X86 LBR
+ * we ignore BRANCH_HV because it is not supported
+ */
+#define PERF_SAMPLE_BRANCH_X86_ALL	\
+	(PERF_SAMPLE_BRANCH_ANY		|\
+	 PERF_SAMPLE_BRANCH_USER	|\
+	 PERF_SAMPLE_BRANCH_KERNEL)
+
+int intel_pmu_setup_lbr_filter(struct perf_event *event)
+{
+	u64 br_type = event->attr.branch_sample_type;
+
+	/*
+	 * no LBR on this PMU
+	 */
+	if (!x86_pmu.lbr_nr)
+		return -EOPNOTSUPP;
+
+	/*
+	 * if no LBR HW filter, users can only
+	 * capture all branches
+	 */
+	if (!x86_pmu.lbr_sel_map) {
+		if (br_type != PERF_SAMPLE_BRANCH_X86_ALL)
+			return -EOPNOTSUPP;
+		return 0;
+	}
+	/*
+	 * we ignore branch priv levels we do not
+	 * know about: BRANCH_HV
+	 */
+
+	return intel_pmu_setup_hw_lbr_filter(event);
+}
+
+/*
  * Map interface branch filters onto LBR filters
  */
 static const int nhm_lbr_sel_map[PERF_SAMPLE_BRANCH_MAX] = {
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v5 08/18] perf: add LBR software filter support for Intel X86
  2012-02-02 12:54 [PATCH v5 00/18] perf: add support for sampling taken branches Stephane Eranian
                   ` (6 preceding siblings ...)
  2012-02-02 12:54 ` [PATCH v5 07/18] perf: implement PERF_SAMPLE_BRANCH for Intel X86 Stephane Eranian
@ 2012-02-02 12:54 ` Stephane Eranian
  2012-02-02 12:54 ` [PATCH v5 09/18] perf: disable PERF_SAMPLE_BRANCH_* when not supported Stephane Eranian
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 43+ messages in thread
From: Stephane Eranian @ 2012-02-02 12:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterz, mingo, acme, robert.richter, ming.m.lin, andi, asharma,
	ravitillo, vweaver1, khandual, dsahern

This patch adds an internal sofware filter to complement
the (optional) LBR hardware filter.

The software filter is necessary:
- as a substitute when there is no HW LBR filter (e.g., Atom, Core)
- to complement HW LBR filter in case of errata (e.g., Nehalem/Westmere)
- to provide finer grain filtering (e.g., all processors)

Sometimes, the LBR HW filter cannot distinguish between two types
of branches. For instance, to capture syscall as CALLS, it is necessary
to enable the LBR_FAR filter which will also capture JMP instructions.
Thus, a second pass is necessary to filter those out, this is what the
SW filter can do.

The SW filter is built on top of the internal x86 disassembler. It
is a best effort filter especially for user level code. It is subject
to the availability of the text page of the program.

The SW filter is enabled on all Intel X86 processors. It is bypassed
when the user is capturing all branches at all priv levels.

Signed-off-by: Stephane Eranian <eranian@google.com>
---
 arch/x86/kernel/cpu/perf_event.h           |   10 +
 arch/x86/kernel/cpu/perf_event_intel_ds.c  |   12 +-
 arch/x86/kernel/cpu/perf_event_intel_lbr.c |  332 ++++++++++++++++++++++++++--
 3 files changed, 321 insertions(+), 33 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h
index adbe80a..a5281a4 100644
--- a/arch/x86/kernel/cpu/perf_event.h
+++ b/arch/x86/kernel/cpu/perf_event.h
@@ -132,6 +132,7 @@ struct cpu_hw_events {
 	struct perf_branch_stack	lbr_stack;
 	struct perf_branch_entry	lbr_entries[MAX_LBR_ENTRIES];
 	struct er_account		*lbr_sel;
+	u64				br_sel;
 
 	/*
 	 * Intel host/guest exclude bits
@@ -455,6 +456,15 @@ extern struct event_constraint emptyconstraint;
 
 extern struct event_constraint unconstrained;
 
+static inline bool kernel_ip(unsigned long ip)
+{
+#ifdef CONFIG_X86_32
+	return ip > PAGE_OFFSET;
+#else
+	return (long)ip < 0;
+#endif
+}
+
 #ifdef CONFIG_CPU_SUP_AMD
 
 int amd_pmu_init(void);
diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c
index 04c71ea..db0aa19 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -3,6 +3,7 @@
 #include <linux/slab.h>
 
 #include <asm/perf_event.h>
+#include <asm/insn.h>
 
 #include "perf_event.h"
 
@@ -470,17 +471,6 @@ void intel_pmu_pebs_disable_all(void)
 		wrmsrl(MSR_IA32_PEBS_ENABLE, 0);
 }
 
-#include <asm/insn.h>
-
-static inline bool kernel_ip(unsigned long ip)
-{
-#ifdef CONFIG_X86_32
-	return ip > PAGE_OFFSET;
-#else
-	return (long)ip < 0;
-#endif
-}
-
 static int intel_pmu_pebs_fixup_ip(struct pt_regs *regs)
 {
 	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
diff --git a/arch/x86/kernel/cpu/perf_event_intel_lbr.c b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
index 94ad84d..960ca65 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_lbr.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
@@ -3,6 +3,7 @@
 
 #include <asm/perf_event.h>
 #include <asm/msr.h>
+#include <asm/insn.h>
 
 #include "perf_event.h"
 
@@ -61,6 +62,53 @@ enum {
 	     (x) < PERF_SAMPLE_BRANCH_MAX; (x) <<= 1)
 
 /*
+ * X86 control flow change classification
+ * X86 control flow changes include branches, interrupts, traps, faults
+ */
+enum {
+	X86_BR_NONE     = 0,      /* unknown */
+
+	X86_BR_USER     = 1 << 0, /* branch target is user */
+	X86_BR_KERNEL   = 1 << 1, /* branch target is kernel */
+
+	X86_BR_CALL     = 1 << 2, /* call */
+	X86_BR_RET      = 1 << 3, /* return */
+	X86_BR_SYSCALL  = 1 << 4, /* syscall */
+	X86_BR_SYSRET   = 1 << 5, /* syscall return */
+	X86_BR_INT      = 1 << 6, /* sw interrupt */
+	X86_BR_IRET     = 1 << 7, /* return from interrupt */
+	X86_BR_JCC      = 1 << 8, /* conditional */
+	X86_BR_JMP      = 1 << 9, /* jump */
+	X86_BR_IRQ      = 1 << 10,/* hw interrupt or trap or fault */
+	X86_BR_IND_CALL = 1 << 11,/* indirect calls */
+};
+
+#define X86_BR_PLM (X86_BR_USER | X86_BR_KERNEL)
+
+#define X86_BR_ANY       \
+	(X86_BR_CALL    |\
+	 X86_BR_RET     |\
+	 X86_BR_SYSCALL |\
+	 X86_BR_SYSRET  |\
+	 X86_BR_INT     |\
+	 X86_BR_IRET    |\
+	 X86_BR_JCC     |\
+	 X86_BR_JMP	 |\
+	 X86_BR_IRQ	 |\
+	 X86_BR_IND_CALL)
+
+#define X86_BR_ALL (X86_BR_PLM | X86_BR_ANY)
+
+#define X86_BR_ANY_CALL		 \
+	(X86_BR_CALL		|\
+	 X86_BR_IND_CALL	|\
+	 X86_BR_SYSCALL		|\
+	 X86_BR_IRQ		|\
+	 X86_BR_INT)
+
+static void intel_pmu_lbr_filter(struct cpu_hw_events *cpuc);
+
+/*
  * We only support LBR implementations that have FREEZE_LBRS_ON_PMI
  * otherwise it becomes near impossible to get a reliable stack.
  */
@@ -133,6 +181,7 @@ void intel_pmu_lbr_enable(struct perf_event *event)
 		intel_pmu_lbr_reset();
 		cpuc->lbr_context = event->ctx;
 	}
+	cpuc->br_sel = event->hw.branch_reg.reg;
 
 	cpuc->lbr_users++;
 }
@@ -254,6 +303,44 @@ void intel_pmu_lbr_read(void)
 		intel_pmu_lbr_read_32(cpuc);
 	else
 		intel_pmu_lbr_read_64(cpuc);
+
+	intel_pmu_lbr_filter(cpuc);
+}
+
+/*
+ * SW filter is used:
+ * - in case there is no HW filter
+ * - in case the HW filter has errata or limitations
+ */
+static void intel_pmu_setup_sw_lbr_filter(struct perf_event *event)
+{
+	u64 br_type = event->attr.branch_sample_type;
+	int mask = 0;
+
+	if (br_type & PERF_SAMPLE_BRANCH_USER)
+		mask |= X86_BR_USER;
+
+	if (br_type & PERF_SAMPLE_BRANCH_KERNEL)
+		mask |= X86_BR_KERNEL;
+
+	/* we ignore BRANCH_HV here */
+
+	if (br_type & PERF_SAMPLE_BRANCH_ANY)
+		mask |= X86_BR_ANY;
+
+	if (br_type & PERF_SAMPLE_BRANCH_ANY_CALL)
+		mask |= X86_BR_ANY_CALL;
+
+	if (br_type & PERF_SAMPLE_BRANCH_ANY_RETURN)
+		mask |= X86_BR_RET | X86_BR_IRET | X86_BR_SYSRET;
+
+	if (br_type & PERF_SAMPLE_BRANCH_IND_CALL)
+		mask |= X86_BR_IND_CALL;
+	/*
+	 * stash actual user request into reg, it may
+	 * be used by fixup code for some CPU
+	 */
+	event->hw.branch_reg.reg = mask;
 }
 
 /*
@@ -275,10 +362,9 @@ static int intel_pmu_setup_hw_lbr_filter(struct perf_event *event)
 		v = x86_pmu.lbr_sel_map[m];
 		if (v == LBR_NOT_SUPP)
 			return -EOPNOTSUPP;
-		mask |= v;
 
-		if (m == PERF_SAMPLE_BRANCH_ANY)
-			break;
+		if (v != LBR_IGN)
+			mask |= v;
 	}
 	reg = &event->hw.branch_reg;
 	reg->idx = EXTRA_REG_LBR;
@@ -289,18 +375,9 @@ static int intel_pmu_setup_hw_lbr_filter(struct perf_event *event)
 	return 0;
 }
 
-/*
- * all the bits supported on some flavor of X86 LBR
- * we ignore BRANCH_HV because it is not supported
- */
-#define PERF_SAMPLE_BRANCH_X86_ALL	\
-	(PERF_SAMPLE_BRANCH_ANY		|\
-	 PERF_SAMPLE_BRANCH_USER	|\
-	 PERF_SAMPLE_BRANCH_KERNEL)
-
 int intel_pmu_setup_lbr_filter(struct perf_event *event)
 {
-	u64 br_type = event->attr.branch_sample_type;
+	int ret = 0;
 
 	/*
 	 * no LBR on this PMU
@@ -309,20 +386,210 @@ int intel_pmu_setup_lbr_filter(struct perf_event *event)
 		return -EOPNOTSUPP;
 
 	/*
-	 * if no LBR HW filter, users can only
-	 * capture all branches
+	 * setup SW LBR filter
 	 */
-	if (!x86_pmu.lbr_sel_map) {
-		if (br_type != PERF_SAMPLE_BRANCH_X86_ALL)
-			return -EOPNOTSUPP;
-		return 0;
+	intel_pmu_setup_sw_lbr_filter(event);
+
+	/*
+	 * setup HW LBR filter, if any
+	 */
+	if (x86_pmu.lbr_sel_map)
+		ret = intel_pmu_setup_hw_lbr_filter(event);
+
+	return ret;
+}
+
+/*
+ * return the type of control flow change at address "from"
+ * intruction is not necessarily a branch (in case of interrupt).
+ *
+ * The branch type returned also includes the priv level of the
+ * target of the control flow change (X86_BR_USER, X86_BR_KERNEL).
+ *
+ * If a branch type is unknown OR the instruction cannot be
+ * decoded (e.g., text page not present), then X86_BR_NONE is
+ * returned.
+ */
+static int branch_type(unsigned long from, unsigned long to)
+{
+	struct insn insn;
+	void *addr;
+	int bytes, size = MAX_INSN_SIZE;
+	int ret = X86_BR_NONE;
+	int ext, to_plm, from_plm;
+	u8 buf[MAX_INSN_SIZE];
+	int is64 = 0;
+
+	to_plm = kernel_ip(to) ? X86_BR_KERNEL : X86_BR_USER;
+	from_plm = kernel_ip(from) ? X86_BR_KERNEL : X86_BR_USER;
+
+	/*
+	 * maybe zero if lbr did not fill up after a reset by the time
+	 * we get a PMU interrupt
+	 */
+	if (from == 0 || to == 0)
+		return X86_BR_NONE;
+
+	if (from_plm == X86_BR_USER) {
+		/*
+		 * can happen if measuring at the user level only
+		 * and we interrupt in a kernel thread, e.g., idle.
+		 */
+		if (!current->mm)
+			return X86_BR_NONE;
+
+		/* may fail if text not present */
+		bytes = copy_from_user_nmi(buf, (void __user *)from, size);
+		if (bytes != size)
+			return X86_BR_NONE;
+
+		addr = buf;
+	} else
+		addr = (void *)from;
+
+	/*
+	 * decoder needs to know the ABI especially
+	 * on 64-bit systems running 32-bit apps
+	 */
+#ifdef CONFIG_X86_64
+	is64 = kernel_ip((unsigned long)addr) || !test_thread_flag(TIF_IA32);
+#endif
+	insn_init(&insn, addr, is64);
+	insn_get_opcode(&insn);
+
+	switch (insn.opcode.bytes[0]) {
+	case 0xf:
+		switch (insn.opcode.bytes[1]) {
+		case 0x05: /* syscall */
+		case 0x34: /* sysenter */
+			ret = X86_BR_SYSCALL;
+			break;
+		case 0x07: /* sysret */
+		case 0x35: /* sysexit */
+			ret = X86_BR_SYSRET;
+			break;
+		case 0x80 ... 0x8f: /* conditional */
+			ret = X86_BR_JCC;
+			break;
+		default:
+			ret = X86_BR_NONE;
+		}
+		break;
+	case 0x70 ... 0x7f: /* conditional */
+		ret = X86_BR_JCC;
+		break;
+	case 0xc2: /* near ret */
+	case 0xc3: /* near ret */
+	case 0xca: /* far ret */
+	case 0xcb: /* far ret */
+		ret = X86_BR_RET;
+		break;
+	case 0xcf: /* iret */
+		ret = X86_BR_IRET;
+		break;
+	case 0xcc ... 0xce: /* int */
+		ret = X86_BR_INT;
+		break;
+	case 0xe8: /* call near rel */
+	case 0x9a: /* call far absolute */
+		ret = X86_BR_CALL;
+		break;
+	case 0xe0 ... 0xe3: /* loop jmp */
+		ret = X86_BR_JCC;
+		break;
+	case 0xe9 ... 0xeb: /* jmp */
+		ret = X86_BR_JMP;
+		break;
+	case 0xff: /* call near absolute, call far absolute ind */
+		insn_get_modrm(&insn);
+		ext = (insn.modrm.bytes[0] >> 3) & 0x7;
+		switch (ext) {
+		case 2: /* near ind call */
+		case 3: /* far ind call */
+			ret = X86_BR_IND_CALL;
+			break;
+		case 4:
+		case 5:
+			ret = X86_BR_JMP;
+			break;
+		}
+		break;
+	default:
+		ret = X86_BR_NONE;
 	}
 	/*
-	 * we ignore branch priv levels we do not
-	 * know about: BRANCH_HV
+	 * interrupts, traps, faults (and thus ring transition) may
+	 * occur on any instructions. Thus, to classify them correctly,
+	 * we need to first look at the from and to priv levels. If they
+	 * are different and to is in the kernel, then it indicates
+	 * a ring transition. If the from instruction is not a ring
+	 * transition instr (syscall, systenter, int), then it means
+	 * it was a irq, trap or fault.
+	 *
+	 * we have no way of detecting kernel to kernel faults.
+	 */
+	if (from_plm == X86_BR_USER && to_plm == X86_BR_KERNEL
+	    && ret != X86_BR_SYSCALL && ret != X86_BR_INT)
+		ret = X86_BR_IRQ;
+
+	/*
+	 * branch priv level determined by target as
+	 * is done by HW when LBR_SELECT is implemented
 	 */
+	if (ret != X86_BR_NONE)
+		ret |= to_plm;
 
-	return intel_pmu_setup_hw_lbr_filter(event);
+	return ret;
+}
+
+/*
+ * implement actual branch filter based on user demand.
+ * Hardware may not exactly satisfy that request, thus
+ * we need to inspect opcodes. Mismatched branches are
+ * discarded. Therefore, the number of branches returned
+ * in PERF_SAMPLE_BRANCH_STACK sample may vary.
+ */
+static void
+intel_pmu_lbr_filter(struct cpu_hw_events *cpuc)
+{
+	u64 from, to;
+	int br_sel = cpuc->br_sel;
+	int i, j, type;
+	bool compress = false;
+
+	/* if sampling all branches, then nothing to filter */
+	if ((br_sel & X86_BR_ALL) == X86_BR_ALL)
+		return;
+
+	for (i = 0; i < cpuc->lbr_stack.nr; i++) {
+
+		from = cpuc->lbr_entries[i].from;
+		to = cpuc->lbr_entries[i].to;
+
+		type = branch_type(from, to);
+
+		/* if type does not correspond, then discard */
+		if (type == X86_BR_NONE || (br_sel & type) != type) {
+			cpuc->lbr_entries[i].from = 0;
+			compress = true;
+		}
+	}
+
+	if (!compress)
+		return;
+
+	/* remove all entries with from=0 */
+	for (i = 0; i < cpuc->lbr_stack.nr; ) {
+		if (!cpuc->lbr_entries[i].from) {
+			j = i;
+			while (++j < cpuc->lbr_stack.nr)
+				cpuc->lbr_entries[j-1] = cpuc->lbr_entries[j];
+			cpuc->lbr_stack.nr--;
+			if (!cpuc->lbr_entries[i].from)
+				continue;
+		}
+		i++;
+	}
 }
 
 /*
@@ -365,6 +632,10 @@ void intel_pmu_lbr_init_core(void)
 	x86_pmu.lbr_from   = MSR_LBR_CORE_FROM;
 	x86_pmu.lbr_to     = MSR_LBR_CORE_TO;
 
+	/*
+	 * SW branch filter usage:
+	 * - compensate for lack of HW filter
+	 */
 	pr_cont("4-deep LBR, ");
 }
 
@@ -379,6 +650,13 @@ void intel_pmu_lbr_init_nhm(void)
 	x86_pmu.lbr_sel_mask = LBR_SEL_MASK;
 	x86_pmu.lbr_sel_map  = nhm_lbr_sel_map;
 
+	/*
+	 * SW branch filter usage:
+	 * - workaround LBR_SEL errata (see above)
+	 * - support syscall, sysret capture.
+	 *   That requires LBR_FAR but that means far
+	 *   jmp need to be filtered out
+	 */
 	pr_cont("16-deep LBR, ");
 }
 
@@ -393,6 +671,12 @@ void intel_pmu_lbr_init_snb(void)
 	x86_pmu.lbr_sel_mask = LBR_SEL_MASK;
 	x86_pmu.lbr_sel_map  = snb_lbr_sel_map;
 
+	/*
+	 * SW branch filter usage:
+	 * - support syscall, sysret capture.
+	 *   That requires LBR_FAR but that means far
+	 *   jmp need to be filtered out
+	 */
 	pr_cont("16-deep LBR, ");
 }
 
@@ -414,5 +698,9 @@ void intel_pmu_lbr_init_atom(void)
 	x86_pmu.lbr_from   = MSR_LBR_CORE_FROM;
 	x86_pmu.lbr_to     = MSR_LBR_CORE_TO;
 
+	/*
+	 * SW branch filter usage:
+	 * - compensate for lack of HW filter
+	 */
 	pr_cont("8-deep LBR, ");
 }
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v5 09/18] perf: disable PERF_SAMPLE_BRANCH_* when not supported
  2012-02-02 12:54 [PATCH v5 00/18] perf: add support for sampling taken branches Stephane Eranian
                   ` (7 preceding siblings ...)
  2012-02-02 12:54 ` [PATCH v5 08/18] perf: add LBR software filter support " Stephane Eranian
@ 2012-02-02 12:54 ` Stephane Eranian
  2012-02-06 19:23   ` Peter Zijlstra
  2012-02-02 12:54 ` [PATCH v5 10/18] perf: add hook to flush branch_stack on context switch Stephane Eranian
                   ` (8 subsequent siblings)
  17 siblings, 1 reply; 43+ messages in thread
From: Stephane Eranian @ 2012-02-02 12:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterz, mingo, acme, robert.richter, ming.m.lin, andi, asharma,
	ravitillo, vweaver1, khandual, dsahern

PERF_SAMPLE_BRANCH_* is disabled for:
- SW events (sw counters, tracepoints)
- HW breakpoints
- ALL but Intel X86 architecture
- AMD64 processors

Signed-off-by: Stephane Eranian <eranian@google.com>
---
 arch/alpha/kernel/perf_event.c       |    4 ++++
 arch/arm/kernel/perf_event.c         |    4 ++++
 arch/mips/kernel/perf_event_mipsxx.c |    4 ++++
 arch/powerpc/kernel/perf_event.c     |    4 ++++
 arch/sh/kernel/perf_event.c          |    4 ++++
 arch/sparc/kernel/perf_event.c       |    4 ++++
 arch/x86/kernel/cpu/perf_event_amd.c |    3 +++
 kernel/events/core.c                 |   24 ++++++++++++++++++++++++
 kernel/events/hw_breakpoint.c        |    6 ++++++
 9 files changed, 57 insertions(+), 0 deletions(-)

diff --git a/arch/alpha/kernel/perf_event.c b/arch/alpha/kernel/perf_event.c
index 8143cd7..0dae252 100644
--- a/arch/alpha/kernel/perf_event.c
+++ b/arch/alpha/kernel/perf_event.c
@@ -685,6 +685,10 @@ static int alpha_pmu_event_init(struct perf_event *event)
 {
 	int err;
 
+	/* does not support taken branch sampling */
+	if (has_branch_stack(event))
+		return -EOPNOTSUPP;
+
 	switch (event->attr.type) {
 	case PERF_TYPE_RAW:
 	case PERF_TYPE_HARDWARE:
diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
index 5bb91bf..68bb0ce 100644
--- a/arch/arm/kernel/perf_event.c
+++ b/arch/arm/kernel/perf_event.c
@@ -539,6 +539,10 @@ static int armpmu_event_init(struct perf_event *event)
 	int err = 0;
 	atomic_t *active_events = &armpmu->active_events;
 
+	/* does not support taken branch sampling */
+	if (has_branch_smpl(event))
+		return -EOPNOTSUPP;
+
 	if (armpmu->map_event(event) == -ENOENT)
 		return -ENOENT;
 
diff --git a/arch/mips/kernel/perf_event_mipsxx.c b/arch/mips/kernel/perf_event_mipsxx.c
index e3b897a..811084f 100644
--- a/arch/mips/kernel/perf_event_mipsxx.c
+++ b/arch/mips/kernel/perf_event_mipsxx.c
@@ -606,6 +606,10 @@ static int mipspmu_event_init(struct perf_event *event)
 {
 	int err = 0;
 
+	/* does not support taken branch sampling */
+	if (has_branch_stack(event))
+		return -EOPNOTSUPP;
+
 	switch (event->attr.type) {
 	case PERF_TYPE_RAW:
 	case PERF_TYPE_HARDWARE:
diff --git a/arch/powerpc/kernel/perf_event.c b/arch/powerpc/kernel/perf_event.c
index d614ab5..4e0b265 100644
--- a/arch/powerpc/kernel/perf_event.c
+++ b/arch/powerpc/kernel/perf_event.c
@@ -1078,6 +1078,10 @@ static int power_pmu_event_init(struct perf_event *event)
 	if (!ppmu)
 		return -ENOENT;
 
+	/* does not support taken branch sampling */
+	if (has_branch_stack(event))
+		return -EOPNOTSUPP;
+
 	switch (event->attr.type) {
 	case PERF_TYPE_HARDWARE:
 		ev = event->attr.config;
diff --git a/arch/sh/kernel/perf_event.c b/arch/sh/kernel/perf_event.c
index 10b14e3..068b8a2 100644
--- a/arch/sh/kernel/perf_event.c
+++ b/arch/sh/kernel/perf_event.c
@@ -310,6 +310,10 @@ static int sh_pmu_event_init(struct perf_event *event)
 {
 	int err;
 
+	/* does not support taken branch sampling */
+	if (has_branch_stack(event))
+		return -EOPNOTSUPP;
+
 	switch (event->attr.type) {
 	case PERF_TYPE_RAW:
 	case PERF_TYPE_HW_CACHE:
diff --git a/arch/sparc/kernel/perf_event.c b/arch/sparc/kernel/perf_event.c
index 614da62..8e16a4a 100644
--- a/arch/sparc/kernel/perf_event.c
+++ b/arch/sparc/kernel/perf_event.c
@@ -1105,6 +1105,10 @@ static int sparc_pmu_event_init(struct perf_event *event)
 	if (atomic_read(&nmi_active) < 0)
 		return -ENODEV;
 
+	/* does not support taken branch sampling */
+	if (has_branch_stack(event))
+		return -EOPNOTSUPP;
+
 	switch (attr->type) {
 	case PERF_TYPE_HARDWARE:
 		if (attr->config >= sparc_pmu->max_events)
diff --git a/arch/x86/kernel/cpu/perf_event_amd.c b/arch/x86/kernel/cpu/perf_event_amd.c
index 0397b23..0d8da03 100644
--- a/arch/x86/kernel/cpu/perf_event_amd.c
+++ b/arch/x86/kernel/cpu/perf_event_amd.c
@@ -138,6 +138,9 @@ static int amd_pmu_hw_config(struct perf_event *event)
 	if (ret)
 		return ret;
 
+	if (has_branch_stack(event))
+		return -EOPNOTSUPP;
+
 	if (event->attr.exclude_host && event->attr.exclude_guest)
 		/*
 		 * When HO == GO == 1 the hardware treats that as GO == HO == 0
diff --git a/kernel/events/core.c b/kernel/events/core.c
index a268e45..5660ffa 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5035,6 +5035,12 @@ static int perf_swevent_init(struct perf_event *event)
 	if (event->attr.type != PERF_TYPE_SOFTWARE)
 		return -ENOENT;
 
+	/*
+	 * no branch sampling for software events
+	 */
+	if (has_branch_stack(event))
+		return -EOPNOTSUPP;
+
 	switch (event_id) {
 	case PERF_COUNT_SW_CPU_CLOCK:
 	case PERF_COUNT_SW_TASK_CLOCK:
@@ -5145,6 +5151,12 @@ static int perf_tp_event_init(struct perf_event *event)
 	if (event->attr.type != PERF_TYPE_TRACEPOINT)
 		return -ENOENT;
 
+	/*
+	 * no branch sampling for tracepoint events
+	 */
+	if (has_branch_stack(event))
+		return -EOPNOTSUPP;
+
 	err = perf_trace_init(event);
 	if (err)
 		return err;
@@ -5370,6 +5382,12 @@ static int cpu_clock_event_init(struct perf_event *event)
 	if (event->attr.config != PERF_COUNT_SW_CPU_CLOCK)
 		return -ENOENT;
 
+	/*
+	 * no branch sampling for software events
+	 */
+	if (has_branch_stack(event))
+		return -EOPNOTSUPP;
+
 	perf_swevent_init_hrtimer(event);
 
 	return 0;
@@ -5444,6 +5462,12 @@ static int task_clock_event_init(struct perf_event *event)
 	if (event->attr.config != PERF_COUNT_SW_TASK_CLOCK)
 		return -ENOENT;
 
+	/*
+	 * no branch sampling for software events
+	 */
+	if (has_branch_stack(event))
+		return -EOPNOTSUPP;
+
 	perf_swevent_init_hrtimer(event);
 
 	return 0;
diff --git a/kernel/events/hw_breakpoint.c b/kernel/events/hw_breakpoint.c
index b0309f7..cee5423 100644
--- a/kernel/events/hw_breakpoint.c
+++ b/kernel/events/hw_breakpoint.c
@@ -581,6 +581,12 @@ static int hw_breakpoint_event_init(struct perf_event *bp)
 	if (bp->attr.type != PERF_TYPE_BREAKPOINT)
 		return -ENOENT;
 
+	/*
+	 * no branch sampling for breakpoint events
+	 */
+	if (has_branch_stack(bp))
+		return -EOPNOTSUPP;
+
 	err = register_perf_hw_breakpoint(bp);
 	if (err)
 		return err;
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v5 10/18] perf: add hook to flush branch_stack on context switch
  2012-02-02 12:54 [PATCH v5 00/18] perf: add support for sampling taken branches Stephane Eranian
                   ` (8 preceding siblings ...)
  2012-02-02 12:54 ` [PATCH v5 09/18] perf: disable PERF_SAMPLE_BRANCH_* when not supported Stephane Eranian
@ 2012-02-02 12:54 ` Stephane Eranian
  2012-02-02 12:54 ` [PATCH v5 11/18] perf: add code to support PERF_SAMPLE_BRANCH_STACK Stephane Eranian
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 43+ messages in thread
From: Stephane Eranian @ 2012-02-02 12:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterz, mingo, acme, robert.richter, ming.m.lin, andi, asharma,
	ravitillo, vweaver1, khandual, dsahern

With branch stack sampling, it is possible to filter by priv levels.
In system-wide mode, that means it is possible to capture only user
level branches. The builtin SW LBR filter needs to disassemble code
based on LBR captured addresses. For that, it needs to know the task
the addresses are associated with. Because of context switches, the
content of the branch stack buffer may contain addresses from
different tasks.

We need a hook on context switch to either flush the branch stack
or save it. This patch adds a new hook in struct pmu which is called
during context switches. The hook is called only when necessary.
That is when a system-wide context has, at least, one event which
uses PERF_SAMPLE_BRANCH_STACK. The hook is never called for per-thread
context.

In this version, the Intel X86 code simply flushes (reset) the LBR
on context switches (fill with zeroes). Those zeroed branches are
then filtered out by the SW filter.

Signed-off-by: Stephane Eranian <eranian@google.com>
---
 arch/x86/kernel/cpu/perf_event.c       |   21 +++++---
 arch/x86/kernel/cpu/perf_event.h       |    1 +
 arch/x86/kernel/cpu/perf_event_intel.c |   13 +++++
 include/linux/perf_event.h             |    9 +++-
 kernel/events/core.c                   |   85 ++++++++++++++++++++++++++++++++
 5 files changed, 121 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 71e0264..16c7d56 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1671,25 +1671,32 @@ static const struct attribute_group *x86_pmu_attr_groups[] = {
 	NULL,
 };
 
+static void x86_pmu_flush_branch_stack(void)
+{
+	if (x86_pmu.flush_branch_stack)
+		x86_pmu.flush_branch_stack();
+}
+
 static struct pmu pmu = {
-	.pmu_enable	= x86_pmu_enable,
-	.pmu_disable	= x86_pmu_disable,
+	.pmu_enable		= x86_pmu_enable,
+	.pmu_disable		= x86_pmu_disable,
 
 	.attr_groups	= x86_pmu_attr_groups,
 
 	.event_init	= x86_pmu_event_init,
 
-	.add		= x86_pmu_add,
-	.del		= x86_pmu_del,
-	.start		= x86_pmu_start,
-	.stop		= x86_pmu_stop,
-	.read		= x86_pmu_read,
+	.add			= x86_pmu_add,
+	.del			= x86_pmu_del,
+	.start			= x86_pmu_start,
+	.stop			= x86_pmu_stop,
+	.read			= x86_pmu_read,
 
 	.start_txn	= x86_pmu_start_txn,
 	.cancel_txn	= x86_pmu_cancel_txn,
 	.commit_txn	= x86_pmu_commit_txn,
 
 	.event_idx	= x86_pmu_event_idx,
+	.flush_branch_stack	= x86_pmu_flush_branch_stack,
 };
 
 void perf_update_user_clock(struct perf_event_mmap_page *userpg, u64 now)
diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h
index a5281a4..1699d36 100644
--- a/arch/x86/kernel/cpu/perf_event.h
+++ b/arch/x86/kernel/cpu/perf_event.h
@@ -322,6 +322,7 @@ struct x86_pmu {
 	void		(*cpu_starting)(int cpu);
 	void		(*cpu_dying)(int cpu);
 	void		(*cpu_dead)(int cpu);
+	void		(*flush_branch_stack)(void);
 
 	/*
 	 * Intel Arch Perfmon v2+
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index 7cc1e2d..6627089 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -1539,6 +1539,18 @@ static void intel_pmu_cpu_dying(int cpu)
 	fini_debug_store_on_cpu(cpu);
 }
 
+static void intel_pmu_flush_branch_stack(void)
+{
+	/*
+	 * Intel LBR does not tag entries with the
+	 * PID of the current task, then we need to
+	 * flush it on ctxsw
+	 * For now, we simply reset it
+	 */
+	if (x86_pmu.lbr_nr)
+		intel_pmu_lbr_reset();
+}
+
 static __initconst const struct x86_pmu intel_pmu = {
 	.name			= "Intel",
 	.handle_irq		= intel_pmu_handle_irq,
@@ -1566,6 +1578,7 @@ static __initconst const struct x86_pmu intel_pmu = {
 	.cpu_starting		= intel_pmu_cpu_starting,
 	.cpu_dying		= intel_pmu_cpu_dying,
 	.guest_get_msrs		= intel_guest_get_msrs,
+	.flush_branch_stack	= intel_pmu_flush_branch_stack,
 };
 
 static __init void intel_clovertown_quirk(void)
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 71b0232..366e2b4 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -746,6 +746,11 @@ struct pmu {
 	 * if no implementation is provided it will default to: event->hw.idx + 1.
 	 */
 	int (*event_idx)		(struct perf_event *event); /*optional */
+
+	/*
+	 * flush branch stack on context-switches (needed in cpu-wide mode)
+	 */
+	void (*flush_branch_stack)	(void);
 };
 
 /**
@@ -976,7 +981,8 @@ struct perf_event_context {
 	u64				parent_gen;
 	u64				generation;
 	int				pin_count;
-	int				nr_cgroups; /* cgroup events present */
+	int				nr_cgroups;	 /* cgroup evts */
+	int				nr_branch_stack; /* branch_stack evt */
 	struct rcu_head			rcu_head;
 };
 
@@ -1041,6 +1047,7 @@ perf_event_create_kernel_counter(struct perf_event_attr *attr,
 extern u64 perf_event_read_value(struct perf_event *event,
 				 u64 *enabled, u64 *running);
 
+
 struct perf_sample_data {
 	u64				type;
 
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 5660ffa..e91ce94 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -137,6 +137,7 @@ enum event_type_t {
  */
 struct jump_label_key_deferred perf_sched_events __read_mostly;
 static DEFINE_PER_CPU(atomic_t, perf_cgroup_events);
+static DEFINE_PER_CPU(atomic_t, perf_branch_stack_events);
 
 static atomic_t nr_mmap_events __read_mostly;
 static atomic_t nr_comm_events __read_mostly;
@@ -888,6 +889,9 @@ list_add_event(struct perf_event *event, struct perf_event_context *ctx)
 	if (is_cgroup_event(event))
 		ctx->nr_cgroups++;
 
+	if (has_branch_stack(event))
+		ctx->nr_branch_stack++;
+
 	list_add_rcu(&event->event_entry, &ctx->event_list);
 	if (!ctx->nr_events)
 		perf_pmu_rotate_start(ctx->pmu);
@@ -1027,6 +1031,9 @@ list_del_event(struct perf_event *event, struct perf_event_context *ctx)
 			cpuctx->cgrp = NULL;
 	}
 
+	if (has_branch_stack(event))
+		ctx->nr_branch_stack--;
+
 	ctx->nr_events--;
 	if (event->attr.inherit_stat)
 		ctx->nr_stat--;
@@ -2202,6 +2209,66 @@ static void perf_event_context_sched_in(struct perf_event_context *ctx,
 }
 
 /*
+ * When sampling the branck stack in system-wide, it may be necessary
+ * to flush the stack on context switch. This happens when the branch
+ * stack does not tag its entries with the pid of the current task.
+ * Otherwise it becomes impossible to associate a branch entry with a
+ * task. This ambiguity is more likely to appear when the branch stack
+ * supports priv level filtering and the user sets it to monitor only
+ * at the user level (which could be a useful measurement in system-wide
+ * mode). In that case, the risk is high of having a branch stack with
+ * branch from multiple tasks. Flushing may mean dropping the existing
+ * entries or stashing them somewhere in the PMU specific code layer.
+ *
+ * This function provides the context switch callback to the lower code
+ * layer. It is invoked ONLY when there is at least one system-wide context
+ * with at least one active event using taken branch sampling.
+ */
+static void perf_branch_stack_sched_in(struct task_struct *prev,
+				       struct task_struct *task)
+{
+	struct perf_cpu_context *cpuctx;
+	struct pmu *pmu;
+	unsigned long flags;
+
+	/* no need to flush branch stack if not changing task */
+	if (prev == task)
+		return;
+
+	local_irq_save(flags);
+
+	rcu_read_lock();
+
+	list_for_each_entry_rcu(pmu, &pmus, entry) {
+		cpuctx = this_cpu_ptr(pmu->pmu_cpu_context);
+
+		/*
+		 * check if the context has at least one
+		 * event using PERF_SAMPLE_BRANCH_STACK
+		 */
+		if (cpuctx->ctx.nr_branch_stack > 0
+		    && pmu->flush_branch_stack) {
+
+			pmu = cpuctx->ctx.pmu;
+
+			perf_ctx_lock(cpuctx, cpuctx->task_ctx);
+
+			perf_pmu_disable(pmu);
+
+			pmu->flush_branch_stack();
+
+			perf_pmu_enable(pmu);
+
+			perf_ctx_unlock(cpuctx, cpuctx->task_ctx);
+		}
+	}
+
+	rcu_read_unlock();
+
+	local_irq_restore(flags);
+}
+
+/*
  * Called from scheduler to add the events of the current task
  * with interrupts disabled.
  *
@@ -2232,6 +2299,10 @@ void __perf_event_task_sched_in(struct task_struct *prev,
 	 */
 	if (atomic_read(&__get_cpu_var(perf_cgroup_events)))
 		perf_cgroup_sched_in(prev, task);
+
+	/* check for system-wide branch_stack events */
+	if (atomic_read(&__get_cpu_var(perf_branch_stack_events)))
+		perf_branch_stack_sched_in(prev, task);
 }
 
 static u64 perf_calculate_period(struct perf_event *event, u64 nsec, u64 count)
@@ -2789,6 +2860,14 @@ static void free_event(struct perf_event *event)
 			atomic_dec(&per_cpu(perf_cgroup_events, event->cpu));
 			jump_label_dec_deferred(&perf_sched_events);
 		}
+
+		if (has_branch_stack(event)) {
+			jump_label_dec_deferred(&perf_sched_events);
+			/* is system-wide event */
+			if (!(event->attach_state & PERF_ATTACH_TASK))
+				atomic_dec(&per_cpu(perf_branch_stack_events,
+						    event->cpu));
+		}
 	}
 
 	if (event->rb) {
@@ -5915,6 +5994,12 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
 				return ERR_PTR(err);
 			}
 		}
+		if (has_branch_stack(event)) {
+			jump_label_inc(&perf_sched_events.key);
+			if (!(event->attach_state & PERF_ATTACH_TASK))
+				atomic_inc(&per_cpu(perf_branch_stack_events,
+						    event->cpu));
+		}
 	}
 
 	return event;
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v5 11/18] perf: add code to support PERF_SAMPLE_BRANCH_STACK
  2012-02-02 12:54 [PATCH v5 00/18] perf: add support for sampling taken branches Stephane Eranian
                   ` (9 preceding siblings ...)
  2012-02-02 12:54 ` [PATCH v5 10/18] perf: add hook to flush branch_stack on context switch Stephane Eranian
@ 2012-02-02 12:54 ` Stephane Eranian
  2012-02-06 18:06   ` Arnaldo Carvalho de Melo
  2012-02-02 12:54 ` [PATCH v5 12/18] perf: add support for sampling taken branch to perf record Stephane Eranian
                   ` (6 subsequent siblings)
  17 siblings, 1 reply; 43+ messages in thread
From: Stephane Eranian @ 2012-02-02 12:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterz, mingo, acme, robert.richter, ming.m.lin, andi, asharma,
	ravitillo, vweaver1, khandual, dsahern

From: Roberto Agostino Vitillo <ravitillo@lbl.gov>

This patch adds:
- ability to parse samples with PERF_SAMPLE_BRANCH_STACK
- sort on branches
- build histograms on branches

Signed-off-by: Roberto Agostino Vitillo <ravitillo@lbl.gov>
Signed-off-by: Stephane Eranian <eranian@google.com>
---
 tools/perf/perf.h          |   17 ++
 tools/perf/util/annotate.c |    2 +-
 tools/perf/util/event.h    |    1 +
 tools/perf/util/evsel.c    |   10 ++
 tools/perf/util/hist.c     |   93 +++++++++---
 tools/perf/util/hist.h     |    7 +
 tools/perf/util/session.c  |   72 +++++++++
 tools/perf/util/session.h  |    4 +
 tools/perf/util/sort.c     |  362 +++++++++++++++++++++++++++++++++-----------
 tools/perf/util/sort.h     |    5 +
 tools/perf/util/symbol.h   |   13 ++
 11 files changed, 475 insertions(+), 111 deletions(-)

diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index 92af168..8b4d25d 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -180,6 +180,23 @@ struct ip_callchain {
 	u64 ips[0];
 };
 
+struct branch_flags {
+	u64 mispred:1;
+	u64 predicted:1;
+	u64 reserved:62;
+};
+
+struct branch_entry {
+	u64				from;
+	u64				to;
+	struct branch_flags flags;
+};
+
+struct branch_stack {
+	u64				nr;
+	struct branch_entry	entries[0];
+};
+
 extern bool perf_host, perf_guest;
 extern const char perf_version_string[];
 
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 011ed26..8248d80 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -64,7 +64,7 @@ int symbol__inc_addr_samples(struct symbol *sym, struct map *map,
 
 	pr_debug3("%s: addr=%#" PRIx64 "\n", __func__, map->unmap_ip(map, addr));
 
-	if (addr >= sym->end)
+	if (addr >= sym->end || addr < sym->start)
 		return 0;
 
 	offset = addr - sym->start;
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index cbdeaad..1b19728 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -81,6 +81,7 @@ struct perf_sample {
 	u32 raw_size;
 	void *raw_data;
 	struct ip_callchain *callchain;
+	struct branch_stack *branch_stack;
 };
 
 #define BUILD_ID_SIZE 20
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index dcfefab..6b15cda 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -575,6 +575,16 @@ int perf_event__parse_sample(const union perf_event *event, u64 type,
 		data->raw_data = (void *) pdata;
 	}
 
+	if (type & PERF_SAMPLE_BRANCH_STACK) {
+		u64 sz;
+
+		data->branch_stack = (struct branch_stack *)array;
+		array++; /* nr */
+
+		sz = data->branch_stack->nr * sizeof(struct branch_entry);
+		sz /= sizeof(uint64_t);
+		array += sz;
+	}
 	return 0;
 }
 
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 6f505d1..66f9936 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -54,9 +54,11 @@ static void hists__calc_col_len(struct hists *hists, struct hist_entry *h)
 {
 	u16 len;
 
-	if (h->ms.sym)
-		hists__new_col_len(hists, HISTC_SYMBOL, h->ms.sym->namelen);
-	else {
+	if (h->ms.sym) {
+		int n = (int)h->ms.sym->namelen + 4;
+		int symlen = max(n, BITS_PER_LONG / 4 + 6);
+		hists__new_col_len(hists, HISTC_SYMBOL, symlen);
+	} else {
 		const unsigned int unresolved_col_width = BITS_PER_LONG / 4;
 
 		if (hists__col_len(hists, HISTC_DSO) < unresolved_col_width &&
@@ -195,26 +197,14 @@ static u8 symbol__parent_filter(const struct symbol *parent)
 	return 0;
 }
 
-struct hist_entry *__hists__add_entry(struct hists *hists,
+static struct hist_entry *add_hist_entry(struct hists *hists,
+				      struct hist_entry *entry,
 				      struct addr_location *al,
-				      struct symbol *sym_parent, u64 period)
+				      u64 period)
 {
 	struct rb_node **p;
 	struct rb_node *parent = NULL;
 	struct hist_entry *he;
-	struct hist_entry entry = {
-		.thread	= al->thread,
-		.ms = {
-			.map	= al->map,
-			.sym	= al->sym,
-		},
-		.cpu	= al->cpu,
-		.ip	= al->addr,
-		.level	= al->level,
-		.period	= period,
-		.parent = sym_parent,
-		.filtered = symbol__parent_filter(sym_parent),
-	};
 	int cmp;
 
 	pthread_mutex_lock(&hists->lock);
@@ -225,7 +215,7 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
 		parent = *p;
 		he = rb_entry(parent, struct hist_entry, rb_node_in);
 
-		cmp = hist_entry__cmp(&entry, he);
+		cmp = hist_entry__cmp(entry, he);
 
 		if (!cmp) {
 			he->period += period;
@@ -239,7 +229,7 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
 			p = &(*p)->rb_right;
 	}
 
-	he = hist_entry__new(&entry);
+	he = hist_entry__new(entry);
 	if (!he)
 		goto out_unlock;
 
@@ -252,6 +242,69 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
 	return he;
 }
 
+struct hist_entry *__hists__add_branch_entry(struct hists *self,
+					     struct addr_location *al,
+					     struct symbol *sym_parent,
+					     struct branch_info *bi,
+					     u64 period){
+	struct hist_entry entry = {
+		.thread	= al->thread,
+		.ms = {
+			.map	= bi->to.map,
+			.sym	= bi->to.sym,
+		},
+		.cpu	= al->cpu,
+		.ip	= bi->to.addr,
+		.level	= al->level,
+		.period	= period,
+		.parent = sym_parent,
+		.filtered = symbol__parent_filter(sym_parent),
+		.branch_info = bi,
+	};
+	struct hist_entry *he;
+
+	he = add_hist_entry(self, &entry, al, period);
+	if (!he)
+		return NULL;
+
+	/*
+	 * in branch mode, we do not display al->sym, al->addr
+	 * but instead what is in branch_info. The addresses and
+	 * symbols there may need wider columns, so make sure they
+	 * are taken into account.
+	 *
+	 * hists__calc_col_len() tracks the max column width, so
+	 * we need to call it for both the from and to addresses
+	 */
+	entry.ip     = bi->from.addr;
+	entry.ms.map = bi->from.map;
+	entry.ms.sym = bi->from.sym;
+	hists__calc_col_len(self, &entry);
+
+	return he;
+}
+
+struct hist_entry *__hists__add_entry(struct hists *self,
+				      struct addr_location *al,
+				      struct symbol *sym_parent, u64 period)
+{
+	struct hist_entry entry = {
+		.thread	= al->thread,
+		.ms = {
+			.map	= al->map,
+			.sym	= al->sym,
+		},
+		.cpu	= al->cpu,
+		.ip	= al->addr,
+		.level	= al->level,
+		.period	= period,
+		.parent = sym_parent,
+		.filtered = symbol__parent_filter(sym_parent),
+	};
+
+	return add_hist_entry(self, &entry, al, period);
+}
+
 int64_t
 hist_entry__cmp(struct hist_entry *left, struct hist_entry *right)
 {
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 0d48613..801a04e 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -41,6 +41,7 @@ enum hist_column {
 	HISTC_COMM,
 	HISTC_PARENT,
 	HISTC_CPU,
+	HISTC_MISPREDICT,
 	HISTC_NR_COLS, /* Last entry */
 };
 
@@ -73,6 +74,12 @@ int hist_entry__snprintf(struct hist_entry *self, char *bf, size_t size,
 			 struct hists *hists);
 void hist_entry__free(struct hist_entry *);
 
+struct hist_entry *__hists__add_branch_entry(struct hists *self,
+					     struct addr_location *al,
+					     struct symbol *sym_parent,
+					     struct branch_info *bi,
+					     u64 period);
+
 void hists__output_resort(struct hists *self);
 void hists__output_resort_threaded(struct hists *hists);
 void hists__collapse_resort(struct hists *self);
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 552c1c5..5ce3f31 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -229,6 +229,63 @@ static bool symbol__match_parent_regex(struct symbol *sym)
 	return 0;
 }
 
+static const u8 cpumodes[] = {
+	PERF_RECORD_MISC_USER,
+	PERF_RECORD_MISC_KERNEL,
+	PERF_RECORD_MISC_GUEST_USER,
+	PERF_RECORD_MISC_GUEST_KERNEL
+};
+#define NCPUMODES (sizeof(cpumodes)/sizeof(u8))
+
+static void ip__resolve_ams(struct machine *self, struct thread *thread,
+			    struct addr_map_symbol *ams,
+			    u64 ip)
+{
+	struct addr_location al;
+	size_t i;
+	u8 m;
+
+	memset(&al, 0, sizeof(al));
+
+	for (i = 0; i < NCPUMODES; i++) {
+		m = cpumodes[i];
+		/*
+		 * we cannot use the header.misc hint to determine whether a
+		 * branch stack address is user, kernel, guest, hypervisor.
+		 * Branches may straddle the kernel/user/hypervisor boundaries.
+		 * Thus, we have to try * consecutively until we find a match
+		 * or else, the symbol is unknown
+		 */
+		thread__find_addr_location(thread, self, m, MAP__FUNCTION,
+				ip, &al, NULL);
+		if (al.sym)
+			goto found;
+	}
+found:
+	ams->addr = ip;
+	ams->sym = al.sym;
+	ams->map = al.map;
+}
+
+struct branch_info *perf_session__resolve_bstack(struct machine *self,
+						 struct thread *thr,
+						 struct branch_stack *bs)
+{
+	struct branch_info *bi;
+	unsigned int i;
+
+	bi = calloc(bs->nr, sizeof(struct branch_info));
+	if (!bi)
+		return NULL;
+
+	for (i = 0; i < bs->nr; i++) {
+		ip__resolve_ams(self, thr, &bi[i].to, bs->entries[i].to);
+		ip__resolve_ams(self, thr, &bi[i].from, bs->entries[i].from);
+		bi[i].flags = bs->entries[i].flags;
+	}
+	return bi;
+}
+
 int machine__resolve_callchain(struct machine *self, struct perf_evsel *evsel,
 			       struct thread *thread,
 			       struct ip_callchain *chain,
@@ -697,6 +754,18 @@ static void callchain__printf(struct perf_sample *sample)
 		       i, sample->callchain->ips[i]);
 }
 
+static void branch_stack__printf(struct perf_sample *sample)
+{
+	uint64_t i;
+
+	printf("... branch stack: nr:%" PRIu64 "\n", sample->branch_stack->nr);
+
+	for (i = 0; i < sample->branch_stack->nr; i++)
+		printf("..... %2"PRIu64": %016" PRIx64 " -> %016" PRIx64 "\n",
+			i, sample->branch_stack->entries[i].from,
+			sample->branch_stack->entries[i].to);
+}
+
 static void perf_session__print_tstamp(struct perf_session *session,
 				       union perf_event *event,
 				       struct perf_sample *sample)
@@ -744,6 +813,9 @@ static void dump_sample(struct perf_session *session, union perf_event *event,
 
 	if (session->sample_type & PERF_SAMPLE_CALLCHAIN)
 		callchain__printf(sample);
+
+	if (session->sample_type & PERF_SAMPLE_BRANCH_STACK)
+		branch_stack__printf(sample);
 }
 
 static struct machine *
diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h
index c8d9017..accb5dc 100644
--- a/tools/perf/util/session.h
+++ b/tools/perf/util/session.h
@@ -73,6 +73,10 @@ int perf_session__resolve_callchain(struct perf_session *self, struct perf_evsel
 				    struct ip_callchain *chain,
 				    struct symbol **parent);
 
+struct branch_info *perf_session__resolve_bstack(struct machine *self,
+						 struct thread *thread,
+						 struct branch_stack *bs);
+
 bool perf_session__has_traces(struct perf_session *self, const char *msg);
 
 void mem_bswap_64(void *src, int byte_size);
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 16da30d..1531989 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -8,6 +8,7 @@ const char	default_sort_order[] = "comm,dso,symbol";
 const char	*sort_order = default_sort_order;
 int		sort__need_collapse = 0;
 int		sort__has_parent = 0;
+bool		sort__branch_mode;
 
 enum sort_type	sort__first_dimension;
 
@@ -94,6 +95,26 @@ static int hist_entry__comm_snprintf(struct hist_entry *self, char *bf,
 	return repsep_snprintf(bf, size, "%*s", width, self->thread->comm);
 }
 
+static int64_t _sort__dso_cmp(struct map *map_l, struct map *map_r)
+{
+	struct dso *dso_l = map_l ? map_l->dso : NULL;
+	struct dso *dso_r = map_r ? map_r->dso : NULL;
+	const char *dso_name_l, *dso_name_r;
+
+	if (!dso_l || !dso_r)
+		return cmp_null(dso_l, dso_r);
+
+	if (verbose) {
+		dso_name_l = dso_l->long_name;
+		dso_name_r = dso_r->long_name;
+	} else {
+		dso_name_l = dso_l->short_name;
+		dso_name_r = dso_r->short_name;
+	}
+
+	return strcmp(dso_name_l, dso_name_r);
+}
+
 struct sort_entry sort_comm = {
 	.se_header	= "Command",
 	.se_cmp		= sort__comm_cmp,
@@ -107,36 +128,74 @@ struct sort_entry sort_comm = {
 static int64_t
 sort__dso_cmp(struct hist_entry *left, struct hist_entry *right)
 {
-	struct dso *dso_l = left->ms.map ? left->ms.map->dso : NULL;
-	struct dso *dso_r = right->ms.map ? right->ms.map->dso : NULL;
-	const char *dso_name_l, *dso_name_r;
+	return _sort__dso_cmp(left->ms.map, right->ms.map);
+}
 
-	if (!dso_l || !dso_r)
-		return cmp_null(dso_l, dso_r);
 
-	if (verbose) {
-		dso_name_l = dso_l->long_name;
-		dso_name_r = dso_r->long_name;
-	} else {
-		dso_name_l = dso_l->short_name;
-		dso_name_r = dso_r->short_name;
+static int64_t _sort__sym_cmp(struct symbol *sym_l, struct symbol *sym_r,
+			      u64 ip_l, u64 ip_r)
+{
+	if (!sym_l || !sym_r)
+		return cmp_null(sym_l, sym_r);
+
+	if (sym_l == sym_r)
+		return 0;
+
+	if (sym_l)
+		ip_l = sym_l->start;
+	if (sym_r)
+		ip_r = sym_r->start;
+
+	return (int64_t)(ip_r - ip_l);
+}
+
+static int _hist_entry__dso_snprintf(struct map *map, char *bf,
+				     size_t size, unsigned int width)
+{
+	if (map && map->dso) {
+		const char *dso_name = !verbose ? map->dso->short_name :
+			map->dso->long_name;
+		return repsep_snprintf(bf, size, "%-*s", width, dso_name);
 	}
 
-	return strcmp(dso_name_l, dso_name_r);
+	return repsep_snprintf(bf, size, "%-*s", width, "[unknown]");
 }
 
 static int hist_entry__dso_snprintf(struct hist_entry *self, char *bf,
 				    size_t size, unsigned int width)
 {
-	if (self->ms.map && self->ms.map->dso) {
-		const char *dso_name = !verbose ? self->ms.map->dso->short_name :
-						  self->ms.map->dso->long_name;
-		return repsep_snprintf(bf, size, "%-*s", width, dso_name);
+	return _hist_entry__dso_snprintf(self->ms.map, bf, size, width);
+}
+
+static int _hist_entry__sym_snprintf(struct map *map, struct symbol *sym,
+				     u64 ip, char level, char *bf, size_t size,
+				     unsigned int width __used)
+{
+	size_t ret = 0;
+
+	if (verbose) {
+		char o = map ? dso__symtab_origin(map->dso) : '!';
+		ret += repsep_snprintf(bf, size, "%-#*llx %c ",
+				       BITS_PER_LONG / 4, ip, o);
 	}
 
-	return repsep_snprintf(bf, size, "%-*s", width, "[unknown]");
+	ret += repsep_snprintf(bf + ret, size - ret, "[%c] ", level);
+	if (sym)
+		ret += repsep_snprintf(bf + ret, size - ret, "%-*s",
+				       width - ret,
+				       sym->name);
+	else {
+		size_t len = BITS_PER_LONG / 4;
+		ret += repsep_snprintf(bf + ret, size - ret, "%-#.*llx",
+				       len, ip);
+		ret += repsep_snprintf(bf + ret, size - ret, "%-*s",
+				       width - ret, "");
+	}
+
+	return ret;
 }
 
+
 struct sort_entry sort_dso = {
 	.se_header	= "Shared Object",
 	.se_cmp		= sort__dso_cmp,
@@ -144,8 +203,14 @@ struct sort_entry sort_dso = {
 	.se_width_idx	= HISTC_DSO,
 };
 
-/* --sort symbol */
+static int hist_entry__sym_snprintf(struct hist_entry *self, char *bf,
+				    size_t size, unsigned int width __used)
+{
+	return _hist_entry__sym_snprintf(self->ms.map, self->ms.sym, self->ip,
+					 self->level, bf, size, width);
+}
 
+/* --sort symbol */
 static int64_t
 sort__sym_cmp(struct hist_entry *left, struct hist_entry *right)
 {
@@ -154,40 +219,10 @@ sort__sym_cmp(struct hist_entry *left, struct hist_entry *right)
 	if (!left->ms.sym && !right->ms.sym)
 		return right->level - left->level;
 
-	if (!left->ms.sym || !right->ms.sym)
-		return cmp_null(left->ms.sym, right->ms.sym);
-
-	if (left->ms.sym == right->ms.sym)
-		return 0;
-
 	ip_l = left->ms.sym->start;
 	ip_r = right->ms.sym->start;
 
-	return (int64_t)(ip_r - ip_l);
-}
-
-static int hist_entry__sym_snprintf(struct hist_entry *self, char *bf,
-				    size_t size, unsigned int width __used)
-{
-	size_t ret = 0;
-
-	if (verbose) {
-		char o = self->ms.map ? dso__symtab_origin(self->ms.map->dso) : '!';
-		ret += repsep_snprintf(bf, size, "%-#*llx %c ",
-				       BITS_PER_LONG / 4, self->ip, o);
-	}
-
-	if (!sort_dso.elide)
-		ret += repsep_snprintf(bf + ret, size - ret, "[%c] ", self->level);
-
-	if (self->ms.sym)
-		ret += repsep_snprintf(bf + ret, size - ret, "%s",
-				       self->ms.sym->name);
-	else
-		ret += repsep_snprintf(bf + ret, size - ret, "%-#*llx",
-				       BITS_PER_LONG / 4, self->ip);
-
-	return ret;
+	return _sort__sym_cmp(left->ms.sym, right->ms.sym, ip_l, ip_r);
 }
 
 struct sort_entry sort_sym = {
@@ -246,6 +281,135 @@ struct sort_entry sort_cpu = {
 	.se_width_idx	= HISTC_CPU,
 };
 
+static int64_t
+sort__dso_from_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+	return _sort__dso_cmp(left->branch_info->from.map,
+			      right->branch_info->from.map);
+}
+
+static int hist_entry__dso_from_snprintf(struct hist_entry *self, char *bf,
+				    size_t size, unsigned int width)
+{
+	return _hist_entry__dso_snprintf(self->branch_info->from.map,
+					 bf, size, width);
+}
+
+struct sort_entry sort_dso_from = {
+	.se_header	= "Source Shared Object",
+	.se_cmp		= sort__dso_from_cmp,
+	.se_snprintf	= hist_entry__dso_from_snprintf,
+	.se_width_idx	= HISTC_DSO,
+};
+
+static int64_t
+sort__dso_to_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+	return _sort__dso_cmp(left->branch_info->to.map,
+			      right->branch_info->to.map);
+}
+
+static int hist_entry__dso_to_snprintf(struct hist_entry *self, char *bf,
+				       size_t size, unsigned int width)
+{
+	return _hist_entry__dso_snprintf(self->branch_info->to.map,
+					 bf, size, width);
+}
+
+static int64_t
+sort__sym_from_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+	struct addr_map_symbol *from_l = &left->branch_info->from;
+	struct addr_map_symbol *from_r = &right->branch_info->from;
+
+	if (!from_l->sym && !from_r->sym)
+		return right->level - left->level;
+
+	return _sort__sym_cmp(from_l->sym, from_r->sym, from_l->addr,
+			     from_r->addr);
+}
+
+static int64_t
+sort__sym_to_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+	struct addr_map_symbol *to_l = &left->branch_info->to;
+	struct addr_map_symbol *to_r = &right->branch_info->to;
+
+	if (!to_l->sym && !to_r->sym)
+		return right->level - left->level;
+
+	return _sort__sym_cmp(to_l->sym, to_r->sym, to_l->addr, to_r->addr);
+}
+
+static int hist_entry__sym_from_snprintf(struct hist_entry *self, char *bf,
+				    size_t size, unsigned int width __used)
+{
+	struct addr_map_symbol *from = &self->branch_info->from;
+	return _hist_entry__sym_snprintf(from->map, from->sym, from->addr,
+					 self->level, bf, size, width);
+
+}
+
+static int hist_entry__sym_to_snprintf(struct hist_entry *self, char *bf,
+				    size_t size, unsigned int width __used)
+{
+	struct addr_map_symbol *to = &self->branch_info->to;
+	return _hist_entry__sym_snprintf(to->map, to->sym, to->addr,
+					 self->level, bf, size, width);
+
+}
+
+struct sort_entry sort_dso_to = {
+	.se_header	= "Target Shared Object",
+	.se_cmp		= sort__dso_to_cmp,
+	.se_snprintf	= hist_entry__dso_to_snprintf,
+	.se_width_idx	= HISTC_DSO,
+};
+
+struct sort_entry sort_sym_from = {
+	.se_header	= "Source Symbol",
+	.se_cmp		= sort__sym_from_cmp,
+	.se_snprintf	= hist_entry__sym_from_snprintf,
+	.se_width_idx	= HISTC_SYMBOL,
+};
+
+struct sort_entry sort_sym_to = {
+	.se_header	= "Target Symbol",
+	.se_cmp		= sort__sym_to_cmp,
+	.se_snprintf	= hist_entry__sym_to_snprintf,
+	.se_width_idx	= HISTC_SYMBOL,
+};
+
+static int64_t
+sort__mispredict_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+	const unsigned char mp = left->branch_info->flags.mispred !=
+					right->branch_info->flags.mispred;
+	const unsigned char p = left->branch_info->flags.predicted !=
+					right->branch_info->flags.predicted;
+
+	return mp || p;
+}
+
+static int hist_entry__mispredict_snprintf(struct hist_entry *self, char *bf,
+				    size_t size, unsigned int width){
+	static const char *out = "N/A";
+
+	if (self->branch_info->flags.predicted)
+		out = "N";
+	else if (self->branch_info->flags.mispred)
+		out = "Y";
+
+	return repsep_snprintf(bf, size, "%-*s", width, out);
+}
+
+struct sort_entry sort_mispredict = {
+	.se_header	= "Branch Mispredicted",
+	.se_cmp		= sort__mispredict_cmp,
+	.se_snprintf	= hist_entry__mispredict_snprintf,
+	.se_width_idx	= HISTC_MISPREDICT,
+};
+
 struct sort_dimension {
 	const char		*name;
 	struct sort_entry	*entry;
@@ -253,14 +417,59 @@ struct sort_dimension {
 };
 
 static struct sort_dimension sort_dimensions[] = {
-	{ .name = "pid",	.entry = &sort_thread,	},
-	{ .name = "comm",	.entry = &sort_comm,	},
-	{ .name = "dso",	.entry = &sort_dso,	},
-	{ .name = "symbol",	.entry = &sort_sym,	},
-	{ .name = "parent",	.entry = &sort_parent,	},
-	{ .name = "cpu",	.entry = &sort_cpu,	},
+	{ .name = "pid",	.entry = &sort_thread,			},
+	{ .name = "comm",	.entry = &sort_comm,			},
+	{ .name = "dso",	.entry = &sort_dso,			},
+	{ .name = "dso_from",	.entry = &sort_dso_from, .taken = true	},
+	{ .name = "dso_to",	.entry = &sort_dso_to,	 .taken = true	},
+	{ .name = "symbol",	.entry = &sort_sym,			},
+	{ .name = "symbol_from",.entry = &sort_sym_from, .taken = true	},
+	{ .name = "symbol_to",	.entry = &sort_sym_to,	 .taken = true	},
+	{ .name = "parent",	.entry = &sort_parent,			},
+	{ .name = "cpu",	.entry = &sort_cpu,			},
+	{ .name = "mispredict", .entry = &sort_mispredict, },
 };
 
+static int _sort_dimension__add(struct sort_dimension *sd)
+{
+	if (sd->entry->se_collapse)
+		sort__need_collapse = 1;
+
+	if (sd->entry == &sort_parent) {
+		int ret = regcomp(&parent_regex, parent_pattern, REG_EXTENDED);
+		if (ret) {
+			char err[BUFSIZ];
+
+			regerror(ret, &parent_regex, err, sizeof(err));
+			pr_err("Invalid regex: %s\n%s", parent_pattern, err);
+			return -EINVAL;
+		}
+		sort__has_parent = 1;
+	}
+
+	if (list_empty(&hist_entry__sort_list)) {
+		if (!strcmp(sd->name, "pid"))
+			sort__first_dimension = SORT_PID;
+		else if (!strcmp(sd->name, "comm"))
+			sort__first_dimension = SORT_COMM;
+		else if (!strcmp(sd->name, "dso"))
+			sort__first_dimension = SORT_DSO;
+		else if (!strcmp(sd->name, "symbol"))
+			sort__first_dimension = SORT_SYM;
+		else if (!strcmp(sd->name, "parent"))
+			sort__first_dimension = SORT_PARENT;
+		else if (!strcmp(sd->name, "cpu"))
+			sort__first_dimension = SORT_CPU;
+		else if (!strcmp(sd->name, "mispredict"))
+			sort__first_dimension = SORT_MISPREDICTED;
+	}
+
+	list_add_tail(&sd->entry->list, &hist_entry__sort_list);
+	sd->taken = 1;
+
+	return 0;
+}
+
 int sort_dimension__add(const char *tok)
 {
 	unsigned int i;
@@ -271,48 +480,21 @@ int sort_dimension__add(const char *tok)
 		if (strncasecmp(tok, sd->name, strlen(tok)))
 			continue;
 
-		if (sd->entry == &sort_parent) {
-			int ret = regcomp(&parent_regex, parent_pattern, REG_EXTENDED);
-			if (ret) {
-				char err[BUFSIZ];
-
-				regerror(ret, &parent_regex, err, sizeof(err));
-				pr_err("Invalid regex: %s\n%s", parent_pattern, err);
-				return -EINVAL;
-			}
-			sort__has_parent = 1;
-		}
-
 		if (sd->taken)
 			return 0;
 
-		if (sd->entry->se_collapse)
-			sort__need_collapse = 1;
-
-		if (list_empty(&hist_entry__sort_list)) {
-			if (!strcmp(sd->name, "pid"))
-				sort__first_dimension = SORT_PID;
-			else if (!strcmp(sd->name, "comm"))
-				sort__first_dimension = SORT_COMM;
-			else if (!strcmp(sd->name, "dso"))
-				sort__first_dimension = SORT_DSO;
-			else if (!strcmp(sd->name, "symbol"))
-				sort__first_dimension = SORT_SYM;
-			else if (!strcmp(sd->name, "parent"))
-				sort__first_dimension = SORT_PARENT;
-			else if (!strcmp(sd->name, "cpu"))
-				sort__first_dimension = SORT_CPU;
-		}
-
-		list_add_tail(&sd->entry->list, &hist_entry__sort_list);
-		sd->taken = 1;
 
-		return 0;
+		if (sort__branch_mode && (sd->entry == &sort_dso ||
+					sd->entry == &sort_sym)){
+			int err = _sort_dimension__add(sd + 1);
+			return err ?: _sort_dimension__add(sd + 2);
+		} else if (sd->entry == &sort_mispredict && !sort__branch_mode)
+			break;
+		else
+			return _sort_dimension__add(sd);
 	}
-
 	return -ESRCH;
 }
-
 void setup_sorting(const char * const usagestr[], const struct option *opts)
 {
 	char *tmp, *tok, *str = strdup(sort_order);
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 3f67ae3..effcae1 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -31,11 +31,14 @@ extern const char *parent_pattern;
 extern const char default_sort_order[];
 extern int sort__need_collapse;
 extern int sort__has_parent;
+extern bool sort__branch_mode;
 extern char *field_sep;
 extern struct sort_entry sort_comm;
 extern struct sort_entry sort_dso;
 extern struct sort_entry sort_sym;
 extern struct sort_entry sort_parent;
+extern struct sort_entry sort_lbr_dso;
+extern struct sort_entry sort_lbr_sym;
 extern enum sort_type sort__first_dimension;
 
 /**
@@ -72,6 +75,7 @@ struct hist_entry {
 		struct hist_entry *pair;
 		struct rb_root	  sorted_chain;
 	};
+	struct branch_info	*branch_info;
 	struct callchain_root	callchain[0];
 };
 
@@ -82,6 +86,7 @@ enum sort_type {
 	SORT_SYM,
 	SORT_PARENT,
 	SORT_CPU,
+	SORT_MISPREDICTED,
 };
 
 /*
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 2a683d4..5866ce6 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -5,6 +5,7 @@
 #include <stdbool.h>
 #include <stdint.h>
 #include "map.h"
+#include "../perf.h"
 #include <linux/list.h>
 #include <linux/rbtree.h>
 #include <stdio.h>
@@ -120,6 +121,18 @@ struct map_symbol {
 	bool	      has_children;
 };
 
+struct addr_map_symbol {
+	struct map    *map;
+	struct symbol *sym;
+	u64	      addr;
+};
+
+struct branch_info {
+	struct addr_map_symbol from;
+	struct addr_map_symbol to;
+	struct branch_flags flags;
+};
+
 struct addr_location {
 	struct thread *thread;
 	struct map    *map;
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v5 12/18] perf: add support for sampling taken branch to perf record
  2012-02-02 12:54 [PATCH v5 00/18] perf: add support for sampling taken branches Stephane Eranian
                   ` (10 preceding siblings ...)
  2012-02-02 12:54 ` [PATCH v5 11/18] perf: add code to support PERF_SAMPLE_BRANCH_STACK Stephane Eranian
@ 2012-02-02 12:54 ` Stephane Eranian
  2012-02-06 18:08   ` Arnaldo Carvalho de Melo
  2012-02-02 12:54 ` [PATCH v5 13/18] perf: add support for taken branch sampling to perf report Stephane Eranian
                   ` (5 subsequent siblings)
  17 siblings, 1 reply; 43+ messages in thread
From: Stephane Eranian @ 2012-02-02 12:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterz, mingo, acme, robert.richter, ming.m.lin, andi, asharma,
	ravitillo, vweaver1, khandual, dsahern

From: Roberto Agostino Vitillo <ravitillo@lbl.gov>

This patch adds a new option to enable taken branch stack
sampling, i.e., leverage the PERF_SAMPLE_BRANCH_STACK feature
of perf_events.

There is a new option to active this mode: -b.
It is possible to pass a set of filters to select the type of
branches to sample.

The following filters are available:
- any : any type of branches
- any_call : any function call or system call
- any_ret : any function return or system call return
- any_ind : any indirect branch
- u:  only when the branch target is at the user level
- k: only when the branch target is in the kernel
- hv: only when the branch target is in the hypervisor

Filters can be combined by passing a comma separated list
to the option:

$ perf record -b any_call,u -e cycles:u branchy

Signed-off-by: Roberto Agostino Vitillo <ravitillo@lbl.gov>
Signed-off-by: Stephane Eranian <eranian@google.com>
---
 tools/perf/Documentation/perf-record.txt |   25 ++++++++++
 tools/perf/builtin-record.c              |   74 ++++++++++++++++++++++++++++++
 tools/perf/perf.h                        |    1 +
 tools/perf/util/evsel.c                  |    4 ++
 4 files changed, 104 insertions(+), 0 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index ff9a66e..288d429 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -152,6 +152,31 @@ an empty cgroup (monitor all the time) using, e.g., -G foo,,bar. Cgroups must ha
 corresponding events, i.e., they always refer to events defined earlier on the command
 line.
 
+-b::
+--branch-stack::
+Enable taken branch stack sampling. Each sample captures a series of consecutive
+taken branches. The number of branches captured with each sample depends on the
+underlying hardware, the type of branches of interest, and the executed code.
+It is possible to select the types of branches captured by enabling filters. The
+following filters are defined:
+
+        -  any :  any type of branches
+        - any_call: any function call or system call
+        - any_ret: any function return or system call return
+        - any_ind: any indirect branch
+        - u:  only when the branch target is at the user level
+        - k: only when the branch target is in the kernel
+        - hv: only when the target is at the hypervisor level
+
++
+At least one of any, any_call, any_ret, any_ind must be provided. The privilege levels may
+be ommitted, in which case, the privilege levels of the associated event are applied to the
+branch filter. Both kernel (k) and hypervisor (hv) privilege levels are subject to
+permissions.  When sampling on multiple events, branch stack sampling is enabled for all
+the sampling events. The sampled branch type is the same for all events.
+Note that taken branch sampling may not be available on all processors.
+The various filters must be specified as a comma separated list: -b any_ret,u,k
+
 SEE ALSO
 --------
 linkperf:perf-stat[1], linkperf:perf-list[1]
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 32870ee..6565164 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -637,6 +637,77 @@ static int __cmd_record(struct perf_record *rec, int argc, const char **argv)
 	return err;
 }
 
+#define BRANCH_OPT(n, m) \
+	{ .name = n, .mode = (m) }
+
+#define BRANCH_END { .name = NULL }
+
+struct branch_mode {
+	const char *name;
+	int mode;
+};
+
+static const struct branch_mode branch_modes[] = {
+	BRANCH_OPT("u", PERF_SAMPLE_BRANCH_USER),
+	BRANCH_OPT("k", PERF_SAMPLE_BRANCH_KERNEL),
+	BRANCH_OPT("hv", PERF_SAMPLE_BRANCH_HV),
+	BRANCH_OPT("any", PERF_SAMPLE_BRANCH_ANY),
+	BRANCH_OPT("any_call", PERF_SAMPLE_BRANCH_ANY_CALL),
+	BRANCH_OPT("any_ret", PERF_SAMPLE_BRANCH_ANY_RETURN),
+	BRANCH_OPT("ind_call", PERF_SAMPLE_BRANCH_IND_CALL),
+	BRANCH_END
+};
+
+static int
+parse_branch_stack(const struct option *opt, const char *str, int unset __used)
+{
+#define ONLY_PLM \
+	(PERF_SAMPLE_BRANCH_USER	|\
+	 PERF_SAMPLE_BRANCH_KERNEL	|\
+	 PERF_SAMPLE_BRANCH_HV)
+
+	uint64_t *mode = (uint64_t *)opt->value;
+	const struct branch_mode *br;
+	char *s, *os, *p;
+	int ret = -1;
+
+	*mode = 0;
+
+	/* because str is read-only */
+	s = os = strdup(str);
+	if (!s)
+		return -1;
+
+	for (;;) {
+		p = strchr(s, ',');
+		if (p)
+			*p = '\0';
+
+		for (br = branch_modes; br->name; br++) {
+			if (!strcasecmp(s, br->name))
+				break;
+		}
+		if (!br->name)
+			goto error;
+
+		*mode |= br->mode;
+
+		if (!p)
+			break;
+
+		s = p + 1;
+	}
+	ret = 0;
+
+	if ((*mode & ~ONLY_PLM) == 0) {
+		error("need at least one branch type with -b\n");
+		ret = -1;
+	}
+error:
+	free(os);
+	return ret;
+}
+
 static const char * const record_usage[] = {
 	"perf record [<options>] [<command>]",
 	"perf record [<options>] -- <command> [<options>]",
@@ -729,6 +800,9 @@ const struct option record_options[] = {
 		     "monitor event in cgroup name only",
 		     parse_cgroups),
 	OPT_STRING('u', "uid", &record.uid_str, "user", "user to profile"),
+	OPT_CALLBACK('b', "branch-stack", &record.opts.branch_stack,
+		     "branch mode mask", "branch stack sampling modes",
+		     parse_branch_stack),
 	OPT_END()
 };
 
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index 8b4d25d..7f8fbab 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -222,6 +222,7 @@ struct perf_record_opts {
 	unsigned int freq;
 	unsigned int mmap_pages;
 	unsigned int user_freq;
+	int	     branch_stack;
 	u64	     default_interval;
 	u64	     user_interval;
 	const char   *cpu_list;
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 6b15cda..63a6a16 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -126,6 +126,10 @@ void perf_evsel__config(struct perf_evsel *evsel, struct perf_record_opts *opts)
 		attr->watermark = 0;
 		attr->wakeup_events = 1;
 	}
+	if (opts->branch_stack) {
+		attr->sample_type	|= PERF_SAMPLE_BRANCH_STACK;
+		attr->branch_sample_type = opts->branch_stack;
+	}
 
 	attr->mmap = track;
 	attr->comm = track;
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v5 13/18] perf: add support for taken branch sampling to perf report
  2012-02-02 12:54 [PATCH v5 00/18] perf: add support for sampling taken branches Stephane Eranian
                   ` (11 preceding siblings ...)
  2012-02-02 12:54 ` [PATCH v5 12/18] perf: add support for sampling taken branch to perf record Stephane Eranian
@ 2012-02-02 12:54 ` Stephane Eranian
  2012-02-06 18:14   ` Arnaldo Carvalho de Melo
  2012-02-02 12:54 ` [PATCH v5 14/18] perf: fix endianness detection in perf.data Stephane Eranian
                   ` (4 subsequent siblings)
  17 siblings, 1 reply; 43+ messages in thread
From: Stephane Eranian @ 2012-02-02 12:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterz, mingo, acme, robert.richter, ming.m.lin, andi, asharma,
	ravitillo, vweaver1, khandual, dsahern

From: Roberto Agostino Vitillo <ravitillo@lbl.gov>

This patch adds support for taken branch sampling, i.e, the
PERF_SAMPLE_BRANCH_STACK feature to perf report. In other
words, to display histograms based on taken branches rather
than executed instructions addresses.

The new option is called -b and it takes no argument. To
generate meaningful output, the perf.data must have been
obtained using perf record -b xxx ... where xxx is a branch
filter option.

The output shows symbols, modules, sorted by 'who branches
where' the most often. The percentages reported in the first
column refer to the total number of branches captured and
not the usual number of samples.

Here is a quick example.
Here branchy is simple test program which looks as follows:

void f2(void)
{}
void f3(void)
{}
void f1(unsigned long n)
{
  if (n & 1UL)
    f2();
  else
    f3();
}
int main(void)
{
  unsigned long i;

  for (i=0; i < N; i++)
   f1(i);
  return 0;
}

Here is the output captured on Nehalem, if we are
only interested in user level function calls.

$ perf record -b any_call,u -e cycles:u branchy

$ perf report -b --sort=symbol
    52.34%  [.] main                   [.] f1
    24.04%  [.] f1                     [.] f3
    23.60%  [.] f1                     [.] f2
     0.01%  [k] _IO_new_file_xsputn    [k] _IO_file_overflow
     0.01%  [k] _IO_vfprintf_internal  [k] _IO_new_file_xsputn
     0.01%  [k] _IO_vfprintf_internal  [k] strchrnul
     0.01%  [k] __printf               [k] _IO_vfprintf_internal
     0.01%  [k] main                   [k] __printf

About half (52%) of the call branches captured are from main() -> f1().
The second half (24%+23%) is split in two equal shares between
f1() -> f2(), f1() ->f3(). The output is as expected given the code.

It should be noted, that using -b in perf record does not eliminate
information in the perf.data file. Consequently, a typical profile
can also be obtained by perf report by simply not using its -b option.

Signed-off-by: Roberto Agostino Vitillo <ravitillo@lbl.gov>
Signed-off-by: Stephane Eranian <eranian@google.com>
---
 tools/perf/Documentation/perf-report.txt |    7 ++
 tools/perf/builtin-report.c              |   98 +++++++++++++++++++++++++++---
 2 files changed, 96 insertions(+), 9 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 9b430e9..19b9092 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -153,6 +153,13 @@ OPTIONS
 	information which may be very large and thus may clutter the display.
 	It currently includes: cpu and numa topology of the host system.
 
+-b::
+--branch-stack::
+	Use the addresses of sampled taken branches instead of the instruction
+	address to build the histograms. To generate meaningful output, the
+	perf.data file must have been obtained using perf record -b xxx where
+	xxx is a branch filter option.
+
 SEE ALSO
 --------
 linkperf:perf-stat[1], linkperf:perf-annotate[1]
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 25d34d4..8a8d2f9 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -53,6 +53,50 @@ struct perf_report {
 	DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
 };
 
+static int perf_session__add_branch_hist_entry(struct perf_tool *tool,
+					struct addr_location *al,
+					struct perf_sample *sample,
+					struct perf_evsel *evsel,
+				      struct machine *machine)
+{
+	struct perf_report *rep = container_of(tool, struct perf_report, tool);
+	struct symbol *parent = NULL;
+	int err = 0;
+	unsigned i;
+	struct hist_entry *he;
+	struct branch_info *bi;
+
+	if ((sort__has_parent || symbol_conf.use_callchain)
+	    && sample->callchain) {
+		err = machine__resolve_callchain(machine, evsel, al->thread,
+						 sample->callchain, &parent);
+		if (err)
+			return err;
+	}
+
+	bi = perf_session__resolve_bstack(machine, al->thread,
+					  sample->branch_stack);
+	if (!bi)
+		return -ENOMEM;
+
+	for (i = 0; i < sample->branch_stack->nr; i++) {
+		if (rep->hide_unresolved && !(bi[i].from.sym && bi[i].to.sym))
+			continue;
+		/*
+		 * The report shows the percentage of total branches captured
+		 * and not events sampled. Thus we use a pseudo period of 1.
+		 */
+		he = __hists__add_branch_entry(&evsel->hists, al, parent,
+					       &bi[i], 1);
+		if (he) {
+			evsel->hists.stats.total_period += 1;
+			hists__inc_nr_events(&evsel->hists, PERF_RECORD_SAMPLE);
+		} else
+			return -ENOMEM;
+	}
+	return err;
+}
+
 static int perf_evsel__add_hist_entry(struct perf_evsel *evsel,
 				      struct addr_location *al,
 				      struct perf_sample *sample,
@@ -126,14 +170,21 @@ static int process_sample_event(struct perf_tool *tool,
 	if (rep->cpu_list && !test_bit(sample->cpu, rep->cpu_bitmap))
 		return 0;
 
-	if (al.map != NULL)
-		al.map->dso->hit = 1;
+	if (sort__branch_mode) {
+		if (perf_session__add_branch_hist_entry(tool, &al, sample,
+						    evsel, machine)) {
+			pr_debug("problem adding lbr entry, skipping event\n");
+			return -1;
+		}
+	} else {
+		if (al.map != NULL)
+			al.map->dso->hit = 1;
 
-	if (perf_evsel__add_hist_entry(evsel, &al, sample, machine)) {
-		pr_debug("problem incrementing symbol period, skipping event\n");
-		return -1;
+		if (perf_evsel__add_hist_entry(evsel, &al, sample, machine)) {
+			pr_debug("problem incrementing symbol period, skipping event\n");
+			return -1;
+		}
 	}
-
 	return 0;
 }
 
@@ -188,6 +239,15 @@ static int perf_report__setup_sample_type(struct perf_report *rep)
 			}
 	}
 
+	if (sort__branch_mode) {
+		if (!(self->sample_type & PERF_SAMPLE_BRANCH_STACK)) {
+			fprintf(stderr, "selected -b but no branch data."
+					" Did you call perf record without"
+					" -b?\n");
+			return -1;
+		}
+	}
+
 	return 0;
 }
 
@@ -477,7 +537,8 @@ int cmd_report(int argc, const char **argv, const char *prefix __used)
 	OPT_BOOLEAN(0, "stdio", &report.use_stdio,
 		    "Use the stdio interface"),
 	OPT_STRING('s', "sort", &sort_order, "key[,key2...]",
-		   "sort by key(s): pid, comm, dso, symbol, parent"),
+		   "sort by key(s): pid, comm, dso, symbol, parent, dso_to,"
+		   " dso_from, symbol_to, symbol_from, mispredict"),
 	OPT_BOOLEAN(0, "showcpuutilization", &symbol_conf.show_cpu_utilization,
 		    "Show sample percentage for different cpu modes"),
 	OPT_STRING('p', "parent", &parent_pattern, "regex",
@@ -517,6 +578,8 @@ int cmd_report(int argc, const char **argv, const char *prefix __used)
 		   "Specify disassembler style (e.g. -M intel for intel syntax)"),
 	OPT_BOOLEAN(0, "show-total-period", &symbol_conf.show_total_period,
 		    "Show a column with the sum of periods"),
+	OPT_BOOLEAN('b', "branch-stack", &sort__branch_mode,
+		    "use branch records for histogram filling"),
 	OPT_END()
 	};
 
@@ -537,10 +600,27 @@ int cmd_report(int argc, const char **argv, const char *prefix __used)
 			report.input_name = "perf.data";
 	}
 
-	if (strcmp(report.input_name, "-") != 0)
+	if (sort__branch_mode) {
+		if (use_browser)
+			fprintf(stderr, "Warning: TUI interface not supported"
+					" in branch mode\n");
+		if (symbol_conf.dso_list_str != NULL)
+			fprintf(stderr, "Warning: dso filtering not supported"
+					" in branch mode\n");
+		if (symbol_conf.sym_list_str != NULL)
+			fprintf(stderr, "Warning: symbol filtering not"
+					" supported in branch mode\n");
+
+		report.use_stdio = true;
+		use_browser = 0;
 		setup_browser(true);
-	else
+		symbol_conf.dso_list_str = NULL;
+		symbol_conf.sym_list_str = NULL;
+	} else if (strcmp(report.input_name, "-") != 0) {
+		setup_browser(true);
+	} else {
 		use_browser = 0;
+	}
 
 	/*
 	 * Only in the newt browser we are doing integrated annotation,
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v5 14/18] perf: fix endianness detection in perf.data
  2012-02-02 12:54 [PATCH v5 00/18] perf: add support for sampling taken branches Stephane Eranian
                   ` (12 preceding siblings ...)
  2012-02-02 12:54 ` [PATCH v5 13/18] perf: add support for taken branch sampling to perf report Stephane Eranian
@ 2012-02-02 12:54 ` Stephane Eranian
  2012-02-06 18:17   ` Arnaldo Carvalho de Melo
  2012-02-17  9:42   ` [tip:perf/core] perf tools: " tip-bot for Stephane Eranian
  2012-02-02 12:54 ` [PATCH v5 15/18] perf: add ABI reference sizes Stephane Eranian
                   ` (3 subsequent siblings)
  17 siblings, 2 replies; 43+ messages in thread
From: Stephane Eranian @ 2012-02-02 12:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterz, mingo, acme, robert.richter, ming.m.lin, andi, asharma,
	ravitillo, vweaver1, khandual, dsahern

The current version of perf detects whether or not
the perf.data file is written in a different endianness
using the attr_size field in the header of the file. This
field represents sizeof(struct perf_event_attr) as known
to perf record. If the sizes do not match, then perf tries
the byte-swapped version. If they match, then the tool assumes
a different endianness.

The issue with the approach is that it assumes the size of
perf_event_attr always has to match between perf record and
perf report. However, the kernel perf_event ABI is extensible.
New fields can be added to struct perf_event_attr. Consequently,
it is not possible to use attr_size to detect endianness.

This patch takes another approach by using the magic number
written at the beginning of the perf.data file to detect
endianness. The magic number is an eight-byte signature.
It's primary purpose is to identify (signature) a perf.data
file. But it could also be used to encode the endianness.

The patch introduces a new value for this signature. The key
difference is that the signature is written differently in
the file depending on the endianness. Thus, by comparing the
signature from the file with the tool's own signature it is
possible to detect endianness. The new signature is "PERFILE2".

Backward compatiblity with existing perf.data file is
ensured.

Signed-off-by: Stephane Eranian <eranian@google.com>
---
 tools/perf/util/header.c |   77 ++++++++++++++++++++++++++++++++++++++--------
 1 files changed, 64 insertions(+), 13 deletions(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index ecd7f4d..6f4187d 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -63,9 +63,20 @@ char *perf_header__find_event(u64 id)
 	return NULL;
 }
 
-static const char *__perf_magic = "PERFFILE";
+/*
+ * magic2 = "PERFILE2"
+ * must be a numerical value to let the endianness
+ * determine the memory layout. That way we are able
+ * to detect endianness when reading the perf.data file
+ * back.
+ *
+ * we check for legacy (PERFFILE) format.
+ */
+static const char *__perf_magic1 = "PERFFILE";
+static const u64 __perf_magic2    = 0x32454c4946524550ULL;
+static const u64 __perf_magic2_sw = 0x50455246494c4532ULL;
 
-#define PERF_MAGIC	(*(u64 *)__perf_magic)
+#define PERF_MAGIC	__perf_magic2
 
 struct perf_file_attr {
 	struct perf_event_attr	attr;
@@ -1620,24 +1631,59 @@ int perf_header__process_sections(struct perf_header *header, int fd,
 	return err;
 }
 
+static int check_magic_endian(u64 *magic, struct perf_file_header *header,
+			      struct perf_header *ph)
+{
+	int ret;
+
+	/* check for legacy format */
+	ret = memcmp(magic, __perf_magic1, sizeof(*magic));
+	if (ret == 0) {
+		pr_debug("legacy perf.data format\n");
+		if (!header)
+			return -1;
+
+		if (header->attr_size != sizeof(struct perf_file_attr)) {
+			u64 attr_size = bswap_64(header->attr_size);
+
+			if (attr_size != sizeof(struct perf_file_attr))
+				return -1;
+
+			ph->needs_swap = true;
+		}
+		return 0;
+	}
+
+	/* check magic number with same endianness */
+	if (*magic == __perf_magic2)
+		return 0;
+
+	/* check magic number but opposite endianness */
+	if (*magic != __perf_magic2_sw)
+		return -1;
+
+	ph->needs_swap = true;
+
+	return 0;
+}
+
 int perf_file_header__read(struct perf_file_header *header,
 			   struct perf_header *ph, int fd)
 {
+	int ret;
+
 	lseek(fd, 0, SEEK_SET);
 
-	if (readn(fd, header, sizeof(*header)) <= 0 ||
-	    memcmp(&header->magic, __perf_magic, sizeof(header->magic)))
+	ret = readn(fd, header, sizeof(*header));
+	if (ret <= 0)
 		return -1;
 
-	if (header->attr_size != sizeof(struct perf_file_attr)) {
-		u64 attr_size = bswap_64(header->attr_size);
-
-		if (attr_size != sizeof(struct perf_file_attr))
-			return -1;
+	if (check_magic_endian(&header->magic, header, ph) < 0)
+		return -1;
 
+	if (ph->needs_swap) {
 		mem_bswap_64(header, offsetof(struct perf_file_header,
-					    adds_features));
-		ph->needs_swap = true;
+			     adds_features));
 	}
 
 	if (header->size != sizeof(*header)) {
@@ -1873,8 +1919,13 @@ static int perf_file_header__read_pipe(struct perf_pipe_file_header *header,
 				       struct perf_header *ph, int fd,
 				       bool repipe)
 {
-	if (readn(fd, header, sizeof(*header)) <= 0 ||
-	    memcmp(&header->magic, __perf_magic, sizeof(header->magic)))
+	int ret;
+
+	ret = readn(fd, header, sizeof(*header));
+	if (ret <= 0)
+		return -1;
+
+	 if (check_magic_endian(&header->magic, NULL, ph) < 0)
 		return -1;
 
 	if (repipe && do_write(STDOUT_FILENO, header, sizeof(*header)) < 0)
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v5 15/18] perf: add ABI reference sizes
  2012-02-02 12:54 [PATCH v5 00/18] perf: add support for sampling taken branches Stephane Eranian
                   ` (13 preceding siblings ...)
  2012-02-02 12:54 ` [PATCH v5 14/18] perf: fix endianness detection in perf.data Stephane Eranian
@ 2012-02-02 12:54 ` Stephane Eranian
  2012-02-02 12:54 ` [PATCH v5 16/18] perf: enable reading of perf.data files from different ABI rev Stephane Eranian
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 43+ messages in thread
From: Stephane Eranian @ 2012-02-02 12:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterz, mingo, acme, robert.richter, ming.m.lin, andi, asharma,
	ravitillo, vweaver1, khandual, dsahern

This patch adds reference sizes for revision 1
and 2 of the perf_event ABI, i.e., the size of
the perf_event_attr struct.

With Rev1: config2 was added = +8 bytes
With Rev2: branch_sample_type was added = +8 bytes

Adds the definition for Rev1, Rev2.

This is useful for tools trying to decode the revision
numbers based on the size of the struct.

Signed-off-by: Stephane Eranian <eranian@google.com>
---
 include/linux/perf_event.h |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 366e2b4..a1f983b 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -192,6 +192,8 @@ enum perf_event_read_format {
 };
 
 #define PERF_ATTR_SIZE_VER0	64	/* sizeof first published struct */
+#define PERF_ATTR_SIZE_VER1	72	/* add: config2 */
+#define PERF_ATTR_SIZE_VER2	80	/* add: branch_sample_type */
 
 /*
  * Hardware event_id to monitor via a performance monitoring event:
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v5 16/18] perf: enable reading of perf.data files from different ABI rev
  2012-02-02 12:54 [PATCH v5 00/18] perf: add support for sampling taken branches Stephane Eranian
                   ` (14 preceding siblings ...)
  2012-02-02 12:54 ` [PATCH v5 15/18] perf: add ABI reference sizes Stephane Eranian
@ 2012-02-02 12:54 ` Stephane Eranian
  2012-02-06 18:19   ` Arnaldo Carvalho de Melo
                     ` (2 more replies)
  2012-02-02 12:54 ` [PATCH v5 17/18] perf: fix bug print_event_desc() Stephane Eranian
  2012-02-02 12:54 ` [PATCH v5 18/18] perf: make perf able to read file from older ABIs Stephane Eranian
  17 siblings, 3 replies; 43+ messages in thread
From: Stephane Eranian @ 2012-02-02 12:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterz, mingo, acme, robert.richter, ming.m.lin, andi, asharma,
	ravitillo, vweaver1, khandual, dsahern

This patch allows perf to process perf.data files generated
using an ABI that has a different perf_event_attr struct size, i.e.,
a different ABI version.

The perf_event_attr can be extended, yet perf needs to cope with
older perf.data files. Similarly, perf must be able to cope with
a perf.data file which is using a newer version of the ABI than
what it knows about.

This patch adds read_attr(), a routine that reads a perf_event_attr
struct from a file incrementally based on its advertised size. If
the on-file struct is smaller than what perf knows, then the extra
fields are zeroed. If the on-file struct is bigger, then perf only
uses what it knows about, the rest is skipped.

Signed-off-by: Stephane Eranian <eranian@google.com>
---
 tools/perf/util/header.c |   49 ++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 47 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 6f4187d..8d6c18d 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -1959,6 +1959,51 @@ static int perf_header__read_pipe(struct perf_session *session, int fd)
 	return 0;
 }
 
+static int read_attr(int fd, struct perf_header *ph,
+		     struct perf_file_attr *f_attr)
+{
+	struct perf_event_attr *attr = &f_attr->attr;
+	size_t sz, left;
+	size_t our_sz = sizeof(f_attr->attr);
+	int ret;
+
+	memset(f_attr, 0, sizeof(*f_attr));
+
+	/* read minimal guaranteed structure */
+	ret = readn(fd, attr, PERF_ATTR_SIZE_VER0);
+	if (ret <= 0)
+		return -1;
+
+	/* on file perf_event_attr size */
+	sz = attr->size;
+	if (ph->needs_swap)
+		sz = bswap_32(sz);
+
+	if (sz == 0) {
+		/* assume ABI0 */
+		sz =  PERF_ATTR_SIZE_VER0;
+	} else if (sz > our_sz) {
+		/* bigger than what we know about */
+		sz = our_sz;
+
+		/* skip what we do not know about */
+		lseek(fd, SEEK_CUR, attr->size - our_sz);
+	}
+	/* what we have not yet read and that we know about */
+	left = sz - PERF_ATTR_SIZE_VER0;
+	if (left) {
+		void *ptr = attr;
+		ptr += PERF_ATTR_SIZE_VER0;
+
+		ret = readn(fd, ptr, left);
+		if (ret <= 0)
+			return -1;
+	}
+	/* read the ids */
+	ret = readn(fd, &f_attr->ids, sizeof(struct perf_file_section));
+	return ret <= 0 ? -1 : 0;
+}
+
 int perf_session__read_header(struct perf_session *session, int fd)
 {
 	struct perf_header *header = &session->header;
@@ -1979,14 +2024,14 @@ int perf_session__read_header(struct perf_session *session, int fd)
 		return -EINVAL;
 	}
 
-	nr_attrs = f_header.attrs.size / sizeof(f_attr);
+	nr_attrs = f_header.attrs.size / f_header.attr_size;
 	lseek(fd, f_header.attrs.offset, SEEK_SET);
 
 	for (i = 0; i < nr_attrs; i++) {
 		struct perf_evsel *evsel;
 		off_t tmp;
 
-		if (readn(fd, &f_attr, sizeof(f_attr)) <= 0)
+		if (read_attr(fd, header, &f_attr) < 0)
 			goto out_errno;
 
 		if (header->needs_swap)
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v5 17/18] perf: fix bug print_event_desc()
  2012-02-02 12:54 [PATCH v5 00/18] perf: add support for sampling taken branches Stephane Eranian
                   ` (15 preceding siblings ...)
  2012-02-02 12:54 ` [PATCH v5 16/18] perf: enable reading of perf.data files from different ABI rev Stephane Eranian
@ 2012-02-02 12:54 ` Stephane Eranian
  2012-02-02 12:54 ` [PATCH v5 18/18] perf: make perf able to read file from older ABIs Stephane Eranian
  17 siblings, 0 replies; 43+ messages in thread
From: Stephane Eranian @ 2012-02-02 12:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterz, mingo, acme, robert.richter, ming.m.lin, andi, asharma,
	ravitillo, vweaver1, khandual, dsahern

This patches cleans up local variable types for msz and ret.
They need to be size_t and ssize_t respectively.

It also fixes a bug whereby perf would not read attr struct
with a different size than what it knows about.

Signed-off-by: Stephane Eranian <eranian@google.com>
---
 tools/perf/util/header.c |   19 +++++++++----------
 1 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 8d6c18d..1fb365d 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -1144,8 +1144,9 @@ static void print_event_desc(struct perf_header *ph, int fd, FILE *fp)
 	uint64_t id;
 	void *buf = NULL;
 	char *str;
-	u32 nre, sz, nr, i, j, msz;
-	int ret;
+	u32 nre, sz, nr, i, j;
+	ssize_t ret;
+	size_t msz;
 
 	/* number of events */
 	ret = read(fd, &nre, sizeof(nre));
@@ -1162,25 +1163,23 @@ static void print_event_desc(struct perf_header *ph, int fd, FILE *fp)
 	if (ph->needs_swap)
 		sz = bswap_32(sz);
 
-	/*
-	 * ensure it is at least to our ABI rev
-	 */
-	if (sz < (u32)sizeof(attr))
-		goto error;
-
 	memset(&attr, 0, sizeof(attr));
 
-	/* read entire region to sync up to next field */
+	/* buffer to hold on file attr struct */
 	buf = malloc(sz);
 	if (!buf)
 		goto error;
 
 	msz = sizeof(attr);
-	if (sz < msz)
+	if (sz < (ssize_t)msz)
 		msz = sz;
 
 	for (i = 0 ; i < nre; i++) {
 
+		/*
+		 * must read entire on-file attr struct to
+		 * sync up with layout.
+		 */
 		ret = read(fd, buf, sz);
 		if (ret != (ssize_t)sz)
 			goto error;
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v5 18/18] perf: make perf able to read file from older ABIs
  2012-02-02 12:54 [PATCH v5 00/18] perf: add support for sampling taken branches Stephane Eranian
                   ` (16 preceding siblings ...)
  2012-02-02 12:54 ` [PATCH v5 17/18] perf: fix bug print_event_desc() Stephane Eranian
@ 2012-02-02 12:54 ` Stephane Eranian
  17 siblings, 0 replies; 43+ messages in thread
From: Stephane Eranian @ 2012-02-02 12:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterz, mingo, acme, robert.richter, ming.m.lin, andi, asharma,
	ravitillo, vweaver1, khandual, dsahern

This patches provides a way to handle legacy perf.data
files.  Legacy files are those using the older PERFFILE
signature.

For those, it is still necessary to detect endianness but
without comparing their header->attr_size with the
tool's own version as it may be different. Instead, we use
a reference table for all known sizes from the legacy era.

We try all the combinations for sizes and endianness. If we find
a match, we proceed, otherwise we return: "incompatible file format".
This is also done for the pipe-mode file format.

Signed-off-by: Stephane Eranian <eranian@google.com>
---
 tools/perf/util/header.c |  125 +++++++++++++++++++++++++++++++++++----------
 1 files changed, 97 insertions(+), 28 deletions(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 1fb365d..487605a 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -1630,35 +1630,101 @@ int perf_header__process_sections(struct perf_header *header, int fd,
 	return err;
 }
 
-static int check_magic_endian(u64 *magic, struct perf_file_header *header,
-			      struct perf_header *ph)
+static const int attr_file_abi_sizes[] = {
+	[0] = PERF_ATTR_SIZE_VER0,
+	[1] = PERF_ATTR_SIZE_VER1,
+	0,
+};
+
+/*
+ * In the legacy file format, the magic number is not used to encode endianness.
+ * hdr_sz was used to encode endianness. But given that hdr_sz can vary based
+ * on ABI revisions, we need to try all combinations for all endianness to
+ * detect the endianness.
+ */
+static int try_all_file_abis(uint64_t hdr_sz, struct perf_header *ph)
 {
-	int ret;
+	uint64_t ref_size, attr_size;
+	int i;
 
-	/* check for legacy format */
-	ret = memcmp(magic, __perf_magic1, sizeof(*magic));
-	if (ret == 0) {
-		pr_debug("legacy perf.data format\n");
-		if (!header)
-			return -1;
+	for (i = 0 ; attr_file_abi_sizes[i]; i++) {
+		ref_size = attr_file_abi_sizes[i]
+			 + sizeof(struct perf_file_section);
+		if (hdr_sz != ref_size) {
+			attr_size = bswap_64(hdr_sz);
+			if (attr_size != ref_size)
+				continue;
 
-		if (header->attr_size != sizeof(struct perf_file_attr)) {
-			u64 attr_size = bswap_64(header->attr_size);
+			ph->needs_swap = true;
+		}
+		pr_debug("ABI%d perf.data file detected, need_swap=%d\n",
+			 i,
+			 ph->needs_swap);
+		return 0;
+	}
+	/* could not determine endianness */
+	return -1;
+}
 
-			if (attr_size != sizeof(struct perf_file_attr))
-				return -1;
+#define PERF_PIPE_HDR_VER0	16
+
+static const size_t attr_pipe_abi_sizes[] = {
+	[0] = PERF_PIPE_HDR_VER0,
+	0,
+};
+
+/*
+ * In the legacy pipe format, there is an implicit assumption that endiannesss
+ * between host recording the samples, and host parsing the samples is the
+ * same. This is not always the case given that the pipe output may always be
+ * redirected into a file and analyzed on a different machine with possibly a
+ * different endianness and perf_event ABI revsions in the perf tool itself.
+ */
+static int try_all_pipe_abis(uint64_t hdr_sz, struct perf_header *ph)
+{
+	u64 attr_size;
+	int i;
+
+	for (i = 0 ; attr_pipe_abi_sizes[i]; i++) {
+		if (hdr_sz != attr_pipe_abi_sizes[i]) {
+			attr_size = bswap_64(hdr_sz);
+			if (attr_size != hdr_sz)
+				continue;
 
 			ph->needs_swap = true;
 		}
+		pr_debug("Pipe ABI%d perf.data file detected\n", i);
 		return 0;
 	}
+	return -1;
+}
+
+static int check_magic_endian(u64 magic, uint64_t hdr_sz,
+			      bool is_pipe, struct perf_header *ph)
+{
+	int ret;
+
+	/* check for legacy format */
+	ret = memcmp(&magic, __perf_magic1, sizeof(magic));
+	if (ret == 0) {
+		pr_debug("legacy perf.data format\n");
+		if (is_pipe)
+			return try_all_pipe_abis(hdr_sz, ph);
+
+		return try_all_file_abis(hdr_sz, ph);
+	}
+	/*
+	 * the new magic number serves two purposes:
+	 * - unique number to identify actual perf.data files
+	 * - encode endianness of file
+	 */
 
-	/* check magic number with same endianness */
-	if (*magic == __perf_magic2)
+	/* check magic number with one endianness */
+	if (magic == __perf_magic2)
 		return 0;
 
-	/* check magic number but opposite endianness */
-	if (*magic != __perf_magic2_sw)
+	/* check magic number with opposite endianness */
+	if (magic != __perf_magic2_sw)
 		return -1;
 
 	ph->needs_swap = true;
@@ -1677,8 +1743,11 @@ int perf_file_header__read(struct perf_file_header *header,
 	if (ret <= 0)
 		return -1;
 
-	if (check_magic_endian(&header->magic, header, ph) < 0)
+	if (check_magic_endian(header->magic,
+			       header->attr_size, false, ph) < 0) {
+		pr_debug("magic/endian check failed\n");
 		return -1;
+	}
 
 	if (ph->needs_swap) {
 		mem_bswap_64(header, offsetof(struct perf_file_header,
@@ -1924,21 +1993,17 @@ static int perf_file_header__read_pipe(struct perf_pipe_file_header *header,
 	if (ret <= 0)
 		return -1;
 
-	 if (check_magic_endian(&header->magic, NULL, ph) < 0)
+	if (check_magic_endian(header->magic, header->size, true, ph) < 0) {
+		pr_debug("endian/magic failed\n");
 		return -1;
+	}
+
+	if (ph->needs_swap)
+		header->size = bswap_64(header->size);
 
 	if (repipe && do_write(STDOUT_FILENO, header, sizeof(*header)) < 0)
 		return -1;
 
-	if (header->size != sizeof(*header)) {
-		u64 size = bswap_64(header->size);
-
-		if (size != sizeof(*header))
-			return -1;
-
-		ph->needs_swap = true;
-	}
-
 	return 0;
 }
 
@@ -1975,6 +2040,10 @@ static int read_attr(int fd, struct perf_header *ph,
 
 	/* on file perf_event_attr size */
 	sz = attr->size;
+	if (sz != our_sz)
+		pr_debug("on file attr=%zu vs. %zu bytes,"
+			 " ignoring extra fields\n", sz, our_sz);
+
 	if (ph->needs_swap)
 		sz = bswap_32(sz);
 
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 11/18] perf: add code to support PERF_SAMPLE_BRANCH_STACK
  2012-02-02 12:54 ` [PATCH v5 11/18] perf: add code to support PERF_SAMPLE_BRANCH_STACK Stephane Eranian
@ 2012-02-06 18:06   ` Arnaldo Carvalho de Melo
  2012-02-07 14:11     ` Stephane Eranian
  0 siblings, 1 reply; 43+ messages in thread
From: Arnaldo Carvalho de Melo @ 2012-02-06 18:06 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: linux-kernel, peterz, mingo, robert.richter, ming.m.lin, andi,
	asharma, ravitillo, vweaver1, khandual, dsahern

Em Thu, Feb 02, 2012 at 01:54:41PM +0100, Stephane Eranian escreveu:
> From: Roberto Agostino Vitillo <ravitillo@lbl.gov>
> 
> This patch adds:
> - ability to parse samples with PERF_SAMPLE_BRANCH_STACK
> - sort on branches
> - build histograms on branches

Some comments below, mostly minor stuff, looks great work, thanks!

- Arnaldo
 
> Signed-off-by: Roberto Agostino Vitillo <ravitillo@lbl.gov>
> Signed-off-by: Stephane Eranian <eranian@google.com>
> ---
>  tools/perf/perf.h          |   17 ++
>  tools/perf/util/annotate.c |    2 +-
>  tools/perf/util/event.h    |    1 +
>  tools/perf/util/evsel.c    |   10 ++
>  tools/perf/util/hist.c     |   93 +++++++++---
>  tools/perf/util/hist.h     |    7 +
>  tools/perf/util/session.c  |   72 +++++++++
>  tools/perf/util/session.h  |    4 +
>  tools/perf/util/sort.c     |  362 +++++++++++++++++++++++++++++++++-----------
>  tools/perf/util/sort.h     |    5 +
>  tools/perf/util/symbol.h   |   13 ++
>  11 files changed, 475 insertions(+), 111 deletions(-)
> 
> diff --git a/tools/perf/perf.h b/tools/perf/perf.h
> index 92af168..8b4d25d 100644
> --- a/tools/perf/perf.h
> +++ b/tools/perf/perf.h
> @@ -180,6 +180,23 @@ struct ip_callchain {
>  	u64 ips[0];
>  };
>  
> +struct branch_flags {
> +	u64 mispred:1;
> +	u64 predicted:1;
> +	u64 reserved:62;
> +};
> +
> +struct branch_entry {
> +	u64				from;
> +	u64				to;
> +	struct branch_flags flags;
> +};
> +
> +struct branch_stack {
> +	u64				nr;
> +	struct branch_entry	entries[0];
> +};
> +
>  extern bool perf_host, perf_guest;
>  extern const char perf_version_string[];
>  
> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
> index 011ed26..8248d80 100644
> --- a/tools/perf/util/annotate.c
> +++ b/tools/perf/util/annotate.c
> @@ -64,7 +64,7 @@ int symbol__inc_addr_samples(struct symbol *sym, struct map *map,
>  
>  	pr_debug3("%s: addr=%#" PRIx64 "\n", __func__, map->unmap_ip(map, addr));
>  
> -	if (addr >= sym->end)
> +	if (addr >= sym->end || addr < sym->start)

This is not related to this, would be better to come in a separate patch
with a proper explanation.

>  		return 0;
>  
>  	offset = addr - sym->start;
> diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
> index cbdeaad..1b19728 100644
> --- a/tools/perf/util/event.h
> +++ b/tools/perf/util/event.h
> @@ -81,6 +81,7 @@ struct perf_sample {
>  	u32 raw_size;
>  	void *raw_data;
>  	struct ip_callchain *callchain;
> +	struct branch_stack *branch_stack;
>  };
>  
>  #define BUILD_ID_SIZE 20
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index dcfefab..6b15cda 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -575,6 +575,16 @@ int perf_event__parse_sample(const union perf_event *event, u64 type,
>  		data->raw_data = (void *) pdata;
>  	}
>  
> +	if (type & PERF_SAMPLE_BRANCH_STACK) {
> +		u64 sz;
> +
> +		data->branch_stack = (struct branch_stack *)array;
> +		array++; /* nr */
> +
> +		sz = data->branch_stack->nr * sizeof(struct branch_entry);
> +		sz /= sizeof(uint64_t);

Consistency here: use sizeof(u64), or better yet: sizeof(sz);

> +		array += sz;
> +	}
>  	return 0;
>  }
>  
> diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
> index 6f505d1..66f9936 100644
> --- a/tools/perf/util/hist.c
> +++ b/tools/perf/util/hist.c
> @@ -54,9 +54,11 @@ static void hists__calc_col_len(struct hists *hists, struct hist_entry *h)
>  {
>  	u16 len;
>  
> -	if (h->ms.sym)
> -		hists__new_col_len(hists, HISTC_SYMBOL, h->ms.sym->namelen);
> -	else {
> +	if (h->ms.sym) {
> +		int n = (int)h->ms.sym->namelen + 4;
> +		int symlen = max(n, BITS_PER_LONG / 4 + 6);

What is the rationale here? Adding a comment will help

> +		hists__new_col_len(hists, HISTC_SYMBOL, symlen);
> +	} else {
>  		const unsigned int unresolved_col_width = BITS_PER_LONG / 4;
>  
>  		if (hists__col_len(hists, HISTC_DSO) < unresolved_col_width &&
> @@ -195,26 +197,14 @@ static u8 symbol__parent_filter(const struct symbol *parent)
>  	return 0;
>  }
>  
> -struct hist_entry *__hists__add_entry(struct hists *hists,
> +static struct hist_entry *add_hist_entry(struct hists *hists,
> +				      struct hist_entry *entry,
>  				      struct addr_location *al,
> -				      struct symbol *sym_parent, u64 period)
> +				      u64 period)
>  {
>  	struct rb_node **p;
>  	struct rb_node *parent = NULL;
>  	struct hist_entry *he;
> -	struct hist_entry entry = {
> -		.thread	= al->thread,
> -		.ms = {
> -			.map	= al->map,
> -			.sym	= al->sym,
> -		},
> -		.cpu	= al->cpu,
> -		.ip	= al->addr,
> -		.level	= al->level,
> -		.period	= period,
> -		.parent = sym_parent,
> -		.filtered = symbol__parent_filter(sym_parent),
> -	};
>  	int cmp;
>  
>  	pthread_mutex_lock(&hists->lock);
> @@ -225,7 +215,7 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
>  		parent = *p;
>  		he = rb_entry(parent, struct hist_entry, rb_node_in);
>  
> -		cmp = hist_entry__cmp(&entry, he);
> +		cmp = hist_entry__cmp(entry, he);
>  
>  		if (!cmp) {
>  			he->period += period;
> @@ -239,7 +229,7 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
>  			p = &(*p)->rb_right;
>  	}
>  
> -	he = hist_entry__new(&entry);
> +	he = hist_entry__new(entry);
>  	if (!he)
>  		goto out_unlock;
>  
> @@ -252,6 +242,69 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
>  	return he;
>  }
>  
> +struct hist_entry *__hists__add_branch_entry(struct hists *self,
> +					     struct addr_location *al,
> +					     struct symbol *sym_parent,
> +					     struct branch_info *bi,
> +					     u64 period){
> +	struct hist_entry entry = {
> +		.thread	= al->thread,
> +		.ms = {
> +			.map	= bi->to.map,
> +			.sym	= bi->to.sym,
> +		},
> +		.cpu	= al->cpu,
> +		.ip	= bi->to.addr,
> +		.level	= al->level,
> +		.period	= period,
> +		.parent = sym_parent,
> +		.filtered = symbol__parent_filter(sym_parent),
> +		.branch_info = bi,
> +	};
> +	struct hist_entry *he;
> +
> +	he = add_hist_entry(self, &entry, al, period);
> +	if (!he)
> +		return NULL;
> +
> +	/*
> +	 * in branch mode, we do not display al->sym, al->addr

Really minor nit, but start with:  "In branch mode"

> +	 * but instead what is in branch_info. The addresses and
> +	 * symbols there may need wider columns, so make sure they
> +	 * are taken into account.
> +	 *
> +	 * hists__calc_col_len() tracks the max column width, so
> +	 * we need to call it for both the from and to addresses
> +	 */
> +	entry.ip     = bi->from.addr;
> +	entry.ms.map = bi->from.map;
> +	entry.ms.sym = bi->from.sym;
> +	hists__calc_col_len(self, &entry);
> +
> +	return he;
> +}
> +
> +struct hist_entry *__hists__add_entry(struct hists *self,
> +				      struct addr_location *al,
> +				      struct symbol *sym_parent, u64 period)
> +{
> +	struct hist_entry entry = {
> +		.thread	= al->thread,
> +		.ms = {
> +			.map	= al->map,
> +			.sym	= al->sym,
> +		},
> +		.cpu	= al->cpu,
> +		.ip	= al->addr,
> +		.level	= al->level,
> +		.period	= period,
> +		.parent = sym_parent,
> +		.filtered = symbol__parent_filter(sym_parent),
> +	};
> +
> +	return add_hist_entry(self, &entry, al, period);
> +}
> +
>  int64_t
>  hist_entry__cmp(struct hist_entry *left, struct hist_entry *right)
>  {
> diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
> index 0d48613..801a04e 100644
> --- a/tools/perf/util/hist.h
> +++ b/tools/perf/util/hist.h
> @@ -41,6 +41,7 @@ enum hist_column {
>  	HISTC_COMM,
>  	HISTC_PARENT,
>  	HISTC_CPU,
> +	HISTC_MISPREDICT,
>  	HISTC_NR_COLS, /* Last entry */
>  };
>  
> @@ -73,6 +74,12 @@ int hist_entry__snprintf(struct hist_entry *self, char *bf, size_t size,
>  			 struct hists *hists);
>  void hist_entry__free(struct hist_entry *);
>  
> +struct hist_entry *__hists__add_branch_entry(struct hists *self,
> +					     struct addr_location *al,
> +					     struct symbol *sym_parent,
> +					     struct branch_info *bi,
> +					     u64 period);
> +
>  void hists__output_resort(struct hists *self);
>  void hists__output_resort_threaded(struct hists *hists);
>  void hists__collapse_resort(struct hists *self);
> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
> index 552c1c5..5ce3f31 100644
> --- a/tools/perf/util/session.c
> +++ b/tools/perf/util/session.c
> @@ -229,6 +229,63 @@ static bool symbol__match_parent_regex(struct symbol *sym)
>  	return 0;
>  }
>  
> +static const u8 cpumodes[] = {
> +	PERF_RECORD_MISC_USER,
> +	PERF_RECORD_MISC_KERNEL,
> +	PERF_RECORD_MISC_GUEST_USER,
> +	PERF_RECORD_MISC_GUEST_KERNEL
> +};
> +#define NCPUMODES (sizeof(cpumodes)/sizeof(u8))
> +
> +static void ip__resolve_ams(struct machine *self, struct thread *thread,
> +			    struct addr_map_symbol *ams,
> +			    u64 ip)
> +{
> +	struct addr_location al;
> +	size_t i;
> +	u8 m;
> +
> +	memset(&al, 0, sizeof(al));
> +
> +	for (i = 0; i < NCPUMODES; i++) {
> +		m = cpumodes[i];
> +		/*
> +		 * we cannot use the header.misc hint to determine whether a

ditto

> +		 * branch stack address is user, kernel, guest, hypervisor.
> +		 * Branches may straddle the kernel/user/hypervisor boundaries.
> +		 * Thus, we have to try * consecutively until we find a match

                                        ^ comment reflow artifact?

> +		 * or else, the symbol is unknown
> +		 */
> +		thread__find_addr_location(thread, self, m, MAP__FUNCTION,
> +				ip, &al, NULL);
> +		if (al.sym)
> +			goto found;
> +	}
> +found:
> +	ams->addr = ip;
> +	ams->sym = al.sym;
> +	ams->map = al.map;
> +}
> +
> +struct branch_info *perf_session__resolve_bstack(struct machine *self,
> +						 struct thread *thr,
> +						 struct branch_stack *bs)
> +{
> +	struct branch_info *bi;
> +	unsigned int i;
> +
> +	bi = calloc(bs->nr, sizeof(struct branch_info));
> +	if (!bi)
> +		return NULL;
> +
> +	for (i = 0; i < bs->nr; i++) {
> +		ip__resolve_ams(self, thr, &bi[i].to, bs->entries[i].to);
> +		ip__resolve_ams(self, thr, &bi[i].from, bs->entries[i].from);
> +		bi[i].flags = bs->entries[i].flags;
> +	}
> +	return bi;
> +}
> +
>  int machine__resolve_callchain(struct machine *self, struct perf_evsel *evsel,
>  			       struct thread *thread,
>  			       struct ip_callchain *chain,
> @@ -697,6 +754,18 @@ static void callchain__printf(struct perf_sample *sample)
>  		       i, sample->callchain->ips[i]);
>  }
>  
> +static void branch_stack__printf(struct perf_sample *sample)
> +{
> +	uint64_t i;
> +
> +	printf("... branch stack: nr:%" PRIu64 "\n", sample->branch_stack->nr);
> +
> +	for (i = 0; i < sample->branch_stack->nr; i++)
> +		printf("..... %2"PRIu64": %016" PRIx64 " -> %016" PRIx64 "\n",
> +			i, sample->branch_stack->entries[i].from,
> +			sample->branch_stack->entries[i].to);
> +}
> +
>  static void perf_session__print_tstamp(struct perf_session *session,
>  				       union perf_event *event,
>  				       struct perf_sample *sample)
> @@ -744,6 +813,9 @@ static void dump_sample(struct perf_session *session, union perf_event *event,
>  
>  	if (session->sample_type & PERF_SAMPLE_CALLCHAIN)
>  		callchain__printf(sample);
> +
> +	if (session->sample_type & PERF_SAMPLE_BRANCH_STACK)
> +		branch_stack__printf(sample);
>  }
>  
>  static struct machine *
> diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h
> index c8d9017..accb5dc 100644
> --- a/tools/perf/util/session.h
> +++ b/tools/perf/util/session.h
> @@ -73,6 +73,10 @@ int perf_session__resolve_callchain(struct perf_session *self, struct perf_evsel
>  				    struct ip_callchain *chain,
>  				    struct symbol **parent);
>  
> +struct branch_info *perf_session__resolve_bstack(struct machine *self,
> +						 struct thread *thread,
> +						 struct branch_stack *bs);
> +
>  bool perf_session__has_traces(struct perf_session *self, const char *msg);
>  
>  void mem_bswap_64(void *src, int byte_size);
> diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
> index 16da30d..1531989 100644
> --- a/tools/perf/util/sort.c
> +++ b/tools/perf/util/sort.c
> @@ -8,6 +8,7 @@ const char	default_sort_order[] = "comm,dso,symbol";
>  const char	*sort_order = default_sort_order;
>  int		sort__need_collapse = 0;
>  int		sort__has_parent = 0;
> +bool		sort__branch_mode;
>  
>  enum sort_type	sort__first_dimension;
>  
> @@ -94,6 +95,26 @@ static int hist_entry__comm_snprintf(struct hist_entry *self, char *bf,
>  	return repsep_snprintf(bf, size, "%*s", width, self->thread->comm);
>  }
>  
> +static int64_t _sort__dso_cmp(struct map *map_l, struct map *map_r)
> +{
> +	struct dso *dso_l = map_l ? map_l->dso : NULL;
> +	struct dso *dso_r = map_r ? map_r->dso : NULL;
> +	const char *dso_name_l, *dso_name_r;
> +
> +	if (!dso_l || !dso_r)
> +		return cmp_null(dso_l, dso_r);
> +
> +	if (verbose) {
> +		dso_name_l = dso_l->long_name;
> +		dso_name_r = dso_r->long_name;
> +	} else {
> +		dso_name_l = dso_l->short_name;
> +		dso_name_r = dso_r->short_name;
> +	}
> +
> +	return strcmp(dso_name_l, dso_name_r);
> +}
> +
>  struct sort_entry sort_comm = {
>  	.se_header	= "Command",
>  	.se_cmp		= sort__comm_cmp,
> @@ -107,36 +128,74 @@ struct sort_entry sort_comm = {
>  static int64_t
>  sort__dso_cmp(struct hist_entry *left, struct hist_entry *right)
>  {
> -	struct dso *dso_l = left->ms.map ? left->ms.map->dso : NULL;
> -	struct dso *dso_r = right->ms.map ? right->ms.map->dso : NULL;
> -	const char *dso_name_l, *dso_name_r;
> +	return _sort__dso_cmp(left->ms.map, right->ms.map);
> +}
>  
> -	if (!dso_l || !dso_r)
> -		return cmp_null(dso_l, dso_r);
>  
> -	if (verbose) {
> -		dso_name_l = dso_l->long_name;
> -		dso_name_r = dso_r->long_name;
> -	} else {
> -		dso_name_l = dso_l->short_name;
> -		dso_name_r = dso_r->short_name;
> +static int64_t _sort__sym_cmp(struct symbol *sym_l, struct symbol *sym_r,

We use double _ on the front with the same rationale as in the kernel,
i.e. we we do a little bit less than what the non __ prefixed function
does (locking, etc).

> +			      u64 ip_l, u64 ip_r)
> +{
> +	if (!sym_l || !sym_r)
> +		return cmp_null(sym_l, sym_r);
> +
> +	if (sym_l == sym_r)
> +		return 0;
> +
> +	if (sym_l)
> +		ip_l = sym_l->start;
> +	if (sym_r)
> +		ip_r = sym_r->start;
> +
> +	return (int64_t)(ip_r - ip_l);
> +}
> +
> +static int _hist_entry__dso_snprintf(struct map *map, char *bf,
> +				     size_t size, unsigned int width)
> +{
> +	if (map && map->dso) {
> +		const char *dso_name = !verbose ? map->dso->short_name :
> +			map->dso->long_name;
> +		return repsep_snprintf(bf, size, "%-*s", width, dso_name);
>  	}
>  
> -	return strcmp(dso_name_l, dso_name_r);
> +	return repsep_snprintf(bf, size, "%-*s", width, "[unknown]");
>  }
>  
>  static int hist_entry__dso_snprintf(struct hist_entry *self, char *bf,
>  				    size_t size, unsigned int width)
>  {
> -	if (self->ms.map && self->ms.map->dso) {
> -		const char *dso_name = !verbose ? self->ms.map->dso->short_name :
> -						  self->ms.map->dso->long_name;
> -		return repsep_snprintf(bf, size, "%-*s", width, dso_name);
> +	return _hist_entry__dso_snprintf(self->ms.map, bf, size, width);
> +}
> +
> +static int _hist_entry__sym_snprintf(struct map *map, struct symbol *sym,
> +				     u64 ip, char level, char *bf, size_t size,
> +				     unsigned int width __used)
> +{
> +	size_t ret = 0;
> +
> +	if (verbose) {
> +		char o = map ? dso__symtab_origin(map->dso) : '!';
> +		ret += repsep_snprintf(bf, size, "%-#*llx %c ",
> +				       BITS_PER_LONG / 4, ip, o);
>  	}
>  
> -	return repsep_snprintf(bf, size, "%-*s", width, "[unknown]");
> +	ret += repsep_snprintf(bf + ret, size - ret, "[%c] ", level);
> +	if (sym)
> +		ret += repsep_snprintf(bf + ret, size - ret, "%-*s",
> +				       width - ret,
> +				       sym->name);
> +	else {
> +		size_t len = BITS_PER_LONG / 4;
> +		ret += repsep_snprintf(bf + ret, size - ret, "%-#.*llx",
> +				       len, ip);
> +		ret += repsep_snprintf(bf + ret, size - ret, "%-*s",
> +				       width - ret, "");
> +	}
> +
> +	return ret;
>  }
>  
> +
>  struct sort_entry sort_dso = {
>  	.se_header	= "Shared Object",
>  	.se_cmp		= sort__dso_cmp,
> @@ -144,8 +203,14 @@ struct sort_entry sort_dso = {
>  	.se_width_idx	= HISTC_DSO,
>  };
>  
> -/* --sort symbol */
> +static int hist_entry__sym_snprintf(struct hist_entry *self, char *bf,
> +				    size_t size, unsigned int width __used)
> +{
> +	return _hist_entry__sym_snprintf(self->ms.map, self->ms.sym, self->ip,
> +					 self->level, bf, size, width);
> +}
>  
> +/* --sort symbol */
>  static int64_t
>  sort__sym_cmp(struct hist_entry *left, struct hist_entry *right)
>  {
> @@ -154,40 +219,10 @@ sort__sym_cmp(struct hist_entry *left, struct hist_entry *right)
>  	if (!left->ms.sym && !right->ms.sym)
>  		return right->level - left->level;
>  
> -	if (!left->ms.sym || !right->ms.sym)
> -		return cmp_null(left->ms.sym, right->ms.sym);
> -
> -	if (left->ms.sym == right->ms.sym)
> -		return 0;
> -
>  	ip_l = left->ms.sym->start;
>  	ip_r = right->ms.sym->start;
>  
> -	return (int64_t)(ip_r - ip_l);
> -}
> -
> -static int hist_entry__sym_snprintf(struct hist_entry *self, char *bf,
> -				    size_t size, unsigned int width __used)
> -{
> -	size_t ret = 0;
> -
> -	if (verbose) {
> -		char o = self->ms.map ? dso__symtab_origin(self->ms.map->dso) : '!';
> -		ret += repsep_snprintf(bf, size, "%-#*llx %c ",
> -				       BITS_PER_LONG / 4, self->ip, o);
> -	}
> -
> -	if (!sort_dso.elide)
> -		ret += repsep_snprintf(bf + ret, size - ret, "[%c] ", self->level);
> -
> -	if (self->ms.sym)
> -		ret += repsep_snprintf(bf + ret, size - ret, "%s",
> -				       self->ms.sym->name);
> -	else
> -		ret += repsep_snprintf(bf + ret, size - ret, "%-#*llx",
> -				       BITS_PER_LONG / 4, self->ip);
> -
> -	return ret;
> +	return _sort__sym_cmp(left->ms.sym, right->ms.sym, ip_l, ip_r);
>  }
>  
>  struct sort_entry sort_sym = {
> @@ -246,6 +281,135 @@ struct sort_entry sort_cpu = {
>  	.se_width_idx	= HISTC_CPU,
>  };
>  
> +static int64_t
> +sort__dso_from_cmp(struct hist_entry *left, struct hist_entry *right)
> +{
> +	return _sort__dso_cmp(left->branch_info->from.map,
> +			      right->branch_info->from.map);
> +}
> +
> +static int hist_entry__dso_from_snprintf(struct hist_entry *self, char *bf,
> +				    size_t size, unsigned int width)
> +{
> +	return _hist_entry__dso_snprintf(self->branch_info->from.map,
> +					 bf, size, width);
> +}
> +
> +struct sort_entry sort_dso_from = {
> +	.se_header	= "Source Shared Object",
> +	.se_cmp		= sort__dso_from_cmp,
> +	.se_snprintf	= hist_entry__dso_from_snprintf,
> +	.se_width_idx	= HISTC_DSO,
> +};
> +
> +static int64_t
> +sort__dso_to_cmp(struct hist_entry *left, struct hist_entry *right)
> +{
> +	return _sort__dso_cmp(left->branch_info->to.map,
> +			      right->branch_info->to.map);
> +}
> +
> +static int hist_entry__dso_to_snprintf(struct hist_entry *self, char *bf,
> +				       size_t size, unsigned int width)
> +{
> +	return _hist_entry__dso_snprintf(self->branch_info->to.map,
> +					 bf, size, width);
> +}
> +
> +static int64_t
> +sort__sym_from_cmp(struct hist_entry *left, struct hist_entry *right)
> +{
> +	struct addr_map_symbol *from_l = &left->branch_info->from;
> +	struct addr_map_symbol *from_r = &right->branch_info->from;
> +
> +	if (!from_l->sym && !from_r->sym)
> +		return right->level - left->level;
> +
> +	return _sort__sym_cmp(from_l->sym, from_r->sym, from_l->addr,
> +			     from_r->addr);
> +}
> +
> +static int64_t
> +sort__sym_to_cmp(struct hist_entry *left, struct hist_entry *right)
> +{
> +	struct addr_map_symbol *to_l = &left->branch_info->to;
> +	struct addr_map_symbol *to_r = &right->branch_info->to;
> +
> +	if (!to_l->sym && !to_r->sym)
> +		return right->level - left->level;
> +
> +	return _sort__sym_cmp(to_l->sym, to_r->sym, to_l->addr, to_r->addr);
> +}
> +
> +static int hist_entry__sym_from_snprintf(struct hist_entry *self, char *bf,
> +				    size_t size, unsigned int width __used)
> +{
> +	struct addr_map_symbol *from = &self->branch_info->from;
> +	return _hist_entry__sym_snprintf(from->map, from->sym, from->addr,
> +					 self->level, bf, size, width);
> +
> +}
> +
> +static int hist_entry__sym_to_snprintf(struct hist_entry *self, char *bf,
> +				    size_t size, unsigned int width __used)
> +{
> +	struct addr_map_symbol *to = &self->branch_info->to;
> +	return _hist_entry__sym_snprintf(to->map, to->sym, to->addr,
> +					 self->level, bf, size, width);
> +
> +}
> +
> +struct sort_entry sort_dso_to = {
> +	.se_header	= "Target Shared Object",
> +	.se_cmp		= sort__dso_to_cmp,
> +	.se_snprintf	= hist_entry__dso_to_snprintf,
> +	.se_width_idx	= HISTC_DSO,
> +};
> +
> +struct sort_entry sort_sym_from = {
> +	.se_header	= "Source Symbol",
> +	.se_cmp		= sort__sym_from_cmp,
> +	.se_snprintf	= hist_entry__sym_from_snprintf,
> +	.se_width_idx	= HISTC_SYMBOL,
> +};
> +
> +struct sort_entry sort_sym_to = {
> +	.se_header	= "Target Symbol",
> +	.se_cmp		= sort__sym_to_cmp,
> +	.se_snprintf	= hist_entry__sym_to_snprintf,
> +	.se_width_idx	= HISTC_SYMBOL,
> +};
> +
> +static int64_t
> +sort__mispredict_cmp(struct hist_entry *left, struct hist_entry *right)
> +{
> +	const unsigned char mp = left->branch_info->flags.mispred !=
> +					right->branch_info->flags.mispred;
> +	const unsigned char p = left->branch_info->flags.predicted !=
> +					right->branch_info->flags.predicted;
> +
> +	return mp || p;
> +}
> +
> +static int hist_entry__mispredict_snprintf(struct hist_entry *self, char *bf,
> +				    size_t size, unsigned int width){
> +	static const char *out = "N/A";
> +
> +	if (self->branch_info->flags.predicted)
> +		out = "N";
> +	else if (self->branch_info->flags.mispred)
> +		out = "Y";
> +
> +	return repsep_snprintf(bf, size, "%-*s", width, out);
> +}
> +
> +struct sort_entry sort_mispredict = {
> +	.se_header	= "Branch Mispredicted",
> +	.se_cmp		= sort__mispredict_cmp,
> +	.se_snprintf	= hist_entry__mispredict_snprintf,
> +	.se_width_idx	= HISTC_MISPREDICT,
> +};
> +
>  struct sort_dimension {
>  	const char		*name;
>  	struct sort_entry	*entry;
> @@ -253,14 +417,59 @@ struct sort_dimension {
>  };
>  
>  static struct sort_dimension sort_dimensions[] = {
> -	{ .name = "pid",	.entry = &sort_thread,	},
> -	{ .name = "comm",	.entry = &sort_comm,	},
> -	{ .name = "dso",	.entry = &sort_dso,	},
> -	{ .name = "symbol",	.entry = &sort_sym,	},
> -	{ .name = "parent",	.entry = &sort_parent,	},
> -	{ .name = "cpu",	.entry = &sort_cpu,	},
> +	{ .name = "pid",	.entry = &sort_thread,			},
> +	{ .name = "comm",	.entry = &sort_comm,			},
> +	{ .name = "dso",	.entry = &sort_dso,			},
> +	{ .name = "dso_from",	.entry = &sort_dso_from, .taken = true	},
> +	{ .name = "dso_to",	.entry = &sort_dso_to,	 .taken = true	},
> +	{ .name = "symbol",	.entry = &sort_sym,			},
> +	{ .name = "symbol_from",.entry = &sort_sym_from, .taken = true	},
> +	{ .name = "symbol_to",	.entry = &sort_sym_to,	 .taken = true	},
> +	{ .name = "parent",	.entry = &sort_parent,			},
> +	{ .name = "cpu",	.entry = &sort_cpu,			},
> +	{ .name = "mispredict", .entry = &sort_mispredict, },
>  };
>  
> +static int _sort_dimension__add(struct sort_dimension *sd)
> +{
> +	if (sd->entry->se_collapse)
> +		sort__need_collapse = 1;
> +
> +	if (sd->entry == &sort_parent) {
> +		int ret = regcomp(&parent_regex, parent_pattern, REG_EXTENDED);
> +		if (ret) {
> +			char err[BUFSIZ];
> +
> +			regerror(ret, &parent_regex, err, sizeof(err));
> +			pr_err("Invalid regex: %s\n%s", parent_pattern, err);
> +			return -EINVAL;
> +		}
> +		sort__has_parent = 1;
> +	}
> +
> +	if (list_empty(&hist_entry__sort_list)) {
> +		if (!strcmp(sd->name, "pid"))
> +			sort__first_dimension = SORT_PID;
> +		else if (!strcmp(sd->name, "comm"))
> +			sort__first_dimension = SORT_COMM;
> +		else if (!strcmp(sd->name, "dso"))
> +			sort__first_dimension = SORT_DSO;
> +		else if (!strcmp(sd->name, "symbol"))
> +			sort__first_dimension = SORT_SYM;
> +		else if (!strcmp(sd->name, "parent"))
> +			sort__first_dimension = SORT_PARENT;
> +		else if (!strcmp(sd->name, "cpu"))
> +			sort__first_dimension = SORT_CPU;
> +		else if (!strcmp(sd->name, "mispredict"))
> +			sort__first_dimension = SORT_MISPREDICTED;
> +	}
> +
> +	list_add_tail(&sd->entry->list, &hist_entry__sort_list);
> +	sd->taken = 1;
> +
> +	return 0;
> +}
> +
>  int sort_dimension__add(const char *tok)
>  {
>  	unsigned int i;
> @@ -271,48 +480,21 @@ int sort_dimension__add(const char *tok)
>  		if (strncasecmp(tok, sd->name, strlen(tok)))
>  			continue;
>  
> -		if (sd->entry == &sort_parent) {
> -			int ret = regcomp(&parent_regex, parent_pattern, REG_EXTENDED);
> -			if (ret) {
> -				char err[BUFSIZ];
> -
> -				regerror(ret, &parent_regex, err, sizeof(err));
> -				pr_err("Invalid regex: %s\n%s", parent_pattern, err);
> -				return -EINVAL;
> -			}
> -			sort__has_parent = 1;
> -		}
> -
>  		if (sd->taken)
>  			return 0;
>  
> -		if (sd->entry->se_collapse)
> -			sort__need_collapse = 1;
> -
> -		if (list_empty(&hist_entry__sort_list)) {
> -			if (!strcmp(sd->name, "pid"))
> -				sort__first_dimension = SORT_PID;
> -			else if (!strcmp(sd->name, "comm"))
> -				sort__first_dimension = SORT_COMM;
> -			else if (!strcmp(sd->name, "dso"))
> -				sort__first_dimension = SORT_DSO;
> -			else if (!strcmp(sd->name, "symbol"))
> -				sort__first_dimension = SORT_SYM;
> -			else if (!strcmp(sd->name, "parent"))
> -				sort__first_dimension = SORT_PARENT;
> -			else if (!strcmp(sd->name, "cpu"))
> -				sort__first_dimension = SORT_CPU;
> -		}
> -
> -		list_add_tail(&sd->entry->list, &hist_entry__sort_list);
> -		sd->taken = 1;
>  
> -		return 0;
> +		if (sort__branch_mode && (sd->entry == &sort_dso ||
> +					sd->entry == &sort_sym)){
> +			int err = _sort_dimension__add(sd + 1);
> +			return err ?: _sort_dimension__add(sd + 2);
> +		} else if (sd->entry == &sort_mispredict && !sort__branch_mode)
> +			break;
> +		else
> +			return _sort_dimension__add(sd);
>  	}
> -
>  	return -ESRCH;
>  }
> -
>  void setup_sorting(const char * const usagestr[], const struct option *opts)
>  {
>  	char *tmp, *tok, *str = strdup(sort_order);
> diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
> index 3f67ae3..effcae1 100644
> --- a/tools/perf/util/sort.h
> +++ b/tools/perf/util/sort.h
> @@ -31,11 +31,14 @@ extern const char *parent_pattern;
>  extern const char default_sort_order[];
>  extern int sort__need_collapse;
>  extern int sort__has_parent;
> +extern bool sort__branch_mode;
>  extern char *field_sep;
>  extern struct sort_entry sort_comm;
>  extern struct sort_entry sort_dso;
>  extern struct sort_entry sort_sym;
>  extern struct sort_entry sort_parent;
> +extern struct sort_entry sort_lbr_dso;
> +extern struct sort_entry sort_lbr_sym;
>  extern enum sort_type sort__first_dimension;
>  
>  /**
> @@ -72,6 +75,7 @@ struct hist_entry {
>  		struct hist_entry *pair;
>  		struct rb_root	  sorted_chain;
>  	};
> +	struct branch_info	*branch_info;
>  	struct callchain_root	callchain[0];
>  };
>  
> @@ -82,6 +86,7 @@ enum sort_type {
>  	SORT_SYM,
>  	SORT_PARENT,
>  	SORT_CPU,
> +	SORT_MISPREDICTED,
>  };
>  
>  /*
> diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
> index 2a683d4..5866ce6 100644
> --- a/tools/perf/util/symbol.h
> +++ b/tools/perf/util/symbol.h
> @@ -5,6 +5,7 @@
>  #include <stdbool.h>
>  #include <stdint.h>
>  #include "map.h"
> +#include "../perf.h"
>  #include <linux/list.h>
>  #include <linux/rbtree.h>
>  #include <stdio.h>
> @@ -120,6 +121,18 @@ struct map_symbol {
>  	bool	      has_children;
>  };
>  
> +struct addr_map_symbol {
> +	struct map    *map;
> +	struct symbol *sym;
> +	u64	      addr;
> +};
> +
> +struct branch_info {
> +	struct addr_map_symbol from;
> +	struct addr_map_symbol to;
> +	struct branch_flags flags;
> +};
> +
>  struct addr_location {
>  	struct thread *thread;
>  	struct map    *map;
> -- 
> 1.7.4.1

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 12/18] perf: add support for sampling taken branch to perf record
  2012-02-02 12:54 ` [PATCH v5 12/18] perf: add support for sampling taken branch to perf record Stephane Eranian
@ 2012-02-06 18:08   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 43+ messages in thread
From: Arnaldo Carvalho de Melo @ 2012-02-06 18:08 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: linux-kernel, peterz, mingo, robert.richter, ming.m.lin, andi,
	asharma, ravitillo, vweaver1, khandual, dsahern

Em Thu, Feb 02, 2012 at 01:54:42PM +0100, Stephane Eranian escreveu:
> From: Roberto Agostino Vitillo <ravitillo@lbl.gov>
> 
> This patch adds a new option to enable taken branch stack
> sampling, i.e., leverage the PERF_SAMPLE_BRANCH_STACK feature
> of perf_events.
> 
> There is a new option to active this mode: -b.
> It is possible to pass a set of filters to select the type of
> branches to sample.
> 
> The following filters are available:
> - any : any type of branches
> - any_call : any function call or system call
> - any_ret : any function return or system call return
> - any_ind : any indirect branch
> - u:  only when the branch target is at the user level
> - k: only when the branch target is in the kernel
> - hv: only when the branch target is in the hypervisor
> 
> Filters can be combined by passing a comma separated list
> to the option:
> 
> $ perf record -b any_call,u -e cycles:u branchy
> 
> Signed-off-by: Roberto Agostino Vitillo <ravitillo@lbl.gov>

Looks ok,

Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>

> Signed-off-by: Stephane Eranian <eranian@google.com>
> ---
>  tools/perf/Documentation/perf-record.txt |   25 ++++++++++
>  tools/perf/builtin-record.c              |   74 ++++++++++++++++++++++++++++++
>  tools/perf/perf.h                        |    1 +
>  tools/perf/util/evsel.c                  |    4 ++
>  4 files changed, 104 insertions(+), 0 deletions(-)
> 
> diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
> index ff9a66e..288d429 100644
> --- a/tools/perf/Documentation/perf-record.txt
> +++ b/tools/perf/Documentation/perf-record.txt
> @@ -152,6 +152,31 @@ an empty cgroup (monitor all the time) using, e.g., -G foo,,bar. Cgroups must ha
>  corresponding events, i.e., they always refer to events defined earlier on the command
>  line.
>  
> +-b::
> +--branch-stack::
> +Enable taken branch stack sampling. Each sample captures a series of consecutive
> +taken branches. The number of branches captured with each sample depends on the
> +underlying hardware, the type of branches of interest, and the executed code.
> +It is possible to select the types of branches captured by enabling filters. The
> +following filters are defined:
> +
> +        -  any :  any type of branches
> +        - any_call: any function call or system call
> +        - any_ret: any function return or system call return
> +        - any_ind: any indirect branch
> +        - u:  only when the branch target is at the user level
> +        - k: only when the branch target is in the kernel
> +        - hv: only when the target is at the hypervisor level
> +
> ++
> +At least one of any, any_call, any_ret, any_ind must be provided. The privilege levels may
> +be ommitted, in which case, the privilege levels of the associated event are applied to the
> +branch filter. Both kernel (k) and hypervisor (hv) privilege levels are subject to
> +permissions.  When sampling on multiple events, branch stack sampling is enabled for all
> +the sampling events. The sampled branch type is the same for all events.
> +Note that taken branch sampling may not be available on all processors.
> +The various filters must be specified as a comma separated list: -b any_ret,u,k
> +
>  SEE ALSO
>  --------
>  linkperf:perf-stat[1], linkperf:perf-list[1]
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index 32870ee..6565164 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -637,6 +637,77 @@ static int __cmd_record(struct perf_record *rec, int argc, const char **argv)
>  	return err;
>  }
>  
> +#define BRANCH_OPT(n, m) \
> +	{ .name = n, .mode = (m) }
> +
> +#define BRANCH_END { .name = NULL }
> +
> +struct branch_mode {
> +	const char *name;
> +	int mode;
> +};
> +
> +static const struct branch_mode branch_modes[] = {
> +	BRANCH_OPT("u", PERF_SAMPLE_BRANCH_USER),
> +	BRANCH_OPT("k", PERF_SAMPLE_BRANCH_KERNEL),
> +	BRANCH_OPT("hv", PERF_SAMPLE_BRANCH_HV),
> +	BRANCH_OPT("any", PERF_SAMPLE_BRANCH_ANY),
> +	BRANCH_OPT("any_call", PERF_SAMPLE_BRANCH_ANY_CALL),
> +	BRANCH_OPT("any_ret", PERF_SAMPLE_BRANCH_ANY_RETURN),
> +	BRANCH_OPT("ind_call", PERF_SAMPLE_BRANCH_IND_CALL),
> +	BRANCH_END
> +};
> +
> +static int
> +parse_branch_stack(const struct option *opt, const char *str, int unset __used)
> +{
> +#define ONLY_PLM \
> +	(PERF_SAMPLE_BRANCH_USER	|\
> +	 PERF_SAMPLE_BRANCH_KERNEL	|\
> +	 PERF_SAMPLE_BRANCH_HV)
> +
> +	uint64_t *mode = (uint64_t *)opt->value;
> +	const struct branch_mode *br;
> +	char *s, *os, *p;
> +	int ret = -1;
> +
> +	*mode = 0;
> +
> +	/* because str is read-only */
> +	s = os = strdup(str);
> +	if (!s)
> +		return -1;
> +
> +	for (;;) {
> +		p = strchr(s, ',');
> +		if (p)
> +			*p = '\0';
> +
> +		for (br = branch_modes; br->name; br++) {
> +			if (!strcasecmp(s, br->name))
> +				break;
> +		}
> +		if (!br->name)
> +			goto error;
> +
> +		*mode |= br->mode;
> +
> +		if (!p)
> +			break;
> +
> +		s = p + 1;
> +	}
> +	ret = 0;
> +
> +	if ((*mode & ~ONLY_PLM) == 0) {
> +		error("need at least one branch type with -b\n");
> +		ret = -1;
> +	}
> +error:
> +	free(os);
> +	return ret;
> +}
> +
>  static const char * const record_usage[] = {
>  	"perf record [<options>] [<command>]",
>  	"perf record [<options>] -- <command> [<options>]",
> @@ -729,6 +800,9 @@ const struct option record_options[] = {
>  		     "monitor event in cgroup name only",
>  		     parse_cgroups),
>  	OPT_STRING('u', "uid", &record.uid_str, "user", "user to profile"),
> +	OPT_CALLBACK('b', "branch-stack", &record.opts.branch_stack,
> +		     "branch mode mask", "branch stack sampling modes",
> +		     parse_branch_stack),
>  	OPT_END()
>  };
>  
> diff --git a/tools/perf/perf.h b/tools/perf/perf.h
> index 8b4d25d..7f8fbab 100644
> --- a/tools/perf/perf.h
> +++ b/tools/perf/perf.h
> @@ -222,6 +222,7 @@ struct perf_record_opts {
>  	unsigned int freq;
>  	unsigned int mmap_pages;
>  	unsigned int user_freq;
> +	int	     branch_stack;
>  	u64	     default_interval;
>  	u64	     user_interval;
>  	const char   *cpu_list;
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 6b15cda..63a6a16 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -126,6 +126,10 @@ void perf_evsel__config(struct perf_evsel *evsel, struct perf_record_opts *opts)
>  		attr->watermark = 0;
>  		attr->wakeup_events = 1;
>  	}
> +	if (opts->branch_stack) {
> +		attr->sample_type	|= PERF_SAMPLE_BRANCH_STACK;
> +		attr->branch_sample_type = opts->branch_stack;
> +	}
>  
>  	attr->mmap = track;
>  	attr->comm = track;
> -- 
> 1.7.4.1

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 13/18] perf: add support for taken branch sampling to perf report
  2012-02-02 12:54 ` [PATCH v5 13/18] perf: add support for taken branch sampling to perf report Stephane Eranian
@ 2012-02-06 18:14   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 43+ messages in thread
From: Arnaldo Carvalho de Melo @ 2012-02-06 18:14 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: linux-kernel, peterz, mingo, robert.richter, ming.m.lin, andi,
	asharma, ravitillo, vweaver1, khandual, dsahern

Em Thu, Feb 02, 2012 at 01:54:43PM +0100, Stephane Eranian escreveu:
> From: Roberto Agostino Vitillo <ravitillo@lbl.gov>
> 
> This patch adds support for taken branch sampling, i.e, the
> PERF_SAMPLE_BRANCH_STACK feature to perf report. In other
> words, to display histograms based on taken branches rather
> than executed instructions addresses.
> 
> The new option is called -b and it takes no argument. To
> generate meaningful output, the perf.data must have been
> obtained using perf record -b xxx ... where xxx is a branch
> filter option.
> 
> The output shows symbols, modules, sorted by 'who branches
> where' the most often. The percentages reported in the first
> column refer to the total number of branches captured and
> not the usual number of samples.
> 
> Here is a quick example.
> Here branchy is simple test program which looks as follows:
> 
> void f2(void)
> {}
> void f3(void)
> {}
> void f1(unsigned long n)
> {
>   if (n & 1UL)
>     f2();
>   else
>     f3();
> }
> int main(void)
> {
>   unsigned long i;
> 
>   for (i=0; i < N; i++)
>    f1(i);
>   return 0;
> }
> 
> Here is the output captured on Nehalem, if we are
> only interested in user level function calls.
> 
> $ perf record -b any_call,u -e cycles:u branchy
> 
> $ perf report -b --sort=symbol
>     52.34%  [.] main                   [.] f1
>     24.04%  [.] f1                     [.] f3
>     23.60%  [.] f1                     [.] f2
>      0.01%  [k] _IO_new_file_xsputn    [k] _IO_file_overflow
>      0.01%  [k] _IO_vfprintf_internal  [k] _IO_new_file_xsputn
>      0.01%  [k] _IO_vfprintf_internal  [k] strchrnul
>      0.01%  [k] __printf               [k] _IO_vfprintf_internal
>      0.01%  [k] main                   [k] __printf
> 
> About half (52%) of the call branches captured are from main() -> f1().
> The second half (24%+23%) is split in two equal shares between
> f1() -> f2(), f1() ->f3(). The output is as expected given the code.

It would be great to have a 'perf test' entry for the above test case,
that would try to use this feature and if the kernel didn't bailed out,
meaning the hardware supports it, validate the results like is done in
other perf test cases.

Just some minor comments about method naming/class ownership below.
 
> It should be noted, that using -b in perf record does not eliminate
> information in the perf.data file. Consequently, a typical profile
> can also be obtained by perf report by simply not using its -b option.
> 
> Signed-off-by: Roberto Agostino Vitillo <ravitillo@lbl.gov>
> Signed-off-by: Stephane Eranian <eranian@google.com>
> ---
>  tools/perf/Documentation/perf-report.txt |    7 ++
>  tools/perf/builtin-report.c              |   98 +++++++++++++++++++++++++++---
>  2 files changed, 96 insertions(+), 9 deletions(-)
> 
> diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
> index 9b430e9..19b9092 100644
> --- a/tools/perf/Documentation/perf-report.txt
> +++ b/tools/perf/Documentation/perf-report.txt
> @@ -153,6 +153,13 @@ OPTIONS
>  	information which may be very large and thus may clutter the display.
>  	It currently includes: cpu and numa topology of the host system.
>  
> +-b::
> +--branch-stack::
> +	Use the addresses of sampled taken branches instead of the instruction
> +	address to build the histograms. To generate meaningful output, the
> +	perf.data file must have been obtained using perf record -b xxx where
> +	xxx is a branch filter option.
> +
>  SEE ALSO
>  --------
>  linkperf:perf-stat[1], linkperf:perf-annotate[1]
> diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
> index 25d34d4..8a8d2f9 100644
> --- a/tools/perf/builtin-report.c
> +++ b/tools/perf/builtin-report.c
> @@ -53,6 +53,50 @@ struct perf_report {
>  	DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
>  };
>  
> +static int perf_session__add_branch_hist_entry(struct perf_tool *tool,
> +					struct addr_location *al,
> +					struct perf_sample *sample,
> +					struct perf_evsel *evsel,
> +				      struct machine *machine)

The naming here should be:

static int perf_report__add_branch_hist_entry, as this is just a
'struct perf_report' specific method, perf_report being an
specialization of a 'perf_tool', etc.

> +{
> +	struct perf_report *rep = container_of(tool, struct perf_report, tool);
> +	struct symbol *parent = NULL;
> +	int err = 0;
> +	unsigned i;
> +	struct hist_entry *he;
> +	struct branch_info *bi;
> +
> +	if ((sort__has_parent || symbol_conf.use_callchain)
> +	    && sample->callchain) {
> +		err = machine__resolve_callchain(machine, evsel, al->thread,
> +						 sample->callchain, &parent);
> +		if (err)
> +			return err;
> +	}
> +
> +	bi = perf_session__resolve_bstack(machine, al->thread,
> +					  sample->branch_stack);

This one then is just a 'struct machine' method, hence:

	bi = machine__resolve_bstack(machine, al->thread, sample->branch_stack);

> +	if (!bi)
> +		return -ENOMEM;
> +
> +	for (i = 0; i < sample->branch_stack->nr; i++) {
> +		if (rep->hide_unresolved && !(bi[i].from.sym && bi[i].to.sym))
> +			continue;
> +		/*
> +		 * The report shows the percentage of total branches captured
> +		 * and not events sampled. Thus we use a pseudo period of 1.
> +		 */
> +		he = __hists__add_branch_entry(&evsel->hists, al, parent,
> +					       &bi[i], 1);
> +		if (he) {
> +			evsel->hists.stats.total_period += 1;
> +			hists__inc_nr_events(&evsel->hists, PERF_RECORD_SAMPLE);
> +		} else
> +			return -ENOMEM;
> +	}
> +	return err;
> +}
> +
>  static int perf_evsel__add_hist_entry(struct perf_evsel *evsel,
>  				      struct addr_location *al,
>  				      struct perf_sample *sample,
> @@ -126,14 +170,21 @@ static int process_sample_event(struct perf_tool *tool,
>  	if (rep->cpu_list && !test_bit(sample->cpu, rep->cpu_bitmap))
>  		return 0;
>  
> -	if (al.map != NULL)
> -		al.map->dso->hit = 1;
> +	if (sort__branch_mode) {
> +		if (perf_session__add_branch_hist_entry(tool, &al, sample,
> +						    evsel, machine)) {
> +			pr_debug("problem adding lbr entry, skipping event\n");
> +			return -1;
> +		}
> +	} else {
> +		if (al.map != NULL)
> +			al.map->dso->hit = 1;
>  
> -	if (perf_evsel__add_hist_entry(evsel, &al, sample, machine)) {
> -		pr_debug("problem incrementing symbol period, skipping event\n");
> -		return -1;
> +		if (perf_evsel__add_hist_entry(evsel, &al, sample, machine)) {
> +			pr_debug("problem incrementing symbol period, skipping event\n");
> +			return -1;
> +		}
>  	}
> -
>  	return 0;
>  }
>  
> @@ -188,6 +239,15 @@ static int perf_report__setup_sample_type(struct perf_report *rep)
>  			}
>  	}
>  
> +	if (sort__branch_mode) {
> +		if (!(self->sample_type & PERF_SAMPLE_BRANCH_STACK)) {
> +			fprintf(stderr, "selected -b but no branch data."
> +					" Did you call perf record without"
> +					" -b?\n");
> +			return -1;
> +		}
> +	}
> +
>  	return 0;
>  }
>  
> @@ -477,7 +537,8 @@ int cmd_report(int argc, const char **argv, const char *prefix __used)
>  	OPT_BOOLEAN(0, "stdio", &report.use_stdio,
>  		    "Use the stdio interface"),
>  	OPT_STRING('s', "sort", &sort_order, "key[,key2...]",
> -		   "sort by key(s): pid, comm, dso, symbol, parent"),
> +		   "sort by key(s): pid, comm, dso, symbol, parent, dso_to,"
> +		   " dso_from, symbol_to, symbol_from, mispredict"),
>  	OPT_BOOLEAN(0, "showcpuutilization", &symbol_conf.show_cpu_utilization,
>  		    "Show sample percentage for different cpu modes"),
>  	OPT_STRING('p', "parent", &parent_pattern, "regex",
> @@ -517,6 +578,8 @@ int cmd_report(int argc, const char **argv, const char *prefix __used)
>  		   "Specify disassembler style (e.g. -M intel for intel syntax)"),
>  	OPT_BOOLEAN(0, "show-total-period", &symbol_conf.show_total_period,
>  		    "Show a column with the sum of periods"),
> +	OPT_BOOLEAN('b', "branch-stack", &sort__branch_mode,
> +		    "use branch records for histogram filling"),
>  	OPT_END()
>  	};
>  
> @@ -537,10 +600,27 @@ int cmd_report(int argc, const char **argv, const char *prefix __used)
>  			report.input_name = "perf.data";
>  	}
>  
> -	if (strcmp(report.input_name, "-") != 0)
> +	if (sort__branch_mode) {
> +		if (use_browser)
> +			fprintf(stderr, "Warning: TUI interface not supported"
> +					" in branch mode\n");

I'll put this on my TODO list :-)

> +		if (symbol_conf.dso_list_str != NULL)
> +			fprintf(stderr, "Warning: dso filtering not supported"
> +					" in branch mode\n");
> +		if (symbol_conf.sym_list_str != NULL)
> +			fprintf(stderr, "Warning: symbol filtering not"
> +					" supported in branch mode\n");
> +
> +		report.use_stdio = true;
> +		use_browser = 0;
>  		setup_browser(true);
> -	else
> +		symbol_conf.dso_list_str = NULL;
> +		symbol_conf.sym_list_str = NULL;
> +	} else if (strcmp(report.input_name, "-") != 0) {
> +		setup_browser(true);
> +	} else {
>  		use_browser = 0;
> +	}
>  
>  	/*
>  	 * Only in the newt browser we are doing integrated annotation,
> -- 
> 1.7.4.1

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 14/18] perf: fix endianness detection in perf.data
  2012-02-02 12:54 ` [PATCH v5 14/18] perf: fix endianness detection in perf.data Stephane Eranian
@ 2012-02-06 18:17   ` Arnaldo Carvalho de Melo
  2012-02-06 18:18     ` Stephane Eranian
  2012-02-06 21:47     ` David Ahern
  2012-02-17  9:42   ` [tip:perf/core] perf tools: " tip-bot for Stephane Eranian
  1 sibling, 2 replies; 43+ messages in thread
From: Arnaldo Carvalho de Melo @ 2012-02-06 18:17 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: linux-kernel, peterz, mingo, robert.richter, ming.m.lin, andi,
	asharma, ravitillo, vweaver1, khandual, dsahern

Em Thu, Feb 02, 2012 at 01:54:44PM +0100, Stephane Eranian escreveu:
> The current version of perf detects whether or not
> the perf.data file is written in a different endianness
> using the attr_size field in the header of the file. This
> field represents sizeof(struct perf_event_attr) as known
> to perf record. If the sizes do not match, then perf tries
> the byte-swapped version. If they match, then the tool assumes
> a different endianness.
> 
> The issue with the approach is that it assumes the size of
> perf_event_attr always has to match between perf record and
> perf report. However, the kernel perf_event ABI is extensible.
> New fields can be added to struct perf_event_attr. Consequently,
> it is not possible to use attr_size to detect endianness.
> 
> This patch takes another approach by using the magic number
> written at the beginning of the perf.data file to detect
> endianness. The magic number is an eight-byte signature.
> It's primary purpose is to identify (signature) a perf.data
> file. But it could also be used to encode the endianness.
> 
> The patch introduces a new value for this signature. The key
> difference is that the signature is written differently in
> the file depending on the endianness. Thus, by comparing the
> signature from the file with the tool's own signature it is
> possible to detect endianness. The new signature is "PERFILE2".
> 
> Backward compatiblity with existing perf.data file is
> ensured.

Looks ok, but IIRC David Ahern interacted with you on this specific
patch in the past, having his Acked-by and/or Tested-by would be great,
David?

- Arnaldo
 
> Signed-off-by: Stephane Eranian <eranian@google.com>
> ---
>  tools/perf/util/header.c |   77 ++++++++++++++++++++++++++++++++++++++--------
>  1 files changed, 64 insertions(+), 13 deletions(-)
> 
> diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
> index ecd7f4d..6f4187d 100644
> --- a/tools/perf/util/header.c
> +++ b/tools/perf/util/header.c
> @@ -63,9 +63,20 @@ char *perf_header__find_event(u64 id)
>  	return NULL;
>  }
>  
> -static const char *__perf_magic = "PERFFILE";
> +/*
> + * magic2 = "PERFILE2"
> + * must be a numerical value to let the endianness
> + * determine the memory layout. That way we are able
> + * to detect endianness when reading the perf.data file
> + * back.
> + *
> + * we check for legacy (PERFFILE) format.
> + */
> +static const char *__perf_magic1 = "PERFFILE";
> +static const u64 __perf_magic2    = 0x32454c4946524550ULL;
> +static const u64 __perf_magic2_sw = 0x50455246494c4532ULL;
>  
> -#define PERF_MAGIC	(*(u64 *)__perf_magic)
> +#define PERF_MAGIC	__perf_magic2
>  
>  struct perf_file_attr {
>  	struct perf_event_attr	attr;
> @@ -1620,24 +1631,59 @@ int perf_header__process_sections(struct perf_header *header, int fd,
>  	return err;
>  }
>  
> +static int check_magic_endian(u64 *magic, struct perf_file_header *header,
> +			      struct perf_header *ph)
> +{
> +	int ret;
> +
> +	/* check for legacy format */
> +	ret = memcmp(magic, __perf_magic1, sizeof(*magic));
> +	if (ret == 0) {
> +		pr_debug("legacy perf.data format\n");
> +		if (!header)
> +			return -1;
> +
> +		if (header->attr_size != sizeof(struct perf_file_attr)) {
> +			u64 attr_size = bswap_64(header->attr_size);
> +
> +			if (attr_size != sizeof(struct perf_file_attr))
> +				return -1;
> +
> +			ph->needs_swap = true;
> +		}
> +		return 0;
> +	}
> +
> +	/* check magic number with same endianness */
> +	if (*magic == __perf_magic2)
> +		return 0;
> +
> +	/* check magic number but opposite endianness */
> +	if (*magic != __perf_magic2_sw)
> +		return -1;
> +
> +	ph->needs_swap = true;
> +
> +	return 0;
> +}
> +
>  int perf_file_header__read(struct perf_file_header *header,
>  			   struct perf_header *ph, int fd)
>  {
> +	int ret;
> +
>  	lseek(fd, 0, SEEK_SET);
>  
> -	if (readn(fd, header, sizeof(*header)) <= 0 ||
> -	    memcmp(&header->magic, __perf_magic, sizeof(header->magic)))
> +	ret = readn(fd, header, sizeof(*header));
> +	if (ret <= 0)
>  		return -1;
>  
> -	if (header->attr_size != sizeof(struct perf_file_attr)) {
> -		u64 attr_size = bswap_64(header->attr_size);
> -
> -		if (attr_size != sizeof(struct perf_file_attr))
> -			return -1;
> +	if (check_magic_endian(&header->magic, header, ph) < 0)
> +		return -1;
>  
> +	if (ph->needs_swap) {
>  		mem_bswap_64(header, offsetof(struct perf_file_header,
> -					    adds_features));
> -		ph->needs_swap = true;
> +			     adds_features));
>  	}
>  
>  	if (header->size != sizeof(*header)) {
> @@ -1873,8 +1919,13 @@ static int perf_file_header__read_pipe(struct perf_pipe_file_header *header,
>  				       struct perf_header *ph, int fd,
>  				       bool repipe)
>  {
> -	if (readn(fd, header, sizeof(*header)) <= 0 ||
> -	    memcmp(&header->magic, __perf_magic, sizeof(header->magic)))
> +	int ret;
> +
> +	ret = readn(fd, header, sizeof(*header));
> +	if (ret <= 0)
> +		return -1;
> +
> +	 if (check_magic_endian(&header->magic, NULL, ph) < 0)
>  		return -1;
>  
>  	if (repipe && do_write(STDOUT_FILENO, header, sizeof(*header)) < 0)
> -- 
> 1.7.4.1

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 14/18] perf: fix endianness detection in perf.data
  2012-02-06 18:17   ` Arnaldo Carvalho de Melo
@ 2012-02-06 18:18     ` Stephane Eranian
  2012-02-06 21:47     ` David Ahern
  1 sibling, 0 replies; 43+ messages in thread
From: Stephane Eranian @ 2012-02-06 18:18 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, peterz, mingo, robert.richter, ming.m.lin, andi,
	asharma, ravitillo, vweaver1, khandual, dsahern

On Mon, Feb 6, 2012 at 7:17 PM, Arnaldo Carvalho de Melo
<acme@redhat.com> wrote:
> Em Thu, Feb 02, 2012 at 01:54:44PM +0100, Stephane Eranian escreveu:
>> The current version of perf detects whether or not
>> the perf.data file is written in a different endianness
>> using the attr_size field in the header of the file. This
>> field represents sizeof(struct perf_event_attr) as known
>> to perf record. If the sizes do not match, then perf tries
>> the byte-swapped version. If they match, then the tool assumes
>> a different endianness.
>>
>> The issue with the approach is that it assumes the size of
>> perf_event_attr always has to match between perf record and
>> perf report. However, the kernel perf_event ABI is extensible.
>> New fields can be added to struct perf_event_attr. Consequently,
>> it is not possible to use attr_size to detect endianness.
>>
>> This patch takes another approach by using the magic number
>> written at the beginning of the perf.data file to detect
>> endianness. The magic number is an eight-byte signature.
>> It's primary purpose is to identify (signature) a perf.data
>> file. But it could also be used to encode the endianness.
>>
>> The patch introduces a new value for this signature. The key
>> difference is that the signature is written differently in
>> the file depending on the endianness. Thus, by comparing the
>> signature from the file with the tool's own signature it is
>> possible to detect endianness. The new signature is "PERFILE2".
>>
>> Backward compatiblity with existing perf.data file is
>> ensured.
>
> Looks ok, but IIRC David Ahern interacted with you on this specific
> patch in the past, having his Acked-by and/or Tested-by would be great,
> David?
>
I agree, I am still waiting for the results of his test on big-endian systems.
I don't have any unfortunately.

> - Arnaldo
>
>> Signed-off-by: Stephane Eranian <eranian@google.com>
>> ---
>>  tools/perf/util/header.c |   77 ++++++++++++++++++++++++++++++++++++++--------
>>  1 files changed, 64 insertions(+), 13 deletions(-)
>>
>> diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
>> index ecd7f4d..6f4187d 100644
>> --- a/tools/perf/util/header.c
>> +++ b/tools/perf/util/header.c
>> @@ -63,9 +63,20 @@ char *perf_header__find_event(u64 id)
>>       return NULL;
>>  }
>>
>> -static const char *__perf_magic = "PERFFILE";
>> +/*
>> + * magic2 = "PERFILE2"
>> + * must be a numerical value to let the endianness
>> + * determine the memory layout. That way we are able
>> + * to detect endianness when reading the perf.data file
>> + * back.
>> + *
>> + * we check for legacy (PERFFILE) format.
>> + */
>> +static const char *__perf_magic1 = "PERFFILE";
>> +static const u64 __perf_magic2    = 0x32454c4946524550ULL;
>> +static const u64 __perf_magic2_sw = 0x50455246494c4532ULL;
>>
>> -#define PERF_MAGIC   (*(u64 *)__perf_magic)
>> +#define PERF_MAGIC   __perf_magic2
>>
>>  struct perf_file_attr {
>>       struct perf_event_attr  attr;
>> @@ -1620,24 +1631,59 @@ int perf_header__process_sections(struct perf_header *header, int fd,
>>       return err;
>>  }
>>
>> +static int check_magic_endian(u64 *magic, struct perf_file_header *header,
>> +                           struct perf_header *ph)
>> +{
>> +     int ret;
>> +
>> +     /* check for legacy format */
>> +     ret = memcmp(magic, __perf_magic1, sizeof(*magic));
>> +     if (ret == 0) {
>> +             pr_debug("legacy perf.data format\n");
>> +             if (!header)
>> +                     return -1;
>> +
>> +             if (header->attr_size != sizeof(struct perf_file_attr)) {
>> +                     u64 attr_size = bswap_64(header->attr_size);
>> +
>> +                     if (attr_size != sizeof(struct perf_file_attr))
>> +                             return -1;
>> +
>> +                     ph->needs_swap = true;
>> +             }
>> +             return 0;
>> +     }
>> +
>> +     /* check magic number with same endianness */
>> +     if (*magic == __perf_magic2)
>> +             return 0;
>> +
>> +     /* check magic number but opposite endianness */
>> +     if (*magic != __perf_magic2_sw)
>> +             return -1;
>> +
>> +     ph->needs_swap = true;
>> +
>> +     return 0;
>> +}
>> +
>>  int perf_file_header__read(struct perf_file_header *header,
>>                          struct perf_header *ph, int fd)
>>  {
>> +     int ret;
>> +
>>       lseek(fd, 0, SEEK_SET);
>>
>> -     if (readn(fd, header, sizeof(*header)) <= 0 ||
>> -         memcmp(&header->magic, __perf_magic, sizeof(header->magic)))
>> +     ret = readn(fd, header, sizeof(*header));
>> +     if (ret <= 0)
>>               return -1;
>>
>> -     if (header->attr_size != sizeof(struct perf_file_attr)) {
>> -             u64 attr_size = bswap_64(header->attr_size);
>> -
>> -             if (attr_size != sizeof(struct perf_file_attr))
>> -                     return -1;
>> +     if (check_magic_endian(&header->magic, header, ph) < 0)
>> +             return -1;
>>
>> +     if (ph->needs_swap) {
>>               mem_bswap_64(header, offsetof(struct perf_file_header,
>> -                                         adds_features));
>> -             ph->needs_swap = true;
>> +                          adds_features));
>>       }
>>
>>       if (header->size != sizeof(*header)) {
>> @@ -1873,8 +1919,13 @@ static int perf_file_header__read_pipe(struct perf_pipe_file_header *header,
>>                                      struct perf_header *ph, int fd,
>>                                      bool repipe)
>>  {
>> -     if (readn(fd, header, sizeof(*header)) <= 0 ||
>> -         memcmp(&header->magic, __perf_magic, sizeof(header->magic)))
>> +     int ret;
>> +
>> +     ret = readn(fd, header, sizeof(*header));
>> +     if (ret <= 0)
>> +             return -1;
>> +
>> +      if (check_magic_endian(&header->magic, NULL, ph) < 0)
>>               return -1;
>>
>>       if (repipe && do_write(STDOUT_FILENO, header, sizeof(*header)) < 0)
>> --
>> 1.7.4.1

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 16/18] perf: enable reading of perf.data files from different ABI rev
  2012-02-02 12:54 ` [PATCH v5 16/18] perf: enable reading of perf.data files from different ABI rev Stephane Eranian
@ 2012-02-06 18:19   ` Arnaldo Carvalho de Melo
  2012-02-06 18:22   ` Arnaldo Carvalho de Melo
  2012-02-06 22:19   ` David Ahern
  2 siblings, 0 replies; 43+ messages in thread
From: Arnaldo Carvalho de Melo @ 2012-02-06 18:19 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: linux-kernel, peterz, mingo, robert.richter, ming.m.lin, andi,
	asharma, ravitillo, vweaver1, khandual, dsahern

Em Thu, Feb 02, 2012 at 01:54:46PM +0100, Stephane Eranian escreveu:
> This patch allows perf to process perf.data files generated
> using an ABI that has a different perf_event_attr struct size, i.e.,
> a different ABI version.
> 
> The perf_event_attr can be extended, yet perf needs to cope with
> older perf.data files. Similarly, perf must be able to cope with
> a perf.data file which is using a newer version of the ABI than
> what it knows about.
> 
> This patch adds read_attr(), a routine that reads a perf_event_attr
> struct from a file incrementally based on its advertised size. If
> the on-file struct is smaller than what perf knows, then the extra
> fields are zeroed. If the on-file struct is bigger, then perf only
> uses what it knows about, the rest is skipped.

Ditto on this one to get Acked/Tested-by David,

- Arnaldo
 
> Signed-off-by: Stephane Eranian <eranian@google.com>
> ---
>  tools/perf/util/header.c |   49 ++++++++++++++++++++++++++++++++++++++++++++-
>  1 files changed, 47 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
> index 6f4187d..8d6c18d 100644
> --- a/tools/perf/util/header.c
> +++ b/tools/perf/util/header.c
> @@ -1959,6 +1959,51 @@ static int perf_header__read_pipe(struct perf_session *session, int fd)
>  	return 0;
>  }
>  
> +static int read_attr(int fd, struct perf_header *ph,
> +		     struct perf_file_attr *f_attr)
> +{
> +	struct perf_event_attr *attr = &f_attr->attr;
> +	size_t sz, left;
> +	size_t our_sz = sizeof(f_attr->attr);
> +	int ret;
> +
> +	memset(f_attr, 0, sizeof(*f_attr));
> +
> +	/* read minimal guaranteed structure */
> +	ret = readn(fd, attr, PERF_ATTR_SIZE_VER0);
> +	if (ret <= 0)
> +		return -1;
> +
> +	/* on file perf_event_attr size */
> +	sz = attr->size;
> +	if (ph->needs_swap)
> +		sz = bswap_32(sz);
> +
> +	if (sz == 0) {
> +		/* assume ABI0 */
> +		sz =  PERF_ATTR_SIZE_VER0;
> +	} else if (sz > our_sz) {
> +		/* bigger than what we know about */
> +		sz = our_sz;
> +
> +		/* skip what we do not know about */
> +		lseek(fd, SEEK_CUR, attr->size - our_sz);
> +	}
> +	/* what we have not yet read and that we know about */
> +	left = sz - PERF_ATTR_SIZE_VER0;
> +	if (left) {
> +		void *ptr = attr;
> +		ptr += PERF_ATTR_SIZE_VER0;
> +
> +		ret = readn(fd, ptr, left);
> +		if (ret <= 0)
> +			return -1;
> +	}
> +	/* read the ids */
> +	ret = readn(fd, &f_attr->ids, sizeof(struct perf_file_section));
> +	return ret <= 0 ? -1 : 0;
> +}
> +
>  int perf_session__read_header(struct perf_session *session, int fd)
>  {
>  	struct perf_header *header = &session->header;
> @@ -1979,14 +2024,14 @@ int perf_session__read_header(struct perf_session *session, int fd)
>  		return -EINVAL;
>  	}
>  
> -	nr_attrs = f_header.attrs.size / sizeof(f_attr);
> +	nr_attrs = f_header.attrs.size / f_header.attr_size;
>  	lseek(fd, f_header.attrs.offset, SEEK_SET);
>  
>  	for (i = 0; i < nr_attrs; i++) {
>  		struct perf_evsel *evsel;
>  		off_t tmp;
>  
> -		if (readn(fd, &f_attr, sizeof(f_attr)) <= 0)
> +		if (read_attr(fd, header, &f_attr) < 0)
>  			goto out_errno;
>  
>  		if (header->needs_swap)
> -- 
> 1.7.4.1

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 16/18] perf: enable reading of perf.data files from different ABI rev
  2012-02-02 12:54 ` [PATCH v5 16/18] perf: enable reading of perf.data files from different ABI rev Stephane Eranian
  2012-02-06 18:19   ` Arnaldo Carvalho de Melo
@ 2012-02-06 18:22   ` Arnaldo Carvalho de Melo
  2012-02-07  7:03     ` Anshuman Khandual
  2012-02-06 22:19   ` David Ahern
  2 siblings, 1 reply; 43+ messages in thread
From: Arnaldo Carvalho de Melo @ 2012-02-06 18:22 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: linux-kernel, peterz, mingo, robert.richter, ming.m.lin, andi,
	asharma, ravitillo, vweaver1, khandual, dsahern

Em Thu, Feb 02, 2012 at 01:54:46PM +0100, Stephane Eranian escreveu:
> This patch allows perf to process perf.data files generated
> using an ABI that has a different perf_event_attr struct size, i.e.,
> a different ABI version.
> 
> The perf_event_attr can be extended, yet perf needs to cope with
> older perf.data files. Similarly, perf must be able to cope with
> a perf.data file which is using a newer version of the ABI than
> what it knows about.
> 
> This patch adds read_attr(), a routine that reads a perf_event_attr
> struct from a file incrementally based on its advertised size. If
> the on-file struct is smaller than what perf knows, then the extra
> fields are zeroed. If the on-file struct is bigger, then perf only
> uses what it knows about, the rest is skipped.

Anshuman, can I have your Acked-by or Reviewed-by since you spotted
problems in this and your suggestions were taken into account? Is this
OK now?

- Arnaldo
 
> Signed-off-by: Stephane Eranian <eranian@google.com>
> ---
>  tools/perf/util/header.c |   49 ++++++++++++++++++++++++++++++++++++++++++++-
>  1 files changed, 47 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
> index 6f4187d..8d6c18d 100644
> --- a/tools/perf/util/header.c
> +++ b/tools/perf/util/header.c
> @@ -1959,6 +1959,51 @@ static int perf_header__read_pipe(struct perf_session *session, int fd)
>  	return 0;
>  }
>  
> +static int read_attr(int fd, struct perf_header *ph,
> +		     struct perf_file_attr *f_attr)
> +{
> +	struct perf_event_attr *attr = &f_attr->attr;
> +	size_t sz, left;
> +	size_t our_sz = sizeof(f_attr->attr);
> +	int ret;
> +
> +	memset(f_attr, 0, sizeof(*f_attr));
> +
> +	/* read minimal guaranteed structure */
> +	ret = readn(fd, attr, PERF_ATTR_SIZE_VER0);
> +	if (ret <= 0)
> +		return -1;
> +
> +	/* on file perf_event_attr size */
> +	sz = attr->size;
> +	if (ph->needs_swap)
> +		sz = bswap_32(sz);
> +
> +	if (sz == 0) {
> +		/* assume ABI0 */
> +		sz =  PERF_ATTR_SIZE_VER0;
> +	} else if (sz > our_sz) {
> +		/* bigger than what we know about */
> +		sz = our_sz;
> +
> +		/* skip what we do not know about */
> +		lseek(fd, SEEK_CUR, attr->size - our_sz);
> +	}
> +	/* what we have not yet read and that we know about */
> +	left = sz - PERF_ATTR_SIZE_VER0;
> +	if (left) {
> +		void *ptr = attr;
> +		ptr += PERF_ATTR_SIZE_VER0;
> +
> +		ret = readn(fd, ptr, left);
> +		if (ret <= 0)
> +			return -1;
> +	}
> +	/* read the ids */
> +	ret = readn(fd, &f_attr->ids, sizeof(struct perf_file_section));
> +	return ret <= 0 ? -1 : 0;
> +}
> +
>  int perf_session__read_header(struct perf_session *session, int fd)
>  {
>  	struct perf_header *header = &session->header;
> @@ -1979,14 +2024,14 @@ int perf_session__read_header(struct perf_session *session, int fd)
>  		return -EINVAL;
>  	}
>  
> -	nr_attrs = f_header.attrs.size / sizeof(f_attr);
> +	nr_attrs = f_header.attrs.size / f_header.attr_size;
>  	lseek(fd, f_header.attrs.offset, SEEK_SET);
>  
>  	for (i = 0; i < nr_attrs; i++) {
>  		struct perf_evsel *evsel;
>  		off_t tmp;
>  
> -		if (readn(fd, &f_attr, sizeof(f_attr)) <= 0)
> +		if (read_attr(fd, header, &f_attr) < 0)
>  			goto out_errno;
>  
>  		if (header->needs_swap)
> -- 
> 1.7.4.1

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 09/18] perf: disable PERF_SAMPLE_BRANCH_* when not supported
  2012-02-02 12:54 ` [PATCH v5 09/18] perf: disable PERF_SAMPLE_BRANCH_* when not supported Stephane Eranian
@ 2012-02-06 19:23   ` Peter Zijlstra
  2012-02-06 19:59     ` Stephane Eranian
  0 siblings, 1 reply; 43+ messages in thread
From: Peter Zijlstra @ 2012-02-06 19:23 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: linux-kernel, mingo, acme, robert.richter, ming.m.lin, andi,
	asharma, ravitillo, vweaver1, khandual, dsahern

On Thu, 2012-02-02 at 13:54 +0100, Stephane Eranian wrote:
> @@ -539,6 +539,10 @@ static int armpmu_event_init(struct perf_event *event)
>         int err = 0;
>         atomic_t *active_events = &armpmu->active_events;
>  
> +       /* does not support taken branch sampling */
> +       if (has_branch_smpl(event))
> +               return -EOPNOTSUPP;
> +
>         if (armpmu->map_event(event) == -ENOENT) 

I'll make that has_branch_stack(), ok? :-)


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 09/18] perf: disable PERF_SAMPLE_BRANCH_* when not supported
  2012-02-06 19:23   ` Peter Zijlstra
@ 2012-02-06 19:59     ` Stephane Eranian
  0 siblings, 0 replies; 43+ messages in thread
From: Stephane Eranian @ 2012-02-06 19:59 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, mingo, acme, robert.richter, ming.m.lin, andi,
	asharma, ravitillo, vweaver1, khandual, dsahern

On Mon, Feb 6, 2012 at 8:23 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Thu, 2012-02-02 at 13:54 +0100, Stephane Eranian wrote:
>> @@ -539,6 +539,10 @@ static int armpmu_event_init(struct perf_event *event)
>>         int err = 0;
>>         atomic_t *active_events = &armpmu->active_events;
>>
>> +       /* does not support taken branch sampling */
>> +       if (has_branch_smpl(event))
>> +               return -EOPNOTSUPP;
>> +
>>         if (armpmu->map_event(event) == -ENOENT)
>
> I'll make that has_branch_stack(), ok? :-)
>
Supposedly, I fixed that in v5. Is it not there?
If not, then something went wrong but go ahead, that's the right fix anyway.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 14/18] perf: fix endianness detection in perf.data
  2012-02-06 18:17   ` Arnaldo Carvalho de Melo
  2012-02-06 18:18     ` Stephane Eranian
@ 2012-02-06 21:47     ` David Ahern
  2012-02-06 22:06       ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 43+ messages in thread
From: David Ahern @ 2012-02-06 21:47 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Stephane Eranian, linux-kernel, peterz, mingo, robert.richter,
	ming.m.lin, andi, asharma, ravitillo, vweaver1, khandual



On 02/06/2012 11:17 AM, Arnaldo Carvalho de Melo wrote:
> Em Thu, Feb 02, 2012 at 01:54:44PM +0100, Stephane Eranian escreveu:
>> The current version of perf detects whether or not
>> the perf.data file is written in a different endianness
>> using the attr_size field in the header of the file. This
>> field represents sizeof(struct perf_event_attr) as known
>> to perf record. If the sizes do not match, then perf tries
>> the byte-swapped version. If they match, then the tool assumes
>> a different endianness.
>>
>> The issue with the approach is that it assumes the size of
>> perf_event_attr always has to match between perf record and
>> perf report. However, the kernel perf_event ABI is extensible.
>> New fields can be added to struct perf_event_attr. Consequently,
>> it is not possible to use attr_size to detect endianness.
>>
>> This patch takes another approach by using the magic number
>> written at the beginning of the perf.data file to detect
>> endianness. The magic number is an eight-byte signature.
>> It's primary purpose is to identify (signature) a perf.data
>> file. But it could also be used to encode the endianness.
>>
>> The patch introduces a new value for this signature. The key
>> difference is that the signature is written differently in
>> the file depending on the endianness. Thus, by comparing the
>> signature from the file with the tool's own signature it is
>> possible to detect endianness. The new signature is "PERFILE2".
>>
>> Backward compatiblity with existing perf.data file is
>> ensured.
> 
> Looks ok, but IIRC David Ahern interacted with you on this specific
> patch in the past, having his Acked-by and/or Tested-by would be great,
> David?

I don't recall anything changing since version 4. I scanned over the
change and it all looks familiar; testing worked fine as well -- I was
able to analyze a PPC file on x86 and vice versa.

Acked-by: David Ahern <dsahern@gmail.com>
Tested-by: David Ahern <dsahern@gmail.com>



^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 14/18] perf: fix endianness detection in perf.data
  2012-02-06 21:47     ` David Ahern
@ 2012-02-06 22:06       ` Arnaldo Carvalho de Melo
  2012-02-06 22:29         ` David Ahern
  0 siblings, 1 reply; 43+ messages in thread
From: Arnaldo Carvalho de Melo @ 2012-02-06 22:06 UTC (permalink / raw)
  To: David Ahern
  Cc: Stephane Eranian, linux-kernel, peterz, mingo, robert.richter,
	ming.m.lin, andi, asharma, ravitillo, vweaver1, khandual

Em Mon, Feb 06, 2012 at 02:47:18PM -0700, David Ahern escreveu:
> On 02/06/2012 11:17 AM, Arnaldo Carvalho de Melo wrote:
> > Looks ok, but IIRC David Ahern interacted with you on this specific
> > patch in the past, having his Acked-by and/or Tested-by would be great,
> > David?
> 
> I don't recall anything changing since version 4. I scanned over the
> change and it all looks familiar; testing worked fine as well -- I was
> able to analyze a PPC file on x86 and vice versa.
> 
> Acked-by: David Ahern <dsahern@gmail.com>
> Tested-by: David Ahern <dsahern@gmail.com>
> 

Stephane,

	What are your plans? A v6 patch series with this new round of
comments and Acked-by et al stamps?

- Arnaldo

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 16/18] perf: enable reading of perf.data files from different ABI rev
  2012-02-02 12:54 ` [PATCH v5 16/18] perf: enable reading of perf.data files from different ABI rev Stephane Eranian
  2012-02-06 18:19   ` Arnaldo Carvalho de Melo
  2012-02-06 18:22   ` Arnaldo Carvalho de Melo
@ 2012-02-06 22:19   ` David Ahern
  2012-02-07 15:50     ` Stephane Eranian
  2 siblings, 1 reply; 43+ messages in thread
From: David Ahern @ 2012-02-06 22:19 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: linux-kernel, peterz, mingo, acme, robert.richter, ming.m.lin,
	andi, asharma, ravitillo, vweaver1, khandual



On 02/02/2012 05:54 AM, Stephane Eranian wrote:
> This patch allows perf to process perf.data files generated
> using an ABI that has a different perf_event_attr struct size, i.e.,
> a different ABI version.
> 
> The perf_event_attr can be extended, yet perf needs to cope with
> older perf.data files. Similarly, perf must be able to cope with
> a perf.data file which is using a newer version of the ABI than
> what it knows about.
> 
> This patch adds read_attr(), a routine that reads a perf_event_attr
> struct from a file incrementally based on its advertised size. If
> the on-file struct is smaller than what perf knows, then the extra
> fields are zeroed. If the on-file struct is bigger, then perf only
> uses what it knows about, the rest is skipped.
> 
> Signed-off-by: Stephane Eranian <eranian@google.com>
> ---
>  tools/perf/util/header.c |   49 ++++++++++++++++++++++++++++++++++++++++++++-
>  1 files changed, 47 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
> index 6f4187d..8d6c18d 100644
> --- a/tools/perf/util/header.c
> +++ b/tools/perf/util/header.c
> @@ -1959,6 +1959,51 @@ static int perf_header__read_pipe(struct perf_session *session, int fd)
>  	return 0;
>  }
>  
> +static int read_attr(int fd, struct perf_header *ph,
> +		     struct perf_file_attr *f_attr)
> +{
> +	struct perf_event_attr *attr = &f_attr->attr;
> +	size_t sz, left;
> +	size_t our_sz = sizeof(f_attr->attr);
> +	int ret;
> +
> +	memset(f_attr, 0, sizeof(*f_attr));
> +
> +	/* read minimal guaranteed structure */
> +	ret = readn(fd, attr, PERF_ATTR_SIZE_VER0);
> +	if (ret <= 0)
> +		return -1;

As I recall the first bump in that structure happened in 2.6.32. Why add
backward compatibility for it now? ie., why not just expect VER1

> +
> +	/* on file perf_event_attr size */
> +	sz = attr->size;
> +	if (ph->needs_swap)
> +		sz = bswap_32(sz);
> +
> +	if (sz == 0) {
> +		/* assume ABI0 */
> +		sz =  PERF_ATTR_SIZE_VER0;

Shouldn't this be a failure? ie., problem with the file (or the
swapping) since size can't be 0

And then for the following why not restrict sz to known, expected sizes
-- using the PERF_ATTR_SIZE_VER defines introduced in patch 15?

> +	} else if (sz > our_sz) {
> +		/* bigger than what we know about */
> +		sz = our_sz;
> +
> +		/* skip what we do not know about */
> +		lseek(fd, SEEK_CUR, attr->size - our_sz);
> +	}
> +	/* what we have not yet read and that we know about */
> +	left = sz - PERF_ATTR_SIZE_VER0;
> +	if (left) {
> +		void *ptr = attr;
> +		ptr += PERF_ATTR_SIZE_VER0;
> +
> +		ret = readn(fd, ptr, left);
> +		if (ret <= 0)
> +			return -1;
> +	}
> +	/* read the ids */
> +	ret = readn(fd, &f_attr->ids, sizeof(struct perf_file_section));

Confused by the above? It is not done in the old code, so why read the
ids here? I scanned the other patches, but don't see other code movement
on this file.

David

> +	return ret <= 0 ? -1 : 0;
> +}
> +
>  int perf_session__read_header(struct perf_session *session, int fd)
>  {
>  	struct perf_header *header = &session->header;
> @@ -1979,14 +2024,14 @@ int perf_session__read_header(struct perf_session *session, int fd)
>  		return -EINVAL;
>  	}
>  
> -	nr_attrs = f_header.attrs.size / sizeof(f_attr);
> +	nr_attrs = f_header.attrs.size / f_header.attr_size;
>  	lseek(fd, f_header.attrs.offset, SEEK_SET);
>  
>  	for (i = 0; i < nr_attrs; i++) {
>  		struct perf_evsel *evsel;
>  		off_t tmp;
>  
> -		if (readn(fd, &f_attr, sizeof(f_attr)) <= 0)
> +		if (read_attr(fd, header, &f_attr) < 0)
>  			goto out_errno;
>  
>  		if (header->needs_swap)

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 14/18] perf: fix endianness detection in perf.data
  2012-02-06 22:06       ` Arnaldo Carvalho de Melo
@ 2012-02-06 22:29         ` David Ahern
  2012-02-07 14:13           ` Stephane Eranian
  0 siblings, 1 reply; 43+ messages in thread
From: David Ahern @ 2012-02-06 22:29 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Stephane Eranian, linux-kernel, peterz, mingo, robert.richter,
	ming.m.lin, andi, asharma, ravitillo, vweaver1, khandual



On 02/06/2012 03:06 PM, Arnaldo Carvalho de Melo wrote:
> Em Mon, Feb 06, 2012 at 02:47:18PM -0700, David Ahern escreveu:
>> On 02/06/2012 11:17 AM, Arnaldo Carvalho de Melo wrote:
>>> Looks ok, but IIRC David Ahern interacted with you on this specific
>>> patch in the past, having his Acked-by and/or Tested-by would be great,
>>> David?
>>
>> I don't recall anything changing since version 4. I scanned over the
>> change and it all looks familiar; testing worked fine as well -- I was
>> able to analyze a PPC file on x86 and vice versa.
>>
>> Acked-by: David Ahern <dsahern@gmail.com>
>> Tested-by: David Ahern <dsahern@gmail.com>
>>
> 
> Stephane,
> 
> 	What are your plans? A v6 patch series with this new round of
> comments and Acked-by et al stamps?
> 
> - Arnaldo

I thought this one was standalone, so it could be picked up now.

David

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 16/18] perf: enable reading of perf.data files from different ABI rev
  2012-02-06 18:22   ` Arnaldo Carvalho de Melo
@ 2012-02-07  7:03     ` Anshuman Khandual
  2012-02-07 14:52       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 43+ messages in thread
From: Anshuman Khandual @ 2012-02-07  7:03 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Stephane Eranian, linux-kernel, peterz, mingo, robert.richter,
	ming.m.lin, andi, asharma, ravitillo, vweaver1, dsahern

On Monday 06 February 2012 11:52 PM, Arnaldo Carvalho de Melo wrote:
> Em Thu, Feb 02, 2012 at 01:54:46PM +0100, Stephane Eranian escreveu:
>> This patch allows perf to process perf.data files generated
>> using an ABI that has a different perf_event_attr struct size, i.e.,
>> a different ABI version.
>>
>> The perf_event_attr can be extended, yet perf needs to cope with
>> older perf.data files. Similarly, perf must be able to cope with
>> a perf.data file which is using a newer version of the ABI than
>> what it knows about.
>>
>> This patch adds read_attr(), a routine that reads a perf_event_attr
>> struct from a file incrementally based on its advertised size. If
>> the on-file struct is smaller than what perf knows, then the extra
>> fields are zeroed. If the on-file struct is bigger, then perf only
>> uses what it knows about, the rest is skipped.
> 
> Anshuman, can I have your Acked-by or Reviewed-by since you spotted
> problems in this and your suggestions were taken into account? Is this
> OK now?
> 

(1) PERF_SAMPLE_BRANCH_HV and related privilege level problems has been fixed. 
    Verified various combinations of <branch_type>,[u,k,hv] <event>:[u,k,h].
    Works well in all privilege level permutations.

(2) As Peter has mentioned 'has_branch_smpl()' bug in the ARM code would be taken
    care.

(3) 'ref_size' problem in try_all_pipe_abis() has been fixed. All the patches 
    independently compile successfully.

(4) Went through the entire patchset, looks good to me. 

Acked-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Tested-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
-- 
Anshuman Khandual
Linux Technology Centre
IBM Systems and Technology Group


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 11/18] perf: add code to support PERF_SAMPLE_BRANCH_STACK
  2012-02-06 18:06   ` Arnaldo Carvalho de Melo
@ 2012-02-07 14:11     ` Stephane Eranian
  2012-02-07 15:21       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 43+ messages in thread
From: Stephane Eranian @ 2012-02-07 14:11 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, peterz, mingo, robert.richter, ming.m.lin, andi,
	asharma, ravitillo, vweaver1, khandual, dsahern

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 37944 bytes --]

On Mon, Feb 6, 2012 at 7:06 PM, Arnaldo Carvalho de Melo
<acme@redhat.com> wrote:
> Em Thu, Feb 02, 2012 at 01:54:41PM +0100, Stephane Eranian escreveu:
>> From: Roberto Agostino Vitillo <ravitillo@lbl.gov>
>>
>> This patch adds:
>> - ability to parse samples with PERF_SAMPLE_BRANCH_STACK
>> - sort on branches
>> - build histograms on branches
>
> Some comments below, mostly minor stuff, looks great work, thanks!
>
> - Arnaldo
>
>> Signed-off-by: Roberto Agostino Vitillo <ravitillo@lbl.gov>
>> Signed-off-by: Stephane Eranian <eranian@google.com>
>> ---
>>  tools/perf/perf.h          |   17 ++
>>  tools/perf/util/annotate.c |    2 +-
>>  tools/perf/util/event.h    |    1 +
>>  tools/perf/util/evsel.c    |   10 ++
>>  tools/perf/util/hist.c     |   93 +++++++++---
>>  tools/perf/util/hist.h     |    7 +
>>  tools/perf/util/session.c  |   72 +++++++++
>>  tools/perf/util/session.h  |    4 +
>>  tools/perf/util/sort.c     |  362 +++++++++++++++++++++++++++++++++-----------
>>  tools/perf/util/sort.h     |    5 +
>>  tools/perf/util/symbol.h   |   13 ++
>>  11 files changed, 475 insertions(+), 111 deletions(-)
>>
>> diff --git a/tools/perf/perf.h b/tools/perf/perf.h
>> index 92af168..8b4d25d 100644
>> --- a/tools/perf/perf.h
>> +++ b/tools/perf/perf.h
>> @@ -180,6 +180,23 @@ struct ip_callchain {
>>       u64 ips[0];
>>  };
>>
>> +struct branch_flags {
>> +     u64 mispred:1;
>> +     u64 predicted:1;
>> +     u64 reserved:62;
>> +};
>> +
>> +struct branch_entry {
>> +     u64                             from;
>> +     u64                             to;
>> +     struct branch_flags flags;
>> +};
>> +
>> +struct branch_stack {
>> +     u64                             nr;
>> +     struct branch_entry     entries[0];
>> +};
>> +
>>  extern bool perf_host, perf_guest;
>>  extern const char perf_version_string[];
>>
>> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
>> index 011ed26..8248d80 100644
>> --- a/tools/perf/util/annotate.c
>> +++ b/tools/perf/util/annotate.c
>> @@ -64,7 +64,7 @@ int symbol__inc_addr_samples(struct symbol *sym, struct map *map,
>>
>>       pr_debug3("%s: addr=%#" PRIx64 "\n", __func__, map->unmap_ip(map, addr));
>>
>> -     if (addr >= sym->end)
>> +     if (addr >= sym->end || addr < sym->start)
>
> This is not related to this, would be better to come in a separate patch
> with a proper explanation.
>
You mean in this patchset or separately?

>>               return 0;
>>
>>       offset = addr - sym->start;
>> diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
>> index cbdeaad..1b19728 100644
>> --- a/tools/perf/util/event.h
>> +++ b/tools/perf/util/event.h
>> @@ -81,6 +81,7 @@ struct perf_sample {
>>       u32 raw_size;
>>       void *raw_data;
>>       struct ip_callchain *callchain;
>> +     struct branch_stack *branch_stack;
>>  };
>>
>>  #define BUILD_ID_SIZE 20
>> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
>> index dcfefab..6b15cda 100644
>> --- a/tools/perf/util/evsel.c
>> +++ b/tools/perf/util/evsel.c
>> @@ -575,6 +575,16 @@ int perf_event__parse_sample(const union perf_event *event, u64 type,
>>               data->raw_data = (void *) pdata;
>>       }
>>
>> +     if (type & PERF_SAMPLE_BRANCH_STACK) {
>> +             u64 sz;
>> +
>> +             data->branch_stack = (struct branch_stack *)array;
>> +             array++; /* nr */
>> +
>> +             sz = data->branch_stack->nr * sizeof(struct branch_entry);
>> +             sz /= sizeof(uint64_t);
>
> Consistency here: use sizeof(u64), or better yet: sizeof(sz);
>
>> +             array += sz;
>> +     }
>>       return 0;
>>  }
>>
>> diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
>> index 6f505d1..66f9936 100644
>> --- a/tools/perf/util/hist.c
>> +++ b/tools/perf/util/hist.c
>> @@ -54,9 +54,11 @@ static void hists__calc_col_len(struct hists *hists, struct hist_entry *h)
>>  {
>>       u16 len;
>>
>> -     if (h->ms.sym)
>> -             hists__new_col_len(hists, HISTC_SYMBOL, h->ms.sym->namelen);
>> -     else {
>> +     if (h->ms.sym) {
>> +             int n = (int)h->ms.sym->namelen + 4;
>> +             int symlen = max(n, BITS_PER_LONG / 4 + 6);
>
> What is the rationale here? Adding a comment will help
>
Will do.

>> +             hists__new_col_len(hists, HISTC_SYMBOL, symlen);
>> +     } else {
>>               const unsigned int unresolved_col_width = BITS_PER_LONG / 4;
>>
>>               if (hists__col_len(hists, HISTC_DSO) < unresolved_col_width &&
>> @@ -195,26 +197,14 @@ static u8 symbol__parent_filter(const struct symbol *parent)
>>       return 0;
>>  }
>>
>> -struct hist_entry *__hists__add_entry(struct hists *hists,
>> +static struct hist_entry *add_hist_entry(struct hists *hists,
>> +                                   struct hist_entry *entry,
>>                                     struct addr_location *al,
>> -                                   struct symbol *sym_parent, u64 period)
>> +                                   u64 period)
>>  {
>>       struct rb_node **p;
>>       struct rb_node *parent = NULL;
>>       struct hist_entry *he;
>> -     struct hist_entry entry = {
>> -             .thread = al->thread,
>> -             .ms = {
>> -                     .map    = al->map,
>> -                     .sym    = al->sym,
>> -             },
>> -             .cpu    = al->cpu,
>> -             .ip     = al->addr,
>> -             .level  = al->level,
>> -             .period = period,
>> -             .parent = sym_parent,
>> -             .filtered = symbol__parent_filter(sym_parent),
>> -     };
>>       int cmp;
>>
>>       pthread_mutex_lock(&hists->lock);
>> @@ -225,7 +215,7 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
>>               parent = *p;
>>               he = rb_entry(parent, struct hist_entry, rb_node_in);
>>
>> -             cmp = hist_entry__cmp(&entry, he);
>> +             cmp = hist_entry__cmp(entry, he);
>>
>>               if (!cmp) {
>>                       he->period += period;
>> @@ -239,7 +229,7 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
>>                       p = &(*p)->rb_right;
>>       }
>>
>> -     he = hist_entry__new(&entry);
>> +     he = hist_entry__new(entry);
>>       if (!he)
>>               goto out_unlock;
>>
>> @@ -252,6 +242,69 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
>>       return he;
>>  }
>>
>> +struct hist_entry *__hists__add_branch_entry(struct hists *self,
>> +                                          struct addr_location *al,
>> +                                          struct symbol *sym_parent,
>> +                                          struct branch_info *bi,
>> +                                          u64 period){
>> +     struct hist_entry entry = {
>> +             .thread = al->thread,
>> +             .ms = {
>> +                     .map    = bi->to.map,
>> +                     .sym    = bi->to.sym,
>> +             },
>> +             .cpu    = al->cpu,
>> +             .ip     = bi->to.addr,
>> +             .level  = al->level,
>> +             .period = period,
>> +             .parent = sym_parent,
>> +             .filtered = symbol__parent_filter(sym_parent),
>> +             .branch_info = bi,
>> +     };
>> +     struct hist_entry *he;
>> +
>> +     he = add_hist_entry(self, &entry, al, period);
>> +     if (!he)
>> +             return NULL;
>> +
>> +     /*
>> +      * in branch mode, we do not display al->sym, al->addr
>
> Really minor nit, but start with:  "In branch mode"
>
>> +      * but instead what is in branch_info. The addresses and
>> +      * symbols there may need wider columns, so make sure they
>> +      * are taken into account.
>> +      *
>> +      * hists__calc_col_len() tracks the max column width, so
>> +      * we need to call it for both the from and to addresses
>> +      */
>> +     entry.ip     = bi->from.addr;
>> +     entry.ms.map = bi->from.map;
>> +     entry.ms.sym = bi->from.sym;
>> +     hists__calc_col_len(self, &entry);
>> +
>> +     return he;
>> +}
>> +
>> +struct hist_entry *__hists__add_entry(struct hists *self,
>> +                                   struct addr_location *al,
>> +                                   struct symbol *sym_parent, u64 period)
>> +{
>> +     struct hist_entry entry = {
>> +             .thread = al->thread,
>> +             .ms = {
>> +                     .map    = al->map,
>> +                     .sym    = al->sym,
>> +             },
>> +             .cpu    = al->cpu,
>> +             .ip     = al->addr,
>> +             .level  = al->level,
>> +             .period = period,
>> +             .parent = sym_parent,
>> +             .filtered = symbol__parent_filter(sym_parent),
>> +     };
>> +
>> +     return add_hist_entry(self, &entry, al, period);
>> +}
>> +
>>  int64_t
>>  hist_entry__cmp(struct hist_entry *left, struct hist_entry *right)
>>  {
>> diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
>> index 0d48613..801a04e 100644
>> --- a/tools/perf/util/hist.h
>> +++ b/tools/perf/util/hist.h
>> @@ -41,6 +41,7 @@ enum hist_column {
>>       HISTC_COMM,
>>       HISTC_PARENT,
>>       HISTC_CPU,
>> +     HISTC_MISPREDICT,
>>       HISTC_NR_COLS, /* Last entry */
>>  };
>>
>> @@ -73,6 +74,12 @@ int hist_entry__snprintf(struct hist_entry *self, char *bf, size_t size,
>>                        struct hists *hists);
>>  void hist_entry__free(struct hist_entry *);
>>
>> +struct hist_entry *__hists__add_branch_entry(struct hists *self,
>> +                                          struct addr_location *al,
>> +                                          struct symbol *sym_parent,
>> +                                          struct branch_info *bi,
>> +                                          u64 period);
>> +
>>  void hists__output_resort(struct hists *self);
>>  void hists__output_resort_threaded(struct hists *hists);
>>  void hists__collapse_resort(struct hists *self);
>> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
>> index 552c1c5..5ce3f31 100644
>> --- a/tools/perf/util/session.c
>> +++ b/tools/perf/util/session.c
>> @@ -229,6 +229,63 @@ static bool symbol__match_parent_regex(struct symbol *sym)
>>       return 0;
>>  }
>>
>> +static const u8 cpumodes[] = {
>> +     PERF_RECORD_MISC_USER,
>> +     PERF_RECORD_MISC_KERNEL,
>> +     PERF_RECORD_MISC_GUEST_USER,
>> +     PERF_RECORD_MISC_GUEST_KERNEL
>> +};
>> +#define NCPUMODES (sizeof(cpumodes)/sizeof(u8))
>> +
>> +static void ip__resolve_ams(struct machine *self, struct thread *thread,
>> +                         struct addr_map_symbol *ams,
>> +                         u64 ip)
>> +{
>> +     struct addr_location al;
>> +     size_t i;
>> +     u8 m;
>> +
>> +     memset(&al, 0, sizeof(al));
>> +
>> +     for (i = 0; i < NCPUMODES; i++) {
>> +             m = cpumodes[i];
>> +             /*
>> +              * we cannot use the header.misc hint to determine whether a
>
> ditto
>
>> +              * branch stack address is user, kernel, guest, hypervisor.
>> +              * Branches may straddle the kernel/user/hypervisor boundaries.
>> +              * Thus, we have to try * consecutively until we find a match
>
>                                        ^ comment reflow artifact?
>
>> +              * or else, the symbol is unknown
>> +              */
>> +             thread__find_addr_location(thread, self, m, MAP__FUNCTION,
>> +                             ip, &al, NULL);
>> +             if (al.sym)
>> +                     goto found;
>> +     }
>> +found:
>> +     ams->addr = ip;
>> +     ams->sym = al.sym;
>> +     ams->map = al.map;
>> +}
>> +
>> +struct branch_info *perf_session__resolve_bstack(struct machine *self,
>> +                                              struct thread *thr,
>> +                                              struct branch_stack *bs)
>> +{
>> +     struct branch_info *bi;
>> +     unsigned int i;
>> +
>> +     bi = calloc(bs->nr, sizeof(struct branch_info));
>> +     if (!bi)
>> +             return NULL;
>> +
>> +     for (i = 0; i < bs->nr; i++) {
>> +             ip__resolve_ams(self, thr, &bi[i].to, bs->entries[i].to);
>> +             ip__resolve_ams(self, thr, &bi[i].from, bs->entries[i].from);
>> +             bi[i].flags = bs->entries[i].flags;
>> +     }
>> +     return bi;
>> +}
>> +
>>  int machine__resolve_callchain(struct machine *self, struct perf_evsel *evsel,
>>                              struct thread *thread,
>>                              struct ip_callchain *chain,
>> @@ -697,6 +754,18 @@ static void callchain__printf(struct perf_sample *sample)
>>                      i, sample->callchain->ips[i]);
>>  }
>>
>> +static void branch_stack__printf(struct perf_sample *sample)
>> +{
>> +     uint64_t i;
>> +
>> +     printf("... branch stack: nr:%" PRIu64 "\n", sample->branch_stack->nr);
>> +
>> +     for (i = 0; i < sample->branch_stack->nr; i++)
>> +             printf("..... %2"PRIu64": %016" PRIx64 " -> %016" PRIx64 "\n",
>> +                     i, sample->branch_stack->entries[i].from,
>> +                     sample->branch_stack->entries[i].to);
>> +}
>> +
>>  static void perf_session__print_tstamp(struct perf_session *session,
>>                                      union perf_event *event,
>>                                      struct perf_sample *sample)
>> @@ -744,6 +813,9 @@ static void dump_sample(struct perf_session *session, union perf_event *event,
>>
>>       if (session->sample_type & PERF_SAMPLE_CALLCHAIN)
>>               callchain__printf(sample);
>> +
>> +     if (session->sample_type & PERF_SAMPLE_BRANCH_STACK)
>> +             branch_stack__printf(sample);
>>  }
>>
>>  static struct machine *
>> diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h
>> index c8d9017..accb5dc 100644
>> --- a/tools/perf/util/session.h
>> +++ b/tools/perf/util/session.h
>> @@ -73,6 +73,10 @@ int perf_session__resolve_callchain(struct perf_session *self, struct perf_evsel
>>                                   struct ip_callchain *chain,
>>                                   struct symbol **parent);
>>
>> +struct branch_info *perf_session__resolve_bstack(struct machine *self,
>> +                                              struct thread *thread,
>> +                                              struct branch_stack *bs);
>> +
>>  bool perf_session__has_traces(struct perf_session *self, const char *msg);
>>
>>  void mem_bswap_64(void *src, int byte_size);
>> diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
>> index 16da30d..1531989 100644
>> --- a/tools/perf/util/sort.c
>> +++ b/tools/perf/util/sort.c
>> @@ -8,6 +8,7 @@ const char    default_sort_order[] = "comm,dso,symbol";
>>  const char   *sort_order = default_sort_order;
>>  int          sort__need_collapse = 0;
>>  int          sort__has_parent = 0;
>> +bool         sort__branch_mode;
>>
>>  enum sort_type       sort__first_dimension;
>>
>> @@ -94,6 +95,26 @@ static int hist_entry__comm_snprintf(struct hist_entry *self, char *bf,
>>       return repsep_snprintf(bf, size, "%*s", width, self->thread->comm);
>>  }
>>
>> +static int64_t _sort__dso_cmp(struct map *map_l, struct map *map_r)
>> +{
>> +     struct dso *dso_l = map_l ? map_l->dso : NULL;
>> +     struct dso *dso_r = map_r ? map_r->dso : NULL;
>> +     const char *dso_name_l, *dso_name_r;
>> +
>> +     if (!dso_l || !dso_r)
>> +             return cmp_null(dso_l, dso_r);
>> +
>> +     if (verbose) {
>> +             dso_name_l = dso_l->long_name;
>> +             dso_name_r = dso_r->long_name;
>> +     } else {
>> +             dso_name_l = dso_l->short_name;
>> +             dso_name_r = dso_r->short_name;
>> +     }
>> +
>> +     return strcmp(dso_name_l, dso_name_r);
>> +}
>> +
>>  struct sort_entry sort_comm = {
>>       .se_header      = "Command",
>>       .se_cmp         = sort__comm_cmp,
>> @@ -107,36 +128,74 @@ struct sort_entry sort_comm = {
>>  static int64_t
>>  sort__dso_cmp(struct hist_entry *left, struct hist_entry *right)
>>  {
>> -     struct dso *dso_l = left->ms.map ? left->ms.map->dso : NULL;
>> -     struct dso *dso_r = right->ms.map ? right->ms.map->dso : NULL;
>> -     const char *dso_name_l, *dso_name_r;
>> +     return _sort__dso_cmp(left->ms.map, right->ms.map);
>> +}
>>
>> -     if (!dso_l || !dso_r)
>> -             return cmp_null(dso_l, dso_r);
>>
>> -     if (verbose) {
>> -             dso_name_l = dso_l->long_name;
>> -             dso_name_r = dso_r->long_name;
>> -     } else {
>> -             dso_name_l = dso_l->short_name;
>> -             dso_name_r = dso_r->short_name;
>> +static int64_t _sort__sym_cmp(struct symbol *sym_l, struct symbol *sym_r,
>
> We use double _ on the front with the same rationale as in the kernel,
> i.e. we we do a little bit less than what the non __ prefixed function
> does (locking, etc).
>
The function was extracted to be called from different contexts.
The old function kept the same name to avoid modifying many lines of code.
The _sort__sym_cmp() does the actual work, i.e., the common code.
So I don't know how to apply your comment.

>> +                           u64 ip_l, u64 ip_r)
>> +{
>> +     if (!sym_l || !sym_r)
>> +             return cmp_null(sym_l, sym_r);
>> +
>> +     if (sym_l == sym_r)
>> +             return 0;
>> +
>> +     if (sym_l)
>> +             ip_l = sym_l->start;
>> +     if (sym_r)
>> +             ip_r = sym_r->start;
>> +
>> +     return (int64_t)(ip_r - ip_l);
>> +}
>> +
>> +static int _hist_entry__dso_snprintf(struct map *map, char *bf,
>> +                                  size_t size, unsigned int width)
>> +{
>> +     if (map && map->dso) {
>> +             const char *dso_name = !verbose ? map->dso->short_name :
>> +                     map->dso->long_name;
>> +             return repsep_snprintf(bf, size, "%-*s", width, dso_name);
>>       }
>>
>> -     return strcmp(dso_name_l, dso_name_r);
>> +     return repsep_snprintf(bf, size, "%-*s", width, "[unknown]");
>>  }
>>
>>  static int hist_entry__dso_snprintf(struct hist_entry *self, char *bf,
>>                                   size_t size, unsigned int width)
>>  {
>> -     if (self->ms.map && self->ms.map->dso) {
>> -             const char *dso_name = !verbose ? self->ms.map->dso->short_name :
>> -                                               self->ms.map->dso->long_name;
>> -             return repsep_snprintf(bf, size, "%-*s", width, dso_name);
>> +     return _hist_entry__dso_snprintf(self->ms.map, bf, size, width);
>> +}
>> +
>> +static int _hist_entry__sym_snprintf(struct map *map, struct symbol *sym,
>> +                                  u64 ip, char level, char *bf, size_t size,
>> +                                  unsigned int width __used)
>> +{
>> +     size_t ret = 0;
>> +
>> +     if (verbose) {
>> +             char o = map ? dso__symtab_origin(map->dso) : '!';
>> +             ret += repsep_snprintf(bf, size, "%-#*llx %c ",
>> +                                    BITS_PER_LONG / 4, ip, o);
>>       }
>>
>> -     return repsep_snprintf(bf, size, "%-*s", width, "[unknown]");
>> +     ret += repsep_snprintf(bf + ret, size - ret, "[%c] ", level);
>> +     if (sym)
>> +             ret += repsep_snprintf(bf + ret, size - ret, "%-*s",
>> +                                    width - ret,
>> +                                    sym->name);
>> +     else {
>> +             size_t len = BITS_PER_LONG / 4;
>> +             ret += repsep_snprintf(bf + ret, size - ret, "%-#.*llx",
>> +                                    len, ip);
>> +             ret += repsep_snprintf(bf + ret, size - ret, "%-*s",
>> +                                    width - ret, "");
>> +     }
>> +
>> +     return ret;
>>  }
>>
>> +
>>  struct sort_entry sort_dso = {
>>       .se_header      = "Shared Object",
>>       .se_cmp         = sort__dso_cmp,
>> @@ -144,8 +203,14 @@ struct sort_entry sort_dso = {
>>       .se_width_idx   = HISTC_DSO,
>>  };
>>
>> -/* --sort symbol */
>> +static int hist_entry__sym_snprintf(struct hist_entry *self, char *bf,
>> +                                 size_t size, unsigned int width __used)
>> +{
>> +     return _hist_entry__sym_snprintf(self->ms.map, self->ms.sym, self->ip,
>> +                                      self->level, bf, size, width);
>> +}
>>
>> +/* --sort symbol */
>>  static int64_t
>>  sort__sym_cmp(struct hist_entry *left, struct hist_entry *right)
>>  {
>> @@ -154,40 +219,10 @@ sort__sym_cmp(struct hist_entry *left, struct hist_entry *right)
>>       if (!left->ms.sym && !right->ms.sym)
>>               return right->level - left->level;
>>
>> -     if (!left->ms.sym || !right->ms.sym)
>> -             return cmp_null(left->ms.sym, right->ms.sym);
>> -
>> -     if (left->ms.sym == right->ms.sym)
>> -             return 0;
>> -
>>       ip_l = left->ms.sym->start;
>>       ip_r = right->ms.sym->start;
>>
>> -     return (int64_t)(ip_r - ip_l);
>> -}
>> -
>> -static int hist_entry__sym_snprintf(struct hist_entry *self, char *bf,
>> -                                 size_t size, unsigned int width __used)
>> -{
>> -     size_t ret = 0;
>> -
>> -     if (verbose) {
>> -             char o = self->ms.map ? dso__symtab_origin(self->ms.map->dso) : '!';
>> -             ret += repsep_snprintf(bf, size, "%-#*llx %c ",
>> -                                    BITS_PER_LONG / 4, self->ip, o);
>> -     }
>> -
>> -     if (!sort_dso.elide)
>> -             ret += repsep_snprintf(bf + ret, size - ret, "[%c] ", self->level);
>> -
>> -     if (self->ms.sym)
>> -             ret += repsep_snprintf(bf + ret, size - ret, "%s",
>> -                                    self->ms.sym->name);
>> -     else
>> -             ret += repsep_snprintf(bf + ret, size - ret, "%-#*llx",
>> -                                    BITS_PER_LONG / 4, self->ip);
>> -
>> -     return ret;
>> +     return _sort__sym_cmp(left->ms.sym, right->ms.sym, ip_l, ip_r);
>>  }
>>
>>  struct sort_entry sort_sym = {
>> @@ -246,6 +281,135 @@ struct sort_entry sort_cpu = {
>>       .se_width_idx   = HISTC_CPU,
>>  };
>>
>> +static int64_t
>> +sort__dso_from_cmp(struct hist_entry *left, struct hist_entry *right)
>> +{
>> +     return _sort__dso_cmp(left->branch_info->from.map,
>> +                           right->branch_info->from.map);
>> +}
>> +
>> +static int hist_entry__dso_from_snprintf(struct hist_entry *self, char *bf,
>> +                                 size_t size, unsigned int width)
>> +{
>> +     return _hist_entry__dso_snprintf(self->branch_info->from.map,
>> +                                      bf, size, width);
>> +}
>> +
>> +struct sort_entry sort_dso_from = {
>> +     .se_header      = "Source Shared Object",
>> +     .se_cmp         = sort__dso_from_cmp,
>> +     .se_snprintf    = hist_entry__dso_from_snprintf,
>> +     .se_width_idx   = HISTC_DSO,
>> +};
>> +
>> +static int64_t
>> +sort__dso_to_cmp(struct hist_entry *left, struct hist_entry *right)
>> +{
>> +     return _sort__dso_cmp(left->branch_info->to.map,
>> +                           right->branch_info->to.map);
>> +}
>> +
>> +static int hist_entry__dso_to_snprintf(struct hist_entry *self, char *bf,
>> +                                    size_t size, unsigned int width)
>> +{
>> +     return _hist_entry__dso_snprintf(self->branch_info->to.map,
>> +                                      bf, size, width);
>> +}
>> +
>> +static int64_t
>> +sort__sym_from_cmp(struct hist_entry *left, struct hist_entry *right)
>> +{
>> +     struct addr_map_symbol *from_l = &left->branch_info->from;
>> +     struct addr_map_symbol *from_r = &right->branch_info->from;
>> +
>> +     if (!from_l->sym && !from_r->sym)
>> +             return right->level - left->level;
>> +
>> +     return _sort__sym_cmp(from_l->sym, from_r->sym, from_l->addr,
>> +                          from_r->addr);
>> +}
>> +
>> +static int64_t
>> +sort__sym_to_cmp(struct hist_entry *left, struct hist_entry *right)
>> +{
>> +     struct addr_map_symbol *to_l = &left->branch_info->to;
>> +     struct addr_map_symbol *to_r = &right->branch_info->to;
>> +
>> +     if (!to_l->sym && !to_r->sym)
>> +             return right->level - left->level;
>> +
>> +     return _sort__sym_cmp(to_l->sym, to_r->sym, to_l->addr, to_r->addr);
>> +}
>> +
>> +static int hist_entry__sym_from_snprintf(struct hist_entry *self, char *bf,
>> +                                 size_t size, unsigned int width __used)
>> +{
>> +     struct addr_map_symbol *from = &self->branch_info->from;
>> +     return _hist_entry__sym_snprintf(from->map, from->sym, from->addr,
>> +                                      self->level, bf, size, width);
>> +
>> +}
>> +
>> +static int hist_entry__sym_to_snprintf(struct hist_entry *self, char *bf,
>> +                                 size_t size, unsigned int width __used)
>> +{
>> +     struct addr_map_symbol *to = &self->branch_info->to;
>> +     return _hist_entry__sym_snprintf(to->map, to->sym, to->addr,
>> +                                      self->level, bf, size, width);
>> +
>> +}
>> +
>> +struct sort_entry sort_dso_to = {
>> +     .se_header      = "Target Shared Object",
>> +     .se_cmp         = sort__dso_to_cmp,
>> +     .se_snprintf    = hist_entry__dso_to_snprintf,
>> +     .se_width_idx   = HISTC_DSO,
>> +};
>> +
>> +struct sort_entry sort_sym_from = {
>> +     .se_header      = "Source Symbol",
>> +     .se_cmp         = sort__sym_from_cmp,
>> +     .se_snprintf    = hist_entry__sym_from_snprintf,
>> +     .se_width_idx   = HISTC_SYMBOL,
>> +};
>> +
>> +struct sort_entry sort_sym_to = {
>> +     .se_header      = "Target Symbol",
>> +     .se_cmp         = sort__sym_to_cmp,
>> +     .se_snprintf    = hist_entry__sym_to_snprintf,
>> +     .se_width_idx   = HISTC_SYMBOL,
>> +};
>> +
>> +static int64_t
>> +sort__mispredict_cmp(struct hist_entry *left, struct hist_entry *right)
>> +{
>> +     const unsigned char mp = left->branch_info->flags.mispred !=
>> +                                     right->branch_info->flags.mispred;
>> +     const unsigned char p = left->branch_info->flags.predicted !=
>> +                                     right->branch_info->flags.predicted;
>> +
>> +     return mp || p;
>> +}
>> +
>> +static int hist_entry__mispredict_snprintf(struct hist_entry *self, char *bf,
>> +                                 size_t size, unsigned int width){
>> +     static const char *out = "N/A";
>> +
>> +     if (self->branch_info->flags.predicted)
>> +             out = "N";
>> +     else if (self->branch_info->flags.mispred)
>> +             out = "Y";
>> +
>> +     return repsep_snprintf(bf, size, "%-*s", width, out);
>> +}
>> +
>> +struct sort_entry sort_mispredict = {
>> +     .se_header      = "Branch Mispredicted",
>> +     .se_cmp         = sort__mispredict_cmp,
>> +     .se_snprintf    = hist_entry__mispredict_snprintf,
>> +     .se_width_idx   = HISTC_MISPREDICT,
>> +};
>> +
>>  struct sort_dimension {
>>       const char              *name;
>>       struct sort_entry       *entry;
>> @@ -253,14 +417,59 @@ struct sort_dimension {
>>  };
>>
>>  static struct sort_dimension sort_dimensions[] = {
>> -     { .name = "pid",        .entry = &sort_thread,  },
>> -     { .name = "comm",       .entry = &sort_comm,    },
>> -     { .name = "dso",        .entry = &sort_dso,     },
>> -     { .name = "symbol",     .entry = &sort_sym,     },
>> -     { .name = "parent",     .entry = &sort_parent,  },
>> -     { .name = "cpu",        .entry = &sort_cpu,     },
>> +     { .name = "pid",        .entry = &sort_thread,                  },
>> +     { .name = "comm",       .entry = &sort_comm,                    },
>> +     { .name = "dso",        .entry = &sort_dso,                     },
>> +     { .name = "dso_from",   .entry = &sort_dso_from, .taken = true  },
>> +     { .name = "dso_to",     .entry = &sort_dso_to,   .taken = true  },
>> +     { .name = "symbol",     .entry = &sort_sym,                     },
>> +     { .name = "symbol_from",.entry = &sort_sym_from, .taken = true  },
>> +     { .name = "symbol_to",  .entry = &sort_sym_to,   .taken = true  },
>> +     { .name = "parent",     .entry = &sort_parent,                  },
>> +     { .name = "cpu",        .entry = &sort_cpu,                     },
>> +     { .name = "mispredict", .entry = &sort_mispredict, },
>>  };
>>
>> +static int _sort_dimension__add(struct sort_dimension *sd)
>> +{
>> +     if (sd->entry->se_collapse)
>> +             sort__need_collapse = 1;
>> +
>> +     if (sd->entry == &sort_parent) {
>> +             int ret = regcomp(&parent_regex, parent_pattern, REG_EXTENDED);
>> +             if (ret) {
>> +                     char err[BUFSIZ];
>> +
>> +                     regerror(ret, &parent_regex, err, sizeof(err));
>> +                     pr_err("Invalid regex: %s\n%s", parent_pattern, err);
>> +                     return -EINVAL;
>> +             }
>> +             sort__has_parent = 1;
>> +     }
>> +
>> +     if (list_empty(&hist_entry__sort_list)) {
>> +             if (!strcmp(sd->name, "pid"))
>> +                     sort__first_dimension = SORT_PID;
>> +             else if (!strcmp(sd->name, "comm"))
>> +                     sort__first_dimension = SORT_COMM;
>> +             else if (!strcmp(sd->name, "dso"))
>> +                     sort__first_dimension = SORT_DSO;
>> +             else if (!strcmp(sd->name, "symbol"))
>> +                     sort__first_dimension = SORT_SYM;
>> +             else if (!strcmp(sd->name, "parent"))
>> +                     sort__first_dimension = SORT_PARENT;
>> +             else if (!strcmp(sd->name, "cpu"))
>> +                     sort__first_dimension = SORT_CPU;
>> +             else if (!strcmp(sd->name, "mispredict"))
>> +                     sort__first_dimension = SORT_MISPREDICTED;
>> +     }
>> +
>> +     list_add_tail(&sd->entry->list, &hist_entry__sort_list);
>> +     sd->taken = 1;
>> +
>> +     return 0;
>> +}
>> +
>>  int sort_dimension__add(const char *tok)
>>  {
>>       unsigned int i;
>> @@ -271,48 +480,21 @@ int sort_dimension__add(const char *tok)
>>               if (strncasecmp(tok, sd->name, strlen(tok)))
>>                       continue;
>>
>> -             if (sd->entry == &sort_parent) {
>> -                     int ret = regcomp(&parent_regex, parent_pattern, REG_EXTENDED);
>> -                     if (ret) {
>> -                             char err[BUFSIZ];
>> -
>> -                             regerror(ret, &parent_regex, err, sizeof(err));
>> -                             pr_err("Invalid regex: %s\n%s", parent_pattern, err);
>> -                             return -EINVAL;
>> -                     }
>> -                     sort__has_parent = 1;
>> -             }
>> -
>>               if (sd->taken)
>>                       return 0;
>>
>> -             if (sd->entry->se_collapse)
>> -                     sort__need_collapse = 1;
>> -
>> -             if (list_empty(&hist_entry__sort_list)) {
>> -                     if (!strcmp(sd->name, "pid"))
>> -                             sort__first_dimension = SORT_PID;
>> -                     else if (!strcmp(sd->name, "comm"))
>> -                             sort__first_dimension = SORT_COMM;
>> -                     else if (!strcmp(sd->name, "dso"))
>> -                             sort__first_dimension = SORT_DSO;
>> -                     else if (!strcmp(sd->name, "symbol"))
>> -                             sort__first_dimension = SORT_SYM;
>> -                     else if (!strcmp(sd->name, "parent"))
>> -                             sort__first_dimension = SORT_PARENT;
>> -                     else if (!strcmp(sd->name, "cpu"))
>> -                             sort__first_dimension = SORT_CPU;
>> -             }
>> -
>> -             list_add_tail(&sd->entry->list, &hist_entry__sort_list);
>> -             sd->taken = 1;
>>
>> -             return 0;
>> +             if (sort__branch_mode && (sd->entry == &sort_dso ||
>> +                                     sd->entry == &sort_sym)){
>> +                     int err = _sort_dimension__add(sd + 1);
>> +                     return err ?: _sort_dimension__add(sd + 2);
>> +             } else if (sd->entry == &sort_mispredict && !sort__branch_mode)
>> +                     break;
>> +             else
>> +                     return _sort_dimension__add(sd);
>>       }
>> -
>>       return -ESRCH;
>>  }
>> -
>>  void setup_sorting(const char * const usagestr[], const struct option *opts)
>>  {
>>       char *tmp, *tok, *str = strdup(sort_order);
>> diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
>> index 3f67ae3..effcae1 100644
>> --- a/tools/perf/util/sort.h
>> +++ b/tools/perf/util/sort.h
>> @@ -31,11 +31,14 @@ extern const char *parent_pattern;
>>  extern const char default_sort_order[];
>>  extern int sort__need_collapse;
>>  extern int sort__has_parent;
>> +extern bool sort__branch_mode;
>>  extern char *field_sep;
>>  extern struct sort_entry sort_comm;
>>  extern struct sort_entry sort_dso;
>>  extern struct sort_entry sort_sym;
>>  extern struct sort_entry sort_parent;
>> +extern struct sort_entry sort_lbr_dso;
>> +extern struct sort_entry sort_lbr_sym;
>>  extern enum sort_type sort__first_dimension;
>>
>>  /**
>> @@ -72,6 +75,7 @@ struct hist_entry {
>>               struct hist_entry *pair;
>>               struct rb_root    sorted_chain;
>>       };
>> +     struct branch_info      *branch_info;
>>       struct callchain_root   callchain[0];
>>  };
>>
>> @@ -82,6 +86,7 @@ enum sort_type {
>>       SORT_SYM,
>>       SORT_PARENT,
>>       SORT_CPU,
>> +     SORT_MISPREDICTED,
>>  };
>>
>>  /*
>> diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
>> index 2a683d4..5866ce6 100644
>> --- a/tools/perf/util/symbol.h
>> +++ b/tools/perf/util/symbol.h
>> @@ -5,6 +5,7 @@
>>  #include <stdbool.h>
>>  #include <stdint.h>
>>  #include "map.h"
>> +#include "../perf.h"
>>  #include <linux/list.h>
>>  #include <linux/rbtree.h>
>>  #include <stdio.h>
>> @@ -120,6 +121,18 @@ struct map_symbol {
>>       bool          has_children;
>>  };
>>
>> +struct addr_map_symbol {
>> +     struct map    *map;
>> +     struct symbol *sym;
>> +     u64           addr;
>> +};
>> +
>> +struct branch_info {
>> +     struct addr_map_symbol from;
>> +     struct addr_map_symbol to;
>> +     struct branch_flags flags;
>> +};
>> +
>>  struct addr_location {
>>       struct thread *thread;
>>       struct map    *map;
>> --
>> 1.7.4.1
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 14/18] perf: fix endianness detection in perf.data
  2012-02-06 22:29         ` David Ahern
@ 2012-02-07 14:13           ` Stephane Eranian
  2012-02-07 14:38             ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 43+ messages in thread
From: Stephane Eranian @ 2012-02-07 14:13 UTC (permalink / raw)
  To: David Ahern
  Cc: Arnaldo Carvalho de Melo, linux-kernel, peterz, mingo,
	robert.richter, ming.m.lin, andi, asharma, ravitillo, vweaver1,
	khandual

On Mon, Feb 6, 2012 at 11:29 PM, David Ahern <dsahern@gmail.com> wrote:
>
>
> On 02/06/2012 03:06 PM, Arnaldo Carvalho de Melo wrote:
>> Em Mon, Feb 06, 2012 at 02:47:18PM -0700, David Ahern escreveu:
>>> On 02/06/2012 11:17 AM, Arnaldo Carvalho de Melo wrote:
>>>> Looks ok, but IIRC David Ahern interacted with you on this specific
>>>> patch in the past, having his Acked-by and/or Tested-by would be great,
>>>> David?
>>>
>>> I don't recall anything changing since version 4. I scanned over the
>>> change and it all looks familiar; testing worked fine as well -- I was
>>> able to analyze a PPC file on x86 and vice versa.
>>>
>>> Acked-by: David Ahern <dsahern@gmail.com>
>>> Tested-by: David Ahern <dsahern@gmail.com>
>>>
>>
>> Stephane,
>>
>>       What are your plans? A v6 patch series with this new round of
>> comments and Acked-by et al stamps?
>>
I can do that. That'll be easier I think.

>> - Arnaldo
>
> I thought this one was standalone, so it could be picked up now.
>
> David

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 14/18] perf: fix endianness detection in perf.data
  2012-02-07 14:13           ` Stephane Eranian
@ 2012-02-07 14:38             ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 43+ messages in thread
From: Arnaldo Carvalho de Melo @ 2012-02-07 14:38 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: David Ahern, linux-kernel, peterz, mingo, robert.richter,
	ming.m.lin, andi, asharma, ravitillo, vweaver1, khandual

Em Tue, Feb 07, 2012 at 03:13:54PM +0100, Stephane Eranian escreveu:
> On Mon, Feb 6, 2012 at 11:29 PM, David Ahern <dsahern@gmail.com> wrote:
> > On 02/06/2012 03:06 PM, Arnaldo Carvalho de Melo wrote:
> >> Em Mon, Feb 06, 2012 at 02:47:18PM -0700, David Ahern escreveu:
> >>> On 02/06/2012 11:17 AM, Arnaldo Carvalho de Melo wrote:
> >>>> Looks ok, but IIRC David Ahern interacted with you on this specific
> >>>> patch in the past, having his Acked-by and/or Tested-by would be great,
> >>>> David?
> >>>
> >>> I don't recall anything changing since version 4. I scanned over the
> >>> change and it all looks familiar; testing worked fine as well -- I was
> >>> able to analyze a PPC file on x86 and vice versa.
> >>>
> >>> Acked-by: David Ahern <dsahern@gmail.com>
> >>> Tested-by: David Ahern <dsahern@gmail.com>
> >>
> >> Stephane,
> >>
> >>       What are your plans? A v6 patch series with this new round of
> >> comments and Acked-by et al stamps?
> >>
> I can do that. That'll be easier I think.

Nevermind, as David pointed out, it is standalone, I've applied it,
together with David's stamps.
 
> > I thought this one was standalone, so it could be picked up now.

- Arnaldo

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 16/18] perf: enable reading of perf.data files from different ABI rev
  2012-02-07  7:03     ` Anshuman Khandual
@ 2012-02-07 14:52       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 43+ messages in thread
From: Arnaldo Carvalho de Melo @ 2012-02-07 14:52 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: Stephane Eranian, linux-kernel, peterz, mingo, robert.richter,
	ming.m.lin, andi, asharma, ravitillo, vweaver1, dsahern

Em Tue, Feb 07, 2012 at 12:33:02PM +0530, Anshuman Khandual escreveu:
> On Monday 06 February 2012 11:52 PM, Arnaldo Carvalho de Melo wrote:
> > Em Thu, Feb 02, 2012 at 01:54:46PM +0100, Stephane Eranian escreveu:
> >> This patch allows perf to process perf.data files generated
> >> using an ABI that has a different perf_event_attr struct size, i.e.,
> >> a different ABI version.

> > Anshuman, can I have your Acked-by or Reviewed-by since you spotted
> > problems in this and your suggestions were taken into account? Is this
> > OK now?

> (1) PERF_SAMPLE_BRANCH_HV and related privilege level problems has been fixed. 
>     Verified various combinations of <branch_type>,[u,k,hv] <event>:[u,k,h].
>     Works well in all privilege level permutations.
> 
> (2) As Peter has mentioned 'has_branch_smpl()' bug in the ARM code would be taken
>     care.
> 
> (3) 'ref_size' problem in try_all_pipe_abis() has been fixed. All the patches 
>     independently compile successfully.
> 
> (4) Went through the entire patchset, looks good to me. 
> 
> Acked-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
> Tested-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>

Thanks! But we'll now have to wait till Stephane addresses the points
made by David :-)

- Arnaldo

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 11/18] perf: add code to support PERF_SAMPLE_BRANCH_STACK
  2012-02-07 14:11     ` Stephane Eranian
@ 2012-02-07 15:21       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 43+ messages in thread
From: Arnaldo Carvalho de Melo @ 2012-02-07 15:21 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: linux-kernel, peterz, mingo, robert.richter, ming.m.lin, andi,
	asharma, ravitillo, vweaver1, khandual, dsahern

Em Tue, Feb 07, 2012 at 03:11:56PM +0100, Stephane Eranian escreveu:
> On Mon, Feb 6, 2012 at 7:06 PM, Arnaldo Carvalho de Melo <acme@redhat.com> wrote:
> >> +++ b/tools/perf/util/annotate.c
> >> @@ -64,7 +64,7 @@ int symbol__inc_addr_samples(struct symbol *sym, struct map *map,

> >>       pr_debug3("%s: addr=%#" PRIx64 "\n", __func__, map->unmap_ip(map, addr));

> >> -     if (addr >= sym->end)
> >> +     if (addr >= sym->end || addr < sym->start)

> > This is not related to this, would be better to come in a separate patch
> > with a proper explanation.

> You mean in this patchset or separately?

Either way it would be standalone, I'd pick it, but please write a
commit message explaining why it is needed.

Multiple people submitted this already but without a good commit message
which I think may be papering over a bug.

- Arnaldo

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 16/18] perf: enable reading of perf.data files from different ABI rev
  2012-02-06 22:19   ` David Ahern
@ 2012-02-07 15:50     ` Stephane Eranian
  2012-02-07 16:41       ` David Ahern
  0 siblings, 1 reply; 43+ messages in thread
From: Stephane Eranian @ 2012-02-07 15:50 UTC (permalink / raw)
  To: David Ahern
  Cc: linux-kernel, peterz, mingo, acme, robert.richter, ming.m.lin,
	andi, asharma, ravitillo, vweaver1, khandual

On Mon, Feb 6, 2012 at 11:19 PM, David Ahern <dsahern@gmail.com> wrote:
>
>
> On 02/02/2012 05:54 AM, Stephane Eranian wrote:
>> This patch allows perf to process perf.data files generated
>> using an ABI that has a different perf_event_attr struct size, i.e.,
>> a different ABI version.
>>
>> The perf_event_attr can be extended, yet perf needs to cope with
>> older perf.data files. Similarly, perf must be able to cope with
>> a perf.data file which is using a newer version of the ABI than
>> what it knows about.
>>
>> This patch adds read_attr(), a routine that reads a perf_event_attr
>> struct from a file incrementally based on its advertised size. If
>> the on-file struct is smaller than what perf knows, then the extra
>> fields are zeroed. If the on-file struct is bigger, then perf only
>> uses what it knows about, the rest is skipped.
>>
>> Signed-off-by: Stephane Eranian <eranian@google.com>
>> ---
>>  tools/perf/util/header.c |   49 ++++++++++++++++++++++++++++++++++++++++++++-
>>  1 files changed, 47 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
>> index 6f4187d..8d6c18d 100644
>> --- a/tools/perf/util/header.c
>> +++ b/tools/perf/util/header.c
>> @@ -1959,6 +1959,51 @@ static int perf_header__read_pipe(struct perf_session *session, int fd)
>>       return 0;
>>  }
>>
>> +static int read_attr(int fd, struct perf_header *ph,
>> +                  struct perf_file_attr *f_attr)
>> +{
>> +     struct perf_event_attr *attr = &f_attr->attr;
>> +     size_t sz, left;
>> +     size_t our_sz = sizeof(f_attr->attr);
>> +     int ret;
>> +
>> +     memset(f_attr, 0, sizeof(*f_attr));
>> +
>> +     /* read minimal guaranteed structure */
>> +     ret = readn(fd, attr, PERF_ATTR_SIZE_VER0);
>> +     if (ret <= 0)
>> +             return -1;
>
> As I recall the first bump in that structure happened in 2.6.32. Why add
> backward compatibility for it now? ie., why not just expect VER1
>
>> +
>> +     /* on file perf_event_attr size */
>> +     sz = attr->size;
>> +     if (ph->needs_swap)
>> +             sz = bswap_32(sz);
>> +
>> +     if (sz == 0) {
>> +             /* assume ABI0 */
>> +             sz =  PERF_ATTR_SIZE_VER0;
>
> Shouldn't this be a failure? ie., problem with the file (or the
> swapping) since size can't be 0
>
size can be zero. In which case, it means ABI0 version.
See kernel/event/core.c:perf_copy_attr().


> And then for the following why not restrict sz to known, expected sizes
> -- using the PERF_ATTR_SIZE_VER defines introduced in patch 15?
>
Well, the current code solves the problem once and for all. Old tools
can still read new files and vice-versa. If you think that's a problem I
can simply bail out if sz > our_sz.

>> +     } else if (sz > our_sz) {
>> +             /* bigger than what we know about */
>> +             sz = our_sz;
>> +
>> +             /* skip what we do not know about */
>> +             lseek(fd, SEEK_CUR, attr->size - our_sz);
>> +     }
>> +     /* what we have not yet read and that we know about */
>> +     left = sz - PERF_ATTR_SIZE_VER0;
>> +     if (left) {
>> +             void *ptr = attr;
>> +             ptr += PERF_ATTR_SIZE_VER0;
>> +
>> +             ret = readn(fd, ptr, left);
>> +             if (ret <= 0)
>> +                     return -1;
>> +     }
>> +     /* read the ids */
>> +     ret = readn(fd, &f_attr->ids, sizeof(struct perf_file_section));
>
> Confused by the above? It is not done in the old code, so why read the
> ids here? I scanned the other patches, but don't see other code movement
> on this file.
>
Good catch. It is leftover code from debugging most likely. The ids are read
later on in perf_session__read_header().

> David
>
>> +     return ret <= 0 ? -1 : 0;
>> +}
>> +
>>  int perf_session__read_header(struct perf_session *session, int fd)
>>  {
>>       struct perf_header *header = &session->header;
>> @@ -1979,14 +2024,14 @@ int perf_session__read_header(struct perf_session *session, int fd)
>>               return -EINVAL;
>>       }
>>
>> -     nr_attrs = f_header.attrs.size / sizeof(f_attr);
>> +     nr_attrs = f_header.attrs.size / f_header.attr_size;
>>       lseek(fd, f_header.attrs.offset, SEEK_SET);
>>
>>       for (i = 0; i < nr_attrs; i++) {
>>               struct perf_evsel *evsel;
>>               off_t tmp;
>>
>> -             if (readn(fd, &f_attr, sizeof(f_attr)) <= 0)
>> +             if (read_attr(fd, header, &f_attr) < 0)
>>                       goto out_errno;
>>
>>               if (header->needs_swap)

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 16/18] perf: enable reading of perf.data files from different ABI rev
  2012-02-07 15:50     ` Stephane Eranian
@ 2012-02-07 16:41       ` David Ahern
  2012-02-07 17:42         ` Stephane Eranian
  0 siblings, 1 reply; 43+ messages in thread
From: David Ahern @ 2012-02-07 16:41 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: linux-kernel, peterz, mingo, acme, robert.richter, ming.m.lin,
	andi, asharma, ravitillo, vweaver1, khandual

On 02/07/2012 08:50 AM, Stephane Eranian wrote:
>>> +     if (sz == 0) {
>>> +             /* assume ABI0 */
>>> +             sz =  PERF_ATTR_SIZE_VER0;
>>
>> Shouldn't this be a failure? ie., problem with the file (or the
>> swapping) since size can't be 0
>>
> size can be zero. In which case, it means ABI0 version.
> See kernel/event/core.c:perf_copy_attr().

ok

> 
> 
>> And then for the following why not restrict sz to known, expected sizes
>> -- using the PERF_ATTR_SIZE_VER defines introduced in patch 15?
>>
> Well, the current code solves the problem once and for all. Old tools
> can still read new files and vice-versa. If you think that's a problem I
> can simply bail out if sz > our_sz.

My sensitivity on this is when endianness is broken it is a nightmare to
find. You end up lacing the code with printfs trying to find which size
field is going off the charts making the parsing of the file fail - or
worse the sizes are slightly off and you get non-sense out.

New commands should be able to read old files; old commands reading new
files is a bit of a stretch in that the code has to be future-proofed.
It seems like a reasonable requirement for data files to be examined by
a command of the same vintage or newer as the command that wrote the file.

David

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 16/18] perf: enable reading of perf.data files from different ABI rev
  2012-02-07 16:41       ` David Ahern
@ 2012-02-07 17:42         ` Stephane Eranian
  2012-02-07 17:57           ` David Ahern
  0 siblings, 1 reply; 43+ messages in thread
From: Stephane Eranian @ 2012-02-07 17:42 UTC (permalink / raw)
  To: David Ahern
  Cc: linux-kernel, peterz, mingo, acme, robert.richter, ming.m.lin,
	andi, asharma, ravitillo, vweaver1, khandual

On Tue, Feb 7, 2012 at 5:41 PM, David Ahern <dsahern@gmail.com> wrote:
> On 02/07/2012 08:50 AM, Stephane Eranian wrote:
>>>> +     if (sz == 0) {
>>>> +             /* assume ABI0 */
>>>> +             sz =  PERF_ATTR_SIZE_VER0;
>>>
>>> Shouldn't this be a failure? ie., problem with the file (or the
>>> swapping) since size can't be 0
>>>
>> size can be zero. In which case, it means ABI0 version.
>> See kernel/event/core.c:perf_copy_attr().
>
> ok
>
>>
>>
>>> And then for the following why not restrict sz to known, expected sizes
>>> -- using the PERF_ATTR_SIZE_VER defines introduced in patch 15?
>>>
>> Well, the current code solves the problem once and for all. Old tools
>> can still read new files and vice-versa. If you think that's a problem I
>> can simply bail out if sz > our_sz.
>
> My sensitivity on this is when endianness is broken it is a nightmare to
> find. You end up lacing the code with printfs trying to find which size
> field is going off the charts making the parsing of the file fail - or
> worse the sizes are slightly off and you get non-sense out.
>
> New commands should be able to read old files; old commands reading new
> files is a bit of a stretch in that the code has to be future-proofed.
> It seems like a reasonable requirement for data files to be examined by
> a command of the same vintage or newer as the command that wrote the file.
>
Fine then, I can simply strip that part of the code and return an
error is sz > our_sz.
How about that?

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 16/18] perf: enable reading of perf.data files from different ABI rev
  2012-02-07 17:42         ` Stephane Eranian
@ 2012-02-07 17:57           ` David Ahern
  0 siblings, 0 replies; 43+ messages in thread
From: David Ahern @ 2012-02-07 17:57 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: linux-kernel, peterz, mingo, acme, robert.richter, ming.m.lin,
	andi, asharma, ravitillo, vweaver1, khandual



On 02/07/2012 10:42 AM, Stephane Eranian wrote:
>>> Well, the current code solves the problem once and for all. Old tools
>>> can still read new files and vice-versa. If you think that's a problem I
>>> can simply bail out if sz > our_sz.
>>
>> My sensitivity on this is when endianness is broken it is a nightmare to
>> find. You end up lacing the code with printfs trying to find which size
>> field is going off the charts making the parsing of the file fail - or
>> worse the sizes are slightly off and you get non-sense out.
>>
>> New commands should be able to read old files; old commands reading new
>> files is a bit of a stretch in that the code has to be future-proofed.
>> It seems like a reasonable requirement for data files to be examined by
>> a command of the same vintage or newer as the command that wrote the file.
>>
> Fine then, I can simply strip that part of the code and return an
> error is sz > our_sz.
> How about that?


That's my preference - with a nice grep-able error message about an
unexpected attr size.

David

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [tip:perf/core] perf tools: fix endianness detection in perf.data
  2012-02-02 12:54 ` [PATCH v5 14/18] perf: fix endianness detection in perf.data Stephane Eranian
  2012-02-06 18:17   ` Arnaldo Carvalho de Melo
@ 2012-02-17  9:42   ` tip-bot for Stephane Eranian
  1 sibling, 0 replies; 43+ messages in thread
From: tip-bot for Stephane Eranian @ 2012-02-17  9:42 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: acme, eranian, mingo, peterz, robert.richter, ravitillo,
	ming.m.lin, dsahern, tglx, vweaver1, asharma, hpa, linux-kernel,
	andi, khandual, mingo

Commit-ID:  73323f541fe5f55a3b8a5c3d565bfc1efd64abf6
Gitweb:     http://git.kernel.org/tip/73323f541fe5f55a3b8a5c3d565bfc1efd64abf6
Author:     Stephane Eranian <eranian@google.com>
AuthorDate: Thu, 2 Feb 2012 13:54:44 +0100
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 9 Feb 2012 12:28:10 -0200

perf tools: fix endianness detection in perf.data

The current version of perf detects whether or not the perf.data file is
written in a different endianness using the attr_size field in the
header of the file. This field represents sizeof(struct perf_event_attr)
as known to perf record. If the sizes do not match, then perf tries the
byte-swapped version. If they match, then the tool assumes a different
endianness.

The issue with the approach is that it assumes the size of
perf_event_attr always has to match between perf record and perf report.
However, the kernel perf_event ABI is extensible.  New fields can be
added to struct perf_event_attr. Consequently, it is not possible to use
attr_size to detect endianness.

This patch takes another approach by using the magic number written at
the beginning of the perf.data file to detect endianness. The magic
number is an eight-byte signature.  It's primary purpose is to identify
(signature) a perf.data file. But it could also be used to encode the
endianness.

The patch introduces a new value for this signature. The key difference
is that the signature is written differently in the file depending on
the endianness. Thus, by comparing the signature from the file with the
tool's own signature it is possible to detect endianness. The new
signature is "PERFILE2".

Backward compatiblity with existing perf.data file is ensured.

Tested-by: David Ahern <dsahern@gmail.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Arun Sharma <asharma@fb.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Roberto Agostino Vitillo <ravitillo@lbl.gov>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Vince Weaver <vweaver1@eecs.utk.edu>
Link: http://lkml.kernel.org/r/1328187288-24395-15-git-send-email-eranian@google.com
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/header.c |   77 ++++++++++++++++++++++++++++++++++++++--------
 1 files changed, 64 insertions(+), 13 deletions(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index ecd7f4d..6f4187d 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -63,9 +63,20 @@ char *perf_header__find_event(u64 id)
 	return NULL;
 }
 
-static const char *__perf_magic = "PERFFILE";
+/*
+ * magic2 = "PERFILE2"
+ * must be a numerical value to let the endianness
+ * determine the memory layout. That way we are able
+ * to detect endianness when reading the perf.data file
+ * back.
+ *
+ * we check for legacy (PERFFILE) format.
+ */
+static const char *__perf_magic1 = "PERFFILE";
+static const u64 __perf_magic2    = 0x32454c4946524550ULL;
+static const u64 __perf_magic2_sw = 0x50455246494c4532ULL;
 
-#define PERF_MAGIC	(*(u64 *)__perf_magic)
+#define PERF_MAGIC	__perf_magic2
 
 struct perf_file_attr {
 	struct perf_event_attr	attr;
@@ -1620,24 +1631,59 @@ out_free:
 	return err;
 }
 
+static int check_magic_endian(u64 *magic, struct perf_file_header *header,
+			      struct perf_header *ph)
+{
+	int ret;
+
+	/* check for legacy format */
+	ret = memcmp(magic, __perf_magic1, sizeof(*magic));
+	if (ret == 0) {
+		pr_debug("legacy perf.data format\n");
+		if (!header)
+			return -1;
+
+		if (header->attr_size != sizeof(struct perf_file_attr)) {
+			u64 attr_size = bswap_64(header->attr_size);
+
+			if (attr_size != sizeof(struct perf_file_attr))
+				return -1;
+
+			ph->needs_swap = true;
+		}
+		return 0;
+	}
+
+	/* check magic number with same endianness */
+	if (*magic == __perf_magic2)
+		return 0;
+
+	/* check magic number but opposite endianness */
+	if (*magic != __perf_magic2_sw)
+		return -1;
+
+	ph->needs_swap = true;
+
+	return 0;
+}
+
 int perf_file_header__read(struct perf_file_header *header,
 			   struct perf_header *ph, int fd)
 {
+	int ret;
+
 	lseek(fd, 0, SEEK_SET);
 
-	if (readn(fd, header, sizeof(*header)) <= 0 ||
-	    memcmp(&header->magic, __perf_magic, sizeof(header->magic)))
+	ret = readn(fd, header, sizeof(*header));
+	if (ret <= 0)
 		return -1;
 
-	if (header->attr_size != sizeof(struct perf_file_attr)) {
-		u64 attr_size = bswap_64(header->attr_size);
-
-		if (attr_size != sizeof(struct perf_file_attr))
-			return -1;
+	if (check_magic_endian(&header->magic, header, ph) < 0)
+		return -1;
 
+	if (ph->needs_swap) {
 		mem_bswap_64(header, offsetof(struct perf_file_header,
-					    adds_features));
-		ph->needs_swap = true;
+			     adds_features));
 	}
 
 	if (header->size != sizeof(*header)) {
@@ -1873,8 +1919,13 @@ static int perf_file_header__read_pipe(struct perf_pipe_file_header *header,
 				       struct perf_header *ph, int fd,
 				       bool repipe)
 {
-	if (readn(fd, header, sizeof(*header)) <= 0 ||
-	    memcmp(&header->magic, __perf_magic, sizeof(header->magic)))
+	int ret;
+
+	ret = readn(fd, header, sizeof(*header));
+	if (ret <= 0)
+		return -1;
+
+	 if (check_magic_endian(&header->magic, NULL, ph) < 0)
 		return -1;
 
 	if (repipe && do_write(STDOUT_FILENO, header, sizeof(*header)) < 0)

^ permalink raw reply related	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2012-02-17  9:44 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-02 12:54 [PATCH v5 00/18] perf: add support for sampling taken branches Stephane Eranian
2012-02-02 12:54 ` [PATCH v5 01/18] perf: add generic taken branch sampling support Stephane Eranian
2012-02-02 12:54 ` [PATCH v5 02/18] perf: add Intel LBR MSR definitions Stephane Eranian
2012-02-02 12:54 ` [PATCH v5 03/18] perf: add Intel X86 LBR sharing logic Stephane Eranian
2012-02-02 12:54 ` [PATCH v5 04/18] perf: sync branch stack sampling with X86 precise_sampling Stephane Eranian
2012-02-02 12:54 ` [PATCH v5 05/18] perf: add Intel X86 LBR mappings for PERF_SAMPLE_BRANCH filters Stephane Eranian
2012-02-02 12:54 ` [PATCH v5 06/18] perf: disable LBR support for older Intel Atom processors Stephane Eranian
2012-02-02 12:54 ` [PATCH v5 07/18] perf: implement PERF_SAMPLE_BRANCH for Intel X86 Stephane Eranian
2012-02-02 12:54 ` [PATCH v5 08/18] perf: add LBR software filter support " Stephane Eranian
2012-02-02 12:54 ` [PATCH v5 09/18] perf: disable PERF_SAMPLE_BRANCH_* when not supported Stephane Eranian
2012-02-06 19:23   ` Peter Zijlstra
2012-02-06 19:59     ` Stephane Eranian
2012-02-02 12:54 ` [PATCH v5 10/18] perf: add hook to flush branch_stack on context switch Stephane Eranian
2012-02-02 12:54 ` [PATCH v5 11/18] perf: add code to support PERF_SAMPLE_BRANCH_STACK Stephane Eranian
2012-02-06 18:06   ` Arnaldo Carvalho de Melo
2012-02-07 14:11     ` Stephane Eranian
2012-02-07 15:21       ` Arnaldo Carvalho de Melo
2012-02-02 12:54 ` [PATCH v5 12/18] perf: add support for sampling taken branch to perf record Stephane Eranian
2012-02-06 18:08   ` Arnaldo Carvalho de Melo
2012-02-02 12:54 ` [PATCH v5 13/18] perf: add support for taken branch sampling to perf report Stephane Eranian
2012-02-06 18:14   ` Arnaldo Carvalho de Melo
2012-02-02 12:54 ` [PATCH v5 14/18] perf: fix endianness detection in perf.data Stephane Eranian
2012-02-06 18:17   ` Arnaldo Carvalho de Melo
2012-02-06 18:18     ` Stephane Eranian
2012-02-06 21:47     ` David Ahern
2012-02-06 22:06       ` Arnaldo Carvalho de Melo
2012-02-06 22:29         ` David Ahern
2012-02-07 14:13           ` Stephane Eranian
2012-02-07 14:38             ` Arnaldo Carvalho de Melo
2012-02-17  9:42   ` [tip:perf/core] perf tools: " tip-bot for Stephane Eranian
2012-02-02 12:54 ` [PATCH v5 15/18] perf: add ABI reference sizes Stephane Eranian
2012-02-02 12:54 ` [PATCH v5 16/18] perf: enable reading of perf.data files from different ABI rev Stephane Eranian
2012-02-06 18:19   ` Arnaldo Carvalho de Melo
2012-02-06 18:22   ` Arnaldo Carvalho de Melo
2012-02-07  7:03     ` Anshuman Khandual
2012-02-07 14:52       ` Arnaldo Carvalho de Melo
2012-02-06 22:19   ` David Ahern
2012-02-07 15:50     ` Stephane Eranian
2012-02-07 16:41       ` David Ahern
2012-02-07 17:42         ` Stephane Eranian
2012-02-07 17:57           ` David Ahern
2012-02-02 12:54 ` [PATCH v5 17/18] perf: fix bug print_event_desc() Stephane Eranian
2012-02-02 12:54 ` [PATCH v5 18/18] perf: make perf able to read file from older ABIs Stephane Eranian

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).