All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] perf: add ability to sample direct call branches
@ 2015-10-13  7:09 Stephane Eranian
  2015-10-13  7:09 ` [PATCH 1/4] perf: add PERF_SAMPLE_BRANCH_CALL Stephane Eranian
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Stephane Eranian @ 2015-10-13  7:09 UTC (permalink / raw)
  To: linux-kernel; +Cc: acme, peterz, mingo, ak, jolsa, namhyung, khandual

This short patch series improves the perf_events interface by providing
a new branch_sample_type bit to sample only direct call branches. Up
until now, you could specify PERF_SAMPLE_BRANCH_ANY_CALL (any calls) or
PERF_SAMPLE_BRANCH_IND_CALL (indirect calls). But there was no way to 
sample only direct calls. This series adds PERF_SAMPLE_BRANCH_CALL.

This covers direct function calls (incl. zero length calls) but not syscalls.
It can be used for those who want to analyze direct calls only.

The series includes the kernel generic code changes. The x86 support based on
the LBR filter (or sofware filter) and the PPC check.

The series also includes the changes to perf record to support the new filter:

    $ perf record -j call -e cycles ......

Patch is relative to tip.git @ commit e6f195f Merge branch 'ras/core'

Stephane Eranian (4):
  perf: add PERF_SAMPLE_BRANCH_CALL
  perf/x86: add support for PERF_SAMPLE_BRANCH_CALL
  perf/powerpc: add support for PERF_SAMPLE_BRANCH_CALL
  perf record: add ability to sample call branches

 arch/powerpc/perf/power8-pmu.c             | 3 +++
 arch/x86/kernel/cpu/perf_event_intel_lbr.c | 4 ++++
 include/uapi/linux/perf_event.h            | 2 ++
 tools/perf/Documentation/perf-record.txt   | 1 +
 tools/perf/util/parse-branch-options.c     | 1 +
 5 files changed, 11 insertions(+)

-- 
1.9.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/4] perf: add PERF_SAMPLE_BRANCH_CALL
  2015-10-13  7:09 [PATCH 0/4] perf: add ability to sample direct call branches Stephane Eranian
@ 2015-10-13  7:09 ` Stephane Eranian
  2015-10-20  9:35   ` [tip:perf/core] perf: Add PERF_SAMPLE_BRANCH_CALL tip-bot for Stephane Eranian
  2015-10-13  7:09 ` [PATCH 2/4] perf/x86: add support for PERF_SAMPLE_BRANCH_CALL Stephane Eranian
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 12+ messages in thread
From: Stephane Eranian @ 2015-10-13  7:09 UTC (permalink / raw)
  To: linux-kernel; +Cc: acme, peterz, mingo, ak, jolsa, namhyung, khandual

Add a new branch sample type to cover only call branches (function calls).
The current ANY_CALL included direct, indirect calls and far jumps.

We want to be able to differentiate indirect from direct calls. Therefore
we introduce PERF_SAMPLE_BRANCH_CALL. The implementation is up to each
architecture.

Signed-off-by: Stephane Eranian <eranian@google.com>
---
 include/uapi/linux/perf_event.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 2881145..e6c1b47 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -168,6 +168,7 @@ enum perf_branch_sample_type_shift {
 
 	PERF_SAMPLE_BRANCH_CALL_STACK_SHIFT	= 11, /* call/ret stack */
 	PERF_SAMPLE_BRANCH_IND_JUMP_SHIFT	= 12, /* indirect jumps */
+	PERF_SAMPLE_BRANCH_CALL_SHIFT		= 13, /* direct call */
 
 	PERF_SAMPLE_BRANCH_MAX_SHIFT		/* non-ABI */
 };
@@ -188,6 +189,7 @@ enum perf_branch_sample_type {
 
 	PERF_SAMPLE_BRANCH_CALL_STACK	= 1U << PERF_SAMPLE_BRANCH_CALL_STACK_SHIFT,
 	PERF_SAMPLE_BRANCH_IND_JUMP	= 1U << PERF_SAMPLE_BRANCH_IND_JUMP_SHIFT,
+	PERF_SAMPLE_BRANCH_CALL		= 1U << PERF_SAMPLE_BRANCH_CALL_SHIFT,
 
 	PERF_SAMPLE_BRANCH_MAX		= 1U << PERF_SAMPLE_BRANCH_MAX_SHIFT,
 };
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/4] perf/x86: add support for PERF_SAMPLE_BRANCH_CALL
  2015-10-13  7:09 [PATCH 0/4] perf: add ability to sample direct call branches Stephane Eranian
  2015-10-13  7:09 ` [PATCH 1/4] perf: add PERF_SAMPLE_BRANCH_CALL Stephane Eranian
@ 2015-10-13  7:09 ` Stephane Eranian
  2015-10-13 13:40   ` Ingo Molnar
  2015-10-20  9:36   ` [tip:perf/core] perf/x86: Add " tip-bot for Stephane Eranian
  2015-10-13  7:09 ` [PATCH 3/4] perf/powerpc: add " Stephane Eranian
  2015-10-13  7:09 ` [PATCH 4/4] perf record: add ability to sample call branches Stephane Eranian
  3 siblings, 2 replies; 12+ messages in thread
From: Stephane Eranian @ 2015-10-13  7:09 UTC (permalink / raw)
  To: linux-kernel; +Cc: acme, peterz, mingo, ak, jolsa, namhyung, khandual

This patch enables the suport for the PERF_SAMPLE_BRANCH_CALL
for Intel x86 processors. When the processor support LBR filtering
this the selection is done in hardware. Otherwise, the filter is
applied by software. Note that we chose to include zero length calls
because they also represent calls.

Signed-off-by: Stephane Eranian <eranian@google.com>
---
 arch/x86/kernel/cpu/perf_event_intel_lbr.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_lbr.c b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
index ad0b8b0..bfd0b71 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_lbr.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
@@ -555,6 +555,8 @@ static int intel_pmu_setup_sw_lbr_filter(struct perf_event *event)
 	if (br_type & PERF_SAMPLE_BRANCH_IND_JUMP)
 		mask |= X86_BR_IND_JMP;
 
+	if (br_type & PERF_SAMPLE_BRANCH_CALL)
+		mask |= X86_BR_CALL | X86_BR_ZERO_CALL;
 	/*
 	 * stash actual user request into reg, it may
 	 * be used by fixup code for some CPU
@@ -890,6 +892,7 @@ static const int snb_lbr_sel_map[PERF_SAMPLE_BRANCH_MAX_SHIFT] = {
 	[PERF_SAMPLE_BRANCH_IND_CALL_SHIFT]	= LBR_IND_CALL,
 	[PERF_SAMPLE_BRANCH_COND_SHIFT]		= LBR_JCC,
 	[PERF_SAMPLE_BRANCH_IND_JUMP_SHIFT]	= LBR_IND_JMP,
+	[PERF_SAMPLE_BRANCH_CALL_SHIFT]		= LBR_REL_CALL,
 };
 
 static const int hsw_lbr_sel_map[PERF_SAMPLE_BRANCH_MAX_SHIFT] = {
@@ -905,6 +908,7 @@ static const int hsw_lbr_sel_map[PERF_SAMPLE_BRANCH_MAX_SHIFT] = {
 	[PERF_SAMPLE_BRANCH_CALL_STACK_SHIFT]	= LBR_REL_CALL | LBR_IND_CALL
 						| LBR_RETURN | LBR_CALL_STACK,
 	[PERF_SAMPLE_BRANCH_IND_JUMP_SHIFT]	= LBR_IND_JMP,
+	[PERF_SAMPLE_BRANCH_CALL_SHIFT]		= LBR_REL_CALL,
 };
 
 /* core */
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/4] perf/powerpc: add support for PERF_SAMPLE_BRANCH_CALL
  2015-10-13  7:09 [PATCH 0/4] perf: add ability to sample direct call branches Stephane Eranian
  2015-10-13  7:09 ` [PATCH 1/4] perf: add PERF_SAMPLE_BRANCH_CALL Stephane Eranian
  2015-10-13  7:09 ` [PATCH 2/4] perf/x86: add support for PERF_SAMPLE_BRANCH_CALL Stephane Eranian
@ 2015-10-13  7:09 ` Stephane Eranian
  2015-10-20  9:36   ` [tip:perf/core] perf/powerpc: Add " tip-bot for Stephane Eranian
  2015-10-13  7:09 ` [PATCH 4/4] perf record: add ability to sample call branches Stephane Eranian
  3 siblings, 1 reply; 12+ messages in thread
From: Stephane Eranian @ 2015-10-13  7:09 UTC (permalink / raw)
  To: linux-kernel; +Cc: acme, peterz, mingo, ak, jolsa, namhyung, khandual

The patch catches PERF_SAMPLE_BRANCH_CALL because it is not clear whether
this is actually supported by the hardware.

Signed-off-by: Stephane Eranian <eranian@google.com>
---
 arch/powerpc/perf/power8-pmu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/powerpc/perf/power8-pmu.c b/arch/powerpc/perf/power8-pmu.c
index 396351d..7d5e295 100644
--- a/arch/powerpc/perf/power8-pmu.c
+++ b/arch/powerpc/perf/power8-pmu.c
@@ -676,6 +676,9 @@ static u64 power8_bhrb_filter_map(u64 branch_sample_type)
 	if (branch_sample_type & PERF_SAMPLE_BRANCH_IND_CALL)
 		return -1;
 
+	if (branch_sample_type & PERF_SAMPLE_BRANCH_CALL)
+		return -1;
+
 	if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY_CALL) {
 		pmu_bhrb_filter |= POWER8_MMCRA_IFM1;
 		return pmu_bhrb_filter;
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/4] perf record: add ability to sample call branches
  2015-10-13  7:09 [PATCH 0/4] perf: add ability to sample direct call branches Stephane Eranian
                   ` (2 preceding siblings ...)
  2015-10-13  7:09 ` [PATCH 3/4] perf/powerpc: add " Stephane Eranian
@ 2015-10-13  7:09 ` Stephane Eranian
  2015-10-20  9:36   ` [tip:perf/core] perf record: Add " tip-bot for Stephane Eranian
  3 siblings, 1 reply; 12+ messages in thread
From: Stephane Eranian @ 2015-10-13  7:09 UTC (permalink / raw)
  To: linux-kernel; +Cc: acme, peterz, mingo, ak, jolsa, namhyung, khandual

This patch add a new branch type sampling filter to perf record.
It is named 'call' and maps to PERF_SAMPLE_BRANCH_CALL. It samples
direct call branches only, unlike 'any_call' which includes indirect
calls as well.

 $ perf record -j call -e cycles .....

The man page is updated accordingly.

Signed-off-by: Stephane Eranian <eranian@google.com>
---
 tools/perf/Documentation/perf-record.txt | 1 +
 tools/perf/util/parse-branch-options.c   | 1 +
 2 files changed, 2 insertions(+)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 2e9ce77..b027d28 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -236,6 +236,7 @@ It is possible to select the types of branches captured by enabling filters. The
         - any_call: any function call or system call
         - any_ret: any function return or system call return
         - ind_call: any indirect branch
+        - call: direct calls, including far (to/from kernel) calls
         - u:  only when the branch target is at the user level
         - k: only when the branch target is in the kernel
         - hv: only when the target is at the hypervisor level
diff --git a/tools/perf/util/parse-branch-options.c b/tools/perf/util/parse-branch-options.c
index a3b1e13..355eecf 100644
--- a/tools/perf/util/parse-branch-options.c
+++ b/tools/perf/util/parse-branch-options.c
@@ -27,6 +27,7 @@ static const struct branch_mode branch_modes[] = {
 	BRANCH_OPT("no_tx", PERF_SAMPLE_BRANCH_NO_TX),
 	BRANCH_OPT("cond", PERF_SAMPLE_BRANCH_COND),
 	BRANCH_OPT("ind_jmp", PERF_SAMPLE_BRANCH_IND_JUMP),
+	BRANCH_OPT("call", PERF_SAMPLE_BRANCH_CALL),
 	BRANCH_END
 };
 
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/4] perf/x86: add support for PERF_SAMPLE_BRANCH_CALL
  2015-10-13  7:09 ` [PATCH 2/4] perf/x86: add support for PERF_SAMPLE_BRANCH_CALL Stephane Eranian
@ 2015-10-13 13:40   ` Ingo Molnar
  2015-10-13 15:40     ` Andi Kleen
  2015-10-14  0:39     ` Stephane Eranian
  2015-10-20  9:36   ` [tip:perf/core] perf/x86: Add " tip-bot for Stephane Eranian
  1 sibling, 2 replies; 12+ messages in thread
From: Ingo Molnar @ 2015-10-13 13:40 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: linux-kernel, acme, peterz, mingo, ak, jolsa, namhyung, khandual


* Stephane Eranian <eranian@google.com> wrote:

> This patch enables the suport for the PERF_SAMPLE_BRANCH_CALL
> for Intel x86 processors. When the processor support LBR filtering
> this the selection is done in hardware. Otherwise, the filter is
> applied by software. Note that we chose to include zero length calls
> because they also represent calls.
> 
> Signed-off-by: Stephane Eranian <eranian@google.com>
> ---
>  arch/x86/kernel/cpu/perf_event_intel_lbr.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/perf_event_intel_lbr.c b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
> index ad0b8b0..bfd0b71 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel_lbr.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
> @@ -555,6 +555,8 @@ static int intel_pmu_setup_sw_lbr_filter(struct perf_event *event)
>  	if (br_type & PERF_SAMPLE_BRANCH_IND_JUMP)
>  		mask |= X86_BR_IND_JMP;
>  
> +	if (br_type & PERF_SAMPLE_BRANCH_CALL)
> +		mask |= X86_BR_CALL | X86_BR_ZERO_CALL;

I'm wondering how frequent zero-length calls are. If they still occur in typical 
user-space, would it make sense to also have a separate branch sampling type for 
zero length calls?

Intel documents zero length calls as ones that (ab-)use the call instruction to 
push the current IP on the stack:

	call next_addr
next_addr:
	pop %reg

which can take over 10 cycles on certain microarchitectures (and it unbalances 
whatever call stack tracking/caching the CPU does as well).

So it might make sense to analyze them separately. I guess that's the reason why 
Intel added a separate flag for them in the PMU.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/4] perf/x86: add support for PERF_SAMPLE_BRANCH_CALL
  2015-10-13 13:40   ` Ingo Molnar
@ 2015-10-13 15:40     ` Andi Kleen
  2015-10-14  0:39     ` Stephane Eranian
  1 sibling, 0 replies; 12+ messages in thread
From: Andi Kleen @ 2015-10-13 15:40 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Stephane Eranian, linux-kernel, acme, peterz, mingo, jolsa,
	namhyung, khandual

> I'm wondering how frequent zero-length calls are. If they still occur in typical 
> user-space, would it make sense to also have a separate branch sampling type for 
> zero length calls?

Apparently not too old icc compiled 32bit PIC binaries still contain it.
For gcc it was fixed for much longer.

But I'm not sure it's that interesting to sample by itself.

> push the current IP on the stack:
> 
> 	call next_addr
> next_addr:
> 	pop %reg
> 
> which can take over 10 cycles on certain microarchitectures (and it unbalances 
> whatever call stack tracking/caching the CPU does as well).
> 
> So it might make sense to analyze them separately. I guess that's the reason why 
> Intel added a separate flag for them in the PMU.

X86_BR_ZERO_CALL is only a software filter. There's no direct support for it
in the Intel hardware. It was added to make the LBR call stack more reliable,
which otherwise gets messed up by the zero length calls.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/4] perf/x86: add support for PERF_SAMPLE_BRANCH_CALL
  2015-10-13 13:40   ` Ingo Molnar
  2015-10-13 15:40     ` Andi Kleen
@ 2015-10-14  0:39     ` Stephane Eranian
  1 sibling, 0 replies; 12+ messages in thread
From: Stephane Eranian @ 2015-10-14  0:39 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Arnaldo Carvalho de Melo, Peter Zijlstra, mingo, ak,
	Jiri Olsa, Namhyung Kim, Anshuman Khandual

On Tue, Oct 13, 2015 at 6:40 AM, Ingo Molnar <mingo@kernel.org> wrote:
>
>
> * Stephane Eranian <eranian@google.com> wrote:
>
> > This patch enables the suport for the PERF_SAMPLE_BRANCH_CALL
> > for Intel x86 processors. When the processor support LBR filtering
> > this the selection is done in hardware. Otherwise, the filter is
> > applied by software. Note that we chose to include zero length calls
> > because they also represent calls.
> >
> > Signed-off-by: Stephane Eranian <eranian@google.com>
> > ---
> >  arch/x86/kernel/cpu/perf_event_intel_lbr.c | 4 ++++
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/arch/x86/kernel/cpu/perf_event_intel_lbr.c b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
> > index ad0b8b0..bfd0b71 100644
> > --- a/arch/x86/kernel/cpu/perf_event_intel_lbr.c
> > +++ b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
> > @@ -555,6 +555,8 @@ static int intel_pmu_setup_sw_lbr_filter(struct perf_event *event)
> >       if (br_type & PERF_SAMPLE_BRANCH_IND_JUMP)
> >               mask |= X86_BR_IND_JMP;
> >
> > +     if (br_type & PERF_SAMPLE_BRANCH_CALL)
> > +             mask |= X86_BR_CALL | X86_BR_ZERO_CALL;
>
> I'm wondering how frequent zero-length calls are. If they still occur in typical
> user-space, would it make sense to also have a separate branch sampling type for
> zero length calls?
>
We could add that. It would rely on the sw filter to catch only the
zero calls as Andi
mentioned. But I am wondering about the data quality because we would catch zero
calls without being able to determine how many we sampled vs. how many have
occurred. There is no PMU event counting zero call branches.

> Intel documents zero length calls as ones that (ab-)use the call instruction to
> push the current IP on the stack:
>
>         call next_addr
> next_addr:
>         pop %reg
>
> which can take over 10 cycles on certain microarchitectures (and it unbalances
> whatever call stack tracking/caching the CPU does as well).
>
> So it might make sense to analyze them separately. I guess that's the reason why
> Intel added a separate flag for them in the PMU.
>
> Thanks,
>
>         Ingo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [tip:perf/core] perf: Add PERF_SAMPLE_BRANCH_CALL
  2015-10-13  7:09 ` [PATCH 1/4] perf: add PERF_SAMPLE_BRANCH_CALL Stephane Eranian
@ 2015-10-20  9:35   ` tip-bot for Stephane Eranian
  0 siblings, 0 replies; 12+ messages in thread
From: tip-bot for Stephane Eranian @ 2015-10-20  9:35 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: peterz, torvalds, acme, vincent.weaver, linux-kernel, tglx,
	dsahern, jolsa, mingo, namhyung, eranian, hpa

Commit-ID:  c229bf9dc179d2023e185c0f705bdf68484c1e73
Gitweb:     http://git.kernel.org/tip/c229bf9dc179d2023e185c0f705bdf68484c1e73
Author:     Stephane Eranian <eranian@google.com>
AuthorDate: Tue, 13 Oct 2015 09:09:08 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 20 Oct 2015 10:30:53 +0200

perf: Add PERF_SAMPLE_BRANCH_CALL

Add a new branch sample type to cover only call branches (function calls).
The current ANY_CALL included direct, indirect calls and far jumps.

We want to be able to differentiate indirect from direct calls. Therefore
we introduce PERF_SAMPLE_BRANCH_CALL. The implementation is up to each
architecture.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: khandual@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/1444720151-10275-2-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/uapi/linux/perf_event.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 6c72e72..6512213 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -168,6 +168,7 @@ enum perf_branch_sample_type_shift {
 
 	PERF_SAMPLE_BRANCH_CALL_STACK_SHIFT	= 11, /* call/ret stack */
 	PERF_SAMPLE_BRANCH_IND_JUMP_SHIFT	= 12, /* indirect jumps */
+	PERF_SAMPLE_BRANCH_CALL_SHIFT		= 13, /* direct call */
 
 	PERF_SAMPLE_BRANCH_MAX_SHIFT		/* non-ABI */
 };
@@ -188,6 +189,7 @@ enum perf_branch_sample_type {
 
 	PERF_SAMPLE_BRANCH_CALL_STACK	= 1U << PERF_SAMPLE_BRANCH_CALL_STACK_SHIFT,
 	PERF_SAMPLE_BRANCH_IND_JUMP	= 1U << PERF_SAMPLE_BRANCH_IND_JUMP_SHIFT,
+	PERF_SAMPLE_BRANCH_CALL		= 1U << PERF_SAMPLE_BRANCH_CALL_SHIFT,
 
 	PERF_SAMPLE_BRANCH_MAX		= 1U << PERF_SAMPLE_BRANCH_MAX_SHIFT,
 };

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [tip:perf/core] perf/x86: Add support for PERF_SAMPLE_BRANCH_CALL
  2015-10-13  7:09 ` [PATCH 2/4] perf/x86: add support for PERF_SAMPLE_BRANCH_CALL Stephane Eranian
  2015-10-13 13:40   ` Ingo Molnar
@ 2015-10-20  9:36   ` tip-bot for Stephane Eranian
  1 sibling, 0 replies; 12+ messages in thread
From: tip-bot for Stephane Eranian @ 2015-10-20  9:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: torvalds, vincent.weaver, jolsa, namhyung, dsahern, linux-kernel,
	eranian, mingo, tglx, hpa, peterz, acme

Commit-ID:  d892819faa6860d469aae71d70c336b391c25505
Gitweb:     http://git.kernel.org/tip/d892819faa6860d469aae71d70c336b391c25505
Author:     Stephane Eranian <eranian@google.com>
AuthorDate: Tue, 13 Oct 2015 09:09:09 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 20 Oct 2015 10:30:53 +0200

perf/x86: Add support for PERF_SAMPLE_BRANCH_CALL

This patch enables the suport for the PERF_SAMPLE_BRANCH_CALL
for Intel x86 processors. When the processor support LBR filtering
this the selection is done in hardware. Otherwise, the filter is
applied by software. Note that we chose to include zero length calls
because they also represent calls.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: khandual@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/1444720151-10275-3-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/cpu/perf_event_intel_lbr.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_lbr.c b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
index ad0b8b0..bfd0b71 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_lbr.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
@@ -555,6 +555,8 @@ static int intel_pmu_setup_sw_lbr_filter(struct perf_event *event)
 	if (br_type & PERF_SAMPLE_BRANCH_IND_JUMP)
 		mask |= X86_BR_IND_JMP;
 
+	if (br_type & PERF_SAMPLE_BRANCH_CALL)
+		mask |= X86_BR_CALL | X86_BR_ZERO_CALL;
 	/*
 	 * stash actual user request into reg, it may
 	 * be used by fixup code for some CPU
@@ -890,6 +892,7 @@ static const int snb_lbr_sel_map[PERF_SAMPLE_BRANCH_MAX_SHIFT] = {
 	[PERF_SAMPLE_BRANCH_IND_CALL_SHIFT]	= LBR_IND_CALL,
 	[PERF_SAMPLE_BRANCH_COND_SHIFT]		= LBR_JCC,
 	[PERF_SAMPLE_BRANCH_IND_JUMP_SHIFT]	= LBR_IND_JMP,
+	[PERF_SAMPLE_BRANCH_CALL_SHIFT]		= LBR_REL_CALL,
 };
 
 static const int hsw_lbr_sel_map[PERF_SAMPLE_BRANCH_MAX_SHIFT] = {
@@ -905,6 +908,7 @@ static const int hsw_lbr_sel_map[PERF_SAMPLE_BRANCH_MAX_SHIFT] = {
 	[PERF_SAMPLE_BRANCH_CALL_STACK_SHIFT]	= LBR_REL_CALL | LBR_IND_CALL
 						| LBR_RETURN | LBR_CALL_STACK,
 	[PERF_SAMPLE_BRANCH_IND_JUMP_SHIFT]	= LBR_IND_JMP,
+	[PERF_SAMPLE_BRANCH_CALL_SHIFT]		= LBR_REL_CALL,
 };
 
 /* core */

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [tip:perf/core] perf/powerpc: Add support for PERF_SAMPLE_BRANCH_CALL
  2015-10-13  7:09 ` [PATCH 3/4] perf/powerpc: add " Stephane Eranian
@ 2015-10-20  9:36   ` tip-bot for Stephane Eranian
  0 siblings, 0 replies; 12+ messages in thread
From: tip-bot for Stephane Eranian @ 2015-10-20  9:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: eranian, torvalds, tglx, hpa, linux-kernel, mingo, dsahern, acme,
	peterz, jolsa, namhyung, vincent.weaver

Commit-ID:  24f1a79a5fc10858e05ee0bf651ec99abfc0319b
Gitweb:     http://git.kernel.org/tip/24f1a79a5fc10858e05ee0bf651ec99abfc0319b
Author:     Stephane Eranian <eranian@google.com>
AuthorDate: Tue, 13 Oct 2015 09:09:10 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 20 Oct 2015 10:30:54 +0200

perf/powerpc: Add support for PERF_SAMPLE_BRANCH_CALL

The patch catches PERF_SAMPLE_BRANCH_CALL because it is not clear whether
this is actually supported by the hardware.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: khandual@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/1444720151-10275-4-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/powerpc/perf/power8-pmu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/powerpc/perf/power8-pmu.c b/arch/powerpc/perf/power8-pmu.c
index 396351d..7d5e295 100644
--- a/arch/powerpc/perf/power8-pmu.c
+++ b/arch/powerpc/perf/power8-pmu.c
@@ -676,6 +676,9 @@ static u64 power8_bhrb_filter_map(u64 branch_sample_type)
 	if (branch_sample_type & PERF_SAMPLE_BRANCH_IND_CALL)
 		return -1;
 
+	if (branch_sample_type & PERF_SAMPLE_BRANCH_CALL)
+		return -1;
+
 	if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY_CALL) {
 		pmu_bhrb_filter |= POWER8_MMCRA_IFM1;
 		return pmu_bhrb_filter;

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [tip:perf/core] perf record: Add ability to sample call branches
  2015-10-13  7:09 ` [PATCH 4/4] perf record: add ability to sample call branches Stephane Eranian
@ 2015-10-20  9:36   ` tip-bot for Stephane Eranian
  0 siblings, 0 replies; 12+ messages in thread
From: tip-bot for Stephane Eranian @ 2015-10-20  9:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: peterz, dsahern, hpa, acme, mingo, eranian, linux-kernel, jolsa,
	torvalds, vincent.weaver, namhyung, tglx

Commit-ID:  43e41adc9e8c36545888d78fed2ef8d102a938dc
Gitweb:     http://git.kernel.org/tip/43e41adc9e8c36545888d78fed2ef8d102a938dc
Author:     Stephane Eranian <eranian@google.com>
AuthorDate: Tue, 13 Oct 2015 09:09:11 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 20 Oct 2015 10:30:55 +0200

perf record: Add ability to sample call branches

This patch add a new branch type sampling filter to perf record.
It is named 'call' and maps to PERF_SAMPLE_BRANCH_CALL. It samples
direct call branches only, unlike 'any_call' which includes indirect
calls as well.

 $ perf record -j call -e cycles .....

The man page is updated accordingly.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: khandual@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/1444720151-10275-5-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 tools/perf/Documentation/perf-record.txt | 1 +
 tools/perf/util/parse-branch-options.c   | 1 +
 2 files changed, 2 insertions(+)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 2e9ce77..b027d28 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -236,6 +236,7 @@ following filters are defined:
         - any_call: any function call or system call
         - any_ret: any function return or system call return
         - ind_call: any indirect branch
+        - call: direct calls, including far (to/from kernel) calls
         - u:  only when the branch target is at the user level
         - k: only when the branch target is in the kernel
         - hv: only when the target is at the hypervisor level
diff --git a/tools/perf/util/parse-branch-options.c b/tools/perf/util/parse-branch-options.c
index a3b1e13..355eecf 100644
--- a/tools/perf/util/parse-branch-options.c
+++ b/tools/perf/util/parse-branch-options.c
@@ -27,6 +27,7 @@ static const struct branch_mode branch_modes[] = {
 	BRANCH_OPT("no_tx", PERF_SAMPLE_BRANCH_NO_TX),
 	BRANCH_OPT("cond", PERF_SAMPLE_BRANCH_COND),
 	BRANCH_OPT("ind_jmp", PERF_SAMPLE_BRANCH_IND_JUMP),
+	BRANCH_OPT("call", PERF_SAMPLE_BRANCH_CALL),
 	BRANCH_END
 };
 

^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2015-10-20  9:37 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-13  7:09 [PATCH 0/4] perf: add ability to sample direct call branches Stephane Eranian
2015-10-13  7:09 ` [PATCH 1/4] perf: add PERF_SAMPLE_BRANCH_CALL Stephane Eranian
2015-10-20  9:35   ` [tip:perf/core] perf: Add PERF_SAMPLE_BRANCH_CALL tip-bot for Stephane Eranian
2015-10-13  7:09 ` [PATCH 2/4] perf/x86: add support for PERF_SAMPLE_BRANCH_CALL Stephane Eranian
2015-10-13 13:40   ` Ingo Molnar
2015-10-13 15:40     ` Andi Kleen
2015-10-14  0:39     ` Stephane Eranian
2015-10-20  9:36   ` [tip:perf/core] perf/x86: Add " tip-bot for Stephane Eranian
2015-10-13  7:09 ` [PATCH 3/4] perf/powerpc: add " Stephane Eranian
2015-10-20  9:36   ` [tip:perf/core] perf/powerpc: Add " tip-bot for Stephane Eranian
2015-10-13  7:09 ` [PATCH 4/4] perf record: add ability to sample call branches Stephane Eranian
2015-10-20  9:36   ` [tip:perf/core] perf record: Add " tip-bot for Stephane Eranian

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.