linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RESEND PATCH V5 0/2] Stitch LBR call stack (kernel)
@ 2020-01-16 15:57 kan.liang
  2020-01-16 15:57 ` [RESEND PATCH V5 1/2] perf/core: Add new branch sample type for HW index of raw branch records kan.liang
  2020-01-16 15:57 ` [RESEND PATCH V5 2/2] perf/x86/intel: Output LBR TOS information kan.liang
  0 siblings, 2 replies; 9+ messages in thread
From: kan.liang @ 2020-01-16 15:57 UTC (permalink / raw)
  To: peterz, eranian, acme, mingo, mpe, linux-kernel
  Cc: jolsa, namhyung, vitaly.slobodskoy, pavel.gerasimov, ak, Kan Liang

From: Kan Liang <kan.liang@linux.intel.com>

Changes since V4
- Only include the kernel patches
- Abstract TOS to HW index, which can be used across hw platforms.
  If we don't know the order of raw branch records, the hw_idx should be
  -1ULL. Set hw_idx to -1ULL for IBM Power for now.
- Move the new branch sample type back to bit 17

Changes since V3
- Add the new branch sample type at the end of enum
  perf_branch_sample_type.
- Rebase the user space patch on top of acme's perf/core branch

Changes since V2
- Move tos into struct perf_branch_stack

Changes since V1
- Add a new branch sample type for LBR TOS. Drop the sample type in V1.
- Add check in perf header to detect unknown input bits in event attr
- Save and use the LBR cursor nodes from previous sample to avoid
  duplicate calculation of cursor nodes.
- Add fast path for duplicate entries check. It benefits all call stack
  parsing, not just for stitch LBR call stack. It can be merged
  independetely.

Start from Haswell, Linux perf can utilize the existing Last Branch
Record (LBR) facility to record call stack. However, the depth of the
reconstructed LBR call stack limits to the number of LBR registers.
E.g. on skylake, the depth of reconstructed LBR call stack is <= 32
That's because HW will overwrite the oldest LBR registers when it's
full.

However, the overwritten LBRs may still be retrieved from previous
sample. At that moment, HW hasn't overwritten the LBR registers yet.
Perf tools can stitch those overwritten LBRs on current call stacks to
get a more complete call stack.

To determine if LBRs can be stitched, the physical index of LBR
registers is required. A new branch sample type is introduced to
dump the LBR Top-of-Stack (TOS) information for perf tools.

The stitching approach base on LBR call stack technology. The known
limitations of LBR call stack technology still apply to the approach,
e.g. Exception handing such as setjmp/longjmp will have calls/returns
not match.
This approach is not full proof. There can be cases where it creates
incorrect call stacks from incorrect matches. There is no attempt
to validate any matches in another way. So it is not enabled by default.
However in many common cases with call stack overflows it can recreate
better call stacks than the default lbr call stack output. So if there
are problems with LBR overflows this is a possible workaround.

Regression:
Users may collect LBR call stack on a machine with new perf tool and
new kernel (support LBR TOS). However, they may parse the perf.data with
old perf tool (not support LBR TOS). The old tool doesn't check
attr.branch_sample_type. Users probably get incorrect information
without any warning.

Kan Liang (2):
  perf/core: Add new branch sample type for HW index of raw branch
    records
  perf/x86/intel: Output LBR TOS information

 arch/powerpc/perf/core-book3s.c |  1 +
 arch/x86/events/intel/lbr.c     |  9 +++++++++
 include/linux/perf_event.h      | 12 ++++++++++++
 include/uapi/linux/perf_event.h | 10 +++++++++-
 kernel/events/core.c            | 11 +++++++++++
 5 files changed, 42 insertions(+), 1 deletion(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RESEND PATCH V5 1/2] perf/core: Add new branch sample type for HW index of raw branch records
  2020-01-16 15:57 [RESEND PATCH V5 0/2] Stitch LBR call stack (kernel) kan.liang
@ 2020-01-16 15:57 ` kan.liang
  2020-01-20  9:23   ` Peter Zijlstra
  2020-01-21  9:32   ` Stephane Eranian
  2020-01-16 15:57 ` [RESEND PATCH V5 2/2] perf/x86/intel: Output LBR TOS information kan.liang
  1 sibling, 2 replies; 9+ messages in thread
From: kan.liang @ 2020-01-16 15:57 UTC (permalink / raw)
  To: peterz, eranian, acme, mingo, mpe, linux-kernel
  Cc: jolsa, namhyung, vitaly.slobodskoy, pavel.gerasimov, ak, Kan Liang

From: Kan Liang <kan.liang@linux.intel.com>

The low level index of raw branch records is very useful for
reconstructing the call stack. For example, in Intel LBR call stack mode,
the depth of reconstructed LBR call stack limits to the number of LBR
registers. With the HW index information, perf tool may stitch the
stacks of two samples. The reconstructed LBR call stack can break the HW
limitation.

Add a new branch sample type to retrieve the low level index of raw
branch records. Only need to save the index for the most recent branch
aka entries[0]. Others can be calculated by the max number of HW
supported branch records later in perf tool.

If we don't know the order of raw branch records, the hw_idx should be
-1ULL. This patch sets -1ULL for all architectures for now. It can be
changed later if needed.

Only when the new branch sample type is set, the HW index information is
dumped into the PERF_SAMPLE_BRANCH_STACK output.
Perf tool should check the attr.branch_sample_type, and apply the
corresponding format for PERF_SAMPLE_BRANCH_STACK samples.
Otherwise, some user case may be broken. For example, users may parse a
perf.data, which include the new branch sample type, with an old version
perf tool (without the check). Users probably get incorrect information
without any warning.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
---
 arch/powerpc/perf/core-book3s.c |  1 +
 arch/x86/events/intel/lbr.c     |  3 +++
 include/linux/perf_event.h      | 12 ++++++++++++
 include/uapi/linux/perf_event.h | 10 +++++++++-
 kernel/events/core.c            | 11 +++++++++++
 5 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 48604625ab31..fe7de222229a 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -524,6 +524,7 @@ static void power_pmu_bhrb_read(struct perf_event *event, struct cpu_hw_events *
 		}
 	}
 	cpuhw->bhrb_stack.nr = u_index;
+	cpuhw->bhrb_stack.hw_idx = -1ULL;
 	return;
 }
 
diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
index 534c76606049..7639e2097101 100644
--- a/arch/x86/events/intel/lbr.c
+++ b/arch/x86/events/intel/lbr.c
@@ -585,6 +585,7 @@ static void intel_pmu_lbr_read_32(struct cpu_hw_events *cpuc)
 		cpuc->lbr_entries[i].reserved	= 0;
 	}
 	cpuc->lbr_stack.nr = i;
+	cpuc->lbr_stack.hw_idx = -1ULL;
 }
 
 /*
@@ -680,6 +681,7 @@ static void intel_pmu_lbr_read_64(struct cpu_hw_events *cpuc)
 		out++;
 	}
 	cpuc->lbr_stack.nr = out;
+	cpuc->lbr_stack.hw_idx = -1ULL;
 }
 
 void intel_pmu_lbr_read(void)
@@ -1120,6 +1122,7 @@ void intel_pmu_store_pebs_lbrs(struct pebs_lbr *lbr)
 	int i;
 
 	cpuc->lbr_stack.nr = x86_pmu.lbr_nr;
+	cpuc->lbr_stack.hw_idx = -1ULL;
 	for (i = 0; i < x86_pmu.lbr_nr; i++) {
 		u64 info = lbr->lbr[i].info;
 		struct perf_branch_entry *e = &cpuc->lbr_entries[i];
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 011dcbdbccc2..554621d99864 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -93,14 +93,26 @@ struct perf_raw_record {
 /*
  * branch stack layout:
  *  nr: number of taken branches stored in entries[]
+ *  hw_idx: The low level index of raw branch records
+ *          for the most recent branch.
+ *          -1ULL means invalid.
  *
  * Note that nr can vary from sample to sample
  * branches (to, from) are stored from most recent
  * to least recent, i.e., entries[0] contains the most
  * recent branch.
+ * The entries[] is an abstraction of raw branch records,
+ * which may not be stored in age order in HW, e.g. Intel LBR.
+ * The hw_idx is to expose the low level index of raw
+ * branch record for the most recent branch aka entries[0].
+ * For the architectures whose raw branch records are
+ * already stored in age order, the hw_idx should be 0.
+ * If we don't know the order of raw branch records,
+ * the hw_idx should be -1ULL.
  */
 struct perf_branch_stack {
 	__u64				nr;
+	__u64				hw_idx;
 	struct perf_branch_entry	entries[0];
 };
 
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index bb7b271397a6..14110837b130 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -180,6 +180,8 @@ enum perf_branch_sample_type_shift {
 
 	PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT	= 16, /* save branch type */
 
+	PERF_SAMPLE_BRANCH_HW_INDEX_SHIFT	= 17, /* save low level index of raw branch records */
+
 	PERF_SAMPLE_BRANCH_MAX_SHIFT		/* non-ABI */
 };
 
@@ -207,6 +209,8 @@ enum perf_branch_sample_type {
 	PERF_SAMPLE_BRANCH_TYPE_SAVE	=
 		1U << PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT,
 
+	PERF_SAMPLE_BRANCH_HW_INDEX	= 1U << PERF_SAMPLE_BRANCH_HW_INDEX_SHIFT,
+
 	PERF_SAMPLE_BRANCH_MAX		= 1U << PERF_SAMPLE_BRANCH_MAX_SHIFT,
 };
 
@@ -849,7 +853,11 @@ enum perf_event_type {
 	 *	  char                  data[size];}&& PERF_SAMPLE_RAW
 	 *
 	 *	{ u64                   nr;
-	 *        { u64 from, to, flags } lbr[nr];} && PERF_SAMPLE_BRANCH_STACK
+	 *        { u64 from, to, flags } lbr[nr];
+	 *
+	 *        # only available if PERF_SAMPLE_BRANCH_HW_INDEX is set
+	 *        u64			hw_idx;
+	 *      } && PERF_SAMPLE_BRANCH_STACK
 	 *
 	 * 	{ u64			abi; # enum perf_sample_regs_abi
 	 * 	  u64			regs[weight(mask)]; } && PERF_SAMPLE_REGS_USER
diff --git a/kernel/events/core.c b/kernel/events/core.c
index cfd89b4a02d8..4be3ba12333f 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6391,6 +6391,11 @@ static void perf_output_read(struct perf_output_handle *handle,
 		perf_output_read_one(handle, event, enabled, running);
 }
 
+static inline bool perf_sample_save_hw_index(struct perf_event *event)
+{
+	return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_HW_INDEX;
+}
+
 void perf_output_sample(struct perf_output_handle *handle,
 			struct perf_event_header *header,
 			struct perf_sample_data *data,
@@ -6480,6 +6485,8 @@ void perf_output_sample(struct perf_output_handle *handle,
 
 			perf_output_put(handle, data->br_stack->nr);
 			perf_output_copy(handle, data->br_stack->entries, size);
+			if (perf_sample_save_hw_index(event))
+				perf_output_put(handle, data->br_stack->hw_idx);
 		} else {
 			/*
 			 * we always store at least the value of nr
@@ -6667,7 +6674,11 @@ void perf_prepare_sample(struct perf_event_header *header,
 		if (data->br_stack) {
 			size += data->br_stack->nr
 			      * sizeof(struct perf_branch_entry);
+
+			if (perf_sample_save_hw_index(event))
+				size += sizeof(u64);
 		}
+
 		header->size += size;
 	}
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RESEND PATCH V5 2/2] perf/x86/intel: Output LBR TOS information
  2020-01-16 15:57 [RESEND PATCH V5 0/2] Stitch LBR call stack (kernel) kan.liang
  2020-01-16 15:57 ` [RESEND PATCH V5 1/2] perf/core: Add new branch sample type for HW index of raw branch records kan.liang
@ 2020-01-16 15:57 ` kan.liang
  1 sibling, 0 replies; 9+ messages in thread
From: kan.liang @ 2020-01-16 15:57 UTC (permalink / raw)
  To: peterz, eranian, acme, mingo, mpe, linux-kernel
  Cc: jolsa, namhyung, vitaly.slobodskoy, pavel.gerasimov, ak, Kan Liang

From: Kan Liang <kan.liang@linux.intel.com>

For Intel LBR, the LBR Top-of-Stack (TOS) information is the HW index of
raw branch record for the most recent branch.

For non-adaptive PEBS and non-PEBS, the TOS information can be directly
retrieved from TOS MSR read in intel_pmu_lbr_read().

For adaptive PEBS, the LBR information stored in PEBS record doesn't
include the TOS information. For single PEBS, TOS can be directly read
from MSR, because the PMI is triggered immediately after PEBS is
written. TOS MSR is still unchanged.
For large PEBS, TOS MSR has stale value. Set -1ULL to indicate that the
TOS information is not available.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
---
 arch/x86/events/intel/lbr.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
index 7639e2097101..65113b16804a 100644
--- a/arch/x86/events/intel/lbr.c
+++ b/arch/x86/events/intel/lbr.c
@@ -585,7 +585,7 @@ static void intel_pmu_lbr_read_32(struct cpu_hw_events *cpuc)
 		cpuc->lbr_entries[i].reserved	= 0;
 	}
 	cpuc->lbr_stack.nr = i;
-	cpuc->lbr_stack.hw_idx = -1ULL;
+	cpuc->lbr_stack.hw_idx = tos;
 }
 
 /*
@@ -681,7 +681,7 @@ static void intel_pmu_lbr_read_64(struct cpu_hw_events *cpuc)
 		out++;
 	}
 	cpuc->lbr_stack.nr = out;
-	cpuc->lbr_stack.hw_idx = -1ULL;
+	cpuc->lbr_stack.hw_idx = tos;
 }
 
 void intel_pmu_lbr_read(void)
@@ -1122,7 +1122,13 @@ void intel_pmu_store_pebs_lbrs(struct pebs_lbr *lbr)
 	int i;
 
 	cpuc->lbr_stack.nr = x86_pmu.lbr_nr;
-	cpuc->lbr_stack.hw_idx = -1ULL;
+
+	/* Cannot get TOS for large PEBS */
+	if (cpuc->n_pebs == cpuc->n_large_pebs)
+		cpuc->lbr_stack.hw_idx = -1ULL;
+	else
+		cpuc->lbr_stack.hw_idx = intel_pmu_lbr_tos();
+
 	for (i = 0; i < x86_pmu.lbr_nr; i++) {
 		u64 info = lbr->lbr[i].info;
 		struct perf_branch_entry *e = &cpuc->lbr_entries[i];
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [RESEND PATCH V5 1/2] perf/core: Add new branch sample type for HW index of raw branch records
  2020-01-16 15:57 ` [RESEND PATCH V5 1/2] perf/core: Add new branch sample type for HW index of raw branch records kan.liang
@ 2020-01-20  9:23   ` Peter Zijlstra
  2020-01-20 16:50     ` Liang, Kan
  2020-01-21  9:32   ` Stephane Eranian
  1 sibling, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2020-01-20  9:23 UTC (permalink / raw)
  To: kan.liang
  Cc: eranian, acme, mingo, mpe, linux-kernel, jolsa, namhyung,
	vitaly.slobodskoy, pavel.gerasimov, ak

On Thu, Jan 16, 2020 at 07:57:56AM -0800, kan.liang@linux.intel.com wrote:

>  struct perf_branch_stack {
>  	__u64				nr;
> +	__u64				hw_idx;
>  	struct perf_branch_entry	entries[0];
>  };

The above and below order doesn't match.

> @@ -849,7 +853,11 @@ enum perf_event_type {
>  	 *	  char                  data[size];}&& PERF_SAMPLE_RAW
>  	 *
>  	 *	{ u64                   nr;
> -	 *        { u64 from, to, flags } lbr[nr];} && PERF_SAMPLE_BRANCH_STACK
> +	 *        { u64 from, to, flags } lbr[nr];
> +	 *
> +	 *        # only available if PERF_SAMPLE_BRANCH_HW_INDEX is set
> +	 *        u64			hw_idx;
> +	 *      } && PERF_SAMPLE_BRANCH_STACK

That wants to be written as:

		{ u64			nr;
		  { u64 from, to, flags; } entries[nr];
		  { u64	hw_idx; } && PERF_SAMPLE_BRANCH_HW_INDEX
		} && PERF_SAMPLE_BRANCH_STACK

But the big question is; why isn't it:

		{ u64			nr;
		  { u64	hw_idx; } && PERF_SAMPLE_BRANCH_HW_INDEX
		  { u64 from, to, flags; } entries[nr];
		} && PERF_SAMPLE_BRANCH_STACK

to match the struct perf_branch_stack order. Having that variable sized
entry in the middle just seems weird.

>  	 *
>  	 * 	{ u64			abi; # enum perf_sample_regs_abi
>  	 * 	  u64			regs[weight(mask)]; } && PERF_SAMPLE_REGS_USER

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RESEND PATCH V5 1/2] perf/core: Add new branch sample type for HW index of raw branch records
  2020-01-20  9:23   ` Peter Zijlstra
@ 2020-01-20 16:50     ` Liang, Kan
  2020-01-20 20:24       ` Peter Zijlstra
  0 siblings, 1 reply; 9+ messages in thread
From: Liang, Kan @ 2020-01-20 16:50 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: eranian, acme, mingo, mpe, linux-kernel, jolsa, namhyung,
	vitaly.slobodskoy, pavel.gerasimov, ak



On 1/20/2020 4:23 AM, Peter Zijlstra wrote:
> On Thu, Jan 16, 2020 at 07:57:56AM -0800, kan.liang@linux.intel.com wrote:
> 
>>   struct perf_branch_stack {
>>   	__u64				nr;
>> +	__u64				hw_idx;
>>   	struct perf_branch_entry	entries[0];
>>   };
> 
> The above and below order doesn't match.
> 
>> @@ -849,7 +853,11 @@ enum perf_event_type {
>>   	 *	  char                  data[size];}&& PERF_SAMPLE_RAW
>>   	 *
>>   	 *	{ u64                   nr;
>> -	 *        { u64 from, to, flags } lbr[nr];} && PERF_SAMPLE_BRANCH_STACK
>> +	 *        { u64 from, to, flags } lbr[nr];
>> +	 *
>> +	 *        # only available if PERF_SAMPLE_BRANCH_HW_INDEX is set
>> +	 *        u64			hw_idx;
>> +	 *      } && PERF_SAMPLE_BRANCH_STACK
> 
> That wants to be written as:
> 
> 		{ u64			nr;
> 		  { u64 from, to, flags; } entries[nr];
> 		  { u64	hw_idx; } && PERF_SAMPLE_BRANCH_HW_INDEX
> 		} && PERF_SAMPLE_BRANCH_STACK
> 
> But the big question is; why isn't it:
> 
> 		{ u64			nr;
> 		  { u64	hw_idx; } && PERF_SAMPLE_BRANCH_HW_INDEX
> 		  { u64 from, to, flags; } entries[nr];
> 		} && PERF_SAMPLE_BRANCH_STACK
> 
> to match the struct perf_branch_stack order. Having that variable sized
> entry in the middle just seems weird.


Usually, new data should be output to the end of a sample.
The comments and codes are all based on that way.
However, the entries[0] is sized entry, so I have to put the hw_idx 
before entry. It makes the inconsistency. Sorry for the confusion caused.

I will fix it in V6.

Thanks,
Kan

> 
>>   	 *
>>   	 * 	{ u64			abi; # enum perf_sample_regs_abi
>>   	 * 	  u64			regs[weight(mask)]; } && PERF_SAMPLE_REGS_USER

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RESEND PATCH V5 1/2] perf/core: Add new branch sample type for HW index of raw branch records
  2020-01-20 16:50     ` Liang, Kan
@ 2020-01-20 20:24       ` Peter Zijlstra
  2020-01-20 20:47         ` Liang, Kan
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2020-01-20 20:24 UTC (permalink / raw)
  To: Liang, Kan
  Cc: eranian, acme, mingo, mpe, linux-kernel, jolsa, namhyung,
	vitaly.slobodskoy, pavel.gerasimov, ak

On Mon, Jan 20, 2020 at 11:50:59AM -0500, Liang, Kan wrote:
> 
> 
> On 1/20/2020 4:23 AM, Peter Zijlstra wrote:
> > On Thu, Jan 16, 2020 at 07:57:56AM -0800, kan.liang@linux.intel.com wrote:
> > 
> > >   struct perf_branch_stack {
> > >   	__u64				nr;
> > > +	__u64				hw_idx;
> > >   	struct perf_branch_entry	entries[0];
> > >   };
> > 
> > The above and below order doesn't match.
> > 
> > > @@ -849,7 +853,11 @@ enum perf_event_type {
> > >   	 *	  char                  data[size];}&& PERF_SAMPLE_RAW
> > >   	 *
> > >   	 *	{ u64                   nr;
> > > -	 *        { u64 from, to, flags } lbr[nr];} && PERF_SAMPLE_BRANCH_STACK
> > > +	 *        { u64 from, to, flags } lbr[nr];
> > > +	 *
> > > +	 *        # only available if PERF_SAMPLE_BRANCH_HW_INDEX is set
> > > +	 *        u64			hw_idx;
> > > +	 *      } && PERF_SAMPLE_BRANCH_STACK
> > 
> > That wants to be written as:
> > 
> > 		{ u64			nr;
> > 		  { u64 from, to, flags; } entries[nr];
> > 		  { u64	hw_idx; } && PERF_SAMPLE_BRANCH_HW_INDEX
> > 		} && PERF_SAMPLE_BRANCH_STACK
> > 
> > But the big question is; why isn't it:
> > 
> > 		{ u64			nr;
> > 		  { u64	hw_idx; } && PERF_SAMPLE_BRANCH_HW_INDEX
> > 		  { u64 from, to, flags; } entries[nr];
> > 		} && PERF_SAMPLE_BRANCH_STACK
> > 
> > to match the struct perf_branch_stack order. Having that variable sized
> > entry in the middle just seems weird.
> 
> 
> Usually, new data should be output to the end of a sample.

Because.... you want old tools to read new output?

> However, the entries[0] is sized entry, so I have to put the hw_idx before

entries[0] is only in the C thing, and in C you indeed have to put
hw_idx before.

> entry. It makes the inconsistency. Sorry for the confusion caused.

n/p it's clear now I think.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RESEND PATCH V5 1/2] perf/core: Add new branch sample type for HW index of raw branch records
  2020-01-20 20:24       ` Peter Zijlstra
@ 2020-01-20 20:47         ` Liang, Kan
  0 siblings, 0 replies; 9+ messages in thread
From: Liang, Kan @ 2020-01-20 20:47 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: eranian, acme, mingo, mpe, linux-kernel, jolsa, namhyung,
	vitaly.slobodskoy, pavel.gerasimov, ak



On 1/20/2020 3:24 PM, Peter Zijlstra wrote:
> On Mon, Jan 20, 2020 at 11:50:59AM -0500, Liang, Kan wrote:
>>
>>
>> On 1/20/2020 4:23 AM, Peter Zijlstra wrote:
>>> On Thu, Jan 16, 2020 at 07:57:56AM -0800, kan.liang@linux.intel.com wrote:
>>>
>>>>    struct perf_branch_stack {
>>>>    	__u64				nr;
>>>> +	__u64				hw_idx;
>>>>    	struct perf_branch_entry	entries[0];
>>>>    };
>>>
>>> The above and below order doesn't match.
>>>
>>>> @@ -849,7 +853,11 @@ enum perf_event_type {
>>>>    	 *	  char                  data[size];}&& PERF_SAMPLE_RAW
>>>>    	 *
>>>>    	 *	{ u64                   nr;
>>>> -	 *        { u64 from, to, flags } lbr[nr];} && PERF_SAMPLE_BRANCH_STACK
>>>> +	 *        { u64 from, to, flags } lbr[nr];
>>>> +	 *
>>>> +	 *        # only available if PERF_SAMPLE_BRANCH_HW_INDEX is set
>>>> +	 *        u64			hw_idx;
>>>> +	 *      } && PERF_SAMPLE_BRANCH_STACK
>>>
>>> That wants to be written as:
>>>
>>> 		{ u64			nr;
>>> 		  { u64 from, to, flags; } entries[nr];
>>> 		  { u64	hw_idx; } && PERF_SAMPLE_BRANCH_HW_INDEX
>>> 		} && PERF_SAMPLE_BRANCH_STACK
>>>
>>> But the big question is; why isn't it:
>>>
>>> 		{ u64			nr;
>>> 		  { u64	hw_idx; } && PERF_SAMPLE_BRANCH_HW_INDEX
>>> 		  { u64 from, to, flags; } entries[nr];
>>> 		} && PERF_SAMPLE_BRANCH_STACK
>>>
>>> to match the struct perf_branch_stack order. Having that variable sized
>>> entry in the middle just seems weird.
>>
>>
>> Usually, new data should be output to the end of a sample.
> 
> Because.... you want old tools to read new output?
>

Yes, for some cases, it helps.
If no other sample types are output after PERF_SAMPLE_BRANCH_STACK,
old perf tool will ignore the hw_idx.
But, if we also have to output other sample types, e.g 
PERF_SAMPLE_DATA_SRC or PERF_SAMPLE_PHYS_ADDR, which are output after 
PERF_SAMPLE_BRANCH_STACK. The hw_idx will mess them up.
Old perf tool doesn't work anymore.


>> However, the entries[0] is sized entry, so I have to put the hw_idx before
> 
> entries[0] is only in the C thing, and in C you indeed have to put
> hw_idx before.
> 
>> entry. It makes the inconsistency. Sorry for the confusion caused.
> 
> n/p it's clear now I think.

Should I send V6 patch to move hw_idx before entry as below?

@@ -853,7 +857,9 @@ enum perf_event_type {
          *        char                  data[size];}&& PERF_SAMPLE_RAW
          *
          *      { u64                   nr;
-        *        { u64 from, to, flags } lbr[nr];} && 
PERF_SAMPLE_BRANCH_STACK
+        *        { u64 hw_idx; } && PERF_SAMPLE_BRANCH_HW_INDEX
+        *        { u64 from, to, flags } lbr[nr];
+        *      } && PERF_SAMPLE_BRANCH_STACK
          *
          *      { u64                   abi; # enum perf_sample_regs_abi
          *        u64                   regs[weight(mask)]; } && 
PERF_SAMPLE_REGS_USER

@@ -6634,6 +6639,8 @@ void perf_output_sample(struct perf_output_handle 
*handle,
                              * sizeof(struct perf_branch_entry);

                         perf_output_put(handle, data->br_stack->nr);
+                       if (perf_sample_save_hw_index(event))
+                               perf_output_put(handle, 
data->br_stack->hw_idx);
                         perf_output_copy(handle, 
data->br_stack->entries, size);
                 } else {
                         /*



Thanks,
Kan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RESEND PATCH V5 1/2] perf/core: Add new branch sample type for HW index of raw branch records
  2020-01-16 15:57 ` [RESEND PATCH V5 1/2] perf/core: Add new branch sample type for HW index of raw branch records kan.liang
  2020-01-20  9:23   ` Peter Zijlstra
@ 2020-01-21  9:32   ` Stephane Eranian
  2020-01-21 15:02     ` Liang, Kan
  1 sibling, 1 reply; 9+ messages in thread
From: Stephane Eranian @ 2020-01-21  9:32 UTC (permalink / raw)
  To: Liang, Kan
  Cc: Peter Zijlstra, Arnaldo Carvalho de Melo, Ingo Molnar,
	Michael Ellerman, LKML, Jiri Olsa, Namhyung Kim,
	vitaly.slobodskoy, pavel.gerasimov, Andi Kleen

On Thu, Jan 16, 2020 at 7:59 AM <kan.liang@linux.intel.com> wrote:
>
> From: Kan Liang <kan.liang@linux.intel.com>
>
> The low level index of raw branch records is very useful for
> reconstructing the call stack. For example, in Intel LBR call stack mode,
> the depth of reconstructed LBR call stack limits to the number of LBR
> registers. With the HW index information, perf tool may stitch the
> stacks of two samples. The reconstructed LBR call stack can break the HW
> limitation.
>
> Add a new branch sample type to retrieve the low level index of raw
> branch records. Only need to save the index for the most recent branch
> aka entries[0]. Others can be calculated by the max number of HW
> supported branch records later in perf tool.
>
You need to define what the low level index is about w.r.t. the branch_entries[]
abstraction. You need to say it is a value between -1 (unknown) and max depth
(which can be retrieved in /sys/devices/cpu/caps/branches). It returns the index
in the underlying hardware buffer of the most recently captured taken branch
which is always saved in branch_entries[0]. As such, it is not necessary to
process branches in order. It may be used in certain modes to stitch
together multiple
BRANCH_STACK records in call stack mode.

>
> If we don't know the order of raw branch records, the hw_idx should be
> -1ULL. This patch sets -1ULL for all architectures for now. It can be
> changed later if needed.
>
> Only when the new branch sample type is set, the HW index information is
> dumped into the PERF_SAMPLE_BRANCH_STACK output.
> Perf tool should check the attr.branch_sample_type, and apply the
> corresponding format for PERF_SAMPLE_BRANCH_STACK samples.
> Otherwise, some user case may be broken. For example, users may parse a
> perf.data, which include the new branch sample type, with an old version
> perf tool (without the check). Users probably get incorrect information
> without any warning.
>
> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
> ---
>  arch/powerpc/perf/core-book3s.c |  1 +
>  arch/x86/events/intel/lbr.c     |  3 +++
>  include/linux/perf_event.h      | 12 ++++++++++++
>  include/uapi/linux/perf_event.h | 10 +++++++++-
>  kernel/events/core.c            | 11 +++++++++++
>  5 files changed, 36 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
> index 48604625ab31..fe7de222229a 100644
> --- a/arch/powerpc/perf/core-book3s.c
> +++ b/arch/powerpc/perf/core-book3s.c
> @@ -524,6 +524,7 @@ static void power_pmu_bhrb_read(struct perf_event *event, struct cpu_hw_events *
>                 }
>         }
>         cpuhw->bhrb_stack.nr = u_index;
> +       cpuhw->bhrb_stack.hw_idx = -1ULL;
>         return;
>  }
>
> diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
> index 534c76606049..7639e2097101 100644
> --- a/arch/x86/events/intel/lbr.c
> +++ b/arch/x86/events/intel/lbr.c
> @@ -585,6 +585,7 @@ static void intel_pmu_lbr_read_32(struct cpu_hw_events *cpuc)
>                 cpuc->lbr_entries[i].reserved   = 0;
>         }
>         cpuc->lbr_stack.nr = i;
> +       cpuc->lbr_stack.hw_idx = -1ULL;
>  }
>
>  /*
> @@ -680,6 +681,7 @@ static void intel_pmu_lbr_read_64(struct cpu_hw_events *cpuc)
>                 out++;
>         }
>         cpuc->lbr_stack.nr = out;
> +       cpuc->lbr_stack.hw_idx = -1ULL;
>  }
>
>  void intel_pmu_lbr_read(void)
> @@ -1120,6 +1122,7 @@ void intel_pmu_store_pebs_lbrs(struct pebs_lbr *lbr)
>         int i;
>
>         cpuc->lbr_stack.nr = x86_pmu.lbr_nr;
> +       cpuc->lbr_stack.hw_idx = -1ULL;
>         for (i = 0; i < x86_pmu.lbr_nr; i++) {
>                 u64 info = lbr->lbr[i].info;
>                 struct perf_branch_entry *e = &cpuc->lbr_entries[i];
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index 011dcbdbccc2..554621d99864 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -93,14 +93,26 @@ struct perf_raw_record {
>  /*
>   * branch stack layout:
>   *  nr: number of taken branches stored in entries[]
> + *  hw_idx: The low level index of raw branch records
> + *          for the most recent branch.
> + *          -1ULL means invalid.
>   *
>   * Note that nr can vary from sample to sample
>   * branches (to, from) are stored from most recent
>   * to least recent, i.e., entries[0] contains the most
>   * recent branch.
> + * The entries[] is an abstraction of raw branch records,
> + * which may not be stored in age order in HW, e.g. Intel LBR.
> + * The hw_idx is to expose the low level index of raw
> + * branch record for the most recent branch aka entries[0].
> + * For the architectures whose raw branch records are
> + * already stored in age order, the hw_idx should be 0.
> + * If we don't know the order of raw branch records,
> + * the hw_idx should be -1ULL.
>   */
>  struct perf_branch_stack {
>         __u64                           nr;
> +       __u64                           hw_idx;
>         struct perf_branch_entry        entries[0];
>  };
>
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index bb7b271397a6..14110837b130 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -180,6 +180,8 @@ enum perf_branch_sample_type_shift {
>
>         PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT      = 16, /* save branch type */
>
> +       PERF_SAMPLE_BRANCH_HW_INDEX_SHIFT       = 17, /* save low level index of raw branch records */
> +
>         PERF_SAMPLE_BRANCH_MAX_SHIFT            /* non-ABI */
>  };
>
> @@ -207,6 +209,8 @@ enum perf_branch_sample_type {
>         PERF_SAMPLE_BRANCH_TYPE_SAVE    =
>                 1U << PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT,
>
> +       PERF_SAMPLE_BRANCH_HW_INDEX     = 1U << PERF_SAMPLE_BRANCH_HW_INDEX_SHIFT,
> +
>         PERF_SAMPLE_BRANCH_MAX          = 1U << PERF_SAMPLE_BRANCH_MAX_SHIFT,
>  };
>
> @@ -849,7 +853,11 @@ enum perf_event_type {
>          *        char                  data[size];}&& PERF_SAMPLE_RAW
>          *
>          *      { u64                   nr;
> -        *        { u64 from, to, flags } lbr[nr];} && PERF_SAMPLE_BRANCH_STACK
> +        *        { u64 from, to, flags } lbr[nr];
> +        *
> +        *        # only available if PERF_SAMPLE_BRANCH_HW_INDEX is set
> +        *        u64                   hw_idx;
> +        *      } && PERF_SAMPLE_BRANCH_STACK
>          *
>          *      { u64                   abi; # enum perf_sample_regs_abi
>          *        u64                   regs[weight(mask)]; } && PERF_SAMPLE_REGS_USER
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index cfd89b4a02d8..4be3ba12333f 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -6391,6 +6391,11 @@ static void perf_output_read(struct perf_output_handle *handle,
>                 perf_output_read_one(handle, event, enabled, running);
>  }
>
> +static inline bool perf_sample_save_hw_index(struct perf_event *event)
> +{
> +       return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_HW_INDEX;
> +}
> +
>  void perf_output_sample(struct perf_output_handle *handle,
>                         struct perf_event_header *header,
>                         struct perf_sample_data *data,
> @@ -6480,6 +6485,8 @@ void perf_output_sample(struct perf_output_handle *handle,
>
>                         perf_output_put(handle, data->br_stack->nr);
>                         perf_output_copy(handle, data->br_stack->entries, size);
> +                       if (perf_sample_save_hw_index(event))
> +                               perf_output_put(handle, data->br_stack->hw_idx);
>                 } else {
>                         /*
>                          * we always store at least the value of nr
> @@ -6667,7 +6674,11 @@ void perf_prepare_sample(struct perf_event_header *header,
>                 if (data->br_stack) {
>                         size += data->br_stack->nr
>                               * sizeof(struct perf_branch_entry);
> +
> +                       if (perf_sample_save_hw_index(event))
> +                               size += sizeof(u64);
>                 }
> +
>                 header->size += size;
>         }
>
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RESEND PATCH V5 1/2] perf/core: Add new branch sample type for HW index of raw branch records
  2020-01-21  9:32   ` Stephane Eranian
@ 2020-01-21 15:02     ` Liang, Kan
  0 siblings, 0 replies; 9+ messages in thread
From: Liang, Kan @ 2020-01-21 15:02 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: Peter Zijlstra, Arnaldo Carvalho de Melo, Ingo Molnar,
	Michael Ellerman, LKML, Jiri Olsa, Namhyung Kim,
	vitaly.slobodskoy, pavel.gerasimov, Andi Kleen



On 1/21/2020 4:32 AM, Stephane Eranian wrote:
> On Thu, Jan 16, 2020 at 7:59 AM <kan.liang@linux.intel.com> wrote:
>>
>> From: Kan Liang <kan.liang@linux.intel.com>
>>
>> The low level index of raw branch records is very useful for
>> reconstructing the call stack. For example, in Intel LBR call stack mode,
>> the depth of reconstructed LBR call stack limits to the number of LBR
>> registers. With the HW index information, perf tool may stitch the
>> stacks of two samples. The reconstructed LBR call stack can break the HW
>> limitation.
>>
>> Add a new branch sample type to retrieve the low level index of raw
>> branch records. Only need to save the index for the most recent branch
>> aka entries[0]. Others can be calculated by the max number of HW
>> supported branch records later in perf tool.
>>
> You need to define what the low level index is about w.r.t. the branch_entries[]
> abstraction. You need to say it is a value between -1 (unknown) and max depth
> (which can be retrieved in /sys/devices/cpu/caps/branches). It returns the index
> in the underlying hardware buffer of the most recently captured taken branch
> which is always saved in branch_entries[0]. As such, it is not necessary to
> process branches in order. It may be used in certain modes to stitch
> together multiple
> BRANCH_STACK records in call stack mode.

Thanks Stephane. I will change the description based on your suggestion.

Add a new branch sample type to retrieve low level index of raw branch 
records. The low level index is the index in the underlying hardware 
buffer of the most recently captured taken branch which is always saved 
in branch_entries[0]. The low level index is between -1 (unknown) and 
max depth which can be retrieved in /sys/devices/cpu/caps/branches.
It may be used in certain modes to stitch together multiple BRANCH_STACK 
records in call stack mode.


Thanks,
Kan

> 
>>
>> If we don't know the order of raw branch records, the hw_idx should be
>> -1ULL. This patch sets -1ULL for all architectures for now. It can be
>> changed later if needed.
>>
>> Only when the new branch sample type is set, the HW index information is
>> dumped into the PERF_SAMPLE_BRANCH_STACK output.
>> Perf tool should check the attr.branch_sample_type, and apply the
>> corresponding format for PERF_SAMPLE_BRANCH_STACK samples.
>> Otherwise, some user case may be broken. For example, users may parse a
>> perf.data, which include the new branch sample type, with an old version
>> perf tool (without the check). Users probably get incorrect information
>> without any warning.
>>
>> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
>> ---
>>   arch/powerpc/perf/core-book3s.c |  1 +
>>   arch/x86/events/intel/lbr.c     |  3 +++
>>   include/linux/perf_event.h      | 12 ++++++++++++
>>   include/uapi/linux/perf_event.h | 10 +++++++++-
>>   kernel/events/core.c            | 11 +++++++++++
>>   5 files changed, 36 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
>> index 48604625ab31..fe7de222229a 100644
>> --- a/arch/powerpc/perf/core-book3s.c
>> +++ b/arch/powerpc/perf/core-book3s.c
>> @@ -524,6 +524,7 @@ static void power_pmu_bhrb_read(struct perf_event *event, struct cpu_hw_events *
>>                  }
>>          }
>>          cpuhw->bhrb_stack.nr = u_index;
>> +       cpuhw->bhrb_stack.hw_idx = -1ULL;
>>          return;
>>   }
>>
>> diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
>> index 534c76606049..7639e2097101 100644
>> --- a/arch/x86/events/intel/lbr.c
>> +++ b/arch/x86/events/intel/lbr.c
>> @@ -585,6 +585,7 @@ static void intel_pmu_lbr_read_32(struct cpu_hw_events *cpuc)
>>                  cpuc->lbr_entries[i].reserved   = 0;
>>          }
>>          cpuc->lbr_stack.nr = i;
>> +       cpuc->lbr_stack.hw_idx = -1ULL;
>>   }
>>
>>   /*
>> @@ -680,6 +681,7 @@ static void intel_pmu_lbr_read_64(struct cpu_hw_events *cpuc)
>>                  out++;
>>          }
>>          cpuc->lbr_stack.nr = out;
>> +       cpuc->lbr_stack.hw_idx = -1ULL;
>>   }
>>
>>   void intel_pmu_lbr_read(void)
>> @@ -1120,6 +1122,7 @@ void intel_pmu_store_pebs_lbrs(struct pebs_lbr *lbr)
>>          int i;
>>
>>          cpuc->lbr_stack.nr = x86_pmu.lbr_nr;
>> +       cpuc->lbr_stack.hw_idx = -1ULL;
>>          for (i = 0; i < x86_pmu.lbr_nr; i++) {
>>                  u64 info = lbr->lbr[i].info;
>>                  struct perf_branch_entry *e = &cpuc->lbr_entries[i];
>> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
>> index 011dcbdbccc2..554621d99864 100644
>> --- a/include/linux/perf_event.h
>> +++ b/include/linux/perf_event.h
>> @@ -93,14 +93,26 @@ struct perf_raw_record {
>>   /*
>>    * branch stack layout:
>>    *  nr: number of taken branches stored in entries[]
>> + *  hw_idx: The low level index of raw branch records
>> + *          for the most recent branch.
>> + *          -1ULL means invalid.
>>    *
>>    * Note that nr can vary from sample to sample
>>    * branches (to, from) are stored from most recent
>>    * to least recent, i.e., entries[0] contains the most
>>    * recent branch.
>> + * The entries[] is an abstraction of raw branch records,
>> + * which may not be stored in age order in HW, e.g. Intel LBR.
>> + * The hw_idx is to expose the low level index of raw
>> + * branch record for the most recent branch aka entries[0].
>> + * For the architectures whose raw branch records are
>> + * already stored in age order, the hw_idx should be 0.
>> + * If we don't know the order of raw branch records,
>> + * the hw_idx should be -1ULL.
>>    */
>>   struct perf_branch_stack {
>>          __u64                           nr;
>> +       __u64                           hw_idx;
>>          struct perf_branch_entry        entries[0];
>>   };
>>
>> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
>> index bb7b271397a6..14110837b130 100644
>> --- a/include/uapi/linux/perf_event.h
>> +++ b/include/uapi/linux/perf_event.h
>> @@ -180,6 +180,8 @@ enum perf_branch_sample_type_shift {
>>
>>          PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT      = 16, /* save branch type */
>>
>> +       PERF_SAMPLE_BRANCH_HW_INDEX_SHIFT       = 17, /* save low level index of raw branch records */
>> +
>>          PERF_SAMPLE_BRANCH_MAX_SHIFT            /* non-ABI */
>>   };
>>
>> @@ -207,6 +209,8 @@ enum perf_branch_sample_type {
>>          PERF_SAMPLE_BRANCH_TYPE_SAVE    =
>>                  1U << PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT,
>>
>> +       PERF_SAMPLE_BRANCH_HW_INDEX     = 1U << PERF_SAMPLE_BRANCH_HW_INDEX_SHIFT,
>> +
>>          PERF_SAMPLE_BRANCH_MAX          = 1U << PERF_SAMPLE_BRANCH_MAX_SHIFT,
>>   };
>>
>> @@ -849,7 +853,11 @@ enum perf_event_type {
>>           *        char                  data[size];}&& PERF_SAMPLE_RAW
>>           *
>>           *      { u64                   nr;
>> -        *        { u64 from, to, flags } lbr[nr];} && PERF_SAMPLE_BRANCH_STACK
>> +        *        { u64 from, to, flags } lbr[nr];
>> +        *
>> +        *        # only available if PERF_SAMPLE_BRANCH_HW_INDEX is set
>> +        *        u64                   hw_idx;
>> +        *      } && PERF_SAMPLE_BRANCH_STACK
>>           *
>>           *      { u64                   abi; # enum perf_sample_regs_abi
>>           *        u64                   regs[weight(mask)]; } && PERF_SAMPLE_REGS_USER
>> diff --git a/kernel/events/core.c b/kernel/events/core.c
>> index cfd89b4a02d8..4be3ba12333f 100644
>> --- a/kernel/events/core.c
>> +++ b/kernel/events/core.c
>> @@ -6391,6 +6391,11 @@ static void perf_output_read(struct perf_output_handle *handle,
>>                  perf_output_read_one(handle, event, enabled, running);
>>   }
>>
>> +static inline bool perf_sample_save_hw_index(struct perf_event *event)
>> +{
>> +       return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_HW_INDEX;
>> +}
>> +
>>   void perf_output_sample(struct perf_output_handle *handle,
>>                          struct perf_event_header *header,
>>                          struct perf_sample_data *data,
>> @@ -6480,6 +6485,8 @@ void perf_output_sample(struct perf_output_handle *handle,
>>
>>                          perf_output_put(handle, data->br_stack->nr);
>>                          perf_output_copy(handle, data->br_stack->entries, size);
>> +                       if (perf_sample_save_hw_index(event))
>> +                               perf_output_put(handle, data->br_stack->hw_idx);
>>                  } else {
>>                          /*
>>                           * we always store at least the value of nr
>> @@ -6667,7 +6674,11 @@ void perf_prepare_sample(struct perf_event_header *header,
>>                  if (data->br_stack) {
>>                          size += data->br_stack->nr
>>                                * sizeof(struct perf_branch_entry);
>> +
>> +                       if (perf_sample_save_hw_index(event))
>> +                               size += sizeof(u64);
>>                  }
>> +
>>                  header->size += size;
>>          }
>>
>> --
>> 2.17.1
>>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-01-21 15:03 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-16 15:57 [RESEND PATCH V5 0/2] Stitch LBR call stack (kernel) kan.liang
2020-01-16 15:57 ` [RESEND PATCH V5 1/2] perf/core: Add new branch sample type for HW index of raw branch records kan.liang
2020-01-20  9:23   ` Peter Zijlstra
2020-01-20 16:50     ` Liang, Kan
2020-01-20 20:24       ` Peter Zijlstra
2020-01-20 20:47         ` Liang, Kan
2020-01-21  9:32   ` Stephane Eranian
2020-01-21 15:02     ` Liang, Kan
2020-01-16 15:57 ` [RESEND PATCH V5 2/2] perf/x86/intel: Output LBR TOS information kan.liang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).