linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] BHRB fixes, improvements and cleanups
@ 2015-06-30  8:20 Anshuman Khandual
  2015-06-30  8:20 ` [PATCH 1/5] powerpc/perf: Drop the branch sample when 'from' cannot be fetched Anshuman Khandual
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Anshuman Khandual @ 2015-06-30  8:20 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: mikey, sukadev, dja, mpe

	The five generic patches from BHRB SW branch filter enablement
series have been separated and grouped together in this series. These
patches are required to be applied before using the upcoming V10 of the
BHRB SW branch filter patch series.

Anshuman Khandual (5):
  powerpc/perf: Drop the branch sample when 'from' cannot be fetched
  powerpc/perf: Change type of the bhrb_users variable
  powerpc/perf: Replace last usage of get_cpu_var with this_cpu_ptr
  powerpc/perf: Change name & type of 'pred' in power_pmu_bhrb_read
  powerpc/perf: Re organize PMU branch filter processing on POWER8

 arch/powerpc/perf/core-book3s.c | 29 +++++++++++++++--------------
 arch/powerpc/perf/power8-pmu.c  | 22 +++++++---------------
 2 files changed, 22 insertions(+), 29 deletions(-)

-- 
2.1.0

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/5] powerpc/perf: Drop the branch sample when 'from' cannot be fetched
  2015-06-30  8:20 [PATCH 0/5] BHRB fixes, improvements and cleanups Anshuman Khandual
@ 2015-06-30  8:20 ` Anshuman Khandual
  2015-07-27  4:19   ` [1/5] " Michael Ellerman
  2015-06-30  8:20 ` [PATCH 2/5] powerpc/perf: Change type of the bhrb_users variable Anshuman Khandual
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Anshuman Khandual @ 2015-06-30  8:20 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: mikey, sukadev, dja, mpe

BHRB (Branch History Rolling Buffer) is a rolling buffer. Hence we
might end up in a situation where we have read one target address
but when we try to read the next entry indicating the from address
of the target address, the buffer just overflows. In this case, the
captured from address will be zero which indicates the end of the
buffer.

	This patch drops the entire branch record which would have
otherwise confused the user space tools.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
 arch/powerpc/perf/core-book3s.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index d90893b..b0c2d53 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -461,7 +461,6 @@ static void power_pmu_bhrb_read(struct cpu_hw_events *cpuhw)
 			 *    In this case we need to read the instruction from
 			 *    memory to determine the target/to address.
 			 */
-
 			if (val & BHRB_TARGET) {
 				/* Target branches use two entries
 				 * (ie. computed gotos/XL form)
@@ -472,6 +471,8 @@ static void power_pmu_bhrb_read(struct cpu_hw_events *cpuhw)
 
 				/* Get from address in next entry */
 				val = read_bhrb(r_index++);
+				if (!val)
+					break;
 				addr = val & BHRB_EA;
 				if (val & BHRB_TARGET) {
 					/* Shouldn't have two targets in a
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/5] powerpc/perf: Change type of the bhrb_users variable
  2015-06-30  8:20 [PATCH 0/5] BHRB fixes, improvements and cleanups Anshuman Khandual
  2015-06-30  8:20 ` [PATCH 1/5] powerpc/perf: Drop the branch sample when 'from' cannot be fetched Anshuman Khandual
@ 2015-06-30  8:20 ` Anshuman Khandual
  2015-08-03  1:35   ` [2/5] " Michael Ellerman
  2015-06-30  8:20 ` [PATCH 3/5] powerpc/perf: Replace last usage of get_cpu_var with this_cpu_ptr Anshuman Khandual
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Anshuman Khandual @ 2015-06-30  8:20 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: mikey, sukadev, dja, mpe

This patch just changes data type of bhrb_users variable from
int to unsigned int because it never contains a negative value.

Reported-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
 arch/powerpc/perf/core-book3s.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index b0c2d53..f9ecd93 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -53,7 +53,7 @@ struct cpu_hw_events {
 
 	/* BHRB bits */
 	u64				bhrb_filter;	/* BHRB HW branch filter */
-	int				bhrb_users;
+	unsigned int			bhrb_users;
 	void				*bhrb_context;
 	struct	perf_branch_stack	bhrb_stack;
 	struct	perf_branch_entry	bhrb_entries[BHRB_MAX_ENTRIES];
@@ -369,8 +369,8 @@ static void power_pmu_bhrb_disable(struct perf_event *event)
 	if (!ppmu->bhrb_nr)
 		return;
 
+	WARN_ON_ONCE(!cpuhw->bhrb_users);
 	cpuhw->bhrb_users--;
-	WARN_ON_ONCE(cpuhw->bhrb_users < 0);
 	perf_sched_cb_dec(event->ctx->pmu);
 
 	if (!cpuhw->disabled && !cpuhw->bhrb_users) {
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/5] powerpc/perf: Replace last usage of get_cpu_var with this_cpu_ptr
  2015-06-30  8:20 [PATCH 0/5] BHRB fixes, improvements and cleanups Anshuman Khandual
  2015-06-30  8:20 ` [PATCH 1/5] powerpc/perf: Drop the branch sample when 'from' cannot be fetched Anshuman Khandual
  2015-06-30  8:20 ` [PATCH 2/5] powerpc/perf: Change type of the bhrb_users variable Anshuman Khandual
@ 2015-06-30  8:20 ` Anshuman Khandual
  2015-07-27  5:15   ` [3/5] " Michael Ellerman
  2015-06-30  8:20 ` [PATCH 4/5] powerpc/perf: Change name & type of 'pred' in power_pmu_bhrb_read Anshuman Khandual
  2015-06-30  8:20 ` [PATCH 5/5] powerpc/perf: Re organize PMU branch filter processing on POWER8 Anshuman Khandual
  4 siblings, 1 reply; 15+ messages in thread
From: Anshuman Khandual @ 2015-06-30  8:20 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: mikey, sukadev, dja, mpe

The commit 69111bac42f5ce ("powerpc: Replace __get_cpu_var uses")
replaced all usage of get_cpu_var with this_cpu_ptr inside core
perf event handling on powerpc. But it skipped one of them which
is being replaced with this patch.

Reported-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
 arch/powerpc/perf/core-book3s.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index f9ecd93..57f2c78 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -1840,20 +1840,17 @@ static int power_pmu_event_init(struct perf_event *event)
 	if (check_excludes(ctrs, cflags, n, 1))
 		return -EINVAL;
 
-	cpuhw = &get_cpu_var(cpu_hw_events);
+	cpuhw = this_cpu_ptr(&cpu_hw_events);
 	err = power_check_constraints(cpuhw, events, cflags, n + 1);
 
 	if (has_branch_stack(event)) {
 		cpuhw->bhrb_filter = ppmu->bhrb_filter_map(
 					event->attr.branch_sample_type);
 
-		if (cpuhw->bhrb_filter == -1) {
-			put_cpu_var(cpu_hw_events);
+		if (cpuhw->bhrb_filter == -1)
 			return -EOPNOTSUPP;
-		}
 	}
 
-	put_cpu_var(cpu_hw_events);
 	if (err)
 		return -EINVAL;
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/5] powerpc/perf: Change name & type of 'pred' in power_pmu_bhrb_read
  2015-06-30  8:20 [PATCH 0/5] BHRB fixes, improvements and cleanups Anshuman Khandual
                   ` (2 preceding siblings ...)
  2015-06-30  8:20 ` [PATCH 3/5] powerpc/perf: Replace last usage of get_cpu_var with this_cpu_ptr Anshuman Khandual
@ 2015-06-30  8:20 ` Anshuman Khandual
  2015-07-29  3:25   ` [4/5] " Michael Ellerman
  2015-06-30  8:20 ` [PATCH 5/5] powerpc/perf: Re organize PMU branch filter processing on POWER8 Anshuman Khandual
  4 siblings, 1 reply; 15+ messages in thread
From: Anshuman Khandual @ 2015-06-30  8:20 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: mikey, sukadev, dja, mpe

Branch record attributes 'mispred' and 'predicted' are single bit
fields as defined in the perf ABI. Hence the data type of the field
'pred' used during BHRB processing should be changed from integer
to bool. This patch also changes the name of the variable from 'pred'
to 'mispred' making the logical inversion process more meaningful
and readable.

Reported-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
 arch/powerpc/perf/core-book3s.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 57f2c78..ddc0424 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -426,7 +426,8 @@ static void power_pmu_bhrb_read(struct cpu_hw_events *cpuhw)
 {
 	u64 val;
 	u64 addr;
-	int r_index, u_index, pred;
+	int r_index, u_index;
+	bool mispred;
 
 	r_index = 0;
 	u_index = 0;
@@ -438,7 +439,7 @@ static void power_pmu_bhrb_read(struct cpu_hw_events *cpuhw)
 			break;
 		else {
 			addr = val & BHRB_EA;
-			pred = val & BHRB_PREDICTION;
+			mispred = val & BHRB_PREDICTION;
 
 			if (!addr)
 				/* invalid entry */
@@ -466,8 +467,9 @@ static void power_pmu_bhrb_read(struct cpu_hw_events *cpuhw)
 				 * (ie. computed gotos/XL form)
 				 */
 				cpuhw->bhrb_entries[u_index].to = addr;
-				cpuhw->bhrb_entries[u_index].mispred = pred;
-				cpuhw->bhrb_entries[u_index].predicted = ~pred;
+				cpuhw->bhrb_entries[u_index].mispred = mispred;
+				cpuhw->bhrb_entries[u_index].predicted =
+								~mispred;
 
 				/* Get from address in next entry */
 				val = read_bhrb(r_index++);
@@ -487,8 +489,9 @@ static void power_pmu_bhrb_read(struct cpu_hw_events *cpuhw)
 				cpuhw->bhrb_entries[u_index].from = addr;
 				cpuhw->bhrb_entries[u_index].to =
 					power_pmu_bhrb_to(addr);
-				cpuhw->bhrb_entries[u_index].mispred = pred;
-				cpuhw->bhrb_entries[u_index].predicted = ~pred;
+				cpuhw->bhrb_entries[u_index].mispred = mispred;
+				cpuhw->bhrb_entries[u_index].predicted =
+								~mispred;
 			}
 			u_index++;
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5/5] powerpc/perf: Re organize PMU branch filter processing on POWER8
  2015-06-30  8:20 [PATCH 0/5] BHRB fixes, improvements and cleanups Anshuman Khandual
                   ` (3 preceding siblings ...)
  2015-06-30  8:20 ` [PATCH 4/5] powerpc/perf: Change name & type of 'pred' in power_pmu_bhrb_read Anshuman Khandual
@ 2015-06-30  8:20 ` Anshuman Khandual
  4 siblings, 0 replies; 15+ messages in thread
From: Anshuman Khandual @ 2015-06-30  8:20 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: mikey, sukadev, dja, mpe

This patch does some code re-arrangements to make it clear that kernel
ignores any separate privilege level branch filter request and does not
support any combinations of HW PMU branch filters.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
 arch/powerpc/perf/power8-pmu.c | 22 +++++++---------------
 1 file changed, 7 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/perf/power8-pmu.c b/arch/powerpc/perf/power8-pmu.c
index 396351d..a6c6a2c 100644
--- a/arch/powerpc/perf/power8-pmu.c
+++ b/arch/powerpc/perf/power8-pmu.c
@@ -656,8 +656,6 @@ static int power8_generic_events[] = {
 
 static u64 power8_bhrb_filter_map(u64 branch_sample_type)
 {
-	u64 pmu_bhrb_filter = 0;
-
 	/* BHRB and regular PMU events share the same privilege state
 	 * filter configuration. BHRB is always recorded along with a
 	 * regular PMU event. As the privilege state filter is handled
@@ -665,21 +663,15 @@ static u64 power8_bhrb_filter_map(u64 branch_sample_type)
 	 * PMU event, we ignore any separate BHRB specific request.
 	 */
 
-	/* No branch filter requested */
-	if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY)
-		return pmu_bhrb_filter;
-
-	/* Invalid branch filter options - HW does not support */
-	if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY_RETURN)
-		return -1;
+	/* Ignore user, kernel, hv bits */
+	branch_sample_type &= ~PERF_SAMPLE_BRANCH_PLM_ALL;
 
-	if (branch_sample_type & PERF_SAMPLE_BRANCH_IND_CALL)
-		return -1;
+	/* No branch filter requested */
+	if (branch_sample_type == PERF_SAMPLE_BRANCH_ANY)
+		return 0;
 
-	if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY_CALL) {
-		pmu_bhrb_filter |= POWER8_MMCRA_IFM1;
-		return pmu_bhrb_filter;
-	}
+	if (branch_sample_type == PERF_SAMPLE_BRANCH_ANY_CALL)
+		return POWER8_MMCRA_IFM1;
 
 	/* Every thing else is unsupported */
 	return -1;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [1/5] powerpc/perf: Drop the branch sample when 'from' cannot be fetched
  2015-06-30  8:20 ` [PATCH 1/5] powerpc/perf: Drop the branch sample when 'from' cannot be fetched Anshuman Khandual
@ 2015-07-27  4:19   ` Michael Ellerman
  2015-07-28  3:08     ` Anshuman Khandual
  0 siblings, 1 reply; 15+ messages in thread
From: Michael Ellerman @ 2015-07-27  4:19 UTC (permalink / raw)
  To: Anshuman Khandual, linuxppc-dev; +Cc: mikey, sukadev, dja

On Tue, 2015-30-06 at 08:20:27 UTC, Anshuman Khandual wrote:
> BHRB (Branch History Rolling Buffer) is a rolling buffer. Hence we
> might end up in a situation where we have read one target address
> but when we try to read the next entry indicating the from address
> of the target address, the buffer just overflows. In this case, the
> captured from address will be zero which indicates the end of the
> buffer.

Right. But with SMT8 the size of the buffer is very small, so we will actually
hit this case somewhat often. When we originally wrote this we decided it was
better to get some information, ie. the from address, than no information at
all.

> 	This patch drops the entire branch record which would have
> otherwise confused the user space tools.

Does it confuse the tools? Can you show me before/after output from perf?

I'm not opposed to changing this but we need to be 100% sure it's the best
option.

cheers

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [3/5] powerpc/perf: Replace last usage of get_cpu_var with this_cpu_ptr
  2015-06-30  8:20 ` [PATCH 3/5] powerpc/perf: Replace last usage of get_cpu_var with this_cpu_ptr Anshuman Khandual
@ 2015-07-27  5:15   ` Michael Ellerman
  2015-07-28  3:37     ` Anshuman Khandual
  0 siblings, 1 reply; 15+ messages in thread
From: Michael Ellerman @ 2015-07-27  5:15 UTC (permalink / raw)
  To: Anshuman Khandual, linuxppc-dev; +Cc: mikey, sukadev, dja

On Tue, 2015-30-06 at 08:20:29 UTC, Anshuman Khandual wrote:
> The commit 69111bac42f5ce ("powerpc: Replace __get_cpu_var uses")
> replaced all usage of get_cpu_var with this_cpu_ptr inside core
> perf event handling on powerpc. But it skipped one of them which
> is being replaced with this patch.

No it replaced all uses of __get_cpu_var(), not get_cpu_var(). The difference
is important.

get_cpu_var() disables preemption for you, so it's only safe to switch to
this_cpu_ptr() if preemption is already disabled. Is it?

cheers

> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
> index f9ecd93..57f2c78 100644
> --- a/arch/powerpc/perf/core-book3s.c
> +++ b/arch/powerpc/perf/core-book3s.c
> @@ -1840,20 +1840,17 @@ static int power_pmu_event_init(struct perf_event *event)
>  	if (check_excludes(ctrs, cflags, n, 1))
>  		return -EINVAL;
>  
> -	cpuhw = &get_cpu_var(cpu_hw_events);
> +	cpuhw = this_cpu_ptr(&cpu_hw_events);
>  	err = power_check_constraints(cpuhw, events, cflags, n + 1);
>  
>  	if (has_branch_stack(event)) {
>  		cpuhw->bhrb_filter = ppmu->bhrb_filter_map(
>  					event->attr.branch_sample_type);
>  
> -		if (cpuhw->bhrb_filter == -1) {
> -			put_cpu_var(cpu_hw_events);
> +		if (cpuhw->bhrb_filter == -1)
>  			return -EOPNOTSUPP;
> -		}
>  	}
>  
> -	put_cpu_var(cpu_hw_events);
>  	if (err)
>  		return -EINVAL;

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [1/5] powerpc/perf: Drop the branch sample when 'from' cannot be fetched
  2015-07-27  4:19   ` [1/5] " Michael Ellerman
@ 2015-07-28  3:08     ` Anshuman Khandual
  2015-09-30  9:03       ` Anshuman Khandual
  0 siblings, 1 reply; 15+ messages in thread
From: Anshuman Khandual @ 2015-07-28  3:08 UTC (permalink / raw)
  To: Michael Ellerman, linuxppc-dev; +Cc: mikey, sukadev, dja

On 07/27/2015 09:49 AM, Michael Ellerman wrote:
> On Tue, 2015-30-06 at 08:20:27 UTC, Anshuman Khandual wrote:
>> BHRB (Branch History Rolling Buffer) is a rolling buffer. Hence we
>> might end up in a situation where we have read one target address
>> but when we try to read the next entry indicating the from address
>> of the target address, the buffer just overflows. In this case, the
>> captured from address will be zero which indicates the end of the
>> buffer.
> 
> Right. But with SMT8 the size of the buffer is very small, so we will actually
> hit this case somewhat often. When we originally wrote this we decided it was
> better to get some information, ie. the from address, than no information at
> all.

You are right. But practically as of now we are not using this kind of
(from, 0) branch entries any where as a special case. More over for
certain kind of workloads which has a small code and a few branches,
the chances of getting this kind of branch (from, 0) increases a lot
making them probably one of the highest percentage entries in the final
perf report. Now with this change of code, the workload session might
have overall less number of branch entries, but in my opinion represents
more accurate branch profile of the given workload in percentage wise.

> 
>> 	This patch drops the entire branch record which would have
>> otherwise confused the user space tools.
> 
> Does it confuse the tools? Can you show me before/after output from perf?

The word 'confuse' might be little misleading. But the point as
explained above that the relative branch percentage profile of
certain workloads might be distorted and that I believe is true.
Also branch entries like "from ----> 0" in the perf report might
be confusing to users who dont expect to see this kind of entries
in the final perf report and will never get into "perf report -D"
to figure out what really happened.

> 
> I'm not opposed to changing this but we need to be 100% sure it's the best
> option.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [3/5] powerpc/perf: Replace last usage of get_cpu_var with this_cpu_ptr
  2015-07-27  5:15   ` [3/5] " Michael Ellerman
@ 2015-07-28  3:37     ` Anshuman Khandual
  0 siblings, 0 replies; 15+ messages in thread
From: Anshuman Khandual @ 2015-07-28  3:37 UTC (permalink / raw)
  To: Michael Ellerman, linuxppc-dev; +Cc: mikey, sukadev, dja

On 07/27/2015 10:45 AM, Michael Ellerman wrote:
> On Tue, 2015-30-06 at 08:20:29 UTC, Anshuman Khandual wrote:
>> > The commit 69111bac42f5ce ("powerpc: Replace __get_cpu_var uses")
>> > replaced all usage of get_cpu_var with this_cpu_ptr inside core
>> > perf event handling on powerpc. But it skipped one of them which
>> > is being replaced with this patch.
> No it replaced all uses of __get_cpu_var(), not get_cpu_var(). The difference
> is important.

Hmm, I see. Was not aware about it. Daniel suggested on this and I
thought it made sense. Hence proposed the change.

> 
> get_cpu_var() disables preemption for you, so it's only safe to switch to
> this_cpu_ptr() if preemption is already disabled. Is it?

We dont disable preemption inside power_pmu_event_init neither inside
perf_try_init_event where it gets called from, I guess the answer is NO.
Will drop this patch next time around.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [4/5] powerpc/perf: Change name & type of 'pred' in power_pmu_bhrb_read
  2015-06-30  8:20 ` [PATCH 4/5] powerpc/perf: Change name & type of 'pred' in power_pmu_bhrb_read Anshuman Khandual
@ 2015-07-29  3:25   ` Michael Ellerman
  2015-07-29  8:13     ` Anshuman Khandual
  0 siblings, 1 reply; 15+ messages in thread
From: Michael Ellerman @ 2015-07-29  3:25 UTC (permalink / raw)
  To: Anshuman Khandual, linuxppc-dev; +Cc: mikey, sukadev, dja

On Tue, 2015-30-06 at 08:20:30 UTC, Anshuman Khandual wrote:
> Branch record attributes 'mispred' and 'predicted' are single bit
> fields as defined in the perf ABI. Hence the data type of the field
> 'pred' used during BHRB processing should be changed from integer
> to bool. This patch also changes the name of the variable from 'pred'
> to 'mispred' making the logical inversion process more meaningful
> and readable.

This whole function is a mess.

There's no good reason why we're doing the assignment to pred/mispred in two
places to begin with, so if that was eliminated we wouldn't need a local for
mispred to begin with.

Then there's the type juggling, all of which probably works but is fishy and
horrible.

You take a u64, bitwise and it with a mask, assign that to a boolean, then take
the boolean, *bitwise* negate that and assign the result to a single bit
bitfield.

> +	bool mispred;
> +			mispred = val & BHRB_PREDICTION;
> +				cpuhw->bhrb_entries[u_index].predicted = ~mispred;


cheers

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [4/5] powerpc/perf: Change name & type of 'pred' in power_pmu_bhrb_read
  2015-07-29  3:25   ` [4/5] " Michael Ellerman
@ 2015-07-29  8:13     ` Anshuman Khandual
  0 siblings, 0 replies; 15+ messages in thread
From: Anshuman Khandual @ 2015-07-29  8:13 UTC (permalink / raw)
  To: Michael Ellerman, linuxppc-dev; +Cc: mikey, sukadev, dja

On 07/29/2015 08:55 AM, Michael Ellerman wrote:
> On Tue, 2015-30-06 at 08:20:30 UTC, Anshuman Khandual wrote:
>> > Branch record attributes 'mispred' and 'predicted' are single bit
>> > fields as defined in the perf ABI. Hence the data type of the field
>> > 'pred' used during BHRB processing should be changed from integer
>> > to bool. This patch also changes the name of the variable from 'pred'
>> > to 'mispred' making the logical inversion process more meaningful
>> > and readable.
> This whole function is a mess.
> 
> There's no good reason why we're doing the assignment to pred/mispred in two
> places to begin with, so if that was eliminated we wouldn't need a local for
> mispred to begin with.

Not sure whether I got this right. We are assigning mispred once with
the value (val & BHRB_PREDICTION) and then assigning mispred and it's
inversion to two different fields of the branch entry as required.

> 
> Then there's the type juggling, all of which probably works but is fishy and
> horrible.

With this patch and one more (2nd patch of the BHRB SW filter series)
patch, we are trying to make it better.

> 
> You take a u64, bitwise and it with a mask, assign that to a boolean, then take

So that any residual positive value after the "AND" operation will
become logical TRUE for the boolean. We dont use any shifting here
as BHRB_PREDICTION checks for the right most (least significant) bit
in the sequence.

> the boolean, *bitwise* negate that and assign the result to a single bit
> bitfield.

This is getting fixed with a subsequent patch (2nd patch of the BHRB
SW filter series) in a new function called insert_branch.

+static inline void insert_branch(struct cpu_hw_events *cpuhw,
+                       int index, u64 from, u64 to, bool mispred)
+{
+       cpuhw->bhrb_entries[index].from = from;
+       cpuhw->bhrb_entries[index].to = to;
+       cpuhw->bhrb_entries[index].mispred = mispred;
+       cpuhw->bhrb_entries[index].predicted = !mispred;
+}

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [2/5] powerpc/perf: Change type of the bhrb_users variable
  2015-06-30  8:20 ` [PATCH 2/5] powerpc/perf: Change type of the bhrb_users variable Anshuman Khandual
@ 2015-08-03  1:35   ` Michael Ellerman
  0 siblings, 0 replies; 15+ messages in thread
From: Michael Ellerman @ 2015-08-03  1:35 UTC (permalink / raw)
  To: Anshuman Khandual, linuxppc-dev; +Cc: mikey, sukadev, dja

On Tue, 2015-30-06 at 08:20:28 UTC, Anshuman Khandual wrote:
> This patch just changes data type of bhrb_users variable from
> int to unsigned int because it never contains a negative value.
> 
> Reported-by: Daniel Axtens <dja@axtens.net>
> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/f0322f7f1e2165fbf835

cheers

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [1/5] powerpc/perf: Drop the branch sample when 'from' cannot be fetched
  2015-07-28  3:08     ` Anshuman Khandual
@ 2015-09-30  9:03       ` Anshuman Khandual
  2015-09-30 10:46         ` Michael Ellerman
  0 siblings, 1 reply; 15+ messages in thread
From: Anshuman Khandual @ 2015-09-30  9:03 UTC (permalink / raw)
  To: Michael Ellerman, linuxppc-dev; +Cc: mikey, sukadev, dja

On 07/28/2015 08:38 AM, Anshuman Khandual wrote:
> On 07/27/2015 09:49 AM, Michael Ellerman wrote:
>> > On Tue, 2015-30-06 at 08:20:27 UTC, Anshuman Khandual wrote:
>>> >> BHRB (Branch History Rolling Buffer) is a rolling buffer. Hence we
>>> >> might end up in a situation where we have read one target address
>>> >> but when we try to read the next entry indicating the from address
>>> >> of the target address, the buffer just overflows. In this case, the
>>> >> captured from address will be zero which indicates the end of the
>>> >> buffer.
>> > 
>> > Right. But with SMT8 the size of the buffer is very small, so we will actually
>> > hit this case somewhat often. When we originally wrote this we decided it was
>> > better to get some information, ie. the from address, than no information at
>> > all.
> You are right. But practically as of now we are not using this kind of
> (from, 0) branch entries any where as a special case. More over for
> certain kind of workloads which has a small code and a few branches,
> the chances of getting this kind of branch (from, 0) increases a lot
> making them probably one of the highest percentage entries in the final
> perf report. Now with this change of code, the workload session might
> have overall less number of branch entries, but in my opinion represents
> more accurate branch profile of the given workload in percentage wise.
> 
>> > 
>>> >> 	This patch drops the entire branch record which would have
>>> >> otherwise confused the user space tools.
>> > 
>> > Does it confuse the tools? Can you show me before/after output from perf?
> The word 'confuse' might be little misleading. But the point as
> explained above that the relative branch percentage profile of
> certain workloads might be distorted and that I believe is true.
> Also branch entries like "from ----> 0" in the perf report might
> be confusing to users who dont expect to see this kind of entries
> in the final perf report and will never get into "perf report -D"
> to figure out what really happened.

Hey Michael,

As I had explained earlier, is not it a good idea to drop these
kind of branch records from the final output ? I will request
consideration of this patch along with others in the series.

I have dropped the following patch as you had pointed out.

[3/5] powerpc/perf: Replace last usage of get_cpu_var with this_cpu_ptr

Also did not receive any comments or thoughts on the V10 of the
BHRB SW branch filter patch series posted couple of months back.
Does it look good ?

http://comments.gmane.org/gmane.linux.ports.ppc.embedded/83206

After dropping the above patch and excluding the one which
had already merged mainline, rebased the entire series and
it still works fine on LE and BE kernel as of today. I will
be sending them soon.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [1/5] powerpc/perf: Drop the branch sample when 'from' cannot be fetched
  2015-09-30  9:03       ` Anshuman Khandual
@ 2015-09-30 10:46         ` Michael Ellerman
  0 siblings, 0 replies; 15+ messages in thread
From: Michael Ellerman @ 2015-09-30 10:46 UTC (permalink / raw)
  To: Anshuman Khandual; +Cc: linuxppc-dev, mikey, sukadev, dja

On Wed, 2015-09-30 at 14:33 +0530, Anshuman Khandual wrote:
> On 07/28/2015 08:38 AM, Anshuman Khandual wrote:
> > On 07/27/2015 09:49 AM, Michael Ellerman wrote:
> >> > On Tue, 2015-30-06 at 08:20:27 UTC, Anshuman Khandual wrote:
> >>> >> BHRB (Branch History Rolling Buffer) is a rolling buffer. Hence we
> >>> >> might end up in a situation where we have read one target address
> >>> >> but when we try to read the next entry indicating the from address
> >>> >> of the target address, the buffer just overflows. In this case, the
> >>> >> captured from address will be zero which indicates the end of the
> >>> >> buffer.
> >> > 
> >> > Right. But with SMT8 the size of the buffer is very small, so we will actually
> >> > hit this case somewhat often. When we originally wrote this we decided it was
> >> > better to get some information, ie. the from address, than no information at
> >> > all.
> > You are right. But practically as of now we are not using this kind of
> > (from, 0) branch entries any where as a special case. More over for
> > certain kind of workloads which has a small code and a few branches,
> > the chances of getting this kind of branch (from, 0) increases a lot
> > making them probably one of the highest percentage entries in the final
> > perf report. Now with this change of code, the workload session might
> > have overall less number of branch entries, but in my opinion represents
> > more accurate branch profile of the given workload in percentage wise.
> > 
> >> > 
> >>> >> 	This patch drops the entire branch record which would have
> >>> >> otherwise confused the user space tools.
> >> > 
> >> > Does it confuse the tools? Can you show me before/after output from perf?
> > The word 'confuse' might be little misleading. But the point as
> > explained above that the relative branch percentage profile of
> > certain workloads might be distorted and that I believe is true.
> > Also branch entries like "from ----> 0" in the perf report might
> > be confusing to users who dont expect to see this kind of entries
> > in the final perf report and will never get into "perf report -D"
> > to figure out what really happened.
> 
> Hey Michael,
> 
> As I had explained earlier, is not it a good idea to drop these
> kind of branch records from the final output ? I will request
> consideration of this patch along with others in the series.

I think it's too late to change it. ie. we've shipped several kernel versions
that did emit them, so we just have to stick with that. If it was obviously a
bug fix we could change it, but it's not obviously correct either way.

So please drop that patch and repost the series.

cheers

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2015-09-30 10:46 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-30  8:20 [PATCH 0/5] BHRB fixes, improvements and cleanups Anshuman Khandual
2015-06-30  8:20 ` [PATCH 1/5] powerpc/perf: Drop the branch sample when 'from' cannot be fetched Anshuman Khandual
2015-07-27  4:19   ` [1/5] " Michael Ellerman
2015-07-28  3:08     ` Anshuman Khandual
2015-09-30  9:03       ` Anshuman Khandual
2015-09-30 10:46         ` Michael Ellerman
2015-06-30  8:20 ` [PATCH 2/5] powerpc/perf: Change type of the bhrb_users variable Anshuman Khandual
2015-08-03  1:35   ` [2/5] " Michael Ellerman
2015-06-30  8:20 ` [PATCH 3/5] powerpc/perf: Replace last usage of get_cpu_var with this_cpu_ptr Anshuman Khandual
2015-07-27  5:15   ` [3/5] " Michael Ellerman
2015-07-28  3:37     ` Anshuman Khandual
2015-06-30  8:20 ` [PATCH 4/5] powerpc/perf: Change name & type of 'pred' in power_pmu_bhrb_read Anshuman Khandual
2015-07-29  3:25   ` [4/5] " Michael Ellerman
2015-07-29  8:13     ` Anshuman Khandual
2015-06-30  8:20 ` [PATCH 5/5] powerpc/perf: Re organize PMU branch filter processing on POWER8 Anshuman Khandual

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).