linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] arm: perf: Prevent wraparound during overflow
@ 2014-11-19 15:52 Daniel Thompson
  2014-11-19 18:11 ` Will Deacon
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Daniel Thompson @ 2014-11-19 15:52 UTC (permalink / raw)
  To: Will Deacon, Russell King
  Cc: Daniel Thompson, linux-arm-kernel, linux-kernel, Peter Zijlstra,
	Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo, patches,
	linaro-kernel, John Stultz, Sumit Semwal

If the overflow threshold for a counter is set above or near the
0xffffffff boundary then the kernel may lose track of the overflow
causing only events that occur *after* the overflow to be recorded.
Specifically the problem occurs when the value of the performance counter
overtakes its original programmed value due to wrap around.

Typical solutions to this problem are either to avoid programming in
values likely to be overtaken or to treat the overflow bit as the 33rd
bit of the counter.

Its somewhat fiddly to refactor the code to correctly handle the 33rd bit
during irqsave sections (context switches for example) so instead we take
the simpler approach of avoiding values likely to be overtaken.

We set the limit to half of max_period because this matches the limit
imposed in __hw_perf_event_init(). This causes a doubling of the interrupt
rate for large threshold values, however even with a very fast counter
ticking at 4GHz the interrupt rate would only be ~1Hz.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
---

Notes:
    There is similar code in the arm64 tree which retains the assumptions of
    the original arm code regarding 32-bit wide performance counters. If
    this patch doesn't get beaten up during review I'll also share a similar
    patch for arm64.
    

 arch/arm/kernel/perf_event.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
index 266cba46db3e..b50a770f8c99 100644
--- a/arch/arm/kernel/perf_event.c
+++ b/arch/arm/kernel/perf_event.c
@@ -115,8 +115,14 @@ int armpmu_event_set_period(struct perf_event *event)
 		ret = 1;
 	}

-	if (left > (s64)armpmu->max_period)
-		left = armpmu->max_period;
+	/*
+	 * Limit the maximum period to prevent the counter value
+	 * from overtaking the one we are about to program. In
+	 * effect we are reducing max_period to account for
+	 * interrupt latency (and we are being very conservative).
+	 */
+	if (left > (s64)(armpmu->max_period >> 1))
+		left = armpmu->max_period >> 1;

 	local64_set(&hwc->prev_count, (u64)-left);

--
1.9.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] arm: perf: Prevent wraparound during overflow
  2014-11-19 15:52 [PATCH] arm: perf: Prevent wraparound during overflow Daniel Thompson
@ 2014-11-19 18:11 ` Will Deacon
  2014-11-20 12:14   ` Daniel Thompson
  2014-11-21 16:24 ` [PATCH v2 0/2] arm+arm64: " Daniel Thompson
  2014-12-22  9:39 ` [PATCH 3.19-rc1 v3] arm: " Daniel Thompson
  2 siblings, 1 reply; 13+ messages in thread
From: Will Deacon @ 2014-11-19 18:11 UTC (permalink / raw)
  To: Daniel Thompson
  Cc: Russell King, linux-arm-kernel, linux-kernel, Peter Zijlstra,
	Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo, patches,
	linaro-kernel, John Stultz, Sumit Semwal

On Wed, Nov 19, 2014 at 03:52:26PM +0000, Daniel Thompson wrote:
> If the overflow threshold for a counter is set above or near the
> 0xffffffff boundary then the kernel may lose track of the overflow
> causing only events that occur *after* the overflow to be recorded.
> Specifically the problem occurs when the value of the performance counter
> overtakes its original programmed value due to wrap around.
> 
> Typical solutions to this problem are either to avoid programming in
> values likely to be overtaken or to treat the overflow bit as the 33rd
> bit of the counter.
> 
> Its somewhat fiddly to refactor the code to correctly handle the 33rd bit
> during irqsave sections (context switches for example) so instead we take
> the simpler approach of avoiding values likely to be overtaken.
> 
> We set the limit to half of max_period because this matches the limit
> imposed in __hw_perf_event_init(). This causes a doubling of the interrupt
> rate for large threshold values, however even with a very fast counter
> ticking at 4GHz the interrupt rate would only be ~1Hz.
> 
> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
> ---
> 
> Notes:
>     There is similar code in the arm64 tree which retains the assumptions of
>     the original arm code regarding 32-bit wide performance counters. If
>     this patch doesn't get beaten up during review I'll also share a similar
>     patch for arm64.
>     
> 
>  arch/arm/kernel/perf_event.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
> index 266cba46db3e..b50a770f8c99 100644
> --- a/arch/arm/kernel/perf_event.c
> +++ b/arch/arm/kernel/perf_event.c
> @@ -115,8 +115,14 @@ int armpmu_event_set_period(struct perf_event *event)
>  		ret = 1;
>  	}
> 
> -	if (left > (s64)armpmu->max_period)
> -		left = armpmu->max_period;
> +	/*
> +	 * Limit the maximum period to prevent the counter value
> +	 * from overtaking the one we are about to program. In
> +	 * effect we are reducing max_period to account for
> +	 * interrupt latency (and we are being very conservative).
> +	 */
> +	if (left > (s64)(armpmu->max_period >> 1))
> +		left = armpmu->max_period >> 1;

The s64 cast looks off here, can we just drop it entirely?

Will

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] arm: perf: Prevent wraparound during overflow
  2014-11-19 18:11 ` Will Deacon
@ 2014-11-20 12:14   ` Daniel Thompson
  0 siblings, 0 replies; 13+ messages in thread
From: Daniel Thompson @ 2014-11-20 12:14 UTC (permalink / raw)
  To: Will Deacon
  Cc: Russell King, linux-arm-kernel, linux-kernel, Peter Zijlstra,
	Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo, patches,
	linaro-kernel, John Stultz, Sumit Semwal

On 19/11/14 18:11, Will Deacon wrote:
> On Wed, Nov 19, 2014 at 03:52:26PM +0000, Daniel Thompson wrote:
>> If the overflow threshold for a counter is set above or near the
>> 0xffffffff boundary then the kernel may lose track of the overflow
>> causing only events that occur *after* the overflow to be recorded.
>> Specifically the problem occurs when the value of the performance counter
>> overtakes its original programmed value due to wrap around.
>>
>> Typical solutions to this problem are either to avoid programming in
>> values likely to be overtaken or to treat the overflow bit as the 33rd
>> bit of the counter.
>>
>> Its somewhat fiddly to refactor the code to correctly handle the 33rd bit
>> during irqsave sections (context switches for example) so instead we take
>> the simpler approach of avoiding values likely to be overtaken.
>>
>> We set the limit to half of max_period because this matches the limit
>> imposed in __hw_perf_event_init(). This causes a doubling of the interrupt
>> rate for large threshold values, however even with a very fast counter
>> ticking at 4GHz the interrupt rate would only be ~1Hz.
>>
>> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
>> ---
>>
>> Notes:
>>     There is similar code in the arm64 tree which retains the assumptions of
>>     the original arm code regarding 32-bit wide performance counters. If
>>     this patch doesn't get beaten up during review I'll also share a similar
>>     patch for arm64.
>>     
>>
>>  arch/arm/kernel/perf_event.c | 10 ++++++++--
>>  1 file changed, 8 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
>> index 266cba46db3e..b50a770f8c99 100644
>> --- a/arch/arm/kernel/perf_event.c
>> +++ b/arch/arm/kernel/perf_event.c
>> @@ -115,8 +115,14 @@ int armpmu_event_set_period(struct perf_event *event)
>>  		ret = 1;
>>  	}
>>
>> -	if (left > (s64)armpmu->max_period)
>> -		left = armpmu->max_period;
>> +	/*
>> +	 * Limit the maximum period to prevent the counter value
>> +	 * from overtaking the one we are about to program. In
>> +	 * effect we are reducing max_period to account for
>> +	 * interrupt latency (and we are being very conservative).
>> +	 */
>> +	if (left > (s64)(armpmu->max_period >> 1))
>> +		left = armpmu->max_period >> 1;
> 
> The s64 cast looks off here, can we just drop it entirely?

Yes.

left will always be positive at this point in the code and therefore can
be safely promoted within this expression (and generated no extra
warnings for me).

I'll change this (although I might just keep the redundant braces
because > and >> are composed of the same characters making it hard to
read without the braces).





^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 0/2] arm+arm64: perf: Prevent wraparound during overflow
  2014-11-19 15:52 [PATCH] arm: perf: Prevent wraparound during overflow Daniel Thompson
  2014-11-19 18:11 ` Will Deacon
@ 2014-11-21 16:24 ` Daniel Thompson
  2014-11-21 16:24   ` [PATCH v2 1/2] arm: " Daniel Thompson
  2014-11-21 16:24   ` [PATCH v2 2/2] arm64: " Daniel Thompson
  2014-12-22  9:39 ` [PATCH 3.19-rc1 v3] arm: " Daniel Thompson
  2 siblings, 2 replies; 13+ messages in thread
From: Daniel Thompson @ 2014-11-21 16:24 UTC (permalink / raw)
  To: Russell King, Will Deacon, Catalin Marinas
  Cc: Daniel Thompson, linux-arm-kernel, linux-kernel, Peter Zijlstra,
	Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo, patches,
	linaro-kernel, John Stultz, Sumit Semwal

This patchset fixes problems on arm and arm64 when the PMU counters wrap
around and become larger than the value originally programmed into them.

The problem was observed and fixed on arm but the perf code is,
rather to my surprise, sufficiently similar on arm64 that the fix still
makes sense there too.

v2:

* Remove the redundant cast to s64 (Will Deacon).


Daniel Thompson (2):
  arm: perf: Prevent wraparound during overflow
  arm64: perf: Prevent wraparound during overflow

 arch/arm/kernel/perf_event.c   | 10 ++++++++--
 arch/arm64/kernel/perf_event.c | 10 ++++++++--
 2 files changed, 16 insertions(+), 4 deletions(-)

--
1.9.3


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 1/2] arm: perf: Prevent wraparound during overflow
  2014-11-21 16:24 ` [PATCH v2 0/2] arm+arm64: " Daniel Thompson
@ 2014-11-21 16:24   ` Daniel Thompson
  2014-12-04 10:26     ` Will Deacon
  2015-01-05 14:57     ` Peter Zijlstra
  2014-11-21 16:24   ` [PATCH v2 2/2] arm64: " Daniel Thompson
  1 sibling, 2 replies; 13+ messages in thread
From: Daniel Thompson @ 2014-11-21 16:24 UTC (permalink / raw)
  To: Russell King, Will Deacon, Catalin Marinas
  Cc: Daniel Thompson, linux-arm-kernel, linux-kernel, Peter Zijlstra,
	Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo, patches,
	linaro-kernel, John Stultz, Sumit Semwal

If the overflow threshold for a counter is set above or near the
0xffffffff boundary then the kernel may lose track of the overflow
causing only events that occur *after* the overflow to be recorded.
Specifically the problem occurs when the value of the performance counter
overtakes its original programmed value due to wrap around.

Typical solutions to this problem are either to avoid programming in
values likely to be overtaken or to treat the overflow bit as the 33rd
bit of the counter.

Its somewhat fiddly to refactor the code to correctly handle the 33rd bit
during irqsave sections (context switches for example) so instead we take
the simpler approach of avoiding values likely to be overtaken.

We set the limit to half of max_period because this matches the limit
imposed in __hw_perf_event_init(). This causes a doubling of the interrupt
rate for large threshold values, however even with a very fast counter
ticking at 4GHz the interrupt rate would only be ~1Hz.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
---
 arch/arm/kernel/perf_event.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
index 266cba46db3e..ab68833c1e31 100644
--- a/arch/arm/kernel/perf_event.c
+++ b/arch/arm/kernel/perf_event.c
@@ -115,8 +115,14 @@ int armpmu_event_set_period(struct perf_event *event)
 		ret = 1;
 	}
 
-	if (left > (s64)armpmu->max_period)
-		left = armpmu->max_period;
+	/*
+	 * Limit the maximum period to prevent the counter value
+	 * from overtaking the one we are about to program. In
+	 * effect we are reducing max_period to account for
+	 * interrupt latency (and we are being very conservative).
+	 */
+	if (left > (armpmu->max_period >> 1))
+		left = armpmu->max_period >> 1;
 
 	local64_set(&hwc->prev_count, (u64)-left);
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 2/2] arm64: perf: Prevent wraparound during overflow
  2014-11-21 16:24 ` [PATCH v2 0/2] arm+arm64: " Daniel Thompson
  2014-11-21 16:24   ` [PATCH v2 1/2] arm: " Daniel Thompson
@ 2014-11-21 16:24   ` Daniel Thompson
  2014-12-04 10:27     ` Will Deacon
  1 sibling, 1 reply; 13+ messages in thread
From: Daniel Thompson @ 2014-11-21 16:24 UTC (permalink / raw)
  To: Russell King, Will Deacon, Catalin Marinas
  Cc: Daniel Thompson, linux-arm-kernel, linux-kernel, Peter Zijlstra,
	Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo, patches,
	linaro-kernel, John Stultz, Sumit Semwal

If the overflow threshold for a counter is set above or near the
0xffffffff boundary then the kernel may lose track of the overflow
causing only events that occur *after* the overflow to be recorded.
Specifically the problem occurs when the value of the performance counter
overtakes its original programmed value due to wrap around.

Typical solutions to this problem are either to avoid programming in
values likely to be overtaken or to treat the overflow bit as the 33rd
bit of the counter.

Its somewhat fiddly to refactor the code to correctly handle the 33rd bit
during irqsave sections (context switches for example) so instead we take
the simpler approach of avoiding values likely to be overtaken.

We set the limit to half of max_period because this matches the limit
imposed in __hw_perf_event_init(). This causes a doubling of the interrupt
rate for large threshold values, however even with a very fast counter
ticking at 4GHz the interrupt rate would only be ~1Hz.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
---
 arch/arm64/kernel/perf_event.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index aa29ecb4f800..25a5308744b1 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -169,8 +169,14 @@ armpmu_event_set_period(struct perf_event *event,
 		ret = 1;
 	}
 
-	if (left > (s64)armpmu->max_period)
-		left = armpmu->max_period;
+	/*
+	 * Limit the maximum period to prevent the counter value
+	 * from overtaking the one we are about to program. In
+	 * effect we are reducing max_period to account for
+	 * interrupt latency (and we are being very conservative).
+	 */
+	if (left > (armpmu->max_period >> 1))
+		left = armpmu->max_period >> 1;
 
 	local64_set(&hwc->prev_count, (u64)-left);
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/2] arm: perf: Prevent wraparound during overflow
  2014-11-21 16:24   ` [PATCH v2 1/2] arm: " Daniel Thompson
@ 2014-12-04 10:26     ` Will Deacon
  2014-12-04 13:58       ` Daniel Thompson
  2015-01-05 14:57     ` Peter Zijlstra
  1 sibling, 1 reply; 13+ messages in thread
From: Will Deacon @ 2014-12-04 10:26 UTC (permalink / raw)
  To: Daniel Thompson
  Cc: Russell King, Catalin Marinas, linux-arm-kernel, linux-kernel,
	Peter Zijlstra, Paul Mackerras, Ingo Molnar,
	Arnaldo Carvalho de Melo, patches, linaro-kernel, John Stultz,
	Sumit Semwal

On Fri, Nov 21, 2014 at 04:24:26PM +0000, Daniel Thompson wrote:
> If the overflow threshold for a counter is set above or near the
> 0xffffffff boundary then the kernel may lose track of the overflow
> causing only events that occur *after* the overflow to be recorded.
> Specifically the problem occurs when the value of the performance counter
> overtakes its original programmed value due to wrap around.
> 
> Typical solutions to this problem are either to avoid programming in
> values likely to be overtaken or to treat the overflow bit as the 33rd
> bit of the counter.
> 
> Its somewhat fiddly to refactor the code to correctly handle the 33rd bit
> during irqsave sections (context switches for example) so instead we take
> the simpler approach of avoiding values likely to be overtaken.
> 
> We set the limit to half of max_period because this matches the limit
> imposed in __hw_perf_event_init(). This causes a doubling of the interrupt
> rate for large threshold values, however even with a very fast counter
> ticking at 4GHz the interrupt rate would only be ~1Hz.
> 
> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>

  Acked-by: Will Deacon <will.deacon@arm.com>

You'll probably need to refresh this at -rc1 as there are a bunch of
changes queued for this file already. Then you can stick it into rmk's
patch system.

Cheers,

Will

> ---
>  arch/arm/kernel/perf_event.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
> index 266cba46db3e..ab68833c1e31 100644
> --- a/arch/arm/kernel/perf_event.c
> +++ b/arch/arm/kernel/perf_event.c
> @@ -115,8 +115,14 @@ int armpmu_event_set_period(struct perf_event *event)
>  		ret = 1;
>  	}
>  
> -	if (left > (s64)armpmu->max_period)
> -		left = armpmu->max_period;
> +	/*
> +	 * Limit the maximum period to prevent the counter value
> +	 * from overtaking the one we are about to program. In
> +	 * effect we are reducing max_period to account for
> +	 * interrupt latency (and we are being very conservative).
> +	 */
> +	if (left > (armpmu->max_period >> 1))
> +		left = armpmu->max_period >> 1;
>  
>  	local64_set(&hwc->prev_count, (u64)-left);
>  
> -- 
> 1.9.3
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/2] arm64: perf: Prevent wraparound during overflow
  2014-11-21 16:24   ` [PATCH v2 2/2] arm64: " Daniel Thompson
@ 2014-12-04 10:27     ` Will Deacon
  0 siblings, 0 replies; 13+ messages in thread
From: Will Deacon @ 2014-12-04 10:27 UTC (permalink / raw)
  To: Daniel Thompson
  Cc: Russell King, Catalin Marinas, linux-arm-kernel, linux-kernel,
	Peter Zijlstra, Paul Mackerras, Ingo Molnar,
	Arnaldo Carvalho de Melo, patches, linaro-kernel, John Stultz,
	Sumit Semwal

On Fri, Nov 21, 2014 at 04:24:27PM +0000, Daniel Thompson wrote:
> If the overflow threshold for a counter is set above or near the
> 0xffffffff boundary then the kernel may lose track of the overflow
> causing only events that occur *after* the overflow to be recorded.
> Specifically the problem occurs when the value of the performance counter
> overtakes its original programmed value due to wrap around.
> 
> Typical solutions to this problem are either to avoid programming in
> values likely to be overtaken or to treat the overflow bit as the 33rd
> bit of the counter.
> 
> Its somewhat fiddly to refactor the code to correctly handle the 33rd bit
> during irqsave sections (context switches for example) so instead we take
> the simpler approach of avoiding values likely to be overtaken.
> 
> We set the limit to half of max_period because this matches the limit
> imposed in __hw_perf_event_init(). This causes a doubling of the interrupt
> rate for large threshold values, however even with a very fast counter
> ticking at 4GHz the interrupt rate would only be ~1Hz.
> 
> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
> ---
>  arch/arm64/kernel/perf_event.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)

Thanks, applied.

Will

> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index aa29ecb4f800..25a5308744b1 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -169,8 +169,14 @@ armpmu_event_set_period(struct perf_event *event,
>  		ret = 1;
>  	}
>  
> -	if (left > (s64)armpmu->max_period)
> -		left = armpmu->max_period;
> +	/*
> +	 * Limit the maximum period to prevent the counter value
> +	 * from overtaking the one we are about to program. In
> +	 * effect we are reducing max_period to account for
> +	 * interrupt latency (and we are being very conservative).
> +	 */
> +	if (left > (armpmu->max_period >> 1))
> +		left = armpmu->max_period >> 1;
>  
>  	local64_set(&hwc->prev_count, (u64)-left);
>  
> -- 
> 1.9.3
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/2] arm: perf: Prevent wraparound during overflow
  2014-12-04 10:26     ` Will Deacon
@ 2014-12-04 13:58       ` Daniel Thompson
  0 siblings, 0 replies; 13+ messages in thread
From: Daniel Thompson @ 2014-12-04 13:58 UTC (permalink / raw)
  To: Will Deacon
  Cc: Russell King, Catalin Marinas, linux-arm-kernel, linux-kernel,
	Peter Zijlstra, Paul Mackerras, Ingo Molnar,
	Arnaldo Carvalho de Melo, patches, linaro-kernel, John Stultz,
	Sumit Semwal

On 04/12/14 10:26, Will Deacon wrote:
> On Fri, Nov 21, 2014 at 04:24:26PM +0000, Daniel Thompson wrote:
>> If the overflow threshold for a counter is set above or near the
>> 0xffffffff boundary then the kernel may lose track of the overflow
>> causing only events that occur *after* the overflow to be recorded.
>> Specifically the problem occurs when the value of the performance counter
>> overtakes its original programmed value due to wrap around.
>>
>> Typical solutions to this problem are either to avoid programming in
>> values likely to be overtaken or to treat the overflow bit as the 33rd
>> bit of the counter.
>>
>> Its somewhat fiddly to refactor the code to correctly handle the 33rd bit
>> during irqsave sections (context switches for example) so instead we take
>> the simpler approach of avoiding values likely to be overtaken.
>>
>> We set the limit to half of max_period because this matches the limit
>> imposed in __hw_perf_event_init(). This causes a doubling of the interrupt
>> rate for large threshold values, however even with a very fast counter
>> ticking at 4GHz the interrupt rate would only be ~1Hz.
>>
>> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
> 
>   Acked-by: Will Deacon <will.deacon@arm.com>
> 
> You'll probably need to refresh this at -rc1 as there are a bunch of
> changes queued for this file already. Then you can stick it into rmk's
> patch system.

I'll do that. Thanks.


> 
> Cheers,
> 
> Will
> 
>> ---
>>  arch/arm/kernel/perf_event.c | 10 ++++++++--
>>  1 file changed, 8 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
>> index 266cba46db3e..ab68833c1e31 100644
>> --- a/arch/arm/kernel/perf_event.c
>> +++ b/arch/arm/kernel/perf_event.c
>> @@ -115,8 +115,14 @@ int armpmu_event_set_period(struct perf_event *event)
>>  		ret = 1;
>>  	}
>>  
>> -	if (left > (s64)armpmu->max_period)
>> -		left = armpmu->max_period;
>> +	/*
>> +	 * Limit the maximum period to prevent the counter value
>> +	 * from overtaking the one we are about to program. In
>> +	 * effect we are reducing max_period to account for
>> +	 * interrupt latency (and we are being very conservative).
>> +	 */
>> +	if (left > (armpmu->max_period >> 1))
>> +		left = armpmu->max_period >> 1;
>>  
>>  	local64_set(&hwc->prev_count, (u64)-left);
>>  
>> -- 
>> 1.9.3
>>
>>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 3.19-rc1 v3] arm: perf: Prevent wraparound during overflow
  2014-11-19 15:52 [PATCH] arm: perf: Prevent wraparound during overflow Daniel Thompson
  2014-11-19 18:11 ` Will Deacon
  2014-11-21 16:24 ` [PATCH v2 0/2] arm+arm64: " Daniel Thompson
@ 2014-12-22  9:39 ` Daniel Thompson
  2 siblings, 0 replies; 13+ messages in thread
From: Daniel Thompson @ 2014-12-22  9:39 UTC (permalink / raw)
  To: Russell King, Will Deacon
  Cc: Daniel Thompson, linux-arm-kernel, linux-kernel, Peter Zijlstra,
	Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo, patches,
	linaro-kernel, John Stultz, Sumit Semwal

If the overflow threshold for a counter is set above or near the
0xffffffff boundary then the kernel may lose track of the overflow
causing only events that occur *after* the overflow to be recorded.
Specifically the problem occurs when the value of the performance counter
overtakes its original programmed value due to wrap around.

Typical solutions to this problem are either to avoid programming in
values likely to be overtaken or to treat the overflow bit as the 33rd
bit of the counter.

Its somewhat fiddly to refactor the code to correctly handle the 33rd bit
during irqsave sections (context switches for example) so instead we take
the simpler approach of avoiding values likely to be overtaken.

We set the limit to half of max_period because this matches the limit
imposed in __hw_perf_event_init(). This causes a doubling of the interrupt
rate for large threshold values, however even with a very fast counter
ticking at 4GHz the interrupt rate would only be ~1Hz.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
---

Notes:
    v3:
    
    * Rebased on 3.19-rc1 and dropped the arm64 patches (which are
      already upstream).
    
    v2:
    
    * Remove the redundant cast to s64 (Will Deacon).
    

 arch/arm/kernel/perf_event.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
index f7c65adaa428..557e128e4df0 100644
--- a/arch/arm/kernel/perf_event.c
+++ b/arch/arm/kernel/perf_event.c
@@ -116,8 +116,14 @@ int armpmu_event_set_period(struct perf_event *event)
 		ret = 1;
 	}

-	if (left > (s64)armpmu->max_period)
-		left = armpmu->max_period;
+	/*
+	 * Limit the maximum period to prevent the counter value
+	 * from overtaking the one we are about to program. In
+	 * effect we are reducing max_period to account for
+	 * interrupt latency (and we are being very conservative).
+	 */
+	if (left > (armpmu->max_period >> 1))
+		left = armpmu->max_period >> 1;

 	local64_set(&hwc->prev_count, (u64)-left);

--
1.9.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/2] arm: perf: Prevent wraparound during overflow
  2014-11-21 16:24   ` [PATCH v2 1/2] arm: " Daniel Thompson
  2014-12-04 10:26     ` Will Deacon
@ 2015-01-05 14:57     ` Peter Zijlstra
  2015-01-05 19:31       ` Daniel Thompson
  1 sibling, 1 reply; 13+ messages in thread
From: Peter Zijlstra @ 2015-01-05 14:57 UTC (permalink / raw)
  To: Daniel Thompson
  Cc: Russell King, Will Deacon, Catalin Marinas, linux-arm-kernel,
	linux-kernel, Paul Mackerras, Ingo Molnar,
	Arnaldo Carvalho de Melo, patches, linaro-kernel, John Stultz,
	Sumit Semwal

On Fri, Nov 21, 2014 at 04:24:26PM +0000, Daniel Thompson wrote:
> If the overflow threshold for a counter is set above or near the
> 0xffffffff boundary then the kernel may lose track of the overflow
> causing only events that occur *after* the overflow to be recorded.
> Specifically the problem occurs when the value of the performance counter
> overtakes its original programmed value due to wrap around.
> 
> Typical solutions to this problem are either to avoid programming in
> values likely to be overtaken or to treat the overflow bit as the 33rd
> bit of the counter.
> 
> Its somewhat fiddly to refactor the code to correctly handle the 33rd bit
> during irqsave sections (context switches for example) so instead we take
> the simpler approach of avoiding values likely to be overtaken.
> 
> We set the limit to half of max_period because this matches the limit
> imposed in __hw_perf_event_init(). This causes a doubling of the interrupt
> rate for large threshold values, however even with a very fast counter
> ticking at 4GHz the interrupt rate would only be ~1Hz.
> 
> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
> ---
>  arch/arm/kernel/perf_event.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
> index 266cba46db3e..ab68833c1e31 100644
> --- a/arch/arm/kernel/perf_event.c
> +++ b/arch/arm/kernel/perf_event.c
> @@ -115,8 +115,14 @@ int armpmu_event_set_period(struct perf_event *event)
>  		ret = 1;
>  	}
>  
> -	if (left > (s64)armpmu->max_period)
> -		left = armpmu->max_period;
> +	/*
> +	 * Limit the maximum period to prevent the counter value
> +	 * from overtaking the one we are about to program. In
> +	 * effect we are reducing max_period to account for
> +	 * interrupt latency (and we are being very conservative).
> +	 */
> +	if (left > (armpmu->max_period >> 1))
> +		left = armpmu->max_period >> 1;

On x86 we simply half max_period, why did you choose to do differently?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/2] arm: perf: Prevent wraparound during overflow
  2015-01-05 14:57     ` Peter Zijlstra
@ 2015-01-05 19:31       ` Daniel Thompson
  2015-01-06 19:46         ` Will Deacon
  0 siblings, 1 reply; 13+ messages in thread
From: Daniel Thompson @ 2015-01-05 19:31 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Russell King, Will Deacon, Catalin Marinas, linux-arm-kernel,
	linux-kernel, Paul Mackerras, Ingo Molnar,
	Arnaldo Carvalho de Melo, patches, linaro-kernel, John Stultz,
	Sumit Semwal

On Mon, Jan 05, 2015 at 03:57:39PM +0100, Peter Zijlstra wrote:
> On Fri, Nov 21, 2014 at 04:24:26PM +0000, Daniel Thompson wrote:
> > diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
> > index 266cba46db3e..ab68833c1e31 100644
> > --- a/arch/arm/kernel/perf_event.c
> > +++ b/arch/arm/kernel/perf_event.c
> > @@ -115,8 +115,14 @@ int armpmu_event_set_period(struct perf_event *event)
> >  		ret = 1;
> >  	}
> >  
> > -	if (left > (s64)armpmu->max_period)
> > -		left = armpmu->max_period;
> > +	/*
> > +	 * Limit the maximum period to prevent the counter value
> > +	 * from overtaking the one we are about to program. In
> > +	 * effect we are reducing max_period to account for
> > +	 * interrupt latency (and we are being very conservative).
> > +	 */
> > +	if (left > (armpmu->max_period >> 1))
> > +		left = armpmu->max_period >> 1;
> 
> On x86 we simply half max_period, why did you choose to do differently?

In truth because I didn't look at the x86 code... there is an existing
halving of max_period in the arm code and that was enough to satisfy me
that halving max_period was reasonable.

Predividing max_period looks to me like it would work for ARM too although I
don't think we could blame hardware insanity for doing so ;-).

Will: Do you want me to update this?

-- 
Daniel Thompson (STMicroelectronics) <daniel.thompson@st.com>
1000 Aztec West, Almondsbury, Bristol, BS32 4SQ. 01454 462659

If a car is a horseless carriage then is a motorcycle a horseless horse?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/2] arm: perf: Prevent wraparound during overflow
  2015-01-05 19:31       ` Daniel Thompson
@ 2015-01-06 19:46         ` Will Deacon
  0 siblings, 0 replies; 13+ messages in thread
From: Will Deacon @ 2015-01-06 19:46 UTC (permalink / raw)
  To: Daniel Thompson
  Cc: Peter Zijlstra, Russell King, Catalin Marinas, linux-arm-kernel,
	linux-kernel, Paul Mackerras, Ingo Molnar,
	Arnaldo Carvalho de Melo, patches, linaro-kernel, John Stultz,
	Sumit Semwal

On Mon, Jan 05, 2015 at 07:31:20PM +0000, Daniel Thompson wrote:
> On Mon, Jan 05, 2015 at 03:57:39PM +0100, Peter Zijlstra wrote:
> > On Fri, Nov 21, 2014 at 04:24:26PM +0000, Daniel Thompson wrote:
> > > diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
> > > index 266cba46db3e..ab68833c1e31 100644
> > > --- a/arch/arm/kernel/perf_event.c
> > > +++ b/arch/arm/kernel/perf_event.c
> > > @@ -115,8 +115,14 @@ int armpmu_event_set_period(struct perf_event *event)
> > >  		ret = 1;
> > >  	}
> > >  
> > > -	if (left > (s64)armpmu->max_period)
> > > -		left = armpmu->max_period;
> > > +	/*
> > > +	 * Limit the maximum period to prevent the counter value
> > > +	 * from overtaking the one we are about to program. In
> > > +	 * effect we are reducing max_period to account for
> > > +	 * interrupt latency (and we are being very conservative).
> > > +	 */
> > > +	if (left > (armpmu->max_period >> 1))
> > > +		left = armpmu->max_period >> 1;
> > 
> > On x86 we simply half max_period, why did you choose to do differently?
> 
> In truth because I didn't look at the x86 code... there is an existing
> halving of max_period in the arm code and that was enough to satisfy me
> that halving max_period was reasonable.
> 
> Predividing max_period looks to me like it would work for ARM too although I
> don't think we could blame hardware insanity for doing so ;-).
> 
> Will: Do you want me to update this?

Whichever you prefer. The ARM perf code used to be used by some drivers
and so we tried to keep the implementation details hidden from them, but
that didn't work out so well and it's now only used by the CPU PMUs.

Will

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-01-06 19:46 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-19 15:52 [PATCH] arm: perf: Prevent wraparound during overflow Daniel Thompson
2014-11-19 18:11 ` Will Deacon
2014-11-20 12:14   ` Daniel Thompson
2014-11-21 16:24 ` [PATCH v2 0/2] arm+arm64: " Daniel Thompson
2014-11-21 16:24   ` [PATCH v2 1/2] arm: " Daniel Thompson
2014-12-04 10:26     ` Will Deacon
2014-12-04 13:58       ` Daniel Thompson
2015-01-05 14:57     ` Peter Zijlstra
2015-01-05 19:31       ` Daniel Thompson
2015-01-06 19:46         ` Will Deacon
2014-11-21 16:24   ` [PATCH v2 2/2] arm64: " Daniel Thompson
2014-12-04 10:27     ` Will Deacon
2014-12-22  9:39 ` [PATCH 3.19-rc1 v3] arm: " Daniel Thompson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).