All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH stable 0/1] x86: aperfmperf: bug report
@ 2022-10-21  2:17 Yipeng Zou
  2022-10-21  2:17 ` [PATCH stable 1/1] x86: aperfmperf: fix overflow problem in the concurrency scenario Yipeng Zou
  2022-11-04  2:26 ` [PATCH stable 0/1] x86: aperfmperf: bug report Yipeng Zou
  0 siblings, 2 replies; 3+ messages in thread
From: Yipeng Zou @ 2022-10-21  2:17 UTC (permalink / raw)
  To: tglx, bp, mingo, x86, hpa, rafael.j.wysocki, linux-kernel; +Cc: Yipeng Zou

Recently i was doing some work about calculating CPU frequency on x86 on
stable branch linux-5.10.y, and there is a problem which i descrip in
the commit message.

I've noticed that it has been abandoned on the mainline. On the mainline
it becomes to update [a,m]cnt in timer code with HZ frequency, and it is
actually calculated at the time of reading the cpu freqency. This solves
the problem above.

So, Are there other reasons why the stable branch doesn't have these
changes?, this patch is just to ask how we plan to fix it.

Yipeng Zou (1):
  x86: aperfmperf: fix overflow problem in the concurrency scenario

 arch/x86/kernel/cpu/aperfmperf.c | 4 ----
 1 file changed, 4 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH stable 1/1] x86: aperfmperf: fix overflow problem in the concurrency scenario
  2022-10-21  2:17 [PATCH stable 0/1] x86: aperfmperf: bug report Yipeng Zou
@ 2022-10-21  2:17 ` Yipeng Zou
  2022-11-04  2:26 ` [PATCH stable 0/1] x86: aperfmperf: bug report Yipeng Zou
  1 sibling, 0 replies; 3+ messages in thread
From: Yipeng Zou @ 2022-10-21  2:17 UTC (permalink / raw)
  To: tglx, bp, mingo, x86, hpa, rafael.j.wysocki, linux-kernel; +Cc: Yipeng Zou

Recently i was doing some work about calculating CPU frequency on x86 on
stable branch linux-5.10.y, and there are the details of the problem.
I was test on platform Intel Xeon CPU @ 2.10Ghz. In most cases the cpu
freq field cat from /proc/cpuinfo was 2099.800 ~ 2100.300 MHz. At a very
low probability, the values of cpu freq was very small, such as 105.000
MHz.

After debug I found that there is an integer overflow problem in the
concurrency scenario during the calculation of cpu freq. The key code
was the function aperfmperf_snapshot_khz(),

In function aperfmperf_snapshot_khz(), the aperf_delta may be large(a
long time since the last execution). This has the potential to cause
integer overflow when multiplying with it.

And Then to avoid this the stable branch will calculate the cpu freq
twice and sleep 10ms when it is found that this update is a long time
since the last time to ensure calculated the cpu freq correctly.

Consider the following:

Task 0                   Task 1

arch_freq_perpare_all    ....

sleep 10ms               ....

....                     arch_freq_perpare_all    // Within 10ms

....                     aperfmperf_snapshot_cpu  // Within 10ms

aperfmperf_snapshot_cpu  ....

step 0: task0 : arch_freq_perpare_all go through all cpus and update
their time and freq. If this time is a long time since the last update,
then the saved CPU freq is abnormal(integer overflow).

step 1: Then task1 has also come to get the same CPU freq. But since
within 10ms of last task 0 calculation, it cannot update the current
CPU freq, which exception value it will gets.

I've noticed that it has been abandoned on the mainline. On the mainline
it becomes to update [a,m]cnt in timer code with HZ frequency, and it is
actually calculated at the time of reading the cpu freqency. This solves
the problem above.

Fixes: 7d5905dc14a8 ("x86 / CPU: Always show current CPU frequency in /proc/cpuinfo")
Signed-off-by: Yipeng Zou <zouyipeng@huawei.com>
---
 arch/x86/kernel/cpu/aperfmperf.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/aperfmperf.c b/arch/x86/kernel/cpu/aperfmperf.c
index e2f319dc992d..d3f417c06d5f 100644
--- a/arch/x86/kernel/cpu/aperfmperf.c
+++ b/arch/x86/kernel/cpu/aperfmperf.c
@@ -68,10 +68,6 @@ static bool aperfmperf_snapshot_cpu(int cpu, ktime_t now, bool wait)
 {
 	s64 time_delta = ktime_ms_delta(now, per_cpu(samples.time, cpu));
 
-	/* Don't bother re-computing within the cache threshold time. */
-	if (time_delta < APERFMPERF_CACHE_THRESHOLD_MS)
-		return true;
-
 	smp_call_function_single(cpu, aperfmperf_snapshot_khz, NULL, wait);
 
 	/* Return false if the previous iteration was too long ago. */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH stable 0/1] x86: aperfmperf: bug report
  2022-10-21  2:17 [PATCH stable 0/1] x86: aperfmperf: bug report Yipeng Zou
  2022-10-21  2:17 ` [PATCH stable 1/1] x86: aperfmperf: fix overflow problem in the concurrency scenario Yipeng Zou
@ 2022-11-04  2:26 ` Yipeng Zou
  1 sibling, 0 replies; 3+ messages in thread
From: Yipeng Zou @ 2022-11-04  2:26 UTC (permalink / raw)
  To: tglx, bp, mingo, x86, hpa, rafael.j.wysocki, linux-kernel

Ping~

在 2022/10/21 10:17, Yipeng Zou 写道:
> Recently i was doing some work about calculating CPU frequency on x86 on
> stable branch linux-5.10.y, and there is a problem which i descrip in
> the commit message.
>
> I've noticed that it has been abandoned on the mainline. On the mainline
> it becomes to update [a,m]cnt in timer code with HZ frequency, and it is
> actually calculated at the time of reading the cpu freqency. This solves
> the problem above.
>
> So, Are there other reasons why the stable branch doesn't have these
> changes?, this patch is just to ask how we plan to fix it.
>
> Yipeng Zou (1):
>    x86: aperfmperf: fix overflow problem in the concurrency scenario
>
>   arch/x86/kernel/cpu/aperfmperf.c | 4 ----
>   1 file changed, 4 deletions(-)
>
-- 
Regards,
Yipeng Zou


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-11-04  2:26 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-21  2:17 [PATCH stable 0/1] x86: aperfmperf: bug report Yipeng Zou
2022-10-21  2:17 ` [PATCH stable 1/1] x86: aperfmperf: fix overflow problem in the concurrency scenario Yipeng Zou
2022-11-04  2:26 ` [PATCH stable 0/1] x86: aperfmperf: bug report Yipeng Zou

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.