* [PATCH 4.19,5.4,5.10,5.15] perf/core: Fix reentry problem in perf_output_read_group()
@ 2024-03-07 17:50 Thadeu Lima de Souza Cascardo
2024-03-29 13:08 ` Greg KH
0 siblings, 1 reply; 2+ messages in thread
From: Thadeu Lima de Souza Cascardo @ 2024-03-07 17:50 UTC (permalink / raw)
To: stable; +Cc: Yang Jihong, Peter Zijlstra, kernel-dev
From: Yang Jihong <yangjihong1@huawei.com>
commit 6b959ba22d34ca793ffdb15b5715457c78e38b1a upstream.
perf_output_read_group may respond to IPI request of other cores and invoke
__perf_install_in_context function. As a result, hwc configuration is modified.
causing inconsistency and unexpected consequences.
Interrupts are not disabled when perf_output_read_group reads PMU counter.
In this case, IPI request may be received from other cores.
As a result, PMU configuration is modified and an error occurs when
reading PMU counter:
CPU0 CPU1
__se_sys_perf_event_open
perf_install_in_context
perf_output_read_group smp_call_function_single
for_each_sibling_event(sub, leader) { generic_exec_single
if ((sub != event) && remote_function
(sub->state == PERF_EVENT_STATE_ACTIVE)) |
<enter IPI handler: __perf_install_in_context> <----RAISE IPI-----+
__perf_install_in_context
ctx_resched
event_sched_out
armpmu_del
...
hwc->idx = -1; // event->hwc.idx is set to -1
...
<exit IPI>
sub->pmu->read(sub);
armpmu_read
armv8pmu_read_counter
armv8pmu_read_hw_counter
int idx = event->hw.idx; // idx = -1
u64 val = armv8pmu_read_evcntr(idx);
u32 counter = ARMV8_IDX_TO_COUNTER(idx); // invalid counter = 30
read_pmevcntrn(counter) // undefined instruction
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220902082918.179248-1-yangjihong1@huawei.com
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
---
This race may also lead to observed behavior like RCU stalls, hang tasks,
OOM. Likely due to list corruption or a similar root cause.
---
kernel/events/core.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 4e5a73c7db12..e79cd0fd1d2b 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7119,9 +7119,16 @@ static void perf_output_read_group(struct perf_output_handle *handle,
{
struct perf_event *leader = event->group_leader, *sub;
u64 read_format = event->attr.read_format;
+ unsigned long flags;
u64 values[6];
int n = 0;
+ /*
+ * Disabling interrupts avoids all counter scheduling
+ * (context switches, timer based rotation and IPIs).
+ */
+ local_irq_save(flags);
+
values[n++] = 1 + leader->nr_siblings;
if (read_format & PERF_FORMAT_TOTAL_TIME_ENABLED)
@@ -7157,6 +7164,8 @@ static void perf_output_read_group(struct perf_output_handle *handle,
__output_copy(handle, values, n * sizeof(u64));
}
+
+ local_irq_restore(flags);
}
#define PERF_FORMAT_TOTAL_TIMES (PERF_FORMAT_TOTAL_TIME_ENABLED|\
--
2.34.1
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH 4.19,5.4,5.10,5.15] perf/core: Fix reentry problem in perf_output_read_group()
2024-03-07 17:50 [PATCH 4.19,5.4,5.10,5.15] perf/core: Fix reentry problem in perf_output_read_group() Thadeu Lima de Souza Cascardo
@ 2024-03-29 13:08 ` Greg KH
0 siblings, 0 replies; 2+ messages in thread
From: Greg KH @ 2024-03-29 13:08 UTC (permalink / raw)
To: Thadeu Lima de Souza Cascardo
Cc: stable, Yang Jihong, Peter Zijlstra, kernel-dev
On Thu, Mar 07, 2024 at 02:50:15PM -0300, Thadeu Lima de Souza Cascardo wrote:
> From: Yang Jihong <yangjihong1@huawei.com>
>
> commit 6b959ba22d34ca793ffdb15b5715457c78e38b1a upstream.
>
> perf_output_read_group may respond to IPI request of other cores and invoke
> __perf_install_in_context function. As a result, hwc configuration is modified.
> causing inconsistency and unexpected consequences.
>
> Interrupts are not disabled when perf_output_read_group reads PMU counter.
> In this case, IPI request may be received from other cores.
> As a result, PMU configuration is modified and an error occurs when
> reading PMU counter:
>
> CPU0 CPU1
> __se_sys_perf_event_open
> perf_install_in_context
> perf_output_read_group smp_call_function_single
> for_each_sibling_event(sub, leader) { generic_exec_single
> if ((sub != event) && remote_function
> (sub->state == PERF_EVENT_STATE_ACTIVE)) |
> <enter IPI handler: __perf_install_in_context> <----RAISE IPI-----+
> __perf_install_in_context
> ctx_resched
> event_sched_out
> armpmu_del
> ...
> hwc->idx = -1; // event->hwc.idx is set to -1
> ...
> <exit IPI>
> sub->pmu->read(sub);
> armpmu_read
> armv8pmu_read_counter
> armv8pmu_read_hw_counter
> int idx = event->hw.idx; // idx = -1
> u64 val = armv8pmu_read_evcntr(idx);
> u32 counter = ARMV8_IDX_TO_COUNTER(idx); // invalid counter = 30
> read_pmevcntrn(counter) // undefined instruction
>
> Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Link: https://lkml.kernel.org/r/20220902082918.179248-1-yangjihong1@huawei.com
> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
> ---
>
> This race may also lead to observed behavior like RCU stalls, hang tasks,
> OOM. Likely due to list corruption or a similar root cause.
Now queued up, thanks.
greg k-h
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2024-03-29 13:08 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-07 17:50 [PATCH 4.19,5.4,5.10,5.15] perf/core: Fix reentry problem in perf_output_read_group() Thadeu Lima de Souza Cascardo
2024-03-29 13:08 ` Greg KH
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.