* [PATCH 0/5] perf/x86: Some cleanups
@ 2022-05-11 14:20 Peter Zijlstra
2022-05-11 14:20 ` [PATCH 1/5] perf/x86/amd: Fix AMD BRS period adjustment Peter Zijlstra
` (6 more replies)
0 siblings, 7 replies; 10+ messages in thread
From: Peter Zijlstra @ 2022-05-11 14:20 UTC (permalink / raw)
To: x86, kan.liang, eranian
Cc: linux-kernel, peterz, acme, mark.rutland, alexander.shishkin,
jolsa, namhyung
While staring at the code recently I collected some cleanups.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/5] perf/x86/amd: Fix AMD BRS period adjustment
2022-05-11 14:20 [PATCH 0/5] perf/x86: Some cleanups Peter Zijlstra
@ 2022-05-11 14:20 ` Peter Zijlstra
2022-05-11 14:20 ` [PATCH 2/5] perf/x86: Add two more x86_pmu methods Peter Zijlstra
` (5 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: Peter Zijlstra @ 2022-05-11 14:20 UTC (permalink / raw)
To: x86, kan.liang, eranian
Cc: linux-kernel, peterz, acme, mark.rutland, alexander.shishkin,
jolsa, namhyung
There's two problems with the current amd_brs_adjust_period() code:
- it isn't in fact AMD specific and wil always adjust the period;
- it adjusts the period, while it should only adjust the event count,
resulting in repoting a short period.
Fix this by using x86_pmu.limit_period, this makes it specific to the
AMD BRS case and ensures only the event count is adjusted while the
reported period is unmodified.
Fixes: ba2fe7500845 ("perf/x86/amd: Add AMD branch sampling period adjustment")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
arch/x86/events/amd/core.c | 13 +++++++++++++
arch/x86/events/core.c | 7 -------
arch/x86/events/perf_event.h | 18 ------------------
3 files changed, 13 insertions(+), 25 deletions(-)
--- a/arch/x86/events/amd/core.c
+++ b/arch/x86/events/amd/core.c
@@ -1258,6 +1258,18 @@ static void amd_pmu_sched_task(struct pe
amd_pmu_brs_sched_task(ctx, sched_in);
}
+static u64 amd_pmu_limit_period(struct perf_event *event, u64 left)
+{
+ /*
+ * Decrease period by the depth of the BRS feature to get the last N
+ * taken branches and approximate the desired period
+ */
+ if (has_branch_stack(event) && left > x86_pmu.lbr_nr)
+ left -= x86_pmu.lbr_nr;
+
+ return left;
+}
+
static __initconst const struct x86_pmu amd_pmu = {
.name = "AMD",
.handle_irq = amd_pmu_handle_irq,
@@ -1418,6 +1430,7 @@ static int __init amd_core_pmu_init(void
if (boot_cpu_data.x86 >= 0x19 && !amd_brs_init()) {
x86_pmu.get_event_constraints = amd_get_event_constraints_f19h;
x86_pmu.sched_task = amd_pmu_sched_task;
+ x86_pmu.limit_period = amd_pmu_limit_period;
/*
* put_event_constraints callback same as Fam17h, set above
*/
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -1375,13 +1375,6 @@ int x86_perf_event_set_period(struct per
return x86_pmu.set_topdown_event_period(event);
/*
- * decrease period by the depth of the BRS feature to get
- * the last N taken branches and approximate the desired period
- */
- if (has_branch_stack(event))
- period = amd_brs_adjust_period(period);
-
- /*
* If we are way outside a reasonable range then just skip forward:
*/
if (unlikely(left <= -period)) {
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -1254,14 +1254,6 @@ static inline void amd_pmu_brs_del(struc
}
void amd_pmu_brs_sched_task(struct perf_event_context *ctx, bool sched_in);
-
-static inline s64 amd_brs_adjust_period(s64 period)
-{
- if (period > x86_pmu.lbr_nr)
- return period - x86_pmu.lbr_nr;
-
- return period;
-}
#else
static inline int amd_brs_init(void)
{
@@ -1290,11 +1282,6 @@ static inline void amd_pmu_brs_sched_tas
{
}
-static inline s64 amd_brs_adjust_period(s64 period)
-{
- return period;
-}
-
static inline void amd_brs_enable_all(void)
{
}
@@ -1324,11 +1311,6 @@ static inline void amd_brs_enable_all(vo
static inline void amd_brs_disable_all(void)
{
}
-
-static inline s64 amd_brs_adjust_period(s64 period)
-{
- return period;
-}
#endif /* CONFIG_CPU_SUP_AMD */
static inline int is_pebs_pt(struct perf_event *event)
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 2/5] perf/x86: Add two more x86_pmu methods
2022-05-11 14:20 [PATCH 0/5] perf/x86: Some cleanups Peter Zijlstra
2022-05-11 14:20 ` [PATCH 1/5] perf/x86/amd: Fix AMD BRS period adjustment Peter Zijlstra
@ 2022-05-11 14:20 ` Peter Zijlstra
2022-05-11 14:20 ` [PATCH 3/5] perf/x86/intel: Move the topdown stuff into the intel driver Peter Zijlstra
` (4 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: Peter Zijlstra @ 2022-05-11 14:20 UTC (permalink / raw)
To: x86, kan.liang, eranian
Cc: linux-kernel, peterz, acme, mark.rutland, alexander.shishkin,
jolsa, namhyung
In order to clean up x86_perf_event_{set_period,update)() start by
adding them as x86_pmu methods.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
arch/x86/events/core.c | 22 +++++++++++++++++-----
arch/x86/events/perf_event.h | 5 +++++
2 files changed, 22 insertions(+), 5 deletions(-)
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -72,6 +72,9 @@ DEFINE_STATIC_CALL_NULL(x86_pmu_add, *x
DEFINE_STATIC_CALL_NULL(x86_pmu_del, *x86_pmu.del);
DEFINE_STATIC_CALL_NULL(x86_pmu_read, *x86_pmu.read);
+DEFINE_STATIC_CALL_NULL(x86_pmu_set_period, *x86_pmu.set_period);
+DEFINE_STATIC_CALL_NULL(x86_pmu_update, *x86_pmu.update);
+
DEFINE_STATIC_CALL_NULL(x86_pmu_schedule_events, *x86_pmu.schedule_events);
DEFINE_STATIC_CALL_NULL(x86_pmu_get_event_constraints, *x86_pmu.get_event_constraints);
DEFINE_STATIC_CALL_NULL(x86_pmu_put_event_constraints, *x86_pmu.put_event_constraints);
@@ -1518,7 +1521,7 @@ static void x86_pmu_start(struct perf_ev
if (flags & PERF_EF_RELOAD) {
WARN_ON_ONCE(!(event->hw.state & PERF_HES_UPTODATE));
- x86_perf_event_set_period(event);
+ static_call(x86_pmu_set_period)(event);
}
event->hw.state = 0;
@@ -1610,7 +1613,7 @@ void x86_pmu_stop(struct perf_event *eve
* Drain the remaining delta count out of a event
* that we are disabling:
*/
- x86_perf_event_update(event);
+ static_call(x86_pmu_update)(event);
hwc->state |= PERF_HES_UPTODATE;
}
}
@@ -1700,7 +1703,7 @@ int x86_pmu_handle_irq(struct pt_regs *r
event = cpuc->events[idx];
- val = x86_perf_event_update(event);
+ val = static_call(x86_pmu_update)(event);
if (val & (1ULL << (x86_pmu.cntval_bits - 1)))
continue;
@@ -1709,7 +1712,7 @@ int x86_pmu_handle_irq(struct pt_regs *r
*/
handled++;
- if (!x86_perf_event_set_period(event))
+ if (!static_call(x86_pmu_set_period)(event))
continue;
perf_sample_data_init(&data, 0, event->hw.last_period);
@@ -2023,6 +2026,9 @@ static void x86_pmu_static_call_update(v
static_call_update(x86_pmu_del, x86_pmu.del);
static_call_update(x86_pmu_read, x86_pmu.read);
+ static_call_update(x86_pmu_set_period, x86_pmu.set_period);
+ static_call_update(x86_pmu_update, x86_pmu.update);
+
static_call_update(x86_pmu_schedule_events, x86_pmu.schedule_events);
static_call_update(x86_pmu_get_event_constraints, x86_pmu.get_event_constraints);
static_call_update(x86_pmu_put_event_constraints, x86_pmu.put_event_constraints);
@@ -2042,7 +2048,7 @@ static void x86_pmu_static_call_update(v
static void _x86_pmu_read(struct perf_event *event)
{
- x86_perf_event_update(event);
+ static_call(x86_pmu_update)(event);
}
void x86_pmu_show_pmu_cap(int num_counters, int num_counters_fixed,
@@ -2148,6 +2154,12 @@ static int __init init_hw_perf_events(vo
if (!x86_pmu.guest_get_msrs)
x86_pmu.guest_get_msrs = (void *)&__static_call_return0;
+ if (!x86_pmu.set_period)
+ x86_pmu.set_period = x86_perf_event_set_period;
+
+ if (!x86_pmu.update)
+ x86_pmu.update = x86_perf_event_update;
+
x86_pmu_static_call_update();
/*
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -735,6 +735,8 @@ struct x86_pmu {
void (*add)(struct perf_event *);
void (*del)(struct perf_event *);
void (*read)(struct perf_event *event);
+ int (*set_period)(struct perf_event *event);
+ u64 (*update)(struct perf_event *event);
int (*hw_config)(struct perf_event *event);
int (*schedule_events)(struct cpu_hw_events *cpuc, int n, int *assign);
unsigned eventsel;
@@ -1031,6 +1033,9 @@ static struct perf_pmu_format_hybrid_att
struct pmu *x86_get_pmu(unsigned int cpu);
extern struct x86_pmu x86_pmu __read_mostly;
+DECLARE_STATIC_CALL(x86_pmu_set_period, *x86_pmu.set_period);
+DECLARE_STATIC_CALL(x86_pmu_update, *x86_pmu.update);
+
static __always_inline struct x86_perf_task_context_opt *task_context_opt(void *ctx)
{
if (static_cpu_has(X86_FEATURE_ARCH_LBR))
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 3/5] perf/x86/intel: Move the topdown stuff into the intel driver
2022-05-11 14:20 [PATCH 0/5] perf/x86: Some cleanups Peter Zijlstra
2022-05-11 14:20 ` [PATCH 1/5] perf/x86/amd: Fix AMD BRS period adjustment Peter Zijlstra
2022-05-11 14:20 ` [PATCH 2/5] perf/x86: Add two more x86_pmu methods Peter Zijlstra
@ 2022-05-11 14:20 ` Peter Zijlstra
2022-05-11 14:20 ` [PATCH 4/5] perf/x86: Change x86_pmu::limit_period signature Peter Zijlstra
` (3 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: Peter Zijlstra @ 2022-05-11 14:20 UTC (permalink / raw)
To: x86, kan.liang, eranian
Cc: linux-kernel, peterz, acme, mark.rutland, alexander.shishkin,
jolsa, namhyung
Use the new x86_pmu::{set_period,update}() methods to push the topdown
stuff into the Intel driver, where it belongs.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
arch/x86/events/core.c | 7 -------
arch/x86/events/intel/core.c | 28 +++++++++++++++++++++++++---
2 files changed, 25 insertions(+), 10 deletions(-)
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -119,9 +119,6 @@ u64 x86_perf_event_update(struct perf_ev
if (unlikely(!hwc->event_base))
return 0;
- if (unlikely(is_topdown_count(event)) && x86_pmu.update_topdown_event)
- return x86_pmu.update_topdown_event(event);
-
/*
* Careful: an NMI might modify the previous event value.
*
@@ -1373,10 +1370,6 @@ int x86_perf_event_set_period(struct per
if (unlikely(!hwc->event_base))
return 0;
- if (unlikely(is_topdown_count(event)) &&
- x86_pmu.set_topdown_event_period)
- return x86_pmu.set_topdown_event_period(event);
-
/*
* If we are way outside a reasonable range then just skip forward:
*/
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2301,7 +2301,7 @@ static void intel_pmu_nhm_workaround(voi
for (i = 0; i < 4; i++) {
event = cpuc->events[i];
if (event)
- x86_perf_event_update(event);
+ static_call(x86_pmu_update)(event);
}
for (i = 0; i < 4; i++) {
@@ -2316,7 +2316,7 @@ static void intel_pmu_nhm_workaround(voi
event = cpuc->events[i];
if (event) {
- x86_perf_event_set_period(event);
+ static_call(x86_pmu_set_period)(event);
__x86_pmu_enable_event(&event->hw,
ARCH_PERFMON_EVENTSEL_ENABLE);
} else
@@ -2793,7 +2793,7 @@ static void intel_pmu_add_event(struct p
*/
int intel_pmu_save_and_restart(struct perf_event *event)
{
- x86_perf_event_update(event);
+ static_call(x86_pmu_update)(event);
/*
* For a checkpointed counter always reset back to 0. This
* avoids a situation where the counter overflows, aborts the
@@ -2805,9 +2805,27 @@ int intel_pmu_save_and_restart(struct pe
wrmsrl(event->hw.event_base, 0);
local64_set(&event->hw.prev_count, 0);
}
+ return static_call(x86_pmu_set_period)(event);
+}
+
+static int intel_pmu_set_period(struct perf_event *event)
+{
+ if (unlikely(is_topdown_count(event)) &&
+ x86_pmu.set_topdown_event_period)
+ return x86_pmu.set_topdown_event_period(event);
+
return x86_perf_event_set_period(event);
}
+static u64 intel_pmu_update(struct perf_event *event)
+{
+ if (unlikely(is_topdown_count(event)) &&
+ x86_pmu.update_topdown_event)
+ return x86_pmu.update_topdown_event(event);
+
+ return x86_perf_event_update(event);
+}
+
static void intel_pmu_reset(void)
{
struct debug_store *ds = __this_cpu_read(cpu_hw_events.ds);
@@ -4635,6 +4653,10 @@ static __initconst const struct x86_pmu
.enable_all = core_pmu_enable_all,
.enable = core_pmu_enable_event,
.disable = x86_pmu_disable_event,
+
+ .set_period = intel_pmu_set_period,
+ .update = intel_pmu_update,
+
.hw_config = core_pmu_hw_config,
.schedule_events = x86_schedule_events,
.eventsel = MSR_ARCH_PERFMON_EVENTSEL0,
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 4/5] perf/x86: Change x86_pmu::limit_period signature
2022-05-11 14:20 [PATCH 0/5] perf/x86: Some cleanups Peter Zijlstra
` (2 preceding siblings ...)
2022-05-11 14:20 ` [PATCH 3/5] perf/x86/intel: Move the topdown stuff into the intel driver Peter Zijlstra
@ 2022-05-11 14:20 ` Peter Zijlstra
2022-05-11 17:47 ` Liang, Kan
2022-05-11 14:20 ` [PATCH 5/5] perf/x86: Add a x86_pmu::limit_period static_call Peter Zijlstra
` (2 subsequent siblings)
6 siblings, 1 reply; 10+ messages in thread
From: Peter Zijlstra @ 2022-05-11 14:20 UTC (permalink / raw)
To: x86, kan.liang, eranian
Cc: linux-kernel, peterz, acme, mark.rutland, alexander.shishkin,
jolsa, namhyung
In preparation for making it a static_call, change the signature.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
arch/x86/events/amd/core.c | 8 +++-----
arch/x86/events/core.c | 18 ++++++++----------
arch/x86/events/intel/core.c | 19 ++++++++-----------
arch/x86/events/perf_event.h | 2 +-
4 files changed, 20 insertions(+), 27 deletions(-)
--- a/arch/x86/events/amd/core.c
+++ b/arch/x86/events/amd/core.c
@@ -1258,16 +1258,14 @@ static void amd_pmu_sched_task(struct pe
amd_pmu_brs_sched_task(ctx, sched_in);
}
-static u64 amd_pmu_limit_period(struct perf_event *event, u64 left)
+static void amd_pmu_limit_period(struct perf_event *event, s64 *left)
{
/*
* Decrease period by the depth of the BRS feature to get the last N
* taken branches and approximate the desired period
*/
- if (has_branch_stack(event) && left > x86_pmu.lbr_nr)
- left -= x86_pmu.lbr_nr;
-
- return left;
+ if (has_branch_stack(event) && *left > x86_pmu.lbr_nr)
+ *left -= x86_pmu.lbr_nr;
}
static __initconst const struct x86_pmu amd_pmu = {
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -621,8 +621,9 @@ int x86_pmu_hw_config(struct perf_event
event->hw.config |= event->attr.config & X86_RAW_EVENT_MASK;
if (event->attr.sample_period && x86_pmu.limit_period) {
- if (x86_pmu.limit_period(event, event->attr.sample_period) >
- event->attr.sample_period)
+ s64 left = event->attr.sample_period;
+ x86_pmu.limit_period(event, &left);
+ if (left > event->attr.sample_period)
return -EINVAL;
}
@@ -1386,19 +1387,14 @@ int x86_perf_event_set_period(struct per
hwc->last_period = period;
ret = 1;
}
- /*
- * Quirk: certain CPUs dont like it if just 1 hw_event is left:
- */
- if (unlikely(left < 2))
- left = 2;
if (left > x86_pmu.max_period)
left = x86_pmu.max_period;
if (x86_pmu.limit_period)
- left = x86_pmu.limit_period(event, left);
+ x86_pmu.limit_period(event, &left);
- per_cpu(pmc_prev_left[idx], smp_processor_id()) = left;
+ this_cpu_write(pmc_prev_left[idx], left);
/*
* The hw event starts counting from this event offset,
@@ -2672,7 +2668,9 @@ static int x86_pmu_check_period(struct p
return -EINVAL;
if (value && x86_pmu.limit_period) {
- if (x86_pmu.limit_period(event, value) > value)
+ s64 left = value;
+ x86_pmu.limit_period(event, &left);
+ if (left > value)
return -EINVAL;
}
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4244,28 +4244,25 @@ static u8 adl_get_hybrid_cpu_type(void)
* Therefore the effective (average) period matches the requested period,
* despite coarser hardware granularity.
*/
-static u64 bdw_limit_period(struct perf_event *event, u64 left)
+static void bdw_limit_period(struct perf_event *event, s64 *left)
{
if ((event->hw.config & INTEL_ARCH_EVENT_MASK) ==
X86_CONFIG(.event=0xc0, .umask=0x01)) {
- if (left < 128)
- left = 128;
- left &= ~0x3fULL;
+ if (*left < 128)
+ *left = 128;
+ *left &= ~0x3fULL;
}
- return left;
}
-static u64 nhm_limit_period(struct perf_event *event, u64 left)
+static void nhm_limit_period(struct perf_event *event, s64 *left)
{
- return max(left, 32ULL);
+ *left = max(*left, 32LL);
}
-static u64 spr_limit_period(struct perf_event *event, u64 left)
+static void spr_limit_period(struct perf_event *event, s64 *left)
{
if (event->attr.precise_ip == 3)
- return max(left, 128ULL);
-
- return left;
+ *left = max(*left, 128LL);
}
PMU_FORMAT_ATTR(event, "config:0-7" );
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -773,7 +773,7 @@ struct x86_pmu {
struct event_constraint *event_constraints;
struct x86_pmu_quirk *quirks;
int perfctr_second_write;
- u64 (*limit_period)(struct perf_event *event, u64 l);
+ void (*limit_period)(struct perf_event *event, s64 *l);
/* PMI handler bits */
unsigned int late_ack :1,
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 5/5] perf/x86: Add a x86_pmu::limit_period static_call
2022-05-11 14:20 [PATCH 0/5] perf/x86: Some cleanups Peter Zijlstra
` (3 preceding siblings ...)
2022-05-11 14:20 ` [PATCH 4/5] perf/x86: Change x86_pmu::limit_period signature Peter Zijlstra
@ 2022-05-11 14:20 ` Peter Zijlstra
2022-05-11 15:34 ` [RFC][PATCH 6/5] perf/x86/intel: Remove x86_pmu::set_topdown_event_period Peter Zijlstra
2022-05-11 15:35 ` [RFC][PATCH 7/5] perf/x86/intel: Remove x86_pmu::update_topdown_event Peter Zijlstra
6 siblings, 0 replies; 10+ messages in thread
From: Peter Zijlstra @ 2022-05-11 14:20 UTC (permalink / raw)
To: x86, kan.liang, eranian
Cc: linux-kernel, peterz, acme, mark.rutland, alexander.shishkin,
jolsa, namhyung
Avoid a branch and indirect call.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
arch/x86/events/core.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -72,8 +72,9 @@ DEFINE_STATIC_CALL_NULL(x86_pmu_add, *x
DEFINE_STATIC_CALL_NULL(x86_pmu_del, *x86_pmu.del);
DEFINE_STATIC_CALL_NULL(x86_pmu_read, *x86_pmu.read);
-DEFINE_STATIC_CALL_NULL(x86_pmu_set_period, *x86_pmu.set_period);
-DEFINE_STATIC_CALL_NULL(x86_pmu_update, *x86_pmu.update);
+DEFINE_STATIC_CALL_NULL(x86_pmu_set_period, *x86_pmu.set_period);
+DEFINE_STATIC_CALL_NULL(x86_pmu_update, *x86_pmu.update);
+DEFINE_STATIC_CALL_NULL(x86_pmu_limit_period, *x86_pmu.limit_period);
DEFINE_STATIC_CALL_NULL(x86_pmu_schedule_events, *x86_pmu.schedule_events);
DEFINE_STATIC_CALL_NULL(x86_pmu_get_event_constraints, *x86_pmu.get_event_constraints);
@@ -1391,8 +1392,7 @@ int x86_perf_event_set_period(struct per
if (left > x86_pmu.max_period)
left = x86_pmu.max_period;
- if (x86_pmu.limit_period)
- x86_pmu.limit_period(event, &left);
+ static_call_cond(x86_pmu_limit_period)(event, &left);
this_cpu_write(pmc_prev_left[idx], left);
@@ -2017,6 +2017,7 @@ static void x86_pmu_static_call_update(v
static_call_update(x86_pmu_set_period, x86_pmu.set_period);
static_call_update(x86_pmu_update, x86_pmu.update);
+ static_call_update(x86_pmu_limit_period, x86_pmu.limit_period);
static_call_update(x86_pmu_schedule_events, x86_pmu.schedule_events);
static_call_update(x86_pmu_get_event_constraints, x86_pmu.get_event_constraints);
^ permalink raw reply [flat|nested] 10+ messages in thread
* [RFC][PATCH 6/5] perf/x86/intel: Remove x86_pmu::set_topdown_event_period
2022-05-11 14:20 [PATCH 0/5] perf/x86: Some cleanups Peter Zijlstra
` (4 preceding siblings ...)
2022-05-11 14:20 ` [PATCH 5/5] perf/x86: Add a x86_pmu::limit_period static_call Peter Zijlstra
@ 2022-05-11 15:34 ` Peter Zijlstra
2022-05-11 15:35 ` [RFC][PATCH 7/5] perf/x86/intel: Remove x86_pmu::update_topdown_event Peter Zijlstra
6 siblings, 0 replies; 10+ messages in thread
From: Peter Zijlstra @ 2022-05-11 15:34 UTC (permalink / raw)
To: x86, kan.liang, eranian
Cc: linux-kernel, acme, mark.rutland, alexander.shishkin, jolsa, namhyung
Subject: perf/x86/intel: Remove x86_pmu::set_topdown_event_period
From: Peter Zijlstra <peterz@infradead.org>
Date: Wed May 11 16:41:25 CEST 2022
Now that it is all internal to the intel driver, remove
x86_pmu::set_topdown_event_period.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
arch/x86/events/intel/core.c | 16 ++++++++++------
arch/x86/events/perf_event.h | 1 -
2 files changed, 10 insertions(+), 7 deletions(-)
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2520,6 +2520,8 @@ static int adl_set_topdown_event_period(
return icl_set_topdown_event_period(event);
}
+DEFINE_STATIC_CALL(intel_pmu_set_topdown_event_period, x86_perf_event_set_period);
+
static inline u64 icl_get_metrics_event_value(u64 metric, u64 slots, int idx)
{
u32 val;
@@ -2810,9 +2812,8 @@ int intel_pmu_save_and_restart(struct pe
static int intel_pmu_set_period(struct perf_event *event)
{
- if (unlikely(is_topdown_count(event)) &&
- x86_pmu.set_topdown_event_period)
- return x86_pmu.set_topdown_event_period(event);
+ if (unlikely(is_topdown_count(event)))
+ return static_call(intel_pmu_set_topdown_event_period)(event);
return x86_perf_event_set_period(event);
}
@@ -6191,7 +6192,8 @@ __init int intel_pmu_init(void)
intel_pmu_pebs_data_source_skl(pmem);
x86_pmu.num_topdown_events = 4;
x86_pmu.update_topdown_event = icl_update_topdown_event;
- x86_pmu.set_topdown_event_period = icl_set_topdown_event_period;
+ static_call_update(intel_pmu_set_topdown_event_period,
+ &icl_set_topdown_event_period);
pr_cont("Icelake events, ");
name = "icelake";
break;
@@ -6228,7 +6230,8 @@ __init int intel_pmu_init(void)
intel_pmu_pebs_data_source_skl(pmem);
x86_pmu.num_topdown_events = 8;
x86_pmu.update_topdown_event = icl_update_topdown_event;
- x86_pmu.set_topdown_event_period = icl_set_topdown_event_period;
+ static_call_update(intel_pmu_set_topdown_event_period,
+ &icl_set_topdown_event_period);
pr_cont("Sapphire Rapids events, ");
name = "sapphire_rapids";
break;
@@ -6261,7 +6264,8 @@ __init int intel_pmu_init(void)
intel_pmu_pebs_data_source_skl(false);
x86_pmu.num_topdown_events = 8;
x86_pmu.update_topdown_event = adl_update_topdown_event;
- x86_pmu.set_topdown_event_period = adl_set_topdown_event_period;
+ static_call_update(intel_pmu_set_topdown_event_period,
+ &adl_set_topdown_event_period);
x86_pmu.filter_match = intel_pmu_filter_match;
x86_pmu.get_event_constraints = adl_get_event_constraints;
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -879,7 +879,6 @@ struct x86_pmu {
*/
int num_topdown_events;
u64 (*update_topdown_event)(struct perf_event *event);
- int (*set_topdown_event_period)(struct perf_event *event);
/*
* perf task context (i.e. struct perf_event_context::task_ctx_data)
^ permalink raw reply [flat|nested] 10+ messages in thread
* [RFC][PATCH 7/5] perf/x86/intel: Remove x86_pmu::update_topdown_event
2022-05-11 14:20 [PATCH 0/5] perf/x86: Some cleanups Peter Zijlstra
` (5 preceding siblings ...)
2022-05-11 15:34 ` [RFC][PATCH 6/5] perf/x86/intel: Remove x86_pmu::set_topdown_event_period Peter Zijlstra
@ 2022-05-11 15:35 ` Peter Zijlstra
6 siblings, 0 replies; 10+ messages in thread
From: Peter Zijlstra @ 2022-05-11 15:35 UTC (permalink / raw)
To: x86, kan.liang, eranian
Cc: linux-kernel, acme, mark.rutland, alexander.shishkin, jolsa, namhyung
Subject: perf/x86/intel: Remove x86_pmu::update_topdown_event
From: Peter Zijlstra <peterz@infradead.org>
Date: Wed May 11 17:02:05 CEST 2022
Not that it is all internal to the intel driver, remove
x86_pmu::update_topdown_event.
Assumes that is_topdown_count(event) can only be true when the
hardware has topdown stuff and the function is set.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
arch/x86/events/intel/core.c | 22 ++++++++++++----------
arch/x86/events/perf_event.h | 1 -
2 files changed, 12 insertions(+), 11 deletions(-)
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2672,6 +2672,7 @@ static u64 adl_update_topdown_event(stru
return icl_update_topdown_event(event);
}
+DEFINE_STATIC_CALL(intel_pmu_update_topdown_event, x86_perf_event_update);
static void intel_pmu_read_topdown_event(struct perf_event *event)
{
@@ -2683,7 +2684,7 @@ static void intel_pmu_read_topdown_event
return;
perf_pmu_disable(event->pmu);
- x86_pmu.update_topdown_event(event);
+ static_call(intel_pmu_update_topdown_event)(event);
perf_pmu_enable(event->pmu);
}
@@ -2691,7 +2692,7 @@ static void intel_pmu_read_event(struct
{
if (event->hw.flags & PERF_X86_EVENT_AUTO_RELOAD)
intel_pmu_auto_reload_read(event);
- else if (is_topdown_count(event) && x86_pmu.update_topdown_event)
+ else if (is_topdown_count(event))
intel_pmu_read_topdown_event(event);
else
x86_perf_event_update(event);
@@ -2820,9 +2821,8 @@ static int intel_pmu_set_period(struct p
static u64 intel_pmu_update(struct perf_event *event)
{
- if (unlikely(is_topdown_count(event)) &&
- x86_pmu.update_topdown_event)
- return x86_pmu.update_topdown_event(event);
+ if (unlikely(is_topdown_count(event)))
+ return static_call(intel_pmu_update_topdown_event)(event);
return x86_perf_event_update(event);
}
@@ -2950,8 +2950,7 @@ static int handle_pmi_common(struct pt_r
*/
if (__test_and_clear_bit(GLOBAL_STATUS_PERF_METRICS_OVF_BIT, (unsigned long *)&status)) {
handled++;
- if (x86_pmu.update_topdown_event)
- x86_pmu.update_topdown_event(NULL);
+ static_call(intel_pmu_update_topdown_event)(NULL);
}
/*
@@ -6191,7 +6190,8 @@ __init int intel_pmu_init(void)
x86_pmu.lbr_pt_coexist = true;
intel_pmu_pebs_data_source_skl(pmem);
x86_pmu.num_topdown_events = 4;
- x86_pmu.update_topdown_event = icl_update_topdown_event;
+ static_call_update(intel_pmu_update_topdown_event,
+ &icl_update_topdown_event);
static_call_update(intel_pmu_set_topdown_event_period,
&icl_set_topdown_event_period);
pr_cont("Icelake events, ");
@@ -6229,7 +6229,8 @@ __init int intel_pmu_init(void)
x86_pmu.lbr_pt_coexist = true;
intel_pmu_pebs_data_source_skl(pmem);
x86_pmu.num_topdown_events = 8;
- x86_pmu.update_topdown_event = icl_update_topdown_event;
+ static_call_update(intel_pmu_update_topdown_event,
+ &icl_update_topdown_event);
static_call_update(intel_pmu_set_topdown_event_period,
&icl_set_topdown_event_period);
pr_cont("Sapphire Rapids events, ");
@@ -6263,7 +6264,8 @@ __init int intel_pmu_init(void)
x86_pmu.lbr_pt_coexist = true;
intel_pmu_pebs_data_source_skl(false);
x86_pmu.num_topdown_events = 8;
- x86_pmu.update_topdown_event = adl_update_topdown_event;
+ static_call_update(intel_pmu_update_topdown_event,
+ &adl_update_topdown_event);
static_call_update(intel_pmu_set_topdown_event_period,
&adl_set_topdown_event_period);
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -878,7 +878,6 @@ struct x86_pmu {
* Intel perf metrics
*/
int num_topdown_events;
- u64 (*update_topdown_event)(struct perf_event *event);
/*
* perf task context (i.e. struct perf_event_context::task_ctx_data)
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 4/5] perf/x86: Change x86_pmu::limit_period signature
2022-05-11 14:20 ` [PATCH 4/5] perf/x86: Change x86_pmu::limit_period signature Peter Zijlstra
@ 2022-05-11 17:47 ` Liang, Kan
2022-05-11 20:06 ` Peter Zijlstra
0 siblings, 1 reply; 10+ messages in thread
From: Liang, Kan @ 2022-05-11 17:47 UTC (permalink / raw)
To: Peter Zijlstra, x86, eranian
Cc: linux-kernel, acme, mark.rutland, alexander.shishkin, jolsa, namhyung
On 5/11/2022 10:20 AM, Peter Zijlstra wrote:
> In preparation for making it a static_call, change the signature.
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
> arch/x86/events/amd/core.c | 8 +++-----
> arch/x86/events/core.c | 18 ++++++++----------
> arch/x86/events/intel/core.c | 19 ++++++++-----------
> arch/x86/events/perf_event.h | 2 +-
> 4 files changed, 20 insertions(+), 27 deletions(-)
>
> --- a/arch/x86/events/amd/core.c
> +++ b/arch/x86/events/amd/core.c
> @@ -1258,16 +1258,14 @@ static void amd_pmu_sched_task(struct pe
> amd_pmu_brs_sched_task(ctx, sched_in);
> }
>
> -static u64 amd_pmu_limit_period(struct perf_event *event, u64 left)
> +static void amd_pmu_limit_period(struct perf_event *event, s64 *left)
> {
> /*
> * Decrease period by the depth of the BRS feature to get the last N
> * taken branches and approximate the desired period
> */
> - if (has_branch_stack(event) && left > x86_pmu.lbr_nr)
> - left -= x86_pmu.lbr_nr;
> -
> - return left;
> + if (has_branch_stack(event) && *left > x86_pmu.lbr_nr)
> + *left -= x86_pmu.lbr_nr;
> }
>
> static __initconst const struct x86_pmu amd_pmu = {
> --- a/arch/x86/events/core.c
> +++ b/arch/x86/events/core.c
> @@ -621,8 +621,9 @@ int x86_pmu_hw_config(struct perf_event
> event->hw.config |= event->attr.config & X86_RAW_EVENT_MASK;
>
> if (event->attr.sample_period && x86_pmu.limit_period) {
> - if (x86_pmu.limit_period(event, event->attr.sample_period) >
> - event->attr.sample_period)
> + s64 left = event->attr.sample_period;
> + x86_pmu.limit_period(event, &left);
> + if (left > event->attr.sample_period)
> return -EINVAL;
> }
>
> @@ -1386,19 +1387,14 @@ int x86_perf_event_set_period(struct per
> hwc->last_period = period;
> ret = 1;
> }
> - /*
> - * Quirk: certain CPUs dont like it if just 1 hw_event is left:
> - */
> - if (unlikely(left < 2))
> - left = 2;
>
Is the quirk accidentally deleted?
We should still need the quirk for certain CPUs.
Thanks,
Kan
> if (left > x86_pmu.max_period)
> left = x86_pmu.max_period;
>
> if (x86_pmu.limit_period)
> - left = x86_pmu.limit_period(event, left);
> + x86_pmu.limit_period(event, &left);
>
> - per_cpu(pmc_prev_left[idx], smp_processor_id()) = left;
> + this_cpu_write(pmc_prev_left[idx], left);
>
> /*
> * The hw event starts counting from this event offset,
> @@ -2672,7 +2668,9 @@ static int x86_pmu_check_period(struct p
> return -EINVAL;
>
> if (value && x86_pmu.limit_period) {
> - if (x86_pmu.limit_period(event, value) > value)
> + s64 left = value;
> + x86_pmu.limit_period(event, &left);
> + if (left > value)
> return -EINVAL;
> }
>
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -4244,28 +4244,25 @@ static u8 adl_get_hybrid_cpu_type(void)
> * Therefore the effective (average) period matches the requested period,
> * despite coarser hardware granularity.
> */
> -static u64 bdw_limit_period(struct perf_event *event, u64 left)
> +static void bdw_limit_period(struct perf_event *event, s64 *left)
> {
> if ((event->hw.config & INTEL_ARCH_EVENT_MASK) ==
> X86_CONFIG(.event=0xc0, .umask=0x01)) {
> - if (left < 128)
> - left = 128;
> - left &= ~0x3fULL;
> + if (*left < 128)
> + *left = 128;
> + *left &= ~0x3fULL;
> }
> - return left;
> }
>
> -static u64 nhm_limit_period(struct perf_event *event, u64 left)
> +static void nhm_limit_period(struct perf_event *event, s64 *left)
> {
> - return max(left, 32ULL);
> + *left = max(*left, 32LL);
> }
>
> -static u64 spr_limit_period(struct perf_event *event, u64 left)
> +static void spr_limit_period(struct perf_event *event, s64 *left)
> {
> if (event->attr.precise_ip == 3)
> - return max(left, 128ULL);
> -
> - return left;
> + *left = max(*left, 128LL);
> }
>
> PMU_FORMAT_ATTR(event, "config:0-7" );
> --- a/arch/x86/events/perf_event.h
> +++ b/arch/x86/events/perf_event.h
> @@ -773,7 +773,7 @@ struct x86_pmu {
> struct event_constraint *event_constraints;
> struct x86_pmu_quirk *quirks;
> int perfctr_second_write;
> - u64 (*limit_period)(struct perf_event *event, u64 l);
> + void (*limit_period)(struct perf_event *event, s64 *l);
>
> /* PMI handler bits */
> unsigned int late_ack :1,
>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 4/5] perf/x86: Change x86_pmu::limit_period signature
2022-05-11 17:47 ` Liang, Kan
@ 2022-05-11 20:06 ` Peter Zijlstra
0 siblings, 0 replies; 10+ messages in thread
From: Peter Zijlstra @ 2022-05-11 20:06 UTC (permalink / raw)
To: Liang, Kan
Cc: x86, eranian, linux-kernel, acme, mark.rutland,
alexander.shishkin, jolsa, namhyung
On Wed, May 11, 2022 at 01:47:06PM -0400, Liang, Kan wrote:
> > @@ -1386,19 +1387,14 @@ int x86_perf_event_set_period(struct per
> > hwc->last_period = period;
> > ret = 1;
> > }
> > - /*
> > - * Quirk: certain CPUs dont like it if just 1 hw_event is left:
> > - */
> > - if (unlikely(left < 2))
> > - left = 2;
> >
>
> Is the quirk accidentally deleted?
> We should still need the quirk for certain CPUs.
No, but I did forget to write about it in the Changelog :/
IIRC it was Nehalem that triggered that and that should now be covered
by nhm_limit_period().
Or are you aware of more machines that need this?
Anyway, perhaps this should be its own separate commit.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2022-05-11 20:07 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-11 14:20 [PATCH 0/5] perf/x86: Some cleanups Peter Zijlstra
2022-05-11 14:20 ` [PATCH 1/5] perf/x86/amd: Fix AMD BRS period adjustment Peter Zijlstra
2022-05-11 14:20 ` [PATCH 2/5] perf/x86: Add two more x86_pmu methods Peter Zijlstra
2022-05-11 14:20 ` [PATCH 3/5] perf/x86/intel: Move the topdown stuff into the intel driver Peter Zijlstra
2022-05-11 14:20 ` [PATCH 4/5] perf/x86: Change x86_pmu::limit_period signature Peter Zijlstra
2022-05-11 17:47 ` Liang, Kan
2022-05-11 20:06 ` Peter Zijlstra
2022-05-11 14:20 ` [PATCH 5/5] perf/x86: Add a x86_pmu::limit_period static_call Peter Zijlstra
2022-05-11 15:34 ` [RFC][PATCH 6/5] perf/x86/intel: Remove x86_pmu::set_topdown_event_period Peter Zijlstra
2022-05-11 15:35 ` [RFC][PATCH 7/5] perf/x86/intel: Remove x86_pmu::update_topdown_event Peter Zijlstra
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).