All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] perf/x86/intel: force reschedule on TFA changes
@ 2019-04-04 18:32 Stephane Eranian
  2019-04-04 18:32 ` [PATCH 1/3] perf/core: make perf_ctx_*lock() global inline functions Stephane Eranian
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Stephane Eranian @ 2019-04-04 18:32 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterz, tglx, ak, kan.liang, mingo, nelson.dsouza, jolsa, tonyj

This short patch series improves the TFA patch series by adding a
guarantee to users each time the allow_force_tsx_abort (TFA) sysctl
control knob is modified. 

The current TFA support in perf_events operates as follow:
 - TFA=1
   The PMU has priority over TSX, if PMC3 is needed, then TSX transactions
   are forced to abort. PMU has access to PMC3 and can schedule events on it.

 - TFA=0
   TSX has priority over PMU. If PMC3 is needed for an event, then the event
   must be scheduled on another counter. PMC3 is not available.

When a sysadmin modifies TFA, the current code base does not change anything
to the events measured at the time nor the actual MSR controlling TFA. If the
kernel transitions from TFA=1 to TFA=0, nothing happens until the events are
descheduled on context switch, multiplexing or termination of measurement.
That means the TSX transactions still fail until then. There is no easy way
to evaluate how long this can take.

This patch series addresses this issue by rescheduling the events as part of the
sysctl changes. That way, there is the guarantee that no more perf_events events
are running on PMC3 by the time the write() syscall (from the echo) returns, and
that TSX transactions may succeed from then on. Similarly, when transitioning
from TFA=0 to TFA=1, the events are rescheduled and can use PMC3 immediately if
needed and TSX transactions systematically abort, by the time the write() syscall
returns.

To make this work, the patch uses an existing reschedule function in the generic
code. It makes it visible outside the generic code as well as the context locking
code. All to avoid code duplication. Given there is no good way to find the struct
pmu, if you do not have it, the patch adds a x86_get_pmu() call which is less than
ideal, but I am open to suggestions here.

Signed-off-by: Stephane Eranian <eranian@google.com>

Stephane Eranian (3):
  perf/core: make perf_ctx_*lock() global inline functions
  perf/core: make ctx_resched() a global function
  perf/x86/intel: force resched when TFA sysctl is modified

 arch/x86/events/core.c       |  4 +++
 arch/x86/events/intel/core.c | 60 ++++++++++++++++++++++++++++++++++--
 arch/x86/events/perf_event.h |  1 +
 include/linux/perf_event.h   | 28 +++++++++++++++++
 kernel/events/core.c         | 37 ++++------------------
 5 files changed, 97 insertions(+), 33 deletions(-)

-- 
2.21.0.392.gf8f6787159e-goog


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/3] perf/core: make perf_ctx_*lock() global inline functions
  2019-04-04 18:32 [PATCH 0/3] perf/x86/intel: force reschedule on TFA changes Stephane Eranian
@ 2019-04-04 18:32 ` Stephane Eranian
  2019-04-04 18:32 ` [PATCH 2/3] perf/core: make ctx_resched() a global function Stephane Eranian
  2019-04-04 18:32 ` [PATCH 3/3] perf/x86/intel: force resched when TFA sysctl is modified Stephane Eranian
  2 siblings, 0 replies; 8+ messages in thread
From: Stephane Eranian @ 2019-04-04 18:32 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterz, tglx, ak, kan.liang, mingo, nelson.dsouza, jolsa, tonyj

This patch makes the perf_ctx_lock()/perf_ctx_unlock() inlined functions
available throughout the perf_events code and not just in kernel/events/core.c
This will help with the next patch.

Signed-off-by: Stephane Eranian <eranian@google.com>
---
 include/linux/perf_event.h | 16 ++++++++++++++++
 kernel/events/core.c       | 16 ----------------
 2 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 2a1405e907ec..514de997696b 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1283,6 +1283,22 @@ perf_event_addr_filters(struct perf_event *event)
 	return ifh;
 }
 
+static inline void perf_ctx_lock(struct perf_cpu_context *cpuctx,
+			  struct perf_event_context *ctx)
+{
+	raw_spin_lock(&cpuctx->ctx.lock);
+	if (ctx)
+		raw_spin_lock(&ctx->lock);
+}
+
+static inline void perf_ctx_unlock(struct perf_cpu_context *cpuctx,
+			    struct perf_event_context *ctx)
+{
+	if (ctx)
+		raw_spin_unlock(&ctx->lock);
+	raw_spin_unlock(&cpuctx->ctx.lock);
+}
+
 extern void perf_event_addr_filters_sync(struct perf_event *event);
 
 extern int perf_output_begin(struct perf_output_handle *handle,
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 833f1bccf25a..429bf6d8be95 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -148,22 +148,6 @@ __get_cpu_context(struct perf_event_context *ctx)
 	return this_cpu_ptr(ctx->pmu->pmu_cpu_context);
 }
 
-static void perf_ctx_lock(struct perf_cpu_context *cpuctx,
-			  struct perf_event_context *ctx)
-{
-	raw_spin_lock(&cpuctx->ctx.lock);
-	if (ctx)
-		raw_spin_lock(&ctx->lock);
-}
-
-static void perf_ctx_unlock(struct perf_cpu_context *cpuctx,
-			    struct perf_event_context *ctx)
-{
-	if (ctx)
-		raw_spin_unlock(&ctx->lock);
-	raw_spin_unlock(&cpuctx->ctx.lock);
-}
-
 #define TASK_TOMBSTONE ((void *)-1L)
 
 static bool is_kernel_event(struct perf_event *event)
-- 
2.21.0.392.gf8f6787159e-goog


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/3] perf/core: make ctx_resched() a global function
  2019-04-04 18:32 [PATCH 0/3] perf/x86/intel: force reschedule on TFA changes Stephane Eranian
  2019-04-04 18:32 ` [PATCH 1/3] perf/core: make perf_ctx_*lock() global inline functions Stephane Eranian
@ 2019-04-04 18:32 ` Stephane Eranian
  2019-04-04 18:32 ` [PATCH 3/3] perf/x86/intel: force resched when TFA sysctl is modified Stephane Eranian
  2 siblings, 0 replies; 8+ messages in thread
From: Stephane Eranian @ 2019-04-04 18:32 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterz, tglx, ak, kan.liang, mingo, nelson.dsouza, jolsa, tonyj

This patch renames ctx_resched() to perf_ctx_resched() and makes
the function globally accessible. This is to prepare for the next
patch which needs to call this function from arch specific code.

Signed-off-by: Stephane Eranian <eranian@google.com>
---
 include/linux/perf_event.h | 12 ++++++++++++
 kernel/events/core.c       | 21 ++++++---------------
 2 files changed, 18 insertions(+), 15 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 514de997696b..150cfd493ad2 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -829,6 +829,15 @@ struct bpf_perf_event_data_kern {
 	struct perf_event *event;
 };
 
+enum event_type_t {
+	EVENT_FLEXIBLE = 0x1,
+	EVENT_PINNED = 0x2,
+	EVENT_TIME = 0x4,
+	/* see ctx_resched() for details */
+	EVENT_CPU = 0x8,
+	EVENT_ALL = EVENT_FLEXIBLE | EVENT_PINNED,
+};
+
 #ifdef CONFIG_CGROUP_PERF
 
 /*
@@ -895,6 +904,9 @@ extern void perf_sched_cb_dec(struct pmu *pmu);
 extern void perf_sched_cb_inc(struct pmu *pmu);
 extern int perf_event_task_disable(void);
 extern int perf_event_task_enable(void);
+extern void perf_ctx_resched(struct perf_cpu_context *cpuctx,
+			struct perf_event_context *task_ctx,
+			enum event_type_t event_type);
 extern int perf_event_refresh(struct perf_event *event, int refresh);
 extern void perf_event_update_userpage(struct perf_event *event);
 extern int perf_event_release_kernel(struct perf_event *event);
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 429bf6d8be95..48b955a2b7f1 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -338,15 +338,6 @@ static void event_function_local(struct perf_event *event, event_f func, void *d
 	(PERF_SAMPLE_BRANCH_KERNEL |\
 	 PERF_SAMPLE_BRANCH_HV)
 
-enum event_type_t {
-	EVENT_FLEXIBLE = 0x1,
-	EVENT_PINNED = 0x2,
-	EVENT_TIME = 0x4,
-	/* see ctx_resched() for details */
-	EVENT_CPU = 0x8,
-	EVENT_ALL = EVENT_FLEXIBLE | EVENT_PINNED,
-};
-
 /*
  * perf_sched_events : >0 events exist
  * perf_cgroup_events: >0 per-cpu cgroup events exist on this cpu
@@ -2430,9 +2421,9 @@ static void perf_event_sched_in(struct perf_cpu_context *cpuctx,
  * event_type is a bit mask of the types of events involved. For CPU events,
  * event_type is only either EVENT_PINNED or EVENT_FLEXIBLE.
  */
-static void ctx_resched(struct perf_cpu_context *cpuctx,
-			struct perf_event_context *task_ctx,
-			enum event_type_t event_type)
+void perf_ctx_resched(struct perf_cpu_context *cpuctx,
+		      struct perf_event_context *task_ctx,
+		      enum event_type_t event_type)
 {
 	enum event_type_t ctx_event_type;
 	bool cpu_event = !!(event_type & EVENT_CPU);
@@ -2520,7 +2511,7 @@ static int  __perf_install_in_context(void *info)
 	if (reprogram) {
 		ctx_sched_out(ctx, cpuctx, EVENT_TIME);
 		add_event_to_ctx(event, ctx);
-		ctx_resched(cpuctx, task_ctx, get_event_type(event));
+		perf_ctx_resched(cpuctx, task_ctx, get_event_type(event));
 	} else {
 		add_event_to_ctx(event, ctx);
 	}
@@ -2664,7 +2655,7 @@ static void __perf_event_enable(struct perf_event *event,
 	if (ctx->task)
 		WARN_ON_ONCE(task_ctx != ctx);
 
-	ctx_resched(cpuctx, task_ctx, get_event_type(event));
+	perf_ctx_resched(cpuctx, task_ctx, get_event_type(event));
 }
 
 /*
@@ -3782,7 +3773,7 @@ static void perf_event_enable_on_exec(int ctxn)
 	 */
 	if (enabled) {
 		clone_ctx = unclone_ctx(ctx);
-		ctx_resched(cpuctx, ctx, event_type);
+		perf_ctx_resched(cpuctx, ctx, event_type);
 	} else {
 		ctx_sched_in(ctx, cpuctx, EVENT_TIME, current);
 	}
-- 
2.21.0.392.gf8f6787159e-goog


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/3] perf/x86/intel: force resched when TFA sysctl is modified
  2019-04-04 18:32 [PATCH 0/3] perf/x86/intel: force reschedule on TFA changes Stephane Eranian
  2019-04-04 18:32 ` [PATCH 1/3] perf/core: make perf_ctx_*lock() global inline functions Stephane Eranian
  2019-04-04 18:32 ` [PATCH 2/3] perf/core: make ctx_resched() a global function Stephane Eranian
@ 2019-04-04 18:32 ` Stephane Eranian
  2019-04-05  7:13   ` Peter Zijlstra
  2 siblings, 1 reply; 8+ messages in thread
From: Stephane Eranian @ 2019-04-04 18:32 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterz, tglx, ak, kan.liang, mingo, nelson.dsouza, jolsa, tonyj

This patch provides guarantee to the sysadmin that when TFA is disabled, no PMU
event is using PMC3 when the echo command returns. Vice-Versa, when TFA
is enabled, PMU can use PMC3 immediately (to eliminate possible multiplexing).

$ perf stat -a -I 1000 --no-merge -e branches,branches,branches,branches
     1.000123979    125,768,725,208      branches
     1.000562520    125,631,000,456      branches
     1.000942898    125,487,114,291      branches
     1.001333316    125,323,363,620      branches
     2.004721306    125,514,968,546      branches
     2.005114560    125,511,110,861      branches
     2.005482722    125,510,132,724      branches
     2.005851245    125,508,967,086      branches
     3.006323475    125,166,570,648      branches
     3.006709247    125,165,650,056      branches
     3.007086605    125,164,639,142      branches
     3.007459298    125,164,402,912      branches
     4.007922698    125,045,577,140      branches
     4.008310775    125,046,804,324      branches
     4.008670814    125,048,265,111      branches
     4.009039251    125,048,677,611      branches
     5.009503373    125,122,240,217      branches
     5.009897067    125,122,450,517      branches

Then on another connection, sysadmin does:
$ echo  1 >/sys/devices/cpu/allow_tsx_force_abort

Then perf stat adjusts the events immediately:

     5.010286029    125,121,393,483      branches
     5.010646308    125,120,556,786      branches
     6.011113588    124,963,351,832      branches
     6.011510331    124,964,267,566      branches
     6.011889913    124,964,829,130      branches
     6.012262996    124,965,841,156      branches
     7.012708299    124,419,832,234      branches [79.69%]
     7.012847908    124,416,363,853      branches [79.73%]
     7.013225462    124,400,723,712      branches [79.73%]
     7.013598191    124,376,154,434      branches [79.70%]
     8.014089834    124,250,862,693      branches [74.98%]
     8.014481363    124,267,539,139      branches [74.94%]
     8.014856006    124,259,519,786      branches [74.98%]
     8.014980848    124,225,457,969      branches [75.04%]
     9.015464576    124,204,235,423      branches [75.03%]
     9.015858587    124,204,988,490      branches [75.04%]
     9.016243680    124,220,092,486      branches [74.99%]
     9.016620104    124,231,260,146      branches [74.94%]

And vice-versa if the syadmin does:
$ echo  0 >/sys/devices/cpu/allow_tsx_force_abort

Events are again spread over the 4 counters:

    10.017096277    124,276,230,565      branches [74.96%]
    10.017237209    124,228,062,171      branches [75.03%]
    10.017478637    124,178,780,626      branches [75.03%]
    10.017853402    124,198,316,177      branches [75.03%]
    11.018334423    124,602,418,933      branches [85.40%]
    11.018722584    124,602,921,320      branches [85.42%]
    11.019095621    124,603,956,093      branches [85.42%]
    11.019467742    124,595,273,783      branches [85.42%]
    12.019945736    125,110,114,864      branches
    12.020330764    125,109,334,472      branches
    12.020688740    125,109,818,865      branches
    12.021054020    125,108,594,014      branches
    13.021516774    125,109,164,018      branches
    13.021903640    125,108,794,510      branches
    13.022270770    125,107,756,978      branches
    13.022630819    125,109,380,471      branches
    14.023114989    125,133,140,817      branches
    14.023501880    125,133,785,858      branches
    14.023868339    125,133,852,700      branches

Signed-off-by: Stephane Eranian <eranian@google.com>
---
 arch/x86/events/core.c       |  4 +++
 arch/x86/events/intel/core.c | 60 ++++++++++++++++++++++++++++++++++--
 arch/x86/events/perf_event.h |  1 +
 3 files changed, 63 insertions(+), 2 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 12d7d591843e..314173f89cc8 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -677,6 +677,10 @@ static inline int is_x86_event(struct perf_event *event)
 	return event->pmu == &pmu;
 }
 
+struct pmu *x86_get_pmu(void)
+{
+	return &pmu;
+}
 /*
  * Event scheduler state:
  *
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index a4b7711ef0ee..8d356c2096bc 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4483,6 +4483,60 @@ static ssize_t freeze_on_smi_store(struct device *cdev,
 	return count;
 }
 
+static void update_tfa_sched(void *ignored)
+{
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+	struct pmu *pmu = x86_get_pmu();
+	struct perf_cpu_context *cpuctx = this_cpu_ptr(pmu->pmu_cpu_context);
+	struct perf_event_context *task_ctx = cpuctx->task_ctx;
+
+	/* prevent any changes to the two contexts */
+	perf_ctx_lock(cpuctx, task_ctx);
+
+	/*
+	 * check if PMC3 is used
+	 * and if so force schedule out for all event types all contexts
+	 */
+	if (test_bit(3, cpuc->active_mask))
+		perf_ctx_resched(cpuctx, task_ctx, EVENT_ALL|EVENT_CPU);
+
+	perf_ctx_unlock(cpuctx, task_ctx);
+}
+
+static ssize_t show_sysctl_tfa(struct device *cdev,
+			      struct device_attribute *attr,
+			      char *buf)
+{
+	return snprintf(buf, 40, "%d\n", allow_tsx_force_abort);
+}
+
+static ssize_t set_sysctl_tfa(struct device *cdev,
+			      struct device_attribute *attr,
+			      const char *buf, size_t count)
+{
+	unsigned long val;
+	ssize_t ret;
+
+	ret = kstrtoul(buf, 0, &val);
+	if (ret)
+		return ret;
+
+	/* looking for boolean value */
+	if (val > 2)
+		return -EINVAL;
+
+	/* no change */
+	if (val == allow_tsx_force_abort)
+		return count;
+
+	allow_tsx_force_abort ^= 1;
+
+	on_each_cpu(update_tfa_sched, NULL, 1);
+
+	return count;
+}
+
+
 static DEVICE_ATTR_RW(freeze_on_smi);
 
 static ssize_t branches_show(struct device *cdev,
@@ -4515,7 +4569,9 @@ static struct attribute *intel_pmu_caps_attrs[] = {
        NULL
 };
 
-static DEVICE_BOOL_ATTR(allow_tsx_force_abort, 0644, allow_tsx_force_abort);
+static DEVICE_ATTR(allow_tsx_force_abort, 0644,
+		   show_sysctl_tfa,
+		   set_sysctl_tfa);
 
 static struct attribute *intel_pmu_attrs[] = {
 	&dev_attr_freeze_on_smi.attr,
@@ -5026,7 +5082,7 @@ __init int intel_pmu_init(void)
 			x86_pmu.get_event_constraints = tfa_get_event_constraints;
 			x86_pmu.enable_all = intel_tfa_pmu_enable_all;
 			x86_pmu.commit_scheduling = intel_tfa_commit_scheduling;
-			intel_pmu_attrs[1] = &dev_attr_allow_tsx_force_abort.attr.attr;
+			intel_pmu_attrs[1] = &dev_attr_allow_tsx_force_abort.attr;
 		}
 
 		pr_cont("Skylake events, ");
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index fff8868f92a8..437255603810 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -785,6 +785,7 @@ static struct perf_pmu_events_ht_attr event_attr_##v = {		\
 	.event_str_ht	= ht,						\
 }
 
+struct pmu *x86_get_pmu(void);
 extern struct x86_pmu x86_pmu __read_mostly;
 
 static inline bool x86_pmu_has_lbr_callstack(void)
-- 
2.21.0.392.gf8f6787159e-goog


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/3] perf/x86/intel: force resched when TFA sysctl is modified
  2019-04-04 18:32 ` [PATCH 3/3] perf/x86/intel: force resched when TFA sysctl is modified Stephane Eranian
@ 2019-04-05  7:13   ` Peter Zijlstra
  2019-04-05 17:00     ` Stephane Eranian
  0 siblings, 1 reply; 8+ messages in thread
From: Peter Zijlstra @ 2019-04-05  7:13 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: linux-kernel, tglx, ak, kan.liang, mingo, nelson.dsouza, jolsa, tonyj

On Thu, Apr 04, 2019 at 11:32:19AM -0700, Stephane Eranian wrote:
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index a4b7711ef0ee..8d356c2096bc 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -4483,6 +4483,60 @@ static ssize_t freeze_on_smi_store(struct device *cdev,
>  	return count;
>  }
>  
> +static void update_tfa_sched(void *ignored)
> +{
> +	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
> +	struct pmu *pmu = x86_get_pmu();
> +	struct perf_cpu_context *cpuctx = this_cpu_ptr(pmu->pmu_cpu_context);
> +	struct perf_event_context *task_ctx = cpuctx->task_ctx;
> +
> +	/* prevent any changes to the two contexts */
> +	perf_ctx_lock(cpuctx, task_ctx);
> +
> +	/*
> +	 * check if PMC3 is used
> +	 * and if so force schedule out for all event types all contexts
> +	 */
> +	if (test_bit(3, cpuc->active_mask))
> +		perf_ctx_resched(cpuctx, task_ctx, EVENT_ALL|EVENT_CPU);
> +
> +	perf_ctx_unlock(cpuctx, task_ctx);

I'm not particularly happy with exporting all that. Can't we create this
new perf_ctx_resched() to include the locking and everything. Then the
above reduces to:

	if (test_bit(3, cpuc->active_mask))
		perf_ctx_resched(cpuctx);

And we don't get to export the tricky bits.

> +}
> +
> +static ssize_t show_sysctl_tfa(struct device *cdev,
> +			      struct device_attribute *attr,
> +			      char *buf)
> +{
> +	return snprintf(buf, 40, "%d\n", allow_tsx_force_abort);
> +}
> +
> +static ssize_t set_sysctl_tfa(struct device *cdev,
> +			      struct device_attribute *attr,
> +			      const char *buf, size_t count)
> +{
> +	unsigned long val;
> +	ssize_t ret;
> +
> +	ret = kstrtoul(buf, 0, &val);

You want kstrtobool()

> +	if (ret)
> +		return ret;
> +
> +	/* looking for boolean value */
> +	if (val > 2)
> +		return -EINVAL;
> +
> +	/* no change */
> +	if (val == allow_tsx_force_abort)
> +		return count;
> +
> +	allow_tsx_force_abort ^= 1;

	allow_tsx_force_abort = val;

is simpler

> +
> +	on_each_cpu(update_tfa_sched, NULL, 1);
> +
> +	return count;
> +}

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/3] perf/x86/intel: force resched when TFA sysctl is modified
  2019-04-05  7:13   ` Peter Zijlstra
@ 2019-04-05 17:00     ` Stephane Eranian
  2019-04-05 20:26       ` Peter Zijlstra
  0 siblings, 1 reply; 8+ messages in thread
From: Stephane Eranian @ 2019-04-05 17:00 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Thomas Gleixner, Andi Kleen, Liang, Kan, mingo,
	nelson.dsouza, Jiri Olsa, tonyj

On Fri, Apr 5, 2019 at 12:13 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Thu, Apr 04, 2019 at 11:32:19AM -0700, Stephane Eranian wrote:
> > diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> > index a4b7711ef0ee..8d356c2096bc 100644
> > --- a/arch/x86/events/intel/core.c
> > +++ b/arch/x86/events/intel/core.c
> > @@ -4483,6 +4483,60 @@ static ssize_t freeze_on_smi_store(struct device *cdev,
> >       return count;
> >  }
> >
> > +static void update_tfa_sched(void *ignored)
> > +{
> > +     struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
> > +     struct pmu *pmu = x86_get_pmu();
> > +     struct perf_cpu_context *cpuctx = this_cpu_ptr(pmu->pmu_cpu_context);
> > +     struct perf_event_context *task_ctx = cpuctx->task_ctx;
> > +
> > +     /* prevent any changes to the two contexts */
> > +     perf_ctx_lock(cpuctx, task_ctx);
> > +
> > +     /*
> > +      * check if PMC3 is used
> > +      * and if so force schedule out for all event types all contexts
> > +      */
> > +     if (test_bit(3, cpuc->active_mask))
> > +             perf_ctx_resched(cpuctx, task_ctx, EVENT_ALL|EVENT_CPU);
> > +
> > +     perf_ctx_unlock(cpuctx, task_ctx);
>
> I'm not particularly happy with exporting all that. Can't we create this
> new perf_ctx_resched() to include the locking and everything. Then the
> above reduces to:
>
>         if (test_bit(3, cpuc->active_mask))
>                 perf_ctx_resched(cpuctx);
>
> And we don't get to export the tricky bits.
>
The only reason I exported the locking is to protect
cpuc->active_mask. But if you
think there is no race, then sure,  we can just export a new
perf_ctx_resched() that
does the locking and invokes the ctx_resched() function.

> > +}
> > +
> > +static ssize_t show_sysctl_tfa(struct device *cdev,
> > +                           struct device_attribute *attr,
> > +                           char *buf)
> > +{
> > +     return snprintf(buf, 40, "%d\n", allow_tsx_force_abort);
> > +}
> > +
> > +static ssize_t set_sysctl_tfa(struct device *cdev,
> > +                           struct device_attribute *attr,
> > +                           const char *buf, size_t count)
> > +{
> > +     unsigned long val;
> > +     ssize_t ret;
> > +
> > +     ret = kstrtoul(buf, 0, &val);
>
> You want kstrtobool()
>
ok.

> > +     if (ret)
> > +             return ret;
> > +
> > +     /* looking for boolean value */
> > +     if (val > 2)
> > +             return -EINVAL;
> > +
> > +     /* no change */
> > +     if (val == allow_tsx_force_abort)
> > +             return count;
> > +
> > +     allow_tsx_force_abort ^= 1;
>
>         allow_tsx_force_abort = val;
>
> is simpler
>
ok.

> > +
> > +     on_each_cpu(update_tfa_sched, NULL, 1);
> > +
> > +     return count;
> > +}

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/3] perf/x86/intel: force resched when TFA sysctl is modified
  2019-04-05 17:00     ` Stephane Eranian
@ 2019-04-05 20:26       ` Peter Zijlstra
  2019-04-06  0:49         ` Stephane Eranian
  0 siblings, 1 reply; 8+ messages in thread
From: Peter Zijlstra @ 2019-04-05 20:26 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: LKML, Thomas Gleixner, Andi Kleen, Liang, Kan, mingo,
	nelson.dsouza, Jiri Olsa, tonyj

On Fri, Apr 05, 2019 at 10:00:03AM -0700, Stephane Eranian wrote:

> > > +static void update_tfa_sched(void *ignored)
> > > +{
> > > +     struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
> > > +     struct pmu *pmu = x86_get_pmu();
> > > +     struct perf_cpu_context *cpuctx = this_cpu_ptr(pmu->pmu_cpu_context);
> > > +     struct perf_event_context *task_ctx = cpuctx->task_ctx;
> > > +
> > > +     /* prevent any changes to the two contexts */
> > > +     perf_ctx_lock(cpuctx, task_ctx);
> > > +
> > > +     /*
> > > +      * check if PMC3 is used
> > > +      * and if so force schedule out for all event types all contexts
> > > +      */
> > > +     if (test_bit(3, cpuc->active_mask))
> > > +             perf_ctx_resched(cpuctx, task_ctx, EVENT_ALL|EVENT_CPU);
> > > +
> > > +     perf_ctx_unlock(cpuctx, task_ctx);
> >
> > I'm not particularly happy with exporting all that. Can't we create this
> > new perf_ctx_resched() to include the locking and everything. Then the
> > above reduces to:
> >
> >         if (test_bit(3, cpuc->active_mask))
> >                 perf_ctx_resched(cpuctx);
> >
> > And we don't get to export the tricky bits.
> >
> The only reason I exported the locking is to protect
> cpuc->active_mask. But if you
> think there is no race, then sure,  we can just export a new
> perf_ctx_resched() that
> does the locking and invokes the ctx_resched() function.

It doesn't matter if it races, if it was used and isn't anymore, it's
a pointless reschedule, if it isn't used and we don't reschedule, it
cannot be used because we've already set the flag.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/3] perf/x86/intel: force resched when TFA sysctl is modified
  2019-04-05 20:26       ` Peter Zijlstra
@ 2019-04-06  0:49         ` Stephane Eranian
  0 siblings, 0 replies; 8+ messages in thread
From: Stephane Eranian @ 2019-04-06  0:49 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Thomas Gleixner, Andi Kleen, Liang, Kan, mingo,
	nelson.dsouza, Jiri Olsa, tonyj

On Fri, Apr 5, 2019 at 1:26 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Fri, Apr 05, 2019 at 10:00:03AM -0700, Stephane Eranian wrote:
>
> > > > +static void update_tfa_sched(void *ignored)
> > > > +{
> > > > +     struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
> > > > +     struct pmu *pmu = x86_get_pmu();
> > > > +     struct perf_cpu_context *cpuctx = this_cpu_ptr(pmu->pmu_cpu_context);
> > > > +     struct perf_event_context *task_ctx = cpuctx->task_ctx;
> > > > +
> > > > +     /* prevent any changes to the two contexts */
> > > > +     perf_ctx_lock(cpuctx, task_ctx);
> > > > +
> > > > +     /*
> > > > +      * check if PMC3 is used
> > > > +      * and if so force schedule out for all event types all contexts
> > > > +      */
> > > > +     if (test_bit(3, cpuc->active_mask))
> > > > +             perf_ctx_resched(cpuctx, task_ctx, EVENT_ALL|EVENT_CPU);
> > > > +
> > > > +     perf_ctx_unlock(cpuctx, task_ctx);
> > >
> > > I'm not particularly happy with exporting all that. Can't we create this
> > > new perf_ctx_resched() to include the locking and everything. Then the
> > > above reduces to:
> > >
> > >         if (test_bit(3, cpuc->active_mask))
> > >                 perf_ctx_resched(cpuctx);
> > >
> > > And we don't get to export the tricky bits.
> > >
> > The only reason I exported the locking is to protect
> > cpuc->active_mask. But if you
> > think there is no race, then sure,  we can just export a new
> > perf_ctx_resched() that
> > does the locking and invokes the ctx_resched() function.
>
> It doesn't matter if it races, if it was used and isn't anymore, it's
> a pointless reschedule, if it isn't used and we don't reschedule, it
> cannot be used because we've already set the flag.

True. I will post V2 shortly.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-04-06  0:49 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-04 18:32 [PATCH 0/3] perf/x86/intel: force reschedule on TFA changes Stephane Eranian
2019-04-04 18:32 ` [PATCH 1/3] perf/core: make perf_ctx_*lock() global inline functions Stephane Eranian
2019-04-04 18:32 ` [PATCH 2/3] perf/core: make ctx_resched() a global function Stephane Eranian
2019-04-04 18:32 ` [PATCH 3/3] perf/x86/intel: force resched when TFA sysctl is modified Stephane Eranian
2019-04-05  7:13   ` Peter Zijlstra
2019-04-05 17:00     ` Stephane Eranian
2019-04-05 20:26       ` Peter Zijlstra
2019-04-06  0:49         ` Stephane Eranian

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.