linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] perf: Rewrite core context handling
       [not found] <20220829113347.295-1-ravi.bangoria@amd.com>
@ 2022-08-29 12:04 ` Peter Zijlstra
  2022-08-29 14:40   ` Peter Zijlstra
  2022-09-06  5:50 ` Ravi Bangoria
  2022-09-07 11:28 ` Ravi Bangoria
  2 siblings, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2022-08-29 12:04 UTC (permalink / raw)
  To: Ravi Bangoria
  Cc: acme, alexander.shishkin, jolsa, namhyung, songliubraving,
	eranian, ak, mark.rutland, frederic, maddy, irogers, will, robh,
	mingo, catalin.marinas, linux-arm-kernel, linux-perf-users,
	linuxppc-dev, linux-s390, linux-kernel, sandipan.das,
	ananth.narayan, kim.phillips, santosh.shukla

On Mon, Aug 29, 2022 at 05:03:47PM +0530, Ravi Bangoria wrote:
> @@ -12598,6 +12590,7 @@ EXPORT_SYMBOL_GPL(perf_event_create_kernel_counter);
>  
>  void perf_pmu_migrate_context(struct pmu *pmu, int src_cpu, int dst_cpu)
>  {
> +#if 0 // XXX buggered - cpu hotplug, who cares
>  	struct perf_event_context *src_ctx;
>  	struct perf_event_context *dst_ctx;
>  	struct perf_event *event, *tmp;
> @@ -12658,6 +12651,7 @@ void perf_pmu_migrate_context(struct pmu *pmu, int src_cpu, int dst_cpu)
>  	}
>  	mutex_unlock(&dst_ctx->mutex);
>  	mutex_unlock(&src_ctx->mutex);
> +#endif
>  }
>  EXPORT_SYMBOL_GPL(perf_pmu_migrate_context);
>  

Note to self; fix this :-) I'll see if I have time for that later today.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] perf: Rewrite core context handling
  2022-08-29 12:04 ` [PATCH] perf: Rewrite core context handling Peter Zijlstra
@ 2022-08-29 14:40   ` Peter Zijlstra
  2022-09-01 10:35     ` Ravi Bangoria
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2022-08-29 14:40 UTC (permalink / raw)
  To: Ravi Bangoria
  Cc: acme, alexander.shishkin, jolsa, namhyung, songliubraving,
	eranian, ak, mark.rutland, frederic, maddy, irogers, will, robh,
	mingo, catalin.marinas, linux-arm-kernel, linux-perf-users,
	linuxppc-dev, linux-s390, linux-kernel, sandipan.das,
	ananth.narayan, kim.phillips, santosh.shukla

On Mon, Aug 29, 2022 at 02:04:33PM +0200, Peter Zijlstra wrote:
> On Mon, Aug 29, 2022 at 05:03:47PM +0530, Ravi Bangoria wrote:
> > @@ -12598,6 +12590,7 @@ EXPORT_SYMBOL_GPL(perf_event_create_kernel_counter);
> >  
> >  void perf_pmu_migrate_context(struct pmu *pmu, int src_cpu, int dst_cpu)
> >  {
> > +#if 0 // XXX buggered - cpu hotplug, who cares
> >  	struct perf_event_context *src_ctx;
> >  	struct perf_event_context *dst_ctx;
> >  	struct perf_event *event, *tmp;
> > @@ -12658,6 +12651,7 @@ void perf_pmu_migrate_context(struct pmu *pmu, int src_cpu, int dst_cpu)
> >  	}
> >  	mutex_unlock(&dst_ctx->mutex);
> >  	mutex_unlock(&src_ctx->mutex);
> > +#endif
> >  }
> >  EXPORT_SYMBOL_GPL(perf_pmu_migrate_context);
> >  
> 
> Note to self; fix this :-) I'll see if I have time for that later today.

Urgh, while going through that it appears the whole refcounting thing
isn't fully done either.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] perf: Rewrite core context handling
  2022-08-29 14:40   ` Peter Zijlstra
@ 2022-09-01 10:35     ` Ravi Bangoria
  2022-09-01 11:29       ` Peter Zijlstra
  0 siblings, 1 reply; 11+ messages in thread
From: Ravi Bangoria @ 2022-09-01 10:35 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: acme, alexander.shishkin, jolsa, namhyung, songliubraving,
	eranian, ak, mark.rutland, frederic, maddy, irogers, will, robh,
	mingo, catalin.marinas, linux-arm-kernel, linux-perf-users,
	linuxppc-dev, linux-s390, linux-kernel, sandipan.das,
	ananth.narayan, kim.phillips, santosh.shukla, ravi.bangoria

On 29-Aug-22 8:10 PM, Peter Zijlstra wrote:
> On Mon, Aug 29, 2022 at 02:04:33PM +0200, Peter Zijlstra wrote:
>> On Mon, Aug 29, 2022 at 05:03:47PM +0530, Ravi Bangoria wrote:
>>> @@ -12598,6 +12590,7 @@ EXPORT_SYMBOL_GPL(perf_event_create_kernel_counter);
>>>  
>>>  void perf_pmu_migrate_context(struct pmu *pmu, int src_cpu, int dst_cpu)
>>>  {
>>> +#if 0 // XXX buggered - cpu hotplug, who cares
>>>  	struct perf_event_context *src_ctx;
>>>  	struct perf_event_context *dst_ctx;
>>>  	struct perf_event *event, *tmp;
>>> @@ -12658,6 +12651,7 @@ void perf_pmu_migrate_context(struct pmu *pmu, int src_cpu, int dst_cpu)
>>>  	}
>>>  	mutex_unlock(&dst_ctx->mutex);
>>>  	mutex_unlock(&src_ctx->mutex);
>>> +#endif
>>>  }
>>>  EXPORT_SYMBOL_GPL(perf_pmu_migrate_context);
>>>  
>>
>> Note to self; fix this :-) I'll see if I have time for that later today.
> 
> Urgh, while going through that it appears the whole refcounting thing
> isn't fully done either.

Not sure I follow. Can you please elaborate.

Thanks,
Ravi

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] perf: Rewrite core context handling
  2022-09-01 10:35     ` Ravi Bangoria
@ 2022-09-01 11:29       ` Peter Zijlstra
  2022-09-05  4:40         ` Ravi Bangoria
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2022-09-01 11:29 UTC (permalink / raw)
  To: Ravi Bangoria
  Cc: acme, alexander.shishkin, jolsa, namhyung, songliubraving,
	eranian, ak, mark.rutland, frederic, maddy, irogers, will, robh,
	mingo, catalin.marinas, linux-arm-kernel, linux-perf-users,
	linuxppc-dev, linux-s390, linux-kernel, sandipan.das,
	ananth.narayan, kim.phillips, santosh.shukla

On Thu, Sep 01, 2022 at 04:05:53PM +0530, Ravi Bangoria wrote:
> On 29-Aug-22 8:10 PM, Peter Zijlstra wrote:
> > On Mon, Aug 29, 2022 at 02:04:33PM +0200, Peter Zijlstra wrote:
> >> On Mon, Aug 29, 2022 at 05:03:47PM +0530, Ravi Bangoria wrote:
> >>> @@ -12598,6 +12590,7 @@ EXPORT_SYMBOL_GPL(perf_event_create_kernel_counter);
> >>>  
> >>>  void perf_pmu_migrate_context(struct pmu *pmu, int src_cpu, int dst_cpu)
> >>>  {
> >>> +#if 0 // XXX buggered - cpu hotplug, who cares
> >>>  	struct perf_event_context *src_ctx;
> >>>  	struct perf_event_context *dst_ctx;
> >>>  	struct perf_event *event, *tmp;
> >>> @@ -12658,6 +12651,7 @@ void perf_pmu_migrate_context(struct pmu *pmu, int src_cpu, int dst_cpu)
> >>>  	}
> >>>  	mutex_unlock(&dst_ctx->mutex);
> >>>  	mutex_unlock(&src_ctx->mutex);
> >>> +#endif
> >>>  }
> >>>  EXPORT_SYMBOL_GPL(perf_pmu_migrate_context);
> >>>  
> >>
> >> Note to self; fix this :-) I'll see if I have time for that later today.
> > 
> > Urgh, while going through that it appears the whole refcounting thing
> > isn't fully done either.
> 
> Not sure I follow. Can you please elaborate.

Sure, and sorry, I've not been feeling too well the past few days; I
meant to post something, but I just couldn't find it in me to finish it
let alone make it work.

So the basic issue I mentioned is that:


/*
 *           ,------------------------[1:n]---------------------.
 *           V                                                  V
 * perf_event_context <-[1:n]-> perf_event_pmu_context <--- perf_event
 *           ^                      ^     |                     |
 *           `--------[1:n]---------'     `-[n:1]-> pmu <-[1:n]-'
 *
 *
 * XXX destroy epc when empty
 *   refcount, !rcu
 *
 * XXX epc locking
 *
 *   event->pmu_ctx            ctx->mutex && inactive
 *   ctx->pmu_ctx_list         ctx->mutex && ctx->lock
 *
 */
struct perf_event_pmu_context {
	...
        atomic_t                        refcount; /* event <-> epc */
	...
}

But that then also suggests that:

struct perf_event_context {
	...
        refcount_t                      refcount;
	...
}

should be: ctx <-> epc, and that is not so, the ctx::refcount still
counts the number of events.

Now this might not be bad, but it is confusing.

Anywya; below is what I have, but it is entirely unfinished untested
might eat your ganny etc..


---
Index: linux-2.6/include/linux/perf_event.h
===================================================================
--- linux-2.6.orig/include/linux/perf_event.h
+++ linux-2.6/include/linux/perf_event.h
@@ -883,7 +883,7 @@ struct perf_event_context {
 	int				nr_freq;
 	int				rotate_disable;
 
-	refcount_t			refcount;
+	refcount_t			refcount; /* ctx <-> epc */
 	struct task_struct		*task;
 
 	/*
Index: linux-2.6/kernel/events/core.c
===================================================================
--- linux-2.6.orig/kernel/events/core.c
+++ linux-2.6/kernel/events/core.c
@@ -2750,7 +2750,7 @@ perf_install_in_context(struct perf_even
 	WARN_ON_ONCE(!exclusive_event_installable(event, ctx));
 
 	if (event->cpu != -1)
-		event->cpu = cpu;
+		WARN_ON_ONCE(event->cpu != cpu);
 
 	/*
 	 * Ensures that if we can observe event->ctx, both the event and ctx
@@ -4764,6 +4764,7 @@ find_get_pmu_context(struct pmu *pmu, st
 			list_add(&epc->pmu_ctx_entry, &ctx->pmu_ctx_list);
 			epc->ctx = ctx;
 			raw_spin_unlock_irq(&ctx->lock);
+			// XXX get_ctx(ctx);
 		} else {
 			WARN_ON_ONCE(epc->ctx != ctx);
 			atomic_inc(&epc->refcount);
@@ -4800,6 +4801,7 @@ find_get_pmu_context(struct pmu *pmu, st
 
 	list_add(&epc->pmu_ctx_entry, &ctx->pmu_ctx_list);
 	epc->ctx = ctx;
+	// XXX get_ctx(ctx)
 
 found_epc:
 	if (task_ctx_data && !epc->task_ctx_data) {
@@ -4837,6 +4839,8 @@ static void put_pmu_ctx(struct perf_even
 		list_del_init(&epc->pmu_ctx_entry);
 		epc->ctx = NULL;
 		raw_spin_unlock_irqrestore(&ctx->lock, flags);
+		
+		// XXX put_ctx(ctx)
 	}
 
 	WARN_ON_ONCE(!list_empty(&epc->pinned_active));
@@ -12444,6 +12448,8 @@ SYSCALL_DEFINE5(perf_event_open,
 	perf_unpin_context(ctx);
 
 	mutex_unlock(&ctx->mutex);
+	
+	// XXX put_ctx(ctx);
 
 	if (task) {
 		up_read(&task->signal->exec_update_lock);
@@ -12588,34 +12594,44 @@ err:
 }
 EXPORT_SYMBOL_GPL(perf_event_create_kernel_counter);
 
-void perf_pmu_migrate_context(struct pmu *pmu, int src_cpu, int dst_cpu)
+static void __perf_pmu_remove(struct perf_event_context *ctx,
+			      int cpu, struct pmu *pmu,
+			      struct perf_event_groups *groups,
+			      struct list_head *events)
 {
-#if 0 // XXX buggered - cpu hotplug, who cares
-	struct perf_event_context *src_ctx;
-	struct perf_event_context *dst_ctx;
-	struct perf_event *event, *tmp;
-	LIST_HEAD(events);
+	struct perf_event *event;
 
-	src_ctx = &per_cpu_ptr(pmu->pmu_cpu_context, src_cpu)->ctx;
-	dst_ctx = &per_cpu_ptr(pmu->pmu_cpu_context, dst_cpu)->ctx;
+	for (event = perf_event_groups_first(groups, cpu, pmu, NULL);
+	     event;
+	     event = perf_event_groups_next(event, pmu)) {
 
-	/*
-	 * See perf_event_ctx_lock() for comments on the details
-	 * of swizzling perf_event::ctx.
-	 */
-	mutex_lock_double(&src_ctx->mutex, &dst_ctx->mutex);
-	list_for_each_entry_safe(event, tmp, &src_ctx->event_list,
-				 event_entry) {
 		perf_remove_from_context(event, 0);
-		unaccount_event_cpu(event, src_cpu);
-		put_ctx(src_ctx);
-		list_add(&event->migrate_entry, &events);
+		unaccount_event_cpu(event, cpu);
+		put_pmu_ctx(event->pmu_ctx);
+		list_add(&event->migrate_event, events);
 	}
+}
 
-	/*
-	 * Wait for the events to quiesce before re-instating them.
-	 */
-	synchronize_rcu();
+static void __perf_pmu_install_event(struct pmu *pmu,
+				     struct perf_event_context *ctx,
+				     int cpu, struct perf_event *event)
+{
+	struct perf_event_pmu_context *epc;
+
+	event->cpu = cpu;
+	epc = find_get_pmu_context(pmu, ctx, event);
+	event->pmu_ctx = epc;
+
+	if (event->state >= PERF_EVENT_STATE_OFF)
+		event->state = PERF_EVENT_STATE_INACTIVE;
+	account_event_cpu(event, cpu);
+	perf_install_in_context(ctx, event, cpu);
+}
+
+static void __perf_pmu_install(struct perf_event_context *ctx,
+			       int cpu, struct pmu *pmu, struct list_head *events)
+{
+	struct perf_event *event, *tmp;
 
 	/*
 	 * Re-instate events in 2 passes.
@@ -12625,33 +12641,50 @@ void perf_pmu_migrate_context(struct pmu
 	 * leader will enable its siblings, even if those are still on the old
 	 * context.
 	 */
-	list_for_each_entry_safe(event, tmp, &events, migrate_entry) {
+	list_for_each_entry_safe(event, tmp, events, migrate_entry) {
 		if (event->group_leader == event)
 			continue;
 
 		list_del(&event->migrate_entry);
-		if (event->state >= PERF_EVENT_STATE_OFF)
-			event->state = PERF_EVENT_STATE_INACTIVE;
-		account_event_cpu(event, dst_cpu);
-		perf_install_in_context(dst_ctx, event, dst_cpu);
-		get_ctx(dst_ctx);
+		__perf_pmu_install_event(pmu, ctx, cpu, event);
 	}
 
 	/*
 	 * Once all the siblings are setup properly, install the group leaders
 	 * to make it go.
 	 */
-	list_for_each_entry_safe(event, tmp, &events, migrate_entry) {
+	list_for_each_entry_safe(event, tmp, events, migrate_entry) {
 		list_del(&event->migrate_entry);
-		if (event->state >= PERF_EVENT_STATE_OFF)
-			event->state = PERF_EVENT_STATE_INACTIVE;
-		account_event_cpu(event, dst_cpu);
-		perf_install_in_context(dst_ctx, event, dst_cpu);
-		get_ctx(dst_ctx);
+		__perf_pmu_install_event(pmu, ctx, cpu, event);
 	}
+}
+
+void perf_pmu_migrate_context(struct pmu *pmu, int src_cpu, int dst_cpu)
+{
+	struct perf_event_context *src_ctx, *dst_ctx;
+	LIST_HEAD(events);
+
+	src_ctx = &per_cpu_ptr(&cpu_context, src_cpu)->ctx;
+	dst_ctx = &per_cpu_ptr(&cpu_context, dst_cpu)->ctx;
+
+	/*
+	 * See perf_event_ctx_lock() for comments on the details
+	 * of swizzling perf_event::ctx.
+	 */
+	mutex_lock_double(&src_ctx->mutex, &dst_ctx->mutex);
+
+	__perf_pmu_remove(src_ctx, src_cpu, pmu, &src_src->pinned_groups, &events);
+	__perf_pmu_remove(src_ctx, src_cpu, pmu, &src_src->flexible_groups, &events);
+
+	/*
+	 * Wait for the events to quiesce before re-instating them.
+	 */
+	synchronize_rcu();
+
+	__perf_pmu_install(dst_ctx, dst_cpu, pmu, &events);
+
 	mutex_unlock(&dst_ctx->mutex);
 	mutex_unlock(&src_ctx->mutex);
-#endif
 }
 EXPORT_SYMBOL_GPL(perf_pmu_migrate_context);
 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] perf: Rewrite core context handling
  2022-09-01 11:29       ` Peter Zijlstra
@ 2022-09-05  4:40         ` Ravi Bangoria
  0 siblings, 0 replies; 11+ messages in thread
From: Ravi Bangoria @ 2022-09-05  4:40 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: acme, alexander.shishkin, jolsa, namhyung, songliubraving,
	eranian, ak, mark.rutland, frederic, maddy, irogers, will, robh,
	mingo, catalin.marinas, linux-arm-kernel, linux-perf-users,
	linuxppc-dev, linux-s390, linux-kernel, sandipan.das,
	ananth.narayan, kim.phillips, santosh.shukla, ravi.bangoria

> So the basic issue I mentioned is that:
> 
> 
> /*
>  *           ,------------------------[1:n]---------------------.
>  *           V                                                  V
>  * perf_event_context <-[1:n]-> perf_event_pmu_context <--- perf_event
>  *           ^                      ^     |                     |
>  *           `--------[1:n]---------'     `-[n:1]-> pmu <-[1:n]-'
>  *
>  *
>  * XXX destroy epc when empty
>  *   refcount, !rcu
>  *
>  * XXX epc locking
>  *
>  *   event->pmu_ctx            ctx->mutex && inactive
>  *   ctx->pmu_ctx_list         ctx->mutex && ctx->lock
>  *
>  */
> struct perf_event_pmu_context {
> 	...
>         atomic_t                        refcount; /* event <-> epc */
> 	...
> }
> 
> But that then also suggests that:
> 
> struct perf_event_context {
> 	...
>         refcount_t                      refcount;
> 	...
> }
> 
> should be: ctx <-> epc, and that is not so, the ctx::refcount still
> counts the number of events.
> 
> Now this might not be bad, but it is confusing.

I don't have strong opinion, but we store events in ctx, not in pmu_ctx.
So, I think it makes sense to keep refcount as ctx <-> event?

[...]

> +void perf_pmu_migrate_context(struct pmu *pmu, int src_cpu, int dst_cpu)
> +{
> +	struct perf_event_context *src_ctx, *dst_ctx;
> +	LIST_HEAD(events);
> +
> +	src_ctx = &per_cpu_ptr(&cpu_context, src_cpu)->ctx;
> +	dst_ctx = &per_cpu_ptr(&cpu_context, dst_cpu)->ctx;
> +
> +	/*
> +	 * See perf_event_ctx_lock() for comments on the details
> +	 * of swizzling perf_event::ctx.
> +	 */
> +	mutex_lock_double(&src_ctx->mutex, &dst_ctx->mutex);
> +
> +	__perf_pmu_remove(src_ctx, src_cpu, pmu, &src_src->pinned_groups, &events);
> +	__perf_pmu_remove(src_ctx, src_cpu, pmu, &src_src->flexible_groups, &events);

Rbtrees does not contain sibling events. Shouldn't we continue using
event_list here?

Thanks,
Ravi

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] perf: Rewrite core context handling
       [not found] <20220829113347.295-1-ravi.bangoria@amd.com>
  2022-08-29 12:04 ` [PATCH] perf: Rewrite core context handling Peter Zijlstra
@ 2022-09-06  5:50 ` Ravi Bangoria
  2022-10-10 10:23   ` Peter Zijlstra
  2022-09-07 11:28 ` Ravi Bangoria
  2 siblings, 1 reply; 11+ messages in thread
From: Ravi Bangoria @ 2022-09-06  5:50 UTC (permalink / raw)
  To: peterz
  Cc: acme, alexander.shishkin, jolsa, namhyung, songliubraving,
	eranian, ak, mark.rutland, frederic, maddy, irogers, will, robh,
	mingo, catalin.marinas, linux-arm-kernel, linux-perf-users,
	linuxppc-dev, linux-s390, linux-kernel, sandipan.das,
	ananth.narayan, kim.phillips, santosh.shukla, ravi.bangoria

> @@ -9752,10 +9889,13 @@ void perf_tp_event(u16 event_type, u64 count, void *record, int entry_size,
>  		struct trace_entry *entry = record;
>  
>  		rcu_read_lock();
> -		ctx = rcu_dereference(task->perf_event_ctxp[perf_sw_context]);
> +		ctx = rcu_dereference(task->perf_event_ctxp);
>  		if (!ctx)
>  			goto unlock;
>  
> +		// XXX iterate groups instead, we should be able to
> +		// find the subtree for the perf_tracepoint pmu and CPU.
> +
>  		list_for_each_entry_rcu(event, &ctx->event_list, event_entry) {
>  			if (event->cpu != smp_processor_id())
>  				continue;

This one was simple enough so I prepared a patch for this. Let
me know if you see any issues with below diff.

---
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 820c56c66b26..e0232e0bb74e 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -9807,6 +9807,44 @@ static struct pmu perf_swevent = {
 
 #ifdef CONFIG_EVENT_TRACING
 
+static void tp_perf_event_destroy(struct perf_event *event)
+{
+	perf_trace_destroy(event);
+}
+
+static int perf_tp_event_init(struct perf_event *event)
+{
+	int err;
+
+	if (event->attr.type != PERF_TYPE_TRACEPOINT)
+		return -ENOENT;
+
+	/*
+	 * no branch sampling for tracepoint events
+	 */
+	if (has_branch_stack(event))
+		return -EOPNOTSUPP;
+
+	err = perf_trace_init(event);
+	if (err)
+		return err;
+
+	event->destroy = tp_perf_event_destroy;
+
+	return 0;
+}
+
+static struct pmu perf_tracepoint = {
+	.task_ctx_nr	= perf_sw_context,
+
+	.event_init	= perf_tp_event_init,
+	.add		= perf_trace_add,
+	.del		= perf_trace_del,
+	.start		= perf_swevent_start,
+	.stop		= perf_swevent_stop,
+	.read		= perf_swevent_read,
+};
+
 static int perf_tp_filter_match(struct perf_event *event,
 				struct perf_sample_data *data)
 {
@@ -9856,6 +9894,49 @@ void perf_trace_run_bpf_submit(void *raw_data, int size, int rctx,
 }
 EXPORT_SYMBOL_GPL(perf_trace_run_bpf_submit);
 
+
+static void __perf_tp_event_target_task(u64 count, void *record,
+					struct pt_regs *regs,
+					struct perf_sample_data *data,
+					struct perf_event *event)
+{
+	struct trace_entry *entry = record;
+
+	if (event->attr.config != entry->type)
+		return;
+	/* Cannot deliver synchronous signal to other task. */
+	if (event->attr.sigtrap)
+		return;
+	if (perf_tp_event_match(event, data, regs))
+		perf_swevent_event(event, count, data, regs);
+}
+
+static void perf_tp_event_target_task(u64 count, void *record,
+				      struct pt_regs *regs,
+				      struct perf_sample_data *data,
+				      struct perf_event_context *ctx)
+{
+	struct perf_event *event, *sibling;
+
+	event = perf_event_groups_first(&ctx->pinned_groups, smp_processor_id(),
+					&perf_tracepoint, NULL);
+	for (; event; event = perf_event_groups_next(event, &perf_tracepoint)) {
+		__perf_tp_event_target_task(count, record, regs, data, event);
+		for_each_sibling_event(sibling, event) {
+			__perf_tp_event_target_task(count, record, regs, data, sibling);
+		}
+	}
+
+	event = perf_event_groups_first(&ctx->flexible_groups, smp_processor_id(),
+					&perf_tracepoint, NULL);
+	for (; event; event = perf_event_groups_next(event, &perf_tracepoint)) {
+		__perf_tp_event_target_task(count, record, regs, data, event);
+		for_each_sibling_event(sibling, event) {
+			__perf_tp_event_target_task(count, record, regs, data, sibling);
+		}
+	}
+}
+
 void perf_tp_event(u16 event_type, u64 count, void *record, int entry_size,
 		   struct pt_regs *regs, struct hlist_head *head, int rctx,
 		   struct task_struct *task)
@@ -9886,29 +9967,15 @@ void perf_tp_event(u16 event_type, u64 count, void *record, int entry_size,
 	 */
 	if (task && task != current) {
 		struct perf_event_context *ctx;
-		struct trace_entry *entry = record;
 
 		rcu_read_lock();
 		ctx = rcu_dereference(task->perf_event_ctxp);
 		if (!ctx)
 			goto unlock;
 
-		// XXX iterate groups instead, we should be able to
-		// find the subtree for the perf_tracepoint pmu and CPU.
-
-		list_for_each_entry_rcu(event, &ctx->event_list, event_entry) {
-			if (event->cpu != smp_processor_id())
-				continue;
-			if (event->attr.type != PERF_TYPE_TRACEPOINT)
-				continue;
-			if (event->attr.config != entry->type)
-				continue;
-			/* Cannot deliver synchronous signal to other task. */
-			if (event->attr.sigtrap)
-				continue;
-			if (perf_tp_event_match(event, &data, regs))
-				perf_swevent_event(event, count, &data, regs);
-		}
+		raw_spin_lock(&ctx->lock);
+		perf_tp_event_target_task(count, record, regs, &data, ctx);
+		raw_spin_unlock(&ctx->lock);
 unlock:
 		rcu_read_unlock();
 	}
@@ -9917,44 +9984,6 @@ void perf_tp_event(u16 event_type, u64 count, void *record, int entry_size,
 }
 EXPORT_SYMBOL_GPL(perf_tp_event);
 
-static void tp_perf_event_destroy(struct perf_event *event)
-{
-	perf_trace_destroy(event);
-}
-
-static int perf_tp_event_init(struct perf_event *event)
-{
-	int err;
-
-	if (event->attr.type != PERF_TYPE_TRACEPOINT)
-		return -ENOENT;
-
-	/*
-	 * no branch sampling for tracepoint events
-	 */
-	if (has_branch_stack(event))
-		return -EOPNOTSUPP;
-
-	err = perf_trace_init(event);
-	if (err)
-		return err;
-
-	event->destroy = tp_perf_event_destroy;
-
-	return 0;
-}
-
-static struct pmu perf_tracepoint = {
-	.task_ctx_nr	= perf_sw_context,
-
-	.event_init	= perf_tp_event_init,
-	.add		= perf_trace_add,
-	.del		= perf_trace_del,
-	.start		= perf_swevent_start,
-	.stop		= perf_swevent_stop,
-	.read		= perf_swevent_read,
-};
-
 #if defined(CONFIG_KPROBE_EVENTS) || defined(CONFIG_UPROBE_EVENTS)
 /*
  * Flags in config, used by dynamic PMU kprobe and uprobe

---

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] perf: Rewrite core context handling
       [not found] <20220829113347.295-1-ravi.bangoria@amd.com>
  2022-08-29 12:04 ` [PATCH] perf: Rewrite core context handling Peter Zijlstra
  2022-09-06  5:50 ` Ravi Bangoria
@ 2022-09-07 11:28 ` Ravi Bangoria
  2022-10-10 10:14   ` Peter Zijlstra
  2 siblings, 1 reply; 11+ messages in thread
From: Ravi Bangoria @ 2022-09-07 11:28 UTC (permalink / raw)
  To: peterz
  Cc: acme, alexander.shishkin, jolsa, namhyung, songliubraving,
	eranian, ak, mark.rutland, frederic, maddy, irogers, will, robh,
	mingo, catalin.marinas, linux-arm-kernel, linux-perf-users,
	linuxppc-dev, linux-s390, linux-kernel, sandipan.das,
	ananth.narayan, kim.phillips, santosh.shukla, ravi.bangoria

> -static void
> -ctx_flexible_sched_in(struct perf_event_context *ctx,
> -		      struct perf_cpu_context *cpuctx)
> +/* XXX .busy thingy from Peter's patch */
> +static void ctx_flexible_sched_in(struct perf_event_context *ctx, struct pmu *pmu)

This one turned out to be very easy. Given that, we iterate over each
pmu, we can just return error if we fail to schedule any flexible event.
(It wouldn't be straight forward like this if we needed to implement
pmu=NULL optimization.)

---
diff --git a/kernel/events/core.c b/kernel/events/core.c
index e0232e0bb74e..923656af73fe 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3751,6 +3751,7 @@ static int merge_sched_in(struct perf_event *event, void *data)
 			cpc = this_cpu_ptr(event->pmu_ctx->pmu->cpu_pmu_context);
 			perf_mux_hrtimer_restart(cpc);
 			group_update_userpage(event);
+			return -EBUSY;
 		}
 	}
 
@@ -3776,7 +3777,6 @@ static void ctx_pinned_sched_in(struct perf_event_context *ctx, struct pmu *pmu)
 	}
 }
 
-/* XXX .busy thingy from Peter's patch */
 static void ctx_flexible_sched_in(struct perf_event_context *ctx, struct pmu *pmu)
 {
 	struct perf_event_pmu_context *pmu_ctx;
---

Thanks,
Ravi

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] perf: Rewrite core context handling
  2022-09-07 11:28 ` Ravi Bangoria
@ 2022-10-10 10:14   ` Peter Zijlstra
  2022-10-10 10:59     ` Ravi Bangoria
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2022-10-10 10:14 UTC (permalink / raw)
  To: Ravi Bangoria
  Cc: acme, alexander.shishkin, jolsa, namhyung, songliubraving,
	eranian, ak, mark.rutland, frederic, maddy, irogers, will, robh,
	mingo, catalin.marinas, linux-arm-kernel, linux-perf-users,
	linuxppc-dev, linux-s390, linux-kernel, sandipan.das,
	ananth.narayan, kim.phillips, santosh.shukla

On Wed, Sep 07, 2022 at 04:58:49PM +0530, Ravi Bangoria wrote:
> > -static void
> > -ctx_flexible_sched_in(struct perf_event_context *ctx,
> > -		      struct perf_cpu_context *cpuctx)
> > +/* XXX .busy thingy from Peter's patch */
> > +static void ctx_flexible_sched_in(struct perf_event_context *ctx, struct pmu *pmu)
> 
> This one turned out to be very easy. Given that, we iterate over each
> pmu, we can just return error if we fail to schedule any flexible event.
> (It wouldn't be straight forward like this if we needed to implement
> pmu=NULL optimization.)
> 
> ---
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index e0232e0bb74e..923656af73fe 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -3751,6 +3751,7 @@ static int merge_sched_in(struct perf_event *event, void *data)
>  			cpc = this_cpu_ptr(event->pmu_ctx->pmu->cpu_pmu_context);
>  			perf_mux_hrtimer_restart(cpc);
>  			group_update_userpage(event);
> +			return -EBUSY;
>  		}
>  	}
>  

I'm afraid this breaks things; consider:

  f79256532682 ("perf/core: fix userpage->time_enabled of inactive events")

I totally hate this -- because it means we *HAVE* to iterate the
inactive events, but alas.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] perf: Rewrite core context handling
  2022-09-06  5:50 ` Ravi Bangoria
@ 2022-10-10 10:23   ` Peter Zijlstra
  2022-10-10 11:01     ` Ravi Bangoria
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2022-10-10 10:23 UTC (permalink / raw)
  To: Ravi Bangoria
  Cc: acme, alexander.shishkin, jolsa, namhyung, songliubraving,
	eranian, ak, mark.rutland, frederic, maddy, irogers, will, robh,
	mingo, catalin.marinas, linux-arm-kernel, linux-perf-users,
	linuxppc-dev, linux-s390, linux-kernel, sandipan.das,
	ananth.narayan, kim.phillips, santosh.shukla

On Tue, Sep 06, 2022 at 11:20:53AM +0530, Ravi Bangoria wrote:

> This one was simple enough so I prepared a patch for this. Let
> me know if you see any issues with below diff.

I've extraed this as a separate patch since it's not strictly required
for correctness and the patch is a quite large enough.

> ---
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 820c56c66b26..e0232e0bb74e 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -9807,6 +9807,44 @@ static struct pmu perf_swevent = {
>  
>  #ifdef CONFIG_EVENT_TRACING
>  
> +static void tp_perf_event_destroy(struct perf_event *event)
> +{
> +	perf_trace_destroy(event);
> +}
> +
> +static int perf_tp_event_init(struct perf_event *event)
> +{
> +	int err;
> +
> +	if (event->attr.type != PERF_TYPE_TRACEPOINT)
> +		return -ENOENT;
> +
> +	/*
> +	 * no branch sampling for tracepoint events
> +	 */
> +	if (has_branch_stack(event))
> +		return -EOPNOTSUPP;
> +
> +	err = perf_trace_init(event);
> +	if (err)
> +		return err;
> +
> +	event->destroy = tp_perf_event_destroy;
> +
> +	return 0;
> +}
> +
> +static struct pmu perf_tracepoint = {
> +	.task_ctx_nr	= perf_sw_context,
> +
> +	.event_init	= perf_tp_event_init,
> +	.add		= perf_trace_add,
> +	.del		= perf_trace_del,
> +	.start		= perf_swevent_start,
> +	.stop		= perf_swevent_stop,
> +	.read		= perf_swevent_read,
> +};
> +
>  static int perf_tp_filter_match(struct perf_event *event,
>  				struct perf_sample_data *data)
>  {
> @@ -9856,6 +9894,49 @@ void perf_trace_run_bpf_submit(void *raw_data, int size, int rctx,
>  }
>  EXPORT_SYMBOL_GPL(perf_trace_run_bpf_submit);
>  
> +
> +static void __perf_tp_event_target_task(u64 count, void *record,
> +					struct pt_regs *regs,
> +					struct perf_sample_data *data,
> +					struct perf_event *event)
> +{
> +	struct trace_entry *entry = record;
> +
> +	if (event->attr.config != entry->type)
> +		return;
> +	/* Cannot deliver synchronous signal to other task. */
> +	if (event->attr.sigtrap)
> +		return;
> +	if (perf_tp_event_match(event, data, regs))
> +		perf_swevent_event(event, count, data, regs);
> +}
> +
> +static void perf_tp_event_target_task(u64 count, void *record,
> +				      struct pt_regs *regs,
> +				      struct perf_sample_data *data,
> +				      struct perf_event_context *ctx)
> +{
> +	struct perf_event *event, *sibling;
> +
> +	event = perf_event_groups_first(&ctx->pinned_groups, smp_processor_id(),
> +					&perf_tracepoint, NULL);
> +	for (; event; event = perf_event_groups_next(event, &perf_tracepoint)) {
> +		__perf_tp_event_target_task(count, record, regs, data, event);
> +		for_each_sibling_event(sibling, event) {
> +			__perf_tp_event_target_task(count, record, regs, data, sibling);
> +		}
> +	}
> +
> +	event = perf_event_groups_first(&ctx->flexible_groups, smp_processor_id(),
> +					&perf_tracepoint, NULL);
> +	for (; event; event = perf_event_groups_next(event, &perf_tracepoint)) {
> +		__perf_tp_event_target_task(count, record, regs, data, event);
> +		for_each_sibling_event(sibling, event) {
> +			__perf_tp_event_target_task(count, record, regs, data, sibling);
> +		}
> +	}
> +}
> +
>  void perf_tp_event(u16 event_type, u64 count, void *record, int entry_size,
>  		   struct pt_regs *regs, struct hlist_head *head, int rctx,
>  		   struct task_struct *task)
> @@ -9886,29 +9967,15 @@ void perf_tp_event(u16 event_type, u64 count, void *record, int entry_size,
>  	 */
>  	if (task && task != current) {
>  		struct perf_event_context *ctx;
> -		struct trace_entry *entry = record;
>  
>  		rcu_read_lock();
>  		ctx = rcu_dereference(task->perf_event_ctxp);
>  		if (!ctx)
>  			goto unlock;
>  
> -		// XXX iterate groups instead, we should be able to
> -		// find the subtree for the perf_tracepoint pmu and CPU.
> -
> -		list_for_each_entry_rcu(event, &ctx->event_list, event_entry) {
> -			if (event->cpu != smp_processor_id())
> -				continue;
> -			if (event->attr.type != PERF_TYPE_TRACEPOINT)
> -				continue;
> -			if (event->attr.config != entry->type)
> -				continue;
> -			/* Cannot deliver synchronous signal to other task. */
> -			if (event->attr.sigtrap)
> -				continue;
> -			if (perf_tp_event_match(event, &data, regs))
> -				perf_swevent_event(event, count, &data, regs);
> -		}
> +		raw_spin_lock(&ctx->lock);
> +		perf_tp_event_target_task(count, record, regs, &data, ctx);
> +		raw_spin_unlock(&ctx->lock);
>  unlock:
>  		rcu_read_unlock();
>  	}
> @@ -9917,44 +9984,6 @@ void perf_tp_event(u16 event_type, u64 count, void *record, int entry_size,
>  }
>  EXPORT_SYMBOL_GPL(perf_tp_event);
>  
> -static void tp_perf_event_destroy(struct perf_event *event)
> -{
> -	perf_trace_destroy(event);
> -}
> -
> -static int perf_tp_event_init(struct perf_event *event)
> -{
> -	int err;
> -
> -	if (event->attr.type != PERF_TYPE_TRACEPOINT)
> -		return -ENOENT;
> -
> -	/*
> -	 * no branch sampling for tracepoint events
> -	 */
> -	if (has_branch_stack(event))
> -		return -EOPNOTSUPP;
> -
> -	err = perf_trace_init(event);
> -	if (err)
> -		return err;
> -
> -	event->destroy = tp_perf_event_destroy;
> -
> -	return 0;
> -}
> -
> -static struct pmu perf_tracepoint = {
> -	.task_ctx_nr	= perf_sw_context,
> -
> -	.event_init	= perf_tp_event_init,
> -	.add		= perf_trace_add,
> -	.del		= perf_trace_del,
> -	.start		= perf_swevent_start,
> -	.stop		= perf_swevent_stop,
> -	.read		= perf_swevent_read,
> -};
> -
>  #if defined(CONFIG_KPROBE_EVENTS) || defined(CONFIG_UPROBE_EVENTS)
>  /*
>   * Flags in config, used by dynamic PMU kprobe and uprobe
> 
> ---

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] perf: Rewrite core context handling
  2022-10-10 10:14   ` Peter Zijlstra
@ 2022-10-10 10:59     ` Ravi Bangoria
  0 siblings, 0 replies; 11+ messages in thread
From: Ravi Bangoria @ 2022-10-10 10:59 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: acme, alexander.shishkin, jolsa, namhyung, songliubraving,
	eranian, ak, mark.rutland, frederic, maddy, irogers, will, robh,
	mingo, catalin.marinas, linux-arm-kernel, linux-perf-users,
	linuxppc-dev, linux-s390, linux-kernel, sandipan.das,
	ananth.narayan, kim.phillips, santosh.shukla, Ravi Bangoria

On 10-Oct-22 3:44 PM, Peter Zijlstra wrote:
> On Wed, Sep 07, 2022 at 04:58:49PM +0530, Ravi Bangoria wrote:
>>> -static void
>>> -ctx_flexible_sched_in(struct perf_event_context *ctx,
>>> -		      struct perf_cpu_context *cpuctx)
>>> +/* XXX .busy thingy from Peter's patch */
>>> +static void ctx_flexible_sched_in(struct perf_event_context *ctx, struct pmu *pmu)
>>
>> This one turned out to be very easy. Given that, we iterate over each
>> pmu, we can just return error if we fail to schedule any flexible event.
>> (It wouldn't be straight forward like this if we needed to implement
>> pmu=NULL optimization.)
>>
>> ---
>> diff --git a/kernel/events/core.c b/kernel/events/core.c
>> index e0232e0bb74e..923656af73fe 100644
>> --- a/kernel/events/core.c
>> +++ b/kernel/events/core.c
>> @@ -3751,6 +3751,7 @@ static int merge_sched_in(struct perf_event *event, void *data)
>>  			cpc = this_cpu_ptr(event->pmu_ctx->pmu->cpu_pmu_context);
>>  			perf_mux_hrtimer_restart(cpc);
>>  			group_update_userpage(event);
>> +			return -EBUSY;
>>  		}
>>  	}
>>  
> 
> I'm afraid this breaks things; consider:
> 
>   f79256532682 ("perf/core: fix userpage->time_enabled of inactive events")
> 
> I totally hate this -- because it means we *HAVE* to iterate the
> inactive events, but alas.

Sure. Will drop this.

Thanks,
Ravi

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] perf: Rewrite core context handling
  2022-10-10 10:23   ` Peter Zijlstra
@ 2022-10-10 11:01     ` Ravi Bangoria
  0 siblings, 0 replies; 11+ messages in thread
From: Ravi Bangoria @ 2022-10-10 11:01 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: acme, alexander.shishkin, jolsa, namhyung, songliubraving,
	eranian, ak, mark.rutland, frederic, maddy, irogers, will, robh,
	mingo, catalin.marinas, linux-arm-kernel, linux-perf-users,
	linuxppc-dev, linux-s390, linux-kernel, sandipan.das,
	ananth.narayan, kim.phillips, santosh.shukla, Ravi Bangoria

On 10-Oct-22 3:53 PM, Peter Zijlstra wrote:
> On Tue, Sep 06, 2022 at 11:20:53AM +0530, Ravi Bangoria wrote:
> 
>> This one was simple enough so I prepared a patch for this. Let
>> me know if you see any issues with below diff.
> 
> I've extraed this as a separate patch since it's not strictly required
> for correctness and the patch is a quite large enough.

Sure. I'll keep it separate.

Thanks,
Ravi

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-10-10 11:03 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20220829113347.295-1-ravi.bangoria@amd.com>
2022-08-29 12:04 ` [PATCH] perf: Rewrite core context handling Peter Zijlstra
2022-08-29 14:40   ` Peter Zijlstra
2022-09-01 10:35     ` Ravi Bangoria
2022-09-01 11:29       ` Peter Zijlstra
2022-09-05  4:40         ` Ravi Bangoria
2022-09-06  5:50 ` Ravi Bangoria
2022-10-10 10:23   ` Peter Zijlstra
2022-10-10 11:01     ` Ravi Bangoria
2022-09-07 11:28 ` Ravi Bangoria
2022-10-10 10:14   ` Peter Zijlstra
2022-10-10 10:59     ` Ravi Bangoria

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).