All of lore.kernel.org
 help / color / mirror / Atom feed
From: peterz@infradead.org
To: kan.liang@linux.intel.com
Cc: acme@redhat.com, mingo@kernel.org, linux-kernel@vger.kernel.org,
	jolsa@kernel.org, eranian@google.com,
	alexander.shishkin@linux.intel.com, ak@linux.intel.com,
	like.xu@linux.intel.com
Subject: Re: [PATCH V7 07/14] perf/core: Add a new PERF_EV_CAP_COEXIST event capability
Date: Fri, 24 Jul 2020 13:46:28 +0200	[thread overview]
Message-ID: <20200724114628.GJ43129@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20200724105543.GV119549@hirez.programming.kicks-ass.net>

On Fri, Jul 24, 2020 at 12:55:43PM +0200, peterz@infradead.org wrote:
> > +	event_sched_out(event, cpuctx, ctx);
> > +	perf_event_set_state(event, PERF_EVENT_STATE_ERROR);
> > +}
> 
> Ah, so the problem here is that ERROR is actually recoverable using
> IOC_ENABLE. We don't want that either. Let me try and figure out of EXIT
> would work.

EXIT is difficult too.. Also, that COEXIST thing hurt my brian, can't we
do a simpler SIBLING thing that simply requires the event to be a group
sibling?

The consequence is that SLOTS must be the leader, is that really a
problem? You keep providing the {cycles, slots, metric-things} example,
but afaict {slots, metric-thing, cycles} should work just fine too.
PERF_SAMPLE_READ with PERF_FORMAT_GROUP doesn't need to the the leader.

The thing is, with COEXIST, if you have:

	{cycles, slots, metric1, metric2}

and you say that COEXIST events need to co-exist with at least one other
COEXIST event you end up with a problem when: close(fd_slots) happens,
because in the above case, there's two more COEXIST events, so the
requirement holds, even through this is very much not what we want.

You 'fixed' this by saying closing any COEXIST event will tear the group
up, but that's 'weird' and unexpected if you were to do:
close(fd_metric2).


So something like this.

---
Subject: perf/core: Add a new PERF_EV_CAP_SIBLING event capability

Current perf assumes that events in a group are independent. Close an
event doesn't impact the value of the other events in the same group.
If the closed event is a member, after the event closure, other events
are still running like a group. If the closed event is a leader, other
events are running as singleton events.

Add PERF_EV_CAP_SIBLING to allow events to indicate they require being
part of a group, and when the leader dies they cannot exist
independently.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/linux/perf_event.h |    4 ++++
 kernel/events/core.c       |   38 +++++++++++++++++++++++++++++++++-----
 2 files changed, 37 insertions(+), 5 deletions(-)

--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -576,9 +576,13 @@ typedef void (*perf_overflow_handler_t)(
  * PERF_EV_CAP_SOFTWARE: Is a software event.
  * PERF_EV_CAP_READ_ACTIVE_PKG: A CPU event (or cgroup event) that can be read
  * from any CPU in the package where it is active.
+ * PERF_EV_CAP_SIBLING: An event with this flag must be a group sibling and
+ * cannot be a group leader. If an event with this flag is detached from the
+ * group it is scheduled out and moved into an unrecoverable ERROR state.
  */
 #define PERF_EV_CAP_SOFTWARE		BIT(0)
 #define PERF_EV_CAP_READ_ACTIVE_PKG	BIT(1)
+#define PERF_EV_CAP_SIBLING		BIT(2)
 
 #define SWEVENT_HLIST_BITS		8
 #define SWEVENT_HLIST_SIZE		(1 << SWEVENT_HLIST_BITS)
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2133,8 +2133,24 @@ static inline struct list_head *get_even
 	return event->attr.pinned ? &ctx->pinned_active : &ctx->flexible_active;
 }
 
+/*
+ * Events that have PERF_EV_CAP_SIBLING require being part of a group and
+ * cannot exist on their own, schedule them out and move them into the ERROR
+ * state. Also see _perf_event_enable(), it will not be able to recover
+ * this ERROR state.
+ */
+static inline void perf_remove_sibling_event(struct perf_event *event)
+{
+	struct perf_event_context *ctx = event->ctx;
+	struct perf_cpu_context *cpuctx = __get_cpu_context(ctx);
+
+	event_sched_out(event, cpuctx, ctx);
+	perf_event_set_state(event, PERF_EVENT_STATE_ERROR);
+}
+
 static void perf_group_detach(struct perf_event *event)
 {
+	struct perf_event *leader = event->group_leader;
 	struct perf_event *sibling, *tmp;
 	struct perf_event_context *ctx = event->ctx;
 
@@ -2153,7 +2169,7 @@ static void perf_group_detach(struct per
 	/*
 	 * If this is a sibling, remove it from its group.
 	 */
-	if (event->group_leader != event) {
+	if (leader != event) {
 		list_del_init(&event->sibling_list);
 		event->group_leader->nr_siblings--;
 		goto out;
@@ -2166,6 +2182,9 @@ static void perf_group_detach(struct per
 	 */
 	list_for_each_entry_safe(sibling, tmp, &event->sibling_list, sibling_list) {
 
+		if (sibling->event_caps & PERF_EV_CAP_SIBLING)
+			perf_remove_sibling_event(sibling);
+
 		sibling->group_leader = sibling;
 		list_del_init(&sibling->sibling_list);
 
@@ -2183,10 +2202,10 @@ static void perf_group_detach(struct per
 	}
 
 out:
-	perf_event__header_size(event->group_leader);
-
-	for_each_sibling_event(tmp, event->group_leader)
+	for_each_sibling_event(tmp, leader)
 		perf_event__header_size(tmp);
+
+	perf_event__header_size(leader);
 }
 
 static bool is_orphaned_event(struct perf_event *event)
@@ -2979,6 +2998,7 @@ static void _perf_event_enable(struct pe
 	raw_spin_lock_irq(&ctx->lock);
 	if (event->state >= PERF_EVENT_STATE_INACTIVE ||
 	    event->state <  PERF_EVENT_STATE_ERROR) {
+out:
 		raw_spin_unlock_irq(&ctx->lock);
 		return;
 	}
@@ -2990,8 +3010,16 @@ static void _perf_event_enable(struct pe
 	 * has gone back into error state, as distinct from the task having
 	 * been scheduled away before the cross-call arrived.
 	 */
-	if (event->state == PERF_EVENT_STATE_ERROR)
+	if (event->state == PERF_EVENT_STATE_ERROR) {
+		/*
+		 * Detached SIBLING events cannot leave ERROR state.
+		 */
+		if (event->event_caps & PERF_EV_CAP_SIBLING &&
+		    event->group_leader == event)
+			goto out;
+
 		event->state = PERF_EVENT_STATE_OFF;
+	}
 	raw_spin_unlock_irq(&ctx->lock);
 
 	event_function_call(event, __perf_event_enable, NULL);


  reply	other threads:[~2020-07-24 11:46 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-23 17:11 [PATCH V7 00/14] TopDown metrics support for Icelake kan.liang
2020-07-23 17:11 ` [PATCH V7 01/14] perf/x86: Use event_base_rdpmc for the RDPMC userspace support kan.liang
2020-08-19  8:52   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-23 17:11 ` [PATCH V7 02/14] perf/x86/intel: Name the global status bit in NMI handler kan.liang
2020-08-19  8:52   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-23 17:11 ` [PATCH V7 03/14] perf/x86/intel: Introduce the fourth fixed counter kan.liang
2020-08-19  8:52   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-23 17:11 ` [PATCH V7 04/14] perf/x86/intel: Move BTS index to 47 kan.liang
2020-08-19  8:52   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-23 17:11 ` [PATCH V7 05/14] perf/x86/intel: Fix the name of perf METRICS kan.liang
2020-08-19  8:52   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-23 17:11 ` [PATCH V7 06/14] perf/x86/intel: Use switch in intel_pmu_disable/enable_event kan.liang
2020-08-19  8:52   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-23 17:11 ` [PATCH V7 07/14] perf/core: Add a new PERF_EV_CAP_COEXIST event capability kan.liang
2020-07-24 10:55   ` peterz
2020-07-24 11:46     ` peterz [this message]
2020-07-24 13:43       ` Liang, Kan
2020-07-24 13:54         ` Peter Zijlstra
2020-07-24 14:19           ` Liang, Kan
2020-07-24 14:32             ` Peter Zijlstra
2020-07-24 14:46               ` Andi Kleen
2020-07-24 14:59                 ` Peter Zijlstra
2020-07-24 16:43                   ` peterz
2020-07-24 17:00                     ` Liang, Kan
2020-07-24 14:39       ` Andi Kleen
2020-07-24 14:51         ` Peter Zijlstra
2020-08-19  8:52   ` [tip: perf/core] perf/core: Add a new PERF_EV_CAP_SIBLING " tip-bot2 for Kan Liang
2020-07-23 17:11 ` [PATCH V7 08/14] perf/x86/intel: Generic support for hardware TopDown metrics kan.liang
2020-07-24 13:19   ` peterz
2020-07-24 15:27     ` peterz
2020-07-24 16:07       ` Liang, Kan
2020-07-24 19:10         ` Liang, Kan
2020-07-28 12:32           ` Peter Zijlstra
2020-07-28 13:09           ` Peter Zijlstra
2020-07-28 13:28             ` Liang, Kan
2020-07-28 13:44               ` peterz
2020-07-28 14:01                 ` Liang, Kan
2020-08-19  8:52   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-23 17:11 ` [PATCH V7 09/14] perf/x86: Add a macro for RDPMC offset of fixed counters kan.liang
2020-08-19  8:52   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-23 17:11 ` [PATCH V7 10/14] perf/x86/intel: Support TopDown metrics on Ice Lake kan.liang
2020-08-19  8:52   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-23 17:11 ` [PATCH V7 11/14] perf/x86/intel: Support per-thread RDPMC TopDown metrics kan.liang
2020-08-19  8:52   ` [tip: perf/core] " tip-bot2 for Kan Liang
2020-07-23 17:11 ` [PATCH V7 12/14] perf, tools, stat: Support new per thread " kan.liang
2020-07-24  3:29   ` Andi Kleen
2020-07-23 17:11 ` [PATCH V7 13/14] perf, tools, stat: Check Topdown Metric group kan.liang
2020-07-23 17:11 ` [PATCH V7 14/14] perf, tools: Add documentation for topdown metrics kan.liang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200724114628.GJ43129@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=acme@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=eranian@google.com \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@linux.intel.com \
    --cc=like.xu@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.