linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Carrillo-Cisneros <davidcc@google.com>
To: Peter Zijlstra <peterz@infradead.org>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Ingo Molnar <mingo@redhat.com>
Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com>,
	Matt Fleming <matt.fleming@intel.com>,
	Tony Luck <tony.luck@intel.com>,
	Stephane Eranian <eranian@google.com>,
	Paul Turner <pjt@google.com>,
	David Carrillo-Cisneros <davidcc@google.com>,
	x86@kernel.org, linux-kernel@vger.kernel.org
Subject: [PATCH 23/32] perf/core: introduce PERF_INACTIVE_*_READ_* flags
Date: Thu, 28 Apr 2016 21:43:29 -0700	[thread overview]
Message-ID: <1461905018-86355-24-git-send-email-davidcc@google.com> (raw)
In-Reply-To: <1461905018-86355-1-git-send-email-davidcc@google.com>

Some offcore and uncore events, such as the new intel_cqm/llc_occupancy,
can be read even if the event is not active in its CPU (or in any CPU).
In those cases, a freshly read value is more recent, (and therefore
preferable) than the last value stored at event sched out.

There are two cases covered in this patch to allow Intel's CQM (and
potentially other per package events) to obtain updated values regardless
of the scheduling event in a particular CPU. Each case is covered by a
new event::pmu_event_flag:
	1) PERF_INACTIVE_CPU_READ_PKG: An event attached to a CPU that can
	be read in any CPU in its event:cpu's package, even if inactive.
	2) PERF_INACTIVE_EV_READ_ANY_CPU: An event that can be read in any
	CPU in any package in the system even if inactive.

A consequence of reading a new value from hw on each call to
perf_event_read() is that reading and saving the event value in sched out
can be avoided since the value will never be utilized. Therefore, a PMU
that sets any of the PERF_INACTIVE_*_READ_* flags can choose not to read
in context switch, at the cost of inherit_stats not working properly.

Reviewed-by: Stephane Eranian <eranian@google.com>
Signed-off-by: David Carrillo-Cisneros <davidcc@google.com>
---
 include/linux/perf_event.h | 15 ++++++++++++
 kernel/events/core.c       | 59 +++++++++++++++++++++++++++++++++++-----------
 2 files changed, 60 insertions(+), 14 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index e4c58b0..054d7f4 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -607,6 +607,21 @@ struct perf_event {
 /* Do not enable cgroup events in descendant cgroups. */
 #define PERF_CGROUP_NO_RECURSION		(1 << 0)
 
+/* CPU Event can read from event::cpu's package even if not in
+ * PERF_EVENT_STATE_ACTIVE, event::cpu must be a valid CPU.
+ */
+#define PERF_INACTIVE_CPU_READ_PKG		(1 << 1)
+
+/* Event can read from any package even if not in PERF_EVENT_STATE_ACTIVE. */
+#define PERF_INACTIVE_EV_READ_ANY_CPU		(1 << 2)
+
+static inline bool __perf_can_read_inactive(struct perf_event *event)
+{
+	return (event->pmu_event_flags & PERF_INACTIVE_EV_READ_ANY_CPU) ||
+		((event->pmu_event_flags & PERF_INACTIVE_CPU_READ_PKG) &&
+		(event->cpu != -1));
+}
+
 /**
  * struct perf_event_context - event context structure
  *
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 33961ec..28d1b51 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3266,15 +3266,28 @@ static void __perf_event_read(void *info)
 	struct perf_event_context *ctx = event->ctx;
 	struct perf_cpu_context *cpuctx = __get_cpu_context(ctx);
 	struct pmu *pmu = event->pmu;
+	bool read_inactive = __perf_can_read_inactive(event);
+
+	WARN_ON_ONCE(event->cpu == -1 &&
+		(event->pmu_event_flags & PERF_INACTIVE_CPU_READ_PKG));
+
+	/* If inactive, we should be reading in the adequate package. */
+	WARN_ON_ONCE(
+		event->state != PERF_EVENT_STATE_ACTIVE &&
+		(event->pmu_event_flags & PERF_INACTIVE_CPU_READ_PKG) &&
+		(topology_physical_package_id(event->cpu) !=
+			topology_physical_package_id(smp_processor_id())));
 
 	/*
 	 * If this is a task context, we need to check whether it is
-	 * the current task context of this cpu.  If not it has been
+	 * the current task context of this cpu or if the event
+	 * can be read while inactive.  If cannot read while inactive
+	 * and not in current cpu, then the event has been
 	 * scheduled out before the smp call arrived.  In that case
 	 * event->count would have been updated to a recent sample
 	 * when the event was scheduled out.
 	 */
-	if (ctx->task && cpuctx->task_ctx != ctx)
+	if (ctx->task && cpuctx->task_ctx != ctx && !read_inactive)
 		return;
 
 	raw_spin_lock(&ctx->lock);
@@ -3284,9 +3297,11 @@ static void __perf_event_read(void *info)
 	}
 
 	update_event_times(event);
-	if (event->state != PERF_EVENT_STATE_ACTIVE)
+
+	if (event->state != PERF_EVENT_STATE_ACTIVE && !read_inactive)
 		goto unlock;
 
+
 	if (!data->group) {
 		pmu->read(event);
 		data->ret = 0;
@@ -3299,7 +3314,8 @@ static void __perf_event_read(void *info)
 
 	list_for_each_entry(sub, &event->sibling_list, group_entry) {
 		update_event_times(sub);
-		if (sub->state == PERF_EVENT_STATE_ACTIVE) {
+		if (sub->state == PERF_EVENT_STATE_ACTIVE ||
+		    __perf_can_read_inactive(sub)) {
 			/*
 			 * Use sibling's PMU rather than @event's since
 			 * sibling could be on different (eg: software) PMU.
@@ -3368,19 +3384,34 @@ u64 perf_event_read_local(struct perf_event *event)
 static int perf_event_read(struct perf_event *event, bool group)
 {
 	int ret = 0;
+	bool active = event->state == PERF_EVENT_STATE_ACTIVE;
 
 	/*
-	 * If event is enabled and currently active on a CPU, update the
-	 * value in the event structure:
+	 * Read inactive event if  PMU allows it. Otherwise, if event is
+	 * enabled and currently active on a CPU, update the value in the
+	 * event structure:
 	 */
-	if (event->state == PERF_EVENT_STATE_ACTIVE) {
+
+	if (active || __perf_can_read_inactive(event)) {
 		struct perf_read_data data = {
 			.event = event,
 			.group = group,
 			.ret = 0,
 		};
-		smp_call_function_single(event->oncpu,
-					 __perf_event_read, &data, 1);
+		int cpu_to_read = event->oncpu;
+
+		if (!active) {
+			cpu_to_read =
+				/* if __perf_can_read_inactive is true, it
+				 * either is a CPU/cgroup event or can be
+				 * read for any CPU.
+				 */
+				(event->pmu_event_flags &
+				       PERF_INACTIVE_EV_READ_ANY_CPU) ?
+				smp_processor_id() : event->cpu;
+		}
+		smp_call_function_single(
+			cpu_to_read, __perf_event_read, &data, 1);
 		ret = data.ret;
 	} else if (event->state == PERF_EVENT_STATE_INACTIVE) {
 		struct perf_event_context *ctx = event->ctx;
@@ -8199,11 +8230,11 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
 	mutex_init(&event->mmap_mutex);
 
 	atomic_long_set(&event->refcount, 1);
-	event->cpu		= cpu;
-	event->attr		= *attr;
-	event->group_leader	= group_leader;
-	event->pmu		= NULL;
-	event->oncpu		= -1;
+	event->cpu			= cpu;
+	event->attr			= *attr;
+	event->group_leader		= group_leader;
+	event->pmu			= NULL;
+	event->oncpu			= -1;
 
 	event->parent		= parent_event;
 
-- 
2.8.0.rc3.226.g39d4020

  parent reply	other threads:[~2016-04-29  4:49 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-29  4:43 [PATCH 00/32] 2nd Iteration of Cache QoS Monitoring support David Carrillo-Cisneros
2016-04-29  4:43 ` [PATCH 01/32] perf/x86/intel/cqm: temporarily remove MBM from CQM and cleanup David Carrillo-Cisneros
2016-04-29 20:19   ` Vikas Shivappa
2016-04-29  4:43 ` [PATCH 02/32] perf/x86/intel/cqm: remove check for conflicting events David Carrillo-Cisneros
2016-04-29  4:43 ` [PATCH 03/32] perf/x86/intel/cqm: remove all code for rotation of RMIDs David Carrillo-Cisneros
2016-04-29  4:43 ` [PATCH 04/32] perf/x86/intel/cqm: make read of RMIDs per package (Temporal) David Carrillo-Cisneros
2016-04-29  4:43 ` [PATCH 05/32] perf/core: remove unused pmu->count David Carrillo-Cisneros
2016-04-29  4:43 ` [PATCH 06/32] x86/intel,cqm: add CONFIG_INTEL_RDT configuration flag and refactor PQR David Carrillo-Cisneros
2016-04-29  4:43 ` [PATCH 07/32] perf/x86/intel/cqm: separate CQM PMU's attributes from x86 PMU David Carrillo-Cisneros
2016-04-29  4:43 ` [PATCH 08/32] perf/x86/intel/cqm: prepare for next patches David Carrillo-Cisneros
2016-04-29  9:18   ` Peter Zijlstra
2016-04-29  4:43 ` [PATCH 09/32] perf/x86/intel/cqm: add per-package RMIDs, data and locks David Carrillo-Cisneros
2016-04-29 20:56   ` Vikas Shivappa
2016-04-29  4:43 ` [PATCH 10/32] perf/x86/intel/cqm: basic RMID hierarchy with per package rmids David Carrillo-Cisneros
2016-04-29  4:43 ` [PATCH 11/32] perf/x86/intel/cqm: (I)state and limbo prmids David Carrillo-Cisneros
2016-04-29  4:43 ` [PATCH 12/32] perf/x86/intel/cqm: add per-package RMID rotation David Carrillo-Cisneros
2016-04-29  4:43 ` [PATCH 13/32] perf/x86/intel/cqm: add polled update of RMID's llc_occupancy David Carrillo-Cisneros
2016-04-29  4:43 ` [PATCH 14/32] perf/x86/intel/cqm: add preallocation of anodes David Carrillo-Cisneros
2016-04-29  4:43 ` [PATCH 15/32] perf/core: add hooks to expose architecture specific features in perf_cgroup David Carrillo-Cisneros
2016-04-29  4:43 ` [PATCH 16/32] perf/x86/intel/cqm: add cgroup support David Carrillo-Cisneros
2016-04-29  4:43 ` [PATCH 17/32] perf/core: adding pmu::event_terminate David Carrillo-Cisneros
2016-04-29  4:43 ` [PATCH 18/32] perf/x86/intel/cqm: use pmu::event_terminate David Carrillo-Cisneros
2016-04-29  4:43 ` [PATCH 19/32] perf/core: introduce PMU event flag PERF_CGROUP_NO_RECURSION David Carrillo-Cisneros
2016-04-29  4:43 ` [PATCH 20/32] x86/intel/cqm: use PERF_CGROUP_NO_RECURSION in CQM David Carrillo-Cisneros
2016-04-29  4:43 ` [PATCH 21/32] perf/x86/intel/cqm: handle inherit event and inherit_stat flag David Carrillo-Cisneros
2016-04-29  4:43 ` [PATCH 22/32] perf/x86/intel/cqm: introduce read_subtree David Carrillo-Cisneros
2016-04-29  4:43 ` David Carrillo-Cisneros [this message]
2016-04-29  4:43 ` [PATCH 24/32] perf/x86/intel/cqm: use PERF_INACTIVE_*_READ_* flags in CQM David Carrillo-Cisneros
2016-04-29  4:43 ` [PATCH 25/32] sched: introduce the finish_arch_pre_lock_switch() scheduler hook David Carrillo-Cisneros
2016-04-29  8:52   ` Peter Zijlstra
     [not found]     ` <CALcN6miyq9_4GQfO9=bjFb-X_2LSQdwfWnm+KvT=UrYRCAb6Og@mail.gmail.com>
2016-04-29 18:40       ` David Carrillo-Cisneros
2016-04-29 20:21         ` Vikas Shivappa
2016-04-29 20:50           ` David Carrillo-Cisneros
2016-04-29  4:43 ` [PATCH 26/32] perf/x86/intel/cqm: integrate CQM cgroups with scheduler David Carrillo-Cisneros
2016-04-29 20:25   ` Vikas Shivappa
2016-04-29 20:48     ` David Carrillo-Cisneros
2016-04-29 21:01       ` Vikas Shivappa
2016-04-29 21:26         ` David Carrillo-Cisneros
2016-04-29 21:32           ` Vikas Shivappa
2016-04-29 21:49             ` David Carrillo-Cisneros
2016-04-29 23:49               ` Vikas Shivappa
2016-04-30 17:50                 ` David Carrillo-Cisneros
2016-05-02 13:22                   ` Thomas Gleixner
2016-04-29  4:43 ` [PATCH 27/32] perf/core: add perf_event cgroup hooks for subsystem attributes David Carrillo-Cisneros
2016-04-29  4:43 ` [PATCH 28/32] perf/x86/intel/cqm: add CQM attributes to perf_event cgroup David Carrillo-Cisneros
2016-04-29  4:43 ` [PATCH 29/32] perf,perf/x86,perf/powerpc,perf/arm,perf/*: add int error return to pmu::read David Carrillo-Cisneros
2016-04-29  4:43 ` [PATCH 30/32] perf,perf/x86: add hook perf_event_arch_exec David Carrillo-Cisneros
2016-04-29  4:43 ` [PATCH 31/32] perf/stat: fix bug in handling events in error state David Carrillo-Cisneros
2016-04-29  4:43 ` [PATCH 32/32] perf/stat: revamp error handling for snapshot and per_pkg events David Carrillo-Cisneros
2016-04-29 21:06 ` [PATCH 00/32] 2nd Iteration of Cache QoS Monitoring support Vikas Shivappa
2016-04-29 21:10   ` David Carrillo-Cisneros

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1461905018-86355-24-git-send-email-davidcc@google.com \
    --to=davidcc@google.com \
    --cc=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=eranian@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matt.fleming@intel.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=tony.luck@intel.com \
    --cc=vikas.shivappa@linux.intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).