All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC v3 00/11] i915 PMU and engine busy stats
@ 2017-09-11 15:25 Tvrtko Ursulin
  2017-09-11 15:25 ` [RFC 01/11] drm/i915: Convert intel_rc6_residency_us to ns Tvrtko Ursulin
                   ` (16 more replies)
  0 siblings, 17 replies; 56+ messages in thread
From: Tvrtko Ursulin @ 2017-09-11 15:25 UTC (permalink / raw)
  To: Intel-gfx; +Cc: Peter Zijlstra

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Third spin of the i915 PMU series.

In this version I integrated some of the changes Dmitry made after clarifying
the perf API usage questions with Peter Zijlstra. Some other changes I
integrated in spirit but changed the implementation, and then finally there
are some new fixups and tidies.

Patches 1-3 are small refactors to make the following work easier.

Patch 4 is the main bit.

Patch 5 is a small optimisation on top, to only run the sampling timer when it
is required. (Depending on GPU awake status and active PMU counters.)

Patches 6-7 add software engine busyness tracking, which is then used from the
PMU in patch 9. This allows more efficient and more accurate engine busyness
metric.

Patch 8 exposes the engine busyness in debugfs should anyone care to access it
that way.

Patch 10 was requested by Ben to provide the metrics to be used from outside
i915.

And finally patch 11 is an additional optimisation which hides the whole cost
of the software engine busyness tracking behind a single nop instruction in
the off case.

Strictly speaking, the series provides a fully functional PMU by patch 4. The
rest are optimisations, improvement and API exports.

I think what should follow is:

 1. A final discussion on the ABI. Which counters do we want etc.
 2. Another pass on the PMU implementation from the perf API point of view.

Assuming there are no big surprises after the two items above, the patches are
probably quite close to being ready for review and losing the RFC status.

Tvrtko Ursulin (11):
  drm/i915: Convert intel_rc6_residency_us to ns
  drm/i915: Add intel_energy_uJ
  drm/i915: Extract intel_get_cagf
  drm/i915/pmu: Expose a PMU interface for perf queries
  drm/i915/pmu: Suspend sampling when GPU is idle
  drm/i915: Wrap context schedule notification
  drm/i915: Engine busy time tracking
  drm/i915: Export engine busy stats in debugfs
  drm/i915/pmu: Wire up engine busy stats to PMU
  drm/i915: Export engine stats API to other users
  drm/i915: Gate engine stats collection with a static key

 drivers/gpu/drm/i915/Makefile           |   1 +
 drivers/gpu/drm/i915/i915_debugfs.c     | 117 ++++-
 drivers/gpu/drm/i915/i915_drv.c         |   2 +
 drivers/gpu/drm/i915/i915_drv.h         |  95 +++-
 drivers/gpu/drm/i915/i915_gem.c         |   1 +
 drivers/gpu/drm/i915/i915_gem_request.c |   1 +
 drivers/gpu/drm/i915/i915_pmu.c         | 806 ++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_reg.h         |   3 +
 drivers/gpu/drm/i915/intel_engine_cs.c  | 171 +++++++
 drivers/gpu/drm/i915/intel_lrc.c        |  19 +-
 drivers/gpu/drm/i915/intel_pm.c         |  62 ++-
 drivers/gpu/drm/i915/intel_ringbuffer.c |  25 +
 drivers/gpu/drm/i915/intel_ringbuffer.h | 132 ++++++
 include/uapi/drm/i915_drm.h             |  58 +++
 14 files changed, 1452 insertions(+), 41 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_pmu.c

-- 
2.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [RFC 01/11] drm/i915: Convert intel_rc6_residency_us to ns
  2017-09-11 15:25 [RFC v3 00/11] i915 PMU and engine busy stats Tvrtko Ursulin
@ 2017-09-11 15:25 ` Tvrtko Ursulin
  2017-09-14 19:48   ` Chris Wilson
  2017-09-11 15:25 ` [RFC 02/11] drm/i915: Add intel_energy_uJ Tvrtko Ursulin
                   ` (15 subsequent siblings)
  16 siblings, 1 reply; 56+ messages in thread
From: Tvrtko Ursulin @ 2017-09-11 15:25 UTC (permalink / raw)
  To: Intel-gfx; +Cc: Peter Zijlstra

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Will be used for exposing the PMU counters.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h |  8 +++++++-
 drivers/gpu/drm/i915/intel_pm.c | 23 +++++++++--------------
 2 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index d07d1109e784..dbd054e88ca2 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -4114,9 +4114,15 @@ void vlv_phy_reset_lanes(struct intel_encoder *encoder);
 
 int intel_gpu_freq(struct drm_i915_private *dev_priv, int val);
 int intel_freq_opcode(struct drm_i915_private *dev_priv, int val);
-u64 intel_rc6_residency_us(struct drm_i915_private *dev_priv,
+u64 intel_rc6_residency_ns(struct drm_i915_private *dev_priv,
 			   const i915_reg_t reg);
 
+static inline u64 intel_rc6_residency_us(struct drm_i915_private *dev_priv,
+					 const i915_reg_t reg)
+{
+	return DIV_ROUND_UP_ULL(intel_rc6_residency_ns(dev_priv, reg), 1000);
+}
+
 #define I915_READ8(reg)		dev_priv->uncore.funcs.mmio_readb(dev_priv, (reg), true)
 #define I915_WRITE8(reg, val)	dev_priv->uncore.funcs.mmio_writeb(dev_priv, (reg), (val), true)
 
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index fa9055a4f790..60461f49936b 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -9343,10 +9343,10 @@ static u64 vlv_residency_raw(struct drm_i915_private *dev_priv,
 	return lower | (u64)upper << 8;
 }
 
-u64 intel_rc6_residency_us(struct drm_i915_private *dev_priv,
+u64 intel_rc6_residency_ns(struct drm_i915_private *dev_priv,
 			   const i915_reg_t reg)
 {
-	u64 time_hw, units, div;
+	u64 res;
 
 	if (!intel_enable_rc6())
 		return 0;
@@ -9355,22 +9355,17 @@ u64 intel_rc6_residency_us(struct drm_i915_private *dev_priv,
 
 	/* On VLV and CHV, residency time is in CZ units rather than 1.28us */
 	if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv)) {
-		units = 1000;
-		div = dev_priv->czclk_freq;
+		res = vlv_residency_raw(dev_priv, reg);
+		res = DIV_ROUND_UP_ULL(res * 1000000, dev_priv->czclk_freq);
 
-		time_hw = vlv_residency_raw(dev_priv, reg);
-	} else if (IS_GEN9_LP(dev_priv)) {
-		units = 1000;
-		div = 1200;		/* 833.33ns */
-
-		time_hw = I915_READ(reg);
 	} else {
-		units = 128000; /* 1.28us */
-		div = 100000;
+		/* 833.33ns units on Gen9LP, 1.28us elsewhere. */
+		unsigned int unit = IS_GEN9_LP(dev_priv) ? 833 : 1280;
 
-		time_hw = I915_READ(reg);
+		res = (u64)I915_READ(reg) * unit;
 	}
 
 	intel_runtime_pm_put(dev_priv);
-	return DIV_ROUND_UP_ULL(time_hw * units, div);
+
+	return res;
 }
-- 
2.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [RFC 02/11] drm/i915: Add intel_energy_uJ
  2017-09-11 15:25 [RFC v3 00/11] i915 PMU and engine busy stats Tvrtko Ursulin
  2017-09-11 15:25 ` [RFC 01/11] drm/i915: Convert intel_rc6_residency_us to ns Tvrtko Ursulin
@ 2017-09-11 15:25 ` Tvrtko Ursulin
  2017-09-14 19:49   ` Chris Wilson
  2017-09-14 20:36   ` Ville Syrjälä
  2017-09-11 15:25 ` [RFC 03/11] drm/i915: Extract intel_get_cagf Tvrtko Ursulin
                   ` (14 subsequent siblings)
  16 siblings, 2 replies; 56+ messages in thread
From: Tvrtko Ursulin @ 2017-09-11 15:25 UTC (permalink / raw)
  To: Intel-gfx; +Cc: Peter Zijlstra

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Extract code from i915_energy_uJ (debugfs) so it can be used by
other callers in future patches.

v2: Rebase.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 17 +----------------
 drivers/gpu/drm/i915/i915_drv.h     |  2 ++
 drivers/gpu/drm/i915/intel_pm.c     | 25 +++++++++++++++++++++++++
 3 files changed, 28 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 6338018f655d..b3a4a66bf7c4 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2780,26 +2780,11 @@ static int i915_sink_crc(struct seq_file *m, void *data)
 static int i915_energy_uJ(struct seq_file *m, void *data)
 {
 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
-	unsigned long long power;
-	u32 units;
 
 	if (INTEL_GEN(dev_priv) < 6)
 		return -ENODEV;
 
-	intel_runtime_pm_get(dev_priv);
-
-	if (rdmsrl_safe(MSR_RAPL_POWER_UNIT, &power)) {
-		intel_runtime_pm_put(dev_priv);
-		return -ENODEV;
-	}
-
-	units = (power & 0x1f00) >> 8;
-	power = I915_READ(MCH_SECP_NRG_STTS);
-	power = (1000000 * power) >> units; /* convert to uJ */
-
-	intel_runtime_pm_put(dev_priv);
-
-	seq_printf(m, "%llu", power);
+	seq_printf(m, "%llu", intel_energy_uJ(dev_priv));
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index dbd054e88ca2..826c74970ce9 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -4123,6 +4123,8 @@ static inline u64 intel_rc6_residency_us(struct drm_i915_private *dev_priv,
 	return DIV_ROUND_UP_ULL(intel_rc6_residency_ns(dev_priv, reg), 1000);
 }
 
+u64 intel_energy_uJ(struct drm_i915_private *dev_priv);
+
 #define I915_READ8(reg)		dev_priv->uncore.funcs.mmio_readb(dev_priv, (reg), true)
 #define I915_WRITE8(reg, val)	dev_priv->uncore.funcs.mmio_writeb(dev_priv, (reg), (val), true)
 
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 60461f49936b..ff67df8d99fa 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -9369,3 +9369,28 @@ u64 intel_rc6_residency_ns(struct drm_i915_private *dev_priv,
 
 	return res;
 }
+
+unsigned long long intel_energy_uJ(struct drm_i915_private *dev_priv)
+{
+	unsigned long long power;
+	unsigned long units;
+
+	if (GEM_WARN_ON(INTEL_GEN(dev_priv) < 6))
+		return 0;
+
+	intel_runtime_pm_get(dev_priv);
+
+	if (rdmsrl_safe(MSR_RAPL_POWER_UNIT, &power)) {
+		power = 0;
+		goto out;
+	}
+
+	units = (power >> 8) & 0x1f;
+	power = I915_READ(MCH_SECP_NRG_STTS);
+	power = (1000000 * power) >> units; /* convert to uJ */
+
+out:
+	intel_runtime_pm_put(dev_priv);
+
+	return power;
+}
-- 
2.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [RFC 03/11] drm/i915: Extract intel_get_cagf
  2017-09-11 15:25 [RFC v3 00/11] i915 PMU and engine busy stats Tvrtko Ursulin
  2017-09-11 15:25 ` [RFC 01/11] drm/i915: Convert intel_rc6_residency_us to ns Tvrtko Ursulin
  2017-09-11 15:25 ` [RFC 02/11] drm/i915: Add intel_energy_uJ Tvrtko Ursulin
@ 2017-09-11 15:25 ` Tvrtko Ursulin
  2017-09-14 19:51   ` Chris Wilson
  2017-09-11 15:25 ` [RFC 04/11] drm/i915/pmu: Expose a PMU interface for perf queries Tvrtko Ursulin
                   ` (13 subsequent siblings)
  16 siblings, 1 reply; 56+ messages in thread
From: Tvrtko Ursulin @ 2017-09-11 15:25 UTC (permalink / raw)
  To: Intel-gfx; +Cc: Peter Zijlstra

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Code to be shared between debugfs and the PMU implementation.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c |  8 +-------
 drivers/gpu/drm/i915/i915_drv.h     |  1 +
 drivers/gpu/drm/i915/intel_pm.c     | 14 ++++++++++++++
 3 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index b3a4a66bf7c4..1fd777fb5e9e 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1112,13 +1112,7 @@ static int i915_frequency_info(struct seq_file *m, void *unused)
 		rpdownei = I915_READ(GEN6_RP_CUR_DOWN_EI) & GEN6_CURIAVG_MASK;
 		rpcurdown = I915_READ(GEN6_RP_CUR_DOWN) & GEN6_CURBSYTAVG_MASK;
 		rpprevdown = I915_READ(GEN6_RP_PREV_DOWN) & GEN6_CURBSYTAVG_MASK;
-		if (INTEL_GEN(dev_priv) >= 9)
-			cagf = (rpstat & GEN9_CAGF_MASK) >> GEN9_CAGF_SHIFT;
-		else if (IS_HASWELL(dev_priv) || IS_BROADWELL(dev_priv))
-			cagf = (rpstat & HSW_CAGF_MASK) >> HSW_CAGF_SHIFT;
-		else
-			cagf = (rpstat & GEN6_CAGF_MASK) >> GEN6_CAGF_SHIFT;
-		cagf = intel_gpu_freq(dev_priv, cagf);
+		cagf = intel_gpu_freq(dev_priv, intel_get_cagf(dev_priv, rpstat));
 
 		intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 826c74970ce9..48daf9552163 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -4124,6 +4124,7 @@ static inline u64 intel_rc6_residency_us(struct drm_i915_private *dev_priv,
 }
 
 u64 intel_energy_uJ(struct drm_i915_private *dev_priv);
+u32 intel_get_cagf(struct drm_i915_private *dev_priv, u32 rpstat1);
 
 #define I915_READ8(reg)		dev_priv->uncore.funcs.mmio_readb(dev_priv, (reg), true)
 #define I915_WRITE8(reg, val)	dev_priv->uncore.funcs.mmio_writeb(dev_priv, (reg), (val), true)
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index ff67df8d99fa..9759c99b72bf 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -9394,3 +9394,17 @@ unsigned long long intel_energy_uJ(struct drm_i915_private *dev_priv)
 
 	return power;
 }
+
+u32 intel_get_cagf(struct drm_i915_private *dev_priv, u32 rpstat)
+{
+	u32 cagf;
+
+	if (INTEL_GEN(dev_priv) >= 9)
+		cagf = (rpstat & GEN9_CAGF_MASK) >> GEN9_CAGF_SHIFT;
+	else if (IS_HASWELL(dev_priv) || IS_BROADWELL(dev_priv))
+		cagf = (rpstat & HSW_CAGF_MASK) >> HSW_CAGF_SHIFT;
+	else
+		cagf = (rpstat & GEN6_CAGF_MASK) >> GEN6_CAGF_SHIFT;
+
+	return  cagf;
+}
-- 
2.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [RFC 04/11] drm/i915/pmu: Expose a PMU interface for perf queries
  2017-09-11 15:25 [RFC v3 00/11] i915 PMU and engine busy stats Tvrtko Ursulin
                   ` (2 preceding siblings ...)
  2017-09-11 15:25 ` [RFC 03/11] drm/i915: Extract intel_get_cagf Tvrtko Ursulin
@ 2017-09-11 15:25 ` Tvrtko Ursulin
  2017-09-12  2:06   ` Rogozhkin, Dmitry V
  2017-09-14 19:46   ` [RFC " Chris Wilson
  2017-09-11 15:25 ` [RFC 05/11] drm/i915/pmu: Suspend sampling when GPU is idle Tvrtko Ursulin
                   ` (12 subsequent siblings)
  16 siblings, 2 replies; 56+ messages in thread
From: Tvrtko Ursulin @ 2017-09-11 15:25 UTC (permalink / raw)
  To: Intel-gfx; +Cc: Peter Zijlstra

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

From: Chris Wilson <chris@chris-wilson.co.uk>
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
From: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

The first goal is to be able to measure GPU (and invidual ring) busyness
without having to poll registers from userspace. (Which not only incurs
holding the forcewake lock indefinitely, perturbing the system, but also
runs the risk of hanging the machine.) As an alternative we can use the
perf event counter interface to sample the ring registers periodically
and send those results to userspace.

To be able to do so, we need to export the two symbols from
kernel/events/core.c to register and unregister a PMU device.

v1-v2 (Chris Wilson):

v2: Use a common timer for the ring sampling.

v3: (Tvrtko Ursulin)
 * Decouple uAPI from i915 engine ids.
 * Complete uAPI defines.
 * Refactor some code to helpers for clarity.
 * Skip sampling disabled engines.
 * Expose counters in sysfs.
 * Pass in fake regs to avoid null ptr deref in perf core.
 * Convert to class/instance uAPI.
 * Use shared driver code for rc6 residency, power and frequency.

v4: (Dmitry Rogozhkin)
 * Register PMU with .task_ctx_nr=perf_invalid_context
 * Expose cpumask for the PMU with the single CPU in the mask
 * Properly support pmu->stop(): it should call pmu->read()
 * Properly support pmu->del(): it should call stop(event, PERF_EF_UPDATE)
 * Make pmu.busy_stats a refcounter to avoid busy stats going away
   with some deleted event.
 * Expose cpumask for i915 PMU to avoid multiple events creation of
   the same type followed by counter aggregation by perf-stat.
 * Track CPUs getting online/offline to migrate perf context. If (likely)
   cpumask will initially set CPU0, CONFIG_BOOTPARAM_HOTPLUG_CPU0 will be
   needed to see effect of CPU status tracking.
 * End result is that only global events are supported and perf stat
   works correctly.
 * Deny perf driver level sampling - it is prohibited for uncore PMU.

v5: (Tvrtko Ursulin)

 * Don't hardcode number of engine samplers.
 * Rewrite event ref-counting for correctness and simplicity.
 * Store initial counter value when starting already enabled events
   to correctly report values to all listeners.
 * Fix RC6 residency readout.
 * Comments, GPL header.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
---
 drivers/gpu/drm/i915/Makefile           |   1 +
 drivers/gpu/drm/i915/i915_drv.c         |   2 +
 drivers/gpu/drm/i915/i915_drv.h         |  76 ++++
 drivers/gpu/drm/i915/i915_pmu.c         | 686 ++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_reg.h         |   3 +
 drivers/gpu/drm/i915/intel_engine_cs.c  |  10 +
 drivers/gpu/drm/i915/intel_ringbuffer.c |  25 ++
 drivers/gpu/drm/i915/intel_ringbuffer.h |  25 ++
 include/uapi/drm/i915_drm.h             |  58 +++
 9 files changed, 886 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/i915_pmu.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 1cb8059a3a16..7b3a0eca62b6 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -26,6 +26,7 @@ i915-y := i915_drv.o \
 
 i915-$(CONFIG_COMPAT)   += i915_ioc32.o
 i915-$(CONFIG_DEBUG_FS) += i915_debugfs.o intel_pipe_crc.o
+i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o
 
 # GEM code
 i915-y += i915_cmd_parser.o \
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 5c111ea96e80..b1f96eb1be16 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1196,6 +1196,7 @@ static void i915_driver_register(struct drm_i915_private *dev_priv)
 	struct drm_device *dev = &dev_priv->drm;
 
 	i915_gem_shrinker_init(dev_priv);
+	i915_pmu_register(dev_priv);
 
 	/*
 	 * Notify a valid surface after modesetting,
@@ -1250,6 +1251,7 @@ static void i915_driver_unregister(struct drm_i915_private *dev_priv)
 	intel_opregion_unregister(dev_priv);
 
 	i915_perf_unregister(dev_priv);
+	i915_pmu_unregister(dev_priv);
 
 	i915_teardown_sysfs(dev_priv);
 	i915_guc_log_unregister(dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 48daf9552163..62646b8dfb7a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -40,6 +40,7 @@
 #include <linux/hash.h>
 #include <linux/intel-iommu.h>
 #include <linux/kref.h>
+#include <linux/perf_event.h>
 #include <linux/pm_qos.h>
 #include <linux/reservation.h>
 #include <linux/shmem_fs.h>
@@ -2190,6 +2191,69 @@ struct intel_cdclk_state {
 	unsigned int cdclk, vco, ref;
 };
 
+enum {
+	__I915_SAMPLE_FREQ_ACT = 0,
+	__I915_SAMPLE_FREQ_REQ,
+	__I915_NUM_PMU_SAMPLERS
+};
+
+/**
+ * How many different events we track in the global PMU mask.
+ *
+ * It is also used to know to needed number of event reference counters.
+ */
+#define I915_PMU_MASK_BITS \
+	(1 << I915_PMU_SAMPLE_BITS) + (I915_PMU_LAST + 1 - __I915_PMU_OTHER(0))
+
+struct i915_pmu {
+	/**
+	 * @node: List node for CPU hotplug handling.
+	 */
+	struct hlist_node node;
+	/**
+	 * @base: PMU base.
+	 */
+	struct pmu base;
+	/**
+	 * @lock: Lock protecting enable mask and ref count handling.
+	 */
+	spinlock_t lock;
+	/**
+	 * @timer: Timer for internal i915 PMU sampling.
+	 */
+	struct hrtimer timer;
+	/**
+	 * @enable: Bitmask of all currently enabled events.
+	 *
+	 * Bits are derived from uAPI event numbers in a way that low 16 bits
+	 * correspond to engine event _sample_ _type_ (I915_SAMPLE_QUEUED is
+	 * bit 0), and higher bits correspond to other events (for instance
+	 * I915_PMU_ACTUAL_FREQUENCY is bit 16 etc).
+	 *
+	 * In other words, low 16 bits are not per engine but per engine
+	 * sampler type, while the upper bits are directly mapped to other
+	 * event types.
+	 */
+	u64 enable;
+	/**
+	 * @enable_count: Reference counts for the enabled events.
+	 *
+	 * Array indices are mapped in the same way as bits in the @enable field
+	 * and they are used to control sampling on/off when multiple clients
+	 * are using the PMU API.
+	 */
+	unsigned int enable_count[I915_PMU_MASK_BITS];
+	/**
+	 * @sample: Current counter value for i915 events which need sampling.
+	 *
+	 * These counters are updated from the i915 PMU sampling timer.
+	 *
+	 * Only global counters are held here, while the per-engine ones are in
+	 * struct intel_engine_cs.
+	 */
+	u64 sample[__I915_NUM_PMU_SAMPLERS];
+};
+
 struct drm_i915_private {
 	struct drm_device drm;
 
@@ -2238,6 +2302,7 @@ struct drm_i915_private {
 	struct pci_dev *bridge_dev;
 	struct i915_gem_context *kernel_context;
 	struct intel_engine_cs *engine[I915_NUM_ENGINES];
+	struct intel_engine_cs *engine_class[MAX_ENGINE_CLASS + 1][MAX_ENGINE_INSTANCE + 1];
 	struct i915_vma *semaphore;
 
 	struct drm_dma_handle *status_page_dmah;
@@ -2698,6 +2763,8 @@ struct drm_i915_private {
 		int	irq;
 	} lpe_audio;
 
+	struct i915_pmu pmu;
+
 	/*
 	 * NOTE: This is the dri1/ums dungeon, don't add stuff here. Your patch
 	 * will be rejected. Instead look for a better place.
@@ -3918,6 +3985,15 @@ extern void i915_perf_fini(struct drm_i915_private *dev_priv);
 extern void i915_perf_register(struct drm_i915_private *dev_priv);
 extern void i915_perf_unregister(struct drm_i915_private *dev_priv);
 
+/* i915_pmu.c */
+#ifdef CONFIG_PERF_EVENTS
+extern void i915_pmu_register(struct drm_i915_private *i915);
+extern void i915_pmu_unregister(struct drm_i915_private *i915);
+#else
+static inline void i915_pmu_register(struct drm_i915_private *i915) {}
+static inline void i915_pmu_unregister(struct drm_i915_private *i915) {}
+#endif
+
 /* i915_suspend.c */
 extern int i915_save_state(struct drm_i915_private *dev_priv);
 extern int i915_restore_state(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
new file mode 100644
index 000000000000..2ec892e57143
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -0,0 +1,686 @@
+/*
+ * Copyright © 2017 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include <linux/perf_event.h>
+#include <linux/pm_runtime.h>
+
+#include "i915_drv.h"
+#include "intel_ringbuffer.h"
+
+/* Frequency for the sampling timer for events which need it. */
+#define FREQUENCY 200
+#define PERIOD max_t(u64, 10000, NSEC_PER_SEC / FREQUENCY)
+
+#define ENGINE_SAMPLE_MASK \
+	(BIT(I915_SAMPLE_QUEUED) | \
+	 BIT(I915_SAMPLE_BUSY) | \
+	 BIT(I915_SAMPLE_WAIT) | \
+	 BIT(I915_SAMPLE_SEMA))
+
+#define ENGINE_SAMPLE_BITS (1 << I915_PMU_SAMPLE_BITS)
+
+static cpumask_t i915_pmu_cpumask = CPU_MASK_NONE;
+
+static u8 engine_config_sample(u64 config)
+{
+	return config & I915_PMU_SAMPLE_MASK;
+}
+
+static u8 engine_event_sample(struct perf_event *event)
+{
+	return engine_config_sample(event->attr.config);
+}
+
+static u8 engine_event_class(struct perf_event *event)
+{
+	return (event->attr.config >> I915_PMU_CLASS_SHIFT) & 0xff;
+}
+
+static u8 engine_event_instance(struct perf_event *event)
+{
+	return (event->attr.config >> I915_PMU_SAMPLE_BITS) & 0xff;
+}
+
+static bool is_engine_config(u64 config)
+{
+	return config < __I915_PMU_OTHER(0);
+}
+
+static unsigned int config_enabled_bit(u64 config)
+{
+	if (is_engine_config(config))
+		return engine_config_sample(config);
+	else
+		return ENGINE_SAMPLE_BITS + (config - __I915_PMU_OTHER(0));
+}
+
+static u64 config_enabled_mask(u64 config)
+{
+	return BIT_ULL(config_enabled_bit(config));
+}
+
+static bool is_engine_event(struct perf_event *event)
+{
+	return is_engine_config(event->attr.config);
+}
+
+static unsigned int event_enabled_bit(struct perf_event *event)
+{
+	return config_enabled_bit(event->attr.config);
+}
+
+static bool grab_forcewake(struct drm_i915_private *i915, bool fw)
+{
+	if (!fw)
+		intel_uncore_forcewake_get(i915, FORCEWAKE_ALL);
+
+	return true;
+}
+
+static void engines_sample(struct drm_i915_private *dev_priv)
+{
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	bool fw = false;
+
+	if ((dev_priv->pmu.enable & ENGINE_SAMPLE_MASK) == 0)
+		return;
+
+	if (!dev_priv->gt.awake)
+		return;
+
+	if (!intel_runtime_pm_get_if_in_use(dev_priv))
+		return;
+
+	for_each_engine(engine, dev_priv, id) {
+		u32 enable = engine->pmu.enable;
+
+		if (i915_seqno_passed(intel_engine_get_seqno(engine),
+				      intel_engine_last_submit(engine)))
+			continue;
+
+		if (enable & BIT(I915_SAMPLE_QUEUED))
+			engine->pmu.sample[I915_SAMPLE_QUEUED] += PERIOD;
+
+		if (enable & BIT(I915_SAMPLE_BUSY)) {
+			u32 val;
+
+			fw = grab_forcewake(dev_priv, fw);
+			val = I915_READ_FW(RING_MI_MODE(engine->mmio_base));
+			if (!(val & MODE_IDLE))
+				engine->pmu.sample[I915_SAMPLE_BUSY] += PERIOD;
+		}
+
+		if (enable & (BIT(I915_SAMPLE_WAIT) | BIT(I915_SAMPLE_SEMA))) {
+			u32 val;
+
+			fw = grab_forcewake(dev_priv, fw);
+			val = I915_READ_FW(RING_CTL(engine->mmio_base));
+			if ((enable & BIT(I915_SAMPLE_WAIT)) &&
+			    (val & RING_WAIT))
+				engine->pmu.sample[I915_SAMPLE_WAIT] += PERIOD;
+			if ((enable & BIT(I915_SAMPLE_SEMA)) &&
+			    (val & RING_WAIT_SEMAPHORE))
+				engine->pmu.sample[I915_SAMPLE_SEMA] += PERIOD;
+		}
+	}
+
+	if (fw)
+		intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
+	intel_runtime_pm_put(dev_priv);
+}
+
+static void frequency_sample(struct drm_i915_private *dev_priv)
+{
+	if (dev_priv->pmu.enable &
+	    config_enabled_mask(I915_PMU_ACTUAL_FREQUENCY)) {
+		u64 val;
+
+		val = dev_priv->rps.cur_freq;
+		if (dev_priv->gt.awake &&
+		    intel_runtime_pm_get_if_in_use(dev_priv)) {
+			val = intel_get_cagf(dev_priv,
+					     I915_READ_NOTRACE(GEN6_RPSTAT1));
+			intel_runtime_pm_put(dev_priv);
+		}
+		val = intel_gpu_freq(dev_priv, val);
+		dev_priv->pmu.sample[__I915_SAMPLE_FREQ_ACT] += val * PERIOD;
+	}
+
+	if (dev_priv->pmu.enable &
+	    config_enabled_mask(I915_PMU_REQUESTED_FREQUENCY)) {
+		u64 val = intel_gpu_freq(dev_priv, dev_priv->rps.cur_freq);
+		dev_priv->pmu.sample[__I915_SAMPLE_FREQ_REQ] += val * PERIOD;
+	}
+}
+
+static enum hrtimer_restart i915_sample(struct hrtimer *hrtimer)
+{
+	struct drm_i915_private *i915 =
+		container_of(hrtimer, struct drm_i915_private, pmu.timer);
+
+	if (i915->pmu.enable == 0)
+		return HRTIMER_NORESTART;
+
+	engines_sample(i915);
+	frequency_sample(i915);
+
+	hrtimer_forward_now(hrtimer, ns_to_ktime(PERIOD));
+	return HRTIMER_RESTART;
+}
+
+static u64 count_interrupts(struct drm_i915_private *i915)
+{
+	/* open-coded kstat_irqs() */
+	struct irq_desc *desc = irq_to_desc(i915->drm.pdev->irq);
+	u64 sum = 0;
+	int cpu;
+
+	if (!desc || !desc->kstat_irqs)
+		return 0;
+
+	for_each_possible_cpu(cpu)
+		sum += *per_cpu_ptr(desc->kstat_irqs, cpu);
+
+	return sum;
+}
+
+static void i915_pmu_event_destroy(struct perf_event *event)
+{
+	WARN_ON(event->parent);
+}
+
+static int engine_event_init(struct perf_event *event)
+{
+	struct drm_i915_private *i915 =
+		container_of(event->pmu, typeof(*i915), pmu.base);
+
+	if (!intel_engine_lookup_user(i915, engine_event_class(event),
+				      engine_event_instance(event)))
+		return -ENODEV;
+
+	switch (engine_event_sample(event)) {
+	case I915_SAMPLE_QUEUED:
+	case I915_SAMPLE_BUSY:
+	case I915_SAMPLE_WAIT:
+		break;
+	case I915_SAMPLE_SEMA:
+		if (INTEL_GEN(i915) < 6)
+			return -ENODEV;
+		break;
+	default:
+		return -ENOENT;
+	}
+
+	return 0;
+}
+
+static int i915_pmu_event_init(struct perf_event *event)
+{
+	struct drm_i915_private *i915 =
+		container_of(event->pmu, typeof(*i915), pmu.base);
+	int cpu, ret;
+
+	if (event->attr.type != event->pmu->type)
+		return -ENOENT;
+
+	/* unsupported modes and filters */
+	if (event->attr.sample_period) /* no sampling */
+		return -EINVAL;
+
+	if (has_branch_stack(event))
+		return -EOPNOTSUPP;
+
+	if (event->cpu < 0)
+		return -EINVAL;
+
+	cpu = cpumask_any_and(&i915_pmu_cpumask,
+			      topology_sibling_cpumask(event->cpu));
+	if (cpu >= nr_cpu_ids)
+		return -ENODEV;
+
+	ret = 0;
+	if (is_engine_event(event)) {
+		ret = engine_event_init(event);
+	} else switch (event->attr.config) {
+	case I915_PMU_ACTUAL_FREQUENCY:
+		if (IS_VALLEYVIEW(i915) || IS_CHERRYVIEW(i915))
+			ret = -ENODEV; /* requires a mutex for sampling! */
+	case I915_PMU_REQUESTED_FREQUENCY:
+	case I915_PMU_ENERGY:
+	case I915_PMU_RC6_RESIDENCY:
+	case I915_PMU_RC6p_RESIDENCY:
+	case I915_PMU_RC6pp_RESIDENCY:
+		if (INTEL_GEN(i915) < 6)
+			ret = -ENODEV;
+		break;
+	}
+	if (ret)
+		return ret;
+
+	event->cpu = cpu;
+	if (!event->parent)
+		event->destroy = i915_pmu_event_destroy;
+
+	return 0;
+}
+
+static u64 __i915_pmu_event_read(struct perf_event *event)
+{
+	struct drm_i915_private *i915 =
+		container_of(event->pmu, typeof(*i915), pmu.base);
+	u64 val = 0;
+
+	if (is_engine_event(event)) {
+		u8 sample = engine_event_sample(event);
+		struct intel_engine_cs *engine;
+
+		engine = intel_engine_lookup_user(i915,
+						  engine_event_class(event),
+						  engine_event_instance(event));
+
+		if (WARN_ON_ONCE(!engine)) {
+			/* Do nothing */
+		} else {
+			val = engine->pmu.sample[sample];
+		}
+	} else switch (event->attr.config) {
+	case I915_PMU_ACTUAL_FREQUENCY:
+		val = i915->pmu.sample[__I915_SAMPLE_FREQ_ACT];
+		break;
+	case I915_PMU_REQUESTED_FREQUENCY:
+		val = i915->pmu.sample[__I915_SAMPLE_FREQ_REQ];
+		break;
+	case I915_PMU_ENERGY:
+		val = intel_energy_uJ(i915);
+		break;
+	case I915_PMU_INTERRUPTS:
+		val = count_interrupts(i915);
+		break;
+	case I915_PMU_RC6_RESIDENCY:
+		val = intel_rc6_residency_ns(i915,
+					     IS_VALLEYVIEW(i915) ?
+					     VLV_GT_RENDER_RC6 :
+					     GEN6_GT_GFX_RC6);
+		break;
+	case I915_PMU_RC6p_RESIDENCY:
+		if (!IS_VALLEYVIEW(i915))
+			val = intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6p);
+		break;
+	case I915_PMU_RC6pp_RESIDENCY:
+		if (!IS_VALLEYVIEW(i915))
+			val = intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6pp);
+		break;
+	}
+
+	return val;
+}
+
+static void i915_pmu_event_read(struct perf_event *event)
+{
+
+	local64_set(&event->count,
+		    __i915_pmu_event_read(event) -
+		    local64_read(&event->hw.prev_count));
+}
+
+static void i915_pmu_enable(struct perf_event *event)
+{
+	struct drm_i915_private *i915 =
+		container_of(event->pmu, typeof(*i915), pmu.base);
+	unsigned int bit = event_enabled_bit(event);
+	unsigned long flags;
+
+	spin_lock_irqsave(&i915->pmu.lock, flags);
+
+	/*
+	 * Start the sampling timer when enabling the first event.
+	 */
+	if (i915->pmu.enable == 0)
+		hrtimer_start_range_ns(&i915->pmu.timer,
+				       ns_to_ktime(PERIOD), 0,
+				       HRTIMER_MODE_REL_PINNED);
+
+	/*
+	 * Update the bitmask of enabled events and increment
+	 * the event reference counter.
+	 */
+	GEM_BUG_ON(bit >= I915_PMU_MASK_BITS);
+	GEM_BUG_ON(i915->pmu.enable_count[bit] == ~0);
+	i915->pmu.enable |= BIT_ULL(bit);
+	i915->pmu.enable_count[bit]++;
+
+	/*
+	 * For per-engine events the bitmask and reference counting
+	 * is stored per engine.
+	 */
+	if (is_engine_event(event)) {
+		u8 sample = engine_event_sample(event);
+		struct intel_engine_cs *engine;
+
+		engine = intel_engine_lookup_user(i915,
+						  engine_event_class(event),
+						  engine_event_instance(event));
+		GEM_BUG_ON(!engine);
+		engine->pmu.enable |= BIT(sample);
+
+		GEM_BUG_ON(sample >= I915_PMU_SAMPLE_BITS);
+		GEM_BUG_ON(engine->pmu.enable_count[sample] == ~0);
+		engine->pmu.enable_count[sample]++;
+	}
+
+	/*
+	 * Store the current counter value so we can report the correct delta
+	 * for all listeners. Even when the event was already enabled and has
+	 * an existing non-zero value.
+	 */
+	local64_set(&event->hw.prev_count, __i915_pmu_event_read(event));
+
+	spin_unlock_irqrestore(&i915->pmu.lock, flags);
+}
+
+static void i915_pmu_disable(struct perf_event *event)
+{
+	struct drm_i915_private *i915 =
+		container_of(event->pmu, typeof(*i915), pmu.base);
+	unsigned int bit = event_enabled_bit(event);
+	unsigned long flags;
+
+	spin_lock_irqsave(&i915->pmu.lock, flags);
+
+	if (is_engine_event(event)) {
+		u8 sample = engine_event_sample(event);
+		struct intel_engine_cs *engine;
+
+		engine = intel_engine_lookup_user(i915,
+						  engine_event_class(event),
+						  engine_event_instance(event));
+		GEM_BUG_ON(!engine);
+		GEM_BUG_ON(sample >= I915_PMU_SAMPLE_BITS);
+		GEM_BUG_ON(engine->pmu.enable_count[sample] == 0);
+		/*
+		 * Decrement the reference count and clear the enabled
+		 * bitmask when the last listener on an event goes away.
+		 */
+		if (--engine->pmu.enable_count[sample] == 0)
+			engine->pmu.enable &= ~BIT(sample);
+	}
+
+	GEM_BUG_ON(bit >= I915_PMU_MASK_BITS);
+	GEM_BUG_ON(i915->pmu.enable_count[bit] == 0);
+	/*
+	 * Decrement the reference count and clear the enabled
+	 * bitmask when the last listener on an event goes away.
+	 */
+	if (--i915->pmu.enable_count[bit] == 0)
+		i915->pmu.enable &= ~BIT_ULL(bit);
+
+	spin_unlock_irqrestore(&i915->pmu.lock, flags);
+}
+
+static void i915_pmu_event_start(struct perf_event *event, int flags)
+{
+	i915_pmu_enable(event);
+	event->hw.state = 0;
+}
+
+static void i915_pmu_event_stop(struct perf_event *event, int flags)
+{
+	if (flags & PERF_EF_UPDATE)
+		i915_pmu_event_read(event);
+	i915_pmu_disable(event);
+	event->hw.state = PERF_HES_STOPPED;
+}
+
+static int i915_pmu_event_add(struct perf_event *event, int flags)
+{
+	if (flags & PERF_EF_START)
+		i915_pmu_event_start(event, flags);
+
+	return 0;
+}
+
+static void i915_pmu_event_del(struct perf_event *event, int flags)
+{
+	i915_pmu_event_stop(event, PERF_EF_UPDATE);
+}
+
+static int i915_pmu_event_event_idx(struct perf_event *event)
+{
+	return 0;
+}
+
+static ssize_t i915_pmu_format_show(struct device *dev,
+				    struct device_attribute *attr, char *buf)
+{
+        struct dev_ext_attribute *eattr;
+
+        eattr = container_of(attr, struct dev_ext_attribute, attr);
+        return sprintf(buf, "%s\n", (char *) eattr->var);
+}
+
+#define I915_PMU_FORMAT_ATTR(_name, _config)           \
+        (&((struct dev_ext_attribute[]) {               \
+                { .attr = __ATTR(_name, S_IRUGO, i915_pmu_format_show, NULL), \
+                  .var = (void *) _config, }            \
+        })[0].attr.attr)
+
+static struct attribute *i915_pmu_format_attrs[] = {
+        I915_PMU_FORMAT_ATTR(i915_eventid, "config:0-42"),
+        NULL,
+};
+
+static const struct attribute_group i915_pmu_format_attr_group = {
+        .name = "format",
+        .attrs = i915_pmu_format_attrs,
+};
+
+static ssize_t i915_pmu_event_show(struct device *dev,
+				   struct device_attribute *attr, char *buf)
+{
+        struct dev_ext_attribute *eattr;
+
+        eattr = container_of(attr, struct dev_ext_attribute, attr);
+        return sprintf(buf, "config=0x%lx\n", (unsigned long) eattr->var);
+}
+
+#define I915_PMU_EVENT_ATTR(_name, _config)            \
+        (&((struct dev_ext_attribute[]) {               \
+                { .attr = __ATTR(_name, S_IRUGO, i915_pmu_event_show, NULL), \
+                  .var = (void *) _config, }            \
+         })[0].attr.attr)
+
+static struct attribute *i915_pmu_events_attrs[] = {
+	I915_PMU_EVENT_ATTR(rcs0-queued,
+			    I915_PMU_ENGINE_QUEUED(I915_ENGINE_CLASS_RENDER, 0)),
+	I915_PMU_EVENT_ATTR(rcs0-busy,
+			    I915_PMU_ENGINE_BUSY(I915_ENGINE_CLASS_RENDER, 0)),
+	I915_PMU_EVENT_ATTR(rcs0-wait,
+			    I915_PMU_ENGINE_WAIT(I915_ENGINE_CLASS_RENDER, 0)),
+	I915_PMU_EVENT_ATTR(rcs0-sema,
+			    I915_PMU_ENGINE_SEMA(I915_ENGINE_CLASS_RENDER, 0)),
+
+	I915_PMU_EVENT_ATTR(bcs0-queued,
+			    I915_PMU_ENGINE_QUEUED(I915_ENGINE_CLASS_COPY, 0)),
+	I915_PMU_EVENT_ATTR(bcs0-busy,
+			    I915_PMU_ENGINE_BUSY(I915_ENGINE_CLASS_COPY, 0)),
+	I915_PMU_EVENT_ATTR(bcs0-wait,
+			    I915_PMU_ENGINE_WAIT(I915_ENGINE_CLASS_COPY, 0)),
+	I915_PMU_EVENT_ATTR(bcs0-sema,
+			    I915_PMU_ENGINE_SEMA(I915_ENGINE_CLASS_COPY, 0)),
+
+	I915_PMU_EVENT_ATTR(vcs0-queued,
+			    I915_PMU_ENGINE_QUEUED(I915_ENGINE_CLASS_VIDEO, 0)),
+	I915_PMU_EVENT_ATTR(vcs0-busy,
+			    I915_PMU_ENGINE_BUSY(I915_ENGINE_CLASS_VIDEO, 0)),
+	I915_PMU_EVENT_ATTR(vcs0-wait,
+			    I915_PMU_ENGINE_WAIT(I915_ENGINE_CLASS_VIDEO, 0)),
+	I915_PMU_EVENT_ATTR(vcs0-sema,
+			    I915_PMU_ENGINE_SEMA(I915_ENGINE_CLASS_VIDEO, 0)),
+
+	I915_PMU_EVENT_ATTR(vcs1-queued,
+			    I915_PMU_ENGINE_QUEUED(I915_ENGINE_CLASS_VIDEO, 1)),
+	I915_PMU_EVENT_ATTR(vcs1-busy,
+			    I915_PMU_ENGINE_BUSY(I915_ENGINE_CLASS_VIDEO, 1)),
+	I915_PMU_EVENT_ATTR(vcs1-wait,
+			    I915_PMU_ENGINE_WAIT(I915_ENGINE_CLASS_VIDEO, 1)),
+	I915_PMU_EVENT_ATTR(vcs1-sema,
+			    I915_PMU_ENGINE_SEMA(I915_ENGINE_CLASS_VIDEO, 1)),
+
+	I915_PMU_EVENT_ATTR(vecs0-queued,
+			    I915_PMU_ENGINE_QUEUED(I915_ENGINE_CLASS_VIDEO_ENHANCE, 0)),
+	I915_PMU_EVENT_ATTR(vecs0-busy,
+			    I915_PMU_ENGINE_BUSY(I915_ENGINE_CLASS_VIDEO_ENHANCE, 0)),
+	I915_PMU_EVENT_ATTR(vecs0-wait,
+			    I915_PMU_ENGINE_WAIT(I915_ENGINE_CLASS_VIDEO_ENHANCE, 0)),
+	I915_PMU_EVENT_ATTR(vecs0-sema,
+			    I915_PMU_ENGINE_SEMA(I915_ENGINE_CLASS_VIDEO_ENHANCE, 0)),
+
+        I915_PMU_EVENT_ATTR(actual-frequency,	 I915_PMU_ACTUAL_FREQUENCY),
+        I915_PMU_EVENT_ATTR(requested-frequency, I915_PMU_REQUESTED_FREQUENCY),
+        I915_PMU_EVENT_ATTR(energy,		 I915_PMU_ENERGY),
+        I915_PMU_EVENT_ATTR(interrupts,		 I915_PMU_INTERRUPTS),
+        I915_PMU_EVENT_ATTR(rc6-residency,	 I915_PMU_RC6_RESIDENCY),
+        I915_PMU_EVENT_ATTR(rc6p-residency,	 I915_PMU_RC6p_RESIDENCY),
+        I915_PMU_EVENT_ATTR(rc6pp-residency,	 I915_PMU_RC6pp_RESIDENCY),
+
+        NULL,
+};
+
+static const struct attribute_group i915_pmu_events_attr_group = {
+        .name = "events",
+        .attrs = i915_pmu_events_attrs,
+};
+
+static ssize_t
+i915_pmu_get_attr_cpumask(struct device *dev,
+			  struct device_attribute *attr,
+			  char *buf)
+{
+	return cpumap_print_to_pagebuf(true, buf, &i915_pmu_cpumask);
+}
+
+static DEVICE_ATTR(cpumask, S_IRUGO, i915_pmu_get_attr_cpumask, NULL);
+
+static struct attribute *i915_cpumask_attrs[] = {
+	&dev_attr_cpumask.attr,
+	NULL,
+};
+
+static struct attribute_group i915_pmu_cpumask_attr_group = {
+	.attrs = i915_cpumask_attrs,
+};
+
+static const struct attribute_group *i915_pmu_attr_groups[] = {
+        &i915_pmu_format_attr_group,
+        &i915_pmu_events_attr_group,
+	&i915_pmu_cpumask_attr_group,
+        NULL
+};
+
+static int i915_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
+{
+	unsigned int target;
+
+	target = cpumask_any_and(&i915_pmu_cpumask, &i915_pmu_cpumask);
+	/* Select the first online CPU as a designated reader. */
+	if (target >= nr_cpu_ids)
+		cpumask_set_cpu(cpu, &i915_pmu_cpumask);
+
+	return 0;
+}
+
+static int i915_pmu_cpu_offline(unsigned int cpu, struct hlist_node *node)
+{
+	struct i915_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), node);
+	unsigned int target;
+
+	if (cpumask_test_and_clear_cpu(cpu, &i915_pmu_cpumask)) {
+		target = cpumask_any_but(topology_sibling_cpumask(cpu), cpu);
+		/* Migrate events if there is a valid target */
+		if (target < nr_cpu_ids) {
+			cpumask_set_cpu(target, &i915_pmu_cpumask);
+			perf_pmu_migrate_context(&pmu->base, cpu, target);
+		}
+	}
+
+	return 0;
+}
+
+void i915_pmu_register(struct drm_i915_private *i915)
+{
+	int ret = ENOTSUPP;
+
+	if (INTEL_GEN(i915) <= 2)
+		goto err;
+
+	ret = cpuhp_setup_state_multi(CPUHP_AP_PERF_X86_UNCORE_ONLINE,
+				      "perf/x86/intel/i915:online",
+				      i915_pmu_cpu_online,
+			              i915_pmu_cpu_offline);
+	if (ret)
+		goto err;
+
+	ret = cpuhp_state_add_instance(CPUHP_AP_PERF_X86_UNCORE_ONLINE,
+				       &i915->pmu.node);
+	if (ret)
+		goto err;
+
+	i915->pmu.base.attr_groups	= i915_pmu_attr_groups;
+	i915->pmu.base.task_ctx_nr	= perf_invalid_context;
+	i915->pmu.base.event_init	= i915_pmu_event_init;
+	i915->pmu.base.add		= i915_pmu_event_add;
+	i915->pmu.base.del		= i915_pmu_event_del;
+	i915->pmu.base.start		= i915_pmu_event_start;
+	i915->pmu.base.stop		= i915_pmu_event_stop;
+	i915->pmu.base.read		= i915_pmu_event_read;
+	i915->pmu.base.event_idx	= i915_pmu_event_event_idx;
+
+	spin_lock_init(&i915->pmu.lock);
+	hrtimer_init(&i915->pmu.timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+	i915->pmu.timer.function = i915_sample;
+	i915->pmu.enable = 0;
+
+	if (perf_pmu_register(&i915->pmu.base, "i915", -1))
+		i915->pmu.base.event_init = NULL;
+
+err:
+	DRM_INFO("Failed to register PMU (err=%d)\n", ret);
+}
+
+void i915_pmu_unregister(struct drm_i915_private *i915)
+{
+	if (!i915->pmu.base.event_init)
+		return;
+
+	i915->pmu.enable = 0;
+
+	perf_pmu_unregister(&i915->pmu.base);
+	i915->pmu.base.event_init = NULL;
+
+	hrtimer_cancel(&i915->pmu.timer);
+
+	cpuhp_state_remove_instance(CPUHP_AP_PERF_X86_UNCORE_ONLINE,
+				    &i915->pmu.node);
+}
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 0b03260a3967..8c362e0451c1 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -186,6 +186,9 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define VIDEO_ENHANCEMENT_CLASS	2
 #define COPY_ENGINE_CLASS	3
 #define OTHER_CLASS		4
+#define MAX_ENGINE_CLASS	4
+
+#define MAX_ENGINE_INSTANCE    1
 
 /* PCI config space */
 
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 3ae89a9d6241..dbc7abd65f33 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -198,6 +198,15 @@ intel_engine_setup(struct drm_i915_private *dev_priv,
 	GEM_BUG_ON(info->class >= ARRAY_SIZE(intel_engine_classes));
 	class_info = &intel_engine_classes[info->class];
 
+	if (GEM_WARN_ON(info->class > MAX_ENGINE_CLASS))
+		return -EINVAL;
+
+	if (GEM_WARN_ON(info->instance > MAX_ENGINE_INSTANCE))
+		return -EINVAL;
+
+	if (GEM_WARN_ON(dev_priv->engine_class[info->class][info->instance]))
+		return -EINVAL;
+
 	GEM_BUG_ON(dev_priv->engine[id]);
 	engine = kzalloc(sizeof(*engine), GFP_KERNEL);
 	if (!engine)
@@ -225,6 +234,7 @@ intel_engine_setup(struct drm_i915_private *dev_priv,
 
 	ATOMIC_INIT_NOTIFIER_HEAD(&engine->context_status_notifier);
 
+	dev_priv->engine_class[info->class][info->instance] = engine;
 	dev_priv->engine[id] = engine;
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 268342433a8e..7db4c572ef76 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2283,3 +2283,28 @@ int intel_init_vebox_ring_buffer(struct intel_engine_cs *engine)
 
 	return intel_init_ring_buffer(engine);
 }
+
+static u8 user_class_map[I915_ENGINE_CLASS_MAX] = {
+	[I915_ENGINE_CLASS_OTHER] = OTHER_CLASS,
+	[I915_ENGINE_CLASS_RENDER] = RENDER_CLASS,
+	[I915_ENGINE_CLASS_COPY] = COPY_ENGINE_CLASS,
+	[I915_ENGINE_CLASS_VIDEO] = VIDEO_DECODE_CLASS,
+	[I915_ENGINE_CLASS_VIDEO_ENHANCE] = VIDEO_ENHANCEMENT_CLASS,
+};
+
+struct intel_engine_cs *
+intel_engine_lookup_user(struct drm_i915_private *i915, u8 class, u8 instance)
+{
+	if (class >= ARRAY_SIZE(user_class_map))
+		return NULL;
+
+	class = user_class_map[class];
+
+	if (WARN_ON_ONCE(class > MAX_ENGINE_CLASS))
+		return NULL;
+
+	if (instance > MAX_ENGINE_INSTANCE)
+		return NULL;
+
+	return i915->engine_class[class][instance];
+}
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 79c0021f3700..cf095b9386f4 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -245,6 +245,28 @@ struct intel_engine_cs {
 		I915_SELFTEST_DECLARE(bool mock : 1);
 	} breadcrumbs;
 
+	struct {
+		/**
+		 * @enable: Bitmask of enable sample events on this engine.
+		 *
+		 * Bits correspond to sample event types, for instance
+		 * I915_SAMPLE_QUEUED is bit 0 etc.
+		 */
+		u32 enable;
+		/**
+		 * @enable_count: Reference count for the enabled samplers.
+		 *
+		 * Index number corresponds to the bit number from @enable.
+		 */
+		unsigned int enable_count[I915_PMU_SAMPLE_BITS];
+		/**
+		 * @sample: Counter value for sampling events.
+		 *
+		 * Our internal timer stores the current counter in this field.
+		 */
+		u64 sample[I915_ENGINE_SAMPLE_MAX];
+	} pmu;
+
 	/*
 	 * A pool of objects to use as shadow copies of client batch buffers
 	 * when the command parser is enabled. Prevents the client from
@@ -737,4 +759,7 @@ void intel_engines_reset_default_submission(struct drm_i915_private *i915);
 
 bool intel_engine_can_store_dword(struct intel_engine_cs *engine);
 
+struct intel_engine_cs *
+intel_engine_lookup_user(struct drm_i915_private *i915, u8 class, u8 instance);
+
 #endif /* _INTEL_RINGBUFFER_H_ */
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index d8d10d932759..6dc0d6fd4e4c 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -86,6 +86,64 @@ enum i915_mocs_table_index {
 	I915_MOCS_CACHED,
 };
 
+enum drm_i915_gem_engine_class {
+	I915_ENGINE_CLASS_OTHER = 0,
+	I915_ENGINE_CLASS_RENDER = 1,
+	I915_ENGINE_CLASS_COPY = 2,
+	I915_ENGINE_CLASS_VIDEO = 3,
+	I915_ENGINE_CLASS_VIDEO_ENHANCE = 4,
+	I915_ENGINE_CLASS_MAX /* non-ABI */
+};
+
+/**
+ * DOC: perf_events exposed by i915 through /sys/bus/event_sources/drivers/i915
+ *
+ */
+
+enum drm_i915_pmu_engine_sample {
+	I915_SAMPLE_QUEUED = 0,
+	I915_SAMPLE_BUSY = 1,
+	I915_SAMPLE_WAIT = 2,
+	I915_SAMPLE_SEMA = 3,
+	I915_ENGINE_SAMPLE_MAX /* non-ABI */
+};
+
+#define I915_PMU_SAMPLE_BITS (4)
+#define I915_PMU_SAMPLE_MASK (0xf)
+#define I915_PMU_SAMPLE_INSTANCE_BITS (8)
+#define I915_PMU_CLASS_SHIFT \
+	(I915_PMU_SAMPLE_BITS + I915_PMU_SAMPLE_INSTANCE_BITS)
+
+#define __I915_PMU_ENGINE(class, instance, sample) \
+	((class) << I915_PMU_CLASS_SHIFT | \
+	(instance) << I915_PMU_SAMPLE_BITS | \
+	(sample))
+
+#define I915_PMU_ENGINE_QUEUED(class, instance) \
+	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_QUEUED)
+
+#define I915_PMU_ENGINE_BUSY(class, instance) \
+	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_BUSY)
+
+#define I915_PMU_ENGINE_WAIT(class, instance) \
+	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_WAIT)
+
+#define I915_PMU_ENGINE_SEMA(class, instance) \
+	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_SEMA)
+
+#define __I915_PMU_OTHER(x) (__I915_PMU_ENGINE(0xff, 0xff, 0xf) + 1 + (x))
+
+#define I915_PMU_ACTUAL_FREQUENCY 	__I915_PMU_OTHER(0)
+#define I915_PMU_REQUESTED_FREQUENCY	__I915_PMU_OTHER(1)
+#define I915_PMU_ENERGY			__I915_PMU_OTHER(2)
+#define I915_PMU_INTERRUPTS		__I915_PMU_OTHER(3)
+
+#define I915_PMU_RC6_RESIDENCY		__I915_PMU_OTHER(4)
+#define I915_PMU_RC6p_RESIDENCY		__I915_PMU_OTHER(5)
+#define I915_PMU_RC6pp_RESIDENCY	__I915_PMU_OTHER(6)
+
+#define I915_PMU_LAST I915_PMU_RC6pp_RESIDENCY
+
 /* Each region is a minimum of 16k, and there are at most 255 of them.
  */
 #define I915_NR_TEX_REGIONS 255	/* table size 2k - maximum due to use
-- 
2.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [RFC 05/11] drm/i915/pmu: Suspend sampling when GPU is idle
  2017-09-11 15:25 [RFC v3 00/11] i915 PMU and engine busy stats Tvrtko Ursulin
                   ` (3 preceding siblings ...)
  2017-09-11 15:25 ` [RFC 04/11] drm/i915/pmu: Expose a PMU interface for perf queries Tvrtko Ursulin
@ 2017-09-11 15:25 ` Tvrtko Ursulin
  2017-09-13 10:34   ` [RFC v5 " Tvrtko Ursulin
  2017-09-11 15:25 ` [RFC 06/11] drm/i915: Wrap context schedule notification Tvrtko Ursulin
                   ` (11 subsequent siblings)
  16 siblings, 1 reply; 56+ messages in thread
From: Tvrtko Ursulin @ 2017-09-11 15:25 UTC (permalink / raw)
  To: Intel-gfx; +Cc: Peter Zijlstra

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

If only a subset of events is enabled we can afford to suspend
the sampling timer when the GPU is idle and so save some cycles
and power.

v2: Rebase and limit timer even more.
v3: Rebase.
v4: Rebase.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |  8 +++++
 drivers/gpu/drm/i915/i915_gem.c         |  1 +
 drivers/gpu/drm/i915/i915_gem_request.c |  1 +
 drivers/gpu/drm/i915/i915_pmu.c         | 64 +++++++++++++++++++++++++++------
 4 files changed, 64 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 62646b8dfb7a..70be8c5d9a65 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2244,6 +2244,10 @@ struct i915_pmu {
 	 */
 	unsigned int enable_count[I915_PMU_MASK_BITS];
 	/**
+	 * @timer_enabled: Should the internal sampling timer be running.
+	 */
+	bool timer_enabled;
+	/**
 	 * @sample: Current counter value for i915 events which need sampling.
 	 *
 	 * These counters are updated from the i915 PMU sampling timer.
@@ -3989,9 +3993,13 @@ extern void i915_perf_unregister(struct drm_i915_private *dev_priv);
 #ifdef CONFIG_PERF_EVENTS
 extern void i915_pmu_register(struct drm_i915_private *i915);
 extern void i915_pmu_unregister(struct drm_i915_private *i915);
+extern void i915_pmu_gt_idle(struct drm_i915_private *i915);
+extern void i915_pmu_gt_active(struct drm_i915_private *i915);
 #else
 static inline void i915_pmu_register(struct drm_i915_private *i915) {}
 static inline void i915_pmu_unregister(struct drm_i915_private *i915) {}
+static inline void i915_pmu_gt_idle(struct drm_i915_private *i915) {}
+static inline void i915_pmu_gt_active(struct drm_i915_private *i915) {}
 #endif
 
 /* i915_suspend.c */
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f445587c1a4b..201b09eda93b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3227,6 +3227,7 @@ i915_gem_idle_work_handler(struct work_struct *work)
 
 	intel_engines_mark_idle(dev_priv);
 	i915_gem_timelines_mark_idle(dev_priv);
+	i915_pmu_gt_idle(dev_priv);
 
 	GEM_BUG_ON(!dev_priv->gt.awake);
 	dev_priv->gt.awake = false;
diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
index 813a3b546d6e..18a1e379253e 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -258,6 +258,7 @@ static void mark_busy(struct drm_i915_private *i915)
 	i915_update_gfx_val(i915);
 	if (INTEL_GEN(i915) >= 6)
 		gen6_rps_busy(i915);
+	i915_pmu_gt_active(i915);
 
 	queue_delayed_work(i915->wq,
 			   &i915->gt.retire_work,
diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index 2ec892e57143..26e735f27282 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -90,6 +90,46 @@ static unsigned int event_enabled_bit(struct perf_event *event)
 	return config_enabled_bit(event->attr.config);
 }
 
+static bool pmu_needs_timer(struct drm_i915_private *i915, bool gpu_active)
+{
+	u64 enable = i915->pmu.enable;
+
+	enable &= config_enabled_mask(I915_PMU_ACTUAL_FREQUENCY) |
+		  config_enabled_mask(I915_PMU_REQUESTED_FREQUENCY) |
+		  ENGINE_SAMPLE_MASK;
+
+	if (!gpu_active)
+		enable &= ~ENGINE_SAMPLE_MASK;
+
+	return enable;
+}
+
+void i915_pmu_gt_idle(struct drm_i915_private *i915)
+{
+	spin_lock_irq(&i915->pmu.lock);
+	/*
+	 * Signal sampling timer to stop if only engine events are enabled and
+	 * GPU went idle.
+	 */
+	i915->pmu.timer_enabled = pmu_needs_timer(i915, false);
+	spin_unlock_irq(&i915->pmu.lock);
+}
+
+void i915_pmu_gt_active(struct drm_i915_private *i915)
+{
+	spin_lock_irq(&i915->pmu.lock);
+	/*
+	 * Re-enable sampling timer when GPU goes active.
+	 */
+	if (!i915->pmu.timer_enabled && pmu_needs_timer(i915, true)) {
+		i915->pmu.timer_enabled = true;
+		hrtimer_start_range_ns(&i915->pmu.timer,
+				       ns_to_ktime(PERIOD), 0,
+				       HRTIMER_MODE_REL_PINNED);
+	}
+	spin_unlock_irq(&i915->pmu.lock);
+}
+
 static bool grab_forcewake(struct drm_i915_private *i915, bool fw)
 {
 	if (!fw)
@@ -180,7 +220,7 @@ static enum hrtimer_restart i915_sample(struct hrtimer *hrtimer)
 	struct drm_i915_private *i915 =
 		container_of(hrtimer, struct drm_i915_private, pmu.timer);
 
-	if (i915->pmu.enable == 0)
+	if (!READ_ONCE(i915->pmu.timer_enabled))
 		return HRTIMER_NORESTART;
 
 	engines_sample(i915);
@@ -355,14 +395,6 @@ static void i915_pmu_enable(struct perf_event *event)
 	spin_lock_irqsave(&i915->pmu.lock, flags);
 
 	/*
-	 * Start the sampling timer when enabling the first event.
-	 */
-	if (i915->pmu.enable == 0)
-		hrtimer_start_range_ns(&i915->pmu.timer,
-				       ns_to_ktime(PERIOD), 0,
-				       HRTIMER_MODE_REL_PINNED);
-
-	/*
 	 * Update the bitmask of enabled events and increment
 	 * the event reference counter.
 	 */
@@ -372,6 +404,16 @@ static void i915_pmu_enable(struct perf_event *event)
 	i915->pmu.enable_count[bit]++;
 
 	/*
+	 * Start the sampling timer if needed and not already enabled.
+	 */
+	if (pmu_needs_timer(i915, true) && !i915->pmu.timer_enabled) {
+		i915->pmu.timer_enabled = true;
+		hrtimer_start_range_ns(&i915->pmu.timer,
+				       ns_to_ktime(PERIOD), 0,
+				       HRTIMER_MODE_REL_PINNED);
+	}
+
+	/*
 	 * For per-engine events the bitmask and reference counting
 	 * is stored per engine.
 	 */
@@ -433,8 +475,10 @@ static void i915_pmu_disable(struct perf_event *event)
 	 * Decrement the reference count and clear the enabled
 	 * bitmask when the last listener on an event goes away.
 	 */
-	if (--i915->pmu.enable_count[bit] == 0)
+	if (--i915->pmu.enable_count[bit] == 0) {
 		i915->pmu.enable &= ~BIT_ULL(bit);
+		i915->pmu.timer_enabled &= pmu_needs_timer(i915, true);
+	}
 
 	spin_unlock_irqrestore(&i915->pmu.lock, flags);
 }
-- 
2.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [RFC 06/11] drm/i915: Wrap context schedule notification
  2017-09-11 15:25 [RFC v3 00/11] i915 PMU and engine busy stats Tvrtko Ursulin
                   ` (4 preceding siblings ...)
  2017-09-11 15:25 ` [RFC 05/11] drm/i915/pmu: Suspend sampling when GPU is idle Tvrtko Ursulin
@ 2017-09-11 15:25 ` Tvrtko Ursulin
  2017-09-11 15:25 ` [RFC 07/11] drm/i915: Engine busy time tracking Tvrtko Ursulin
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 56+ messages in thread
From: Tvrtko Ursulin @ 2017-09-11 15:25 UTC (permalink / raw)
  To: Intel-gfx; +Cc: Peter Zijlstra

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

No functional change just something which will be handy in the
following patch.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index d89e1b8e1cc5..b61fb09024c3 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -307,6 +307,18 @@ execlists_context_status_change(struct drm_i915_gem_request *rq,
 				   status, rq);
 }
 
+static inline void
+execlists_context_schedule_in(struct drm_i915_gem_request *rq)
+{
+	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN);
+}
+
+static inline void
+execlists_context_schedule_out(struct drm_i915_gem_request *rq)
+{
+	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
+}
+
 static void
 execlists_update_context_pdps(struct i915_hw_ppgtt *ppgtt, u32 *reg_state)
 {
@@ -352,7 +364,7 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
 		if (rq) {
 			GEM_BUG_ON(count > !n);
 			if (!count++)
-				execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN);
+				execlists_context_schedule_in(rq);
 			port_set(&port[n], port_pack(rq, count));
 			desc = execlists_update_context(rq);
 			GEM_DEBUG_EXEC(port[n].context_id = upper_32_bits(desc));
@@ -603,8 +615,7 @@ static void intel_lrc_irq_handler(unsigned long data)
 			if (--count == 0) {
 				GEM_BUG_ON(status & GEN8_CTX_STATUS_PREEMPTED);
 				GEM_BUG_ON(!i915_gem_request_completed(rq));
-				execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
-
+				execlists_context_schedule_out(rq);
 				trace_i915_gem_request_out(rq);
 				i915_gem_request_put(rq);
 
-- 
2.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [RFC 07/11] drm/i915: Engine busy time tracking
  2017-09-11 15:25 [RFC v3 00/11] i915 PMU and engine busy stats Tvrtko Ursulin
                   ` (5 preceding siblings ...)
  2017-09-11 15:25 ` [RFC 06/11] drm/i915: Wrap context schedule notification Tvrtko Ursulin
@ 2017-09-11 15:25 ` Tvrtko Ursulin
  2017-09-14 20:16   ` Chris Wilson
  2017-09-11 15:25 ` [RFC 08/11] drm/i915: Export engine busy stats in debugfs Tvrtko Ursulin
                   ` (9 subsequent siblings)
  16 siblings, 1 reply; 56+ messages in thread
From: Tvrtko Ursulin @ 2017-09-11 15:25 UTC (permalink / raw)
  To: Intel-gfx; +Cc: Peter Zijlstra

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Track total time requests have been executing on the hardware.

We add new kernel API to allow software tracking of time GPU
engines are spending executing requests.

Both per-engine and global API is added with the latter also
being exported for use by external users.

v2:
 * Squashed with the internal API.
 * Dropped static key.
 * Made per-engine.
 * Store time in monotonic ktime.

v3: Moved stats clearing to disable.

v4:
 * Comments.
 * Don't export the API just yet.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/intel_engine_cs.c  | 141 ++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_lrc.c        |   2 +
 drivers/gpu/drm/i915/intel_ringbuffer.h |  81 ++++++++++++++++++
 3 files changed, 224 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index dbc7abd65f33..f7dba176989c 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -232,6 +232,8 @@ intel_engine_setup(struct drm_i915_private *dev_priv,
 	/* Nothing to do here, execute in order of dependencies */
 	engine->schedule = NULL;
 
+	spin_lock_init(&engine->stats.lock);
+
 	ATOMIC_INIT_NOTIFIER_HEAD(&engine->context_status_notifier);
 
 	dev_priv->engine_class[info->class][info->instance] = engine;
@@ -1417,6 +1419,145 @@ bool intel_engine_can_store_dword(struct intel_engine_cs *engine)
 	}
 }
 
+/**
+ * intel_enable_engine_stats() - Enable engine busy tracking on engine
+ * @engine: engine to enable stats collection
+ *
+ * Start collecting the engine busyness data for @engine.
+ *
+ * Returns 0 on success or a negative error code.
+ */
+int intel_enable_engine_stats(struct intel_engine_cs *engine)
+{
+	unsigned long flags;
+
+	if (!i915.enable_execlists)
+		return -ENODEV;
+
+	spin_lock_irqsave(&engine->stats.lock, flags);
+	if (engine->stats.enabled == ~0)
+		goto busy;
+	engine->stats.enabled++;
+	spin_unlock_irqrestore(&engine->stats.lock, flags);
+
+	return 0;
+
+busy:
+	spin_unlock_irqrestore(&engine->stats.lock, flags);
+
+	return -EBUSY;
+}
+
+/**
+ * intel_disable_engine_stats() - Disable engine busy tracking on engine
+ * @engine: engine to disable stats collection
+ *
+ * Stops collecting the engine busyness data for @engine.
+ */
+void intel_disable_engine_stats(struct intel_engine_cs *engine)
+{
+	unsigned long flags;
+
+	if (!i915.enable_execlists)
+		return;
+
+	spin_lock_irqsave(&engine->stats.lock, flags);
+	WARN_ON_ONCE(engine->stats.enabled == 0);
+	if (--engine->stats.enabled == 0) {
+		engine->stats.ref = 0;
+		engine->stats.start = engine->stats.total = 0;
+	}
+	spin_unlock_irqrestore(&engine->stats.lock, flags);
+}
+
+/**
+ * intel_enable_engines_stats() - Enable engine busy tracking on all engines
+ * @dev_priv: i915 device private
+ *
+ * Start collecting the engine busyness data for all engines.
+ *
+ * Returns 0 on success or a negative error code.
+ */
+int intel_enable_engines_stats(struct drm_i915_private *dev_priv)
+{
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	int ret = 0;
+
+	if (!i915.enable_execlists)
+		return -ENODEV;
+
+	for_each_engine(engine, dev_priv, id) {
+		ret = intel_enable_engine_stats(engine);
+		if (WARN_ON_ONCE(ret))
+			break;
+	}
+
+	return ret;
+}
+
+/**
+ * intel_disable_engines_stats() - Disable engine busy tracking on all engines
+ * @dev_priv: i915 device private
+ *
+ * Stops collecting the engine busyness data for all engines.
+ */
+void intel_disable_engines_stats(struct drm_i915_private *dev_priv)
+{
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+
+	for_each_engine(engine, dev_priv, id)
+		intel_disable_engine_stats(engine);
+}
+
+/**
+ * intel_engine_get_busy_time() - Return current accumulated engine busyness
+ * @engine: engine to report on
+ *
+ * Returns accumulated time @engine was busy since engine stats were enabled.
+ */
+ktime_t intel_engine_get_busy_time(struct intel_engine_cs *engine)
+{
+	ktime_t total;
+	unsigned long flags;
+
+	spin_lock_irqsave(&engine->stats.lock, flags);
+
+	total = engine->stats.total;
+
+	/*
+	 * If the engine is executing something at the moment
+	 * add it to the total.
+	 */
+	if (engine->stats.ref)
+		total = ktime_add(total,
+				  ktime_sub(ktime_get(), engine->stats.start));
+
+	spin_unlock_irqrestore(&engine->stats.lock, flags);
+
+	return total;
+}
+
+/**
+ * intel_engines_get_busy_time() - Return current accumulated overall engine busyness
+ * @dev_priv: i915 device private
+ *
+ * Returns accumulated time all engines were busy since engine stats were
+ * enabled.
+ */
+ktime_t intel_engines_get_busy_time(struct drm_i915_private *dev_priv)
+{
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	ktime_t total = 0;
+
+	for_each_engine(engine, dev_priv, id)
+		total = ktime_add(total, intel_engine_get_busy_time(engine));
+
+	return total;
+}
+
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftests/mock_engine.c"
 #endif
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index b61fb09024c3..00fcbde998fc 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -310,12 +310,14 @@ execlists_context_status_change(struct drm_i915_gem_request *rq,
 static inline void
 execlists_context_schedule_in(struct drm_i915_gem_request *rq)
 {
+	intel_engine_context_in(rq->engine);
 	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN);
 }
 
 static inline void
 execlists_context_schedule_out(struct drm_i915_gem_request *rq)
 {
+	intel_engine_context_out(rq->engine);
 	execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index cf095b9386f4..f618c5f98edf 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -463,6 +463,34 @@ struct intel_engine_cs {
 	 * certain bits to encode the command length in the header).
 	 */
 	u32 (*get_cmd_length_mask)(u32 cmd_header);
+
+       struct {
+	       /**
+		* @lock: Lock protecting the below fields.
+		*/
+		spinlock_t lock;
+		/**
+		 * @enabled: Reference count indicating number of listeners.
+		 */
+		unsigned int enabled;
+		/**
+		 * @ref: Number of contexts currently scheduled in.
+		 */
+		unsigned int ref;
+		/**
+		 * @start: Timestamp of the last idle to active transition.
+		 *
+		 * Idle is defined as ref == 0, active is ref > 0.
+		 */
+		ktime_t start;
+		/**
+		 * @total: Total time this engine was busy.
+		 *
+		 * Accumulated time not counting the most recent block in cases
+		 * where engine is currently busy (ref > 0).
+		 */
+		ktime_t total;
+       } stats;
 };
 
 static inline unsigned int
@@ -762,4 +790,57 @@ bool intel_engine_can_store_dword(struct intel_engine_cs *engine);
 struct intel_engine_cs *
 intel_engine_lookup_user(struct drm_i915_private *i915, u8 class, u8 instance);
 
+static inline void intel_engine_context_in(struct intel_engine_cs *engine)
+{
+	unsigned long flags;
+
+	if (READ_ONCE(engine->stats.enabled) == 0)
+		return;
+
+	spin_lock_irqsave(&engine->stats.lock, flags);
+
+	if (engine->stats.enabled > 0) {
+		if (engine->stats.ref++ == 0)
+			engine->stats.start = ktime_get();
+		GEM_BUG_ON(engine->stats.ref == 0);
+	}
+
+	spin_unlock_irqrestore(&engine->stats.lock, flags);
+}
+
+static inline void intel_engine_context_out(struct intel_engine_cs *engine)
+{
+	unsigned long flags;
+
+	if (READ_ONCE(engine->stats.enabled) == 0)
+		return;
+
+	spin_lock_irqsave(&engine->stats.lock, flags);
+
+	if (engine->stats.enabled > 0) {
+		/*
+		 * After turning on engine stats, context out might be the
+		 * first event which then needs to be ignored (ref == 0).
+		 */
+		if (engine->stats.ref && --engine->stats.ref == 0) {
+			ktime_t last = ktime_sub(ktime_get(),
+						 engine->stats.start);
+
+			engine->stats.total = ktime_add(engine->stats.total,
+							last);
+		}
+	}
+
+	spin_unlock_irqrestore(&engine->stats.lock, flags);
+}
+
+int intel_enable_engine_stats(struct intel_engine_cs *engine);
+void intel_disable_engine_stats(struct intel_engine_cs *engine);
+
+int intel_enable_engines_stats(struct drm_i915_private *dev_priv);
+void intel_disable_engines_stats(struct drm_i915_private *dev_priv);
+
+ktime_t intel_engine_get_busy_time(struct intel_engine_cs *engine);
+ktime_t intel_engines_get_busy_time(struct drm_i915_private *dev_priv);
+
 #endif /* _INTEL_RINGBUFFER_H_ */
-- 
2.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [RFC 08/11] drm/i915: Export engine busy stats in debugfs
  2017-09-11 15:25 [RFC v3 00/11] i915 PMU and engine busy stats Tvrtko Ursulin
                   ` (6 preceding siblings ...)
  2017-09-11 15:25 ` [RFC 07/11] drm/i915: Engine busy time tracking Tvrtko Ursulin
@ 2017-09-11 15:25 ` Tvrtko Ursulin
  2017-09-14 20:17   ` Chris Wilson
  2017-09-11 15:25 ` [RFC 09/11] drm/i915/pmu: Wire up engine busy stats to PMU Tvrtko Ursulin
                   ` (8 subsequent siblings)
  16 siblings, 1 reply; 56+ messages in thread
From: Tvrtko Ursulin @ 2017-09-11 15:25 UTC (permalink / raw)
  To: Intel-gfx; +Cc: Peter Zijlstra

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Export the stats added in the previous patch in debugfs.

Number of active clients reading this data is tracked and the
static key is only enabled whilst there are some.

Userspace is intended to keep the file descriptor open, seeking
to the beginning of the file periodically, and re-reading the
stats.

This is because the underlying implementation is costly on every
first open/last close transition, and also, because the stats
exported mostly make sense when they are considered relative to
the previous sample.

File lists nanoseconds each engine was active since the tracking
has started.

v2: Rebase.
v3: Rebase.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 92 +++++++++++++++++++++++++++++++++++++
 1 file changed, 92 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 1fd777fb5e9e..4bb3970204a1 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -4811,6 +4811,92 @@ static const struct file_operations i915_hpd_storm_ctl_fops = {
 	.write = i915_hpd_storm_ctl_write
 };
 
+struct i915_engine_stats_buf {
+	unsigned int len;
+	size_t available;
+	char buf[0];
+};
+
+static int i915_engine_stats_open(struct inode *inode, struct file *file)
+{
+	struct drm_i915_private *i915 = file->f_inode->i_private;
+	const unsigned int engine_name_len =
+		sizeof(((struct intel_engine_cs *)0)->name);
+	struct i915_engine_stats_buf *buf;
+	const unsigned int buf_size =
+		(engine_name_len + 2 + 19 + 1) * I915_NUM_ENGINES + 1 +
+		sizeof(*buf);
+	int ret;
+
+	buf = kzalloc(buf_size, GFP_KERNEL);
+	if (!buf)
+		return -ENOMEM;
+
+	ret = intel_enable_engines_stats(i915);
+	if (ret) {
+		kfree(buf);
+		return ret;
+	}
+
+	buf->len = buf_size;
+	file->private_data = buf;
+
+	return 0;
+}
+
+static int i915_engine_stats_release(struct inode *inode, struct file *file)
+{
+	struct drm_i915_private *i915 = file->f_inode->i_private;
+
+	intel_disable_engines_stats(i915);
+	kfree(file->private_data);
+
+	return 0;
+}
+
+static ssize_t i915_engine_stats_read(struct file *file, char __user *ubuf,
+				      size_t count, loff_t *pos)
+{
+	struct i915_engine_stats_buf *buf =
+		(struct i915_engine_stats_buf *)file->private_data;
+
+	if (*pos == 0) {
+		struct drm_i915_private *dev_priv = file->f_inode->i_private;
+		char *ptr = &buf->buf[0];
+		int left = buf->len;
+		struct intel_engine_cs *engine;
+		enum intel_engine_id id;
+
+		buf->available = 0;
+
+		for_each_engine(engine, dev_priv, id) {
+			u64 total =
+			       ktime_to_ns(intel_engine_get_busy_time(engine));
+			int len;
+
+			len = snprintf(ptr, left, "%s: %llu\n",
+				       engine->name, total);
+			buf->available += len;
+			left -= len;
+			ptr += len;
+
+			if (len == 0)
+				return -EFBIG;
+		}
+	}
+
+	return simple_read_from_buffer(ubuf, count, pos, &buf->buf[0],
+				       buf->available);
+}
+
+static const struct file_operations i915_engine_stats_fops = {
+	.owner = THIS_MODULE,
+	.open = i915_engine_stats_open,
+	.release = i915_engine_stats_release,
+	.read = i915_engine_stats_read,
+	.llseek = default_llseek,
+};
+
 static const struct drm_info_list i915_debugfs_list[] = {
 	{"i915_capabilities", i915_capabilities, 0},
 	{"i915_gem_objects", i915_gem_object_info, 0},
@@ -4901,6 +4987,12 @@ int i915_debugfs_register(struct drm_i915_private *dev_priv)
 	struct dentry *ent;
 	int ret, i;
 
+	ent = debugfs_create_file("i915_engine_stats", S_IRUGO,
+				  minor->debugfs_root, to_i915(minor->dev),
+				  &i915_engine_stats_fops);
+	if (!ent)
+		return -ENOMEM;
+
 	ent = debugfs_create_file("i915_forcewake_user", S_IRUSR,
 				  minor->debugfs_root, to_i915(minor->dev),
 				  &i915_forcewake_fops);
-- 
2.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [RFC 09/11] drm/i915/pmu: Wire up engine busy stats to PMU
  2017-09-11 15:25 [RFC v3 00/11] i915 PMU and engine busy stats Tvrtko Ursulin
                   ` (7 preceding siblings ...)
  2017-09-11 15:25 ` [RFC 08/11] drm/i915: Export engine busy stats in debugfs Tvrtko Ursulin
@ 2017-09-11 15:25 ` Tvrtko Ursulin
  2017-09-11 15:25 ` [RFC 10/11] drm/i915: Export engine stats API to other users Tvrtko Ursulin
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 56+ messages in thread
From: Tvrtko Ursulin @ 2017-09-11 15:25 UTC (permalink / raw)
  To: Intel-gfx; +Cc: Peter Zijlstra

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

We can use engine busy stats instead of the MMIO sampling timer
for better efficiency.

As minimum this saves period * num_engines / sec mmio reads,
and in a better case, when only engine busy samplers are active,
it enables us to not kick off the sampling timer at all.

v2: Rebase.
v3:
 * Rebase, comments.
 * Leave engine busyness controls out of workers.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_pmu.c         | 36 ++++++++++++++++++++++++++++++---
 drivers/gpu/drm/i915/intel_ringbuffer.h |  4 ++++
 2 files changed, 37 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index 26e735f27282..f8a6195c17f1 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -90,6 +90,11 @@ static unsigned int event_enabled_bit(struct perf_event *event)
 	return config_enabled_bit(event->attr.config);
 }
 
+static bool supports_busy_stats(void)
+{
+	return i915.enable_execlists;
+}
+
 static bool pmu_needs_timer(struct drm_i915_private *i915, bool gpu_active)
 {
 	u64 enable = i915->pmu.enable;
@@ -100,6 +105,8 @@ static bool pmu_needs_timer(struct drm_i915_private *i915, bool gpu_active)
 
 	if (!gpu_active)
 		enable &= ~ENGINE_SAMPLE_MASK;
+	else if (supports_busy_stats())
+		enable &= ~BIT(I915_SAMPLE_BUSY);
 
 	return enable;
 }
@@ -163,7 +170,8 @@ static void engines_sample(struct drm_i915_private *dev_priv)
 		if (enable & BIT(I915_SAMPLE_QUEUED))
 			engine->pmu.sample[I915_SAMPLE_QUEUED] += PERIOD;
 
-		if (enable & BIT(I915_SAMPLE_BUSY)) {
+		if ((enable & BIT(I915_SAMPLE_BUSY)) &&
+		    !engine->pmu.busy_stats) {
 			u32 val;
 
 			fw = grab_forcewake(dev_priv, fw);
@@ -342,6 +350,9 @@ static u64 __i915_pmu_event_read(struct perf_event *event)
 
 		if (WARN_ON_ONCE(!engine)) {
 			/* Do nothing */
+		} else if (sample == I915_SAMPLE_BUSY &&
+			   engine->pmu.busy_stats) {
+			val = ktime_to_ns(intel_engine_get_busy_time(engine));
 		} else {
 			val = engine->pmu.sample[sample];
 		}
@@ -385,6 +396,12 @@ static void i915_pmu_event_read(struct perf_event *event)
 		    local64_read(&event->hw.prev_count));
 }
 
+static bool engine_needs_busy_stats(struct intel_engine_cs *engine)
+{
+	return supports_busy_stats() &&
+	       (engine->pmu.enable & BIT(I915_SAMPLE_BUSY));
+}
+
 static void i915_pmu_enable(struct perf_event *event)
 {
 	struct drm_i915_private *i915 =
@@ -429,7 +446,14 @@ static void i915_pmu_enable(struct perf_event *event)
 
 		GEM_BUG_ON(sample >= I915_PMU_SAMPLE_BITS);
 		GEM_BUG_ON(engine->pmu.enable_count[sample] == ~0);
-		engine->pmu.enable_count[sample]++;
+		if (engine->pmu.enable_count[sample]++ == 0) {
+			if (engine_needs_busy_stats(engine) &&
+			    !engine->pmu.busy_stats) {
+				engine->pmu.busy_stats =
+					intel_enable_engine_stats(engine) == 0;
+				WARN_ON_ONCE(!engine->pmu.busy_stats);
+			}
+		}
 	}
 
 	/*
@@ -465,8 +489,14 @@ static void i915_pmu_disable(struct perf_event *event)
 		 * Decrement the reference count and clear the enabled
 		 * bitmask when the last listener on an event goes away.
 		 */
-		if (--engine->pmu.enable_count[sample] == 0)
+		if (--engine->pmu.enable_count[sample] == 0) {
 			engine->pmu.enable &= ~BIT(sample);
+			if (!engine_needs_busy_stats(engine) &&
+			    engine->pmu.busy_stats) {
+				engine->pmu.busy_stats = false;
+				intel_disable_engine_stats(engine);
+			}
+		}
 	}
 
 	GEM_BUG_ON(bit >= I915_PMU_MASK_BITS);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index f618c5f98edf..fe554fc76867 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -265,6 +265,10 @@ struct intel_engine_cs {
 		 * Our internal timer stores the current counter in this field.
 		 */
 		u64 sample[I915_ENGINE_SAMPLE_MAX];
+		/**
+		 * @busy_stats: Has enablement of engine stats tracking been requested.
+		 */
+		bool busy_stats;
 	} pmu;
 
 	/*
-- 
2.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [RFC 10/11] drm/i915: Export engine stats API to other users
  2017-09-11 15:25 [RFC v3 00/11] i915 PMU and engine busy stats Tvrtko Ursulin
                   ` (8 preceding siblings ...)
  2017-09-11 15:25 ` [RFC 09/11] drm/i915/pmu: Wire up engine busy stats to PMU Tvrtko Ursulin
@ 2017-09-11 15:25 ` Tvrtko Ursulin
  2017-09-12 18:35   ` Ben Widawsky
                     ` (2 more replies)
  2017-09-11 15:25 ` [RFC 11/11] drm/i915: Gate engine stats collection with a static key Tvrtko Ursulin
                   ` (6 subsequent siblings)
  16 siblings, 3 replies; 56+ messages in thread
From: Tvrtko Ursulin @ 2017-09-11 15:25 UTC (permalink / raw)
  To: Intel-gfx; +Cc: Ben Widawsky, Peter Zijlstra, Ben Widawsky

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Other kernel users might want to look at total GPU busyness
in order to implement things like package power distribution
algorithms more efficiently.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/intel_engine_cs.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index f7dba176989c..e2152dd21b4a 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -1495,6 +1495,7 @@ int intel_enable_engines_stats(struct drm_i915_private *dev_priv)
 
 	return ret;
 }
+EXPORT_SYMBOL(intel_enable_engines_stats);
 
 /**
  * intel_disable_engines_stats() - Disable engine busy tracking on all engines
@@ -1510,6 +1511,7 @@ void intel_disable_engines_stats(struct drm_i915_private *dev_priv)
 	for_each_engine(engine, dev_priv, id)
 		intel_disable_engine_stats(engine);
 }
+EXPORT_SYMBOL(intel_disable_engines_stats);
 
 /**
  * intel_engine_get_busy_time() - Return current accumulated engine busyness
@@ -1557,6 +1559,7 @@ ktime_t intel_engines_get_busy_time(struct drm_i915_private *dev_priv)
 
 	return total;
 }
+EXPORT_SYMBOL(intel_engines_get_busy_time);
 
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftests/mock_engine.c"
-- 
2.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [RFC 11/11] drm/i915: Gate engine stats collection with a static key
  2017-09-11 15:25 [RFC v3 00/11] i915 PMU and engine busy stats Tvrtko Ursulin
                   ` (9 preceding siblings ...)
  2017-09-11 15:25 ` [RFC 10/11] drm/i915: Export engine stats API to other users Tvrtko Ursulin
@ 2017-09-11 15:25 ` Tvrtko Ursulin
  2017-09-13 12:18   ` [RFC v3 " Tvrtko Ursulin
  2017-09-11 15:50 ` ✗ Fi.CI.BAT: warning for i915 PMU and engine busy stats (rev3) Patchwork
                   ` (5 subsequent siblings)
  16 siblings, 1 reply; 56+ messages in thread
From: Tvrtko Ursulin @ 2017-09-11 15:25 UTC (permalink / raw)
  To: Intel-gfx; +Cc: Peter Zijlstra

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

This reduces the cost of the software engine busyness tracking
to a single no-op instruction when there are no listeners.

v2: Rebase and some comments.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_pmu.c         | 54 +++++++++++++++++++++++--
 drivers/gpu/drm/i915/intel_engine_cs.c  | 17 ++++++++
 drivers/gpu/drm/i915/intel_ringbuffer.h | 70 ++++++++++++++++++++++-----------
 3 files changed, 113 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index f8a6195c17f1..bce4951c6065 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -447,11 +447,17 @@ static void i915_pmu_enable(struct perf_event *event)
 		GEM_BUG_ON(sample >= I915_PMU_SAMPLE_BITS);
 		GEM_BUG_ON(engine->pmu.enable_count[sample] == ~0);
 		if (engine->pmu.enable_count[sample]++ == 0) {
+			/*
+			 * Enable engine busy stats tracking if needed or
+			 * alternatively cancel the scheduled disabling of the
+			 * same.
+			 */
 			if (engine_needs_busy_stats(engine) &&
 			    !engine->pmu.busy_stats) {
-				engine->pmu.busy_stats =
-					intel_enable_engine_stats(engine) == 0;
-				WARN_ON_ONCE(!engine->pmu.busy_stats);
+				engine->pmu.busy_stats = true;
+				if (!cancel_delayed_work(&engine->pmu.disable_busy_stats))
+					queue_work(i915->wq,
+						&engine->pmu.enable_busy_stats);
 			}
 		}
 	}
@@ -494,7 +500,15 @@ static void i915_pmu_disable(struct perf_event *event)
 			if (!engine_needs_busy_stats(engine) &&
 			    engine->pmu.busy_stats) {
 				engine->pmu.busy_stats = false;
-				intel_disable_engine_stats(engine);
+				/*
+				 * We request a delayed disable to handle the
+				 * rapid on/off cycles on events which can
+				 * happen when tools like perf stat start in a
+				 * nicer way.
+				 */
+				queue_delayed_work(i915->wq,
+						   &engine->pmu.disable_busy_stats,
+						   round_jiffies_up_relative(HZ));
 			}
 		}
 	}
@@ -702,9 +716,27 @@ static int i915_pmu_cpu_offline(unsigned int cpu, struct hlist_node *node)
 	return 0;
 }
 
+static void __enable_busy_stats(struct work_struct *work)
+{
+	struct intel_engine_cs *engine =
+		container_of(work, typeof(*engine), pmu.enable_busy_stats);
+
+	WARN_ON_ONCE(intel_enable_engine_stats(engine));
+}
+
+static void __disable_busy_stats(struct work_struct *work)
+{
+	struct intel_engine_cs *engine =
+	       container_of(work, typeof(*engine), pmu.disable_busy_stats.work);
+
+	intel_disable_engine_stats(engine);
+}
+
 void i915_pmu_register(struct drm_i915_private *i915)
 {
 	int ret = ENOTSUPP;
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
 
 	if (INTEL_GEN(i915) <= 2)
 		goto err;
@@ -736,6 +768,12 @@ void i915_pmu_register(struct drm_i915_private *i915)
 	i915->pmu.timer.function = i915_sample;
 	i915->pmu.enable = 0;
 
+	for_each_engine(engine, i915, id) {
+		INIT_WORK(&engine->pmu.enable_busy_stats, __enable_busy_stats);
+		INIT_DELAYED_WORK(&engine->pmu.disable_busy_stats,
+				  __disable_busy_stats);
+	}
+
 	if (perf_pmu_register(&i915->pmu.base, "i915", -1))
 		i915->pmu.base.event_init = NULL;
 
@@ -745,6 +783,9 @@ void i915_pmu_register(struct drm_i915_private *i915)
 
 void i915_pmu_unregister(struct drm_i915_private *i915)
 {
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+
 	if (!i915->pmu.base.event_init)
 		return;
 
@@ -755,6 +796,11 @@ void i915_pmu_unregister(struct drm_i915_private *i915)
 
 	hrtimer_cancel(&i915->pmu.timer);
 
+	for_each_engine(engine, i915, id) {
+		flush_work(&engine->pmu.enable_busy_stats);
+		flush_delayed_work(&engine->pmu.disable_busy_stats);
+	}
+
 	cpuhp_state_remove_instance(CPUHP_AP_PERF_X86_UNCORE_ONLINE,
 				    &i915->pmu.node);
 }
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index e2152dd21b4a..e4a8eb83a018 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -21,6 +21,7 @@
  * IN THE SOFTWARE.
  *
  */
+#include <linux/static_key.h>
 
 #include "i915_drv.h"
 #include "intel_ringbuffer.h"
@@ -1419,6 +1420,10 @@ bool intel_engine_can_store_dword(struct intel_engine_cs *engine)
 	}
 }
 
+DEFINE_STATIC_KEY_FALSE(i915_engine_stats_key);
+static DEFINE_MUTEX(i915_engine_stats_mutex);
+static int i915_engine_stats_ref;
+
 /**
  * intel_enable_engine_stats() - Enable engine busy tracking on engine
  * @engine: engine to enable stats collection
@@ -1434,16 +1439,24 @@ int intel_enable_engine_stats(struct intel_engine_cs *engine)
 	if (!i915.enable_execlists)
 		return -ENODEV;
 
+	mutex_lock(&i915_engine_stats_mutex);
+
 	spin_lock_irqsave(&engine->stats.lock, flags);
 	if (engine->stats.enabled == ~0)
 		goto busy;
 	engine->stats.enabled++;
 	spin_unlock_irqrestore(&engine->stats.lock, flags);
 
+	if (i915_engine_stats_ref++ == 0)
+		static_branch_enable(&i915_engine_stats_key);
+
+	mutex_unlock(&i915_engine_stats_mutex);
+
 	return 0;
 
 busy:
 	spin_unlock_irqrestore(&engine->stats.lock, flags);
+	mutex_unlock(&i915_engine_stats_mutex);
 
 	return -EBUSY;
 }
@@ -1461,6 +1474,7 @@ void intel_disable_engine_stats(struct intel_engine_cs *engine)
 	if (!i915.enable_execlists)
 		return;
 
+	mutex_lock(&i915_engine_stats_mutex);
 	spin_lock_irqsave(&engine->stats.lock, flags);
 	WARN_ON_ONCE(engine->stats.enabled == 0);
 	if (--engine->stats.enabled == 0) {
@@ -1468,6 +1482,9 @@ void intel_disable_engine_stats(struct intel_engine_cs *engine)
 		engine->stats.start = engine->stats.total = 0;
 	}
 	spin_unlock_irqrestore(&engine->stats.lock, flags);
+	if (--i915_engine_stats_ref == 0)
+		static_branch_disable(&i915_engine_stats_key);
+	mutex_unlock(&i915_engine_stats_mutex);
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index fe554fc76867..65dea686fc7c 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -269,6 +269,22 @@ struct intel_engine_cs {
 		 * @busy_stats: Has enablement of engine stats tracking been requested.
 		 */
 		bool busy_stats;
+		/**
+		 * @enable_busy_stats: Work item for engine busy stats enabling.
+		 *
+		 * Since the action can sleep it needs to be decoupled from the
+		 * perf API callback.
+		 */
+		struct work_struct enable_busy_stats;
+		/**
+		 * @disable_busy_stats: Work item for busy stats disabling.
+		 *
+		 * Same as with @enable_busy_stats action, with the difference
+		 * that we delay it in case there are rapid enable-disable
+		 * actions, which can happen during tool startup (like perf
+		 * stat).
+		 */
+		struct delayed_work disable_busy_stats;
 	} pmu;
 
 	/*
@@ -794,48 +810,54 @@ bool intel_engine_can_store_dword(struct intel_engine_cs *engine);
 struct intel_engine_cs *
 intel_engine_lookup_user(struct drm_i915_private *i915, u8 class, u8 instance);
 
+DECLARE_STATIC_KEY_FALSE(i915_engine_stats_key);
+
 static inline void intel_engine_context_in(struct intel_engine_cs *engine)
 {
 	unsigned long flags;
 
-	if (READ_ONCE(engine->stats.enabled) == 0)
-		return;
+	if (static_branch_unlikely(&i915_engine_stats_key)) {
+		if (READ_ONCE(engine->stats.enabled) == 0)
+			return;
 
-	spin_lock_irqsave(&engine->stats.lock, flags);
+		spin_lock_irqsave(&engine->stats.lock, flags);
 
-	if (engine->stats.enabled > 0) {
-		if (engine->stats.ref++ == 0)
-			engine->stats.start = ktime_get();
-		GEM_BUG_ON(engine->stats.ref == 0);
-	}
+			if (engine->stats.enabled > 0) {
+				if (engine->stats.ref++ == 0)
+					engine->stats.start = ktime_get();
+				GEM_BUG_ON(engine->stats.ref == 0);
+			}
 
-	spin_unlock_irqrestore(&engine->stats.lock, flags);
+		spin_unlock_irqrestore(&engine->stats.lock, flags);
+	}
 }
 
 static inline void intel_engine_context_out(struct intel_engine_cs *engine)
 {
 	unsigned long flags;
 
-	if (READ_ONCE(engine->stats.enabled) == 0)
-		return;
+	if (static_branch_unlikely(&i915_engine_stats_key)) {
+		if (READ_ONCE(engine->stats.enabled) == 0)
+			return;
 
-	spin_lock_irqsave(&engine->stats.lock, flags);
+		spin_lock_irqsave(&engine->stats.lock, flags);
 
-	if (engine->stats.enabled > 0) {
-		/*
-		 * After turning on engine stats, context out might be the
-		 * first event which then needs to be ignored (ref == 0).
-		 */
-		if (engine->stats.ref && --engine->stats.ref == 0) {
-			ktime_t last = ktime_sub(ktime_get(),
-						 engine->stats.start);
+		if (engine->stats.enabled > 0) {
+			/*
+			 * After turning on engine stats, context out might be
+			 * the first event which then needs to be ignored.
+			 */
+			if (engine->stats.ref && --engine->stats.ref == 0) {
+				ktime_t last = ktime_sub(ktime_get(),
+							 engine->stats.start);
 
-			engine->stats.total = ktime_add(engine->stats.total,
-							last);
+				engine->stats.total =
+					ktime_add(engine->stats.total, last);
+			}
 		}
-	}
 
-	spin_unlock_irqrestore(&engine->stats.lock, flags);
+		spin_unlock_irqrestore(&engine->stats.lock, flags);
+	}
 }
 
 int intel_enable_engine_stats(struct intel_engine_cs *engine);
-- 
2.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* ✗ Fi.CI.BAT: warning for i915 PMU and engine busy stats (rev3)
  2017-09-11 15:25 [RFC v3 00/11] i915 PMU and engine busy stats Tvrtko Ursulin
                   ` (10 preceding siblings ...)
  2017-09-11 15:25 ` [RFC 11/11] drm/i915: Gate engine stats collection with a static key Tvrtko Ursulin
@ 2017-09-11 15:50 ` Patchwork
  2017-09-12  2:03 ` [RFC v3 00/11] i915 PMU and engine busy stats Rogozhkin, Dmitry V
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 56+ messages in thread
From: Patchwork @ 2017-09-11 15:50 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

== Series Details ==

Series: i915 PMU and engine busy stats (rev3)
URL   : https://patchwork.freedesktop.org/series/27488/
State : warning

== Summary ==

Series 27488v3 i915 PMU and engine busy stats
https://patchwork.freedesktop.org/api/1.0/series/27488/revisions/3/mbox/

Test chamelium:
        Subgroup dp-crc-fast:
                fail       -> PASS       (fi-kbl-7500u) fdo#102514
Test kms_cursor_legacy:
        Subgroup basic-busy-flip-before-cursor-atomic:
                pass       -> FAIL       (fi-snb-2600) fdo#100215
Test drv_module_reload:
        Subgroup basic-reload:
                pass       -> DMESG-WARN (fi-blb-e6850)
                pass       -> DMESG-WARN (fi-pnv-d510)
                pass       -> DMESG-WARN (fi-elk-e7500)
                pass       -> DMESG-WARN (fi-byt-j1900)
                pass       -> DMESG-WARN (fi-byt-n2820)
                pass       -> DMESG-WARN (fi-bdw-gvtdvm)
                pass       -> DMESG-WARN (fi-bsw-n3050)
                pass       -> DMESG-WARN (fi-skl-gvtdvm)
                pass       -> DMESG-WARN (fi-bxt-j4205)
                pass       -> DMESG-WARN (fi-glk-2a)

fdo#102514 https://bugs.freedesktop.org/show_bug.cgi?id=102514
fdo#100215 https://bugs.freedesktop.org/show_bug.cgi?id=100215

fi-bdw-5557u     total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  time:433s
fi-bdw-gvtdvm    total:289  pass:264  dwarn:1   dfail:0   fail:0   skip:24  time:455s
fi-blb-e6850     total:289  pass:223  dwarn:2   dfail:0   fail:0   skip:64  time:391s
fi-bsw-n3050     total:289  pass:242  dwarn:1   dfail:0   fail:0   skip:46  time:519s
fi-bwr-2160      total:289  pass:184  dwarn:0   dfail:0   fail:0   skip:105 time:269s
fi-bxt-j4205     total:289  pass:259  dwarn:1   dfail:0   fail:0   skip:29  time:516s
fi-byt-j1900     total:289  pass:253  dwarn:2   dfail:0   fail:0   skip:34  time:500s
fi-byt-n2820     total:289  pass:249  dwarn:2   dfail:0   fail:0   skip:38  time:494s
fi-cfl-s         total:289  pass:250  dwarn:4   dfail:0   fail:0   skip:35  time:438s
fi-elk-e7500     total:289  pass:229  dwarn:1   dfail:0   fail:0   skip:59  time:455s
fi-glk-2a        total:289  pass:259  dwarn:1   dfail:0   fail:0   skip:29  time:600s
fi-hsw-4770      total:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  time:418s
fi-hsw-4770r     total:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  time:403s
fi-ilk-650       total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  time:421s
fi-ivb-3520m     total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  time:470s
fi-ivb-3770      total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  time:450s
fi-kbl-7500u     total:289  pass:264  dwarn:1   dfail:0   fail:0   skip:24  time:472s
fi-kbl-7560u     total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  time:563s
fi-kbl-r         total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:569s
fi-pnv-d510      total:289  pass:222  dwarn:2   dfail:0   fail:0   skip:65  time:554s
fi-skl-6260u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:433s
fi-skl-6700k     total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:775s
fi-skl-6770hq    total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:480s
fi-skl-gvtdvm    total:289  pass:265  dwarn:1   dfail:0   fail:0   skip:23  time:461s
fi-skl-x1585l    total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  time:463s
fi-snb-2520m     total:289  pass:251  dwarn:0   dfail:0   fail:0   skip:38  time:559s
fi-snb-2600      total:289  pass:249  dwarn:0   dfail:0   fail:1   skip:39  time:415s

a06ff73a7522af231e77776d7948e4d8099558c6 drm-tip: 2017y-09m-11d-13h-31m-34s UTC integration manifest
eb6b9b46121e drm/i915: Gate engine stats collection with a static key
778c05f4eec0 drm/i915: Export engine stats API to other users
343cf1aab715 drm/i915/pmu: Wire up engine busy stats to PMU
969f6a0a25f4 drm/i915: Export engine busy stats in debugfs
78bcff128697 drm/i915: Engine busy time tracking
fb07b848e437 drm/i915: Wrap context schedule notification
1f45a032a12f drm/i915/pmu: Suspend sampling when GPU is idle
435f23ab7d32 drm/i915/pmu: Expose a PMU interface for perf queries
88d9fe7089ff drm/i915: Extract intel_get_cagf
86b319679791 drm/i915: Add intel_energy_uJ
52be2d634841 drm/i915: Convert intel_rc6_residency_us to ns

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5646/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC v3 00/11] i915 PMU and engine busy stats
  2017-09-11 15:25 [RFC v3 00/11] i915 PMU and engine busy stats Tvrtko Ursulin
                   ` (11 preceding siblings ...)
  2017-09-11 15:50 ` ✗ Fi.CI.BAT: warning for i915 PMU and engine busy stats (rev3) Patchwork
@ 2017-09-12  2:03 ` Rogozhkin, Dmitry V
  2017-09-12 14:54   ` Tvrtko Ursulin
  2017-09-13  9:34 ` ✗ Fi.CI.BAT: warning for i915 PMU and engine busy stats (rev4) Patchwork
                   ` (3 subsequent siblings)
  16 siblings, 1 reply; 56+ messages in thread
From: Rogozhkin, Dmitry V @ 2017-09-12  2:03 UTC (permalink / raw)
  To: tursulin; +Cc: peterz, Intel-gfx

Hi,

Just tried v3 series. perf-stat works fine. From the IGT tests which I wrote for i915 PMU
(https://patchwork.freedesktop.org/series/29313/) all pass (assuming pmu.enabled will be exposed
in debugfs) except cpu_online subtest. And this is pretty interesting - see details below.

Ok, be prepared for the long mail:)...

So, cpu_online subtest loads RCS0 engine 100% and starts to put CPUs offline one by one starting
from CPU0 (don't forget to have CONFIG_BOOTPARAM_HOTPLUG_CPU0=y in .config). Test expectation is
that i915 PMU will report almost 100% (with 2% tolerance) busyness of RCS0. Test requests metric
just twice: before running on RCS0 and right after. This fails as follows:

Executed on rcs0 for 32004262184us
  i915/rcs0-busy/: 2225723999us
(perf_pmu:6325) CRITICAL: Test assertion failure function test_cpu_online, file perf_pmu.c:719:
(perf_pmu:6325) CRITICAL: Failed assertion: perf_elapsed(&pm.metrics[0]) > (1-USAGE_TOLERANCE) * elapsed_ns(&start, &now)
Stack trace:
  #0 [__igt_fail_assert+0xf1]
  #1 [__real_main773+0xff1]
  #2 [main+0x35]
  #3 [__libc_start_main+0xf5]
  #4 [_start+0x29]
  #5 [<unknown>+0x29]
Subtest cpu_online failed.
**** DEBUG ****
(perf_pmu:6325) DEBUG: Test requirement passed: is_hotplug_cpu0()
(perf_pmu:6325) INFO: perf_init: enabled 1 metrics from 1 requested
(perf_pmu:6325) ioctl-wrappers-DEBUG: Test requirement passed: gem_has_ring(fd, ring)
(perf_pmu:6325) INFO: Executed on rcs0 for 32004262184us
(perf_pmu:6325) INFO:   i915/rcs0-busy/: 2225723999us
(perf_pmu:6325) CRITICAL: Test assertion failure function test_cpu_online, file perf_pmu.c:719:
(perf_pmu:6325) CRITICAL: Failed assertion: perf_elapsed(&pm.metrics[0]) > (1-USAGE_TOLERANCE) * elapsed_ns(&start, &now)

Now. Looks like that by itself PMU context migration works. For example, if you will comment out
"perf_pmu_migrate_context(&pmu->base, cpu, target)" you will get:

    Executed on rcs0 for 32004434918us
      i915/rcs0-busy/:     76623707us

Compare with previous:
    Executed on rcs0 for 32004262184us
      i915/rcs0-busy/:    2225723999us

This test passed on the previous set of patches, I mean Tvrtko's v2 series + my patches.

So, it seems we are loosing counter values somehow. I saw in the patches that this place really was modified - you
have added subtraction from initial counter value:
static void i915_pmu_event_read(struct perf_event *event)
{

	local64_set(&event->count,
		    __i915_pmu_event_read(event) -
		    local64_read(&event->hw.prev_count));
}

But looks like the problem is that with the PMU context migration we get sequence of events start/stop (or maybe
add/del) which eventually call our i915_pmu_enable/disable. Here is the dmesg log with the obvious printk:

[  153.971096] [IGT] perf_pmu: starting subtest cpu_online
[  153.971151] i915_pmu_enable: event->hw.prev_count=0
[  154.036015] i915_pmu_disable: event->hw.prev_count=0
[  154.048027] i915_pmu_enable: event->hw.prev_count=0
[  154.049343] smpboot: CPU 0 is now offline
[  155.059028] smpboot: Booting Node 0 Processor 0 APIC 0x0
[  155.155078] smpboot: CPU 1 is now offline
[  156.161026] x86: Booting SMP configuration:
[  156.161027] smpboot: Booting Node 0 Processor 1 APIC 0x2
[  156.197065] IRQ 122: no longer affine to CPU2
[  156.198087] smpboot: CPU 2 is now offline
[  157.208028] smpboot: Booting Node 0 Processor 2 APIC 0x4
[  157.263093] smpboot: CPU 3 is now offline
[  158.273026] smpboot: Booting Node 0 Processor 3 APIC 0x6
[  158.310026] i915_pmu_disable: event->hw.prev_count=76648307
[  158.319020] i915_pmu_enable: event->hw.prev_count=76648307
[  158.319098] IRQ 124: no longer affine to CPU4
[  158.320368] smpboot: CPU 4 is now offline
[  159.326030] smpboot: Booting Node 0 Processor 4 APIC 0x1
[  159.365306] smpboot: CPU 5 is now offline
[  160.371030] smpboot: Booting Node 0 Processor 5 APIC 0x3
[  160.421077] IRQ 125: no longer affine to CPU6
[  160.422093] smpboot: CPU 6 is now offline
[  161.429030] smpboot: Booting Node 0 Processor 6 APIC 0x5
[  161.467091] smpboot: CPU 7 is now offline
[  162.473027] smpboot: Booting Node 0 Processor 7 APIC 0x7
[  162.527019] i915_pmu_disable: event->hw.prev_count=4347548222
[  162.546017] i915_pmu_enable: event->hw.prev_count=4347548222
[  162.547317] smpboot: CPU 0 is now offline
[  163.553028] smpboot: Booting Node 0 Processor 0 APIC 0x0
[  163.621089] smpboot: CPU 1 is now offline
[  164.627028] x86: Booting SMP configuration:
[  164.627029] smpboot: Booting Node 0 Processor 1 APIC 0x2
[  164.669308] smpboot: CPU 2 is now offline
[  165.679025] smpboot: Booting Node 0 Processor 2 APIC 0x4
[  165.717089] smpboot: CPU 3 is now offline
[  166.723025] smpboot: Booting Node 0 Processor 3 APIC 0x6
[  166.775016] i915_pmu_disable: event->hw.prev_count=8574197312
[  166.787016] i915_pmu_enable: event->hw.prev_count=8574197312
[  166.788309] smpboot: CPU 4 is now offline
[  167.794025] smpboot: Booting Node 0 Processor 4 APIC 0x1
[  167.837114] smpboot: CPU 5 is now offline
[  168.847025] smpboot: Booting Node 0 Processor 5 APIC 0x3
[  168.889312] smpboot: CPU 6 is now offline
[  169.899030] smpboot: Booting Node 0 Processor 6 APIC 0x5
[  169.944104] smpboot: CPU 7 is now offline
[  170.954032] smpboot: Booting Node 0 Processor 7 APIC 0x7
[  171.000016] i915_pmu_disable: event->hw.prev_count=12815138319
[  171.008017] i915_pmu_enable: event->hw.prev_count=12815138319
[  171.009304] smpboot: CPU 0 is now offline
[  172.017028] smpboot: Booting Node 0 Processor 0 APIC 0x0
[  172.096104] smpboot: CPU 1 is now offline
[  173.106025] x86: Booting SMP configuration:
[  173.106026] smpboot: Booting Node 0 Processor 1 APIC 0x2
[  173.147078] smpboot: CPU 2 is now offline
[  174.153025] smpboot: Booting Node 0 Processor 2 APIC 0x4
[  174.192093] smpboot: CPU 3 is now offline
[  175.198028] smpboot: Booting Node 0 Processor 3 APIC 0x6
[  175.229042] i915_pmu_disable: event->hw.prev_count=17035889079
[  175.242030] i915_pmu_enable: event->hw.prev_count=17035889079
[  175.242163] IRQ fixup: irq 120 move in progress, old vector 131
[  175.242165] IRQ fixup: irq 121 move in progress, old vector 147
[  175.242171] IRQ 124: no longer affine to CPU4
[  175.243435] smpboot: CPU 4 is now offline
[  176.248040] smpboot: Booting Node 0 Processor 4 APIC 0x1
[  176.285328] smpboot: CPU 5 is now offline
[  177.296039] smpboot: Booting Node 0 Processor 5 APIC 0x3
[  177.325067] IRQ 125: no longer affine to CPU6
[  177.326087] smpboot: CPU 6 is now offline
[  178.335036] smpboot: Booting Node 0 Processor 6 APIC 0x5
[  178.377063] IRQ 122: no longer affine to CPU7
[  178.378086] smpboot: CPU 7 is now offline
[  179.388028] smpboot: Booting Node 0 Processor 7 APIC 0x7
[  179.454030] i915_pmu_disable: event->hw.prev_count=21269856967
[  179.470026] i915_pmu_enable: event->hw.prev_count=21269856967
[  179.471110] smpboot: CPU 0 is now offline
[  180.481028] smpboot: Booting Node 0 Processor 0 APIC 0x0
[  180.551075] smpboot: CPU 1 is now offline
[  181.558029] x86: Booting SMP configuration:
[  181.558030] smpboot: Booting Node 0 Processor 1 APIC 0x2
[  181.595096] smpboot: CPU 2 is now offline
[  182.605029] smpboot: Booting Node 0 Processor 2 APIC 0x4
[  182.657084] smpboot: CPU 3 is now offline
[  183.668030] smpboot: Booting Node 0 Processor 3 APIC 0x6
[  183.709017] i915_pmu_disable: event->hw.prev_count=25497358644
[  183.727016] i915_pmu_enable: event->hw.prev_count=25497358644
[  183.728305] smpboot: CPU 4 is now offline
[  184.734027] smpboot: Booting Node 0 Processor 4 APIC 0x1
[  184.767090] smpboot: CPU 5 is now offline
[  185.777036] smpboot: Booting Node 0 Processor 5 APIC 0x3
[  185.823096] smpboot: CPU 6 is now offline
[  186.829051] smpboot: Booting Node 0 Processor 6 APIC 0x5
[  186.856350] smpboot: CPU 7 is now offline
[  187.862051] smpboot: Booting Node 0 Processor 7 APIC 0x7
[  187.871216] [IGT] perf_pmu: exiting, ret=99
[  187.889199] Console: switching to colour frame buffer device 240x67
[  187.889583] i915_pmu_disable: event->hw.prev_count=29754080941

And the result which I got in userspace for this run were
    Executed on rcs0 for 32003587981us
      i915/rcs0-busy/: 2247436461us

After that I decided to roll back the change with counting values which I mentioned before, i.e.:
static void i915_pmu_event_read(struct perf_event *event)
{

	local64_set(&event->count,
		    __i915_pmu_event_read(event) /*-
		    local64_read(&event->hw.prev_count)*/);
}

And I got test PASSED :):
    Executed on rcs0 for 32002282603us                                                                                                                                   
      i915/rcs0-busy/: 31998855052us                                                                                                                                     
    Subtest cpu_online: SUCCESS (33.950s)

At this point I need to go home :). Maybe you will have time to look into this issue? If not, I will continue
tomorrow.

Regards,
Dmitry.


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC 04/11] drm/i915/pmu: Expose a PMU interface for perf queries
  2017-09-11 15:25 ` [RFC 04/11] drm/i915/pmu: Expose a PMU interface for perf queries Tvrtko Ursulin
@ 2017-09-12  2:06   ` Rogozhkin, Dmitry V
  2017-09-12 14:59     ` Tvrtko Ursulin
  2017-09-14 19:46   ` [RFC " Chris Wilson
  1 sibling, 1 reply; 56+ messages in thread
From: Rogozhkin, Dmitry V @ 2017-09-12  2:06 UTC (permalink / raw)
  To: tursulin; +Cc: peterz, Intel-gfx

On Mon, 2017-09-11 at 16:25 +0100, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> From: Chris Wilson <chris@chris-wilson.co.uk>
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> From: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
> 
> The first goal is to be able to measure GPU (and invidual ring) busyness
> without having to poll registers from userspace. (Which not only incurs
> holding the forcewake lock indefinitely, perturbing the system, but also
> runs the risk of hanging the machine.) As an alternative we can use the
> perf event counter interface to sample the ring registers periodically
> and send those results to userspace.
> 
> To be able to do so, we need to export the two symbols from
> kernel/events/core.c to register and unregister a PMU device.
> 
> v1-v2 (Chris Wilson):
> 
> v2: Use a common timer for the ring sampling.
> 
> v3: (Tvrtko Ursulin)
>  * Decouple uAPI from i915 engine ids.
>  * Complete uAPI defines.
>  * Refactor some code to helpers for clarity.
>  * Skip sampling disabled engines.
>  * Expose counters in sysfs.
>  * Pass in fake regs to avoid null ptr deref in perf core.
>  * Convert to class/instance uAPI.
>  * Use shared driver code for rc6 residency, power and frequency.
> 
> v4: (Dmitry Rogozhkin)
>  * Register PMU with .task_ctx_nr=perf_invalid_context
>  * Expose cpumask for the PMU with the single CPU in the mask
>  * Properly support pmu->stop(): it should call pmu->read()
>  * Properly support pmu->del(): it should call stop(event, PERF_EF_UPDATE)
>  * Make pmu.busy_stats a refcounter to avoid busy stats going away
>    with some deleted event.

busy_stats appear later in the patch series. And in your final version
busy_stats remain bool while we rely on events refcounting. So, this
item is misleading. Could you, please, change it giving a credit to
general ref-counting of events which you have rewrote for v5 of this
patch?

>  * Expose cpumask for i915 PMU to avoid multiple events creation of
>    the same type followed by counter aggregation by perf-stat.
>  * Track CPUs getting online/offline to migrate perf context. If (likely)
>    cpumask will initially set CPU0, CONFIG_BOOTPARAM_HOTPLUG_CPU0 will be
>    needed to see effect of CPU status tracking.
>  * End result is that only global events are supported and perf stat
>    works correctly.
>  * Deny perf driver level sampling - it is prohibited for uncore PMU.
> 
> v5: (Tvrtko Ursulin)
> 
>  * Don't hardcode number of engine samplers.
>  * Rewrite event ref-counting for correctness and simplicity.
>  * Store initial counter value when starting already enabled events
>    to correctly report values to all listeners.
>  * Fix RC6 residency readout.
>  * Comments, GPL header.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> ---
>  drivers/gpu/drm/i915/Makefile           |   1 +
>  drivers/gpu/drm/i915/i915_drv.c         |   2 +
>  drivers/gpu/drm/i915/i915_drv.h         |  76 ++++
>  drivers/gpu/drm/i915/i915_pmu.c         | 686 ++++++++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/i915_reg.h         |   3 +
>  drivers/gpu/drm/i915/intel_engine_cs.c  |  10 +
>  drivers/gpu/drm/i915/intel_ringbuffer.c |  25 ++
>  drivers/gpu/drm/i915/intel_ringbuffer.h |  25 ++
>  include/uapi/drm/i915_drm.h             |  58 +++
>  9 files changed, 886 insertions(+)
>  create mode 100644 drivers/gpu/drm/i915/i915_pmu.c
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index 1cb8059a3a16..7b3a0eca62b6 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -26,6 +26,7 @@ i915-y := i915_drv.o \
>  
>  i915-$(CONFIG_COMPAT)   += i915_ioc32.o
>  i915-$(CONFIG_DEBUG_FS) += i915_debugfs.o intel_pipe_crc.o
> +i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o
>  
>  # GEM code
>  i915-y += i915_cmd_parser.o \
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 5c111ea96e80..b1f96eb1be16 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -1196,6 +1196,7 @@ static void i915_driver_register(struct drm_i915_private *dev_priv)
>  	struct drm_device *dev = &dev_priv->drm;
>  
>  	i915_gem_shrinker_init(dev_priv);
> +	i915_pmu_register(dev_priv);
>  
>  	/*
>  	 * Notify a valid surface after modesetting,
> @@ -1250,6 +1251,7 @@ static void i915_driver_unregister(struct drm_i915_private *dev_priv)
>  	intel_opregion_unregister(dev_priv);
>  
>  	i915_perf_unregister(dev_priv);
> +	i915_pmu_unregister(dev_priv);
>  
>  	i915_teardown_sysfs(dev_priv);
>  	i915_guc_log_unregister(dev_priv);
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 48daf9552163..62646b8dfb7a 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -40,6 +40,7 @@
>  #include <linux/hash.h>
>  #include <linux/intel-iommu.h>
>  #include <linux/kref.h>
> +#include <linux/perf_event.h>
>  #include <linux/pm_qos.h>
>  #include <linux/reservation.h>
>  #include <linux/shmem_fs.h>
> @@ -2190,6 +2191,69 @@ struct intel_cdclk_state {
>  	unsigned int cdclk, vco, ref;
>  };
>  
> +enum {
> +	__I915_SAMPLE_FREQ_ACT = 0,
> +	__I915_SAMPLE_FREQ_REQ,
> +	__I915_NUM_PMU_SAMPLERS
> +};
> +
> +/**
> + * How many different events we track in the global PMU mask.
> + *
> + * It is also used to know to needed number of event reference counters.
> + */
> +#define I915_PMU_MASK_BITS \
> +	(1 << I915_PMU_SAMPLE_BITS) + (I915_PMU_LAST + 1 - __I915_PMU_OTHER(0))
> +
> +struct i915_pmu {
> +	/**
> +	 * @node: List node for CPU hotplug handling.
> +	 */
> +	struct hlist_node node;
> +	/**
> +	 * @base: PMU base.
> +	 */
> +	struct pmu base;
> +	/**
> +	 * @lock: Lock protecting enable mask and ref count handling.
> +	 */
> +	spinlock_t lock;
> +	/**
> +	 * @timer: Timer for internal i915 PMU sampling.
> +	 */
> +	struct hrtimer timer;
> +	/**
> +	 * @enable: Bitmask of all currently enabled events.
> +	 *
> +	 * Bits are derived from uAPI event numbers in a way that low 16 bits
> +	 * correspond to engine event _sample_ _type_ (I915_SAMPLE_QUEUED is
> +	 * bit 0), and higher bits correspond to other events (for instance
> +	 * I915_PMU_ACTUAL_FREQUENCY is bit 16 etc).
> +	 *
> +	 * In other words, low 16 bits are not per engine but per engine
> +	 * sampler type, while the upper bits are directly mapped to other
> +	 * event types.
> +	 */
> +	u64 enable;
> +	/**
> +	 * @enable_count: Reference counts for the enabled events.
> +	 *
> +	 * Array indices are mapped in the same way as bits in the @enable field
> +	 * and they are used to control sampling on/off when multiple clients
> +	 * are using the PMU API.
> +	 */
> +	unsigned int enable_count[I915_PMU_MASK_BITS];
> +	/**
> +	 * @sample: Current counter value for i915 events which need sampling.
> +	 *
> +	 * These counters are updated from the i915 PMU sampling timer.
> +	 *
> +	 * Only global counters are held here, while the per-engine ones are in
> +	 * struct intel_engine_cs.
> +	 */
> +	u64 sample[__I915_NUM_PMU_SAMPLERS];
> +};
> +
>  struct drm_i915_private {
>  	struct drm_device drm;
>  
> @@ -2238,6 +2302,7 @@ struct drm_i915_private {
>  	struct pci_dev *bridge_dev;
>  	struct i915_gem_context *kernel_context;
>  	struct intel_engine_cs *engine[I915_NUM_ENGINES];
> +	struct intel_engine_cs *engine_class[MAX_ENGINE_CLASS + 1][MAX_ENGINE_INSTANCE + 1];
>  	struct i915_vma *semaphore;
>  
>  	struct drm_dma_handle *status_page_dmah;
> @@ -2698,6 +2763,8 @@ struct drm_i915_private {
>  		int	irq;
>  	} lpe_audio;
>  
> +	struct i915_pmu pmu;
> +
>  	/*
>  	 * NOTE: This is the dri1/ums dungeon, don't add stuff here. Your patch
>  	 * will be rejected. Instead look for a better place.
> @@ -3918,6 +3985,15 @@ extern void i915_perf_fini(struct drm_i915_private *dev_priv);
>  extern void i915_perf_register(struct drm_i915_private *dev_priv);
>  extern void i915_perf_unregister(struct drm_i915_private *dev_priv);
>  
> +/* i915_pmu.c */
> +#ifdef CONFIG_PERF_EVENTS
> +extern void i915_pmu_register(struct drm_i915_private *i915);
> +extern void i915_pmu_unregister(struct drm_i915_private *i915);
> +#else
> +static inline void i915_pmu_register(struct drm_i915_private *i915) {}
> +static inline void i915_pmu_unregister(struct drm_i915_private *i915) {}
> +#endif
> +
>  /* i915_suspend.c */
>  extern int i915_save_state(struct drm_i915_private *dev_priv);
>  extern int i915_restore_state(struct drm_i915_private *dev_priv);
> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
> new file mode 100644
> index 000000000000..2ec892e57143
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_pmu.c
> @@ -0,0 +1,686 @@
> +/*
> + * Copyright © 2017 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> + * IN THE SOFTWARE.
> + *
> + */
> +
> +#include <linux/perf_event.h>
> +#include <linux/pm_runtime.h>
> +
> +#include "i915_drv.h"
> +#include "intel_ringbuffer.h"
> +
> +/* Frequency for the sampling timer for events which need it. */
> +#define FREQUENCY 200
> +#define PERIOD max_t(u64, 10000, NSEC_PER_SEC / FREQUENCY)
> +
> +#define ENGINE_SAMPLE_MASK \
> +	(BIT(I915_SAMPLE_QUEUED) | \
> +	 BIT(I915_SAMPLE_BUSY) | \
> +	 BIT(I915_SAMPLE_WAIT) | \
> +	 BIT(I915_SAMPLE_SEMA))
> +
> +#define ENGINE_SAMPLE_BITS (1 << I915_PMU_SAMPLE_BITS)
> +
> +static cpumask_t i915_pmu_cpumask = CPU_MASK_NONE;
> +
> +static u8 engine_config_sample(u64 config)
> +{
> +	return config & I915_PMU_SAMPLE_MASK;
> +}
> +
> +static u8 engine_event_sample(struct perf_event *event)
> +{
> +	return engine_config_sample(event->attr.config);
> +}
> +
> +static u8 engine_event_class(struct perf_event *event)
> +{
> +	return (event->attr.config >> I915_PMU_CLASS_SHIFT) & 0xff;
> +}
> +
> +static u8 engine_event_instance(struct perf_event *event)
> +{
> +	return (event->attr.config >> I915_PMU_SAMPLE_BITS) & 0xff;
> +}
> +
> +static bool is_engine_config(u64 config)
> +{
> +	return config < __I915_PMU_OTHER(0);
> +}
> +
> +static unsigned int config_enabled_bit(u64 config)
> +{
> +	if (is_engine_config(config))
> +		return engine_config_sample(config);
> +	else
> +		return ENGINE_SAMPLE_BITS + (config - __I915_PMU_OTHER(0));
> +}
> +
> +static u64 config_enabled_mask(u64 config)
> +{
> +	return BIT_ULL(config_enabled_bit(config));
> +}
> +
> +static bool is_engine_event(struct perf_event *event)
> +{
> +	return is_engine_config(event->attr.config);
> +}
> +
> +static unsigned int event_enabled_bit(struct perf_event *event)
> +{
> +	return config_enabled_bit(event->attr.config);
> +}
> +
> +static bool grab_forcewake(struct drm_i915_private *i915, bool fw)
> +{
> +	if (!fw)
> +		intel_uncore_forcewake_get(i915, FORCEWAKE_ALL);
> +
> +	return true;
> +}
> +
> +static void engines_sample(struct drm_i915_private *dev_priv)
> +{
> +	struct intel_engine_cs *engine;
> +	enum intel_engine_id id;
> +	bool fw = false;
> +
> +	if ((dev_priv->pmu.enable & ENGINE_SAMPLE_MASK) == 0)
> +		return;
> +
> +	if (!dev_priv->gt.awake)
> +		return;
> +
> +	if (!intel_runtime_pm_get_if_in_use(dev_priv))
> +		return;
> +
> +	for_each_engine(engine, dev_priv, id) {
> +		u32 enable = engine->pmu.enable;
> +
> +		if (i915_seqno_passed(intel_engine_get_seqno(engine),
> +				      intel_engine_last_submit(engine)))
> +			continue;
> +
> +		if (enable & BIT(I915_SAMPLE_QUEUED))
> +			engine->pmu.sample[I915_SAMPLE_QUEUED] += PERIOD;
> +
> +		if (enable & BIT(I915_SAMPLE_BUSY)) {
> +			u32 val;
> +
> +			fw = grab_forcewake(dev_priv, fw);
> +			val = I915_READ_FW(RING_MI_MODE(engine->mmio_base));
> +			if (!(val & MODE_IDLE))
> +				engine->pmu.sample[I915_SAMPLE_BUSY] += PERIOD;
> +		}
> +
> +		if (enable & (BIT(I915_SAMPLE_WAIT) | BIT(I915_SAMPLE_SEMA))) {
> +			u32 val;
> +
> +			fw = grab_forcewake(dev_priv, fw);
> +			val = I915_READ_FW(RING_CTL(engine->mmio_base));
> +			if ((enable & BIT(I915_SAMPLE_WAIT)) &&
> +			    (val & RING_WAIT))
> +				engine->pmu.sample[I915_SAMPLE_WAIT] += PERIOD;
> +			if ((enable & BIT(I915_SAMPLE_SEMA)) &&
> +			    (val & RING_WAIT_SEMAPHORE))
> +				engine->pmu.sample[I915_SAMPLE_SEMA] += PERIOD;
> +		}
> +	}
> +
> +	if (fw)
> +		intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
> +	intel_runtime_pm_put(dev_priv);
> +}
> +
> +static void frequency_sample(struct drm_i915_private *dev_priv)
> +{
> +	if (dev_priv->pmu.enable &
> +	    config_enabled_mask(I915_PMU_ACTUAL_FREQUENCY)) {
> +		u64 val;
> +
> +		val = dev_priv->rps.cur_freq;
> +		if (dev_priv->gt.awake &&
> +		    intel_runtime_pm_get_if_in_use(dev_priv)) {
> +			val = intel_get_cagf(dev_priv,
> +					     I915_READ_NOTRACE(GEN6_RPSTAT1));
> +			intel_runtime_pm_put(dev_priv);
> +		}
> +		val = intel_gpu_freq(dev_priv, val);
> +		dev_priv->pmu.sample[__I915_SAMPLE_FREQ_ACT] += val * PERIOD;
> +	}
> +
> +	if (dev_priv->pmu.enable &
> +	    config_enabled_mask(I915_PMU_REQUESTED_FREQUENCY)) {
> +		u64 val = intel_gpu_freq(dev_priv, dev_priv->rps.cur_freq);
> +		dev_priv->pmu.sample[__I915_SAMPLE_FREQ_REQ] += val * PERIOD;
> +	}
> +}
> +
> +static enum hrtimer_restart i915_sample(struct hrtimer *hrtimer)
> +{
> +	struct drm_i915_private *i915 =
> +		container_of(hrtimer, struct drm_i915_private, pmu.timer);
> +
> +	if (i915->pmu.enable == 0)
> +		return HRTIMER_NORESTART;
> +
> +	engines_sample(i915);
> +	frequency_sample(i915);
> +
> +	hrtimer_forward_now(hrtimer, ns_to_ktime(PERIOD));
> +	return HRTIMER_RESTART;
> +}
> +
> +static u64 count_interrupts(struct drm_i915_private *i915)
> +{
> +	/* open-coded kstat_irqs() */
> +	struct irq_desc *desc = irq_to_desc(i915->drm.pdev->irq);
> +	u64 sum = 0;
> +	int cpu;
> +
> +	if (!desc || !desc->kstat_irqs)
> +		return 0;
> +
> +	for_each_possible_cpu(cpu)
> +		sum += *per_cpu_ptr(desc->kstat_irqs, cpu);
> +
> +	return sum;
> +}
> +
> +static void i915_pmu_event_destroy(struct perf_event *event)
> +{
> +	WARN_ON(event->parent);
> +}
> +
> +static int engine_event_init(struct perf_event *event)
> +{
> +	struct drm_i915_private *i915 =
> +		container_of(event->pmu, typeof(*i915), pmu.base);
> +
> +	if (!intel_engine_lookup_user(i915, engine_event_class(event),
> +				      engine_event_instance(event)))
> +		return -ENODEV;
> +
> +	switch (engine_event_sample(event)) {
> +	case I915_SAMPLE_QUEUED:
> +	case I915_SAMPLE_BUSY:
> +	case I915_SAMPLE_WAIT:
> +		break;
> +	case I915_SAMPLE_SEMA:
> +		if (INTEL_GEN(i915) < 6)
> +			return -ENODEV;
> +		break;
> +	default:
> +		return -ENOENT;
> +	}
> +
> +	return 0;
> +}
> +
> +static int i915_pmu_event_init(struct perf_event *event)
> +{
> +	struct drm_i915_private *i915 =
> +		container_of(event->pmu, typeof(*i915), pmu.base);
> +	int cpu, ret;
> +
> +	if (event->attr.type != event->pmu->type)
> +		return -ENOENT;
> +
> +	/* unsupported modes and filters */
> +	if (event->attr.sample_period) /* no sampling */
> +		return -EINVAL;
> +
> +	if (has_branch_stack(event))
> +		return -EOPNOTSUPP;
> +
> +	if (event->cpu < 0)
> +		return -EINVAL;
> +
> +	cpu = cpumask_any_and(&i915_pmu_cpumask,
> +			      topology_sibling_cpumask(event->cpu));
> +	if (cpu >= nr_cpu_ids)
> +		return -ENODEV;
> +
> +	ret = 0;
> +	if (is_engine_event(event)) {
> +		ret = engine_event_init(event);
> +	} else switch (event->attr.config) {
> +	case I915_PMU_ACTUAL_FREQUENCY:
> +		if (IS_VALLEYVIEW(i915) || IS_CHERRYVIEW(i915))
> +			ret = -ENODEV; /* requires a mutex for sampling! */
> +	case I915_PMU_REQUESTED_FREQUENCY:
> +	case I915_PMU_ENERGY:
> +	case I915_PMU_RC6_RESIDENCY:
> +	case I915_PMU_RC6p_RESIDENCY:
> +	case I915_PMU_RC6pp_RESIDENCY:
> +		if (INTEL_GEN(i915) < 6)
> +			ret = -ENODEV;
> +		break;
> +	}
> +	if (ret)
> +		return ret;
> +
> +	event->cpu = cpu;
> +	if (!event->parent)
> +		event->destroy = i915_pmu_event_destroy;
> +
> +	return 0;
> +}
> +
> +static u64 __i915_pmu_event_read(struct perf_event *event)
> +{
> +	struct drm_i915_private *i915 =
> +		container_of(event->pmu, typeof(*i915), pmu.base);
> +	u64 val = 0;
> +
> +	if (is_engine_event(event)) {
> +		u8 sample = engine_event_sample(event);
> +		struct intel_engine_cs *engine;
> +
> +		engine = intel_engine_lookup_user(i915,
> +						  engine_event_class(event),
> +						  engine_event_instance(event));
> +
> +		if (WARN_ON_ONCE(!engine)) {
> +			/* Do nothing */
> +		} else {
> +			val = engine->pmu.sample[sample];
> +		}
> +	} else switch (event->attr.config) {
> +	case I915_PMU_ACTUAL_FREQUENCY:
> +		val = i915->pmu.sample[__I915_SAMPLE_FREQ_ACT];
> +		break;
> +	case I915_PMU_REQUESTED_FREQUENCY:
> +		val = i915->pmu.sample[__I915_SAMPLE_FREQ_REQ];
> +		break;
> +	case I915_PMU_ENERGY:
> +		val = intel_energy_uJ(i915);
> +		break;
> +	case I915_PMU_INTERRUPTS:
> +		val = count_interrupts(i915);
> +		break;
> +	case I915_PMU_RC6_RESIDENCY:
> +		val = intel_rc6_residency_ns(i915,
> +					     IS_VALLEYVIEW(i915) ?
> +					     VLV_GT_RENDER_RC6 :
> +					     GEN6_GT_GFX_RC6);
> +		break;
> +	case I915_PMU_RC6p_RESIDENCY:
> +		if (!IS_VALLEYVIEW(i915))
> +			val = intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6p);
> +		break;
> +	case I915_PMU_RC6pp_RESIDENCY:
> +		if (!IS_VALLEYVIEW(i915))
> +			val = intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6pp);
> +		break;
> +	}
> +
> +	return val;
> +}
> +
> +static void i915_pmu_event_read(struct perf_event *event)
> +{
> +
> +	local64_set(&event->count,
> +		    __i915_pmu_event_read(event) -
> +		    local64_read(&event->hw.prev_count));
> +}
> +
> +static void i915_pmu_enable(struct perf_event *event)
> +{
> +	struct drm_i915_private *i915 =
> +		container_of(event->pmu, typeof(*i915), pmu.base);
> +	unsigned int bit = event_enabled_bit(event);
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&i915->pmu.lock, flags);
> +
> +	/*
> +	 * Start the sampling timer when enabling the first event.
> +	 */
> +	if (i915->pmu.enable == 0)
> +		hrtimer_start_range_ns(&i915->pmu.timer,
> +				       ns_to_ktime(PERIOD), 0,
> +				       HRTIMER_MODE_REL_PINNED);
> +
> +	/*
> +	 * Update the bitmask of enabled events and increment
> +	 * the event reference counter.
> +	 */
> +	GEM_BUG_ON(bit >= I915_PMU_MASK_BITS);
> +	GEM_BUG_ON(i915->pmu.enable_count[bit] == ~0);
> +	i915->pmu.enable |= BIT_ULL(bit);
> +	i915->pmu.enable_count[bit]++;
> +
> +	/*
> +	 * For per-engine events the bitmask and reference counting
> +	 * is stored per engine.
> +	 */
> +	if (is_engine_event(event)) {
> +		u8 sample = engine_event_sample(event);
> +		struct intel_engine_cs *engine;
> +
> +		engine = intel_engine_lookup_user(i915,
> +						  engine_event_class(event),
> +						  engine_event_instance(event));
> +		GEM_BUG_ON(!engine);
> +		engine->pmu.enable |= BIT(sample);
> +
> +		GEM_BUG_ON(sample >= I915_PMU_SAMPLE_BITS);
> +		GEM_BUG_ON(engine->pmu.enable_count[sample] == ~0);
> +		engine->pmu.enable_count[sample]++;
> +	}
> +
> +	/*
> +	 * Store the current counter value so we can report the correct delta
> +	 * for all listeners. Even when the event was already enabled and has
> +	 * an existing non-zero value.
> +	 */
> +	local64_set(&event->hw.prev_count, __i915_pmu_event_read(event));
> +
> +	spin_unlock_irqrestore(&i915->pmu.lock, flags);
> +}
> +
> +static void i915_pmu_disable(struct perf_event *event)
> +{
> +	struct drm_i915_private *i915 =
> +		container_of(event->pmu, typeof(*i915), pmu.base);
> +	unsigned int bit = event_enabled_bit(event);
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&i915->pmu.lock, flags);
> +
> +	if (is_engine_event(event)) {
> +		u8 sample = engine_event_sample(event);
> +		struct intel_engine_cs *engine;
> +
> +		engine = intel_engine_lookup_user(i915,
> +						  engine_event_class(event),
> +						  engine_event_instance(event));
> +		GEM_BUG_ON(!engine);
> +		GEM_BUG_ON(sample >= I915_PMU_SAMPLE_BITS);
> +		GEM_BUG_ON(engine->pmu.enable_count[sample] == 0);
> +		/*
> +		 * Decrement the reference count and clear the enabled
> +		 * bitmask when the last listener on an event goes away.
> +		 */
> +		if (--engine->pmu.enable_count[sample] == 0)
> +			engine->pmu.enable &= ~BIT(sample);
> +	}
> +
> +	GEM_BUG_ON(bit >= I915_PMU_MASK_BITS);
> +	GEM_BUG_ON(i915->pmu.enable_count[bit] == 0);
> +	/*
> +	 * Decrement the reference count and clear the enabled
> +	 * bitmask when the last listener on an event goes away.
> +	 */
> +	if (--i915->pmu.enable_count[bit] == 0)
> +		i915->pmu.enable &= ~BIT_ULL(bit);
> +
> +	spin_unlock_irqrestore(&i915->pmu.lock, flags);
> +}
> +
> +static void i915_pmu_event_start(struct perf_event *event, int flags)
> +{
> +	i915_pmu_enable(event);
> +	event->hw.state = 0;
> +}
> +
> +static void i915_pmu_event_stop(struct perf_event *event, int flags)
> +{
> +	if (flags & PERF_EF_UPDATE)
> +		i915_pmu_event_read(event);
> +	i915_pmu_disable(event);
> +	event->hw.state = PERF_HES_STOPPED;
> +}
> +
> +static int i915_pmu_event_add(struct perf_event *event, int flags)
> +{
> +	if (flags & PERF_EF_START)
> +		i915_pmu_event_start(event, flags);
> +
> +	return 0;
> +}
> +
> +static void i915_pmu_event_del(struct perf_event *event, int flags)
> +{
> +	i915_pmu_event_stop(event, PERF_EF_UPDATE);
> +}
> +
> +static int i915_pmu_event_event_idx(struct perf_event *event)
> +{
> +	return 0;
> +}
> +
> +static ssize_t i915_pmu_format_show(struct device *dev,
> +				    struct device_attribute *attr, char *buf)
> +{
> +        struct dev_ext_attribute *eattr;
> +
> +        eattr = container_of(attr, struct dev_ext_attribute, attr);
> +        return sprintf(buf, "%s\n", (char *) eattr->var);
> +}
> +
> +#define I915_PMU_FORMAT_ATTR(_name, _config)           \
> +        (&((struct dev_ext_attribute[]) {               \
> +                { .attr = __ATTR(_name, S_IRUGO, i915_pmu_format_show, NULL), \
> +                  .var = (void *) _config, }            \
> +        })[0].attr.attr)
> +
> +static struct attribute *i915_pmu_format_attrs[] = {
> +        I915_PMU_FORMAT_ATTR(i915_eventid, "config:0-42"),
> +        NULL,
> +};
> +
> +static const struct attribute_group i915_pmu_format_attr_group = {
> +        .name = "format",
> +        .attrs = i915_pmu_format_attrs,
> +};
> +
> +static ssize_t i915_pmu_event_show(struct device *dev,
> +				   struct device_attribute *attr, char *buf)
> +{
> +        struct dev_ext_attribute *eattr;
> +
> +        eattr = container_of(attr, struct dev_ext_attribute, attr);
> +        return sprintf(buf, "config=0x%lx\n", (unsigned long) eattr->var);
> +}
> +
> +#define I915_PMU_EVENT_ATTR(_name, _config)            \
> +        (&((struct dev_ext_attribute[]) {               \
> +                { .attr = __ATTR(_name, S_IRUGO, i915_pmu_event_show, NULL), \
> +                  .var = (void *) _config, }            \
> +         })[0].attr.attr)
> +
> +static struct attribute *i915_pmu_events_attrs[] = {
> +	I915_PMU_EVENT_ATTR(rcs0-queued,
> +			    I915_PMU_ENGINE_QUEUED(I915_ENGINE_CLASS_RENDER, 0)),
> +	I915_PMU_EVENT_ATTR(rcs0-busy,
> +			    I915_PMU_ENGINE_BUSY(I915_ENGINE_CLASS_RENDER, 0)),
> +	I915_PMU_EVENT_ATTR(rcs0-wait,
> +			    I915_PMU_ENGINE_WAIT(I915_ENGINE_CLASS_RENDER, 0)),
> +	I915_PMU_EVENT_ATTR(rcs0-sema,
> +			    I915_PMU_ENGINE_SEMA(I915_ENGINE_CLASS_RENDER, 0)),
> +
> +	I915_PMU_EVENT_ATTR(bcs0-queued,
> +			    I915_PMU_ENGINE_QUEUED(I915_ENGINE_CLASS_COPY, 0)),
> +	I915_PMU_EVENT_ATTR(bcs0-busy,
> +			    I915_PMU_ENGINE_BUSY(I915_ENGINE_CLASS_COPY, 0)),
> +	I915_PMU_EVENT_ATTR(bcs0-wait,
> +			    I915_PMU_ENGINE_WAIT(I915_ENGINE_CLASS_COPY, 0)),
> +	I915_PMU_EVENT_ATTR(bcs0-sema,
> +			    I915_PMU_ENGINE_SEMA(I915_ENGINE_CLASS_COPY, 0)),
> +
> +	I915_PMU_EVENT_ATTR(vcs0-queued,
> +			    I915_PMU_ENGINE_QUEUED(I915_ENGINE_CLASS_VIDEO, 0)),
> +	I915_PMU_EVENT_ATTR(vcs0-busy,
> +			    I915_PMU_ENGINE_BUSY(I915_ENGINE_CLASS_VIDEO, 0)),
> +	I915_PMU_EVENT_ATTR(vcs0-wait,
> +			    I915_PMU_ENGINE_WAIT(I915_ENGINE_CLASS_VIDEO, 0)),
> +	I915_PMU_EVENT_ATTR(vcs0-sema,
> +			    I915_PMU_ENGINE_SEMA(I915_ENGINE_CLASS_VIDEO, 0)),
> +
> +	I915_PMU_EVENT_ATTR(vcs1-queued,
> +			    I915_PMU_ENGINE_QUEUED(I915_ENGINE_CLASS_VIDEO, 1)),
> +	I915_PMU_EVENT_ATTR(vcs1-busy,
> +			    I915_PMU_ENGINE_BUSY(I915_ENGINE_CLASS_VIDEO, 1)),
> +	I915_PMU_EVENT_ATTR(vcs1-wait,
> +			    I915_PMU_ENGINE_WAIT(I915_ENGINE_CLASS_VIDEO, 1)),
> +	I915_PMU_EVENT_ATTR(vcs1-sema,
> +			    I915_PMU_ENGINE_SEMA(I915_ENGINE_CLASS_VIDEO, 1)),
> +
> +	I915_PMU_EVENT_ATTR(vecs0-queued,
> +			    I915_PMU_ENGINE_QUEUED(I915_ENGINE_CLASS_VIDEO_ENHANCE, 0)),
> +	I915_PMU_EVENT_ATTR(vecs0-busy,
> +			    I915_PMU_ENGINE_BUSY(I915_ENGINE_CLASS_VIDEO_ENHANCE, 0)),
> +	I915_PMU_EVENT_ATTR(vecs0-wait,
> +			    I915_PMU_ENGINE_WAIT(I915_ENGINE_CLASS_VIDEO_ENHANCE, 0)),
> +	I915_PMU_EVENT_ATTR(vecs0-sema,
> +			    I915_PMU_ENGINE_SEMA(I915_ENGINE_CLASS_VIDEO_ENHANCE, 0)),
> +
> +        I915_PMU_EVENT_ATTR(actual-frequency,	 I915_PMU_ACTUAL_FREQUENCY),
> +        I915_PMU_EVENT_ATTR(requested-frequency, I915_PMU_REQUESTED_FREQUENCY),
> +        I915_PMU_EVENT_ATTR(energy,		 I915_PMU_ENERGY),
> +        I915_PMU_EVENT_ATTR(interrupts,		 I915_PMU_INTERRUPTS),
> +        I915_PMU_EVENT_ATTR(rc6-residency,	 I915_PMU_RC6_RESIDENCY),
> +        I915_PMU_EVENT_ATTR(rc6p-residency,	 I915_PMU_RC6p_RESIDENCY),
> +        I915_PMU_EVENT_ATTR(rc6pp-residency,	 I915_PMU_RC6pp_RESIDENCY),
> +
> +        NULL,
> +};
> +
> +static const struct attribute_group i915_pmu_events_attr_group = {
> +        .name = "events",
> +        .attrs = i915_pmu_events_attrs,
> +};
> +
> +static ssize_t
> +i915_pmu_get_attr_cpumask(struct device *dev,
> +			  struct device_attribute *attr,
> +			  char *buf)
> +{
> +	return cpumap_print_to_pagebuf(true, buf, &i915_pmu_cpumask);
> +}
> +
> +static DEVICE_ATTR(cpumask, S_IRUGO, i915_pmu_get_attr_cpumask, NULL);
> +
> +static struct attribute *i915_cpumask_attrs[] = {
> +	&dev_attr_cpumask.attr,
> +	NULL,
> +};
> +
> +static struct attribute_group i915_pmu_cpumask_attr_group = {
> +	.attrs = i915_cpumask_attrs,
> +};
> +
> +static const struct attribute_group *i915_pmu_attr_groups[] = {
> +        &i915_pmu_format_attr_group,
> +        &i915_pmu_events_attr_group,
> +	&i915_pmu_cpumask_attr_group,
> +        NULL
> +};
> +
> +static int i915_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
> +{
> +	unsigned int target;
> +
> +	target = cpumask_any_and(&i915_pmu_cpumask, &i915_pmu_cpumask);
> +	/* Select the first online CPU as a designated reader. */
> +	if (target >= nr_cpu_ids)
> +		cpumask_set_cpu(cpu, &i915_pmu_cpumask);
> +
> +	return 0;
> +}
> +
> +static int i915_pmu_cpu_offline(unsigned int cpu, struct hlist_node *node)
> +{
> +	struct i915_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), node);
> +	unsigned int target;
> +
> +	if (cpumask_test_and_clear_cpu(cpu, &i915_pmu_cpumask)) {
> +		target = cpumask_any_but(topology_sibling_cpumask(cpu), cpu);
> +		/* Migrate events if there is a valid target */
> +		if (target < nr_cpu_ids) {
> +			cpumask_set_cpu(target, &i915_pmu_cpumask);
> +			perf_pmu_migrate_context(&pmu->base, cpu, target);
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +void i915_pmu_register(struct drm_i915_private *i915)
> +{
> +	int ret = ENOTSUPP;
> +
> +	if (INTEL_GEN(i915) <= 2)
> +		goto err;
> +
> +	ret = cpuhp_setup_state_multi(CPUHP_AP_PERF_X86_UNCORE_ONLINE,
> +				      "perf/x86/intel/i915:online",
> +				      i915_pmu_cpu_online,
> +			              i915_pmu_cpu_offline);
> +	if (ret)
> +		goto err;
> +
> +	ret = cpuhp_state_add_instance(CPUHP_AP_PERF_X86_UNCORE_ONLINE,
> +				       &i915->pmu.node);
> +	if (ret)
> +		goto err;
> +
> +	i915->pmu.base.attr_groups	= i915_pmu_attr_groups;
> +	i915->pmu.base.task_ctx_nr	= perf_invalid_context;
> +	i915->pmu.base.event_init	= i915_pmu_event_init;
> +	i915->pmu.base.add		= i915_pmu_event_add;
> +	i915->pmu.base.del		= i915_pmu_event_del;
> +	i915->pmu.base.start		= i915_pmu_event_start;
> +	i915->pmu.base.stop		= i915_pmu_event_stop;
> +	i915->pmu.base.read		= i915_pmu_event_read;
> +	i915->pmu.base.event_idx	= i915_pmu_event_event_idx;
> +
> +	spin_lock_init(&i915->pmu.lock);
> +	hrtimer_init(&i915->pmu.timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
> +	i915->pmu.timer.function = i915_sample;
> +	i915->pmu.enable = 0;
> +
> +	if (perf_pmu_register(&i915->pmu.base, "i915", -1))
> +		i915->pmu.base.event_init = NULL;
> +
> +err:
> +	DRM_INFO("Failed to register PMU (err=%d)\n", ret);
> +}
> +
> +void i915_pmu_unregister(struct drm_i915_private *i915)
> +{
> +	if (!i915->pmu.base.event_init)
> +		return;
> +
> +	i915->pmu.enable = 0;
> +
> +	perf_pmu_unregister(&i915->pmu.base);
> +	i915->pmu.base.event_init = NULL;
> +
> +	hrtimer_cancel(&i915->pmu.timer);
> +
> +	cpuhp_state_remove_instance(CPUHP_AP_PERF_X86_UNCORE_ONLINE,
> +				    &i915->pmu.node);
> +}
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 0b03260a3967..8c362e0451c1 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -186,6 +186,9 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
>  #define VIDEO_ENHANCEMENT_CLASS	2
>  #define COPY_ENGINE_CLASS	3
>  #define OTHER_CLASS		4
> +#define MAX_ENGINE_CLASS	4
> +
> +#define MAX_ENGINE_INSTANCE    1
>  
>  /* PCI config space */
>  
> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> index 3ae89a9d6241..dbc7abd65f33 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -198,6 +198,15 @@ intel_engine_setup(struct drm_i915_private *dev_priv,
>  	GEM_BUG_ON(info->class >= ARRAY_SIZE(intel_engine_classes));
>  	class_info = &intel_engine_classes[info->class];
>  
> +	if (GEM_WARN_ON(info->class > MAX_ENGINE_CLASS))
> +		return -EINVAL;
> +
> +	if (GEM_WARN_ON(info->instance > MAX_ENGINE_INSTANCE))
> +		return -EINVAL;
> +
> +	if (GEM_WARN_ON(dev_priv->engine_class[info->class][info->instance]))
> +		return -EINVAL;
> +
>  	GEM_BUG_ON(dev_priv->engine[id]);
>  	engine = kzalloc(sizeof(*engine), GFP_KERNEL);
>  	if (!engine)
> @@ -225,6 +234,7 @@ intel_engine_setup(struct drm_i915_private *dev_priv,
>  
>  	ATOMIC_INIT_NOTIFIER_HEAD(&engine->context_status_notifier);
>  
> +	dev_priv->engine_class[info->class][info->instance] = engine;
>  	dev_priv->engine[id] = engine;
>  	return 0;
>  }
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 268342433a8e..7db4c572ef76 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2283,3 +2283,28 @@ int intel_init_vebox_ring_buffer(struct intel_engine_cs *engine)
>  
>  	return intel_init_ring_buffer(engine);
>  }
> +
> +static u8 user_class_map[I915_ENGINE_CLASS_MAX] = {
> +	[I915_ENGINE_CLASS_OTHER] = OTHER_CLASS,
> +	[I915_ENGINE_CLASS_RENDER] = RENDER_CLASS,
> +	[I915_ENGINE_CLASS_COPY] = COPY_ENGINE_CLASS,
> +	[I915_ENGINE_CLASS_VIDEO] = VIDEO_DECODE_CLASS,
> +	[I915_ENGINE_CLASS_VIDEO_ENHANCE] = VIDEO_ENHANCEMENT_CLASS,
> +};
> +
> +struct intel_engine_cs *
> +intel_engine_lookup_user(struct drm_i915_private *i915, u8 class, u8 instance)
> +{
> +	if (class >= ARRAY_SIZE(user_class_map))
> +		return NULL;
> +
> +	class = user_class_map[class];
> +
> +	if (WARN_ON_ONCE(class > MAX_ENGINE_CLASS))
> +		return NULL;
> +
> +	if (instance > MAX_ENGINE_INSTANCE)
> +		return NULL;
> +
> +	return i915->engine_class[class][instance];
> +}
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 79c0021f3700..cf095b9386f4 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -245,6 +245,28 @@ struct intel_engine_cs {
>  		I915_SELFTEST_DECLARE(bool mock : 1);
>  	} breadcrumbs;
>  
> +	struct {
> +		/**
> +		 * @enable: Bitmask of enable sample events on this engine.
> +		 *
> +		 * Bits correspond to sample event types, for instance
> +		 * I915_SAMPLE_QUEUED is bit 0 etc.
> +		 */
> +		u32 enable;
> +		/**
> +		 * @enable_count: Reference count for the enabled samplers.
> +		 *
> +		 * Index number corresponds to the bit number from @enable.
> +		 */
> +		unsigned int enable_count[I915_PMU_SAMPLE_BITS];
> +		/**
> +		 * @sample: Counter value for sampling events.
> +		 *
> +		 * Our internal timer stores the current counter in this field.
> +		 */
> +		u64 sample[I915_ENGINE_SAMPLE_MAX];
> +	} pmu;
> +
>  	/*
>  	 * A pool of objects to use as shadow copies of client batch buffers
>  	 * when the command parser is enabled. Prevents the client from
> @@ -737,4 +759,7 @@ void intel_engines_reset_default_submission(struct drm_i915_private *i915);
>  
>  bool intel_engine_can_store_dword(struct intel_engine_cs *engine);
>  
> +struct intel_engine_cs *
> +intel_engine_lookup_user(struct drm_i915_private *i915, u8 class, u8 instance);
> +
>  #endif /* _INTEL_RINGBUFFER_H_ */
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index d8d10d932759..6dc0d6fd4e4c 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -86,6 +86,64 @@ enum i915_mocs_table_index {
>  	I915_MOCS_CACHED,
>  };
>  
> +enum drm_i915_gem_engine_class {
> +	I915_ENGINE_CLASS_OTHER = 0,
> +	I915_ENGINE_CLASS_RENDER = 1,
> +	I915_ENGINE_CLASS_COPY = 2,
> +	I915_ENGINE_CLASS_VIDEO = 3,
> +	I915_ENGINE_CLASS_VIDEO_ENHANCE = 4,
> +	I915_ENGINE_CLASS_MAX /* non-ABI */
> +};
> +
> +/**
> + * DOC: perf_events exposed by i915 through /sys/bus/event_sources/drivers/i915
> + *
> + */
> +
> +enum drm_i915_pmu_engine_sample {
> +	I915_SAMPLE_QUEUED = 0,
> +	I915_SAMPLE_BUSY = 1,
> +	I915_SAMPLE_WAIT = 2,
> +	I915_SAMPLE_SEMA = 3,
> +	I915_ENGINE_SAMPLE_MAX /* non-ABI */
> +};
> +
> +#define I915_PMU_SAMPLE_BITS (4)
> +#define I915_PMU_SAMPLE_MASK (0xf)
> +#define I915_PMU_SAMPLE_INSTANCE_BITS (8)
> +#define I915_PMU_CLASS_SHIFT \
> +	(I915_PMU_SAMPLE_BITS + I915_PMU_SAMPLE_INSTANCE_BITS)
> +
> +#define __I915_PMU_ENGINE(class, instance, sample) \
> +	((class) << I915_PMU_CLASS_SHIFT | \
> +	(instance) << I915_PMU_SAMPLE_BITS | \
> +	(sample))
> +
> +#define I915_PMU_ENGINE_QUEUED(class, instance) \
> +	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_QUEUED)
> +
> +#define I915_PMU_ENGINE_BUSY(class, instance) \
> +	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_BUSY)
> +
> +#define I915_PMU_ENGINE_WAIT(class, instance) \
> +	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_WAIT)
> +
> +#define I915_PMU_ENGINE_SEMA(class, instance) \
> +	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_SEMA)
> +
> +#define __I915_PMU_OTHER(x) (__I915_PMU_ENGINE(0xff, 0xff, 0xf) + 1 + (x))
> +
> +#define I915_PMU_ACTUAL_FREQUENCY 	__I915_PMU_OTHER(0)
> +#define I915_PMU_REQUESTED_FREQUENCY	__I915_PMU_OTHER(1)
> +#define I915_PMU_ENERGY			__I915_PMU_OTHER(2)
> +#define I915_PMU_INTERRUPTS		__I915_PMU_OTHER(3)
> +
> +#define I915_PMU_RC6_RESIDENCY		__I915_PMU_OTHER(4)
> +#define I915_PMU_RC6p_RESIDENCY		__I915_PMU_OTHER(5)
> +#define I915_PMU_RC6pp_RESIDENCY	__I915_PMU_OTHER(6)
> +
> +#define I915_PMU_LAST I915_PMU_RC6pp_RESIDENCY
> +
>  /* Each region is a minimum of 16k, and there are at most 255 of them.
>   */
>  #define I915_NR_TEX_REGIONS 255	/* table size 2k - maximum due to use

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC v3 00/11] i915 PMU and engine busy stats
  2017-09-12  2:03 ` [RFC v3 00/11] i915 PMU and engine busy stats Rogozhkin, Dmitry V
@ 2017-09-12 14:54   ` Tvrtko Ursulin
  2017-09-12 22:01     ` Rogozhkin, Dmitry V
  0 siblings, 1 reply; 56+ messages in thread
From: Tvrtko Ursulin @ 2017-09-12 14:54 UTC (permalink / raw)
  To: Rogozhkin, Dmitry V, tursulin; +Cc: peterz, Intel-gfx


On 12/09/2017 03:03, Rogozhkin, Dmitry V wrote:
> Hi,
> 
> Just tried v3 series. perf-stat works fine. From the IGT tests which I wrote for i915 PMU
> (https://patchwork.freedesktop.org/series/29313/) all pass (assuming pmu.enabled will be exposed
> in debugfs) except cpu_online subtest. And this is pretty interesting - see details below.
> 
> Ok, be prepared for the long mail:)...
> 
> So, cpu_online subtest loads RCS0 engine 100% and starts to put CPUs offline one by one starting
> from CPU0 (don't forget to have CONFIG_BOOTPARAM_HOTPLUG_CPU0=y in .config). Test expectation is
> that i915 PMU will report almost 100% (with 2% tolerance) busyness of RCS0. Test requests metric
> just twice: before running on RCS0 and right after. This fails as follows:
> 
> Executed on rcs0 for 32004262184us
>    i915/rcs0-busy/: 2225723999us
> (perf_pmu:6325) CRITICAL: Test assertion failure function test_cpu_online, file perf_pmu.c:719:
> (perf_pmu:6325) CRITICAL: Failed assertion: perf_elapsed(&pm.metrics[0]) > (1-USAGE_TOLERANCE) * elapsed_ns(&start, &now)
> Stack trace:
>    #0 [__igt_fail_assert+0xf1]
>    #1 [__real_main773+0xff1]
>    #2 [main+0x35]
>    #3 [__libc_start_main+0xf5]
>    #4 [_start+0x29]
>    #5 [<unknown>+0x29]
> Subtest cpu_online failed.
> **** DEBUG ****
> (perf_pmu:6325) DEBUG: Test requirement passed: is_hotplug_cpu0()
> (perf_pmu:6325) INFO: perf_init: enabled 1 metrics from 1 requested
> (perf_pmu:6325) ioctl-wrappers-DEBUG: Test requirement passed: gem_has_ring(fd, ring)
> (perf_pmu:6325) INFO: Executed on rcs0 for 32004262184us
> (perf_pmu:6325) INFO:   i915/rcs0-busy/: 2225723999us
> (perf_pmu:6325) CRITICAL: Test assertion failure function test_cpu_online, file perf_pmu.c:719:
> (perf_pmu:6325) CRITICAL: Failed assertion: perf_elapsed(&pm.metrics[0]) > (1-USAGE_TOLERANCE) * elapsed_ns(&start, &now)
> 
> Now. Looks like that by itself PMU context migration works. For example, if you will comment out
> "perf_pmu_migrate_context(&pmu->base, cpu, target)" you will get:
> 
>      Executed on rcs0 for 32004434918us
>        i915/rcs0-busy/:     76623707us
> 
> Compare with previous:
>      Executed on rcs0 for 32004262184us
>        i915/rcs0-busy/:    2225723999us
> 
> This test passed on the previous set of patches, I mean Tvrtko's v2 series + my patches.
> 
> So, it seems we are loosing counter values somehow. I saw in the patches that this place really was modified - you
> have added subtraction from initial counter value:
> static void i915_pmu_event_read(struct perf_event *event)
> {
> 
> 	local64_set(&event->count,
> 		    __i915_pmu_event_read(event) -
> 		    local64_read(&event->hw.prev_count));
> }
> 
> But looks like the problem is that with the PMU context migration we get sequence of events start/stop (or maybe
> add/del) which eventually call our i915_pmu_enable/disable. Here is the dmesg log with the obvious printk:
> 
> [  153.971096] [IGT] perf_pmu: starting subtest cpu_online
> [  153.971151] i915_pmu_enable: event->hw.prev_count=0
> [  154.036015] i915_pmu_disable: event->hw.prev_count=0
> [  154.048027] i915_pmu_enable: event->hw.prev_count=0
> [  154.049343] smpboot: CPU 0 is now offline
> [  155.059028] smpboot: Booting Node 0 Processor 0 APIC 0x0
> [  155.155078] smpboot: CPU 1 is now offline
> [  156.161026] x86: Booting SMP configuration:
> [  156.161027] smpboot: Booting Node 0 Processor 1 APIC 0x2
> [  156.197065] IRQ 122: no longer affine to CPU2
> [  156.198087] smpboot: CPU 2 is now offline
> [  157.208028] smpboot: Booting Node 0 Processor 2 APIC 0x4
> [  157.263093] smpboot: CPU 3 is now offline
> [  158.273026] smpboot: Booting Node 0 Processor 3 APIC 0x6
> [  158.310026] i915_pmu_disable: event->hw.prev_count=76648307
> [  158.319020] i915_pmu_enable: event->hw.prev_count=76648307
> [  158.319098] IRQ 124: no longer affine to CPU4
> [  158.320368] smpboot: CPU 4 is now offline
> [  159.326030] smpboot: Booting Node 0 Processor 4 APIC 0x1
> [  159.365306] smpboot: CPU 5 is now offline
> [  160.371030] smpboot: Booting Node 0 Processor 5 APIC 0x3
> [  160.421077] IRQ 125: no longer affine to CPU6
> [  160.422093] smpboot: CPU 6 is now offline
> [  161.429030] smpboot: Booting Node 0 Processor 6 APIC 0x5
> [  161.467091] smpboot: CPU 7 is now offline
> [  162.473027] smpboot: Booting Node 0 Processor 7 APIC 0x7
> [  162.527019] i915_pmu_disable: event->hw.prev_count=4347548222
> [  162.546017] i915_pmu_enable: event->hw.prev_count=4347548222
> [  162.547317] smpboot: CPU 0 is now offline
> [  163.553028] smpboot: Booting Node 0 Processor 0 APIC 0x0
> [  163.621089] smpboot: CPU 1 is now offline
> [  164.627028] x86: Booting SMP configuration:
> [  164.627029] smpboot: Booting Node 0 Processor 1 APIC 0x2
> [  164.669308] smpboot: CPU 2 is now offline
> [  165.679025] smpboot: Booting Node 0 Processor 2 APIC 0x4
> [  165.717089] smpboot: CPU 3 is now offline
> [  166.723025] smpboot: Booting Node 0 Processor 3 APIC 0x6
> [  166.775016] i915_pmu_disable: event->hw.prev_count=8574197312
> [  166.787016] i915_pmu_enable: event->hw.prev_count=8574197312
> [  166.788309] smpboot: CPU 4 is now offline
> [  167.794025] smpboot: Booting Node 0 Processor 4 APIC 0x1
> [  167.837114] smpboot: CPU 5 is now offline
> [  168.847025] smpboot: Booting Node 0 Processor 5 APIC 0x3
> [  168.889312] smpboot: CPU 6 is now offline
> [  169.899030] smpboot: Booting Node 0 Processor 6 APIC 0x5
> [  169.944104] smpboot: CPU 7 is now offline
> [  170.954032] smpboot: Booting Node 0 Processor 7 APIC 0x7
> [  171.000016] i915_pmu_disable: event->hw.prev_count=12815138319
> [  171.008017] i915_pmu_enable: event->hw.prev_count=12815138319
> [  171.009304] smpboot: CPU 0 is now offline
> [  172.017028] smpboot: Booting Node 0 Processor 0 APIC 0x0
> [  172.096104] smpboot: CPU 1 is now offline
> [  173.106025] x86: Booting SMP configuration:
> [  173.106026] smpboot: Booting Node 0 Processor 1 APIC 0x2
> [  173.147078] smpboot: CPU 2 is now offline
> [  174.153025] smpboot: Booting Node 0 Processor 2 APIC 0x4
> [  174.192093] smpboot: CPU 3 is now offline
> [  175.198028] smpboot: Booting Node 0 Processor 3 APIC 0x6
> [  175.229042] i915_pmu_disable: event->hw.prev_count=17035889079
> [  175.242030] i915_pmu_enable: event->hw.prev_count=17035889079
> [  175.242163] IRQ fixup: irq 120 move in progress, old vector 131
> [  175.242165] IRQ fixup: irq 121 move in progress, old vector 147
> [  175.242171] IRQ 124: no longer affine to CPU4
> [  175.243435] smpboot: CPU 4 is now offline
> [  176.248040] smpboot: Booting Node 0 Processor 4 APIC 0x1
> [  176.285328] smpboot: CPU 5 is now offline
> [  177.296039] smpboot: Booting Node 0 Processor 5 APIC 0x3
> [  177.325067] IRQ 125: no longer affine to CPU6
> [  177.326087] smpboot: CPU 6 is now offline
> [  178.335036] smpboot: Booting Node 0 Processor 6 APIC 0x5
> [  178.377063] IRQ 122: no longer affine to CPU7
> [  178.378086] smpboot: CPU 7 is now offline
> [  179.388028] smpboot: Booting Node 0 Processor 7 APIC 0x7
> [  179.454030] i915_pmu_disable: event->hw.prev_count=21269856967
> [  179.470026] i915_pmu_enable: event->hw.prev_count=21269856967
> [  179.471110] smpboot: CPU 0 is now offline
> [  180.481028] smpboot: Booting Node 0 Processor 0 APIC 0x0
> [  180.551075] smpboot: CPU 1 is now offline
> [  181.558029] x86: Booting SMP configuration:
> [  181.558030] smpboot: Booting Node 0 Processor 1 APIC 0x2
> [  181.595096] smpboot: CPU 2 is now offline
> [  182.605029] smpboot: Booting Node 0 Processor 2 APIC 0x4
> [  182.657084] smpboot: CPU 3 is now offline
> [  183.668030] smpboot: Booting Node 0 Processor 3 APIC 0x6
> [  183.709017] i915_pmu_disable: event->hw.prev_count=25497358644
> [  183.727016] i915_pmu_enable: event->hw.prev_count=25497358644
> [  183.728305] smpboot: CPU 4 is now offline
> [  184.734027] smpboot: Booting Node 0 Processor 4 APIC 0x1
> [  184.767090] smpboot: CPU 5 is now offline
> [  185.777036] smpboot: Booting Node 0 Processor 5 APIC 0x3
> [  185.823096] smpboot: CPU 6 is now offline
> [  186.829051] smpboot: Booting Node 0 Processor 6 APIC 0x5
> [  186.856350] smpboot: CPU 7 is now offline
> [  187.862051] smpboot: Booting Node 0 Processor 7 APIC 0x7
> [  187.871216] [IGT] perf_pmu: exiting, ret=99
> [  187.889199] Console: switching to colour frame buffer device 240x67
> [  187.889583] i915_pmu_disable: event->hw.prev_count=29754080941
> 
> And the result which I got in userspace for this run were
>      Executed on rcs0 for 32003587981us
>        i915/rcs0-busy/: 2247436461us
> 
> After that I decided to roll back the change with counting values which I mentioned before, i.e.:
> static void i915_pmu_event_read(struct perf_event *event)
> {
> 
> 	local64_set(&event->count,
> 		    __i915_pmu_event_read(event) /*-
> 		    local64_read(&event->hw.prev_count)*/);
> }
> 
> And I got test PASSED :):
>      Executed on rcs0 for 32002282603us
>        i915/rcs0-busy/: 31998855052us
>      Subtest cpu_online: SUCCESS (33.950s)
> 
> At this point I need to go home :). Maybe you will have time to look into this issue? If not, I will continue
> tomorrow.

I forgot to run this test since I did not have the kernel feature 
enabled. But yes, now that I tried it, it is failing.

What is happening is that event del (so counter stop as well) is getting 
called when the CPU goes offline, followed by add->start, and the 
initial counter value then gets reloaded.

I don't see a way for i915 to distinguish between userspace 
starting/stopping the event, and perf core doing the same in the CPU 
migration process. Perhaps Peter could help here?

I am storing the initial counter value when the counter is started so 
that I can report it's relative value. In other words the change from 
event start to stop. Perhaps that is not correct and should be left to 
userspace to handle?

Otherwise we have counters like energy use, and even engine busyness, 
which will/can already be at some large value before PMU monitoring 
starts. Which makes things like "perf stat -a -I <command>", or even 
just normal "perf stat <command>", attribute all previous usage (from 
before the command profiling started) to the reported stats.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC 04/11] drm/i915/pmu: Expose a PMU interface for perf queries
  2017-09-12  2:06   ` Rogozhkin, Dmitry V
@ 2017-09-12 14:59     ` Tvrtko Ursulin
  2017-09-13  8:57       ` [RFC v6 " Tvrtko Ursulin
  0 siblings, 1 reply; 56+ messages in thread
From: Tvrtko Ursulin @ 2017-09-12 14:59 UTC (permalink / raw)
  To: Rogozhkin, Dmitry V, tursulin; +Cc: peterz, Intel-gfx


On 12/09/2017 03:06, Rogozhkin, Dmitry V wrote:
> On Mon, 2017-09-11 at 16:25 +0100, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> From: Chris Wilson <chris@chris-wilson.co.uk>
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> From: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
>>
>> The first goal is to be able to measure GPU (and invidual ring) busyness
>> without having to poll registers from userspace. (Which not only incurs
>> holding the forcewake lock indefinitely, perturbing the system, but also
>> runs the risk of hanging the machine.) As an alternative we can use the
>> perf event counter interface to sample the ring registers periodically
>> and send those results to userspace.
>>
>> To be able to do so, we need to export the two symbols from
>> kernel/events/core.c to register and unregister a PMU device.
>>
>> v1-v2 (Chris Wilson):
>>
>> v2: Use a common timer for the ring sampling.
>>
>> v3: (Tvrtko Ursulin)
>>   * Decouple uAPI from i915 engine ids.
>>   * Complete uAPI defines.
>>   * Refactor some code to helpers for clarity.
>>   * Skip sampling disabled engines.
>>   * Expose counters in sysfs.
>>   * Pass in fake regs to avoid null ptr deref in perf core.
>>   * Convert to class/instance uAPI.
>>   * Use shared driver code for rc6 residency, power and frequency.
>>
>> v4: (Dmitry Rogozhkin)
>>   * Register PMU with .task_ctx_nr=perf_invalid_context
>>   * Expose cpumask for the PMU with the single CPU in the mask
>>   * Properly support pmu->stop(): it should call pmu->read()
>>   * Properly support pmu->del(): it should call stop(event, PERF_EF_UPDATE)
>>   * Make pmu.busy_stats a refcounter to avoid busy stats going away
>>     with some deleted event.
> 
> busy_stats appear later in the patch series. And in your final version
> busy_stats remain bool while we rely on events refcounting. So, this
> item is misleading. Could you, please, change it giving a credit to
> general ref-counting of events which you have rewrote for v5 of this
> patch?

Oh so I dropped the reference to "drm/i915/pmu: introduce refcounting of 
event subscriptions". My apologies. I will restore it in the next 
version. It wasn't deliberate, just an omission while squashing and 
re-ordering things.

Regards,

Tvrtko

>>   * Expose cpumask for i915 PMU to avoid multiple events creation of
>>     the same type followed by counter aggregation by perf-stat.
>>   * Track CPUs getting online/offline to migrate perf context. If (likely)
>>     cpumask will initially set CPU0, CONFIG_BOOTPARAM_HOTPLUG_CPU0 will be
>>     needed to see effect of CPU status tracking.
>>   * End result is that only global events are supported and perf stat
>>     works correctly.
>>   * Deny perf driver level sampling - it is prohibited for uncore PMU.
>>
>> v5: (Tvrtko Ursulin)
>>
>>   * Don't hardcode number of engine samplers.
>>   * Rewrite event ref-counting for correctness and simplicity.
>>   * Store initial counter value when starting already enabled events
>>     to correctly report values to all listeners.
>>   * Fix RC6 residency readout.
>>   * Comments, GPL header.
>>
>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> ---
>>   drivers/gpu/drm/i915/Makefile           |   1 +
>>   drivers/gpu/drm/i915/i915_drv.c         |   2 +
>>   drivers/gpu/drm/i915/i915_drv.h         |  76 ++++
>>   drivers/gpu/drm/i915/i915_pmu.c         | 686 ++++++++++++++++++++++++++++++++
>>   drivers/gpu/drm/i915/i915_reg.h         |   3 +
>>   drivers/gpu/drm/i915/intel_engine_cs.c  |  10 +
>>   drivers/gpu/drm/i915/intel_ringbuffer.c |  25 ++
>>   drivers/gpu/drm/i915/intel_ringbuffer.h |  25 ++
>>   include/uapi/drm/i915_drm.h             |  58 +++
>>   9 files changed, 886 insertions(+)
>>   create mode 100644 drivers/gpu/drm/i915/i915_pmu.c
>>
>> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
>> index 1cb8059a3a16..7b3a0eca62b6 100644
>> --- a/drivers/gpu/drm/i915/Makefile
>> +++ b/drivers/gpu/drm/i915/Makefile
>> @@ -26,6 +26,7 @@ i915-y := i915_drv.o \
>>   
>>   i915-$(CONFIG_COMPAT)   += i915_ioc32.o
>>   i915-$(CONFIG_DEBUG_FS) += i915_debugfs.o intel_pipe_crc.o
>> +i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o
>>   
>>   # GEM code
>>   i915-y += i915_cmd_parser.o \
>> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
>> index 5c111ea96e80..b1f96eb1be16 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.c
>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>> @@ -1196,6 +1196,7 @@ static void i915_driver_register(struct drm_i915_private *dev_priv)
>>   	struct drm_device *dev = &dev_priv->drm;
>>   
>>   	i915_gem_shrinker_init(dev_priv);
>> +	i915_pmu_register(dev_priv);
>>   
>>   	/*
>>   	 * Notify a valid surface after modesetting,
>> @@ -1250,6 +1251,7 @@ static void i915_driver_unregister(struct drm_i915_private *dev_priv)
>>   	intel_opregion_unregister(dev_priv);
>>   
>>   	i915_perf_unregister(dev_priv);
>> +	i915_pmu_unregister(dev_priv);
>>   
>>   	i915_teardown_sysfs(dev_priv);
>>   	i915_guc_log_unregister(dev_priv);
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 48daf9552163..62646b8dfb7a 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -40,6 +40,7 @@
>>   #include <linux/hash.h>
>>   #include <linux/intel-iommu.h>
>>   #include <linux/kref.h>
>> +#include <linux/perf_event.h>
>>   #include <linux/pm_qos.h>
>>   #include <linux/reservation.h>
>>   #include <linux/shmem_fs.h>
>> @@ -2190,6 +2191,69 @@ struct intel_cdclk_state {
>>   	unsigned int cdclk, vco, ref;
>>   };
>>   
>> +enum {
>> +	__I915_SAMPLE_FREQ_ACT = 0,
>> +	__I915_SAMPLE_FREQ_REQ,
>> +	__I915_NUM_PMU_SAMPLERS
>> +};
>> +
>> +/**
>> + * How many different events we track in the global PMU mask.
>> + *
>> + * It is also used to know to needed number of event reference counters.
>> + */
>> +#define I915_PMU_MASK_BITS \
>> +	(1 << I915_PMU_SAMPLE_BITS) + (I915_PMU_LAST + 1 - __I915_PMU_OTHER(0))
>> +
>> +struct i915_pmu {
>> +	/**
>> +	 * @node: List node for CPU hotplug handling.
>> +	 */
>> +	struct hlist_node node;
>> +	/**
>> +	 * @base: PMU base.
>> +	 */
>> +	struct pmu base;
>> +	/**
>> +	 * @lock: Lock protecting enable mask and ref count handling.
>> +	 */
>> +	spinlock_t lock;
>> +	/**
>> +	 * @timer: Timer for internal i915 PMU sampling.
>> +	 */
>> +	struct hrtimer timer;
>> +	/**
>> +	 * @enable: Bitmask of all currently enabled events.
>> +	 *
>> +	 * Bits are derived from uAPI event numbers in a way that low 16 bits
>> +	 * correspond to engine event _sample_ _type_ (I915_SAMPLE_QUEUED is
>> +	 * bit 0), and higher bits correspond to other events (for instance
>> +	 * I915_PMU_ACTUAL_FREQUENCY is bit 16 etc).
>> +	 *
>> +	 * In other words, low 16 bits are not per engine but per engine
>> +	 * sampler type, while the upper bits are directly mapped to other
>> +	 * event types.
>> +	 */
>> +	u64 enable;
>> +	/**
>> +	 * @enable_count: Reference counts for the enabled events.
>> +	 *
>> +	 * Array indices are mapped in the same way as bits in the @enable field
>> +	 * and they are used to control sampling on/off when multiple clients
>> +	 * are using the PMU API.
>> +	 */
>> +	unsigned int enable_count[I915_PMU_MASK_BITS];
>> +	/**
>> +	 * @sample: Current counter value for i915 events which need sampling.
>> +	 *
>> +	 * These counters are updated from the i915 PMU sampling timer.
>> +	 *
>> +	 * Only global counters are held here, while the per-engine ones are in
>> +	 * struct intel_engine_cs.
>> +	 */
>> +	u64 sample[__I915_NUM_PMU_SAMPLERS];
>> +};
>> +
>>   struct drm_i915_private {
>>   	struct drm_device drm;
>>   
>> @@ -2238,6 +2302,7 @@ struct drm_i915_private {
>>   	struct pci_dev *bridge_dev;
>>   	struct i915_gem_context *kernel_context;
>>   	struct intel_engine_cs *engine[I915_NUM_ENGINES];
>> +	struct intel_engine_cs *engine_class[MAX_ENGINE_CLASS + 1][MAX_ENGINE_INSTANCE + 1];
>>   	struct i915_vma *semaphore;
>>   
>>   	struct drm_dma_handle *status_page_dmah;
>> @@ -2698,6 +2763,8 @@ struct drm_i915_private {
>>   		int	irq;
>>   	} lpe_audio;
>>   
>> +	struct i915_pmu pmu;
>> +
>>   	/*
>>   	 * NOTE: This is the dri1/ums dungeon, don't add stuff here. Your patch
>>   	 * will be rejected. Instead look for a better place.
>> @@ -3918,6 +3985,15 @@ extern void i915_perf_fini(struct drm_i915_private *dev_priv);
>>   extern void i915_perf_register(struct drm_i915_private *dev_priv);
>>   extern void i915_perf_unregister(struct drm_i915_private *dev_priv);
>>   
>> +/* i915_pmu.c */
>> +#ifdef CONFIG_PERF_EVENTS
>> +extern void i915_pmu_register(struct drm_i915_private *i915);
>> +extern void i915_pmu_unregister(struct drm_i915_private *i915);
>> +#else
>> +static inline void i915_pmu_register(struct drm_i915_private *i915) {}
>> +static inline void i915_pmu_unregister(struct drm_i915_private *i915) {}
>> +#endif
>> +
>>   /* i915_suspend.c */
>>   extern int i915_save_state(struct drm_i915_private *dev_priv);
>>   extern int i915_restore_state(struct drm_i915_private *dev_priv);
>> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
>> new file mode 100644
>> index 000000000000..2ec892e57143
>> --- /dev/null
>> +++ b/drivers/gpu/drm/i915/i915_pmu.c
>> @@ -0,0 +1,686 @@
>> +/*
>> + * Copyright © 2017 Intel Corporation
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining a
>> + * copy of this software and associated documentation files (the "Software"),
>> + * to deal in the Software without restriction, including without limitation
>> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
>> + * and/or sell copies of the Software, and to permit persons to whom the
>> + * Software is furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice (including the next
>> + * paragraph) shall be included in all copies or substantial portions of the
>> + * Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
>> + * IN THE SOFTWARE.
>> + *
>> + */
>> +
>> +#include <linux/perf_event.h>
>> +#include <linux/pm_runtime.h>
>> +
>> +#include "i915_drv.h"
>> +#include "intel_ringbuffer.h"
>> +
>> +/* Frequency for the sampling timer for events which need it. */
>> +#define FREQUENCY 200
>> +#define PERIOD max_t(u64, 10000, NSEC_PER_SEC / FREQUENCY)
>> +
>> +#define ENGINE_SAMPLE_MASK \
>> +	(BIT(I915_SAMPLE_QUEUED) | \
>> +	 BIT(I915_SAMPLE_BUSY) | \
>> +	 BIT(I915_SAMPLE_WAIT) | \
>> +	 BIT(I915_SAMPLE_SEMA))
>> +
>> +#define ENGINE_SAMPLE_BITS (1 << I915_PMU_SAMPLE_BITS)
>> +
>> +static cpumask_t i915_pmu_cpumask = CPU_MASK_NONE;
>> +
>> +static u8 engine_config_sample(u64 config)
>> +{
>> +	return config & I915_PMU_SAMPLE_MASK;
>> +}
>> +
>> +static u8 engine_event_sample(struct perf_event *event)
>> +{
>> +	return engine_config_sample(event->attr.config);
>> +}
>> +
>> +static u8 engine_event_class(struct perf_event *event)
>> +{
>> +	return (event->attr.config >> I915_PMU_CLASS_SHIFT) & 0xff;
>> +}
>> +
>> +static u8 engine_event_instance(struct perf_event *event)
>> +{
>> +	return (event->attr.config >> I915_PMU_SAMPLE_BITS) & 0xff;
>> +}
>> +
>> +static bool is_engine_config(u64 config)
>> +{
>> +	return config < __I915_PMU_OTHER(0);
>> +}
>> +
>> +static unsigned int config_enabled_bit(u64 config)
>> +{
>> +	if (is_engine_config(config))
>> +		return engine_config_sample(config);
>> +	else
>> +		return ENGINE_SAMPLE_BITS + (config - __I915_PMU_OTHER(0));
>> +}
>> +
>> +static u64 config_enabled_mask(u64 config)
>> +{
>> +	return BIT_ULL(config_enabled_bit(config));
>> +}
>> +
>> +static bool is_engine_event(struct perf_event *event)
>> +{
>> +	return is_engine_config(event->attr.config);
>> +}
>> +
>> +static unsigned int event_enabled_bit(struct perf_event *event)
>> +{
>> +	return config_enabled_bit(event->attr.config);
>> +}
>> +
>> +static bool grab_forcewake(struct drm_i915_private *i915, bool fw)
>> +{
>> +	if (!fw)
>> +		intel_uncore_forcewake_get(i915, FORCEWAKE_ALL);
>> +
>> +	return true;
>> +}
>> +
>> +static void engines_sample(struct drm_i915_private *dev_priv)
>> +{
>> +	struct intel_engine_cs *engine;
>> +	enum intel_engine_id id;
>> +	bool fw = false;
>> +
>> +	if ((dev_priv->pmu.enable & ENGINE_SAMPLE_MASK) == 0)
>> +		return;
>> +
>> +	if (!dev_priv->gt.awake)
>> +		return;
>> +
>> +	if (!intel_runtime_pm_get_if_in_use(dev_priv))
>> +		return;
>> +
>> +	for_each_engine(engine, dev_priv, id) {
>> +		u32 enable = engine->pmu.enable;
>> +
>> +		if (i915_seqno_passed(intel_engine_get_seqno(engine),
>> +				      intel_engine_last_submit(engine)))
>> +			continue;
>> +
>> +		if (enable & BIT(I915_SAMPLE_QUEUED))
>> +			engine->pmu.sample[I915_SAMPLE_QUEUED] += PERIOD;
>> +
>> +		if (enable & BIT(I915_SAMPLE_BUSY)) {
>> +			u32 val;
>> +
>> +			fw = grab_forcewake(dev_priv, fw);
>> +			val = I915_READ_FW(RING_MI_MODE(engine->mmio_base));
>> +			if (!(val & MODE_IDLE))
>> +				engine->pmu.sample[I915_SAMPLE_BUSY] += PERIOD;
>> +		}
>> +
>> +		if (enable & (BIT(I915_SAMPLE_WAIT) | BIT(I915_SAMPLE_SEMA))) {
>> +			u32 val;
>> +
>> +			fw = grab_forcewake(dev_priv, fw);
>> +			val = I915_READ_FW(RING_CTL(engine->mmio_base));
>> +			if ((enable & BIT(I915_SAMPLE_WAIT)) &&
>> +			    (val & RING_WAIT))
>> +				engine->pmu.sample[I915_SAMPLE_WAIT] += PERIOD;
>> +			if ((enable & BIT(I915_SAMPLE_SEMA)) &&
>> +			    (val & RING_WAIT_SEMAPHORE))
>> +				engine->pmu.sample[I915_SAMPLE_SEMA] += PERIOD;
>> +		}
>> +	}
>> +
>> +	if (fw)
>> +		intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
>> +	intel_runtime_pm_put(dev_priv);
>> +}
>> +
>> +static void frequency_sample(struct drm_i915_private *dev_priv)
>> +{
>> +	if (dev_priv->pmu.enable &
>> +	    config_enabled_mask(I915_PMU_ACTUAL_FREQUENCY)) {
>> +		u64 val;
>> +
>> +		val = dev_priv->rps.cur_freq;
>> +		if (dev_priv->gt.awake &&
>> +		    intel_runtime_pm_get_if_in_use(dev_priv)) {
>> +			val = intel_get_cagf(dev_priv,
>> +					     I915_READ_NOTRACE(GEN6_RPSTAT1));
>> +			intel_runtime_pm_put(dev_priv);
>> +		}
>> +		val = intel_gpu_freq(dev_priv, val);
>> +		dev_priv->pmu.sample[__I915_SAMPLE_FREQ_ACT] += val * PERIOD;
>> +	}
>> +
>> +	if (dev_priv->pmu.enable &
>> +	    config_enabled_mask(I915_PMU_REQUESTED_FREQUENCY)) {
>> +		u64 val = intel_gpu_freq(dev_priv, dev_priv->rps.cur_freq);
>> +		dev_priv->pmu.sample[__I915_SAMPLE_FREQ_REQ] += val * PERIOD;
>> +	}
>> +}
>> +
>> +static enum hrtimer_restart i915_sample(struct hrtimer *hrtimer)
>> +{
>> +	struct drm_i915_private *i915 =
>> +		container_of(hrtimer, struct drm_i915_private, pmu.timer);
>> +
>> +	if (i915->pmu.enable == 0)
>> +		return HRTIMER_NORESTART;
>> +
>> +	engines_sample(i915);
>> +	frequency_sample(i915);
>> +
>> +	hrtimer_forward_now(hrtimer, ns_to_ktime(PERIOD));
>> +	return HRTIMER_RESTART;
>> +}
>> +
>> +static u64 count_interrupts(struct drm_i915_private *i915)
>> +{
>> +	/* open-coded kstat_irqs() */
>> +	struct irq_desc *desc = irq_to_desc(i915->drm.pdev->irq);
>> +	u64 sum = 0;
>> +	int cpu;
>> +
>> +	if (!desc || !desc->kstat_irqs)
>> +		return 0;
>> +
>> +	for_each_possible_cpu(cpu)
>> +		sum += *per_cpu_ptr(desc->kstat_irqs, cpu);
>> +
>> +	return sum;
>> +}
>> +
>> +static void i915_pmu_event_destroy(struct perf_event *event)
>> +{
>> +	WARN_ON(event->parent);
>> +}
>> +
>> +static int engine_event_init(struct perf_event *event)
>> +{
>> +	struct drm_i915_private *i915 =
>> +		container_of(event->pmu, typeof(*i915), pmu.base);
>> +
>> +	if (!intel_engine_lookup_user(i915, engine_event_class(event),
>> +				      engine_event_instance(event)))
>> +		return -ENODEV;
>> +
>> +	switch (engine_event_sample(event)) {
>> +	case I915_SAMPLE_QUEUED:
>> +	case I915_SAMPLE_BUSY:
>> +	case I915_SAMPLE_WAIT:
>> +		break;
>> +	case I915_SAMPLE_SEMA:
>> +		if (INTEL_GEN(i915) < 6)
>> +			return -ENODEV;
>> +		break;
>> +	default:
>> +		return -ENOENT;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static int i915_pmu_event_init(struct perf_event *event)
>> +{
>> +	struct drm_i915_private *i915 =
>> +		container_of(event->pmu, typeof(*i915), pmu.base);
>> +	int cpu, ret;
>> +
>> +	if (event->attr.type != event->pmu->type)
>> +		return -ENOENT;
>> +
>> +	/* unsupported modes and filters */
>> +	if (event->attr.sample_period) /* no sampling */
>> +		return -EINVAL;
>> +
>> +	if (has_branch_stack(event))
>> +		return -EOPNOTSUPP;
>> +
>> +	if (event->cpu < 0)
>> +		return -EINVAL;
>> +
>> +	cpu = cpumask_any_and(&i915_pmu_cpumask,
>> +			      topology_sibling_cpumask(event->cpu));
>> +	if (cpu >= nr_cpu_ids)
>> +		return -ENODEV;
>> +
>> +	ret = 0;
>> +	if (is_engine_event(event)) {
>> +		ret = engine_event_init(event);
>> +	} else switch (event->attr.config) {
>> +	case I915_PMU_ACTUAL_FREQUENCY:
>> +		if (IS_VALLEYVIEW(i915) || IS_CHERRYVIEW(i915))
>> +			ret = -ENODEV; /* requires a mutex for sampling! */
>> +	case I915_PMU_REQUESTED_FREQUENCY:
>> +	case I915_PMU_ENERGY:
>> +	case I915_PMU_RC6_RESIDENCY:
>> +	case I915_PMU_RC6p_RESIDENCY:
>> +	case I915_PMU_RC6pp_RESIDENCY:
>> +		if (INTEL_GEN(i915) < 6)
>> +			ret = -ENODEV;
>> +		break;
>> +	}
>> +	if (ret)
>> +		return ret;
>> +
>> +	event->cpu = cpu;
>> +	if (!event->parent)
>> +		event->destroy = i915_pmu_event_destroy;
>> +
>> +	return 0;
>> +}
>> +
>> +static u64 __i915_pmu_event_read(struct perf_event *event)
>> +{
>> +	struct drm_i915_private *i915 =
>> +		container_of(event->pmu, typeof(*i915), pmu.base);
>> +	u64 val = 0;
>> +
>> +	if (is_engine_event(event)) {
>> +		u8 sample = engine_event_sample(event);
>> +		struct intel_engine_cs *engine;
>> +
>> +		engine = intel_engine_lookup_user(i915,
>> +						  engine_event_class(event),
>> +						  engine_event_instance(event));
>> +
>> +		if (WARN_ON_ONCE(!engine)) {
>> +			/* Do nothing */
>> +		} else {
>> +			val = engine->pmu.sample[sample];
>> +		}
>> +	} else switch (event->attr.config) {
>> +	case I915_PMU_ACTUAL_FREQUENCY:
>> +		val = i915->pmu.sample[__I915_SAMPLE_FREQ_ACT];
>> +		break;
>> +	case I915_PMU_REQUESTED_FREQUENCY:
>> +		val = i915->pmu.sample[__I915_SAMPLE_FREQ_REQ];
>> +		break;
>> +	case I915_PMU_ENERGY:
>> +		val = intel_energy_uJ(i915);
>> +		break;
>> +	case I915_PMU_INTERRUPTS:
>> +		val = count_interrupts(i915);
>> +		break;
>> +	case I915_PMU_RC6_RESIDENCY:
>> +		val = intel_rc6_residency_ns(i915,
>> +					     IS_VALLEYVIEW(i915) ?
>> +					     VLV_GT_RENDER_RC6 :
>> +					     GEN6_GT_GFX_RC6);
>> +		break;
>> +	case I915_PMU_RC6p_RESIDENCY:
>> +		if (!IS_VALLEYVIEW(i915))
>> +			val = intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6p);
>> +		break;
>> +	case I915_PMU_RC6pp_RESIDENCY:
>> +		if (!IS_VALLEYVIEW(i915))
>> +			val = intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6pp);
>> +		break;
>> +	}
>> +
>> +	return val;
>> +}
>> +
>> +static void i915_pmu_event_read(struct perf_event *event)
>> +{
>> +
>> +	local64_set(&event->count,
>> +		    __i915_pmu_event_read(event) -
>> +		    local64_read(&event->hw.prev_count));
>> +}
>> +
>> +static void i915_pmu_enable(struct perf_event *event)
>> +{
>> +	struct drm_i915_private *i915 =
>> +		container_of(event->pmu, typeof(*i915), pmu.base);
>> +	unsigned int bit = event_enabled_bit(event);
>> +	unsigned long flags;
>> +
>> +	spin_lock_irqsave(&i915->pmu.lock, flags);
>> +
>> +	/*
>> +	 * Start the sampling timer when enabling the first event.
>> +	 */
>> +	if (i915->pmu.enable == 0)
>> +		hrtimer_start_range_ns(&i915->pmu.timer,
>> +				       ns_to_ktime(PERIOD), 0,
>> +				       HRTIMER_MODE_REL_PINNED);
>> +
>> +	/*
>> +	 * Update the bitmask of enabled events and increment
>> +	 * the event reference counter.
>> +	 */
>> +	GEM_BUG_ON(bit >= I915_PMU_MASK_BITS);
>> +	GEM_BUG_ON(i915->pmu.enable_count[bit] == ~0);
>> +	i915->pmu.enable |= BIT_ULL(bit);
>> +	i915->pmu.enable_count[bit]++;
>> +
>> +	/*
>> +	 * For per-engine events the bitmask and reference counting
>> +	 * is stored per engine.
>> +	 */
>> +	if (is_engine_event(event)) {
>> +		u8 sample = engine_event_sample(event);
>> +		struct intel_engine_cs *engine;
>> +
>> +		engine = intel_engine_lookup_user(i915,
>> +						  engine_event_class(event),
>> +						  engine_event_instance(event));
>> +		GEM_BUG_ON(!engine);
>> +		engine->pmu.enable |= BIT(sample);
>> +
>> +		GEM_BUG_ON(sample >= I915_PMU_SAMPLE_BITS);
>> +		GEM_BUG_ON(engine->pmu.enable_count[sample] == ~0);
>> +		engine->pmu.enable_count[sample]++;
>> +	}
>> +
>> +	/*
>> +	 * Store the current counter value so we can report the correct delta
>> +	 * for all listeners. Even when the event was already enabled and has
>> +	 * an existing non-zero value.
>> +	 */
>> +	local64_set(&event->hw.prev_count, __i915_pmu_event_read(event));
>> +
>> +	spin_unlock_irqrestore(&i915->pmu.lock, flags);
>> +}
>> +
>> +static void i915_pmu_disable(struct perf_event *event)
>> +{
>> +	struct drm_i915_private *i915 =
>> +		container_of(event->pmu, typeof(*i915), pmu.base);
>> +	unsigned int bit = event_enabled_bit(event);
>> +	unsigned long flags;
>> +
>> +	spin_lock_irqsave(&i915->pmu.lock, flags);
>> +
>> +	if (is_engine_event(event)) {
>> +		u8 sample = engine_event_sample(event);
>> +		struct intel_engine_cs *engine;
>> +
>> +		engine = intel_engine_lookup_user(i915,
>> +						  engine_event_class(event),
>> +						  engine_event_instance(event));
>> +		GEM_BUG_ON(!engine);
>> +		GEM_BUG_ON(sample >= I915_PMU_SAMPLE_BITS);
>> +		GEM_BUG_ON(engine->pmu.enable_count[sample] == 0);
>> +		/*
>> +		 * Decrement the reference count and clear the enabled
>> +		 * bitmask when the last listener on an event goes away.
>> +		 */
>> +		if (--engine->pmu.enable_count[sample] == 0)
>> +			engine->pmu.enable &= ~BIT(sample);
>> +	}
>> +
>> +	GEM_BUG_ON(bit >= I915_PMU_MASK_BITS);
>> +	GEM_BUG_ON(i915->pmu.enable_count[bit] == 0);
>> +	/*
>> +	 * Decrement the reference count and clear the enabled
>> +	 * bitmask when the last listener on an event goes away.
>> +	 */
>> +	if (--i915->pmu.enable_count[bit] == 0)
>> +		i915->pmu.enable &= ~BIT_ULL(bit);
>> +
>> +	spin_unlock_irqrestore(&i915->pmu.lock, flags);
>> +}
>> +
>> +static void i915_pmu_event_start(struct perf_event *event, int flags)
>> +{
>> +	i915_pmu_enable(event);
>> +	event->hw.state = 0;
>> +}
>> +
>> +static void i915_pmu_event_stop(struct perf_event *event, int flags)
>> +{
>> +	if (flags & PERF_EF_UPDATE)
>> +		i915_pmu_event_read(event);
>> +	i915_pmu_disable(event);
>> +	event->hw.state = PERF_HES_STOPPED;
>> +}
>> +
>> +static int i915_pmu_event_add(struct perf_event *event, int flags)
>> +{
>> +	if (flags & PERF_EF_START)
>> +		i915_pmu_event_start(event, flags);
>> +
>> +	return 0;
>> +}
>> +
>> +static void i915_pmu_event_del(struct perf_event *event, int flags)
>> +{
>> +	i915_pmu_event_stop(event, PERF_EF_UPDATE);
>> +}
>> +
>> +static int i915_pmu_event_event_idx(struct perf_event *event)
>> +{
>> +	return 0;
>> +}
>> +
>> +static ssize_t i915_pmu_format_show(struct device *dev,
>> +				    struct device_attribute *attr, char *buf)
>> +{
>> +        struct dev_ext_attribute *eattr;
>> +
>> +        eattr = container_of(attr, struct dev_ext_attribute, attr);
>> +        return sprintf(buf, "%s\n", (char *) eattr->var);
>> +}
>> +
>> +#define I915_PMU_FORMAT_ATTR(_name, _config)           \
>> +        (&((struct dev_ext_attribute[]) {               \
>> +                { .attr = __ATTR(_name, S_IRUGO, i915_pmu_format_show, NULL), \
>> +                  .var = (void *) _config, }            \
>> +        })[0].attr.attr)
>> +
>> +static struct attribute *i915_pmu_format_attrs[] = {
>> +        I915_PMU_FORMAT_ATTR(i915_eventid, "config:0-42"),
>> +        NULL,
>> +};
>> +
>> +static const struct attribute_group i915_pmu_format_attr_group = {
>> +        .name = "format",
>> +        .attrs = i915_pmu_format_attrs,
>> +};
>> +
>> +static ssize_t i915_pmu_event_show(struct device *dev,
>> +				   struct device_attribute *attr, char *buf)
>> +{
>> +        struct dev_ext_attribute *eattr;
>> +
>> +        eattr = container_of(attr, struct dev_ext_attribute, attr);
>> +        return sprintf(buf, "config=0x%lx\n", (unsigned long) eattr->var);
>> +}
>> +
>> +#define I915_PMU_EVENT_ATTR(_name, _config)            \
>> +        (&((struct dev_ext_attribute[]) {               \
>> +                { .attr = __ATTR(_name, S_IRUGO, i915_pmu_event_show, NULL), \
>> +                  .var = (void *) _config, }            \
>> +         })[0].attr.attr)
>> +
>> +static struct attribute *i915_pmu_events_attrs[] = {
>> +	I915_PMU_EVENT_ATTR(rcs0-queued,
>> +			    I915_PMU_ENGINE_QUEUED(I915_ENGINE_CLASS_RENDER, 0)),
>> +	I915_PMU_EVENT_ATTR(rcs0-busy,
>> +			    I915_PMU_ENGINE_BUSY(I915_ENGINE_CLASS_RENDER, 0)),
>> +	I915_PMU_EVENT_ATTR(rcs0-wait,
>> +			    I915_PMU_ENGINE_WAIT(I915_ENGINE_CLASS_RENDER, 0)),
>> +	I915_PMU_EVENT_ATTR(rcs0-sema,
>> +			    I915_PMU_ENGINE_SEMA(I915_ENGINE_CLASS_RENDER, 0)),
>> +
>> +	I915_PMU_EVENT_ATTR(bcs0-queued,
>> +			    I915_PMU_ENGINE_QUEUED(I915_ENGINE_CLASS_COPY, 0)),
>> +	I915_PMU_EVENT_ATTR(bcs0-busy,
>> +			    I915_PMU_ENGINE_BUSY(I915_ENGINE_CLASS_COPY, 0)),
>> +	I915_PMU_EVENT_ATTR(bcs0-wait,
>> +			    I915_PMU_ENGINE_WAIT(I915_ENGINE_CLASS_COPY, 0)),
>> +	I915_PMU_EVENT_ATTR(bcs0-sema,
>> +			    I915_PMU_ENGINE_SEMA(I915_ENGINE_CLASS_COPY, 0)),
>> +
>> +	I915_PMU_EVENT_ATTR(vcs0-queued,
>> +			    I915_PMU_ENGINE_QUEUED(I915_ENGINE_CLASS_VIDEO, 0)),
>> +	I915_PMU_EVENT_ATTR(vcs0-busy,
>> +			    I915_PMU_ENGINE_BUSY(I915_ENGINE_CLASS_VIDEO, 0)),
>> +	I915_PMU_EVENT_ATTR(vcs0-wait,
>> +			    I915_PMU_ENGINE_WAIT(I915_ENGINE_CLASS_VIDEO, 0)),
>> +	I915_PMU_EVENT_ATTR(vcs0-sema,
>> +			    I915_PMU_ENGINE_SEMA(I915_ENGINE_CLASS_VIDEO, 0)),
>> +
>> +	I915_PMU_EVENT_ATTR(vcs1-queued,
>> +			    I915_PMU_ENGINE_QUEUED(I915_ENGINE_CLASS_VIDEO, 1)),
>> +	I915_PMU_EVENT_ATTR(vcs1-busy,
>> +			    I915_PMU_ENGINE_BUSY(I915_ENGINE_CLASS_VIDEO, 1)),
>> +	I915_PMU_EVENT_ATTR(vcs1-wait,
>> +			    I915_PMU_ENGINE_WAIT(I915_ENGINE_CLASS_VIDEO, 1)),
>> +	I915_PMU_EVENT_ATTR(vcs1-sema,
>> +			    I915_PMU_ENGINE_SEMA(I915_ENGINE_CLASS_VIDEO, 1)),
>> +
>> +	I915_PMU_EVENT_ATTR(vecs0-queued,
>> +			    I915_PMU_ENGINE_QUEUED(I915_ENGINE_CLASS_VIDEO_ENHANCE, 0)),
>> +	I915_PMU_EVENT_ATTR(vecs0-busy,
>> +			    I915_PMU_ENGINE_BUSY(I915_ENGINE_CLASS_VIDEO_ENHANCE, 0)),
>> +	I915_PMU_EVENT_ATTR(vecs0-wait,
>> +			    I915_PMU_ENGINE_WAIT(I915_ENGINE_CLASS_VIDEO_ENHANCE, 0)),
>> +	I915_PMU_EVENT_ATTR(vecs0-sema,
>> +			    I915_PMU_ENGINE_SEMA(I915_ENGINE_CLASS_VIDEO_ENHANCE, 0)),
>> +
>> +        I915_PMU_EVENT_ATTR(actual-frequency,	 I915_PMU_ACTUAL_FREQUENCY),
>> +        I915_PMU_EVENT_ATTR(requested-frequency, I915_PMU_REQUESTED_FREQUENCY),
>> +        I915_PMU_EVENT_ATTR(energy,		 I915_PMU_ENERGY),
>> +        I915_PMU_EVENT_ATTR(interrupts,		 I915_PMU_INTERRUPTS),
>> +        I915_PMU_EVENT_ATTR(rc6-residency,	 I915_PMU_RC6_RESIDENCY),
>> +        I915_PMU_EVENT_ATTR(rc6p-residency,	 I915_PMU_RC6p_RESIDENCY),
>> +        I915_PMU_EVENT_ATTR(rc6pp-residency,	 I915_PMU_RC6pp_RESIDENCY),
>> +
>> +        NULL,
>> +};
>> +
>> +static const struct attribute_group i915_pmu_events_attr_group = {
>> +        .name = "events",
>> +        .attrs = i915_pmu_events_attrs,
>> +};
>> +
>> +static ssize_t
>> +i915_pmu_get_attr_cpumask(struct device *dev,
>> +			  struct device_attribute *attr,
>> +			  char *buf)
>> +{
>> +	return cpumap_print_to_pagebuf(true, buf, &i915_pmu_cpumask);
>> +}
>> +
>> +static DEVICE_ATTR(cpumask, S_IRUGO, i915_pmu_get_attr_cpumask, NULL);
>> +
>> +static struct attribute *i915_cpumask_attrs[] = {
>> +	&dev_attr_cpumask.attr,
>> +	NULL,
>> +};
>> +
>> +static struct attribute_group i915_pmu_cpumask_attr_group = {
>> +	.attrs = i915_cpumask_attrs,
>> +};
>> +
>> +static const struct attribute_group *i915_pmu_attr_groups[] = {
>> +        &i915_pmu_format_attr_group,
>> +        &i915_pmu_events_attr_group,
>> +	&i915_pmu_cpumask_attr_group,
>> +        NULL
>> +};
>> +
>> +static int i915_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
>> +{
>> +	unsigned int target;
>> +
>> +	target = cpumask_any_and(&i915_pmu_cpumask, &i915_pmu_cpumask);
>> +	/* Select the first online CPU as a designated reader. */
>> +	if (target >= nr_cpu_ids)
>> +		cpumask_set_cpu(cpu, &i915_pmu_cpumask);
>> +
>> +	return 0;
>> +}
>> +
>> +static int i915_pmu_cpu_offline(unsigned int cpu, struct hlist_node *node)
>> +{
>> +	struct i915_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), node);
>> +	unsigned int target;
>> +
>> +	if (cpumask_test_and_clear_cpu(cpu, &i915_pmu_cpumask)) {
>> +		target = cpumask_any_but(topology_sibling_cpumask(cpu), cpu);
>> +		/* Migrate events if there is a valid target */
>> +		if (target < nr_cpu_ids) {
>> +			cpumask_set_cpu(target, &i915_pmu_cpumask);
>> +			perf_pmu_migrate_context(&pmu->base, cpu, target);
>> +		}
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +void i915_pmu_register(struct drm_i915_private *i915)
>> +{
>> +	int ret = ENOTSUPP;
>> +
>> +	if (INTEL_GEN(i915) <= 2)
>> +		goto err;
>> +
>> +	ret = cpuhp_setup_state_multi(CPUHP_AP_PERF_X86_UNCORE_ONLINE,
>> +				      "perf/x86/intel/i915:online",
>> +				      i915_pmu_cpu_online,
>> +			              i915_pmu_cpu_offline);
>> +	if (ret)
>> +		goto err;
>> +
>> +	ret = cpuhp_state_add_instance(CPUHP_AP_PERF_X86_UNCORE_ONLINE,
>> +				       &i915->pmu.node);
>> +	if (ret)
>> +		goto err;
>> +
>> +	i915->pmu.base.attr_groups	= i915_pmu_attr_groups;
>> +	i915->pmu.base.task_ctx_nr	= perf_invalid_context;
>> +	i915->pmu.base.event_init	= i915_pmu_event_init;
>> +	i915->pmu.base.add		= i915_pmu_event_add;
>> +	i915->pmu.base.del		= i915_pmu_event_del;
>> +	i915->pmu.base.start		= i915_pmu_event_start;
>> +	i915->pmu.base.stop		= i915_pmu_event_stop;
>> +	i915->pmu.base.read		= i915_pmu_event_read;
>> +	i915->pmu.base.event_idx	= i915_pmu_event_event_idx;
>> +
>> +	spin_lock_init(&i915->pmu.lock);
>> +	hrtimer_init(&i915->pmu.timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
>> +	i915->pmu.timer.function = i915_sample;
>> +	i915->pmu.enable = 0;
>> +
>> +	if (perf_pmu_register(&i915->pmu.base, "i915", -1))
>> +		i915->pmu.base.event_init = NULL;
>> +
>> +err:
>> +	DRM_INFO("Failed to register PMU (err=%d)\n", ret);
>> +}
>> +
>> +void i915_pmu_unregister(struct drm_i915_private *i915)
>> +{
>> +	if (!i915->pmu.base.event_init)
>> +		return;
>> +
>> +	i915->pmu.enable = 0;
>> +
>> +	perf_pmu_unregister(&i915->pmu.base);
>> +	i915->pmu.base.event_init = NULL;
>> +
>> +	hrtimer_cancel(&i915->pmu.timer);
>> +
>> +	cpuhp_state_remove_instance(CPUHP_AP_PERF_X86_UNCORE_ONLINE,
>> +				    &i915->pmu.node);
>> +}
>> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
>> index 0b03260a3967..8c362e0451c1 100644
>> --- a/drivers/gpu/drm/i915/i915_reg.h
>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> @@ -186,6 +186,9 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
>>   #define VIDEO_ENHANCEMENT_CLASS	2
>>   #define COPY_ENGINE_CLASS	3
>>   #define OTHER_CLASS		4
>> +#define MAX_ENGINE_CLASS	4
>> +
>> +#define MAX_ENGINE_INSTANCE    1
>>   
>>   /* PCI config space */
>>   
>> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
>> index 3ae89a9d6241..dbc7abd65f33 100644
>> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
>> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
>> @@ -198,6 +198,15 @@ intel_engine_setup(struct drm_i915_private *dev_priv,
>>   	GEM_BUG_ON(info->class >= ARRAY_SIZE(intel_engine_classes));
>>   	class_info = &intel_engine_classes[info->class];
>>   
>> +	if (GEM_WARN_ON(info->class > MAX_ENGINE_CLASS))
>> +		return -EINVAL;
>> +
>> +	if (GEM_WARN_ON(info->instance > MAX_ENGINE_INSTANCE))
>> +		return -EINVAL;
>> +
>> +	if (GEM_WARN_ON(dev_priv->engine_class[info->class][info->instance]))
>> +		return -EINVAL;
>> +
>>   	GEM_BUG_ON(dev_priv->engine[id]);
>>   	engine = kzalloc(sizeof(*engine), GFP_KERNEL);
>>   	if (!engine)
>> @@ -225,6 +234,7 @@ intel_engine_setup(struct drm_i915_private *dev_priv,
>>   
>>   	ATOMIC_INIT_NOTIFIER_HEAD(&engine->context_status_notifier);
>>   
>> +	dev_priv->engine_class[info->class][info->instance] = engine;
>>   	dev_priv->engine[id] = engine;
>>   	return 0;
>>   }
>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> index 268342433a8e..7db4c572ef76 100644
>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> @@ -2283,3 +2283,28 @@ int intel_init_vebox_ring_buffer(struct intel_engine_cs *engine)
>>   
>>   	return intel_init_ring_buffer(engine);
>>   }
>> +
>> +static u8 user_class_map[I915_ENGINE_CLASS_MAX] = {
>> +	[I915_ENGINE_CLASS_OTHER] = OTHER_CLASS,
>> +	[I915_ENGINE_CLASS_RENDER] = RENDER_CLASS,
>> +	[I915_ENGINE_CLASS_COPY] = COPY_ENGINE_CLASS,
>> +	[I915_ENGINE_CLASS_VIDEO] = VIDEO_DECODE_CLASS,
>> +	[I915_ENGINE_CLASS_VIDEO_ENHANCE] = VIDEO_ENHANCEMENT_CLASS,
>> +};
>> +
>> +struct intel_engine_cs *
>> +intel_engine_lookup_user(struct drm_i915_private *i915, u8 class, u8 instance)
>> +{
>> +	if (class >= ARRAY_SIZE(user_class_map))
>> +		return NULL;
>> +
>> +	class = user_class_map[class];
>> +
>> +	if (WARN_ON_ONCE(class > MAX_ENGINE_CLASS))
>> +		return NULL;
>> +
>> +	if (instance > MAX_ENGINE_INSTANCE)
>> +		return NULL;
>> +
>> +	return i915->engine_class[class][instance];
>> +}
>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
>> index 79c0021f3700..cf095b9386f4 100644
>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
>> @@ -245,6 +245,28 @@ struct intel_engine_cs {
>>   		I915_SELFTEST_DECLARE(bool mock : 1);
>>   	} breadcrumbs;
>>   
>> +	struct {
>> +		/**
>> +		 * @enable: Bitmask of enable sample events on this engine.
>> +		 *
>> +		 * Bits correspond to sample event types, for instance
>> +		 * I915_SAMPLE_QUEUED is bit 0 etc.
>> +		 */
>> +		u32 enable;
>> +		/**
>> +		 * @enable_count: Reference count for the enabled samplers.
>> +		 *
>> +		 * Index number corresponds to the bit number from @enable.
>> +		 */
>> +		unsigned int enable_count[I915_PMU_SAMPLE_BITS];
>> +		/**
>> +		 * @sample: Counter value for sampling events.
>> +		 *
>> +		 * Our internal timer stores the current counter in this field.
>> +		 */
>> +		u64 sample[I915_ENGINE_SAMPLE_MAX];
>> +	} pmu;
>> +
>>   	/*
>>   	 * A pool of objects to use as shadow copies of client batch buffers
>>   	 * when the command parser is enabled. Prevents the client from
>> @@ -737,4 +759,7 @@ void intel_engines_reset_default_submission(struct drm_i915_private *i915);
>>   
>>   bool intel_engine_can_store_dword(struct intel_engine_cs *engine);
>>   
>> +struct intel_engine_cs *
>> +intel_engine_lookup_user(struct drm_i915_private *i915, u8 class, u8 instance);
>> +
>>   #endif /* _INTEL_RINGBUFFER_H_ */
>> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
>> index d8d10d932759..6dc0d6fd4e4c 100644
>> --- a/include/uapi/drm/i915_drm.h
>> +++ b/include/uapi/drm/i915_drm.h
>> @@ -86,6 +86,64 @@ enum i915_mocs_table_index {
>>   	I915_MOCS_CACHED,
>>   };
>>   
>> +enum drm_i915_gem_engine_class {
>> +	I915_ENGINE_CLASS_OTHER = 0,
>> +	I915_ENGINE_CLASS_RENDER = 1,
>> +	I915_ENGINE_CLASS_COPY = 2,
>> +	I915_ENGINE_CLASS_VIDEO = 3,
>> +	I915_ENGINE_CLASS_VIDEO_ENHANCE = 4,
>> +	I915_ENGINE_CLASS_MAX /* non-ABI */
>> +};
>> +
>> +/**
>> + * DOC: perf_events exposed by i915 through /sys/bus/event_sources/drivers/i915
>> + *
>> + */
>> +
>> +enum drm_i915_pmu_engine_sample {
>> +	I915_SAMPLE_QUEUED = 0,
>> +	I915_SAMPLE_BUSY = 1,
>> +	I915_SAMPLE_WAIT = 2,
>> +	I915_SAMPLE_SEMA = 3,
>> +	I915_ENGINE_SAMPLE_MAX /* non-ABI */
>> +};
>> +
>> +#define I915_PMU_SAMPLE_BITS (4)
>> +#define I915_PMU_SAMPLE_MASK (0xf)
>> +#define I915_PMU_SAMPLE_INSTANCE_BITS (8)
>> +#define I915_PMU_CLASS_SHIFT \
>> +	(I915_PMU_SAMPLE_BITS + I915_PMU_SAMPLE_INSTANCE_BITS)
>> +
>> +#define __I915_PMU_ENGINE(class, instance, sample) \
>> +	((class) << I915_PMU_CLASS_SHIFT | \
>> +	(instance) << I915_PMU_SAMPLE_BITS | \
>> +	(sample))
>> +
>> +#define I915_PMU_ENGINE_QUEUED(class, instance) \
>> +	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_QUEUED)
>> +
>> +#define I915_PMU_ENGINE_BUSY(class, instance) \
>> +	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_BUSY)
>> +
>> +#define I915_PMU_ENGINE_WAIT(class, instance) \
>> +	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_WAIT)
>> +
>> +#define I915_PMU_ENGINE_SEMA(class, instance) \
>> +	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_SEMA)
>> +
>> +#define __I915_PMU_OTHER(x) (__I915_PMU_ENGINE(0xff, 0xff, 0xf) + 1 + (x))
>> +
>> +#define I915_PMU_ACTUAL_FREQUENCY 	__I915_PMU_OTHER(0)
>> +#define I915_PMU_REQUESTED_FREQUENCY	__I915_PMU_OTHER(1)
>> +#define I915_PMU_ENERGY			__I915_PMU_OTHER(2)
>> +#define I915_PMU_INTERRUPTS		__I915_PMU_OTHER(3)
>> +
>> +#define I915_PMU_RC6_RESIDENCY		__I915_PMU_OTHER(4)
>> +#define I915_PMU_RC6p_RESIDENCY		__I915_PMU_OTHER(5)
>> +#define I915_PMU_RC6pp_RESIDENCY	__I915_PMU_OTHER(6)
>> +
>> +#define I915_PMU_LAST I915_PMU_RC6pp_RESIDENCY
>> +
>>   /* Each region is a minimum of 16k, and there are at most 255 of them.
>>    */
>>   #define I915_NR_TEX_REGIONS 255	/* table size 2k - maximum due to use
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC 10/11] drm/i915: Export engine stats API to other users
  2017-09-11 15:25 ` [RFC 10/11] drm/i915: Export engine stats API to other users Tvrtko Ursulin
@ 2017-09-12 18:35   ` Ben Widawsky
  2017-09-14 20:26   ` Chris Wilson
  2017-09-29 10:59   ` Joonas Lahtinen
  2 siblings, 0 replies; 56+ messages in thread
From: Ben Widawsky @ 2017-09-12 18:35 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Ben Widawsky, Peter Zijlstra, Intel-gfx

On 17-09-11 16:25:58, Tvrtko Ursulin wrote:
>From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
>Other kernel users might want to look at total GPU busyness
>in order to implement things like package power distribution
>algorithms more efficiently.
>
>Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>Cc: Ben Widawsky <benjamin.widawsky@intel.com>
>Cc: Ben Widawsky <ben@bwidawsk.net>

Acked-by: Ben Widawsky <benjamin.widawsky@intel.com>
Acked-by: Ben Widawsky <ben@bwidawsk.net>

>---
> drivers/gpu/drm/i915/intel_engine_cs.c | 3 +++
> 1 file changed, 3 insertions(+)
>
>diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
>index f7dba176989c..e2152dd21b4a 100644
>--- a/drivers/gpu/drm/i915/intel_engine_cs.c
>+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
>@@ -1495,6 +1495,7 @@ int intel_enable_engines_stats(struct drm_i915_private *dev_priv)
>
> 	return ret;
> }
>+EXPORT_SYMBOL(intel_enable_engines_stats);
>
> /**
>  * intel_disable_engines_stats() - Disable engine busy tracking on all engines
>@@ -1510,6 +1511,7 @@ void intel_disable_engines_stats(struct drm_i915_private *dev_priv)
> 	for_each_engine(engine, dev_priv, id)
> 		intel_disable_engine_stats(engine);
> }
>+EXPORT_SYMBOL(intel_disable_engines_stats);
>
> /**
>  * intel_engine_get_busy_time() - Return current accumulated engine busyness
>@@ -1557,6 +1559,7 @@ ktime_t intel_engines_get_busy_time(struct drm_i915_private *dev_priv)
>
> 	return total;
> }
>+EXPORT_SYMBOL(intel_engines_get_busy_time);
>
> #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
> #include "selftests/mock_engine.c"
>-- 
>2.9.5
>

-- 
Ben Widawsky, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC v3 00/11] i915 PMU and engine busy stats
  2017-09-12 14:54   ` Tvrtko Ursulin
@ 2017-09-12 22:01     ` Rogozhkin, Dmitry V
  2017-09-13  8:54       ` [RFC v6 04/11] drm/i915/pmu: Expose a PMU interface for perf queries Tvrtko Ursulin
  2017-09-13  9:01       ` [RFC v3 00/11] i915 PMU and engine busy stats Tvrtko Ursulin
  0 siblings, 2 replies; 56+ messages in thread
From: Rogozhkin, Dmitry V @ 2017-09-12 22:01 UTC (permalink / raw)
  To: tvrtko.ursulin; +Cc: peterz, Intel-gfx

On Tue, 2017-09-12 at 15:54 +0100, Tvrtko Ursulin wrote:
> On 12/09/2017 03:03, Rogozhkin, Dmitry V wrote:
> > Hi,
> > 
> > Just tried v3 series. perf-stat works fine. From the IGT tests which I wrote for i915 PMU
> > (https://patchwork.freedesktop.org/series/29313/) all pass (assuming pmu.enabled will be exposed
> > in debugfs) except cpu_online subtest. And this is pretty interesting - see details below.
> > 
> > Ok, be prepared for the long mail:)...
> > 
> > So, cpu_online subtest loads RCS0 engine 100% and starts to put CPUs offline one by one starting
> > from CPU0 (don't forget to have CONFIG_BOOTPARAM_HOTPLUG_CPU0=y in .config). Test expectation is
> > that i915 PMU will report almost 100% (with 2% tolerance) busyness of RCS0. Test requests metric
> > just twice: before running on RCS0 and right after. This fails as follows:
> > 
> > Executed on rcs0 for 32004262184us
> >    i915/rcs0-busy/: 2225723999us
> > (perf_pmu:6325) CRITICAL: Test assertion failure function test_cpu_online, file perf_pmu.c:719:
> > (perf_pmu:6325) CRITICAL: Failed assertion: perf_elapsed(&pm.metrics[0]) > (1-USAGE_TOLERANCE) * elapsed_ns(&start, &now)
> > Stack trace:
> >    #0 [__igt_fail_assert+0xf1]
> >    #1 [__real_main773+0xff1]
> >    #2 [main+0x35]
> >    #3 [__libc_start_main+0xf5]
> >    #4 [_start+0x29]
> >    #5 [<unknown>+0x29]
> > Subtest cpu_online failed.
> > **** DEBUG ****
> > (perf_pmu:6325) DEBUG: Test requirement passed: is_hotplug_cpu0()
> > (perf_pmu:6325) INFO: perf_init: enabled 1 metrics from 1 requested
> > (perf_pmu:6325) ioctl-wrappers-DEBUG: Test requirement passed: gem_has_ring(fd, ring)
> > (perf_pmu:6325) INFO: Executed on rcs0 for 32004262184us
> > (perf_pmu:6325) INFO:   i915/rcs0-busy/: 2225723999us
> > (perf_pmu:6325) CRITICAL: Test assertion failure function test_cpu_online, file perf_pmu.c:719:
> > (perf_pmu:6325) CRITICAL: Failed assertion: perf_elapsed(&pm.metrics[0]) > (1-USAGE_TOLERANCE) * elapsed_ns(&start, &now)
> > 
> > Now. Looks like that by itself PMU context migration works. For example, if you will comment out
> > "perf_pmu_migrate_context(&pmu->base, cpu, target)" you will get:
> > 
> >      Executed on rcs0 for 32004434918us
> >        i915/rcs0-busy/:     76623707us
> > 
> > Compare with previous:
> >      Executed on rcs0 for 32004262184us
> >        i915/rcs0-busy/:    2225723999us
> > 
> > This test passed on the previous set of patches, I mean Tvrtko's v2 series + my patches.
> > 
> > So, it seems we are loosing counter values somehow. I saw in the patches that this place really was modified - you
> > have added subtraction from initial counter value:
> > static void i915_pmu_event_read(struct perf_event *event)
> > {
> > 
> > 	local64_set(&event->count,
> > 		    __i915_pmu_event_read(event) -
> > 		    local64_read(&event->hw.prev_count));
> > }
> > 
> > But looks like the problem is that with the PMU context migration we get sequence of events start/stop (or maybe
> > add/del) which eventually call our i915_pmu_enable/disable. Here is the dmesg log with the obvious printk:
> > 
> > [  153.971096] [IGT] perf_pmu: starting subtest cpu_online
> > [  153.971151] i915_pmu_enable: event->hw.prev_count=0
> > [  154.036015] i915_pmu_disable: event->hw.prev_count=0
> > [  154.048027] i915_pmu_enable: event->hw.prev_count=0
> > [  154.049343] smpboot: CPU 0 is now offline
> > [  155.059028] smpboot: Booting Node 0 Processor 0 APIC 0x0
> > [  155.155078] smpboot: CPU 1 is now offline
> > [  156.161026] x86: Booting SMP configuration:
> > [  156.161027] smpboot: Booting Node 0 Processor 1 APIC 0x2
> > [  156.197065] IRQ 122: no longer affine to CPU2
> > [  156.198087] smpboot: CPU 2 is now offline
> > [  157.208028] smpboot: Booting Node 0 Processor 2 APIC 0x4
> > [  157.263093] smpboot: CPU 3 is now offline
> > [  158.273026] smpboot: Booting Node 0 Processor 3 APIC 0x6
> > [  158.310026] i915_pmu_disable: event->hw.prev_count=76648307
> > [  158.319020] i915_pmu_enable: event->hw.prev_count=76648307
> > [  158.319098] IRQ 124: no longer affine to CPU4
> > [  158.320368] smpboot: CPU 4 is now offline
> > [  159.326030] smpboot: Booting Node 0 Processor 4 APIC 0x1
> > [  159.365306] smpboot: CPU 5 is now offline
> > [  160.371030] smpboot: Booting Node 0 Processor 5 APIC 0x3
> > [  160.421077] IRQ 125: no longer affine to CPU6
> > [  160.422093] smpboot: CPU 6 is now offline
> > [  161.429030] smpboot: Booting Node 0 Processor 6 APIC 0x5
> > [  161.467091] smpboot: CPU 7 is now offline
> > [  162.473027] smpboot: Booting Node 0 Processor 7 APIC 0x7
> > [  162.527019] i915_pmu_disable: event->hw.prev_count=4347548222
> > [  162.546017] i915_pmu_enable: event->hw.prev_count=4347548222
> > [  162.547317] smpboot: CPU 0 is now offline
> > [  163.553028] smpboot: Booting Node 0 Processor 0 APIC 0x0
> > [  163.621089] smpboot: CPU 1 is now offline
> > [  164.627028] x86: Booting SMP configuration:
> > [  164.627029] smpboot: Booting Node 0 Processor 1 APIC 0x2
> > [  164.669308] smpboot: CPU 2 is now offline
> > [  165.679025] smpboot: Booting Node 0 Processor 2 APIC 0x4
> > [  165.717089] smpboot: CPU 3 is now offline
> > [  166.723025] smpboot: Booting Node 0 Processor 3 APIC 0x6
> > [  166.775016] i915_pmu_disable: event->hw.prev_count=8574197312
> > [  166.787016] i915_pmu_enable: event->hw.prev_count=8574197312
> > [  166.788309] smpboot: CPU 4 is now offline
> > [  167.794025] smpboot: Booting Node 0 Processor 4 APIC 0x1
> > [  167.837114] smpboot: CPU 5 is now offline
> > [  168.847025] smpboot: Booting Node 0 Processor 5 APIC 0x3
> > [  168.889312] smpboot: CPU 6 is now offline
> > [  169.899030] smpboot: Booting Node 0 Processor 6 APIC 0x5
> > [  169.944104] smpboot: CPU 7 is now offline
> > [  170.954032] smpboot: Booting Node 0 Processor 7 APIC 0x7
> > [  171.000016] i915_pmu_disable: event->hw.prev_count=12815138319
> > [  171.008017] i915_pmu_enable: event->hw.prev_count=12815138319
> > [  171.009304] smpboot: CPU 0 is now offline
> > [  172.017028] smpboot: Booting Node 0 Processor 0 APIC 0x0
> > [  172.096104] smpboot: CPU 1 is now offline
> > [  173.106025] x86: Booting SMP configuration:
> > [  173.106026] smpboot: Booting Node 0 Processor 1 APIC 0x2
> > [  173.147078] smpboot: CPU 2 is now offline
> > [  174.153025] smpboot: Booting Node 0 Processor 2 APIC 0x4
> > [  174.192093] smpboot: CPU 3 is now offline
> > [  175.198028] smpboot: Booting Node 0 Processor 3 APIC 0x6
> > [  175.229042] i915_pmu_disable: event->hw.prev_count=17035889079
> > [  175.242030] i915_pmu_enable: event->hw.prev_count=17035889079
> > [  175.242163] IRQ fixup: irq 120 move in progress, old vector 131
> > [  175.242165] IRQ fixup: irq 121 move in progress, old vector 147
> > [  175.242171] IRQ 124: no longer affine to CPU4
> > [  175.243435] smpboot: CPU 4 is now offline
> > [  176.248040] smpboot: Booting Node 0 Processor 4 APIC 0x1
> > [  176.285328] smpboot: CPU 5 is now offline
> > [  177.296039] smpboot: Booting Node 0 Processor 5 APIC 0x3
> > [  177.325067] IRQ 125: no longer affine to CPU6
> > [  177.326087] smpboot: CPU 6 is now offline
> > [  178.335036] smpboot: Booting Node 0 Processor 6 APIC 0x5
> > [  178.377063] IRQ 122: no longer affine to CPU7
> > [  178.378086] smpboot: CPU 7 is now offline
> > [  179.388028] smpboot: Booting Node 0 Processor 7 APIC 0x7
> > [  179.454030] i915_pmu_disable: event->hw.prev_count=21269856967
> > [  179.470026] i915_pmu_enable: event->hw.prev_count=21269856967
> > [  179.471110] smpboot: CPU 0 is now offline
> > [  180.481028] smpboot: Booting Node 0 Processor 0 APIC 0x0
> > [  180.551075] smpboot: CPU 1 is now offline
> > [  181.558029] x86: Booting SMP configuration:
> > [  181.558030] smpboot: Booting Node 0 Processor 1 APIC 0x2
> > [  181.595096] smpboot: CPU 2 is now offline
> > [  182.605029] smpboot: Booting Node 0 Processor 2 APIC 0x4
> > [  182.657084] smpboot: CPU 3 is now offline
> > [  183.668030] smpboot: Booting Node 0 Processor 3 APIC 0x6
> > [  183.709017] i915_pmu_disable: event->hw.prev_count=25497358644
> > [  183.727016] i915_pmu_enable: event->hw.prev_count=25497358644
> > [  183.728305] smpboot: CPU 4 is now offline
> > [  184.734027] smpboot: Booting Node 0 Processor 4 APIC 0x1
> > [  184.767090] smpboot: CPU 5 is now offline
> > [  185.777036] smpboot: Booting Node 0 Processor 5 APIC 0x3
> > [  185.823096] smpboot: CPU 6 is now offline
> > [  186.829051] smpboot: Booting Node 0 Processor 6 APIC 0x5
> > [  186.856350] smpboot: CPU 7 is now offline
> > [  187.862051] smpboot: Booting Node 0 Processor 7 APIC 0x7
> > [  187.871216] [IGT] perf_pmu: exiting, ret=99
> > [  187.889199] Console: switching to colour frame buffer device 240x67
> > [  187.889583] i915_pmu_disable: event->hw.prev_count=29754080941
> > 
> > And the result which I got in userspace for this run were
> >      Executed on rcs0 for 32003587981us
> >        i915/rcs0-busy/: 2247436461us
> > 
> > After that I decided to roll back the change with counting values which I mentioned before, i.e.:
> > static void i915_pmu_event_read(struct perf_event *event)
> > {
> > 
> > 	local64_set(&event->count,
> > 		    __i915_pmu_event_read(event) /*-
> > 		    local64_read(&event->hw.prev_count)*/);
> > }
> > 
> > And I got test PASSED :):
> >      Executed on rcs0 for 32002282603us
> >        i915/rcs0-busy/: 31998855052us
> >      Subtest cpu_online: SUCCESS (33.950s)
> > 
> > At this point I need to go home :). Maybe you will have time to look into this issue? If not, I will continue
> > tomorrow.
> 
> I forgot to run this test since I did not have the kernel feature 
> enabled. But yes, now that I tried it, it is failing.
> 
> What is happening is that event del (so counter stop as well) is getting 
> called when the CPU goes offline, followed by add->start, and the 
> initial counter value then gets reloaded.
> 
> I don't see a way for i915 to distinguish between userspace 
> starting/stopping the event, and perf core doing the same in the CPU 
> migration process. Perhaps Peter could help here?
> 
> I am storing the initial counter value when the counter is started so 
> that I can report it's relative value. In other words the change from 
> event start to stop. Perhaps that is not correct and should be left to 
> userspace to handle?
> 
> Otherwise we have counters like energy use, and even engine busyness, 
> which will/can already be at some large value before PMU monitoring 
> starts. Which makes things like "perf stat -a -I <command>", or even 
> just normal "perf stat <command>", attribute all previous usage (from 
> before the command profiling started) to the reported stats.
> 

Actually that's pretty easy to fix. The following patch is doing that:

diff --git a/drivers/gpu/drm/i915/i915_pmu.c
b/drivers/gpu/drm/i915/i915_pmu.c
index bce4951..277098d 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -390,10 +390,18 @@ static u64 __i915_pmu_event_read(struct perf_event
*event)
 
 static void i915_pmu_event_read(struct perf_event *event)
 {
+       struct hw_perf_event *hwc = &event->hw;
+       u64 prev_raw_count, new_raw_count;
 
-       local64_set(&event->count,
-                   __i915_pmu_event_read(event) -
-                   local64_read(&event->hw.prev_count));
+again:
+       prev_raw_count = local64_read(&hwc->prev_count);
+       new_raw_count = __i915_pmu_event_read(event);
+
+       if (local64_cmpxchg(&hwc->prev_count, prev_raw_count,
+                           new_raw_count) != prev_raw_count)
+               goto again;
+
+       local64_add(new_raw_count - prev_raw_count, &event->count);
 }


I believe you need to squash it to the major i915 PMU enabling one.

So, the idea is:
1. event->count contains current counter value, it is ever growing for
the particular event _instance_, i.e. even if event is stopped/started
or added/deleted, till this event exists (not destroyed) it holds ever
growing value
2. Since it is ever growing, in the read() we always add a _delta_ to
the event count where start point is when event got enabled (started)
3. On PMU context migration to another CPU we will be issued a call to
del(PERF_EF_UPDATE). Thus, here is the trick:
3.1. The first thing we do we _update_ event count adding to the count
everything we gathered on the previous CPU
3.2. The second thing - we update event->hw.prev_count to the new value.
Next time we will add delta counting from it

With this all tests I have for PMU passed. The code above is taken from
arch/x86/events/intel/cstate.c as is. So, that's really how it should
be.

And... I should say that was the last technical problem I saw for i915
PMU implementation. With it puzzle looks complete for me. :).

> Regards,
> 
> Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [RFC v6 04/11] drm/i915/pmu: Expose a PMU interface for perf queries
  2017-09-12 22:01     ` Rogozhkin, Dmitry V
@ 2017-09-13  8:54       ` Tvrtko Ursulin
  2017-09-13  9:01       ` [RFC v3 00/11] i915 PMU and engine busy stats Tvrtko Ursulin
  1 sibling, 0 replies; 56+ messages in thread
From: Tvrtko Ursulin @ 2017-09-13  8:54 UTC (permalink / raw)
  To: Intel-gfx; +Cc: Peter Zijlstra

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

From: Chris Wilson <chris@chris-wilson.co.uk>
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
From: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

The first goal is to be able to measure GPU (and invidual ring) busyness
without having to poll registers from userspace. (Which not only incurs
holding the forcewake lock indefinitely, perturbing the system, but also
runs the risk of hanging the machine.) As an alternative we can use the
perf event counter interface to sample the ring registers periodically
and send those results to userspace.

To be able to do so, we need to export the two symbols from
kernel/events/core.c to register and unregister a PMU device.

v1-v2 (Chris Wilson):

v2: Use a common timer for the ring sampling.

v3: (Tvrtko Ursulin)
 * Decouple uAPI from i915 engine ids.
 * Complete uAPI defines.
 * Refactor some code to helpers for clarity.
 * Skip sampling disabled engines.
 * Expose counters in sysfs.
 * Pass in fake regs to avoid null ptr deref in perf core.
 * Convert to class/instance uAPI.
 * Use shared driver code for rc6 residency, power and frequency.

v4: (Dmitry Rogozhkin)
 * Register PMU with .task_ctx_nr=perf_invalid_context
 * Expose cpumask for the PMU with the single CPU in the mask
 * Properly support pmu->stop(): it should call pmu->read()
 * Properly support pmu->del(): it should call stop(event, PERF_EF_UPDATE)
 * Introduce refcounting of event subscriptions.
 * Make pmu.busy_stats a refcounter to avoid busy stats going away
   with some deleted event.
 * Expose cpumask for i915 PMU to avoid multiple events creation of
   the same type followed by counter aggregation by perf-stat.
 * Track CPUs getting online/offline to migrate perf context. If (likely)
   cpumask will initially set CPU0, CONFIG_BOOTPARAM_HOTPLUG_CPU0 will be
   needed to see effect of CPU status tracking.
 * End result is that only global events are supported and perf stat
   works correctly.
 * Deny perf driver level sampling - it is prohibited for uncore PMU.

v5: (Tvrtko Ursulin)

 * Don't hardcode number of engine samplers.
 * Rewrite event ref-counting for correctness and simplicity.
 * Store initial counter value when starting already enabled events
   to correctly report values to all listeners.
 * Fix RC6 residency readout.
 * Comments, GPL header.

v6:
 * Add missing entry to v4 changelog.
 * Fix accounting in CPU hotplug case by copying the approach from
   arch/x86/events/intel/cstate.c. (Dmitry Rogozhkin)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
---
 drivers/gpu/drm/i915/Makefile           |   1 +
 drivers/gpu/drm/i915/i915_drv.c         |   2 +
 drivers/gpu/drm/i915/i915_drv.h         |  76 ++++
 drivers/gpu/drm/i915/i915_pmu.c         | 693 ++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_reg.h         |   3 +
 drivers/gpu/drm/i915/intel_engine_cs.c  |  10 +
 drivers/gpu/drm/i915/intel_ringbuffer.c |  25 ++
 drivers/gpu/drm/i915/intel_ringbuffer.h |  25 ++
 include/uapi/drm/i915_drm.h             |  58 +++
 9 files changed, 893 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/i915_pmu.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 1cb8059a3a16..7b3a0eca62b6 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -26,6 +26,7 @@ i915-y := i915_drv.o \
 
 i915-$(CONFIG_COMPAT)   += i915_ioc32.o
 i915-$(CONFIG_DEBUG_FS) += i915_debugfs.o intel_pipe_crc.o
+i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o
 
 # GEM code
 i915-y += i915_cmd_parser.o \
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 5c111ea96e80..b1f96eb1be16 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1196,6 +1196,7 @@ static void i915_driver_register(struct drm_i915_private *dev_priv)
 	struct drm_device *dev = &dev_priv->drm;
 
 	i915_gem_shrinker_init(dev_priv);
+	i915_pmu_register(dev_priv);
 
 	/*
 	 * Notify a valid surface after modesetting,
@@ -1250,6 +1251,7 @@ static void i915_driver_unregister(struct drm_i915_private *dev_priv)
 	intel_opregion_unregister(dev_priv);
 
 	i915_perf_unregister(dev_priv);
+	i915_pmu_unregister(dev_priv);
 
 	i915_teardown_sysfs(dev_priv);
 	i915_guc_log_unregister(dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 48daf9552163..62646b8dfb7a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -40,6 +40,7 @@
 #include <linux/hash.h>
 #include <linux/intel-iommu.h>
 #include <linux/kref.h>
+#include <linux/perf_event.h>
 #include <linux/pm_qos.h>
 #include <linux/reservation.h>
 #include <linux/shmem_fs.h>
@@ -2190,6 +2191,69 @@ struct intel_cdclk_state {
 	unsigned int cdclk, vco, ref;
 };
 
+enum {
+	__I915_SAMPLE_FREQ_ACT = 0,
+	__I915_SAMPLE_FREQ_REQ,
+	__I915_NUM_PMU_SAMPLERS
+};
+
+/**
+ * How many different events we track in the global PMU mask.
+ *
+ * It is also used to know to needed number of event reference counters.
+ */
+#define I915_PMU_MASK_BITS \
+	(1 << I915_PMU_SAMPLE_BITS) + (I915_PMU_LAST + 1 - __I915_PMU_OTHER(0))
+
+struct i915_pmu {
+	/**
+	 * @node: List node for CPU hotplug handling.
+	 */
+	struct hlist_node node;
+	/**
+	 * @base: PMU base.
+	 */
+	struct pmu base;
+	/**
+	 * @lock: Lock protecting enable mask and ref count handling.
+	 */
+	spinlock_t lock;
+	/**
+	 * @timer: Timer for internal i915 PMU sampling.
+	 */
+	struct hrtimer timer;
+	/**
+	 * @enable: Bitmask of all currently enabled events.
+	 *
+	 * Bits are derived from uAPI event numbers in a way that low 16 bits
+	 * correspond to engine event _sample_ _type_ (I915_SAMPLE_QUEUED is
+	 * bit 0), and higher bits correspond to other events (for instance
+	 * I915_PMU_ACTUAL_FREQUENCY is bit 16 etc).
+	 *
+	 * In other words, low 16 bits are not per engine but per engine
+	 * sampler type, while the upper bits are directly mapped to other
+	 * event types.
+	 */
+	u64 enable;
+	/**
+	 * @enable_count: Reference counts for the enabled events.
+	 *
+	 * Array indices are mapped in the same way as bits in the @enable field
+	 * and they are used to control sampling on/off when multiple clients
+	 * are using the PMU API.
+	 */
+	unsigned int enable_count[I915_PMU_MASK_BITS];
+	/**
+	 * @sample: Current counter value for i915 events which need sampling.
+	 *
+	 * These counters are updated from the i915 PMU sampling timer.
+	 *
+	 * Only global counters are held here, while the per-engine ones are in
+	 * struct intel_engine_cs.
+	 */
+	u64 sample[__I915_NUM_PMU_SAMPLERS];
+};
+
 struct drm_i915_private {
 	struct drm_device drm;
 
@@ -2238,6 +2302,7 @@ struct drm_i915_private {
 	struct pci_dev *bridge_dev;
 	struct i915_gem_context *kernel_context;
 	struct intel_engine_cs *engine[I915_NUM_ENGINES];
+	struct intel_engine_cs *engine_class[MAX_ENGINE_CLASS + 1][MAX_ENGINE_INSTANCE + 1];
 	struct i915_vma *semaphore;
 
 	struct drm_dma_handle *status_page_dmah;
@@ -2698,6 +2763,8 @@ struct drm_i915_private {
 		int	irq;
 	} lpe_audio;
 
+	struct i915_pmu pmu;
+
 	/*
 	 * NOTE: This is the dri1/ums dungeon, don't add stuff here. Your patch
 	 * will be rejected. Instead look for a better place.
@@ -3918,6 +3985,15 @@ extern void i915_perf_fini(struct drm_i915_private *dev_priv);
 extern void i915_perf_register(struct drm_i915_private *dev_priv);
 extern void i915_perf_unregister(struct drm_i915_private *dev_priv);
 
+/* i915_pmu.c */
+#ifdef CONFIG_PERF_EVENTS
+extern void i915_pmu_register(struct drm_i915_private *i915);
+extern void i915_pmu_unregister(struct drm_i915_private *i915);
+#else
+static inline void i915_pmu_register(struct drm_i915_private *i915) {}
+static inline void i915_pmu_unregister(struct drm_i915_private *i915) {}
+#endif
+
 /* i915_suspend.c */
 extern int i915_save_state(struct drm_i915_private *dev_priv);
 extern int i915_restore_state(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
new file mode 100644
index 000000000000..020be5833ee6
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -0,0 +1,693 @@
+/*
+ * Copyright © 2017 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include <linux/perf_event.h>
+#include <linux/pm_runtime.h>
+
+#include "i915_drv.h"
+#include "intel_ringbuffer.h"
+
+/* Frequency for the sampling timer for events which need it. */
+#define FREQUENCY 200
+#define PERIOD max_t(u64, 10000, NSEC_PER_SEC / FREQUENCY)
+
+#define ENGINE_SAMPLE_MASK \
+	(BIT(I915_SAMPLE_QUEUED) | \
+	 BIT(I915_SAMPLE_BUSY) | \
+	 BIT(I915_SAMPLE_WAIT) | \
+	 BIT(I915_SAMPLE_SEMA))
+
+#define ENGINE_SAMPLE_BITS (1 << I915_PMU_SAMPLE_BITS)
+
+static cpumask_t i915_pmu_cpumask = CPU_MASK_NONE;
+
+static u8 engine_config_sample(u64 config)
+{
+	return config & I915_PMU_SAMPLE_MASK;
+}
+
+static u8 engine_event_sample(struct perf_event *event)
+{
+	return engine_config_sample(event->attr.config);
+}
+
+static u8 engine_event_class(struct perf_event *event)
+{
+	return (event->attr.config >> I915_PMU_CLASS_SHIFT) & 0xff;
+}
+
+static u8 engine_event_instance(struct perf_event *event)
+{
+	return (event->attr.config >> I915_PMU_SAMPLE_BITS) & 0xff;
+}
+
+static bool is_engine_config(u64 config)
+{
+	return config < __I915_PMU_OTHER(0);
+}
+
+static unsigned int config_enabled_bit(u64 config)
+{
+	if (is_engine_config(config))
+		return engine_config_sample(config);
+	else
+		return ENGINE_SAMPLE_BITS + (config - __I915_PMU_OTHER(0));
+}
+
+static u64 config_enabled_mask(u64 config)
+{
+	return BIT_ULL(config_enabled_bit(config));
+}
+
+static bool is_engine_event(struct perf_event *event)
+{
+	return is_engine_config(event->attr.config);
+}
+
+static unsigned int event_enabled_bit(struct perf_event *event)
+{
+	return config_enabled_bit(event->attr.config);
+}
+
+static bool grab_forcewake(struct drm_i915_private *i915, bool fw)
+{
+	if (!fw)
+		intel_uncore_forcewake_get(i915, FORCEWAKE_ALL);
+
+	return true;
+}
+
+static void engines_sample(struct drm_i915_private *dev_priv)
+{
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	bool fw = false;
+
+	if ((dev_priv->pmu.enable & ENGINE_SAMPLE_MASK) == 0)
+		return;
+
+	if (!dev_priv->gt.awake)
+		return;
+
+	if (!intel_runtime_pm_get_if_in_use(dev_priv))
+		return;
+
+	for_each_engine(engine, dev_priv, id) {
+		u32 enable = engine->pmu.enable;
+
+		if (i915_seqno_passed(intel_engine_get_seqno(engine),
+				      intel_engine_last_submit(engine)))
+			continue;
+
+		if (enable & BIT(I915_SAMPLE_QUEUED))
+			engine->pmu.sample[I915_SAMPLE_QUEUED] += PERIOD;
+
+		if (enable & BIT(I915_SAMPLE_BUSY)) {
+			u32 val;
+
+			fw = grab_forcewake(dev_priv, fw);
+			val = I915_READ_FW(RING_MI_MODE(engine->mmio_base));
+			if (!(val & MODE_IDLE))
+				engine->pmu.sample[I915_SAMPLE_BUSY] += PERIOD;
+		}
+
+		if (enable & (BIT(I915_SAMPLE_WAIT) | BIT(I915_SAMPLE_SEMA))) {
+			u32 val;
+
+			fw = grab_forcewake(dev_priv, fw);
+			val = I915_READ_FW(RING_CTL(engine->mmio_base));
+			if ((enable & BIT(I915_SAMPLE_WAIT)) &&
+			    (val & RING_WAIT))
+				engine->pmu.sample[I915_SAMPLE_WAIT] += PERIOD;
+			if ((enable & BIT(I915_SAMPLE_SEMA)) &&
+			    (val & RING_WAIT_SEMAPHORE))
+				engine->pmu.sample[I915_SAMPLE_SEMA] += PERIOD;
+		}
+	}
+
+	if (fw)
+		intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
+	intel_runtime_pm_put(dev_priv);
+}
+
+static void frequency_sample(struct drm_i915_private *dev_priv)
+{
+	if (dev_priv->pmu.enable &
+	    config_enabled_mask(I915_PMU_ACTUAL_FREQUENCY)) {
+		u64 val;
+
+		val = dev_priv->rps.cur_freq;
+		if (dev_priv->gt.awake &&
+		    intel_runtime_pm_get_if_in_use(dev_priv)) {
+			val = intel_get_cagf(dev_priv,
+					     I915_READ_NOTRACE(GEN6_RPSTAT1));
+			intel_runtime_pm_put(dev_priv);
+		}
+		val = intel_gpu_freq(dev_priv, val);
+		dev_priv->pmu.sample[__I915_SAMPLE_FREQ_ACT] += val * PERIOD;
+	}
+
+	if (dev_priv->pmu.enable &
+	    config_enabled_mask(I915_PMU_REQUESTED_FREQUENCY)) {
+		u64 val = intel_gpu_freq(dev_priv, dev_priv->rps.cur_freq);
+		dev_priv->pmu.sample[__I915_SAMPLE_FREQ_REQ] += val * PERIOD;
+	}
+}
+
+static enum hrtimer_restart i915_sample(struct hrtimer *hrtimer)
+{
+	struct drm_i915_private *i915 =
+		container_of(hrtimer, struct drm_i915_private, pmu.timer);
+
+	if (i915->pmu.enable == 0)
+		return HRTIMER_NORESTART;
+
+	engines_sample(i915);
+	frequency_sample(i915);
+
+	hrtimer_forward_now(hrtimer, ns_to_ktime(PERIOD));
+	return HRTIMER_RESTART;
+}
+
+static u64 count_interrupts(struct drm_i915_private *i915)
+{
+	/* open-coded kstat_irqs() */
+	struct irq_desc *desc = irq_to_desc(i915->drm.pdev->irq);
+	u64 sum = 0;
+	int cpu;
+
+	if (!desc || !desc->kstat_irqs)
+		return 0;
+
+	for_each_possible_cpu(cpu)
+		sum += *per_cpu_ptr(desc->kstat_irqs, cpu);
+
+	return sum;
+}
+
+static void i915_pmu_event_destroy(struct perf_event *event)
+{
+	WARN_ON(event->parent);
+}
+
+static int engine_event_init(struct perf_event *event)
+{
+	struct drm_i915_private *i915 =
+		container_of(event->pmu, typeof(*i915), pmu.base);
+
+	if (!intel_engine_lookup_user(i915, engine_event_class(event),
+				      engine_event_instance(event)))
+		return -ENODEV;
+
+	switch (engine_event_sample(event)) {
+	case I915_SAMPLE_QUEUED:
+	case I915_SAMPLE_BUSY:
+	case I915_SAMPLE_WAIT:
+		break;
+	case I915_SAMPLE_SEMA:
+		if (INTEL_GEN(i915) < 6)
+			return -ENODEV;
+		break;
+	default:
+		return -ENOENT;
+	}
+
+	return 0;
+}
+
+static int i915_pmu_event_init(struct perf_event *event)
+{
+	struct drm_i915_private *i915 =
+		container_of(event->pmu, typeof(*i915), pmu.base);
+	int cpu, ret;
+
+	if (event->attr.type != event->pmu->type)
+		return -ENOENT;
+
+	/* unsupported modes and filters */
+	if (event->attr.sample_period) /* no sampling */
+		return -EINVAL;
+
+	if (has_branch_stack(event))
+		return -EOPNOTSUPP;
+
+	if (event->cpu < 0)
+		return -EINVAL;
+
+	cpu = cpumask_any_and(&i915_pmu_cpumask,
+			      topology_sibling_cpumask(event->cpu));
+	if (cpu >= nr_cpu_ids)
+		return -ENODEV;
+
+	ret = 0;
+	if (is_engine_event(event)) {
+		ret = engine_event_init(event);
+	} else switch (event->attr.config) {
+	case I915_PMU_ACTUAL_FREQUENCY:
+		if (IS_VALLEYVIEW(i915) || IS_CHERRYVIEW(i915))
+			ret = -ENODEV; /* requires a mutex for sampling! */
+	case I915_PMU_REQUESTED_FREQUENCY:
+	case I915_PMU_ENERGY:
+	case I915_PMU_RC6_RESIDENCY:
+	case I915_PMU_RC6p_RESIDENCY:
+	case I915_PMU_RC6pp_RESIDENCY:
+		if (INTEL_GEN(i915) < 6)
+			ret = -ENODEV;
+		break;
+	}
+	if (ret)
+		return ret;
+
+	event->cpu = cpu;
+	if (!event->parent)
+		event->destroy = i915_pmu_event_destroy;
+
+	return 0;
+}
+
+static u64 __i915_pmu_event_read(struct perf_event *event)
+{
+	struct drm_i915_private *i915 =
+		container_of(event->pmu, typeof(*i915), pmu.base);
+	u64 val = 0;
+
+	if (is_engine_event(event)) {
+		u8 sample = engine_event_sample(event);
+		struct intel_engine_cs *engine;
+
+		engine = intel_engine_lookup_user(i915,
+						  engine_event_class(event),
+						  engine_event_instance(event));
+
+		if (WARN_ON_ONCE(!engine)) {
+			/* Do nothing */
+		} else {
+			val = engine->pmu.sample[sample];
+		}
+	} else switch (event->attr.config) {
+	case I915_PMU_ACTUAL_FREQUENCY:
+		val = i915->pmu.sample[__I915_SAMPLE_FREQ_ACT];
+		break;
+	case I915_PMU_REQUESTED_FREQUENCY:
+		val = i915->pmu.sample[__I915_SAMPLE_FREQ_REQ];
+		break;
+	case I915_PMU_ENERGY:
+		val = intel_energy_uJ(i915);
+		break;
+	case I915_PMU_INTERRUPTS:
+		val = count_interrupts(i915);
+		break;
+	case I915_PMU_RC6_RESIDENCY:
+		val = intel_rc6_residency_ns(i915,
+					     IS_VALLEYVIEW(i915) ?
+					     VLV_GT_RENDER_RC6 :
+					     GEN6_GT_GFX_RC6);
+		break;
+	case I915_PMU_RC6p_RESIDENCY:
+		if (!IS_VALLEYVIEW(i915))
+			val = intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6p);
+		break;
+	case I915_PMU_RC6pp_RESIDENCY:
+		if (!IS_VALLEYVIEW(i915))
+			val = intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6pp);
+		break;
+	}
+
+	return val;
+}
+
+static void i915_pmu_event_read(struct perf_event *event)
+{
+	struct hw_perf_event *hwc = &event->hw;
+	u64 prev, new;
+
+again:
+	prev = local64_read(&hwc->prev_count);
+	new = __i915_pmu_event_read(event);
+
+	if (local64_cmpxchg(&hwc->prev_count, prev, new) != prev)
+		goto again;
+
+	local64_add(new - prev, &event->count);
+}
+
+static void i915_pmu_enable(struct perf_event *event)
+{
+	struct drm_i915_private *i915 =
+		container_of(event->pmu, typeof(*i915), pmu.base);
+	unsigned int bit = event_enabled_bit(event);
+	unsigned long flags;
+
+	spin_lock_irqsave(&i915->pmu.lock, flags);
+
+	/*
+	 * Start the sampling timer when enabling the first event.
+	 */
+	if (i915->pmu.enable == 0)
+		hrtimer_start_range_ns(&i915->pmu.timer,
+				       ns_to_ktime(PERIOD), 0,
+				       HRTIMER_MODE_REL_PINNED);
+
+	/*
+	 * Update the bitmask of enabled events and increment
+	 * the event reference counter.
+	 */
+	GEM_BUG_ON(bit >= I915_PMU_MASK_BITS);
+	GEM_BUG_ON(i915->pmu.enable_count[bit] == ~0);
+	i915->pmu.enable |= BIT_ULL(bit);
+	i915->pmu.enable_count[bit]++;
+
+	/*
+	 * For per-engine events the bitmask and reference counting
+	 * is stored per engine.
+	 */
+	if (is_engine_event(event)) {
+		u8 sample = engine_event_sample(event);
+		struct intel_engine_cs *engine;
+
+		engine = intel_engine_lookup_user(i915,
+						  engine_event_class(event),
+						  engine_event_instance(event));
+		GEM_BUG_ON(!engine);
+		engine->pmu.enable |= BIT(sample);
+
+		GEM_BUG_ON(sample >= I915_PMU_SAMPLE_BITS);
+		GEM_BUG_ON(engine->pmu.enable_count[sample] == ~0);
+		engine->pmu.enable_count[sample]++;
+	}
+
+	/*
+	 * Store the current counter value so we can report the correct delta
+	 * for all listeners. Even when the event was already enabled and has
+	 * an existing non-zero value.
+	 */
+	local64_set(&event->hw.prev_count, __i915_pmu_event_read(event));
+
+	spin_unlock_irqrestore(&i915->pmu.lock, flags);
+}
+
+static void i915_pmu_disable(struct perf_event *event)
+{
+	struct drm_i915_private *i915 =
+		container_of(event->pmu, typeof(*i915), pmu.base);
+	unsigned int bit = event_enabled_bit(event);
+	unsigned long flags;
+
+	spin_lock_irqsave(&i915->pmu.lock, flags);
+
+	if (is_engine_event(event)) {
+		u8 sample = engine_event_sample(event);
+		struct intel_engine_cs *engine;
+
+		engine = intel_engine_lookup_user(i915,
+						  engine_event_class(event),
+						  engine_event_instance(event));
+		GEM_BUG_ON(!engine);
+		GEM_BUG_ON(sample >= I915_PMU_SAMPLE_BITS);
+		GEM_BUG_ON(engine->pmu.enable_count[sample] == 0);
+		/*
+		 * Decrement the reference count and clear the enabled
+		 * bitmask when the last listener on an event goes away.
+		 */
+		if (--engine->pmu.enable_count[sample] == 0)
+			engine->pmu.enable &= ~BIT(sample);
+	}
+
+	GEM_BUG_ON(bit >= I915_PMU_MASK_BITS);
+	GEM_BUG_ON(i915->pmu.enable_count[bit] == 0);
+	/*
+	 * Decrement the reference count and clear the enabled
+	 * bitmask when the last listener on an event goes away.
+	 */
+	if (--i915->pmu.enable_count[bit] == 0)
+		i915->pmu.enable &= ~BIT_ULL(bit);
+
+	spin_unlock_irqrestore(&i915->pmu.lock, flags);
+}
+
+static void i915_pmu_event_start(struct perf_event *event, int flags)
+{
+	i915_pmu_enable(event);
+	event->hw.state = 0;
+}
+
+static void i915_pmu_event_stop(struct perf_event *event, int flags)
+{
+	if (flags & PERF_EF_UPDATE)
+		i915_pmu_event_read(event);
+	i915_pmu_disable(event);
+	event->hw.state = PERF_HES_STOPPED;
+}
+
+static int i915_pmu_event_add(struct perf_event *event, int flags)
+{
+	if (flags & PERF_EF_START)
+		i915_pmu_event_start(event, flags);
+
+	return 0;
+}
+
+static void i915_pmu_event_del(struct perf_event *event, int flags)
+{
+	i915_pmu_event_stop(event, PERF_EF_UPDATE);
+}
+
+static int i915_pmu_event_event_idx(struct perf_event *event)
+{
+	return 0;
+}
+
+static ssize_t i915_pmu_format_show(struct device *dev,
+				    struct device_attribute *attr, char *buf)
+{
+        struct dev_ext_attribute *eattr;
+
+        eattr = container_of(attr, struct dev_ext_attribute, attr);
+        return sprintf(buf, "%s\n", (char *) eattr->var);
+}
+
+#define I915_PMU_FORMAT_ATTR(_name, _config)           \
+        (&((struct dev_ext_attribute[]) {               \
+                { .attr = __ATTR(_name, S_IRUGO, i915_pmu_format_show, NULL), \
+                  .var = (void *) _config, }            \
+        })[0].attr.attr)
+
+static struct attribute *i915_pmu_format_attrs[] = {
+        I915_PMU_FORMAT_ATTR(i915_eventid, "config:0-42"),
+        NULL,
+};
+
+static const struct attribute_group i915_pmu_format_attr_group = {
+        .name = "format",
+        .attrs = i915_pmu_format_attrs,
+};
+
+static ssize_t i915_pmu_event_show(struct device *dev,
+				   struct device_attribute *attr, char *buf)
+{
+        struct dev_ext_attribute *eattr;
+
+        eattr = container_of(attr, struct dev_ext_attribute, attr);
+        return sprintf(buf, "config=0x%lx\n", (unsigned long) eattr->var);
+}
+
+#define I915_PMU_EVENT_ATTR(_name, _config)            \
+        (&((struct dev_ext_attribute[]) {               \
+                { .attr = __ATTR(_name, S_IRUGO, i915_pmu_event_show, NULL), \
+                  .var = (void *) _config, }            \
+         })[0].attr.attr)
+
+static struct attribute *i915_pmu_events_attrs[] = {
+	I915_PMU_EVENT_ATTR(rcs0-queued,
+			    I915_PMU_ENGINE_QUEUED(I915_ENGINE_CLASS_RENDER, 0)),
+	I915_PMU_EVENT_ATTR(rcs0-busy,
+			    I915_PMU_ENGINE_BUSY(I915_ENGINE_CLASS_RENDER, 0)),
+	I915_PMU_EVENT_ATTR(rcs0-wait,
+			    I915_PMU_ENGINE_WAIT(I915_ENGINE_CLASS_RENDER, 0)),
+	I915_PMU_EVENT_ATTR(rcs0-sema,
+			    I915_PMU_ENGINE_SEMA(I915_ENGINE_CLASS_RENDER, 0)),
+
+	I915_PMU_EVENT_ATTR(bcs0-queued,
+			    I915_PMU_ENGINE_QUEUED(I915_ENGINE_CLASS_COPY, 0)),
+	I915_PMU_EVENT_ATTR(bcs0-busy,
+			    I915_PMU_ENGINE_BUSY(I915_ENGINE_CLASS_COPY, 0)),
+	I915_PMU_EVENT_ATTR(bcs0-wait,
+			    I915_PMU_ENGINE_WAIT(I915_ENGINE_CLASS_COPY, 0)),
+	I915_PMU_EVENT_ATTR(bcs0-sema,
+			    I915_PMU_ENGINE_SEMA(I915_ENGINE_CLASS_COPY, 0)),
+
+	I915_PMU_EVENT_ATTR(vcs0-queued,
+			    I915_PMU_ENGINE_QUEUED(I915_ENGINE_CLASS_VIDEO, 0)),
+	I915_PMU_EVENT_ATTR(vcs0-busy,
+			    I915_PMU_ENGINE_BUSY(I915_ENGINE_CLASS_VIDEO, 0)),
+	I915_PMU_EVENT_ATTR(vcs0-wait,
+			    I915_PMU_ENGINE_WAIT(I915_ENGINE_CLASS_VIDEO, 0)),
+	I915_PMU_EVENT_ATTR(vcs0-sema,
+			    I915_PMU_ENGINE_SEMA(I915_ENGINE_CLASS_VIDEO, 0)),
+
+	I915_PMU_EVENT_ATTR(vcs1-queued,
+			    I915_PMU_ENGINE_QUEUED(I915_ENGINE_CLASS_VIDEO, 1)),
+	I915_PMU_EVENT_ATTR(vcs1-busy,
+			    I915_PMU_ENGINE_BUSY(I915_ENGINE_CLASS_VIDEO, 1)),
+	I915_PMU_EVENT_ATTR(vcs1-wait,
+			    I915_PMU_ENGINE_WAIT(I915_ENGINE_CLASS_VIDEO, 1)),
+	I915_PMU_EVENT_ATTR(vcs1-sema,
+			    I915_PMU_ENGINE_SEMA(I915_ENGINE_CLASS_VIDEO, 1)),
+
+	I915_PMU_EVENT_ATTR(vecs0-queued,
+			    I915_PMU_ENGINE_QUEUED(I915_ENGINE_CLASS_VIDEO_ENHANCE, 0)),
+	I915_PMU_EVENT_ATTR(vecs0-busy,
+			    I915_PMU_ENGINE_BUSY(I915_ENGINE_CLASS_VIDEO_ENHANCE, 0)),
+	I915_PMU_EVENT_ATTR(vecs0-wait,
+			    I915_PMU_ENGINE_WAIT(I915_ENGINE_CLASS_VIDEO_ENHANCE, 0)),
+	I915_PMU_EVENT_ATTR(vecs0-sema,
+			    I915_PMU_ENGINE_SEMA(I915_ENGINE_CLASS_VIDEO_ENHANCE, 0)),
+
+        I915_PMU_EVENT_ATTR(actual-frequency,	 I915_PMU_ACTUAL_FREQUENCY),
+        I915_PMU_EVENT_ATTR(requested-frequency, I915_PMU_REQUESTED_FREQUENCY),
+        I915_PMU_EVENT_ATTR(energy,		 I915_PMU_ENERGY),
+        I915_PMU_EVENT_ATTR(interrupts,		 I915_PMU_INTERRUPTS),
+        I915_PMU_EVENT_ATTR(rc6-residency,	 I915_PMU_RC6_RESIDENCY),
+        I915_PMU_EVENT_ATTR(rc6p-residency,	 I915_PMU_RC6p_RESIDENCY),
+        I915_PMU_EVENT_ATTR(rc6pp-residency,	 I915_PMU_RC6pp_RESIDENCY),
+
+        NULL,
+};
+
+static const struct attribute_group i915_pmu_events_attr_group = {
+        .name = "events",
+        .attrs = i915_pmu_events_attrs,
+};
+
+static ssize_t
+i915_pmu_get_attr_cpumask(struct device *dev,
+			  struct device_attribute *attr,
+			  char *buf)
+{
+	return cpumap_print_to_pagebuf(true, buf, &i915_pmu_cpumask);
+}
+
+static DEVICE_ATTR(cpumask, S_IRUGO, i915_pmu_get_attr_cpumask, NULL);
+
+static struct attribute *i915_cpumask_attrs[] = {
+	&dev_attr_cpumask.attr,
+	NULL,
+};
+
+static struct attribute_group i915_pmu_cpumask_attr_group = {
+	.attrs = i915_cpumask_attrs,
+};
+
+static const struct attribute_group *i915_pmu_attr_groups[] = {
+        &i915_pmu_format_attr_group,
+        &i915_pmu_events_attr_group,
+	&i915_pmu_cpumask_attr_group,
+        NULL
+};
+
+static int i915_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
+{
+	unsigned int target;
+
+	target = cpumask_any_and(&i915_pmu_cpumask, &i915_pmu_cpumask);
+	/* Select the first online CPU as a designated reader. */
+	if (target >= nr_cpu_ids)
+		cpumask_set_cpu(cpu, &i915_pmu_cpumask);
+
+	return 0;
+}
+
+static int i915_pmu_cpu_offline(unsigned int cpu, struct hlist_node *node)
+{
+	struct i915_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), node);
+	unsigned int target;
+
+	if (cpumask_test_and_clear_cpu(cpu, &i915_pmu_cpumask)) {
+		target = cpumask_any_but(topology_sibling_cpumask(cpu), cpu);
+		/* Migrate events if there is a valid target */
+		if (target < nr_cpu_ids) {
+			cpumask_set_cpu(target, &i915_pmu_cpumask);
+			perf_pmu_migrate_context(&pmu->base, cpu, target);
+		}
+	}
+
+	return 0;
+}
+
+void i915_pmu_register(struct drm_i915_private *i915)
+{
+	int ret = ENOTSUPP;
+
+	if (INTEL_GEN(i915) <= 2)
+		goto err;
+
+	ret = cpuhp_setup_state_multi(CPUHP_AP_PERF_X86_UNCORE_ONLINE,
+				      "perf/x86/intel/i915:online",
+				      i915_pmu_cpu_online,
+			              i915_pmu_cpu_offline);
+	if (ret)
+		goto err;
+
+	ret = cpuhp_state_add_instance(CPUHP_AP_PERF_X86_UNCORE_ONLINE,
+				       &i915->pmu.node);
+	if (ret)
+		goto err;
+
+	i915->pmu.base.attr_groups	= i915_pmu_attr_groups;
+	i915->pmu.base.task_ctx_nr	= perf_invalid_context;
+	i915->pmu.base.event_init	= i915_pmu_event_init;
+	i915->pmu.base.add		= i915_pmu_event_add;
+	i915->pmu.base.del		= i915_pmu_event_del;
+	i915->pmu.base.start		= i915_pmu_event_start;
+	i915->pmu.base.stop		= i915_pmu_event_stop;
+	i915->pmu.base.read		= i915_pmu_event_read;
+	i915->pmu.base.event_idx	= i915_pmu_event_event_idx;
+
+	spin_lock_init(&i915->pmu.lock);
+	hrtimer_init(&i915->pmu.timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+	i915->pmu.timer.function = i915_sample;
+	i915->pmu.enable = 0;
+
+	if (perf_pmu_register(&i915->pmu.base, "i915", -1))
+		i915->pmu.base.event_init = NULL;
+
+err:
+	DRM_INFO("Failed to register PMU (err=%d)\n", ret);
+}
+
+void i915_pmu_unregister(struct drm_i915_private *i915)
+{
+	if (!i915->pmu.base.event_init)
+		return;
+
+	i915->pmu.enable = 0;
+
+	perf_pmu_unregister(&i915->pmu.base);
+	i915->pmu.base.event_init = NULL;
+
+	hrtimer_cancel(&i915->pmu.timer);
+
+	cpuhp_state_remove_instance(CPUHP_AP_PERF_X86_UNCORE_ONLINE,
+				    &i915->pmu.node);
+}
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 0b03260a3967..8c362e0451c1 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -186,6 +186,9 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define VIDEO_ENHANCEMENT_CLASS	2
 #define COPY_ENGINE_CLASS	3
 #define OTHER_CLASS		4
+#define MAX_ENGINE_CLASS	4
+
+#define MAX_ENGINE_INSTANCE    1
 
 /* PCI config space */
 
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 3ae89a9d6241..dbc7abd65f33 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -198,6 +198,15 @@ intel_engine_setup(struct drm_i915_private *dev_priv,
 	GEM_BUG_ON(info->class >= ARRAY_SIZE(intel_engine_classes));
 	class_info = &intel_engine_classes[info->class];
 
+	if (GEM_WARN_ON(info->class > MAX_ENGINE_CLASS))
+		return -EINVAL;
+
+	if (GEM_WARN_ON(info->instance > MAX_ENGINE_INSTANCE))
+		return -EINVAL;
+
+	if (GEM_WARN_ON(dev_priv->engine_class[info->class][info->instance]))
+		return -EINVAL;
+
 	GEM_BUG_ON(dev_priv->engine[id]);
 	engine = kzalloc(sizeof(*engine), GFP_KERNEL);
 	if (!engine)
@@ -225,6 +234,7 @@ intel_engine_setup(struct drm_i915_private *dev_priv,
 
 	ATOMIC_INIT_NOTIFIER_HEAD(&engine->context_status_notifier);
 
+	dev_priv->engine_class[info->class][info->instance] = engine;
 	dev_priv->engine[id] = engine;
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 268342433a8e..7db4c572ef76 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2283,3 +2283,28 @@ int intel_init_vebox_ring_buffer(struct intel_engine_cs *engine)
 
 	return intel_init_ring_buffer(engine);
 }
+
+static u8 user_class_map[I915_ENGINE_CLASS_MAX] = {
+	[I915_ENGINE_CLASS_OTHER] = OTHER_CLASS,
+	[I915_ENGINE_CLASS_RENDER] = RENDER_CLASS,
+	[I915_ENGINE_CLASS_COPY] = COPY_ENGINE_CLASS,
+	[I915_ENGINE_CLASS_VIDEO] = VIDEO_DECODE_CLASS,
+	[I915_ENGINE_CLASS_VIDEO_ENHANCE] = VIDEO_ENHANCEMENT_CLASS,
+};
+
+struct intel_engine_cs *
+intel_engine_lookup_user(struct drm_i915_private *i915, u8 class, u8 instance)
+{
+	if (class >= ARRAY_SIZE(user_class_map))
+		return NULL;
+
+	class = user_class_map[class];
+
+	if (WARN_ON_ONCE(class > MAX_ENGINE_CLASS))
+		return NULL;
+
+	if (instance > MAX_ENGINE_INSTANCE)
+		return NULL;
+
+	return i915->engine_class[class][instance];
+}
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 79c0021f3700..cf095b9386f4 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -245,6 +245,28 @@ struct intel_engine_cs {
 		I915_SELFTEST_DECLARE(bool mock : 1);
 	} breadcrumbs;
 
+	struct {
+		/**
+		 * @enable: Bitmask of enable sample events on this engine.
+		 *
+		 * Bits correspond to sample event types, for instance
+		 * I915_SAMPLE_QUEUED is bit 0 etc.
+		 */
+		u32 enable;
+		/**
+		 * @enable_count: Reference count for the enabled samplers.
+		 *
+		 * Index number corresponds to the bit number from @enable.
+		 */
+		unsigned int enable_count[I915_PMU_SAMPLE_BITS];
+		/**
+		 * @sample: Counter value for sampling events.
+		 *
+		 * Our internal timer stores the current counter in this field.
+		 */
+		u64 sample[I915_ENGINE_SAMPLE_MAX];
+	} pmu;
+
 	/*
 	 * A pool of objects to use as shadow copies of client batch buffers
 	 * when the command parser is enabled. Prevents the client from
@@ -737,4 +759,7 @@ void intel_engines_reset_default_submission(struct drm_i915_private *i915);
 
 bool intel_engine_can_store_dword(struct intel_engine_cs *engine);
 
+struct intel_engine_cs *
+intel_engine_lookup_user(struct drm_i915_private *i915, u8 class, u8 instance);
+
 #endif /* _INTEL_RINGBUFFER_H_ */
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index d8d10d932759..6dc0d6fd4e4c 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -86,6 +86,64 @@ enum i915_mocs_table_index {
 	I915_MOCS_CACHED,
 };
 
+enum drm_i915_gem_engine_class {
+	I915_ENGINE_CLASS_OTHER = 0,
+	I915_ENGINE_CLASS_RENDER = 1,
+	I915_ENGINE_CLASS_COPY = 2,
+	I915_ENGINE_CLASS_VIDEO = 3,
+	I915_ENGINE_CLASS_VIDEO_ENHANCE = 4,
+	I915_ENGINE_CLASS_MAX /* non-ABI */
+};
+
+/**
+ * DOC: perf_events exposed by i915 through /sys/bus/event_sources/drivers/i915
+ *
+ */
+
+enum drm_i915_pmu_engine_sample {
+	I915_SAMPLE_QUEUED = 0,
+	I915_SAMPLE_BUSY = 1,
+	I915_SAMPLE_WAIT = 2,
+	I915_SAMPLE_SEMA = 3,
+	I915_ENGINE_SAMPLE_MAX /* non-ABI */
+};
+
+#define I915_PMU_SAMPLE_BITS (4)
+#define I915_PMU_SAMPLE_MASK (0xf)
+#define I915_PMU_SAMPLE_INSTANCE_BITS (8)
+#define I915_PMU_CLASS_SHIFT \
+	(I915_PMU_SAMPLE_BITS + I915_PMU_SAMPLE_INSTANCE_BITS)
+
+#define __I915_PMU_ENGINE(class, instance, sample) \
+	((class) << I915_PMU_CLASS_SHIFT | \
+	(instance) << I915_PMU_SAMPLE_BITS | \
+	(sample))
+
+#define I915_PMU_ENGINE_QUEUED(class, instance) \
+	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_QUEUED)
+
+#define I915_PMU_ENGINE_BUSY(class, instance) \
+	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_BUSY)
+
+#define I915_PMU_ENGINE_WAIT(class, instance) \
+	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_WAIT)
+
+#define I915_PMU_ENGINE_SEMA(class, instance) \
+	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_SEMA)
+
+#define __I915_PMU_OTHER(x) (__I915_PMU_ENGINE(0xff, 0xff, 0xf) + 1 + (x))
+
+#define I915_PMU_ACTUAL_FREQUENCY 	__I915_PMU_OTHER(0)
+#define I915_PMU_REQUESTED_FREQUENCY	__I915_PMU_OTHER(1)
+#define I915_PMU_ENERGY			__I915_PMU_OTHER(2)
+#define I915_PMU_INTERRUPTS		__I915_PMU_OTHER(3)
+
+#define I915_PMU_RC6_RESIDENCY		__I915_PMU_OTHER(4)
+#define I915_PMU_RC6p_RESIDENCY		__I915_PMU_OTHER(5)
+#define I915_PMU_RC6pp_RESIDENCY	__I915_PMU_OTHER(6)
+
+#define I915_PMU_LAST I915_PMU_RC6pp_RESIDENCY
+
 /* Each region is a minimum of 16k, and there are at most 255 of them.
  */
 #define I915_NR_TEX_REGIONS 255	/* table size 2k - maximum due to use
-- 
2.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [RFC v6 04/11] drm/i915/pmu: Expose a PMU interface for perf queries
  2017-09-12 14:59     ` Tvrtko Ursulin
@ 2017-09-13  8:57       ` Tvrtko Ursulin
  2017-09-13 10:34         ` [RFC v7 " Tvrtko Ursulin
  0 siblings, 1 reply; 56+ messages in thread
From: Tvrtko Ursulin @ 2017-09-13  8:57 UTC (permalink / raw)
  To: Intel-gfx; +Cc: Peter Zijlstra

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

From: Chris Wilson <chris@chris-wilson.co.uk>
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
From: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

The first goal is to be able to measure GPU (and invidual ring) busyness
without having to poll registers from userspace. (Which not only incurs
holding the forcewake lock indefinitely, perturbing the system, but also
runs the risk of hanging the machine.) As an alternative we can use the
perf event counter interface to sample the ring registers periodically
and send those results to userspace.

To be able to do so, we need to export the two symbols from
kernel/events/core.c to register and unregister a PMU device.

v1-v2 (Chris Wilson):

v2: Use a common timer for the ring sampling.

v3: (Tvrtko Ursulin)
 * Decouple uAPI from i915 engine ids.
 * Complete uAPI defines.
 * Refactor some code to helpers for clarity.
 * Skip sampling disabled engines.
 * Expose counters in sysfs.
 * Pass in fake regs to avoid null ptr deref in perf core.
 * Convert to class/instance uAPI.
 * Use shared driver code for rc6 residency, power and frequency.

v4: (Dmitry Rogozhkin)
 * Register PMU with .task_ctx_nr=perf_invalid_context
 * Expose cpumask for the PMU with the single CPU in the mask
 * Properly support pmu->stop(): it should call pmu->read()
 * Properly support pmu->del(): it should call stop(event, PERF_EF_UPDATE)
 * Introduce refcounting of event subscriptions.
 * Make pmu.busy_stats a refcounter to avoid busy stats going away
   with some deleted event.
 * Expose cpumask for i915 PMU to avoid multiple events creation of
   the same type followed by counter aggregation by perf-stat.
 * Track CPUs getting online/offline to migrate perf context. If (likely)
   cpumask will initially set CPU0, CONFIG_BOOTPARAM_HOTPLUG_CPU0 will be
   needed to see effect of CPU status tracking.
 * End result is that only global events are supported and perf stat
   works correctly.
 * Deny perf driver level sampling - it is prohibited for uncore PMU.

v5: (Tvrtko Ursulin)

 * Don't hardcode number of engine samplers.
 * Rewrite event ref-counting for correctness and simplicity.
 * Store initial counter value when starting already enabled events
   to correctly report values to all listeners.
 * Fix RC6 residency readout.
 * Comments, GPL header.

v6:
 * Add missing entry to v4 changelog.
 * Fix accounting in CPU hotplug case by copying the approach from
   arch/x86/events/intel/cstate.c. (Dmitry Rogozhkin)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
---
 drivers/gpu/drm/i915/Makefile           |   1 +
 drivers/gpu/drm/i915/i915_drv.c         |   2 +
 drivers/gpu/drm/i915/i915_drv.h         |  76 ++++
 drivers/gpu/drm/i915/i915_pmu.c         | 693 ++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_reg.h         |   3 +
 drivers/gpu/drm/i915/intel_engine_cs.c  |  10 +
 drivers/gpu/drm/i915/intel_ringbuffer.c |  25 ++
 drivers/gpu/drm/i915/intel_ringbuffer.h |  25 ++
 include/uapi/drm/i915_drm.h             |  58 +++
 9 files changed, 893 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/i915_pmu.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 1cb8059a3a16..7b3a0eca62b6 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -26,6 +26,7 @@ i915-y := i915_drv.o \
 
 i915-$(CONFIG_COMPAT)   += i915_ioc32.o
 i915-$(CONFIG_DEBUG_FS) += i915_debugfs.o intel_pipe_crc.o
+i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o
 
 # GEM code
 i915-y += i915_cmd_parser.o \
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 5c111ea96e80..b1f96eb1be16 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1196,6 +1196,7 @@ static void i915_driver_register(struct drm_i915_private *dev_priv)
 	struct drm_device *dev = &dev_priv->drm;
 
 	i915_gem_shrinker_init(dev_priv);
+	i915_pmu_register(dev_priv);
 
 	/*
 	 * Notify a valid surface after modesetting,
@@ -1250,6 +1251,7 @@ static void i915_driver_unregister(struct drm_i915_private *dev_priv)
 	intel_opregion_unregister(dev_priv);
 
 	i915_perf_unregister(dev_priv);
+	i915_pmu_unregister(dev_priv);
 
 	i915_teardown_sysfs(dev_priv);
 	i915_guc_log_unregister(dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 48daf9552163..62646b8dfb7a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -40,6 +40,7 @@
 #include <linux/hash.h>
 #include <linux/intel-iommu.h>
 #include <linux/kref.h>
+#include <linux/perf_event.h>
 #include <linux/pm_qos.h>
 #include <linux/reservation.h>
 #include <linux/shmem_fs.h>
@@ -2190,6 +2191,69 @@ struct intel_cdclk_state {
 	unsigned int cdclk, vco, ref;
 };
 
+enum {
+	__I915_SAMPLE_FREQ_ACT = 0,
+	__I915_SAMPLE_FREQ_REQ,
+	__I915_NUM_PMU_SAMPLERS
+};
+
+/**
+ * How many different events we track in the global PMU mask.
+ *
+ * It is also used to know to needed number of event reference counters.
+ */
+#define I915_PMU_MASK_BITS \
+	(1 << I915_PMU_SAMPLE_BITS) + (I915_PMU_LAST + 1 - __I915_PMU_OTHER(0))
+
+struct i915_pmu {
+	/**
+	 * @node: List node for CPU hotplug handling.
+	 */
+	struct hlist_node node;
+	/**
+	 * @base: PMU base.
+	 */
+	struct pmu base;
+	/**
+	 * @lock: Lock protecting enable mask and ref count handling.
+	 */
+	spinlock_t lock;
+	/**
+	 * @timer: Timer for internal i915 PMU sampling.
+	 */
+	struct hrtimer timer;
+	/**
+	 * @enable: Bitmask of all currently enabled events.
+	 *
+	 * Bits are derived from uAPI event numbers in a way that low 16 bits
+	 * correspond to engine event _sample_ _type_ (I915_SAMPLE_QUEUED is
+	 * bit 0), and higher bits correspond to other events (for instance
+	 * I915_PMU_ACTUAL_FREQUENCY is bit 16 etc).
+	 *
+	 * In other words, low 16 bits are not per engine but per engine
+	 * sampler type, while the upper bits are directly mapped to other
+	 * event types.
+	 */
+	u64 enable;
+	/**
+	 * @enable_count: Reference counts for the enabled events.
+	 *
+	 * Array indices are mapped in the same way as bits in the @enable field
+	 * and they are used to control sampling on/off when multiple clients
+	 * are using the PMU API.
+	 */
+	unsigned int enable_count[I915_PMU_MASK_BITS];
+	/**
+	 * @sample: Current counter value for i915 events which need sampling.
+	 *
+	 * These counters are updated from the i915 PMU sampling timer.
+	 *
+	 * Only global counters are held here, while the per-engine ones are in
+	 * struct intel_engine_cs.
+	 */
+	u64 sample[__I915_NUM_PMU_SAMPLERS];
+};
+
 struct drm_i915_private {
 	struct drm_device drm;
 
@@ -2238,6 +2302,7 @@ struct drm_i915_private {
 	struct pci_dev *bridge_dev;
 	struct i915_gem_context *kernel_context;
 	struct intel_engine_cs *engine[I915_NUM_ENGINES];
+	struct intel_engine_cs *engine_class[MAX_ENGINE_CLASS + 1][MAX_ENGINE_INSTANCE + 1];
 	struct i915_vma *semaphore;
 
 	struct drm_dma_handle *status_page_dmah;
@@ -2698,6 +2763,8 @@ struct drm_i915_private {
 		int	irq;
 	} lpe_audio;
 
+	struct i915_pmu pmu;
+
 	/*
 	 * NOTE: This is the dri1/ums dungeon, don't add stuff here. Your patch
 	 * will be rejected. Instead look for a better place.
@@ -3918,6 +3985,15 @@ extern void i915_perf_fini(struct drm_i915_private *dev_priv);
 extern void i915_perf_register(struct drm_i915_private *dev_priv);
 extern void i915_perf_unregister(struct drm_i915_private *dev_priv);
 
+/* i915_pmu.c */
+#ifdef CONFIG_PERF_EVENTS
+extern void i915_pmu_register(struct drm_i915_private *i915);
+extern void i915_pmu_unregister(struct drm_i915_private *i915);
+#else
+static inline void i915_pmu_register(struct drm_i915_private *i915) {}
+static inline void i915_pmu_unregister(struct drm_i915_private *i915) {}
+#endif
+
 /* i915_suspend.c */
 extern int i915_save_state(struct drm_i915_private *dev_priv);
 extern int i915_restore_state(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
new file mode 100644
index 000000000000..020be5833ee6
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -0,0 +1,693 @@
+/*
+ * Copyright © 2017 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include <linux/perf_event.h>
+#include <linux/pm_runtime.h>
+
+#include "i915_drv.h"
+#include "intel_ringbuffer.h"
+
+/* Frequency for the sampling timer for events which need it. */
+#define FREQUENCY 200
+#define PERIOD max_t(u64, 10000, NSEC_PER_SEC / FREQUENCY)
+
+#define ENGINE_SAMPLE_MASK \
+	(BIT(I915_SAMPLE_QUEUED) | \
+	 BIT(I915_SAMPLE_BUSY) | \
+	 BIT(I915_SAMPLE_WAIT) | \
+	 BIT(I915_SAMPLE_SEMA))
+
+#define ENGINE_SAMPLE_BITS (1 << I915_PMU_SAMPLE_BITS)
+
+static cpumask_t i915_pmu_cpumask = CPU_MASK_NONE;
+
+static u8 engine_config_sample(u64 config)
+{
+	return config & I915_PMU_SAMPLE_MASK;
+}
+
+static u8 engine_event_sample(struct perf_event *event)
+{
+	return engine_config_sample(event->attr.config);
+}
+
+static u8 engine_event_class(struct perf_event *event)
+{
+	return (event->attr.config >> I915_PMU_CLASS_SHIFT) & 0xff;
+}
+
+static u8 engine_event_instance(struct perf_event *event)
+{
+	return (event->attr.config >> I915_PMU_SAMPLE_BITS) & 0xff;
+}
+
+static bool is_engine_config(u64 config)
+{
+	return config < __I915_PMU_OTHER(0);
+}
+
+static unsigned int config_enabled_bit(u64 config)
+{
+	if (is_engine_config(config))
+		return engine_config_sample(config);
+	else
+		return ENGINE_SAMPLE_BITS + (config - __I915_PMU_OTHER(0));
+}
+
+static u64 config_enabled_mask(u64 config)
+{
+	return BIT_ULL(config_enabled_bit(config));
+}
+
+static bool is_engine_event(struct perf_event *event)
+{
+	return is_engine_config(event->attr.config);
+}
+
+static unsigned int event_enabled_bit(struct perf_event *event)
+{
+	return config_enabled_bit(event->attr.config);
+}
+
+static bool grab_forcewake(struct drm_i915_private *i915, bool fw)
+{
+	if (!fw)
+		intel_uncore_forcewake_get(i915, FORCEWAKE_ALL);
+
+	return true;
+}
+
+static void engines_sample(struct drm_i915_private *dev_priv)
+{
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	bool fw = false;
+
+	if ((dev_priv->pmu.enable & ENGINE_SAMPLE_MASK) == 0)
+		return;
+
+	if (!dev_priv->gt.awake)
+		return;
+
+	if (!intel_runtime_pm_get_if_in_use(dev_priv))
+		return;
+
+	for_each_engine(engine, dev_priv, id) {
+		u32 enable = engine->pmu.enable;
+
+		if (i915_seqno_passed(intel_engine_get_seqno(engine),
+				      intel_engine_last_submit(engine)))
+			continue;
+
+		if (enable & BIT(I915_SAMPLE_QUEUED))
+			engine->pmu.sample[I915_SAMPLE_QUEUED] += PERIOD;
+
+		if (enable & BIT(I915_SAMPLE_BUSY)) {
+			u32 val;
+
+			fw = grab_forcewake(dev_priv, fw);
+			val = I915_READ_FW(RING_MI_MODE(engine->mmio_base));
+			if (!(val & MODE_IDLE))
+				engine->pmu.sample[I915_SAMPLE_BUSY] += PERIOD;
+		}
+
+		if (enable & (BIT(I915_SAMPLE_WAIT) | BIT(I915_SAMPLE_SEMA))) {
+			u32 val;
+
+			fw = grab_forcewake(dev_priv, fw);
+			val = I915_READ_FW(RING_CTL(engine->mmio_base));
+			if ((enable & BIT(I915_SAMPLE_WAIT)) &&
+			    (val & RING_WAIT))
+				engine->pmu.sample[I915_SAMPLE_WAIT] += PERIOD;
+			if ((enable & BIT(I915_SAMPLE_SEMA)) &&
+			    (val & RING_WAIT_SEMAPHORE))
+				engine->pmu.sample[I915_SAMPLE_SEMA] += PERIOD;
+		}
+	}
+
+	if (fw)
+		intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
+	intel_runtime_pm_put(dev_priv);
+}
+
+static void frequency_sample(struct drm_i915_private *dev_priv)
+{
+	if (dev_priv->pmu.enable &
+	    config_enabled_mask(I915_PMU_ACTUAL_FREQUENCY)) {
+		u64 val;
+
+		val = dev_priv->rps.cur_freq;
+		if (dev_priv->gt.awake &&
+		    intel_runtime_pm_get_if_in_use(dev_priv)) {
+			val = intel_get_cagf(dev_priv,
+					     I915_READ_NOTRACE(GEN6_RPSTAT1));
+			intel_runtime_pm_put(dev_priv);
+		}
+		val = intel_gpu_freq(dev_priv, val);
+		dev_priv->pmu.sample[__I915_SAMPLE_FREQ_ACT] += val * PERIOD;
+	}
+
+	if (dev_priv->pmu.enable &
+	    config_enabled_mask(I915_PMU_REQUESTED_FREQUENCY)) {
+		u64 val = intel_gpu_freq(dev_priv, dev_priv->rps.cur_freq);
+		dev_priv->pmu.sample[__I915_SAMPLE_FREQ_REQ] += val * PERIOD;
+	}
+}
+
+static enum hrtimer_restart i915_sample(struct hrtimer *hrtimer)
+{
+	struct drm_i915_private *i915 =
+		container_of(hrtimer, struct drm_i915_private, pmu.timer);
+
+	if (i915->pmu.enable == 0)
+		return HRTIMER_NORESTART;
+
+	engines_sample(i915);
+	frequency_sample(i915);
+
+	hrtimer_forward_now(hrtimer, ns_to_ktime(PERIOD));
+	return HRTIMER_RESTART;
+}
+
+static u64 count_interrupts(struct drm_i915_private *i915)
+{
+	/* open-coded kstat_irqs() */
+	struct irq_desc *desc = irq_to_desc(i915->drm.pdev->irq);
+	u64 sum = 0;
+	int cpu;
+
+	if (!desc || !desc->kstat_irqs)
+		return 0;
+
+	for_each_possible_cpu(cpu)
+		sum += *per_cpu_ptr(desc->kstat_irqs, cpu);
+
+	return sum;
+}
+
+static void i915_pmu_event_destroy(struct perf_event *event)
+{
+	WARN_ON(event->parent);
+}
+
+static int engine_event_init(struct perf_event *event)
+{
+	struct drm_i915_private *i915 =
+		container_of(event->pmu, typeof(*i915), pmu.base);
+
+	if (!intel_engine_lookup_user(i915, engine_event_class(event),
+				      engine_event_instance(event)))
+		return -ENODEV;
+
+	switch (engine_event_sample(event)) {
+	case I915_SAMPLE_QUEUED:
+	case I915_SAMPLE_BUSY:
+	case I915_SAMPLE_WAIT:
+		break;
+	case I915_SAMPLE_SEMA:
+		if (INTEL_GEN(i915) < 6)
+			return -ENODEV;
+		break;
+	default:
+		return -ENOENT;
+	}
+
+	return 0;
+}
+
+static int i915_pmu_event_init(struct perf_event *event)
+{
+	struct drm_i915_private *i915 =
+		container_of(event->pmu, typeof(*i915), pmu.base);
+	int cpu, ret;
+
+	if (event->attr.type != event->pmu->type)
+		return -ENOENT;
+
+	/* unsupported modes and filters */
+	if (event->attr.sample_period) /* no sampling */
+		return -EINVAL;
+
+	if (has_branch_stack(event))
+		return -EOPNOTSUPP;
+
+	if (event->cpu < 0)
+		return -EINVAL;
+
+	cpu = cpumask_any_and(&i915_pmu_cpumask,
+			      topology_sibling_cpumask(event->cpu));
+	if (cpu >= nr_cpu_ids)
+		return -ENODEV;
+
+	ret = 0;
+	if (is_engine_event(event)) {
+		ret = engine_event_init(event);
+	} else switch (event->attr.config) {
+	case I915_PMU_ACTUAL_FREQUENCY:
+		if (IS_VALLEYVIEW(i915) || IS_CHERRYVIEW(i915))
+			ret = -ENODEV; /* requires a mutex for sampling! */
+	case I915_PMU_REQUESTED_FREQUENCY:
+	case I915_PMU_ENERGY:
+	case I915_PMU_RC6_RESIDENCY:
+	case I915_PMU_RC6p_RESIDENCY:
+	case I915_PMU_RC6pp_RESIDENCY:
+		if (INTEL_GEN(i915) < 6)
+			ret = -ENODEV;
+		break;
+	}
+	if (ret)
+		return ret;
+
+	event->cpu = cpu;
+	if (!event->parent)
+		event->destroy = i915_pmu_event_destroy;
+
+	return 0;
+}
+
+static u64 __i915_pmu_event_read(struct perf_event *event)
+{
+	struct drm_i915_private *i915 =
+		container_of(event->pmu, typeof(*i915), pmu.base);
+	u64 val = 0;
+
+	if (is_engine_event(event)) {
+		u8 sample = engine_event_sample(event);
+		struct intel_engine_cs *engine;
+
+		engine = intel_engine_lookup_user(i915,
+						  engine_event_class(event),
+						  engine_event_instance(event));
+
+		if (WARN_ON_ONCE(!engine)) {
+			/* Do nothing */
+		} else {
+			val = engine->pmu.sample[sample];
+		}
+	} else switch (event->attr.config) {
+	case I915_PMU_ACTUAL_FREQUENCY:
+		val = i915->pmu.sample[__I915_SAMPLE_FREQ_ACT];
+		break;
+	case I915_PMU_REQUESTED_FREQUENCY:
+		val = i915->pmu.sample[__I915_SAMPLE_FREQ_REQ];
+		break;
+	case I915_PMU_ENERGY:
+		val = intel_energy_uJ(i915);
+		break;
+	case I915_PMU_INTERRUPTS:
+		val = count_interrupts(i915);
+		break;
+	case I915_PMU_RC6_RESIDENCY:
+		val = intel_rc6_residency_ns(i915,
+					     IS_VALLEYVIEW(i915) ?
+					     VLV_GT_RENDER_RC6 :
+					     GEN6_GT_GFX_RC6);
+		break;
+	case I915_PMU_RC6p_RESIDENCY:
+		if (!IS_VALLEYVIEW(i915))
+			val = intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6p);
+		break;
+	case I915_PMU_RC6pp_RESIDENCY:
+		if (!IS_VALLEYVIEW(i915))
+			val = intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6pp);
+		break;
+	}
+
+	return val;
+}
+
+static void i915_pmu_event_read(struct perf_event *event)
+{
+	struct hw_perf_event *hwc = &event->hw;
+	u64 prev, new;
+
+again:
+	prev = local64_read(&hwc->prev_count);
+	new = __i915_pmu_event_read(event);
+
+	if (local64_cmpxchg(&hwc->prev_count, prev, new) != prev)
+		goto again;
+
+	local64_add(new - prev, &event->count);
+}
+
+static void i915_pmu_enable(struct perf_event *event)
+{
+	struct drm_i915_private *i915 =
+		container_of(event->pmu, typeof(*i915), pmu.base);
+	unsigned int bit = event_enabled_bit(event);
+	unsigned long flags;
+
+	spin_lock_irqsave(&i915->pmu.lock, flags);
+
+	/*
+	 * Start the sampling timer when enabling the first event.
+	 */
+	if (i915->pmu.enable == 0)
+		hrtimer_start_range_ns(&i915->pmu.timer,
+				       ns_to_ktime(PERIOD), 0,
+				       HRTIMER_MODE_REL_PINNED);
+
+	/*
+	 * Update the bitmask of enabled events and increment
+	 * the event reference counter.
+	 */
+	GEM_BUG_ON(bit >= I915_PMU_MASK_BITS);
+	GEM_BUG_ON(i915->pmu.enable_count[bit] == ~0);
+	i915->pmu.enable |= BIT_ULL(bit);
+	i915->pmu.enable_count[bit]++;
+
+	/*
+	 * For per-engine events the bitmask and reference counting
+	 * is stored per engine.
+	 */
+	if (is_engine_event(event)) {
+		u8 sample = engine_event_sample(event);
+		struct intel_engine_cs *engine;
+
+		engine = intel_engine_lookup_user(i915,
+						  engine_event_class(event),
+						  engine_event_instance(event));
+		GEM_BUG_ON(!engine);
+		engine->pmu.enable |= BIT(sample);
+
+		GEM_BUG_ON(sample >= I915_PMU_SAMPLE_BITS);
+		GEM_BUG_ON(engine->pmu.enable_count[sample] == ~0);
+		engine->pmu.enable_count[sample]++;
+	}
+
+	/*
+	 * Store the current counter value so we can report the correct delta
+	 * for all listeners. Even when the event was already enabled and has
+	 * an existing non-zero value.
+	 */
+	local64_set(&event->hw.prev_count, __i915_pmu_event_read(event));
+
+	spin_unlock_irqrestore(&i915->pmu.lock, flags);
+}
+
+static void i915_pmu_disable(struct perf_event *event)
+{
+	struct drm_i915_private *i915 =
+		container_of(event->pmu, typeof(*i915), pmu.base);
+	unsigned int bit = event_enabled_bit(event);
+	unsigned long flags;
+
+	spin_lock_irqsave(&i915->pmu.lock, flags);
+
+	if (is_engine_event(event)) {
+		u8 sample = engine_event_sample(event);
+		struct intel_engine_cs *engine;
+
+		engine = intel_engine_lookup_user(i915,
+						  engine_event_class(event),
+						  engine_event_instance(event));
+		GEM_BUG_ON(!engine);
+		GEM_BUG_ON(sample >= I915_PMU_SAMPLE_BITS);
+		GEM_BUG_ON(engine->pmu.enable_count[sample] == 0);
+		/*
+		 * Decrement the reference count and clear the enabled
+		 * bitmask when the last listener on an event goes away.
+		 */
+		if (--engine->pmu.enable_count[sample] == 0)
+			engine->pmu.enable &= ~BIT(sample);
+	}
+
+	GEM_BUG_ON(bit >= I915_PMU_MASK_BITS);
+	GEM_BUG_ON(i915->pmu.enable_count[bit] == 0);
+	/*
+	 * Decrement the reference count and clear the enabled
+	 * bitmask when the last listener on an event goes away.
+	 */
+	if (--i915->pmu.enable_count[bit] == 0)
+		i915->pmu.enable &= ~BIT_ULL(bit);
+
+	spin_unlock_irqrestore(&i915->pmu.lock, flags);
+}
+
+static void i915_pmu_event_start(struct perf_event *event, int flags)
+{
+	i915_pmu_enable(event);
+	event->hw.state = 0;
+}
+
+static void i915_pmu_event_stop(struct perf_event *event, int flags)
+{
+	if (flags & PERF_EF_UPDATE)
+		i915_pmu_event_read(event);
+	i915_pmu_disable(event);
+	event->hw.state = PERF_HES_STOPPED;
+}
+
+static int i915_pmu_event_add(struct perf_event *event, int flags)
+{
+	if (flags & PERF_EF_START)
+		i915_pmu_event_start(event, flags);
+
+	return 0;
+}
+
+static void i915_pmu_event_del(struct perf_event *event, int flags)
+{
+	i915_pmu_event_stop(event, PERF_EF_UPDATE);
+}
+
+static int i915_pmu_event_event_idx(struct perf_event *event)
+{
+	return 0;
+}
+
+static ssize_t i915_pmu_format_show(struct device *dev,
+				    struct device_attribute *attr, char *buf)
+{
+        struct dev_ext_attribute *eattr;
+
+        eattr = container_of(attr, struct dev_ext_attribute, attr);
+        return sprintf(buf, "%s\n", (char *) eattr->var);
+}
+
+#define I915_PMU_FORMAT_ATTR(_name, _config)           \
+        (&((struct dev_ext_attribute[]) {               \
+                { .attr = __ATTR(_name, S_IRUGO, i915_pmu_format_show, NULL), \
+                  .var = (void *) _config, }            \
+        })[0].attr.attr)
+
+static struct attribute *i915_pmu_format_attrs[] = {
+        I915_PMU_FORMAT_ATTR(i915_eventid, "config:0-42"),
+        NULL,
+};
+
+static const struct attribute_group i915_pmu_format_attr_group = {
+        .name = "format",
+        .attrs = i915_pmu_format_attrs,
+};
+
+static ssize_t i915_pmu_event_show(struct device *dev,
+				   struct device_attribute *attr, char *buf)
+{
+        struct dev_ext_attribute *eattr;
+
+        eattr = container_of(attr, struct dev_ext_attribute, attr);
+        return sprintf(buf, "config=0x%lx\n", (unsigned long) eattr->var);
+}
+
+#define I915_PMU_EVENT_ATTR(_name, _config)            \
+        (&((struct dev_ext_attribute[]) {               \
+                { .attr = __ATTR(_name, S_IRUGO, i915_pmu_event_show, NULL), \
+                  .var = (void *) _config, }            \
+         })[0].attr.attr)
+
+static struct attribute *i915_pmu_events_attrs[] = {
+	I915_PMU_EVENT_ATTR(rcs0-queued,
+			    I915_PMU_ENGINE_QUEUED(I915_ENGINE_CLASS_RENDER, 0)),
+	I915_PMU_EVENT_ATTR(rcs0-busy,
+			    I915_PMU_ENGINE_BUSY(I915_ENGINE_CLASS_RENDER, 0)),
+	I915_PMU_EVENT_ATTR(rcs0-wait,
+			    I915_PMU_ENGINE_WAIT(I915_ENGINE_CLASS_RENDER, 0)),
+	I915_PMU_EVENT_ATTR(rcs0-sema,
+			    I915_PMU_ENGINE_SEMA(I915_ENGINE_CLASS_RENDER, 0)),
+
+	I915_PMU_EVENT_ATTR(bcs0-queued,
+			    I915_PMU_ENGINE_QUEUED(I915_ENGINE_CLASS_COPY, 0)),
+	I915_PMU_EVENT_ATTR(bcs0-busy,
+			    I915_PMU_ENGINE_BUSY(I915_ENGINE_CLASS_COPY, 0)),
+	I915_PMU_EVENT_ATTR(bcs0-wait,
+			    I915_PMU_ENGINE_WAIT(I915_ENGINE_CLASS_COPY, 0)),
+	I915_PMU_EVENT_ATTR(bcs0-sema,
+			    I915_PMU_ENGINE_SEMA(I915_ENGINE_CLASS_COPY, 0)),
+
+	I915_PMU_EVENT_ATTR(vcs0-queued,
+			    I915_PMU_ENGINE_QUEUED(I915_ENGINE_CLASS_VIDEO, 0)),
+	I915_PMU_EVENT_ATTR(vcs0-busy,
+			    I915_PMU_ENGINE_BUSY(I915_ENGINE_CLASS_VIDEO, 0)),
+	I915_PMU_EVENT_ATTR(vcs0-wait,
+			    I915_PMU_ENGINE_WAIT(I915_ENGINE_CLASS_VIDEO, 0)),
+	I915_PMU_EVENT_ATTR(vcs0-sema,
+			    I915_PMU_ENGINE_SEMA(I915_ENGINE_CLASS_VIDEO, 0)),
+
+	I915_PMU_EVENT_ATTR(vcs1-queued,
+			    I915_PMU_ENGINE_QUEUED(I915_ENGINE_CLASS_VIDEO, 1)),
+	I915_PMU_EVENT_ATTR(vcs1-busy,
+			    I915_PMU_ENGINE_BUSY(I915_ENGINE_CLASS_VIDEO, 1)),
+	I915_PMU_EVENT_ATTR(vcs1-wait,
+			    I915_PMU_ENGINE_WAIT(I915_ENGINE_CLASS_VIDEO, 1)),
+	I915_PMU_EVENT_ATTR(vcs1-sema,
+			    I915_PMU_ENGINE_SEMA(I915_ENGINE_CLASS_VIDEO, 1)),
+
+	I915_PMU_EVENT_ATTR(vecs0-queued,
+			    I915_PMU_ENGINE_QUEUED(I915_ENGINE_CLASS_VIDEO_ENHANCE, 0)),
+	I915_PMU_EVENT_ATTR(vecs0-busy,
+			    I915_PMU_ENGINE_BUSY(I915_ENGINE_CLASS_VIDEO_ENHANCE, 0)),
+	I915_PMU_EVENT_ATTR(vecs0-wait,
+			    I915_PMU_ENGINE_WAIT(I915_ENGINE_CLASS_VIDEO_ENHANCE, 0)),
+	I915_PMU_EVENT_ATTR(vecs0-sema,
+			    I915_PMU_ENGINE_SEMA(I915_ENGINE_CLASS_VIDEO_ENHANCE, 0)),
+
+        I915_PMU_EVENT_ATTR(actual-frequency,	 I915_PMU_ACTUAL_FREQUENCY),
+        I915_PMU_EVENT_ATTR(requested-frequency, I915_PMU_REQUESTED_FREQUENCY),
+        I915_PMU_EVENT_ATTR(energy,		 I915_PMU_ENERGY),
+        I915_PMU_EVENT_ATTR(interrupts,		 I915_PMU_INTERRUPTS),
+        I915_PMU_EVENT_ATTR(rc6-residency,	 I915_PMU_RC6_RESIDENCY),
+        I915_PMU_EVENT_ATTR(rc6p-residency,	 I915_PMU_RC6p_RESIDENCY),
+        I915_PMU_EVENT_ATTR(rc6pp-residency,	 I915_PMU_RC6pp_RESIDENCY),
+
+        NULL,
+};
+
+static const struct attribute_group i915_pmu_events_attr_group = {
+        .name = "events",
+        .attrs = i915_pmu_events_attrs,
+};
+
+static ssize_t
+i915_pmu_get_attr_cpumask(struct device *dev,
+			  struct device_attribute *attr,
+			  char *buf)
+{
+	return cpumap_print_to_pagebuf(true, buf, &i915_pmu_cpumask);
+}
+
+static DEVICE_ATTR(cpumask, S_IRUGO, i915_pmu_get_attr_cpumask, NULL);
+
+static struct attribute *i915_cpumask_attrs[] = {
+	&dev_attr_cpumask.attr,
+	NULL,
+};
+
+static struct attribute_group i915_pmu_cpumask_attr_group = {
+	.attrs = i915_cpumask_attrs,
+};
+
+static const struct attribute_group *i915_pmu_attr_groups[] = {
+        &i915_pmu_format_attr_group,
+        &i915_pmu_events_attr_group,
+	&i915_pmu_cpumask_attr_group,
+        NULL
+};
+
+static int i915_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
+{
+	unsigned int target;
+
+	target = cpumask_any_and(&i915_pmu_cpumask, &i915_pmu_cpumask);
+	/* Select the first online CPU as a designated reader. */
+	if (target >= nr_cpu_ids)
+		cpumask_set_cpu(cpu, &i915_pmu_cpumask);
+
+	return 0;
+}
+
+static int i915_pmu_cpu_offline(unsigned int cpu, struct hlist_node *node)
+{
+	struct i915_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), node);
+	unsigned int target;
+
+	if (cpumask_test_and_clear_cpu(cpu, &i915_pmu_cpumask)) {
+		target = cpumask_any_but(topology_sibling_cpumask(cpu), cpu);
+		/* Migrate events if there is a valid target */
+		if (target < nr_cpu_ids) {
+			cpumask_set_cpu(target, &i915_pmu_cpumask);
+			perf_pmu_migrate_context(&pmu->base, cpu, target);
+		}
+	}
+
+	return 0;
+}
+
+void i915_pmu_register(struct drm_i915_private *i915)
+{
+	int ret = ENOTSUPP;
+
+	if (INTEL_GEN(i915) <= 2)
+		goto err;
+
+	ret = cpuhp_setup_state_multi(CPUHP_AP_PERF_X86_UNCORE_ONLINE,
+				      "perf/x86/intel/i915:online",
+				      i915_pmu_cpu_online,
+			              i915_pmu_cpu_offline);
+	if (ret)
+		goto err;
+
+	ret = cpuhp_state_add_instance(CPUHP_AP_PERF_X86_UNCORE_ONLINE,
+				       &i915->pmu.node);
+	if (ret)
+		goto err;
+
+	i915->pmu.base.attr_groups	= i915_pmu_attr_groups;
+	i915->pmu.base.task_ctx_nr	= perf_invalid_context;
+	i915->pmu.base.event_init	= i915_pmu_event_init;
+	i915->pmu.base.add		= i915_pmu_event_add;
+	i915->pmu.base.del		= i915_pmu_event_del;
+	i915->pmu.base.start		= i915_pmu_event_start;
+	i915->pmu.base.stop		= i915_pmu_event_stop;
+	i915->pmu.base.read		= i915_pmu_event_read;
+	i915->pmu.base.event_idx	= i915_pmu_event_event_idx;
+
+	spin_lock_init(&i915->pmu.lock);
+	hrtimer_init(&i915->pmu.timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+	i915->pmu.timer.function = i915_sample;
+	i915->pmu.enable = 0;
+
+	if (perf_pmu_register(&i915->pmu.base, "i915", -1))
+		i915->pmu.base.event_init = NULL;
+
+err:
+	DRM_INFO("Failed to register PMU (err=%d)\n", ret);
+}
+
+void i915_pmu_unregister(struct drm_i915_private *i915)
+{
+	if (!i915->pmu.base.event_init)
+		return;
+
+	i915->pmu.enable = 0;
+
+	perf_pmu_unregister(&i915->pmu.base);
+	i915->pmu.base.event_init = NULL;
+
+	hrtimer_cancel(&i915->pmu.timer);
+
+	cpuhp_state_remove_instance(CPUHP_AP_PERF_X86_UNCORE_ONLINE,
+				    &i915->pmu.node);
+}
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 0b03260a3967..8c362e0451c1 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -186,6 +186,9 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define VIDEO_ENHANCEMENT_CLASS	2
 #define COPY_ENGINE_CLASS	3
 #define OTHER_CLASS		4
+#define MAX_ENGINE_CLASS	4
+
+#define MAX_ENGINE_INSTANCE    1
 
 /* PCI config space */
 
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 3ae89a9d6241..dbc7abd65f33 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -198,6 +198,15 @@ intel_engine_setup(struct drm_i915_private *dev_priv,
 	GEM_BUG_ON(info->class >= ARRAY_SIZE(intel_engine_classes));
 	class_info = &intel_engine_classes[info->class];
 
+	if (GEM_WARN_ON(info->class > MAX_ENGINE_CLASS))
+		return -EINVAL;
+
+	if (GEM_WARN_ON(info->instance > MAX_ENGINE_INSTANCE))
+		return -EINVAL;
+
+	if (GEM_WARN_ON(dev_priv->engine_class[info->class][info->instance]))
+		return -EINVAL;
+
 	GEM_BUG_ON(dev_priv->engine[id]);
 	engine = kzalloc(sizeof(*engine), GFP_KERNEL);
 	if (!engine)
@@ -225,6 +234,7 @@ intel_engine_setup(struct drm_i915_private *dev_priv,
 
 	ATOMIC_INIT_NOTIFIER_HEAD(&engine->context_status_notifier);
 
+	dev_priv->engine_class[info->class][info->instance] = engine;
 	dev_priv->engine[id] = engine;
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 268342433a8e..7db4c572ef76 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2283,3 +2283,28 @@ int intel_init_vebox_ring_buffer(struct intel_engine_cs *engine)
 
 	return intel_init_ring_buffer(engine);
 }
+
+static u8 user_class_map[I915_ENGINE_CLASS_MAX] = {
+	[I915_ENGINE_CLASS_OTHER] = OTHER_CLASS,
+	[I915_ENGINE_CLASS_RENDER] = RENDER_CLASS,
+	[I915_ENGINE_CLASS_COPY] = COPY_ENGINE_CLASS,
+	[I915_ENGINE_CLASS_VIDEO] = VIDEO_DECODE_CLASS,
+	[I915_ENGINE_CLASS_VIDEO_ENHANCE] = VIDEO_ENHANCEMENT_CLASS,
+};
+
+struct intel_engine_cs *
+intel_engine_lookup_user(struct drm_i915_private *i915, u8 class, u8 instance)
+{
+	if (class >= ARRAY_SIZE(user_class_map))
+		return NULL;
+
+	class = user_class_map[class];
+
+	if (WARN_ON_ONCE(class > MAX_ENGINE_CLASS))
+		return NULL;
+
+	if (instance > MAX_ENGINE_INSTANCE)
+		return NULL;
+
+	return i915->engine_class[class][instance];
+}
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 79c0021f3700..cf095b9386f4 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -245,6 +245,28 @@ struct intel_engine_cs {
 		I915_SELFTEST_DECLARE(bool mock : 1);
 	} breadcrumbs;
 
+	struct {
+		/**
+		 * @enable: Bitmask of enable sample events on this engine.
+		 *
+		 * Bits correspond to sample event types, for instance
+		 * I915_SAMPLE_QUEUED is bit 0 etc.
+		 */
+		u32 enable;
+		/**
+		 * @enable_count: Reference count for the enabled samplers.
+		 *
+		 * Index number corresponds to the bit number from @enable.
+		 */
+		unsigned int enable_count[I915_PMU_SAMPLE_BITS];
+		/**
+		 * @sample: Counter value for sampling events.
+		 *
+		 * Our internal timer stores the current counter in this field.
+		 */
+		u64 sample[I915_ENGINE_SAMPLE_MAX];
+	} pmu;
+
 	/*
 	 * A pool of objects to use as shadow copies of client batch buffers
 	 * when the command parser is enabled. Prevents the client from
@@ -737,4 +759,7 @@ void intel_engines_reset_default_submission(struct drm_i915_private *i915);
 
 bool intel_engine_can_store_dword(struct intel_engine_cs *engine);
 
+struct intel_engine_cs *
+intel_engine_lookup_user(struct drm_i915_private *i915, u8 class, u8 instance);
+
 #endif /* _INTEL_RINGBUFFER_H_ */
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index d8d10d932759..6dc0d6fd4e4c 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -86,6 +86,64 @@ enum i915_mocs_table_index {
 	I915_MOCS_CACHED,
 };
 
+enum drm_i915_gem_engine_class {
+	I915_ENGINE_CLASS_OTHER = 0,
+	I915_ENGINE_CLASS_RENDER = 1,
+	I915_ENGINE_CLASS_COPY = 2,
+	I915_ENGINE_CLASS_VIDEO = 3,
+	I915_ENGINE_CLASS_VIDEO_ENHANCE = 4,
+	I915_ENGINE_CLASS_MAX /* non-ABI */
+};
+
+/**
+ * DOC: perf_events exposed by i915 through /sys/bus/event_sources/drivers/i915
+ *
+ */
+
+enum drm_i915_pmu_engine_sample {
+	I915_SAMPLE_QUEUED = 0,
+	I915_SAMPLE_BUSY = 1,
+	I915_SAMPLE_WAIT = 2,
+	I915_SAMPLE_SEMA = 3,
+	I915_ENGINE_SAMPLE_MAX /* non-ABI */
+};
+
+#define I915_PMU_SAMPLE_BITS (4)
+#define I915_PMU_SAMPLE_MASK (0xf)
+#define I915_PMU_SAMPLE_INSTANCE_BITS (8)
+#define I915_PMU_CLASS_SHIFT \
+	(I915_PMU_SAMPLE_BITS + I915_PMU_SAMPLE_INSTANCE_BITS)
+
+#define __I915_PMU_ENGINE(class, instance, sample) \
+	((class) << I915_PMU_CLASS_SHIFT | \
+	(instance) << I915_PMU_SAMPLE_BITS | \
+	(sample))
+
+#define I915_PMU_ENGINE_QUEUED(class, instance) \
+	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_QUEUED)
+
+#define I915_PMU_ENGINE_BUSY(class, instance) \
+	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_BUSY)
+
+#define I915_PMU_ENGINE_WAIT(class, instance) \
+	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_WAIT)
+
+#define I915_PMU_ENGINE_SEMA(class, instance) \
+	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_SEMA)
+
+#define __I915_PMU_OTHER(x) (__I915_PMU_ENGINE(0xff, 0xff, 0xf) + 1 + (x))
+
+#define I915_PMU_ACTUAL_FREQUENCY 	__I915_PMU_OTHER(0)
+#define I915_PMU_REQUESTED_FREQUENCY	__I915_PMU_OTHER(1)
+#define I915_PMU_ENERGY			__I915_PMU_OTHER(2)
+#define I915_PMU_INTERRUPTS		__I915_PMU_OTHER(3)
+
+#define I915_PMU_RC6_RESIDENCY		__I915_PMU_OTHER(4)
+#define I915_PMU_RC6p_RESIDENCY		__I915_PMU_OTHER(5)
+#define I915_PMU_RC6pp_RESIDENCY	__I915_PMU_OTHER(6)
+
+#define I915_PMU_LAST I915_PMU_RC6pp_RESIDENCY
+
 /* Each region is a minimum of 16k, and there are at most 255 of them.
  */
 #define I915_NR_TEX_REGIONS 255	/* table size 2k - maximum due to use
-- 
2.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [RFC v3 00/11] i915 PMU and engine busy stats
  2017-09-12 22:01     ` Rogozhkin, Dmitry V
  2017-09-13  8:54       ` [RFC v6 04/11] drm/i915/pmu: Expose a PMU interface for perf queries Tvrtko Ursulin
@ 2017-09-13  9:01       ` Tvrtko Ursulin
  1 sibling, 0 replies; 56+ messages in thread
From: Tvrtko Ursulin @ 2017-09-13  9:01 UTC (permalink / raw)
  To: Rogozhkin, Dmitry V; +Cc: peterz, Intel-gfx


On 12/09/2017 23:01, Rogozhkin, Dmitry V wrote:
> On Tue, 2017-09-12 at 15:54 +0100, Tvrtko Ursulin wrote:
>> On 12/09/2017 03:03, Rogozhkin, Dmitry V wrote:
>>> Hi,
>>>
>>> Just tried v3 series. perf-stat works fine. From the IGT tests which I wrote for i915 PMU
>>> (https://patchwork.freedesktop.org/series/29313/) all pass (assuming pmu.enabled will be exposed
>>> in debugfs) except cpu_online subtest. And this is pretty interesting - see details below.
>>>
>>> Ok, be prepared for the long mail:)...
>>>
>>> So, cpu_online subtest loads RCS0 engine 100% and starts to put CPUs offline one by one starting
>>> from CPU0 (don't forget to have CONFIG_BOOTPARAM_HOTPLUG_CPU0=y in .config). Test expectation is
>>> that i915 PMU will report almost 100% (with 2% tolerance) busyness of RCS0. Test requests metric
>>> just twice: before running on RCS0 and right after. This fails as follows:
>>>
>>> Executed on rcs0 for 32004262184us
>>>     i915/rcs0-busy/: 2225723999us
>>> (perf_pmu:6325) CRITICAL: Test assertion failure function test_cpu_online, file perf_pmu.c:719:
>>> (perf_pmu:6325) CRITICAL: Failed assertion: perf_elapsed(&pm.metrics[0]) > (1-USAGE_TOLERANCE) * elapsed_ns(&start, &now)
>>> Stack trace:
>>>     #0 [__igt_fail_assert+0xf1]
>>>     #1 [__real_main773+0xff1]
>>>     #2 [main+0x35]
>>>     #3 [__libc_start_main+0xf5]
>>>     #4 [_start+0x29]
>>>     #5 [<unknown>+0x29]
>>> Subtest cpu_online failed.
>>> **** DEBUG ****
>>> (perf_pmu:6325) DEBUG: Test requirement passed: is_hotplug_cpu0()
>>> (perf_pmu:6325) INFO: perf_init: enabled 1 metrics from 1 requested
>>> (perf_pmu:6325) ioctl-wrappers-DEBUG: Test requirement passed: gem_has_ring(fd, ring)
>>> (perf_pmu:6325) INFO: Executed on rcs0 for 32004262184us
>>> (perf_pmu:6325) INFO:   i915/rcs0-busy/: 2225723999us
>>> (perf_pmu:6325) CRITICAL: Test assertion failure function test_cpu_online, file perf_pmu.c:719:
>>> (perf_pmu:6325) CRITICAL: Failed assertion: perf_elapsed(&pm.metrics[0]) > (1-USAGE_TOLERANCE) * elapsed_ns(&start, &now)
>>>
>>> Now. Looks like that by itself PMU context migration works. For example, if you will comment out
>>> "perf_pmu_migrate_context(&pmu->base, cpu, target)" you will get:
>>>
>>>       Executed on rcs0 for 32004434918us
>>>         i915/rcs0-busy/:     76623707us
>>>
>>> Compare with previous:
>>>       Executed on rcs0 for 32004262184us
>>>         i915/rcs0-busy/:    2225723999us
>>>
>>> This test passed on the previous set of patches, I mean Tvrtko's v2 series + my patches.
>>>
>>> So, it seems we are loosing counter values somehow. I saw in the patches that this place really was modified - you
>>> have added subtraction from initial counter value:
>>> static void i915_pmu_event_read(struct perf_event *event)
>>> {
>>>
>>> 	local64_set(&event->count,
>>> 		    __i915_pmu_event_read(event) -
>>> 		    local64_read(&event->hw.prev_count));
>>> }
>>>
>>> But looks like the problem is that with the PMU context migration we get sequence of events start/stop (or maybe
>>> add/del) which eventually call our i915_pmu_enable/disable. Here is the dmesg log with the obvious printk:
>>>
>>> [  153.971096] [IGT] perf_pmu: starting subtest cpu_online
>>> [  153.971151] i915_pmu_enable: event->hw.prev_count=0
>>> [  154.036015] i915_pmu_disable: event->hw.prev_count=0
>>> [  154.048027] i915_pmu_enable: event->hw.prev_count=0
>>> [  154.049343] smpboot: CPU 0 is now offline
>>> [  155.059028] smpboot: Booting Node 0 Processor 0 APIC 0x0
>>> [  155.155078] smpboot: CPU 1 is now offline
>>> [  156.161026] x86: Booting SMP configuration:
>>> [  156.161027] smpboot: Booting Node 0 Processor 1 APIC 0x2
>>> [  156.197065] IRQ 122: no longer affine to CPU2
>>> [  156.198087] smpboot: CPU 2 is now offline
>>> [  157.208028] smpboot: Booting Node 0 Processor 2 APIC 0x4
>>> [  157.263093] smpboot: CPU 3 is now offline
>>> [  158.273026] smpboot: Booting Node 0 Processor 3 APIC 0x6
>>> [  158.310026] i915_pmu_disable: event->hw.prev_count=76648307
>>> [  158.319020] i915_pmu_enable: event->hw.prev_count=76648307
>>> [  158.319098] IRQ 124: no longer affine to CPU4
>>> [  158.320368] smpboot: CPU 4 is now offline
>>> [  159.326030] smpboot: Booting Node 0 Processor 4 APIC 0x1
>>> [  159.365306] smpboot: CPU 5 is now offline
>>> [  160.371030] smpboot: Booting Node 0 Processor 5 APIC 0x3
>>> [  160.421077] IRQ 125: no longer affine to CPU6
>>> [  160.422093] smpboot: CPU 6 is now offline
>>> [  161.429030] smpboot: Booting Node 0 Processor 6 APIC 0x5
>>> [  161.467091] smpboot: CPU 7 is now offline
>>> [  162.473027] smpboot: Booting Node 0 Processor 7 APIC 0x7
>>> [  162.527019] i915_pmu_disable: event->hw.prev_count=4347548222
>>> [  162.546017] i915_pmu_enable: event->hw.prev_count=4347548222
>>> [  162.547317] smpboot: CPU 0 is now offline
>>> [  163.553028] smpboot: Booting Node 0 Processor 0 APIC 0x0
>>> [  163.621089] smpboot: CPU 1 is now offline
>>> [  164.627028] x86: Booting SMP configuration:
>>> [  164.627029] smpboot: Booting Node 0 Processor 1 APIC 0x2
>>> [  164.669308] smpboot: CPU 2 is now offline
>>> [  165.679025] smpboot: Booting Node 0 Processor 2 APIC 0x4
>>> [  165.717089] smpboot: CPU 3 is now offline
>>> [  166.723025] smpboot: Booting Node 0 Processor 3 APIC 0x6
>>> [  166.775016] i915_pmu_disable: event->hw.prev_count=8574197312
>>> [  166.787016] i915_pmu_enable: event->hw.prev_count=8574197312
>>> [  166.788309] smpboot: CPU 4 is now offline
>>> [  167.794025] smpboot: Booting Node 0 Processor 4 APIC 0x1
>>> [  167.837114] smpboot: CPU 5 is now offline
>>> [  168.847025] smpboot: Booting Node 0 Processor 5 APIC 0x3
>>> [  168.889312] smpboot: CPU 6 is now offline
>>> [  169.899030] smpboot: Booting Node 0 Processor 6 APIC 0x5
>>> [  169.944104] smpboot: CPU 7 is now offline
>>> [  170.954032] smpboot: Booting Node 0 Processor 7 APIC 0x7
>>> [  171.000016] i915_pmu_disable: event->hw.prev_count=12815138319
>>> [  171.008017] i915_pmu_enable: event->hw.prev_count=12815138319
>>> [  171.009304] smpboot: CPU 0 is now offline
>>> [  172.017028] smpboot: Booting Node 0 Processor 0 APIC 0x0
>>> [  172.096104] smpboot: CPU 1 is now offline
>>> [  173.106025] x86: Booting SMP configuration:
>>> [  173.106026] smpboot: Booting Node 0 Processor 1 APIC 0x2
>>> [  173.147078] smpboot: CPU 2 is now offline
>>> [  174.153025] smpboot: Booting Node 0 Processor 2 APIC 0x4
>>> [  174.192093] smpboot: CPU 3 is now offline
>>> [  175.198028] smpboot: Booting Node 0 Processor 3 APIC 0x6
>>> [  175.229042] i915_pmu_disable: event->hw.prev_count=17035889079
>>> [  175.242030] i915_pmu_enable: event->hw.prev_count=17035889079
>>> [  175.242163] IRQ fixup: irq 120 move in progress, old vector 131
>>> [  175.242165] IRQ fixup: irq 121 move in progress, old vector 147
>>> [  175.242171] IRQ 124: no longer affine to CPU4
>>> [  175.243435] smpboot: CPU 4 is now offline
>>> [  176.248040] smpboot: Booting Node 0 Processor 4 APIC 0x1
>>> [  176.285328] smpboot: CPU 5 is now offline
>>> [  177.296039] smpboot: Booting Node 0 Processor 5 APIC 0x3
>>> [  177.325067] IRQ 125: no longer affine to CPU6
>>> [  177.326087] smpboot: CPU 6 is now offline
>>> [  178.335036] smpboot: Booting Node 0 Processor 6 APIC 0x5
>>> [  178.377063] IRQ 122: no longer affine to CPU7
>>> [  178.378086] smpboot: CPU 7 is now offline
>>> [  179.388028] smpboot: Booting Node 0 Processor 7 APIC 0x7
>>> [  179.454030] i915_pmu_disable: event->hw.prev_count=21269856967
>>> [  179.470026] i915_pmu_enable: event->hw.prev_count=21269856967
>>> [  179.471110] smpboot: CPU 0 is now offline
>>> [  180.481028] smpboot: Booting Node 0 Processor 0 APIC 0x0
>>> [  180.551075] smpboot: CPU 1 is now offline
>>> [  181.558029] x86: Booting SMP configuration:
>>> [  181.558030] smpboot: Booting Node 0 Processor 1 APIC 0x2
>>> [  181.595096] smpboot: CPU 2 is now offline
>>> [  182.605029] smpboot: Booting Node 0 Processor 2 APIC 0x4
>>> [  182.657084] smpboot: CPU 3 is now offline
>>> [  183.668030] smpboot: Booting Node 0 Processor 3 APIC 0x6
>>> [  183.709017] i915_pmu_disable: event->hw.prev_count=25497358644
>>> [  183.727016] i915_pmu_enable: event->hw.prev_count=25497358644
>>> [  183.728305] smpboot: CPU 4 is now offline
>>> [  184.734027] smpboot: Booting Node 0 Processor 4 APIC 0x1
>>> [  184.767090] smpboot: CPU 5 is now offline
>>> [  185.777036] smpboot: Booting Node 0 Processor 5 APIC 0x3
>>> [  185.823096] smpboot: CPU 6 is now offline
>>> [  186.829051] smpboot: Booting Node 0 Processor 6 APIC 0x5
>>> [  186.856350] smpboot: CPU 7 is now offline
>>> [  187.862051] smpboot: Booting Node 0 Processor 7 APIC 0x7
>>> [  187.871216] [IGT] perf_pmu: exiting, ret=99
>>> [  187.889199] Console: switching to colour frame buffer device 240x67
>>> [  187.889583] i915_pmu_disable: event->hw.prev_count=29754080941
>>>
>>> And the result which I got in userspace for this run were
>>>       Executed on rcs0 for 32003587981us
>>>         i915/rcs0-busy/: 2247436461us
>>>
>>> After that I decided to roll back the change with counting values which I mentioned before, i.e.:
>>> static void i915_pmu_event_read(struct perf_event *event)
>>> {
>>>
>>> 	local64_set(&event->count,
>>> 		    __i915_pmu_event_read(event) /*-
>>> 		    local64_read(&event->hw.prev_count)*/);
>>> }
>>>
>>> And I got test PASSED :):
>>>       Executed on rcs0 for 32002282603us
>>>         i915/rcs0-busy/: 31998855052us
>>>       Subtest cpu_online: SUCCESS (33.950s)
>>>
>>> At this point I need to go home :). Maybe you will have time to look into this issue? If not, I will continue
>>> tomorrow.
>>
>> I forgot to run this test since I did not have the kernel feature
>> enabled. But yes, now that I tried it, it is failing.
>>
>> What is happening is that event del (so counter stop as well) is getting
>> called when the CPU goes offline, followed by add->start, and the
>> initial counter value then gets reloaded.
>>
>> I don't see a way for i915 to distinguish between userspace
>> starting/stopping the event, and perf core doing the same in the CPU
>> migration process. Perhaps Peter could help here?
>>
>> I am storing the initial counter value when the counter is started so
>> that I can report it's relative value. In other words the change from
>> event start to stop. Perhaps that is not correct and should be left to
>> userspace to handle?
>>
>> Otherwise we have counters like energy use, and even engine busyness,
>> which will/can already be at some large value before PMU monitoring
>> starts. Which makes things like "perf stat -a -I <command>", or even
>> just normal "perf stat <command>", attribute all previous usage (from
>> before the command profiling started) to the reported stats.
>>
> 
> Actually that's pretty easy to fix. The following patch is doing that:
> 
> diff --git a/drivers/gpu/drm/i915/i915_pmu.c
> b/drivers/gpu/drm/i915/i915_pmu.c
> index bce4951..277098d 100644
> --- a/drivers/gpu/drm/i915/i915_pmu.c
> +++ b/drivers/gpu/drm/i915/i915_pmu.c
> @@ -390,10 +390,18 @@ static u64 __i915_pmu_event_read(struct perf_event
> *event)
>   
>   static void i915_pmu_event_read(struct perf_event *event)
>   {
> +       struct hw_perf_event *hwc = &event->hw;
> +       u64 prev_raw_count, new_raw_count;
>   
> -       local64_set(&event->count,
> -                   __i915_pmu_event_read(event) -
> -                   local64_read(&event->hw.prev_count));
> +again:
> +       prev_raw_count = local64_read(&hwc->prev_count);
> +       new_raw_count = __i915_pmu_event_read(event);
> +
> +       if (local64_cmpxchg(&hwc->prev_count, prev_raw_count,
> +                           new_raw_count) != prev_raw_count)
> +               goto again;
> +
> +       local64_add(new_raw_count - prev_raw_count, &event->count);
>   }
> 
> 
> I believe you need to squash it to the major i915 PMU enabling one.
> 
> So, the idea is:
> 1. event->count contains current counter value, it is ever growing for
> the particular event _instance_, i.e. even if event is stopped/started
> or added/deleted, till this event exists (not destroyed) it holds ever
> growing value
> 2. Since it is ever growing, in the read() we always add a _delta_ to
> the event count where start point is when event got enabled (started)
> 3. On PMU context migration to another CPU we will be issued a call to
> del(PERF_EF_UPDATE). Thus, here is the trick:
> 3.1. The first thing we do we _update_ event count adding to the count
> everything we gathered on the previous CPU
> 3.2. The second thing - we update event->hw.prev_count to the new value.
> Next time we will add delta counting from it
> 
> With this all tests I have for PMU passed. The code above is taken from
> arch/x86/events/intel/cstate.c as is. So, that's really how it should
> be.

Makes sense, simple solution which I overlooked. :I

I've sent a v6 with this fix and the updated v4 changelog. Just managed 
to do it initially against the wrong message in the thread, hopefully 
patchwork is not confused beyond repair now.

> And... I should say that was the last technical problem I saw for i915
> PMU implementation. With it puzzle looks complete for me. :).

So time to poke Chris I guess.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* ✗ Fi.CI.BAT: warning for i915 PMU and engine busy stats (rev4)
  2017-09-11 15:25 [RFC v3 00/11] i915 PMU and engine busy stats Tvrtko Ursulin
                   ` (12 preceding siblings ...)
  2017-09-12  2:03 ` [RFC v3 00/11] i915 PMU and engine busy stats Rogozhkin, Dmitry V
@ 2017-09-13  9:34 ` Patchwork
  2017-09-13 10:46 ` ✗ Fi.CI.BAT: failure for i915 PMU and engine busy stats (rev6) Patchwork
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 56+ messages in thread
From: Patchwork @ 2017-09-13  9:34 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

== Series Details ==

Series: i915 PMU and engine busy stats (rev4)
URL   : https://patchwork.freedesktop.org/series/27488/
State : warning

== Summary ==

Series 27488v4 i915 PMU and engine busy stats
https://patchwork.freedesktop.org/api/1.0/series/27488/revisions/4/mbox/

Test kms_flip:
        Subgroup basic-flip-vs-modeset:
                skip       -> PASS       (fi-skl-x1585l) fdo#101781
Test pm_rpm:
        Subgroup basic-rte:
                dmesg-warn -> PASS       (fi-cfl-s) fdo#102294
Test drv_module_reload:
        Subgroup basic-reload:
                pass       -> DMESG-WARN (fi-blb-e6850)
                pass       -> DMESG-WARN (fi-elk-e7500)
                pass       -> DMESG-WARN (fi-byt-j1900)
                pass       -> DMESG-WARN (fi-byt-n2820)
                pass       -> DMESG-WARN (fi-bdw-gvtdvm)
                pass       -> DMESG-WARN (fi-bsw-n3050)
                pass       -> DMESG-WARN (fi-skl-gvtdvm)
                pass       -> DMESG-WARN (fi-bxt-j4205)
                pass       -> DMESG-WARN (fi-glk-2a)

fdo#101781 https://bugs.freedesktop.org/show_bug.cgi?id=101781
fdo#102294 https://bugs.freedesktop.org/show_bug.cgi?id=102294

fi-bdw-5557u     total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  time:441s
fi-bdw-gvtdvm    total:289  pass:264  dwarn:1   dfail:0   fail:0   skip:24  time:449s
fi-blb-e6850     total:289  pass:223  dwarn:2   dfail:0   fail:0   skip:64  time:386s
fi-bsw-n3050     total:289  pass:242  dwarn:1   dfail:0   fail:0   skip:46  time:527s
fi-bwr-2160      total:289  pass:184  dwarn:0   dfail:0   fail:0   skip:105 time:268s
fi-bxt-j4205     total:289  pass:259  dwarn:1   dfail:0   fail:0   skip:29  time:505s
fi-byt-j1900     total:289  pass:253  dwarn:2   dfail:0   fail:0   skip:34  time:503s
fi-byt-n2820     total:289  pass:249  dwarn:2   dfail:0   fail:0   skip:38  time:495s
fi-cfl-s         total:289  pass:223  dwarn:34  dfail:0   fail:0   skip:32  time:535s
fi-elk-e7500     total:289  pass:229  dwarn:1   dfail:0   fail:0   skip:59  time:450s
fi-glk-2a        total:289  pass:259  dwarn:1   dfail:0   fail:0   skip:29  time:592s
fi-hsw-4770      total:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  time:425s
fi-hsw-4770r     total:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  time:399s
fi-ilk-650       total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  time:425s
fi-ivb-3520m     total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  time:475s
fi-ivb-3770      total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  time:448s
fi-kbl-7500u     total:289  pass:264  dwarn:1   dfail:0   fail:0   skip:24  time:481s
fi-kbl-7560u     total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  time:567s
fi-kbl-r         total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:567s
fi-pnv-d510      total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  time:557s
fi-skl-6260u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:440s
fi-skl-6700k     total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:774s
fi-skl-6770hq    total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:475s
fi-skl-gvtdvm    total:289  pass:265  dwarn:1   dfail:0   fail:0   skip:23  time:468s
fi-skl-x1585l    total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:489s
fi-snb-2520m     total:289  pass:251  dwarn:0   dfail:0   fail:0   skip:38  time:560s
fi-snb-2600      total:289  pass:249  dwarn:0   dfail:0   fail:1   skip:39  time:413s

3a1a0bacb6e4e1dceb233c927d859e33b62dabaa drm-tip: 2017y-09m-13d-08h-04m-49s UTC integration manifest
3ff9f34b62bb drm/i915: Gate engine stats collection with a static key
7945871d7f04 drm/i915: Export engine stats API to other users
9e0727aac506 drm/i915/pmu: Wire up engine busy stats to PMU
6c349e39b665 drm/i915: Export engine busy stats in debugfs
c28e8e9776fb drm/i915: Engine busy time tracking
677b6449cd5a drm/i915: Wrap context schedule notification
5f3606778122 drm/i915/pmu: Suspend sampling when GPU is idle
7f6fabd84477 drm/i915/pmu: Expose a PMU interface for perf queries
58170eef2ba4 drm/i915: Extract intel_get_cagf
88d053926748 drm/i915: Add intel_energy_uJ
ce424e2c1ef0 drm/i915: Convert intel_rc6_residency_us to ns

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5675/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [RFC v7 04/11] drm/i915/pmu: Expose a PMU interface for perf queries
  2017-09-13  8:57       ` [RFC v6 " Tvrtko Ursulin
@ 2017-09-13 10:34         ` Tvrtko Ursulin
  2017-09-15  0:00           ` Rogozhkin, Dmitry V
  0 siblings, 1 reply; 56+ messages in thread
From: Tvrtko Ursulin @ 2017-09-13 10:34 UTC (permalink / raw)
  To: Intel-gfx; +Cc: Peter Zijlstra

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

From: Chris Wilson <chris@chris-wilson.co.uk>
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
From: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

The first goal is to be able to measure GPU (and invidual ring) busyness
without having to poll registers from userspace. (Which not only incurs
holding the forcewake lock indefinitely, perturbing the system, but also
runs the risk of hanging the machine.) As an alternative we can use the
perf event counter interface to sample the ring registers periodically
and send those results to userspace.

To be able to do so, we need to export the two symbols from
kernel/events/core.c to register and unregister a PMU device.

v1-v2 (Chris Wilson):

v2: Use a common timer for the ring sampling.

v3: (Tvrtko Ursulin)
 * Decouple uAPI from i915 engine ids.
 * Complete uAPI defines.
 * Refactor some code to helpers for clarity.
 * Skip sampling disabled engines.
 * Expose counters in sysfs.
 * Pass in fake regs to avoid null ptr deref in perf core.
 * Convert to class/instance uAPI.
 * Use shared driver code for rc6 residency, power and frequency.

v4: (Dmitry Rogozhkin)
 * Register PMU with .task_ctx_nr=perf_invalid_context
 * Expose cpumask for the PMU with the single CPU in the mask
 * Properly support pmu->stop(): it should call pmu->read()
 * Properly support pmu->del(): it should call stop(event, PERF_EF_UPDATE)
 * Introduce refcounting of event subscriptions.
 * Make pmu.busy_stats a refcounter to avoid busy stats going away
   with some deleted event.
 * Expose cpumask for i915 PMU to avoid multiple events creation of
   the same type followed by counter aggregation by perf-stat.
 * Track CPUs getting online/offline to migrate perf context. If (likely)
   cpumask will initially set CPU0, CONFIG_BOOTPARAM_HOTPLUG_CPU0 will be
   needed to see effect of CPU status tracking.
 * End result is that only global events are supported and perf stat
   works correctly.
 * Deny perf driver level sampling - it is prohibited for uncore PMU.

v5: (Tvrtko Ursulin)

 * Don't hardcode number of engine samplers.
 * Rewrite event ref-counting for correctness and simplicity.
 * Store initial counter value when starting already enabled events
   to correctly report values to all listeners.
 * Fix RC6 residency readout.
 * Comments, GPL header.

v6:
 * Add missing entry to v4 changelog.
 * Fix accounting in CPU hotplug case by copying the approach from
   arch/x86/events/intel/cstate.c. (Dmitry Rogozhkin)

v7:
 * Log failure message only on failure.
 * Remove CPU hotplug notification state on unregister.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
---
 drivers/gpu/drm/i915/Makefile           |   1 +
 drivers/gpu/drm/i915/i915_drv.c         |   2 +
 drivers/gpu/drm/i915/i915_drv.h         |  76 ++++
 drivers/gpu/drm/i915/i915_pmu.c         | 697 ++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_reg.h         |   3 +
 drivers/gpu/drm/i915/intel_engine_cs.c  |  10 +
 drivers/gpu/drm/i915/intel_ringbuffer.c |  25 ++
 drivers/gpu/drm/i915/intel_ringbuffer.h |  25 ++
 include/uapi/drm/i915_drm.h             |  58 +++
 9 files changed, 897 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/i915_pmu.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 1cb8059a3a16..7b3a0eca62b6 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -26,6 +26,7 @@ i915-y := i915_drv.o \
 
 i915-$(CONFIG_COMPAT)   += i915_ioc32.o
 i915-$(CONFIG_DEBUG_FS) += i915_debugfs.o intel_pipe_crc.o
+i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o
 
 # GEM code
 i915-y += i915_cmd_parser.o \
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 5c111ea96e80..b1f96eb1be16 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1196,6 +1196,7 @@ static void i915_driver_register(struct drm_i915_private *dev_priv)
 	struct drm_device *dev = &dev_priv->drm;
 
 	i915_gem_shrinker_init(dev_priv);
+	i915_pmu_register(dev_priv);
 
 	/*
 	 * Notify a valid surface after modesetting,
@@ -1250,6 +1251,7 @@ static void i915_driver_unregister(struct drm_i915_private *dev_priv)
 	intel_opregion_unregister(dev_priv);
 
 	i915_perf_unregister(dev_priv);
+	i915_pmu_unregister(dev_priv);
 
 	i915_teardown_sysfs(dev_priv);
 	i915_guc_log_unregister(dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 48daf9552163..62646b8dfb7a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -40,6 +40,7 @@
 #include <linux/hash.h>
 #include <linux/intel-iommu.h>
 #include <linux/kref.h>
+#include <linux/perf_event.h>
 #include <linux/pm_qos.h>
 #include <linux/reservation.h>
 #include <linux/shmem_fs.h>
@@ -2190,6 +2191,69 @@ struct intel_cdclk_state {
 	unsigned int cdclk, vco, ref;
 };
 
+enum {
+	__I915_SAMPLE_FREQ_ACT = 0,
+	__I915_SAMPLE_FREQ_REQ,
+	__I915_NUM_PMU_SAMPLERS
+};
+
+/**
+ * How many different events we track in the global PMU mask.
+ *
+ * It is also used to know to needed number of event reference counters.
+ */
+#define I915_PMU_MASK_BITS \
+	(1 << I915_PMU_SAMPLE_BITS) + (I915_PMU_LAST + 1 - __I915_PMU_OTHER(0))
+
+struct i915_pmu {
+	/**
+	 * @node: List node for CPU hotplug handling.
+	 */
+	struct hlist_node node;
+	/**
+	 * @base: PMU base.
+	 */
+	struct pmu base;
+	/**
+	 * @lock: Lock protecting enable mask and ref count handling.
+	 */
+	spinlock_t lock;
+	/**
+	 * @timer: Timer for internal i915 PMU sampling.
+	 */
+	struct hrtimer timer;
+	/**
+	 * @enable: Bitmask of all currently enabled events.
+	 *
+	 * Bits are derived from uAPI event numbers in a way that low 16 bits
+	 * correspond to engine event _sample_ _type_ (I915_SAMPLE_QUEUED is
+	 * bit 0), and higher bits correspond to other events (for instance
+	 * I915_PMU_ACTUAL_FREQUENCY is bit 16 etc).
+	 *
+	 * In other words, low 16 bits are not per engine but per engine
+	 * sampler type, while the upper bits are directly mapped to other
+	 * event types.
+	 */
+	u64 enable;
+	/**
+	 * @enable_count: Reference counts for the enabled events.
+	 *
+	 * Array indices are mapped in the same way as bits in the @enable field
+	 * and they are used to control sampling on/off when multiple clients
+	 * are using the PMU API.
+	 */
+	unsigned int enable_count[I915_PMU_MASK_BITS];
+	/**
+	 * @sample: Current counter value for i915 events which need sampling.
+	 *
+	 * These counters are updated from the i915 PMU sampling timer.
+	 *
+	 * Only global counters are held here, while the per-engine ones are in
+	 * struct intel_engine_cs.
+	 */
+	u64 sample[__I915_NUM_PMU_SAMPLERS];
+};
+
 struct drm_i915_private {
 	struct drm_device drm;
 
@@ -2238,6 +2302,7 @@ struct drm_i915_private {
 	struct pci_dev *bridge_dev;
 	struct i915_gem_context *kernel_context;
 	struct intel_engine_cs *engine[I915_NUM_ENGINES];
+	struct intel_engine_cs *engine_class[MAX_ENGINE_CLASS + 1][MAX_ENGINE_INSTANCE + 1];
 	struct i915_vma *semaphore;
 
 	struct drm_dma_handle *status_page_dmah;
@@ -2698,6 +2763,8 @@ struct drm_i915_private {
 		int	irq;
 	} lpe_audio;
 
+	struct i915_pmu pmu;
+
 	/*
 	 * NOTE: This is the dri1/ums dungeon, don't add stuff here. Your patch
 	 * will be rejected. Instead look for a better place.
@@ -3918,6 +3985,15 @@ extern void i915_perf_fini(struct drm_i915_private *dev_priv);
 extern void i915_perf_register(struct drm_i915_private *dev_priv);
 extern void i915_perf_unregister(struct drm_i915_private *dev_priv);
 
+/* i915_pmu.c */
+#ifdef CONFIG_PERF_EVENTS
+extern void i915_pmu_register(struct drm_i915_private *i915);
+extern void i915_pmu_unregister(struct drm_i915_private *i915);
+#else
+static inline void i915_pmu_register(struct drm_i915_private *i915) {}
+static inline void i915_pmu_unregister(struct drm_i915_private *i915) {}
+#endif
+
 /* i915_suspend.c */
 extern int i915_save_state(struct drm_i915_private *dev_priv);
 extern int i915_restore_state(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
new file mode 100644
index 000000000000..e82648e6635b
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -0,0 +1,697 @@
+/*
+ * Copyright © 2017 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include <linux/perf_event.h>
+#include <linux/pm_runtime.h>
+
+#include "i915_drv.h"
+#include "intel_ringbuffer.h"
+
+/* Frequency for the sampling timer for events which need it. */
+#define FREQUENCY 200
+#define PERIOD max_t(u64, 10000, NSEC_PER_SEC / FREQUENCY)
+
+#define ENGINE_SAMPLE_MASK \
+	(BIT(I915_SAMPLE_QUEUED) | \
+	 BIT(I915_SAMPLE_BUSY) | \
+	 BIT(I915_SAMPLE_WAIT) | \
+	 BIT(I915_SAMPLE_SEMA))
+
+#define ENGINE_SAMPLE_BITS (1 << I915_PMU_SAMPLE_BITS)
+
+static cpumask_t i915_pmu_cpumask = CPU_MASK_NONE;
+
+static u8 engine_config_sample(u64 config)
+{
+	return config & I915_PMU_SAMPLE_MASK;
+}
+
+static u8 engine_event_sample(struct perf_event *event)
+{
+	return engine_config_sample(event->attr.config);
+}
+
+static u8 engine_event_class(struct perf_event *event)
+{
+	return (event->attr.config >> I915_PMU_CLASS_SHIFT) & 0xff;
+}
+
+static u8 engine_event_instance(struct perf_event *event)
+{
+	return (event->attr.config >> I915_PMU_SAMPLE_BITS) & 0xff;
+}
+
+static bool is_engine_config(u64 config)
+{
+	return config < __I915_PMU_OTHER(0);
+}
+
+static unsigned int config_enabled_bit(u64 config)
+{
+	if (is_engine_config(config))
+		return engine_config_sample(config);
+	else
+		return ENGINE_SAMPLE_BITS + (config - __I915_PMU_OTHER(0));
+}
+
+static u64 config_enabled_mask(u64 config)
+{
+	return BIT_ULL(config_enabled_bit(config));
+}
+
+static bool is_engine_event(struct perf_event *event)
+{
+	return is_engine_config(event->attr.config);
+}
+
+static unsigned int event_enabled_bit(struct perf_event *event)
+{
+	return config_enabled_bit(event->attr.config);
+}
+
+static bool grab_forcewake(struct drm_i915_private *i915, bool fw)
+{
+	if (!fw)
+		intel_uncore_forcewake_get(i915, FORCEWAKE_ALL);
+
+	return true;
+}
+
+static void engines_sample(struct drm_i915_private *dev_priv)
+{
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	bool fw = false;
+
+	if ((dev_priv->pmu.enable & ENGINE_SAMPLE_MASK) == 0)
+		return;
+
+	if (!dev_priv->gt.awake)
+		return;
+
+	if (!intel_runtime_pm_get_if_in_use(dev_priv))
+		return;
+
+	for_each_engine(engine, dev_priv, id) {
+		u32 enable = engine->pmu.enable;
+
+		if (i915_seqno_passed(intel_engine_get_seqno(engine),
+				      intel_engine_last_submit(engine)))
+			continue;
+
+		if (enable & BIT(I915_SAMPLE_QUEUED))
+			engine->pmu.sample[I915_SAMPLE_QUEUED] += PERIOD;
+
+		if (enable & BIT(I915_SAMPLE_BUSY)) {
+			u32 val;
+
+			fw = grab_forcewake(dev_priv, fw);
+			val = I915_READ_FW(RING_MI_MODE(engine->mmio_base));
+			if (!(val & MODE_IDLE))
+				engine->pmu.sample[I915_SAMPLE_BUSY] += PERIOD;
+		}
+
+		if (enable & (BIT(I915_SAMPLE_WAIT) | BIT(I915_SAMPLE_SEMA))) {
+			u32 val;
+
+			fw = grab_forcewake(dev_priv, fw);
+			val = I915_READ_FW(RING_CTL(engine->mmio_base));
+			if ((enable & BIT(I915_SAMPLE_WAIT)) &&
+			    (val & RING_WAIT))
+				engine->pmu.sample[I915_SAMPLE_WAIT] += PERIOD;
+			if ((enable & BIT(I915_SAMPLE_SEMA)) &&
+			    (val & RING_WAIT_SEMAPHORE))
+				engine->pmu.sample[I915_SAMPLE_SEMA] += PERIOD;
+		}
+	}
+
+	if (fw)
+		intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
+	intel_runtime_pm_put(dev_priv);
+}
+
+static void frequency_sample(struct drm_i915_private *dev_priv)
+{
+	if (dev_priv->pmu.enable &
+	    config_enabled_mask(I915_PMU_ACTUAL_FREQUENCY)) {
+		u64 val;
+
+		val = dev_priv->rps.cur_freq;
+		if (dev_priv->gt.awake &&
+		    intel_runtime_pm_get_if_in_use(dev_priv)) {
+			val = intel_get_cagf(dev_priv,
+					     I915_READ_NOTRACE(GEN6_RPSTAT1));
+			intel_runtime_pm_put(dev_priv);
+		}
+		val = intel_gpu_freq(dev_priv, val);
+		dev_priv->pmu.sample[__I915_SAMPLE_FREQ_ACT] += val * PERIOD;
+	}
+
+	if (dev_priv->pmu.enable &
+	    config_enabled_mask(I915_PMU_REQUESTED_FREQUENCY)) {
+		u64 val = intel_gpu_freq(dev_priv, dev_priv->rps.cur_freq);
+		dev_priv->pmu.sample[__I915_SAMPLE_FREQ_REQ] += val * PERIOD;
+	}
+}
+
+static enum hrtimer_restart i915_sample(struct hrtimer *hrtimer)
+{
+	struct drm_i915_private *i915 =
+		container_of(hrtimer, struct drm_i915_private, pmu.timer);
+
+	if (i915->pmu.enable == 0)
+		return HRTIMER_NORESTART;
+
+	engines_sample(i915);
+	frequency_sample(i915);
+
+	hrtimer_forward_now(hrtimer, ns_to_ktime(PERIOD));
+	return HRTIMER_RESTART;
+}
+
+static u64 count_interrupts(struct drm_i915_private *i915)
+{
+	/* open-coded kstat_irqs() */
+	struct irq_desc *desc = irq_to_desc(i915->drm.pdev->irq);
+	u64 sum = 0;
+	int cpu;
+
+	if (!desc || !desc->kstat_irqs)
+		return 0;
+
+	for_each_possible_cpu(cpu)
+		sum += *per_cpu_ptr(desc->kstat_irqs, cpu);
+
+	return sum;
+}
+
+static void i915_pmu_event_destroy(struct perf_event *event)
+{
+	WARN_ON(event->parent);
+}
+
+static int engine_event_init(struct perf_event *event)
+{
+	struct drm_i915_private *i915 =
+		container_of(event->pmu, typeof(*i915), pmu.base);
+
+	if (!intel_engine_lookup_user(i915, engine_event_class(event),
+				      engine_event_instance(event)))
+		return -ENODEV;
+
+	switch (engine_event_sample(event)) {
+	case I915_SAMPLE_QUEUED:
+	case I915_SAMPLE_BUSY:
+	case I915_SAMPLE_WAIT:
+		break;
+	case I915_SAMPLE_SEMA:
+		if (INTEL_GEN(i915) < 6)
+			return -ENODEV;
+		break;
+	default:
+		return -ENOENT;
+	}
+
+	return 0;
+}
+
+static int i915_pmu_event_init(struct perf_event *event)
+{
+	struct drm_i915_private *i915 =
+		container_of(event->pmu, typeof(*i915), pmu.base);
+	int cpu, ret;
+
+	if (event->attr.type != event->pmu->type)
+		return -ENOENT;
+
+	/* unsupported modes and filters */
+	if (event->attr.sample_period) /* no sampling */
+		return -EINVAL;
+
+	if (has_branch_stack(event))
+		return -EOPNOTSUPP;
+
+	if (event->cpu < 0)
+		return -EINVAL;
+
+	cpu = cpumask_any_and(&i915_pmu_cpumask,
+			      topology_sibling_cpumask(event->cpu));
+	if (cpu >= nr_cpu_ids)
+		return -ENODEV;
+
+	ret = 0;
+	if (is_engine_event(event)) {
+		ret = engine_event_init(event);
+	} else switch (event->attr.config) {
+	case I915_PMU_ACTUAL_FREQUENCY:
+		if (IS_VALLEYVIEW(i915) || IS_CHERRYVIEW(i915))
+			ret = -ENODEV; /* requires a mutex for sampling! */
+	case I915_PMU_REQUESTED_FREQUENCY:
+	case I915_PMU_ENERGY:
+	case I915_PMU_RC6_RESIDENCY:
+	case I915_PMU_RC6p_RESIDENCY:
+	case I915_PMU_RC6pp_RESIDENCY:
+		if (INTEL_GEN(i915) < 6)
+			ret = -ENODEV;
+		break;
+	}
+	if (ret)
+		return ret;
+
+	event->cpu = cpu;
+	if (!event->parent)
+		event->destroy = i915_pmu_event_destroy;
+
+	return 0;
+}
+
+static u64 __i915_pmu_event_read(struct perf_event *event)
+{
+	struct drm_i915_private *i915 =
+		container_of(event->pmu, typeof(*i915), pmu.base);
+	u64 val = 0;
+
+	if (is_engine_event(event)) {
+		u8 sample = engine_event_sample(event);
+		struct intel_engine_cs *engine;
+
+		engine = intel_engine_lookup_user(i915,
+						  engine_event_class(event),
+						  engine_event_instance(event));
+
+		if (WARN_ON_ONCE(!engine)) {
+			/* Do nothing */
+		} else {
+			val = engine->pmu.sample[sample];
+		}
+	} else switch (event->attr.config) {
+	case I915_PMU_ACTUAL_FREQUENCY:
+		val = i915->pmu.sample[__I915_SAMPLE_FREQ_ACT];
+		break;
+	case I915_PMU_REQUESTED_FREQUENCY:
+		val = i915->pmu.sample[__I915_SAMPLE_FREQ_REQ];
+		break;
+	case I915_PMU_ENERGY:
+		val = intel_energy_uJ(i915);
+		break;
+	case I915_PMU_INTERRUPTS:
+		val = count_interrupts(i915);
+		break;
+	case I915_PMU_RC6_RESIDENCY:
+		val = intel_rc6_residency_ns(i915,
+					     IS_VALLEYVIEW(i915) ?
+					     VLV_GT_RENDER_RC6 :
+					     GEN6_GT_GFX_RC6);
+		break;
+	case I915_PMU_RC6p_RESIDENCY:
+		if (!IS_VALLEYVIEW(i915))
+			val = intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6p);
+		break;
+	case I915_PMU_RC6pp_RESIDENCY:
+		if (!IS_VALLEYVIEW(i915))
+			val = intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6pp);
+		break;
+	}
+
+	return val;
+}
+
+static void i915_pmu_event_read(struct perf_event *event)
+{
+	struct hw_perf_event *hwc = &event->hw;
+	u64 prev, new;
+
+again:
+	prev = local64_read(&hwc->prev_count);
+	new = __i915_pmu_event_read(event);
+
+	if (local64_cmpxchg(&hwc->prev_count, prev, new) != prev)
+		goto again;
+
+	local64_add(new - prev, &event->count);
+}
+
+static void i915_pmu_enable(struct perf_event *event)
+{
+	struct drm_i915_private *i915 =
+		container_of(event->pmu, typeof(*i915), pmu.base);
+	unsigned int bit = event_enabled_bit(event);
+	unsigned long flags;
+
+	spin_lock_irqsave(&i915->pmu.lock, flags);
+
+	/*
+	 * Start the sampling timer when enabling the first event.
+	 */
+	if (i915->pmu.enable == 0)
+		hrtimer_start_range_ns(&i915->pmu.timer,
+				       ns_to_ktime(PERIOD), 0,
+				       HRTIMER_MODE_REL_PINNED);
+
+	/*
+	 * Update the bitmask of enabled events and increment
+	 * the event reference counter.
+	 */
+	GEM_BUG_ON(bit >= I915_PMU_MASK_BITS);
+	GEM_BUG_ON(i915->pmu.enable_count[bit] == ~0);
+	i915->pmu.enable |= BIT_ULL(bit);
+	i915->pmu.enable_count[bit]++;
+
+	/*
+	 * For per-engine events the bitmask and reference counting
+	 * is stored per engine.
+	 */
+	if (is_engine_event(event)) {
+		u8 sample = engine_event_sample(event);
+		struct intel_engine_cs *engine;
+
+		engine = intel_engine_lookup_user(i915,
+						  engine_event_class(event),
+						  engine_event_instance(event));
+		GEM_BUG_ON(!engine);
+		engine->pmu.enable |= BIT(sample);
+
+		GEM_BUG_ON(sample >= I915_PMU_SAMPLE_BITS);
+		GEM_BUG_ON(engine->pmu.enable_count[sample] == ~0);
+		engine->pmu.enable_count[sample]++;
+	}
+
+	/*
+	 * Store the current counter value so we can report the correct delta
+	 * for all listeners. Even when the event was already enabled and has
+	 * an existing non-zero value.
+	 */
+	local64_set(&event->hw.prev_count, __i915_pmu_event_read(event));
+
+	spin_unlock_irqrestore(&i915->pmu.lock, flags);
+}
+
+static void i915_pmu_disable(struct perf_event *event)
+{
+	struct drm_i915_private *i915 =
+		container_of(event->pmu, typeof(*i915), pmu.base);
+	unsigned int bit = event_enabled_bit(event);
+	unsigned long flags;
+
+	spin_lock_irqsave(&i915->pmu.lock, flags);
+
+	if (is_engine_event(event)) {
+		u8 sample = engine_event_sample(event);
+		struct intel_engine_cs *engine;
+
+		engine = intel_engine_lookup_user(i915,
+						  engine_event_class(event),
+						  engine_event_instance(event));
+		GEM_BUG_ON(!engine);
+		GEM_BUG_ON(sample >= I915_PMU_SAMPLE_BITS);
+		GEM_BUG_ON(engine->pmu.enable_count[sample] == 0);
+		/*
+		 * Decrement the reference count and clear the enabled
+		 * bitmask when the last listener on an event goes away.
+		 */
+		if (--engine->pmu.enable_count[sample] == 0)
+			engine->pmu.enable &= ~BIT(sample);
+	}
+
+	GEM_BUG_ON(bit >= I915_PMU_MASK_BITS);
+	GEM_BUG_ON(i915->pmu.enable_count[bit] == 0);
+	/*
+	 * Decrement the reference count and clear the enabled
+	 * bitmask when the last listener on an event goes away.
+	 */
+	if (--i915->pmu.enable_count[bit] == 0)
+		i915->pmu.enable &= ~BIT_ULL(bit);
+
+	spin_unlock_irqrestore(&i915->pmu.lock, flags);
+}
+
+static void i915_pmu_event_start(struct perf_event *event, int flags)
+{
+	i915_pmu_enable(event);
+	event->hw.state = 0;
+}
+
+static void i915_pmu_event_stop(struct perf_event *event, int flags)
+{
+	if (flags & PERF_EF_UPDATE)
+		i915_pmu_event_read(event);
+	i915_pmu_disable(event);
+	event->hw.state = PERF_HES_STOPPED;
+}
+
+static int i915_pmu_event_add(struct perf_event *event, int flags)
+{
+	if (flags & PERF_EF_START)
+		i915_pmu_event_start(event, flags);
+
+	return 0;
+}
+
+static void i915_pmu_event_del(struct perf_event *event, int flags)
+{
+	i915_pmu_event_stop(event, PERF_EF_UPDATE);
+}
+
+static int i915_pmu_event_event_idx(struct perf_event *event)
+{
+	return 0;
+}
+
+static ssize_t i915_pmu_format_show(struct device *dev,
+				    struct device_attribute *attr, char *buf)
+{
+        struct dev_ext_attribute *eattr;
+
+        eattr = container_of(attr, struct dev_ext_attribute, attr);
+        return sprintf(buf, "%s\n", (char *) eattr->var);
+}
+
+#define I915_PMU_FORMAT_ATTR(_name, _config)           \
+        (&((struct dev_ext_attribute[]) {               \
+                { .attr = __ATTR(_name, S_IRUGO, i915_pmu_format_show, NULL), \
+                  .var = (void *) _config, }            \
+        })[0].attr.attr)
+
+static struct attribute *i915_pmu_format_attrs[] = {
+        I915_PMU_FORMAT_ATTR(i915_eventid, "config:0-42"),
+        NULL,
+};
+
+static const struct attribute_group i915_pmu_format_attr_group = {
+        .name = "format",
+        .attrs = i915_pmu_format_attrs,
+};
+
+static ssize_t i915_pmu_event_show(struct device *dev,
+				   struct device_attribute *attr, char *buf)
+{
+        struct dev_ext_attribute *eattr;
+
+        eattr = container_of(attr, struct dev_ext_attribute, attr);
+        return sprintf(buf, "config=0x%lx\n", (unsigned long) eattr->var);
+}
+
+#define I915_PMU_EVENT_ATTR(_name, _config)            \
+        (&((struct dev_ext_attribute[]) {               \
+                { .attr = __ATTR(_name, S_IRUGO, i915_pmu_event_show, NULL), \
+                  .var = (void *) _config, }            \
+         })[0].attr.attr)
+
+static struct attribute *i915_pmu_events_attrs[] = {
+	I915_PMU_EVENT_ATTR(rcs0-queued,
+			    I915_PMU_ENGINE_QUEUED(I915_ENGINE_CLASS_RENDER, 0)),
+	I915_PMU_EVENT_ATTR(rcs0-busy,
+			    I915_PMU_ENGINE_BUSY(I915_ENGINE_CLASS_RENDER, 0)),
+	I915_PMU_EVENT_ATTR(rcs0-wait,
+			    I915_PMU_ENGINE_WAIT(I915_ENGINE_CLASS_RENDER, 0)),
+	I915_PMU_EVENT_ATTR(rcs0-sema,
+			    I915_PMU_ENGINE_SEMA(I915_ENGINE_CLASS_RENDER, 0)),
+
+	I915_PMU_EVENT_ATTR(bcs0-queued,
+			    I915_PMU_ENGINE_QUEUED(I915_ENGINE_CLASS_COPY, 0)),
+	I915_PMU_EVENT_ATTR(bcs0-busy,
+			    I915_PMU_ENGINE_BUSY(I915_ENGINE_CLASS_COPY, 0)),
+	I915_PMU_EVENT_ATTR(bcs0-wait,
+			    I915_PMU_ENGINE_WAIT(I915_ENGINE_CLASS_COPY, 0)),
+	I915_PMU_EVENT_ATTR(bcs0-sema,
+			    I915_PMU_ENGINE_SEMA(I915_ENGINE_CLASS_COPY, 0)),
+
+	I915_PMU_EVENT_ATTR(vcs0-queued,
+			    I915_PMU_ENGINE_QUEUED(I915_ENGINE_CLASS_VIDEO, 0)),
+	I915_PMU_EVENT_ATTR(vcs0-busy,
+			    I915_PMU_ENGINE_BUSY(I915_ENGINE_CLASS_VIDEO, 0)),
+	I915_PMU_EVENT_ATTR(vcs0-wait,
+			    I915_PMU_ENGINE_WAIT(I915_ENGINE_CLASS_VIDEO, 0)),
+	I915_PMU_EVENT_ATTR(vcs0-sema,
+			    I915_PMU_ENGINE_SEMA(I915_ENGINE_CLASS_VIDEO, 0)),
+
+	I915_PMU_EVENT_ATTR(vcs1-queued,
+			    I915_PMU_ENGINE_QUEUED(I915_ENGINE_CLASS_VIDEO, 1)),
+	I915_PMU_EVENT_ATTR(vcs1-busy,
+			    I915_PMU_ENGINE_BUSY(I915_ENGINE_CLASS_VIDEO, 1)),
+	I915_PMU_EVENT_ATTR(vcs1-wait,
+			    I915_PMU_ENGINE_WAIT(I915_ENGINE_CLASS_VIDEO, 1)),
+	I915_PMU_EVENT_ATTR(vcs1-sema,
+			    I915_PMU_ENGINE_SEMA(I915_ENGINE_CLASS_VIDEO, 1)),
+
+	I915_PMU_EVENT_ATTR(vecs0-queued,
+			    I915_PMU_ENGINE_QUEUED(I915_ENGINE_CLASS_VIDEO_ENHANCE, 0)),
+	I915_PMU_EVENT_ATTR(vecs0-busy,
+			    I915_PMU_ENGINE_BUSY(I915_ENGINE_CLASS_VIDEO_ENHANCE, 0)),
+	I915_PMU_EVENT_ATTR(vecs0-wait,
+			    I915_PMU_ENGINE_WAIT(I915_ENGINE_CLASS_VIDEO_ENHANCE, 0)),
+	I915_PMU_EVENT_ATTR(vecs0-sema,
+			    I915_PMU_ENGINE_SEMA(I915_ENGINE_CLASS_VIDEO_ENHANCE, 0)),
+
+        I915_PMU_EVENT_ATTR(actual-frequency,	 I915_PMU_ACTUAL_FREQUENCY),
+        I915_PMU_EVENT_ATTR(requested-frequency, I915_PMU_REQUESTED_FREQUENCY),
+        I915_PMU_EVENT_ATTR(energy,		 I915_PMU_ENERGY),
+        I915_PMU_EVENT_ATTR(interrupts,		 I915_PMU_INTERRUPTS),
+        I915_PMU_EVENT_ATTR(rc6-residency,	 I915_PMU_RC6_RESIDENCY),
+        I915_PMU_EVENT_ATTR(rc6p-residency,	 I915_PMU_RC6p_RESIDENCY),
+        I915_PMU_EVENT_ATTR(rc6pp-residency,	 I915_PMU_RC6pp_RESIDENCY),
+
+        NULL,
+};
+
+static const struct attribute_group i915_pmu_events_attr_group = {
+        .name = "events",
+        .attrs = i915_pmu_events_attrs,
+};
+
+static ssize_t
+i915_pmu_get_attr_cpumask(struct device *dev,
+			  struct device_attribute *attr,
+			  char *buf)
+{
+	return cpumap_print_to_pagebuf(true, buf, &i915_pmu_cpumask);
+}
+
+static DEVICE_ATTR(cpumask, S_IRUGO, i915_pmu_get_attr_cpumask, NULL);
+
+static struct attribute *i915_cpumask_attrs[] = {
+	&dev_attr_cpumask.attr,
+	NULL,
+};
+
+static struct attribute_group i915_pmu_cpumask_attr_group = {
+	.attrs = i915_cpumask_attrs,
+};
+
+static const struct attribute_group *i915_pmu_attr_groups[] = {
+        &i915_pmu_format_attr_group,
+        &i915_pmu_events_attr_group,
+	&i915_pmu_cpumask_attr_group,
+        NULL
+};
+
+static int i915_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
+{
+	unsigned int target;
+
+	target = cpumask_any_and(&i915_pmu_cpumask, &i915_pmu_cpumask);
+	/* Select the first online CPU as a designated reader. */
+	if (target >= nr_cpu_ids)
+		cpumask_set_cpu(cpu, &i915_pmu_cpumask);
+
+	return 0;
+}
+
+static int i915_pmu_cpu_offline(unsigned int cpu, struct hlist_node *node)
+{
+	struct i915_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), node);
+	unsigned int target;
+
+	if (cpumask_test_and_clear_cpu(cpu, &i915_pmu_cpumask)) {
+		target = cpumask_any_but(topology_sibling_cpumask(cpu), cpu);
+		/* Migrate events if there is a valid target */
+		if (target < nr_cpu_ids) {
+			cpumask_set_cpu(target, &i915_pmu_cpumask);
+			perf_pmu_migrate_context(&pmu->base, cpu, target);
+		}
+	}
+
+	return 0;
+}
+
+void i915_pmu_register(struct drm_i915_private *i915)
+{
+	int ret;
+
+	if (INTEL_GEN(i915) <= 2) {
+		ret = -ENOTSUPP;
+		goto err;
+	}
+
+	ret = cpuhp_setup_state_multi(CPUHP_AP_PERF_X86_UNCORE_ONLINE,
+				      "perf/x86/intel/i915:online",
+				      i915_pmu_cpu_online,
+			              i915_pmu_cpu_offline);
+	if (ret)
+		goto err;
+
+	ret = cpuhp_state_add_instance(CPUHP_AP_PERF_X86_UNCORE_ONLINE,
+				       &i915->pmu.node);
+	if (ret)
+		goto err;
+
+	i915->pmu.base.attr_groups	= i915_pmu_attr_groups;
+	i915->pmu.base.task_ctx_nr	= perf_invalid_context;
+	i915->pmu.base.event_init	= i915_pmu_event_init;
+	i915->pmu.base.add		= i915_pmu_event_add;
+	i915->pmu.base.del		= i915_pmu_event_del;
+	i915->pmu.base.start		= i915_pmu_event_start;
+	i915->pmu.base.stop		= i915_pmu_event_stop;
+	i915->pmu.base.read		= i915_pmu_event_read;
+	i915->pmu.base.event_idx	= i915_pmu_event_event_idx;
+
+	spin_lock_init(&i915->pmu.lock);
+	hrtimer_init(&i915->pmu.timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+	i915->pmu.timer.function = i915_sample;
+	i915->pmu.enable = 0;
+
+	if (perf_pmu_register(&i915->pmu.base, "i915", -1))
+		i915->pmu.base.event_init = NULL;
+
+err:
+	if (ret)
+		DRM_NOTE("Failed to register PMU (err=%d)\n", ret);
+}
+
+void i915_pmu_unregister(struct drm_i915_private *i915)
+{
+	if (!i915->pmu.base.event_init)
+		return;
+
+	WARN_ON(cpuhp_state_remove_instance(CPUHP_AP_PERF_X86_UNCORE_ONLINE,
+					    &i915->pmu.node));
+	cpuhp_remove_multi_state(CPUHP_AP_PERF_X86_UNCORE_ONLINE);
+
+	i915->pmu.enable = 0;
+
+	hrtimer_cancel(&i915->pmu.timer);
+
+	perf_pmu_unregister(&i915->pmu.base);
+	i915->pmu.base.event_init = NULL;
+}
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 0b03260a3967..8c362e0451c1 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -186,6 +186,9 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define VIDEO_ENHANCEMENT_CLASS	2
 #define COPY_ENGINE_CLASS	3
 #define OTHER_CLASS		4
+#define MAX_ENGINE_CLASS	4
+
+#define MAX_ENGINE_INSTANCE    1
 
 /* PCI config space */
 
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 3ae89a9d6241..dbc7abd65f33 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -198,6 +198,15 @@ intel_engine_setup(struct drm_i915_private *dev_priv,
 	GEM_BUG_ON(info->class >= ARRAY_SIZE(intel_engine_classes));
 	class_info = &intel_engine_classes[info->class];
 
+	if (GEM_WARN_ON(info->class > MAX_ENGINE_CLASS))
+		return -EINVAL;
+
+	if (GEM_WARN_ON(info->instance > MAX_ENGINE_INSTANCE))
+		return -EINVAL;
+
+	if (GEM_WARN_ON(dev_priv->engine_class[info->class][info->instance]))
+		return -EINVAL;
+
 	GEM_BUG_ON(dev_priv->engine[id]);
 	engine = kzalloc(sizeof(*engine), GFP_KERNEL);
 	if (!engine)
@@ -225,6 +234,7 @@ intel_engine_setup(struct drm_i915_private *dev_priv,
 
 	ATOMIC_INIT_NOTIFIER_HEAD(&engine->context_status_notifier);
 
+	dev_priv->engine_class[info->class][info->instance] = engine;
 	dev_priv->engine[id] = engine;
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 268342433a8e..7db4c572ef76 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2283,3 +2283,28 @@ int intel_init_vebox_ring_buffer(struct intel_engine_cs *engine)
 
 	return intel_init_ring_buffer(engine);
 }
+
+static u8 user_class_map[I915_ENGINE_CLASS_MAX] = {
+	[I915_ENGINE_CLASS_OTHER] = OTHER_CLASS,
+	[I915_ENGINE_CLASS_RENDER] = RENDER_CLASS,
+	[I915_ENGINE_CLASS_COPY] = COPY_ENGINE_CLASS,
+	[I915_ENGINE_CLASS_VIDEO] = VIDEO_DECODE_CLASS,
+	[I915_ENGINE_CLASS_VIDEO_ENHANCE] = VIDEO_ENHANCEMENT_CLASS,
+};
+
+struct intel_engine_cs *
+intel_engine_lookup_user(struct drm_i915_private *i915, u8 class, u8 instance)
+{
+	if (class >= ARRAY_SIZE(user_class_map))
+		return NULL;
+
+	class = user_class_map[class];
+
+	if (WARN_ON_ONCE(class > MAX_ENGINE_CLASS))
+		return NULL;
+
+	if (instance > MAX_ENGINE_INSTANCE)
+		return NULL;
+
+	return i915->engine_class[class][instance];
+}
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 79c0021f3700..cf095b9386f4 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -245,6 +245,28 @@ struct intel_engine_cs {
 		I915_SELFTEST_DECLARE(bool mock : 1);
 	} breadcrumbs;
 
+	struct {
+		/**
+		 * @enable: Bitmask of enable sample events on this engine.
+		 *
+		 * Bits correspond to sample event types, for instance
+		 * I915_SAMPLE_QUEUED is bit 0 etc.
+		 */
+		u32 enable;
+		/**
+		 * @enable_count: Reference count for the enabled samplers.
+		 *
+		 * Index number corresponds to the bit number from @enable.
+		 */
+		unsigned int enable_count[I915_PMU_SAMPLE_BITS];
+		/**
+		 * @sample: Counter value for sampling events.
+		 *
+		 * Our internal timer stores the current counter in this field.
+		 */
+		u64 sample[I915_ENGINE_SAMPLE_MAX];
+	} pmu;
+
 	/*
 	 * A pool of objects to use as shadow copies of client batch buffers
 	 * when the command parser is enabled. Prevents the client from
@@ -737,4 +759,7 @@ void intel_engines_reset_default_submission(struct drm_i915_private *i915);
 
 bool intel_engine_can_store_dword(struct intel_engine_cs *engine);
 
+struct intel_engine_cs *
+intel_engine_lookup_user(struct drm_i915_private *i915, u8 class, u8 instance);
+
 #endif /* _INTEL_RINGBUFFER_H_ */
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index d8d10d932759..6dc0d6fd4e4c 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -86,6 +86,64 @@ enum i915_mocs_table_index {
 	I915_MOCS_CACHED,
 };
 
+enum drm_i915_gem_engine_class {
+	I915_ENGINE_CLASS_OTHER = 0,
+	I915_ENGINE_CLASS_RENDER = 1,
+	I915_ENGINE_CLASS_COPY = 2,
+	I915_ENGINE_CLASS_VIDEO = 3,
+	I915_ENGINE_CLASS_VIDEO_ENHANCE = 4,
+	I915_ENGINE_CLASS_MAX /* non-ABI */
+};
+
+/**
+ * DOC: perf_events exposed by i915 through /sys/bus/event_sources/drivers/i915
+ *
+ */
+
+enum drm_i915_pmu_engine_sample {
+	I915_SAMPLE_QUEUED = 0,
+	I915_SAMPLE_BUSY = 1,
+	I915_SAMPLE_WAIT = 2,
+	I915_SAMPLE_SEMA = 3,
+	I915_ENGINE_SAMPLE_MAX /* non-ABI */
+};
+
+#define I915_PMU_SAMPLE_BITS (4)
+#define I915_PMU_SAMPLE_MASK (0xf)
+#define I915_PMU_SAMPLE_INSTANCE_BITS (8)
+#define I915_PMU_CLASS_SHIFT \
+	(I915_PMU_SAMPLE_BITS + I915_PMU_SAMPLE_INSTANCE_BITS)
+
+#define __I915_PMU_ENGINE(class, instance, sample) \
+	((class) << I915_PMU_CLASS_SHIFT | \
+	(instance) << I915_PMU_SAMPLE_BITS | \
+	(sample))
+
+#define I915_PMU_ENGINE_QUEUED(class, instance) \
+	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_QUEUED)
+
+#define I915_PMU_ENGINE_BUSY(class, instance) \
+	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_BUSY)
+
+#define I915_PMU_ENGINE_WAIT(class, instance) \
+	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_WAIT)
+
+#define I915_PMU_ENGINE_SEMA(class, instance) \
+	__I915_PMU_ENGINE(class, instance, I915_SAMPLE_SEMA)
+
+#define __I915_PMU_OTHER(x) (__I915_PMU_ENGINE(0xff, 0xff, 0xf) + 1 + (x))
+
+#define I915_PMU_ACTUAL_FREQUENCY 	__I915_PMU_OTHER(0)
+#define I915_PMU_REQUESTED_FREQUENCY	__I915_PMU_OTHER(1)
+#define I915_PMU_ENERGY			__I915_PMU_OTHER(2)
+#define I915_PMU_INTERRUPTS		__I915_PMU_OTHER(3)
+
+#define I915_PMU_RC6_RESIDENCY		__I915_PMU_OTHER(4)
+#define I915_PMU_RC6p_RESIDENCY		__I915_PMU_OTHER(5)
+#define I915_PMU_RC6pp_RESIDENCY	__I915_PMU_OTHER(6)
+
+#define I915_PMU_LAST I915_PMU_RC6pp_RESIDENCY
+
 /* Each region is a minimum of 16k, and there are at most 255 of them.
  */
 #define I915_NR_TEX_REGIONS 255	/* table size 2k - maximum due to use
-- 
2.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [RFC v5 05/11] drm/i915/pmu: Suspend sampling when GPU is idle
  2017-09-11 15:25 ` [RFC 05/11] drm/i915/pmu: Suspend sampling when GPU is idle Tvrtko Ursulin
@ 2017-09-13 10:34   ` Tvrtko Ursulin
  2017-09-14 19:57     ` Chris Wilson
  0 siblings, 1 reply; 56+ messages in thread
From: Tvrtko Ursulin @ 2017-09-13 10:34 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

If only a subset of events is enabled we can afford to suspend
the sampling timer when the GPU is idle and so save some cycles
and power.

v2: Rebase and limit timer even more.
v3: Rebase.
v4: Rebase.
v5: Skip action if perf PMU failed to register.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |  8 ++++
 drivers/gpu/drm/i915/i915_gem.c         |  1 +
 drivers/gpu/drm/i915/i915_gem_request.c |  1 +
 drivers/gpu/drm/i915/i915_pmu.c         | 70 ++++++++++++++++++++++++++++-----
 4 files changed, 70 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 62646b8dfb7a..70be8c5d9a65 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2244,6 +2244,10 @@ struct i915_pmu {
 	 */
 	unsigned int enable_count[I915_PMU_MASK_BITS];
 	/**
+	 * @timer_enabled: Should the internal sampling timer be running.
+	 */
+	bool timer_enabled;
+	/**
 	 * @sample: Current counter value for i915 events which need sampling.
 	 *
 	 * These counters are updated from the i915 PMU sampling timer.
@@ -3989,9 +3993,13 @@ extern void i915_perf_unregister(struct drm_i915_private *dev_priv);
 #ifdef CONFIG_PERF_EVENTS
 extern void i915_pmu_register(struct drm_i915_private *i915);
 extern void i915_pmu_unregister(struct drm_i915_private *i915);
+extern void i915_pmu_gt_idle(struct drm_i915_private *i915);
+extern void i915_pmu_gt_active(struct drm_i915_private *i915);
 #else
 static inline void i915_pmu_register(struct drm_i915_private *i915) {}
 static inline void i915_pmu_unregister(struct drm_i915_private *i915) {}
+static inline void i915_pmu_gt_idle(struct drm_i915_private *i915) {}
+static inline void i915_pmu_gt_active(struct drm_i915_private *i915) {}
 #endif
 
 /* i915_suspend.c */
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f445587c1a4b..201b09eda93b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3227,6 +3227,7 @@ i915_gem_idle_work_handler(struct work_struct *work)
 
 	intel_engines_mark_idle(dev_priv);
 	i915_gem_timelines_mark_idle(dev_priv);
+	i915_pmu_gt_idle(dev_priv);
 
 	GEM_BUG_ON(!dev_priv->gt.awake);
 	dev_priv->gt.awake = false;
diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
index 813a3b546d6e..18a1e379253e 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -258,6 +258,7 @@ static void mark_busy(struct drm_i915_private *i915)
 	i915_update_gfx_val(i915);
 	if (INTEL_GEN(i915) >= 6)
 		gen6_rps_busy(i915);
+	i915_pmu_gt_active(i915);
 
 	queue_delayed_work(i915->wq,
 			   &i915->gt.retire_work,
diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index e82648e6635b..22246918757c 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -90,6 +90,52 @@ static unsigned int event_enabled_bit(struct perf_event *event)
 	return config_enabled_bit(event->attr.config);
 }
 
+static bool pmu_needs_timer(struct drm_i915_private *i915, bool gpu_active)
+{
+	u64 enable = i915->pmu.enable;
+
+	enable &= config_enabled_mask(I915_PMU_ACTUAL_FREQUENCY) |
+		  config_enabled_mask(I915_PMU_REQUESTED_FREQUENCY) |
+		  ENGINE_SAMPLE_MASK;
+
+	if (!gpu_active)
+		enable &= ~ENGINE_SAMPLE_MASK;
+
+	return enable;
+}
+
+void i915_pmu_gt_idle(struct drm_i915_private *i915)
+{
+	if (!i915->pmu.base.event_init)
+		return;
+
+	spin_lock_irq(&i915->pmu.lock);
+	/*
+	 * Signal sampling timer to stop if only engine events are enabled and
+	 * GPU went idle.
+	 */
+	i915->pmu.timer_enabled = pmu_needs_timer(i915, false);
+	spin_unlock_irq(&i915->pmu.lock);
+}
+
+void i915_pmu_gt_active(struct drm_i915_private *i915)
+{
+	if (!i915->pmu.base.event_init)
+		return;
+
+	spin_lock_irq(&i915->pmu.lock);
+	/*
+	 * Re-enable sampling timer when GPU goes active.
+	 */
+	if (!i915->pmu.timer_enabled && pmu_needs_timer(i915, true)) {
+		i915->pmu.timer_enabled = true;
+		hrtimer_start_range_ns(&i915->pmu.timer,
+				       ns_to_ktime(PERIOD), 0,
+				       HRTIMER_MODE_REL_PINNED);
+	}
+	spin_unlock_irq(&i915->pmu.lock);
+}
+
 static bool grab_forcewake(struct drm_i915_private *i915, bool fw)
 {
 	if (!fw)
@@ -180,7 +226,7 @@ static enum hrtimer_restart i915_sample(struct hrtimer *hrtimer)
 	struct drm_i915_private *i915 =
 		container_of(hrtimer, struct drm_i915_private, pmu.timer);
 
-	if (i915->pmu.enable == 0)
+	if (!READ_ONCE(i915->pmu.timer_enabled))
 		return HRTIMER_NORESTART;
 
 	engines_sample(i915);
@@ -362,14 +408,6 @@ static void i915_pmu_enable(struct perf_event *event)
 	spin_lock_irqsave(&i915->pmu.lock, flags);
 
 	/*
-	 * Start the sampling timer when enabling the first event.
-	 */
-	if (i915->pmu.enable == 0)
-		hrtimer_start_range_ns(&i915->pmu.timer,
-				       ns_to_ktime(PERIOD), 0,
-				       HRTIMER_MODE_REL_PINNED);
-
-	/*
 	 * Update the bitmask of enabled events and increment
 	 * the event reference counter.
 	 */
@@ -379,6 +417,16 @@ static void i915_pmu_enable(struct perf_event *event)
 	i915->pmu.enable_count[bit]++;
 
 	/*
+	 * Start the sampling timer if needed and not already enabled.
+	 */
+	if (pmu_needs_timer(i915, true) && !i915->pmu.timer_enabled) {
+		i915->pmu.timer_enabled = true;
+		hrtimer_start_range_ns(&i915->pmu.timer,
+				       ns_to_ktime(PERIOD), 0,
+				       HRTIMER_MODE_REL_PINNED);
+	}
+
+	/*
 	 * For per-engine events the bitmask and reference counting
 	 * is stored per engine.
 	 */
@@ -440,8 +488,10 @@ static void i915_pmu_disable(struct perf_event *event)
 	 * Decrement the reference count and clear the enabled
 	 * bitmask when the last listener on an event goes away.
 	 */
-	if (--i915->pmu.enable_count[bit] == 0)
+	if (--i915->pmu.enable_count[bit] == 0) {
 		i915->pmu.enable &= ~BIT_ULL(bit);
+		i915->pmu.timer_enabled &= pmu_needs_timer(i915, true);
+	}
 
 	spin_unlock_irqrestore(&i915->pmu.lock, flags);
 }
-- 
2.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* ✗ Fi.CI.BAT: failure for i915 PMU and engine busy stats (rev6)
  2017-09-11 15:25 [RFC v3 00/11] i915 PMU and engine busy stats Tvrtko Ursulin
                   ` (13 preceding siblings ...)
  2017-09-13  9:34 ` ✗ Fi.CI.BAT: warning for i915 PMU and engine busy stats (rev4) Patchwork
@ 2017-09-13 10:46 ` Patchwork
  2017-09-13 13:27 ` ✓ Fi.CI.BAT: success for i915 PMU and engine busy stats (rev7) Patchwork
  2017-09-13 21:24 ` ✓ Fi.CI.IGT: " Patchwork
  16 siblings, 0 replies; 56+ messages in thread
From: Patchwork @ 2017-09-13 10:46 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

== Series Details ==

Series: i915 PMU and engine busy stats (rev6)
URL   : https://patchwork.freedesktop.org/series/27488/
State : failure

== Summary ==

  CHK     include/config/kernel.release
  CHK     include/generated/uapi/linux/version.h
  CHK     include/generated/utsrelease.h
  CHK     include/generated/bounds.h
  CHK     include/generated/timeconst.h
  CHK     include/generated/asm-offsets.h
  CALL    scripts/checksyscalls.sh
  CHK     scripts/mod/devicetable-offsets.h
  CHK     include/generated/compile.h
  CHK     kernel/config_data.h
  CC [M]  drivers/gpu/drm/i915/i915_pmu.o
drivers/gpu/drm/i915/i915_pmu.c: In function ‘i915_pmu_register’:
drivers/gpu/drm/i915/i915_pmu.c:750:1: error: version control conflict marker in file
 <<<<<<< HEAD
 ^~~~~~~
drivers/gpu/drm/i915/i915_pmu.c:752:1: error: version control conflict marker in file
 =======
 ^~~~~~~
drivers/gpu/drm/i915/i915_pmu.c:754:2: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement]
  struct intel_engine_cs *engine;
  ^~~~~~
drivers/gpu/drm/i915/i915_pmu.c:756:1: error: version control conflict marker in file
 >>>>>>> drm/i915: Gate engine stats collection with a static key
 ^~~~~~~
drivers/gpu/drm/i915/i915_pmu.c:767:6: error: ‘ret’ undeclared (first use in this function)
  if (ret)
      ^~~
drivers/gpu/drm/i915/i915_pmu.c:767:6: note: each undeclared identifier is reported only once for each function it appears in
drivers/gpu/drm/i915/i915_pmu.c: In function ‘i915_pmu_unregister’:
drivers/gpu/drm/i915/i915_pmu.c:820:1: error: version control conflict marker in file
 <<<<<<< HEAD
 ^~~~~~~
drivers/gpu/drm/i915/i915_pmu.c:823:1: error: version control conflict marker in file
 =======
 ^~~~~~~
drivers/gpu/drm/i915/i915_pmu.c:831:1: error: version control conflict marker in file
 >>>>>>> drm/i915: Gate engine stats collection with a static key
 ^~~~~~~
drivers/gpu/drm/i915/i915_pmu.c:807:23: error: unused variable ‘id’ [-Werror=unused-variable]
  enum intel_engine_id id;
                       ^~
drivers/gpu/drm/i915/i915_pmu.c:806:26: error: unused variable ‘engine’ [-Werror=unused-variable]
  struct intel_engine_cs *engine;
                          ^~~~~~
At top level:
drivers/gpu/drm/i915/i915_pmu.c:715:12: error: ‘i915_pmu_cpu_offline’ defined but not used [-Werror=unused-function]
 static int i915_pmu_cpu_offline(unsigned int cpu, struct hlist_node *node)
            ^~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/i915/i915_pmu.c:703:12: error: ‘i915_pmu_cpu_online’ defined but not used [-Werror=unused-function]
 static int i915_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
            ^~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
scripts/Makefile.build:302: recipe for target 'drivers/gpu/drm/i915/i915_pmu.o' failed
make[4]: *** [drivers/gpu/drm/i915/i915_pmu.o] Error 1
scripts/Makefile.build:561: recipe for target 'drivers/gpu/drm/i915' failed
make[3]: *** [drivers/gpu/drm/i915] Error 2
scripts/Makefile.build:561: recipe for target 'drivers/gpu/drm' failed
make[2]: *** [drivers/gpu/drm] Error 2
scripts/Makefile.build:561: recipe for target 'drivers/gpu' failed
make[1]: *** [drivers/gpu] Error 2
Makefile:1019: recipe for target 'drivers' failed
make: *** [drivers] Error 2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [RFC v3 11/11] drm/i915: Gate engine stats collection with a static key
  2017-09-11 15:25 ` [RFC 11/11] drm/i915: Gate engine stats collection with a static key Tvrtko Ursulin
@ 2017-09-13 12:18   ` Tvrtko Ursulin
  2017-09-14 20:22     ` Chris Wilson
  0 siblings, 1 reply; 56+ messages in thread
From: Tvrtko Ursulin @ 2017-09-13 12:18 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

This reduces the cost of the software engine busyness tracking
to a single no-op instruction when there are no listeners.

v2: Rebase and some comments.
v3: Rebase.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_pmu.c         | 54 +++++++++++++++++++++++--
 drivers/gpu/drm/i915/intel_engine_cs.c  | 17 ++++++++
 drivers/gpu/drm/i915/intel_ringbuffer.h | 70 ++++++++++++++++++++++-----------
 3 files changed, 113 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index 3c0c5d0549b3..d734879e67ee 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -460,11 +460,17 @@ static void i915_pmu_enable(struct perf_event *event)
 		GEM_BUG_ON(sample >= I915_PMU_SAMPLE_BITS);
 		GEM_BUG_ON(engine->pmu.enable_count[sample] == ~0);
 		if (engine->pmu.enable_count[sample]++ == 0) {
+			/*
+			 * Enable engine busy stats tracking if needed or
+			 * alternatively cancel the scheduled disabling of the
+			 * same.
+			 */
 			if (engine_needs_busy_stats(engine) &&
 			    !engine->pmu.busy_stats) {
-				engine->pmu.busy_stats =
-					intel_enable_engine_stats(engine) == 0;
-				WARN_ON_ONCE(!engine->pmu.busy_stats);
+				engine->pmu.busy_stats = true;
+				if (!cancel_delayed_work(&engine->pmu.disable_busy_stats))
+					queue_work(i915->wq,
+						&engine->pmu.enable_busy_stats);
 			}
 		}
 	}
@@ -507,7 +513,15 @@ static void i915_pmu_disable(struct perf_event *event)
 			if (!engine_needs_busy_stats(engine) &&
 			    engine->pmu.busy_stats) {
 				engine->pmu.busy_stats = false;
-				intel_disable_engine_stats(engine);
+				/*
+				 * We request a delayed disable to handle the
+				 * rapid on/off cycles on events which can
+				 * happen when tools like perf stat start in a
+				 * nicer way.
+				 */
+				queue_delayed_work(i915->wq,
+						   &engine->pmu.disable_busy_stats,
+						   round_jiffies_up_relative(HZ));
 			}
 		}
 	}
@@ -715,9 +729,27 @@ static int i915_pmu_cpu_offline(unsigned int cpu, struct hlist_node *node)
 	return 0;
 }
 
+static void __enable_busy_stats(struct work_struct *work)
+{
+	struct intel_engine_cs *engine =
+		container_of(work, typeof(*engine), pmu.enable_busy_stats);
+
+	WARN_ON_ONCE(intel_enable_engine_stats(engine));
+}
+
+static void __disable_busy_stats(struct work_struct *work)
+{
+	struct intel_engine_cs *engine =
+	       container_of(work, typeof(*engine), pmu.disable_busy_stats.work);
+
+	intel_disable_engine_stats(engine);
+}
+
 void i915_pmu_register(struct drm_i915_private *i915)
 {
 	int ret;
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
 
 	if (INTEL_GEN(i915) <= 2) {
 		ret = -ENOTSUPP;
@@ -751,6 +783,12 @@ void i915_pmu_register(struct drm_i915_private *i915)
 	i915->pmu.timer.function = i915_sample;
 	i915->pmu.enable = 0;
 
+	for_each_engine(engine, i915, id) {
+		INIT_WORK(&engine->pmu.enable_busy_stats, __enable_busy_stats);
+		INIT_DELAYED_WORK(&engine->pmu.disable_busy_stats,
+				  __disable_busy_stats);
+	}
+
 	if (perf_pmu_register(&i915->pmu.base, "i915", -1))
 		i915->pmu.base.event_init = NULL;
 
@@ -761,6 +799,9 @@ void i915_pmu_register(struct drm_i915_private *i915)
 
 void i915_pmu_unregister(struct drm_i915_private *i915)
 {
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+
 	if (!i915->pmu.base.event_init)
 		return;
 
@@ -772,6 +813,11 @@ void i915_pmu_unregister(struct drm_i915_private *i915)
 
 	hrtimer_cancel(&i915->pmu.timer);
 
+	for_each_engine(engine, i915, id) {
+		flush_work(&engine->pmu.enable_busy_stats);
+		flush_delayed_work(&engine->pmu.disable_busy_stats);
+	}
+
 	perf_pmu_unregister(&i915->pmu.base);
 	i915->pmu.base.event_init = NULL;
 }
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index e2152dd21b4a..e4a8eb83a018 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -21,6 +21,7 @@
  * IN THE SOFTWARE.
  *
  */
+#include <linux/static_key.h>
 
 #include "i915_drv.h"
 #include "intel_ringbuffer.h"
@@ -1419,6 +1420,10 @@ bool intel_engine_can_store_dword(struct intel_engine_cs *engine)
 	}
 }
 
+DEFINE_STATIC_KEY_FALSE(i915_engine_stats_key);
+static DEFINE_MUTEX(i915_engine_stats_mutex);
+static int i915_engine_stats_ref;
+
 /**
  * intel_enable_engine_stats() - Enable engine busy tracking on engine
  * @engine: engine to enable stats collection
@@ -1434,16 +1439,24 @@ int intel_enable_engine_stats(struct intel_engine_cs *engine)
 	if (!i915.enable_execlists)
 		return -ENODEV;
 
+	mutex_lock(&i915_engine_stats_mutex);
+
 	spin_lock_irqsave(&engine->stats.lock, flags);
 	if (engine->stats.enabled == ~0)
 		goto busy;
 	engine->stats.enabled++;
 	spin_unlock_irqrestore(&engine->stats.lock, flags);
 
+	if (i915_engine_stats_ref++ == 0)
+		static_branch_enable(&i915_engine_stats_key);
+
+	mutex_unlock(&i915_engine_stats_mutex);
+
 	return 0;
 
 busy:
 	spin_unlock_irqrestore(&engine->stats.lock, flags);
+	mutex_unlock(&i915_engine_stats_mutex);
 
 	return -EBUSY;
 }
@@ -1461,6 +1474,7 @@ void intel_disable_engine_stats(struct intel_engine_cs *engine)
 	if (!i915.enable_execlists)
 		return;
 
+	mutex_lock(&i915_engine_stats_mutex);
 	spin_lock_irqsave(&engine->stats.lock, flags);
 	WARN_ON_ONCE(engine->stats.enabled == 0);
 	if (--engine->stats.enabled == 0) {
@@ -1468,6 +1482,9 @@ void intel_disable_engine_stats(struct intel_engine_cs *engine)
 		engine->stats.start = engine->stats.total = 0;
 	}
 	spin_unlock_irqrestore(&engine->stats.lock, flags);
+	if (--i915_engine_stats_ref == 0)
+		static_branch_disable(&i915_engine_stats_key);
+	mutex_unlock(&i915_engine_stats_mutex);
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index fe554fc76867..65dea686fc7c 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -269,6 +269,22 @@ struct intel_engine_cs {
 		 * @busy_stats: Has enablement of engine stats tracking been requested.
 		 */
 		bool busy_stats;
+		/**
+		 * @enable_busy_stats: Work item for engine busy stats enabling.
+		 *
+		 * Since the action can sleep it needs to be decoupled from the
+		 * perf API callback.
+		 */
+		struct work_struct enable_busy_stats;
+		/**
+		 * @disable_busy_stats: Work item for busy stats disabling.
+		 *
+		 * Same as with @enable_busy_stats action, with the difference
+		 * that we delay it in case there are rapid enable-disable
+		 * actions, which can happen during tool startup (like perf
+		 * stat).
+		 */
+		struct delayed_work disable_busy_stats;
 	} pmu;
 
 	/*
@@ -794,48 +810,54 @@ bool intel_engine_can_store_dword(struct intel_engine_cs *engine);
 struct intel_engine_cs *
 intel_engine_lookup_user(struct drm_i915_private *i915, u8 class, u8 instance);
 
+DECLARE_STATIC_KEY_FALSE(i915_engine_stats_key);
+
 static inline void intel_engine_context_in(struct intel_engine_cs *engine)
 {
 	unsigned long flags;
 
-	if (READ_ONCE(engine->stats.enabled) == 0)
-		return;
+	if (static_branch_unlikely(&i915_engine_stats_key)) {
+		if (READ_ONCE(engine->stats.enabled) == 0)
+			return;
 
-	spin_lock_irqsave(&engine->stats.lock, flags);
+		spin_lock_irqsave(&engine->stats.lock, flags);
 
-	if (engine->stats.enabled > 0) {
-		if (engine->stats.ref++ == 0)
-			engine->stats.start = ktime_get();
-		GEM_BUG_ON(engine->stats.ref == 0);
-	}
+			if (engine->stats.enabled > 0) {
+				if (engine->stats.ref++ == 0)
+					engine->stats.start = ktime_get();
+				GEM_BUG_ON(engine->stats.ref == 0);
+			}
 
-	spin_unlock_irqrestore(&engine->stats.lock, flags);
+		spin_unlock_irqrestore(&engine->stats.lock, flags);
+	}
 }
 
 static inline void intel_engine_context_out(struct intel_engine_cs *engine)
 {
 	unsigned long flags;
 
-	if (READ_ONCE(engine->stats.enabled) == 0)
-		return;
+	if (static_branch_unlikely(&i915_engine_stats_key)) {
+		if (READ_ONCE(engine->stats.enabled) == 0)
+			return;
 
-	spin_lock_irqsave(&engine->stats.lock, flags);
+		spin_lock_irqsave(&engine->stats.lock, flags);
 
-	if (engine->stats.enabled > 0) {
-		/*
-		 * After turning on engine stats, context out might be the
-		 * first event which then needs to be ignored (ref == 0).
-		 */
-		if (engine->stats.ref && --engine->stats.ref == 0) {
-			ktime_t last = ktime_sub(ktime_get(),
-						 engine->stats.start);
+		if (engine->stats.enabled > 0) {
+			/*
+			 * After turning on engine stats, context out might be
+			 * the first event which then needs to be ignored.
+			 */
+			if (engine->stats.ref && --engine->stats.ref == 0) {
+				ktime_t last = ktime_sub(ktime_get(),
+							 engine->stats.start);
 
-			engine->stats.total = ktime_add(engine->stats.total,
-							last);
+				engine->stats.total =
+					ktime_add(engine->stats.total, last);
+			}
 		}
-	}
 
-	spin_unlock_irqrestore(&engine->stats.lock, flags);
+		spin_unlock_irqrestore(&engine->stats.lock, flags);
+	}
 }
 
 int intel_enable_engine_stats(struct intel_engine_cs *engine);
-- 
2.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* ✓ Fi.CI.BAT: success for i915 PMU and engine busy stats (rev7)
  2017-09-11 15:25 [RFC v3 00/11] i915 PMU and engine busy stats Tvrtko Ursulin
                   ` (14 preceding siblings ...)
  2017-09-13 10:46 ` ✗ Fi.CI.BAT: failure for i915 PMU and engine busy stats (rev6) Patchwork
@ 2017-09-13 13:27 ` Patchwork
  2017-09-13 21:24 ` ✓ Fi.CI.IGT: " Patchwork
  16 siblings, 0 replies; 56+ messages in thread
From: Patchwork @ 2017-09-13 13:27 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

== Series Details ==

Series: i915 PMU and engine busy stats (rev7)
URL   : https://patchwork.freedesktop.org/series/27488/
State : success

== Summary ==

Series 27488v7 i915 PMU and engine busy stats
https://patchwork.freedesktop.org/api/1.0/series/27488/revisions/7/mbox/

Test chamelium:
        Subgroup dp-crc-fast:
                pass       -> FAIL       (fi-kbl-7500u) fdo#102514
Test kms_cursor_legacy:
        Subgroup basic-flip-before-cursor-atomic:
                incomplete -> PASS       (fi-bxt-j4205) fdo#102705
Test pm_rpm:
        Subgroup basic-rte:
                pass       -> DMESG-WARN (fi-cfl-s) fdo#102294

fdo#102514 https://bugs.freedesktop.org/show_bug.cgi?id=102514
fdo#102705 https://bugs.freedesktop.org/show_bug.cgi?id=102705
fdo#102294 https://bugs.freedesktop.org/show_bug.cgi?id=102294

fi-bdw-5557u     total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  time:446s
fi-bdw-gvtdvm    total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:453s
fi-blb-e6850     total:289  pass:224  dwarn:1   dfail:0   fail:0   skip:64  time:378s
fi-bsw-n3050     total:289  pass:243  dwarn:0   dfail:0   fail:0   skip:46  time:543s
fi-bwr-2160      total:289  pass:184  dwarn:0   dfail:0   fail:0   skip:105 time:269s
fi-bxt-j4205     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:512s
fi-byt-j1900     total:289  pass:254  dwarn:1   dfail:0   fail:0   skip:34  time:510s
fi-byt-n2820     total:289  pass:250  dwarn:1   dfail:0   fail:0   skip:38  time:500s
fi-cfl-s         total:289  pass:222  dwarn:35  dfail:0   fail:0   skip:32  time:561s
fi-elk-e7500     total:289  pass:230  dwarn:0   dfail:0   fail:0   skip:59  time:453s
fi-glk-2a        total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:600s
fi-hsw-4770      total:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  time:429s
fi-hsw-4770r     total:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  time:413s
fi-ilk-650       total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  time:443s
fi-ivb-3520m     total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  time:493s
fi-ivb-3770      total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  time:474s
fi-kbl-7500u     total:289  pass:263  dwarn:1   dfail:0   fail:1   skip:24  time:485s
fi-kbl-7560u     total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  time:580s
fi-kbl-r         total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:586s
fi-pnv-d510      total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  time:550s
fi-skl-6260u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:462s
fi-skl-6700k     total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:528s
fi-skl-6770hq    total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:513s
fi-skl-gvtdvm    total:289  pass:266  dwarn:0   dfail:0   fail:0   skip:23  time:458s
fi-skl-x1585l    total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  time:476s
fi-snb-2520m     total:289  pass:251  dwarn:0   dfail:0   fail:0   skip:38  time:584s
fi-snb-2600      total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  time:425s

76f9b11f445f4381eff873a62138ed0b00d08e80 drm-tip: 2017y-09m-13d-12h-28m-54s UTC integration manifest
1c047656cd75 drm/i915: Gate engine stats collection with a static key
1a4be037c2ba drm/i915: Export engine stats API to other users
4f215e795c7d drm/i915/pmu: Wire up engine busy stats to PMU
655532dd323b drm/i915: Export engine busy stats in debugfs
ec9d88cf8fb8 drm/i915: Engine busy time tracking
c19f405cb074 drm/i915: Wrap context schedule notification
5d8030688248 drm/i915/pmu: Suspend sampling when GPU is idle
98ca2c439317 drm/i915/pmu: Expose a PMU interface for perf queries
c03f860d5c4b drm/i915: Extract intel_get_cagf
e79304ec9862 drm/i915: Add intel_energy_uJ
0269ac1577e4 drm/i915: Convert intel_rc6_residency_us to ns

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5683/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* ✓ Fi.CI.IGT: success for i915 PMU and engine busy stats (rev7)
  2017-09-11 15:25 [RFC v3 00/11] i915 PMU and engine busy stats Tvrtko Ursulin
                   ` (15 preceding siblings ...)
  2017-09-13 13:27 ` ✓ Fi.CI.BAT: success for i915 PMU and engine busy stats (rev7) Patchwork
@ 2017-09-13 21:24 ` Patchwork
  16 siblings, 0 replies; 56+ messages in thread
From: Patchwork @ 2017-09-13 21:24 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

== Series Details ==

Series: i915 PMU and engine busy stats (rev7)
URL   : https://patchwork.freedesktop.org/series/27488/
State : success

== Summary ==

Test perf:
        Subgroup polling:
                fail       -> PASS       (shard-hsw) fdo#102252
Test kms_setmode:
        Subgroup basic:
                pass       -> FAIL       (shard-hsw) fdo#99912

fdo#102252 https://bugs.freedesktop.org/show_bug.cgi?id=102252
fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912

shard-hsw        total:2313 pass:1245 dwarn:0   dfail:0   fail:13  skip:1055 time:9450s

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5683/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC 04/11] drm/i915/pmu: Expose a PMU interface for perf queries
  2017-09-11 15:25 ` [RFC 04/11] drm/i915/pmu: Expose a PMU interface for perf queries Tvrtko Ursulin
  2017-09-12  2:06   ` Rogozhkin, Dmitry V
@ 2017-09-14 19:46   ` Chris Wilson
  1 sibling, 0 replies; 56+ messages in thread
From: Chris Wilson @ 2017-09-14 19:46 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-gfx; +Cc: Peter Zijlstra

Quoting Tvrtko Ursulin (2017-09-11 16:25:52)
> +       case I915_PMU_RC6_RESIDENCY:
> +               val = intel_rc6_residency_ns(i915,
> +                                            IS_VALLEYVIEW(i915) ?
> +                                            VLV_GT_RENDER_RC6 :
> +                                            GEN6_GT_GFX_RC6);
> +               break;
> +       case I915_PMU_RC6p_RESIDENCY:
> +               if (!IS_VALLEYVIEW(i915))

HAS_RC6p and HAS_RC6pp?
> +                       val = intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6p);
> +               break;
> +       case I915_PMU_RC6pp_RESIDENCY:
> +               if (!IS_VALLEYVIEW(i915))
> +                       val = intel_rc6_residency_ns(i915, GEN6_GT_GFX_RC6pp);
> +               break;
> +       }
> +
> +       return val;
> +}
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC 01/11] drm/i915: Convert intel_rc6_residency_us to ns
  2017-09-11 15:25 ` [RFC 01/11] drm/i915: Convert intel_rc6_residency_us to ns Tvrtko Ursulin
@ 2017-09-14 19:48   ` Chris Wilson
  0 siblings, 0 replies; 56+ messages in thread
From: Chris Wilson @ 2017-09-14 19:48 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-gfx; +Cc: Peter Zijlstra

Quoting Tvrtko Ursulin (2017-09-11 16:25:49)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Will be used for exposing the PMU counters.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h |  8 +++++++-
>  drivers/gpu/drm/i915/intel_pm.c | 23 +++++++++--------------
>  2 files changed, 16 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index d07d1109e784..dbd054e88ca2 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -4114,9 +4114,15 @@ void vlv_phy_reset_lanes(struct intel_encoder *encoder);
>  
>  int intel_gpu_freq(struct drm_i915_private *dev_priv, int val);
>  int intel_freq_opcode(struct drm_i915_private *dev_priv, int val);
> -u64 intel_rc6_residency_us(struct drm_i915_private *dev_priv,
> +u64 intel_rc6_residency_ns(struct drm_i915_private *dev_priv,
>                            const i915_reg_t reg);
>  
> +static inline u64 intel_rc6_residency_us(struct drm_i915_private *dev_priv,
> +                                        const i915_reg_t reg)
> +{
> +       return DIV_ROUND_UP_ULL(intel_rc6_residency_ns(dev_priv, reg), 1000);
> +}
> +
>  #define I915_READ8(reg)                dev_priv->uncore.funcs.mmio_readb(dev_priv, (reg), true)
>  #define I915_WRITE8(reg, val)  dev_priv->uncore.funcs.mmio_writeb(dev_priv, (reg), (val), true)
>  
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index fa9055a4f790..60461f49936b 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -9343,10 +9343,10 @@ static u64 vlv_residency_raw(struct drm_i915_private *dev_priv,
>         return lower | (u64)upper << 8;
>  }
>  
> -u64 intel_rc6_residency_us(struct drm_i915_private *dev_priv,
> +u64 intel_rc6_residency_ns(struct drm_i915_private *dev_priv,
>                            const i915_reg_t reg)
>  {
> -       u64 time_hw, units, div;
> +       u64 res;
>  
>         if (!intel_enable_rc6())
>                 return 0;
> @@ -9355,22 +9355,17 @@ u64 intel_rc6_residency_us(struct drm_i915_private *dev_priv,
>  
>         /* On VLV and CHV, residency time is in CZ units rather than 1.28us */
>         if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv)) {
> -               units = 1000;
> -               div = dev_priv->czclk_freq;
> +               res = vlv_residency_raw(dev_priv, reg);
> +               res = DIV_ROUND_UP_ULL(res * 1000000, dev_priv->czclk_freq);
>  
> -               time_hw = vlv_residency_raw(dev_priv, reg);
> -       } else if (IS_GEN9_LP(dev_priv)) {
> -               units = 1000;
> -               div = 1200;             /* 833.33ns */
> -
> -               time_hw = I915_READ(reg);
>         } else {
> -               units = 128000; /* 1.28us */
> -               div = 100000;
> +               /* 833.33ns units on Gen9LP, 1.28us elsewhere. */
> +               unsigned int unit = IS_GEN9_LP(dev_priv) ? 833 : 1280;
>  
> -               time_hw = I915_READ(reg);
> +               res = (u64)I915_READ(reg) * unit;
>         }
>  
>         intel_runtime_pm_put(dev_priv);

I was worried that you were going to assume that we took a wakeref in
pmu, but I see that it keeps the if (intel_rpm_get_if_in_use() {}

Should we push the rpm wakeref to the caller? intel_runtime_pm_get()
isn't as simple as a single atomic op you would like it to be...
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC 02/11] drm/i915: Add intel_energy_uJ
  2017-09-11 15:25 ` [RFC 02/11] drm/i915: Add intel_energy_uJ Tvrtko Ursulin
@ 2017-09-14 19:49   ` Chris Wilson
  2017-09-15  9:18     ` Tvrtko Ursulin
  2017-09-14 20:36   ` Ville Syrjälä
  1 sibling, 1 reply; 56+ messages in thread
From: Chris Wilson @ 2017-09-14 19:49 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-gfx; +Cc: Peter Zijlstra

Quoting Tvrtko Ursulin (2017-09-11 16:25:50)
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 60461f49936b..ff67df8d99fa 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -9369,3 +9369,28 @@ u64 intel_rc6_residency_ns(struct drm_i915_private *dev_priv,
>  
>         return res;
>  }
> +
> +unsigned long long intel_energy_uJ(struct drm_i915_private *dev_priv)
> +{
> +       unsigned long long power;
> +       unsigned long units;
> +
> +       if (GEM_WARN_ON(INTEL_GEN(dev_priv) < 6))
> +               return 0;
> +
> +       intel_runtime_pm_get(dev_priv);

Same question on lifting the rpm_get to the caller?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC 03/11] drm/i915: Extract intel_get_cagf
  2017-09-11 15:25 ` [RFC 03/11] drm/i915: Extract intel_get_cagf Tvrtko Ursulin
@ 2017-09-14 19:51   ` Chris Wilson
  0 siblings, 0 replies; 56+ messages in thread
From: Chris Wilson @ 2017-09-14 19:51 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-gfx; +Cc: Peter Zijlstra

Quoting Tvrtko Ursulin (2017-09-11 16:25:51)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Code to be shared between debugfs and the PMU implementation.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC v5 05/11] drm/i915/pmu: Suspend sampling when GPU is idle
  2017-09-13 10:34   ` [RFC v5 " Tvrtko Ursulin
@ 2017-09-14 19:57     ` Chris Wilson
  2017-09-15  9:22       ` Tvrtko Ursulin
  0 siblings, 1 reply; 56+ messages in thread
From: Chris Wilson @ 2017-09-14 19:57 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-gfx

Quoting Tvrtko Ursulin (2017-09-13 11:34:17)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> If only a subset of events is enabled we can afford to suspend
> the sampling timer when the GPU is idle and so save some cycles
> and power.
> 
> v2: Rebase and limit timer even more.
> v3: Rebase.
> v4: Rebase.
> v5: Skip action if perf PMU failed to register.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h         |  8 ++++
>  drivers/gpu/drm/i915/i915_gem.c         |  1 +
>  drivers/gpu/drm/i915/i915_gem_request.c |  1 +
>  drivers/gpu/drm/i915/i915_pmu.c         | 70 ++++++++++++++++++++++++++++-----
>  4 files changed, 70 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 62646b8dfb7a..70be8c5d9a65 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2244,6 +2244,10 @@ struct i915_pmu {
>          */
>         unsigned int enable_count[I915_PMU_MASK_BITS];
>         /**
> +        * @timer_enabled: Should the internal sampling timer be running.
> +        */
> +       bool timer_enabled;
> +       /**
>          * @sample: Current counter value for i915 events which need sampling.
>          *
>          * These counters are updated from the i915 PMU sampling timer.
> @@ -3989,9 +3993,13 @@ extern void i915_perf_unregister(struct drm_i915_private *dev_priv);
>  #ifdef CONFIG_PERF_EVENTS
>  extern void i915_pmu_register(struct drm_i915_private *i915);
>  extern void i915_pmu_unregister(struct drm_i915_private *i915);
> +extern void i915_pmu_gt_idle(struct drm_i915_private *i915);
> +extern void i915_pmu_gt_active(struct drm_i915_private *i915);
>  #else
>  static inline void i915_pmu_register(struct drm_i915_private *i915) {}
>  static inline void i915_pmu_unregister(struct drm_i915_private *i915) {}
> +static inline void i915_pmu_gt_idle(struct drm_i915_private *i915) {}
> +static inline void i915_pmu_gt_active(struct drm_i915_private *i915) {}
>  #endif
>  
>  /* i915_suspend.c */
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index f445587c1a4b..201b09eda93b 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3227,6 +3227,7 @@ i915_gem_idle_work_handler(struct work_struct *work)
>  
>         intel_engines_mark_idle(dev_priv);
>         i915_gem_timelines_mark_idle(dev_priv);
> +       i915_pmu_gt_idle(dev_priv);
>  
>         GEM_BUG_ON(!dev_priv->gt.awake);
>         dev_priv->gt.awake = false;
> diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> index 813a3b546d6e..18a1e379253e 100644
> --- a/drivers/gpu/drm/i915/i915_gem_request.c
> +++ b/drivers/gpu/drm/i915/i915_gem_request.c
> @@ -258,6 +258,7 @@ static void mark_busy(struct drm_i915_private *i915)
>         i915_update_gfx_val(i915);
>         if (INTEL_GEN(i915) >= 6)
>                 gen6_rps_busy(i915);
> +       i915_pmu_gt_active(i915);
>  
>         queue_delayed_work(i915->wq,
>                            &i915->gt.retire_work,
> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
> index e82648e6635b..22246918757c 100644
> --- a/drivers/gpu/drm/i915/i915_pmu.c
> +++ b/drivers/gpu/drm/i915/i915_pmu.c
> @@ -90,6 +90,52 @@ static unsigned int event_enabled_bit(struct perf_event *event)
>         return config_enabled_bit(event->attr.config);
>  }
>  
> +static bool pmu_needs_timer(struct drm_i915_private *i915, bool gpu_active)
> +{
> +       u64 enable = i915->pmu.enable;
> +
> +       enable &= config_enabled_mask(I915_PMU_ACTUAL_FREQUENCY) |
> +                 config_enabled_mask(I915_PMU_REQUESTED_FREQUENCY) |
> +                 ENGINE_SAMPLE_MASK;
> +
> +       if (!gpu_active)
> +               enable &= ~ENGINE_SAMPLE_MASK;

Ok.

> +
> +       return enable;
> +}
> +
> +void i915_pmu_gt_idle(struct drm_i915_private *i915)
> +{
> +       if (!i915->pmu.base.event_init)
> +               return;
> +
> +       spin_lock_irq(&i915->pmu.lock);
> +       /*
> +        * Signal sampling timer to stop if only engine events are enabled and
> +        * GPU went idle.
> +        */
> +       i915->pmu.timer_enabled = pmu_needs_timer(i915, false);
> +       spin_unlock_irq(&i915->pmu.lock);
> +}
> +
> +void i915_pmu_gt_active(struct drm_i915_private *i915)
> +{
> +       if (!i915->pmu.base.event_init)
> +               return;
> +
> +       spin_lock_irq(&i915->pmu.lock);
> +       /*
> +        * Re-enable sampling timer when GPU goes active.
> +        */
> +       if (!i915->pmu.timer_enabled && pmu_needs_timer(i915, true)) {
> +               i915->pmu.timer_enabled = true;
> +               hrtimer_start_range_ns(&i915->pmu.timer,
> +                                      ns_to_ktime(PERIOD), 0,
> +                                      HRTIMER_MODE_REL_PINNED);
> +       }

Duplication here, __i915_pmu_gt_active() for the critical section. Then
also helps allay fears that we hold the spinlock which is not clear from
the diff context below.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC 07/11] drm/i915: Engine busy time tracking
  2017-09-11 15:25 ` [RFC 07/11] drm/i915: Engine busy time tracking Tvrtko Ursulin
@ 2017-09-14 20:16   ` Chris Wilson
  2017-09-15  9:45     ` Tvrtko Ursulin
  0 siblings, 1 reply; 56+ messages in thread
From: Chris Wilson @ 2017-09-14 20:16 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-gfx; +Cc: Peter Zijlstra

Quoting Tvrtko Ursulin (2017-09-11 16:25:55)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Track total time requests have been executing on the hardware.
> 
> We add new kernel API to allow software tracking of time GPU
> engines are spending executing requests.
> 
> Both per-engine and global API is added with the latter also
> being exported for use by external users.
> 
> v2:
>  * Squashed with the internal API.
>  * Dropped static key.
>  * Made per-engine.
>  * Store time in monotonic ktime.
> 
> v3: Moved stats clearing to disable.
> 
> v4:
>  * Comments.
>  * Don't export the API just yet.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_engine_cs.c  | 141 ++++++++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/intel_lrc.c        |   2 +
>  drivers/gpu/drm/i915/intel_ringbuffer.h |  81 ++++++++++++++++++
>  3 files changed, 224 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> index dbc7abd65f33..f7dba176989c 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -232,6 +232,8 @@ intel_engine_setup(struct drm_i915_private *dev_priv,
>         /* Nothing to do here, execute in order of dependencies */
>         engine->schedule = NULL;
>  
> +       spin_lock_init(&engine->stats.lock);
> +
>         ATOMIC_INIT_NOTIFIER_HEAD(&engine->context_status_notifier);
>  
>         dev_priv->engine_class[info->class][info->instance] = engine;
> @@ -1417,6 +1419,145 @@ bool intel_engine_can_store_dword(struct intel_engine_cs *engine)
>         }
>  }
>  
> +/**
> + * intel_enable_engine_stats() - Enable engine busy tracking on engine
> + * @engine: engine to enable stats collection
> + *
> + * Start collecting the engine busyness data for @engine.
> + *
> + * Returns 0 on success or a negative error code.
> + */
> +int intel_enable_engine_stats(struct intel_engine_cs *engine)
> +{
> +       unsigned long flags;
> +
> +       if (!i915.enable_execlists)
> +               return -ENODEV;
> +
> +       spin_lock_irqsave(&engine->stats.lock, flags);
> +       if (engine->stats.enabled == ~0)
> +               goto busy;
> +       engine->stats.enabled++;
> +       spin_unlock_irqrestore(&engine->stats.lock, flags);
> +
> +       return 0;
> +
> +busy:
> +       spin_unlock_irqrestore(&engine->stats.lock, flags);
> +
> +       return -EBUSY;
> +}
> +
> +/**
> + * intel_disable_engine_stats() - Disable engine busy tracking on engine
> + * @engine: engine to disable stats collection
> + *
> + * Stops collecting the engine busyness data for @engine.
> + */
> +void intel_disable_engine_stats(struct intel_engine_cs *engine)
> +{
> +       unsigned long flags;
> +
> +       if (!i915.enable_execlists)
> +               return;
> +
> +       spin_lock_irqsave(&engine->stats.lock, flags);
> +       WARN_ON_ONCE(engine->stats.enabled == 0);
> +       if (--engine->stats.enabled == 0) {

Saturation protection on inc, but not on dec?

You might as well just use refcount_t.

> +               engine->stats.ref = 0;
> +               engine->stats.start = engine->stats.total = 0;
> +       }
> +       spin_unlock_irqrestore(&engine->stats.lock, flags);
> +}
> +
> +/**
> + * intel_enable_engines_stats() - Enable engine busy tracking on all engines
> + * @dev_priv: i915 device private
> + *
> + * Start collecting the engine busyness data for all engines.
> + *
> + * Returns 0 on success or a negative error code.
> + */
> +int intel_enable_engines_stats(struct drm_i915_private *dev_priv)
> +{
> +       struct intel_engine_cs *engine;
> +       enum intel_engine_id id;
> +       int ret = 0;
> +
> +       if (!i915.enable_execlists)
> +               return -ENODEV;
> +
> +       for_each_engine(engine, dev_priv, id) {
> +               ret = intel_enable_engine_stats(engine);
> +               if (WARN_ON_ONCE(ret))
> +                       break;

Doesn't the failure here only lead to more failure? The only failure is
counter saturation, and by not handling that failure you leak the
earlier refs.

> +       }
> +
> +       return ret;
> +}
> +
> +/**
> + * intel_disable_engines_stats() - Disable engine busy tracking on all engines
> + * @dev_priv: i915 device private
> + *
> + * Stops collecting the engine busyness data for all engines.
> + */
> +void intel_disable_engines_stats(struct drm_i915_private *dev_priv)
> +{
> +       struct intel_engine_cs *engine;
> +       enum intel_engine_id id;
> +
> +       for_each_engine(engine, dev_priv, id)
> +               intel_disable_engine_stats(engine);
> +}
> +
> +/**
> + * intel_engine_get_busy_time() - Return current accumulated engine busyness
> + * @engine: engine to report on
> + *
> + * Returns accumulated time @engine was busy since engine stats were enabled.
> + */
> +ktime_t intel_engine_get_busy_time(struct intel_engine_cs *engine)
> +{
> +       ktime_t total;
> +       unsigned long flags;
> +
> +       spin_lock_irqsave(&engine->stats.lock, flags);
> +
> +       total = engine->stats.total;
> +
> +       /*
> +        * If the engine is executing something at the moment
> +        * add it to the total.
> +        */
> +       if (engine->stats.ref)
> +               total = ktime_add(total,
> +                                 ktime_sub(ktime_get(), engine->stats.start));
> +
> +       spin_unlock_irqrestore(&engine->stats.lock, flags);
> +
> +       return total;
> +}
> +
> +/**
> + * intel_engines_get_busy_time() - Return current accumulated overall engine busyness
> + * @dev_priv: i915 device private
> + *
> + * Returns accumulated time all engines were busy since engine stats were
> + * enabled.
> + */
> +ktime_t intel_engines_get_busy_time(struct drm_i915_private *dev_priv)
> +{
> +       struct intel_engine_cs *engine;
> +       enum intel_engine_id id;
> +       ktime_t total = 0;
> +
> +       for_each_engine(engine, dev_priv, id)
> +               total = ktime_add(total, intel_engine_get_busy_time(engine));
> +
> +       return total;
> +}
> +
>  #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
>  #include "selftests/mock_engine.c"
>  #endif
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index b61fb09024c3..00fcbde998fc 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -310,12 +310,14 @@ execlists_context_status_change(struct drm_i915_gem_request *rq,
>  static inline void
>  execlists_context_schedule_in(struct drm_i915_gem_request *rq)
>  {
> +       intel_engine_context_in(rq->engine);
>         execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN);
>  }
>  
>  static inline void
>  execlists_context_schedule_out(struct drm_i915_gem_request *rq)
>  {
> +       intel_engine_context_out(rq->engine);
>         execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
>  }
>  
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index cf095b9386f4..f618c5f98edf 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -463,6 +463,34 @@ struct intel_engine_cs {
>          * certain bits to encode the command length in the header).
>          */
>         u32 (*get_cmd_length_mask)(u32 cmd_header);
> +
> +       struct {
> +              /**
> +               * @lock: Lock protecting the below fields.
> +               */
> +               spinlock_t lock;
> +               /**
> +                * @enabled: Reference count indicating number of listeners.
> +                */
> +               unsigned int enabled;
> +               /**
> +                * @ref: Number of contexts currently scheduled in.
> +                */
> +               unsigned int ref;

active?

> +               /**
> +                * @start: Timestamp of the last idle to active transition.
> +                *
> +                * Idle is defined as ref == 0, active is ref > 0.
> +                */
> +               ktime_t start;
> +               /**
> +                * @total: Total time this engine was busy.
> +                *
> +                * Accumulated time not counting the most recent block in cases
> +                * where engine is currently busy (ref > 0).
> +                */
> +               ktime_t total;
> +       } stats;
>  };
>  
>  static inline unsigned int
> @@ -762,4 +790,57 @@ bool intel_engine_can_store_dword(struct intel_engine_cs *engine);
>  struct intel_engine_cs *
>  intel_engine_lookup_user(struct drm_i915_private *i915, u8 class, u8 instance);
>  
> +static inline void intel_engine_context_in(struct intel_engine_cs *engine)
> +{
> +       unsigned long flags;
> +
> +       if (READ_ONCE(engine->stats.enabled) == 0)
> +               return;
> +
> +       spin_lock_irqsave(&engine->stats.lock, flags);
> +
> +       if (engine->stats.enabled > 0) {
> +               if (engine->stats.ref++ == 0)
> +                       engine->stats.start = ktime_get();
> +               GEM_BUG_ON(engine->stats.ref == 0);
> +       }
> +
> +       spin_unlock_irqrestore(&engine->stats.lock, flags);
> +}
> +
> +static inline void intel_engine_context_out(struct intel_engine_cs *engine)
> +{
> +       unsigned long flags;
> +
> +       if (READ_ONCE(engine->stats.enabled) == 0)
> +               return;
> +
> +       spin_lock_irqsave(&engine->stats.lock, flags);
> +
> +       if (engine->stats.enabled > 0) {
> +               /*
> +                * After turning on engine stats, context out might be the
> +                * first event which then needs to be ignored (ref == 0).
> +                */
> +               if (engine->stats.ref && --engine->stats.ref == 0) {
> +                       ktime_t last = ktime_sub(ktime_get(),
> +                                                engine->stats.start);

s/last/this/ ? You adding in the time elapsed for the current activity.

> +
> +                       engine->stats.total = ktime_add(engine->stats.total,
> +                                                       last);
> +               }
> +       }
> +
> +       spin_unlock_irqrestore(&engine->stats.lock, flags);

Only slight annoyance is that we do out before we process in, so if we
only fill slot0 every time, we end up with a pair of ktime_get()s we
didn't need.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC 08/11] drm/i915: Export engine busy stats in debugfs
  2017-09-11 15:25 ` [RFC 08/11] drm/i915: Export engine busy stats in debugfs Tvrtko Ursulin
@ 2017-09-14 20:17   ` Chris Wilson
  2017-09-15  9:46     ` Tvrtko Ursulin
  0 siblings, 1 reply; 56+ messages in thread
From: Chris Wilson @ 2017-09-14 20:17 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-gfx; +Cc: Peter Zijlstra

Quoting Tvrtko Ursulin (2017-09-11 16:25:56)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Export the stats added in the previous patch in debugfs.
> 
> Number of active clients reading this data is tracked and the
> static key is only enabled whilst there are some.
> 
> Userspace is intended to keep the file descriptor open, seeking
> to the beginning of the file periodically, and re-reading the
> stats.
> 
> This is because the underlying implementation is costly on every
> first open/last close transition, and also, because the stats
> exported mostly make sense when they are considered relative to
> the previous sample.
> 
> File lists nanoseconds each engine was active since the tracking
> has started.

Purpose? It's a debug API, so what are we debugging and by whom?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC v3 11/11] drm/i915: Gate engine stats collection with a static key
  2017-09-13 12:18   ` [RFC v3 " Tvrtko Ursulin
@ 2017-09-14 20:22     ` Chris Wilson
  2017-09-15  9:51       ` Tvrtko Ursulin
  0 siblings, 1 reply; 56+ messages in thread
From: Chris Wilson @ 2017-09-14 20:22 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-gfx

Quoting Tvrtko Ursulin (2017-09-13 13:18:19)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> This reduces the cost of the software engine busyness tracking
> to a single no-op instruction when there are no listeners.
> 
> v2: Rebase and some comments.
> v3: Rebase.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_pmu.c         | 54 +++++++++++++++++++++++--
>  drivers/gpu/drm/i915/intel_engine_cs.c  | 17 ++++++++
>  drivers/gpu/drm/i915/intel_ringbuffer.h | 70 ++++++++++++++++++++++-----------
>  3 files changed, 113 insertions(+), 28 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
> index 3c0c5d0549b3..d734879e67ee 100644
> --- a/drivers/gpu/drm/i915/i915_pmu.c
> +++ b/drivers/gpu/drm/i915/i915_pmu.c
> @@ -460,11 +460,17 @@ static void i915_pmu_enable(struct perf_event *event)
>                 GEM_BUG_ON(sample >= I915_PMU_SAMPLE_BITS);
>                 GEM_BUG_ON(engine->pmu.enable_count[sample] == ~0);
>                 if (engine->pmu.enable_count[sample]++ == 0) {
> +                       /*
> +                        * Enable engine busy stats tracking if needed or
> +                        * alternatively cancel the scheduled disabling of the
> +                        * same.
> +                        */
>                         if (engine_needs_busy_stats(engine) &&
>                             !engine->pmu.busy_stats) {
> -                               engine->pmu.busy_stats =
> -                                       intel_enable_engine_stats(engine) == 0;
> -                               WARN_ON_ONCE(!engine->pmu.busy_stats);
> +                               engine->pmu.busy_stats = true;
> +                               if (!cancel_delayed_work(&engine->pmu.disable_busy_stats))
> +                                       queue_work(i915->wq,
> +                                               &engine->pmu.enable_busy_stats);

The only users of i915->wq are supposed to be dependent on struct_mutex,
it is limited to a single job under the presumption that each worker
would be serialised by that mutex.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC 10/11] drm/i915: Export engine stats API to other users
  2017-09-11 15:25 ` [RFC 10/11] drm/i915: Export engine stats API to other users Tvrtko Ursulin
  2017-09-12 18:35   ` Ben Widawsky
@ 2017-09-14 20:26   ` Chris Wilson
  2017-09-15  9:49     ` Tvrtko Ursulin
  2017-09-29 10:59   ` Joonas Lahtinen
  2 siblings, 1 reply; 56+ messages in thread
From: Chris Wilson @ 2017-09-14 20:26 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-gfx; +Cc: Peter Zijlstra, Ben Widawsky, Ben Widawsky

Quoting Tvrtko Ursulin (2017-09-11 16:25:58)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Other kernel users might want to look at total GPU busyness
> in order to implement things like package power distribution
> algorithms more efficiently.

Who are we exporting these symbols to? Will you not need all the module
ref handling and load ordering like around ips and audio?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC 02/11] drm/i915: Add intel_energy_uJ
  2017-09-11 15:25 ` [RFC 02/11] drm/i915: Add intel_energy_uJ Tvrtko Ursulin
  2017-09-14 19:49   ` Chris Wilson
@ 2017-09-14 20:36   ` Ville Syrjälä
  2017-09-15  6:56     ` Tvrtko Ursulin
  1 sibling, 1 reply; 56+ messages in thread
From: Ville Syrjälä @ 2017-09-14 20:36 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Peter Zijlstra, Intel-gfx

On Mon, Sep 11, 2017 at 04:25:50PM +0100, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Extract code from i915_energy_uJ (debugfs) so it can be used by
> other callers in future patches.
> 
> v2: Rebase.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c | 17 +----------------
>  drivers/gpu/drm/i915/i915_drv.h     |  2 ++
>  drivers/gpu/drm/i915/intel_pm.c     | 25 +++++++++++++++++++++++++
>  3 files changed, 28 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 6338018f655d..b3a4a66bf7c4 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -2780,26 +2780,11 @@ static int i915_sink_crc(struct seq_file *m, void *data)
>  static int i915_energy_uJ(struct seq_file *m, void *data)
>  {
>  	struct drm_i915_private *dev_priv = node_to_i915(m->private);
> -	unsigned long long power;
> -	u32 units;
>  
>  	if (INTEL_GEN(dev_priv) < 6)
>  		return -ENODEV;
>  
> -	intel_runtime_pm_get(dev_priv);
> -
> -	if (rdmsrl_safe(MSR_RAPL_POWER_UNIT, &power)) {
> -		intel_runtime_pm_put(dev_priv);
> -		return -ENODEV;
> -	}
> -
> -	units = (power & 0x1f00) >> 8;
> -	power = I915_READ(MCH_SECP_NRG_STTS);
> -	power = (1000000 * power) >> units; /* convert to uJ */
> -
> -	intel_runtime_pm_put(dev_priv);
> -
> -	seq_printf(m, "%llu", power);
> +	seq_printf(m, "%llu", intel_energy_uJ(dev_priv));

Isn't this the same thing as the package energy you get from rapl? Can't
we just nuke this private implementation entirely and rely on whatever
rapl gives us?

>  
>  	return 0;
>  }
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index dbd054e88ca2..826c74970ce9 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -4123,6 +4123,8 @@ static inline u64 intel_rc6_residency_us(struct drm_i915_private *dev_priv,
>  	return DIV_ROUND_UP_ULL(intel_rc6_residency_ns(dev_priv, reg), 1000);
>  }
>  
> +u64 intel_energy_uJ(struct drm_i915_private *dev_priv);
> +
>  #define I915_READ8(reg)		dev_priv->uncore.funcs.mmio_readb(dev_priv, (reg), true)
>  #define I915_WRITE8(reg, val)	dev_priv->uncore.funcs.mmio_writeb(dev_priv, (reg), (val), true)
>  
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 60461f49936b..ff67df8d99fa 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -9369,3 +9369,28 @@ u64 intel_rc6_residency_ns(struct drm_i915_private *dev_priv,
>  
>  	return res;
>  }
> +
> +unsigned long long intel_energy_uJ(struct drm_i915_private *dev_priv)
> +{
> +	unsigned long long power;
> +	unsigned long units;
> +
> +	if (GEM_WARN_ON(INTEL_GEN(dev_priv) < 6))
> +		return 0;
> +
> +	intel_runtime_pm_get(dev_priv);
> +
> +	if (rdmsrl_safe(MSR_RAPL_POWER_UNIT, &power)) {
> +		power = 0;
> +		goto out;
> +	}
> +
> +	units = (power >> 8) & 0x1f;
> +	power = I915_READ(MCH_SECP_NRG_STTS);
> +	power = (1000000 * power) >> units; /* convert to uJ */
> +
> +out:
> +	intel_runtime_pm_put(dev_priv);
> +
> +	return power;
> +}
> -- 
> 2.9.5
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC v7 04/11] drm/i915/pmu: Expose a PMU interface for perf queries
  2017-09-13 10:34         ` [RFC v7 " Tvrtko Ursulin
@ 2017-09-15  0:00           ` Rogozhkin, Dmitry V
  2017-09-15  7:57             ` Tvrtko Ursulin
  0 siblings, 1 reply; 56+ messages in thread
From: Rogozhkin, Dmitry V @ 2017-09-15  0:00 UTC (permalink / raw)
  To: tursulin; +Cc: peterz, Intel-gfx

On Wed, 2017-09-13 at 11:34 +0100, Tvrtko Ursulin wrote:
> +static int i915_pmu_event_init(struct perf_event *event)
> +{
> +       struct drm_i915_private *i915 =
> +               container_of(event->pmu, typeof(*i915), pmu.base);
> +       int cpu, ret;
> +
> +       if (event->attr.type != event->pmu->type)
> +               return -ENOENT;
> +
> +       /* unsupported modes and filters */
> +       if (event->attr.sample_period) /* no sampling */
> +               return -EINVAL;
> +
> +       if (has_branch_stack(event))
> +               return -EOPNOTSUPP;
> +
> +       if (event->cpu < 0)
> +               return -EINVAL;
> +
> +       cpu = cpumask_any_and(&i915_pmu_cpumask,
> +                             topology_sibling_cpumask(event->cpu));
> +       if (cpu >= nr_cpu_ids)
> +               return -ENODEV;
> +
> +       ret = 0;
> +       if (is_engine_event(event)) {
> +               ret = engine_event_init(event);
> +       } else switch (event->attr.config) {
> +       case I915_PMU_ACTUAL_FREQUENCY:
> +               if (IS_VALLEYVIEW(i915) || IS_CHERRYVIEW(i915))
> +                       ret = -ENODEV; /* requires a mutex for
> sampling! */
> +       case I915_PMU_REQUESTED_FREQUENCY:
> +       case I915_PMU_ENERGY:
> +       case I915_PMU_RC6_RESIDENCY:
> +       case I915_PMU_RC6p_RESIDENCY:
> +       case I915_PMU_RC6pp_RESIDENCY:
> +               if (INTEL_GEN(i915) < 6)
> +                       ret = -ENODEV;
> +               break;
> +       }
> +       if (ret)
> +               return ret;

The switch for non-engine events should error out by default:

diff --git a/drivers/gpu/drm/i915/i915_pmu.c
b/drivers/gpu/drm/i915/i915_pmu.c
index d734879..3145e9a 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -329,6 +329,9 @@ static int i915_pmu_event_init(struct perf_event
*event)
                if (INTEL_GEN(i915) < 6)
                        ret = -ENODEV;
                break;
+       default:
+               ret = -ENOENT;
+               break;
        }
        if (ret)
                return ret;


Otherwise user may try to enable non-existing metric (> I915_PMU_LAST)
and eventually will be subject to kernel panic on
i915_pmu_enable/disable during refcount operations. And we need to have
an IGT test to check that.

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [RFC 02/11] drm/i915: Add intel_energy_uJ
  2017-09-14 20:36   ` Ville Syrjälä
@ 2017-09-15  6:56     ` Tvrtko Ursulin
  2017-09-15  8:51       ` Chris Wilson
  0 siblings, 1 reply; 56+ messages in thread
From: Tvrtko Ursulin @ 2017-09-15  6:56 UTC (permalink / raw)
  To: Ville Syrjälä, Tvrtko Ursulin; +Cc: Peter Zijlstra, Intel-gfx


On 14/09/2017 21:36, Ville Syrjälä wrote:
> On Mon, Sep 11, 2017 at 04:25:50PM +0100, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Extract code from i915_energy_uJ (debugfs) so it can be used by
>> other callers in future patches.
>>
>> v2: Rebase.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_debugfs.c | 17 +----------------
>>   drivers/gpu/drm/i915/i915_drv.h     |  2 ++
>>   drivers/gpu/drm/i915/intel_pm.c     | 25 +++++++++++++++++++++++++
>>   3 files changed, 28 insertions(+), 16 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
>> index 6338018f655d..b3a4a66bf7c4 100644
>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>> @@ -2780,26 +2780,11 @@ static int i915_sink_crc(struct seq_file *m, void *data)
>>   static int i915_energy_uJ(struct seq_file *m, void *data)
>>   {
>>   	struct drm_i915_private *dev_priv = node_to_i915(m->private);
>> -	unsigned long long power;
>> -	u32 units;
>>   
>>   	if (INTEL_GEN(dev_priv) < 6)
>>   		return -ENODEV;
>>   
>> -	intel_runtime_pm_get(dev_priv);
>> -
>> -	if (rdmsrl_safe(MSR_RAPL_POWER_UNIT, &power)) {
>> -		intel_runtime_pm_put(dev_priv);
>> -		return -ENODEV;
>> -	}
>> -
>> -	units = (power & 0x1f00) >> 8;
>> -	power = I915_READ(MCH_SECP_NRG_STTS);
>> -	power = (1000000 * power) >> units; /* convert to uJ */
>> -
>> -	intel_runtime_pm_put(dev_priv);
>> -
>> -	seq_printf(m, "%llu", power);
>> +	seq_printf(m, "%llu", intel_energy_uJ(dev_priv));
> 
> Isn't this the same thing as the package energy you get from rapl? Can't
> we just nuke this private implementation entirely and rely on whatever
> rapl gives us?

If so I think we could leave it out of i915 PMU. I tried looking for 
MCH_SECP_NRG_STTS in bspec but did not find it though? Is your bspec fu 
perhaps better and you could have a look?

I can also compare the outputs of the two and see if they are one and 
the same that way.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC v7 04/11] drm/i915/pmu: Expose a PMU interface for perf queries
  2017-09-15  0:00           ` Rogozhkin, Dmitry V
@ 2017-09-15  7:57             ` Tvrtko Ursulin
  0 siblings, 0 replies; 56+ messages in thread
From: Tvrtko Ursulin @ 2017-09-15  7:57 UTC (permalink / raw)
  To: Rogozhkin, Dmitry V, tursulin; +Cc: peterz, Intel-gfx


On 15/09/2017 01:00, Rogozhkin, Dmitry V wrote:
> On Wed, 2017-09-13 at 11:34 +0100, Tvrtko Ursulin wrote:
>> +static int i915_pmu_event_init(struct perf_event *event)
>> +{
>> +       struct drm_i915_private *i915 =
>> +               container_of(event->pmu, typeof(*i915), pmu.base);
>> +       int cpu, ret;
>> +
>> +       if (event->attr.type != event->pmu->type)
>> +               return -ENOENT;
>> +
>> +       /* unsupported modes and filters */
>> +       if (event->attr.sample_period) /* no sampling */
>> +               return -EINVAL;
>> +
>> +       if (has_branch_stack(event))
>> +               return -EOPNOTSUPP;
>> +
>> +       if (event->cpu < 0)
>> +               return -EINVAL;
>> +
>> +       cpu = cpumask_any_and(&i915_pmu_cpumask,
>> +                             topology_sibling_cpumask(event->cpu));
>> +       if (cpu >= nr_cpu_ids)
>> +               return -ENODEV;
>> +
>> +       ret = 0;
>> +       if (is_engine_event(event)) {
>> +               ret = engine_event_init(event);
>> +       } else switch (event->attr.config) {
>> +       case I915_PMU_ACTUAL_FREQUENCY:
>> +               if (IS_VALLEYVIEW(i915) || IS_CHERRYVIEW(i915))
>> +                       ret = -ENODEV; /* requires a mutex for
>> sampling! */
>> +       case I915_PMU_REQUESTED_FREQUENCY:
>> +       case I915_PMU_ENERGY:
>> +       case I915_PMU_RC6_RESIDENCY:
>> +       case I915_PMU_RC6p_RESIDENCY:
>> +       case I915_PMU_RC6pp_RESIDENCY:
>> +               if (INTEL_GEN(i915) < 6)
>> +                       ret = -ENODEV;
>> +               break;
>> +       }
>> +       if (ret)
>> +               return ret;
> 
> The switch for non-engine events should error out by default:
> 
> diff --git a/drivers/gpu/drm/i915/i915_pmu.c
> b/drivers/gpu/drm/i915/i915_pmu.c
> index d734879..3145e9a 100644
> --- a/drivers/gpu/drm/i915/i915_pmu.c
> +++ b/drivers/gpu/drm/i915/i915_pmu.c
> @@ -329,6 +329,9 @@ static int i915_pmu_event_init(struct perf_event
> *event)
>                  if (INTEL_GEN(i915) < 6)
>                          ret = -ENODEV;
>                  break;
> +       default:
> +               ret = -ENOENT;
> +               break;
>          }
>          if (ret)
>                  return ret;
> 
> 
> Otherwise user may try to enable non-existing metric (> I915_PMU_LAST)
> and eventually will be subject to kernel panic on
> i915_pmu_enable/disable during refcount operations. And we need to have
> an IGT test to check that.

Well spotted, I'll add the test for this. I am working on tests at the 
moment anyway. It is taking a bit longer than expected due finding 
issues, not limited to this one.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC 02/11] drm/i915: Add intel_energy_uJ
  2017-09-15  6:56     ` Tvrtko Ursulin
@ 2017-09-15  8:51       ` Chris Wilson
  2017-09-15 10:07         ` Tvrtko Ursulin
  0 siblings, 1 reply; 56+ messages in thread
From: Chris Wilson @ 2017-09-15  8:51 UTC (permalink / raw)
  To: Tvrtko Ursulin, Ville Syrjälä, Tvrtko Ursulin
  Cc: Peter Zijlstra, Intel-gfx

Quoting Tvrtko Ursulin (2017-09-15 07:56:00)
> 
> On 14/09/2017 21:36, Ville Syrjälä wrote:
> > On Mon, Sep 11, 2017 at 04:25:50PM +0100, Tvrtko Ursulin wrote:
> >> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>
> >> Extract code from i915_energy_uJ (debugfs) so it can be used by
> >> other callers in future patches.
> >>
> >> v2: Rebase.
> >>
> >> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >> ---
> >>   drivers/gpu/drm/i915/i915_debugfs.c | 17 +----------------
> >>   drivers/gpu/drm/i915/i915_drv.h     |  2 ++
> >>   drivers/gpu/drm/i915/intel_pm.c     | 25 +++++++++++++++++++++++++
> >>   3 files changed, 28 insertions(+), 16 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> >> index 6338018f655d..b3a4a66bf7c4 100644
> >> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> >> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> >> @@ -2780,26 +2780,11 @@ static int i915_sink_crc(struct seq_file *m, void *data)
> >>   static int i915_energy_uJ(struct seq_file *m, void *data)
> >>   {
> >>      struct drm_i915_private *dev_priv = node_to_i915(m->private);
> >> -    unsigned long long power;
> >> -    u32 units;
> >>   
> >>      if (INTEL_GEN(dev_priv) < 6)
> >>              return -ENODEV;
> >>   
> >> -    intel_runtime_pm_get(dev_priv);
> >> -
> >> -    if (rdmsrl_safe(MSR_RAPL_POWER_UNIT, &power)) {
> >> -            intel_runtime_pm_put(dev_priv);
> >> -            return -ENODEV;
> >> -    }
> >> -
> >> -    units = (power & 0x1f00) >> 8;
> >> -    power = I915_READ(MCH_SECP_NRG_STTS);
> >> -    power = (1000000 * power) >> units; /* convert to uJ */
> >> -
> >> -    intel_runtime_pm_put(dev_priv);
> >> -
> >> -    seq_printf(m, "%llu", power);
> >> +    seq_printf(m, "%llu", intel_energy_uJ(dev_priv));
> > 
> > Isn't this the same thing as the package energy you get from rapl? Can't
> > we just nuke this private implementation entirely and rely on whatever
> > rapl gives us?
> 
> If so I think we could leave it out of i915 PMU. I tried looking for 
> MCH_SECP_NRG_STTS in bspec but did not find it though? Is your bspec fu 
> perhaps better and you could have a look?

I had it there for the convenience of grabbing everything through the
one interface (which I still think has merit). I don't I ever compiled
in the rapl user interface...
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC 02/11] drm/i915: Add intel_energy_uJ
  2017-09-14 19:49   ` Chris Wilson
@ 2017-09-15  9:18     ` Tvrtko Ursulin
  0 siblings, 0 replies; 56+ messages in thread
From: Tvrtko Ursulin @ 2017-09-15  9:18 UTC (permalink / raw)
  To: Chris Wilson, Tvrtko Ursulin, Intel-gfx; +Cc: Peter Zijlstra


On 14/09/2017 20:49, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2017-09-11 16:25:50)
>> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
>> index 60461f49936b..ff67df8d99fa 100644
>> --- a/drivers/gpu/drm/i915/intel_pm.c
>> +++ b/drivers/gpu/drm/i915/intel_pm.c
>> @@ -9369,3 +9369,28 @@ u64 intel_rc6_residency_ns(struct drm_i915_private *dev_priv,
>>   
>>          return res;
>>   }
>> +
>> +unsigned long long intel_energy_uJ(struct drm_i915_private *dev_priv)
>> +{
>> +       unsigned long long power;
>> +       unsigned long units;
>> +
>> +       if (GEM_WARN_ON(INTEL_GEN(dev_priv) < 6))
>> +               return 0;
>> +
>> +       intel_runtime_pm_get(dev_priv);
> 
> Same question on lifting the rpm_get to the caller?

Makes sense, will do both.

Regards,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC v5 05/11] drm/i915/pmu: Suspend sampling when GPU is idle
  2017-09-14 19:57     ` Chris Wilson
@ 2017-09-15  9:22       ` Tvrtko Ursulin
  0 siblings, 0 replies; 56+ messages in thread
From: Tvrtko Ursulin @ 2017-09-15  9:22 UTC (permalink / raw)
  To: Chris Wilson, Tvrtko Ursulin, Intel-gfx


On 14/09/2017 20:57, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2017-09-13 11:34:17)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> If only a subset of events is enabled we can afford to suspend
>> the sampling timer when the GPU is idle and so save some cycles
>> and power.
>>
>> v2: Rebase and limit timer even more.
>> v3: Rebase.
>> v4: Rebase.
>> v5: Skip action if perf PMU failed to register.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h         |  8 ++++
>>   drivers/gpu/drm/i915/i915_gem.c         |  1 +
>>   drivers/gpu/drm/i915/i915_gem_request.c |  1 +
>>   drivers/gpu/drm/i915/i915_pmu.c         | 70 ++++++++++++++++++++++++++++-----
>>   4 files changed, 70 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 62646b8dfb7a..70be8c5d9a65 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -2244,6 +2244,10 @@ struct i915_pmu {
>>           */
>>          unsigned int enable_count[I915_PMU_MASK_BITS];
>>          /**
>> +        * @timer_enabled: Should the internal sampling timer be running.
>> +        */
>> +       bool timer_enabled;
>> +       /**
>>           * @sample: Current counter value for i915 events which need sampling.
>>           *
>>           * These counters are updated from the i915 PMU sampling timer.
>> @@ -3989,9 +3993,13 @@ extern void i915_perf_unregister(struct drm_i915_private *dev_priv);
>>   #ifdef CONFIG_PERF_EVENTS
>>   extern void i915_pmu_register(struct drm_i915_private *i915);
>>   extern void i915_pmu_unregister(struct drm_i915_private *i915);
>> +extern void i915_pmu_gt_idle(struct drm_i915_private *i915);
>> +extern void i915_pmu_gt_active(struct drm_i915_private *i915);
>>   #else
>>   static inline void i915_pmu_register(struct drm_i915_private *i915) {}
>>   static inline void i915_pmu_unregister(struct drm_i915_private *i915) {}
>> +static inline void i915_pmu_gt_idle(struct drm_i915_private *i915) {}
>> +static inline void i915_pmu_gt_active(struct drm_i915_private *i915) {}
>>   #endif
>>   
>>   /* i915_suspend.c */
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index f445587c1a4b..201b09eda93b 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -3227,6 +3227,7 @@ i915_gem_idle_work_handler(struct work_struct *work)
>>   
>>          intel_engines_mark_idle(dev_priv);
>>          i915_gem_timelines_mark_idle(dev_priv);
>> +       i915_pmu_gt_idle(dev_priv);
>>   
>>          GEM_BUG_ON(!dev_priv->gt.awake);
>>          dev_priv->gt.awake = false;
>> diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
>> index 813a3b546d6e..18a1e379253e 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_request.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_request.c
>> @@ -258,6 +258,7 @@ static void mark_busy(struct drm_i915_private *i915)
>>          i915_update_gfx_val(i915);
>>          if (INTEL_GEN(i915) >= 6)
>>                  gen6_rps_busy(i915);
>> +       i915_pmu_gt_active(i915);
>>   
>>          queue_delayed_work(i915->wq,
>>                             &i915->gt.retire_work,
>> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
>> index e82648e6635b..22246918757c 100644
>> --- a/drivers/gpu/drm/i915/i915_pmu.c
>> +++ b/drivers/gpu/drm/i915/i915_pmu.c
>> @@ -90,6 +90,52 @@ static unsigned int event_enabled_bit(struct perf_event *event)
>>          return config_enabled_bit(event->attr.config);
>>   }
>>   
>> +static bool pmu_needs_timer(struct drm_i915_private *i915, bool gpu_active)
>> +{
>> +       u64 enable = i915->pmu.enable;
>> +
>> +       enable &= config_enabled_mask(I915_PMU_ACTUAL_FREQUENCY) |
>> +                 config_enabled_mask(I915_PMU_REQUESTED_FREQUENCY) |
>> +                 ENGINE_SAMPLE_MASK;
>> +
>> +       if (!gpu_active)
>> +               enable &= ~ENGINE_SAMPLE_MASK;
> 
> Ok.

I'll add a comment here. It gets even more advanced when busy stats are 
added.

>> +
>> +       return enable;
>> +}
>> +
>> +void i915_pmu_gt_idle(struct drm_i915_private *i915)
>> +{
>> +       if (!i915->pmu.base.event_init)
>> +               return;
>> +
>> +       spin_lock_irq(&i915->pmu.lock);
>> +       /*
>> +        * Signal sampling timer to stop if only engine events are enabled and
>> +        * GPU went idle.
>> +        */
>> +       i915->pmu.timer_enabled = pmu_needs_timer(i915, false);
>> +       spin_unlock_irq(&i915->pmu.lock);
>> +}
>> +
>> +void i915_pmu_gt_active(struct drm_i915_private *i915)
>> +{
>> +       if (!i915->pmu.base.event_init)
>> +               return;
>> +
>> +       spin_lock_irq(&i915->pmu.lock);
>> +       /*
>> +        * Re-enable sampling timer when GPU goes active.
>> +        */
>> +       if (!i915->pmu.timer_enabled && pmu_needs_timer(i915, true)) {
>> +               i915->pmu.timer_enabled = true;
>> +               hrtimer_start_range_ns(&i915->pmu.timer,
>> +                                      ns_to_ktime(PERIOD), 0,
>> +                                      HRTIMER_MODE_REL_PINNED);
>> +       }
> 
> Duplication here, __i915_pmu_gt_active() for the critical section. Then
> also helps allay fears that we hold the spinlock which is not clear from
> the diff context below.

Agreed, good cleanup suggestion.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC 07/11] drm/i915: Engine busy time tracking
  2017-09-14 20:16   ` Chris Wilson
@ 2017-09-15  9:45     ` Tvrtko Ursulin
  0 siblings, 0 replies; 56+ messages in thread
From: Tvrtko Ursulin @ 2017-09-15  9:45 UTC (permalink / raw)
  To: Chris Wilson, Tvrtko Ursulin, Intel-gfx; +Cc: Peter Zijlstra


On 14/09/2017 21:16, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2017-09-11 16:25:55)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Track total time requests have been executing on the hardware.
>>
>> We add new kernel API to allow software tracking of time GPU
>> engines are spending executing requests.
>>
>> Both per-engine and global API is added with the latter also
>> being exported for use by external users.
>>
>> v2:
>>   * Squashed with the internal API.
>>   * Dropped static key.
>>   * Made per-engine.
>>   * Store time in monotonic ktime.
>>
>> v3: Moved stats clearing to disable.
>>
>> v4:
>>   * Comments.
>>   * Don't export the API just yet.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   drivers/gpu/drm/i915/intel_engine_cs.c  | 141 ++++++++++++++++++++++++++++++++
>>   drivers/gpu/drm/i915/intel_lrc.c        |   2 +
>>   drivers/gpu/drm/i915/intel_ringbuffer.h |  81 ++++++++++++++++++
>>   3 files changed, 224 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
>> index dbc7abd65f33..f7dba176989c 100644
>> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
>> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
>> @@ -232,6 +232,8 @@ intel_engine_setup(struct drm_i915_private *dev_priv,
>>          /* Nothing to do here, execute in order of dependencies */
>>          engine->schedule = NULL;
>>   
>> +       spin_lock_init(&engine->stats.lock);
>> +
>>          ATOMIC_INIT_NOTIFIER_HEAD(&engine->context_status_notifier);
>>   
>>          dev_priv->engine_class[info->class][info->instance] = engine;
>> @@ -1417,6 +1419,145 @@ bool intel_engine_can_store_dword(struct intel_engine_cs *engine)
>>          }
>>   }
>>   
>> +/**
>> + * intel_enable_engine_stats() - Enable engine busy tracking on engine
>> + * @engine: engine to enable stats collection
>> + *
>> + * Start collecting the engine busyness data for @engine.
>> + *
>> + * Returns 0 on success or a negative error code.
>> + */
>> +int intel_enable_engine_stats(struct intel_engine_cs *engine)
>> +{
>> +       unsigned long flags;
>> +
>> +       if (!i915.enable_execlists)
>> +               return -ENODEV;
>> +
>> +       spin_lock_irqsave(&engine->stats.lock, flags);
>> +       if (engine->stats.enabled == ~0)
>> +               goto busy;
>> +       engine->stats.enabled++;
>> +       spin_unlock_irqrestore(&engine->stats.lock, flags);
>> +
>> +       return 0;
>> +
>> +busy:
>> +       spin_unlock_irqrestore(&engine->stats.lock, flags);
>> +
>> +       return -EBUSY;
>> +}
>> +
>> +/**
>> + * intel_disable_engine_stats() - Disable engine busy tracking on engine
>> + * @engine: engine to disable stats collection
>> + *
>> + * Stops collecting the engine busyness data for @engine.
>> + */
>> +void intel_disable_engine_stats(struct intel_engine_cs *engine)
>> +{
>> +       unsigned long flags;
>> +
>> +       if (!i915.enable_execlists)
>> +               return;
>> +
>> +       spin_lock_irqsave(&engine->stats.lock, flags);
>> +       WARN_ON_ONCE(engine->stats.enabled == 0);
>> +       if (--engine->stats.enabled == 0) {
> 
> Saturation protection on inc, but not on dec?

On inc since it can imaginably be triggered from the outside, even 
userspace via PMU API, but dec has a WARN_ON_ONCE only with the thinking 
it can only be a programming error to hit that one.

> You might as well just use refcount_t.

I dislike needless atomics inside atomics, since I need the spinlock anyway.

>> +               engine->stats.ref = 0;
>> +               engine->stats.start = engine->stats.total = 0;
>> +       }
>> +       spin_unlock_irqrestore(&engine->stats.lock, flags);
>> +}
>> +
>> +/**
>> + * intel_enable_engines_stats() - Enable engine busy tracking on all engines
>> + * @dev_priv: i915 device private
>> + *
>> + * Start collecting the engine busyness data for all engines.
>> + *
>> + * Returns 0 on success or a negative error code.
>> + */
>> +int intel_enable_engines_stats(struct drm_i915_private *dev_priv)
>> +{
>> +       struct intel_engine_cs *engine;
>> +       enum intel_engine_id id;
>> +       int ret = 0;
>> +
>> +       if (!i915.enable_execlists)
>> +               return -ENODEV;
>> +
>> +       for_each_engine(engine, dev_priv, id) {
>> +               ret = intel_enable_engine_stats(engine);
>> +               if (WARN_ON_ONCE(ret))
>> +                       break;
> 
> Doesn't the failure here only lead to more failure? The only failure is
> counter saturation, and by not handling that failure you leak the
> earlier refs.

Oopsie, this a bug.

> 
>> +       }
>> +
>> +       return ret;
>> +}
>> +
>> +/**
>> + * intel_disable_engines_stats() - Disable engine busy tracking on all engines
>> + * @dev_priv: i915 device private
>> + *
>> + * Stops collecting the engine busyness data for all engines.
>> + */
>> +void intel_disable_engines_stats(struct drm_i915_private *dev_priv)
>> +{
>> +       struct intel_engine_cs *engine;
>> +       enum intel_engine_id id;
>> +
>> +       for_each_engine(engine, dev_priv, id)
>> +               intel_disable_engine_stats(engine);
>> +}
>> +
>> +/**
>> + * intel_engine_get_busy_time() - Return current accumulated engine busyness
>> + * @engine: engine to report on
>> + *
>> + * Returns accumulated time @engine was busy since engine stats were enabled.
>> + */
>> +ktime_t intel_engine_get_busy_time(struct intel_engine_cs *engine)
>> +{
>> +       ktime_t total;
>> +       unsigned long flags;
>> +
>> +       spin_lock_irqsave(&engine->stats.lock, flags);
>> +
>> +       total = engine->stats.total;
>> +
>> +       /*
>> +        * If the engine is executing something at the moment
>> +        * add it to the total.
>> +        */
>> +       if (engine->stats.ref)
>> +               total = ktime_add(total,
>> +                                 ktime_sub(ktime_get(), engine->stats.start));
>> +
>> +       spin_unlock_irqrestore(&engine->stats.lock, flags);
>> +
>> +       return total;
>> +}
>> +
>> +/**
>> + * intel_engines_get_busy_time() - Return current accumulated overall engine busyness
>> + * @dev_priv: i915 device private
>> + *
>> + * Returns accumulated time all engines were busy since engine stats were
>> + * enabled.
>> + */
>> +ktime_t intel_engines_get_busy_time(struct drm_i915_private *dev_priv)
>> +{
>> +       struct intel_engine_cs *engine;
>> +       enum intel_engine_id id;
>> +       ktime_t total = 0;
>> +
>> +       for_each_engine(engine, dev_priv, id)
>> +               total = ktime_add(total, intel_engine_get_busy_time(engine));
>> +
>> +       return total;
>> +}
>> +
>>   #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
>>   #include "selftests/mock_engine.c"
>>   #endif
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>> index b61fb09024c3..00fcbde998fc 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -310,12 +310,14 @@ execlists_context_status_change(struct drm_i915_gem_request *rq,
>>   static inline void
>>   execlists_context_schedule_in(struct drm_i915_gem_request *rq)
>>   {
>> +       intel_engine_context_in(rq->engine);
>>          execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN);
>>   }
>>   
>>   static inline void
>>   execlists_context_schedule_out(struct drm_i915_gem_request *rq)
>>   {
>> +       intel_engine_context_out(rq->engine);
>>          execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
>>   }
>>   
>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
>> index cf095b9386f4..f618c5f98edf 100644
>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
>> @@ -463,6 +463,34 @@ struct intel_engine_cs {
>>           * certain bits to encode the command length in the header).
>>           */
>>          u32 (*get_cmd_length_mask)(u32 cmd_header);
>> +
>> +       struct {
>> +              /**
>> +               * @lock: Lock protecting the below fields.
>> +               */
>> +               spinlock_t lock;
>> +               /**
>> +                * @enabled: Reference count indicating number of listeners.
>> +                */
>> +               unsigned int enabled;
>> +               /**
>> +                * @ref: Number of contexts currently scheduled in.
>> +                */
>> +               unsigned int ref;
> 
> active?

I disliked the name as well, yes, active is much better.
> 
>> +               /**
>> +                * @start: Timestamp of the last idle to active transition.
>> +                *
>> +                * Idle is defined as ref == 0, active is ref > 0.
>> +                */
>> +               ktime_t start;
>> +               /**
>> +                * @total: Total time this engine was busy.
>> +                *
>> +                * Accumulated time not counting the most recent block in cases
>> +                * where engine is currently busy (ref > 0).
>> +                */
>> +               ktime_t total;
>> +       } stats;
>>   };
>>   
>>   static inline unsigned int
>> @@ -762,4 +790,57 @@ bool intel_engine_can_store_dword(struct intel_engine_cs *engine);
>>   struct intel_engine_cs *
>>   intel_engine_lookup_user(struct drm_i915_private *i915, u8 class, u8 instance);
>>   
>> +static inline void intel_engine_context_in(struct intel_engine_cs *engine)
>> +{
>> +       unsigned long flags;
>> +
>> +       if (READ_ONCE(engine->stats.enabled) == 0)
>> +               return;
>> +
>> +       spin_lock_irqsave(&engine->stats.lock, flags);
>> +
>> +       if (engine->stats.enabled > 0) {
>> +               if (engine->stats.ref++ == 0)
>> +                       engine->stats.start = ktime_get();
>> +               GEM_BUG_ON(engine->stats.ref == 0);
>> +       }
>> +
>> +       spin_unlock_irqrestore(&engine->stats.lock, flags);
>> +}
>> +
>> +static inline void intel_engine_context_out(struct intel_engine_cs *engine)
>> +{
>> +       unsigned long flags;
>> +
>> +       if (READ_ONCE(engine->stats.enabled) == 0)
>> +               return;
>> +
>> +       spin_lock_irqsave(&engine->stats.lock, flags);
>> +
>> +       if (engine->stats.enabled > 0) {
>> +               /*
>> +                * After turning on engine stats, context out might be the
>> +                * first event which then needs to be ignored (ref == 0).
>> +                */
>> +               if (engine->stats.ref && --engine->stats.ref == 0) {
>> +                       ktime_t last = ktime_sub(ktime_get(),
>> +                                                engine->stats.start);
> 
> s/last/this/ ? You adding in the time elapsed for the current activity.

Hm don't know. It is kind of not "this" at this point, since the context 
is out, finished. "Last" still sounds better to me.

> 
>> +
>> +                       engine->stats.total = ktime_add(engine->stats.total,
>> +                                                       last);
>> +               }
>> +       }
>> +
>> +       spin_unlock_irqrestore(&engine->stats.lock, flags);
> 
> Only slight annoyance is that we do out before we process in, so if we
> only fill slot0 every time, we end up with a pair of ktime_get()s we
> didn't need.

You mean a continuous stream of requests where all times we would really 
need is start of first, and end of the last?

I don't have immediate ideas on how to implement this. However with that 
approach we would also need to accept accounting the time spent inside 
the tasklet, and tasklet scheduling latency, as GPU busyness. However 
small this time might be.

Either way I'd prefer to leave thinking about this for some future time.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC 08/11] drm/i915: Export engine busy stats in debugfs
  2017-09-14 20:17   ` Chris Wilson
@ 2017-09-15  9:46     ` Tvrtko Ursulin
  0 siblings, 0 replies; 56+ messages in thread
From: Tvrtko Ursulin @ 2017-09-15  9:46 UTC (permalink / raw)
  To: Chris Wilson, Tvrtko Ursulin, Intel-gfx; +Cc: Peter Zijlstra


On 14/09/2017 21:17, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2017-09-11 16:25:56)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Export the stats added in the previous patch in debugfs.
>>
>> Number of active clients reading this data is tracked and the
>> static key is only enabled whilst there are some.
>>
>> Userspace is intended to keep the file descriptor open, seeking
>> to the beginning of the file periodically, and re-reading the
>> stats.
>>
>> This is because the underlying implementation is costly on every
>> first open/last close transition, and also, because the stats
>> exported mostly make sense when they are considered relative to
>> the previous sample.
>>
>> File lists nanoseconds each engine was active since the tracking
>> has started.
> 
> Purpose? It's a debug API, so what are we debugging and by whom?

Now that perf stat works it has very little value indeed. Probably 
serves as a reason to have the intel_enable_engines_stats API in. I 
could drop it and move that API to the same patch which exports it?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC 10/11] drm/i915: Export engine stats API to other users
  2017-09-14 20:26   ` Chris Wilson
@ 2017-09-15  9:49     ` Tvrtko Ursulin
  2017-09-19 19:50       ` Ben Widawsky
  0 siblings, 1 reply; 56+ messages in thread
From: Tvrtko Ursulin @ 2017-09-15  9:49 UTC (permalink / raw)
  To: Chris Wilson, Tvrtko Ursulin, Intel-gfx
  Cc: Peter Zijlstra, Ben Widawsky, Ben Widawsky


On 14/09/2017 21:26, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2017-09-11 16:25:58)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Other kernel users might want to look at total GPU busyness
>> in order to implement things like package power distribution
>> algorithms more efficiently.
> 
> Who are we exporting these symbols to? Will you not need all the module
> ref handling and load ordering like around ips and audio?

Hm yes indeed, I forgot about that.

Perhaps Ben could comment on who is the user. If it is purely for 
internal explorations, I'll stick the patch at the end of the series as 
it is. If it has a more serious user I would need to implement a proper 
solution.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC v3 11/11] drm/i915: Gate engine stats collection with a static key
  2017-09-14 20:22     ` Chris Wilson
@ 2017-09-15  9:51       ` Tvrtko Ursulin
  0 siblings, 0 replies; 56+ messages in thread
From: Tvrtko Ursulin @ 2017-09-15  9:51 UTC (permalink / raw)
  To: Chris Wilson, Tvrtko Ursulin, Intel-gfx


On 14/09/2017 21:22, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2017-09-13 13:18:19)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> This reduces the cost of the software engine busyness tracking
>> to a single no-op instruction when there are no listeners.
>>
>> v2: Rebase and some comments.
>> v3: Rebase.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_pmu.c         | 54 +++++++++++++++++++++++--
>>   drivers/gpu/drm/i915/intel_engine_cs.c  | 17 ++++++++
>>   drivers/gpu/drm/i915/intel_ringbuffer.h | 70 ++++++++++++++++++++++-----------
>>   3 files changed, 113 insertions(+), 28 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
>> index 3c0c5d0549b3..d734879e67ee 100644
>> --- a/drivers/gpu/drm/i915/i915_pmu.c
>> +++ b/drivers/gpu/drm/i915/i915_pmu.c
>> @@ -460,11 +460,17 @@ static void i915_pmu_enable(struct perf_event *event)
>>                  GEM_BUG_ON(sample >= I915_PMU_SAMPLE_BITS);
>>                  GEM_BUG_ON(engine->pmu.enable_count[sample] == ~0);
>>                  if (engine->pmu.enable_count[sample]++ == 0) {
>> +                       /*
>> +                        * Enable engine busy stats tracking if needed or
>> +                        * alternatively cancel the scheduled disabling of the
>> +                        * same.
>> +                        */
>>                          if (engine_needs_busy_stats(engine) &&
>>                              !engine->pmu.busy_stats) {
>> -                               engine->pmu.busy_stats =
>> -                                       intel_enable_engine_stats(engine) == 0;
>> -                               WARN_ON_ONCE(!engine->pmu.busy_stats);
>> +                               engine->pmu.busy_stats = true;
>> +                               if (!cancel_delayed_work(&engine->pmu.disable_busy_stats))
>> +                                       queue_work(i915->wq,
>> +                                               &engine->pmu.enable_busy_stats);
> 
> The only users of i915->wq are supposed to be dependent on struct_mutex,
> it is limited to a single job under the presumption that each worker
> would be serialised by that mutex.

Good point, so I need a dedicated wq. Might still want to make it 
ordered since they themselves will be serialised by the global static 
key mutex.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC 02/11] drm/i915: Add intel_energy_uJ
  2017-09-15  8:51       ` Chris Wilson
@ 2017-09-15 10:07         ` Tvrtko Ursulin
  2017-09-15 10:34           ` Ville Syrjälä
  0 siblings, 1 reply; 56+ messages in thread
From: Tvrtko Ursulin @ 2017-09-15 10:07 UTC (permalink / raw)
  To: Chris Wilson, Ville Syrjälä, Tvrtko Ursulin
  Cc: Peter Zijlstra, Intel-gfx


On 15/09/2017 09:51, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2017-09-15 07:56:00)
>>
>> On 14/09/2017 21:36, Ville Syrjälä wrote:
>>> On Mon, Sep 11, 2017 at 04:25:50PM +0100, Tvrtko Ursulin wrote:
>>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>
>>>> Extract code from i915_energy_uJ (debugfs) so it can be used by
>>>> other callers in future patches.
>>>>
>>>> v2: Rebase.
>>>>
>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>> ---
>>>>    drivers/gpu/drm/i915/i915_debugfs.c | 17 +----------------
>>>>    drivers/gpu/drm/i915/i915_drv.h     |  2 ++
>>>>    drivers/gpu/drm/i915/intel_pm.c     | 25 +++++++++++++++++++++++++
>>>>    3 files changed, 28 insertions(+), 16 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
>>>> index 6338018f655d..b3a4a66bf7c4 100644
>>>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>>>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>>>> @@ -2780,26 +2780,11 @@ static int i915_sink_crc(struct seq_file *m, void *data)
>>>>    static int i915_energy_uJ(struct seq_file *m, void *data)
>>>>    {
>>>>       struct drm_i915_private *dev_priv = node_to_i915(m->private);
>>>> -    unsigned long long power;
>>>> -    u32 units;
>>>>    
>>>>       if (INTEL_GEN(dev_priv) < 6)
>>>>               return -ENODEV;
>>>>    
>>>> -    intel_runtime_pm_get(dev_priv);
>>>> -
>>>> -    if (rdmsrl_safe(MSR_RAPL_POWER_UNIT, &power)) {
>>>> -            intel_runtime_pm_put(dev_priv);
>>>> -            return -ENODEV;
>>>> -    }
>>>> -
>>>> -    units = (power & 0x1f00) >> 8;
>>>> -    power = I915_READ(MCH_SECP_NRG_STTS);
>>>> -    power = (1000000 * power) >> units; /* convert to uJ */
>>>> -
>>>> -    intel_runtime_pm_put(dev_priv);
>>>> -
>>>> -    seq_printf(m, "%llu", power);
>>>> +    seq_printf(m, "%llu", intel_energy_uJ(dev_priv));
>>>
>>> Isn't this the same thing as the package energy you get from rapl? Can't
>>> we just nuke this private implementation entirely and rely on whatever
>>> rapl gives us?
>>
>> If so I think we could leave it out of i915 PMU. I tried looking for
>> MCH_SECP_NRG_STTS in bspec but did not find it though? Is your bspec fu
>> perhaps better and you could have a look?
> 
> I had it there for the convenience of grabbing everything through the
> one interface (which I still think has merit). I don't I ever compiled
> in the rapl user interface...

You mean the RAPL PMU, PERF_EVENTS_INTEL_RAPL?

If it provides the same thing, and in fact even more different counters 
than this RFC, then I think it is not that much more difficult to use 
two PMUs from things like intel-gpu-overlay or something.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC 02/11] drm/i915: Add intel_energy_uJ
  2017-09-15 10:07         ` Tvrtko Ursulin
@ 2017-09-15 10:34           ` Ville Syrjälä
  2017-09-15 10:38             ` Chris Wilson
  0 siblings, 1 reply; 56+ messages in thread
From: Ville Syrjälä @ 2017-09-15 10:34 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Peter Zijlstra, Intel-gfx

On Fri, Sep 15, 2017 at 11:07:21AM +0100, Tvrtko Ursulin wrote:
> 
> On 15/09/2017 09:51, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2017-09-15 07:56:00)
> >>
> >> On 14/09/2017 21:36, Ville Syrjälä wrote:
> >>> On Mon, Sep 11, 2017 at 04:25:50PM +0100, Tvrtko Ursulin wrote:
> >>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>>>
> >>>> Extract code from i915_energy_uJ (debugfs) so it can be used by
> >>>> other callers in future patches.
> >>>>
> >>>> v2: Rebase.
> >>>>
> >>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>>> ---
> >>>>    drivers/gpu/drm/i915/i915_debugfs.c | 17 +----------------
> >>>>    drivers/gpu/drm/i915/i915_drv.h     |  2 ++
> >>>>    drivers/gpu/drm/i915/intel_pm.c     | 25 +++++++++++++++++++++++++
> >>>>    3 files changed, 28 insertions(+), 16 deletions(-)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> >>>> index 6338018f655d..b3a4a66bf7c4 100644
> >>>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> >>>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> >>>> @@ -2780,26 +2780,11 @@ static int i915_sink_crc(struct seq_file *m, void *data)
> >>>>    static int i915_energy_uJ(struct seq_file *m, void *data)
> >>>>    {
> >>>>       struct drm_i915_private *dev_priv = node_to_i915(m->private);
> >>>> -    unsigned long long power;
> >>>> -    u32 units;
> >>>>    
> >>>>       if (INTEL_GEN(dev_priv) < 6)
> >>>>               return -ENODEV;
> >>>>    
> >>>> -    intel_runtime_pm_get(dev_priv);
> >>>> -
> >>>> -    if (rdmsrl_safe(MSR_RAPL_POWER_UNIT, &power)) {
> >>>> -            intel_runtime_pm_put(dev_priv);
> >>>> -            return -ENODEV;
> >>>> -    }
> >>>> -
> >>>> -    units = (power & 0x1f00) >> 8;
> >>>> -    power = I915_READ(MCH_SECP_NRG_STTS);
> >>>> -    power = (1000000 * power) >> units; /* convert to uJ */
> >>>> -
> >>>> -    intel_runtime_pm_put(dev_priv);
> >>>> -
> >>>> -    seq_printf(m, "%llu", power);
> >>>> +    seq_printf(m, "%llu", intel_energy_uJ(dev_priv));
> >>>
> >>> Isn't this the same thing as the package energy you get from rapl? Can't
> >>> we just nuke this private implementation entirely and rely on whatever
> >>> rapl gives us?
> >>
> >> If so I think we could leave it out of i915 PMU. I tried looking for
> >> MCH_SECP_NRG_STTS in bspec but did not find it though? Is your bspec fu
> >> perhaps better and you could have a look?
> > 
> > I had it there for the convenience of grabbing everything through the
> > one interface (which I still think has merit). I don't I ever compiled
> > in the rapl user interface...
> 
> You mean the RAPL PMU, PERF_EVENTS_INTEL_RAPL?

I suspect he means sysfs. IIRC this perf+rapl marriage is somewhat
recent (or at least I still remeber having to fix it when it was
introduced because it broke the sysfs interface).

-- 
Ville Syrjälä
Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC 02/11] drm/i915: Add intel_energy_uJ
  2017-09-15 10:34           ` Ville Syrjälä
@ 2017-09-15 10:38             ` Chris Wilson
  2017-09-15 11:16               ` Tvrtko Ursulin
  0 siblings, 1 reply; 56+ messages in thread
From: Chris Wilson @ 2017-09-15 10:38 UTC (permalink / raw)
  To: Ville Syrjälä, Tvrtko Ursulin; +Cc: Peter Zijlstra, Intel-gfx

Quoting Ville Syrjälä (2017-09-15 11:34:03)
> On Fri, Sep 15, 2017 at 11:07:21AM +0100, Tvrtko Ursulin wrote:
> > 
> > On 15/09/2017 09:51, Chris Wilson wrote:
> > > Quoting Tvrtko Ursulin (2017-09-15 07:56:00)
> > >>
> > >> On 14/09/2017 21:36, Ville Syrjälä wrote:
> > >>> On Mon, Sep 11, 2017 at 04:25:50PM +0100, Tvrtko Ursulin wrote:
> > >>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > >>>>
> > >>>> Extract code from i915_energy_uJ (debugfs) so it can be used by
> > >>>> other callers in future patches.
> > >>>>
> > >>>> v2: Rebase.
> > >>>>
> > >>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > >>>> ---
> > >>>>    drivers/gpu/drm/i915/i915_debugfs.c | 17 +----------------
> > >>>>    drivers/gpu/drm/i915/i915_drv.h     |  2 ++
> > >>>>    drivers/gpu/drm/i915/intel_pm.c     | 25 +++++++++++++++++++++++++
> > >>>>    3 files changed, 28 insertions(+), 16 deletions(-)
> > >>>>
> > >>>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> > >>>> index 6338018f655d..b3a4a66bf7c4 100644
> > >>>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > >>>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > >>>> @@ -2780,26 +2780,11 @@ static int i915_sink_crc(struct seq_file *m, void *data)
> > >>>>    static int i915_energy_uJ(struct seq_file *m, void *data)
> > >>>>    {
> > >>>>       struct drm_i915_private *dev_priv = node_to_i915(m->private);
> > >>>> -    unsigned long long power;
> > >>>> -    u32 units;
> > >>>>    
> > >>>>       if (INTEL_GEN(dev_priv) < 6)
> > >>>>               return -ENODEV;
> > >>>>    
> > >>>> -    intel_runtime_pm_get(dev_priv);
> > >>>> -
> > >>>> -    if (rdmsrl_safe(MSR_RAPL_POWER_UNIT, &power)) {
> > >>>> -            intel_runtime_pm_put(dev_priv);
> > >>>> -            return -ENODEV;
> > >>>> -    }
> > >>>> -
> > >>>> -    units = (power & 0x1f00) >> 8;
> > >>>> -    power = I915_READ(MCH_SECP_NRG_STTS);
> > >>>> -    power = (1000000 * power) >> units; /* convert to uJ */
> > >>>> -
> > >>>> -    intel_runtime_pm_put(dev_priv);
> > >>>> -
> > >>>> -    seq_printf(m, "%llu", power);
> > >>>> +    seq_printf(m, "%llu", intel_energy_uJ(dev_priv));
> > >>>
> > >>> Isn't this the same thing as the package energy you get from rapl? Can't
> > >>> we just nuke this private implementation entirely and rely on whatever
> > >>> rapl gives us?
> > >>
> > >> If so I think we could leave it out of i915 PMU. I tried looking for
> > >> MCH_SECP_NRG_STTS in bspec but did not find it though? Is your bspec fu
> > >> perhaps better and you could have a look?
> > > 
> > > I had it there for the convenience of grabbing everything through the
> > > one interface (which I still think has merit). I don't I ever compiled
> > > in the rapl user interface...
> > 
> > You mean the RAPL PMU, PERF_EVENTS_INTEL_RAPL?
> 
> I suspect he means sysfs. IIRC this perf+rapl marriage is somewhat
> recent (or at least I still remeber having to fix it when it was
> introduced because it broke the sysfs interface).

I hadn't even heard of the perf interface. Yes, if it is available
through perf, that's just what we want.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC 02/11] drm/i915: Add intel_energy_uJ
  2017-09-15 10:38             ` Chris Wilson
@ 2017-09-15 11:16               ` Tvrtko Ursulin
  0 siblings, 0 replies; 56+ messages in thread
From: Tvrtko Ursulin @ 2017-09-15 11:16 UTC (permalink / raw)
  To: Chris Wilson, Ville Syrjälä; +Cc: Peter Zijlstra, Intel-gfx


On 15/09/2017 11:38, Chris Wilson wrote:
> Quoting Ville Syrjälä (2017-09-15 11:34:03)
>> On Fri, Sep 15, 2017 at 11:07:21AM +0100, Tvrtko Ursulin wrote:
>>>
>>> On 15/09/2017 09:51, Chris Wilson wrote:
>>>> Quoting Tvrtko Ursulin (2017-09-15 07:56:00)
>>>>>
>>>>> On 14/09/2017 21:36, Ville Syrjälä wrote:
>>>>>> On Mon, Sep 11, 2017 at 04:25:50PM +0100, Tvrtko Ursulin wrote:
>>>>>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>>>>
>>>>>>> Extract code from i915_energy_uJ (debugfs) so it can be used by
>>>>>>> other callers in future patches.
>>>>>>>
>>>>>>> v2: Rebase.
>>>>>>>
>>>>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>>>> ---
>>>>>>>     drivers/gpu/drm/i915/i915_debugfs.c | 17 +----------------
>>>>>>>     drivers/gpu/drm/i915/i915_drv.h     |  2 ++
>>>>>>>     drivers/gpu/drm/i915/intel_pm.c     | 25 +++++++++++++++++++++++++
>>>>>>>     3 files changed, 28 insertions(+), 16 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
>>>>>>> index 6338018f655d..b3a4a66bf7c4 100644
>>>>>>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>>>>>>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>>>>>>> @@ -2780,26 +2780,11 @@ static int i915_sink_crc(struct seq_file *m, void *data)
>>>>>>>     static int i915_energy_uJ(struct seq_file *m, void *data)
>>>>>>>     {
>>>>>>>        struct drm_i915_private *dev_priv = node_to_i915(m->private);
>>>>>>> -    unsigned long long power;
>>>>>>> -    u32 units;
>>>>>>>     
>>>>>>>        if (INTEL_GEN(dev_priv) < 6)
>>>>>>>                return -ENODEV;
>>>>>>>     
>>>>>>> -    intel_runtime_pm_get(dev_priv);
>>>>>>> -
>>>>>>> -    if (rdmsrl_safe(MSR_RAPL_POWER_UNIT, &power)) {
>>>>>>> -            intel_runtime_pm_put(dev_priv);
>>>>>>> -            return -ENODEV;
>>>>>>> -    }
>>>>>>> -
>>>>>>> -    units = (power & 0x1f00) >> 8;
>>>>>>> -    power = I915_READ(MCH_SECP_NRG_STTS);
>>>>>>> -    power = (1000000 * power) >> units; /* convert to uJ */
>>>>>>> -
>>>>>>> -    intel_runtime_pm_put(dev_priv);
>>>>>>> -
>>>>>>> -    seq_printf(m, "%llu", power);
>>>>>>> +    seq_printf(m, "%llu", intel_energy_uJ(dev_priv));
>>>>>>
>>>>>> Isn't this the same thing as the package energy you get from rapl? Can't
>>>>>> we just nuke this private implementation entirely and rely on whatever
>>>>>> rapl gives us?
>>>>>
>>>>> If so I think we could leave it out of i915 PMU. I tried looking for
>>>>> MCH_SECP_NRG_STTS in bspec but did not find it though? Is your bspec fu
>>>>> perhaps better and you could have a look?
>>>>
>>>> I had it there for the convenience of grabbing everything through the
>>>> one interface (which I still think has merit). I don't I ever compiled
>>>> in the rapl user interface...
>>>
>>> You mean the RAPL PMU, PERF_EVENTS_INTEL_RAPL?
>>
>> I suspect he means sysfs. IIRC this perf+rapl marriage is somewhat
>> recent (or at least I still remeber having to fix it when it was
>> introduced because it broke the sysfs interface).
> 
> I hadn't even heard of the perf interface. Yes, if it is available
> through perf, that's just what we want.

Dropping it then. And I've checked in the meantime it is reporting 
identical numbers to power/energy-gpu/ provided by the intel_rapl_perf 
module.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC 10/11] drm/i915: Export engine stats API to other users
  2017-09-15  9:49     ` Tvrtko Ursulin
@ 2017-09-19 19:50       ` Ben Widawsky
  2017-09-19 20:11         ` Rogozhkin, Dmitry V
  0 siblings, 1 reply; 56+ messages in thread
From: Ben Widawsky @ 2017-09-19 19:50 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: Ben Widawsky, Peter Zijlstra, Intel-gfx, Pandruvada, Srinivas

On 17-09-15 10:49:56, Tvrtko Ursulin wrote:
>
>On 14/09/2017 21:26, Chris Wilson wrote:
>>Quoting Tvrtko Ursulin (2017-09-11 16:25:58)
>>>From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>
>>>Other kernel users might want to look at total GPU busyness
>>>in order to implement things like package power distribution
>>>algorithms more efficiently.
>>
>>Who are we exporting these symbols to? Will you not need all the module
>>ref handling and load ordering like around ips and audio?
>
>Hm yes indeed, I forgot about that.
>
>Perhaps Ben could comment on who is the user. If it is purely for 
>internal explorations, I'll stick the patch at the end of the series 
>as it is. If it has a more serious user I would need to implement a 
>proper solution.
>
>Regards,
>
>Tvrtko

P-state driver was looking to use this as a way to make determinations about how
much to limit CPU frequency. Srinivas was privy to the original discussion

-- 
Ben Widawsky, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC 10/11] drm/i915: Export engine stats API to other users
  2017-09-19 19:50       ` Ben Widawsky
@ 2017-09-19 20:11         ` Rogozhkin, Dmitry V
  0 siblings, 0 replies; 56+ messages in thread
From: Rogozhkin, Dmitry V @ 2017-09-19 20:11 UTC (permalink / raw)
  To: Widawsky, Benjamin; +Cc: peterz, ben, Intel-gfx, Pandruvada, Srinivas

On Tue, 2017-09-19 at 12:50 -0700, Ben Widawsky wrote:
> On 17-09-15 10:49:56, Tvrtko Ursulin wrote:
> >
> >On 14/09/2017 21:26, Chris Wilson wrote:
> >>Quoting Tvrtko Ursulin (2017-09-11 16:25:58)
> >>>From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>>
> >>>Other kernel users might want to look at total GPU busyness
> >>>in order to implement things like package power distribution
> >>>algorithms more efficiently.
> >>
> >>Who are we exporting these symbols to? Will you not need all the module
> >>ref handling and load ordering like around ips and audio?
> >
> >Hm yes indeed, I forgot about that.
> >
> >Perhaps Ben could comment on who is the user. If it is purely for 
> >internal explorations, I'll stick the patch at the end of the series 
> >as it is. If it has a more serious user I would need to implement a 
> >proper solution.
> >
> >Regards,
> >
> >Tvrtko
> 
> P-state driver was looking to use this as a way to make determinations about how
> much to limit CPU frequency. Srinivas was privy to the original discussion
> 

I personally was surprised to see private API exposed rather than reuse
of PMU API. Do they really need a private path?

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC 10/11] drm/i915: Export engine stats API to other users
  2017-09-11 15:25 ` [RFC 10/11] drm/i915: Export engine stats API to other users Tvrtko Ursulin
  2017-09-12 18:35   ` Ben Widawsky
  2017-09-14 20:26   ` Chris Wilson
@ 2017-09-29 10:59   ` Joonas Lahtinen
  2 siblings, 0 replies; 56+ messages in thread
From: Joonas Lahtinen @ 2017-09-29 10:59 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-gfx; +Cc: Peter Zijlstra, Ben Widawsky, Ben Widawsky

On Mon, 2017-09-11 at 16:25 +0100, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Other kernel users might want to look at total GPU busyness
> in order to implement things like package power distribution
> algorithms more efficiently.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Ben Widawsky <benjamin.widawsky@intel.com>
> Cc: Ben Widawsky <ben@bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/intel_engine_cs.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> index f7dba176989c..e2152dd21b4a 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -1495,6 +1495,7 @@ int intel_enable_engines_stats(struct drm_i915_private *dev_priv)
>  
>  	return ret;
>  }
> +EXPORT_SYMBOL(intel_enable_engines_stats);

EXPORT_SYMBOL_GPL you mean? x3

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2017-09-29 10:59 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-11 15:25 [RFC v3 00/11] i915 PMU and engine busy stats Tvrtko Ursulin
2017-09-11 15:25 ` [RFC 01/11] drm/i915: Convert intel_rc6_residency_us to ns Tvrtko Ursulin
2017-09-14 19:48   ` Chris Wilson
2017-09-11 15:25 ` [RFC 02/11] drm/i915: Add intel_energy_uJ Tvrtko Ursulin
2017-09-14 19:49   ` Chris Wilson
2017-09-15  9:18     ` Tvrtko Ursulin
2017-09-14 20:36   ` Ville Syrjälä
2017-09-15  6:56     ` Tvrtko Ursulin
2017-09-15  8:51       ` Chris Wilson
2017-09-15 10:07         ` Tvrtko Ursulin
2017-09-15 10:34           ` Ville Syrjälä
2017-09-15 10:38             ` Chris Wilson
2017-09-15 11:16               ` Tvrtko Ursulin
2017-09-11 15:25 ` [RFC 03/11] drm/i915: Extract intel_get_cagf Tvrtko Ursulin
2017-09-14 19:51   ` Chris Wilson
2017-09-11 15:25 ` [RFC 04/11] drm/i915/pmu: Expose a PMU interface for perf queries Tvrtko Ursulin
2017-09-12  2:06   ` Rogozhkin, Dmitry V
2017-09-12 14:59     ` Tvrtko Ursulin
2017-09-13  8:57       ` [RFC v6 " Tvrtko Ursulin
2017-09-13 10:34         ` [RFC v7 " Tvrtko Ursulin
2017-09-15  0:00           ` Rogozhkin, Dmitry V
2017-09-15  7:57             ` Tvrtko Ursulin
2017-09-14 19:46   ` [RFC " Chris Wilson
2017-09-11 15:25 ` [RFC 05/11] drm/i915/pmu: Suspend sampling when GPU is idle Tvrtko Ursulin
2017-09-13 10:34   ` [RFC v5 " Tvrtko Ursulin
2017-09-14 19:57     ` Chris Wilson
2017-09-15  9:22       ` Tvrtko Ursulin
2017-09-11 15:25 ` [RFC 06/11] drm/i915: Wrap context schedule notification Tvrtko Ursulin
2017-09-11 15:25 ` [RFC 07/11] drm/i915: Engine busy time tracking Tvrtko Ursulin
2017-09-14 20:16   ` Chris Wilson
2017-09-15  9:45     ` Tvrtko Ursulin
2017-09-11 15:25 ` [RFC 08/11] drm/i915: Export engine busy stats in debugfs Tvrtko Ursulin
2017-09-14 20:17   ` Chris Wilson
2017-09-15  9:46     ` Tvrtko Ursulin
2017-09-11 15:25 ` [RFC 09/11] drm/i915/pmu: Wire up engine busy stats to PMU Tvrtko Ursulin
2017-09-11 15:25 ` [RFC 10/11] drm/i915: Export engine stats API to other users Tvrtko Ursulin
2017-09-12 18:35   ` Ben Widawsky
2017-09-14 20:26   ` Chris Wilson
2017-09-15  9:49     ` Tvrtko Ursulin
2017-09-19 19:50       ` Ben Widawsky
2017-09-19 20:11         ` Rogozhkin, Dmitry V
2017-09-29 10:59   ` Joonas Lahtinen
2017-09-11 15:25 ` [RFC 11/11] drm/i915: Gate engine stats collection with a static key Tvrtko Ursulin
2017-09-13 12:18   ` [RFC v3 " Tvrtko Ursulin
2017-09-14 20:22     ` Chris Wilson
2017-09-15  9:51       ` Tvrtko Ursulin
2017-09-11 15:50 ` ✗ Fi.CI.BAT: warning for i915 PMU and engine busy stats (rev3) Patchwork
2017-09-12  2:03 ` [RFC v3 00/11] i915 PMU and engine busy stats Rogozhkin, Dmitry V
2017-09-12 14:54   ` Tvrtko Ursulin
2017-09-12 22:01     ` Rogozhkin, Dmitry V
2017-09-13  8:54       ` [RFC v6 04/11] drm/i915/pmu: Expose a PMU interface for perf queries Tvrtko Ursulin
2017-09-13  9:01       ` [RFC v3 00/11] i915 PMU and engine busy stats Tvrtko Ursulin
2017-09-13  9:34 ` ✗ Fi.CI.BAT: warning for i915 PMU and engine busy stats (rev4) Patchwork
2017-09-13 10:46 ` ✗ Fi.CI.BAT: failure for i915 PMU and engine busy stats (rev6) Patchwork
2017-09-13 13:27 ` ✓ Fi.CI.BAT: success for i915 PMU and engine busy stats (rev7) Patchwork
2017-09-13 21:24 ` ✓ Fi.CI.IGT: " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.