[RFC v2 00/10] i915 PMU and engine busy stats

* [RFC v2 00/10] i915 PMU and engine busy stats
@ 2017-08-02 12:32 Tvrtko Ursulin
  2017-08-02 12:32 ` [RFC 01/10] drm/i915: Convert intel_rc6_residency_us to ns Tvrtko Ursulin
                   ` (10 more replies)
  0 siblings, 11 replies; 32+ messages in thread
From: Tvrtko Ursulin @ 2017-08-02 12:32 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Second spin of the i915 PMU series.

This version squashes the fixups I was making against Chris' original PMU
prototype into patch 4 itself. Main changes in the core patch are:

 * Convert to class/instance uAPI.
 * Refactor some code to helpers for clarity.
 * Skip sampling disabled engines.
 * Expose counters in sysfs.
 * Pass in fake regs to avoid null ptr deref in perf core.
 * Use shared driver code for rc6 residency, power and frequency.

Series otherwise begins with three patches to expose some exsiting code so it
can be used from the PMU implementation.

Software engine busyness patches follow (6-9), which enable cheaper and more
accurate busyness metric.

Finally, I have made the static key patch as the last in the series so any
improvements it brings can be easily looke at separately.

Patch 8 also exports engine busyness in debugfs.

Pending is discussing with perf folks on whether our PMU usage/implementation
is correct. Opens there are:

 * Are the metric we are exposing meanigful/appropriate as software PMU?
 * Do we need / can we get anything useful by enumerating our events in sysfs?
 * Can we have "perf stat" or similar existing tools usefully parse these?
 * I had to create fake pt_regs to pass to perf_event_overflow, is that OK?
 * Can we control sampling frequency as requested by the perf API? We mostly
   do not need rapid polling on these counters.

My own high level TODOs:

 * Can we use OA instead of software tracking? On what platforms, with what
   limitations etc? Current thinking is to have this work completely separate.
 * See how to integrate with gpu-top.

I will also send IGT patches for people who are curious to exercise this.

Chris Wilson (1):
  drm/i915: Expose a PMU interface for perf queries

Tvrtko Ursulin (9):
  drm/i915: Convert intel_rc6_residency_us to ns
  drm/i915: Add intel_energy_uJ
  drm/i915: Extract intel_get_cagf
  drm/i915/pmu: Suspend sampling when GPU is idle
  drm/i915: Wrap context schedule notification
  drm/i915: Engine busy time tracking
  drm/i915: Export engine busy stats in debugfs
  drm/i915/pmu: Wire up engine busy stats to PMU
  drm/i915: Gate engine stats collection with a static key

 drivers/gpu/drm/i915/Makefile           |   1 +
 drivers/gpu/drm/i915/i915_debugfs.c     | 114 ++++-
 drivers/gpu/drm/i915/i915_drv.c         |   2 +
 drivers/gpu/drm/i915/i915_drv.h         |  41 +-
 drivers/gpu/drm/i915/i915_gem.c         |   1 +
 drivers/gpu/drm/i915/i915_gem_request.c |   1 +
 drivers/gpu/drm/i915/i915_pmu.c         | 734 ++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_reg.h         |   3 +
 drivers/gpu/drm/i915/intel_engine_cs.c  | 130 ++++++
 drivers/gpu/drm/i915/intel_lrc.c        |  19 +-
 drivers/gpu/drm/i915/intel_pm.c         |  57 ++-
 drivers/gpu/drm/i915/intel_ringbuffer.c |  25 ++
 drivers/gpu/drm/i915/intel_ringbuffer.h |  78 ++++
 include/uapi/drm/i915_drm.h             |  55 +++
 kernel/events/core.c                    |   1 +
 15 files changed, 1224 insertions(+), 38 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_pmu.c

-- 
2.9.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 32+ messages in thread