intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [Intel-gfx] [PATCH 0/8] Enable triggered perf query for Xe_HP
@ 2021-08-03 20:13 Umesh Nerlige Ramappa
  2021-08-03 20:13 ` [Intel-gfx] [PATCH 1/8] drm/i915/gt: Lock intel_engine_apply_whitelist with uncore->lock Umesh Nerlige Ramappa
                   ` (10 more replies)
  0 siblings, 11 replies; 15+ messages in thread
From: Umesh Nerlige Ramappa @ 2021-08-03 20:13 UTC (permalink / raw)
  To: intel-gfx, Lionel G Landwerlin, Ashutosh Dixit, daniel; +Cc: dri-devel

This is a revival of the patch series to support triggered perf query reports
from here - https://patchwork.freedesktop.org/series/83831/

The patches were not pushed earlier because corresponding UMD changes were
missing. This revival addresses UMD changes in GPUvis for this series. GPUvis
uses the perf library in igt-gpu-tools. Changes to the library are here -
https://patchwork.freedesktop.org/series/93355/

GPUvis changes will be posted as a PR once the above library and kernel changes
are pushed.

Summary of the feature:

Current platforms provide MI_REPORT_PERF_COUNT to query a snapshot of perf
counters from a batch. This mechanism does not have consistent support on all
engines for newer platforms. To support perf query, all new platforms use a
mechanism to trigger OA report snapshots into the OA buffer by writing to a HW
register from a batch. To lookup this report in the OA buffer quickly, the OA
buffer is mmapped into the user space.

This series implements the new query mechanism.

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>

Chris Wilson (3):
  drm/i915/gt: Refactor _wa_add to reuse wa_index and wa_list_grow
  drm/i915/gt: Check for conflicting RING_NONPRIV
  drm/i915/gt: Enable dynamic adjustment of RING_NONPRIV

Piotr Maciejewski (1):
  drm/i915/perf: Ensure observation logic is not clock gated

Umesh Nerlige Ramappa (4):
  drm/i915/gt: Lock intel_engine_apply_whitelist with uncore->lock
  drm/i915/perf: Whitelist OA report trigger registers
  drm/i915/perf: Whitelist OA counter and buffer registers
  drm/i915/perf: Map OA buffer to user space for gen12 performance query

 drivers/gpu/drm/i915/gem/i915_gem_mman.c      |   2 +-
 drivers/gpu/drm/i915/gem/i915_gem_mman.h      |   2 +
 drivers/gpu/drm/i915/gt/intel_workarounds.c   | 269 +++++++++++++-----
 drivers/gpu/drm/i915/gt/intel_workarounds.h   |   7 +
 .../gpu/drm/i915/gt/selftest_workarounds.c    | 237 +++++++++++++++
 drivers/gpu/drm/i915/i915_perf.c              | 228 ++++++++++++++-
 drivers/gpu/drm/i915/i915_perf_types.h        |   8 +
 drivers/gpu/drm/i915/i915_reg.h               |  30 +-
 include/uapi/drm/i915_drm.h                   |  33 +++
 9 files changed, 728 insertions(+), 88 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 15+ messages in thread
* [Intel-gfx] [PATCH 0/8] Enable triggered perf query for Xe_HP
@ 2021-08-30 19:38 Umesh Nerlige Ramappa
  2021-08-30 19:38 ` [Intel-gfx] [PATCH 1/8] drm/i915/gt: Lock intel_engine_apply_whitelist with uncore->lock Umesh Nerlige Ramappa
  0 siblings, 1 reply; 15+ messages in thread
From: Umesh Nerlige Ramappa @ 2021-08-30 19:38 UTC (permalink / raw)
  To: intel-gfx, Lionel G Landwerlin, Ashutosh Dixit
  Cc: dri-devel, daniel.vetter, Joonas Lahtinen, jason

This is a revival of the patch series to support triggered perf query reports
from here - https://patchwork.freedesktop.org/series/83831/

The patches were not pushed earlier because corresponding UMD changes were
missing. This revival addresses UMD changes in GPUvis for this series. GPUvis
uses the perf library in igt-gpu-tools. Changes to the library are here -
https://patchwork.freedesktop.org/series/93355/

GPUvis changes will be posted as a PR once the above library and kernel changes
are pushed.

Summary of the feature:

Current platforms provide MI_REPORT_PERF_COUNT to query a snapshot of perf
counters from a batch. This mechanism does not have consistent support on all
engines for newer platforms. To support perf query, all new platforms use a
mechanism to trigger OA report snapshots into the OA buffer by writing to a HW
register from a batch. To lookup this report in the OA buffer quickly, the OA
buffer is mmapped into the user space.

This series implements the new query mechanism.

v2: Fix BAT failure (Umesh)
v3: Fix selftest (Umesh)
v4: Update uapi comment (Umesh)

Test-with: 20210830193337.15260-1-umesh.nerlige.ramappa@intel.com
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>

Chris Wilson (3):
  drm/i915/gt: Refactor _wa_add to reuse wa_index and wa_list_grow
  drm/i915/gt: Check for conflicting RING_NONPRIV
  drm/i915/gt: Enable dynamic adjustment of RING_NONPRIV

Piotr Maciejewski (1):
  drm/i915/perf: Ensure observation logic is not clock gated

Umesh Nerlige Ramappa (4):
  drm/i915/gt: Lock intel_engine_apply_whitelist with uncore->lock
  drm/i915/perf: Whitelist OA report trigger registers
  drm/i915/perf: Whitelist OA counter and buffer registers
  drm/i915/perf: Map OA buffer to user space for gen12 performance query

 drivers/gpu/drm/i915/gem/i915_gem_mman.c      |   2 +-
 drivers/gpu/drm/i915/gem/i915_gem_mman.h      |   2 +
 drivers/gpu/drm/i915/gt/intel_workarounds.c   | 269 +++++++++++++-----
 drivers/gpu/drm/i915/gt/intel_workarounds.h   |   7 +
 .../gpu/drm/i915/gt/selftest_workarounds.c    | 267 +++++++++++++++++
 drivers/gpu/drm/i915/i915_perf.c              | 228 ++++++++++++++-
 drivers/gpu/drm/i915/i915_perf_types.h        |   8 +
 drivers/gpu/drm/i915/i915_reg.h               |  30 +-
 include/uapi/drm/i915_drm.h                   |  33 +++
 9 files changed, 758 insertions(+), 88 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 15+ messages in thread
* [Intel-gfx] [PATCH 1/8] drm/i915/gt: Lock intel_engine_apply_whitelist with uncore->lock
@ 2020-11-17 11:01 Chris Wilson
  0 siblings, 0 replies; 15+ messages in thread
From: Chris Wilson @ 2020-11-17 11:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

From: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>

Refactor intel_engine_apply_whitelist into locked and unlocked versions
so that a caller who already has the lock can apply whitelist.

v2: Fix sparse warning
v3: (Chris)
- Drop prefix and suffix for static function
- Use longest to shortest line ordering for variable declaration

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 44 +++++++++++++++------
 1 file changed, 31 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index adc9a8ea410a..c49083957074 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -1339,7 +1339,8 @@ void intel_gt_init_workarounds(struct drm_i915_private *i915)
 }
 
 static enum forcewake_domains
-wal_get_fw_for_rmw(struct intel_uncore *uncore, const struct i915_wa_list *wal)
+wal_get_fw(struct intel_uncore *uncore, const struct i915_wa_list *wal,
+	   unsigned int op)
 {
 	enum forcewake_domains fw = 0;
 	struct i915_wa *wa;
@@ -1348,8 +1349,7 @@ wal_get_fw_for_rmw(struct intel_uncore *uncore, const struct i915_wa_list *wal)
 	for (i = 0, wa = wal->list; i < wal->count; i++, wa++)
 		fw |= intel_uncore_forcewake_for_reg(uncore,
 						     wa->reg,
-						     FW_REG_READ |
-						     FW_REG_WRITE);
+						     op);
 
 	return fw;
 }
@@ -1379,7 +1379,7 @@ wa_list_apply(struct intel_uncore *uncore, const struct i915_wa_list *wal)
 	if (!wal->count)
 		return;
 
-	fw = wal_get_fw_for_rmw(uncore, wal);
+	fw = wal_get_fw(uncore, wal, FW_REG_READ | FW_REG_WRITE);
 
 	spin_lock_irqsave(&uncore->lock, flags);
 	intel_uncore_forcewake_get__locked(uncore, fw);
@@ -1705,27 +1705,45 @@ void intel_engine_init_whitelist(struct intel_engine_cs *engine)
 	wa_init_finish(w);
 }
 
-void intel_engine_apply_whitelist(struct intel_engine_cs *engine)
+static void __engine_apply_whitelist(struct intel_engine_cs *engine)
 {
 	const struct i915_wa_list *wal = &engine->whitelist;
 	struct intel_uncore *uncore = engine->uncore;
 	const u32 base = engine->mmio_base;
+	enum forcewake_domains fw;
 	struct i915_wa *wa;
 	unsigned int i;
 
-	if (!wal->count)
-		return;
+	lockdep_assert_held(&uncore->lock);
+
+	fw = wal_get_fw(uncore, wal, FW_REG_WRITE);
+	intel_uncore_forcewake_get__locked(uncore, fw);
 
 	for (i = 0, wa = wal->list; i < wal->count; i++, wa++)
-		intel_uncore_write(uncore,
-				   RING_FORCE_TO_NONPRIV(base, i),
-				   i915_mmio_reg_offset(wa->reg));
+		intel_uncore_write_fw(uncore,
+				      RING_FORCE_TO_NONPRIV(base, i),
+				      i915_mmio_reg_offset(wa->reg));
 
 	/* And clear the rest just in case of garbage */
 	for (; i < RING_MAX_NONPRIV_SLOTS; i++)
-		intel_uncore_write(uncore,
-				   RING_FORCE_TO_NONPRIV(base, i),
-				   i915_mmio_reg_offset(RING_NOPID(base)));
+		intel_uncore_write_fw(uncore,
+				      RING_FORCE_TO_NONPRIV(base, i),
+				      i915_mmio_reg_offset(RING_NOPID(base)));
+
+	intel_uncore_forcewake_put__locked(uncore, fw);
+}
+
+void intel_engine_apply_whitelist(struct intel_engine_cs *engine)
+{
+	const struct i915_wa_list *wal = &engine->whitelist;
+	unsigned long flags;
+
+	if (!wal->count)
+		return;
+
+	spin_lock_irqsave(&engine->uncore->lock, flags);
+	__engine_apply_whitelist(engine);
+	spin_unlock_irqrestore(&engine->uncore->lock, flags);
 }
 
 static void
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-08-30 19:39 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-03 20:13 [Intel-gfx] [PATCH 0/8] Enable triggered perf query for Xe_HP Umesh Nerlige Ramappa
2021-08-03 20:13 ` [Intel-gfx] [PATCH 1/8] drm/i915/gt: Lock intel_engine_apply_whitelist with uncore->lock Umesh Nerlige Ramappa
2021-08-03 20:13 ` [Intel-gfx] [PATCH 2/8] drm/i915/gt: Refactor _wa_add to reuse wa_index and wa_list_grow Umesh Nerlige Ramappa
2021-08-03 20:13 ` [Intel-gfx] [PATCH 3/8] drm/i915/gt: Check for conflicting RING_NONPRIV Umesh Nerlige Ramappa
2021-08-03 20:13 ` [Intel-gfx] [PATCH 4/8] drm/i915/gt: Enable dynamic adjustment of RING_NONPRIV Umesh Nerlige Ramappa
2021-08-03 20:13 ` [Intel-gfx] [PATCH 5/8] drm/i915/perf: Ensure observation logic is not clock gated Umesh Nerlige Ramappa
2021-08-03 20:13 ` [Intel-gfx] [PATCH 6/8] drm/i915/perf: Whitelist OA report trigger registers Umesh Nerlige Ramappa
2021-08-03 20:13 ` [Intel-gfx] [PATCH 7/8] drm/i915/perf: Whitelist OA counter and buffer registers Umesh Nerlige Ramappa
2021-08-03 20:13 ` [Intel-gfx] [PATCH 8/8] drm/i915/perf: Map OA buffer to user space for gen12 performance query Umesh Nerlige Ramappa
2021-08-03 20:18 ` [Intel-gfx] [PATCH 0/8] Enable triggered perf query for Xe_HP Umesh Nerlige Ramappa
2021-08-03 20:27   ` Umesh Nerlige Ramappa
2021-08-03 21:17 ` [Intel-gfx] ✗ Fi.CI.DOCS: warning for " Patchwork
2021-08-03 21:58 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
  -- strict thread matches above, loose matches on Subject: below --
2021-08-30 19:38 [Intel-gfx] [PATCH 0/8] " Umesh Nerlige Ramappa
2021-08-30 19:38 ` [Intel-gfx] [PATCH 1/8] drm/i915/gt: Lock intel_engine_apply_whitelist with uncore->lock Umesh Nerlige Ramappa
2020-11-17 11:01 Chris Wilson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).